From charlesreid1

Chapter 1: Book Outline and Notes

Chapter 2: review of univariate generalized linear models and extensions

Chapter 3: models for multicategorical responses (i.e. multiple, unordered responses)

Chapter 4: selecting variables for models, variable reduction procedures, checking models, goodness-of-fit, residual analysis (outliers or consistent trend?)

Chapter 5: Semi- and non-parametric approaches

Chapter 6: Fixed-parameter models for time series (extends Ch. 2 and Ch. 3)

Chapter 7: Random effects models for non-normal data

Chapter 8: State space models for analyzing non-normal time series; relate time series observations y_t to unobserved states, like trend and seasonal components

Chapter 9: Survival models; determination of factors that determine survival/transition


Chapter 2: Univariate Generalized Linear Models

Cross-sectional regression analysis: univariate variable of primary interest (response variable) $ y $

Explained by a vector $ x = (x_1, x_2, \dots x_m) $

Data consist of observations on $ (y,x) $:

$ (y_i , x_i ), i = 1, \dots, n $

Definition of Univariate Generalized Linear Models

classical linear model for ungrouped normal responses and deterministic covariates is:

$ y_i = z_i^{\prime} \beta + \epsilon_i $

where:

$ z_i $ = design vector, function of covariate vector $ x_i $

$ \beta $ = vector of unknown parameters

$ \epsilon_i $ = errors, normally distributed and independent, $ \epsilon_i \sim N(0, \sigma^2) $

The observations $ y_i $ are independent and normally distributed,

$ y_i \sim N(\mu_i, \sigma^2), i=1, \dots, n $

A specific generalized linear model is fully characterized by three components:

  • type of exponential family
  • response or link function
  • design vector

Example:

Exponential family

  • important members: normal, binomial, poisson, gamma, inverse Gaussian distributions

Models for Continuous Responses

Normal distribution

Gamma distribution

Inverse Gaussian distribution

Models for Binary and Binomial Responses

Linear probability model

Probit model

Logit model

Complementary log-log model

Models for Counted Data

Log-linear Poisson model

Linear Poisson model


Likelihood Inference

Regression analysis with generalized linear models is based on likelihoods

This section contains inferential tools for:

  • parameter estimation
  • hypothesis testing
  • good-ness-of-fit tests
  • more detailed material on model choice/checking: see Chapter 4

Assumes that model is completely and correctly specified

Maximum likelihood estimator (MLE): MLE of unknown parameter vector obtained by maximizing the likelihood

Goodness of fit Statistics

two measures of adequacy of model (goodness of fit) are:

Pearson statistic $ \chi^2 = \sum_{i=1}^{g} \frac{ \left( y_i - \hat{\mu}_i \right)^2 }{ v(\hat{\mu}_i) } $

Deviance $ D = - 2 \phi \sum_{i=1}^{g} \left[ l_i ( \hat{\mu}_i ) - l_i (y_i) \right] $

where

$ \hat{\mu}_i $ = estimated mean function

$ v(\hat{\mu}_i) $ = estimated variance function

$ l_i(y_i) $ = individual log-likelihood











References

Generalized linear models:

  • McCullagh and Nelder 1989 - standard source of information about generalized linear models
  • Santner and Duffy 1989 - consider cross-classified data and univariate discrete data