Lecture Notes 5
Lecture Notes 5
& %
1
' $
Introduction
• We have discussed methods for analyzing associations in
two-way and three-way tables.
• Now we will use models as the basis of such analysis.
• Models can handle more complicated situations than discussed
so far.
• We can also estimate the parameters, which describe the effects
in a more informative way.
& %
2
' $
& %
3
' $
The Data
1 66 0 9 57 1 17 70 0
2 70 1 10 63 1 18 81 0
3 69 0 11 70 1 19 76 0
4 68 0 12 78 0 20 79 0
5 67 0 13 67 0 21 75 1
6 72 0 14 53 1 22 76 0
7 73 0 15 67 0 23 58 1
8 70 0 16 75 0
& %
5
' $
Fit From Logistic Regression
& %
6
' $
& %
7
' $
Example Continued
• We consider the width alone as a predictor.
• To obtain a clearer picture, we grouped the female crabs into a
set of width categories
• ≤ 23.25, 23.25-24.25, 24.25-25.25, 25.25-26.25, 26.25-27.25,
27.25-28.25,28.25-29.25, >29.25
• Calculated sample mean number of satellites for female crabs
in each category.
& %
8
' $
& %
9
' $
& %
10
' $
Components of A GLM
• Random component
– Identifies the response variable Y and assumes a probability
distribution for it
• Systematic component
– Specifies the explanatory variables used as predictors in the
model
• Link
– Describes the functional relation between the systematic
component and expected value of the random component
& %
11
' $
Random Component
• Let Y1 , · · · , YN denote the N observations on the response
variables Y .
• The random component specifies a probability distribution for
Y1 , · · · , YN .
• If the potential outcome for each observation Yi are binary
such as ”success” or ”f ailure”; or, more generally, each Yi
might be number of ”successes” out of a certain fixed number
of trials, we can assume a binomial distribution for the random
component.
• If each response observation is a non-negative count, such as
cell count in a contingency table, then we may assume a
Poisson distribution for the random component.
& %
12
' $
Systematic Component
• The systematic component specifies the explanatory variables.
• It specifies the variables that play the roles of xj in the formula
α + β1 x1 + · · · + βk xk .
• This linear combination of explanatory variables is called the
linear predictors.
• Some xj may be based on others in the model; for instance,
perhaps x3 = x1 x2 , to allow interaction between x1 and x2 in
their effects on Y , or perhaps x3 = x21 to allow a curvilinear
effect of x1 .
& %
13
' $
Link
• It specifies how µ = E(Y ) relates to explanatory variables in
the linear predictor.
• The model formula states that
g(µ) = α + β1 x1 + · · · + βk xk
& %
14
' $
g(µ) = µ = α + β1 x1 + · · · + βk xk
• Log link
g(µ) = log(µ) = α + β1 x1 + · · · + βk xk
• Logit
µ
g(µ) = log[ ] = α + β1 x1 + · · · + βk xk
1−µ
& %
15
' $
& %
16
' $
& %
17
' $
& %
18
' $
Example: Snoring
Heart Disease
& %
19
' $
Example:
• We use (0,2,4,5) for their snoring categories, treating the last
two levels closer.
• Linear Fit using maximizing likelihood:
& %
20
' $
Logistic Regression Model
• Relationship between π(x) and x are usually nonlinear rather
than linear. The most important function describing this
nonlinear relationship has the form
π(x)
log( ) = α + βx
1 − π(x)
• That is,
ex 1
π(x) = F0 (α + βx), where F0 (x) = x
= −x
,
1+e 1+e
where F0 (x) is the cdf of the logistic distribution. Its pdf is
F0 (x)(1 − F0 (x)).
• The associated GLM is called the logistic regression f unction.
• Logistic regression models are often referred as logit models as
the link in this GLM is the logit link.
& %
21
' $
Parameters
• The parameter β determines the rate of increase or decrease of
the curve.
• When β > 0, π(x) increases with x.
• When β < 0, π(x) decreases as x increases.
• The magnitude of β determines how fast the curve increases or
decreases.
• As |β| increases, the curve has a steeper rate of change.
& %
22
' $
Example: Snoring
Heart Disease
& %
23
' $
Effect of Parameters
& %
24
' $
π(x) = F (α + βx)
& %
25
' $
& %
26
' $
Probit Models
• The probability of success, π(x), has the form Φ(α + βx) where
Φ is the cdf of a standard normal distribution N (0, 1).
• The link function is known as probit link: g(π) = Φ−1 (π).
• The probit transf orm maps π(x) so that the regression curve
for π(x) (or 1 − π(x), when β < 0) has the appearance of the
normal cdf with mean µ = −α/β and standard deviation
σ = 1/|β|.
& %
27
' $
Example: Snoring
Heart Disease
& %
28
' $
& %
29
' $
& %
30
' $
Poisson Regression
• Assume a Poisson distribution for the random component.
• One can model the Poisson mean using the identity link.
• But more common to model the log of the mean.
• A Poisson loglinear model is a GLM that assumes a Poisson
distribution for Y and uses the log − link.
& %
31
' $
log(µ) = α + βx
& %
32
' $
Example: Horseshoe Data
& %
33
' $
& %
34
' $
& %
35
' $
Exponential Family
• The random variable Y has a distribution in the exponential
family, if its p.d.f (or p.m.f .) can be written as
& %
36
' $
& %
37
' $
& %
38
' $
& %
39
' $
Log-Likelihood Functions
• The log-likelihood function
• Here
′ ∂2l
∂l
∂θ = (y − b (θ))/a(ϕ) and ∂θ 2 = −b′′ (θ)/a(ϕ)
& %
40
' $
& %
41
' $
Examples
• Normal Distribution: b(θ) = θ2 /2
– E(Y ) = b′ (θ) = θ = µ
– var(Y ) = b′′ (θ)a(ϕ) = ϕ = σ 2 .
• Bernoulli Distribution: b(θ) = log(1 + eθ )
– E(Y ) = b′ (θ) = eθ /(1 + eθ ) = π.
– var(Y ) = b′′ (θ) = eθ /(1 + eθ )2 = π(1 − π)
• Poisson Distribution: b(θ) = exp(θ)
– E(Y ) = b′ (θ) = exp(θ) = λ.
– var(Y ) = b′′ (θ) = exp(θ) = λ.
& %
42
' $
ηi = g(µi )
& %
43
' $
& %
44
' $
∂L(β) ∑ ∂Li
N
= = 0,
∂βj i=1
∂βj
for j = 1, · · · , p.
• Simplifying, we have
∑N
(yi − µi )xij ∂µi
=0
i=1
var(yi ) ∂ηi
& %
45
' $
Examples
• Logit Model:
∑
p
∑
N exp( βj xij )
(yi − πi )xij = 0, where πi =
j=1
∑
p
i=1 1+exp( βj xij )
j=1
• Probit Model:
∑
N
(yi −πi )xij ∑
p ∑
p
πi (1−πi ) ϕ( βk xik ) = 0, where πi = Φ( βj xij )
i=1 k=1 j=1
• Log-Linear Model:
∑
N ∑
p
(yi − exp( βk xik ))xij = 0
i=1 k=1
& %
46
' $
& %
47
' $
& %
48
' $
& %
49
' $
Score Test
• The score statistic or efficient score statistic uses the size of the
derivative of the log-likelihood function evaluated at βj = 0.
• The score statistic is the square of the ratio of this derivative to
its ASE.
• It also has an approximate chi-squared distribution.
& %
50
' $
Example
• In simple logistic regression with one explanatory variable, the
log-likelihood function is:
∑
N
l(α, β) = {yi (α + βxi ) − log(1 + exp(α + βxi ))}
i=1
& %
51
' $
Model Residuals
• For i-th observation, the raw residual is:
& %
52
' $
Adjusted Residuals
• The Pearson residuals divided by its estimated standard error
are called adjusted residuals.
• Adjusted residuals have an approximate standard normal
distribution.
• For Poisson GLMs, the general form of the adjusted residual is:
(y − µ̂i ) ei
√ i =√
µ̂i (1 − hi ) 1 − hi
where hi is called the leverage of observation i.
& %
53