0% found this document useful (0 votes)

85 views77 pages

Presentation Generalized Linear Model Theory

This document introduces generalized linear models (GLMs) as a framework for modeling relationships between a response variable and covariates when linear regression is not appropriate. GLMs consist of a random component that specifies the distribution of the response variable, a linear predictor relating the mean of the response to covariates, and a link function that connects the linear predictor to the mean. This allows modeling responses that are not normally distributed, as well as accounting for non-constant variance that may arise due to the distribution of the response variable.

Uploaded by

mse0425

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

85 views77 pages

Presentation Generalized Linear Model Theory

Uploaded by

mse0425

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 77

✬ ✩

Part IV:
Theory of Generalized Linear
Models

✫ ✪

198 BIO 233, Spring 2015

✬ ✩

Lung cancer surgery

Q: Is there an association between time spent in the operating room and

post-surgical outcomes?

• Could choose from a number of possible response variables, including:

⋆ hospital stay of > 7 days
⋆ number of major complications during the hospital stay

• The scientific goal is to characterize the joint distribution between both of

these responses and a p-vector of covariates, X
⋆ age, co-morbidities, surgery type, resection type, etc

• The first response is binary and the second is a count variable

⋆ Y ∈ {0, 1}
⋆ Y ∈ {0, 1, 2, . . .}
✫ ✪

199 BIO 233, Spring 2015

✬ ✩
Q: Can we analyze such response variables with the linear regression model?

⋆ specify a mean model

E[Yi |Xi ] = XiT β

⋆ estimate β via least squares and perform inference via the CLT

• Given continuous response data, least squares estimation works remarkably

well for the linear regression model
b is unbiased
⋆ assuming the mean model is correctly specified, β OLS

⋆ OLS is generally robust to the underlying distribution of the error terms

∗ Homework #2

⋆ OLS is ‘optimal’ if the error terms are homoskedastic

∗ MLE if ǫ ∼ Normal(0, σ 2 ) and BLUE otherwise

✫ ✪

200 BIO 233, Spring 2015

✬ ✩
• For a binary response variable, we could specify a linear regression model:

E[Yi |Xi ] = XiT β

Yi |Xi ∼ Bernoulli(µi )

where, for notational convenience, µi = XiT β

b will still be unbiased

• As long as this model is correctly specified, β OLS

• For the Bernoulli distribution, there is an implicit mean-variance

relationship:

V[Yi |Xi ] = µi (1 − µi )

⋆ as long as µi 6= µ ∀ i, study units will be heteroskedastic

⋆ non-constant variance
✫ ✪

201 BIO 233, Spring 2015

✬ ✩
• Ignoring heteroskedasticity results in invalid inference
⋆ naı̈ve standard errors (that assume homoskedasticity) are incorrect

• We’ve seen three possible remedies:

(1) transform the response variable
(2) use OLS and base inference on a valid standard error
(3) use WLS

b
• Recall, β WLS is the solution to

∂
0 = RSS(β; W )
∂β
n
∂ X
0 = wi (yi − XiT β)2
∂β i=1
n
X
0 = Xi wi (yi − XiT β)

✫ ✪
i=1

202 BIO 233, Spring 2015

✬ ✩
• For a binary response, we know the form of V[Yi ]
⋆ estimate β by setting W = Σ−1 , a diagonal matrix with elements:

1
wi =
µi (1 − µi )

• From the Gauss-Markov Theorem, the resulting estimator is BLUE

b T −1
X)−1 X T Σ−1 Y
β GLS = (X Σ

• Note, the least squares equations become

n
X Xi
0 = (yi − µi )
i=1
µi (1 − µi )

b while
⋆ in practice, we use the IWLS algorithm to estimate β GLS

simultaneously accommodating the mean-variance relationship

✫ ✪

203 BIO 233, Spring 2015

✬ ✩
b , obtained via the IWLS algorithm, is the MLE
• We can also show that β GLS

⋆ firstly, note that the likelihood and log-likelihood are:

n
Y
L(β|y) = µyi i (1 − µi )1−yi
i=1
Xn
ℓ(β|y) = yi log(µi ) + (1 − yi ) log(1 − µi )
i=1

⋆ to get the MLE, we take derivatives, set them equal to zero and solve
⋆ following the algebra trail we find that

Xn
∂ Xi
ℓ(β|y) = (Yi − µi )
∂β µ
i=1 i
(1 − µ i )

• The score equations are equivalent to the least squares equations

b is therefore ML
⋆ β
✫ ✪
GLS

204 BIO 233, Spring 2015

✬ ✩
• So, least squares estimation can accommodate implicit heteroskedasticity
for binary data by using the IWLS algorithm
⋆ assuming the model is correctly specified, WLS is in fact optimal!

• However, when modeling binary or count response data, the linear

regression model doesn’t respect the fact that the outcome is bounded

⋆ the functional that is being modeled is bounded:

∗ binary: E[Yi |Xi ] ∈ (0, 1)
∗ count: E[Yi |Xi ] ∈ (0, ∞)

⋆ but our current specification of the mean model doesn’t impose any
restrictions

E[Yi |Xi ] = XiT β

Q: Is this a problem?

✫ ✪

205 BIO 233, Spring 2015

✬ ✩

Summary

• Our goal is to develop statistical models to characterize the relationship

between some response variable, Y , and a vector of covariates, X

• Statistical models consist of two components:

⋆ a systematic component
⋆ a random component

• When moving beyond linear regression analysis of continuous response

data, we need to be aware of two key challenges:
(1) sensible specification of the systematic component
(2) proper accounting of any implicit mean-variance relationships
arising from the random component

✫ ✪

206 BIO 233, Spring 2015

✬ ✩

Generalized Linear Models

Definition

• A generalized linear model (GLM) specifies a parametric statistical model

for the conditional distribution of a response Yi given a p-vector of
covariates Xi

• Consists of three elements:

(1) probability distribution, Y ∼ fY (y)
(2) linear predictor, XiT β
(3) link function, g(·)

⋆ element (1) is the random component

⋆ elements (2) and (3) jointly specify the systematic component
✫ ✪

207 BIO 233, Spring 2015

✬ ✩

Random component

• In practice, we see a wide range of response variables with a wide range of

associated (possible) distributions

Response type Range Possible distribution

Continuous (−∞, ∞) Normal(µ, σ 2 )
Binary {0, 1} Bernoulli(π)
Polytomous {1, . . . , K} Multinomial(πk )
Count {0, 1, . . . , n} Binomial(n, π)
Count {0, 1, . . .} Poisson(µ)
Continuous (0, ∞) Gamma(α, β)
Continuous (0, 1) Beta(α, β)

• Desirable to have a single framework that accommodates all of these

✫ ✪

208 BIO 233, Spring 2015

✬ ✩

Systematic component

• For a given choice of probability distribution, a GLM specifies a model for

the conditional mean:

µi = E[Yi |Xi ]

Q: How do we specify reasonable models for µi while ensuring that we respect

the appropriate range/scale of µi ?

• Achieved by constructing a linear predictor XiT β and relating it to µi via a

link function g(·):

g(µi ) = XiT β

⋆ often use the notation ηi = XiT β

✫ ✪

209 BIO 233, Spring 2015

✬ ✩

The random component

• GLMs form a class of statistical models for response variables whose

distribution belongs to the exponential dispersion family

⋆ family of distributions with a pdf/pmf of the form:

yθ − b(θ)
fY (y; θ, φ) = exp + c(y, φ)
a(φ)

⋆ θ is the canonical parameter

⋆ φ is the dispersion parameter

⋆ b(θ) is the cumulant function

• Many common distributions are members of this family

✫ ✪

210 BIO 233, Spring 2015

✬ ✩
• Y ∼ Bernoulli(π)

fY (y; π) = µy (1 − µ)1−y

fY (y; θ, φ) = exp {yθ − log (1 + exp{θ})}

π
θ = log
1−π

a(φ) = 1

b(θ) = log (1 + exp{θ})

c(y, φ) = 0
✫ ✪

211 BIO 233, Spring 2015

✬ ✩
• Many other common distributions are also members of this family

• The canonical parameter has key relationships with both E[Y ] and V[Y ]
⋆ typically varies across study units
⋆ index θ by i: θi

• The dispersion parameter has a key relationship with V[Y ]

⋆ may but typically does not vary across study units
⋆ typically no unit-specific index: φ
⋆ in some settings we may have a(·) vary with i: ai (φ)
∗ e.g. ai (φ) = φ/wi , where wi is a prior weight

• When the dispersion parameter is known, we say that the distribution is a

member of the exponential family

✫ ✪

212 BIO 233, Spring 2015

✬ ✩

Properties

• Consider the likelihood function for a single observation

yi θi − b(θi )
L(θi , φ; yi ) = exp + c(yi , φ)
ai (φ)

• The log-likelihood is

yi θi − b(θi )
ℓ(θi , φ; yi ) = + c(yi , φ)
ai (φ)

• The first partial derivative with respect to θi is the score function for θi
and is given by

∂ yi − b′ (θi )
ℓ(θi , φ; yi ) = U (θi ) =
∂θi ai (φ)

✫ ✪

213 BIO 233, Spring 2015

✬ ✩
• Using standard results from likelihood theory, we know that under
appropriate regularity conditions:

E [U (θi )] = 0

2
∂U (θi )
V [U (θi )] = E U (θi ) = −E
∂θi

⋆ this latter expression is the (i, i)th component of the Fisher information
matrix

• Since the score has mean zero, we find that

′

Yi − b (θi )
E = 0
ai (φ)

and, consequently, that

E[Yi ] = b′ (θi )
✫ ✪

214 BIO 233, Spring 2015

✬ ✩
• The second partial derivative of ℓ(θi , φ; yi ) is

∂2 b′′ (θi )
2 ℓ(θi , φ; yi ) = −
∂θi ai (φ)

⋆ the observed information for the canonical parameter from the ith
observation

• This is also the expected information and using the above properties it
follows that
′

Yi − b (θi ) b′′ (θi )
V [U (θi )] = V = ,
ai (φ) ai (φ)

so that

V[Yi ] = b′′ (θi )ai (φ)

✫ ✪

215 BIO 233, Spring 2015

✬ ✩
• The variance of Yi is therefore a function of both θi and φ

• Note that the canonical parameter is a function of µi

µi = b′ (θi ) ⇒ θi = θ(µi ) = b′−1 (µi )

so that we can write

V[Yi ] = b′′ (θ(µi ))ai (φ)

• The function V (µi ) = b′′ (θ(µi )) is called the variance function

⋆ specific form indicates the nature of the (if any) mean-variance
relationship
• For example, for Y ∼ Bernoulli(µ)

a(φ) = 1

✫ ✪

216 BIO 233, Spring 2015

✬ ✩
b(θ) = log (1 + exp{θ})

E[Y ] = b′ (θ)
exp{θ}
= = µ
1 + exp{θ}

V[Y ] = b′′ (θ)a(φ)

exp{θ}
= = µ(1 − µ)
(1 + exp{θ})2

V (µ) = µ(1 − µ)

✫ ✪

217 BIO 233, Spring 2015

✬ ✩

The systematic component

• For the exponential dispersion family, the pdf/pmf has the following form:

yi θi − b(θi )
fY (yi ; θi , φ) = exp + c(yi , φ)
ai (φ)

⋆ this distribution is the random component of the statistical model

• We need a means of specifying how this distribution depends on a vector

of covariates Xi
⋆ the systematic component

• In GLMs we model the conditional mean, µi = E[Yi |Xi ]

⋆ provides a connection between Xi and distribution of Yi via the
canonical parameter θi and the cumulant function b(θi )
✫ ✪

218 BIO 233, Spring 2015

✬ ✩
• Specifically, the relationship between µi and Xi is given by

g(µi ) = XiT β

⋆ we ‘link’ the linear predictor to the distribution of of Yi via a

transformation of µi

• Traditionally, this specification is broken down into two parts:

(1) the linear predictor, ηi = XiT β
(2) the link function, g(µi ) = ηi

• You’ll often find the linear predictor called the ‘systematic component’
⋆ e.g., McCullagh and Nelder (1989) Generalized Linear Models

• In practice, one cannot consider one without the other

⋆ the relationship between µi and Xi is jointly determined by β and g(·)
✫ ✪

219 BIO 233, Spring 2015

✬ ✩

The linear predictor, ηi = XiT β

• Constructing the linear predictor for a GLM follows the same process one
uses for linear regression

• Given a set of covariates Xi , there are two decisions

⋆ which covariates to include in the model?
⋆ how to include them in the model?

• For the most part, the decision of which covariates to include should be
driven by scientific considerations
⋆ is the goal estimation or prediction?
⋆ is there a primary exposure of interest?
⋆ which covariates are predictors of the response variable?
⋆ are any of the covariates effect modifiers? confounders?
✫ ✪

220 BIO 233, Spring 2015

✬ ✩
• In some settings, practical or data-oriented considerations may drive these
decisions
⋆ small sample sizes
⋆ missing data
⋆ measurement error/missclassification

• How one includes them in the model will also depend on a mixture of
scientific and practical considerations

• Suppose we are interested in the relationship between birth weight and risk
of death within the first year of life
⋆ infant mortality

• Note: birth weight is a continuous covariate

⋆ there are a number of options for including a continuous covariate into
the linear predictor
✫ ✪

221 BIO 233, Spring 2015

✬ ✩
• Let Xw denote the continuous birth weight measure

• A simple model would be to include Xw via a linear term

η = β0 + β1 Xw

⋆ a ‘constant’ relationship between birth weight and infant mortality

• May be concerned that this is too restrictive a model

⋆ include additional polynomial terms

η = β0 + β1 Xw + β2 Xw2 + β3 Xw3

⋆ more flexible than the linear model

⋆ but the interpretation of β2 and β3 is difficult

✫ ✪

222 BIO 233, Spring 2015

✬ ✩
• Scientifically, one might only be interested in the ‘low birth weight’
threshold
⋆ let Xlbw = 0/1 if birth weight is >2.5kg/≤2.5kg

η = β0 + β1 Xlbw

⋆ impact of birth weight on risk of infant mortality manifests solely

through whether or not the baby has a low birth weight

• The underlying relationship may be more complex than a simple linear or

threshold effect, although we don’t like the (lack of) interpretability of the
polynomial model
⋆ categorize the continuous covariates into K + 1 groups
⋆ include in the linear predictor via K dummy variables

η = β0 + β1 Xcat,1 + . . . + βK Xcat,K

✫ ✪

223 BIO 233, Spring 2015

✬ ✩

The link function, g(·)

• Given the form of linear predictor XiT β we need to specify how it is

related to the conditional mean µi

• As we’ve noted, the range of values that µi can take on may be restricted
⋆ binary data: µi ∈ (0, 1)
⋆ count data: µi ∈ (0, ∞)

• One approach would be to estimate β subject to the constraint that all

(modeled) values of µi respect the appropriate range

Q: What might the drawbacks of such an approach be?

✫ ✪

224 BIO 233, Spring 2015

✬ ✩
• An alternative is to permit the estimation of β to be ‘free’ but impose a
functional form of the relationship between µi and XiT β
⋆ via the link function g(·)

g(µi ) = XiT β

• We interpret the link function as specifying a transformation of the

conditional mean, µi
⋆ we are not specifying a transformation of the response Yi

• The inverse of the link function provides the specification of the model on
the scale of µi

µi = g −1
XiT β

⋆ link functions are therefore usually monotone and have a well-defined

✫ ✪
inverse

225 BIO 233, Spring 2015

✬ ✩
• In linear regression we specify

µi = XiT β

⋆ g(·) is the identity link

• In logistic regression we specify

µi
log = XiT β
1 − µi
⋆ g(·) is the logit or logistic link

• In Poisson regression we specify

log(µi ) = XiT β

⋆ g(·) is the log link

✫ ✪

226 BIO 233, Spring 2015

✬ ✩
• For linear regression also we have that

µi = XiT β

⋆ g −1 (ηi ) = ηi is the identity function

• For logistic regression

T
exp Xi β
µi = T
1 + exp Xi β

⋆ g −1 (ηi ) = expit(ηi ) is the expit function

• For Poisson regression

T
µi = exp Xi β

⋆ g −1 (ηi ) = exp(ηi ) is the exponential function

✫ ✪

227 BIO 233, Spring 2015

✬ ✩

The canonical link

• Recall that the mean and the canonical parameter are linked via the
derivative of the cumulant function
⋆ E[Yi ] = µi = b′ (θi )

• An important link function is the canonical link:

g(µi ) = θ(µi )

⋆ the function that results by viewing the canonical parameter θi as a

function of µi
⋆ inverse of b′ (·)

• We’ll see later that this choice results in some mathematical convenience

✫ ✪

228 BIO 233, Spring 2015

✬ ✩

Choosing g(·)

• In practice, there are often many possible link functions

• For binary response data, one might choose a link function from among
the following:

identity: g(µi ) = µi
log: g(µi ) = log(µi )

µi
logit: g(µi ) = log
1 − µi
probit: g(µi ) = probit(µi )
complementary log-log: g(µi ) = log {−log(1 − µi )}

⋆ note the logit link is the canonical link function

✫ ✪

229 BIO 233, Spring 2015

✬ ✩
• We typically choose a specific link function via consideration of two issues:

(1) respect of the range of values that µi can take

(2) impact on the interpretability of β

• There can be a trade-off between mathematical convenience and

interpretability of the model

• We’ll spend more time on this later on in the course

✫ ✪

230 BIO 233, Spring 2015

✬ ✩

Frequentist estimation and inference

• Given an i.i.d sample of size n, the log-likelihood is

Xn
yi θi − b(θi )
ℓ(β, φ; y) = + c(yi , φ)
i=1
ai (φ)

where θi is a function of β and is determined by

⋆ the form of b′ (θi ) = µi
⋆ the choice of the link function via g(µi ) = ηi = XiT β

• The primary goal is to perform estimation and inference with respect to β

• Since we’ve fully specified the likelihood, we can proceed with

likelihood-based estimation/inference
✫ ✪

231 BIO 233, Spring 2015

✬ ✩

Estimation

• There are (p+2) unknown parameters: (β, φ)

• To obtain the MLE we need to solve the score equations:

T
∂ℓ(β, φ; y) ∂ℓ(β, φ; y) ∂ℓ(β, φ; y)
, ··· , , = 0
∂β0 ∂βp ∂φ

⋆ system of (p+2) equations

• The contribution to the score for φ by the ith unit is

∂ℓ(β, φ; yi ) a′i (φ) ′

= − 2
(yi θ i − b(θ i )) + c (yi , φ)
∂φ ai (φ)

✫ ✪

232 BIO 233, Spring 2015

✬ ✩
• We can use the chain rule to obtain a convenient expression for the ith
contribution to the score function for βj :

∂ℓ(β, φ; yi ) ∂ℓ(β, φ; yi ) ∂θi ∂µi ∂ηi

=
∂βj ∂θi ∂µi ∂ηi ∂βj

• Note the following results:

∂ℓ(β, φ; yi ) yi − µi
=
∂θi ai (φ)
∂µi
= b′′ (θi )
∂θi
V[Yi ]
=
ai (φ)
∂ηi
= Xj,i
∂βj

✫ ✪

233 BIO 233, Spring 2015

✬ ✩
• The score function for βj can therefore be written as

Xn
∂ℓ(β, φ; y) ∂µi Xj,i
= (yi − µi )
∂βj i=1
∂ηi V (µ )a
i i (φ)

⋆ depends on the distribution of Yi solely through E[Yi ] = µi and

V[Yi ] = V (µi )ai (φ)

• Suppose ai (φ) = φ/wi . The score equations become

Xn
∂ℓ(β, φ; y) wi (yi θi − b(θi )) ′
= − 2
+ c (yi , φ) = 0
∂φ i=1
φ

Xn
∂ℓ(β, φ; y) ∂µi Xj,i
= wi (yi − µi ) = 0
∂βj i=1
∂ηi V (µ i )

✫ ✪

234 BIO 233, Spring 2015

✬ ✩
• Notice that the (p+1) score equations for β do not depend on φ

• Consequently, obtaining the MLE of β doesn’t require knowledge of φ

⋆ φ isn’t required to be known or estimated (if unknown)
⋆ for example, in linear regression we don’t need σ 2 (or σ̂ 2 ) to obtain

b T −1
XT Y
β MLE = (X X)

⋆ inference does require an estimate of φ (see below)

✫ ✪

235 BIO 233, Spring 2015

✬ ✩

Asymptotic sampling distribution

• From standard likelihood theory, subject to appropriate regularity

conditions,
√
b b
n((β MLE , φMLE ) − (β, φ)) −→ MVN 0, I(β, φ) −1

• To get the asymptotic variance, we first need to derive expressions for the
second partial derivatives:

2

∂ ℓ(β, φ; yi ) ∂ ∂µi Xj,i
= (yi − µi )
∂βj ∂βk ∂βk ∂ηi V (µi )ai (φ)

2
∂ ∂µi Xj,i ∂µi Xj,i Xk,i
= (yi − µi ) −
∂βk ∂ηi V (µi )ai (φ) ∂ηi V (µi )ai (φ)

✫ ✪

236 BIO 233, Spring 2015

✬ ✩
2

∂ ℓ(β, φ; yi ) ∂ ∂µi Xj,i
= (yi − µi )
∂βj ∂φ ∂φ ∂ηi V (µi )ai (φ)

a′i (φ) ∂µi Xj,i

= − 2
(yi − µi )
ai (φ) ∂ηi V (µi )

∂ 2 ℓ(β, φ; yi ) ∂ a′i (φ) ′
= − (yi θ i − b(θ i )) + c (yi , φ)
∂φ∂φ ∂φ ai (φ)2

ai (φ)2 a′′i (φ)− 2ai (φ)a′i (φ)2
= − (yi θi − b(θi )) + c′′ (yi , φ)
ai (φ)4

= − K(φ) (yi θi − b(θi )) + c′′ (yi , φ)

✫ ✪

237 BIO 233, Spring 2015

✬ ✩
• Upon taking expectations with respect to Y , we find that
2
Xn 2
∂ ℓ(β, φ; y) ∂µi Xj,i Xk,i
−E =
∂βj ∂βk i=1
∂ηi V (µi )ai (φ)

• The second expression has mean zero, so that

2

∂ ℓ(β, φ; y)
−E = 0
∂βj ∂φ

• Taking the expectation of the negative of the third expression gives:

2
n
X
∂ ℓ(β, φ; y)
−E = K(φ) (b′ (θi )θi − b(θi )) − E[c′′ (Yi , φ)]
∂φ∂φ i=1

✫ ✪

238 BIO 233, Spring 2015

✬ ✩
• The expected information matrix can therefore be written in
block-diagonal form:
 
Iββ 0
I(β, φ) =  
0 Iφφ

where the components of Iββ are given by the first expression on the
previous slide and the Iφφ is given by the last expression on the previous
slide

• The inverse of the information matrix is gives the asymptotic variance

 
−1
b , φbMLE ] = I(β, φ)−1 =  Iββ 0
V[β MLE

−1
0 Iφφ

✫ ✪

239 BIO 233, Spring 2015

✬ ✩

• The block-diagonal structure V[β b , φbMLE ] indicates that for GLMs valid
MLE

characterization of the uncertainty in our estimate of β does not require

the propagation of uncertainty in our estimation of φ

• For example, for linear regression of Normally distributed response data we

plug in an estimate of σ 2 into
b ] = σ 2 (X T X)−1
V[β MLE

2
⋆ we typically don’t plug in σ̂MLE but, rather, an unbiased estimate:

X n
1 b )2
σ̂ 2 = (Yi − XiT β MLE
n − p − 1 i=1

⋆ further, we don’t worry about the fact that what we plug in is an

estimate of σ 2

✫ ✪

240 BIO 233, Spring 2015

✬ ✩
b
• For GLMs, therefore, estimation of the variance of β MLE proceeds by
b , φ)
plugging in the values of (β b into the upper (p+1)×(p+1)
MLE

sub-matrix:

bβ
V[ b ] = Ib−1
MLE ββ

where φb is any consistent estimator of φ

✫ ✪

241 BIO 233, Spring 2015

✬ ✩

Matrix notation

• If we set
2
∂µi 1
Wi =
∂ηi V (µi )ai (φ)

then the (j, k)th element of Iββ can be expressed as

n
X
Wi Xj,i Xk,i
i=1

• We can therefore write:

Iββ = X T W X

where W is an n × n diagonal matrix with entries Wi , i = 1, . . ., n, and

X is the design matrix from the specification of the linear predictor
✫ ✪

242 BIO 233, Spring 2015

✬ ✩

Special case: canonical link function

• For the canonical link function, ηi = g(µi ) = θi (µi ), so that

∂θi ∂µi ∂µi ∂θi V[Yi ]

= 1 ⇒ = = = V (µi )
∂ηi ∂ηi ∂θi ∂ηi ai (φ)

• The score contribution for βj by the ith unit simplifies to

∂ℓ(β, φ; yi ) ∂µi Xj,i Xj,i

= (yi − µi ) = (yi − µi )
∂βj ∂ηi V (µi )ai (φ) ai (φ)

and the components of the sub-matrix for β of the expected information

matrix, Iββ , are the summation of
2
2
∂ ℓ(β, φ; yi ) ∂µi Xj,i Xk,i V (µi )Xj,i Xk,i
−E = =
∂βj ∂βk ∂ηi V (µi )ai (φ) ai (φ)
✫ ✪

243 BIO 233, Spring 2015

✬ ✩

Hypothesis testing

• For the linear predictor XiT β, suppose we partition β = (β 1 , β 2 ) and we

are interested in testing:

H0 : β 1 = β 1,0 vs Ha : β 1 6= β 1,0

⋆ length of β 1 is q ≤ (p + 1)
⋆ β 2 is left arbitrary

• In most settings, β 1,0 = 0 which represents some form of ‘no effect’

⋆ at least given the structure of the model

• Following our review of asymptotic theory, there are three common

hypothesis testing frameworks

✫ ✪

244 BIO 233, Spring 2015

✬ ✩
• Wald test:
b
⋆ let β b b
MLE = (β 1,MLE , β 2,MLE )

⋆ under H0

b
(β Tb b −1 b
(β 1,MLE − β 1,0 ) −→d χ2q
1,MLE − β 1,0 ) V[β 1,MLE ]

bβ
where V[ b
1,MLE ] is the inverse of the q × q sub-matrix of Iββ
corresponding to β , evaluated at β b
1 1,MLE

• Score test:
b
⋆ let β b
0,MLE = (β 1,0 , β 2,MLE ) denote the MLE under H0

⋆ under H0

b
U (β b −1 b 2
0,MLE ; y)I(β 0,MLE ) U (β 0,MLE ; y) −→d χq

✫ ✪

245 BIO 233, Spring 2015

✬ ✩
• Likelihood ratio test:
bMLE
⋆ obtain the ‘best fitting model’ without restrictions: θ
b0,MLE
⋆ obtain the ‘best fitting model’ under H0 : θ
⋆ under H0

b ; y) − ℓ(β
2(ℓ(β b ; y)) −→d χ 2
MLE 0,MLE q

✫ ✪

246 BIO 233, Spring 2015

✬ ✩

Iteratively re-weighted least squares

• We saw that the score equation for βj is

Xn
∂ℓ(β, φ; y) ∂µi Xj,i
= (yi − µi ) = 0
∂βj i=1
∂ηi V (µi )ai (φ)

⋆ estimation of β requires solving (p + 1) of these equations

simultaneously
⋆ tricky because β appears in several places

• A general approach to finding roots is the Newton-Raphson algorithm

⋆ iterative procedure based on the gradient

• For a GLM, the gradient is the derivative of the score function with
respect to β

✫ ✪
⋆ these form the components of the observed information matrix I ββ

247 BIO 233, Spring 2015

✬ ✩
• Fisher scoring is an adaptation of the Newton-Raphson algorithm that
uses the expected information, Iββ , rather than I ββ , for the update

(r)
b
• Suppose the current estimate of β is β
⋆ compute the following:

(r) T b (r)
ηi = Xi β

(r) (r)
µi = g −1 ηi
!2
(r) ∂µi 1
Wi =
∂ηi (r)
ηi V
(r)
µi
∂η
(r) (r) (r) i
zi = ηi + yi − µi
∂µi (r)
µi

⋆ Wi is called the ‘working weight’

✫ ✪
⋆ zi is called the ‘adjusted response variable’

248 BIO 233, Spring 2015

✬ ✩
b is obtained as the WLS estimate to the regression
• The updated value of β
of Z on X:
(r+1)
b
β = (X T W (r) X)−1 (X T W (r) Z (r) )

⋆ X is the n × (p + 1) design matrix from the initial specification of the

model
(r) (r)
⋆ W (r) is a diagonal n × n matrix with entries {W1 , . . . , Wn }
(r) (r)
⋆ Z (r) is the n-vector (z1 , . . . , zn )

b converges
• Iterate until the value of β
(r+1) (r)
b
⋆ i.e. the difference between β b
and β is ‘small’

✫ ✪

249 BIO 233, Spring 2015

✬ ✩

Fitting GLMs in R with glm()

• A generic call to glm() is given by

fit0 <- glm(formula, family, data, ...)

⋆ many other arguments that control various aspects of the model/fit

⋆ ?glm for more information

• ‘data’ specifies the data frame containing the response and covariate data

• ‘formula’ specifies the structure of linear predictor, ηi = XiT β

⋆ input is an object of class ‘formula’
⋆ typical input might be of the form:
Y ∼ X1 + X2 + X3
⋆ ?formula for more information
✫ ✪

250 BIO 233, Spring 2015

✬ ✩
• ‘family’ jointly specifies the probability distribution fY (·), link function
g(·) and variance function V (·)
⋆ most common distributions have already been implemented
⋆ input is an object of class ‘family’
∗ object is a list of elements describing the details of the GLM

• The call for a standard logistic regression for binary data might be of the
form:

glm(Y ∼ X1 + X2, family=binomial(), data=myData)

or, more simply,

glm(Y ∼ X1 + X2, family=binomial, data=myData)

✫ ✪

251 BIO 233, Spring 2015

✬ ✩
• A more detailed look at family objects:

> ##
> ?family
> poisson()

Family: poisson
Link function: log
> ##
> myFamily <- binomial()
> myFamily

Family: binomial
Link function: logit
> names(myFamily)
[1] "family" "link" "linkfun" "linkinv" "variance"
"dev.resids" "aic"
[8] "mu.eta" "initialize" "validmu" "valideta" "simulate"
> myFamily$link
[1] "logit"

✫ ✪

252 BIO 233, Spring 2015

✬ ✩

> myFamily$variance
function (mu)
mu * (1 - mu)
>
> ## Changing the link function
> ## * for a true ’log-linear’ model we’d need to make appropriate
> ## changes to the other components of the family object
> ##
> myFamily$link <- "log"
>
> ## Standard logistic regression
> ##
> fit0 <- glm(Y ~ X, family=binomial)
>
> ## log-linear model for binary data
> ##
> fit1 <- glm(Y ~ X, family=binomial(link = "log"))
>
> ## which is (currently) not the same as
> ##

✫ ✪
> fit1 <- glm(Y ~ X, family=myFamily)

253 BIO 233, Spring 2015

✬ ✩
• Once you’ve fit a GLM you can examine the components of the glm object:

> ##
> names(fit0)
[1] "coefficients" "residuals" "fitted.values" "effects"
[5] "R" "rank" "qr" "family"
[9] "linear.predictors" "deviance" "aic" "null.deviance"
[13] "iter" "weights" "prior.weights" "df.residual"
[17] "df.null" "y" "converged" "boundary"
[21] "model" "call" "formula" "terms"
[25] "data" "offset" "control" "method"
[29] "contrasts" "xlevels"
>
> ##
> names(summary(fit0))
[1] "call" "terms" "family" "deviance" "aic"
[6] "contrasts" "df.residual" "null.deviance" "df.null" "iter"
[11] "deviance.resid" "coefficients" "aliased" "dispersion" "df"
[16] "cov.unscaled" "cov.scaled"

✫ ✪

254 BIO 233, Spring 2015

✬ ✩

The deviance

• Recall, the contribution to the log-likelihood by the ith study unit is

yi θi − b(θi )
ℓ(θi , φ; yi ) = + c(yi , φ)
ai (φ)

• Implicitly, θi is a function of µi so we could write the log-likelihood

contribution as a function of µi :

ℓ(θi , φ; yi ) ⇒ ℓ(µi , φ; yi )

b , we can compute each µ̂i and evaluate

• Given β MLE

n
X
ℓ(b
µ, φ; y) = ℓ(µ̂i , φ; yi ),
i=1

⋆ the maximum log-likelihood

✫ ✪

255 BIO 233, Spring 2015

✬ ✩
• ℓ(b
µ, φ; y) is the maximum achievable log-likelihood given the structure of
the model
⋆ µi is modeled via g(µi ) = ηi = XiT β
⋆ any other value of β would correspond to a lower value of the
log-likelihood

• The overall maximum achievable log-likelihood, however, is one based on a

saturated model
⋆ same number of parameters as observations
⋆ each observation is its own mean: µi = yi

n
X
ℓ(y, φ; y) = ℓ(yi , φ; yi ),
i=1

⋆ this represents the ‘best possible fit’

✫ ✪

256 BIO 233, Spring 2015

✬ ✩
• The difference

D ∗ (y, µ
b ) = 2 [ℓ(y, φ; y) − ℓ(b
µ, φ; y)]

is called the scaled deviance

• Let
⋆ θ̃i be the value of θi based on setting µi = yi
⋆ θ̂i be the value of θi based on setting µi = µ̂i

• If we take ai (φ) = φ/wi , then

2wi h i
Xn
∗ b)
D(y, µ
b) =
D (y, µ yi (θ̃i − θ̂i ) − b(θ̃i ) + b(θ̂i ) =
i=1
φ φ

b ) is the deviance for the current model

• D(y, µ
✫ ✪

257 BIO 233, Spring 2015

✬ ✩
b ) is used as a measure of goodness of fit of the model to the data
• D(y, µ
⋆ measures the ‘discrepancy’ between the fitted model and the data

• For the Normal distribution, the deviance is the sum of squared residuals:
n
X
b) =
D(y, µ (yi − µ̂i )2
i=1

⋆ has an exact χ2 distribution

⋆ compare two nested models by taking the difference in their deviances
∗ distribution of the difference is still a χ2
∗ the likelihood ratio test

• Beyond the Normal distribution the deviance is not χ2

• But we still can rely on a χ2 approximation to the asymptotic sampling

✫ ✪
distribution of the difference in the deviance between two models

258 BIO 233, Spring 2015

✬ ✩

Residuals

• In the context of regression modeling, residuals are used primarily to

⋆ examine the adequacy of model fit

∗ functional form for terms in the linear predictor
∗ link function
∗ variance function

⋆ investigate potential data issues

∗ e.g. outliers
• Interpreted as representing variation in the outcome that is not explained
by the model
⋆ variation once the systematic component has been accounted for
⋆ residuals are therefore model-specific

✫ ✪

259 BIO 233, Spring 2015

✬ ✩
• An ideal residual would look like an i.i.d sample when the correct mean
model is fit

• For linear regression, we often consider the raw or response residual

ri = yi − µ̂i

⋆ if the ǫi are homoskedastic then {r1 , . . . , rn } will be i.i.d

• For GLMs the underlying probability distribution is often skewed and

exhibits a mean-variance relationship

• Pearson residuals account for the heteroskedasticity via standardization

yi − µ̂i
rip = p
V (µ̂i )

2
P p 2
⋆ Pearson χ statistic for goodness-of-fit is equal to (r i)
✫ ✪
i

260 BIO 233, Spring 2015

✬ ✩
• The deviance residual is defined as
p
rid = sign(yi − µ̂i ) di

b ) from the ith study unit

where di is the contribution to D(y, µ
⋆ why is this a reasonable quantity to consider?

• Pierce and Schafer (JASA, 1986) examined various residuals for GLMs
⋆ conclude that deviance residuals are ‘a very good choice’
⋆ very nearly normally distributed after one allows for the discreteness
⋆ continuity correction which replaces

1
yi ⇒ yi ±
2

in the definition of the residual

∗ +/− chosen to move the value closer to µ̂i
✫ ✪

261 BIO 233, Spring 2015

✬ ✩
• All three types of residuals are returned by glm() in R:

> ## generic (logistic regression) model

> fit0 <- glm(Y ~ X, family=binomial)
>
> args(residuals.glm)
function (object, type = c("deviance", "pearson", "working",
"response", "partial"), ...)
NULL
>
> ## deviance residuals are the default
> residual(fit0)
...
>
> ## extracting the pearson residuals
> residual(fit0, type="pearson")
...

✫ ✪

262 BIO 233, Spring 2015

✬ ✩

The Bayesian solution

• A GLM is specified by:

Yi |Xi ∼ fY (y; µi , φ)
E[Yi |Xi ] = g −1 (XiT β) = µi
V[Yi |Xi ] = V (µi )ai (φ)

⋆ fY (·) is a member of the exponential dispersion family

⋆ β is a vector of regression coefficients
⋆ φ is the dispersion parameter

• (β, φ) are the unknown parameters

⋆ note there might not necessarily be a dispersion parameter
⋆ e.g. for binary or Poisson data
✫ ✪

263 BIO 233, Spring 2015

✬ ✩
• Required to specify a prior distribution for (β, φ) which is often factored
into

π(β, φ) = π(β|φ)π(φ)

• For β|φ, strategies include

⋆ a flat, non-informative prior

∗ recover the classical analysis
∗ posterior mode corresponding to a uniform prior density is the MLE

⋆ an informative prior
∗ e.g., β ∼ MVN(β 0 , Σβ )
∗ convenient choice given the computational methods described below

• Unfortunately, specifying a prior for φ is less prescriptive

⋆ consider specific models in Parts V-VII of the notes
✫ ✪

264 BIO 233, Spring 2015

✬ ✩
• Given an independent sample Y1 , . . ., Yn , the likelihood is the product of
n terms:
n
Y
L(β, φ|y) = fY (yi |µi , φ)
i=1

• Apply Bayes’ Theorem to get the posterior:

π(β, φ|y) ∝ L(β, φ|y)π(β, φ)

✫ ✪

265 BIO 233, Spring 2015

✬ ✩

Computation

• For most GLMs, the posterior won’t be of a convenient form

⋆ analytically intractable

• Use Monte Carlo methods to summarize the posterior distribution

• We’ve seen that the Gibbs sampler and the Metropolis-Hastings algorithm
are powerful tools for generating samples from the posterior distribution
⋆ need to specify a proposal distribution
⋆ need to specify starting values for the Markov chain(s)

e = (β,
• Towards this, let θ e φ̃) denote the posterior mode

✫ ✪

266 BIO 233, Spring 2015

✬ ✩
e
• Consider a Taylor series expansion of the log-posterior at θ:

e
log π(θ|y) = log π(θ|y)

e ∂ log π(θ|y)
+ (θ − θ)
∂θ θ =θe
2
1 e T ∂ e
+ (θ − θ) log π(θ|y) (θ − θ)
2 ∂θ∂θ θ =θe
+ ...

e
• Ignore the log π(θ|y) term because, as a function of θ, it is constant

• The linear term in the expansion disappears because the first derivative of
the log-posterior at the mode is equal to 0

• The middle component of the quadratic term is approximately the negative

✫ ✪
observed information matrix, evaluated at the mode

267 BIO 233, Spring 2015

✬ ✩
• We therefore get

1 e T I(θ)(θ
e e
log π(θ|y) ≈ − (θ − θ) − θ)
2

which is the log of the kernel for a Normal distribution

• So, towards specifying a proposal distribution for the Metropolis-Hastings

algorithm, we can consider the following Normal approximation to the
posterior

e I(θ)
π(θ|y) ≈ Normal θ, e −1

Q: How can we make use of this for sampling from the posterior π(β, φ|y)?
⋆ there are many approaches that one could take
⋆ we’ll describe three

✫ ✪

268 BIO 233, Spring 2015

✬ ✩
e φ̃)
• First, we need to find the mode, (β,
⋆ the value that maximizes π(β, φ|y)
⋆ given a non-informative prior:

e φ̃) ≡ (β
(β, b , φ̂MLE )
MLE

∗ obtain the mode via the IRLS algorithm

⋆ otherwise, use any other standard optimization technique
∗ e.g. Newton-Raphson
∗ could use (βb , φ̂MLE ) as a starting point
MLE

• Next, recall the block-diagonal structure of the information matrix for a

GLM:
 
Iββ 0
I(β, φ) =  
0 Iφφ

✫ ✪

269 BIO 233, Spring 2015

✬ ✩
• Exploit this and consider the approximation:

e Vβ (β,
π(β|y) ≈ Normal β, e φ̃)

to the marginal posterior of β

e φ̃) = I −1 evaluated at the mode
⋆ Vβ (β, ββ

⋆ denote the approximation by π

e(β; y)

• Also consider the approximation:

e φ̃)
π(φ|y) ≈ Normal φ̃, Veφ (β,

to the marginal posterior of φ

e φ̃) = I −1 evaluated at the mode
⋆ Vφ (β, φφ

⋆ denote the approximation by π

e(φ|y)
✫ ✪

270 BIO 233, Spring 2015

✬ ✩

Approach #1

• If we believe that π
e(β|y) is a good approximation, we could simply report
summary statistics directly from the multivariate Normal distribution

β|y ∼ Normal β, e Vβ (β,
e φ̃)

⋆ report the posterior mean (equivalently, the posterior median)

e φ̃)
⋆ posterior credible intervals using the components of Vβ (β,

• The approach conditions on φ̃

⋆ uncertainty in the true value of φ is ignored
⋆ this is what we do in classical estimation/inference for linear regression
anyway

• Similarly, we could summarize features of the posterior distribution of φ

✫ ✪
using the π̃(φ|y) Normal approximation

271 BIO 233, Spring 2015

✬ ✩

Approach #2

• We may not be willing to believe that the approximation is good enough

to summarize features of π(β; y)
⋆ approximation may not be good in small samples
⋆ approximation may not be good in the tails of the distribution
∗ away from the posterior mode

• We could use πe(β|y) as a proposal distribution in a Metropolis-Hastings

algorithm to sample from the exact posterior π(β; y)

• Let β (r) be the current state in the sequence

(1) generate a proposal β ∗ from π

e(β|y)
∗ straightforward since this is a multivariate Normal distribution

✫ ✪

272 BIO 233, Spring 2015

(3) generate a random U ∼ Uniform(0, 1)

∗ reject the proposal if ar < U :

β (r+1) = β (r)

∗ accept the proposal if ar ≥ U :

β (r+1) = β ∗

✫ ✪

273 BIO 233, Spring 2015

✬ ✩

Approach #3

• While approach #2 facilitates sampling from the exact posterior

distribution of β, π(β|y), uncertainty in the value of φ is still ignored
⋆ condition on φ = φ̃

• To sample from the full exact posterior π(β, φ; y) we could implement a

Gibbs sampling scheme and iterate between the full conditionals
⋆ for each, implement a Metropolis-Hastings step using the
approximations we’ve developed
⋆ for the r th sample:
(1) sample β (r) from π(β| φ(r−1) ; y) with π
e(β|y) as a proposal
(2) sample φ(r) from π(φ| β (r) ; y) with π
e(φ|y) as a proposal

✫ ✪
• Use the approximations to generate starting values for the Markov chain(s)

274 BIO 233, Spring 2015

James W. Hardin, Joseph M. Hilbe - Generalized Linear Models and Extensions-Stata Press (2018)
100% (1)
James W. Hardin, Joseph M. Hilbe - Generalized Linear Models and Extensions-Stata Press (2018)
789 pages
Generalised Linear Models and Bayesian Statistics
No ratings yet
Generalised Linear Models and Bayesian Statistics
35 pages
An Introduction To Generalized Linear Models (Third Edition, 2008) by Annette Dobson & Adrian Barnett Outline of Solutions For Selected Exercises
No ratings yet
An Introduction To Generalized Linear Models (Third Edition, 2008) by Annette Dobson & Adrian Barnett Outline of Solutions For Selected Exercises
23 pages
McCullagh - GLM
100% (11)
McCullagh - GLM
526 pages
Introduction to Applied Econometrics Analysis Using Stata
From Everand
Introduction to Applied Econometrics Analysis Using Stata
Justin Doran
5/5 (3)
Assignment 1 - MANOVA (Multivariate ANOVA)
No ratings yet
Assignment 1 - MANOVA (Multivariate ANOVA)
39 pages
7 Generalized Linear Models Padua
No ratings yet
7 Generalized Linear Models Padua
29 pages
Generalized Linear Models: 45 Heagerty, Bio/Stat 571
No ratings yet
Generalized Linear Models: 45 Heagerty, Bio/Stat 571
39 pages
GLM Slides 6 Binary Response Print
No ratings yet
GLM Slides 6 Binary Response Print
55 pages
θ, then the probability density function for Y, θ), can be written as  y∣=exp  ybcd  y θ) is called the natural −m  n y ,
No ratings yet
θ, then the probability density function for Y, θ), can be written as  y∣=exp  ybcd  y θ) is called the natural −m  n y ,
6 pages
Nelder 1972
No ratings yet
Nelder 1972
16 pages
Regression 101
No ratings yet
Regression 101
18 pages
2.1972 Generalized Linear Models Nelder Wedderburn
No ratings yet
2.1972 Generalized Linear Models Nelder Wedderburn
16 pages
26GeneralizedLinearModelBernoulliAnnotated PDF
No ratings yet
26GeneralizedLinearModelBernoulliAnnotated PDF
46 pages
Categorical Notes Ch4
No ratings yet
Categorical Notes Ch4
40 pages
GLM - NelderWedderburn1972
No ratings yet
GLM - NelderWedderburn1972
16 pages
Generalized Linear Models
No ratings yet
Generalized Linear Models
7 pages
w6 - Statistical Modelling
No ratings yet
w6 - Statistical Modelling
24 pages
Theory Generalized Linear Model
No ratings yet
Theory Generalized Linear Model
16 pages
Modelling Lecture 5
No ratings yet
Modelling Lecture 5
10 pages
Lecture Notes 5
100% (1)
Lecture Notes 5
53 pages
Note on Generalized Linear Models: y y Xβ w X β w I y Xβ I y Xβ X w X
No ratings yet
Note on Generalized Linear Models: y y Xβ w X β w I y Xβ I y Xβ X w X
4 pages
Generalized Linear Models: Simon Jackman Stanford University
No ratings yet
Generalized Linear Models: Simon Jackman Stanford University
7 pages
7 Generalized Linear Models
No ratings yet
7 Generalized Linear Models
16 pages
Midterm Review STA216: Generalized Linear Models: I I I I I I
No ratings yet
Midterm Review STA216: Generalized Linear Models: I I I I I I
26 pages
Econometria Avanzada: Generalized Linear Models
No ratings yet
Econometria Avanzada: Generalized Linear Models
30 pages
Chapter 2
No ratings yet
Chapter 2
5 pages
S M S T C Lecture 2425
No ratings yet
S M S T C Lecture 2425
45 pages
Lecture 24: Weighted and Generalized Least Squares 1 Weighted Least Squares
No ratings yet
Lecture 24: Weighted and Generalized Least Squares 1 Weighted Least Squares
8 pages
CH 1 Introduction
No ratings yet
CH 1 Introduction
19 pages
(P. McCullagh, John A. Nelder) Generalized Linear (B-Ok - Xyz)
No ratings yet
(P. McCullagh, John A. Nelder) Generalized Linear (B-Ok - Xyz)
274 pages
15 GLM
No ratings yet
15 GLM
32 pages
Seattle SISG 18 IntroQG Lecture08
No ratings yet
Seattle SISG 18 IntroQG Lecture08
21 pages
All Ex Sol
No ratings yet
All Ex Sol
43 pages
Httpsemas2.Ui - Ac.idpluginfile - Php2375826mod Resourcecontent1kuliah1 2 PDF
No ratings yet
Httpsemas2.Ui - Ac.idpluginfile - Php2375826mod Resourcecontent1kuliah1 2 PDF
31 pages
Lecture BDS 2 23 24 Print
No ratings yet
Lecture BDS 2 23 24 Print
10 pages
CDA Course
No ratings yet
CDA Course
196 pages
ES714glm Generalized Linear Models
No ratings yet
ES714glm Generalized Linear Models
26 pages
GLM Slides 2 Exp Family
No ratings yet
GLM Slides 2 Exp Family
35 pages
Ridge Regression
No ratings yet
Ridge Regression
82 pages
Chapter 1 - Linear Regression With 1 Predictor: Statistical Model
No ratings yet
Chapter 1 - Linear Regression With 1 Predictor: Statistical Model
35 pages
Generalized Linear Model
No ratings yet
Generalized Linear Model
9 pages
Fisher Information For GLM
No ratings yet
Fisher Information For GLM
35 pages
Math68052 Generalised Linear Models and Survival Analysis
No ratings yet
Math68052 Generalised Linear Models and Survival Analysis
12 pages
07 GLM
No ratings yet
07 GLM
49 pages
Poisson Regression
No ratings yet
Poisson Regression
3 pages
Generalized Linear Models
No ratings yet
Generalized Linear Models
109 pages
Ch13slides Generalized Linear Models
No ratings yet
Ch13slides Generalized Linear Models
24 pages
Unitb - II - Linear Probability, Logit and Probit
No ratings yet
Unitb - II - Linear Probability, Logit and Probit
34 pages
(GAM) Application PDF
No ratings yet
(GAM) Application PDF
30 pages
Solutions To Final1
No ratings yet
Solutions To Final1
12 pages
Statistics 244 - Binary Response Regression, and Related Issues
100% (1)
Statistics 244 - Binary Response Regression, and Related Issues
30 pages
Logistic Regression
No ratings yet
Logistic Regression
8 pages
Statistics BI: Models of Random Outcomes. What Is A Model?
No ratings yet
Statistics BI: Models of Random Outcomes. What Is A Model?
22 pages
Econometric Analysis of Cross Section and Panel Data, 2e: Models For Fractional Responses
No ratings yet
Econometric Analysis of Cross Section and Panel Data, 2e: Models For Fractional Responses
104 pages
Unit 2
No ratings yet
Unit 2
11 pages
Countdata2018 2
No ratings yet
Countdata2018 2
23 pages
Stat4006 2022-23 PS3
No ratings yet
Stat4006 2022-23 PS3
3 pages
6.1 - Introduction To GLMs
No ratings yet
6.1 - Introduction To GLMs
3 pages
2101 F 17 Assignment 1
No ratings yet
2101 F 17 Assignment 1
8 pages
FAI Lecture - 4-10-2023 PDF
No ratings yet
FAI Lecture - 4-10-2023 PDF
27 pages
Econ 331 Econometrics 1
No ratings yet
Econ 331 Econometrics 1
3 pages
MS Excel Instruction Steps in Matrimony Conjoint Analysis
No ratings yet
MS Excel Instruction Steps in Matrimony Conjoint Analysis
8 pages
Rapid Miner Report
No ratings yet
Rapid Miner Report
4 pages
STAT 5700 Homework 1
No ratings yet
STAT 5700 Homework 1
19 pages
Real Statistics Using Excel - Time Series Examples Workbook Charles Zaiontz, 27 July 2018
No ratings yet
Real Statistics Using Excel - Time Series Examples Workbook Charles Zaiontz, 27 July 2018
380 pages
Measurement Error Models
No ratings yet
Measurement Error Models
79 pages
Arch
No ratings yet
Arch
8 pages
11-Simple Linear Regression
No ratings yet
11-Simple Linear Regression
25 pages
Selection
No ratings yet
Selection
8 pages
Arpita - Sarkar - Business - Report - 17th December, 2023
No ratings yet
Arpita - Sarkar - Business - Report - 17th December, 2023
23 pages
APA Table Templates Table 1: Regular Demographic/Informational Table
No ratings yet
APA Table Templates Table 1: Regular Demographic/Informational Table
2 pages
Pengaruh Kualitas Pelayanan Dan Harga Terhadap Keputusan Pembelian Pada Kedai Kirani Coffee Abdul Mukti
No ratings yet
Pengaruh Kualitas Pelayanan Dan Harga Terhadap Keputusan Pembelian Pada Kedai Kirani Coffee Abdul Mukti
17 pages
TEST1
No ratings yet
TEST1
8 pages
2.1 Regression Analysis
No ratings yet
2.1 Regression Analysis
28 pages
Logistic Regression With Pyspark
No ratings yet
Logistic Regression With Pyspark
19 pages
Lampiran SPSS 25 Desember 2023
No ratings yet
Lampiran SPSS 25 Desember 2023
2 pages
Tugas Analisis Regresi - 220020076 - Novita Ratna Dewi
No ratings yet
Tugas Analisis Regresi - 220020076 - Novita Ratna Dewi
17 pages
MAT6001 - Digital Assignment I
No ratings yet
MAT6001 - Digital Assignment I
2 pages
Intermediate Stats 2024
No ratings yet
Intermediate Stats 2024
2 pages
ML Viva Questions
No ratings yet
ML Viva Questions
4 pages
Tutorial 10
No ratings yet
Tutorial 10
3 pages
EDA Unit IV
No ratings yet
EDA Unit IV
17 pages
Orange Green Corporate Geometric Business Case Study and Report Business Presentation
No ratings yet
Orange Green Corporate Geometric Business Case Study and Report Business Presentation
11 pages
Modeling
No ratings yet
Modeling
7 pages
Credit Card Default
No ratings yet
Credit Card Default
5 pages
Cathy Econ0019 - w2
No ratings yet
Cathy Econ0019 - w2
62 pages
04 Chap04 ClassificationMethods LDA QDA
No ratings yet
04 Chap04 ClassificationMethods LDA QDA
28 pages
Curve Fitting (Print)
No ratings yet
Curve Fitting (Print)
3 pages