0% found this document useful (0 votes)

56 views42 pages

Choosing Link Functions Probit Regression Model Selection: Statistics 149 Spring 2006

Uploaded by

ワンピース

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

56 views42 pages

Choosing Link Functions Probit Regression Model Selection: Statistics 149 Spring 2006

Uploaded by

ワンピース

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 42

Choosing Link Functions

Probit Regression
Model Selection
Statistics 149

Spring 2006

c 2006 by Mark E. Irwin

The base function for fitting GLIMs is glm

glm(formula, family = gaussian, data, weights, subset,

na.action, start = NULL, etastart, mustart,
offset, control = glm.control(...), model = TRUE,
method = "glm.fit", x = FALSE, y = TRUE,
contrasts = NULL, ...)

formula: a symbolic description of the model to be fit. The

structure is the same as for lm, i.e.

y ~ x1 + x2 + ... + xp

with conventions involving categorical factors and

interactions the same as in lm.

Fitting Generalized Linear Models in R 1

Normally y is a numeric variable. However for
’binomial’ models the response can also be specified
as a ’factor’ (when the first level denotes failure
and all others success) or as a two-column matrix with
the columns giving the numbers of successes
and failures.

family: a description of the error distribution and link

function to be used in the model. This can be a
character string naming a family function, a family
function or the result of a call to a family function.
(See ’family’ for details of family functions.)

Fitting Generalized Linear Models in R 2

The possible choices for family are (with default link):

binomial(link = "logit")
gaussian(link = "identity")
Gamma(link = "inverse")
inverse.gaussian(link = "1/mu^2")
poisson(link = "log")
quasi(link = "identity", variance = "constant")
quasibinomial(link = "logit")
quasipoisson(link = "log")

weights: an optional vector of weights to be used in the

fitting process.

subset: an optional vector specifying a subset of observations

to be used in the fitting process.

Fitting Generalized Linear Models in R 3

offset: this can be used to specify an _a priori_ known
component to be included in the linear predictor
during fitting.

So to fit a model with different link function, you need to link option to
family. For example, to do a Probit regression, run the command

fasten.logit.glm <- glm(fasten.res ~ fasten.load,

family=binomial()) # same as family=binomial(link="logit")

fasten.probit.glm <- glm(fasten.res ~ fasten.load,

family=binomial(link="probit"))

In this example fasten.res is a matrix with the first column being the
number of fasteners to fail for a particular load, and the second column
being the number of fasteners able to hold under the same load.

Fitting Generalized Linear Models in R 4

Choosing Link Function in Binomial Regression Models

As already mentioned, R has many possibilities for a link function in binomial

based regression. So one issue that may need to be considered is what
link function to use. Lets look at what happens in two examples (airplane
fasteners and birth weight) for the links

µ
• logit: g(µ) = log 1−µ = logit(µ)

• probit: g(µ) = Φ−1(µ)

• cloglog (Complementary Log-Log link): g(µ) = log(− log(1 − µ))

Choosing Link Function in Binomial Regression Models 5

1.0
Logit
Probit
Complementary Log−Log
0.8
Proportion Failing

0.6
0.4
0.2
0.0

2500 3000 3500 4000

Load (psi)

Choosing Link Function in Binomial Regression Models 6

1.0
Logit
0.8
Probit
Complementary Log−Log
Low Birth Weight

0.6
0.4
0.2
0.0

15 20 25 30 35 40 45

Maternal Age

Choosing Link Function in Binomial Regression Models 7

In both of these examples, the logit and probit links give similar fitted
probabilities, however the fits from the C-Log-Log link are a bit different.

Even though the fits may be similar, the estimated regression parameter
values can be quite different.

For the the birth weight example

Parameter Logit Probit C-Log-Log

β̂0 0.38458 0.23590 -0.02595
β̂1 -0.05115 -0.03155 -0.04185

One implication of this is that the interpretation of the parameters is

different with the three link functions. I’ll get back to this later.

Even though the meaning of the parameters is different, the tests on slope
type parameters is often similar, as can be seen in the example

Choosing Link Function in Binomial Regression Models 8

Call: glm(formula = low ~ age,
family = binomial(link = "logit"), data = birthwt)

Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) 0.38458 0.73212 0.525 0.599
age -0.05115 0.03151 -1.623 0.105

Call: glm(formula = low ~ age,

family = binomial(link = "probit"), data = birthwt)

Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) 0.23590 0.43981 0.536 0.5917
age -0.03155 0.01875 -1.682 0.0925 .

Choosing Link Function in Binomial Regression Models 9

Call: glm(formula = low ~ age,
family = binomial(link = "cloglog"), data = birthwt)

Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -0.02595 0.60755 -0.043 0.966
age -0.04185 0.02654 -1.577 0.115

The reason for the similarity in the results from logit and probit regressions
can be seen in the following plot.

This plot shows the relationship between π and g(π) for the three link
functions on a logit probability scale.

Choosing Link Function in Binomial Regression Models 10

The reason that logit and probit regression gave similar predicted probability
come from the fact that the probit curve is approximately linear in the
probability range 0.1 to 0.9 and the logit curve is linear (as it was constructed
to be).

Choosing Link Function in Binomial Regression Models 11

The difference in magnitude in β̂1 for logit and probit regression can be
seen by differences in g(π) − g(1 − π) or in g 0(0.5) as the β1 is related to
this difference as

∆g
β1 ≈
∆x
The differences of the C-Log-Log fits comes from two features. The first is
that this link function is less linear in the range 0.1 to 0.9 than the other
two links.

The second reason relates to ...

Choosing Link Function in Binomial Regression Models 12

Relationship Between Link Function and Counting
Successes or Failures

When collecting binomial data, we can either count the “successes” or the
“failures” as they give the same information. What happens in the analysis
if we switch what we are counting.

For the logit link,

1−π π
logit(1 − π) = log = − log = −logit(π)
π 1−π

This implies switching between counts of “successes” and counts of

“failures” switches the signs of the βs. For example, for the fastener
example

Relationship Between Link Function and Counting Successes or Failures 13

> summary(fasten.logit.rev.glm)

Call:
glm(formula = fasten.rev ~ fasten.load, family = binomial())

Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) 5.3397115 0.5456932 9.785 <2e-16 ***
fasten.load -0.0015484 0.0001575 -9.829 <2e-16 ***

> summary(fasten.logit.glm)

Call:
glm(formula = fasten.res ~ fasten.load, family = binomial())

Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -5.3397115 0.5456932 -9.785 <2e-16 ***
fasten.load 0.0015484 0.0001575 9.829 <2e-16 ***

Relationship Between Link Function and Counting Successes or Failures 14

Probit regression has the same property

Φ−1(1 − π) = −Φ−1(π)

So the βs will act the same here. In fact this is the case for a link function
based off a symmetric density function.
However for the C-Log-Log link

log(− log(1 − π)) 6= − log(− log π)

So there is not a nice relationship between the βs for counting “successes”

and the βs for counting “failures”. Due to this asymmetry, the C-Log-Log
link is usually only used for problems when π is small.
I don’t think the C-Log-Log link is used much today, partly due to the
features mentioned, but also due to the fact that for small π

logit(π) ≈ log(− log(1 − π))

Relationship Between Link Function and Counting Successes or Failures 15

Probit Regression

As we have seen, probit regression is based on the model

Φ−1(π) = Xβ

Lets consider the case with only a single predictor

Φ−1(π) = β0 + β1x

Unfortunately this probit regression model doesn’t a nice interpretation for

the regression parameters like logistic regression (changes in log odds for
changes in x). For levels of the predictor variable leading to moderate
probabilities (i.e. π ∈ (0.1, 0.9)), it can be shown that

ω(x + 1)
β1 ≈ 1.6 log
ω(x)

Probit Regression 16
However the true multiplier depends on π(x) and gets bigger the further
π(x) gets from 0.5.

When data is fit with logistic and probit regression the relationship between
the βs depends on the levels of the predictor variables. For example

Example Logistic Probit Ratio

Fastener 0.00154 0.00095 1.63
Birth Weight -0.05115 -0.03155 1.62

Since the two procedures give similar answer and logistic regression has nicer
interpretation of parameters, it is usually preferred to probit regression.

Probit Regression 17
Model Selection

As in linear regression, it can be useful to do model selection in generalized

linear models. As we are basing our estimation on the log likelihood
function, choosing our model based on a large log likelihood (or on a small
deviance) might seem to be a reasonable approach.

However as discussed earlier, when parameters get added to a model, the

log likelihood must go up (or the deviance must go down). So we need
to adjust are model selection criteria to take account of the number of
predictor variables in a prospective model and the amount of information
each predictor variable adds.

One approach is a penalized likelihood approach, similar to Mallow’s Cp for

linear regression models. The idea is to pick the model that minimizes

Deviance + Penalty(p)

where Penalty(p) is a penalty term which depends on p, the number of

parameters in the model of interest, including the intercept.

Model Selection 18
There are a number of common penalty terms, 2 of which I’ll discuss here

• Schwarz’s Bayesian Information Criterion (BIC)

BIC = Deviance + p log n

This approach, originally derived from Bayesian consideration looking at

posterior probabilities of model says pick the model with the smallest
value of BIC.

• Akaike’s Information Criterion (AIC)

AIC = Deviance + 2p

Similarly, this approach says to pick the model with the smallest value of
AIC

Model Selection 19
As can be easily seen BIC tends to penalize big models more than AIC,
as long as n > e2 = 7.4. What tends to happen is that AIC will tend
to pick models where the predictors choose should have at least moderate
influence, whereas BIC tends to include variables with strong influence.

Which to use can be a philosophical issue. In addition it can be problem

specific.

As this book discusses in section 12.4.2, we usually aren’t trying to find one
“best model”, but instead trying to find a set of reasonable models and
working with those.

In R, it is easiest to deal with AIC as this is given for almost every model
fit with glm. For example,

> birthwtall.glm <- glm(low ~ ., binomial, bwt)

> summary(birthwtall.glm)

Call:
glm(formula = low ~ ., family = binomial, data = bwt)

Model Selection 20
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) 0.82302 1.24471 0.661 0.50848
age -0.03723 0.03870 -0.962 0.33602
lwt -0.01565 0.00708 -2.211 0.02705 *
raceblack 1.19241 0.53597 2.225 0.02609 *
raceother 0.74069 0.46174 1.604 0.10869
smokeTRUE 0.75553 0.42502 1.778 0.07546 .
ptdTRUE 1.34376 0.48062 2.796 0.00518 **
htTRUE 1.91317 0.72074 2.654 0.00794 **
uiTRUE 0.68019 0.46434 1.465 0.14296
ftv1 -0.43638 0.47939 -0.910 0.36268
ftv2+ 0.17901 0.45638 0.392 0.69488
---

Null deviance: 234.67 on 188 degrees of freedom

Residual deviance: 195.48 on 178 degrees of freedom
AIC: 217.48

Model Selection 21
It is possible to build a stepwise model selection procedure similar to
forward/backward/stepwise in linear regression.

This procedure has the advantage that is will easily deal with categorical
factors, which is problematic with stepwise procedures based on t/F -tests.
Also it can deal with interactions in a reasonable fashion, by requiring main
effects to be in a model when an interaction is.

• Forward AIC Selection: This procedure starts with no predictors in the

model and keeps adding predictors until the AIC stops decreasing. The
basic structure works as follow
1. Add each variable currently not in the model to the current model and
calculate the AIC.
2. Choose the variable that leads to the smallest AIC.
3. If the AIC of this model is lower that the AIC of the model not
containing the variable, add the variable and go back to step 1.
Otherwise, don’t add the variable and stop.

Model Selection 22
• Backward AIC Elimination: This procedure starts with all predictors in
the model and removes predictors until the AIC stops decreasing. The
basic structure is
1. Remove each variable currently in the model and calculate the AIC.
2. Choose the variable that leads to the model with the smallest AIC.
3. If the AIC of this model without this variable is lower that the AIC of
the model containing the variable, drop the variable and go back to
step 1. Otherwise, don’t drop the variable and stop.

• Stepwise AIC: This procedure combines features of both procedures where

variables can be added and removed from the model. The structure is
as follows
1. Given the current model, do one add step as described in forward AIC
selection.
2. Given the model after the forward step, do one drop step.
3. If any variables are added or dropped, go to step 1. Otherwise stop.
While any prospective model could be used as the starting model in this
procedure, it is often the null model with no predictors in it.

Model Selection 23
In R there is a function stepAIC which implements these ideas. Forward
selection could be done by the following example

> birthwtnull.glm <- glm(low ~ 1, family=binomial, data=bwt)

> birthwt.add.step <- stepAIC(birthwtnull.glm,
direction="forward",
scope=list(upper= ~ age + lwt + race + smoke + ptd + ht
+ ui + ftv, lower= ~ 1))

Start: AIC= 236.67

low ~ 1

Df Deviance AIC
+ ptd 1 221.90 225.90
+ lwt 1 228.69 232.69
+ ui 1 229.60 233.60
+ smoke 1 229.81 233.81
+ ht 1 230.65 234.65
+ race 2 229.66 235.66

Model Selection 24
+ age 1 231.91 235.91
<none> 234.67 236.67
+ ftv 2 232.09 238.09

Step: AIC= 225.9

low ~ ptd

Df Deviance AIC
+ age 1 217.30 223.30
+ lwt 1 217.50 223.50
+ ht 1 217.66 223.66
+ race 2 217.02 225.02
+ ui 1 219.12 225.12
+ smoke 1 219.33 225.33
+ ftv 2 217.88 225.88
<none> 221.90 225.90

Model Selection 25
Step: AIC= 223.3
low ~ ptd + age

Df Deviance AIC
+ ht 1 213.12 221.12
+ lwt 1 214.32 222.32
+ smoke 1 215.04 223.04
+ ui 1 215.13 223.13
<none> 217.30 223.30
+ race 2 213.97 223.97
+ ftv 2 214.63 224.63

Model Selection 26
Step: AIC= 221.12
low ~ ptd + age + ht

Df Deviance AIC
+ lwt 1 207.43 217.43
+ ui 1 210.13 220.13
+ smoke 1 210.89 220.89
<none> 213.12 221.12
+ race 2 210.06 222.06
+ ftv 2 210.38 222.38

Step: AIC= 217.43

low ~ ptd + age + ht + lwt

Df Deviance AIC
+ ui 1 205.15 217.15
+ smoke 1 205.39 217.39
<none> 207.43 217.43
+ race 2 203.77 217.77

Model Selection 27
+ ftv 2 204.33 218.33

Step: AIC= 217.15

low ~ ptd + age + ht + lwt + ui

Df Deviance AIC
<none> 205.15 217.15
+ smoke 1 203.24 217.24
+ race 2 201.25 217.25
+ ftv 2 202.41 218.41

Backward elimination can be handled similarly

> birthwtall.glm <- glm(low ~ ., binomial, bwt)

> birthwt.drop.step <- stepAIC(birthwtall.glm,
direction="backward",
scope=list(upper= ~ age + lwt + race + smoke + ptd + ht
+ ui + ftv, lower= ~ 1))

Model Selection 28
Start: AIC= 217.48
low ~ age + lwt + race + smoke + ptd + ht + ui + ftv

Df Deviance AIC
- ftv 2 196.83 214.83
- age 1 196.42 216.42
<none> 195.48 217.48
- ui 1 197.59 217.59
- smoke 1 198.67 218.67
- race 2 201.23 219.23
- lwt 1 200.95 220.95
- ht 1 202.93 222.93
- ptd 1 203.58 223.58

Model Selection 29
Step: AIC= 214.83
low ~ age + lwt + race + smoke + ptd + ht + ui

Df Deviance AIC
- age 1 197.85 213.85
<none> 196.83 214.83
- ui 1 199.15 215.15
- race 2 203.24 217.24
- smoke 1 201.25 217.25
- lwt 1 201.83 217.83
- ptd 1 203.95 219.95
- ht 1 204.01 220.01

Model Selection 30
Step: AIC= 213.85
low ~ lwt + race + smoke + ptd + ht + ui

Df Deviance AIC
<none> 197.85 213.85
- ui 1 200.48 214.48
- smoke 1 202.57 216.57
- race 2 205.47 217.47
- lwt 1 203.82 217.82
- ptd 1 204.22 218.22
- ht 1 205.16 219.16

For a full stepwise procedure, the following is an example. Note that R

implements this procedure slightly differently than described earlier.

> birthwtnull.glm <- glm(low ~ 1, family=binomial, data=bwt)

> birthwt.step.step <- stepAIC(birthwtnull.glm,
direction="both",
scope=list(upper= ~ age + lwt + race + smoke + ptd + ht
+ ui + ftv, lower= ~ 1))

Model Selection 31
Start: AIC= 236.67
low ~ 1

Df Deviance AIC
+ ptd 1 221.90 225.90
+ lwt 1 228.69 232.69
+ ui 1 229.60 233.60
+ smoke 1 229.81 233.81
+ ht 1 230.65 234.65
+ race 2 229.66 235.66
+ age 1 231.91 235.91
<none> 234.67 236.67
+ ftv 2 232.09 238.09

Model Selection 32
Step: AIC= 225.9
low ~ ptd

Df Deviance AIC
+ age 1 217.30 223.30
+ lwt 1 217.50 223.50
+ ht 1 217.66 223.66
+ race 2 217.02 225.02
+ ui 1 219.12 225.12
+ smoke 1 219.33 225.33
+ ftv 2 217.88 225.88
<none> 221.90 225.90
- ptd 1 234.67 236.67

Model Selection 33
Step: AIC= 223.3
low ~ ptd + age

Df Deviance AIC
+ ht 1 213.12 221.12
+ lwt 1 214.32 222.32
+ smoke 1 215.04 223.04
+ ui 1 215.13 223.13
<none> 217.30 223.30
+ race 2 213.97 223.97
+ ftv 2 214.63 224.63
- age 1 221.90 225.90
- ptd 1 231.91 235.91

Model Selection 34
Step: AIC= 221.12
low ~ ptd + age + ht

Df Deviance AIC
+ lwt 1 207.43 217.43
+ ui 1 210.13 220.13
+ smoke 1 210.89 220.89
<none> 213.12 221.12
+ race 2 210.06 222.06
+ ftv 2 210.38 222.38
- ht 1 217.30 223.30
- age 1 217.66 223.66
- ptd 1 227.93 233.93

Model Selection 35
Step: AIC= 217.43
low ~ ptd + age + ht + lwt

Df Deviance AIC
+ ui 1 205.15 217.15
+ smoke 1 205.39 217.39
<none> 207.43 217.43
+ race 2 203.77 217.77
- age 1 210.12 218.12
+ ftv 2 204.33 218.33
- lwt 1 213.12 221.12
- ht 1 214.32 222.32
- ptd 1 219.88 227.88

Model Selection 36
Step: AIC= 217.15
low ~ ptd + age + ht + lwt + ui

Df Deviance AIC
<none> 205.15 217.15
+ smoke 1 203.24 217.24
+ race 2 201.25 217.25
- ui 1 207.43 217.43
- age 1 207.51 217.51
+ ftv 2 202.41 218.41
- lwt 1 210.13 220.13
- ht 1 212.70 222.70
- ptd 1 215.48 225.48

For the stepwise procedure, different starting models can lead to different
final models. For example, starting with all predictors in the model the final
model is

Model Selection 37
Step: AIC= 213.85
low ~ lwt + race + smoke + ptd + ht + ui

Df Deviance AIC
<none> 197.85 213.85
- ui 1 200.48 214.48
+ age 1 196.83 214.83
+ ftv 2 196.42 216.42
- smoke 1 202.57 216.57
- race 2 205.47 217.47
- lwt 1 203.82 217.82
- ptd 1 204.22 218.22
- ht 1 205.16 219.16

Starting Model Final Model AIC

None In low ~ ptd + ht + lwt + ui + age 217.15
All In low ~ ptd + ht + lwt + ui + race + smoke 213.85

Model Selection 38
The function stepAIC can handle different penalty functions. Thus by
adding the option k = log(n), BIC can be used as well

> birthwt.BIC.step <- stepAIC(birthwtnull.glm, k=log(189)

direction="both",
scope=list(upper= ~ age + lwt + race + smoke + ptd + ht
+ ui + ftv, lower= ~ 1))

Start: AIC= 239.91

low ~ 1

Df Deviance AIC
+ ptd 1 221.90 232.38
+ lwt 1 228.69 239.17
<none> 234.67 239.91
+ ui 1 229.60 240.08
+ smoke 1 229.81 240.29
+ ht 1 230.65 241.13
+ age 1 231.91 242.40

Model Selection 39
+ race 2 229.66 245.39
+ ftv 2 232.09 247.81

Step: AIC= 232.38

low ~ ptd

Df Deviance AIC
<none> 221.90 232.38
+ age 1 217.30 233.02
+ lwt 1 217.50 233.22
+ ht 1 217.66 233.39
+ ui 1 219.12 234.85
+ smoke 1 219.33 235.05
+ race 2 217.02 237.99
+ ftv 2 217.88 238.85
- ptd 1 234.67 239.91

Model Selection 40
Starting Final Model Approach Criteria
None In ptd + ht + lwt + ui + age Forward AIC
All In ptd + ht + lwt + ui + race + smoke Backward AIC
None In ptd + ht + lwt + ui + age Stepwise AIC
All In ptd + ht + lwt + ui + race + smoke Stepwise AIC
None In ptd Stepwise BIC
All In ptd + ht + lwt Stepwise BIC

Model Selection 41

Lecture Notes 5
100% (1)
Lecture Notes 5
53 pages
T3 Logistic Regression
No ratings yet
T3 Logistic Regression
53 pages
Statistics 244 - Binary Response Regression, and Related Issues
100% (1)
Statistics 244 - Binary Response Regression, and Related Issues
30 pages
Chapter 11 Binomial Regression - Statistical Methods II
No ratings yet
Chapter 11 Binomial Regression - Statistical Methods II
53 pages
Httpsemas2.Ui - Ac.idpluginfile - Php2375826mod Resourcecontent1kuliah1 2 PDF
No ratings yet
Httpsemas2.Ui - Ac.idpluginfile - Php2375826mod Resourcecontent1kuliah1 2 PDF
31 pages
Logit and Probit Models
No ratings yet
Logit and Probit Models
44 pages
Session 4 Forecasting Regression Methods II
No ratings yet
Session 4 Forecasting Regression Methods II
65 pages
GLM Slides 6 Binary Response Print
No ratings yet
GLM Slides 6 Binary Response Print
55 pages
Lecture Notes - Logistic Regression
100% (1)
Lecture Notes - Logistic Regression
11 pages
07 GLM
No ratings yet
07 GLM
49 pages
EC501 Lecture 04
No ratings yet
EC501 Lecture 04
30 pages
Going Beyond Linear Regression: Ita Cirovic Donev
No ratings yet
Going Beyond Linear Regression: Ita Cirovic Donev
42 pages
PM - ExtendedProject - Business Report
100% (4)
PM - ExtendedProject - Business Report
35 pages
GLM Theory
No ratings yet
GLM Theory
46 pages
Lec-03 LogisticRegression
No ratings yet
Lec-03 LogisticRegression
32 pages
Lect 12
No ratings yet
Lect 12
36 pages
26GeneralizedLinearModelBernoulliAnnotated PDF
No ratings yet
26GeneralizedLinearModelBernoulliAnnotated PDF
46 pages
Regression3 Slides
No ratings yet
Regression3 Slides
47 pages
Generalized Linear Model: Badr Missaoui
No ratings yet
Generalized Linear Model: Badr Missaoui
35 pages
(GAM) Application PDF
No ratings yet
(GAM) Application PDF
30 pages
Lesson 3 Logistic Regression Diagnostics
No ratings yet
Lesson 3 Logistic Regression Diagnostics
37 pages
w6 - Statistical Modelling
No ratings yet
w6 - Statistical Modelling
24 pages
Logistic Regression (With R) : 1 Theory
No ratings yet
Logistic Regression (With R) : 1 Theory
15 pages
Modelling Lecture 5
No ratings yet
Modelling Lecture 5
10 pages
Ch13slides Generalized Linear Models
No ratings yet
Ch13slides Generalized Linear Models
24 pages
Week6 1 GLM
No ratings yet
Week6 1 GLM
28 pages
Notes 15
No ratings yet
Notes 15
20 pages
Cda Chapter Three
No ratings yet
Cda Chapter Three
18 pages
Q 10 A Q 6B Logistic Regression Class
No ratings yet
Q 10 A Q 6B Logistic Regression Class
18 pages
Capstone - Https:Users - Ox.ac - Uk: Jesu0073:Lecture 3:LogisticRegression
No ratings yet
Capstone - Https:Users - Ox.ac - Uk: Jesu0073:Lecture 3:LogisticRegression
17 pages
09 Discrete Choice 1 Notes
No ratings yet
09 Discrete Choice 1 Notes
17 pages
Generalized Linear Models
No ratings yet
Generalized Linear Models
12 pages
GLMM Revision
No ratings yet
GLMM Revision
15 pages
Exponential Family
No ratings yet
Exponential Family
13 pages
HWK 5
No ratings yet
HWK 5
16 pages
Difference Between Logit and Probit Models
100% (1)
Difference Between Logit and Probit Models
7 pages
7a GAMs - Key PDF
No ratings yet
7a GAMs - Key PDF
12 pages
STAT511Q2Q4
No ratings yet
STAT511Q2Q4
11 pages
Advanced Regression
No ratings yet
Advanced Regression
13 pages
Makalah Intervening (Path Analysis)
No ratings yet
Makalah Intervening (Path Analysis)
22 pages
UC Berkeley Econ 140 Section 10
No ratings yet
UC Berkeley Econ 140 Section 10
8 pages
Scalar Measures of Fit: Pseudo R and Information Measures (AIC & BIC)
No ratings yet
Scalar Measures of Fit: Pseudo R and Information Measures (AIC & BIC)
11 pages
Lecture BDS 2 23 24 Print
No ratings yet
Lecture BDS 2 23 24 Print
10 pages
Lecture 13: Introduction To Generalized Linear Models: 21 November 2007
No ratings yet
Lecture 13: Introduction To Generalized Linear Models: 21 November 2007
12 pages
Logit Probit
No ratings yet
Logit Probit
11 pages
Regression 101
No ratings yet
Regression 101
18 pages
新增 Microsoft Word Document
No ratings yet
新增 Microsoft Word Document
10 pages
Assignment On Probit Model
No ratings yet
Assignment On Probit Model
17 pages
Choosing A Functional Form
No ratings yet
Choosing A Functional Form
8 pages
Ho GLM
No ratings yet
Ho GLM
5 pages
APPC 1.13-1.14 Notes Modeling Functions and Applications
No ratings yet
APPC 1.13-1.14 Notes Modeling Functions and Applications
4 pages
Notes Topics 1.13 - 1.14 Modeling Functions and Applications Updated
No ratings yet
Notes Topics 1.13 - 1.14 Modeling Functions and Applications Updated
4 pages
Notes Topics 1.13 - 1.14 Modeling Functions and Applications Updated
No ratings yet
Notes Topics 1.13 - 1.14 Modeling Functions and Applications Updated
4 pages
Note 1
0% (2)
Note 1
129 pages
ES714glm Generalized Linear Models
No ratings yet
ES714glm Generalized Linear Models
26 pages
Section 9 Limited Dependent Variables
No ratings yet
Section 9 Limited Dependent Variables
17 pages
Chapter 1 (STD)
No ratings yet
Chapter 1 (STD)
5 pages
Unit - 2
No ratings yet
Unit - 2
3 pages
Chapter 3 Notes
No ratings yet
Chapter 3 Notes
5 pages
Chapter 2
No ratings yet
Chapter 2
5 pages
Dats Group Final
No ratings yet
Dats Group Final
41 pages
DS Unit-Iv
No ratings yet
DS Unit-Iv
34 pages
Rajat Sir FDP
No ratings yet
Rajat Sir FDP
48 pages
MS 02 More Exercises
No ratings yet
MS 02 More Exercises
5 pages
Managerial Econometrics
100% (1)
Managerial Econometrics
5 pages
CS7015 (Deep Learning) : Lecture 12: Object Detection: R-CNN, Fast R-CNN, Faster R-CNN, You Only Look Once (YOLO)
No ratings yet
CS7015 (Deep Learning) : Lecture 12: Object Detection: R-CNN, Fast R-CNN, Faster R-CNN, You Only Look Once (YOLO)
47 pages
10E-Poisson Regression
No ratings yet
10E-Poisson Regression
19 pages
Ardl Model
No ratings yet
Ardl Model
6 pages
ANOVA - Analysis of Variance (Slides)
No ratings yet
ANOVA - Analysis of Variance (Slides)
41 pages
Chapter 8 - Clustering
No ratings yet
Chapter 8 - Clustering
42 pages
Basics of Sas
No ratings yet
Basics of Sas
14 pages
Assigment 22
No ratings yet
Assigment 22
4 pages
MFIN 5800 Chapter 11
No ratings yet
MFIN 5800 Chapter 11
32 pages
Determination of The Selection Statistics and Best Significance Level in Backward Stepwise Logistic Regression
No ratings yet
Determination of The Selection Statistics and Best Significance Level in Backward Stepwise Logistic Regression
12 pages
Total Quality Management (Set 3)
No ratings yet
Total Quality Management (Set 3)
23 pages
Equity Theory: (Adams, 1963 Landy, 1989 Beehr, 1996)
No ratings yet
Equity Theory: (Adams, 1963 Landy, 1989 Beehr, 1996)
6 pages
Sentiment Analysis IMDB Review - Presentation
No ratings yet
Sentiment Analysis IMDB Review - Presentation
19 pages
Ef3451 HW1 (Feb14 12)
No ratings yet
Ef3451 HW1 (Feb14 12)
2 pages
Part 7-Forecasting PDF
No ratings yet
Part 7-Forecasting PDF
14 pages
Staffff
No ratings yet
Staffff
16 pages
Sociology in Malaysian Context
No ratings yet
Sociology in Malaysian Context
6 pages
Moderation Implied An Interaction Effect, Where Introducing A Moderating Variable
No ratings yet
Moderation Implied An Interaction Effect, Where Introducing A Moderating Variable
11 pages
Sample Questions
No ratings yet
Sample Questions
4 pages
0.1 Gam - Logit: Generalized Additive Model For Dichoto-Mous Dependent Variables
No ratings yet
0.1 Gam - Logit: Generalized Additive Model For Dichoto-Mous Dependent Variables
8 pages
Jeffrey M. Wooldridge-Introductory Econometrics - A Modern Approach-South-Western College Pub (2016) - 113-115
No ratings yet
Jeffrey M. Wooldridge-Introductory Econometrics - A Modern Approach-South-Western College Pub (2016) - 113-115
3 pages
Mth302 Quiz 3
No ratings yet
Mth302 Quiz 3
2 pages
Spss Assignment
No ratings yet
Spss Assignment
5 pages
Ridge Regression
No ratings yet
Ridge Regression
3 pages
MachineLearning Lab
No ratings yet
MachineLearning Lab
3 pages
Psychological Statistics Using T-Test, Correlation, and ANOVA
No ratings yet
Psychological Statistics Using T-Test, Correlation, and ANOVA
4 pages
SOFP Harta
No ratings yet
SOFP Harta
1 page
Digital and Microprocessor Techniques V10
From Everand
Digital and Microprocessor Techniques V10
Clive W. Humphris
No ratings yet
Restricted Boltzmann Machine: Fundamentals and Applications for Unlocking the Hidden Layers of Artificial Intelligence
From Everand
Restricted Boltzmann Machine: Fundamentals and Applications for Unlocking the Hidden Layers of Artificial Intelligence
Fouad Sabry
No ratings yet
Computer Vision Graph Cuts: Exploring Graph Cuts in Computer Vision
From Everand
Computer Vision Graph Cuts: Exploring Graph Cuts in Computer Vision
Fouad Sabry
No ratings yet
Backpropagation: Fundamentals and Applications for Preparing Data for Training in Deep Learning
From Everand
Backpropagation: Fundamentals and Applications for Preparing Data for Training in Deep Learning
Fouad Sabry
No ratings yet

Choosing Link Functions Probit Regression Model Selection: Statistics 149 Spring 2006

Uploaded by

Choosing Link Functions Probit Regression Model Selection: Statistics 149 Spring 2006

Uploaded by

Choosing Link Functions

c 2006 by Mark E. Irwin

The base function for fitting GLIMs is glm

glm(formula, family = gaussian, data, weights, subset,

formula: a symbolic description of the model to be fit. The

with conventions involving categorical factors and

Fitting Generalized Linear Models in R 1

family: a description of the error distribution and link

Fitting Generalized Linear Models in R 2

weights: an optional vector of weights to be used in the

subset: an optional vector specifying a subset of observations

Fitting Generalized Linear Models in R 3

fasten.logit.glm <- glm(fasten.res ~ fasten.load,

fasten.probit.glm <- glm(fasten.res ~ fasten.load,

Fitting Generalized Linear Models in R 4

As already mentioned, R has many possibilities for a link function in binomial

• probit: g(µ) = Φ−1(µ)

• cloglog (Complementary Log-Log link): g(µ) = log(− log(1 − µ))

Choosing Link Function in Binomial Regression Models 5

2500 3000 3500 4000

Choosing Link Function in Binomial Regression Models 6

Choosing Link Function in Binomial Regression Models 7

For the the birth weight example

Parameter Logit Probit C-Log-Log

One implication of this is that the interpretation of the parameters is

Choosing Link Function in Binomial Regression Models 8

Call: glm(formula = low ~ age,

Choosing Link Function in Binomial Regression Models 9

Choosing Link Function in Binomial Regression Models 10

Choosing Link Function in Binomial Regression Models 11

The second reason relates to ...

Choosing Link Function in Binomial Regression Models 12

For the logit link,

This implies switching between counts of “successes” and counts of

Relationship Between Link Function and Counting Successes or Failures 13

Relationship Between Link Function and Counting Successes or Failures 14

log(− log(1 − π)) 6= − log(− log π)

So there is not a nice relationship between the βs for counting “successes”

logit(π) ≈ log(− log(1 − π))

Relationship Between Link Function and Counting Successes or Failures 15

As we have seen, probit regression is based on the model

Lets consider the case with only a single predictor

Unfortunately this probit regression model doesn’t a nice interpretation for

Example Logistic Probit Ratio

As in linear regression, it can be useful to do model selection in generalized

However as discussed earlier, when parameters get added to a model, the

One approach is a penalized likelihood approach, similar to Mallow’s Cp for

where Penalty(p) is a penalty term which depends on p, the number of

• Schwarz’s Bayesian Information Criterion (BIC)

BIC = Deviance + p log n

This approach, originally derived from Bayesian consideration looking at

• Akaike’s Information Criterion (AIC)

Which to use can be a philosophical issue. In addition it can be problem

> birthwtall.glm <- glm(low ~ ., binomial, bwt)

Null deviance: 234.67 on 188 degrees of freedom

• Forward AIC Selection: This procedure starts with no predictors in the

• Stepwise AIC: This procedure combines features of both procedures where

> birthwtnull.glm <- glm(low ~ 1, family=binomial, data=bwt)

Start: AIC= 236.67

Step: AIC= 225.9

Step: AIC= 217.43

Step: AIC= 217.15

Backward elimination can be handled similarly

> birthwtall.glm <- glm(low ~ ., binomial, bwt)

For a full stepwise procedure, the following is an example. Note that R

> birthwtnull.glm <- glm(low ~ 1, family=binomial, data=bwt)

Starting Model Final Model AIC

> birthwt.BIC.step <- stepAIC(birthwtnull.glm, k=log(189)

Start: AIC= 239.91

Step: AIC= 232.38

You might also like