0% found this document useful (0 votes)

23 views15 pages

Review of Logistic and Poisson Regression Models

Uploaded by

Gerbaba Guta

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

23 views15 pages

Review of Logistic and Poisson Regression Models

Uploaded by

Gerbaba Guta

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 15

BIO 226: APPLIED LONGITUDINAL ANALYSIS

LECTURE 16

Review of Logistic and Poisson Regression Models

Generalized Linear Models

Generalized linear models are a class of regression models; they include the
standard linear regression model but also many other important models:

- Linear regression for continuous data

- Logistic regression for binary data

- Log-linear/Poisson regression models for count data

Generalized linear models extend the methods of regression analysis to

settings where the outcome variable can be categorical.

In the remainder of the course, we consider extensions of generalized linear

models to longitudinal data.

2
Motivating Example

Oral Treatment of Toenail Infection

Randomized, double-blind, parallel-group, multicenter study of 294 patients

comparing 2 oral treatments (denoted A and B) for toe-nail infection.

Outcome variable: Binary variable indicating presence of onycholysis

(separation of the nail plate from the nail bed).

Patients evaluated for degree of onycholysis (separation of the nail plate

from the nail-bed) at baseline (week 0) and at weeks 4, 8, 12, 24, 36, and
48.

Interested in the rate of decline of the proportion of patients with

onycholysis over time and the eﬀects of treatment on that rate.

Motivating Example

Clinical trial of anti-epileptic drug progabide

(Thall and Vail, Biometrics, 1990)

Randomized, placebo-controlled study of treatment of epileptic seizures

with progabide.

Patients were randomized to treatment with progabide, or to placebo in

addition to standard therapy.

Outcome variable: Count of number of seizures

Measurement schedule: Baseline measurement during 8 weeks prior to

randomization. Four measurements during consecutive two-week intervals.

Sample size: 28 epileptics on placebo; 31 epileptics on progabide

4
Review of Generalized Linear Models for a Single
Response

So far, we have considered linear regression models for a continuous

response, Y , of the following form

Y = β1X1 + β2X2 + . . . + βpXp + e

The response variable, Y , is assumed to have a normal distribution with

mean

E(Y ) = β1X1 + β2X2 + . . . + βpXp

and with variance, σ 2.

Recall that the population intercept (for X1 = 1), β1, has interpretation as
the mean value of the response when all of the covariates take on the value
zero.

The population slope, say βk , has interpretation in terms of the expected

change in the mean response for a single-unit change in Xk given that all
of the other covariates remain constant.

In many studies, however, we are interested in a response variable that is

dichotomous/binary rather than continuous.

Next, we consider a regression model for a binary (or dichotomous)

response.

6
Review: Logistic Regression

Let Y be a binary response, where

Y = 1 represents a “success”; Y = 0 represents a “failure”.

Then the mean of the binary response variable, denoted π, is the proportion
of successes or the probability that the response takes on the value 1.

That is,
π = E(Y ) = Pr(Y = 1) = Pr(“success”)

With a binary response, we are usually interested in estimating the

probability π, and relating it to a set of covariates.

To do this, we can use logistic regression.

A naive strategy for modeling a binary response is to consider a linear

regression model

π = E(Y ) = β1X1 + β2X2 + . . . + βpXp

However, in general, this model is not feasible since π is a probability and

is restricted to values between 0 and 1.

Also, the usual assumption of homogeneity of variance would be violated

since the variance of a binary response depends on the mean, i.e.

Var(Y ) = π (1 − π)

8
Instead, we can consider a logistic regression model where

ln [π/ (1 − π)] = β1X1 + β2X2 + . . . + βpXp

This model accommodates the constraint that π is restricted to values

between 0 and 1.

Recall that π/ (1 − π) is deﬁned as the odds of success.

Therefore, modeling π with a logistic function can be considered equivalent

to a linear regression model where the mean of the continuous response has
been replaced by the logarithm of the odds of success.

Note that the relationship between π and the covariates is non-linear.

Figure 1: Plot of logistic response function.

10
Under the assumption that the binary responses are Bernoulli random
variables, we can use ML estimation to obtain estimates of the logistic
regression parameters.

Finally, recall the relationship between “odds” and “probabilities”.

π
Odds = ;
1−π

Odds
π= .
1 + Odds

Given the logistic regression model

ln [π/ (1 − π)] = β1X1 + β2X2 + . . . + βpXp

the population intercept, β1, has interpretation as the log odds of success
when all of the covariates take on the value zero.

The population slope, say βk , has interpretation in terms of the change in

log odds of success for a single-unit change in Xk given that all of the other
covariates remain constant.

When one of the covariates is dichotomous, say X2, then β2 has a special
interpretation:

exp (β2) is the odds ratio or ratio of odds of success for the two possible
levels of X2 (given that all of the other covariates remain constant).

12
Keep in mind that as:

π increases

⇒ odds of success increases

⇒ log odds of success increases

Similarly, as:

π decreases

⇒ odds of success decreases

⇒ log odds of success decreases

Example: Development of bronchopulmonary dysplasia (BPD) in a sample

of 223 low birth weight infants.

Binary Response: Y = 1 if BPD is present, Y = 0 otherwise.

Covariate: Birth weight of infant in grams.

Consider the following logistic regression model

ln [π/ (1 − π)] = β1 + β2Weight

where π = E(Y ) = Pr(Y = 1) = Pr(BPD)

14
For the 223 infants in the sample, the estimated logistic regression (obtained
using ML) is
π / (1 − π
ln [ )] = 4.0343 − 0.0042 Weight

The ML estimate of β2 implies that, for every 1 gram increase in birth

weight, the log odds of BPD decreases by 0.0042.

For example, the odds of BPD for an infant weighing 1200 grams is

exp (4.0343 − 1200 ∗ .0042) = exp (−1.0057)

= 0.3658

Thus the predicted probability of BPD is:

0.3658/ (1 + 0.3658) = 0.268

Figure 2: Plot of estimated logistic response function of BPD on birth

weight.

16
Review: Poisson Regression

In Poisson regression, the response variable is a count (e.g. number of cases

of a disease in a given period of time).

The Poisson distribution provides the basis of likelihood-based inference.

Often the counts may be expressed as rates.

That is, the count or absolute number of events is often not satisfactory
because any comparison depends almost entirely on the sizes of the groups
(or the “time at risk”) that generated the observations.

Like a proportion or probability, a rate provides a basis for direct

comparison.

In either case, Poisson regression relates the expected counts or rates to a

set of covariates.

The Poisson regression model has two components:

1. The response variable is a count and is assumed to have a Poisson

distribution.
That is, the probability a speciﬁc number of events, y, occurs is

Pr(y events) = e−λλy /y!

Note that λ is the expected count or number of events and the expected
rate is given by λ/t, where t is a relevant baseline measure (e.g., t might
be the number of persons or the number of person-years of observation).

18
2. ln(λ/t) = β1X1 + β2X2 + . . . + βpXp

Note that since ln(λ/t) = ln(λ) − ln(t), the Poisson regression model can
also be considered as

ln(λ) = ln(t) + β1X1 + β2X2 + . . . + βpXp

where the ‘coeﬃcient’ associated with ln(t) is ﬁxed to be 1.

This adjustment term is known as an “oﬀset”.

Therefore, modelling λ (or λ/t) with a log function can be considered

equivalent to a linear regression model where the mean of the continuous
response has been replaced by the logarithm of the expected count (or rate).

Note that the relationship between λ (or λ/t) and the covariates is non-
linear.

We can use ML estimation to obtain estimates of the Poisson regression

parameters, under the assumption that the responses are Poisson random
variables.

20
Given the Poisson regression model

ln(λ/t) = β1X1 + β2X2 + . . . + βpXp

the population intercept, β1, has interpretation as the log expected rate
when all the covariates take on the value zero.

The population slope, say βk , has interpretation in terms of the change in

log expected rate for a single-unit change in Xk given that all of the other
covariates remain constant.

When one of the covariates is dichotomous, say X2, then β2 has a special
interpretation:

exp (β2) is the (incidence) rate ratio for the two possible levels of X2 (given
that all of the other covariates remain constant).

Example: Prospective study of coronary heart disease (CHD).

The study observed 3154 men aged 40-50 for an average of 8 years and
recorded incidence of cases of CHD.

The risk factors considered include:

Smoking exposure: 0, 10, 20, 30 cigs per day;

Blood Pressure: 0 (< 140), 1 (≥ 140);
Behavior Type: 0 (type B), 1 (type A).

A simple Poisson regression model is:

ln (λ/t) = ln(rate of CHD) = β1 + β2 Smoke

or
ln (λ) = ln(t) + β1 + β2 Smoke

22
Person - Blood
Years Smoking Pressure Behavior CHD

5268.2 0 0 0 20
2542.0 10 0 0 16
1140.7 20 0 0 13
614.6 30 0 0 3
4451.1 0 0 1 41
2243.5 10 0 1 24
1153.6 20 0 1 27
925.0 30 0 1 17
1366.8 0 1 0 8
497.0 10 1 0 9
238.1 20 1 0 3
146.3 30 1 0 7
1251.9 0 1 1 29
640.0 10 1 1 21
374.5 20 1 1 7
338.2 30 1 1 12

In this model the ML estimate of β2 is 0.0318. That is, the rate of CHD
increases by a factor of exp(0.0318) = 1.032 for every cigarette smoked.

Alternatively, the rate of CHD in smokers of one pack per day (20 cigs)
is estimated to be (1.032)20 = 1.88 times higher than the rate of CHD in
non-smokers.

We can include the additional risk factors in the following model:

ln (λ/t) = β1 + β2 Smoke + β3 Type + β4BP

Eﬀect Estimate Std. Error

Intercept -5.420 0.130

Smoke 0.027 0.006
Type 0.753 0.136
BP 0.753 0.129

24
Now, adjusted rate of CHD (controlling for BP and behavior type) increases
by a factor of exp(0.027) = 1.028 for every cigarette smoked.

Adjusted rate of CHD in smokers of one pack per day (20 cigs) is estimated
to be (1.027)20 = 1.7 times higher than rate of CHD in non-smokers.

Finally, note that when a Poisson regression model is applied to data

consisting of very small rates (say, λ/t << 0.01), then the rate is
approximately equal to the corresponding probability, p, and

ln (rate) ≈ ln (p) ≈ ln [p/ (1 − p)]

Therefore, the parameters for Poisson regression and logistic regression

models are approximately equal when the event being studied is rare.

In that case, results from a Poisson and logistic regression will not give
discernibly diﬀerent results.

Overdispersion

Count data (or counts of number of successes) often have variability that
far exceeds that predicted by Poisson (or binomial) distribution.

This phenomenon is referred to as overdispersion.

Although underdispersion can also arise, it is far less common.

Failure to account for overdispersion has negligible impact of the estimated

regression coeﬃcients.

Neglecting overdispersion results in standard errors being underestimated

and potentially misleading inferences (e.g., conﬁdence intervals that are too
narrow and p-values that are too small).

26
Example: Clinical Trial of Antibiotics for Leprosy

Placebo-controlled clinical trial of 30 patients with leprosy at the Eversley

Childs Sanitorium in the Philippines.

Participants were randomized to either of two antibiotics (denoted

treatment drug A and B) or to a placebo (denoted treatment drug C).

Baseline data on number of leprosy bacilli at 6 sites of body were recorded.

After several months of treatment, number of bacilli were recorded a second

time.

Outcome: Total count of number of leprosy bacilli at 6 sites.

Table 1: Mean count of leprosy bacilli at six sites of the body (and variance)
post-treatment.

Treatment Group Post-Treatment

Drug A (Antibiotic) 5.3

(21.6)

Drug B (Antibiotic) 6.1

(37.9)

Drug C (Placebo) 12.3

(51.1)

28
Consider outcome (post-treatment) at end of study.

Variability is approximately 4 to 6 times larger than that predicted by

Poisson variation.

Adjustments to nominal standard errors to account for overdispersion can

be made either by including a scale factor φ in speciﬁcation of the Poisson
variance,

Var(Yi) = φ μi,

or by basing standard errors on the so-called “sandwich” estimator of

Cov(β).

Regresi Logistik
No ratings yet
Regresi Logistik
34 pages
An Introduction To Generalized Linear Models (Third Edition, 2008) by Annette Dobson & Adrian Barnett Outline of Solutions For Selected Exercises
No ratings yet
An Introduction To Generalized Linear Models (Third Edition, 2008) by Annette Dobson & Adrian Barnett Outline of Solutions For Selected Exercises
23 pages
Logistic Regression
0% (1)
Logistic Regression
49 pages
Binary Logistic Regression - 6.2
No ratings yet
Binary Logistic Regression - 6.2
34 pages
ES714glm Generalized Linear Models
No ratings yet
ES714glm Generalized Linear Models
26 pages
Probit Logit Interpretation
No ratings yet
Probit Logit Interpretation
26 pages
0lecture 18
No ratings yet
0lecture 18
104 pages
Generalized Linear Models
No ratings yet
Generalized Linear Models
12 pages
Logistic Regression & Practice
100% (1)
Logistic Regression & Practice
51 pages
5.1) Binary Logistic Regression
No ratings yet
5.1) Binary Logistic Regression
32 pages
CUHK STAT5102 Ch7
No ratings yet
CUHK STAT5102 Ch7
33 pages
Logistic
No ratings yet
Logistic
14 pages
Chapter 2
No ratings yet
Chapter 2
5 pages
Regression3 Slides
No ratings yet
Regression3 Slides
47 pages
Home Lesson 15: Logistic, Poisson & Nonlinear Regression
No ratings yet
Home Lesson 15: Logistic, Poisson & Nonlinear Regression
32 pages
Psy 512 Logistic Regression
No ratings yet
Psy 512 Logistic Regression
12 pages
Poisson Regression
No ratings yet
Poisson Regression
3 pages
Lecture Notes 5
100% (1)
Lecture Notes 5
53 pages
Introduction To Logistic Regression: Rachid Salmi, Jean-Claude Desenclos, Alain Moren, Thomas Grein
No ratings yet
Introduction To Logistic Regression: Rachid Salmi, Jean-Claude Desenclos, Alain Moren, Thomas Grein
36 pages
Stat5900 f24 Lec11 Handout
No ratings yet
Stat5900 f24 Lec11 Handout
5 pages
Bio2 Module 5 - Logistic Regression
No ratings yet
Bio2 Module 5 - Logistic Regression
19 pages
Logistic Regression
100% (1)
Logistic Regression
37 pages
An Introduction To Logistic Regression
No ratings yet
An Introduction To Logistic Regression
48 pages
Lec-4 Logistic Regression
No ratings yet
Lec-4 Logistic Regression
54 pages
Logistic Regression
No ratings yet
Logistic Regression
8 pages
Logistic Regression
100% (1)
Logistic Regression
21 pages
Regression Logistic 4
No ratings yet
Regression Logistic 4
51 pages
Log Reg
No ratings yet
Log Reg
32 pages
Lecture 10
No ratings yet
Lecture 10
13 pages
Sestrada Logistic Regression in R 02172023
No ratings yet
Sestrada Logistic Regression in R 02172023
25 pages
Statistics 244 - Binary Response Regression, and Related Issues
100% (1)
Statistics 244 - Binary Response Regression, and Related Issues
30 pages
Regresion Logistica
No ratings yet
Regresion Logistica
71 pages
Logistic Regression: Logistic Regression and The New: Residual Logistic Regression
No ratings yet
Logistic Regression: Logistic Regression and The New: Residual Logistic Regression
31 pages
Regression Logistic Regression
100% (1)
Regression Logistic Regression
37 pages
Logistic Regression Analysis
No ratings yet
Logistic Regression Analysis
48 pages
Laboratory 10
No ratings yet
Laboratory 10
8 pages
Logistic Regression
No ratings yet
Logistic Regression
25 pages
T3 Logistic Regression
No ratings yet
T3 Logistic Regression
53 pages
Logistic Regression Analysis 2022
No ratings yet
Logistic Regression Analysis 2022
38 pages
Regression 101
No ratings yet
Regression 101
18 pages
Machine Learning
No ratings yet
Machine Learning
4 pages
Notes 15
No ratings yet
Notes 15
20 pages
Logistic Regression-Advanced Biostat PDF
No ratings yet
Logistic Regression-Advanced Biostat PDF
86 pages
Logit
No ratings yet
Logit
48 pages
Dummy Dependent Variable
100% (1)
Dummy Dependent Variable
58 pages
Detailed Logistic Regression
No ratings yet
Detailed Logistic Regression
30 pages
An Overview of Logistic Regression: Jill Mccracken May 28, 2004
No ratings yet
An Overview of Logistic Regression: Jill Mccracken May 28, 2004
10 pages
Binary Logistic Regression
No ratings yet
Binary Logistic Regression
8 pages
Lecture 8
No ratings yet
Lecture 8
39 pages
Lecture 8
No ratings yet
Lecture 8
22 pages
Logistic Regression: Continued Psy 524 Ainsworth
0% (1)
Logistic Regression: Continued Psy 524 Ainsworth
29 pages
Logistic Regression: Psy 524 Ainsworth
No ratings yet
Logistic Regression: Psy 524 Ainsworth
37 pages
An Introduction To Logistic Regression: Johnwhitehead Department of Economics East Carolina University
No ratings yet
An Introduction To Logistic Regression: Johnwhitehead Department of Economics East Carolina University
48 pages
Day 13 Logistic Regression
No ratings yet
Day 13 Logistic Regression
28 pages
Seu Ds610 Mod03
No ratings yet
Seu Ds610 Mod03
45 pages
02 Simple-Logistic-Regression-An-Overview Simple Logistic Regression
No ratings yet
02 Simple-Logistic-Regression-An-Overview Simple Logistic Regression
86 pages
Lecture 7 - Binary
No ratings yet
Lecture 7 - Binary
45 pages
Mathematical Foundations of Information Theory
From Everand
Mathematical Foundations of Information Theory
A. Ya. Khinchin
3.5/5 (9)
Lectures on the Coupling Method
From Everand
Lectures on the Coupling Method
Torgny Lindvall
No ratings yet
Set Theory Essentials
From Everand
Set Theory Essentials
Emil Milewski
No ratings yet
Introduction To Econometrics 3rd Edition James H. Stock - Ebook PDF PDF Download
100% (4)
Introduction To Econometrics 3rd Edition James H. Stock - Ebook PDF PDF Download
46 pages
Practice Worksheet - Graphing, Line of Best Fit, and Slope Calculations
No ratings yet
Practice Worksheet - Graphing, Line of Best Fit, and Slope Calculations
5 pages
Module 6A Estimating Relationships
No ratings yet
Module 6A Estimating Relationships
104 pages
Student Performance (Multiple Linear Regression)
No ratings yet
Student Performance (Multiple Linear Regression)
30 pages
Exercise6 1
No ratings yet
Exercise6 1
4 pages
ArticleText 52748 1 10 20220630
No ratings yet
ArticleText 52748 1 10 20220630
8 pages
Simple Pendulum 2
No ratings yet
Simple Pendulum 2
7 pages
DS Final Report
No ratings yet
DS Final Report
5 pages
Applied Economics IV Lecture Notes
No ratings yet
Applied Economics IV Lecture Notes
64 pages
Correlation
100% (1)
Correlation
29 pages
Jibson 2007
No ratings yet
Jibson 2007
10 pages
CLRM Assumptions
No ratings yet
CLRM Assumptions
24 pages
Econ320 Syllabus
No ratings yet
Econ320 Syllabus
5 pages
Session 4 Forecasting Regression Methods II
No ratings yet
Session 4 Forecasting Regression Methods II
65 pages
Lecture 2 - Regression
No ratings yet
Lecture 2 - Regression
6 pages
Regression
No ratings yet
Regression
7 pages
DAM Class 21-24 Regression Analysis
No ratings yet
DAM Class 21-24 Regression Analysis
93 pages
TP Regression
100% (1)
TP Regression
1 page
Panel Data Regression Models-Seminar
No ratings yet
Panel Data Regression Models-Seminar
18 pages
SE 403 Lecture 5
No ratings yet
SE 403 Lecture 5
10 pages
Stats Ans
No ratings yet
Stats Ans
44 pages
Jurnal Penelitian Tolak Angin
No ratings yet
Jurnal Penelitian Tolak Angin
9 pages
Applied Logistic Regression - 3rd Edition Scribd Download
100% (8)
Applied Logistic Regression - 3rd Edition Scribd Download
17 pages
Lateral and Vertical Pressure in Grain Bins
No ratings yet
Lateral and Vertical Pressure in Grain Bins
9 pages
Mathematical Modeling and Simulation in Aerospace Engineering
No ratings yet
Mathematical Modeling and Simulation in Aerospace Engineering
32 pages
CE324 Lecture 7
No ratings yet
CE324 Lecture 7
16 pages
Influence of Particulate Matter On Asth
No ratings yet
Influence of Particulate Matter On Asth
10 pages
P&S Unit-5 SU
No ratings yet
P&S Unit-5 SU
4 pages
Properties of Regression Coefficients
No ratings yet
Properties of Regression Coefficients
2 pages
EDU 820 - Module 5 - Lecture Slides - Final Version - 2025
No ratings yet
EDU 820 - Module 5 - Lecture Slides - Final Version - 2025
29 pages