0% found this document useful (0 votes)

203 views48 pages

Lecture13 PDF

Logistic regression is used for binary outcome variables that follow a binomial distribution, unlike linear regression which is used for continuous, normally distributed variables. The document provides an example of using logistic regression to model the effectiveness of a new pain medication based on data from a clinical trial. Maximum likelihood estimation is used to estimate the probability p of relief for the medication as 0.63, with a 95% confidence interval of 0.47 to 0.79. Logistic regression allows modeling the log odds of an event as a function of predictor variables, like treatment, to estimate an odds ratio comparing two groups.

Uploaded by

ronny

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

203 views48 pages

Lecture13 PDF

Uploaded by

ronny

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 48

Lecture 13: Introduction to

Logistic Regression

Sandy Eckel
[email protected]

13 May 2008

1
Logistic Regression

Basic Idea:
Logistic regression is the type of
regression we use for a response variable
(Y) that follows a binomial distribution
Linear regression is the type of regression
we use for a continuous, normally
distributed response (Y) variable

Remember the Binomial Distribution?

2
Review of the Binomial Model

Y ~ Binomial(n,p)
n independent trials
(e.g., coin tosses)
p = probability of success on each trial
(e.g., p = ½ = Pr of heads)
Y = number of successes out of n trials
(e.g., Y= number of heads)

3
Binomial Distribution Example

Binomial probability density function (pdf):

n y
P(Y = y ) =   p (1 − p )
n− y

 y
Example:

4
Why can’t we use Linear Regression
to model binary responses?

The response (Y) is NOT normally distributed

The variability of Y is NOT constant
Variance of Y depends on the expected value of Y
For a Y~Binomial(n,p) we have Var(Y)=pq which
depends on the expected response, E(Y)=p
The model must produce predicted/fitted
probabilities that are between 0 and 1
Linear models produce fitted responses that vary
from -∞ to ∞

5
Binomial Y example

Consider a phase I clinical trial in which

35 independent patients are given a
new medication for pain relief. Of the 35
patients, 22 report “significant” relief
one hour after medication

Question: How effective is the drug?

6
Model

Y = # patients who get relief

n = 35 patients (trials)
p = probability of relief for any patient
The truth we seek in the population
How effective is the drug? What is p?
Want a method to
Get best estimate of p given data
Determine range of plausible values for p

7
How do we estimate p?
Maximum Likelihood Method
The method of maximum likelihood estimation chooses
values for parameter estimates which make the observed
data “maximally likely” under the specified model

Likelihood Function: Pr(22 of 35)

1.0e-10
Max Likelihood
Likelihood

5.0e-11

MLE: p=0.63

0
0 .1 .2 .3 .4 .5 .6 .7 .8 .9 1
p=Prob(Event) 8
Maximum Likelihood
Clinical trial example
Under the binomial model, ‘likelihood’ for observed Y=y
n y
P(Y = y ) =   p (1 − p )
n− y

 y
So for this example the likelihood function is:
 35  22
P(Y = y ) =   p (1 − p )
13

 22 
So, estimate p by choosing the value for p which makes
observed data “maximally likely”
i.e., choose p that makes the value of Pr (Y=22) maximal
The ML estimate of p is y/n
= 22/35
= 0.63
The estimated proportion of patients who will experience
relief is 0.63 9
Confidence Interval (CI) for p

Recall the general form of any CI:

Estimate ± (something near 2) x SE(estimate)

p (1 − p ) pq
Variance of p̂ : Var( p̂)= =
n n

“Standard Error” of p̂ : pq
n
Estimate of “Standard Error” of p̂ : pˆ qˆ
n
10
Confidence Interval for p

95% Confidence Interval for the ‘true’

proportion, p:

pˆ ± 1.96
pˆ qˆ
= 0.63 ± 1.96
(0.63)(0.37)
n 35
LB: 0.63-1.96(.082)
UB: 0.63+1.96(.082)
=(0.47, 0.79)
11
Conclusion

Based upon our clinical trial in which 22 of 35

patients experience relief, we estimate that 63%
of persons who receive the new drug experience
relief within 1 hour (95% CI: 47% to 79%)
Whether 63% (47% to 79%) represents an
‘effective’ drug will depend many things,
especially on the science of the problem.
Sore throat pain?
Arthritis pain?
Childbirth pain?

12
Aside: Review of Probabilities and Odds

The odds of an event are defined as:

P(Y = 1) P(Y = 1)
odds(Y=1) = =
P(Y = 0) 1 - P(Y = 1)
=
p
1-p
We can go back and forth between odds and
probabilities:
p
Odds =
1-p
p = odds/(odds+1)
13
Aside: Review of Odds Ratio

We saw that an odds ratio (OR) can be

helpful for comparisons.
Recall the Vitamin A trial where we
looked at the odds ratio of death
comparing the vitamin A group to the
no vitamin A group:

odds(Death | Vit. A)
OR =
odds(Death | No Vit A.)
14
Aside: Review of Odds Ratio Interpretation

The OR here describes the benefits of

Vitamin A therapy. We saw for this
example that:
OR = 0.59
The Vitamin A group had 0.60 times the
odds of death of the no Vitamin A group; or
An estimated 40% reduction in mortality
OR is a building block for logistic
regression

15
Logistic Regression

Suppose we want to ask whether new

drug is better than a placebo and have
the following observed data:

Relief? Drug Placebo

No 13 20
Yes 22 15
Total 35 35
16
Confidence Intervals for p

Placebo
( )

( ) Drug

0
0 .1 .2 .3 .4 .5 .6 .7 .8 .9 1
p

17
Odds Ratio

18
Confidence Interval for OR

CI used Woolf’s method for the standard

error of log(Oˆ R ) (from lecture 6)

1 1 1 1
se(log(Oˆ R)) = + + + = 0.489
22 13 15 20

find log( ˆ
O R ) ± 1 . 96 se (log( ˆ
O R ))
Then (eL,eU)

19
Interpretation

OR = 2.26
95% CI: (0.86 , 5.90)

The Drug is an estimated 2 ¼ times

better than the placebo.
But could the difference be due to
chance alone?
YES ! 1 is a ‘plausible’ true population OR

20
Logistic Regression

Can we set up a model for this binomial

outcome similar to what we’ve done in
regression?

Idea: model the log odds of the event,

(in this example, relief) as a function of
predictor variables

21
A regression model for the log odds
 P(relief | Tx) 
log[ odds(Relief | Tx) ] = log  = β + β Tx
 0 1
 P(no relief | Tx) 
where: Tx = 0 if Placebo
1 if Drug

log( odds(Relief|Drug) ) = β0 + β1

log( odds(Relief|Placebo) ) = β0

log( odds(Relief|D)) – log( odds(Relief|P)) = β1

22
And…
Because of the basic property of logs:

log( odds(Relief|D)) – log( odds(Relief|P)) = β1

 odds(R | D) 
log  =
 β1
 odds(R | P) 

And: OR = exp(β1) = eβ1 !!

So: exp(β1) = odds ratio of relief for patients

taking the Drug-vs-patients taking the Placebo.
23
Logistic Regression
Logit estimates Number of obs = 70
LR chi2(1) = 2.83
Prob > chi2 = 0.0926
Log likelihood = -46.99169 Pseudo R2 = 0.0292

------------------------------------------------------------------------------
y | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
Tx | .8137752 .4889211 1.66 0.096 -.1444926 1.772043
(Intercept) | -.2876821 .341565 -0.84 0.400 -.9571372 .3817731
------------------------------------------------------------------------------

Estimates:

log( odds(relief|Tx) ) = βˆ0 + βˆ1Tx

= -0.288 + 0.814(Tx)

Therefore: OR = exp(0.814) = 2.26 !

So 2.26 is the odds ratio of relief for patients taking the Drug
compared to patients taking the Placebo 24
It’s the same as the OR we got before!

So, why go to all the trouble of setting up a

linear model?
What if there is a biologic reason to expect
that the rate of relief (and perhaps drug
efficacy) is age dependent?
What if
Pr(relief) = function of Drug or Placebo AND Age
We could easily include age in a model such
as:
log( odds(relief) ) = β0 + β1Drug + β2Age

25
Logistic Regression

As in MLR, we can include many

additional covariates
For a Logistic Regression model with
r number of predictors:
log ( odds(Y=1)) = β0 + β1X1 + ... + βrXr
Pr(Y = 1) Pr(Y = 1)
where: odds(Y=1) = =
1 − Pr(Y = 1) Pr(Y = 0)

26
Logistic Regression
Thus:
 Pr(Y = 1) 
log   = β0 + β1X1 + ... + βrXr
 Pr(Y = 0) 
But, why use log(odds)?

Linear regression might estimate anything

(-∞, +∞), not just a proportion in the range of
0 to 1
Logistic regression is a way to estimate a
proportion (between 0 and 1) as well as some
related items
27
Another way to motivate using log(OR) for the
lefthand side of logistic regression

We would like to use something like

what we know from linear regression:

Continuous outcome = β0 + β1X1 + β2X2+…

How can we turn a proportion into a

continuous outcome?

28
Transforming a proportion…

A proportion is a value between 0 and 1

The odds are always positive:
 p 
odds=   ⇒ [0,+∞)
 1− p 

The log odds is continuous:

 p 
Logodds= ln  ⇒ (−∞,+∞)
 1− p 

29
“Logit” transformation of the probability

Measure Min Max Name

Pr(Y = 1) 0 1 “probability”

Pr(Y = 1)
0 ∞ “odds”
1 − Pr(Y = 1)

 Pr(Y = 1) 
log  -∞ ∞ “log-odds” or “logit”
 1 − Pr(Y = 1) 

30
Logit Function

Relates log-odds (logit) to p = Pr(Y=1)

logit function
10

5
log-odds

-5

-10
0 .5 1
Probability of Success
31
Key Relationships

Relating log-odds, probabilities, and

parameters in logistic regression:
Suppose we have the model:
logit(p) = β0 + β1X
 p 
i.e. log   = β0 + β1X
 1-p 
Take “anti-logs” to get back to OR scale
 p 
  = exp(β0 + β1X)
 1-p 

32
Solve for p as a function of the coefficients

p/(1-p) = exp(β0 + β1X)

p = (1 – p)⋅exp(β0 + β1X)
p = exp(β0 + β1X) – p ⋅ exp(β0 + β1X)

p + p ⋅exp(β0 + β1X) = exp(β0 + β1X)

p ⋅{1+ exp(β0 + β1X)} = exp(β0 + β1X)
exp(β + β X )
p= 0 1
1 + exp(β + β X )
0 1
33
What’s the point of all that algebra?

Now we can determine the estimated

probability of success for a specific set
of covariates, X, after running a logistic
regression model

34
Example
Dependence of Blindness on Age

The following data concern the Aegean

island of Kalytos where inhabitants
suffer from a congenital eye disease
whose effects become more marked
with age.
Samples of 50 people were taken at five
different ages and the numbers of blind
people were counted

35
Example: Data

Age Number blind / 50

20 6 / 50

35 7 / 50

45 26 / 50

55 37 / 50

70 44 / 50

36
Question

The scientific question of interest is to

determine how the probability of
blindness is related to age in this
population

Let pi = Pr(a person in age classi is blind)

37
Model 1 – Intercept only model

logit(pi) = β0*

β0*= log-odds of blindness for all ages

exp(β0*) = odds of blindness for all ages

No age dependence in this model

38
Model 2 – Intercept and age

logit(pi) = β0 + β1(agei – 45)

β0 = log-odds of blindness among 45 year olds

exp(β0) = odds of blindness among 45 year olds

β1 = difference in log-odds of blindness

comparing a group that is one year older than
another

exp(β1) = odds ratio of blindness comparing a

group that is one year older than another
39
Results

Model 1: logit(pi) = β0*

Logit estimates Number of obs = 250
LR chi2(0) = 0.00
Prob > chi2 = .
Log likelihood = -173.08674 Pseudo R2 = 0.0000

------------------------------------------------------------------------------
y | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
(Intercept) | -.0800427 .1265924 -0.63 0.527 -.3281593 .1680739
------------------------------------------------------------------------------

exp( −.08)
logit( p̂i) = -0.08 or pˆ i = = 0.48
1 + exp( −.08)

40
Results

Model 2: logit(pi) = β0 + β1(agei – 45)

Logit estimates Number of obs = 250
LR chi2(1) = 99.30
Prob > chi2 = 0.0000
Log likelihood = -123.43444 Pseudo R2 = 0.2869

------------------------------------------------------------------------------
y | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
age | .0940683 .0119755 7.86 0.000 .0705967 .1175399
(Intercept) | -4.356181 .5700966 -7.64 0.000 -5.473549 -3.238812
------------------------------------------------------------------------------

logit( p̂i) = -4.4 + .094(agei - 45)

or exp(− 4.4 + 0.094(age − 45)
pˆi = i
( (
1 + exp − 4.4 + 0.094 age − 45
i
)
41
Test of significance

Is the addition of the age variable in the

model important?
Maximum likelihood estimates:
β̂1 =0.094 s.e.(β̂1 )=0.012
z-test: H0: β1 = 0
z=7.855; p-val=0.000
95% C.I. (0.07, 0.12)

42
What about the Odds Ratio?
Maximum likelihood estimates:
OR = exp( β̂1)= exp(0.094)= 1.10
SE( β̂1 ) =SE(log(OR) ) = 0.013
Same z-test, reworded for OR scale:
Ho: exp(β1) = 1
z = 7.86 p-val = 0.000
95% C.I. for β1 (1.07, 1.13)
*(calculated on log scale, then exponentiated!!)
e(0.094 – 1.96*0.013), e(0.094 + 1.96*0.013)

It appears that blindness is age dependent

Note: exp(0) = 1, where is this fact useful?
43
Model 1 fit

Plot of observed proportion -vs-

predicted proportions using an intercept
only model

1 Observed
Prob Blindness

Predicted

0
20 40 60 80
Age
44
Model 2 fit

Plot of observed proportion -vs-

predicted proportions with age in the
model
1
Observed
Prob Blindness

.5
Predicted

0
20 40 60 80
Age 45
Conclusion

Model 2 clearly fits better than Model 1!

Including age in our model is better

than intercept alone.

46
Lecture 13 Summary

Logistic regression gives us a framework in which

to model binary outcomes
Uses the structure of linear models, with
outcomes modelled as a function of covariates
As we’ll see, many concepts carry over from
linear regression
Interactions
Linear splines
Tests of significance for coefficients
All coefficients will have different
interpretations in logistic regression
Log odds or Log odds ratios!
47
HW 3 Hint

General logistic model specification:

Systematic:
logit( P(Yi = 1)) = log(odds(Yi = 1)) = β 0 + β1 x1 + β 2 x2

Random:
Yi ~ Binomial (n, p ) = Binomial (1, pi )

where pi depends on the covariates for person i

EconometricsTest Bank Questions Chapter 3
100% (10)
EconometricsTest Bank Questions Chapter 3
3 pages
Regresi Logistik
No ratings yet
Regresi Logistik
34 pages
Agresti Cda
No ratings yet
Agresti Cda
191 pages
Logistic Regression
0% (1)
Logistic Regression
49 pages
Legg Calvé Perthes Disease
No ratings yet
Legg Calvé Perthes Disease
11 pages
Daftar Isi Bridwell Spine
50% (2)
Daftar Isi Bridwell Spine
38 pages
Applied Longitudinal Analysis Lecture Notes
No ratings yet
Applied Longitudinal Analysis Lecture Notes
475 pages
Psy 512 Logistic Regression
No ratings yet
Psy 512 Logistic Regression
12 pages
Logistic Regression & Practice
100% (1)
Logistic Regression & Practice
51 pages
Regresion Logistica
No ratings yet
Regresion Logistica
71 pages
Logistic Regression
No ratings yet
Logistic Regression
33 pages
Laboratory 10
No ratings yet
Laboratory 10
8 pages
Logistic Regression
No ratings yet
Logistic Regression
23 pages
Detailed Logistic Regression
No ratings yet
Detailed Logistic Regression
30 pages
Logistic Regression
No ratings yet
Logistic Regression
49 pages
Logistic Regression
No ratings yet
Logistic Regression
49 pages
Logistic Regression-Advanced Biostat PDF
No ratings yet
Logistic Regression-Advanced Biostat PDF
86 pages
302 F 14 Logistic Regression
No ratings yet
302 F 14 Logistic Regression
23 pages
Lec11 PDF
No ratings yet
Lec11 PDF
12 pages
CS-13410 Introduction To Machine Learning: Lecture # 17
No ratings yet
CS-13410 Introduction To Machine Learning: Lecture # 17
11 pages
Logistic Regression
100% (1)
Logistic Regression
37 pages
L9 Logistical Regression Models Updated
No ratings yet
L9 Logistical Regression Models Updated
10 pages
5.1) Binary Logistic Regression
No ratings yet
5.1) Binary Logistic Regression
32 pages
HFHFH
No ratings yet
HFHFH
37 pages
Logistic
No ratings yet
Logistic
14 pages
Regression Logistic 4
No ratings yet
Regression Logistic 4
51 pages
Home Lesson 15: Logistic, Poisson & Nonlinear Regression
No ratings yet
Home Lesson 15: Logistic, Poisson & Nonlinear Regression
32 pages
Regression3 Slides
No ratings yet
Regression3 Slides
47 pages
PSYC8010 Topic 9 Logistic Regression R
No ratings yet
PSYC8010 Topic 9 Logistic Regression R
47 pages
Logistic Regression
No ratings yet
Logistic Regression
25 pages
4 - C - Logistic Regression
No ratings yet
4 - C - Logistic Regression
13 pages
Basic Concepts of Logistic Regression
No ratings yet
Basic Concepts of Logistic Regression
5 pages
Lec-4 Logistic Regression
No ratings yet
Lec-4 Logistic Regression
54 pages
Logistic Regresson
No ratings yet
Logistic Regresson
32 pages
Binary Logistic Regression Lecture 9
No ratings yet
Binary Logistic Regression Lecture 9
33 pages
Logistic Regression Analysis
No ratings yet
Logistic Regression Analysis
48 pages
Logistic Regression: Psy 524 Ainsworth
No ratings yet
Logistic Regression: Psy 524 Ainsworth
37 pages
Logistic Regression-1
No ratings yet
Logistic Regression-1
27 pages
Logistic+Regression+Monograph+ +DSBA+v2
No ratings yet
Logistic+Regression+Monograph+ +DSBA+v2
54 pages
Modeling Ordinal Categorical Data (Agresti)
No ratings yet
Modeling Ordinal Categorical Data (Agresti)
71 pages
Review of Logistic and Poisson Regression Models
No ratings yet
Review of Logistic and Poisson Regression Models
15 pages
3 Classification
No ratings yet
3 Classification
26 pages
ML Logistic Regression Module3 Final
No ratings yet
ML Logistic Regression Module3 Final
22 pages
FAQ - How Do I Interpret Odds Ratios in Logistic Regression
No ratings yet
FAQ - How Do I Interpret Odds Ratios in Logistic Regression
6 pages
Interpreting Logistic Regression Results
No ratings yet
Interpreting Logistic Regression Results
13 pages
18logistic Regression Yilma
No ratings yet
18logistic Regression Yilma
88 pages
Logistic Regression: 30 March 2016
No ratings yet
Logistic Regression: 30 March 2016
49 pages
Logistic Regression
100% (1)
Logistic Regression
34 pages
Logistic Regression
100% (3)
Logistic Regression
30 pages
An Introduction To Logistic Regression in R
No ratings yet
An Introduction To Logistic Regression in R
25 pages
Logistic Regression Analysis 2022
No ratings yet
Logistic Regression Analysis 2022
38 pages
Lecture 10 PDF
No ratings yet
Lecture 10 PDF
73 pages
Logistic Regression
No ratings yet
Logistic Regression
9 pages
MGT 6203 - Sri - M4 - Logistic Regression v042919
No ratings yet
MGT 6203 - Sri - M4 - Logistic Regression v042919
31 pages
Bio2 Module 5 - Logistic Regression
No ratings yet
Bio2 Module 5 - Logistic Regression
19 pages
Logistic Regression
No ratings yet
Logistic Regression
15 pages
ES714glm Generalized Linear Models
No ratings yet
ES714glm Generalized Linear Models
26 pages
Lect7 Math231
No ratings yet
Lect7 Math231
29 pages
Logistic Regression
No ratings yet
Logistic Regression
18 pages
Lecture 2.3.1
No ratings yet
Lecture 2.3.1
50 pages
Log Reg
No ratings yet
Log Reg
32 pages
Logistic Regression: Logistic Regression and The New: Residual Logistic Regression
No ratings yet
Logistic Regression: Logistic Regression and The New: Residual Logistic Regression
31 pages
BAYES Theorem
From Everand
BAYES Theorem
Jeffery Short
2/5 (5)
Set Theory Essentials
From Everand
Set Theory Essentials
Emil Milewski
No ratings yet
Heinzl Waldhoer Mittlboeck 2005 SiM PDF
No ratings yet
Heinzl Waldhoer Mittlboeck 2005 SiM PDF
6 pages
Jadwal Perkuliahan Tatap Muka Kars E-Learning Batam Di Depok
No ratings yet
Jadwal Perkuliahan Tatap Muka Kars E-Learning Batam Di Depok
1 page
How To Diagnose Acetabulum Fractures? One Column: Ramus Fracture
No ratings yet
How To Diagnose Acetabulum Fractures? One Column: Ramus Fracture
2 pages
Soal Ortopedi Ke Kolegium
No ratings yet
Soal Ortopedi Ke Kolegium
3 pages
Soal Prof. Bala
No ratings yet
Soal Prof. Bala
3 pages
SASMO 2015 Indonesia Letter
No ratings yet
SASMO 2015 Indonesia Letter
3 pages
Atlanto-Axial Subluxation: Radiographic Features
No ratings yet
Atlanto-Axial Subluxation: Radiographic Features
2 pages
Practicedump: Free Practice Dumps - Unlimited Free Access of Practice Exam
No ratings yet
Practicedump: Free Practice Dumps - Unlimited Free Access of Practice Exam
5 pages
Social Science Research Principles Methods and Practices
No ratings yet
Social Science Research Principles Methods and Practices
5 pages
Clodes Class Data Science
No ratings yet
Clodes Class Data Science
14 pages
Applied Statistics in Business & Economics,: David P. Doane and Lori E. Seward
No ratings yet
Applied Statistics in Business & Economics,: David P. Doane and Lori E. Seward
48 pages
Bayesian Applications in Environmental and Ecological Studies With R and Stan, 1st Edition High-Quality Download
100% (13)
Bayesian Applications in Environmental and Ecological Studies With R and Stan, 1st Edition High-Quality Download
16 pages
Baec 301
No ratings yet
Baec 301
6 pages
EE 605 Syllabus
No ratings yet
EE 605 Syllabus
2 pages
Hqe 4 KV VH AKe HJND A
No ratings yet
Hqe 4 KV VH AKe HJND A
9 pages
Coding of Data Set
No ratings yet
Coding of Data Set
105 pages
Statistics and Probability Notes Part 1
No ratings yet
Statistics and Probability Notes Part 1
23 pages
Final Examination For Grade 11 STAT PROB
No ratings yet
Final Examination For Grade 11 STAT PROB
4 pages
Usquestions Gxo LBBVAX2 Articlenn 07 FG#Z
No ratings yet
Usquestions Gxo LBBVAX2 Articlenn 07 FG#Z
1 page
Exam Question Evaluation With Item Response Theory: Evert-Jan - Bakker@wur - NL
No ratings yet
Exam Question Evaluation With Item Response Theory: Evert-Jan - Bakker@wur - NL
4 pages
Correlation
No ratings yet
Correlation
27 pages
Knife Method (Ba)
No ratings yet
Knife Method (Ba)
3 pages
Derivation of BLUE Property of OLS Estimators
100% (2)
Derivation of BLUE Property of OLS Estimators
4 pages
Linear Regression Experiment
No ratings yet
Linear Regression Experiment
6 pages
Errors and Residuals - Wikipedia
No ratings yet
Errors and Residuals - Wikipedia
6 pages
MC Multiple Regression
No ratings yet
MC Multiple Regression
7 pages
Statistical Inferences Presentation
No ratings yet
Statistical Inferences Presentation
11 pages
Tutorial
No ratings yet
Tutorial
6 pages
Data Science Unit 4
No ratings yet
Data Science Unit 4
15 pages
Confidence Intervals for the Sample Mean with Known σ
No ratings yet
Confidence Intervals for the Sample Mean with Known σ
7 pages
Racine, Su, Ullah - Unknown - Applied Nonparametric & Semiparametric Econometrics & Statistics PDF
No ratings yet
Racine, Su, Ullah - Unknown - Applied Nonparametric & Semiparametric Econometrics & Statistics PDF
562 pages
Elementary Survey Sampling, 7th Ed.-136-139 (Example 5.2-5.3)
No ratings yet
Elementary Survey Sampling, 7th Ed.-136-139 (Example 5.2-5.3)
4 pages
Formulating Appropriate Null-Alternative Hypothesis-Onetailed-Twotailed
No ratings yet
Formulating Appropriate Null-Alternative Hypothesis-Onetailed-Twotailed
29 pages
Satish Chandra All Method For Critical Gap in Inter Section
No ratings yet
Satish Chandra All Method For Critical Gap in Inter Section
18 pages
Measure of Dispersion
No ratings yet
Measure of Dispersion
32 pages

Lecture13 PDF

Uploaded by

Lecture13 PDF

Uploaded by

Lecture 13: Introduction to

Remember the Binomial Distribution?

Binomial probability density function (pdf):

The response (Y) is NOT normally distributed

Consider a phase I clinical trial in which

Question: How effective is the drug?

Y = # patients who get relief

Likelihood Function: Pr(22 of 35)

Recall the general form of any CI:

95% Confidence Interval for the ‘true’

Based upon our clinical trial in which 22 of 35

The odds of an event are defined as:

We saw that an odds ratio (OR) can be

The OR here describes the benefits of

Suppose we want to ask whether new

Relief? Drug Placebo

CI used Woolf’s method for the standard

The Drug is an estimated 2 ¼ times

Can we set up a model for this binomial

Idea: model the log odds of the event,

log( odds(Relief|D)) – log( odds(Relief|P)) = β1

log( odds(Relief|D)) – log( odds(Relief|P)) = β1

And: OR = exp(β1) = eβ1 !!

So: exp(β1) = odds ratio of relief for patients

log( odds(relief|Tx) ) = βˆ0 + βˆ1Tx

Therefore: OR = exp(0.814) = 2.26 !

So, why go to all the trouble of setting up a

As in MLR, we can include many

Linear regression might estimate anything

We would like to use something like

Continuous outcome = β0 + β1X1 + β2X2+…

How can we turn a proportion into a

A proportion is a value between 0 and 1

The log odds is continuous:

Measure Min Max Name

Relates log-odds (logit) to p = Pr(Y=1)

Relating log-odds, probabilities, and

p/(1-p) = exp(β0 + β1X)

p + p ⋅exp(β0 + β1X) = exp(β0 + β1X)

Now we can determine the estimated

The following data concern the Aegean

Age Number blind / 50

The scientific question of interest is to

Let pi = Pr(a person in age classi is blind)

β0*= log-odds of blindness for all ages

No age dependence in this model

logit(pi) = β0 + β1(agei – 45)

β0 = log-odds of blindness among 45 year olds

exp(β0) = odds of blindness among 45 year olds

β1 = difference in log-odds of blindness

exp(β1) = odds ratio of blindness comparing a

Model 1: logit(pi) = β0*

Model 2: logit(pi) = β0 + β1(agei – 45)

logit( p̂i) = -4.4 + .094(agei - 45)

Is the addition of the age variable in the

It appears that blindness is age dependent

Plot of observed proportion -vs-

Plot of observed proportion -vs-

Model 2 clearly fits better than Model 1!

Including age in our model is better

Logistic regression gives us a framework in which

General logistic model specification:

where pi depends on the covariates for person i

You might also like