Logistic Regression: Prof. Andy Field

Logistic regression is used to predict categorical outcomes from continuous and/or categorical predictors. It can be used for binary or multinomial outcomes. The document discusses when and why to use logistic regression, how to assess overall model fit and predictor importance, and potential issues like incomplete data, complete separation, and overdispersion. Interpretation involves examining odds ratios and predicted probabilities. Multinomial logistic regression extends these concepts to predict membership in more than two categories.

Uploaded by

Syed

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

477 views34 pages

Logistic Regression: Prof. Andy Field

Uploaded by

Syed

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 34

Logistic Regression

Prof. Andy Field

Aims
• When and Why do we Use Logistic
Regression?
– Binary
– Multinomial
• Theory Behind Logistic Regression
– Assessing the Model
– Assessing predictors
– Things that can go Wrong
• Interpreting Logistic Regression
Slide 2
When And Why

• To predict an outcome variable that is

categorical from one or more categorical or
continuous predictor variables.
• Used because having a categorical outcome
variable violates the assumption of linearity
in normal regression.

Slide 3
With One Predictor
P(Y ) =1+e- ( b10 +b1X1i )
• Outcome
– We predict the probability of the outcome
occurring
• b0 and b0
– Can be thought of in much the same way as
multiple regression
– Note the normal regression equation forms part
of the logistic regression equation

Slide 4
With Several Predictor
P(Y ) =1+e- ( b0 +b1X1i+b12 X2 i +...+bn Xni )
• Outcome
– We still predict the probability of the outcome
occurring
• Differences
– Note the multiple regression equation forms part
of the logistic regression equation
– This part of the equation expands to
accommodate additional predictors

Slide 5
Assessing the Model: the log-
likelihood statistic
N
log  likelihood   Y ln PY     1  Y  ln 1  PY   
i1
i i i i

• The Log-likelihood statistic

– Analogous to the residual sum of squares in
multiple regression
– It is an indicator of how much unexplained
information there is after the model has been
fitted.
– Large values indicate poorly fitting statistical
models.
Assessing the Model: the
deviance statistic
• The deviance is very closely related to the log-
likelihood: it’s given by
–Deviance = −2 × log-likelihood
• It’s possible to calculate a log-likelihood or deviance
for different models and to compare these models by
looking at the difference between their deviances.
Assessing the model: R and R 2

• This R-statistic
– is the partial correlation between the outcome
variable and each of the predictor variables.
– It can vary between −1 and 1.
• The R-statistic is given by:

• Or:
Assessing Predictors: The Wald
Statistic
b
Wald  SE b

• Similar to t-statistic in Regression.

• Tests the null hypothesis that b = 0.
• Is biased when b is large.
• Better to look at Likelihood-ratio statistics.

Slide 9
The odds ratio: exp(B)
Odds after a unit change in the predictor
Odds ratio =
Original odds

• Indicates the change in odds resulting from

a unit change in the predictor.
– OR > 1: Predictor , Probability of outcome
occurring .
– OR < 1: Predictor , Probability of outcome
occurring .

Slide 10
Methods of Regression
• Forced Entry: All variables entered
simultaneously.
• Hierarchical: Variables entered in blocks.
– Blocks should be based on past research, or theory
being tested. Good Method.
• Stepwise: Variables entered on the basis of
statistical criteria (i.e. relative contribution to
predicting outcome).
– Should be used only for exploratory analysis.

Slide 11
Model building and Parsimony
• When building a model we should strive
for parsimony.
– predictors should not be included unless they
have explanatory benefit.
• First fit the model that includes all
potential predictors, and then
systematically remove any that don’t seem
to contribute to the model.
Things That Can go Wrong
• Assumptions from Linear Regression:
– Linearity
– Independence of Errors
– Multicollinearity
• Unique Problems
– Incomplete Information
– Complete Separation
– Overdispersion
Incomplete Information From the
Predictors
• Categorical Predictors:
– Predicting cancer from smoking and eating tomatoes.
– We don’t know what happens when nonsmokers eat tomatoes
because we have no data in this cell of the design.
• Continuous variables
– Will your sample contain a to include an 80 year old, highly
anxious, Buddhist left-handed lesbian?
Complete Separation
• When the outcome variable can be perfectly
predicted.
– E.g. predicting whether someone is a burglar or your
teenage son or your cat based on weight.
– Weight is a perfect predictor of cat/burglar unless you
have a very fat cat indeed!

1.0 1.0

0.8 0.8
Probability of Outcome

Probability of Outcome

0.6 0.6

0.4 0.4

0.2 0.2

0.0 0.0

20 30 40 50 60 70 80 90 0 20 40 60 80

Weight (KG) Weight (KG)

Overdispersion
• Overdispersion is where the variance is
larger than expected from the model.
• This can be caused by violating the
assumption of independence.
• This problem makes the standard errors
too small!
An Example
• Predictors of a treatment intervention.
• Participants
– 113 adults with a medical problem
• Outcome:
– Cured (1) or not cured (0).
• Predictors:
– Intervention: intervention or no treatment.
– Duration: the number of days before treatment that
the patient had the problem.

Slide 17
Output: Initial Model
Output: Initial analysis
The output is split into two blocks: block 0 describes the model before Intervention is
included, and block 1 describes the model after Intervention is included. As such, block 1

Output: Block 0
is the main bit in which we’re interested. The bit of the block 0 output that does come in
useful is in Output 19.3, and will be there only if you selected Iteration history in Figure
19.10. This table tells us the initial 2LL, which is 154.084. We’ll use this value later so
don’t forget it.

OUT

19.6.2. Model summary ➁

With Intervention included in the model a patient is now classified as being cured or not
based on whether they had an intervention or not (waiting list). This can be explained eas-
Output: Model Summary
Output: Model Summary
Model Summary
Classification Plot
Summary
• The overall fit of the final model is shown by −2LL and its associated
chi-square statistic.
– If the significance of the chi-square statistic is less than .05, then
the model is a significant fit of the data.
• Check the table labelled Variables in the equation to see the regression
parameters for any predictors in the model.
• Look at the Wald statistic and its significance.
• Use the odds ratio, Exp(B), for interpretation.
– OR > 1, then as the predictor increases, the odds of the outcome
occurring increase.
– OR < 1, then as the predictor increases, the odds of the outcome
occurring decrease.
– The confidence interval of Exp(B) should not cross 1.
Reporting the Analysis
Multinomial logistic regression
• Logistic regression to predict membership of more than two
categories.
• It (basically) works in the same way as binary logistic regression.
• The analysis breaks the outcome variable down into a series of
comparisons between two categories.
– E.g., if you have three outcome categories (A, B and C), then the
analysis will consist of two comparisons that you choose:
• Compare everything against your first category (e.g. A vs. B and A vs. C),
• Or your last category (e.g. A vs. C and B vs. C),
• Or a custom category (e.g. B vs. A and B vs. C).
• The important parts of the analysis and output are much the same
as we have just seen for binary logistic regression
I may not be Fred Flintstone …
• How successful are chat-up lines?
• The chat-up lines used by 348 men and 672 women in a night-
club were recorded.
• Outcome:
– Whether the chat-up line resulted in one of the following three events:
• The person got no response or the recipient walked away,
• The person obtained the recipient’s phone number,
• The person left the night-club with the recipient.
• Predictors:
– The content of the chat-up lines were rated for:
• Funniness (0 = not funny at all, 10 = the funniest thing that I have ever heard)
• Sexuality (0 = no sexual content at all, 10 = very sexually direct)
• Moral vales (0 = the chat-up line does not reflect good characteristics, 10 = the
chat-up line is very indicative of good characteristics).
– Gender of recipient
Output I
Output II
Output III
Interpretation I
• Good_Mate: Whether the chat-up line showed signs of good
moral fibre did not significantly predict whether you went
home with the date or got a slap in the face, b = 0.13, Wald
χ2(1) = 2.42, p = .120.
• Funny: Whether the chat-up line was funny significantly
predicted whether you went home with the date or no response,
b = 0.32, Wald χ2(1) = 6.46, p = .011.
• Gender: The gender of the person being chatted up
significantly predicted whether they went home with the person
or gave no response, b = −5.63, Wald χ2(1) = 17.93, p < .001.
• Sex: The sexual content of the chat-up line significantly
predicted whether you went home with the date or got a slap in
the face, b = 0.42, Wald χ2(1) = 11.68, p = .001.
Interpretation II
• Funny × Gender: The success of funny chat-up
lines depended on whether they were delivered to a
man or a woman because in interaction these
variables predicted whether or not you went home
with the date, b = 1.17, Wald χ2(1) = 34.63, p < .001.
• Sex × Gender: The success of chat-up lines with
sexual content depended on whether they were
delivered to a man or a woman because in interaction
these variables predicted whether or not you went
home with the date, b = −0.48, Wald χ2(1) = 8.51, p =
.004.
Reporting the Results

4 Data Wrangling With Excel
No ratings yet
4 Data Wrangling With Excel
27 pages
Data As Clean of Excel
No ratings yet
Data As Clean of Excel
66 pages
Estimation and Hypothesis Testing
100% (2)
Estimation and Hypothesis Testing
47 pages
Regression With Categorical Variables
No ratings yet
Regression With Categorical Variables
28 pages
6 Sigma 1619893345
No ratings yet
6 Sigma 1619893345
299 pages
Finite Element Procedures-Bathe
100% (2)
Finite Element Procedures-Bathe
397 pages
Univariate, Bivariate and Multivariate Methods in Corpus-Based Lexicography - A Study of Synonymy
100% (1)
Univariate, Bivariate and Multivariate Methods in Corpus-Based Lexicography - A Study of Synonymy
614 pages
SEM:Confirmatory Factor Analysis (CFA)
No ratings yet
SEM:Confirmatory Factor Analysis (CFA)
28 pages
Engineering Maths Mid Sem 1st Year
No ratings yet
Engineering Maths Mid Sem 1st Year
3 pages
Chapter 9. Test of Hypotheses For A Single Sample
No ratings yet
Chapter 9. Test of Hypotheses For A Single Sample
98 pages
Biostatistics Lecture - 8 - Probability (Part - 2)
100% (1)
Biostatistics Lecture - 8 - Probability (Part - 2)
18 pages
Hypothesis Testing
100% (1)
Hypothesis Testing
60 pages
Oneway ANOVA
No ratings yet
Oneway ANOVA
38 pages
Module 1 Notes
100% (1)
Module 1 Notes
73 pages
Data Validation & Research
No ratings yet
Data Validation & Research
41 pages
Logistic Regression
0% (1)
Logistic Regression
71 pages
Intro To UV-Vis Spectros
No ratings yet
Intro To UV-Vis Spectros
14 pages
Aggregate Planning, Linear Programming and Excel Solver
No ratings yet
Aggregate Planning, Linear Programming and Excel Solver
7 pages
Inferential Statistics in Details
No ratings yet
Inferential Statistics in Details
652 pages
Logistic Ordinal Regression
No ratings yet
Logistic Ordinal Regression
10 pages
Power and Sample Size
No ratings yet
Power and Sample Size
88 pages
KNN (K Nearest Neighbor)
No ratings yet
KNN (K Nearest Neighbor)
21 pages
Introduction To IBM SPSS Statistics
100% (1)
Introduction To IBM SPSS Statistics
85 pages
Confidence Intervals: Submitted To: Prof. Neeta Gupta
100% (2)
Confidence Intervals: Submitted To: Prof. Neeta Gupta
13 pages
Chap4 Normality (Data Analysis) FV
100% (1)
Chap4 Normality (Data Analysis) FV
72 pages
5 Regression Analysis
No ratings yet
5 Regression Analysis
43 pages
Statistical Analysis in Finance Session 4: Hypothesis Testing
No ratings yet
Statistical Analysis in Finance Session 4: Hypothesis Testing
32 pages
Variation of Velocity and Acceleration in Suction and Delivery Pipes Due To Acceleration of Piston
100% (1)
Variation of Velocity and Acceleration in Suction and Delivery Pipes Due To Acceleration of Piston
9 pages
5.6 DOE Fractional Factorial Designs PDF
No ratings yet
5.6 DOE Fractional Factorial Designs PDF
12 pages
5.probability Distributions
No ratings yet
5.probability Distributions
63 pages
Mann Whitney U Test PDF
No ratings yet
Mann Whitney U Test PDF
5 pages
Different Types of Sampling Designs
100% (6)
Different Types of Sampling Designs
12 pages
Ignou PGDAST Assignment Booklet Jan-Dec 2020
No ratings yet
Ignou PGDAST Assignment Booklet Jan-Dec 2020
30 pages
Hypothesis Testing - Analysis of Variance (ANOVA)
No ratings yet
Hypothesis Testing - Analysis of Variance (ANOVA)
14 pages
Question 1
No ratings yet
Question 1
18 pages
Aeroballistics of A Terminally Corrected Spinning Projectile (TCSP)
No ratings yet
Aeroballistics of A Terminally Corrected Spinning Projectile (TCSP)
7 pages
Binary Logistic Regression Lecture 9
No ratings yet
Binary Logistic Regression Lecture 9
33 pages
Ch7. Hypothesis Testing
100% (1)
Ch7. Hypothesis Testing
86 pages
PLAXIS - 3D2018 Tutorial Lesson 09 PDF
No ratings yet
PLAXIS - 3D2018 Tutorial Lesson 09 PDF
14 pages
PSCV Unit-Iii Digital Notes
No ratings yet
PSCV Unit-Iii Digital Notes
46 pages
Characteristics of Good Assessment
No ratings yet
Characteristics of Good Assessment
18 pages
Multivariate Analysis of Variance
No ratings yet
Multivariate Analysis of Variance
29 pages
Basic Concepts of One Way Analysis of Variance (ANOVA)
No ratings yet
Basic Concepts of One Way Analysis of Variance (ANOVA)
38 pages
Types of Statistical Tests
No ratings yet
Types of Statistical Tests
5 pages
Statistics
No ratings yet
Statistics
10 pages
Unit 10 Randomised Block Design: Structure
No ratings yet
Unit 10 Randomised Block Design: Structure
16 pages
20 Reliability Testing and Verification
No ratings yet
20 Reliability Testing and Verification
13 pages
Theresa Hughes Data Analysis and Surveying 101
No ratings yet
Theresa Hughes Data Analysis and Surveying 101
37 pages
Psychology Revision: Research Methods
No ratings yet
Psychology Revision: Research Methods
17 pages
Dsur I Chapter 18 Categorical Data
No ratings yet
Dsur I Chapter 18 Categorical Data
47 pages
Type I & Type II Error
No ratings yet
Type I & Type II Error
19 pages
Le Maitre 1976
No ratings yet
Le Maitre 1976
10 pages
Spss Hypotheses
No ratings yet
Spss Hypotheses
28 pages
Behavioral Model Analysis
No ratings yet
Behavioral Model Analysis
0 pages
Statistics and Freq Distribution
No ratings yet
Statistics and Freq Distribution
35 pages
Design of Rural Water Supply System Using Loop 4.0
No ratings yet
Design of Rural Water Supply System Using Loop 4.0
9 pages
Simple Linear Regression Analysis
No ratings yet
Simple Linear Regression Analysis
21 pages
Random Effects Models
No ratings yet
Random Effects Models
37 pages
Linear Regression Model
No ratings yet
Linear Regression Model
3 pages
Tests For One Poisson Mean
No ratings yet
Tests For One Poisson Mean
9 pages
Confidence Intervals: By: Asst. Prof. Xandro Alexi A. Nieto UST - Faculty of Pharmacy
No ratings yet
Confidence Intervals: By: Asst. Prof. Xandro Alexi A. Nieto UST - Faculty of Pharmacy
24 pages
Academic Journal Guide 2021-Methodology
100% (1)
Academic Journal Guide 2021-Methodology
22 pages
Chapter 9 Fundamental of Hypothesis Testing
No ratings yet
Chapter 9 Fundamental of Hypothesis Testing
26 pages
One-Way ANOVA: What Is This Test For?
No ratings yet
One-Way ANOVA: What Is This Test For?
22 pages
Limitations of Mathematical Model PDF
No ratings yet
Limitations of Mathematical Model PDF
16 pages
Exploratory Factor Analysis: Prof. Andy Field
No ratings yet
Exploratory Factor Analysis: Prof. Andy Field
33 pages
9 - Maths - L-3-Coordinate Geometry WS-1
No ratings yet
9 - Maths - L-3-Coordinate Geometry WS-1
6 pages
Proposed Activity Details On IDM 2025 v.2
No ratings yet
Proposed Activity Details On IDM 2025 v.2
14 pages
Chi-Square Test A Nonparametric Hypothesis Test
No ratings yet
Chi-Square Test A Nonparametric Hypothesis Test
52 pages
(Tutorial) Dynamic Analysis For High Speed Two - Final - Blue
No ratings yet
(Tutorial) Dynamic Analysis For High Speed Two - Final - Blue
28 pages
9.data Analysis
No ratings yet
9.data Analysis
25 pages
Digital Service Supply Chain Management - Current Realities and Prospective Visions by Badr Bentalha PDF
No ratings yet
Digital Service Supply Chain Management - Current Realities and Prospective Visions by Badr Bentalha PDF
1,284 pages
Determination of Sample Size
100% (1)
Determination of Sample Size
9 pages
Test of Goodness of Fit
No ratings yet
Test of Goodness of Fit
38 pages
MQL5 Language Basics STRING TYPES
No ratings yet
MQL5 Language Basics STRING TYPES
11 pages
For Use With Cat Mobile Crimper M6: MN) Couplings
No ratings yet
For Use With Cat Mobile Crimper M6: MN) Couplings
3 pages
Kiangsu-Chekiang College (Shatin) F.5 Final Examination 2023-24 MATHEMATICS Compulsory Part Paper 1 Question-Answer Book June 18, 2024 (Tuesday)
No ratings yet
Kiangsu-Chekiang College (Shatin) F.5 Final Examination 2023-24 MATHEMATICS Compulsory Part Paper 1 Question-Answer Book June 18, 2024 (Tuesday)
17 pages
Effect of The Vent Hole Geometry and Welding
No ratings yet
Effect of The Vent Hole Geometry and Welding
16 pages
s2017 Pbs Pixar Notes PDF
No ratings yet
s2017 Pbs Pixar Notes PDF
18 pages
Netflix Dataset For Analysis With Tableau
No ratings yet
Netflix Dataset For Analysis With Tableau
392 pages
IS 1893-2016 Checklist - 02-04-2019
No ratings yet
IS 1893-2016 Checklist - 02-04-2019
7 pages
Chapter 19: Logistic Regression: Smart Alex's Solutions
No ratings yet
Chapter 19: Logistic Regression: Smart Alex's Solutions
45 pages
Prognosis Appraisal Tools
No ratings yet
Prognosis Appraisal Tools
2 pages
? Answerbook - JL - Pluto - JL
No ratings yet
? Answerbook - JL - Pluto - JL
32 pages
Connecting Relational Mechanisms To Performance Measurement in A Digital Service Supply Chain
No ratings yet
Connecting Relational Mechanisms To Performance Measurement in A Digital Service Supply Chain
13 pages
Write A Shell Script To Find Whether An Input Integer Is Even or Odd
No ratings yet
Write A Shell Script To Find Whether An Input Integer Is Even or Odd
3 pages
CMPS161 Class Notes Chap 03
No ratings yet
CMPS161 Class Notes Chap 03
20 pages
A3 - 1bm15me039 - Nyquist Plot Using Matlab
No ratings yet
A3 - 1bm15me039 - Nyquist Plot Using Matlab
12 pages
Find The LCM of The Following Numbers
No ratings yet
Find The LCM of The Following Numbers
1 page
Gilson Et Al 2005 - Creativity and Standardization - Complimentary of Conflicting Drivers or Team Effectiveness
No ratings yet
Gilson Et Al 2005 - Creativity and Standardization - Complimentary of Conflicting Drivers or Team Effectiveness
12 pages
Engineering Economics Formulas
No ratings yet
Engineering Economics Formulas
2 pages
Pasteurization of Milk PDF
No ratings yet
Pasteurization of Milk PDF
10 pages
Predicting Body Weight From Body Measurements in Adult Female Sahiwal Cattle
No ratings yet
Predicting Body Weight From Body Measurements in Adult Female Sahiwal Cattle
4 pages
QSPM 1
No ratings yet
QSPM 1
4 pages