Chapter 10 - Logistic Regression: Data Mining For Business Intelligence

This document discusses logistic regression, a powerful classification model. It relates predictor variables to a binary outcome using the logit function and maximum likelihood estimation. It also covers interpreting coefficients, variable selection, and assessing significance of predictors.

Uploaded by

jay

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

76 views

Chapter 10 - Logistic Regression: Data Mining For Business Intelligence

Uploaded by

jay

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

You are on page 1/ 20

Chapter 10 – Logistic Regression

Data Mining for Business Intelligence

Shmueli, Patel & Bruce

© Galit Shmueli and Peter Bruce 2010

Logistic Regression
 Powerful model-based classification tool
 Extends idea of linear regression to situation where
outcome variable is categorical
Model relates predictors with the outcome
Example: Y denotes recommendation on
holding/selling/buying a stock – categorical variable with 3
categories
 We focus on binary classification, i.e. Y=0 or Y=1 but
predictors can be categorical or continuous
 Widely used, particularly where a structured model is
useful
The Logit
Goal: Find a function of the predictor variables that relates
them to a 0/1 outcome

 Instead of Y as outcome variable (like in linear regression),

we use a function of Prob(Y=1) called the logit
 Logit can be modeled as a linear function of the predictors
 The logit can be mapped back to a probability, which, in
turn, can be mapped to a class
Using cut-off value on the probability of belonging to class 1,
P(Y=1)
Step 1: Logistic Response Function
• Let p = probability of belonging to class 1
• Logistic regression relates p to predictors with a function
that guarantees 0  p  1

Standard linear function (shown below) does not:

+…

q = number of predictors
The Fix:
use logistic response function

Equation 10.2 in textbook

Step 2: The Odds

The odds of an event are defined as:

p
eq. 10.3 Odds  p = probability of event
1 p

Or, given the odds of an event, the probability of the

event can be computed by:

Odds
eq. 10.4
p
1  Odds
We can also relate the Odds to the
predictors:

 0  1 x1   2 x2   q xq
eq. 10.5 Odds  e

Recall that:
Step 3: Take log on both sides
• This gives us the logit:

log(Odds)   0  1 x1   2 x2     q xq eq. 10.6

• Log(odds) is called the logit and it takes values from –∞

to +∞
• Logit is the dependent variable, and is a linear function of
the predictors x1, x2, …, xq
• Helps make interpretations easier
Example: Acceptance of Personal Loan
Offer
Outcome variable: accept bank loan (0/1)

Predictors: Demographic (age, income, etc.), and information about

their bank relationship (mortgage, securities account, etc.)

Data: 5000 customers – 480 (9.6%) accepted the loan offer previously

Goal: find characteristics of customers who are most likely to accept

loan offer in future mailings
Data preprocessing
Partition 60% training, 40% validation
Create 0/1 dummy variables for categorical predictors
Single Predictor Model
Modeling loan acceptance on income (x)

Fitted coefficients: b0 = -6.3525, b1 = 0.0392

Last step - classification
 Model produces an estimated probability of being a “ 1”
Example: P(accept loan|income)
 Convert to a classification by establishing cutoff level
 If estimated prob. > cutoff, classify as “ 1”
 Thus model helps in classification as well as predicting the
probability of belonging to one class
 Default cut-off value: 0.50 but can be changed to:
Maximize classification accuracy
Example: Parameter estimation
Estimates of ’s are derived through an iterative
process called maximum likelihood estimation

Let us include all 12 predictors in the model now

Estimated Equation for Logit

• Interpreting binary predictor effects:

• The odds of accepting the loan offer for those who already have a CD account with the
bank is 32.1 times as the odds of accepting the loan offer for those who do not have a
CD account (p value < 0.001).

• Interpreting continuous predictor effects:

• The odds of accepting the loan offer increases by 77.1% if the family size increases by
one (p value < 0.001).
• The odds of accepting the loan offer decreases by 4.4% if a client is 1 year older (p value
= 0.624).
http://www.ats.ucla.edu/stat/mult_pkg/faq/general/odds_ratio.htm
Variable Selection
Problems:
As in linear regression, correlated predictors introduce bias in
the method
 Overly complex models have the danger of overfitting

Solution: Remove extreme redundancies by dropping

predictors via automated selection of variable subsets (like
linear regressions) or by data reduction methods such as PCA
P-values for Predictors
 Test null hypothesis that coefficient = 0
P-values with the coefficients display results of these tests
Coefficients with low p-values (close to 0) are statistically
significant
 Useful for review to determine whether to include variable
in model
 Key in profiling tasks, but less important in predictive
classification
Summary
 Logistic regression is similar to linear regression, except
that it is used with a categorical response
 It can be used for explanatory tasks (=profiling) or
predictive tasks (=classification)
 The predictors are related to the response Y via a nonlinear
function called the logit
 As in linear regression, reducing predictors can be done via
variable selection
 Logistic regression can be generalized to more than two
classes

Connections Between Wind Tonguings and Keyboard Fingerings (1500-1650)
No ratings yet
Connections Between Wind Tonguings and Keyboard Fingerings (1500-1650)
110 pages
ISDS 558 Advanced Software Development With Web Applications Spring 2021
No ratings yet
ISDS 558 Advanced Software Development With Web Applications Spring 2021
18 pages
Chap10_LogisticRegression
No ratings yet
Chap10_LogisticRegression
19 pages
Chapter 10 Logistic Reg
No ratings yet
Chapter 10 Logistic Reg
29 pages
Logistic+Regression - Done
100% (1)
Logistic+Regression - Done
41 pages
BANA 560 Lecture - 4 - LogisticRegression
No ratings yet
BANA 560 Lecture - 4 - LogisticRegression
26 pages
Chap10 Logistic Regression
No ratings yet
Chap10 Logistic Regression
36 pages
S4-LogisticRegression-15Jan2025
No ratings yet
S4-LogisticRegression-15Jan2025
25 pages
Logistic Regression
No ratings yet
Logistic Regression
12 pages
Bayesian_theory_daniel_restrepo
No ratings yet
Bayesian_theory_daniel_restrepo
8 pages
Regression (1)
No ratings yet
Regression (1)
31 pages
Exp2 Milf
No ratings yet
Exp2 Milf
7 pages
Logistic Regression
No ratings yet
Logistic Regression
14 pages
Sta 3010 Quizes
No ratings yet
Sta 3010 Quizes
10 pages
CSE3506 Module2 Notes
No ratings yet
CSE3506 Module2 Notes
96 pages
Credit Scoring Modelling For Retail Banking Sector
No ratings yet
Credit Scoring Modelling For Retail Banking Sector
9 pages
Supervised Machine Learning Algorithm
100% (1)
Supervised Machine Learning Algorithm
111 pages
Assignment On Probit Model
No ratings yet
Assignment On Probit Model
17 pages
MicroEconometrics Lecture10
No ratings yet
MicroEconometrics Lecture10
27 pages
DS Unit 2 Essay Answers
No ratings yet
DS Unit 2 Essay Answers
17 pages
5.1) Binary logistic regression
No ratings yet
5.1) Binary logistic regression
32 pages
Logistic Regression With Newton Method
No ratings yet
Logistic Regression With Newton Method
16 pages
Day 13 Logistic Regression
No ratings yet
Day 13 Logistic Regression
28 pages
Newsletter 23 - Logit, Probit, Tobit (2P)
No ratings yet
Newsletter 23 - Logit, Probit, Tobit (2P)
2 pages
ML 4
No ratings yet
ML 4
80 pages
Unit 2
No ratings yet
Unit 2
11 pages
Logistic Regression
No ratings yet
Logistic Regression
30 pages
Regression With A Binary Dependent Variable
No ratings yet
Regression With A Binary Dependent Variable
63 pages
UNIT5
No ratings yet
UNIT5
10 pages
m2 Data analytic and visualization
No ratings yet
m2 Data analytic and visualization
53 pages
A Complete Guide To K Nearest Neighbors Algorithm 1598272616
No ratings yet
A Complete Guide To K Nearest Neighbors Algorithm 1598272616
13 pages
Ecntr Assmm
No ratings yet
Ecntr Assmm
23 pages
Logistic Regression Analysis
No ratings yet
Logistic Regression Analysis
16 pages
Seu Ds610 Mod03
No ratings yet
Seu Ds610 Mod03
45 pages
Regression Analysis
No ratings yet
Regression Analysis
16 pages
Logistic Regression
No ratings yet
Logistic Regression
1 page
2
No ratings yet
2
6 pages
Regression With A Binary Dependent Variable: Michael Ash
No ratings yet
Regression With A Binary Dependent Variable: Michael Ash
18 pages
Unit - 2 ML
No ratings yet
Unit - 2 ML
32 pages
Dissertation Logistic Regression
100% (2)
Dissertation Logistic Regression
4 pages
Logistic Regression
No ratings yet
Logistic Regression
35 pages
7.logistics Regression - BDSM - Oct - 2020
No ratings yet
7.logistics Regression - BDSM - Oct - 2020
49 pages
Aih Exp 1
No ratings yet
Aih Exp 1
6 pages
Chap12 DiscriminantAnalysis
No ratings yet
Chap12 DiscriminantAnalysis
30 pages
AI lab8
No ratings yet
AI lab8
8 pages
Machine Leraning Unit 2
No ratings yet
Machine Leraning Unit 2
62 pages
To understand Regression Models using first principles thinking
No ratings yet
To understand Regression Models using first principles thinking
3 pages
Simafire Logistic Regression Article Digest
No ratings yet
Simafire Logistic Regression Article Digest
11 pages
Thesis Using Logistic Regression
100% (2)
Thesis Using Logistic Regression
7 pages
Multiple Regression
No ratings yet
Multiple Regression
100 pages
Chapter Five Demand Estimation: Page 1 of 22
No ratings yet
Chapter Five Demand Estimation: Page 1 of 22
22 pages
8.31.17 Final Data Science Lecture 1
No ratings yet
8.31.17 Final Data Science Lecture 1
13 pages
Lecture 22. Glm
No ratings yet
Lecture 22. Glm
41 pages
Logistic Regression Copy
No ratings yet
Logistic Regression Copy
30 pages
Linear and Logistic Regression
No ratings yet
Linear and Logistic Regression
21 pages
29_ML Exp_03
No ratings yet
29_ML Exp_03
4 pages
Samatrix Kaa Kaam
No ratings yet
Samatrix Kaa Kaam
3 pages
Probabilistic Model
No ratings yet
Probabilistic Model
7 pages
Probit Logit Interpretation
No ratings yet
Probit Logit Interpretation
26 pages
1694600777-Unit2.2 Logistic Regression CU 2.0
100% (1)
1694600777-Unit2.2 Logistic Regression CU 2.0
37 pages
Random Optimization: Fundamentals and Applications
From Everand
Random Optimization: Fundamentals and Applications
Fouad Sabry
No ratings yet
Process Performance Models: Statistical, Probabilistic & Simulation
From Everand
Process Performance Models: Statistical, Probabilistic & Simulation
Vishnuvarthanan Moorthy
No ratings yet
Chapter 14 - Cluster Analysis: Data Mining For Business Intelligence
No ratings yet
Chapter 14 - Cluster Analysis: Data Mining For Business Intelligence
31 pages
Chapter 13 - Association Rules: Data Mining For Business Intelligence
No ratings yet
Chapter 13 - Association Rules: Data Mining For Business Intelligence
22 pages
Chapter 4 - Dimension Reduction: Data Mining For Business Intelligence
No ratings yet
Chapter 4 - Dimension Reduction: Data Mining For Business Intelligence
24 pages
Chapter 3 - Data Visualization Chapter 4 - Summary Statistics
No ratings yet
Chapter 3 - Data Visualization Chapter 4 - Summary Statistics
38 pages
Catalogo Ebm DC Axial Fans 2011 N
0% (1)
Catalogo Ebm DC Axial Fans 2011 N
89 pages
Financial Management Theory and Practice 14th Edition Brigham Test Bank download pdf
100% (15)
Financial Management Theory and Practice 14th Edition Brigham Test Bank download pdf
66 pages
Axor Catalog 2015 PDF
No ratings yet
Axor Catalog 2015 PDF
212 pages
Explore by Interests: Career & Money
No ratings yet
Explore by Interests: Career & Money
10 pages
Forecasting
No ratings yet
Forecasting
35 pages
Blue Prism Accredited Developer Exam
No ratings yet
Blue Prism Accredited Developer Exam
18 pages
7.soru Fren Si̇stem Ve Balata Deği̇şi̇mi̇
No ratings yet
7.soru Fren Si̇stem Ve Balata Deği̇şi̇mi̇
10 pages
DLL Diss 3
No ratings yet
DLL Diss 3
3 pages
Lab - Configure ASA 5505 Basic Settings Using CLI
No ratings yet
Lab - Configure ASA 5505 Basic Settings Using CLI
11 pages
State of Vue - Js 2019 PDF
100% (1)
State of Vue - Js 2019 PDF
71 pages
List MCU
No ratings yet
List MCU
1 page
Alc.e - 001
No ratings yet
Alc.e - 001
2 pages
Dell Boomi Course Agenda PDF
No ratings yet
Dell Boomi Course Agenda PDF
11 pages
Design and Manufacturing of Quadcopter
100% (3)
Design and Manufacturing of Quadcopter
47 pages
Brother Dcp 9030cdn l3510cdw l3517cdw l3550cdw l3551cdw Hl l3290cdw Mfc 9150cdn 9350cdw l3710cw l3730cdn l3735cdn l3745cdw l3750cdw l3770cdw Parts Manual
No ratings yet
Brother Dcp 9030cdn l3510cdw l3517cdw l3550cdw l3551cdw Hl l3290cdw Mfc 9150cdn 9350cdw l3710cw l3730cdn l3735cdn l3745cdw l3750cdw l3770cdw Parts Manual
39 pages
Schedule of Products WaterMark
No ratings yet
Schedule of Products WaterMark
46 pages
Persepsi Dan Motivasi Mahasiswa Dalam Memilih Program Studi Pada Jurusan Pendidikan Bahasa Dan Seni
No ratings yet
Persepsi Dan Motivasi Mahasiswa Dalam Memilih Program Studi Pada Jurusan Pendidikan Bahasa Dan Seni
11 pages
DQ XBQ 8 IA5 B 4 XWNRB
No ratings yet
DQ XBQ 8 IA5 B 4 XWNRB
2 pages
Arun Kumar Manglik-1
No ratings yet
Arun Kumar Manglik-1
23 pages
Locking and Safety Reasons: Captive)
No ratings yet
Locking and Safety Reasons: Captive)
10 pages
DTC Agreement Between Cyprus and United States
No ratings yet
DTC Agreement Between Cyprus and United States
30 pages
Mismanagement Final
No ratings yet
Mismanagement Final
20 pages
Service Manager: Description
No ratings yet
Service Manager: Description
3 pages
Lecture-02 Lecture-02: Information Systems in Global Business Information Systems in Global Business
100% (1)
Lecture-02 Lecture-02: Information Systems in Global Business Information Systems in Global Business
19 pages
Buried Bodies Detection Using Ground Penetrating RADAR
No ratings yet
Buried Bodies Detection Using Ground Penetrating RADAR
6 pages
Static Pressure For AHU's and Fans - 21.01.2013
No ratings yet
Static Pressure For AHU's and Fans - 21.01.2013
12 pages
Name: Roll No: Sub:: AIM Geometric Modelling
No ratings yet
Name: Roll No: Sub:: AIM Geometric Modelling
9 pages
Oil and Petrochemical Overview - Solutions For Your ... - Spirax Sarco
No ratings yet
Oil and Petrochemical Overview - Solutions For Your ... - Spirax Sarco
12 pages
Newest Loyo Led Light Bar Quotation 2018
0% (1)
Newest Loyo Led Light Bar Quotation 2018
3 pages