0% found this document useful (0 votes)

17 views41 pages

Lecture 22. GLM

This document provides an overview of binary random variables and logistic regression, emphasizing its application in predicting binary outcomes. It explains the logistic function, model equations, and the relationship between probabilities and odds, along with methods for parameter estimation and model evaluation. Real-world applications include spam detection, medical diagnosis, and loan default prediction.

Uploaded by

Mohammad Nurunnabi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

17 views41 pages

Lecture 22. GLM

Uploaded by

Mohammad Nurunnabi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 41

Generalized Linear Model

LECTURE 22
BINARY VARIABLES AND LOGISTIC REGRESSION: FROM BASICS TO APPLICATIONS
What is a binary random variable?

 A binary random variable is a type of random variable that can

take on only two possible values, typically represented as 0 and 1.
 These values often correspond to two distinct outcomes in a
scenario, such as success/failure, yes/no, or true/false.
Introduction to Logistic Regression

 Definition: Logistic regression is a statistical model used to predict the

probability of a binary outcome (e.g., success/failure).
 Key Idea: Unlike linear regression, logistic regression is suitable for
categorical dependent variables.
 Objective: To understand how logistic regression works, its equations,
parameters, and applications.
How to model binary outcome
variables?
 In the basic form of logistic regression, dichotomous variables
(0 or 1) can be predicted and the probability of the occurrence
of value 1 (success/characteristic present) is estimated.
Real-world Examples:

 Spam detection: Classify emails as spam or not spam.

 Medical diagnosis: Predict the presence of a disease (e.g., diabetes:
yes/no).
 Loan default prediction: Will a customer default on a loan?
Simple Logistic Regression

 Simple Logistic Regression is a type of logistic regression model used

to predict the probability of a binary outcome based on a single
explanatory (independent) variable. It provides a framework for
understanding the relationship between one predictor variable and
the response variable.
 Model: Equation of the Model:
 P(Y=1∣ X)= exp(β0+β1X1)/ {1 + exp{β0+β1X)
 Logit Function:

 Parameters:
 : Intercept
 : Slope (effect of the explanatory variable)
 Interpretation: Odd increase by exp(β1) for a one-unit increase in X.
 Graphical Representation: S-shaped logistic curve showing the
probability of Y=1 as X changes.
Example
General Logistic Regression Model

 Model: Predicts a binary outcome based on multiple explanatory

variables.
 Equation:
 Logit Function:
 Parameters:
 : Intercept
 : Coefficients for predictors .
Example
Applications:

 Disease prediction using biomarkers and demographic variables.

 Fraud detection in financial transactions.
Logistic regression and probabilities

 In linear regression, the independent variables (e.g., age, height

and gender) are used to estimate the specific value of the
dependent variable (e.g., body mass index).
 In logistic regression, on the other hand, the dependent variable is
dichotomous (0 or 1) and the probability that expression 1 occurs is
estimated.
 Returning to an example: How likely is it that a disease is present if
the person under consideration has a certain age, sex and smoking
status.
Calculate logistic regression

To build a logistic regression model, the linear regression equation is used as

the starting point.
However, if a linear regression were
simply calculated for solving a
logistic regression, the following
result would appear graphically:

As can be seen in the graph,

however, values between plus and
minus infinity can now occur. The
goal of logistic regression, however,
is to estimate the probability of
occurrence and not the value of
the variable itself.

Therefore, the this equation must

be transformed.
 To do this, it is necessary to restrict the value range for the prediction
to the range between 0 and 1. To ensure that only values between
0 and 1 are possible, the logistic function f is used.
Logistic function

 The logistic model is based on the

logical function. The special thing
about the logistic function is that
for values between minus and
plus infinity, it always assumes only
values between 0 and 1.
Logistic Function

So the logistic function is perfect to describe

the probability P(Y=1).
If the logistic function is now applied to the previous
regression equation, the result is:
This now ensures that no matter
in which range the x values are
located, only values between 0
and 1 will come out.

The new graph now looks like

this:
The logistic function is a mathematical formula that maps any real number
(z) to a value between 0 and 1:
P=1/(1+e−z)
Where:
P: Probability of success (e.g., P(Y=1)).
z: Linear combination of predictors, expressed as:
z = β0+β1X1+β2X2+⋯+βkXk
Key Properties:
P is always in the range [0,1], making it suitable to represent probabilities.
For large positive z, P→1; for large negative z, P→0.

The logistic function creates an S-shaped (sigmoidal) curve, which models

the non-linear relationship between z and P
Logistic Regression

 Logistic regression applies the logistic function to model the

relationship between a binary outcome variable YYY (e.g.,
success/failure) and one or more predictors (X1,X2,…,Xk).
 Logistic Regression Model:
 P(Y=1∣X)=exp(β0+β1X1+β2X2+⋯+βkXk)/[1+exp(β0+β1X1+β2X2+⋯+βkXk)]
 Where:
 P(Y=1∣X) : Probability that the dependent variable Y=1 given
predictors X1,X2,…,Xk.

 z=β0+β1X1+β2X2+ ⋯ + βk Xk: The linear combination of the predictors.

The connection between the logistic function and logistic regression lies in
the transformation from probabilities to a linear relationship:
(a) Probabilities and Odds:
 Logistic regression models the probability of success P(Y=1∣X).
 The odds of success are defined as: Odds=P/(1−P)
(b) Logit Function (Linearization):
 The logistic regression model transforms the non-linear probability into a
linear form using the logit function:
 logit(P)=ln {P/(1−P)}=β0+β1X1+β2X2+⋯+βkXk
 This ensures that the predictors are linearly related to the log-odds of
the outcome.
 The logistic function is used to "invert" this transformation and predict
probabilities from the linear combination.
Key Components and Notations

 Dependent Variable:
 Binary (e.g., 0 or 1).
 Independent Variables:
 Can be continuous, categorical, or a mix.
 Odds and Odds Ratio:
Estimation of Parameters

 Maximum Likelihood Estimation (MLE):

 Likelihood Function:
 Log-Likelihood Function:
 Solved using iterative methods such as Newton-Raphson.
 Interpretation:
 Parameters represent changes in log-odds for a one-unit increase in
predictors.
Likelihood Function
Log Likelihood Function
Optimization
Fitting the Model
Interpretation of Parameters
Goodness of Fit and Model Evaluation

 Deviance:
 Measures lack of fit
 Hosmer-Lemeshow Test:
 Compares observed and predicted values in grouped data.
 Pseudo :
 McFadden’s, Cox-Snell, and Nagelkerke’s .
Deviance Measure

 The deviance for logistic regression

can also be expressed as:

𝑦𝑖 𝑛𝑖 −𝑦𝑖
 𝐷=2 𝑖 y𝑖 ln 𝑦𝑖
+ (𝑛𝑖 − y𝑖 ) ln 𝑛 𝑖 − 𝑦𝑖

 Where:
 Yi: Observed outcome for the i-th
observation (Yi∈{0,1}
Deviance measure and chi square test
Other Goodness of Fit statistic

 AIC
 BIC
Pseudo-R squared
In a linear regression, the coefficient of determination R2 indicates the
proportion of the explained variance.
In logistic regression, the dependent variable is scaled nominally or ordinally
and it is not possible to calculate a variance, so the coefficient of
determination cannot be calculated in logical regression.

However, in order to make a statement about the quality of the logistic

regression model, so-called pseudo coefficients of determination have
been established, also called pseudo-R squared.

Pseudo coefficients of determination are constructed in such a way that

they lie between 0 and 1 just like the original coefficient of determination.
The best known coefficients of determination are the Cox and Snell R-
square and the Nagelkerke R-square.
 Source: https://datatab.net/tutorial/logistic-regression

Logistic Regression
No ratings yet
Logistic Regression
22 pages
Geometric Modeling
100% (2)
Geometric Modeling
39 pages
7.logistics Regression - BDSM - Oct - 2020
No ratings yet
7.logistics Regression - BDSM - Oct - 2020
49 pages
Logistic Regression: Logistic Regression and The New: Residual Logistic Regression
No ratings yet
Logistic Regression: Logistic Regression and The New: Residual Logistic Regression
31 pages
Regression Logistic 4
No ratings yet
Regression Logistic 4
51 pages
Machine Learning (Analytics Vidhya) : What Is Logistic Regression?
100% (1)
Machine Learning (Analytics Vidhya) : What Is Logistic Regression?
5 pages
Logistic Regression
No ratings yet
Logistic Regression
18 pages
MLStackCafe QAS 1672810525772
No ratings yet
MLStackCafe QAS 1672810525772
12 pages
Logistic Regression
100% (2)
Logistic Regression
47 pages
Econometrics II CH 1
No ratings yet
Econometrics II CH 1
48 pages
Logistic Regression
No ratings yet
Logistic Regression
54 pages
Dissertation Using Logistic Regression
100% (2)
Dissertation Using Logistic Regression
6 pages
A Simple But Effective Logistic Regression Derivation
No ratings yet
A Simple But Effective Logistic Regression Derivation
6 pages
Logistic Regression
100% (1)
Logistic Regression
37 pages
Binary Logistic Regression Lecture 9
No ratings yet
Binary Logistic Regression Lecture 9
33 pages
spss10 LOGIT
No ratings yet
spss10 LOGIT
17 pages
Regression3 Slides
No ratings yet
Regression3 Slides
47 pages
Ch05 - Shape Functions
No ratings yet
Ch05 - Shape Functions
52 pages
Samatrix Kaa Kaam
No ratings yet
Samatrix Kaa Kaam
3 pages
Logistic Regression
No ratings yet
Logistic Regression
8 pages
Loges Tic
No ratings yet
Loges Tic
30 pages
Logistic Regression
No ratings yet
Logistic Regression
14 pages
MACHINE LEARNING Presentation Logistic Regression
No ratings yet
MACHINE LEARNING Presentation Logistic Regression
18 pages
Logistic Regression
No ratings yet
Logistic Regression
7 pages
Logistic Regression: Psy 524 Ainsworth
No ratings yet
Logistic Regression: Psy 524 Ainsworth
37 pages
Logistic Regression
No ratings yet
Logistic Regression
25 pages
GMS MODFLOW ConceptualModelApproach
No ratings yet
GMS MODFLOW ConceptualModelApproach
24 pages
Report Logistic Regression
No ratings yet
Report Logistic Regression
21 pages
What Is Logistic Regression
No ratings yet
What Is Logistic Regression
20 pages
Lec-4 Logistic Regression
No ratings yet
Lec-4 Logistic Regression
54 pages
Presentation (FA20 BCS 104)
No ratings yet
Presentation (FA20 BCS 104)
9 pages
Logistic Regression
No ratings yet
Logistic Regression
9 pages
Experiment No 8
No ratings yet
Experiment No 8
4 pages
Logistic Regression
No ratings yet
Logistic Regression
6 pages
Reading 2 Time-Series Analysis
No ratings yet
Reading 2 Time-Series Analysis
47 pages
RM - Binary Logistic Regression Model - Estimation
No ratings yet
RM - Binary Logistic Regression Model - Estimation
19 pages
Logistic Regression in R and Python
No ratings yet
Logistic Regression in R and Python
9 pages
Logistic Regression & Practice
100% (1)
Logistic Regression & Practice
51 pages
Data Analytics Using R
No ratings yet
Data Analytics Using R
23 pages
Logistic Regression
No ratings yet
Logistic Regression
16 pages
Water 09 00796 v3
No ratings yet
Water 09 00796 v3
20 pages
Logistic Regression
No ratings yet
Logistic Regression
4 pages
Logistic Regression
No ratings yet
Logistic Regression
8 pages
Find Out Real Root of Equation 3x-Cosx-1 0 by Newton's Raphson Method. 2. Solve Upto Four Decimal Places by Newton Raphson. 3
No ratings yet
Find Out Real Root of Equation 3x-Cosx-1 0 by Newton's Raphson Method. 2. Solve Upto Four Decimal Places by Newton Raphson. 3
3 pages
Octave Exercises Eng
No ratings yet
Octave Exercises Eng
5 pages
5.1) Binary Logistic Regression
No ratings yet
5.1) Binary Logistic Regression
32 pages
Index: Numerical Methods and Data Analysis
No ratings yet
Index: Numerical Methods and Data Analysis
9 pages
Polynomial Interpolation and Approximation
No ratings yet
Polynomial Interpolation and Approximation
19 pages
Cubic Spline
No ratings yet
Cubic Spline
19 pages
Logisticregression
No ratings yet
Logisticregression
22 pages
Logistic Regression in Machine Learning
No ratings yet
Logistic Regression in Machine Learning
3 pages
Algorithm Theoretical Basis Document For GRACE Level-1B Data Processing V1.2
No ratings yet
Algorithm Theoretical Basis Document For GRACE Level-1B Data Processing V1.2
54 pages
Perturbation and Projection Methods For Solving DSGE Models: Lawrence J. Christiano
No ratings yet
Perturbation and Projection Methods For Solving DSGE Models: Lawrence J. Christiano
93 pages
Detailed Logistic Regression
No ratings yet
Detailed Logistic Regression
30 pages
Lecture 4-Logistic Regression
No ratings yet
Lecture 4-Logistic Regression
20 pages
Wen 2021
No ratings yet
Wen 2021
11 pages
M2 Logistic Regression Classcopy 4
No ratings yet
M2 Logistic Regression Classcopy 4
7 pages
Shape of A Dew Drop
100% (1)
Shape of A Dew Drop
18 pages
CBNST
0% (1)
CBNST
2 pages
4 - C - Logistic Regression
No ratings yet
4 - C - Logistic Regression
13 pages
Tugas Regresi Linear Sederhana (CPMK12)
No ratings yet
Tugas Regresi Linear Sederhana (CPMK12)
12 pages
09 23ECE216 LogisticRegression
No ratings yet
09 23ECE216 LogisticRegression
40 pages
Financial Econometrics
No ratings yet
Financial Econometrics
5 pages
Logistic Regression in Machine Learning
No ratings yet
Logistic Regression in Machine Learning
10 pages
CH 2. Simple Linear Regression
No ratings yet
CH 2. Simple Linear Regression
63 pages
Modelling, Identification and Control of A "Magnetic Levitation CE152"
No ratings yet
Modelling, Identification and Control of A "Magnetic Levitation CE152"
27 pages
Anova Kacang Panjang
No ratings yet
Anova Kacang Panjang
8 pages
Name: Chinmay Tripurwar Roll No: 22b3902: Simple Regression Model Analysis
No ratings yet
Name: Chinmay Tripurwar Roll No: 22b3902: Simple Regression Model Analysis
9 pages
EXCEL
No ratings yet
EXCEL
24 pages
Logistic Regression
No ratings yet
Logistic Regression
14 pages
Untitled4 Assigment 3
No ratings yet
Untitled4 Assigment 3
9 pages
W5S01 - PM-Logistic Regression
No ratings yet
W5S01 - PM-Logistic Regression
17 pages
Agung Erwanto Statistika Regresi
No ratings yet
Agung Erwanto Statistika Regresi
21 pages
Sarang Ke Liye
No ratings yet
Sarang Ke Liye
14 pages
Hair PPT Ch05
No ratings yet
Hair PPT Ch05
18 pages
Logistics Regression
No ratings yet
Logistics Regression
10 pages
Lecture Notes Week 2
No ratings yet
Lecture Notes Week 2
76 pages
Logistic Regression
No ratings yet
Logistic Regression
25 pages
File 5edb2945380c6
No ratings yet
File 5edb2945380c6
13 pages
Chapter 11. Simple Linear Regression and Correlation
No ratings yet
Chapter 11. Simple Linear Regression and Correlation
96 pages
M8 Logreg
No ratings yet
M8 Logreg
10 pages
Notes For Chapter 7
No ratings yet
Notes For Chapter 7
13 pages
FALLSEM2024-25 BCSE209L TH VL2024250101695 2024-08-12 Reference-Material-II
No ratings yet
FALLSEM2024-25 BCSE209L TH VL2024250101695 2024-08-12 Reference-Material-II
19 pages
T3 Logistic Regression
No ratings yet
T3 Logistic Regression
53 pages
Logistic Regression
No ratings yet
Logistic Regression
72 pages
ML2 Logistic Regression
No ratings yet
ML2 Logistic Regression
23 pages
Lecture 07
No ratings yet
Lecture 07
26 pages
Principles of Numerical Analysis
No ratings yet
Principles of Numerical Analysis
308 pages
NM Assignment Questions (AutoRecovered)
No ratings yet
NM Assignment Questions (AutoRecovered)
7 pages

Lecture 22. GLM

Uploaded by

Lecture 22. GLM

Uploaded by

Generalized Linear Model

 A binary random variable is a type of random variable that can

 Definition: Logistic regression is a statistical model used to predict the

 Spam detection: Classify emails as spam or not spam.

 Simple Logistic Regression is a type of logistic regression model used

 Model: Predicts a binary outcome based on multiple explanatory

 Disease prediction using biomarkers and demographic variables.

 In linear regression, the independent variables (e.g., age, height

To build a logistic regression model, the linear regression equation is used as

As can be seen in the graph,

Therefore, the this equation must

 The logistic model is based on the

So the logistic function is perfect to describe

The new graph now looks like

The logistic function creates an S-shaped (sigmoidal) curve, which models

 Logistic regression applies the logistic function to model the

 z=β0+β1X1+β2X2+ ⋯ + βk Xk: The linear combination of the predictors.

 Maximum Likelihood Estimation (MLE):

 The deviance for logistic regression

However, in order to make a statement about the quality of the logistic

Pseudo coefficients of determination are constructed in such a way that

You might also like