100% found this document useful (2 votes)

58 views30 pages

Logistic Regression

Logistic regression is a classification algorithm that uses a logistic function to predict a discrete outcome such as male or female. It determines weights to minimize the log-likelihood function and fit a linear classifier. The algorithm was demonstrated using Python's scikit-learn library to classify data as 0 or 1, with evaluation showing 90% accuracy on test data using metrics like confusion matrix and classification report.

Uploaded by

Bharath

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

100% found this document useful (2 votes)

58 views30 pages

Logistic Regression

Uploaded by

Bharath

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 30

Logistic Regression

Dr. Rajavel Ramadoss

Associate Professor
SSN College of Engineering
Presentation Outline
❑ Introduction to logistic regression

❑ Problem formulation

❑ Logistic regression for binary and multiclass classification

❑ Demo - Python implementation

❑ Summary
Logistic Regression
▪ Logistic regression is a simple classification algorithm for
learning to predict a discrete variable such as predicting whether
a person is

male or female (binary classification)

male, female or transgender (multiclass classification)

▪ Logistic regression is a linear classifier, so we’ll use a linear

function 𝑓(𝐱) = 𝑏₀ + 𝑏₁𝑥₁ + ⋯ + 𝑏ᵣ𝑥ᵣ, also called the logit.

▪ The variables 𝑏₀, 𝑏₁, …, 𝑏ᵣ are the estimators of the regression

coefficients, which are also called the predicted weights or
just coefficients.
Logistic Regression
▪ The logistic regression function 𝑝(𝐱) is the sigmoid function of
𝑓(𝐱): 𝑝(𝐱) = 1 / (1 + exp(−𝑓(𝐱)).

▪ The function 𝑝(𝐱) is often interpreted as the predicted probability

that the output for a given 𝐱 is equal to 1.

▪ Therefore, 1 − 𝑝(𝑥) is the probability that the output is 0.

Logistic Regression
▪ Logistic regression determines the best predicted weights 𝑏₀, 𝑏₁,
…, 𝑏ᵣ such that the function 𝑝(𝐱) is as close as possible to all
actual responses 𝑦ᵢ, 𝑖 = 1, …, 𝑛, where 𝑛 is the number of
observations.

▪ The process of calculating the best weights using available

observations is called model training or fitting.

▪ To get the best weights, we usually maximize the log-likelihood

function (LLF) for all observations 𝑖 = 1, …, 𝑛.

▪ This method is called the maximum likelihood estimation

and is represented by the equation

LLF = Σᵢ(𝑦ᵢ log(𝑝(𝐱ᵢ)) + (1 − 𝑦ᵢ) log(1 − 𝑝(𝐱ᵢ))).

Logistic Regression - Notations
Logistic Regression-Objective Function
▪ In logistic regression we can’t use the same objective function
(MSE) as used in linear regression

▪ MSE introduces different penalty to different error values. For

example, Y – Y^ is small it adds low penalty, i.e y-y^ = 0.1 it
adds penalty as (0.1)^2 = 0.001, whereas Y – Y^ is larger it
adds large penalty, i.e. y-y^ = 10 it adds penalty as (10)^2 =
100

▪ This is not the case in logistic regression, because the error will
always be less than 1 since both model output and label is ≤1
Logistic Regression-Objective Function
▪ In logistic regression, we use cross entropy as the objective
function
Intuition behind the Objective Function
Intuition behind the Objective Function
Intuition behind the Objective Function
Multi-Class Classification
Multi-Class Logistic Regression
Multi-Class Logistic Regression
Multi-Class Logistic Regression
Multi-Class Logistic Regression
Logistic Regression in Python With scikit-learn
▪ General steps to prepare classification models:
1. Import packages, functions, and classes
2. Get data to work with and, if appropriate, transform
it
3. Create a classification model and train (or fit) it with
your existing data
4. Evaluate your model to see if its performance is
satisfactory
A sufficiently good model that we define can be used to
make further predictions related to new, unseen data.
Logistic Regression in Python With scikit-learn
Step 1: Import Packages, Functions, and Classes
import matplotlib.pyplot as plt
import numpy as np
From sklearn.linear_model import LogisticRegression
from sklearn.metrics import classification_report,
confusion_matrix
Step 2: Get Data
x = np.arange(10).reshape(-1, 1)
y = np.array([0, 0, 0, 0, 1, 1, 1, 1, 1, 1])
Note: .reshape() with the arguments -1 to get as many rows as
needed and 1 to get one column
Logistic Regression in Python With scikit-learn
Step 3: Create a Model and Train It
model = LogisticRegression(solver='liblinear', random_state=0)

▪ Once the model is created, we need to fit (or train) it.

▪ Model fitting is the process of determining the
coefficients 𝑏₀, 𝑏₁, …, 𝑏ᵣ that correspond to the best
value of the cost function.
▪ We can fit the model with .fit()
model.fit(x, y)
Or equantly we can use
model = LogisticRegression(solver='liblinear', random_state=0).fit(x, y)
Logistic Regression in Python With scikit-learn
▪ We can get the attributes of the model as follows:

model.classes_

array([0, 1])

model.intercept_

array([-1.04608067])

model.coef_

array([[0.51491375]])
Logistic Regression in Python With scikit-learn
Step 4: Evaluate the Model

▪ Once a model is defined, we can check its

performance with .predict_proba(), which returns the
matrix of probabilities that the predicted output is
equal to zero or one

model.predict_proba(x)

array([[0.74002157, 0.25997843], [0.62975524, 0.37024476],

[0.5040632, 0.4959368], [0.37785549, 0.62214451],
[0.26628093, 0.73371907], [0.17821501, 0.82178499],
[0.11472079, 0.88527921], [0.07186982, 0.92813018],
[0.04422513, 0.95577487], [0.02690569, 0.97309431]])
Logistic Regression in Python With scikit-learn
▪ The first column is the probability of the predicted
output being zero, that is 1 - 𝑝(𝑥).
▪ The second column is the probability that the output is
one, or 𝑝(𝑥).
▪ We can get the actual predictions, based on the
probability matrix and the values of 𝑝(𝑥), with
.predict():
model.predict(x)
array([0, 0, 0, 1, 1, 1, 1, 1, 1, 1])
▪ This function returns the predicted output values as a
one-dimensional array.
Logistic Regression in Python With scikit-learn
Logistic Regression in Python With scikit-learn
▪ The green circles represent the actual responses as
well as the correct predictions.

▪ The red × shows the incorrect prediction.

▪ The full black line is the estimated logistic regression

line 𝑝(𝑥).

▪ The grey squares are the points on this line that

correspond to 𝑥 and the values in the second column
of the probability matrix.

▪ The black dashed line is the logit 𝑓(𝑥).

Logistic Regression in Python With scikit-learn
▪ The value of 𝑥 slightly above 2 corresponds to the
threshold 𝑝(𝑥)=0.5, which is 𝑓(𝑥)=0.
▪ For example, the first point has input 𝑥=0, actual
output 𝑦=0, probability 𝑝=0.26, and a predicted
value of 0.
▪ The second point has 𝑥=1, 𝑦=0, 𝑝=0.37, and a
prediction of 0.
▪ Only the fourth point has the actual output 𝑦=0 and
the probability higher than 0.5 (at 𝑝=0.62), so it’s
wrongly classified as 1.
▪ All other values are predicted correctly.
Logistic Regression in Python With scikit-learn
▪ The accuracy of the model is 9/10=0.9, which we
can obtain with .score():
model.score(x, y) => 0.9
▪ We can get more information on the accuracy of the
model with a confusion matrix.
confusion_matrix(y, model.predict(x))
array([[3, 1],
[0, 6]])
Logistic Regression in Python With scikit-learn
▪ We can get a more comprehensive report on the
classification with classification_report():
print(classification_report(y, model.predict(x)))
precision recall f1-score support
0 1.00 0.75 0.86 4
1 0.86 1.00 0.92 6
accuracy 0.90 10
macro avg 0.93 0.88 0.89 10
weighted avg 0.91 0.90 0.90 10
Session Summary
In this session we have learned,

❑ Introduction to logistic regression

❑ Problem formulation

❑ Logistic regression for binary and multiclass

classification

❑ Demo - Python implementation

References

1. https://realpython.com/logistic-regression-python/
Thanks

Levels of Comprehension
No ratings yet
Levels of Comprehension
9 pages
QuantEconlectures Python3 PDF
100% (1)
QuantEconlectures Python3 PDF
1,125 pages
Math7 Unpacking
No ratings yet
Math7 Unpacking
1 page
1694600777-Unit2.2 Logistic Regression CU 2.0
100% (1)
1694600777-Unit2.2 Logistic Regression CU 2.0
37 pages
Multiple Regression Analysis: I 0 1 I1 K Ik I
100% (1)
Multiple Regression Analysis: I 0 1 I1 K Ik I
30 pages
Data Transformation With Dplyr - Cheatsheet
100% (1)
Data Transformation With Dplyr - Cheatsheet
2 pages
Linear Regression Chap01
100% (1)
Linear Regression Chap01
7 pages
In All The Regression Models That We Have Considered So
100% (1)
In All The Regression Models That We Have Considered So
52 pages
Lecture 4 Linear Regression
100% (1)
Lecture 4 Linear Regression
44 pages
Introduction To STATISTICS-new
100% (1)
Introduction To STATISTICS-new
46 pages
Predictive Modeling Project Report
100% (2)
Predictive Modeling Project Report
31 pages
Practical-5 - Jupyter Notebook
100% (1)
Practical-5 - Jupyter Notebook
8 pages
Linear Regression With LM Function, Diagnostic Plots, Interaction Term, Non-Linear Transformation of The Predictors, Qualitative Predictors
100% (1)
Linear Regression With LM Function, Diagnostic Plots, Interaction Term, Non-Linear Transformation of The Predictors, Qualitative Predictors
15 pages
ML3 - Evaluation
100% (1)
ML3 - Evaluation
65 pages
Statistical Methods For Decision Making (SMDM) Project Report
100% (2)
Statistical Methods For Decision Making (SMDM) Project Report
22 pages
Classification With Decision Trees: Instructor: Qiang Yang
100% (1)
Classification With Decision Trees: Instructor: Qiang Yang
62 pages
Logistic Regression
100% (1)
Logistic Regression
14 pages
EDA Assignment
No ratings yet
EDA Assignment
15 pages
Linear Regression
100% (1)
Linear Regression
51 pages
Python Numpy (1) : Intro To Multi-Dimensional Array & Numerical Linear Algebra
100% (1)
Python Numpy (1) : Intro To Multi-Dimensional Array & Numerical Linear Algebra
27 pages
Gradient Descent
No ratings yet
Gradient Descent
15 pages
Machine Learning (Analytics Vidhya) : What Is Logistic Regression?
100% (1)
Machine Learning (Analytics Vidhya) : What Is Logistic Regression?
5 pages
Logistic Regression Example
100% (1)
Logistic Regression Example
22 pages
1
100% (1)
1
385 pages
Bagging and Boosting Regression Algorithms
100% (1)
Bagging and Boosting Regression Algorithms
84 pages
Matplotlib PDF
No ratings yet
Matplotlib PDF
16 pages
Logistic Regression: Gunjan Bharadwaj Assistant Professor Dept of CEA
100% (1)
Logistic Regression: Gunjan Bharadwaj Assistant Professor Dept of CEA
42 pages
Correlation & Regression
No ratings yet
Correlation & Regression
31 pages
Linear Regression Analysis. Statistics 2 Notes
No ratings yet
Linear Regression Analysis. Statistics 2 Notes
20 pages
SQL Database Notes
No ratings yet
SQL Database Notes
8 pages
Statistical Modeling
No ratings yet
Statistical Modeling
22 pages
Scip y Lectures
100% (1)
Scip y Lectures
329 pages
Numpy Cheat Sheet & Quick Reference
100% (1)
Numpy Cheat Sheet & Quick Reference
6 pages
Cluster
100% (1)
Cluster
72 pages
Class 7
No ratings yet
Class 7
42 pages
1.1 Simple Linear Regression Model
100% (1)
1.1 Simple Linear Regression Model
15 pages
CS229 Lecture 3 PDF
100% (1)
CS229 Lecture 3 PDF
35 pages
EDA Lecture Module 2
100% (1)
EDA Lecture Module 2
42 pages
Python For Finance - The Complete Beginner's Guide - by Behic Guven - Jul, 2020 - Towards Data Science PDF
100% (1)
Python For Finance - The Complete Beginner's Guide - by Behic Guven - Jul, 2020 - Towards Data Science PDF
12 pages
Data Pre-Processing (Pandas)
No ratings yet
Data Pre-Processing (Pandas)
19 pages
Introduction To Data Visualization With Python
No ratings yet
Introduction To Data Visualization With Python
47 pages
Poly
100% (1)
Poly
108 pages
Course Title: Data Pre-Processing and Visualization
100% (2)
Course Title: Data Pre-Processing and Visualization
11 pages
Bias and Variance
No ratings yet
Bias and Variance
6 pages
Machine Learning: Lecture 13: Model Validation Techniques, Overfitting, Underfitting
100% (2)
Machine Learning: Lecture 13: Model Validation Techniques, Overfitting, Underfitting
26 pages
Python For Non-Programmers Final
No ratings yet
Python For Non-Programmers Final
218 pages
PCA Using Python
No ratings yet
PCA Using Python
18 pages
Conda Cheatsheet
100% (1)
Conda Cheatsheet
22 pages
Outliers, Hypothesis and Natural Language Processing
100% (1)
Outliers, Hypothesis and Natural Language Processing
7 pages
Correlation & Regression
100% (1)
Correlation & Regression
53 pages
Sas Notes Module 4-Categorical Data Analysis Testing Association Between Categorical Variables
100% (1)
Sas Notes Module 4-Categorical Data Analysis Testing Association Between Categorical Variables
16 pages
RSTUDIO
No ratings yet
RSTUDIO
44 pages
Chapter 5.3-Mulitple Linear Regression
No ratings yet
Chapter 5.3-Mulitple Linear Regression
26 pages
Sajjad DS
100% (2)
Sajjad DS
97 pages
Customer Segmentation Clustering
No ratings yet
Customer Segmentation Clustering
35 pages
Logistic Regression Model Study Assignment
100% (1)
Logistic Regression Model Study Assignment
5 pages
Dealing With Missing Data in Python Pandas
100% (1)
Dealing With Missing Data in Python Pandas
14 pages
Import As
100% (1)
Import As
27 pages
Leer Los Datos: Import As Import As Import As From Import From Import
100% (1)
Leer Los Datos: Import As Import As Import As From Import From Import
14 pages
Excel 2013/2016: Get Your Hands Dirty
From Everand
Excel 2013/2016: Get Your Hands Dirty
Sam Akrasi
No ratings yet
Effective Amazon Machine Learning
From Everand
Effective Amazon Machine Learning
Alexis Perrier
No ratings yet
AI Lab8
No ratings yet
AI Lab8
8 pages
CST Studio Suite - Thermal and Mechanical Simulation
No ratings yet
CST Studio Suite - Thermal and Mechanical Simulation
58 pages
Paper 4106
No ratings yet
Paper 4106
4 pages
8051 Interrupts
No ratings yet
8051 Interrupts
16 pages
IEECP - Article - 241 (Alasmer Ibrahim Et Al.)
No ratings yet
IEECP - Article - 241 (Alasmer Ibrahim Et Al.)
6 pages
History Choose
No ratings yet
History Choose
3 pages
LED Flasher Circuit PPT Presentation
100% (1)
LED Flasher Circuit PPT Presentation
6 pages
LED Flasher Circuit Explanation
No ratings yet
LED Flasher Circuit Explanation
5 pages
Project Report: Development of Three Dimensional Display Based On Lighting Scheme Using Microcontroller
No ratings yet
Project Report: Development of Three Dimensional Display Based On Lighting Scheme Using Microcontroller
58 pages
Set-Level Guidance Attack - Boosting Adversarial Transferability of Vision-Language Pre-Training Models
No ratings yet
Set-Level Guidance Attack - Boosting Adversarial Transferability of Vision-Language Pre-Training Models
22 pages
Pascal
No ratings yet
Pascal
267 pages
Annamalai University Faculty of Engineering and Technology, Annamalai Nagar, Cuddalore
No ratings yet
Annamalai University Faculty of Engineering and Technology, Annamalai Nagar, Cuddalore
3 pages
Reaction Paper: The Spusurigao'S Outcomes-Based Education
No ratings yet
Reaction Paper: The Spusurigao'S Outcomes-Based Education
3 pages
The Stamp Technique For Direct Composite Restorations
No ratings yet
The Stamp Technique For Direct Composite Restorations
4 pages
World Civ. Monastic Lifecycles HW
No ratings yet
World Civ. Monastic Lifecycles HW
2 pages
Entrepreneurship Thesis Title
100% (3)
Entrepreneurship Thesis Title
7 pages
Major English
No ratings yet
Major English
21 pages
Q2 DISS Wk8 Final
100% (1)
Q2 DISS Wk8 Final
8 pages
Profed 11TH Mock-Board March 2025
No ratings yet
Profed 11TH Mock-Board March 2025
68 pages
Words in Action: A Vocabulary Development Course PDF
No ratings yet
Words in Action: A Vocabulary Development Course PDF
4 pages
Artificial Aesthetics and Ethical Ambiguity - Exploring Business Ethics in The Context of AI Driven Creativity
No ratings yet
Artificial Aesthetics and Ethical Ambiguity - Exploring Business Ethics in The Context of AI Driven Creativity
22 pages
Bernal TLWR Timeline of Rizal 0 8
No ratings yet
Bernal TLWR Timeline of Rizal 0 8
4 pages
Nutritional Status Report of Dalig Elementary School: Severely Wasted Severely Stunted
No ratings yet
Nutritional Status Report of Dalig Elementary School: Severely Wasted Severely Stunted
4 pages
Early Theories: The Foundations of Modern Leadership
100% (1)
Early Theories: The Foundations of Modern Leadership
27 pages
Bhenhury
No ratings yet
Bhenhury
6 pages
2 - 2. FAQs Regarding Backlog Registration
No ratings yet
2 - 2. FAQs Regarding Backlog Registration
4 pages
Team Aloo Manifesto
No ratings yet
Team Aloo Manifesto
4 pages
Stages of Language Development
No ratings yet
Stages of Language Development
14 pages
Romanian Language Lessons PDF
50% (2)
Romanian Language Lessons PDF
14 pages
Embodied Cognition Beralde
No ratings yet
Embodied Cognition Beralde
11 pages
Narrative Report On SPTA
100% (3)
Narrative Report On SPTA
2 pages
Dsa Course File (Final)
No ratings yet
Dsa Course File (Final)
15 pages
Rosca Ciprian Application Form EB 11-12
No ratings yet
Rosca Ciprian Application Form EB 11-12
20 pages
Let'S Get Started... : Game 1: About Me
No ratings yet
Let'S Get Started... : Game 1: About Me
3 pages
NSS Slides 2025
No ratings yet
NSS Slides 2025
9 pages
AP 物理C力学 Open note
No ratings yet
AP 物理C力学 Open note
15 pages
The Department of Education Culture and Sports
No ratings yet
The Department of Education Culture and Sports
6 pages

Logistic Regression

Uploaded by

Logistic Regression

Uploaded by

Logistic Regression

Dr. Rajavel Ramadoss

❑ Logistic regression for binary and multiclass classification

❑ Demo - Python implementation

male or female (binary classification)

male, female or transgender (multiclass classification)

▪ Logistic regression is a linear classifier, so we’ll use a linear

▪ The variables 𝑏₀, 𝑏₁, …, 𝑏ᵣ are the estimators of the regression

▪ The function 𝑝(𝐱) is often interpreted as the predicted probability

▪ Therefore, 1 − 𝑝(𝑥) is the probability that the output is 0.

▪ The process of calculating the best weights using available

▪ To get the best weights, we usually maximize the log-likelihood

▪ This method is called the maximum likelihood estimation

LLF = Σᵢ(𝑦ᵢ log(𝑝(𝐱ᵢ)) + (1 − 𝑦ᵢ) log(1 − 𝑝(𝐱ᵢ))).

▪ MSE introduces different penalty to different error values. For

▪ Once the model is created, we need to fit (or train) it.

▪ Once a model is defined, we can check its

array([[0.74002157, 0.25997843], [0.62975524, 0.37024476],

▪ The red × shows the incorrect prediction.

▪ The full black line is the estimated logistic regression

▪ The grey squares are the points on this line that

▪ The black dashed line is the logit 𝑓(𝑥).

❑ Introduction to logistic regression

❑ Logistic regression for binary and multiclass

❑ Demo - Python implementation

You might also like