0% found this document useful (0 votes)

11 views25 pages

Logistic Regression

Logistic Regression is a classification algorithm used to predict binary outcomes based on independent variables, functioning as a special case of linear regression for categorical data. It assumes a linear relationship between the logit of the outcome and predictor variables, and can be applied in various fields such as text classification and image recognition. The method includes types like Multinomial and Ordinal Logistic Regression, and utilizes metrics like AIC, Null Deviance, and ROC Curve for model performance evaluation.

Uploaded by

abhijaychauhan88

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

11 views25 pages

Logistic Regression

Uploaded by

abhijaychauhan88

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 25

Logistic Regression

Classification
• Classification is a very important area of supervised machine learning.
A large number of important machine learning problems fall within
this area. There are many classification methods, and logistic
regression is one of them.
• Supervised machine learning algorithms define models that capture
relationships among data. Classification is an area of supervised
machine learning that tries to predict which class or category some
entity belongs to, based on its features.
• There are two main types of classification problems:
• Binary or binomial classification: exactly two classes to choose
between (usually 0 and 1, true and false, or positive and negative)
• Multiclass or multinomial classification: three or more classes of the
outputs to choose from
When Do You Need Classification?
• You can apply classification in many fields of science and technology.
Example:
• text classification algorithms are used to separate legitimate and
spam emails, as well as positive and negative comments.
• Image recognition tasks are often represented as classification
problems.
What is Logistic Regression?
• Logistic Regression is a classification algorithm. It is used to predict a
binary outcome (1 / 0, Yes / No, True / False) given a set of
independent variables.
• You can also think of logistic regression as a special case of linear
regression when the outcome variable is categorical, where we are
using log of odds as dependent variable.
• In simple words, it predicts the probability of occurrence of an event
by fitting data to a logit function.
• In many situations, the response variable is qualitative or, in other
words, categorical. For example, gender is qualitative, taking on
values male or female.
• Prediciting a qualitative response for an observation can be referred
to as classifying that observation, since it involves assigning the
observation to a category, or class. On the other hand, the methods
that are often used for classification first predict the probability of
each of the categories of a qualitative variable, as the basis for making
the classification.
• Linear regression is not capable of predicting probability.
Logistic Regression Assumptions
The logistic regression method assumes that:

• The outcome is a binary or dichotomous variable like yes vs no, positive vs negative, 1 vs
0.
• There is a linear relationship between the logit of the outcome and each predictor
variables. Recall that the logit function is logit(p) = log(p/(1-p)), where p is the
probabilities of the outcome.
• There is no influential values (extreme values or outliers) in the continuous predictors
• There is no high intercorrelations (i.e. multicollinearity) among the predictors.

To improve the accuracy of your model, you should make sure that these assumptions hold
true for your data.
Types Of Logistic Regression
Models
• One of the plus points of Logistic Regression is that it can be used to
solve multi-class classification problems by using the Multinomial and
Ordinal Logistic models.
Multinomial Logistic Regression:
• Multinomial Regression is an extension of binary logistic regression,
that is used when the response variable has more than 2 classes.
Multinomial regression is used to handle multi-class classification
problems.
• Let’s assume that our response variable has K = 3 classes, then the
Multinomial logistic model will fit K-1 independent binary logistic
models in order to compute the final outcome.
Ordinal Logistic Regression:
• Ordinal Logistic Regression also known as Ordinal classification is a
predictive modeling technique used when the response variable is
ordinal in nature.
• An ordinal variable is one where the order of the values is significant,
but not the difference between values. For example, you might ask a
person to rate a movie on a scale of 1 to 5. A score of 4 is much better
than 3, because it means that the person liked the movie. But the
difference between a rating of 4 and the 3 may not be the same as
that between 4 and 1. The values simply express an order.
What is the Sigmoid Function?
• It is a mathematical function having a characteristic that can take any
real value and map it to between 0 to 1 shaped like the letter “S”. The
sigmoid function also called a logistic function.
Derivation of Logistic
Regression Equation
• g(E(y)) = α + βx1 + γx2

• Here, g() is the link function, E(y) is the expectation of target variable and α + βx1 + γx2 is the linear
predictor ( α,β,γ to be predicted). The role of link function is to ‘link’ the expectation of y to linear
predictor.

Important Points

• GLM does not assume a linear relationship between dependent and independent variables. However, it
assumes a linear relationship between link function and independent variables in logit model.
• The dependent variable need not to be normally distributed.
• It does not uses OLS (Ordinary Least Square) for parameter estimation. Instead, it uses maximum
likelihood estimation (MLE).
• Errors need to be independent but not normally distributed.
• In logistic regression, we are only concerned about the probability of
outcome dependent variable ( success or failure). As described above,
g() is the link function. This function is established using two things:
Probability of Success(p) and Probability of Failure(1-p). p should
meet following criteria:
• It must always be positive (since p >= 0)
• It must always be less than equals to 1 (since p <= 1)
• g(y) = βo + β(Age) ---- (a)

• p = exp(βo + β(Age)) = e^(βo + β(Age)) ------- (b)

• p = exp(βo + β(Age)) / exp(βo + β(Age)) + 1 = e^(βo + β(Age)) / e^(βo + β(Age)) + 1 ----- (c)

• Using (a), (b) and (c), we can redefine the probability as:

• p = e^y/ 1 + e^y --- (d)

• q = 1 - p = 1 - (e^y/ 1 + e^y) --- (e)

• On dividing, (d) / (e), we get,

• After taking log on both side, we get,
Decision Boundary
• Our current prediction function returns a probability score between 0
and 1. In order to map this to a discrete class (true/false, cat/dog), we
select a threshold value or tipping point above which we will classify
values into class 1 and below which we classify values into class 2.
p≥0.5,class=1
p<0.5,class=0
• For example, if our threshold was .5 and our prediction function
returned .7, we would classify this observation as positive. If our
prediction was .2 we would classify the observation as negative. For
logistic regression with multiple classes we could select the class with
the highest predicted probability.
Python code of Logistic Regression
# load libraries
import pandas as pd
import numpy as np
from sklearn import metrics
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import train_test_split

# load dataset
df = pd.read_csv('Student-Pass-Fail-Data.csv')
df.head()

# split input and target variables

x = df.drop('Pass_Or_Fail',axis = 1)
y = df.Pass_Or_Fail
Code…
# split data into training and testing dataset
x_train, x_test, y_train, y_test = train_test_split(x, y, random_state=4)

# create object for logistic regression

logistic_regression = LogisticRegression()

#train the model

logistic_regression.fit(x_train, y_train)

#predict values
y_pred = logistic_regression.predict(x_test)
Code…
#check model accuracy

accuracy = metrics.accuracy_score(y_test, y_pred)

accuracy_percentage = 100 * accuracy
accuracy_percentage
Performance of Logistic
Regression Model
• 1. AIC (Akaike Information Criteria) – The analogous metric of
adjusted R² in logistic regression is AIC. AIC is the measure of fit which
penalizes model for the number of model coefficients. Therefore, we
always prefer model with minimum AIC value.
• 2. Null Deviance and Residual Deviance – Null Deviance indicates the
response predicted by a model with nothing but an intercept. Lower
the value, better the model. Residual deviance indicates the response
predicted by a model on adding independent variables. Lower the
value, better the model.
• 3. Confusion Matrix: It is nothing but a tabular representation of
Actual vs Predicted values. This helps us to find the accuracy of the
model and avoid overfitting. This is how it looks like:
• ROC Curve: Receiver Operating Characteristic(ROC) summarizes the
model’s performance by evaluating the trade offs between true
positive rate (sensitivity) and false positive rate(1- specificity). For
plotting ROC, it is advisable to assume p > 0.5 since we are more
concerned about success rate. ROC summarizes the predictive power
for all possible values of p > 0.5. The area under curve (AUC), referred
to as index of accuracy(A) or concordance index, is a perfect
performance metric for ROC curve. Higher the area under curve,
better the prediction power of the model. Below is a sample ROC
curve. The ROC of a perfect predictive model has TP equals 1 and FP
equals 0. This curve will touch the top left corner of the graph.

Sage - Girden, 1992 ANOVA Repeated Measures
0% (1)
Sage - Girden, 1992 ANOVA Repeated Measures
110 pages
Linear Regression Analysis and Least Square Methods
No ratings yet
Linear Regression Analysis and Least Square Methods
65 pages
Lecture 4-Logistic Regression
No ratings yet
Lecture 4-Logistic Regression
20 pages
ML Unit 3
No ratings yet
ML Unit 3
40 pages
Business Decision Making II Simple Linear Regression: Dr. Nguyen Ngoc Phan
No ratings yet
Business Decision Making II Simple Linear Regression: Dr. Nguyen Ngoc Phan
69 pages
Unit II
100% (1)
Unit II
13 pages
Logistic Regression by Nirzona
No ratings yet
Logistic Regression by Nirzona
11 pages
Machine Learning (Analytics Vidhya) : What Is Logistic Regression?
100% (1)
Machine Learning (Analytics Vidhya) : What Is Logistic Regression?
5 pages
BRM Question Paper-2015
No ratings yet
BRM Question Paper-2015
15 pages
STAT 217 Wiest Outline
No ratings yet
STAT 217 Wiest Outline
4 pages
Lecture 3 - CSE38900 - Rev
No ratings yet
Lecture 3 - CSE38900 - Rev
88 pages
Yatchew A. Semiparametric Regression For The Applied Econometrician (CUP, 2003) (ISBN 0521812836) (235s) - GL
100% (1)
Yatchew A. Semiparametric Regression For The Applied Econometrician (CUP, 2003) (ISBN 0521812836) (235s) - GL
235 pages
Practical - Logistic Regression
No ratings yet
Practical - Logistic Regression
84 pages
DMML Unit4
No ratings yet
DMML Unit4
77 pages
Logistic Regression
No ratings yet
Logistic Regression
72 pages
Statistical Inference Cheat Sheet
No ratings yet
Statistical Inference Cheat Sheet
4 pages
Ai Tech Agency Infographics
No ratings yet
Ai Tech Agency Infographics
65 pages
Data Compression (Rcs087) Assignment Unit-5
No ratings yet
Data Compression (Rcs087) Assignment Unit-5
6 pages
Chapter - 1
No ratings yet
Chapter - 1
56 pages
Mathematics Behind Logistic Regression Model 1598272636
No ratings yet
Mathematics Behind Logistic Regression Model 1598272636
6 pages
Analytics Overview
No ratings yet
Analytics Overview
34 pages
SP14 CS188 Lecture 16 Bayes Nets 4
No ratings yet
SP14 CS188 Lecture 16 Bayes Nets 4
42 pages
Lecture 22. GLM
No ratings yet
Lecture 22. GLM
41 pages
Decision Tree
No ratings yet
Decision Tree
35 pages
Logistic Regression
No ratings yet
Logistic Regression
30 pages
Hierarchical
No ratings yet
Hierarchical
31 pages
MACHINE LEARNING Presentation Logistic Regression
No ratings yet
MACHINE LEARNING Presentation Logistic Regression
18 pages
Some Analysis of The Knockoff Filter and Its Variants: Jiajie Chen, Anthony Hou, Thomas Y. Hou June 6, 2017
No ratings yet
Some Analysis of The Knockoff Filter and Its Variants: Jiajie Chen, Anthony Hou, Thomas Y. Hou June 6, 2017
25 pages
What Is Logistic Regression
No ratings yet
What Is Logistic Regression
20 pages
Lecture Note #9 - PEC-CS701E
No ratings yet
Lecture Note #9 - PEC-CS701E
41 pages
03 Logistic Regression
No ratings yet
03 Logistic Regression
23 pages
Introduction To ML
No ratings yet
Introduction To ML
17 pages
09 23ECE216 LogisticRegression
No ratings yet
09 23ECE216 LogisticRegression
40 pages
Logistic Regression
No ratings yet
Logistic Regression
22 pages
Setting The Unit of Analysis
No ratings yet
Setting The Unit of Analysis
34 pages
Lecture 6 Continuous Probability Distributions
No ratings yet
Lecture 6 Continuous Probability Distributions
25 pages
Session 9-Logistic Regression
No ratings yet
Session 9-Logistic Regression
33 pages
Logistic Regression
No ratings yet
Logistic Regression
36 pages
Data Analytics Using R
No ratings yet
Data Analytics Using R
23 pages
Logistic Regression
No ratings yet
Logistic Regression
25 pages
Probability Paper
No ratings yet
Probability Paper
22 pages
Probability
No ratings yet
Probability
22 pages
Logisticregression
No ratings yet
Logisticregression
22 pages
Logistic Regression
No ratings yet
Logistic Regression
14 pages
Report Logistic Regression
No ratings yet
Report Logistic Regression
21 pages
Logistic Regression
No ratings yet
Logistic Regression
20 pages
MLStackCafe QAS 1672810525772
No ratings yet
MLStackCafe QAS 1672810525772
12 pages
Linear and Logistic Regression
No ratings yet
Linear and Logistic Regression
21 pages
K Means
No ratings yet
K Means
25 pages
Logistic Regression
No ratings yet
Logistic Regression
16 pages
FALLSEM2024-25 BCSE209L TH VL2024250101695 2024-08-12 Reference-Material-II
No ratings yet
FALLSEM2024-25 BCSE209L TH VL2024250101695 2024-08-12 Reference-Material-II
19 pages
Ncert Solutions Class 12 Exercise 13.2
No ratings yet
Ncert Solutions Class 12 Exercise 13.2
15 pages
13 Granger
No ratings yet
13 Granger
19 pages
Confusion Matrix
No ratings yet
Confusion Matrix
16 pages
Ridge Regression LASSO
No ratings yet
Ridge Regression LASSO
18 pages
Advanced Regression
No ratings yet
Advanced Regression
13 pages
Lecture Material 11
No ratings yet
Lecture Material 11
14 pages
CHAID Decision Tree
No ratings yet
CHAID Decision Tree
14 pages
Logistic Regression
No ratings yet
Logistic Regression
17 pages
W5S01 - PM-Logistic Regression
No ratings yet
W5S01 - PM-Logistic Regression
17 pages
Statistical Treatment of Data
No ratings yet
Statistical Treatment of Data
12 pages
13 Logistic Regression Main
No ratings yet
13 Logistic Regression Main
14 pages
Logistic Regression
No ratings yet
Logistic Regression
16 pages
Logistic Regression
No ratings yet
Logistic Regression
14 pages
STAT1005 Test 2 2022 - 1
No ratings yet
STAT1005 Test 2 2022 - 1
8 pages
Logistic Regression in Machine Learning
No ratings yet
Logistic Regression in Machine Learning
10 pages
Data Mining
No ratings yet
Data Mining
13 pages
ML Lec-9
No ratings yet
ML Lec-9
13 pages
Logistic Regression
No ratings yet
Logistic Regression
12 pages
Logistic Regression
No ratings yet
Logistic Regression
9 pages
Wa0004.
No ratings yet
Wa0004.
9 pages
Logistic Regression
No ratings yet
Logistic Regression
8 pages
05 Random Signal
No ratings yet
05 Random Signal
40 pages
Logistic Regressions
No ratings yet
Logistic Regressions
11 pages
Assignment 2 Sqqs 2013 A211
No ratings yet
Assignment 2 Sqqs 2013 A211
10 pages
Regression Metrics
No ratings yet
Regression Metrics
11 pages
Logistic Regression For Machine Learning Complete TutorialUnderstand This Popular Supervised Classifi
No ratings yet
Logistic Regression For Machine Learning Complete TutorialUnderstand This Popular Supervised Classifi
10 pages
Logistic Regression in R and Python
No ratings yet
Logistic Regression in R and Python
9 pages
Module7-Coefficient of Variation and Skewness (Grouped Data) (Business)
No ratings yet
Module7-Coefficient of Variation and Skewness (Grouped Data) (Business)
7 pages
Logistic Regression
No ratings yet
Logistic Regression
8 pages
DLAI4 Energy Boltzmann
No ratings yet
DLAI4 Energy Boltzmann
8 pages
ML Assignment Kv2
No ratings yet
ML Assignment Kv2
10 pages
St. Peter's College of Ormoc: SY 2020-2021 School Theme
No ratings yet
St. Peter's College of Ormoc: SY 2020-2021 School Theme
4 pages
Exp 2 121a1047 ML Lavanya Kurup Div C C3
No ratings yet
Exp 2 121a1047 ML Lavanya Kurup Div C C3
8 pages
Watson Studio
No ratings yet
Watson Studio
8 pages
B.Tech V KCS055 Unit2 2
No ratings yet
B.Tech V KCS055 Unit2 2
7 pages
Statistics
No ratings yet
Statistics
7 pages
Logistic Regression
No ratings yet
Logistic Regression
6 pages
Chp2 Logistic Regression
No ratings yet
Chp2 Logistic Regression
6 pages
Sonia Jessica - 2022 - How Does Logistic Regression Work
No ratings yet
Sonia Jessica - 2022 - How Does Logistic Regression Work
4 pages
Dav Exp4 66
No ratings yet
Dav Exp4 66
5 pages
Intro To Linear and Logistic Reg
No ratings yet
Intro To Linear and Logistic Reg
5 pages
Artikel Ahmad Fadhil Imran PDF
No ratings yet
Artikel Ahmad Fadhil Imran PDF
5 pages
Key Ingredients To Inferential Statistics
No ratings yet
Key Ingredients To Inferential Statistics
4 pages
Experiment No 8
No ratings yet
Experiment No 8
4 pages
ML DSBA Lab2
No ratings yet
ML DSBA Lab2
4 pages
Speech Recognition
No ratings yet
Speech Recognition
4 pages
Chapter Two Dss
No ratings yet
Chapter Two Dss
3 pages
Beckman Coulter LS - Sampel 20 - 01 - 03
No ratings yet
Beckman Coulter LS - Sampel 20 - 01 - 03
2 pages
Assignment III
No ratings yet
Assignment III
2 pages
Assignment BSCS 4rt Semester (2022 - 2026)
No ratings yet
Assignment BSCS 4rt Semester (2022 - 2026)
1 page
Misc 5
No ratings yet
Misc 5
1 page
Multi-dimensional Monte Carlo Integrations Utilizing Mathematica
From Everand
Multi-dimensional Monte Carlo Integrations Utilizing Mathematica
SUJAUL CHOWDHURY
No ratings yet

Logistic Regression

Uploaded by

Logistic Regression

Uploaded by

Logistic Regression

• p = exp(βo + β(Age)) = e^(βo + β(Age)) ------- (b)

• p = e^y/ 1 + e^y --- (d)

• q = 1 - p = 1 - (e^y/ 1 + e^y) --- (e)

• On dividing, (d) / (e), we get,

# split input and target variables

# create object for logistic regression

#train the model

accuracy = metrics.accuracy_score(y_test, y_pred)

You might also like