0% found this document useful (0 votes)
11 views25 pages

Logistic Regression

Logistic Regression is a classification algorithm used to predict binary outcomes based on independent variables, functioning as a special case of linear regression for categorical data. It assumes a linear relationship between the logit of the outcome and predictor variables, and can be applied in various fields such as text classification and image recognition. The method includes types like Multinomial and Ordinal Logistic Regression, and utilizes metrics like AIC, Null Deviance, and ROC Curve for model performance evaluation.

Uploaded by

abhijaychauhan88
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views25 pages

Logistic Regression

Logistic Regression is a classification algorithm used to predict binary outcomes based on independent variables, functioning as a special case of linear regression for categorical data. It assumes a linear relationship between the logit of the outcome and predictor variables, and can be applied in various fields such as text classification and image recognition. The method includes types like Multinomial and Ordinal Logistic Regression, and utilizes metrics like AIC, Null Deviance, and ROC Curve for model performance evaluation.

Uploaded by

abhijaychauhan88
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 25

Logistic Regression

Classification
• Classification is a very important area of supervised machine learning.
A large number of important machine learning problems fall within
this area. There are many classification methods, and logistic
regression is one of them.
• Supervised machine learning algorithms define models that capture
relationships among data. Classification is an area of supervised
machine learning that tries to predict which class or category some
entity belongs to, based on its features.
• There are two main types of classification problems:
• Binary or binomial classification: exactly two classes to choose
between (usually 0 and 1, true and false, or positive and negative)
• Multiclass or multinomial classification: three or more classes of the
outputs to choose from
When Do You Need Classification?
• You can apply classification in many fields of science and technology.
Example:
• text classification algorithms are used to separate legitimate and
spam emails, as well as positive and negative comments.
• Image recognition tasks are often represented as classification
problems.
What is Logistic Regression?
• Logistic Regression is a classification algorithm. It is used to predict a
binary outcome (1 / 0, Yes / No, True / False) given a set of
independent variables.
• You can also think of logistic regression as a special case of linear
regression when the outcome variable is categorical, where we are
using log of odds as dependent variable.
• In simple words, it predicts the probability of occurrence of an event
by fitting data to a logit function.
• In many situations, the response variable is qualitative or, in other
words, categorical. For example, gender is qualitative, taking on
values male or female.
• Prediciting a qualitative response for an observation can be referred
to as classifying that observation, since it involves assigning the
observation to a category, or class. On the other hand, the methods
that are often used for classification first predict the probability of
each of the categories of a qualitative variable, as the basis for making
the classification.
• Linear regression is not capable of predicting probability.
Logistic Regression Assumptions
The logistic regression method assumes that:

• The outcome is a binary or dichotomous variable like yes vs no, positive vs negative, 1 vs
0.
• There is a linear relationship between the logit of the outcome and each predictor
variables. Recall that the logit function is logit(p) = log(p/(1-p)), where p is the
probabilities of the outcome.
• There is no influential values (extreme values or outliers) in the continuous predictors
• There is no high intercorrelations (i.e. multicollinearity) among the predictors.

To improve the accuracy of your model, you should make sure that these assumptions hold
true for your data.
Types Of Logistic Regression
Models
• One of the plus points of Logistic Regression is that it can be used to
solve multi-class classification problems by using the Multinomial and
Ordinal Logistic models.
Multinomial Logistic Regression:
• Multinomial Regression is an extension of binary logistic regression,
that is used when the response variable has more than 2 classes.
Multinomial regression is used to handle multi-class classification
problems.
• Let’s assume that our response variable has K = 3 classes, then the
Multinomial logistic model will fit K-1 independent binary logistic
models in order to compute the final outcome.
Ordinal Logistic Regression:
• Ordinal Logistic Regression also known as Ordinal classification is a
predictive modeling technique used when the response variable is
ordinal in nature.
• An ordinal variable is one where the order of the values is significant,
but not the difference between values. For example, you might ask a
person to rate a movie on a scale of 1 to 5. A score of 4 is much better
than 3, because it means that the person liked the movie. But the
difference between a rating of 4 and the 3 may not be the same as
that between 4 and 1. The values simply express an order.
What is the Sigmoid Function?
• It is a mathematical function having a characteristic that can take any
real value and map it to between 0 to 1 shaped like the letter “S”. The
sigmoid function also called a logistic function.
Derivation of Logistic
Regression Equation
• g(E(y)) = α + βx1 + γx2

• Here, g() is the link function, E(y) is the expectation of target variable and α + βx1 + γx2 is the linear
predictor ( α,β,γ to be predicted). The role of link function is to ‘link’ the expectation of y to linear
predictor.

Important Points

• GLM does not assume a linear relationship between dependent and independent variables. However, it
assumes a linear relationship between link function and independent variables in logit model.
• The dependent variable need not to be normally distributed.
• It does not uses OLS (Ordinary Least Square) for parameter estimation. Instead, it uses maximum
likelihood estimation (MLE).
• Errors need to be independent but not normally distributed.
• In logistic regression, we are only concerned about the probability of
outcome dependent variable ( success or failure). As described above,
g() is the link function. This function is established using two things:
Probability of Success(p) and Probability of Failure(1-p). p should
meet following criteria:
• It must always be positive (since p >= 0)
• It must always be less than equals to 1 (since p <= 1)
• g(y) = βo + β(Age) ---- (a)

• p = exp(βo + β(Age)) = e^(βo + β(Age)) ------- (b)

• p = exp(βo + β(Age)) / exp(βo + β(Age)) + 1 = e^(βo + β(Age)) / e^(βo + β(Age)) + 1 ----- (c)

• Using (a), (b) and (c), we can redefine the probability as:

• p = e^y/ 1 + e^y --- (d)

• q = 1 - p = 1 - (e^y/ 1 + e^y) --- (e)

• On dividing, (d) / (e), we get,


• After taking log on both side, we get,
Decision Boundary
• Our current prediction function returns a probability score between 0
and 1. In order to map this to a discrete class (true/false, cat/dog), we
select a threshold value or tipping point above which we will classify
values into class 1 and below which we classify values into class 2.
p≥0.5,class=1
p<0.5,class=0
• For example, if our threshold was .5 and our prediction function
returned .7, we would classify this observation as positive. If our
prediction was .2 we would classify the observation as negative. For
logistic regression with multiple classes we could select the class with
the highest predicted probability.
Python code of Logistic Regression
# load libraries
import pandas as pd
import numpy as np
from sklearn import metrics
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import train_test_split

# load dataset
df = pd.read_csv('Student-Pass-Fail-Data.csv')
df.head()

# split input and target variables


x = df.drop('Pass_Or_Fail',axis = 1)
y = df.Pass_Or_Fail
Code…
# split data into training and testing dataset
x_train, x_test, y_train, y_test = train_test_split(x, y, random_state=4)

# create object for logistic regression


logistic_regression = LogisticRegression()

#train the model


logistic_regression.fit(x_train, y_train)

#predict values
y_pred = logistic_regression.predict(x_test)
Code…
#check model accuracy

accuracy = metrics.accuracy_score(y_test, y_pred)


accuracy_percentage = 100 * accuracy
accuracy_percentage
Performance of Logistic
Regression Model
• 1. AIC (Akaike Information Criteria) – The analogous metric of
adjusted R² in logistic regression is AIC. AIC is the measure of fit which
penalizes model for the number of model coefficients. Therefore, we
always prefer model with minimum AIC value.
• 2. Null Deviance and Residual Deviance – Null Deviance indicates the
response predicted by a model with nothing but an intercept. Lower
the value, better the model. Residual deviance indicates the response
predicted by a model on adding independent variables. Lower the
value, better the model.
• 3. Confusion Matrix: It is nothing but a tabular representation of
Actual vs Predicted values. This helps us to find the accuracy of the
model and avoid overfitting. This is how it looks like:
• ROC Curve: Receiver Operating Characteristic(ROC) summarizes the
model’s performance by evaluating the trade offs between true
positive rate (sensitivity) and false positive rate(1- specificity). For
plotting ROC, it is advisable to assume p > 0.5 since we are more
concerned about success rate. ROC summarizes the predictive power
for all possible values of p > 0.5. The area under curve (AUC), referred
to as index of accuracy(A) or concordance index, is a perfect
performance metric for ROC curve. Higher the area under curve,
better the prediction power of the model. Below is a sample ROC
curve. The ROC of a perfect predictive model has TP equals 1 and FP
equals 0. This curve will touch the top left corner of the graph.

You might also like