Logistic Regression in Machine Learning

Logistic regression is a supervised machine learning algorithm primarily used for binary classification tasks, predicting the probability of an instance belonging to a specific class. It employs the sigmoid function to map input values to probabilities between 0 and 1, and can be categorized into binomial, multinomial, and ordinal types based on the nature of the dependent variable. The document also covers the assumptions, evaluation metrics, and implementation steps for logistic regression, highlighting its differences from linear regression.

Uploaded by

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

166 views10 pages

Logistic Regression in Machine Learning

Uploaded by

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 10

Logistic Regression in Machine Learning

In our previous discussion, we explored the fundamentals of machine learning and walked
through a hands-on implementation of Linear Regression. Now, let’s take a step forward and
dive into one of the first and most widely used classification algorithms — Logistic Regression
What is Logistic Regression?
Logistic regression is a supervised machine learning algorithm used for classification
tasks where the goal is to predict the probability that an instance belongs to a given class or
not. Logistic regression is a statistical algorithm which analyze the relationship between two
data factors. The article explores the fundamentals of logistic regression, it’s types and
implementations.
Logistic regression is used for binary classification where we use sigmoid function, that takes
input as independent variables and produces a probability value between 0 and 1.
For example, we have two classes Class 0 and Class 1 if the value of the logistic function for
an input is greater than 0.5 (threshold value) then it belongs to Class 1 otherwise it belongs to
Class 0. It’s referred to as regression because it is the extension of linear regression but is
mainly used for classification problems.
Key Points:
• Logistic regression predicts the output of a categorical dependent variable. Therefore,
the outcome must be a categorical or discrete value.
• It can be either Yes or No, 0 or 1, true or False, etc. but instead of giving the exact value
as 0 and 1, it gives the probabilistic values which lie between 0 and 1.
• In Logistic regression, instead of fitting a regression line, we fit an “S” shaped logistic
function, which predicts two maximum values (0 or 1).
Types of Logistic Regression
On the basis of the categories, Logistic Regression can be classified into three types:
1. Binomial: In binomial Logistic regression, there can be only two possible types of the
dependent variables, such as 0 or 1, Pass or Fail, etc.
2. Multinomial: In multinomial Logistic regression, there can be 3 or more possible
unordered types of the dependent variable, such as “cat”, “dogs”, or “sheep”
3. Ordinal: In ordinal Logistic regression, there can be 3 or more possible ordered types
of dependent variables, such as “low”, “Medium”, or “High”.
Assumptions of Logistic Regression
We will explore the assumptions of logistic regression as understanding these assumptions is
important to ensure that we are using appropriate application of the model. The assumption
include:
1. Independent observations: Each observation is independent of the other. meaning there
is no correlation between any input variables.
2. Binary dependent variables: It takes the assumption that the dependent variable must
be binary or dichotomous, meaning it can take only two values. For more than two
categories SoftMax functions are used.
3. Linearity relationship between independent variables and log odds: The relationship
between the independent variables and the log odds of the dependent variable should
be linear.
4. No outliers: There should be no outliers in the dataset.
5. Large sample size: The sample size is sufficiently large
Understanding Sigmoid Function
So far, we’ve covered the basics of logistic regression but now let’s focus on the most important
function that forms the core of logistic regression.
• The sigmoid function is a mathematical function used to map the predicted values to
probabilities.
• It maps any real value into another value within a range of 0 and 1. The value of the
logistic regression must be between 0 and 1, which cannot go beyond this limit, so it
forms a curve like the “S” form.
• The S-form curve is called the Sigmoid function or the logistic function.
• In logistic regression, we use the concept of the threshold value, which defines the
probability of either 0 or 1. Such as values above the threshold value tends to 1, and a
value below the threshold values tends to 0.
Sigmoid Function
Now we use the sigmoid function where the input will be z and we find the probability between
0 and 1. i.e. predicted y.
σ(z)=11+e−zσ(z)=1+e−z1

Sigmoid function
As shown above, the figure sigmoid function converts the continuous variable data into
the probability i.e. between 0 and 1.
• σ(z) σ(z) tends towards 1 as z→∞z→∞
• σ(z) σ(z) tends towards 0 as z→−∞z→−∞
• σ(z) σ(z) is always bounded between 0 and 1
where the probability of being a class can be measured as:
P(y=1) =σ(z)
P(y=0) =1−σ(z)
Equation of Logistic Regression:
The odd is the ratio of something occurring to something not occurring. it is different from
probability as the probability is the ratio of something occurring to everything that could
possibly occur. so odd will be:

Likelihood Function for Logistic Regression

Gradient of the log-likelihood function
To find the maximum likelihood estimates, we differentiate w.r.t w,

Terminologies involved in Logistic Regression

Here are some common terms involved in logistic regression:
• Independent variables: The input characteristics or predictor factors applied to the
dependent variable’s predictions.
• Dependent variable: The target variable in a logistic regression model, which we are
trying to predict.
• Logistic function: The formula used to represent how the independent and dependent
variables relate to one another. The logistic function transforms the input variables into
a probability value between 0 and 1, which represents the likelihood of the dependent
variable being 1 or 0.
• Odds: It is the ratio of something occurring to something not occurring. it is different
from probability as the probability is the ratio of something occurring to everything that
could possibly occur.
• Log-odds: The log-odds, also known as the logit function, is the natural logarithm of
the odds. In logistic regression, the log odds of the dependent variable are modeled as
a linear combination of the independent variables and the intercept.
• Coefficient: The logistic regression model’s estimated parameters, show how the
independent and dependent variables relate to one another.
• Intercept: A constant term in the logistic regression model, which represents the log
odds when all independent variables are equal to zero.
• Maximum likelihood estimation: The method used to estimate the coefficients of the
logistic regression model, which maximizes the likelihood of observing the data given
the model
Code Implementation for Logistic Regression
So far, we’ve covered the basics of logistic regression with all the theoritical concepts, but now
let’s focus on the hands on code implementation part which makes you understand the logistic
regression more clearly. We will dicuss Binomial Logistic regression and Multinomial
Logistic Regression one by one.
Binomial Logistic regression:
Target variable can have only 2 possible types: “0” or “1” which may represent “win” vs “loss”,
“pass” vs “fail”, “dead” vs “alive”, etc., in this case, sigmoid functions are used, which is
already discussed above.
Importing necessary libraries based on the requirement of model. This Python code shows how
to use the breast cancer dataset to implement a Logistic Regression model for classification.

This code loads the breast cancer dataset from scikit-learn, splits it into training and testing
sets, and then trains a Logistic Regression model on the training data. The model is used to
predict the labels for the test data, and the accuracy of these predictions is calculated by
comparing the predicted values with the actual labels from the test set. Finally, the accuracy is
printed as a percentage.
Multinomial Logistic Regression:
Target variable can have 3 or more possible types which are not ordered (i.e. types have no
quantitative significance) like “disease A” vs “disease B” vs “disease C”.
In this case, the softmax function is used in place of the sigmoid function. Softmax function for
K classes will be:

Here, K represents the number of elements in the vector z, and i, j iterates over all the elements
in the vector.
Then the probability for class c will be:

In Multinomial Logistic Regression, the output variable can have more than two possible
discrete outputs. Consider the Digit Dataset.

How to Evaluate Logistic Regression Model?

So far, we’ve covered the implementation of logistic regression. Now, let’s dive into the
evaluation of logistic regression and understand why it’s important
Evaluating the model helps us assess the model’s performance and ensure it generalizes well
to new data
We can evaluate the logistic regression model using the following metrics:
• Accuracy: Accuracy provides the proportion of correctly classified instances.
• Precision: Precision focuses on the accuracy of positive predictions.
• Recall (Sensitivity or True Positive Rate): Recall measures the proportion of correctly
predicted positive instances among all actual positive instances.
• F1 Score: F1 score is the harmonic mean of precision and recall.
• Area Under the Receiver Operating Characteristic Curve (AUC-ROC): The ROC
curve plots the true positive rate against the false positive rate at various thresholds.
AUC-ROC measures the area under this curve, providing an aggregate measure of a
model’s performance across different classification thresholds.
• Area Under the Precision-Recall Curve (AUC-PR): Similar to AUC-ROC, AUC-PR
measures the area under the precision-recall curve, providing a summary of a model’s
performance across different precision-recall trade-offs.

Differences Between Linear and Logistic Regression

Linear Regression Logistic Regression
Linear regression is used to predict the Logistic regression is used to predict the
continuous dependent variable using a given set categorical dependent variable using a given set
of independent variables. of independent variables.
Linear regression is used for solving regression It is used for solving classification problems.
problem.
In this we predict the value of continuous In this we predict values of categorical variables
variables
In this we find best fit line. In this we find S-Curve.
Least square estimation method is used for Maximum likelihood estimation method is used
estimation of accuracy. for Estimation of accuracy.
The output must be continuous value, such as Output must be categorical value such as 0 or 1,
price, age, etc. Yes or no, etc.

It required linear relationship between It not required linear relationship.

dependent and independent variables.
There may be collinearity between the There should be little to no collinearity between
independent variables. independent variables.
Logistic Regression Formula
Unlike Linear Regression, which predicts a continuous output, Logistic Regression predicts a
probability using the sigmoid function:

To convert this probability into a classification decision:

Types of Logistic Regression

• Binary Logistic Regression → Outcome has two classes (e.g., Spam/Not Spam)
• Multinomial Logistic Regression → Outcome has more than two classes (e.g.,
Low/Medium/High)
• Ordinal Logistic Regression → Ordered categories (e.g., Customer Satisfaction: Low
< Medium < High)

Steps for Building a Logistic Regression Model

Step 1: Import Required Libraries
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import accuracy_score, confusion_matrix, classification_report
Step 2: Load Dataset
df = pd.read_csv("data.csv") # Replace with your dataset
print(df.head())
Step 3: Preprocess Data
• Handle missing values
• Convert categorical variables to numeric (One-Hot Encoding or Label Encoding)
• Scale features if necessary
df.dropna(inplace=True) # Drop missing values
df = pd.get_dummies(df, drop_first=True) # Convert categorical variables
Step 4: Split Data
X = df.drop("target", axis=1) # Independent variables
y = df["target"] # Dependent variable (0 or 1)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
Step 5: Train Logistic Regression Model
model = LogisticRegression()
model.fit(X_train, y_train)
Step 6: Make Predictions
y_pred = model.predict(X_test)
Step 7: Evaluate the Model
print("Accuracy:", accuracy_score(y_test, y_pred))
print("Confusion Matrix:\n", confusion_matrix(y_test, y_pred))
print("Classification Report:\n", classification_report(y_test, y_pred))

Model Interpretation
• Accuracy: Measures overall correctness of predictions
• Confusion Matrix: Shows True Positives (TP), False Positives (FP), True Negatives
(TN), and False Negatives (FN)
• Precision & Recall: Important when dealing with imbalanced classes
• ROC Curve & AUC Score: Measures model's ability to distinguish between classes
from sklearn.metrics import roc_curve, auc
y_prob = model.predict_proba(X_test)[:, 1] # Get probability scores
fpr, tpr, _ = roc_curve(y_test, y_prob)
roc_auc = auc(fpr, tpr)
plt.plot(fpr, tpr, label=f"AUC = {roc_auc:.2f}")
plt.xlabel("False Positive Rate")
plt.ylabel("True Positive Rate")
plt.title("ROC Curve")
plt.legend()
plt.show()

When to Use Logistic Regression?

• When the target variable is categorical (binary or multinomial)
• When features and target have a log-odds linear relationship
• When you need interpretability over complex models like Neural Networks

Logistic Regression
No ratings yet
Logistic Regression
22 pages
GPost Manual
No ratings yet
GPost Manual
449 pages
Logistic Regression
No ratings yet
Logistic Regression
17 pages
Report Logistic Regression
No ratings yet
Report Logistic Regression
21 pages
FALLSEM2024-25 BCSE209L TH VL2024250101695 2024-08-12 Reference-Material-II
No ratings yet
FALLSEM2024-25 BCSE209L TH VL2024250101695 2024-08-12 Reference-Material-II
19 pages
Logistic Regression
No ratings yet
Logistic Regression
9 pages
Exp 2 121a1047 ML Lavanya Kurup Div C C3
No ratings yet
Exp 2 121a1047 ML Lavanya Kurup Div C C3
8 pages
ML Assignment
No ratings yet
ML Assignment
20 pages
Logistic Regression
No ratings yet
Logistic Regression
16 pages
Experiment No 8
No ratings yet
Experiment No 8
4 pages
M2 Logistic Regression Classcopy 4
No ratings yet
M2 Logistic Regression Classcopy 4
7 pages
Logistic Regression in Machine Learning - GeeksforGeeks
No ratings yet
Logistic Regression in Machine Learning - GeeksforGeeks
10 pages
MACHINE LEARNING Presentation Logistic Regression
No ratings yet
MACHINE LEARNING Presentation Logistic Regression
18 pages
Logistic Regression
No ratings yet
Logistic Regression
25 pages
B.Tech V KCS055 Unit2 2
No ratings yet
B.Tech V KCS055 Unit2 2
7 pages
Logistic Regression
No ratings yet
Logistic Regression
10 pages
Logistic Regression in Machine Learning
No ratings yet
Logistic Regression in Machine Learning
3 pages
Aiml Unit 3 1
No ratings yet
Aiml Unit 3 1
9 pages
Logistics Regression
No ratings yet
Logistics Regression
8 pages
Lecture 4-Logistic Regression
No ratings yet
Lecture 4-Logistic Regression
20 pages
What Is Logistic Regression
No ratings yet
What Is Logistic Regression
20 pages
Lecture Note #9 - PEC-CS701E
No ratings yet
Lecture Note #9 - PEC-CS701E
41 pages
ML (08-08-2024)
No ratings yet
ML (08-08-2024)
5 pages
Logistic Regression
No ratings yet
Logistic Regression
14 pages
09 23ECE216 LogisticRegression
No ratings yet
09 23ECE216 LogisticRegression
40 pages
Mathematics Behind Logistic Regression Model 1598272636
No ratings yet
Mathematics Behind Logistic Regression Model 1598272636
6 pages
Logistic Regression
No ratings yet
Logistic Regression
36 pages
Logistic Regression
No ratings yet
Logistic Regression
4 pages
Dav Exp4 66
No ratings yet
Dav Exp4 66
5 pages
Supervised Logistic Regression
No ratings yet
Supervised Logistic Regression
13 pages
Wa0004.
No ratings yet
Wa0004.
9 pages
Misc 5
No ratings yet
Misc 5
1 page
Logistic Regression
No ratings yet
Logistic Regression
8 pages
EXP-2-To Implement Logistic Regression
No ratings yet
EXP-2-To Implement Logistic Regression
5 pages
W5S01 - PM-Logistic Regression
No ratings yet
W5S01 - PM-Logistic Regression
17 pages
Chapter Two Dss
No ratings yet
Chapter Two Dss
3 pages
Logistic Regression For Machine Learning Complete TutorialUnderstand This Popular Supervised Classifi
No ratings yet
Logistic Regression For Machine Learning Complete TutorialUnderstand This Popular Supervised Classifi
10 pages
Lecture 07
No ratings yet
Lecture 07
26 pages
2-Logistic Regression
No ratings yet
2-Logistic Regression
15 pages
13 Logistic Regression Main
No ratings yet
13 Logistic Regression Main
14 pages
Logistic Regressions
No ratings yet
Logistic Regressions
11 pages
Logistic Regression by Nirzona
No ratings yet
Logistic Regression by Nirzona
11 pages
03 Logistic Regression
No ratings yet
03 Logistic Regression
23 pages
Logistic Regression in Machine Learning
No ratings yet
Logistic Regression in Machine Learning
3 pages
29 LogisticRegression
No ratings yet
29 LogisticRegression
15 pages
Logistic Regression
No ratings yet
Logistic Regression
6 pages
Logistic Regression
No ratings yet
Logistic Regression
16 pages
Logistic Regression
No ratings yet
Logistic Regression
6 pages
All Merged Chap 2
No ratings yet
All Merged Chap 2
19 pages
Linear and Logistic Regression
No ratings yet
Linear and Logistic Regression
21 pages
Logistic Regression
No ratings yet
Logistic Regression
8 pages
ML Lec-9
No ratings yet
ML Lec-9
13 pages
4.logistic Regression
No ratings yet
4.logistic Regression
28 pages
Lecture 22. GLM
No ratings yet
Lecture 22. GLM
41 pages
Presentation (FA20 BCS 104)
No ratings yet
Presentation (FA20 BCS 104)
9 pages
MLStackCafe QAS 1672810525772
No ratings yet
MLStackCafe QAS 1672810525772
12 pages
Logistic Regression
No ratings yet
Logistic Regression
14 pages
Logistic Regression Report
No ratings yet
Logistic Regression Report
39 pages
ML Unit 3
No ratings yet
ML Unit 3
40 pages
ML Unit 4
No ratings yet
ML Unit 4
28 pages
Correlation and Regression: Six Sigma Thinking, #8
From Everand
Correlation and Regression: Six Sigma Thinking, #8
Sumeet Savant
5/5 (1)
Topic 5 Calculus Review HL
No ratings yet
Topic 5 Calculus Review HL
120 pages
Python Lab Manual
No ratings yet
Python Lab Manual
17 pages
Class 11 English Guess (2025) - 1
75% (4)
Class 11 English Guess (2025) - 1
20 pages
Ent131 HRM Assessment
No ratings yet
Ent131 HRM Assessment
42 pages
22ISE464 IoT Lab Manual
No ratings yet
22ISE464 IoT Lab Manual
56 pages
Human-Centered Machine Learning Implementation in Banking Case Study in BRILink BRI Branchless Banking Agent Acquisition Upgrade and Activation
No ratings yet
Human-Centered Machine Learning Implementation in Banking Case Study in BRILink BRI Branchless Banking Agent Acquisition Upgrade and Activation
7 pages
Dragonlock Cura Profile Instr v3
No ratings yet
Dragonlock Cura Profile Instr v3
6 pages
Flow-Xpress Release Notes - RN - FlowXpress-EN
No ratings yet
Flow-Xpress Release Notes - RN - FlowXpress-EN
22 pages
Network Categories
No ratings yet
Network Categories
20 pages
10 Coolest Jobs in Cybersecurity
No ratings yet
10 Coolest Jobs in Cybersecurity
1 page
Auto LISP
No ratings yet
Auto LISP
18 pages
Harshal More Aws&Devops
No ratings yet
Harshal More Aws&Devops
3 pages
110-6300-EN-R1 PS v10.0 Updating Firmware PS-FS
No ratings yet
110-6300-EN-R1 PS v10.0 Updating Firmware PS-FS
31 pages
A Local Business Listing
0% (1)
A Local Business Listing
11 pages
Meet Raval Resume
No ratings yet
Meet Raval Resume
2 pages
Compare Laptop
No ratings yet
Compare Laptop
7 pages
A Case Study Analysis of JDPi Automotive Manufacturer
No ratings yet
A Case Study Analysis of JDPi Automotive Manufacturer
14 pages
Sap MM Module Most Essential Notes at One Place
88% (8)
Sap MM Module Most Essential Notes at One Place
18 pages
Astm-Standards-3-Pdf-Free 23
No ratings yet
Astm-Standards-3-Pdf-Free 23
1 page
PC Partner Intel LGA 775
No ratings yet
PC Partner Intel LGA 775
2 pages
! Diet Problem Given in The Note : Model Title
No ratings yet
! Diet Problem Given in The Note : Model Title
2 pages
Roll It!
No ratings yet
Roll It!
4 pages
NPS International School-2
No ratings yet
NPS International School-2
13 pages
Single Developer License
No ratings yet
Single Developer License
2 pages
Map Update Manual - LTH - BR V2.1 - ENG
No ratings yet
Map Update Manual - LTH - BR V2.1 - ENG
6 pages
Jetpack Compose UI App Development Toolkit - Android Developers
No ratings yet
Jetpack Compose UI App Development Toolkit - Android Developers
10 pages
Cpe Reviewer
No ratings yet
Cpe Reviewer
5 pages
MYSQL
No ratings yet
MYSQL
44 pages
HPE ProLiant DL380 Gen10 Server
No ratings yet
HPE ProLiant DL380 Gen10 Server
3 pages

Logistic Regression in Machine Learning

Uploaded by

Logistic Regression in Machine Learning

Uploaded by

Logistic Regression in Machine Learning

Likelihood Function for Logistic Regression

Terminologies involved in Logistic Regression

How to Evaluate Logistic Regression Model?

Differences Between Linear and Logistic Regression

It required linear relationship between It not required linear relationship.

To convert this probability into a classification decision:

Types of Logistic Regression

Steps for Building a Logistic Regression Model

When to Use Logistic Regression?

You might also like