0% found this document useful (0 votes)
33 views7 pages

PCCAIML601

This report provides a detailed analysis of Logistic Regression and Maximum Likelihood Estimation (MLE), focusing on their mathematical foundations, optimization techniques, and applications in binary classification tasks. It discusses the advantages and limitations of logistic regression, as well as regularization methods to prevent overfitting. The report serves as a comprehensive resource for understanding the implementation and significance of logistic regression in various fields.

Uploaded by

Eshika Giri
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
33 views7 pages

PCCAIML601

This report provides a detailed analysis of Logistic Regression and Maximum Likelihood Estimation (MLE), focusing on their mathematical foundations, optimization techniques, and applications in binary classification tasks. It discusses the advantages and limitations of logistic regression, as well as regularization methods to prevent overfitting. The report serves as a comprehensive resource for understanding the implementation and significance of logistic regression in various fields.

Uploaded by

Eshika Giri
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

FUTURE INSTITUE OF TECHNOLOGY

240, Garia Boral Main Road, Kolkata - 700154 West Bengal

(Affiliated To MAKAUT)

Detailed Report on
Logistic Regression and Maximum Likelihood
Estimation

Submitted as CA2 in

Machine Learning Applications

(PCCAIML601)

for

the partial fulfilment of

B. Tech in

Computer Science and Engineering (AI & ML)

Submitted by:

Eshika Giri

(34230822009)

Submitted on: 12th of March, 2025


Table of Contents

1. Introduction
2. Logistic Regression
o Definition and Importance
o Mathematical Formulation
o Sigmoid Function
o Decision Boundary
3. Maximum Likelihood Estimation (MLE)
o Concept of Likelihood
o Derivation of MLE for Logistic Regression
o Log-Likelihood Function
o Optimization Using Gradient Descent
4. Cost Function for Logistic Regression
5. Regularization in Logistic Regression
o L1 (Lasso) Regularization
o L2 (Ridge) Regularization
6. Advantages of Logistic Regression
7. Limitations of Logistic Regression
8. Applications of Logistic Regression
9. Conclusion
10. References
Abstract

Logistic Regression is a fundamental machine learning algorithm used for binary


classification problems. It models the probability of an event occurring by applying the
logistic (sigmoid) function to a linear combination of input features. The parameters of
logistic regression are optimized using Maximum Likelihood Estimation (MLE), which
ensures the best fit to the data by maximizing the probability of observed outcomes. This
report provides a comprehensive discussion on the mathematical foundations of logistic
regression, the derivation of MLE, optimization techniques, cost functions, regularization
methods, and real-world applications. Additionally, the report highlights the advantages and
limitations of logistic regression in practical scenarios.

Introduction

Logistic Regression is a widely used statistical method for binary classification tasks. It is
particularly effective when the target variable has two possible outcomes, such as 'yes' or 'no',
'spam' or 'not spam', and 'fraudulent' or 'non-fraudulent'. Unlike linear regression, which
predicts continuous values, logistic regression estimates the probability of a particular class
label.

The key idea behind logistic regression is to model the relationship between a set of
independent variables and the probability of a dependent variable belonging to a particular
class. The model uses the sigmoid function to ensure that the output values are constrained
between 0 and 1, making them interpretable as probabilities.

To estimate the parameters of the logistic regression model, Maximum Likelihood


Estimation (MLE) is used. MLE finds the parameter values that maximize the likelihood of
the observed data. Since logistic regression does not have a closed-form solution like linear
regression, optimization techniques such as Gradient Descent are used to find the best
parameter estimates.

Logistic Regression

Definition and Importance

Logistic regression is a supervised learning algorithm used for classification problems where
the dependent variable is categorical. The model predicts the probability of an event
occurring, making it useful in numerous domains, including medical diagnosis, fraud
detection, and marketing.
Mathematical Formulation

For a given set of input features X, the logistic regression model is expressed as:

where:

 β0,β1,...,βn are the model parameters (weights and bias),


 X1,...,Xn are input features,
 e is Euler’s number (2.718), ensuring the output is between 0 and 1.

Sigmoid Function

The sigmoid function is used to map the


linear combination of input features to a
probability:

where z is the linear combination of


input features. The sigmoid function ensures that output values range between 0 and 1.

Decision Boundary

 The model classifies an instance


as 1 if P(Y=1∣X)≥0.5, otherwise
as 0.
 The decision boundary is linear
in simple logistic regression, but
in nonlinear cases, feature
transformations can be applied.
Maximum Likelihood Estimation (MLE)

Concept of Likelihood

MLE is a statistical technique that estimates parameters by maximizing the likelihood


function, which represents the probability of observed data given the parameters.

Derivation of MLE for Logistic Regression

For logistic regression, the likelihood function is given by:

Taking the logarithm (log-likelihood function):

Optimization Using Gradient Descent

 Compute gradients with respect to parameters β.


 Use the update rule:

where α is the learning rate.

 Iterate until convergence.


Cost Function for Logistic Regression

The cost function is derived from the negative log-likelihood function:

Minimizing this function ensures optimal parameter values.

Regularization in Logistic Regression

To prevent overfitting, regularization techniques are used:

 L1 (Lasso) Regularization: Adds absolute weight penalties to shrink some


coefficients to zero.
 L2 (Ridge) Regularization: Adds squared weight penalties to prevent large
coefficient values.

Advantages of Logistic Regression

1. Simple and Interpretable: Provides clear probability estimates.


2. Efficient for Binary Classification: Performs well when classes are linearly
separable.
3. Less Prone to Overfitting: Simpler than deep learning models.
4. Probability Outputs: Unlike SVM, it provides probability scores.
Limitations of Logistic Regression

1. Limited to Linear Decision Boundaries: Cannot handle complex decision


boundaries without transformations.
2. Sensitive to Outliers: Outliers can impact model performance.
3. Struggles with High-Dimensional Data: Feature selection is necessary.

Applications of Logistic Regression

1. Medical Diagnosis: Predicting diseases (e.g., diabetes detection).


2. Spam Detection: Classifying emails as spam or not spam.
3. Credit Scoring: Assessing loan eligibility.
4. Marketing: Predicting customer purchase behavior.
5. Fraud Detection: Identifying fraudulent transactions.

Conclusion

Logistic regression is a powerful classification algorithm, particularly for binary


classification problems. Maximum Likelihood Estimation plays a crucial role in optimizing
model parameters. While logistic regression is simple and effective, it has limitations that
must be addressed using techniques such as feature engineering and regularization.

References

1. ScienceDirect. (n.d.). Logistic Regression Analysis. Retrieved from:


https://www.sciencedirect.com/topics/medicine-and-dentistry/logistic-regression-analysis

2. GeeksforGeeks. (n.d.). Understanding Logistic Regression. Retrieved from:


https://www.geeksforgeeks.org/understanding-logistic-regression/

You might also like