0% found this document useful (0 votes)
45 views26 pages

Unit 4 - Logistic Regression

Logistic regression is a statistical approach used for binary classification problems. It determines the probability of an outcome being true or false based on independent variables. The hypothesis function uses the sigmoid or logistic function to output a probability between 0 and 1. The decision boundary is 0.5 - above 0.5 predicts true and below predicts false. The cost function for logistic regression is similar to linear regression but adapted for the sigmoid function output between 0 and 1. Multiclass classification can be handled by using multiple binary logistic regressions. Overfitting is addressed by reducing features through manual selection or algorithms like PCA.

Uploaded by

shinjo
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
45 views26 pages

Unit 4 - Logistic Regression

Logistic regression is a statistical approach used for binary classification problems. It determines the probability of an outcome being true or false based on independent variables. The hypothesis function uses the sigmoid or logistic function to output a probability between 0 and 1. The decision boundary is 0.5 - above 0.5 predicts true and below predicts false. The cost function for logistic regression is similar to linear regression but adapted for the sigmoid function output between 0 and 1. Multiclass classification can be handled by using multiple binary logistic regressions. Overfitting is addressed by reducing features through manual selection or algorithms like PCA.

Uploaded by

shinjo
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 26

Logistic Regression

Classification
Logistic Regression
• An statistical approach that determine a Binary
outcome based on one or more independent
variables.

• Binary outcome:
» 1/0,
» Yes/No,
» True/False etc.
Sigmoid Function (Logistic
Function)

  g
Logistic Regression: Model Representation

• For Linear regression:  g


hθ(x) = θTx

• In Logistic Regression Model,


We want: 0 <= hθ(x) <= 1.

• The hypothesis function:


hθ(x) = g(θTx)
  g

Where, z= θTx
Interpretation of Hypothesis
output
Hypothesis:
hθ(x) = “estimated probability that y= 1 on input x” = g(θTx)

Example: For a given tumor size = x


we get,
hθ(x) = 0.7
It means that 70% chance of tumor being
malignant.
Mathematical representation of
the hypothesis output
hθ(x)= P(y=1|x; θ)
Probability that y=1, given x, parameterized by θ

(Probability that y=0) + (Probability that y=1) => 1


P(y=0|x; θ) + P(y=1|x; θ) = 1

P(y=0|x; θ) = 1- P(y=1|x; θ)
Decision Boundary

Logistic Regression

hθ(x) = g(θTx) = P(y=1|x; θ)


  g

Suppose,
predict “y=1” if hθ(x) >= 0.5
predict “y= 0” if hθ(x) < 0.5
Example: Two Feature x1 and x2
Example: Two Feature x1 and x2
Let
In Logistic Regression
Cost function for
Logistic Regression

Rewrite the above equation of cost function as


Multiclass Classification
Multiclass Classification

y=1, y=2, y=3, y=4

y=1, y=2, y=3

y=1, y=2, y=3, y=4


Use three different Binary classifications

y=1
y=2
y=3
Bias vs Variance: Overfitting Problem
Bias vs Variance: Overfitting Problem
Bias vs Variance: Overfitting Problem
Bias vs Variance: Overfitting Problem
Addressing Overfitting
Options:
1. Reduce number of features
- Manually select which features to keep.
- Feature selection algorithm
(Feature Reduction approach like PCA etc.)

You might also like