0% found this document useful (0 votes)
15 views17 pages

Lec - 5 - Logistic Regression

The document provides an overview of logistic regression, focusing on its application in classification tasks. It explains the differences between classification, clustering, and regression, and details the logistic regression cost function and its optimization using gradient descent. Additionally, it outlines the steps involved in logistic regression, including the transformation of linear outputs into probabilities and the use of the binary cross-entropy cost function.

Uploaded by

manchestermilf1
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
15 views17 pages

Lec - 5 - Logistic Regression

The document provides an overview of logistic regression, focusing on its application in classification tasks. It explains the differences between classification, clustering, and regression, and details the logistic regression cost function and its optimization using gradient descent. Additionally, it outlines the steps involved in logistic regression, including the transformation of linear outputs into probabilities and the use of the binary cross-entropy cost function.

Uploaded by

manchestermilf1
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 17

Logistic Regression

“Classification”

Dr.Mohanad Ali Deif


Deference Between
classification, clustering, and regression.
Deference Between
classification, clustering, and regression.
Deference Between
Classification and regression.
function that used to represent
our hypothesis in classification

sigmoid function of hθ(x)


hθ(x) = θ0 + θ1x1 + θ2x2
hθ(x) = g(θ0 + θ1x1 + θ2x2)

sigmoid function
Cost function
Numerical Example
X y
3 1
x z 𝒆−𝐳 𝟏 + 𝒆−(𝑧) 𝐠(𝐳) 𝐲
2 1
1 1 3 0 1 2 0.5 1
𝑒0
5 0
4 0 2 2 1 0.13536335 1.13536335 0.8808 1
6 0 𝑒2
h(x)=-2x+6 1 4 1 0.018323237 1.018323237 0.9820 1

𝑒4
𝑦𝜖 0,1 5 -4 54.57551085 55.57551085 0.0179 0
𝑒4
Estimated probability 4 -2 7.387524 8.387524 0.1192 0
1 𝑒2
𝑔(𝑧) =
1 + 𝑒 −(𝑧) 6 -6 403.1778962 404.1778962 0.00247 0
𝑒6
𝑥 : tumor size when 𝑧 ≥ 0
Then 𝑔(𝑧) ≥ 0.5
𝑧=ℎ 𝑥 when 𝑧 < 0
Then 𝑔(𝑧) < 0.5
𝑝(𝑦 = 1 ∣ 𝑥; 𝜃)
How to minimize
the logistic regression cost function

In Linear regression, to find optimal values for 𝜃 s:


1 2
1. Choose a cost function 𝑗(𝜃) = 2 ∑𝑚𝑖=1 ℎ 𝑥
(𝑖)
− 𝑦 (𝑖)
𝒉(𝒙) = 𝜃 𝑇 𝑥 is a linear function

▪ What if we choose the same cos


function for Logistic regression?
1
𝑔(𝑧) = 𝜃𝑇 𝑥
1 + 𝑒−

𝑧 = ℎ(𝑥)

▪ Can we find a convex cost function ??


𝒋(𝜽) = cost(𝒉(𝒙), 𝒚)

−log(ℎ(𝑥)), if 𝑦 = 1
cost(ℎ(𝑥), 𝑦) = ቊ
−log(1 − ℎ(𝑥)), if 𝑦 = 0

This will give us the convexity


Logistic Regression Cost Function

−log(ℎ(𝑥)), if 𝑦 = 1
cost(ℎ(𝑥), 𝑦) = ቊ
−log(1 − ℎ(𝑥)), if 𝑦 = 0

𝑗(𝜃) = 𝑦 ∗ −log(ℎ(𝑥)) − (1 − 𝑦) ∗ log(1 − ℎ(𝑥))

𝑗(𝜃) = −𝑦log(ℎ(𝑥)) − (1 − 𝑦)log(1 − ℎ(𝑥))

𝑚
1
𝑗(𝜃) = − ෍ 𝑦 (𝑖) log ℎ 𝑥 (𝑖) + 1 − 𝑦 𝑖 log 1 − ℎ 𝑥 (𝑖)
𝑚
𝑖=1

Binary Cross-Entropy Cost Function


How to minimize
the logistic regression cost function

Gradient Descent 𝑑
𝜃𝑗 : = 𝜃𝑗 − 𝛾 𝐽(𝜃)
𝑑𝜃𝑗

𝑚
𝑑 1
𝑗(𝜃) = ෍ 𝑔(𝑧)𝑖 − 𝑦 𝑖 ⋅ 𝑥𝑗𝑖
𝑑𝜃𝑗 𝑚
𝑖=1

𝑚
1
𝜽𝒋 : = 𝜽𝒋 − 𝜸 ෍ 𝟏 𝑔 ቀ𝑧1)𝑖 − 𝑦 𝑖 ⋅ 𝑥𝑗𝑖
𝑚
𝑖=1
Cross Entropy Cost Function Derivative (Optional)
Logistic Regression Steps Summary

Step 1: 𝒉(𝒙) = 𝜽𝑇𝑖 ⋅ 𝒙𝑖 Learning Model Linear classifier

𝑧 = 𝒉(𝒙)

1
Step 2: 𝑔(𝑧) = 1+𝑒 −(𝑧) convert a real value into one that can be interpreted as a probability

1
Step 3: 𝑗(𝜃) = − 𝑚 ∑𝑚 (𝑖)
𝑖=1 𝑦 log 𝑔 𝑧
(𝑖) + 1 − 𝑦 𝑖 log 1 − 𝑔 𝑧 (𝑖) Binary Cross-Entropy Cost

Function

Step 4: minimize (𝑗(𝜃)) Using an optimization algorithm

𝑑
Gradient Descent 𝜃𝑗 : = 𝜃𝑗 − 𝛾 𝐽(𝜃)
𝑑𝜃𝑗
Multi-class Classification
Open Discussion

You might also like