4.logistic Regression
4.logistic Regression
Logistic Regression
• A supervised classification algorithm.
• Predicted output is discrete – only specific values or classes
are allowed. Eg: Pass/Fail, Spam/No spam, Malignant or
Benign Tumor etc.
• That means, Output y (y = f (x)), which is predicted from x
(inputs or features), takes only discrete values/classes.
• Types of logistic regression:
(a) Binary logistic regression: y takes two discrete values (for
two classes).
1
Logistic Regression (contd.)
2
Binary Logistic Regression
• A Two-class classification: y = 0 or 1.
• y can be predicted based on single variable or feature (x):
y = f (x) = β0 + β1 x
• Based on multiple features (x1 , x2 , · · · , xk ):
y = f (x1 , x2 , · · · , xk ) = β0 + β1 x1 + β2 x2 + · · · βk xk
• An example on student dataset:
Hours studied Hours slept Result (Pass (1)/Fail (0))
4.85 9.63 1
8.62 3.23 0
5.43 8.23 1
9.21 6.34 0
3
Binary Logistic Regression (contd.)
4
Binary Logistic Regression (contd.)
5
Binary Logistic Regression (contd.)
6
Binary Logistic Regression (contd.)
7
Cost function
• Consider n training points or samples, (Xi , Yi ), where each Xi
is a k-dimensional feature vector (k features) and Yi = 0 or 1.
• For logistic regression, as Sigmoid function is used which has
exponential term in the denominator, the following cost
function called Log-Loss or Cross Entropy, is used (instead
of MSE used for linear regression):
J(θ) = J(β0 , β1 , · · · , βk ) =
n
1X
− [Yi log (f (Xi )) + (1 − Yi )log (1 − f (Xi ))]
n
i=1
• Similar to MSE, the above cost function which quantifies the
difference between actual (Yi ) and predicted (f (Xi )) outputs,
has to be minimized.
8
Cost function (contd.)
9
Cost function (contd.)
10
Gradient descent for training
11
Gradient descent for training (contd.)
12
Testing
13
Multi-class logistic regression
14
Thank You
15