L16 LogisticRegression
L16 LogisticRegression
April 5, 2022
Classification as Regression
y
(xi , yi )
Regression problem:
Given x (independent variable),
predict y (dependent variable).
x
y
(xi , yi )
Classification problem: 1
Given x (features),
predict y (labels).
0
x
Classification as Regression
0
x
0
x
1
σ(t) =
1 + e−t
Linear Predictor Inside Logistic Function
1
p(y | x) = σ(β0 + β1 x) =
1 + e−β0 −β1 x
β0 : “intercept” β1 : “slope”
Example (from Wikipedia)
Pass/fail of exam (y) vs. Hours spent studying (x)
Multivariate Predictor
Maximize likelihood:
1. Compute derivative (gradient) of likelihood w.r.t. β
2. Solve for β that makes this derivative zero
Likelihood Function
n
Y
L(β; X, y) = σ(Xi• β)yi (1 − σ(Xi• β))1−yi
i=1
Log-Likelihood Function
`(β; X, y) = ln L(β; X, y)
X n
= (yi − 1)Xi• β − ln(1 + e−Xi• β )
i=1
Gradient of Log-Likelihood Function
∂`
∂β
0
∂`
∂β1
∇`(β; X, y) =
...
∂`
∂βd
n
e−Xi• β
∂` X
= (yi − 1) + Xik
∂βk 1 + e−Xi• β
i=1
Problem! Can’t solve for β that makes this zero!
Gradient Ascent
I Take a small step in the gradient direction
I Repeat until the gradient is zero
Algorithm for Logistic Regression