New Data Science Module Logistic Regression
New Data Science Module Logistic Regression
0 Logistic Regression
LOGISTIC
REGRESSION
Page 1
BU MET CS-677: Data Science With Python, v.2.0 Logistic Regression
Overview
6
5
4
Y3
2
1
0
0 1 2 3 4 5 6
X
x1 + x2 = 1.5
Class 0 (False)
Class 1 (True)
2 3
1
x2
1 4
0
logical "and"
0 1
x1
• linearly separable
Page 3
BU MET CS-677: Data Science With Python, v.2.0 Logistic Regression
x1 + x2 = 0.5
Class 0 (False)
Class 1 (True)
2 3
1
x2
1 4
0
logical "or"
0 1
x1
• linearly separable
Page 4
BU MET CS-677: Data Science With Python, v.2.0 Logistic Regression
Class 0 (False)
Class 1 (True)
2 3
1
x2
1 4
0
logical "xor"
0 1
x1
Page 5
BU MET CS-677: Data Science With Python, v.2.0 Logistic Regression
Binary Classification
H : X 7→ {0, 1}
Page 6
BU MET CS-677: Data Science With Python, v.2.0 Logistic Regression
Background: Linear
Regression
Page 7
BU MET CS-677: Data Science With Python, v.2.0 Logistic Regression
Simple Linear
Regression
i=1
Page 8
BU MET CS-677: Data Science With Python, v.2.0 Logistic Regression
Classification Problem
• how do we transform a
linear prediction model to
classification problem?
Page 9
BU MET CS-677: Data Science With Python, v.2.0 Logistic Regression
Issues
• linear regression: continuous
variables
• classification: discrete
• solution:
Page 10
BU MET CS-677: Data Science With Python, v.2.0 Logistic Regression
logit Function
1 exp(x)
=
1 + exp(−x) 1 + exp(x)
Page 11
BU MET CS-677: Data Science With Python, v.2.0 Logistic Regression
• define odds as
P
odds =
1−P
• ex.1: P = 0.25 7→ odds = 1/3
• ex.2: P = 0.50 7→ odds = 1
• ex.3: P = 0.75 7→ odds = 3/1
Page 12
BU MET CS-677: Data Science With Python, v.2.0 Logistic Regression
Main Idea:
• estimate logit(odds)
• use regression
P
log = b0 + b1 x
1−P
P
= exp(b0 + b1x)
1−P
exp(b0 + b1x)
P =
1 + exp(b0 + b1x
• note:
exp(b0 + b1x) 1
=
1 + exp(b0 + b1x) 1 + exp(−(b0 + b1x))
Page 13
BU MET CS-677: Data Science With Python, v.2.0 Logistic Regression
Illustration
Page 14
BU MET CS-677: Data Science With Python, v.2.0 Logistic Regression
Logistic Regression
1
x1 w1
w0 0
1 y = w0 + w1x1 + w2x2
• supervised learning
• estimate label probabilities by
sigmoid function
Page 15
BU MET CS-677: Data Science With Python, v.2.0 Logistic Regression
Linear Regression
x1 w1 predicted y
w0
1 y = w0 + w1x1 + w2x2
Page 16
BU MET CS-677: Data Science With Python, v.2.0 Logistic Regression
Page 17
BU MET CS-677: Data Science With Python, v.2.0 Logistic Regression
Linear Separability
4.5
Iris-virginica
Iris-versicolor
4.0 Iris-setosa
3.5
sepal-width
3.0
2.5
2.0
4.5 5.0 5.5 6.0 6.5 7.0 7.5 8.0
sepal-length
• draw a hyperplane
• difficult in many cases
Page 18
BU MET CS-677: Data Science With Python, v.2.0 Logistic Regression
Logistic Regression
• dependent variable is class
label - categorical
• output is a weighted sum of
inputs
y = w0 + w1x1 + · · · + wmxm
• weighted sum is passed through
a sigmoid function
z (y ) =
1
1 + e−y
• assign labels based on z (y )
Page 19
BU MET CS-677: Data Science With Python, v.2.0 Logistic Regression
0.6
0.4
0.2
0.0
10 8 6 4 2 0 2 4 6 8 10
y
A Numerical Dataset
Page 21
BU MET CS-677: Data Science With Python, v.2.0 Logistic Regression
ipdb> data
id Height Weight Foot Label
0 1 5.00 100 6 green
1 2 5.50 150 8 green
2 3 5.33 130 7 green
3 4 5.75 150 9 green
4 5 6.00 180 13 red
5 6 5.92 190 11 red
6 7 5.58 170 12 red
7 8 5.92 165 10 red
Page 22
BU MET CS-677: Data Science With Python, v.2.0 Logistic Regression
A Dataset Illustration
14
12
10 Foot
8
6
200
180
4.8 5.0 160
5.2 5.4 140 ight
120 We
Height5.6 5.8 6.0 100
Page 23
BU MET CS-677: Data Science With Python, v.2.0 Logistic Regression
A New Instance
14
12
10 Foot
8
6
200
180
4.8 5.0 160
5.2 5.4 140 ight
120 We
Height5.6 5.8 6.0 100
Page 24
BU MET CS-677: Data Science With Python, v.2.0 Logistic Regression
Separability in Detail
w0 = 1.25, w1 = 4.25
14 w2 = 2.75
Class 0 5
Class 1
12
7
6
10
8 x*
Foot (F)
4
8
2 new instance
3
6
1
4
4.0 4.5 5.0 5.5 6.0 6.5 7.0
Height (H)
Page 25
BU MET CS-677: Data Science With Python, v.2.0 Logistic Regression
Class 0
Class 1 2 3 1
0.8 4
z(y) = 1 +1e y
0.6
probability
0.4
8 y*
0.2
57 6
0.0 new instance
10 8 6 4 2 0 2 4 6 8 10
predicted y
Summary of Logistic
Regression
• feature vector: X = (1, x1, . . . , xm)
How to Compute W?
4 log(z)
log(1 z)
2
cost
0
0.0 0.2 0.4 0.6 0.8 1.0
h(y)
• maximize likelihood
h(X)C(X) · [1 − h(X)]1−C(X)
Y
L=
X
Page 28
BU MET CS-677: Data Science With Python, v.2.0 Logistic Regression
Cost Function
4 log(z)
log(1 z)
2
cost
0
0.0 0.2 0.4 0.6 0.8 1.0
h(y)
Page 29
BU MET CS-677: Data Science With Python, v.2.0 Logistic Regression
Cost Intuition
4 log(z)
log(1 z)
2
cost
0
0.0 0.2 0.4 0.6 0.8 1.0
h(y)
Page 30
BU MET CS-677: Data Science With Python, v.2.0 Logistic Regression
Computing Gradient
Page 31
BU MET CS-677: Data Science With Python, v.2.0 Logistic Regression
Computing Weights
8
2
3
6
1
4
4.0 4.5 5.0 5.5 6.0 6.5 7.0
Height
iterations w0 w1 w2 accuracy
100 0.021 0.084 -0.084 50%
250 0.053 0.221 -0.166 75%
1000 0.177 0.741 -0.482 100%
Page 32
BU MET CS-677: Data Science With Python, v.2.0 Logistic Regression
8
2
3
6
1
4
4.0 4.5 5.0 5.5 6.0 6.5 7.0
Height
Page 33
BU MET CS-677: Data Science With Python, v.2.0 Logistic Regression
8
2
3
6
1
4
4.0 4.5 5.0 5.5 6.0 6.5 7.0
Height
Logistic Regression
(original dataset)
Logistic Regression
;
(scaled) Weight
(scaled) Height
• predict(x∗)=red
• accuracy = 100%
Page 35
BU MET CS-677: Data Science With Python, v.2.0 Logistic Regression
log_reg_classifier = LogisticRegression ()
log_reg_classifier . fit (X , Y )
ipdb> predicted[0]
red
ipdb> accuracy
0.875
Page 36
BU MET CS-677: Data Science With Python, v.2.0 Logistic Regression
F/W/H Change
14
12
10 Foot
8
6
200
180
4.8 5.0 160
5.2 5.4 140 ight
120 We
Height5.6 5.8 6.0 100
Logistic Regression
(modified dataset)
Logistic Regression
(scaled) Weight
;
(scaled) Height
• predict(x∗)=green
• accuracy = 87.5%
Page 38
BU MET CS-677: Data Science With Python, v.2.0 Logistic Regression
X = scaler . transform ( X )
Y = data [ ’ Label ’ ]. values
log_reg_classifier = LogisticRegression ()
log_reg_classifier . fit (X , Y )
new_x = scaler . transform ( np . asmatrix ([6 , 160]))
predicted = log_reg_classifier . predict ( new_x )
accuracy = log_reg_classifier . score (X , Y )
ipdb> predicted[0]
green
ipdb> accuracy
0.875
Page 39
BU MET CS-677: Data Science With Python, v.2.0 Logistic Regression
Categorical Dataset
Change to Dummy
Variables
Weather
Temp.
Wind
Day
overcast
sunny
rainy
mild
high
cold
low
hot
1 0 0 1 0 1 0 0 1
2 0 1 0 0 0 1 1 0
3 0 0 1 1 0 0 0 1
4 0 1 0 1 0 0 1 0
5 0 0 1 1 0 0 1 0
6 1 0 0 0 0 1 0 1
7 0 0 1 0 1 0 0 1
8 1 0 0 0 1 0 1 0
9 0 1 0 0 1 0 1 0
10 0 1 0 0 0 1 0 1
Page 41
BU MET CS-677: Data Science With Python, v.2.0 Logistic Regression
Python Code
import numpy as np
import pandas as pd
from sklearn . linear_model import LogisticRegression
from sklearn . preprocessing import LabelEncoder
data = pd . DataFrame (
{ ’ Day ’: [1 ,2 ,3 ,4 ,5 ,6 ,7 ,8 ,9 ,10] ,
’ Weather ’: [ ’ sunny ’ , ’ rainy ’ , ’ sunny ’ , ’ rainy ’ ,
’ sunny ’ , ’ overcast ’ , ’ sunny ’ , ’ overcast ’ ,
’ rainy ’ , ’ rainy ’] ,
’ Temperature ’: [ ’ hot ’ , ’ mild ’ , ’ cold ’ , ’ cold ’ , ’ cold ’ ,
’ mild ’ , ’ hot ’ , ’ hot ’ , ’ hot ’ , ’ mild ’] ,
’ Wind ’: [ ’ low ’ , ’ high ’ , ’ low ’ , ’ high ’ , ’ high ’ ,
’ low ’ , ’ low ’ , ’ high ’ , ’ high ’ , ’ low ’] ,
’ Play ’: [ ’ no ’ , ’ yes ’ , ’ yes ’ , ’ no ’ , ’ yes ’ ,
’ yes ’ , ’ yes ’ , ’ yes ’ , ’ no ’ , ’ yes ’]} ,
columns = [ ’ Day ’ , ’ Weather ’ , ’ Temperature ’ , ’ Wind ’ , ’ Play ’ ])
input_data = data [[ ’ Weather ’ , ’ Temperature ’ , ’ Wind ’ ]]
dummies = [ pd . get_dummies ( data [ c ]) for c in input_data . columns ]
binary_data = pd . concat ( dummies , axis =1)
X = binary_data [0:10]. values
le = LabelEncoder ()
Y = le . fit_transform ( data [ ’ Play ’ ]. values )
log_reg_classifier = LogisticRegression ()
log_reg_classifier . fit (X , Y )
ipdb> prediction[0]
1
ipdb> accuracy
0.8
Page 42
BU MET CS-677: Data Science With Python, v.2.0 Logistic Regression
Iris Histograms
10 10
count
5 5
0 0
4.0 4.5 5.0 5.5 6.0 6.5 7.0 7.5 8.0 2.0 2.5 3.0 3.5 4.0 4.5 5.0
sepal length in cm sepal width in cm
Iris histogram #3 Iris histogram #4
class Setosa 30 class Setosa
25 class Versicolor class Versicolor
class Virginica 25 class Virginica
20
20
15
count
15
10 10
5 5
0 0
1 2 3 4 5 6 7 0.0 0.5 1.0 1.5 2.0 2.5 3.0
petal length in cm petal width in cm
Page 43
BU MET CS-677: Data Science With Python, v.2.0 Logistic Regression
Iris Dataset:
Iris-setosa
Iris-versicolor
Iris-virginica7
6
petal-length
5
4
3
2
1
4.5
4.0
4.5 5.0 3.5 dth
5.5 3.0 l-wi
sepal-6.0
leng6.5 7.0 2.5 sep
a
th 7.5 8.0 2.0
Page 44
BU MET CS-677: Data Science With Python, v.2.0 Logistic Regression
le = LabelEncoder ()
Y = le . fit_transform ( data [ ’ Class ’ ]. values )
ipdb> accuracy
1.0
Page 45
BU MET CS-677: Data Science With Python, v.2.0 Logistic Regression
(scaled) sepal-length
• accuracy = 100%
• easy to separate
Page 46
BU MET CS-677: Data Science With Python, v.2.0 Logistic Regression
le = LabelEncoder ()
Y = le . fit_transform ( data [ ’ Class ’ ]. values )
ipdb> accuracy
0.68
Page 47
BU MET CS-677: Data Science With Python, v.2.0 Logistic Regression
(scaled) sepal-length
• accuracy = 68%
• difficult to separate
Page 48
BU MET CS-677: Data Science With Python, v.2.0 Logistic Regression
Concepts Check:
Page 49