0% found this document useful (0 votes)
12 views49 pages

New Data Science Module Logistic Regression

The document provides an overview of Logistic Regression, a method for binary classification that estimates probabilities using a sigmoid function. It explains the transformation from linear regression to classification, the use of the logit function, and the computation of class labels based on predicted probabilities. Additionally, it includes examples, illustrations, and code for implementing logistic regression using a dataset.

Uploaded by

akul joshi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views49 pages

New Data Science Module Logistic Regression

The document provides an overview of Logistic Regression, a method for binary classification that estimates probabilities using a sigmoid function. It explains the transformation from linear regression to classification, the use of the logit function, and the computation of class labels based on predicted probabilities. Additionally, it includes examples, illustrations, and code for implementing logistic regression using a dataset.

Uploaded by

akul joshi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 49

BU MET CS-677: Data Science With Python, v.2.

0 Logistic Regression

LOGISTIC

REGRESSION

Page 1
BU MET CS-677: Data Science With Python, v.2.0 Logistic Regression

Overview
6

5  
4

Y3

2

1 
 
0
0 1 2 3 4 5 6
X

• wantto separate classes


• smooth decision function
Page 2
BU MET CS-677: Data Science With Python, v.2.0 Logistic Regression

Example: ”AND” Gate

x1 + x2 = 1.5
Class 0 (False)
Class 1 (True)

2 3
1
x2

1 4
0

logical "and"
0 1
x1

• linearly separable

Page 3
BU MET CS-677: Data Science With Python, v.2.0 Logistic Regression

Example: ”OR” Gate

x1 + x2 = 0.5
Class 0 (False)
Class 1 (True)

2 3
1
x2

1 4
0

logical "or"
0 1
x1

• linearly separable

Page 4
BU MET CS-677: Data Science With Python, v.2.0 Logistic Regression

Example: ”XOR” Gate

Class 0 (False)
Class 1 (True)

2 3
1
x2

1 4
0

logical "xor"
0 1
x1

• not linearly separable

Page 5
BU MET CS-677: Data Science With Python, v.2.0 Logistic Regression

Binary Classification

• training set S with labels {0, 1}


• find a classifier H

H : X 7→ {0, 1}

• low generalization error


• linear classification (based on
logistic regression)
• dividing (hyper)plane is called
linear discriminant

Page 6
BU MET CS-677: Data Science With Python, v.2.0 Logistic Regression

Background: Linear
Regression

• example of a Generalized Lin-


ear Model (GLM)
figure reprinted from www.kdnuggets.com with explicit permission of the editor

Page 7
BU MET CS-677: Data Science With Python, v.2.0 Logistic Regression

Simple Linear
Regression

• choose line to minimize loss


N
Loss = (yi − yˆi)2
X

i=1

figure reprinted from www.kdnuggets.com with explicit permission of the editor

Page 8
BU MET CS-677: Data Science With Python, v.2.0 Logistic Regression

Classification Problem

• how do we transform a
linear prediction model to
classification problem?

figure reprinted from www.kdnuggets.com with explicit permission of the editor

Page 9
BU MET CS-677: Data Science With Python, v.2.0 Logistic Regression

Issues
• linear regression: continuous
variables
• classification: discrete

• probabilities must be in [0, 1]

• solution:

linear regression 7→ classification


• how: use logit function

Page 10
BU MET CS-677: Data Science With Python, v.2.0 Logistic Regression

logit Function

1 exp(x)
=
1 + exp(−x) 1 + exp(x)

figure reprinted from www.kdnuggets.com with explicit permission of the editor

Page 11
BU MET CS-677: Data Science With Python, v.2.0 Logistic Regression

Probability and Odds


• assume probability P

• define odds as
P
odds =
1−P
• ex.1: P = 0.25 7→ odds = 1/3
• ex.2: P = 0.50 7→ odds = 1
• ex.3: P = 0.75 7→ odds = 3/1

Page 12
BU MET CS-677: Data Science With Python, v.2.0 Logistic Regression

Main Idea:
• estimate logit(odds)

• use regression
 
P
log = b0 + b1 x
1−P
P
= exp(b0 + b1x)
1−P
exp(b0 + b1x)
P =
1 + exp(b0 + b1x
• note:
exp(b0 + b1x) 1
=
1 + exp(b0 + b1x) 1 + exp(−(b0 + b1x))

Page 13
BU MET CS-677: Data Science With Python, v.2.0 Logistic Regression

Illustration

figure reprinted from www.kdnuggets.com with explicit permission of the editor

Page 14
BU MET CS-677: Data Science With Python, v.2.0 Logistic Regression

Logistic Regression

Input Features Outputs


x2 w1 z = 1 +1e y

1
x1 w1
w0 0
1 y = w0 + w1x1 + w2x2

• supervised learning
• estimate label probabilities by
sigmoid function
Page 15
BU MET CS-677: Data Science With Python, v.2.0 Logistic Regression

Linear Regression

Input Features Output


x2 w1

x1 w1 predicted y
w0
1 y = w0 + w1x1 + w2x2

• real-valued output from weighted


sum of inputs

Page 16
BU MET CS-677: Data Science With Python, v.2.0 Logistic Regression

Linear vs. Logistic


• linear regression:
1. estimate w0, w1, . . . , wn using min
squared error
2. predict y = w0 + w1xx + · · · + wnxn
• logistic regression:
1. estimate w0, w1, . . . , wn using min
squared error
2. compute y = w0 + w1xx + · · · +
w n xn
3. apply the signoid function z (y)
to compute label probabilities

Page 17
BU MET CS-677: Data Science With Python, v.2.0 Logistic Regression

Linear Separability

4.5
Iris-virginica
Iris-versicolor
4.0 Iris-setosa

3.5
sepal-width

3.0

2.5

2.0
4.5 5.0 5.5 6.0 6.5 7.0 7.5 8.0
sepal-length

• draw a hyperplane
• difficult in many cases
Page 18
BU MET CS-677: Data Science With Python, v.2.0 Logistic Regression

Logistic Regression
• dependent variable is class
label - categorical
• output is a weighted sum of
inputs
y = w0 + w1x1 + · · · + wmxm
• weighted sum is passed through
a sigmoid function
z (y ) =
1
1 + e−y
• assign labels based on z (y )

Page 19
BU MET CS-677: Data Science With Python, v.2.0 Logistic Regression

Sigmoid Function z (y)

0.8 z(y) = 1 +1e y


Probability z(y)

0.6

0.4

0.2

0.0

10 8 6 4 2 0 2 4 6 8 10
y

• z (y ) > 0.5 if y>0 (class 1)


• z (y ) < 0.5 if y < 0 (class 0)
Page 20
BU MET CS-677: Data Science With Python, v.2.0 Logistic Regression

A Numerical Dataset

object Height Weight Foot Label


xi (H) (W) (F) (L)
x1 5.00 100 6 green
x2 5.50 150 8 green
x3 5.33 130 7 green
x4 5.75 150 9 green
x5 6.00 180 13 red
x6 5.92 190 11 red
x7 5.58 170 12 red
x8 5.92 165 10 red

Page 21
BU MET CS-677: Data Science With Python, v.2.0 Logistic Regression

Code for the Dataset


import pandas as pd
data = pd . DataFrame (
{ ’ id ’: [ 1 ,2 ,3 ,4 ,5 ,6 ,7 ,8] ,
’ Label ’: [ ’ green ’ , ’ green ’ , ’ green ’ , ’ green ’ ,
’ red ’ , ’ red ’ , ’ red ’ , ’ red ’] ,
’ Height ’: [5 , 5.5 , 5.33 , 5.75 ,
6.00 , 5.92 , 5.58 , 5.92] ,
’ Weight ’: [100 , 150 , 130 , 150 ,
180 , 190 , 170 , 165] ,
’ Foot ’: [6 , 8 , 7 , 9 , 13 , 11 , 12 , 10]} ,
columns = [ ’ id ’ , ’ Height ’ , ’ Weight ’ ,
’ Foot ’ , ’ Label ’] )

ipdb> data
id Height Weight Foot Label
0 1 5.00 100 6 green
1 2 5.50 150 8 green
2 3 5.33 130 7 green
3 4 5.75 150 9 green
4 5 6.00 180 13 red
5 6 5.92 190 11 red
6 7 5.58 170 12 red
7 8 5.92 165 10 red

Page 22
BU MET CS-677: Data Science With Python, v.2.0 Logistic Regression

A Dataset Illustration

14
  12

10 Foot

  8
 6
 200
180
4.8 5.0 160
5.2 5.4 140 ight
120 We
Height5.6 5.8 6.0 100

Page 23
BU MET CS-677: Data Science With Python, v.2.0 Logistic Regression

A New Instance

14
  12

10 Foot

  8
 6
 200
180
4.8 5.0 160
5.2 5.4 140 ight
120 We
Height5.6 5.8 6.0 100

(H=6, W=160, F=10) 7→ ?

Page 24
BU MET CS-677: Data Science With Python, v.2.0 Logistic Regression

Separability in Detail

w0 = 1.25, w1 = 4.25
14 w2 = 2.75
Class 0 5
Class 1
12
7
6
10
8 x*
Foot (F)

4
8
2 new instance
3
6
1

4
4.0 4.5 5.0 5.5 6.0 6.5 7.0
Height (H)

Page 25
BU MET CS-677: Data Science With Python, v.2.0 Logistic Regression

Computing Class Labels

Class 0
Class 1 2 3 1

0.8 4
z(y) = 1 +1e y
0.6
probability

0.4
8 y*
0.2
57 6
0.0 new instance

10 8 6 4 2 0 2 4 6 8 10
predicted y

• z (y ∗) < 0.5 - ”red” (class 0)


Page 26
BU MET CS-677: Data Science With Python, v.2.0 Logistic Regression

Summary of Logistic
Regression
• feature vector: X = (1, x1, . . . , xm)

• weights W = (w0, w1, . . . , wm)


• compute
y = W · X = w 0 + w1 x1 + · · · + w m xm
• compute probability h(x):
1
h(x) =
1 + e−W ·X
• assign label C (X ):

1, if h(x) > 0.5
C (X ) =
0, if h(x) < 0.5
Page 27
BU MET CS-677: Data Science With Python, v.2.0 Logistic Regression

How to Compute W?

4 log(z)
log(1 z)

2
cost

0
0.0 0.2 0.4 0.6 0.8 1.0
h(y)

• maximize likelihood
h(X)C(X) · [1 − h(X)]1−C(X)
Y
L=
X

Page 28
BU MET CS-677: Data Science With Python, v.2.0 Logistic Regression

Cost Function
4 log(z)
log(1 z)

2
cost

0
0.0 0.2 0.4 0.6 0.8 1.0
h(y)

• minimize cost (”loss”):


X
Q = − [C(X) log(h(X))
X
+ (1 − C) log(1 − h(X))]

Page 29
BU MET CS-677: Data Science With Python, v.2.0 Logistic Regression

Cost Intuition
4 log(z)
log(1 z)

2
cost

0
0.0 0.2 0.4 0.6 0.8 1.0
h(y)

• correct classification cost: 0


• misclassification cost: 7→ ∞

Page 30
BU MET CS-677: Data Science With Python, v.2.0 Logistic Regression

Computing Gradient

• gradient (with respect to wi):


∂Q Xh i
∂wi
= h(X ) − C (X ) · xi
X
• computation of weights:
1. initialize weights
2. (simultaneously) update
X
wi = wi − α [h(X) − C(X)] xi
X
3. α is the learning rate

Page 31
BU MET CS-677: Data Science With Python, v.2.0 Logistic Regression

Computing Weights

100 iterations learn.rate = 0.003


14 250 iterations
1000 iterations 5
Class 0
12 Class 1 7
6
10
8
4
Foot

8
2
3
6
1

4
4.0 4.5 5.0 5.5 6.0 6.5 7.0
Height

iterations w0 w1 w2 accuracy
100 0.021 0.084 -0.084 50%
250 0.053 0.221 -0.166 75%
1000 0.177 0.741 -0.482 100%

Page 32
BU MET CS-677: Data Science With Python, v.2.0 Logistic Regression

Effect of Lower Rate

100 iterations learn.rate = 0.001


14 250 iterations
1000 iterations 5
Class 0
12 Class 1 7
6
10
8
4
Foot

8
2
3
6
1

4
4.0 4.5 5.0 5.5 6.0 6.5 7.0
Height

• need more iterations

Page 33
BU MET CS-677: Data Science With Python, v.2.0 Logistic Regression

Effect of Higher Rate

100 iterations learn.rate = 0.01


14 250 iterations
1000 iterations 5
Class 0
12 Class 1 7
6
10
8
4
Foot

8
2
3
6
1

4
4.0 4.5 5.0 5.5 6.0 6.5 7.0
Height

• get higher accuracy for the


same number of iterations
Page 34
BU MET CS-677: Data Science With Python, v.2.0 Logistic Regression

Logistic Regression
(original dataset)
Logistic Regression



  ;
(scaled) Weight

 




(scaled) Height

• predict(x∗)=red

• accuracy = 100%
Page 35
BU MET CS-677: Data Science With Python, v.2.0 Logistic Regression

Code: Log. Regression


import pandas as pd
import numpy as np
from sklearn . linear_model import LogisticRegression
from sklearn . preprocessing import StandardScaler , LabelEncoder

data = pd . DataFrame ( { ’ id ’: [ 1 ,2 ,3 ,4 ,5 ,6 ,7 ,8] ,


’ Label ’: [ ’ green ’ , ’ green ’ , ’ green ’ , ’ green ’ ,
’ red ’ , ’ red ’ , ’ red ’ , ’ red ’] ,
’ Height ’: [5 , 5.5 , 5.33 , 5.75 , 6.00 , 5.92 , 5.58 , 5.92] ,
’ Weight ’: [100 , 150 , 130 , 150 , 180 , 190 , 170 , 165] ,
’ Foot ’: [6 , 8 , 7 , 9 , 13 , 11 , 12 , 10]} ,
columns = [ ’ id ’ , ’ Height ’ , ’ Weight ’ , ’ Foot ’ , ’ Label ’] )

X = data [[ ’ Height ’ , ’ Weight ’ ]]. values


scaler = StandardScaler ()
scaler . fit ( X )
X = scaler . transform ( X )
Y = data [ ’ Label ’ ]. values

log_reg_classifier = LogisticRegression ()
log_reg_classifier . fit (X , Y )

new_x = scaler . transform ( np . asmatrix ([6 , 160]))


predicted = log_reg_classifier . predict ( new_x )
accuracy = log_reg_classifier . score (X , Y )

ipdb> predicted[0]
red
ipdb> accuracy
0.875

Page 36
BU MET CS-677: Data Science With Python, v.2.0 Logistic Regression

F/W/H Change

14
  12

10 Foot
 
  8
 6
200
180
4.8 5.0 160
5.2 5.4 140 ight
120 We
Height5.6 5.8 6.0 100

id Height Weight Foot Label


1 5 7→ 6 100 7→ 170 6 7→ 10 green

(H=6, W=160, F=10) 7→ ?


Page 37
BU MET CS-677: Data Science With Python, v.2.0 Logistic Regression

Logistic Regression
(modified dataset)
Logistic Regression



(scaled) Weight

 

;
 



(scaled) Height

• predict(x∗)=green

• accuracy = 87.5%
Page 38
BU MET CS-677: Data Science With Python, v.2.0 Logistic Regression

Code: Log. Regression


(modified dataset)
import pandas as pd
import numpy as np
from sklearn . linear_model import LogisticRegression
from sklearn . preprocessing import StandardScaler , LabelEncoder

data = pd . DataFrame ( { ’ id ’: [ 1 ,2 ,3 ,4 ,5 ,6 ,7 ,8] ,


’ Label ’: [ ’ green ’ , ’ green ’ , ’ green ’ , ’ green ’ ,
’ red ’ , ’ red ’ , ’ red ’ , ’ red ’] ,
’ Height ’: [5 , 5.5 , 5.33 , 5.75 , 6.00 , 5.92 , 5.58 , 5.92] ,
’ Weight ’: [100 , 150 , 130 , 150 , 180 , 190 , 170 , 165] ,
’ Foot ’: [6 , 8 , 7 , 9 , 13 , 11 , 12 , 10]} ,
columns = [ ’ id ’ , ’ Height ’ , ’ Weight ’ , ’ Foot ’ , ’ Label ’] )

data [ ’ Height ’ ]. iloc [0] = 6;


data [ ’ Weight ’ ]. iloc [0] = 170;
data [ ’ Foot ’ ]. iloc [0] = 10
X = data [[ ’ Height ’ , ’ Weight ’ ]]. values
scaler = StandardScaler (). fit ( X )

X = scaler . transform ( X )
Y = data [ ’ Label ’ ]. values
log_reg_classifier = LogisticRegression ()
log_reg_classifier . fit (X , Y )
new_x = scaler . transform ( np . asmatrix ([6 , 160]))
predicted = log_reg_classifier . predict ( new_x )
accuracy = log_reg_classifier . score (X , Y )

ipdb> predicted[0]
green
ipdb> accuracy
0.875
Page 39
BU MET CS-677: Data Science With Python, v.2.0 Logistic Regression

Categorical Dataset

Day Weather Temperature Wind Play


1 sunny hot low no
2 rainy mild high yes
3 sunny cold low yes
4 rainy cold high no
5 sunny cold high yes
6 overcast mild low yes
7 sunny hot low yes
8 overcast hot high yes
9 rainy hot high no
10 rainy mild low yes

• x* = (sunny, cold, low) 7→ ?


• need numeric values for
attributes
Page 40
BU MET CS-677: Data Science With Python, v.2.0 Logistic Regression

Change to Dummy
Variables
Weather

Temp.

Wind
Day
overcast

sunny
rainy

mild
high
cold

low
hot

1 0 0 1 0 1 0 0 1
2 0 1 0 0 0 1 1 0
3 0 0 1 1 0 0 0 1
4 0 1 0 1 0 0 1 0
5 0 0 1 1 0 0 1 0
6 1 0 0 0 0 1 0 1
7 0 0 1 0 1 0 0 1
8 1 0 0 0 1 0 1 0
9 0 1 0 0 1 0 1 0
10 0 1 0 0 0 1 0 1

Page 41
BU MET CS-677: Data Science With Python, v.2.0 Logistic Regression

Python Code
import numpy as np
import pandas as pd
from sklearn . linear_model import LogisticRegression
from sklearn . preprocessing import LabelEncoder
data = pd . DataFrame (
{ ’ Day ’: [1 ,2 ,3 ,4 ,5 ,6 ,7 ,8 ,9 ,10] ,
’ Weather ’: [ ’ sunny ’ , ’ rainy ’ , ’ sunny ’ , ’ rainy ’ ,
’ sunny ’ , ’ overcast ’ , ’ sunny ’ , ’ overcast ’ ,
’ rainy ’ , ’ rainy ’] ,
’ Temperature ’: [ ’ hot ’ , ’ mild ’ , ’ cold ’ , ’ cold ’ , ’ cold ’ ,
’ mild ’ , ’ hot ’ , ’ hot ’ , ’ hot ’ , ’ mild ’] ,
’ Wind ’: [ ’ low ’ , ’ high ’ , ’ low ’ , ’ high ’ , ’ high ’ ,
’ low ’ , ’ low ’ , ’ high ’ , ’ high ’ , ’ low ’] ,
’ Play ’: [ ’ no ’ , ’ yes ’ , ’ yes ’ , ’ no ’ , ’ yes ’ ,
’ yes ’ , ’ yes ’ , ’ yes ’ , ’ no ’ , ’ yes ’]} ,
columns = [ ’ Day ’ , ’ Weather ’ , ’ Temperature ’ , ’ Wind ’ , ’ Play ’ ])
input_data = data [[ ’ Weather ’ , ’ Temperature ’ , ’ Wind ’ ]]
dummies = [ pd . get_dummies ( data [ c ]) for c in input_data . columns ]
binary_data = pd . concat ( dummies , axis =1)
X = binary_data [0:10]. values
le = LabelEncoder ()
Y = le . fit_transform ( data [ ’ Play ’ ]. values )
log_reg_classifier = LogisticRegression ()
log_reg_classifier . fit (X , Y )

# sunny -> (0 ,0 ,1) , cold - > (0 ,1 ,0) , low -> (0 ,1)


new_instance = np . asmatrix ([0 ,0 ,1 ,1 ,0 ,0 ,0 ,1])
prediction = log_reg_classifier . predict ( new_instance )
accuracy = log_reg_classifier . score (X , Y )

ipdb> prediction[0]
1
ipdb> accuracy
0.8
Page 42
BU MET CS-677: Data Science With Python, v.2.0 Logistic Regression

Iris Histograms

Iris histogram #1 Iris histogram #2


class Setosa class Setosa
class Versicolor class Versicolor
15 class Virginica 15 class Virginica

10 10
count

5 5

0 0
4.0 4.5 5.0 5.5 6.0 6.5 7.0 7.5 8.0 2.0 2.5 3.0 3.5 4.0 4.5 5.0
sepal length in cm sepal width in cm
Iris histogram #3 Iris histogram #4
class Setosa 30 class Setosa
25 class Versicolor class Versicolor
class Virginica 25 class Virginica
20
20
15
count

15
10 10
5 5
0 0
1 2 3 4 5 6 7 0.0 0.5 1.0 1.5 2.0 2.5 3.0
petal length in cm petal width in cm

Page 43
BU MET CS-677: Data Science With Python, v.2.0 Logistic Regression

Iris Dataset:

Iris-setosa
Iris-versicolor
Iris-virginica7
6

petal-length
5
4
3
2
1
4.5
4.0
4.5 5.0 3.5 dth
5.5 3.0 l-wi
sepal-6.0
leng6.5 7.0 2.5 sep
a
th 7.5 8.0 2.0

Page 44
BU MET CS-677: Data Science With Python, v.2.0 Logistic Regression

Iris: Python Code


import pandas as pd
from sklearn . linear_model import LogisticRegression
from sklearn . preprocessing import LabelEncoder
from sklearn . model_selection import train_test_split

url = r ’ https :// archive . ics . uci . edu / ml / ’ + \


r ’ machine - learning - databases / iris / iris . data ’

data = pd . read_csv ( url , names =[ ’ sepal - length ’ , ’ sepal - width ’ ,


’ petal - length ’ , ’ petal - width ’ , ’ Class ’ ])

features = [ ’ sepal - length ’ , ’ sepal - width ’]


class_labels = [ ’ Iris - setosa ’ , ’ Iris - versicolor ’]

X = data [ features ]. values

le = LabelEncoder ()
Y = le . fit_transform ( data [ ’ Class ’ ]. values )

X_train , X_test , Y_train , Y_test = train_test_split (X , Y ,


test_size =0.5 , random_state =3)
log_reg_classifier = LogisticRegression ()
log_reg_classifier . fit ( X_train , Y_train )

prediction = log_reg_classifier . predict ( X_test )


accuracy = np . mean ( prediction == Y_test )

ipdb> accuracy
1.0

Page 45
BU MET CS-677: Data Science With Python, v.2.0 Logistic Regression

Iris: Logistic Regression


Logistic Regression
Iris-setosa
Iris-versicolor
(scaled) sepal-width

(scaled) sepal-length

• accuracy = 100%
• easy to separate
Page 46
BU MET CS-677: Data Science With Python, v.2.0 Logistic Regression

Iris: Python Code


import pandas as pd
from sklearn . linear_model import LogisticRegression
from sklearn . preprocessing import LabelEncoder
from sklearn . model_selection import train_test_split

url = r ’ https :// archive . ics . uci . edu / ml / ’ + \


r ’ machine - learning - databases / iris / iris . data ’

data = pd . read_csv ( url , names =[ ’ sepal - length ’ , ’ sepal - width ’ ,


’ petal - length ’ , ’ petal - width ’ , ’ Class ’ ])

features = [ ’ sepal - length ’ , ’ sepal - width ’]


class_labels = [ ’ Iris - versicolor ’ , ’ Iris - virginica ’]

X = data [ features ]. values

le = LabelEncoder ()
Y = le . fit_transform ( data [ ’ Class ’ ]. values )

X_train , X_test , Y_train , Y_test = train_test_split (X , Y ,


test_size =0.5 , random_state =3)
log_reg_classifier = LogisticRegression ()
log_reg_classifier . fit ( X_train , Y_train )

prediction = log_reg_classifier . predict ( X_test )


accuracy = np . mean ( prediction == Y_test )

ipdb> accuracy
0.68

Page 47
BU MET CS-677: Data Science With Python, v.2.0 Logistic Regression

Iris: Logistic Regression


Logistic Regression
Iris-versicolor
Iris-virginica
(scaled) sepal-width

(scaled) sepal-length

• accuracy = 68%
• difficult to separate
Page 48
BU MET CS-677: Data Science With Python, v.2.0 Logistic Regression

Concepts Check:

(a) linear separability


(b) logistic vs. linear regression
(c) odds and logit function
(d) computing weights
(e) analysis of categorical data

Page 49

You might also like