0% found this document useful (0 votes)
53 views

Logistic: Regression Sigmoid Function

This document summarizes the key steps in logistic regression modeling. It introduces the sigmoid function used to model probabilities, the likelihood function, and the log likelihood function. It then describes how to calculate the gradient of the log likelihood function to perform gradient ascent optimization of the coefficients. An example application is shown, where initial coefficients are set to 0, the log likelihood is calculated, coefficients are updated over 100 iterations using gradient ascent, and the final log likelihood and coefficients are reported. Performance metrics like precision, recall, and accuracy are also calculated on the predicted vs. actual classes.

Uploaded by

Hsu Let Yee Hnin
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
53 views

Logistic: Regression Sigmoid Function

This document summarizes the key steps in logistic regression modeling. It introduces the sigmoid function used to model probabilities, the likelihood function, and the log likelihood function. It then describes how to calculate the gradient of the log likelihood function to perform gradient ascent optimization of the coefficients. An example application is shown, where initial coefficients are set to 0, the log likelihood is calculated, coefficients are updated over 100 iterations using gradient ascent, and the final log likelihood and coefficients are reported. Performance metrics like precision, recall, and accuracy are also calculated on the predicted vs. actual classes.

Uploaded by

Hsu Let Yee Hnin
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 4

Logistic Regression

Sigmoid Function
1
P ( Y )= − (θ0 +θi X i )
=σ (θ T xi )
1+ e

Likelihood Function
n n
i (1− y i )
L (θ )=∏ P ( Y = y i|X =xi ) =∏ σ (θT x i ) y [1−σ ( θT x i ) ]
i i

Log of Likelihood Function


n
¿ ( θ )=∑ y i log(σ ( θT x i ) )+ ( 1− y i ) log[1−σ (θT x i)]
i=1

Gradient of Log Likelihood Function


∂≪ (θ ) ∂ i ∂
= y log(σ ( θT x i ) )+ ( 1− y i ) log [ 1−σ ( θT x i ) ]
∂θ j ∂θj ∂θ j

y 1− y ∂
¿
[( T
σ θ x)

1−σ ( θ x ) ∂θT
j ] T
σ (θ x)

y 1− y
¿
[( T
σ θ x)

1−σ ( θ x T
)]
T T
σ (θ x)[1−σ (θ x)] x j

y−σ ( θT x )
¿
[ T
σ ( θ x ) [ 1−σ ( θ x ) ] T ] σ (θT x) ¿[1−σ (θT x )]x j

n
i T i i
=∑ [ y −∂(θ x ) ¿ ] x j ¿
i=1

Gradient Ascent Optimization


n
θ j+1=θ j +C∗∑ [ y i −σ (θ T x i ) ¿ ] xij ¿
i=1

First Iteration
Initial_coefficients =θ =[0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
0,0,0,0,0,0,0,0,0,0,0,0,0,0,0, 0,0,0,0,0, 0,0,0,0,0, 0,0,0,0,0,]
1
Prediction with sigmoid function = P ( Y )=
1+ e 0
Prediction with sigmoid function = [0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5
0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5
0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5]

Indicators =y=[1 1 1 1 0 0 0 1 0 1 0 1 1 0 1 1 0 1 0 1 0 1 0 1 0 0 1 1 1 0 0 0 1 1 0 1 0 1 0 1 0 0 0 0 1 0 1 0 1 1
1 0 1 1 1 1 1 0 1 0 0]

X[:,0] =1st Feature= [ 0.735 -1.188 0.686 1.478 -0.616 -1.0254 -0.519 0.371 -0.568 -1.956 1.453 0.814
-0.427 1.141 -0.182 -1.668 0.731 0.373 1.742 -0.746 -1.579 0.666 1.790 0.294 -0.640 0.996 -1.600 -0.585
0.852 1.141 1.020 -1.530 -0.556 -0.477 0.515 -0.212 1.020 0.036 -0.905 1.028 -0.351 -1.097 -0.110 -1.218
-0.832 0.418 0.179 -0.832 -0.544 0.737 1.417 0.876 0.466 0.892 -0.554 -2.108 0.529-0.664 -0.682 1.646
-0.592]

Gradient of log likelihood function

∂≪ (θ ) n
=∑ [ y i−∂ (θT x i) ¿ ] x ij= [ (1−0.5 ) 0.735 ] +…+[ ( 0−0.5 )−0.592]¿
∂θ j i=1

= -2.50362

Gradient Ascent Optimization


n
θ1=θ 0 +C∗∑ [ y i−σ (θT x i) ¿ ]x ij=0+0.00005∗(−2.50362)¿
i=1

C = 0.0001

θ0 =−0.002503

Log of likelihood function


n
¿ ( θ )=∑ y i log(σ ( θT x i ) )+ ( 1− y i ) log[1−σ (θT x i)]
i=1

= -42.2196

After 100 iteration


Log of likelihood = -36.4231
Coefficients =[-1.17915,0.3395, -0.1759, 0.0032, -0.3622, -0.2968, 0.2607 , 0.2031, 0.2608, 0.3840,
0.4710, 0.3835, 0.0264 , -0.0015 , 0.0261, -0.4771, -0.2799, -0.4769, -0.1195, 0.2207, -0.4122, 0.1689,
0.9106, -0.3114, -0.3865, 0.0175 , -0.3805, -0.228, 0.2621, 0.0213, 0.8631, 0.2207, -0.6635, -0.5419,
1.1652, -0.2991, -0.2501, -0.2075, -0.1395, 0.2207, 0.3154, 0.6443, 0.3154, -0.3159, 0.2207, -0.6088,
0.2207, -0.2819, 0.0077, 0.2207, 0.2207, -0.2293, -0.3164, -0.5953, -0.2805, -0.2152]
No Probabilities Predicted Actual Class 1
Class label Label P(Y) =
1+ e−¿¿ ¿
1 0.9788491 1 1
Probabilities of X
2 0.96463481 1 1

3 0.99637032 1 1

4 0.97703039 1 1

5 0.99459516 1 1 Confusion Matrix

6 0 0.93098767 1 1 1
0 7 9
71 0 0.97416454 11 1 1

8 0.54667615 1 0
Precision =
9 0.88297689 1 0 TP 11
= =1
TP+ FP 11+0
10 0.01412368 0 0
Recall =
11 0.28683184 0 0
TP 11
= =0.55
12 0.79641965 1 0 TP+ FN 11+ 9

13 0.99196144 1 1 F1 score =
2∗precision∗recall =
14 0.66984013 1 0 precision+recall
2∗1∗0.55 = 0.709
15 0.99631057 1 1
1+ 0.55
16 0.00974661 0 0 TP Rate = recall = 0.55
17 0.25649455 0 0 FP Rate =
FP 0
18 0.84043497 1 0 = =0
FP +TN 0+7
19 0.77538173 1 0

20 0.9687391 1 1

21 0.94291326 1 0

22 0.53847297 1 0

23 0.97493028 1 1

24 0.16696163 0 0

25 0.43009663 0 0

26 0.55881501 1 0

27 0.09237489 0 0
TP+TN 11+7
Accuracy = = =0.667
TP+ TN+ FP+ FN 11+7+ 0

TN 7
Specificity = = =1
TN+ FN 7+ 0

Logistic Regression Without SMOTE balancing Data

Confusion Matrix
0 1
0 7 8
1 0 0

TP 0
Precision = = =0
TP+ FP 0+0

TP 0
Recall = = =0
TP+ FN 0+ 8
2∗precision∗recall 2∗0∗0
F1 score = = =0
precision+recall 0+0
TP+TN 0+7
Accuracy = = =0.467
TP+ TN+ FP+ FN 0+0+7 +8

You might also like