0% found this document useful (0 votes)

28 views10 pages

Binary Logistic Regression From Scratch

Uploaded by

cybersphinx85

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

28 views10 pages

Binary Logistic Regression From Scratch

Uploaded by

cybersphinx85

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 10

Binary Logistic Regression

In [1]: from PIL import Image

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import scipy.optimize as opt
import matplotlib.image as mpimg
from matplotlib import rcParams

In [2]:
# figure size in inches optional
rcParams['figure.figsize'] = 11 ,8
# read images
img_A = mpimg.imread('Classification_Using_Linear_Regression.png')
img_B = mpimg.imread('Classification_Using_Linear_Regression_Issue.png')
# display images
fig, ax = plt.subplots(1,2)
ax[0].imshow(img_A);
ax[1].imshow(img_B);
#SVD_file1 = "Classification_Using_Linear_Regression.png"
#SVD_file2 = "Classification_Using_Linear_Regression_Issue.png"
#img_A = Image.open(SVD_file1)
#img_B = Image.open(SVD_file2)
#plt.imshow(img_A)
#plt.imshow(img_B)

Why Can’t We use Linear Regression For Classification

In classification our objective is to get discrete value like 0 or 1. But in case of Linear regression our objective is to
get continuous value which can range outside 0 and 1. This is one of the problem, as we are trying to classify
output as 0 or 1 but Linear Regression algorithm isn’t designed for predicting discrete values.
So in order to get discrete output from Linear Regression we can use threshold value(0.5) to classify output.
Classify anything above 0.5 as positive class(1) and anything below 0.5 as negative class(0).

Hypothesis Function

Since our objective is to get discrete value(0 or 1) we will create a hypothesis function that will return values
between 0 and 1

Sigmoid function do exactly that, it maps the whole real number range between 0 and 1. It is also called as Logistic
function.

1
g(z) = 1+e−z

The term Sigmoid means ‘S-shaped’ and when plotted this function gives S-shaped curve.

h(x) = θ0 + (θ1 ∗ x1 )....(θn ∗ xn )

z = θ0 + (θ1 ∗ x1 )....(θn ∗ xn )

h(x) = g(z) = g(θ0 + (θ1 ∗ x1 )....(θn ∗ xn ))

z = θT x
1
g(z) = 1+e−z

Basically we are using line function as input to sigmoid function in order to get discrete value from 0 to 1. The way
our sigmoid function g(z) behaves is that, when its input is greater than or equal to zero, its output is greater than
or equal to 0.5

hθ = g(θT x) ≥ 0.5 when θT x ≥ 0

Since positive input results in positive class and negative input results in negative class, we can separate both the
classes by setting the weighted sum of inputs to 0. i.e.

z = θ0 + (θ1 ∗ x1 )....(θn ∗ xn ) = 0

Decision Boundary
Decision boundary separates the positive and negative class
The decision boundary is the line that separates the area where y = 0 and where y = 1. It is created by our
hypothesis function.
As explained earlier decision boundary can be found by setting the weighted sum of inputs to 0
Lets create a formula to find decision boundary for two feature(x and x1 ) dataset

h(x) = θ0 + (θ1 ∗ x1 ) + (θ2 ∗ x2 ) = 0

x2 = −(θ0 + (θ1 ∗ x1 ))/θ2

Cost Function

It is obvious that to find the optimum values of theta parameters we have to try multiple values and then choose
the best possible values based on how the predicted class match with given data. To do this we will create a cost
function (J ). Inner working of cost function is as below

We will start with random values of θ0 and θ1

We will execute the hypothesis function using theta values, to get the predicted values(0 or 1) for every training
example
Now we will compare our predicted discrete values with actual target values from training data.
If our predicted value matches with actual value then our cost will be ‘0’ or else cost will be highly penalized.

Logistic Regression cost function is as below

J(θ) =
− m1 [y (i) log (hθ (x(i) )) + (1 − y (i) log (1 − hθ (x(i) ))]

Vectorized implementation Of Logistic Regression cost function is as below

h = g(Xθ)
1
J(θ) = m
⋅ (−y T log (h) − (1 − y)T log (1 − h))
Just like linear regression, logistic cost function is also ‘Convex function’
So the optimum values of the theta are the one, for which we get minimum value of the cost.

Advance Optimization Algorithm

We are going to use fmin_tnc() function from ‘scipy’ library to find theta values
This function will use cost function and gradient of the cost to find the optimum values of theta.

Gradient of the cost is nothing but partial derivative of cost function

∂J(θ) 1 (i) (i) (i)

∂θj

= m
∑m
i=1 (hθ (x
) − y
) xj

Note that while this gradient of cost looks identical to the linear regression gradient, the formula is actually
different because linear and logistic regression have different definitions of h(x).

In [3]:
import matplotlib.pyplot as plt
import numpy as np
import math

x = np.linspace(-10, 10, 100)

z = 1/(1 + np.exp(-x))
plt.figure(figsize = (10,3))
plt.plot(x, z)
plt.xlabel("x")
plt.ylabel("Sigmoid(X)")

plt.show()

Notations used
m = no of training examples (no of rows of feature matrix)
n = no of features (no of columns of feature matrix)
X ’s = input variables / independent variables / features
y ’s = output variables / dependent variables / target/ labels

Load The Data

We are going to use ‘admission_basedon_exam_scores.csv’ CSV file
File contains three columns Exam 1 marks, Exam 2 marks and Admission status

The data consists of marks of two exams for 100 applicants. The target value takes on binary values 1,0. 1 means the
applicant was admitted to the university whereas 0 means the applicant didn't get an admission. The objective is to build a
classifier that can predict whether an application will be admitted to the university or not.

In [4]:
df = pd.read_csv('admission_basedon_exam_scores.csv')
m, n = df.shape
print('Number of training examples m = ', m)
print('Number of features n = ', n - 1) # Not counting the 'Label: Admission status'
df.sample(15) # Show random 5 training examples

Number of training examples m = 100

Number of features n = 2

Out [4]:
Exam 1 marks Exam 2 marks Admission status

57 32.577200 95.598548 0

24 77.924091 68.972360 1
88 78.635424 96.647427 1
Exam 1 marks Exam 2 marks Admission status

8 76.098787 87.420570 1
28 61.830206 50.256108 0

38 74.789253 41.573415 0
15 53.971052 89.207350 1

7 75.024746 46.554014 1
79 82.226662 42.719879 0

99 74.775893 89.529813 1
39 34.183640 75.237720 0

64 44.668262 66.450086 0
81 94.834507 45.694307 1

5 45.083277 56.316372 0
84 80.366756 90.960148 1

Data Understanding
There are total 100 training examples (m= 100 or 100 no of rows)
There are two features Exam 1 marks and Exam 2 marks
Label column contains application status. Where ‘1’ means admitted and ‘0’ means not admitted
Total no of features (n) = 2 (Later we will add column of ones(x_0) to make it 3)

In [5]:
df_admitted = df[df['Admission status'] == 1]
print('Dimension of df_admitted= ', df_admitted.shape)
df_admitted.sample(10)

Dimension of df_admitted= (60, 3)

Out [5]:
Exam 1 marks Exam 2 marks Admission status

42 94.443368 65.568922 1
94 89.845807 45.358284 1

82 67.319257 66.589353 1
88 78.635424 96.647427 1

93 74.492692 84.845137 1
30 61.379289 72.807887 1

8 76.098787 87.420570 1
21 89.676776 65.799366 1

84 80.366756 90.960148 1
77 50.458160 75.809860 1

In [6]:
df_notadmitted = df[df['Admission status'] == 0]
print('Dimension of df_notadmitted= ', df_notadmitted.shape)
df_notadmitted.sample(5)

Dimension of df_notadmitted= (40, 3)

Out [6]:
Exam 1 marks Exam 2 marks Admission status

23 34.212061 44.209529 0

70 32.722833 43.307173 0
10 95.861555 38.225278 0

17 67.946855 46.678574 0
92 55.482161 35.570703 0

Data Visualization
To plot the data of admitted and not admitted applicants, we need to first create separate data frame for each
class(admitted/not-admitted)
In [8]:
plt.figure(figsize = (5,5))
plt.scatter(df_admitted['Exam 1 marks'], df_admitted['Exam 2 marks'], color='green', label='A
plt.scatter(df_notadmitted['Exam 1 marks'], df_notadmitted['Exam 2 marks'], color='red', labe
plt.xlabel('Exam 1 Marks')
plt.ylabel('Exam 2 Marks')
plt.legend()
plt.title('Admitted Vs Not Admitted Applicants')

Out [8]: Text(0.5, 1.0, 'Admitted Vs Not Admitted Applicants')

Compute Cost Using Cost Function

Create Feature Matrix X and Label Vector y

In [9]:
# Get feature columns from dataframe
X = df.iloc[:, 0:2]
#Add column of ones (intercept term)
X = np.hstack((np.ones((m,1)),X))
# Now X is numpy array of 2 dimension
print("Dimension of feature matric X = ", X.shape, '\n')

y = df.iloc[:, -1]
# First 5 records training examples with labels
for i in range(5):
print('x =', X[i, ], ', y =', y[i])

Dimension of feature matric X = (100, 3)

x = [ 1. 34.62365962 78.02469282] , y = 0
x = [ 1. 30.28671077 43.89499752] , y = 0
x = [ 1. 35.84740877 72.90219803] , y = 0
x = [ 1. 60.18259939 86.3085521 ] , y = 1
x = [ 1. 79.03273605 75.34437644] , y = 1

In [11]:
#initialize the theta values with 0
theta = np.zeros(n)
theta

Out [11]: array([0., 0., 0.])

Create Sigmoid Function

In [12]:
def sigmoid(z):
"""
To convert continuous value into a range of 0 to 1

I/P
----------
z : Continuous value

O/P
-------
Value in range between 0 to 1.
"""
g = 1 / (1 + np.exp(-z))
return g

Create Cost And Gradient Function

A vector implementation of cost and gradient function formula’s for better performance

In [14]:
def cost_function(theta, X, y):
"""
Compute cost for logistic regression.

I/P
----------
X : 2D array where each row represent the training example and each column represent the
m= number of training examples
n= number of features (including X_0 column of ones)
y : 1D array of labels/target value for each traing example. dimension(1 x m)

theta : 1D array of fitting parameters or weights. Dimension (1 x n)

O/P
-------
J : The cost of using theta as the parameter for linear regression to fit the data points
"""
m, n = X.shape
x_dot_theta = X.dot(theta)

J = 1.0 / m * (-y.T.dot(np.log(sigmoid(x_dot_theta))) - (1 - y).T.dot(np.log(1 - sigmoid(

return J

In [16]: def gradient(theta, X, y):

"""
Compute gradient for logistic regression.

theta : 1D array of fitting parameters or weights. Dimension (1 x n)

O/P
-------
grad: (numpy array)The gradient of the cost with respect to the parameters theta
"""
m, n = X.shape
x_dot_theta = X.dot(theta)

grad = 1.0 / m * (sigmoid(x_dot_theta) - y).T.dot(X)

return grad
Testing the cost_function() using initial values

In [21]: cost = cost_function(theta, X, y)

print ('Cost at initial theta (zeros):', cost)

grad = gradient(theta, X, y)
print ('Gradient at initial theta (zeros):', grad)

Cost at initial theta (zeros): 0.20349770158947478

Gradient at initial theta (zeros): [9.26521627e-09 1.10161433e-07 4.84858555e-07]

Using Advance Optimization Technique To Find Theta Values

But here we are going use fmin_tnc function from the scipy library
This process is same as using ‘fit’ method from sklearn library. Because here we are trying to optimize our cost function
in order to find the best possible parameter(theta) values
fmin_tnc function takes four arguments:
func: Cost function to minimize
fprime: Gradient for the function defined by ‘func’
x0 : initial values for the parameters(theta) that we want to find
args: feature and label values

In [22]: theta, nfeval, rc = opt.fmin_tnc(func=cost_function, fprime= gradient, x0=theta, args=(X, y))

cost = cost_function(theta, X, y)
print ('Cost at theta found by fminunc:', cost)
print ('theta:', theta)

Cost at theta found by fminunc: 0.20349770158947478

theta: [-25.16131857 0.20623159 0.20147149]

NIT NF F GTG
0 1 2.034977015894748E-01 2.47309204E-13
tnc: |pg| = 4.97302e-07 -> local minimum
0 1 2.034977015894748E-01 2.47309204E-13
tnc: Local minima reach (|pg| ~= 0)

Visualization

* plot decision boundary to cross-check the accuracy of our model

In [23]: # Lets calculate the X and Y values using Decision Boundary formula

# For ploting a line we just need 2 points. Here I am taking 'min' and 'max' value as my two
x_values = [min(X[:, 1]), np.max(X[:, 2])]
y_values = - (theta[0] + np.dot(theta[1], x_values)) / theta[2]

plt.figure(figsize = (6,5))
plt.scatter(df_admitted['Exam 1 marks'], df_admitted['Exam 2 marks'], color='green', label='A
plt.scatter(df_notadmitted['Exam 1 marks'], df_notadmitted['Exam 2 marks'], color='red', labe
plt.xlabel('Exam 1 Marks')
plt.ylabel('Exam 2 Marks')

plt.plot(x_values, y_values, color='blue', label='Decision Boundary')

plt.legend()
plt.title('Decision Boundary')

Out [23]: Text(0.5, 1.0, 'Decision Boundary')

Model Testing

Question: Predict an admission probability for applicant with scores 45 in Exam 1 and 85 in Exam 2
We can use our hypothesis function for prediction h(x) = g(z) = g(Xθ)

In [24]:
input_data = np.array([1, 45, 85]) # Note the intercept term '1' in array
prob = sigmoid(np.dot(input_data, theta))
print ('Admission probability for applicant with scores 45 in Exam 1 and 85 in Exam 2 is =',

Admission probability for applicant with scores 45 in Exam 1 and 85 in Exam 2 is = 0.7762906222622858

Create a function for prediction on our logistic model. Instead of predicting the probability between 0 and 1, this function will
use threshold value of 0.5 to predict the discrete value. 1 when probability ≥ 0.5 else 0

In [25]:
def predict(theta, X):
"""
Predict the class between 0 and 1 using learned logistic regression parameters theta.
Using threshold value 0.5 to convert probability value to class value

I/P
----------
X : 2D array where each row represent the training example and each column represent the
m= number of training examples
n= number of features (including X_0 column of ones)

theta : 1D array of fitting parameters or weights. Dimension (1 x n)

O/P
-------
Class type based on threshold
"""
p = sigmoid(X.dot(theta)) >= 0.5
return p.astype(int)

Accuracy Of Model

In [26]: predictedValue = pd.DataFrame(predict(theta, X), columns=['Predicted Admission status']) # Cr

actualAdmissionStatus = pd.DataFrame(y, columns=['Admission status'])
df_actual_vs_predicted = pd.concat([actualAdmissionStatus,predictedValue],axis =1)
df_actual_vs_predicted.T

Out [26]:
0 1 2 3 4 5 6 7 8 9 ... 90 91 92 93 94 95 96 97 98 99

Admission status 0 0 0 1 1 0 1 1 1 1 ... 1 1 0 1 1 1 1 1 1 1

0 1 2 3 4 5 6 7 8 9 ... 90 91 92 93 94 95 96 97 98 99

Predicted Admission status 0 0 0 1 1 0 1 0 1 1 ... 1 1 0 1 1 1 1 1 0 1

2 rows × 100 columns

In [27]:
# Accuracy = TP + TN/(TN+TP+FN+FP)

p = predict(theta, X)
print ('Accuracy:', np.mean(p == y) * 100 )

Accuracy: 89.0

Confusion Matrix
In [28]: from sklearn import metrics

In [29]:
confusion_matrix = metrics.confusion_matrix(actualAdmissionStatus, predictedValue)

In [30]:
cm_display = metrics.ConfusionMatrixDisplay(confusion_matrix = confusion_matrix, display_labe

In [31]: cm_display.plot()
plt.show()

Precision
Out of all the positive predicted, what percentage is truly positive

Accuracy is 90% but this model is have average performance.

Precision = TP/(TP+FP)

Accuracy
In [34]:
Accuracy = (34+56)/(34+56+5+6)
Accuracy

Out [34]: 0.8910891089108911

Recall
Out of the total positive, what percentage are predicted positive. It is the same as TPR (true positive rate).

Recall = TP/(TP+FN)

F1 Score:
It is the harmonic mean of precision and recall. It takes both false positive and false negatives into account. Therefore, it
performs well on an imbalanced dataset.

F1 = 2/(1/Precision + 1/Recall)

In [ ]:

Notification For TGT & Hostel Warden
No ratings yet
Notification For TGT & Hostel Warden
34 pages
Study Material Part 1 Evs Aecc Notes 1st Year Du
No ratings yet
Study Material Part 1 Evs Aecc Notes 1st Year Du
131 pages
Logistic Regression: Gunjan Bharadwaj Assistant Professor Dept of CEA
100% (1)
Logistic Regression: Gunjan Bharadwaj Assistant Professor Dept of CEA
42 pages
CE CE131P 2 SyllabusModular
No ratings yet
CE CE131P 2 SyllabusModular
14 pages
A Short Course in Discrete Mathematics
From Everand
A Short Course in Discrete Mathematics
Edward A. Bender
3/5 (1)
Machine Learning Lab Manual 06
100% (1)
Machine Learning Lab Manual 06
8 pages
Module-2 - Logistic Regression in Machine Learning
No ratings yet
Module-2 - Logistic Regression in Machine Learning
28 pages
Welcome Remarks
100% (1)
Welcome Remarks
5 pages
Burnt Clay Building Bricks - Methods of Tests: Indian Standard
No ratings yet
Burnt Clay Building Bricks - Methods of Tests: Indian Standard
8 pages
Logistic Regression
100% (1)
Logistic Regression
10 pages
AC-ED L04 - Logistic Regression, Regularization
No ratings yet
AC-ED L04 - Logistic Regression, Regularization
80 pages
01B DL2023 LinearModels
No ratings yet
01B DL2023 LinearModels
47 pages
Algorithms Notes
No ratings yet
Algorithms Notes
66 pages
ML Classification Trupesh Patel
No ratings yet
ML Classification Trupesh Patel
39 pages
Lecture3 Logistic Regression Classifier V0
No ratings yet
Lecture3 Logistic Regression Classifier V0
41 pages
Lecture 07
No ratings yet
Lecture 07
26 pages
06 Logistic Regression
No ratings yet
06 Logistic Regression
55 pages
A Layman's Guide To The Project
No ratings yet
A Layman's Guide To The Project
34 pages
09 23ECE216 LogisticRegression
No ratings yet
09 23ECE216 LogisticRegression
40 pages
Ch2Regression and Regularization1
No ratings yet
Ch2Regression and Regularization1
45 pages
Lecture Note #9 - PEC-CS701E
No ratings yet
Lecture Note #9 - PEC-CS701E
41 pages
Lecture 8 Logistic Regression
No ratings yet
Lecture 8 Logistic Regression
34 pages
C1 W3 Logistic Regression
No ratings yet
C1 W3 Logistic Regression
27 pages
Lecture 03 Logistic Regression
No ratings yet
Lecture 03 Logistic Regression
34 pages
PDF - Behavioral Analysis by Robert Meza
No ratings yet
PDF - Behavioral Analysis by Robert Meza
20 pages
Session 9-Logistic Regression
No ratings yet
Session 9-Logistic Regression
33 pages
Logistic Regression
No ratings yet
Logistic Regression
34 pages
Slide 2
No ratings yet
Slide 2
30 pages
Logistic Regression
No ratings yet
Logistic Regression
37 pages
Sample Research Paper
No ratings yet
Sample Research Paper
26 pages
Unit 1,2,3
No ratings yet
Unit 1,2,3
17 pages
2021 Logistic Regression
No ratings yet
2021 Logistic Regression
33 pages
07 Logistics Regression
No ratings yet
07 Logistics Regression
23 pages
ML-Unit 4
No ratings yet
ML-Unit 4
29 pages
Introduction To The Research
No ratings yet
Introduction To The Research
19 pages
Title:: Tensile Impact Properties Ss316L 3D Printed Component by Ded Process
No ratings yet
Title:: Tensile Impact Properties Ss316L 3D Printed Component by Ded Process
20 pages
Week 4 Logistic
No ratings yet
Week 4 Logistic
21 pages
Logistic Regression
No ratings yet
Logistic Regression
21 pages
Classification-Introduction, Logistic Regression
No ratings yet
Classification-Introduction, Logistic Regression
26 pages
Introduction To Machine Learning: Dr. Muhammad Amjad Iqbal
No ratings yet
Introduction To Machine Learning: Dr. Muhammad Amjad Iqbal
20 pages
Logistic Regression
No ratings yet
Logistic Regression
25 pages
Episode 13 Layout
No ratings yet
Episode 13 Layout
22 pages
04 Probability and Learning PDF
No ratings yet
04 Probability and Learning PDF
34 pages
Logistic Regression
No ratings yet
Logistic Regression
25 pages
(CAD CAM CAE) Ansys - Userguide
No ratings yet
(CAD CAM CAE) Ansys - Userguide
51 pages
Lec18 Logistic Regression
No ratings yet
Lec18 Logistic Regression
17 pages
Logistic Regression
No ratings yet
Logistic Regression
21 pages
Related Searches: Electrical-Interview-Questions-Answers PDF
No ratings yet
Related Searches: Electrical-Interview-Questions-Answers PDF
1 page
05 Logistic - Regression
No ratings yet
05 Logistic - Regression
7 pages
Gned 08
No ratings yet
Gned 08
14 pages
Session-11 Machine Learning - Jupyter Notebook
No ratings yet
Session-11 Machine Learning - Jupyter Notebook
11 pages
Logistic Regression
No ratings yet
Logistic Regression
10 pages
M02Logistic Regression Logistic RegressioLogistic Regressionn
No ratings yet
M02Logistic Regression Logistic RegressioLogistic Regressionn
19 pages
Jaw Coupling
No ratings yet
Jaw Coupling
18 pages
ML Linear Model
No ratings yet
ML Linear Model
10 pages
Log Reg Skimed - Ipynb - Colab
No ratings yet
Log Reg Skimed - Ipynb - Colab
10 pages
Logistic Regression by IntuitiveAI v2.5
No ratings yet
Logistic Regression by IntuitiveAI v2.5
8 pages
Wa0004.
No ratings yet
Wa0004.
9 pages
Rain in Australia Logistic Regression Classifier
No ratings yet
Rain in Australia Logistic Regression Classifier
10 pages
12.4 ONM Iron Pillars of India
No ratings yet
12.4 ONM Iron Pillars of India
10 pages
Day.12 Logistic Regression
No ratings yet
Day.12 Logistic Regression
8 pages
Logistic Regression
No ratings yet
Logistic Regression
9 pages
2+logistic Regression
No ratings yet
2+logistic Regression
10 pages
Franco Velazco Portfolio ARW 11
No ratings yet
Franco Velazco Portfolio ARW 11
10 pages
Principle - of - Econ - HW1 (Solution)
No ratings yet
Principle - of - Econ - HW1 (Solution)
8 pages
Datasheet Plexiglas GS en XT
No ratings yet
Datasheet Plexiglas GS en XT
7 pages
Major Environmental Issues For Society: (Plastic Pollution)
No ratings yet
Major Environmental Issues For Society: (Plastic Pollution)
14 pages
Week 3 Lecture Notes
No ratings yet
Week 3 Lecture Notes
7 pages
Logistic Regression
No ratings yet
Logistic Regression
8 pages
Lecture Notes 6 Logistic Regression
No ratings yet
Lecture Notes 6 Logistic Regression
8 pages
Chp2 Logistic Regression
No ratings yet
Chp2 Logistic Regression
6 pages
Paper Poka Yoke - DR Odysseas Kopsidas
No ratings yet
Paper Poka Yoke - DR Odysseas Kopsidas
6 pages
Reconfigurable Intelligent Surface Assisted UAV Communication Joint Trajectory Design and Passive Beamforming
No ratings yet
Reconfigurable Intelligent Surface Assisted UAV Communication Joint Trajectory Design and Passive Beamforming
5 pages
06 Logistic Regression PDF
No ratings yet
06 Logistic Regression PDF
10 pages
VBH 70 05
No ratings yet
VBH 70 05
5 pages
TEST 1 trắc nghiệm tổng hợp Sở GD
No ratings yet
TEST 1 trắc nghiệm tổng hợp Sở GD
6 pages
Coate's and Foucault
No ratings yet
Coate's and Foucault
4 pages
OIM Call For Application
No ratings yet
OIM Call For Application
5 pages
NSTP Final
No ratings yet
NSTP Final
5 pages
Logistic Regression
No ratings yet
Logistic Regression
6 pages
Exp3 ML
No ratings yet
Exp3 ML
4 pages
Sam HW2
No ratings yet
Sam HW2
4 pages
Exercise 3: Logistic Regression: Andrew NG (Very Slightly Edited by Luis R. Izquierdo For The University of Burgos)
No ratings yet
Exercise 3: Logistic Regression: Andrew NG (Very Slightly Edited by Luis R. Izquierdo For The University of Burgos)
5 pages
A Study On Using Newspapers in Enhance The Language Skills Among School Students
No ratings yet
A Study On Using Newspapers in Enhance The Language Skills Among School Students
3 pages
‎⁨مد احصاء حيوي 1446⁩
No ratings yet
‎⁨مد احصاء حيوي 1446⁩
2 pages
ML DSBA Lab2
No ratings yet
ML DSBA Lab2
4 pages
MSDS Wilgard C21
No ratings yet
MSDS Wilgard C21
5 pages
Management Concepts and Organisational Behaviour
No ratings yet
Management Concepts and Organisational Behaviour
2 pages
Planting Design 1
No ratings yet
Planting Design 1
20 pages
A-level Maths Revision: Cheeky Revision Shortcuts
From Everand
A-level Maths Revision: Cheeky Revision Shortcuts
Scool Revision
3.5/5 (8)

Binary Logistic Regression From Scratch

Uploaded by

Binary Logistic Regression From Scratch

Uploaded by

Binary Logistic Regression

In [1]: from PIL import Image

Why Can’t We use Linear Regression For Classification

h(x) = θ0 + (θ1 ∗ x1 )....(θn ∗ xn )

h(x) = g(z) = g(θ0 + (θ1 ∗ x1 )....(θn ∗ xn )) ​ ​ ​ ​ ​

hθ = g(θT x) ≥ 0.5 when θT x ≥ 0

h(x) = θ0 + (θ1 ∗ x1 ) + (θ2 ∗ x2 ) = 0

x2 = −(θ0 + (θ1 ∗ x1 ))/θ2

We will start with random values of θ0 ​ and θ1 ​

Logistic Regression cost function is as below

Vectorized implementation Of Logistic Regression cost function is as below

Advance Optimization Algorithm

Gradient of the cost is nothing but partial derivative of cost function

∂J(θ) 1 (i) (i) (i)

x = np.linspace(-10, 10, 100)

Load The Data

Number of training examples m = 100

Dimension of df_admitted= (60, 3)

Dimension of df_notadmitted= (40, 3)

Out [8]: Text(0.5, 1.0, 'Admitted Vs Not Admitted Applicants')

Compute Cost Using Cost Function

Create Feature Matrix X and Label Vector y

Dimension of feature matric X = (100, 3)

Out [11]: array([0., 0., 0.])

Create Sigmoid Function

Create Cost And Gradient Function

theta : 1D array of fitting parameters or weights. Dimension (1 x n)

J = 1.0 / m * (-y.T.dot(np.log(sigmoid(x_dot_theta))) - (1 - y).T.dot(np.log(1 - sigmoid(

In [16]: def gradient(theta, X, y):

theta : 1D array of fitting parameters or weights. Dimension (1 x n)

grad = 1.0 / m * (sigmoid(x_dot_theta) - y).T.dot(X)

In [21]: cost = cost_function(theta, X, y)

Cost at initial theta (zeros): 0.20349770158947478

Using Advance Optimization Technique To Find Theta Values

In [22]: theta, nfeval, rc = opt.fmin_tnc(func=cost_function, fprime= gradient, x0=theta, args=(X, y))

Cost at theta found by fminunc: 0.20349770158947478

* plot decision boundary to cross-check the accuracy of our model

plt.plot(x_values, y_values, color='blue', label='Decision Boundary')

Out [23]: Text(0.5, 1.0, 'Decision Boundary')

theta : 1D array of fitting parameters or weights. Dimension (1 x n)

In [26]: predictedValue = pd.DataFrame(predict(theta, X), columns=['Predicted Admission status']) # Cr

Admission status 0 0 0 1 1 0 1 1 1 1 ... 1 1 0 1 1 1 1 1 1 1

Predicted Admission status 0 0 0 1 1 0 1 0 1 1 ... 1 1 0 1 1 1 1 1 0 1

2 rows × 100 columns

Accuracy is 90% but this model is have average performance.

Out [34]: 0.8910891089108911

You might also like

h(x) = g(z) = g(θ0 + (θ1 ∗ x1 )....(θn ∗ xn ))

We will start with random values of θ0 and θ1