0% found this document useful (0 votes)
67 views

Logistic Regression

The document loads and preprocesses a dataset from a CSV file for logistic regression modeling. It splits the data into training and test sets, initializes logistic regression coefficients, and performs gradient descent optimization to fit the model to the training data. It then makes predictions on the test set and evaluates the accuracy. The process is repeated in a 5-fold cross validation scheme.

Uploaded by

Shwetha Reddy
Copyright
© © All Rights Reserved
Available Formats
Download as TXT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
67 views

Logistic Regression

The document loads and preprocesses a dataset from a CSV file for logistic regression modeling. It splits the data into training and test sets, initializes logistic regression coefficients, and performs gradient descent optimization to fit the model to the training data. It then makes predictions on the test set and evaluates the accuracy. The process is repeated in a 5-fold cross validation scheme.

Uploaded by

Shwetha Reddy
Copyright
© © All Rights Reserved
Available Formats
Download as TXT, PDF, TXT or read online on Scribd
You are on page 1/ 3

import pandas as pd

import numpy as np
from sklearn.model_selection import train_test_split
from sklearn import preprocessing
from sklearn.preprocessing import StandardScaler
from sklearn.model_selection import KFold

ds=pd.read_csv("logistic.csv")
ds
34.62365962 78.02469282 0
0 30.286711 43.894998 0
1 35.847409 72.902198 0
2 60.182599 86.308552 1
3 79.032736 75.344376 1
4 45.083277 56.316372 0
... ... ... ...
94 83.489163 48.380286 1
95 42.261701 87.103851 1
96 99.315009 68.775409 1
97 55.340018 64.931938 1
98 74.775893 89.529813 1
99 rows × 3 columns

print(ds)
x = ds.iloc[:,[0,1]].values
34.62365962 78.02469282 0
0 30.286711 43.894998 0
1 35.847409 72.902198 0
2 60.182599 86.308552 1
3 79.032736 75.344376 1
4 45.083277 56.316372 0
.. ... ... ..
94 83.489163 48.380286 1
95 42.261701 87.103851 1
96 99.315009 68.775409 1
97 55.340018 64.931938 1
98 74.775893 89.529813 1

[99 rows x 3 columns]

y = ds.iloc[:,2].values
xp = preprocessing.scale(x)

kf=KFold(n_splits=5)
for train_index, test_index in kf.split(xp):

xtrain, xtest, ytrain, ytest = train_test_split(xp, y, test_size = 0.20,


random_state = 0)
x1=xtrain[:,0]
x2=xtrain[:,1]
b0=0.0
b1=0.0
b2=0.0
epoch=10000
alpha=0.001

while(epoch>0):
for i in range(len(xtrain)):

prediction=1/(1+np.exp(-(b0+b1*x1[i]+b2*x2[i])))
b0=b0+alpha*(ytrain[i]-prediction)*prediction*(1-prediction)*1.0
b1=b1+alpha*(ytrain[i]-prediction)*prediction*(1-prediction)*x1[i]
b2=b2+alpha*(ytrain[i]-prediction)*prediction*(1-prediction)*x2[i]
epoch=epoch-1

print(b0)
print(b1)
print(b2)

final_prediction=[]

x3=xtest[:,0]

x4=xtest[:,1]

print(ytest)

y_pred=[0]*len(xtest)

for i in range(len(xtest)):

y_pred[i]=np.round(1/(1+np.exp(-(b0+b1*x3[i]+b2*x4[i]))))

final_prediction.append(np.ceil(y_pred[i]))

print(final_prediction)
from sklearn.metrics import accuracy_score
print("Accuracy",accuracy_score(ytest,y_pred))
1.590641737857832
3.1148936955248936
2.412367982045799
[0 1 1 1 1 1 0 1 0 1 0 0 0 0 1 1 0 1 0 1]
[1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 0.0, 1.0, 0.0, 1.0, 0.0, 0.0, 1.0, 0.0, 1.0, 1.0,
0.0, 1.0, 0.0, 1.0]
Accuracy 0.9
1.590641737857832
3.1148936955248936
2.412367982045799
[0 1 1 1 1 1 0 1 0 1 0 0 0 0 1 1 0 1 0 1]
[1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 0.0, 1.0, 0.0, 1.0, 0.0, 0.0, 1.0, 0.0, 1.0, 1.0,
0.0, 1.0, 0.0, 1.0]
Accuracy 0.9
1.590641737857832
3.1148936955248936
2.412367982045799
[0 1 1 1 1 1 0 1 0 1 0 0 0 0 1 1 0 1 0 1]
[1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 0.0, 1.0, 0.0, 1.0, 0.0, 0.0, 1.0, 0.0, 1.0, 1.0,
0.0, 1.0, 0.0, 1.0]
Accuracy 0.9
1.590641737857832
3.1148936955248936
2.412367982045799
[0 1 1 1 1 1 0 1 0 1 0 0 0 0 1 1 0 1 0 1]
[1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 0.0, 1.0, 0.0, 1.0, 0.0, 0.0, 1.0, 0.0, 1.0, 1.0,
0.0, 1.0, 0.0, 1.0]
Accuracy 0.9

You might also like