0% found this document useful (0 votes)
19 views

Data Analysis in Python-4

The document discusses using a K-nearest neighbors (KNN) classifier model in python. It imports the KNN library, creates a KNN classifier with 5 neighbors, fits the training data and predicts test values. It calculates performance metrics and accuracy. It then experiments with different K values and finds the misclassified samples are lowest with K=16.

Uploaded by

mohan
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
19 views

Data Analysis in Python-4

The document discusses using a K-nearest neighbors (KNN) classifier model in python. It imports the KNN library, creates a KNN classifier with 5 neighbors, fits the training data and predicts test values. It calculates performance metrics and accuracy. It then experiments with different K values and finds the misclassified samples are lowest with K=16.

Uploaded by

mohan
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 1

Data Analysis in python-4

'''
Now we will see the KNN classifier model
'''
#importing the necessary library of KNN
from sklearn.neighbors import KNeighborsClassifier
#nwo creat an instance of the model using K nearest Neighbors Classifer
KNN_classifier=KNeighborsClassifier(n_neighbors=5) #here K vlaue is
5,i.e nearest neighbors of 5 having
#sal less than or equal to 50000 will be considered.
KNN_classifier.fit(train_x,train_y) #fitting the values for x and y
#predicting the test values with this model
prediction=KNN_classifier.predict(test_x)
print(prediction)
#Now performance matrix check
confusion_matrix=confusion_matrix(test_y,prediction)
print('\t','predicted values')
print('original values','\n',confusion_matrix)
accuracy_score=accuracy_score(test_y,prediction)
print(accuracy_score)
print('miss-classified values: %d',(test_y!=prediction).sum())
'''
Now check the effect of K values on classifier
'''
Misclassified_sample=[]
#calculating errors for K values between 1 to 20
for i in range(1,20):
knn=KNeighborsClassifier(n_neighbors=i)
knn.fit(train_x,train_y)
pred_i=knn.predict(test_x)
Misclassified_sample.append((test_y!=pred_i).sum())
print(Misclassified_sample)
#therefor form these K values we can take K=16 for which the
misclassified value is lowest=1401
'''
So, we considered and studied two algorithms for classification problem
1. LogisticRegressiion
2. KNN
'''

You might also like