Lecture-11-K Nearest Neighbors-Part2 - Jupyter Notebook
Lecture-11-K Nearest Neighbors-Part2 - Jupyter Notebook
Import Libraries
In [10]: 1 import pandas as pd
2 import seaborn as sns
3 import matplotlib.pyplot as plt
4 import numpy as np
5 %matplotlib inline
In [12]: 1 df.head()
Out[12]:
XVPM GWYH TRAT TLLZ IGGA HYKR EDFS
Out[13]: StandardScaler()
Out[20]:
XVPM GWYH TRAT TLLZ IGGA HYKR EDFS GUUB MGJM
Using KNN
Remember that we are trying to come up with a model to predict whether someone will TARGET
CLASS or not. We'll start with k=1.
In [25]: 1 knn.fit(X_train,y_train)
Out[25]: KNeighborsClassifier(n_neighbors=1)
In [28]: 1 print(confusion_matrix(y_test,pred))
[[109 45]
[ 33 113]]
In [29]: 1 print(classification_report(y_test,pred))
Choosing a K Value
Let's go ahead and use the elbow method to pick a good K Value:
In [30]: 1 error_rate = []
2
3 # Will take some time
4 for i in range(1,40):
5
6 knn = KNeighborsClassifier(n_neighbors=i)
7 knn.fit(X_train,y_train)
8 pred_i = knn.predict(X_test)
9 error_rate.append(np.mean(pred_i != y_test))
In [31]: 1 plt.figure(figsize=(10,6))
2 plt.plot(range(1,40),error_rate,color='blue', linestyle='dashed', marker='
3 markerfacecolor='red', markersize=10)
4 plt.title('Error Rate vs. K Value')
5 plt.xlabel('K')
6 plt.ylabel('Error Rate')
WITH K=1
[[109 45]
[ 33 113]]
WITH K=30
[[114 40]
[ 20 126]]
In [ ]: 1