Machine Learning Assignment 3
Machine Learning Assignment 3
18BCE2301
Devangshu Mazumder
Aim:
Design and implement a KNN Classifier using a csv file
Abstract:
The abbreviation KNN stands for “K-Nearest Neighbour”. It is a supervised machine learning
algorithm. The algorithm can be used to solve both classification and regression problem
statements. The number of nearest neighbours to a new unknown variable that has to be predicted
or classified is denoted by the symbol 'K'.
KNN works by finding the distances between a query and all the examples in the data, selecting the
specified number examples (K) closest to the query, then votes for the most frequent label (in the
case of classification) or averages the labels (in the case of regression).
Sample Code:
import numpy as np
import pandas as pd
dataset_mean= dataset
dataset1=dataset_mean
df1=pd.DataFrame(dataset1)
print(df1['ca'].mean())
df1.fillna(df1.mean(), inplace=True)
print(df1.loc[[166,192,287,302]])
print(df1['thal'].mean())
df1.fillna(df1.mean(), inplace=True)
print(df1.loc[[87,266]])
feature_cols = list(dataset.columns[0:13])
X= dataset[feature_cols]
y= dataset['output'].values
print("\nFeature values:")
X.head
print(X_train)
#Normalization
scaler.fit(X_train)
X_train = scaler.transform(X_train)
print(X_train)
scaler.fit(X_test)
X_test = scaler.transform(X_test)
print(X_test)
print("KNN CLASSIFER")
clf2 = KNeighborsClassifier(n_neighbors=5)
clf2.fit(X_train,y_train)
y_predictions = clf2.predict(X_test)
print("Accuracy=",accuracy_score(y_test, y_predictions))
OUTPUT: