0% found this document useful (0 votes)
2 views

Practical 10 K-Nearest Neighbors Algorithm

The K-Nearest Neighbors (KNN) algorithm is a supervised machine learning method used for classification and regression by predicting based on the proximity of data points. It identifies the K nearest neighbors to an input data point and assigns the most common class label among them. The document also includes practical steps for implementing KNN using the IRIS dataset in Google Colab, covering data loading, model training, and performance evaluation metrics.

Uploaded by

easyupload999
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views

Practical 10 K-Nearest Neighbors Algorithm

The K-Nearest Neighbors (KNN) algorithm is a supervised machine learning method used for classification and regression by predicting based on the proximity of data points. It identifies the K nearest neighbors to an input data point and assigns the most common class label among them. The document also includes practical steps for implementing KNN using the IRIS dataset in Google Colab, covering data loading, model training, and performance evaluation metrics.

Uploaded by

easyupload999
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 16

K-Nearest Neighbors

algorithm
What is KNN (K-Nearest Neighbor)
• k-nearest neighbors (KNN) algorithm is a simple, supervised machine
learning method that makes predictions based on how close a data
point is to others.
• It’s widely used for both classification and regression tasks because of
its simplicity and popularity.
KNN algorithm
• The algorithm identifies the K nearest neighbors to the input data
point based on their distances.
• In the case of classification, the algorithm assigns the most common
class label among the K neighbors as the predicted label for the input
data point.
K-NN class prediction

If K=3, what is the nearest class?


Decision boundary for different k-
values
Selection of k-in KNN algorithm
Distance computation algorithms
• Euclidean distance
• Manhattan distance
• Cosine similarity
• Minkowski Distance
Practical 10
• Train a k-NN classifier with n_neighbors=3 on the IRIS dataset using
the scikit-learn library, with 80% of the data allocated for training and
20% for testing.
• Evaluate the model's performance by calculating the testing accuracy,
precision, recall, F1 score and confusion matrix.
Colab Iris Dataset classification
using k-NN classifier
1. Open Google Colab by visiting link
https://colab.research.google.com/
2. Load data
3. Split data into training and test parts
4. Load a k-NN classifier and classification metrics
5. Build model
6. Calculate training and test set accuracies
Step 2 Load Data

Dataset loaded

Get X and y variables


from Data
Split data into training and test part

1. First command loads a train_test_split function from sklearn library


2. Input X and y to this function, and this will return X_train, X_test, y_train,
y_test
3. Test size is 0.2, which means, 80% data for training and 20% for testing
Import Classifier and metrics

1. Classification metrics imported in line 2


2. K-nn classifier imported at line 3
3. clf (k-NN) classifier initialised at line 4
Fit/ Train a classifier

1. K-NN classifier fitted with training data on line number 1


2. Prediction made on test data on line 2
Computed training accuracy,
precision etc

1. Trained model evaluated.


2. In first cell, test metrics computed.
3. In next cell, these metrics printed
Confusion matrix code

1. This code makes the confusion matrix for


publication purpose.
2. Here in ConfusionMatrixDisplay function, first
passed argument is confusion matrix and
then the labels for each IRIS class are passed.
Confusion Matrix

1. Diagonal entries are the correct


predictions.
2. Non-Diagonal entries are the
incorrect predictions.
3. Here only 1 wrong prediction.
4. One instance of Versiocolor
predicted as Virginica

You might also like