0% found this document useful (0 votes)

5 views6 pages

Module 3 Lab 2

This document provides a detailed explanation of implementing the K-Nearest Neighbors (KNN) algorithm from scratch, including distance calculation, neighbor selection, and accuracy metrics. It covers visualizing KNN behavior with Voronoi diagrams and decision boundaries, as well as evaluating model performance using confusion matrices and classification reports. Additionally, it discusses handling categorical data and using PCA for visualization in the context of the Iris and Car Evaluation datasets.

Uploaded by

katrao39798

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

5 views6 pages

Module 3 Lab 2

Uploaded by

katrao39798

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 6

Detailed Explanation of Module 3 Lab 2: Implementing KNN from Scratch and

Visualizing Algorithm Performance

(Updated with All Your Queries and Key Concepts)

Section 1: Implementing KNN from Scratch

What is KNN?
K-Nearest Neighbors (KNN) is a simple, intuitive algorithm for classification and regression. It
predicts the label of a new data point by looking at the labels of its k closest points in the
training set (using a distance metric, usually Euclidean distance), and choosing the most
common label among them.

How is KNN Implemented from Scratch?

Distance Calculation: For each test point, compute the distance to every training point.
Find Neighbors: Sort all distances and select the k smallest (closest) points.
Predict Label: For classification, take the most frequent label among the k neighbors.
Example Code:

def predict(X_train, y_train, X_test, k):

distances = []
targets = []
for i in range(len(X_train)):
distances.append([np.sqrt(np.sum(np.square(X_test - X_train[i, :]))), i])
distances = sorted(distances)
for i in range(k):
index = distances[i][^1]
targets.append(y_train[index])
return Counter(targets).most_common(1)[^0][^0]

For k=1, the label of the single nearest neighbor is returned.

For k>1, the most common label among the k neighbors is chosen.

Accuracy Metric
Accuracy is the ratio of correctly classified samples to total samples:

def Accuracy(gtlabel, predlabel):

correct = (gtlabel == predlabel).sum()
return correct / len(gtlabel)

Section 1.1: KNN on the Iris Dataset

Dataset: Iris (150 samples, 4 features, 3 classes).
Process:
1. Split data into training and test sets.
2. Use your KNN function to predict test labels.
3. Calculate accuracy.
Result Example:
"The accuracy of our classifier is 94.0%"
Comparison:
The sklearn library’s KNN implementation gives the same accuracy, validating your scratch
code.

Section 1.2: Weighted KNN

Why Weighted?
If k is large, distant neighbors may outvote closer, more relevant ones. Weighted KNN gives
more importance to closer neighbors (e.g., by using the inverse of their distance as a
weight).
How to Implement:
In sklearn, use weights='distance' in KNeighborsClassifier.
In your own code, you’d multiply each neighbor’s vote by its weight (inverse distance).

Section 1.3: Return Neighbors and Distances

Modification:
Instead of just the predicted label, your function can return the indices, distances, and
labels of the k nearest neighbors for each test point.
Why?
This helps you analyze which points are influencing each prediction.

Section 2: Visualizing Data and KNN Behavior

Voronoi Diagrams
What are they?
Voronoi diagrams partition the plane into regions where each region contains all points
closest to one "seed" (data point).
Why useful?
They show how the choice of distance metric and data distribution affects the influence of
each training point.
Limitation:
Only practical for 2D data, so you use the first two features or apply PCA to reduce
dimensions.
Example Code:

from scipy.spatial import Voronoi, voronoi_plot_2d

vor = Voronoi(points)
voronoi_plot_2d(vor)
plt.scatter(points[:, 0], points[:, 1], c=targets, cmap='viridis', edgecolor='k')
plt.show()

Section 2.2: Decision Boundaries in KNN

What are Decision Boundaries?
Imaginary lines (or surfaces) in the feature space where the predicted class changes. They
show which regions of the space are classified as which class by KNN.
How are they plotted?
1. Create a grid covering the feature space.
2. Use KNN to predict the class at each grid point.
3. Color each region according to the predicted class.
4. Overlay the training data points.
Why are they important?
They help you see how KNN generalizes and where it is likely to make mistakes.
For small k, boundaries are jagged and sensitive to noise; for large k, boundaries are
smoother.
Example Code:

def decision_boundary_plot(x_dec, y_dec, k):

h = .02
n = len(set(y_dec))
cmap_light = ListedColormap(['pink', 'green', 'cyan', 'yellow'][:n])
cmap_bold = ['pink', 'darkgreen', 'blue', 'yellow'][:n]
for weights in ['uniform', 'distance']:
clf = KNeighborsClassifier(n_neighbors=k, weights=weights)
clf.fit(x_dec, y_dec)
x_min, x_max = x_dec[:, 0].min() - 1, x_dec[:, 0].max() + 1
y_min, y_max = x_dec[:, 1].min() - 1, x_dec[:, 1].max() + 1
xx, yy = np.meshgrid(np.arange(x_min, x_max, h),
np.arange(y_min, y_max, h))
Z = clf.predict(np.c_[xx.ravel(), yy.ravel()])
Z = Z.reshape(xx.shape)
plt.figure(figsize=(8, 6))
plt.contourf(xx, yy, Z, cmap=cmap_light)
sns.scatterplot(x=x_dec[:, 0], y=x_dec[:, 1], hue=y_dec,
palette=cmap_bold, edgecolor="black", alpha=1.0)
plt.show()

Section 2.3: PCA for Visualization

Why PCA?
The Iris dataset has 4 features; to plot Voronoi diagrams and decision boundaries, you need
2D data.
How?
Use PCA to reduce the data to two principal components, then plot as above.

Section 2.4: Confusion Matrix and Classification Report

Confusion Matrix:
A table showing the number of correct and incorrect predictions for each class. Diagonal
values are correct; off-diagonal are mistakes.
Classification Report:
Gives precision, recall, F1-score, and support for each class.
Precision: Of all predicted as class X, how many were correct?
Recall: Of all actual class X, how many did we find?
F1-score: Harmonic mean of precision and recall.
Example Output:

precision recall f1-score support

0 1.00 1.00 1.00 13
1 0.88 1.00 0.94 22
2 1.00 0.80 0.89 15
accuracy 0.94 50
macro avg 0.96 0.93 0.94 50
weighted avg 0.95 0.94 0.94 50
Section 3: Applying KNN on the Car Evaluation Dataset
Data Preparation:
Categorical features are label-encoded to numbers.
Data is split into train/test sets.
KNN Training and Evaluation:
KNN is trained and tested as above.
Accuracy is reported (e.g., 89.88%).
Visualization:
PCA reduces the data to 2D for plotting Voronoi diagrams and decision boundaries.
Confusion matrix and classification report are generated for model evaluation.

Summary Table
Concept What It Means / Why It Matters

KNN from scratch Understands the algorithm’s logic, not just using libraries

Weighted KNN Closer neighbors have more influence on prediction

Voronoi diagrams Visualize which points “own” which regions of space

Decision boundaries Show where class predictions change in feature space

PCA Reduces high-dimensional data to 2D for visualization

Confusion matrix Shows details of correct/incorrect predictions per class

Classification report Gives precision, recall, F1-score for each class

In summary:
This lab teaches you to implement KNN from scratch, understand how it works, visualize its
behavior using Voronoi diagrams and decision boundaries, and evaluate its performance with
confusion matrices and classification reports. You also learn how to handle categorical data, use
PCA for visualization, and interpret the strengths and weaknesses of your classifier [1] [2] [3] [4] [5]
[6] [7] .

1. https://www.machinelearningmastery.com/tutorial-to-implement-k-nearest-neighbors-in-python-from-s
cratch/
2. https://www.kaggle.com/code/jebathuraiibarnabas/knn-from-scratch-with-visualization
3. https://realpython.com/knn-python/
4. https://www.kaggle.com/code/just4jcgeorge/k-nearest-neighbour-algorithm
5. https://dataaspirant.com/k-nearest-neighbor-algorithm-implementaion-python-scratch/
6. https://www.scribd.com/document/736817575/MACHINE-LEARNING-LAB-MANUAL
7. AIML_Module_3_Lab_2_Implementing_KNN_from_scratch_and_visualize_Algorithm_performance.ipynb-
Cola.pdf

K-Nearest Neighbor On Python Ken Ocuma
100% (2)
K-Nearest Neighbor On Python Ken Ocuma
9 pages
B-56 Sanket Jambhulkar MLA-7
No ratings yet
B-56 Sanket Jambhulkar MLA-7
9 pages
KMEANS
No ratings yet
KMEANS
9 pages
KNN Class 2
No ratings yet
KNN Class 2
40 pages
-Updated K-Nearest Neighbors in Machine Learning
No ratings yet
-Updated K-Nearest Neighbors in Machine Learning
11 pages
K-Nearest Neighbors
No ratings yet
K-Nearest Neighbors
2 pages
Knn Classifier
No ratings yet
Knn Classifier
5 pages
Rahul Raj.ipynb - Colab
No ratings yet
Rahul Raj.ipynb - Colab
50 pages
KNN_colab_illustration
No ratings yet
KNN_colab_illustration
5 pages
KNN Class 1
No ratings yet
KNN Class 1
32 pages
v
No ratings yet
v
8 pages
Decision Tree KNN
No ratings yet
Decision Tree KNN
9 pages
Implementing KNN Algorithm: Importing Libraries
No ratings yet
Implementing KNN Algorithm: Importing Libraries
6 pages
Amrendra
No ratings yet
Amrendra
9 pages
Machine Learning KNN - Supervised
No ratings yet
Machine Learning KNN - Supervised
9 pages
Experiment No 7 ML
No ratings yet
Experiment No 7 ML
4 pages
dhanashree ML report
No ratings yet
dhanashree ML report
3 pages
Knn
No ratings yet
Knn
4 pages
Worksheet - 2.3 20BCS7611
No ratings yet
Worksheet - 2.3 20BCS7611
6 pages
ML 4 (1)
No ratings yet
ML 4 (1)
33 pages
Unit 5 Learning with Algorithm
No ratings yet
Unit 5 Learning with Algorithm
7 pages
5. K-Nearest Neighbors
No ratings yet
5. K-Nearest Neighbors
35 pages
Practical 7
No ratings yet
Practical 7
6 pages
Machine Learning With Python - Machine Learning Algorithms - KNN
No ratings yet
Machine Learning With Python - Machine Learning Algorithms - KNN
15 pages
Worksheet - 2.3 20BCS7490
No ratings yet
Worksheet - 2.3 20BCS7490
6 pages
Lecture 07 KNN 14112022 034756pm
100% (1)
Lecture 07 KNN 14112022 034756pm
24 pages
Introduction To K-Nearest Neighbors: Simplified (With Implementation in Python)
100% (1)
Introduction To K-Nearest Neighbors: Simplified (With Implementation in Python)
125 pages
Lecture 2 - Nearest-Neighbors Methods
No ratings yet
Lecture 2 - Nearest-Neighbors Methods
57 pages
K-Nearest Neighbor (KNN) Algorithm For Machine Learning
No ratings yet
K-Nearest Neighbor (KNN) Algorithm For Machine Learning
17 pages
Ml 7th Sem Aiml Ite Notes Complete Long[1]-63-155
No ratings yet
Ml 7th Sem Aiml Ite Notes Complete Long[1]-63-155
93 pages
ML Lab Manual
No ratings yet
ML Lab Manual
24 pages
DSASSign4
No ratings yet
DSASSign4
11 pages
ML - Lab-8.ipynb - Colab
No ratings yet
ML - Lab-8.ipynb - Colab
4 pages
WEEK 07
No ratings yet
WEEK 07
24 pages
KNN - Predictive Analysis
No ratings yet
KNN - Predictive Analysis
6 pages
knn_cookbook
No ratings yet
knn_cookbook
8 pages
ML Lab2 pgm
No ratings yet
ML Lab2 pgm
3 pages
MLLabManual
No ratings yet
MLLabManual
24 pages
Lab 8
No ratings yet
Lab 8
7 pages
2.3 Aiml Rishit
No ratings yet
2.3 Aiml Rishit
7 pages
ML Lec07 KNN
100% (2)
ML Lec07 KNN
37 pages
Experiment 2.2 KNN Classifier
No ratings yet
Experiment 2.2 KNN Classifier
7 pages
ML Exp5 C36
No ratings yet
ML Exp5 C36
18 pages
Week 5 - Instance-Based Learning & PCA
No ratings yet
Week 5 - Instance-Based Learning & PCA
69 pages
MLLABDA2
No ratings yet
MLLABDA2
5 pages
Experiment 4: Aim/Overview of The Practical: Task To Be Done
No ratings yet
Experiment 4: Aim/Overview of The Practical: Task To Be Done
7 pages
06-knn
No ratings yet
06-knn
41 pages
Machine Learning Lab
No ratings yet
Machine Learning Lab
33 pages
knn
No ratings yet
knn
6 pages
ML Assignment No. 3: 3.1 Title
No ratings yet
ML Assignment No. 3: 3.1 Title
6 pages
Entropy (S) Log (P) : I 1c I I
No ratings yet
Entropy (S) Log (P) : I 1c I I
5 pages
K-Nearest Neighbor
No ratings yet
K-Nearest Neighbor
22 pages
Week 7 Nearest Neighbours
No ratings yet
Week 7 Nearest Neighbours
21 pages
UNIT 3 - Final
No ratings yet
UNIT 3 - Final
37 pages
Lecture 12 K-Nearest Neighbors
No ratings yet
Lecture 12 K-Nearest Neighbors
24 pages
s40537-024-00973-y
No ratings yet
s40537-024-00973-y
55 pages
A Complete Guide To KNN
No ratings yet
A Complete Guide To KNN
16 pages
ML Notes
100% (2)
ML Notes
125 pages
K Nearest neighbour’s(knn)[1] using R
No ratings yet
K Nearest neighbour’s(knn)[1] using R
9 pages
DATA MINING and MACHINE LEARNING: CLUSTER ANALYSIS and kNN CLASSIFIERS. Examples with MATLAB
From Everand
DATA MINING and MACHINE LEARNING: CLUSTER ANALYSIS and kNN CLASSIFIERS. Examples with MATLAB
César Pérez López
No ratings yet
Module 3 Lab 3
No ratings yet
Module 3 Lab 3
4 pages
Module 1 Lab 2
No ratings yet
Module 1 Lab 2
7 pages
temp 2 Lab 1
No ratings yet
temp 2 Lab 1
5 pages
Module 2 Lab 3
No ratings yet
Module 2 Lab 3
5 pages
Project Details
No ratings yet
Project Details
30 pages
49 Machine Learning
No ratings yet
49 Machine Learning
300 pages
IPMAT Exam 2023 Short Revision Notes
No ratings yet
IPMAT Exam 2023 Short Revision Notes
8 pages
The Balance of Payments: Speech Given by DR Martin Weale
No ratings yet
The Balance of Payments: Speech Given by DR Martin Weale
14 pages
Patient Monitoring Trios
No ratings yet
Patient Monitoring Trios
7 pages
Medical Emergencies at Workplace
No ratings yet
Medical Emergencies at Workplace
28 pages
Detailed Lesson Plan For Reading and Writing Skills 11
No ratings yet
Detailed Lesson Plan For Reading and Writing Skills 11
8 pages
Kod: 73 FƏNN: İngilis Dili
No ratings yet
Kod: 73 FƏNN: İngilis Dili
27 pages
Global Hotel Report 2018
No ratings yet
Global Hotel Report 2018
428 pages
Leadership and Management
No ratings yet
Leadership and Management
9 pages
English Test Ready
No ratings yet
English Test Ready
4 pages
Progress Test 1 A
No ratings yet
Progress Test 1 A
4 pages
Jesús David Nicolás Fabián: Escuela de Ingenieria de Petroleos
No ratings yet
Jesús David Nicolás Fabián: Escuela de Ingenieria de Petroleos
61 pages
Polar Vortex (Mvt. 1) - Alto - Saxophone
No ratings yet
Polar Vortex (Mvt. 1) - Alto - Saxophone
2 pages
T 1000S Operating Manual
No ratings yet
T 1000S Operating Manual
8 pages
Zoning and Neighhourhood Design
No ratings yet
Zoning and Neighhourhood Design
36 pages
402264SMOKE
No ratings yet
402264SMOKE
7 pages
HP Cisco Alliance Case Analyis
100% (1)
HP Cisco Alliance Case Analyis
17 pages
Balloons Over Broadway Math Lesson Plan
No ratings yet
Balloons Over Broadway Math Lesson Plan
9 pages
Tehnicki Crtez Nosaca
No ratings yet
Tehnicki Crtez Nosaca
1 page
Mechanics of Fluid (UCE03B03) Total Credit: 03 Contact Periods: 03 (2L+1T+0P)
No ratings yet
Mechanics of Fluid (UCE03B03) Total Credit: 03 Contact Periods: 03 (2L+1T+0P)
5 pages
TowerXchange Journal Issue 1
No ratings yet
TowerXchange Journal Issue 1
56 pages
TI FURADUR DBK 323 en
No ratings yet
TI FURADUR DBK 323 en
5 pages
Tesi
No ratings yet
Tesi
115 pages
Convenience Store Co - LTD
No ratings yet
Convenience Store Co - LTD
17 pages
Cell Division Report
No ratings yet
Cell Division Report
6 pages
Report
No ratings yet
Report
8 pages
Chapter 8 (Binomial Distribution)
No ratings yet
Chapter 8 (Binomial Distribution)
2 pages
QFD Application in The Hospitality Industry - A Hotel Case Study
No ratings yet
QFD Application in The Hospitality Industry - A Hotel Case Study
19 pages