Lab 1 1.2

The document details an experiment on implementing the Nearest Neighbor classification algorithm using the MNIST dataset, which consists of 70,000 images of handwritten digits. The KNN algorithm is applied to classify test images, and its accuracy is analyzed based on different k-values, revealing an accuracy of 76.39% for k=1 and 81.76% for k=7. Limitations of KNN are discussed, including sensitivity to outliers and performance degradation with an increasing number of classes.

Uploaded by

Sadbin Mohshin

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

8 views4 pages

Lab 1 1.2

Uploaded by

Sadbin Mohshin

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 4

Experiment No.

01
Name of the Experiment: Implementation of Nearest Neighbor classification algorithms with
and without distorted pattern.
Dataset: MINST dataset
The MNIST (Modified National Institute of Standards and Technology) dataset is a widely
used dataset for handwritten digit recognition and image classification tasks. It consists of
70,000 grayscale images of handwritten digits (0-9) and is commonly used as a benchmark for
testing and comparing the performance of various machine learning algorithms.
Some Key Characteristics of the MNIST Dataset:
➢ Image size: The images in the MNIST dataset are 28×28 pixels in size, making them
relatively small and easy to work with.
➢ Image format: The images are stored in a 28×28 array of pixel values, with each
pixel having a value between 0 and 255, representing the intensity of the pixel.
➢ Labels: The MNIST dataset includes labels for each image, indicating the digit that
that the image represents.
➢ Training and Test Sets: 60,000 training images and 10,000 testing images.
➢ Balance: The MNIST dataset is well-balanced, with roughly equal numbers of
images for each digit.
➢ Features: Number of total features is 784.
➢ Classes: Total 10 classes.
Ratio of training and Test Dataset:
The MNIST dataset is typically split into two sets: a training set and a test set. The ratio
of the training set to the test set is usually 60,000 images, respectively. So, the ratio of the
training set to the test set is 60,000/10,000=6:1.
Implementation:
At first all the dependencies are loaded.

Code:

import numpy as np
import pandas as pd
import statistics
from statistics import mode
import tensorflow as tf
Then the dataset is loaded and train data as x_train, train class label as y_train, test data
as x_test, test class label as y_test are extracted from the dataset.

(x_train,y_train),(x_test,y_test) = tf.keras.datasets.mnist.load_data(path='mnist.npz')

The array of train data and label data are reshaped for flexibility i.e. converted to 2-D array.

x_train = x_train.reshape(x_train.shape[0],784)
y_train = y_train.reshape(y_train.shape[0],1)
x_test = x_test.reshape(x_test.shape[0],784)
y_test = y_test.reshape(y_test.shape[0],1)
MN_image = np.vstack((x_train,x_test))
MN_label = np.vstack((y_train,y_test))

After that I have defined a function that implements KNN algorithm. It performs the
following operation:
➢ Take a test image and k-value.
➢ Measures the Manhattan distance between the two images where feature is the pixel
value. The Manhattan Distance between two points (X1, Y1) and (X2, Y2) is given by-

|X1 – X2| + |Y1 – Y2|

➢ Sort all the distances measured from all training image.

➢ Find majority label from first k-distances of sorted_distance
➢ Calculate the model accuracy:

Total Number of Correct Predictions

𝐴𝑐𝑐𝑢𝑟𝑎𝑐𝑦 =
Total Number of Predictions
def KNN(k,i):
accurate = 0

sort_distance=[] #train data

for j in range(0,60000):

sort_distance.append(((np.absolute(MN_image[i,]-
MN_image[j,]).sum()),i,j,MN_label[j][0]))

sort_distance.sort(key=lambda tup:tup[0])

#print(mode([x[3] for x in sort_distance[:k]]))

return mode([x[3] for x in sort_distance[:k]])

KNN has been applied for all test images and then the accuracy of the model is
calculated. It should be noted that when more than majority label exits, this situation has
been solved by increasing k-values. The value of K can greatly impact the performance of a
KNN model, and choosing the right value can be crucial. The value of K depends on various
attributes, including:
➢ Number of classes
➢ Data distribution
➢ Outliers
➢ Noise

c=0
for i in range(60000,70000): #test data
for k in range(0,2):
try:
t=KNN(k,i)
if t == MN_label[i][0]:
c=c+1
break
except:
a=0

print("Accuracy:",(c*100)/10000)

Analysis Of Accuracy:
Table 1.1: Accuracy of KNN on MNIST dataset

K-Value Accuracy (%)

1 76.39

7 81.76

Limitations of KNN:
➢ The dataset is randomly distributed, so this classified doesn’t perform well
➢ The dataset has many outliers and KNN is very sensitive to outliers.
➢ If the label of some data are missed, performance degrades.
➢ Dataset having more classes, performance degrades more.
➢ Dataset that has no label, KNN performs bad as classifier.

Conclusion:
KNN performs well to classify data between two classes. As the number of classes
increases, it become hard to classify the data accurately. This is because the shortest distance
for data to be classified may be equal to more than one class. Though it has some drawbacks,
KNN is used as a simple, lightweight classifier.

DLV Lab Manual Print
No ratings yet
DLV Lab Manual Print
29 pages
DL Problem
No ratings yet
DL Problem
70 pages
Rahul Raj - Ipynb - Colab
No ratings yet
Rahul Raj - Ipynb - Colab
50 pages
4K-Nearest Neighbor
No ratings yet
4K-Nearest Neighbor
38 pages
ML Lec07 KNN
100% (2)
ML Lec07 KNN
37 pages
Lab 10 - Manual and Assignment On KNN
No ratings yet
Lab 10 - Manual and Assignment On KNN
3 pages
K-Nearest Neighbor (KNN) Algorithm For Machine Learning
No ratings yet
K-Nearest Neighbor (KNN) Algorithm For Machine Learning
17 pages
School of Computer Science and Artificial Intelligence
No ratings yet
School of Computer Science and Artificial Intelligence
35 pages
Machine Learning Lab
No ratings yet
Machine Learning Lab
33 pages
DL Record Merged
No ratings yet
DL Record Merged
113 pages
Introduction To K-Nearest Neighbors: Simplified (With Implementation in Python)
100% (1)
Introduction To K-Nearest Neighbors: Simplified (With Implementation in Python)
125 pages
ML Notes
100% (2)
ML Notes
125 pages
K-Nearest Neighbor On Python Ken Ocuma
100% (2)
K-Nearest Neighbor On Python Ken Ocuma
9 pages
Artificial Intelligence Lab 7
No ratings yet
Artificial Intelligence Lab 7
10 pages
DSASSign 4
No ratings yet
DSASSign 4
11 pages
V
No ratings yet
V
8 pages
KNN - Predictive Analysis
No ratings yet
KNN - Predictive Analysis
6 pages
Just For Fun
No ratings yet
Just For Fun
24 pages
Exercise Final
No ratings yet
Exercise Final
8 pages
Lab 1
No ratings yet
Lab 1
3 pages
Implementing KNN Algorithm On The Iris Dataset
No ratings yet
Implementing KNN Algorithm On The Iris Dataset
7 pages
Control Systems
No ratings yet
Control Systems
72 pages
Lab 8
No ratings yet
Lab 8
7 pages
ML Lec-10
No ratings yet
ML Lec-10
19 pages
ML Exp9
No ratings yet
ML Exp9
10 pages
cYCLE 9
No ratings yet
cYCLE 9
5 pages
ML 5
No ratings yet
ML 5
2 pages
Assignment 2
No ratings yet
Assignment 2
5 pages
1.8 Classical Encryption Techniques-Substitution Techniques
No ratings yet
1.8 Classical Encryption Techniques-Substitution Techniques
33 pages
1 - Nearest Neighbor Classification Handout
No ratings yet
1 - Nearest Neighbor Classification Handout
6 pages
ABCs2018 Paper 156
No ratings yet
ABCs2018 Paper 156
5 pages
K-Nearest Neighbor: General Gist
No ratings yet
K-Nearest Neighbor: General Gist
14 pages
Dhanashree ML Report
No ratings yet
Dhanashree ML Report
3 pages
Assignment 02
No ratings yet
Assignment 02
5 pages
Improving Performance Handout
No ratings yet
Improving Performance Handout
4 pages
Lab6 Instruction
No ratings yet
Lab6 Instruction
3 pages
Assignment No 2 AI
No ratings yet
Assignment No 2 AI
4 pages
W6a Gaussian Process Kernels
No ratings yet
W6a Gaussian Process Kernels
6 pages
DL Practical 3
No ratings yet
DL Practical 3
5 pages
Classification Model
No ratings yet
Classification Model
4 pages
Assignment 2 Solution
No ratings yet
Assignment 2 Solution
12 pages
1-Data Mining and Applications
No ratings yet
1-Data Mining and Applications
70 pages
Preprocessing
No ratings yet
Preprocessing
90 pages
LAB-4 Report
No ratings yet
LAB-4 Report
21 pages
Practicl Work - 02
No ratings yet
Practicl Work - 02
2 pages
GEP June 2024 Chapter2 EAP
No ratings yet
GEP June 2024 Chapter2 EAP
64 pages
AI in HC - 5
No ratings yet
AI in HC - 5
5 pages
GEP June 2024 Chapter2 ECA
No ratings yet
GEP June 2024 Chapter2 ECA
60 pages
KMEANS
No ratings yet
KMEANS
9 pages
Keras - Datasets Keras - Datasets: "X - Train Shape" "Y - Train Shape" "X - Test Shape" "Y - Test Shape"
No ratings yet
Keras - Datasets Keras - Datasets: "X - Train Shape" "Y - Train Shape" "X - Test Shape" "Y - Test Shape"
6 pages
Mnist Dataset
No ratings yet
Mnist Dataset
6 pages
DEEP LEARNING TECHNIQUES: CLUSTER ANALYSIS and PATTERN RECOGNITION with NEURAL NETWORKS. Examples with MATLAB
From Everand
DEEP LEARNING TECHNIQUES: CLUSTER ANALYSIS and PATTERN RECOGNITION with NEURAL NETWORKS. Examples with MATLAB
César Pérez López
No ratings yet
AI LAB Assignment 09
No ratings yet
AI LAB Assignment 09
4 pages
Hand Writing Using - CNN
No ratings yet
Hand Writing Using - CNN
5 pages
Week10 KNN Practical
No ratings yet
Week10 KNN Practical
4 pages
DT RF
No ratings yet
DT RF
64 pages
Experiment 2.2 KNN Classifier
No ratings yet
Experiment 2.2 KNN Classifier
7 pages
Exno 4
No ratings yet
Exno 4
3 pages
Case Study - AP23322130042
No ratings yet
Case Study - AP23322130042
7 pages
DL Exp-1.4 19BCS1431
No ratings yet
DL Exp-1.4 19BCS1431
5 pages
Activity 01: Python Set/s of Source Code Use in The Activity (Paste Below)
No ratings yet
Activity 01: Python Set/s of Source Code Use in The Activity (Paste Below)
2 pages
Unit - 3
No ratings yet
Unit - 3
55 pages
Lecture 3
No ratings yet
Lecture 3
17 pages
GEP June 2024 Chapter1 Box1
No ratings yet
GEP June 2024 Chapter1 Box1
39 pages
Worksheet - 2.3 20BCS7611
No ratings yet
Worksheet - 2.3 20BCS7611
6 pages
Immediate Download Spectral Analysis For Univariate Time Series 2nd Edition Donald B. Percival Ebooks 2024
100% (1)
Immediate Download Spectral Analysis For Univariate Time Series 2nd Edition Donald B. Percival Ebooks 2024
65 pages
How To Create Beautiful Graphs and Charts With LaTeX
No ratings yet
How To Create Beautiful Graphs and Charts With LaTeX
1 page
Experiment No 7 ML
No ratings yet
Experiment No 7 ML
4 pages
MATH1179 Vectors and Matrices Book Solution
No ratings yet
MATH1179 Vectors and Matrices Book Solution
4 pages
Fashion MNIST-6
No ratings yet
Fashion MNIST-6
10 pages
Lab Report 04
No ratings yet
Lab Report 04
10 pages
Chapter 6S Advanced Waiting Line Theory and Simulation Modeling
No ratings yet
Chapter 6S Advanced Waiting Line Theory and Simulation Modeling
16 pages
Combining Normal Random Variables
No ratings yet
Combining Normal Random Variables
4 pages
Assignment #1: K Nearest Neighbor Classifier: Name: Srikanth Mujjiga (Roll No: 2015-50-831
No ratings yet
Assignment #1: K Nearest Neighbor Classifier: Name: Srikanth Mujjiga (Roll No: 2015-50-831
8 pages
Poolin Layer
No ratings yet
Poolin Layer
28 pages
Seminar Report CSE Quantum Computing
No ratings yet
Seminar Report CSE Quantum Computing
19 pages
Evaluation Metrics
No ratings yet
Evaluation Metrics
16 pages
PID Controller Design For Two Tanks Liquid Level Control System Using Matlab
No ratings yet
PID Controller Design For Two Tanks Liquid Level Control System Using Matlab
7 pages
Imcprlqwqi
No ratings yet
Imcprlqwqi
22 pages
Unit 1 Matrix Theory: Applied Mathematics For Electrical Engineers
No ratings yet
Unit 1 Matrix Theory: Applied Mathematics For Electrical Engineers
2 pages
06 Smoothing PDF
No ratings yet
06 Smoothing PDF
55 pages
Differential Equation 2
No ratings yet
Differential Equation 2
22 pages
Project Crashing To Solve Time
No ratings yet
Project Crashing To Solve Time
7 pages
9ma0 03 Q S2
No ratings yet
9ma0 03 Q S2
1 page
Application Report
No ratings yet
Application Report
1 page
Adsaa Sem Important Questions
No ratings yet
Adsaa Sem Important Questions
3 pages
1 - Linear Regression
No ratings yet
1 - Linear Regression
28 pages
Speech Recognition Models For Holy Quran Recitatio
No ratings yet
Speech Recognition Models For Holy Quran Recitatio
14 pages
EEE301 Digital Electronics Lecture 1 Part 3: Dr. A.S.M. Mohsin
No ratings yet
EEE301 Digital Electronics Lecture 1 Part 3: Dr. A.S.M. Mohsin
6 pages
A Flexible Univariate Autoregressive Time-Series Model For Dispersed Count Data
No ratings yet
A Flexible Univariate Autoregressive Time-Series Model For Dispersed Count Data
22 pages
1 SM
No ratings yet
1 SM
8 pages
Eric C. Chi: Research Interests
No ratings yet
Eric C. Chi: Research Interests
15 pages
Mesosphere Stratosphere Troposphere (MST) Radar Signal Using DWT With OGS
No ratings yet
Mesosphere Stratosphere Troposphere (MST) Radar Signal Using DWT With OGS
4 pages
Creating Artificial Neural Networks That Generalize
No ratings yet
Creating Artificial Neural Networks That Generalize
13 pages
Genmath Graphic Organizer
No ratings yet
Genmath Graphic Organizer
1 page
ME 17 - Homework #5 Solving Partial Differential Equations Poisson's Equation
No ratings yet
ME 17 - Homework #5 Solving Partial Differential Equations Poisson's Equation
10 pages
H. Rieger, R. Juhasz and F. Igloi - Critical Exponents of Random XX and XY Chains: Exact Results Via Random Walks
No ratings yet
H. Rieger, R. Juhasz and F. Igloi - Critical Exponents of Random XX and XY Chains: Exact Results Via Random Walks
4 pages
Numerical Analysis II Essentials
From Everand
Numerical Analysis II Essentials
The Editors of REA
No ratings yet
K Nearest Neighbor Algorithm: Fundamentals and Applications
From Everand
K Nearest Neighbor Algorithm: Fundamentals and Applications
Fouad Sabry
No ratings yet