0% found this document useful (0 votes)

13 views3 pages

MachineLearning-Spring24 - KNN Implementation For Classification

Uploaded by

balouch86013

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

13 views3 pages

MachineLearning-Spring24 - KNN Implementation For Classification

Uploaded by

balouch86013

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 3

K-Nearest Neighbors (KNN)

It is a simple and intuitive supervised machine learning algorithm used for classification and regression
tasks. In classification, KNN predicts the class label of a new data point based on the majority class of its
'k' nearest neighbors in the feature space. The 'k' neighbors are determined by measuring distances,
typically using Euclidean distance, between the new data point and all other data points in the training
set. The algorithm does not make any assumptions about the underlying data distribution, making it
versatile and suitable for various types of datasets. However, its performance may degrade with high-
dimensional or noisy data, and it can be computationally expensive, especially with large datasets, as it
requires storing and computing distances for all training samples during prediction.

Minkowski Distance:
The Minkowski distance is a generalization of other distance measures such as Euclidean distance and
Manhattan distance. It calculates the distance between two points in a multi-dimensional space.

The Minkowski distance between two points ( P ) and ( Q ) in ( n )-dimensional space is given by:

n p

p
D(P , Q) = (∑ |x i − y i | )

i=1

Where:

( x_i ) and ( y_i ) are the ( i )-th dimensions of points ( P ) and ( Q ) respectively.
( p ) is a parameter that defines the order of the Minkowski distance. When ( p = 1 ), it is equivalent
to Manhattan distance, and when ( p = 2 ), it is equivalent to Euclidean distance.

Euclidean Distance:
The Euclidean distance is a measure of the straight-line distance between two points in Euclidean space.

The Euclidean distance between two points ( P ) and ( Q ) in ( n )-dimensional space is given by:


n

2
D(P , Q) = ∑(x i − y i )
⎷
i=1

Where:

( x_i ) and ( y_i ) are the ( i )-th dimensions of points ( P ) and ( Q ) respectively.

These distances are commonly used in various machine learning algorithms and data analysis tasks to
quantify the similarity or dissimilarity between data points in a dataset.

In [1]:
import pandas as pd

# Creating the data for traing

# it is same as we saw exaple in previous class
data = {
'brightness': [40, 50, 60, 10],
'saturation': [20, 50, 90, 25],
'Class': ['Red', 'Blue', 'Blue', 'Red']
}

# Creating the DataFrame

df = pd.DataFrame(data)

print(df)
brightness saturation Class
0 40 20 Red
1 50 50 Blue
2 60 90 Blue
3 10 25 Red

What is scikit-learn?
Scikit-learn, often abbreviated as sklearn, is one of the most popular and widely used machine learning
libraries in Python. It provides simple and efficient tools for data mining and data analysis, built on top of
NumPy, SciPy, and matplotlib.

In [2]:
# import kNN from sklearn
from sklearn.neighbors import KNeighborsClassifier

# Creating the DataFrame for testing

X_test = pd.DataFrame({'brightness': [20], 'saturation': [35]})

# Splitting the data into features and target variable

X = df[['brightness', 'saturation']]
y = df['Class']

Feature Matrix (X)

The term "X" typically refers to the feature matrix, also known as the design matrix or predictor matrix.
This matrix contains the features or attributes of the dataset used for training the model.

Each row of the feature matrix corresponds to a single sample or data point (Feature vector), and each
column corresponds to a feature or attribute of that sample. Therefore, the dimensions of the feature
matrix X are typically m × n, where m is the number of samples and n is the number of features.

For example, in a dataset with m samples and n features, the feature matrix X would look like this:

x 11 x 12 … x 1n
⎡ ⎤
x 21 x 22 … x 2n

X =

⋮ ⋮ ⋱ ⋮
⎣ ⎦
x m1 x m2 … x mn

Where x ij represents the value of the j-th feature of the i-th sample.

So, in the context of scikit-learn or any machine learning library, when you refer to the feature matrix X,
you are essentially talking about the data matrix containing the features of your dataset.

In [3]:
X # feature matrix

Out[3]: brightness saturation

0 40 20

1 50 50

2 60 90

3 10 25

Target Variable (y)

The term "y" typically refers to the target variable or the response variable. It represents the labels or
outcomes associated with the samples in the dataset.
The target variable y is usually a one-dimensional array or a column vector containing the labels or
outcomes corresponding to each sample in the feature matrix X.

For classification problems, y contains discrete class labels or categories assigned to each sample i.e.
(Red, Blue). In regression problems, y contains continuous values representing the target variable to be
predicted for each sample i.e. 2500,35000,400000,200000.

In [4]:
y

Out[4]: 0 Red
1 Blue
2 Blue
3 Red
Name: Class, dtype: object

In [5]:
# Creating and training the KNN classifier
knn = KNeighborsClassifier(n_neighbors=3)

knn.fit(X, y)

# Predicting the labels for the test set

y_pred = knn.predict(X_test)

print ("The predicted label for sample: \n\n",X_test," \n\nis ", y_pred)

The predicted label for sample:

brightness saturation
0 20 35

is ['Red']
/home/abdullah/anaconda3/lib/python3.9/site-packages/sklearn/neighbors/_classification.p
y:228: FutureWarning: Unlike other reduction functions (e.g. `skew`, `kurtosis`), the de
fault behavior of `mode` typically preserves the axis it acts along. In SciPy 1.11.0, th
is behavior will change: the default value of `keepdims` will become False, the `axis` o
ver which the statistic is taken will be eliminated, and the value None will no longer b
e accepted. Set `keepdims` to True or False to avoid this warning.
mode, _ = stats.mode(_y[neigh_ind, k], axis=1)

In [6]:
# Finding the distances and indices of the k nearest neighbors
distances, indices = knn.kneighbors(X_test)

A-level Maths Revision: Cheeky Revision Shortcuts
From Everand
A-level Maths Revision: Cheeky Revision Shortcuts
Scool Revision
3.5/5 (8)
KNN - Algorithm - SVM - Algorithm
No ratings yet
KNN - Algorithm - SVM - Algorithm
27 pages
scikit
No ratings yet
scikit
3 pages
KNN Algorithm
No ratings yet
KNN Algorithm
10 pages
KNN Algorithm
No ratings yet
KNN Algorithm
2 pages
Decision Tree KNN
No ratings yet
Decision Tree KNN
9 pages
K-Nearest Neighbor On Python Ken Ocuma
100% (2)
K-Nearest Neighbor On Python Ken Ocuma
9 pages
4K-Nearest Neighbor
No ratings yet
4K-Nearest Neighbor
38 pages
CSE445 NSU Week - 5
No ratings yet
CSE445 NSU Week - 5
26 pages
KNN Algorithms_5082025
No ratings yet
KNN Algorithms_5082025
14 pages
K Nearest Neighbor: Presented by
No ratings yet
K Nearest Neighbor: Presented by
29 pages
ML 4
No ratings yet
ML 4
33 pages
Updated K-Nearest Neighbors in Machine Learning
No ratings yet
Updated K-Nearest Neighbors in Machine Learning
11 pages
ML Lecture 13 KNN
No ratings yet
ML Lecture 13 KNN
14 pages
ML 2
No ratings yet
ML 2
6 pages
Why Do We Need A K-NN Algorithm?
No ratings yet
Why Do We Need A K-NN Algorithm?
11 pages
ML Lab2 PGM
No ratings yet
ML Lab2 PGM
3 pages
An Empirical Study of Distance Metrics For K-Nearest Neighbor Algorithm
No ratings yet
An Empirical Study of Distance Metrics For K-Nearest Neighbor Algorithm
6 pages
A Complete Guide To K Nearest Neighbors Algorithm 1598272616
No ratings yet
A Complete Guide To K Nearest Neighbors Algorithm 1598272616
13 pages
ML Unit 2
No ratings yet
ML Unit 2
24 pages
Week 07
No ratings yet
Week 07
24 pages
Part A 3. KNN Classification
No ratings yet
Part A 3. KNN Classification
35 pages
KNN Algorithm - PPT (Autosaved)
0% (1)
KNN Algorithm - PPT (Autosaved)
8 pages
1 KNN-Algo
No ratings yet
1 KNN-Algo
27 pages
K-Nearest Neighbour Classifier: Prerequisite
No ratings yet
K-Nearest Neighbour Classifier: Prerequisite
6 pages
AIML-Unit 4 Notes-Assignment 4
No ratings yet
AIML-Unit 4 Notes-Assignment 4
21 pages
Unit II 2 Mark Answers ML
No ratings yet
Unit II 2 Mark Answers ML
3 pages
K Nearestneighborknnalgorithm 241117075907 d767c46d
No ratings yet
K Nearestneighborknnalgorithm 241117075907 d767c46d
13 pages
Jntuk R20 ML Unit-Ii
No ratings yet
Jntuk R20 ML Unit-Ii
37 pages
Introduction To Classification - KNN
No ratings yet
Introduction To Classification - KNN
29 pages
ML Unit-2
No ratings yet
ML Unit-2
55 pages
Classification (K-Nearest Neighbor)
No ratings yet
Classification (K-Nearest Neighbor)
22 pages
4 KNN Classifier
No ratings yet
4 KNN Classifier
6 pages
AIML
No ratings yet
AIML
13 pages
2-KNN
No ratings yet
2-KNN
67 pages
K-Nearest Neighbors: Marcel Van Velzen Junior Marte Garcia
No ratings yet
K-Nearest Neighbors: Marcel Van Velzen Junior Marte Garcia
8 pages
04 KNN
No ratings yet
04 KNN
25 pages
4 KNN Classifier
No ratings yet
4 KNN Classifier
6 pages
Experiment No 7 ML
No ratings yet
Experiment No 7 ML
4 pages
ML 7th Sem Aiml Ite Notes Complete Long (1) - 63-155
No ratings yet
ML 7th Sem Aiml Ite Notes Complete Long (1) - 63-155
93 pages
KNN
No ratings yet
KNN
5 pages
Supervised Example KNN
No ratings yet
Supervised Example KNN
22 pages
K-Nearest Neighbor (KNN) : Non-Parametric Algorithm
No ratings yet
K-Nearest Neighbor (KNN) : Non-Parametric Algorithm
7 pages
K Nearest Neighbour - Algorithm
No ratings yet
K Nearest Neighbour - Algorithm
29 pages
Dr. S. Vairachilai Department of CSE CVR College of Engineering Mangalpalli Telangana
No ratings yet
Dr. S. Vairachilai Department of CSE CVR College of Engineering Mangalpalli Telangana
18 pages
KNN (K Nearest Neighbor)
No ratings yet
KNN (K Nearest Neighbor)
21 pages
K-Nearest Neighbors
No ratings yet
K-Nearest Neighbors
35 pages
KNN Cookbook
No ratings yet
KNN Cookbook
8 pages
JNTUK R20 B.tech CSE 3-2 Machine Learning Unit 2 Notes
No ratings yet
JNTUK R20 B.tech CSE 3-2 Machine Learning Unit 2 Notes
33 pages
Lecture 14 and 15
No ratings yet
Lecture 14 and 15
42 pages
K Nearest Neighbor KNN
No ratings yet
K Nearest Neighbor KNN
18 pages
KMEANS
No ratings yet
KMEANS
9 pages
Enhanced K-Nearest Neighbor Algorithm: Dalvinder Singh Dhaliwal, Parvinder S. Sandhu, S. N. Panda
No ratings yet
Enhanced K-Nearest Neighbor Algorithm: Dalvinder Singh Dhaliwal, Parvinder S. Sandhu, S. N. Panda
5 pages
K-Nearest Neighbors Algorithm
No ratings yet
K-Nearest Neighbors Algorithm
7 pages
Bài nhóm tìm hiểu về KNN
No ratings yet
Bài nhóm tìm hiểu về KNN
5 pages
K-Nearest Neighbor Algorithm: by Vipul Pathak (00216404824) Siddharth Tyagi (02016404824)
No ratings yet
K-Nearest Neighbor Algorithm: by Vipul Pathak (00216404824) Siddharth Tyagi (02016404824)
19 pages
12 ML KNN
No ratings yet
12 ML KNN
28 pages
Limits and Continuity (Calculus) Engineering Entrance Exams Question Bank
From Everand
Limits and Continuity (Calculus) Engineering Entrance Exams Question Bank
Mohmmad Khaja Shareef
No ratings yet
A Short Course in Discrete Mathematics
From Everand
A Short Course in Discrete Mathematics
Edward A. Bender
3/5 (1)
Worked Examples in Mathematics for Scientists and Engineers
From Everand
Worked Examples in Mathematics for Scientists and Engineers
G. Stephenson
No ratings yet
EDA KNN KMeans Filled Example Project
100% (1)
EDA KNN KMeans Filled Example Project
4 pages
DW Unit-2
No ratings yet
DW Unit-2
5 pages
AI Fundamentals SkillUp Session 2
No ratings yet
AI Fundamentals SkillUp Session 2
41 pages
Machine Learning Algorithms in Civil Structural Health Monitoring: A Systematic Review
No ratings yet
Machine Learning Algorithms in Civil Structural Health Monitoring: A Systematic Review
24 pages
Homework 1
No ratings yet
Homework 1
9 pages
2019 Sleep Apnea Detection Based On Rician Modelling of Feature Variations in Multi Band EEG
No ratings yet
2019 Sleep Apnea Detection Based On Rician Modelling of Feature Variations in Multi Band EEG
9 pages
Mdoc
No ratings yet
Mdoc
32 pages
Sem7 Aml PB Batch2021
No ratings yet
Sem7 Aml PB Batch2021
46 pages
CS8082-Machine Learning Techniques
No ratings yet
CS8082-Machine Learning Techniques
13 pages
Oral Questions LP-II: Star Schema
No ratings yet
Oral Questions LP-II: Star Schema
21 pages
Best Resources To Learn Statistics - Analytics Vidhya
No ratings yet
Best Resources To Learn Statistics - Analytics Vidhya
7 pages
Surfer PDF
No ratings yet
Surfer PDF
9 pages
Data Science 30 Days Learning Plan - by Data Analytics - Mr. Plan Publication - Jun, 2024 - Medium
No ratings yet
Data Science 30 Days Learning Plan - by Data Analytics - Mr. Plan Publication - Jun, 2024 - Medium
11 pages
Ieee Format
No ratings yet
Ieee Format
8 pages
Implementation and Analysis of Different Digit Recognition Methods On Reduced MNIST Dataset
No ratings yet
Implementation and Analysis of Different Digit Recognition Methods On Reduced MNIST Dataset
10 pages
PROG-1: Write A Python Program To Compute Central Tendency Measures: Mean, Median, Mode Measure of Dispersion: Variance, Standard Deviation Aim
No ratings yet
PROG-1: Write A Python Program To Compute Central Tendency Measures: Mean, Median, Mode Measure of Dispersion: Variance, Standard Deviation Aim
11 pages
Machine: Learning ATO Z - I
No ratings yet
Machine: Learning ATO Z - I
131 pages
Module 2
No ratings yet
Module 2
78 pages
Applied Computational Intelligence and Soft Computing - 2024 - Ahmed - Student Performance Prediction Using Machine
No ratings yet
Applied Computational Intelligence and Soft Computing - 2024 - Ahmed - Student Performance Prediction Using Machine
15 pages
Machine Learning LAB MANUAL
No ratings yet
Machine Learning LAB MANUAL
23 pages
1 s2.0 S2666827022001104 Main
No ratings yet
1 s2.0 S2666827022001104 Main
9 pages
PlantCare Community-Driven Agri-Aid
No ratings yet
PlantCare Community-Driven Agri-Aid
6 pages
ExcelFormer A Neural Network Surpassing GBDTs On Tabular Data
No ratings yet
ExcelFormer A Neural Network Surpassing GBDTs On Tabular Data
13 pages
Parkinson
No ratings yet
Parkinson
7 pages
Machine Learning Notes
No ratings yet
Machine Learning Notes
2 pages
Mca1to6 New
No ratings yet
Mca1to6 New
28 pages
Paper 14324
No ratings yet
Paper 14324
9 pages
Vsat2k - ML - Ch1 Introduction To Machine Learning - Jan 2025
No ratings yet
Vsat2k - ML - Ch1 Introduction To Machine Learning - Jan 2025
26 pages
1 s2.0 S0933365722000227 Main
No ratings yet
1 s2.0 S0933365722000227 Main
8 pages
ML LAB MANUAL (ACSML0651) - DR Roop Singh
No ratings yet
ML LAB MANUAL (ACSML0651) - DR Roop Singh
58 pages