K Nearest Neighbor

The document explains the K-Nearest Neighbors (kNN) algorithm, which classifies unlabeled examples based on the most similar labeled examples from a training dataset. It discusses the importance of choosing the appropriate number of neighbors (k) to balance overfitting and underfitting, as well as techniques for calculating distance, such as Euclidean distance. Additionally, it highlights the lazy learning nature of kNN and provides resources for further learning.

Uploaded by

kalpana khandale

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

2 views32 pages

K Nearest Neighbor

Uploaded by

kalpana khandale

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 32

K- Nearest Neighbors

Nearest Neighbor Classification

• In a single sentence, nearest neighbor classifiers are

defined by their characteristic of classifying
unlabeled examples by assigning them the class of
the most similar labeled examples. Despite the
simplicity of this idea, nearest neighbor methods are
extremely powerful. They have been used successfully
for:
– Computer vision applications, including optical
character recognition and facial recognition in
both still images and video
– Predicting whether a person enjoys a movie which
he/she has been recommended (as in the Netfl ix
challenge)
– Identifying patterns in genetic data, for use in
detecting specific protein or diseases
The kNN Algorithm

• The kNN algorithm begins with a training

dataset made up of examples that are
classified into several categories, as labeled
by a nominal variable.
• Assume that we have a test dataset
containing unlabeled examples that
otherwise have the same features as the
training data.
• For each record in the test dataset, kNN
identifies k records in the training data that
are the "nearest" in similarity, where k is an
integer specified in advance.
• The unlabeled test instance is assigned the
Example:

Reference: Machine Learning with R, Brett Lantz, Packt Publishing

Example:
Example:
Classify me now!
Example:
Calculating Distance

• Locating the tomato's nearest neighbors

requires a dist ance function, or a f ormula
t hat measures the similarity between two
instances.
• There are many diff erent ways to
calculate distance.
• Traditionally, the kNN algorithm uses
Euclidean distance, which is the distance
one would measure if you could use a ruler
t o connect t wo point s, illustrated in the
previous figure by the dotted lines
connecting the tomato to its neighbors.
Calculating Distance

Reference: Super Data Science

Distance

• Euclidean distance is specified by the following formula,

where p and q are th examples to be compared, each
having n features. The term p1 refers to the value of the
fi rst feature of example p, while q1 refers to the value
of the fi rst feature of example q:

• The distance formula involves comparing the values of

each feature. For example, to calculate the distance
between the tomato (sweetness = 6, crunchiness = 4),
and the green bean (sweetness = 3, crunchiness = 7), we
can use the formula as follows:
Distance
Distance

Manhattan Distance

Euclidean Distance
Closest Neighbors
Choosing appropriate k

• Deciding how many neighbors to use for

kNN determines how well the mode will
generalize to future data.
• The balance between overfitting and
underfitting t he t raining dat a is a
problem known as t he bias- variance
• tradeoff.
Choosing a large k reduces t he impact or
variance caused by noisy data, but can bias
the learner such that it runs the risk of
ignoring small, but important patterns.
Choosing appropriate k
Choosing appropriate k

• In practice, choosing k depends on the

diffi culty of the concept to be learned
and the number of records in the
training data.
• Typically, k is set somewhere between 3
and 10. One common practice is to set k
equal to the square root of the number
of training examples.
• In the classifier, we might set k = 4,
because there were 15 example
ingredients in the training data and
the square root of 15 is 3.87.
Min-Max normalization

• The traditional method of rescaling features for

kNN is min- max normalization.
• This process transforms a feature such that all of its
values fall in a range between 0 and 1. The formula
for normalizing a feature is as follows. Essentially, the
formula subtracts the minimum of feature X from
each value and divides by the range of X:

• Normalized feature values can be interpreted as

indicating how far, from 0 percent to 100 percent, the
original value fell along the range between the
original minimum and maximum.
The Lazy Learning

• Using the strict definition of learning, a lazy

learner is not really learning anything.
• Instead, it merely stores the training data in
it. This allows the training phase to occur
very rapidly, with a pot ent ial downside
being t hat t he process of making
• predictions tends to be relatively slow.
Due to the heavy reliance on the training
instances, lazy learning is also known as
instance-based learning or rote learning.
Few Lazy Learning Algorithms

• K Nearest
Neighbors
• Local Regression
• Lazy Naive Bayes
The kNN Algorithm
Python Packages needed

• pandas
– Data Analytics
• numpy
– Numerical Computing
• matplotlib.pyp
lot
– Plotting graphs
• sklearn
– Classification
and
Regression
Sample Application
KNN – Classification : Dataset

• The best small project to start with on a new

tool is the classification of iris flowers (e.g. the
iris dataset).
• This is a good project because it is so well
understood.
– Attributes are numeric so you have to figure out how to
load and handle data.
– It is a classification problem, allowing you to practice
with perhaps an easier type of supervised learning
algorithm.
– It is a multi-class classification problem (multi-nominal)
that may require some specialized handling.
– It only has 4 attributes and 150 rows, meaning it is small
and easily fi ts into memory (and a screen or A4 page).
– All of the numeric attributes are in the same units and
the same scale, not requiring any special scaling or
Dataset
KNN – Classification : Dataset
Pre-processing
Characterize
Finding error with changed k
Error rate
Useful resources

• www.mitu.co.in
• www.pythonprogramminglanguage
• .com
• www.scikit-learn.org
• www.towardsdatascience.com
• www.medium.com
• www.analyticsvidhya.com
• www.kaggle.com
• www.stephacking.
com
www.github.com
Thank you
This presentation is created using LibreOffice Impress 5.1.6.2, can be used freely as per GNU General Public
License

/ @mitu_grou
/company/mitu- MITUSkillologies
mITuSkillologie p skillologie
s s

Web Resources
https://mitu.co.in
http://tusharkute.c
om

[email protected]
[email protected]

Lecture Planner - Physics - Laksha JEE 2025
100% (1)
Lecture Planner - Physics - Laksha JEE 2025
3 pages
K - Nearest Neighbors
No ratings yet
K - Nearest Neighbors
33 pages
K-Nearest Neighbor (KNN) Algorithm For Machine Learning
No ratings yet
K-Nearest Neighbor (KNN) Algorithm For Machine Learning
17 pages
12 ML KNN
No ratings yet
12 ML KNN
28 pages
K Nearest Neighbor
No ratings yet
K Nearest Neighbor
33 pages
Lazy LearningClassification Using Nearest Neighbors
No ratings yet
Lazy LearningClassification Using Nearest Neighbors
36 pages
ML CH 3
No ratings yet
ML CH 3
88 pages
Lecture 07 KNN 14112022 034756pm
100% (1)
Lecture 07 KNN 14112022 034756pm
24 pages
KNN Using Python
No ratings yet
KNN Using Python
23 pages
Machine Learning-Lecture 03
No ratings yet
Machine Learning-Lecture 03
19 pages
ML Lecture 13 KNN
No ratings yet
ML Lecture 13 KNN
14 pages
19-K-Nearest Neighbor Learning.-22-08-2024
No ratings yet
19-K-Nearest Neighbor Learning.-22-08-2024
25 pages
KNN Presentation
No ratings yet
KNN Presentation
19 pages
K - Nearest Neighbours (K-NN) Algorithm
No ratings yet
K - Nearest Neighbours (K-NN) Algorithm
10 pages
Instance-Based Learning: K-Nearest Neighbour Learning
No ratings yet
Instance-Based Learning: K-Nearest Neighbour Learning
21 pages
A Complete Guide To K Nearest Neighbors Algorithm 1598272616
No ratings yet
A Complete Guide To K Nearest Neighbors Algorithm 1598272616
13 pages
Organization and Management Module 3: Quarter 1 - Week 3
100% (3)
Organization and Management Module 3: Quarter 1 - Week 3
15 pages
Lecture Note #3 - PEC-CS701E
No ratings yet
Lecture Note #3 - PEC-CS701E
27 pages
21 KNN
No ratings yet
21 KNN
28 pages
06 KNN
No ratings yet
06 KNN
41 pages
KNN Dan KMeans
No ratings yet
KNN Dan KMeans
37 pages
Sample KNN
No ratings yet
Sample KNN
7 pages
Week 3. K-Nearest Neighbours (KNN) : Dr. Shuo Wang
No ratings yet
Week 3. K-Nearest Neighbours (KNN) : Dr. Shuo Wang
18 pages
KNN With Example
No ratings yet
KNN With Example
21 pages
K Nearest Neighbour's (KNN) (1) Using R
No ratings yet
K Nearest Neighbour's (KNN) (1) Using R
9 pages
4.kNN Concepts
No ratings yet
4.kNN Concepts
12 pages
K Nearest Neighbors
No ratings yet
K Nearest Neighbors
22 pages
Machine Learning Lab Manual 7
100% (1)
Machine Learning Lab Manual 7
8 pages
KNN - Feb 19
No ratings yet
KNN - Feb 19
42 pages
KNN Algorithm
No ratings yet
KNN Algorithm
9 pages
Supervised Example KNN
No ratings yet
Supervised Example KNN
22 pages
AIML
No ratings yet
AIML
13 pages
K-Nearest Neighbor
No ratings yet
K-Nearest Neighbor
22 pages
CPE412 Pattern Recognition (Week 6)
No ratings yet
CPE412 Pattern Recognition (Week 6)
27 pages
Instance Based Learning
No ratings yet
Instance Based Learning
16 pages
Week 07
No ratings yet
Week 07
24 pages
K-Nearest Neighbor Classification-Algorithm and Characteristics
No ratings yet
K-Nearest Neighbor Classification-Algorithm and Characteristics
6 pages
Improving Time-Complexity of K Nearest Neighbors Classifier: A Systematic Review
No ratings yet
Improving Time-Complexity of K Nearest Neighbors Classifier: A Systematic Review
6 pages
Machine Learning
No ratings yet
Machine Learning
32 pages
KNN PDF
No ratings yet
KNN PDF
30 pages
Part A 3. KNN Classification
No ratings yet
Part A 3. KNN Classification
35 pages
K Nearest Neighbors KNN A Fundamental Machine Learning Algorithm
No ratings yet
K Nearest Neighbors KNN A Fundamental Machine Learning Algorithm
11 pages
K-Nearest Neighbor (KNN) Algorithm: Last Updated: 14 May, 2025
No ratings yet
K-Nearest Neighbor (KNN) Algorithm: Last Updated: 14 May, 2025
14 pages
Research Paper
No ratings yet
Research Paper
6 pages
Unit V Non Parametric Machine Learning
No ratings yet
Unit V Non Parametric Machine Learning
47 pages
Algorithms - K Nearest Neighbors
No ratings yet
Algorithms - K Nearest Neighbors
23 pages
K Nearestneighborknnalgorithm 241117075907 d767c46d
No ratings yet
K Nearestneighborknnalgorithm 241117075907 d767c46d
13 pages
K Nearest Neighbor (KNN)
No ratings yet
K Nearest Neighbor (KNN)
9 pages
Amrendra
No ratings yet
Amrendra
9 pages
K - Nearest Neighbor
No ratings yet
K - Nearest Neighbor
22 pages
Bài nhóm tìm hiểu về KNN
No ratings yet
Bài nhóm tìm hiểu về KNN
5 pages
K - Nearest Neighbor
No ratings yet
K - Nearest Neighbor
13 pages
Miss Erum Mahood Topic: KNN Algorthim: Presentator BY: Zobia Malaika Maryam Minahil
No ratings yet
Miss Erum Mahood Topic: KNN Algorthim: Presentator BY: Zobia Malaika Maryam Minahil
10 pages
Why Do We Need A K-NN Algorithm?
No ratings yet
Why Do We Need A K-NN Algorithm?
11 pages
Road Traffic Algorithm
No ratings yet
Road Traffic Algorithm
5 pages
ML 2
No ratings yet
ML 2
6 pages
K-Nearest Neighbors Algorithm
No ratings yet
K-Nearest Neighbors Algorithm
7 pages
K-Nearest Neighbors: Marcel Van Velzen Junior Marte Garcia
No ratings yet
K-Nearest Neighbors: Marcel Van Velzen Junior Marte Garcia
8 pages
K-Nearest Neighbor (KNN) : Non-Parametric Algorithm
No ratings yet
K-Nearest Neighbor (KNN) : Non-Parametric Algorithm
7 pages
Unit 3 KNN
No ratings yet
Unit 3 KNN
16 pages
Why Speaking in Tongues Teaching Notes
No ratings yet
Why Speaking in Tongues Teaching Notes
10 pages
GSW NG01017640 GEN MP4033 00002 001 Piping Material Specification
No ratings yet
GSW NG01017640 GEN MP4033 00002 001 Piping Material Specification
114 pages
Experiment No 7 ML
No ratings yet
Experiment No 7 ML
4 pages
Princess Accessories For 18" Doll
No ratings yet
Princess Accessories For 18" Doll
5 pages
Greek (Pythagoras and Plato)
100% (1)
Greek (Pythagoras and Plato)
29 pages
Nutrition of Entomaphagous Insects and Their Host Final
No ratings yet
Nutrition of Entomaphagous Insects and Their Host Final
36 pages
Electron Microscope-Cell Structure Workbook
No ratings yet
Electron Microscope-Cell Structure Workbook
7 pages
1 - Biomechanics - Gait Analysis
100% (1)
1 - Biomechanics - Gait Analysis
48 pages
Tunnelling Methods
0% (1)
Tunnelling Methods
15 pages
UNIT 1 "To Be": I Am Very Hungry Today
No ratings yet
UNIT 1 "To Be": I Am Very Hungry Today
108 pages
Level 4 Reading Narrative With KEY
No ratings yet
Level 4 Reading Narrative With KEY
13 pages
DEAS 793-1-2021 Toilet Cleaners Specification Part 1 Acidic
No ratings yet
DEAS 793-1-2021 Toilet Cleaners Specification Part 1 Acidic
14 pages
Highway 1
No ratings yet
Highway 1
76 pages
National Scientists of The Philippines
No ratings yet
National Scientists of The Philippines
3 pages
Bon Appétit Bon Appétit
No ratings yet
Bon Appétit Bon Appétit
16 pages
2 Hour Water Operator Mathematics
No ratings yet
2 Hour Water Operator Mathematics
74 pages
Earthscience4 1
No ratings yet
Earthscience4 1
8 pages
Ancient Marathon Class
No ratings yet
Ancient Marathon Class
13 pages
Failure Analysis of Water Pump Shaft
No ratings yet
Failure Analysis of Water Pump Shaft
7 pages
Gurley, Bill J, - Clinically Relevant Herb Mineral Vitamin Intrxn
No ratings yet
Gurley, Bill J, - Clinically Relevant Herb Mineral Vitamin Intrxn
9 pages
Effect of Added Sodium Sulphate On Colour Strength and Dye Fixation of Digital Printed Cellulosic Fabrics
No ratings yet
Effect of Added Sodium Sulphate On Colour Strength and Dye Fixation of Digital Printed Cellulosic Fabrics
21 pages
Agric Cala B
No ratings yet
Agric Cala B
6 pages
Painting Certificate
No ratings yet
Painting Certificate
1 page
Coldstore Doors PDF
No ratings yet
Coldstore Doors PDF
2 pages
Sime Darby Plantation Is The Plantation and Agri
No ratings yet
Sime Darby Plantation Is The Plantation and Agri
1 page
IC-M801GMDSS Certificate N° 06212504AA00
No ratings yet
IC-M801GMDSS Certificate N° 06212504AA00
5 pages
Defect Improvement by Optimizing Electroplating in BEOL Sub-50nm Pitch
No ratings yet
Defect Improvement by Optimizing Electroplating in BEOL Sub-50nm Pitch
6 pages
Lesson 1 Homework Practice: Integers and Absolute Value
No ratings yet
Lesson 1 Homework Practice: Integers and Absolute Value
1 page
Manual
No ratings yet
Manual
8 pages