K Nearest Neighbor

Uploaded by

mahi60340

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

30 views33 pages

K Nearest Neighbor

Uploaded by

mahi60340

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 33

K-Nearest Neighbour

Tushar B. Kute,
http://tusharkute.com
What sort of Machine Learning?

• An idea that can be used for machine learning—

as does another maxim involving poultry: "birds
of a feather flock together."
• In other words, things that are alike are likely to
have properties that are alike.
• We can use this principle to classify data by
placing it in the category with the most similar,
or "nearest" neighbors.
Nearest Neighbor Classification

• In a single sentence, nearest neighbor classifiers are defined

by their characteristic of classifying unlabeled examples by
assigning them the class of the most similar labeled examples.
Despite the simplicity of this idea, nearest neighbor methods
are extremely powerful. They have been used successfully for:
– Computer vision applications, including optical character
recognition and facial recognition in both still images and
video
– Predicting whether a person enjoys a movie which he/she
has been recommended (as in the Netflix challenge)
– Identifying patterns in genetic data, for use in detecting
specific protein or diseases
The kNN Algorithm

• The kNN algorithm begins with a training dataset

made up of examples that are classified into several
categories, as labeled by a nominal variable.
• Assume that we have a test dataset containing
unlabeled examples that otherwise have the same
features as the training data.
• For each record in the test dataset, kNN identifies k
records in the training data that are the "nearest" in
similarity, where k is an integer specified in advance.
• The unlabeled test instance is assigned the class of
the majority of the k nearest neighbors
Example:

Reference: Machine Learning with R, Brett Lantz, Packt Publishing

Example:
Example:
Classify me now!
Example:
Calculating Distance

• Locating the tomato's nearest neighbors requires

a distance function, or a formula that measures
the similarity between two instances.
• There are many different ways to calculate
distance.
• Traditionally, the kNN algorithm uses Euclidean
distance, which is the distance one would measure
if you could use a ruler to connect two points,
illustrated in the previous figure by the dotted
lines connecting the tomato to its neighbors.
Calculating Distance

Reference: Super Data Science

Distance

• Euclidean distance is specified by the following formula, where p

and q are th examples to be compared, each having n features. The
term p1 refers to the value of the first feature of example p, while
q1 refers to the value of the first feature of example q:

• The distance formula involves comparing the values of each

feature. For example, to calculate the distance between the
tomato (sweetness = 6, crunchiness = 4), and the green bean
(sweetness = 3, crunchiness = 7), we can use the formula as follows:
Distance
Distance

Manhattan Distance

Euclidean Distance
Closest Neighbors
Choosing appropriate k

• Deciding how many neighbors to use for kNN

determines how well the mode will generalize to
future data.
• The balance between overfitting and underfitting
the training data is a problem known as the bias-
variance tradeoff.
• Choosing a large k reduces the impact or variance
caused by noisy data, but can bias the learner such
that it runs the risk of ignoring small, but important
patterns.
Choosing appropriate k
Choosing appropriate k

• In practice, choosing k depends on the difficulty

of the concept to be learned and the number of
records in the training data.
• Typically, k is set somewhere between 3 and 10.
One common practice is to set k equal to the
square root of the number of training examples.
• In the classifier, we might set k = 4, because
there were 15 example ingredients in the
training data and the square root of 15 is 3.87.
Min-Max normalization

• The traditional method of rescaling features for kNN is min-

max normalization.
• This process transforms a feature such that all of its values fall
in a range between 0 and 1. The formula for normalizing a
feature is as follows. Essentially, the formula subtracts the
minimum of feature X from each value and divides by the range
of X:

• Normalized feature values can be interpreted as indicating how

far, from 0 percent to 100 percent, the original value fell along
the range between the original minimum and maximum.
The Lazy Learning

• Using the strict definition of learning, a lazy learner

is not really learning anything.
• Instead, it merely stores the training data in it. This
allows the training phase to occur very rapidly, with
a potential downside being that the process of
making predictions tends to be relatively slow.
• Due to the heavy reliance on the training instances,
lazy learning is also known as instance-based
learning or rote learning.
Few Lazy Learning Algorithms

• K Nearest Neighbors
• Local Regression
• Lazy Naive Bayes
The kNN Algorithm
Python Packages needed

• pandas
– Data Analytics
• numpy
– Numerical Computing
• matplotlib.pyplot
– Plotting graphs
• sklearn
– Classification and Regression Classes
Sample Application
KNN – Classification : Dataset

• The best small project to start with on a new tool is the

classification of iris flowers (e.g. the iris dataset).
• This is a good project because it is so well understood.
– Attributes are numeric so you have to figure out how to load and
handle data.
– It is a classification problem, allowing you to practice with perhaps
an easier type of supervised learning algorithm.
– It is a multi-class classification problem (multi-nominal) that may
require some specialized handling.
– It only has 4 attributes and 150 rows, meaning it is small and easily
fits into memory (and a screen or A4 page).
– All of the numeric attributes are in the same units and the same
scale, not requiring any special scaling or transforms to get started.
Dataset
KNN – Classification : Dataset
Pre-processing
Characterize
Finding error with changed k
Error rate
Useful resources

• www.mitu.co.in
• www.pythonprogramminglanguage.com
• www.scikit-learn.org
• www.towardsdatascience.com
• www.medium.com
• www.analyticsvidhya.com
• www.kaggle.com
• www.stephacking.com
• www.github.com
Thank you
This presentation is created using LibreOffice Impress 5.1.6.2, can be used freely as per GNU General Public License

/mITuSkillologies @mitu_group /company/mitu- MITUSkillologies

skillologies

Web Resources
https://mitu.co.in
http://tusharkute.com

[email protected]
[email protected]

K - Nearest Neighbors
No ratings yet
K - Nearest Neighbors
33 pages
12 ML KNN
No ratings yet
12 ML KNN
28 pages
K-Nearest Neighbor Algorithm
100% (1)
K-Nearest Neighbor Algorithm
6 pages
REGULA - FALSI METHOD Notes
0% (1)
REGULA - FALSI METHOD Notes
14 pages
Algorithms: K Nearest Neighbors
No ratings yet
Algorithms: K Nearest Neighbors
16 pages
Unit 4 - KVR
No ratings yet
Unit 4 - KVR
111 pages
Week 7 Part 1KNN K Nearest Neighbor Classification
No ratings yet
Week 7 Part 1KNN K Nearest Neighbor Classification
47 pages
Lazy Learners Unit 2
No ratings yet
Lazy Learners Unit 2
26 pages
ML CH 3
No ratings yet
ML CH 3
88 pages
Lazy LearningClassification Using Nearest Neighbors
No ratings yet
Lazy LearningClassification Using Nearest Neighbors
36 pages
Aiml M3 C2
No ratings yet
Aiml M3 C2
56 pages
Live 1 - AI - K Nearest Neighbors
No ratings yet
Live 1 - AI - K Nearest Neighbors
21 pages
KNN Updated
No ratings yet
KNN Updated
30 pages
K - Nearest Neighbours (K-NN) Algorithm
No ratings yet
K - Nearest Neighbours (K-NN) Algorithm
10 pages
2.2 Lazy Learning
No ratings yet
2.2 Lazy Learning
26 pages
Machine Learning-Lecture 03
No ratings yet
Machine Learning-Lecture 03
19 pages
KNN Presentation
No ratings yet
KNN Presentation
19 pages
KNN
No ratings yet
KNN
53 pages
Lecture 07 KNN 14112022 034756pm
100% (1)
Lecture 07 KNN 14112022 034756pm
24 pages
ML Lecture 13 KNN
No ratings yet
ML Lecture 13 KNN
14 pages
CPE412 Pattern Recognition (Week 6)
No ratings yet
CPE412 Pattern Recognition (Week 6)
27 pages
Instance-Based Learning: K-Nearest Neighbour Learning
No ratings yet
Instance-Based Learning: K-Nearest Neighbour Learning
21 pages
19-K-Nearest Neighbor Learning.-22-08-2024
No ratings yet
19-K-Nearest Neighbor Learning.-22-08-2024
25 pages
KNN Dan KMeans
No ratings yet
KNN Dan KMeans
37 pages
KNN - Feb 19
No ratings yet
KNN - Feb 19
42 pages
Supervised Example KNN
No ratings yet
Supervised Example KNN
22 pages
Sample KNN
No ratings yet
Sample KNN
7 pages
K Nearest Neighbor
No ratings yet
K Nearest Neighbor
32 pages
KNN Using Python
No ratings yet
KNN Using Python
23 pages
'Machine Learning (Nagarjun)
No ratings yet
'Machine Learning (Nagarjun)
10 pages
KNN Algorithm
No ratings yet
KNN Algorithm
9 pages
K-Nearest Neighbor Classification-Algorithm and Characteristics
No ratings yet
K-Nearest Neighbor Classification-Algorithm and Characteristics
6 pages
Lecture Note #3 - PEC-CS701E
No ratings yet
Lecture Note #3 - PEC-CS701E
27 pages
Week 07
No ratings yet
Week 07
24 pages
COS4852 2023 Unit 2 - KNN
No ratings yet
COS4852 2023 Unit 2 - KNN
10 pages
4.kNN Concepts
No ratings yet
4.kNN Concepts
12 pages
K Nearest Neighbors
No ratings yet
K Nearest Neighbors
22 pages
K-Nearest Neighbor: Dr. Adven
No ratings yet
K-Nearest Neighbor: Dr. Adven
20 pages
K-NN Algorithm and Clustering Analysis
No ratings yet
K-NN Algorithm and Clustering Analysis
93 pages
Instance Based Learning
No ratings yet
Instance Based Learning
16 pages
K Nearest Neighbor (KNN)
No ratings yet
K Nearest Neighbor (KNN)
9 pages
KNN PDF
No ratings yet
KNN PDF
30 pages
3.1 K Nearest Neighbour Classifier
No ratings yet
3.1 K Nearest Neighbour Classifier
24 pages
K Nearest Neighbors (KNN) : "Birds of A Feather Flock Together"
No ratings yet
K Nearest Neighbors (KNN) : "Birds of A Feather Flock Together"
16 pages
Algorithms - K Nearest Neighbors
No ratings yet
Algorithms - K Nearest Neighbors
23 pages
K Nearest Neighbour's (KNN) (1) Using R
No ratings yet
K Nearest Neighbour's (KNN) (1) Using R
9 pages
21 KNN
No ratings yet
21 KNN
28 pages
KNN With Example
No ratings yet
KNN With Example
21 pages
K - Nearest Neighbor
No ratings yet
K - Nearest Neighbor
13 pages
Ai 5
No ratings yet
Ai 5
11 pages
K-Nearest Neighbor (KNN) : Non-Parametric Algorithm
No ratings yet
K-Nearest Neighbor (KNN) : Non-Parametric Algorithm
7 pages
K Nearest Neighbour Classifier
No ratings yet
K Nearest Neighbour Classifier
24 pages
K-Nearest Neighbor (KNN) Algorithm: Last Updated: 14 May, 2025
No ratings yet
K-Nearest Neighbor (KNN) Algorithm: Last Updated: 14 May, 2025
14 pages
Machine Learning
No ratings yet
Machine Learning
32 pages
Unit 3 KNN
No ratings yet
Unit 3 KNN
16 pages
Why Do We Need A K-NN Algorithm?
No ratings yet
Why Do We Need A K-NN Algorithm?
11 pages
Amrendra
No ratings yet
Amrendra
9 pages
ML 2
No ratings yet
ML 2
6 pages
K-Nearest Neighbors (KNN)
No ratings yet
K-Nearest Neighbors (KNN)
3 pages
K-Nearest Neighbors: Marcel Van Velzen Junior Marte Garcia
No ratings yet
K-Nearest Neighbors: Marcel Van Velzen Junior Marte Garcia
8 pages
Arid Agriculture University, Rawalpindi: Mid Exam / Spring 2021 (Paper Duration 12 Hours) To Be Filled by Teacher
No ratings yet
Arid Agriculture University, Rawalpindi: Mid Exam / Spring 2021 (Paper Duration 12 Hours) To Be Filled by Teacher
4 pages
Python Lab Manual Detail
No ratings yet
Python Lab Manual Detail
49 pages
Quantum Field Theory Solution To Exercise Sheet No. 9: Exercise 9.1: One Loop Renormalization of QED A)
No ratings yet
Quantum Field Theory Solution To Exercise Sheet No. 9: Exercise 9.1: One Loop Renormalization of QED A)
20 pages
001 Feedback and CONTROL SYSTEMS - Notes
No ratings yet
001 Feedback and CONTROL SYSTEMS - Notes
10 pages
Signals & Systems B38SA 2018: Chapter 2 Assignment Question 1 - Theory - 10 Marks
No ratings yet
Signals & Systems B38SA 2018: Chapter 2 Assignment Question 1 - Theory - 10 Marks
6 pages
Data Avengers PAP Analytics Course Brochure
No ratings yet
Data Avengers PAP Analytics Course Brochure
14 pages
The Improvement of Sil Calculation Methodology
No ratings yet
The Improvement of Sil Calculation Methodology
16 pages
Lecture 1: Cryptography: 1.2.1 Symmetric Case
No ratings yet
Lecture 1: Cryptography: 1.2.1 Symmetric Case
3 pages
1.2.1 Idealization of A Continuum
No ratings yet
1.2.1 Idealization of A Continuum
6 pages
Msp430 Filter
100% (1)
Msp430 Filter
7 pages
TOC Unit 6 Notes by DK?
No ratings yet
TOC Unit 6 Notes by DK?
9 pages
Pid Control
No ratings yet
Pid Control
19 pages
GNF PDF
No ratings yet
GNF PDF
3 pages
Tutorial - III: 1) Simple Moving Average
No ratings yet
Tutorial - III: 1) Simple Moving Average
16 pages
Bottleneck Model: Terminology and Symbols
No ratings yet
Bottleneck Model: Terminology and Symbols
5 pages
SP6 Solution
No ratings yet
SP6 Solution
10 pages
MAS Session1
No ratings yet
MAS Session1
72 pages
11-Optimization For Engineering Design
No ratings yet
11-Optimization For Engineering Design
23 pages
DSP Questions
No ratings yet
DSP Questions
4 pages
Spring 2024 Project #2
No ratings yet
Spring 2024 Project #2
2 pages
Riscv Crypto Spec v0.9.0 Scalar
No ratings yet
Riscv Crypto Spec v0.9.0 Scalar
52 pages
Velammal Engineering College (An Autonomous Institution), Chennai-66 Teaching & Learning Lesson Plan Form
No ratings yet
Velammal Engineering College (An Autonomous Institution), Chennai-66 Teaching & Learning Lesson Plan Form
7 pages
DONSIG Lesson 9.
No ratings yet
DONSIG Lesson 9.
2 pages
CH6605 Process Instrumentation, Dynamics and Control
No ratings yet
CH6605 Process Instrumentation, Dynamics and Control
16 pages
SVM Theory
No ratings yet
SVM Theory
4 pages
Zeeshan (CS) - Assignment 1
No ratings yet
Zeeshan (CS) - Assignment 1
3 pages
IATI Day 2/junior Task 2. Artillery (English)
No ratings yet
IATI Day 2/junior Task 2. Artillery (English)
2 pages
PLC Tags PDF
No ratings yet
PLC Tags PDF
2 pages
Deep Learning For Consumer Devices and Services
No ratings yet
Deep Learning For Consumer Devices and Services
9 pages
Python Machine Learning: Learn how to build powerful Python machine learning algorithms to generate useful data insights with this data analysis tutorial
From Everand
Python Machine Learning: Learn how to build powerful Python machine learning algorithms to generate useful data insights with this data analysis tutorial
Sebastian Raschka
4/5 (20)