0% found this document useful (0 votes)

12 views16 pages

KNN Algorithm

Pdf

Uploaded by

rajasweetyji369

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

12 views16 pages

KNN Algorithm

Pdf

Uploaded by

rajasweetyji369

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 16

k Nearest Neighbors

(KNN) Classifier

Dr. Arvind Selwal

Central University of Jammu, J&K
Supervised learning
and classification
 Given: dataset of instances with
known categories
 Goal: using the “knowledge” in the
dataset, classify a given instance
 predict the category of the given
instance that is rationally consistent
with the dataset
Classifiers

X1
X2
feature X3 Y
… Classifier category
values
Xn

collection of instances
DB
with known categories
k NEAREST NEIGHBOR
 Requires 3 things:
 Feature Space(Training Data)
 Distance metric
• to compute distance between
records
 The value of k
• the number of nearest
neighbors to retrieve from
? which to get majority class
 To classify an unknown record:
 Compute distance to other
training records
 Identify k nearest neighbors
 Use class labels of nearest
neighbors to determine the
class label of unknown record
k NEAREST NEIGHBOR
 k = 1:
 Belongs to square class

 k = 3:
?  Belongs to triangle class

 k = 7:
 Belongs to square class

 Choosing the value of k:

 If k is too small, sensitive to noise points
 If k is too large, neighborhood may include points from other
classes 5
8
 Choose an odd value for k, to eliminate ties
K - Nearest Neighbors

 For a given instance T, get the top k

dataset instances that are “nearest” to T
 Select a reasonable distance measure
 Inspect the category of these k
instances, choose the category C that
represent the most instances
 Conclude that T belongs to category C
K - Nearest Neighbors
Algorithm
Input: Training Data set, test data set, value of „k‟
Steps:
Do for all test data points
i. Calculate the distance of test data point from the
different training data points.
ii. Sort all the training data points in ascending order
according to distance computed in step-i.
iii. Choose top k items from the sorted list from step-ii
if k=1
Then assign class label of training data point to the test data
point.
else
Assign class to the test data point by using majority voting(
name of class label using voting)
End do
Example 1

 Determining decision on scholarship

application based on the following features:
 Household income (annual income in
millions of pesos)
 Number of siblings in family
 High school grade (on a QPI scale of 1.0 –
4.0)
 Intuition (reflected on data set): award
scholarships to high-performers and to
those with financial need
Distance formula

 Euclidian distance: square root of sum

of squares of differences

for two features: (x)2 + (y)2

 Intuition: similar samples should be

close to each other
 May not always apply
(example: quota and actual sales)
Example revisited

 Suppose household income was

instead indicated in thousands of
pesos per month and that grades are
given on a 70-100 scale
 Note different results produced by
kNN algorithm on the same dataset
Non-numeric data

 Feature values are not always

numbers
 Example
 Boolean values: Yes or no, presence
or absence of an attribute
 Categories: Colors, educational
attainment, gender
 How do these values factor into the
computation of distance?
Dealing with non-numeric
data
 Boolean values => convert to 0 or 1
 Applies to yes-no/presence-absence
attributes
 Non-binary characterizations
 Use natural progression when applicable;
e.g., educational attainment: GS, HS,
College, MS, PHD => 1,2,3,4,5
 Assign arbitrary numbers but be careful
about distances; e.g., color: red, yellow, blue
=> 1,2,3
 How about unavailable data?
(0 value not always the answer)
k-NN variations

 Value of k
 Larger k increases confidence in prediction
 Note that if k is too large, decision may be
skewed
 Weighted evaluation of nearest neighbors
 Plain majority may unfairly skew decision
 Revise algorithm so that closer neighbors
have greater “vote weight”
 Other distance measures
k-NN Time Complexity

 Suppose there are m instances and n

features in the dataset
 Nearest neighbor algorithm requires
computing m distances
 Each distance computation involves
scanning through each feature value
 Running time complexity is proportional to
mXn
k NEAREST NEIGHBOR
ADVANTAGES
 Simple technique that is easily implemented
 Building model is inexpensive
 Extremely flexible classification scheme
 does not involve preprocessing
 Well suited for
 Multi-modal classes (classes of multiple forms)
 Records with multiple class labels
 Asymptotic Error rate at most twice Bayes rate
 Cover & Hart paper (1967)
 Can sometimes be the best method
 Michihiro Kuramochi and George Karypis, Gene Classification using
Expression Profiles: A Feasibility Study, International Journal on
Artificial Intelligence Tools. Vol. 14, No. 4, pp. 641-660, 2005
 K nearest neighbor outperformed SVM for protein function prediction
using expression profiles
k NEAREST NEIGHBOR
DISADVANTAGES
 Classifying unknown records are relatively
expensive
 Requires distance computation of k-nearest
neighbors
 Computationally intensive, especially when the size
of the training set grows
 Accuracy can be severely degraded by the
presence of noisy or irrelevant features
 NN classification expects class conditional
probability to be locally constant
 bias of high dimensions

Basic Insurance Concepts and Principles (5 Edition) Mock Paper
100% (1)
Basic Insurance Concepts and Principles (5 Edition) Mock Paper
9 pages
Techsonic Industries
33% (3)
Techsonic Industries
3 pages
6 - KNN Classifier
No ratings yet
6 - KNN Classifier
10 pages
Chapter 9 Project Scheduling: PERT/CPM: Case Problem: R.C. Coleman
100% (2)
Chapter 9 Project Scheduling: PERT/CPM: Case Problem: R.C. Coleman
4 pages
K Nearest Neighbor KNN
No ratings yet
K Nearest Neighbor KNN
18 pages
Supervised Learning and K Nearest Neighbors: Business Intelligence For Managers
No ratings yet
Supervised Learning and K Nearest Neighbors: Business Intelligence For Managers
15 pages
12 ML KNN
No ratings yet
12 ML KNN
28 pages
K-Nearest Neighbourhood
100% (1)
K-Nearest Neighbourhood
7 pages
K Nearest Neighbour Classifier
No ratings yet
K Nearest Neighbour Classifier
24 pages
Lecture Note #3 - PEC-CS701E
No ratings yet
Lecture Note #3 - PEC-CS701E
27 pages
3.1 K Nearest Neighbour Classifier
No ratings yet
3.1 K Nearest Neighbour Classifier
24 pages
KNN Presentation
No ratings yet
KNN Presentation
16 pages
Instance Based Learning
No ratings yet
Instance Based Learning
16 pages
19-K-Nearest Neighbor Learning.-22-08-2024
No ratings yet
19-K-Nearest Neighbor Learning.-22-08-2024
25 pages
K - Nearest Neighbor
No ratings yet
K - Nearest Neighbor
22 pages
K-Nearest Neighbor
No ratings yet
K-Nearest Neighbor
22 pages
K-Nearest Neighbors
No ratings yet
K-Nearest Neighbors
35 pages
KNN PDF
No ratings yet
KNN PDF
30 pages
CH5 Data Mining Classification Prepared by Dr. Maher Abuhamdeh
No ratings yet
CH5 Data Mining Classification Prepared by Dr. Maher Abuhamdeh
61 pages
Nearest Neighbour Classifier (-NN Classifier)
No ratings yet
Nearest Neighbour Classifier (-NN Classifier)
17 pages
E Learning KNN
No ratings yet
E Learning KNN
31 pages
Supervised Example KNN
No ratings yet
Supervised Example KNN
22 pages
Dr. S. Vairachilai Department of CSE CVR College of Engineering Mangalpalli Telangana
No ratings yet
Dr. S. Vairachilai Department of CSE CVR College of Engineering Mangalpalli Telangana
18 pages
Lecture8 KNN1
No ratings yet
Lecture8 KNN1
16 pages
K Nearest Neighbor: Presented by
No ratings yet
K Nearest Neighbor: Presented by
29 pages
KNN
No ratings yet
KNN
53 pages
20 KNN Presentation
No ratings yet
20 KNN Presentation
16 pages
KNN HMM
No ratings yet
KNN HMM
51 pages
PowerPoint Presentation - KNN Presentation
No ratings yet
PowerPoint Presentation - KNN Presentation
16 pages
Week 07
No ratings yet
Week 07
24 pages
KNN Dan KMeans
No ratings yet
KNN Dan KMeans
37 pages
K-Nearest Neighbor Learning
No ratings yet
K-Nearest Neighbor Learning
31 pages
05 KNN
No ratings yet
05 KNN
49 pages
4K-Nearest Neighbor
No ratings yet
4K-Nearest Neighbor
38 pages
Algorithms - K Nearest Neighbors
No ratings yet
Algorithms - K Nearest Neighbors
23 pages
Lecture 14 and 15
No ratings yet
Lecture 14 and 15
42 pages
K-Nearest Neighbor Classifier: This Slide Is Modified From Dr. Tan's Slides. Thanks To Dr. Tan
No ratings yet
K-Nearest Neighbor Classifier: This Slide Is Modified From Dr. Tan's Slides. Thanks To Dr. Tan
11 pages
Decision Tree KNN
No ratings yet
Decision Tree KNN
9 pages
Supervised Learning KNN
No ratings yet
Supervised Learning KNN
23 pages
5c. Nearest Neighbour Classifier
No ratings yet
5c. Nearest Neighbour Classifier
2 pages
2EL1730-ML-Lecture04-Non Parametric Learning and Nearest Neighbor
No ratings yet
2EL1730-ML-Lecture04-Non Parametric Learning and Nearest Neighbor
47 pages
ML KN
No ratings yet
ML KN
12 pages
Chap7 KNN
No ratings yet
Chap7 KNN
15 pages
Nearest-Neighbor Classifier: MTL 782 Iit Delhi
No ratings yet
Nearest-Neighbor Classifier: MTL 782 Iit Delhi
16 pages
STAT 479: Machine Learning Lecture Notes: Sebastian Raschka Department of Statistics University of Wisconsin-Madison
No ratings yet
STAT 479: Machine Learning Lecture Notes: Sebastian Raschka Department of Statistics University of Wisconsin-Madison
23 pages
08 Classification Using K NN
No ratings yet
08 Classification Using K NN
23 pages
K-Nearest Neighbor Classification-Algorithm and Characteristics
No ratings yet
K-Nearest Neighbor Classification-Algorithm and Characteristics
6 pages
Machine Learning
No ratings yet
Machine Learning
32 pages
Lecture-11-KNearest Clustering-Part-1
No ratings yet
Lecture-11-KNearest Clustering-Part-1
18 pages
02-knn Notes
No ratings yet
02-knn Notes
23 pages
Lecture 3
No ratings yet
Lecture 3
17 pages
ML DSBA Lab4
No ratings yet
ML DSBA Lab4
5 pages
STAT 451: Introduction To Machine Learning Lecture Notes
No ratings yet
STAT 451: Introduction To Machine Learning Lecture Notes
22 pages
06 KNN
No ratings yet
06 KNN
41 pages
Classification KNN
No ratings yet
Classification KNN
11 pages
Introduction To KNN
100% (1)
Introduction To KNN
8 pages
KNN - Algorithm - SVM - Algorithm
No ratings yet
KNN - Algorithm - SVM - Algorithm
27 pages
Part A 3. KNN Classification
No ratings yet
Part A 3. KNN Classification
35 pages
Lecture 17 - KNN
No ratings yet
Lecture 17 - KNN
18 pages
K-NN Method
No ratings yet
K-NN Method
12 pages
COS4852 2023 Unit 2 - KNN
No ratings yet
COS4852 2023 Unit 2 - KNN
10 pages
K-Nearest Neighbor Learning
No ratings yet
K-Nearest Neighbor Learning
19 pages
K Nearest Neighbor Algorithm: Fundamentals and Applications
From Everand
K Nearest Neighbor Algorithm: Fundamentals and Applications
Fouad Sabry
No ratings yet
Dimensionality Reduction Using PCA: Unsupervised Machine Learning
No ratings yet
Dimensionality Reduction Using PCA: Unsupervised Machine Learning
32 pages
Lasso Vs Ridge Vs Elastic 1
No ratings yet
Lasso Vs Ridge Vs Elastic 1
5 pages
Heat and Mass Transfer UNIT 2
No ratings yet
Heat and Mass Transfer UNIT 2
49 pages
Feature Scaling and Feature Selection
No ratings yet
Feature Scaling and Feature Selection
21 pages
Retrofit Glass Is Now Within Your Grasp: Specifications
No ratings yet
Retrofit Glass Is Now Within Your Grasp: Specifications
2 pages
PS3 Relations PDF
No ratings yet
PS3 Relations PDF
2 pages
History of Civil Engineering
No ratings yet
History of Civil Engineering
51 pages
Dowel - Wikipedia
100% (1)
Dowel - Wikipedia
23 pages
Secure Socket Connection in Java
No ratings yet
Secure Socket Connection in Java
3 pages
UALT Page4
No ratings yet
UALT Page4
1 page
Crab Frame
No ratings yet
Crab Frame
3 pages
1D Array Programs List
No ratings yet
1D Array Programs List
4 pages
Key Plan: Acoustic Legends
No ratings yet
Key Plan: Acoustic Legends
1 page
Urban Informal Sector
No ratings yet
Urban Informal Sector
14 pages
Curriculum Vitae Lusine Aproyan: Work Experience
No ratings yet
Curriculum Vitae Lusine Aproyan: Work Experience
3 pages
US Department of Justice Official Release - 02881-723
No ratings yet
US Department of Justice Official Release - 02881-723
2 pages
Electric Car Project
No ratings yet
Electric Car Project
1 page
Cache Controller Verilog Project
No ratings yet
Cache Controller Verilog Project
4 pages
Quotation 0045 NBG
No ratings yet
Quotation 0045 NBG
1 page
Sun Tzu's "The Art of War" and Implications For Leadership: Theoretical Discussion
No ratings yet
Sun Tzu's "The Art of War" and Implications For Leadership: Theoretical Discussion
9 pages
Big Data Analysis and Intelligent Decision Support System For Environmental Water Quality Application of Artificial Intelligence in Water Environmental Protection
No ratings yet
Big Data Analysis and Intelligent Decision Support System For Environmental Water Quality Application of Artificial Intelligence in Water Environmental Protection
6 pages
Imm Module III
No ratings yet
Imm Module III
17 pages
LTD Hermoso Vs CA
No ratings yet
LTD Hermoso Vs CA
8 pages
Profesional Services Inc. vs. Agana
No ratings yet
Profesional Services Inc. vs. Agana
2 pages
Local Control Station
No ratings yet
Local Control Station
2 pages
Clearpack Proposal For The 120 BPM Water PET Filling Line For Hyaline Enviro, Hyderabad.
No ratings yet
Clearpack Proposal For The 120 BPM Water PET Filling Line For Hyaline Enviro, Hyderabad.
9 pages
Session ID: BW254 Information Broadcasting & SAP Enterprise Portal Integration With SAP BW 3.5
No ratings yet
Session ID: BW254 Information Broadcasting & SAP Enterprise Portal Integration With SAP BW 3.5
81 pages
G.o-749 Mr. Mohammad Zahidul Islam Mian, Deputy Secretary, Ministry of Shipping, Dhaka
No ratings yet
G.o-749 Mr. Mohammad Zahidul Islam Mian, Deputy Secretary, Ministry of Shipping, Dhaka
1 page
.Draft Lal Airport Terminology
No ratings yet
.Draft Lal Airport Terminology
3 pages
Active and Passive Voice Objective
No ratings yet
Active and Passive Voice Objective
7 pages
Charterhouse April 17th 2011 Car Sale Catalogue
No ratings yet
Charterhouse April 17th 2011 Car Sale Catalogue
14 pages

KNN Algorithm

Uploaded by

KNN Algorithm

Uploaded by

k Nearest Neighbors

Dr. Arvind Selwal

 Choosing the value of k:

 For a given instance T, get the top k

 Determining decision on scholarship

 Euclidian distance: square root of sum

for two features: (x)2 + (y)2

 Intuition: similar samples should be

 Suppose household income was

 Feature values are not always

 Suppose there are m instances and n

You might also like