0% found this document useful (0 votes)
86 views16 pages

K Nearest Neighbors

KNN is a simple machine learning algorithm that stores all available cases and classifies new cases based on similarity. It works by finding the k nearest neighbors of a new case and assigning the most common label among them for classification or averaging their values for regression. The number of neighbors k can be tuned, and distance measures account for different types of attributes. While conceptually simple, KNN has proven effective for complex problems in areas like medicine, law, and customer service.

Uploaded by

surabhi
Copyright
© Public Domain
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
86 views16 pages

K Nearest Neighbors

KNN is a simple machine learning algorithm that stores all available cases and classifies new cases based on similarity. It works by finding the k nearest neighbors of a new case and assigning the most common label among them for classification or averaging their values for regression. The number of neighbors k can be tuned, and distance measures account for different types of attributes. While conceptually simple, KNN has proven effective for complex problems in areas like medicine, law, and customer service.

Uploaded by

surabhi
Copyright
© Public Domain
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 16

K Nearest Neighbors

Saed Sayad

www.ismartsoft.com 1
KNN - Definition

KNN is a simple algorithm that stores


all available cases and classifies new
cases based on a similarity measure

www.ismartsoft.com 2
KNN – different names
• K-Nearest Neighbors
• Memory-Based Reasoning
• Example-Based Reasoning
• Instance-Based Learning
• Case-Based Reasoning
• Lazy Learning

www.ismartsoft.com 3
KNN – Short History
• Nearest Neighbors have been used in statistical
estimation and pattern recognition already in the
beginning of 1970’s (non-parametric techniques).
• Dynamic Memory: A theory of Reminding and
Learning in Computer and People (Schank, 1982).
• People reason by remembering and learn by doing.
• Thinking is reminding, making analogies.
• Examples = Concepts???

www.ismartsoft.com 4
KNN Classification

Loan$

Age

www.ismartsoft.com 5
KNN Classification – Distance
Age Loan Default Distance
25 $40,000 N 102000
35 $60,000 N 82000
45 $80,000 N 62000
20 $20,000 N 122000
35 $120,000 N 22000
52 $18,000 N 124000
23 $95,000 Y 47000
40 $62,000 Y 80000
60 $100,000 Y 42000
48 $220,000 Y 78000
33 $150,000 Y 8000

48 $142,000 ?
ta nce
Dis
Euc
l i dean D = ( x1 − x2 ) + ( y1 − y2 )
2 2

www.ismartsoft.com 6
KNN Classification – Standardized Distance
Age Loan Default Distance
0.125 0.11 N 0.7652
0.375 0.21 N 0.5200
0.625 0.31 N 0.3160
0 0.01 N 0.9245
0.375 0.50 N 0.3428
0.8 0.00 N 0.6220
0.075 0.38 Y 0.6669
0.5 0.22 Y 0.4437
1 0.41 Y 0.3650
0.7 1.00 Y 0.3861
0.325 0.65 Y 0.3771

0.7 0.61 ?
r i a ble
ize
d Va
X − Min
nda
r d
Xs =
Sta Max − Min
www.ismartsoft.com 7
KNN Regression - Distance
Age Loan House Price Index Distance
25 $40,000 135 102000
35 $60,000 256 82000
45 $80,000 231 62000
20 $20,000 267 122000
35 $120,000 139 22000
52 $18,000 150 124000
23 $95,000 127 47000
40 $62,000 216 80000
60 $100,000 139 42000
48 $220,000 250 78000
33 $150,000 264 8000

48 $142,000 ?

D = ( x1 − x2 ) + ( y1 − y2 )
2 2

www.ismartsoft.com 8
KNN Regression – Standardized Distance
Age Loan House Price Index Distance
0.125 0.11 135 0.7652
0.375 0.21 256 0.5200
0.625 0.31 231 0.3160
0 0.01 267 0.9245
0.375 0.50 139 0.3428
0.8 0.00 150 0.6220
0.075 0.38 127 0.6669
0.5 0.22 216 0.4437
1 0.41 139 0.3650
0.7 1.00 250 0.3861
0.325 0.65 264 0.3771

0.7 0.61 ?
X − Min
Xs =
Max − Min
www.ismartsoft.com 9
KNN – Number of Neighbors
• If K=1, select the nearest neighbor
• If K>1,
– For classification select the most frequent
neighbor.
– For regression calculate the average of K
neighbors.

www.ismartsoft.com 10
Distance – Categorical Variables

X Y Distance
Male Male 0
Male Female 1

x= y⇒D=0
x ≠ y ⇒ D =1

www.ismartsoft.com 11
Instance Based Reasoning
• IB1 is based on the standard KNN
• IB2 is incremental KNN learner that only
incorporates misclassified instances into the
classifier.
• IB3 discards instances that do not perform
well by keeping success records.

www.ismartsoft.com 12
Case Based Reasoning

www.ismartsoft.com 13
KNN - Applications
• Classification and Interpretation
– legal, medical, news, banking

• Problem-solving
– planning, pronunciation

• Function learning
– dynamic control

• Teaching and aiding


– help desk, user training
www.ismartsoft.com 14
Summary
• KNN is conceptually simple, yet able to solve
complex problems
• Can work with relatively little information
• Learning is simple (no learning at all!)
• Memory and CPU cost
• Feature selection problem
• Sensitive to representation

www.ismartsoft.com 15
Questions?

www.ismartsoft.com 16

You might also like