K Nearest Neighbors
K Nearest Neighbors
Saed Sayad
www.ismartsoft.com 1
KNN - Definition
www.ismartsoft.com 2
KNN – different names
• K-Nearest Neighbors
• Memory-Based Reasoning
• Example-Based Reasoning
• Instance-Based Learning
• Case-Based Reasoning
• Lazy Learning
www.ismartsoft.com 3
KNN – Short History
• Nearest Neighbors have been used in statistical
estimation and pattern recognition already in the
beginning of 1970’s (non-parametric techniques).
• Dynamic Memory: A theory of Reminding and
Learning in Computer and People (Schank, 1982).
• People reason by remembering and learn by doing.
• Thinking is reminding, making analogies.
• Examples = Concepts???
www.ismartsoft.com 4
KNN Classification
Loan$
Age
www.ismartsoft.com 5
KNN Classification – Distance
Age Loan Default Distance
25 $40,000 N 102000
35 $60,000 N 82000
45 $80,000 N 62000
20 $20,000 N 122000
35 $120,000 N 22000
52 $18,000 N 124000
23 $95,000 Y 47000
40 $62,000 Y 80000
60 $100,000 Y 42000
48 $220,000 Y 78000
33 $150,000 Y 8000
48 $142,000 ?
ta nce
Dis
Euc
l i dean D = ( x1 − x2 ) + ( y1 − y2 )
2 2
www.ismartsoft.com 6
KNN Classification – Standardized Distance
Age Loan Default Distance
0.125 0.11 N 0.7652
0.375 0.21 N 0.5200
0.625 0.31 N 0.3160
0 0.01 N 0.9245
0.375 0.50 N 0.3428
0.8 0.00 N 0.6220
0.075 0.38 Y 0.6669
0.5 0.22 Y 0.4437
1 0.41 Y 0.3650
0.7 1.00 Y 0.3861
0.325 0.65 Y 0.3771
0.7 0.61 ?
r i a ble
ize
d Va
X − Min
nda
r d
Xs =
Sta Max − Min
www.ismartsoft.com 7
KNN Regression - Distance
Age Loan House Price Index Distance
25 $40,000 135 102000
35 $60,000 256 82000
45 $80,000 231 62000
20 $20,000 267 122000
35 $120,000 139 22000
52 $18,000 150 124000
23 $95,000 127 47000
40 $62,000 216 80000
60 $100,000 139 42000
48 $220,000 250 78000
33 $150,000 264 8000
48 $142,000 ?
D = ( x1 − x2 ) + ( y1 − y2 )
2 2
www.ismartsoft.com 8
KNN Regression – Standardized Distance
Age Loan House Price Index Distance
0.125 0.11 135 0.7652
0.375 0.21 256 0.5200
0.625 0.31 231 0.3160
0 0.01 267 0.9245
0.375 0.50 139 0.3428
0.8 0.00 150 0.6220
0.075 0.38 127 0.6669
0.5 0.22 216 0.4437
1 0.41 139 0.3650
0.7 1.00 250 0.3861
0.325 0.65 264 0.3771
0.7 0.61 ?
X − Min
Xs =
Max − Min
www.ismartsoft.com 9
KNN – Number of Neighbors
• If K=1, select the nearest neighbor
• If K>1,
– For classification select the most frequent
neighbor.
– For regression calculate the average of K
neighbors.
www.ismartsoft.com 10
Distance – Categorical Variables
X Y Distance
Male Male 0
Male Female 1
x= y⇒D=0
x ≠ y ⇒ D =1
www.ismartsoft.com 11
Instance Based Reasoning
• IB1 is based on the standard KNN
• IB2 is incremental KNN learner that only
incorporates misclassified instances into the
classifier.
• IB3 discards instances that do not perform
well by keeping success records.
www.ismartsoft.com 12
Case Based Reasoning
www.ismartsoft.com 13
KNN - Applications
• Classification and Interpretation
– legal, medical, news, banking
• Problem-solving
– planning, pronunciation
• Function learning
– dynamic control
www.ismartsoft.com 15
Questions?
www.ismartsoft.com 16