KNN Algorithm
KNN Algorithm
(KNN) Classifier
X1
X2
feature X3 Y
… Classifier category
values
Xn
collection of instances
DB
with known categories
k NEAREST NEIGHBOR
Requires 3 things:
Feature Space(Training Data)
Distance metric
• to compute distance between
records
The value of k
• the number of nearest
neighbors to retrieve from
? which to get majority class
To classify an unknown record:
Compute distance to other
training records
Identify k nearest neighbors
Use class labels of nearest
neighbors to determine the
class label of unknown record
k NEAREST NEIGHBOR
k = 1:
Belongs to square class
k = 3:
? Belongs to triangle class
k = 7:
Belongs to square class
Value of k
Larger k increases confidence in prediction
Note that if k is too large, decision may be
skewed
Weighted evaluation of nearest neighbors
Plain majority may unfairly skew decision
Revise algorithm so that closer neighbors
have greater “vote weight”
Other distance measures
k-NN Time Complexity