Week 3. K-Nearest Neighbours (KNN) : Dr. Shuo Wang
Week 3. K-Nearest Neighbours (KNN) : Dr. Shuo Wang
Week 3. K-Nearest Neighbours (KNN) : Dr. Shuo Wang
k-Nearest Neighbours
(kNN)
Dr. Shuo Wang
Overview
§ Intuitive understanding
§ The kNN algorithm
§ Pros/cons
kNN Basics
§ Full name: k-Nearest Neighbours (kNN, or k-NN).
§ It is nonparametric.
No assumption about the functional form of the model.
§ It is instance-based.
The prediction is based on a comparison of a new point with
data points in the training set, rather than a model.
§ It is a lazy algorithm.
No explicit training step. Defers all the computation until
prediction.
§ Can be used for both classification and regression problems.
Intuitive Understanding
Instead of approximating a model function 𝑓(𝑥) globally, kNN approximates
the label of a new point based on its nearest neighbours in training data.
underfit overfit
1/k
§ Small k -> small neighborhood -> high complexity -> may overfit
§ Large k -> large neighborhood -> low complexity -> may underfit
§ Practicians often choose k between 3 – 15, or k < 𝑁 (N is the
number of training examples).
§ Refer to “model selection/evaluation” to be learnt next week.
The issue in numeric attribute ranges