0% found this document useful (0 votes)
8 views26 pages

K - Nearest Neighbors

Uploaded by

Zohair Mughal
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views26 pages

K - Nearest Neighbors

Uploaded by

Zohair Mughal
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 26

K- Nearest Neighbors

INSTANCE BASED LEARNING

K-Nearest Neighbors

Key Idea:
Just store all training examples <xi, f(xi)>

Thus the training algorithm is very simple

2
INSTANCE BASED LEARNING

K-Nearest Neighbors

Classification Algorithm:
* Given query instance xq
* Locate nearest training example xn
* Estimate

3
Simple Analogy..
KNN – Different names
• K-Nearest Neighbors
• Memory-Based Reasoning
• Example-Based Reasoning
• Instance-Based Learning
• Lazy Learning
What is KNN?
• A powerful classification algorithm used in pattern
recognition.

• K nearest neighbors stores all available cases and classifies


new cases based on a similarity measure(e.g distance
function)

• One of the top data mining algorithms used today.

• A non-parametric lazy learning algorithm (An Instance


based Learning method).
KNN: Classification Approach
• An object (a new instance) is classified by a
majority votes for its neighbor classes.
• The object is assigned to the most common
class amongst its K nearest neighbors.
(measured by a distant function )
Distance Measure
Distance measure for Continuous
Variables
Distance Between Neighbors
K-Nearest Neighbor Algorithm
• All the instances correspond to points in an n-dimensional feature space.

• Each instance is represented with a set of numerical attributes.

• Each of the training data consists of a set of vectors and a class label
associated with each vector.

• Classification is done by comparing feature vectors of different K nearest


points.

• Select the K-nearest examples to E in the training set.

• Assign E to the most common class among its K-nearest


• neighbors.
3-KNN: Example
How to choose K?
• If K is too small it is sensitive to noise points.
• Larger K works well. But too large K may
include majority points from other classes.
• Rule of thumb is K < sqrt(n), n is number of
examples.
INSTANCE BASED LEARNING

Distance weighted kNN-algorithm

If all the training examples are used to determine the


classification of xq, then the algorithm is called a global
method, otherwise it is called a local method

For real-valued functions, the global methods is also called


the Shepard’s method

17
INSTANCE BASED LEARNING

Terminology

It is useful to know the following terms:

18
KNN Feature Weighting
• Scale each feature by its importance for
classification
• Can use our prior knowledge about which
features are more important
• Can learn the weights wk using cross‐
validation
Feature Normalization
Nominal/Categorical Data
• Distance works naturally with numerical
attributes.
• Binary value categorical data attributes can be
regarded as 1 or 0.
KNN Classification
KNN Classification – Distance
KNN Classification – Standardized Distance
Strengths of KNN
• Very simple and intuitive.

• Can be applied to the data from any


distribution.

• Good classification if the number of samples is


large enough.
Weaknesses of KNN
• Takes more time to classify a new example.

• Need to calculate and compare distance from


new example to all other examples.

• Choosing k may be tricky.

• Need large number of samples for accuracy.

You might also like