Lecture Notes For Chapter 4 Instance-Based Learning Introduction To Data Mining, 2 Edition
Lecture Notes For Chapter 4 Instance-Based Learning Introduction To Data Mining, 2 Edition
Instance-Based Learning
Examples:
– Rote-learner
Memorizes entire training data and performs
classification only if attributes of record match one
of the training examples exactly
– Nearest neighbor
Uses k “closest” points (nearest neighbors) for
performing classification
Basic idea:
– If it walks like a duck, quacks like a duck, then
it’s probably a duck
Compute
Distance Test
Record
X X X
Voronoi Diagram
d ( p, q ) ( pi
i
q )
i
2
Scaling issues
– Attributes may have to be scaled to prevent
distance measures from being dominated by
one of the attributes
– Example:
height of a person may vary from 1.5m to 1.8m
weight of a person may vary from 90lb to 300lb
income of a person may vary from $10K to $1M
111111111110 000000000001
vs
011111111111 100000000000
Proximity graphs
– a graph in which two vertices are connected
by an edge if and only if the vertices satisfy
particular geometric requirements
– nearest neighbor graphs
– minimum spanning trees
– Delaunay triangulations
– relative neighborhood graphs
– Gabriel graphs