Instance-Based Learning: Slides Provided by Introduction To Data Mining, 2 Edition
Instance-Based Learning: Slides Provided by Introduction To Data Mining, 2 Edition
Instance-Based Learning
Examples:
– Rote-learner
Memorizes entire training data and performs
classification only if attributes of record match one
of the training examples exactly
– Nearest neighbor
Uses k “closest” points (nearest neighbors) for
performing classification
Basic idea:
– If it walks like a duck, quacks like a duck, then
it’s probably a duck
Compute
Distance Test
Record
X X X
Voronoi Diagram
d ( p, q ) ( pi
i
q ) i
2
Scaling issues
– Attributes may have to be scaled to prevent
distance measures from being dominated by
one of the attributes
– Example:
height of a person may vary from 1.5m to 1.8m
weight of a person may vary from 90lb to 300lb
111111111110 000000000001
vs
011111111111 100000000000
Proximity graphs
– a graph in which two vertices are connected by an edge if and
only if the vertices satisfy particular geometric requirements
– nearest neighbor graphs,
– minimum spanning trees
– Delaunay triangulations
– relative neighborhood graphs
– Gabriel graphs
See recent papers by Toussaint
– G. T. Toussaint. Proximity graphs for nearest neighbor decision rules: recent progress.
In Interface-2002, 34th Symposium on Computing and Statistics, ontreal, Canada,
April 17–20 2002.
– G. T. Toussaint. Open problems in geometric methods for instance based learning. In
Discrete and Computational Geometry, volume 2866 of Lecture Notes in Computer
Science, pages 273–283, December 6-9, 2003.
– G. T. Toussaint. Geometric proximity graphs for improving nearest neighbor methods
in instance-based learning and data mining. Int. J. Comput. Geometry Appl.,
15(2):101–150, 2005.