BookSlides 5A Similarity Based Learning
BookSlides 5A Similarity Based Learning
1 Big Idea
2 Fundamentals
Feature Space
Distance Metrics
4 Epilogue
5 Summary
Big Idea Fundamentals Standard Approach Epilogue Summary
Big Idea
Big Idea Fundamentals Standard Approach Epilogue Summary
3 7 7 1
7 3 7 1
7 3 3 2
Figure: Matching animals you remember to the features of the
unknown animal described by the sailor. Note: The images used in
this figure were created by Jan Gillbank for the English for the
Australian Curriculum website (http://www.e4ac.edu.au) and
are used under the Create Commons Attribution 3.0 Unported licence
(http://creativecommons.org/licenses/by/3.0). The
images were sourced via Wikimedia Commons.
Big Idea Fundamentals Standard Approach Epilogue Summary
Fundamentals
Big Idea Fundamentals Standard Approach Epilogue Summary
Feature Space
Table: The speed and agility ratings for 20 college athletes labelled
with the decisions for whether they were drafted or not.
ID Speed Agility Draft ID Speed Agility Draft
1 2.50 6.00 No 11 2.00 2.00 No
2 3.75 8.00 No 12 5.00 2.50 No
3 2.25 5.50 No 13 8.25 8.50 No
4 3.25 8.25 No 14 5.75 8.75 Yes
5 2.75 7.50 No 15 4.75 6.25 Yes
6 4.50 5.00 No 16 5.50 6.75 Yes
7 3.50 5.25 No 17 5.25 9.50 Yes
8 3.00 3.25 No 18 7.00 4.25 Yes
9 4.00 4.00 No 19 7.50 8.00 Yes
10 4.25 3.75 No 20 7.25 5.75 Yes
8
Agility
6
4
2
2 3 4 5 6 7 8
Speed
Figure: A feature space plot of the data in Table 2 [25] . The triangles
represent ’Non-draft’ instances and the crosses represent the ’Draft’
instances.
Big Idea Fundamentals Standard Approach Epilogue Summary
Feature Space
Distance Metrics
Distance Metrics
Distance Metrics
Example
The Euclidean distance between instances d12 (S PEED= 5.00,
AGILITY= 2.5) and d5 (S PEED= 2.75,AGILITY= 7.5) in Table 2
[25]
is:
Big Idea Fundamentals Standard Approach Epilogue Summary
Distance Metrics
Example
The Euclidean distance between instances d12 (S PEED= 5.00,
AGILITY= 2.5) and d5 (S PEED= 2.75,AGILITY= 7.5) in Table 2
[25]
is:
q
Euclidean(h5.00, 2.50i , h2.75, 7.50i) = (5.00 − 2.75)2 + (2.50 − 7.50)2
√
= 30.0625 = 5.4829
Big Idea Fundamentals Standard Approach Epilogue Summary
Distance Metrics
1
The abs() function surrounding the subtraction term indicates that we
use the absolute value, i.e. non-negative value, when we are summing the
differences; this makes sense because distances can’t be negative.
Big Idea Fundamentals Standard Approach Epilogue Summary
Distance Metrics
n
d ea
cli
Eu
●
Manhattan
Distance Metrics
Example
The Manhattan distance between instances d12 (S PEED= 5.00,
AGILITY= 2.5) and d5 (S PEED= 2.75,AGILITY= 7.5) in Table 2
[25]
is:
Big Idea Fundamentals Standard Approach Epilogue Summary
Distance Metrics
Example
The Manhattan distance between instances d12 (S PEED= 5.00,
AGILITY= 2.5) and d5 (S PEED= 2.75,AGILITY= 7.5) in Table 2
[25]
is:
Distance Metrics
Distance Metrics
8
Agility 5
6
4
12
2
2 3 4 5 6 7 8
Speed
A Worked Example
Table: The speed and agility ratings for 20 college athletes labelled
with the decisions for whether they were drafted or not.
ID Speed Agility Draft ID Speed Agility Draft
1 2.50 6.00 No 11 2.00 2.00 No
2 3.75 8.00 No 12 5.00 2.50 No
3 2.25 5.50 No 13 8.25 8.50 No
4 3.25 8.25 No 14 5.75 8.75 Yes
5 2.75 7.50 No 15 4.75 6.25 Yes
6 4.50 5.00 No 16 5.50 6.75 Yes
7 3.50 5.25 No 17 5.25 9.50 Yes
8 3.00 3.25 No 18 7.00 4.25 Yes
9 4.00 4.00 No 19 7.50 8.00 Yes
10 4.25 3.75 No 20 7.25 5.75 Yes
Big Idea Fundamentals Standard Approach Epilogue Summary
A Worked Example
Example
Should we draft an athlete with the following profile:
A Worked Example
8
Agility
6
4
?
2
2 3 4 5 6 7 8
Speed
Figure: A feature space plot of the data in Table 2 [25] with the position
in the feature space of the query represented by the ? marker. The
triangles represent ’Non-draft’ instances and the crosses represent
the ’Draft’ instances.
Big Idea Fundamentals Standard Approach Epilogue Summary
A Worked Example
A Worked Example
Figure: (a) The Voronoi tessellation of the feature space for the
dataset in Table 2 [25] with the position of the query represented by the
? marker; (b) the decision boundary created by aggregating the
neighboring Voronoi regions that belong to the same target level.
Big Idea Fundamentals Standard Approach Epilogue Summary
A Worked Example
A Worked Example
A Worked Example
Figure: (a) The Voronoi tessellation of the feature space when the
dataset has been updated to include the query instance; (b) the
updated decision boundary reflecting the addition of the query
instance in the training set.
Big Idea Fundamentals Standard Approach Epilogue Summary
Epilogue
Big Idea Fundamentals Standard Approach Epilogue Summary
2
The story recounted here of the discovery of the platypus is loosely
based on real events.
Big Idea Fundamentals Standard Approach Epilogue Summary
Summary
Big Idea Fundamentals Standard Approach Epilogue Summary
1 Big Idea
2 Fundamentals
Feature Space
Distance Metrics
4 Epilogue
5 Summary