Lecture 5
Lecture 5
Lecture 5
Pre-lecture Assignment:
• Read Section 3.4-3.5 of the 100pMLB
In-class activities:
• Poll Everywhere review questions
• Review fundamental equations governing SVM
• Discuss impact of kernel function
• Implement a KNN classifier by hand using sample data
Key terms
• Hinge loss function, kernel trick, kernel functions (kernels), RBF
kernel, Euclidean distance, k-Nearest Neighbors, cosine similarity,
support vector.
>
-
Support Vector Machines - Review
• Review of support vector machines (SVM)
Note that 100pMLbook doesn’t
show dot product
f(x)=sign(wTx-b)=sign(w·x-b)
Learn w, b such that:
,
max(0 ,
1 -
.
+ b) right
determine tradeoff between increasing size of decision boundary
and classification
Regulates empirical
3
large 2 width MOST risk
>
-
of margin matters
Small - > classifying matters more and generalization => TRADE OFF
·
soft-margin SVM : Optimize hinge-loss function
.
Inherent Non-linearity :
·
implicitly transform the original space into a higher dimension
during the cost function optimization
. This is called the Kernel trich .
&
1) Noisy/mislabelled data
• Use hinge loss function to allow for incorrectly classified training data
large 2 >
- width
of margin matters MOST
incorrectly
classified
Small - > classifying matters
<0 for correctly penalty
-
ma makes
classified samples dominate
↓ -
h
value becomes
-
ve
FOR CORRECTLY
Minimize classified points
. contributes nothing
i
. e
For , , , = ,
From http://www.eric-kim.net/eric-kim-net/posts/1/kernel_trick.html 12
2) Inherent structure of data (example con’t)
From http://www.eric-kim.net/eric-kim-net/posts/1/kernel_trick.html 13
2) Inherent structure of data
Common kernels:
1) Linear kernel: , =
No mapping to higher dimensional space…
mseabass
unknown
weight
A
9 -
&
=
O
123
S -
& - - -
&
a
neighbours ass
7 -
I & & 7 neighbour = Salmon
2
6 -
&
5 -
4- ~ A
length
is in is i in is is
I >
12
>
-
>
-
don't have to compute any distances because all will the close . ex : if more
apples then
oranges ,
will
always the apples