ML Lectures - 20 22
ML Lectures - 20 22
ML Lectures - 20 22
Lecture #20-22
Support Vector Machines
Fig 1: Linearly separable classes can be separated in infinitely many different ways.
2
Fig 2
Fig 3
Fig 4
Decision Function
▪ If we think of b as an additional weight, w0, we can rewrite
Eq. (1) as
w0 + w1x1 + w2x2 = 0
Fig 5
11
Decision Function
▪ Hence the hyperplanes defining the “sides” of the margin
can be written as
o H1 : w0 + w1x1 + w2x2 ≥ 1 for yi = +1
o H2 : w0 + w1x1 + w2x2 ≤ -1 for yi = -1
Fig 6
ℎ 𝑥, 𝜆 = 𝑓 𝑥 − 𝜆𝑖 [𝑔𝑖 𝑥 − 𝑏𝑖 ]
𝑖=1
▪ Where the new variables = (1, 2, …, m) are called Lagrange
multipliers. Notice the key fact that for the feasible values of x,
gi(x) – bi = 0 for all i
So h(x, ) = f(x) . 15
Making predictions
18
19
20
Fig 7
▪ These are the only points, for which αi > 0, that need to
be used when classifying new data.
21
Fig 8
▪ There is one difference from fig 7 – the support vector from class
denoted by grey squares has been moved closer to the other class.
▪ Moving this single data point has had a large effect on the
position of the decision boundary. 22
24
25
26
27
Fig 9