SVM Slides
SVM Slides
Canonical hyperplane
Distance between H1 and H0
◼ Expansion of 𝐿𝑃 yields
◼ Using the optimality condition 𝜕𝐽/𝜕𝑤=0, the first term in 𝐿𝑃 can be expressed as
Subject to constraints
Lagrangian Dual Problem (LP=>LD)
◼ problem of finding a saddle point for 𝐿𝑃(𝑤,𝑏) into the easier one of maximizing
𝐿𝐷(𝛼).
◼ 𝐿𝐷(𝛼) depends on the Lagrange multipliers 𝛼, not on (𝑤,𝑏)
◼ The primal problem scales with dimensionality (𝑤 has one coefficient for each
dimension), whereas the dual problem scales with the amount of training data
(there is one Lagrange multiplier per example)
◼ In 𝐿𝐷(𝛼), training data appears only as dot products 𝑥𝑖𝑇𝑥𝑗
◼ This property can be cleverly exploited to perform the classification in a higher
(e.g., infinite) dimensional space
Linear SVM
The KKT complementary condition states that, for every point in the training set,
the following equality must hold
Bias term 𝑏 is found from the KKT complementary condition on the support
vectors, hence the complete dataset could be replaced by only the support
vectors, and the separating hyperplane would be the same
Non-Linear SVM
Non-Linear SVM
Non-Linear SVM
Non-Linear SVM