Lecture 6
Lecture 6
Lecture 6
David Sontag
New York University
(Primal)
(Dual)
(Dual)
(Dual)
(1)
(2)
(3)
Classification rule using dual solution
Dual:
What changed?
• Added upper bound of C on αi!
• Intuitive explanation:
• Without slack, αi ∞ when constraints are violated (points
misclassified)
• Upper bound of C limits the αi, so misclassifications are allowed
Support vectors
+1
-1
=
=
=
w.x + b
w.x + b
w.x + b
Final solution tends to
be sparse
•αj=0 for most j
[Tommi Jaakkola]
Quadratic kernel
[Cynthia Rudin]
Common kernels
• Polynomials of degree exactly d
• Polynomials of degree up to d
• Gaussian kernels
Euclidean distance,
squared
Support vectors
Q: How would you prove that the “Gaussian kernel” is a valid kernel?
A: Expand the Euclidean norm as follows: