Support Vector Machine For Classification
Support Vector Machine For Classification
Classification
Instructor: Seunghoon Hong
Recap: image representation
Recap: nonparametric approach for classification
Training images Test image
Cat
…
Positive
Negative
Example: separable 2D data
Positive
Negative
Example: determining a good classifier
Negative
Example: determining a good classifier
Issues?
Positive
Negative
margin
Support Vector Machine (SVM)
● Let’s assume that we have a set of linearly separable data
Support Vector Machine (SVM)
● Our decision rule
if
Support Vector Machine (SVM)
● Our decision rule
By the definition of
support vectors
Support Vector Machine (SVM)
● Problem of maximizing the margin
● We can find the solution using any Quadratic Programming (QP) solver
● The obtained solution is the global optimum! (no local optima!)
So the optimality of the solution is always guaranteed!
Support Vector Machine (SVM)
● Parameters for the max-margin hyperplane
○ Weight coefficient
○ Bias parameter
■ From the fact that for all support vectors
■ We usually take average over all support vectors for numerical stability
Support Vector Machine (SVM)
● Testing
Linear SVM, Non-separable case
● Soft margin
○ Introduce slack variables,
○ Allow training example to be within the margin or
even on the wrong side of the linear separator.
Non-linear SVM
● It allows us to just define the kernel k without knowing the explicit form of the
mapping function φ!
Widely-used kernels
● Linear kernel:
● Polynomial kernel:
● Gaussian (Radial basis function - RBF) kernel:
● Histogram intersection kernel:
● And many others...
Non-linear SVM
● Optimizing the SVM objective with kernel
Oops? inner-product!
○ Kernel parameters
● One‐versus‐one
○ Training: learn an SVM for each pair of classes
○ Testing: each learned SVM “votes” for a class to assign to the test example
SVM Resources
● References
○ C. Cortes and V. Vapnik, Support‐vector networks, Machine Learning 20 (3): 273, 1995.
○ N. Cristianini, and J. Shawe‐Taylor, An introduction to support vector machine and other
kernel-based methods, Cambridge University Press, Cambridge. 2000.
○ B. Scholkopf and A. Smola, Learning with Kernels, Robust Estimators, MIT Press, 2002.