0% found this document useful (0 votes)
63 views25 pages

Support Vectors: Michael (Mike) Gelbart

This document discusses support vector machines (SVMs) and compares logistic regression to SVMs. It explains that SVMs are linear classifiers trained using hinge loss and L2 regularization to maximize the margin between classes. Only support vectors, which are points close to the decision boundary or misclassified, affect the model. Kernel SVMs transform features nonlinearly. Logistic regression and SVMs can be extended to multi-class and handle large datasets using SGDClassifier in scikit-learn. SVMs are generally faster than logistic regression when using kernels.

Uploaded by

prjet1 fsm1
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
63 views25 pages

Support Vectors: Michael (Mike) Gelbart

This document discusses support vector machines (SVMs) and compares logistic regression to SVMs. It explains that SVMs are linear classifiers trained using hinge loss and L2 regularization to maximize the margin between classes. Only support vectors, which are points close to the decision boundary or misclassified, affect the model. Kernel SVMs transform features nonlinearly. Logistic regression and SVMs can be extended to multi-class and handle large datasets using SGDClassifier in scikit-learn. SVMs are generally faster than logistic regression when using kernels.

Uploaded by

prjet1 fsm1
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 25

Support Vectors

LINEAR CLASSIFIERS IN PYTHON

Michael (Mike) Gelbart


Instructor, The University of British
Columbia
What is an SVM?
Linear classi ers (so far)

Trained using the hinge loss and L2 regularization

LINEAR CLASSIFIERS IN PYTHON


Support vector: a training example not in the at part of the
loss diagram

Support vector: an example that is incorrectly classi ed or


close to the boundary

If an example is not a support vector, removing it has no


e ect on the model

Having a small number of support vectors makes kernel SVMs


really fast

LINEAR CLASSIFIERS IN PYTHON


Max-margin viewpoint
The SVM maximizes the "margin" for linearly separable
datasets

Margin: distance from the boundary to the closest points

LINEAR CLASSIFIERS IN PYTHON


Max-margin viewpoint
The SVM maximizes the "margin" for linearly separable
datasets

Margin: distance from the boundary to the closest points

LINEAR CLASSIFIERS IN PYTHON


Let's practice!
LINEAR CLASSIFIERS IN PYTHON
Kernel SVMs
LINEAR CLASSIFIERS IN PYTHON

Michael (Mike) Gelbart


Instructor, The University of British
Columbia
Transforming your features

LINEAR CLASSIFIERS IN PYTHON


Transforming your features

LINEAR CLASSIFIERS IN PYTHON


Transforming your features

transformed feature =

(original feature)2

LINEAR CLASSIFIERS IN PYTHON


Transforming your features

transformed feature =

(original feature)2

LINEAR CLASSIFIERS IN PYTHON


Transforming your features

transformed feature =

(original feature)2

LINEAR CLASSIFIERS IN PYTHON


Kernel SVMs
from sklearn.svm import SVC

svm = SVC(gamma=1) # default is kernel="rbf"

LINEAR CLASSIFIERS IN PYTHON


Kernel SVMs
from sklearn.svm import SVC

svm = SVC(gamma=0.01) # default is kernel="rbf"

smaller gamma leads to smoother boundaries

LINEAR CLASSIFIERS IN PYTHON


Kernel SVMs
from sklearn.svm import SVC

svm = SVC(gamma=2) # default is kernel="rbf"

larger gamma leads to more complex boundaries

LINEAR CLASSIFIERS IN PYTHON


Let's practice!
LINEAR CLASSIFIERS IN PYTHON
Comparing logistic
regression and SVM
LINEAR CLASSIFIERS IN PYTHON

Michael (Mike) Gelbart


Instructor, The University of British
Columbia
Logistic regression: Support vector machine
(SVM):
Is a linear classi er
Is a linear classi er
Can use with kernels, but
slow Can use with kernels, and
fast
Outputs meaningful
probabilities Does not naturally output
probabilities
Can be extended to multi-
class Can be extended to multi-
class
All data points a ect t
Only "support vectors"
L2 or L1 regularization
a ect t

Conventionally just L2
regularization

LINEAR CLASSIFIERS IN PYTHON


Use in scikit-learn
Logistic regression in sklearn:

linear_model.LogisticRegression

Key hyperparameters in sklearn:

C (inverse regularization strength)

penalty (type of regularization)

multi_class (type of multi-class)

SVM in sklearn:

svm.LinearSVC and svm.SVC

LINEAR CLASSIFIERS IN PYTHON


Use in scikit-learn (cont.)
Key hyperparameters in sklearn:

C (inverse regularization strength)

kernel (type of kernel)

gamma (inverse RBF smoothness)

LINEAR CLASSIFIERS IN PYTHON


SGDClassifier
SGDClassifier : scales well to large datasets

from sklearn.linear_model import SGDClassifier

logreg = SGDClassifier(loss='log')

linsvm = SGDClassifier(loss='hinge')

SGDClassifier hyperparameter alpha is like 1/C

LINEAR CLASSIFIERS IN PYTHON


Let's practice!
LINEAR CLASSIFIERS IN PYTHON
Conclusion
LINEAR CLASSIFIERS IN PYTHON

Michael (Mike) Gelbart


Instructor, The University of British
Columbia
How does this course fit into data science?
Data science

→ Machine learning
→→ Supervised learning
→→→ Classi cation

→→→→ Linear classi ers (this course)

LINEAR CLASSIFIERS IN PYTHON


Congratulations &
thanks!
LINEAR CLASSIFIERS IN PYTHON

You might also like