0% found this document useful (0 votes)
38 views16 pages

Lecture#12

Support vector machines are a supervised machine learning model that uses classification algorithms for two-group classification problems. It works by finding the optimal separating hyperplane that maximizes the margin between the two classes of data. The kernel trick allows SVMs to efficiently perform nonlinear classification by mapping data into higher dimensional spaces.

Uploaded by

usmaniqbal1137
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
38 views16 pages

Lecture#12

Support vector machines are a supervised machine learning model that uses classification algorithms for two-group classification problems. It works by finding the optimal separating hyperplane that maximizes the margin between the two classes of data. The kernel trick allows SVMs to efficiently perform nonlinear classification by mapping data into higher dimensional spaces.

Uploaded by

usmaniqbal1137
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 16

Support Vector Machines?

by
What is Support Vector Machines?
• A support vector machine (SVM) is a supervised machine learning
model that uses classification algorithms for two-group
classification problems. After giving an SVM model sets of labeled
training data for each category, they’re able to categorize new
text.

• Compared to newer algorithms like neural networks, they have


two main advantages: higher speed and better performance with a
limited number of samples (in the thousands). This makes the
algorithm very suitable for text classification problems, where it’s
common to have access to a dataset of at most a couple of
thousands of tagged samples.
How Does SVM Work?

• The basics of Support Vector Machines


and how it works are best understood with
a simple example. Let’s imagine we have
two tags: red and blue, and our data has
two features: x and y. We want a classifier
that, given a pair of (x,y) coordinates,
outputs if it’s either red or blue. We plot
our already labeled training data on a
plane:
• A support vector machine takes these data
points and outputs the hyperplane (which
in two dimensions it’s simply a line) that
best separates the tags. This line is the
decision boundary: anything that falls to
one side of it we will classify as blue, and
anything that falls to the other as red.
Nonlinear data

• Now this example was easy, since clearly


the data was linearly separable — we
could draw a straight line to separate red
and blue. Sadly, usually things aren’t that
simple. Take a look at this case:
It’s pretty clear that there’s not a linear
decision boundary (a single straight line that
separates both tags). However, the vectors
are very clearly segregated and it looks as
though it should be easy to separate them.
The kernel trick
• In our example we found a way to classify nonlinear data by cleverly
mapping our space to a higher dimension. However, it turns out that
calculating this transformation can get pretty computationally
expensive: there can be a lot of new dimensions, each one of them
possibly involving a complicated calculation. Doing this for every
vector in the dataset can be a lot of work, so it’d be great if we could
find a cheaper solution.

• And we’re in luck! Here’s a trick: SVM doesn’t need the actual
vectors to work its magic, it actually can get by only with the dot
products between them. This means that we can sidestep the
expensive calculations of the new dimensions.
• This is what we do instead:

• Imagine the new space we want:

• z = x² + y²

• Figure out what the dot product in that space looks like:

• a · b = xa · xb + ya · yb + za · zb

• a · b = xa · xb + ya · yb + (xa² + ya²) · (xb² + yb²)

• Tell SVM to do its thing, but using the new dot product — we call this
a
Pros and Cons associated with SVM
• Pros:
• It works really well with a clear margin of separation
• It is effective in high dimensional spaces.
• It is effective in cases where the number of dimensions is greater than the
number of samples.
• It uses a subset of training points in the decision function (called support
vectors), so it is also memory efficient.
• Cons:
• It doesn’t perform well when we have large data set because the required
training time is higher
• It also doesn’t perform very well, when the data set has more noise i.e.
target classes are overlapping
• SVM doesn’t directly provide probability estimates, these are calculated
using an expensive five-fold cross-validation. It is included in the related
SVC method of Python scikit-learn library.
Thanx for Listening

You might also like