Lecture#12
Lecture#12
by
What is Support Vector Machines?
• A support vector machine (SVM) is a supervised machine learning
model that uses classification algorithms for two-group
classification problems. After giving an SVM model sets of labeled
training data for each category, they’re able to categorize new
text.
• And we’re in luck! Here’s a trick: SVM doesn’t need the actual
vectors to work its magic, it actually can get by only with the dot
products between them. This means that we can sidestep the
expensive calculations of the new dimensions.
• This is what we do instead:
• z = x² + y²
• Figure out what the dot product in that space looks like:
• a · b = xa · xb + ya · yb + za · zb
• Tell SVM to do its thing, but using the new dot product — we call this
a
Pros and Cons associated with SVM
• Pros:
• It works really well with a clear margin of separation
• It is effective in high dimensional spaces.
• It is effective in cases where the number of dimensions is greater than the
number of samples.
• It uses a subset of training points in the decision function (called support
vectors), so it is also memory efficient.
• Cons:
• It doesn’t perform well when we have large data set because the required
training time is higher
• It also doesn’t perform very well, when the data set has more noise i.e.
target classes are overlapping
• SVM doesn’t directly provide probability estimates, these are calculated
using an expensive five-fold cross-validation. It is included in the related
SVC method of Python scikit-learn library.
Thanx for Listening