4 - Lec 11 - ML - Support Vector Machine
4 - Lec 11 - ML - Support Vector Machine
4 - Lec 11 - ML - Support Vector Machine
In the SVM algorithm, we plot each data item as a point in n-dimensional space (where n is
number of features you have) with the value of each feature being the value of a particular
coordinate.
Then, we perform classification by finding the hyper-plane that differentiates the two classes
very well (look at the below snapshot).
The SVM classifier is a frontier which best segregates the two classes (hyper-plane/ line).
How does it work?
Above, we got accustomed to the process of segregating the two classes with a hyper-
plane. Now the burning question is “How can we identify the right hyper-plane?”. Don’t
worry, it’s not as hard as you think!
Let’s understand:
Identify the right hyper-plane (Scenario-1): Here, we have three hyper-planes (A,
B and C). Now, identify the right hyper-plane to classify star and circle.
You need to remember a thumb rule to identify the right hyper-plane: “Select the
hyper-plane which segregates the two classes better”. In this scenario, hyper-plane
“B” has excellently performed this job.
Identify the right hyper-plane (Scenario-2): Here, we have three hyper-planes (A,
B and C) and all are segregating the classes well. Now, How can we identify the
right hyper-plane?
Here, maximizing the distances between nearest data point (either class) and hyper-
plane will help us to decide the right hyper-plane. This distance is called as Margin.
Let’s look at the below snapshot:
Above, you can see that the margin for hyper-plane C is high as compared to both A
and B. Hence, we name the right hyper-plane as C. Another lightning reason for
selecting the hyper-plane with higher margin is robustness. If we select a hyper-
plane having low margin then there is high chance of miss-classification.
Some of you may have selected the hyper-plane B as it has higher margin compared
to A. But, here is the catch, SVM selects the hyper-plane which classifies the classes
accurately prior to maximizing margin. Here, hyper-plane B has a classification error and A
has classified all correctly. Therefore, the right hyper-plane is A.
Can we classify two classes (Scenario-4)?: Below, I am unable to segregate the
two classes using a straight line, as one of the stars lies in the territory of
other(circle) class as an outlier.
As I have already
mentioned, one star at other end is like an outlier for star class. The SVM algorithm
has a feature to ignore outliers and find the hyper-plane that has the maximum
margin. Hence, we can say, SVM classification is robust to outliers.
In the SVM classifier, it is easy to have a linear hyper-plane between these two
classes. But, another burning question which arises is, should we need to add this
feature manually to have a hyper-plane. No, the SVM algorithm has a technique
called the kernel trick. The SVM kernel is a function that takes low dimensional
input space and transforms it to a higher dimensional space i.e. it converts not
separable problem to separable problem. It is mostly useful in non-linear separation
problem. Simply put, it does some extremely complex data transformations, then
finds out the process to separate the data based on the labels or outputs you’ve
defined.
When we look at the hyper-plane in original input space it looks like a circle:
Now, let’s look at the methods to apply SVM classifier algorithm in a data science challenge.