What Is Support Vector Machine
What Is Support Vector Machine
What Is Support Vector Machine
Support Vectors are simply the coordinates of individual observation. The SVM
classifier is a frontier that best segregates the two classes (hyper-plane/ line).
Let’s understand:
Above, you can see that the margin for hyper-plane C is high as compared to both A
and B. Hence, we name the right hyper-plane as C. Another lightning reason for
selecting the hyper-plane with higher margin is robustness. If we select a hyper-plane
having low margin then there is high chance of miss-classification.
In the SVM classifier, it is easy to have a linear hyper-plane between these two
classes. But, another burning question which arises is, should we need to add this
feature manually to have a hyper-plane. No, the SVM algorithm has a technique
called the kernel trick. The SVM kernel is a function that takes low dimensional
input space and transforms it to a higher dimensional space i.e. it converts not
separable problem to separable problem. It is mostly useful in non-linear separation
problem. Simply put, it does some extremely complex data transformations, then finds
out the process to separate the data based on the labels or outputs you’ve defined.
When we look at the hyper-plane in original input space it looks like a circle:
Now, let’s look at the methods to apply SVM classifier algorithm in a data science
challenge.
Now, let us have a look at a real-life problem statement and dataset to understand how
to apply SVM for classification
Problem Statement
Dream Housing Finance company deals in all home loans. They have a presence
across all urban, semi-urban and rural areas. A customer first applies for a home loan,
after that the company validates the customer’s eligibility for a loan.
Company wants to automate the loan eligibility process (real-time) based on customer
details provided while filling an online application form. These details are Gender,
Marital Status, Education, Number of Dependents, Income, Loan Amount, Credit
History and others. To automate this process, they have given a problem to identify
the customers’ segments, those are eligible for loan amount so that they can
specifically target these customers. Here they have provided a partial data set.
Use the coding window below to predict the loan eligibility on the test set. Try
changing the hyperparameters for the linear SVM to improve the accuracy.
The e1071 package in R is used to create Support Vector Machines with ease. It has
helper functions as well as code for the Naive Bayes Classifier. The creation of a
support vector machine in R and Python follow similar approaches, let’s take a look
now at the following code:
#Import Library
require(e1071) #Contains the SVM
Train <- read.csv(file.choose())
Test <- read.csv(file.choose())
# there are various options associated with SVM training; like changing kernel, gamma
and C value.
# create model
model <-
svm(Target~Predictor1+Predictor2+Predictor3,data=Train,kernel='linear',gamma=0.2,cost
=100)
#Predict Output
preds <- predict(model,Test)
table(preds)
Tuning the parameters’ values for machine learning algorithms effectively improves
model performance. Let’s look at the list of parameters available with SVM.
kernel: We have already discussed about it. Here, we have various options available
with kernel like, “linear”, “rbf”,”poly” and others (default value is “rbf”). Here “rbf”
and “poly” are useful for non-linear hyper-plane. Let’s look at the example, where
we’ve used linear kernel on two feature of iris data set to classify their class.
plt.show()
Example: Use SVM rbf kernel
Change the kernel type to rbf in below line and look at the impact.
svc = svm.SVC(kernel='rbf', C=1,gamma=0).fit(X, y)
I would suggest you go for linear SVM kernel if you have a large number of features
(>1000) because it is more likely that the data is linearly separable in high
dimensional space. Also, you can use RBF but do not forget to cross-validate for its
parameters to avoid over-fitting.
gamma: Kernel coefficient for ‘rbf’, ‘poly’ and ‘sigmoid’. Higher the value of
gamma, will try to exact fit the as per training data set i.e. generalization error and
cause over-fitting problem.
In R, SVMs can be tuned in a similar fashion as they are in Python. Mentioned below
are the respective parameters for e1071 package:
Practice Problem
Find right additional feature to have a hyper-plane for segregating the classes in below
snapshot:
Answer the variable name in the comments section below. I’ll shall then reveal the
answer.