0% found this document useful (0 votes)
13 views22 pages

10 Classification SVM

CLASSIFICATION

Uploaded by

Suhani Jain
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views22 pages

10 Classification SVM

CLASSIFICATION

Uploaded by

Suhani Jain
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 22

Classification

Summer 2024
© IIT Roorkee India

Prof. Sharma T 1
Support Vector machine (SVM)

Dr. Sharma T 2
Support Vector Machine Algorithm
• Most popular Supervised Learning algorithms, which is used for Classification as
well as Regression problems.
• Goal: To create the best line or decision boundary that can segregate n-
dimensional space into classes so that we can easily put the new data point in the
correct category in the future. This best decision boundary is called a hyperplane.
• SVM chooses the extreme points/vectors that help in creating the hyperplane.
• These extreme cases are called as support vectors, and hence algorithm is termed
as Support Vector Machine.

Dr. Sharma T 3
SVM
• SVM finds a hyperplane that segregates the labeled dataset
(supervised machine learning) into two classes

• To choose the right hyperplane we need margin.

• Margin is the distance between the hyperplane and the closest point
from either set.

Dr. Sharma T 4
Hyperplane and Support Vectors

• Hyperplane: There can be multiple lines/decision boundaries to segregate the classes in


n-dimensional space, but we need to find out the best decision boundary that helps to
classify the data points. This best boundary is known as the hyperplane of SVM.

• The dimensions of the hyperplane depend on the features present in the dataset, which
means if there are 2 features then hyperplane will be a straight line. And if there are 3
features, then hyperplane will be a 2-dimension plane.

• We always create a hyperplane that has a maximum margin, which means the maximum
distance between the data points.

• Support Vectors: The data points or vectors that are the closest to the hyperplane and
which affect the position of the hyperplane are termed as Support Vector. Since these
vectors support the hyperplane, hence called a Support vector.

Dr. Sharma T 5
Two different categories that are classified
using a decision boundary or hyperplane:

Dr. Sharma T 6
Example:
• support vector creates a
decision boundary between
these two data (cat and dog)
and choose extreme cases
(support vectors), it will see
the extreme case of cat and
dog. On the basis of the
support vectors, it will
classify it as a cat.
• Other applications: Face
detection, image
classification, text
categorization, etc.

Dr. Sharma T 7
Types of SVM
• Linear SVM: Linear SVM is used for linearly separable data,
which means if a dataset can be classified into two classes by
using a single straight line, then such data is termed as linearly
separable data, and classifier is called as Linear SVM classifier.

• Non-linear SVM: Non-Linear SVM is used for non-linearly


separated data, which means if a dataset cannot be classified by
using a straight line, then such data is termed as non-linear data
and classifier used is called as Non-linear SVM classifier.

Dr. Sharma T 8
How does SVM works?
Linear SVM:
• Suppose we have a dataset that has two tags
(green and blue), and the dataset has two features
x1 and x2.
• We want a classifier that can classify the pair(x1,
x2) of coordinates in either green or blue.
• It is 2-d space so by just using a straight line, we
can easily separate these two classes. But there
can be multiple lines that can separate these
classes.
• Hence, the SVM algorithm helps to find the best
line or decision boundary; this best boundary or
region is called as a hyperplane.
Dr. Sharma T 9
Dr. Sharma T 10
How to get best line or decision boundary?

• SVM algorithm finds the closest point of the


lines from both the classes. These points are
called support vectors.
• The distance between the vectors and the
hyperplane is called as margin.
• The goal of SVM is to maximize this margin. The
hyperplane with maximum margin is called the
optimal hyperplane.

Dr. Sharma T 11
Dr. Sharma T 12
Mathematical Interpretation of Optimal Hyperplane

Fig 1: We have l training examples where each


example x are of D dimension and each have
labels of either y=+1 or y= -1 class, and our
examples are linearly separable. Then, our
training data is form ,

Fig. 1

Dr. Sharma T 13
We consider D=2 to keep explanation simple and data points are linearly separable, The hyperplane w.x+b=0 can
be described as :

Support vectors examples are closest to optimal


hyperplane and the aim of the SVM is to orientate
this hyperplane as far as possible from the closest
member of the both classes.

From the Fig 2, SVM problem can be formulated as,

Fig. 2

Dr. Sharma T 14
• From the Fig.2 we have two hyperplane H1 and H2 passing through the support vectors of +1 and -1 class respectively. So

• And distance between H1 hyperplane and origin is (-1-b)/|w| and distance between H2 hyperplane and origin is (1–b)/|w|. So,
margin can be given as

• Where M is nothing but twice of the margin. So, margin can be written as 1/|w|. As, optimal hyperplane maximize the margin,
then the SVM objective is boiled down to fact of maximizing the term 1/|w|,

Dr. Sharma T 15
Fig. 2 explanation
• The two hyperplanes represent the support vectors for two classes, and they
define the boundaries of the margin in an SVM (Support Vector Machine). The
hyperplanes are denoted as H1 and H2​, and they correspond to the following
conditions:
• The hyperplane H1​, located closer to the class 1 points (circles), has the condition
w⋅x+b=−1. This is because it represents the boundary for class 1, where the
margin is negative.
• The hyperplane H2​, located closer to the class 2 points (black dots), has the
condition w⋅x+b=+1. This is the boundary for class 2, where the margin is positive.
• The solid line in between represents the decision boundary where w⋅x+b=0,
which separates the two classes.

Dr. Sharma T 16
Non-Linear SVM
• If data is linearly arranged, then we can
separate it by using a straight line, but for
non-linear data, we cannot draw a single
straight line.
• To separate these data points, we need to
add one more dimension. For linear data,
we have used two dimensions x and y, so
for non-linear data, we will add a third-
dimension z.
• It can be calculated as: z=x2 +y2

Dr. Sharma T 17
kernel
The kernel trick in SVM allows us to handle complex data patterns that can’t be separated by a
simple straight line. Kernels are functions that transform (or "project") data into a higher-
dimensional space, where it becomes easier to find a dividing line between different classes.
Types of Kernel:

1. Linear Kernel: Draws a straight line for simpler separable data.

2. Polynomial Kernel: Adds curved boundaries for more complex, non-linear separations.

3. RBF Kernel: Adapts flexibly around data points, creating complex, non-linear boundaries suitable
for highly intricate patterns.

Each kernel provides a different way of “reshaping” the data to help SVM find a clear boundary,
no matter how complex the data distribution is.

Dr. Sharma T 18
By adding the third dimension, the sample space will become as below image:

Dr. Sharma T 19
SVM will divide the datasets into classes. Since we are in If we convert it in 2d space with z=1, then it will look like
3-d Space, hence, it is looking like a plane parallel to the below Figure. We get a circumference of radius 1 in case
x-axis. of non-linear data.

Dr. Sharma T 20
Class Exercise

•Revise Jupiter notebook from previous lecture


for the heart disease prediction dataset using
SVM.
•Note the change in the performance of
the model for different hyperparameters.
Dr. Sharma T 21
Task to perform
• Run SVM with default hyperparameters
• Run SVM with linear kernel
• Run SVM with polynomial kernel
• Run SVM with sigmoid kernel
• Confusion matrix
• Classification metrices
• ROC - AUC
• Stratified k-fold Cross Validation with shuffle split
• Hyperparameter optimization using GridSearch CV and RandomSearch CV
Dr. Sharma T 22

You might also like