0% found this document useful (0 votes)
22 views

Support Vector Machine

Uploaded by

adarsh.tripathi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
22 views

Support Vector Machine

Uploaded by

adarsh.tripathi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 17

Task: Classification (Mostly)

Algorithm:Support Vector Machine


SVM
SVM

Support Vector Machine (SVM) is a powerful machine learning algorithm used


for linear or nonlinear classification, regression, and even outlier detection tasks.
SVMs can be used for a variety of tasks, such as text classification, image
classification, spam detection, handwriting identification, gene expression
analysis, face detection, and anomaly detection.
SVMs are adaptable and efficient in a variety of applications because they can
manage high-dimensional data and nonlinear relationships.
Support vector machine is based on statistical approaches.
SVM works best when the dataset is small and complex.
SVM
Support Vector Machine or SVM is one of the most popular Supervised Learning
algorithms, which is used for Classification as well as Regression problems. However,
primarily, it is used for Classification problems in Machine Learning.

The goal of the SVM algorithm is to create the best line or decision boundary that can
segregate n-dimensional space into classes so that we can easily put the new data point
in the correct category in the future. This best decision boundary is called a
hyperplane.

The hyperplane tries that the margin between the closest points of different
classes should be as maximum as possible.
The dimension of the hyperplane depends upon the number of features.
Hyperplane?

In Machine Learning, a hyperplane is a decision boundary that divides


the input space into two or more regions, each corresponding to a
different class or output label.

In a 2D space, a hyperplane is a straight line that divides the space into


two halves.
Important Terms

● Support Vectors: These are the points that are closest to the hyperplane. A

separating line will be defined with the help of these data points.

● Margin: it is the distance between the hyperplane and the observations

closest to the hyperplane (support vectors). In SVM large margin is

considered a good margin. There are two types of margins hard margin and

soft margin.
Types of Support Vector Machine (SVM) Algorithms

● Linear SVM: When the data is perfectly linearly separable only then we can use Linear SVM.

Perfectly linearly separable means that the data points can be classified into 2 classes by using a single

straight line(if 2D).

● Non-Linear SVM: When the data is not linearly separable then we can use Non-Linear SVM, which

means when the data points cannot be separated into 2 classes by using a straight line (if 2D) then we

use some advanced techniques like kernel tricks to classify them. In most real-world applications we

do not find linearly separable datapoints hence we use kernel trick to solve them.
Example
From the figure it’s very clear that
there are multiple lines (our
hyperplane here is a line because
we are considering only two input
features x1, x2) that segregate our
data points or do a classification
between red and blue circles. So
how do we choose the best line or
in general the best hyperplane
that segregates our data points?
One reasonable choice as the best
How does SVM work?
hyperplane is the one that represents the
largest separation or margin between the two
classes.

So we choose the hyperplane whose distance


from it to the nearest data point on each side
is maximized.

If such a hyperplane exists it is known as


the maximum-margin hyperplane/hard
margin. So from the above figure, we
choose L2.
Let’s consider a scenario like shown below

Here we have one blue ball in the boundary of the red ball. So
how does SVM classify the data? It’s simple! The blue ball in
the boundary of red ones is an outlier of blue balls.

The SVM algorithm has the characteristics to ignore the outlier


and finds the best hyperplane that maximizes the margin. SVM
is robust to outliers.

So in this type of data point what SVM does is, finds the
maximum margin as done with previous data sets along with
that it adds a penalty each time a point crosses the margin. So
the margins in these types of cases are called soft margins.
Till now, we were talking about linearly separable data(the
group of blue balls and red balls are separable by a
straight line/linear line). What to do if data are not
linearly separable?
say, our data is shown in the figure below:.
SVM solves this by creating a new variable using a
kernel.
We call a point xi on the line and we create a new
variable yi as a function of distance from origin o.
So if we plot this we get something like as shown
below.
In this case, the new variable y is created as a
function of distance from the origin. A non-linear
function that creates a new variable is referred to as
a kernel.
Support Vector Machine Terminology
1. Hyperplane: Hyperplane is the decision boundary that is used to separate the data points of different classes in a feature space. In the
case of linear classifications, it will be a linear equation i.e. wx+b = 0.
2. Support Vectors: Support vectors are the closest data points to the hyperplane, which makes a critical role in deciding the
hyperplane and margin.
3. Margin: Margin is the distance between the support vector and hyperplane. The main objective of the support vector machine
algorithm is to maximize the margin. The wider margin indicates better classification performance.
4. Kernel: Kernel is the mathematical function, which is used in SVM to map the original input data points into high-dimensional
feature spaces, so, that the hyperplane can be easily found out even if the data points are not linearly separable in the original input
space. Some of the common kernel functions are linear, polynomial, radial basis function(RBF), and sigmoid.
5. Hard Margin: The maximum-margin hyperplane or the hard margin hyperplane is a hyperplane that properly separates the data
points of different categories without any misclassifications.
6. Soft Margin: When the data is not perfectly separable or contains outliers, SVM permits a soft margin technique. Each data point
has a slack variable introduced by the soft-margin SVM formulation, which softens the strict margin requirement and permits certain
misclassifications or violations. It discovers a compromise between increasing the margin and reducing violations.
Types of Support Vector Machine
Based on the nature of the decision boundary, Support Vector Machines (SVM) can be divided into two main parts:

Linear SVM: Linear SVMs use a linear decision boundary to separate the data points of different classes. When the
data can be precisely linearly separated, linear SVMs are very suitable. This means that a single straight line (in 2D) or a
hyperplane (in higher dimensions) can entirely divide the data points into their respective classes. A hyperplane that
maximizes the margin between the classes is the decision boundary.

Non-Linear SVM: Non-Linear SVM can be used to classify data when it cannot be separated into two classes by a
straight line (in the case of 2D). By using kernel functions, nonlinear SVMs can handle nonlinearly separable data. The
original input data is transformed by these kernel functions into a higher-dimensional feature space, where the data
points can be linearly separated. A linear SVM is used to locate a nonlinear decision boundary in this modified space.
Advantages of SVM
● Effective in high-dimensional cases.
● Different kernel functions can be specified for the decision
functions and its possible to specify custom kernels.

Disadvantages of SVM
● Choosing a good kernel is not easy
● It doesn’t show good results on a big dataset
Real Life ex: Predict if cancer is Benign or malignant.

Using historical data about patients diagnosed with cancer


enables doctors to differentiate malignant cases and benign ones
are given independent attributes.
Steps
● Load the breast cancer dataset from sklearn.datasets
● Separate input features and target variables.
● Buil and train the SVM classifiers using RBF kernel.
● Plot the scatter plot of the input features.

You might also like