0% found this document useful (0 votes)
5 views14 pages

An Overview On Support Vector Machines

overview of svm in machine learning
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views14 pages

An Overview On Support Vector Machines

overview of svm in machine learning
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 14

An Overview of Support

Vector Machines

By: Bhuvin Sai Batikiri


01 INTRODUCTION

02 PRIMAL PROBLEM

03 DUAL PROBLEM

04 HARD MARGIN

05 SOFT MARGIN

06 CONCLUSION
Introduction to SVMs
Support Vector Machines (SVM) are a set of
supervised learning methods used for
classification, regression, and outliers detection.
In Support Vector Machines, the goal is to find a
hyperplane that separates the data points into
two classes.
The hard margin SVM is used when the data is
linearly separable without any error, while the
soft margin SVM allows for misclassification
when the data is not perfectly separable.
The Primal Problem
The Primal Problem refers to the original optimization problem formulated for the
classification task. It seeks to find the best separating hyperplane between two classes by
maximizing the margin between the closest data points (support vectors) and the
hyperplane

The primal problem is a quadratic programming problem that tries to minimize the norm of
the weight vector (which corresponds to maximizing the margin) while penalizing
misclassified or incorrectly placed points using slack variables.
Equation
Given a dataset where are the feature vectors and are the class labels ,
the primal problem is to minimize the following objective function:

Subject to the following constraints:

Where:
w is the weight vector that defines the hyperplane.
b is the bias term.
ξi​are slack variables, which allow for some misclassification in non-linearly separable
cases.
C is the regularization parameter controlling the trade-off between maximizing the margin
and minimizing the classification error (penalizing the slack variables).
The Dual Problem
The dual problem focuses on maximizing a different objective function that depends only on
the Lagrange multipliers, denoted as αi.

These multipliers are associated with the constraints in the primal problem.
It is often used in practice because it allows the introduction of kernel functions to handle non-
linearly separable data and provides a more computationally efficient solution, especially for
high-dimensional feature spaces.

The dual optimization problem is formulated as:


Subject to the following constraints:

Where:
are the Lagrange multipliers.
are the class labels
is the dot product of the feature vectors and
is the regularization parameter from the primal problem.
Hard Margin
The Hard Margin case intends to find a hyperplane that seperates the two classes in such a
way that no two data points are misclassified. Given a dataset where
and , we want to find a hyperplane of the form:

where w is the normal vector to the hyperplane and b is the bias term.

The classification rule is:

To achieve the maximum margin, we need to minimize the norm of the weight vector w, since
the margin is . Keeping this in mind, the primal problem is resolved to:

This is a convex quadratic programming problem, where the objective is quadratic in w, and the
constraints are linear in w and b
Lagrangian for Hard Margin:
We form the Lagrangian for the constrained optimization problem by introducing Lagrange
multipliers for each constraint :

Dual Problem:
To derive the dual problem, we compute the partial derivatives of the Lagrangian with respect
to w and b:
Dual Optimization Problem
The dual formulation of the hard margin SVM is:

Solution:
The below equations represent the optimal weights and biases from the dual optimization
problem.
Soft Margin
When the data is not linearly separable, we allow for some misclassification by introducing
slack variables . The slack variable measures how much the i-th data point violates the
margin.

Primal Problem for Soft Margin


The primal optimization problem for soft margin SVM is:

Here, C is a regularization parameter that controls the trade-off between maximizing the
margin and minimizing the classification error (through ).
Lagrangian for Soft Margin:
Similar to the hard margin case, we construct the Lagrangian by introducing Lagrange
multipliers and for the slack variables:

Dual Formulation for Soft Margin:


To derive the dual formulation, we minimize the Lagrangian with respect to w, b and , and
then maximize it with respect to and
CONCLUSION
When choosing between hard margin and soft margin SVM, consider the following:

Use hard margin SVM when the data is linearly separable and free of noise or outliers. It
requires perfect separation and is sensitive to outliers, making it suitable for clean datasets
where misclassifications are unacceptable.

Use soft margin SVM when the data is not perfectly separable and may contain noise or
outliers. It allows for misclassifications through slack variables and includes a
regularization parameter CCC to balance margin size and classification errors. This
Everest
approach is more Remy
robust and flexible for real-world datasets where noise is common.
Cantu Marsh
Ceo Of Ingoude Ceo Of Ingoude
Company Company
THANK YOU

You might also like