0% found this document useful (0 votes)
7 views10 pages

Support Vector Machines Theory Implementation and Applications

The document presents an overview of Support Vector Machines (SVM), detailing their theoretical foundations, mathematical framework, and implementation approaches. It explains the core principles of SVM, including the identification of the Maximum Margin Hyperplane and the handling of both linearly and non-linearly separable data through techniques like Soft-Margin SVM and Kernel Methods. The learning objectives focus on understanding SVM principles, their applications across various domains, and practical considerations in implementation.

Uploaded by

yourworsehalf
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views10 pages

Support Vector Machines Theory Implementation and Applications

The document presents an overview of Support Vector Machines (SVM), detailing their theoretical foundations, mathematical framework, and implementation approaches. It explains the core principles of SVM, including the identification of the Maximum Margin Hyperplane and the handling of both linearly and non-linearly separable data through techniques like Soft-Margin SVM and Kernel Methods. The learning objectives focus on understanding SVM principles, their applications across various domains, and practical considerations in implementation.

Uploaded by

yourworsehalf
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 10

Support Vector Machines:

Theory, Implementation,
and Applications
Presented By:

Sachin 2022UIT3065

Praatham Sharma 2022UIT3068

Ishan Singh 2022UIT3072

Amjad Imran 2022UIT3079

Sachin Simauliya 2022UIT3151

Presented to:

Ms. Shubhra Goyal Mam


Overview/Agenda
• Introduction to Support Vector Machines
• Theoretical Foundations
• Linear and Non-linear Classification
• Kernel Methods
• Mathematical Framework
• Implementation Approaches
• Applications across domains
• Advanced Topics and Extensions
• Case Studies and Practical Considerations

Learning Objectives: Understanding SVM principles, mathematical formulation, implementation, and practical use
cases.
Introduction to Support
Vector Machines
Developed by Vladimir Vapnik and colleagues at AT&T Bell
Laboratories (1992-1995). Evolved from Statistical Learning Theory.
SVM is a supervised machine learning algorithm that finds an optimal
hyperplane to separate data into distinct classes. Focus on
maximizing the margin between classes. Initially for binary
classification, later extended.
Fundamental Concept
At its core, the Support Vector Machine (SVM) seeks to identify the
optimal hyperplane that most effectively separates data points into
distinct classes. This hyperplane is chosen to maximize the margin—
the distance between itself and the nearest data points from each
class. In an n-dimensional space, the hyperplane is defined by the
equation w·x + b = 0, where w represents the weight vector, x is the
input vector, and b is the bias. Classification is then accomplished
using the decision function: f(x) = sign(w·x + b).
Key Terminology
Support Vectors: Critical data points nearest to the separating hyperplane, influencing its position and orientation.

Margin: The perpendicular distance between the hyperplane and the closest support vectors, indicating classification confidence.

Maximum Margin Hyperplane (MMH): The optimal hyperplane that maximizes the margin, providing the best separation between classes.

Decision Boundary: The hyperplane that distinctly separates data points of different classes, enabling classification.

Feature Space: The n-dimensional space representing all possible values of the input features, where data points are plotted.

SVM: Maximum margin separating hyperplane


How SVM Works - Basic Principles
Support Vector Machines (SVMs) follow a structured process to achieve optimal data classification. Here's a concise
overview of the key principles:

Feature Space Mapping: The input data is transformed into a high-dimensional feature space, enabling the
representation of complex relationships within the data.
Hyperplane Generation: Multiple potential hyperplanes are created within the feature space, each acting as a
candidate decision boundary between classes.
Margin Calculation: For each hyperplane, the margin—the distance between the hyperplane and the nearest data
points (support vectors) from each class—is computed.
Maximum Margin Selection: The algorithm selects the hyperplane with the largest margin, known as the Maximum
Margin Hyperplane (MMH), ensuring optimal class separation.
Support Vector Identification: Critical data points closest to the MMH, called support vectors, are identified. These
points play a pivotal role in determining the hyperplane's position and orientation.
Decision Function Derivation: A decision function is formulated using the support vectors and MMH. This function
classifies new data points by determining which side of the hyperplane they fall on.
Mathematical Representation
The foundation of a linear Support Vector Machine lies in its precise mathematical formulation,
which can be broken down into the following components:

Primal Formulation:
The objective is to minimize ½||w||², where w is the weight vector, while ensuring that every
data point satisfies the constraint yi(w·xi + b) ≥ 1. This guarantees not only correct
classification but also a margin of at least 1, resulting in a quadratic optimization problem that
can be solved using Lagrangian multipliers.

Lagrangian Formulation:
To solve this optimization problem, the Lagrangian function is defined as:
L(w, b, α) = ½||w||² - Σi αi[yi(w·xi + b) - 1]
Here, αi represents the Lagrangian multipliers associated with each constraint.

Karush-Kuhn-Tucker (KKT) Conditions:


Applying the KKT conditions provides essential relationships for optimality:
w = Σi αiyixi
Σi αiyi = 0
These equations play a crucial role in defining the optimal parameters.

Decision Function:
Finally, with the optimal parameters in hand, the decision function formulated for classifying
new data points is:
f(x) = sign(Σi αiyi(xi·x) + b)
This function assigns the class of a new data point based on which side of the hyperplane it falls
on.
Linear Separability
Linearly separable data can be perfectly divided into distinct classes
using a hyperplane. This division requires selecting parameters w (the
weight vector) and b (the bias) so that for every training example (xi,
yi), the condition yi(w·xi + b) > 0 is met. The classification boundaries
are established by the canonical hyperplanes defined by w·x + b = 1
and w·x + b = -1, which create a margin of width 2/||w||. A hard-
margin SVM seeks to find the unique hyperplane that maximizes this
margin, ensuring the most robust separation between the classes.
Margin Maximization
The goal of margin maximization is to achieve the widest possible
separation between classes, quantified as 2/||w||, while ensuring all
data points meet the requirement yi(w·xi + b) ≥ 1. This objective is
mathematically equivalent to minimizing (||w||²)/2 under the same
constraints, forming a convex optimization problem with a unique
solution. This problem is commonly solved using the Lagrangian
formulation:

L(w, b, α) = ½||w||² - Σi αi[yi(w·xi + b) - 1],

where the αi represent Lagrange multipliers. The dual formulation of


this optimization problem is then expressed as:

Maximize: Σi αi - ½ Σi Σj αiαj yi yj (xi·xj),

subject to the constraints that αi ≥ 0 for all i and Σi αiyi = 0.


Non-Linearly Separable Data
In many practical scenarios, data cannot be perfectly divided by a simple linear
boundary. A classic illustration of this challenge is the XOR problem, where data
points from different classes are so intricately interwoven that no single straight line
or hyperplane can effectively separate them.

To address non-linearly separable data, Support Vector Machines (SVMs) employ two
main approaches. The first is the Soft-Margin SVM, which introduces slack variables.
These slack variables allow for some misclassifications by accepting data points that
fall within the margin or even on the wrong side of the hyperplane. This approach
carefully balances the need to maximize the margin while also accommodating
errors in classification.

The second approach utilizes Kernel Methods. Instead of trying to find a linear
boundary in the original input space, kernel functions transform the data into a
higher-dimensional space where linear separation becomes feasible. This
transformation is performed implicitly through the computation of dot products in
the new space, making it unnecessary to explicitly compute the high-dimensional
mapping. As a result, the kernel trick enables SVMs to manage complex, non-linear
relationships effectively.

Together, these techniques allow SVMs to adapt to a wide range of real-world


problems where simple linear boundaries are insufficient.

You might also like