0% found this document useful (0 votes)

13 views19 pages

Support Vector Machine

Support Vector Machines (SVM) are primarily used for classification tasks, aiming to find a hyperplane that maximizes the margin between different classes of data points. The algorithm utilizes support vectors, which are critical data points that influence the hyperplane's position and orientation. SVM can also handle non-linear data through kernel methods, mapping data to higher-dimensional spaces for better separation.

Uploaded by

Rohini Rajaram Pandian

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

13 views19 pages

Support Vector Machine

Uploaded by

Rohini Rajaram Pandian

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 19

• Support Vector Machine, abbreviated as

SVM can be used for both regression and

classification tasks.

• But, it is widely used in classification

objectives.

Support Vector and • The objective of the support vector

machine algorithm is to find a hyperplane
Kernel Methods in an N-dimensional space(N — the
number of features) that distinctly
classifies the data points.

• To separate the two classes of data points,

there are many possible hyperplanes that
could be chosen.
Support Vector Machines

• The objective is to find a plane that has

the maximum margin, i.e the
maximum distance between data
points of both classes.
• Maximizing the margin distance
provides some reinforcement so that
future data points can be classified
with more confidence.
Hyperplane as Decision Surface
• Hyperplanes are decision boundaries that help
classify the data points.
• Data points falling on either side of the
hyperplane can be attributed to different classes.
It is a sort of binary classification
• The dimension of the hyperplane depends upon
the number of features. If the number of input
features is 2, then the hyperplane is just a line. If
the number of input features is 3, then the
hyperplane becomes a two-dimensional plane.
Support
Vectors
Support vectors are data points that are
closer to the hyperplane and influence
the position and orientation of the
hyperplane.

Using these support vectors, we

maximize the margin of the classifier.

Deleting the support vectors will change

the position of the hyperplane. These
are the points that help us build our
SVM.
Maximizing the Margin

• In logistic regression, we take the output of the linear function and squash the value
within the range of [0,1] using the sigmoid function.
• If the squashed value is greater than a threshold value(0.5) we assign it a label 1,
else we
assign it a label 0.
• In SVM, we take the output of the linear function and if that output is greater than 1,
we identify it with one class and if the output is -1, we identify is with another class.
• Since the threshold values are changed to 1 and -1 in SVM, we obtain this reinforcement
range of values([-1,1]) which acts as margin.
Sec.
15.1

Support • SVMs maximize the margin around the

separating hyperplane.
Vector • A.k.a. large margin classifiers
Machine
Support vectors • The decision function is fully specified
(SVM) by a
subset of training samples, the support
vectors.
• Solving SVMs is a quadratic
programming
problem
• Seen by many as the most successful current
Maximizes
Narrowmearrgin
text classification method*
margin 71
SV
M
Sec.
15.1

Maximum Margin:
Formalization
w: decision hyperplane normal vector

xi: data point i

yi: class of data point i (+1 or -1) NB: Not 1/0

Classifier is: f(xi) = sign(wTxi + b)

Functional margin of xi is: But note that we can increase this

yi (wTxi + b) margin simply by scaling w, b….
73
Sec.
Geometric 15.1

Margin
• Distance from example to the separator is r 
wT x  b
y w
• Examples closest to the hyperplane are support vectors.
• Margin ρ of the separator is the width of separation between support vectors of classes.

Derivation of finding r:
ρ Dotted line x’−x is perpendicular to
x
decision boundary so parallel to w.
 Unit vector is w/|w|, so line is
r
x rw/|w|.
x’ = x – yrw/|w|.
′ x’ satisfies wTx’+b = 0. So
wT(x –yrw/|w|) + b = 0
Recall that |w| =
sqrt(wTw).
So wTx –yr|w| + b = 0
w So, solving for r gives:
r = y(wTx + b)/|w|
Sec.
15.1
Linear SVM
Mathematically The
• Assume that all data is at least linearly separable
distance 1 from the hyperplane, then the following two constraints
follow for a training set {(x ,y )}
i i
case
wTxi + b ≥ 1 if yi = 1
wTxi + b ≤ −1 if yi = −1

• For support vectors, the inequality becomes an equality

• Then, since each example’s distance from the hyperplane is
wT x  b
r
y w
• The margin is:

2

 w
75
Sec.
15.1

Linear Support Vector

Machine (SVM)
ρ wTxa + b = 1

•
wTxb + b = -1
Hyperplane
wT x + b = 0
• Extra scale constraint:
mini=1,…,n |wTxi + b| = 1

• This implies:
wT(xa–xb) = 2
ρ = ||xa–x b|| 2 = 2/||w||2 wT x + b = 0

76
Solving the Optimization Problem
Find w and b such that
Φ(w) =½ wTw is minimized;
and for all {(xi ,yi)}: yi (wTxi + b) ≥ 1

• This is now optimizing a quadratic function subject to linear constraints

• Quadratic optimization problems are a well-known class of mathematical
programming problem, and many (intricate) algorithms exist for solving them
(with many special ones built for SVMs)
• The solution involves constructing a dual problem where a Lagrange
multiplier
αi is associated with every constraint in the primary problem:
Find α1…αN such that
Q(α) =Σαi - ½ΣΣαiαjyiyjxiTxj is maximized and
(1) Σαiyi = 0
(2) αi ≥ 0 for all αi
77
The Optimization Problem Solution

• The solution has the form:

w =Σαiyixi b= yk- wTxk for any xk such that αk 0

• Each non-zero αi indicates that corresponding xi is a support vector.

• Then the classifying function will have the form:

f(x) = ΣαiyixiTx + b

• Notice that it relies on an inner product between the test point x and the support vectors xi
• We will return to this later.
• Also keep in mind that solving the optimization problem involved computing the inner products x iTx j
between all pairs of training points.

78
Classification with SVMs

•Given a new point x, we can score its projection onto the

hyperplane normal:
• I.e., compute score: wTx + b = ΣαiyixiTx + b
• Decide class based on whether < or > 0

• Can set confidence threshold t.

Score > t: yes

Score < -t: no
1
Else: don’t -1
0
79
Linear SVMs:
Summary
• The classifier is a separating hyperplane.

• The most “important” training points are the support vectors; they define the hyperplane.

• Quadratic optimization algorithms can identify which training points xi are support vectors with
non-zero Lagrangian multipliers αi.

• Both in the dual formulation of the problem and in the solution, training points appear only
inside
inner products:
Find α1…αN such that f(x) = ΣαiyixiTx + b
Q(α) =Σαi - ½ΣΣαiαjyiyjxiTxj is maximized and
(1) Σαiyi = 0
(2) 0 ≤ αi ≤ C for all αi

80
Non-linear SVMs
• Datasets that are linearly separable (with some noise) work out great:

0 x

• But what are we going to do if the dataset is just too hard?

0 x

•How about … mapping data to a higher-xd imensional space:

0 x
81
Non-linear SVMs:
Feature spaces
• General idea: the original feature space can always be mapped
to some higher-dimensional feature space where
the training set is separable:

Φ: x → φ(x)

82
The “Kernel
Trick”
• The linear classifier relies on an inner product between vectors K(x ,x )=x x i j i
T
j

• If every datapoint is mapped into high-dimensional space via some transformation Φ: x → φ(x), the
inner product becomes:
K(xi,xj)= φ(xi) Tφ(xj)
• A kernel function is some function that corresponds to an inner product in some expanded feature
space.
• Example:
2-dimensional vectors x=[x1 x2]; let K(xi,xj)=(1 + xi xj) ,
T 2

Need to show that K(xi,xj)= φ(xi) Tφ(xj):

K(xi,xj)=(1 + xi xj) ,= 1+ xi1 xj1 + 2 xi1xj1 xi2xj2+ xi2 xj2 + 2xi1xj1 + 2xi2xj2=
T 2 2 2 2 2

= [1 xi12 √2 xi1xi2 xi22 √2xi1 √2xi2]T [1 xj1 √2 xj1xj2 xj2 √2xj1 √2xj2]
2
2
2 2
= φ(xi) Tφ(xj) where φ(x) = [1 x1 x2 √2x1 √2x2]
83
√2 x1x2
Sec.
15.2.3

Kernels

Why use kernels?

• Make non-separable problem separable.

• Map data into better representational space

Common kernels

• Linear
• Polynomial K(x,z) = (1+xTz)d
• Gives feature conjunctions
• Radial basis function (infinite dimensional space)

Haven’t been very useful in text classification

MATHEMATICS 10 (Dividing Polynomials by Long & Synthetic Division
No ratings yet
MATHEMATICS 10 (Dividing Polynomials by Long & Synthetic Division
2 pages
Final - Support Vector Machine - Class - Modifie
No ratings yet
Final - Support Vector Machine - Class - Modifie
69 pages
ML Support Vector Machines 2
No ratings yet
ML Support Vector Machines 2
22 pages
Aimlf Unit4
No ratings yet
Aimlf Unit4
20 pages
20 SVM
No ratings yet
20 SVM
35 pages
22-Kernel Tricks Shit
No ratings yet
22-Kernel Tricks Shit
43 pages
Support Vector Machines: Theory, Implementation, and Applications
No ratings yet
Support Vector Machines: Theory, Implementation, and Applications
40 pages
Module 3 ML 24
No ratings yet
Module 3 ML 24
65 pages
Support Vector Machines
No ratings yet
Support Vector Machines
13 pages
Support Vector Machine
No ratings yet
Support Vector Machine
45 pages
Support Vector Machine
No ratings yet
Support Vector Machine
8 pages
SVM Explained PDF
No ratings yet
SVM Explained PDF
19 pages
Ai File
No ratings yet
Ai File
28 pages
IVPML Unit III
No ratings yet
IVPML Unit III
139 pages
S V M (SVM) : Upport Ector Achine
No ratings yet
S V M (SVM) : Upport Ector Achine
67 pages
Support Vector Machines: (Vapnik, 1979)
No ratings yet
Support Vector Machines: (Vapnik, 1979)
34 pages
Unit 2 - SVM - 241016 - 104220
No ratings yet
Unit 2 - SVM - 241016 - 104220
47 pages
Machine Learning Unit-3.3
No ratings yet
Machine Learning Unit-3.3
38 pages
Bandwidth (Signal Processing)
No ratings yet
Bandwidth (Signal Processing)
4 pages
L5 SVMs
No ratings yet
L5 SVMs
37 pages
SVM Tutorial
100% (1)
SVM Tutorial
34 pages
SVM Tutorial
No ratings yet
SVM Tutorial
34 pages
Unit 2
No ratings yet
Unit 2
47 pages
SVM Student
No ratings yet
SVM Student
40 pages
Support Vector Machine
No ratings yet
Support Vector Machine
52 pages
Data Analytics Unit 3
No ratings yet
Data Analytics Unit 3
104 pages
By: Moataz Al-Haj: Vision Topics - Seminar (University of Haifa)
No ratings yet
By: Moataz Al-Haj: Vision Topics - Seminar (University of Haifa)
69 pages
SVM Consolidated
No ratings yet
SVM Consolidated
34 pages
27-Module 4 - Support Vector Machine and Naïve Bayes-20-09-2024
No ratings yet
27-Module 4 - Support Vector Machine and Naïve Bayes-20-09-2024
31 pages
Machine Learning - Open Elective - Part III
No ratings yet
Machine Learning - Open Elective - Part III
90 pages
Support Vector Machine (SVM)
No ratings yet
Support Vector Machine (SVM)
103 pages
SVM Scribe Notes
No ratings yet
SVM Scribe Notes
16 pages
Unit - 2
No ratings yet
Unit - 2
15 pages
Support Vector Machines: Jeff Wu
No ratings yet
Support Vector Machines: Jeff Wu
35 pages
SVM New
No ratings yet
SVM New
12 pages
Inverse Nichols Chart Problem
0% (1)
Inverse Nichols Chart Problem
5 pages
SVM
No ratings yet
SVM
11 pages
Support Vector Machine (SVM)
No ratings yet
Support Vector Machine (SVM)
4 pages
Support Vector Machine: Abinas Panda
No ratings yet
Support Vector Machine: Abinas Panda
52 pages
Support Vector Machine: Prof. Subodh Kumar Mohanty
No ratings yet
Support Vector Machine: Prof. Subodh Kumar Mohanty
52 pages
Support Vector Machine Classifiers
No ratings yet
Support Vector Machine Classifiers
44 pages
ML Unit 3
No ratings yet
ML Unit 3
14 pages
Exp 14
No ratings yet
Exp 14
27 pages
Lec5 Support Vector Machine
No ratings yet
Lec5 Support Vector Machine
28 pages
Support Vector Machines
No ratings yet
Support Vector Machines
33 pages
Support Vector Machine
No ratings yet
Support Vector Machine
31 pages
Another Introduction SVM
No ratings yet
Another Introduction SVM
4 pages
Overview of SVM: A Support Vector Machine (SVM) Performs by Finding The That The Margin Between The
No ratings yet
Overview of SVM: A Support Vector Machine (SVM) Performs by Finding The That The Margin Between The
20 pages
097 3
No ratings yet
097 3
8 pages
W12 SVM
No ratings yet
W12 SVM
52 pages
Support Vector Machine
No ratings yet
Support Vector Machine
50 pages
SVM Notes Unit 4
No ratings yet
SVM Notes Unit 4
8 pages
Bempong Kwasi Gyimah 5862816 Assignment 2
No ratings yet
Bempong Kwasi Gyimah 5862816 Assignment 2
8 pages
Introduction To: Support Vector Machines
No ratings yet
Introduction To: Support Vector Machines
53 pages
Support Vector Machine
No ratings yet
Support Vector Machine
35 pages
Support Vector Machine
No ratings yet
Support Vector Machine
55 pages
SVM Tutorial
No ratings yet
SVM Tutorial
31 pages
An Introduction To Support Vector Machines
No ratings yet
An Introduction To Support Vector Machines
13 pages
SVM PRESENTATION
No ratings yet
SVM PRESENTATION
34 pages
DSP Question Bank
No ratings yet
DSP Question Bank
12 pages
Lec06 SVM
No ratings yet
Lec06 SVM
25 pages
10 SVM
No ratings yet
10 SVM
23 pages
Hebb Network
No ratings yet
Hebb Network
10 pages
Pipelined Adc
No ratings yet
Pipelined Adc
14 pages
CS-13410 Introduction To Machine Learning
No ratings yet
CS-13410 Introduction To Machine Learning
33 pages
Unit 3 PPT Ai
No ratings yet
Unit 3 PPT Ai
93 pages
Bus Scheduling PDF
No ratings yet
Bus Scheduling PDF
6 pages
AI Exercises
No ratings yet
AI Exercises
32 pages
PROGRAM#01: - : Exercise3 (A) Consider The Following Block Diagram, Find The Closed Loop Transfer Function C(S) / R(S)
No ratings yet
PROGRAM#01: - : Exercise3 (A) Consider The Following Block Diagram, Find The Closed Loop Transfer Function C(S) / R(S)
15 pages
Algorithms For Generating Permutations and Combinations: Prof. Nathan Wodarz Math 209 - Fall 2008
0% (1)
Algorithms For Generating Permutations and Combinations: Prof. Nathan Wodarz Math 209 - Fall 2008
7 pages
Neural Network
No ratings yet
Neural Network
55 pages
Bài Tập Ôn Thi Môn Tín Hiệu - Hệ Thống Thầy Lê Vũ Hà
No ratings yet
Bài Tập Ôn Thi Môn Tín Hiệu - Hệ Thống Thầy Lê Vũ Hà
25 pages
Unit 2 Mcqs
No ratings yet
Unit 2 Mcqs
6 pages
Linear Algebra
No ratings yet
Linear Algebra
65 pages
Unit-5 - Question Bank
No ratings yet
Unit-5 - Question Bank
5 pages
MTES3103 - Kuliah 2
No ratings yet
MTES3103 - Kuliah 2
17 pages
PID Tuning Tips: Check Control Loop Basics With A Time Line
No ratings yet
PID Tuning Tips: Check Control Loop Basics With A Time Line
2 pages
Introduction To Support Vector Machines: 1 Description
No ratings yet
Introduction To Support Vector Machines: 1 Description
15 pages
Ordinary Differential Equations
No ratings yet
Ordinary Differential Equations
28 pages
1 The K-Medoids Algorithm
No ratings yet
1 The K-Medoids Algorithm
5 pages
Advance-Math Portfolio Carbon
No ratings yet
Advance-Math Portfolio Carbon
98 pages
10.2. Deep Learning (CNN)
No ratings yet
10.2. Deep Learning (CNN)
50 pages
ML (PCCS-114) Pyq
No ratings yet
ML (PCCS-114) Pyq
16 pages
Experiment No.4 - Cyber Security
No ratings yet
Experiment No.4 - Cyber Security
5 pages
Page Fault
No ratings yet
Page Fault
2 pages
Design and Analysis of Algorithms - Model Question Paper
No ratings yet
Design and Analysis of Algorithms - Model Question Paper
3 pages
Properties of Discrete Time Convolution: Stephen Kruzick
No ratings yet
Properties of Discrete Time Convolution: Stephen Kruzick
4 pages
P08 - 178380 - Eviews Guide
No ratings yet
P08 - 178380 - Eviews Guide
9 pages
Backtracking Search For CSPs
No ratings yet
Backtracking Search For CSPs
9 pages
SVM Tutorial
No ratings yet
SVM Tutorial
34 pages
HPC May
No ratings yet
HPC May
2 pages
Gauss Nodes Revolution: Numerical Integration Theory Radically Simplified And Generalised
From Everand
Gauss Nodes Revolution: Numerical Integration Theory Radically Simplified And Generalised
Rob Porter
No ratings yet
Affine Transformation: Unlocking Visual Perspectives: Exploring Affine Transformation in Computer Vision
From Everand
Affine Transformation: Unlocking Visual Perspectives: Exploring Affine Transformation in Computer Vision
Fouad Sabry
No ratings yet
Exercises of Vectors and Vectorial Spaces
From Everand
Exercises of Vectors and Vectorial Spaces
Simone Malacrida
No ratings yet
Ordered Weighted Averaging Aggregation Operator: Fundamentals and Applications
From Everand
Ordered Weighted Averaging Aggregation Operator: Fundamentals and Applications
Fouad Sabry
No ratings yet

Support Vector Machine

Uploaded by

Support Vector Machine

Uploaded by

• Support Vector Machine, abbreviated as

SVM can be used for both regression and

• But, it is widely used in classification

Support Vector and • The objective of the support vector

• To separate the two classes of data points,

• The objective is to find a plane that has

Using these support vectors, we

Deleting the support vectors will change

Support • SVMs maximize the margin around the

xi: data point i

yi: class of data point i (+1 or -1) NB: Not 1/0

Classifier is: f(xi) = sign(wTxi + b)

Functional margin of xi is: But note that we can increase this

• For support vectors, the inequality becomes an equality

Linear Support Vector

• This is now optimizing a quadratic function subject to linear constraints

• The solution has the form:

w =Σαiyixi b= yk- wTxk for any xk such that αk 0

• Each non-zero αi indicates that corresponding xi is a support vector.

•Given a new point x, we can score its projection onto the

• Can set confidence threshold t.

Score > t: yes

• But what are we going to do if the dataset is just too hard?

•How about … mapping data to a higher-xd imensional space:

Need to show that K(xi,xj)= φ(xi) Tφ(xj):

Why use kernels?

• Make non-separable problem separable.

Haven’t been very useful in text classification

You might also like