0% found this document useful (0 votes)

53 views28 pages

13.1 Support Vector Machine

This document discusses support vector machines (SVMs) for classification. It explains that SVMs find the optimal separating hyperplane that maximizes the margin between the two classes of data. This hyperplane is known as the maximum marginal hyperplane (MMH). The data points that lie closest to the MMH are called the support vectors, and they are the most informative for determining the classification. SVMs can handle both linearly separable and non-linearly separable data using techniques like soft margins and kernels to project the data into higher dimensions where linear separation is possible.

Uploaded by

gameplayevgerk

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

53 views28 pages

13.1 Support Vector Machine

Uploaded by

gameplayevgerk

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 28

SUPPORT VECTOR MACHINE

Pranya PO, MSc

Classification of customers
Income

– buy computers : YES

– – – – + buy computers : NO
–
– – ––
– ––
– + ?
– ––
+ + +
+ + + ++
+ + +

Age
2
Classification of customers
Income

– buy computers : YES

– – – – + buy computers : NO
–
– – ––
– ––
–
– –– +
? + ++
+ + + ++
+ + +

Age
3
Classification of customers
Income

– buy computers : YES

– – – – + buy computers : NO
–
– – ––
– ––
–
– –– +
+ + +
+ + + ++
+ + +

Age
4
Classification of customers
Income

– buy computers : YES

– – – – + buy computers : NO
–
– – ––
– ––
–
– –– +
? + ++
+ + + ++
+ + +

Age
5
Classification of customers
Income

– buy computers : YES

– – – – + buy computers : NO
–
– – ––
– ––
–
– –– +
? + ++
+ + + ++
+ + +

Age
6
Support vector machine
• Support vector machine (SVM) is a method for the classification of both linear and
nonlinear data.
• It uses a nonlinear mapping to transform the original training data into a higher
dimension. Within this new dimension, it searches for the linear optimal separating
hyperplane: a “decision boundary” separating the tuples of one class from another.

– – – – The SVM finds this hyperplane

–
– – – – using support vectors (“essential”
– –– – training tuples) and margins
–
– –– + + N
Y + + + (defined by the support vectors).
+ + + ++
++ + +
Main ideas of SVM
• Maximum marginal hyperplane
• Formalize notion of the best linear separator
• Lagrangian multipliers
• Way to convert a constrained optimization problem to one that is easier to
solve
• Slack variables
• Allow misclassification of difficult or noisy examples
• Kernel tricks
• Projecting data into higher-dimensional space makes it linearly separable
Linear SVM
Linear SVM
• Linear SVM  the case when the data are linearly separable.
• The simplest case: a two-class problem where the classes are linearly
separable.
• Let the data set D be given as 𝑿𝑿1, 𝑦𝑦1 , 𝑿𝑿2 , 𝑦𝑦2 , … , 𝑿𝑿𝑘𝑘 , 𝑦𝑦𝑘𝑘 where 𝑿𝑿𝑖𝑖
is the set of training tuples with associated class labels, 𝑦𝑦𝑖𝑖 .
• Each 𝑦𝑦𝑖𝑖 take one of two values, either +1 or -1 (yi ∈ {-1,+1}),
corresponding to the classes buys_computer = yes and
buys_computer = no, respectively.
Linear SVM
• The 2-D training data are
linearly separable. There are
an infinite number of possible
separating hyperplanes or
“decision boundaries,” some
of which are shown here as
dashed lines. Which one is
best?
• An SVM approaches this
problem by searching for the
maximum marginal hyperplane
(MMH).
Maximum marginal hyperplane
We expect the
hyperplane with the
larger margin to be
more accurate at
classifying future data
tuples than the
hyperplane with the
smaller margin. This is
why the SVM searches
for the hyperplane with
the largest margin.
Maximum marginal hyperplane
• The shortest distance from a
hyperplane to one side of its
margin is equal to the
shortest distance from the
hyperplane to the other side
of its margin, where the
“sides” of the margin are
parallel to the hyperplane.
• This distance is the shortest
distance from the MMH to
the closest training tuple of
either class.
Maximum marginal hyperplane
• A separating hyperplane can be written as
𝑾𝑾 � 𝑿𝑿 + 𝑏𝑏 = 0,
where 𝑾𝑾 is a weight vector, namely, 𝑾𝑾 = 𝑤𝑤1 , 𝑤𝑤2 , … , 𝑤𝑤𝑛𝑛 ; n is the
number of attributes; and b is a scalar, often referred to as a bias.
• Consider two input attributes, 𝐴𝐴1 and 𝐴𝐴2, where 𝑥𝑥1 and 𝑥𝑥2 are the
values of attributes for 𝑿𝑿. Rewrite the equation above as
𝑤𝑤0 + 𝑤𝑤1 𝑥𝑥1 + 𝑤𝑤2 𝑥𝑥2 = 0
where 𝑤𝑤0 is scalar b.
Maximum marginal hyperplane
• Thus, any point lies above the separating hyperplane satisfies:
𝑤𝑤0 + 𝑤𝑤1 𝑥𝑥1 + 𝑤𝑤2 𝑥𝑥2 > 0
Similarly, any point lies below the separating hyperplane satisfies:
𝑤𝑤0 + 𝑤𝑤1 𝑥𝑥1 + 𝑤𝑤2 𝑥𝑥2 < 0
• The weights can be adjusted so that the hyperplanes defining the “sides” of
the margin can be written as
𝐻𝐻1 : 𝑤𝑤0 + 𝑤𝑤1 𝑥𝑥1 + 𝑤𝑤2 𝑥𝑥2 ≥ 1 for 𝑦𝑦𝑖𝑖 = +1
𝐻𝐻2 : 𝑤𝑤0 + 𝑤𝑤1 𝑥𝑥1 + 𝑤𝑤2 𝑥𝑥2 ≤ −1 for 𝑦𝑦𝑖𝑖 = −1
• Any tuple that falls on or above 𝐻𝐻1 belongs to class +1, and any tuple that
falls on or below 𝐻𝐻2 belongs to class -1. Combining the two inequalities, we
get
𝑦𝑦𝑖𝑖 (𝑤𝑤0 + 𝑤𝑤1 𝑥𝑥1 + 𝑤𝑤2 𝑥𝑥2 ) ≥ 1, ∀𝑖𝑖
Maximum marginal hyperplane
• Any training tuples that fall on hyperplanes 𝐻𝐻1 or 𝐻𝐻2 are called
support vectors.

• Essentially, the support vectors

are the most difficult tuples to
classify and give the most
information regarding
classification.
Maximum marginal hyperplane
• The distance from the separating hyperplane to any point on 𝐻𝐻1 is
1
, where 𝑾𝑾 = 𝑾𝑾 � 𝑾𝑾 = 𝑤𝑤12 + 𝑤𝑤22 + ⋯ + 𝑤𝑤𝑛𝑛2 is the Euclidean
| 𝑾𝑾 |
norm of 𝑾𝑾.
• By definition, this is equal to the
distance from any point on 𝐻𝐻2 to the
separating hyperplane.
2
• Therefore, the maximal margin is .
| 𝑾𝑾 |
Maximum marginal hyperplane
• So, how does an SVM find the MMH and the support vectors?
• We can rewrite 𝑦𝑦𝑖𝑖 (𝑤𝑤0 + 𝑤𝑤1 𝑥𝑥1 + 𝑤𝑤2 𝑥𝑥2 ) ≥ 1, ∀𝑖𝑖 so that it becomes
what is known as a constrained quadratic optimization problem
using a Lagrangian formulation.
• Then solving for the solution using Karush-Kuhn-Tucker (KKT)
conditions.
Maximum marginal hyperplane
• Based on the Lagrangian formulation mentioned before, the MMH
can be rewritten as the decision boundary
𝑙𝑙

𝑑𝑑 𝑿𝑿𝑇𝑇 = � 𝑦𝑦𝑖𝑖 𝛼𝛼𝑖𝑖 𝑿𝑿𝑖𝑖 𝑿𝑿𝑇𝑇 + 𝑏𝑏0 ,

𝑖𝑖=1
where 𝑦𝑦𝑖𝑖 is the class label of support vector 𝑿𝑿𝑖𝑖 ; 𝑿𝑿𝑇𝑇 is a test tuple; 𝛼𝛼𝑖𝑖
(Lagrange multipliers) and 𝑏𝑏0 are numeric parameters that were
determined automatically by the optimization or SVM algorithm
noted before; and 𝑙𝑙 is the number of support vectors.
Sec. 15.2.1

Soft margin classification

• What if the training set is not linearly
separable?
• Slack variables ξi can be added to allow
misclassification of difficult or noisy
examples, resulting margin called soft.
• We then pay a cost for each misclassified
ξi
example, which depends on how far it is
ξj from meeting the margin requirement.
• Add slack variables ξi into 𝑦𝑦𝑖𝑖 (𝑤𝑤0 + 𝑤𝑤1 𝑥𝑥1 +
𝑤𝑤2 𝑥𝑥2 ) ≥ 1 and rewrite as
𝒚𝒚𝒊𝒊 (𝒘𝒘𝟎𝟎 + 𝒘𝒘𝟏𝟏 𝒙𝒙𝟏𝟏 + 𝒘𝒘𝟐𝟐 𝒙𝒙𝟐𝟐 ) ≥ 𝟏𝟏 −ξi
Non-linear SVM
Sec. 15.2.3

Non-linear SVM
• Datasets that are linearly separable work out great:

0 x

• But what are we going to do if the dataset is just too hard?

0 x

• How about … mapping data to a higher-dimensional space:

2 x

0 x
Sec. 15.2.3

Feature spaces
General idea: the original feature space can always be mapped to
some higher-dimensional feature space where the training set is
separable:

Φ: x → φ(x)
Sec. 15.2.3

The “kernel trick”

• The linear classifier relies on an inner product between vectors
K(xi,xj)=xiTxj
• If every datapoint is mapped into high-dimensional space via some
transformation Φ: x → φ(x), the inner product becomes:
K(xi,xj)= φ(xi) Tφ(xj)
• A kernel function is some function that corresponds to an inner
product in some expanded feature space.
Sec. 15.2.3

Kernels
• Why use kernels?
• Make non-separable problem separable.
• Map data into better representational space
• Common kernels
Multiclass classification
One-versus-all/one-versus-rest
• Given m classes, we train m binary classifiers, one for each class.
• Classifier j is trained using data of class j as the positive class, and the
remaining data as the negative class. It learns to return a positive
value for class j and a negative value for the rest.
All-versus-all/one-versus-one
𝒎𝒎(𝒎𝒎−𝟏𝟏)
• Given m classes, we construct binary classifiers 
𝟐𝟐
combination nCr, where n=m classes and r=2.
• A classifier is trained using data of the two classes.

Capstone Project Vivek
100% (4)
Capstone Project Vivek
145 pages
Power Quality Performance Enhancement Using Single-Phase UPQC With Fuzzy Logic Controller Integrated With PV-BES System
No ratings yet
Power Quality Performance Enhancement Using Single-Phase UPQC With Fuzzy Logic Controller Integrated With PV-BES System
22 pages
Unit 2 PPT - Part 2
100% (1)
Unit 2 PPT - Part 2
81 pages
Prefix To Postfix
100% (1)
Prefix To Postfix
2 pages
Support Vector Machines
No ratings yet
Support Vector Machines
11 pages
Kottarathil J. Graph Theory and Decomposition 2024
No ratings yet
Kottarathil J. Graph Theory and Decomposition 2024
201 pages
Wiener Index of Graphs Over Rings A Survey
No ratings yet
Wiener Index of Graphs Over Rings A Survey
10 pages
Support Vector Machine (SVM) : Salim A
No ratings yet
Support Vector Machine (SVM) : Salim A
255 pages
Support Vector Machines
No ratings yet
Support Vector Machines
9 pages
Final - Support Vector Machine - Class - Modifie
No ratings yet
Final - Support Vector Machine - Class - Modifie
69 pages
Williams Landel Ferry - JACS55 PDF
No ratings yet
Williams Landel Ferry - JACS55 PDF
7 pages
10 Classification SVM
No ratings yet
10 Classification SVM
22 pages
ML Chapter 5 Part 2
No ratings yet
ML Chapter 5 Part 2
52 pages
Introduction To Support Vector Machines
No ratings yet
Introduction To Support Vector Machines
46 pages
Lesson 4
No ratings yet
Lesson 4
83 pages
DA Unit-2 Part-2
No ratings yet
DA Unit-2 Part-2
24 pages
16 SVM
No ratings yet
16 SVM
41 pages
Unit-III - SVM
No ratings yet
Unit-III - SVM
105 pages
Support Vector Machines
No ratings yet
Support Vector Machines
24 pages
Support Vector Machine
No ratings yet
Support Vector Machine
33 pages
S V M (SVM) : Upport Ector Achine
No ratings yet
S V M (SVM) : Upport Ector Achine
67 pages
Support Vector Machine
No ratings yet
Support Vector Machine
18 pages
Wind Power Optimization
No ratings yet
Wind Power Optimization
9 pages
Support Vector Machine: Abinas Panda
No ratings yet
Support Vector Machine: Abinas Panda
52 pages
ML Lec9 SVM
No ratings yet
ML Lec9 SVM
32 pages
14-Introduction To Support Vector Machine-22-03-2024
No ratings yet
14-Introduction To Support Vector Machine-22-03-2024
12 pages
IVPML Unit III
No ratings yet
IVPML Unit III
139 pages
Machine Learning (CSO851) - Lecture 05
No ratings yet
Machine Learning (CSO851) - Lecture 05
27 pages
Difficulties Observed When Implementing Total Productive Maintenance TPM Empirical Evidences From The Manufacturing Sect
No ratings yet
Difficulties Observed When Implementing Total Productive Maintenance TPM Empirical Evidences From The Manufacturing Sect
15 pages
Surveying Solved MCQs (Set-14)
No ratings yet
Surveying Solved MCQs (Set-14)
8 pages
SVM (Repaired)
No ratings yet
SVM (Repaired)
39 pages
SVM Presentation
No ratings yet
SVM Presentation
19 pages
SVM Notes
No ratings yet
SVM Notes
4 pages
L5 SVMs
No ratings yet
L5 SVMs
37 pages
Lecture 13: Natural Frequency and Bode Plot: Lecturer: Dr. Vinita Vasudevan Scribe: Shashank Shekhar
No ratings yet
Lecture 13: Natural Frequency and Bode Plot: Lecturer: Dr. Vinita Vasudevan Scribe: Shashank Shekhar
4 pages
Test Execution For Software Testing
No ratings yet
Test Execution For Software Testing
10 pages
L5-Support Vector Machine
No ratings yet
L5-Support Vector Machine
61 pages
Studies Analysing FWD Test Results: S.No. Paper Name Journal Objectives Methodology Conclusion
No ratings yet
Studies Analysing FWD Test Results: S.No. Paper Name Journal Objectives Methodology Conclusion
4 pages
Unit 2 - SVM - 241016 - 104220
No ratings yet
Unit 2 - SVM - 241016 - 104220
47 pages
University of Cambridge International Examinations General Certificate of Education Advanced Level
No ratings yet
University of Cambridge International Examinations General Certificate of Education Advanced Level
4 pages
Support Vector Machine (SVM)
No ratings yet
Support Vector Machine (SVM)
28 pages
SVM
No ratings yet
SVM
43 pages
SVM - Feb 15
No ratings yet
SVM - Feb 15
34 pages
Machine Learning Unit-3.3
No ratings yet
Machine Learning Unit-3.3
38 pages
Lecture - 7 Classification (SVM)
No ratings yet
Lecture - 7 Classification (SVM)
48 pages
Support Vector Machine: Prof. Subodh Kumar Mohanty
No ratings yet
Support Vector Machine: Prof. Subodh Kumar Mohanty
52 pages
5NF and Other Normal Forms
No ratings yet
5NF and Other Normal Forms
22 pages
Support Vector Machines
No ratings yet
Support Vector Machines
19 pages
Chapter 07 SVM
No ratings yet
Chapter 07 SVM
20 pages
Support Vector Machine
No ratings yet
Support Vector Machine
29 pages
Support Vector Machine
No ratings yet
Support Vector Machine
17 pages
M.E Maths
No ratings yet
M.E Maths
87 pages
Classification Regression: Mostly Used in Classification Problems
No ratings yet
Classification Regression: Mostly Used in Classification Problems
8 pages
CSAT - 01 - Explanation File
No ratings yet
CSAT - 01 - Explanation File
27 pages
Support Vector Machine
No ratings yet
Support Vector Machine
52 pages
SVM MJJ
No ratings yet
SVM MJJ
19 pages
SVM
No ratings yet
SVM
11 pages
Unit-4 AI - SVM
No ratings yet
Unit-4 AI - SVM
21 pages
Support Vector Machine
No ratings yet
Support Vector Machine
52 pages
Lecture 18 - SVM
No ratings yet
Lecture 18 - SVM
54 pages
Maple Labs
No ratings yet
Maple Labs
19 pages
Unit 2
No ratings yet
Unit 2
47 pages
Support Vector Machines
No ratings yet
Support Vector Machines
16 pages
CBSE Class 10 Maths Chapter 8 Introduction To Trignometry Objective Questions
No ratings yet
CBSE Class 10 Maths Chapter 8 Introduction To Trignometry Objective Questions
13 pages
1 Complex Numbers
No ratings yet
1 Complex Numbers
32 pages
Module10 - Support Vector Machine
No ratings yet
Module10 - Support Vector Machine
23 pages
Support Vector Machine
100% (2)
Support Vector Machine
11 pages
W12 SVM
No ratings yet
W12 SVM
52 pages
Support Vector Machine
No ratings yet
Support Vector Machine
19 pages
Student Study Guide
No ratings yet
Student Study Guide
3 pages
SIMULATION MODEL of Permanent Magnet Synchronous Motor
No ratings yet
SIMULATION MODEL of Permanent Magnet Synchronous Motor
9 pages
Sample Paper 2 Class 9
No ratings yet
Sample Paper 2 Class 9
5 pages
SVM Notes Unit 4
No ratings yet
SVM Notes Unit 4
8 pages
SML Unit 4
No ratings yet
SML Unit 4
61 pages
Support Vector Machines: Jeff Wu
No ratings yet
Support Vector Machines: Jeff Wu
35 pages
Support Vector Machine
No ratings yet
Support Vector Machine
31 pages
DAT: PAT Tips
No ratings yet
DAT: PAT Tips
5 pages
Chapter 3 FM I
No ratings yet
Chapter 3 FM I
16 pages
Notification-CSPE 2020 N Engl
No ratings yet
Notification-CSPE 2020 N Engl
9 pages
Lec06 SVM
No ratings yet
Lec06 SVM
25 pages
Data Compression Report
No ratings yet
Data Compression Report
10 pages
SVM Tutorial
No ratings yet
SVM Tutorial
31 pages
1 Complex Numbers
No ratings yet
1 Complex Numbers
9 pages
10 SVM
No ratings yet
10 SVM
23 pages
Support Vector Machine
No ratings yet
Support Vector Machine
45 pages
HW1 Sol PDF
No ratings yet
HW1 Sol PDF
12 pages
CS-13410 Introduction To Machine Learning
No ratings yet
CS-13410 Introduction To Machine Learning
33 pages
Analyzing Operational Flexibility of Electric Power Systems
No ratings yet
Analyzing Operational Flexibility of Electric Power Systems
10 pages

13.1 Support Vector Machine

Uploaded by

13.1 Support Vector Machine

Uploaded by

SUPPORT VECTOR MACHINE

Pranya PO, MSc

– buy computers : YES

– buy computers : YES

– buy computers : YES

– buy computers : YES

– buy computers : YES

– – – – The SVM finds this hyperplane

• Essentially, the support vectors

𝑑𝑑 𝑿𝑿𝑇𝑇 = � 𝑦𝑦𝑖𝑖 𝛼𝛼𝑖𝑖 𝑿𝑿𝑖𝑖 𝑿𝑿𝑇𝑇 + 𝑏𝑏0 ,

Soft margin classification

• But what are we going to do if the dataset is just too hard?

• How about … mapping data to a higher-dimensional space:

The “kernel trick”

You might also like