0% found this document useful (0 votes)

53 views4 pages

SVM-Worked Out Example

Uploaded by

pagalutkarshsharma

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

53 views4 pages

SVM-Worked Out Example

Uploaded by

pagalutkarshsharma

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 4

SVM Example

Dan Ventura March 12, 2009 In what follows we will use vectors augmented with a 1 as a bias input, and for clarity we will differentiate these with an
over-tilde. So, if s1 = (10), then s˜1 = (101). Figure 3 shows the SVM architecture, and our task is to ?ind values for the αi such that
Abstract
α1Φ(s1) · Φ(s1) + α2Φ(s2) · Φ(s1) + α3Φ(s3) · Φ(s1) = −1
We try to give a helpful simple example that demonstrates a linear SVM and then extend the example to a simple non-linear
case to illustrate the use of mapping functions and kernels. α1Φ(s1) · Φ(s2) + α2Φ(s2) · Φ(s2) + α3Φ(s3) · Φ(s2) = +1
α1Φ(s1) · Φ(s3) + α2Φ(s2) · Φ(s3) + α3Φ(s3) · Φ(s3) = +1

1 Introduction
Since for now we have let Φ() = I, this reduces to
Many learning models make use of the idea that any learning problem can be made easy with the right set of features. The trick,
of course, is discovering that “right set of features”, which in general is a very dif?icult thing to do. SVMs are another attempt at a
model that does this. The idea behind SVMs is to make use of a (nonlinear) mapping function Φ that transforms data in input α1s˜1 · s˜1 + α2s˜2 · s˜1 + α3s˜3 · s˜1 = −1 α1s˜1 ·
space to data in feature space in such a way as to render a problem linearly separable. The SVM then automatically discovers the s˜2 + α2s˜2 · s˜2 + α3s˜3 · s˜2 = +1 α1s˜1 · s˜3 +
optimal separating hyperplane (which, when mapped back into input space via Φ−1, can be a complex decision surface). SVMs α2s˜2 · s˜3 + α3s˜3 · s˜3 = +1
are rather interesting in that they enjoy both a sound theoretical basis as well as state-of-the-art success in real-world Now, computing the dot products results in
applications.
To illustrate the basic ideas, we will begin with a linear SVM (that is, a model that assumes the data is linearly separable). We
will then expand the example to the nonlinear case to demonstrate the role of the mapping function Φ, and ?inally we will explain
the idea of a kernel and how it allows SVMs to make use of high-dimensional feature spaces while remaining tractable.

2 Linear Example – when Φ is trivial

Suppose we are given the following positively labeled data points in <2:

and the following negatively labeled data points in <2 (see Figure 1):

Figure 2: The three support vectors are marked as yellow circles.

Figure 1: Sample data points in <2. Blue diamonds are positive examples and red squares are negative examples.

We would like to discover a simple SVM that accurately discriminates the two classes. Since the data is linearly separable, we
can use a linear SVM (that is, one whose mapping function Φ() is the identity function). By inspection, it should be obvious that
there are three support vectors (see Figure 2):

1 2
Figure 3: The SVM architecture.
2α1 + 4α2 + 4α3 = −1
4α1 + 11α2 + 9α3 = +1
4α1 + 9α2 + 11α3 = +1
A little algebra reveals that the solution to this system of equations is α1 = −3.5,α2 = 0.75 and α3 = 0.75.
Now, we can look at how these α values relate to the discriminating hyperplane; or, in other words, now that we have the αi,
how do we ?ind the hyperplane that discriminates the positive from the negative examples? It turns out that

w˜ = X
αis˜i
i

=
Figure 5: Nonlinearly separable sample data points in <2. Blue diamonds are positive examples and red squares are negative
examples.
=

Finally, remembering that our vectors are augmented with a bias, we can equate the last entry in ˜w as the hyperplane offset
b and write the separating

hyperplane equation y = wx+b with and b = −2. Plotting the line

gives the expected decision surface (see Figure 4).

2.1 Input space vs. Feature space

3 Nonlinear Example – when Φ is non-trivial

Now suppose instead that we are given the following positively labeled data points in <2:
Figure 6: The data represented in feature space.

and the following negatively labeled data points in <2 (see Figure 5): Our goal, again, is to discover a separating hyperplane that accurately discriminates the two classes. Of course, it is obvious
that no such hyperplane exists in the input space (that is, in the space in which the original input data live). Therefore, we must
use a nonlinear SVM (that is, one whose mapping function Φ is a nonlinear mapping from input space into some feature space).
De?ine

if
(1)
otherwise

Referring back to Figure 3, we can see how Φ transforms our data before the dot products are performed. Therefore, we can
rewrite the data in feature space as

for the positive examples and

Figure 4: The discriminating hyperplane corresponding to the values α1 = −3.5,α2 = 0.75 and α3 = 0.75. for the negative examples (see Figure 6). Now we can once again easily identify the support vectors (see Figure 7):

3 4
We again use vectors augmented with a 1 as a bias input and will differentiate them as before. Now given the [augmented]
support vectors, we must again ?ind values for the αi. This time our constraints are

Figure 8: The discriminating hyperplane corresponding to the values α1 = −7 and α2 = 4

Figure 7: The two support vectors (in feature space) are marked as yellow circles.
=

α1Φ1(s1) · Φ1(s1) + α2Φ1(s2) · Φ1(s1) = −1 =

α1Φ1(s1) · Φ1(s2) + α2Φ1(s2) · Φ1(s2) = +1

giving us the separating hyperplane equation y = wx + b with and

Given Eq. 1, this reduces to
b = −3. Plotting the line gives the expected decision surface (see Figure 8).

α1s˜1 · s˜1 + α2s˜2 · s˜1 = −1 α1s˜1 · 3.1 Using the SVM

s˜2 + α2s˜2 · s˜2 = +1
Let’s brie?ly look at how we would use the SVM model to classify data. Given x, the
(Note that even though Φ1 is a nontrivial function, both s1 and s2 map to themselves under Φ1. This will not be the case for other
classi?ication f(x) is given by the equation
inputs as we will see later.)
!
Now, computing the dot products results in
(2)

3α1 + 5α2 = −1 where σ(z) returns the sign of z. For example, if we wanted to classify the point
5α1 + 9α2 = +1 x = (4,5) using the mapping function of Eq. 1,

And the solution to this system of equations is α1 = −7 and α2 = 4.

Finally, we can again look at the discriminating hyperplane in input space that corresponds to these α.

w˜ = Xαis˜i
i

5 6
for the positive examples and

for the negative examples. With a little thought, we realize that in this case, all 8 of the examples will be support vectors with
for the positive support vectors and for the negative ones. Note that a consequence of this mapping is that we
do not need to use augmented vectors (though it wouldn’t hurt to do so) because the hyperplane in feature space goes through

Figure 9: The decision surface in input space corresponding to Φ1. Note the singularity. the origin, y = wx+b, where and b = 0. Therefore, the discriminating feature,
is x3, and Eq. 2 reduces to f(x) = σ(x3).
and thus we would classify x = (4,5) as negative. Looking again at the input space, we might be tempted to think this is not a Figure 10 shows the decision surface induced in the input space for this new mapping function. Kernel trick.
reasonable classi?ication; however, it is what our model says, and our model is consistent with all the training data. As always,
there are no guarantees on generalization accuracy, and if we are not happy about our generalization, the likely culprit is our
choice of Φ. Indeed, if we map our discriminating hyperplane (which lives in feature space) back into input space, we can see the 5 Conclusion
effective decision surface of our model (see Figure 9). Of course, we may or may not be able to improve generalization accuracy What kernel to use? Slack variables. Theory. Generalization. Dual problem. QP.
by choosing a different Φ; however, there is another reason to revisit our choice of mapping function.

4 The Kernel Trick

Our de?inition of Φ in Eq. 1 preserved the number of dimensions. In other words, our input and feature spaces are the same size.
However, it is often the case that in order to effectively separate the data, we must use a feature space that is of (sometimes very
much) higher dimension than our input space. Let us now consider an alternative mapping function

(3)
which transforms our data from 2-dimensional input space to 3-dimensional feature space. Using this alternative mapping, the
data in the new feature space looks like

Figure 10: The decision surface in input space corresponding Φ2.

7 8

Marce Equipment Catalog
No ratings yet
Marce Equipment Catalog
190 pages
Exploration of Search For Real Home in Procession
No ratings yet
Exploration of Search For Real Home in Procession
3 pages
Asji
No ratings yet
Asji
128 pages
Lab Manual To Accompany Digital Electronics
100% (1)
Lab Manual To Accompany Digital Electronics
220 pages
Discrete_Mathematics__Notes_-1
No ratings yet
Discrete_Mathematics__Notes_-1
24 pages
Unit 2 PPT - Part 2
100% (1)
Unit 2 PPT - Part 2
81 pages
SVM-1
No ratings yet
SVM-1
8 pages
Svm
No ratings yet
Svm
40 pages
Svm
No ratings yet
Svm
52 pages
Safety Data Sheet: Luxol WB Satin: 1. Identification of The Material and Supplier
No ratings yet
Safety Data Sheet: Luxol WB Satin: 1. Identification of The Material and Supplier
7 pages
Week 1 Sol Merged
No ratings yet
Week 1 Sol Merged
39 pages
BH 01
No ratings yet
BH 01
1 page
Worked Examples in Mathematics for Scientists and Engineers
From Everand
Worked Examples in Mathematics for Scientists and Engineers
G. Stephenson
No ratings yet
Case Study On Electric Vehicles in India
No ratings yet
Case Study On Electric Vehicles in India
2 pages
Ch-7support Vecbot Mochines El Keinal Based Meihods Regression and
No ratings yet
Ch-7support Vecbot Mochines El Keinal Based Meihods Regression and
6 pages
914 Viscosity-Pressure Driven Methods
No ratings yet
914 Viscosity-Pressure Driven Methods
2 pages
svm
No ratings yet
svm
33 pages
Introduction To Support Vector Machines
No ratings yet
Introduction To Support Vector Machines
46 pages
DACE-CE Daikin New Water Cooled Chiller
No ratings yet
DACE-CE Daikin New Water Cooled Chiller
4 pages
Svm
No ratings yet
Svm
20 pages
6. Support Vector Machine for Classification
No ratings yet
6. Support Vector Machine for Classification
38 pages
QUESTIONS
No ratings yet
QUESTIONS
28 pages
LangerK1978 Alliteration in Sanskrit Court Poetry PDF
No ratings yet
LangerK1978 Alliteration in Sanskrit Court Poetry PDF
9 pages
GREE WiFi Smart - 9ee4f471e74316f5
No ratings yet
GREE WiFi Smart - 9ee4f471e74316f5
20 pages
Lecture 4 Health Education
No ratings yet
Lecture 4 Health Education
28 pages
Pneumatic Power Generation
No ratings yet
Pneumatic Power Generation
4 pages
Fiqih Abdul Jafar - Resume
No ratings yet
Fiqih Abdul Jafar - Resume
1 page
btd2008 11
No ratings yet
btd2008 11
2 pages
Lecture09 SVM Intro, Kernel Trick (Updated)
No ratings yet
Lecture09 SVM Intro, Kernel Trick (Updated)
36 pages
Lecture#12
No ratings yet
Lecture#12
16 pages
L5-Support Vector Machine
No ratings yet
L5-Support Vector Machine
61 pages
Glacial Lake
No ratings yet
Glacial Lake
6 pages
Support Vector Machines
No ratings yet
Support Vector Machines
43 pages
SVM Example
No ratings yet
SVM Example
10 pages
22-Kernel Tricks Shit
No ratings yet
22-Kernel Tricks Shit
43 pages
Svm
No ratings yet
Svm
52 pages
Support Vector Machine
No ratings yet
Support Vector Machine
19 pages
Lecture Slides-Week12
100% (1)
Lecture Slides-Week12
41 pages
SVMs
No ratings yet
SVMs
30 pages
Unit 2 - SVM - 241016 - 104220
No ratings yet
Unit 2 - SVM - 241016 - 104220
47 pages
Unit-3 Career Planning
No ratings yet
Unit-3 Career Planning
14 pages
IVPML Unit III
No ratings yet
IVPML Unit III
139 pages
Support Vactor Machine Final
No ratings yet
Support Vactor Machine Final
11 pages
Session Svmclassification
No ratings yet
Session Svmclassification
28 pages
SVMs[1]
No ratings yet
SVMs[1]
30 pages
Select Coat SL-940E/SL-941E: Powerful Process Control For Conformal Coating
No ratings yet
Select Coat SL-940E/SL-941E: Powerful Process Control For Conformal Coating
3 pages
SML Unit 4
No ratings yet
SML Unit 4
61 pages
SVM Theory
No ratings yet
SVM Theory
7 pages
Item Description Size Material Grade: Tee Equal " CPVC 1" CPVC 1 " CPVC
No ratings yet
Item Description Size Material Grade: Tee Equal " CPVC 1" CPVC 1 " CPVC
2 pages
Unit2 notes What is a Support Vector Machine
No ratings yet
Unit2 notes What is a Support Vector Machine
11 pages
Lec5 Support vector machine
No ratings yet
Lec5 Support vector machine
28 pages
UNIT - 2-1
No ratings yet
UNIT - 2-1
7 pages
UNIT - 2
No ratings yet
UNIT - 2
15 pages
SVM
No ratings yet
SVM
11 pages
Testing Maturity Erik Van Veenendaal
No ratings yet
Testing Maturity Erik Van Veenendaal
10 pages
SVM
No ratings yet
SVM
43 pages
Supervised Learning - Support Vector Machines and Feature Reduction
No ratings yet
Supervised Learning - Support Vector Machines and Feature Reduction
11 pages
Laboratory Design and The Aim of Science
No ratings yet
Laboratory Design and The Aim of Science
28 pages
Machine Learning Unit-3.3
No ratings yet
Machine Learning Unit-3.3
38 pages
Machine Learning - Open Elective - Part III
No ratings yet
Machine Learning - Open Elective - Part III
90 pages
SVM Class
No ratings yet
SVM Class
33 pages
White Paper: LTE-Advanced (3GPP Rel.11) Technology Introduction
No ratings yet
White Paper: LTE-Advanced (3GPP Rel.11) Technology Introduction
39 pages
SVM notes unit 4.docx
No ratings yet
SVM notes unit 4.docx
8 pages
Support Vector Machine (SVM)
No ratings yet
Support Vector Machine (SVM)
4 pages
Support Vector Machin, An Excellent Tool
No ratings yet
Support Vector Machin, An Excellent Tool
36 pages
Atc Lecture Tyliu
No ratings yet
Atc Lecture Tyliu
48 pages
Unit 2
No ratings yet
Unit 2
47 pages
Support Vector Machines: Dominik Wisniewski Wojciech Wawrzyniak
No ratings yet
Support Vector Machines: Dominik Wisniewski Wojciech Wawrzyniak
16 pages
ML Unit 3
No ratings yet
ML Unit 3
14 pages
Introduction To: Support Vector Machines
No ratings yet
Introduction To: Support Vector Machines
53 pages
This Is
No ratings yet
This Is
7 pages
SVM Tutorial
100% (1)
SVM Tutorial
34 pages
Support Vector Machine
0% (1)
Support Vector Machine
7 pages
SVM Tutorial
No ratings yet
SVM Tutorial
31 pages
Introduction To Support Vector Machines: 1 Description
No ratings yet
Introduction To Support Vector Machines: 1 Description
15 pages
Support Vector Machine
No ratings yet
Support Vector Machine
31 pages
Support Vector Machines: (Vapnik, 1979)
No ratings yet
Support Vector Machines: (Vapnik, 1979)
34 pages
CS-13410 Introduction To Machine Learning
No ratings yet
CS-13410 Introduction To Machine Learning
33 pages
MT75 Overhaul
No ratings yet
MT75 Overhaul
31 pages
Euler Equation - Wave Equation Connection: T. H. Pulliam Stanford University
No ratings yet
Euler Equation - Wave Equation Connection: T. H. Pulliam Stanford University
12 pages
MODULE 1 - The Learner-Centered Psychological Principles (LCP)
100% (2)
MODULE 1 - The Learner-Centered Psychological Principles (LCP)
31 pages
Support Vector Machine
No ratings yet
Support Vector Machine
45 pages
SVM Tutorial
No ratings yet
SVM Tutorial
34 pages
III-I CS Question Bank
No ratings yet
III-I CS Question Bank
9 pages
SVM Tutorial
No ratings yet
SVM Tutorial
34 pages
BS 8204 Bonding
No ratings yet
BS 8204 Bonding
1 page
Top Numerical Methods With Matlab For Beginners!
From Everand
Top Numerical Methods With Matlab For Beginners!
Andrei Besedin
No ratings yet
A Brief Introduction to MATLAB: Taken From the Book "MATLAB for Beginners: A Gentle Approach"
From Everand
A Brief Introduction to MATLAB: Taken From the Book "MATLAB for Beginners: A Gentle Approach"
Peter Kattan
2.5/5 (2)
Topology and Geometry for Physicists
From Everand
Topology and Geometry for Physicists
Charles Nash
3.5/5 (1)
A-level Maths Revision: Cheeky Revision Shortcuts
From Everand
A-level Maths Revision: Cheeky Revision Shortcuts
Scool Revision
3.5/5 (8)

SVM-Worked Out Example

Uploaded by

SVM-Worked Out Example

Uploaded by

SVM Example

2 Linear Example – when Φ is trivial

Figure 2: The three support vectors are marked as yellow circles.

hyperplane equation y = wx+b with and b = −2. Plotting the line

2.1 Input space vs. Feature space

3 Nonlinear Example – when Φ is non-trivial

for the positive examples and

Figure 8: The discriminating hyperplane corresponding to the values α1 = −7 and α2 = 4

α1Φ1(s1) · Φ1(s1) + α2Φ1(s2) · Φ1(s1) = −1 =

giving us the separating hyperplane equation y = wx + b with and

α1s˜1 · s˜1 + α2s˜2 · s˜1 = −1 α1s˜1 · 3.1 Using the SVM

And the solution to this system of equations is α1 = −7 and α2 = 4.

4 The Kernel Trick

Figure 10: The decision surface in input space corresponding Φ2.

You might also like