0% found this document useful (0 votes)

28 views16 pages

Lecture 17 - Hyperplane Classifiers - SVM - Plain

This document discusses support vector machines (SVMs) for classification. It explains that SVMs find the optimal hyperplane that separates classes with the maximum margin. The hyperplane is defined by a weight vector w and bias b. Hard-margin SVMs require all training examples to satisfy margin constraints, while soft-margin SVMs allow some violations via slack variables. The goal is to maximize the margin while minimizing slack variables. SVMs are solved using Lagrangian duality, resulting in a quadratic programming problem that is optimized to find the separating hyperplane.

Uploaded by

Rajachandra Voodiga

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

28 views16 pages

Lecture 17 - Hyperplane Classifiers - SVM - Plain

Uploaded by

Rajachandra Voodiga

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 16

Hyperplane based Classifiers (2):

Large-Margin Classification - SVM

CS771: Introduction to Machine Learning

Piyush Rai
2
Support Vector Machine (SVM) SVM originally proposed by
Vapnik and colleagues in early
90s

 Hyperplane based classifier. Ensures a large margin around the hyperplane

 Will assume a linear hyperplane to be of the form (nonlinear ext. later)
𝒘 ⊤ 𝒙 +𝑏=1 𝒘 ⊤ 𝒙 𝑛 + 𝑏 ≥ 1 if 𝑦 𝑛=+1
Class +1
𝒘 ⊤ 𝒙 𝑛 + 𝑏 ≤ − 1if 𝑦 𝑛 =− 1
𝒘 ⊤ 𝒙+𝑏 ≥ 1 𝒘 𝒙+𝑏=−1 Distance from the
⊤
closest point (on either 𝑦 𝑛 (𝒘 ¿ ¿ ⊤ 𝒙 𝑛 +𝑏)≥ 1 ∀ 𝑛 ¿
side)
Distance of an input
“Margin” of the hyperplane from the h.p.
Class -1
𝒘 ⊤ 𝒙 +𝑏 ≤ −1 𝛾 = min
|𝒘 ⊤ 𝒙 𝑛 +𝑏| ¿
1
1 ≤𝑛 ≤ 𝑁 ‖𝒘‖ ‖𝒘‖
Want the hyperplane such that
this margin is maximized (max- 2
𝒘 ⊤ 𝒙+𝑏=0 Constrained Total margin=
margin hyperplane) and ‖𝒘‖
optimization
problem The 1/-1 in supp. h.p.
 Two other “supporting” hyperplanes defining a “no man’s land” equations is arbitrary; can
replace by any scalar m/-m
 Ensure that zero training examples fall in this region (will relax later) and solution won’t change,
except a simple scaling of
 The SVM idea: Position the hyperplane s.t. this region is as “wide” as possible
CS771: Intro to ML
3
Hard-Margin SVM
 Hard-Margin: Every training example must fulfil margin condition
 Meaning: Must not have any example in the no-man’s land
𝒘 ⊤ 𝒙 +𝑏=1  Also want to maximize margin
Class +1
𝒘 ⊤ 𝒙 +𝑏 ≥ 1 𝒘 ⊤ 𝒙 +𝑏=−1
 Equivalent to minimizing or

Class -1 
⊤
The objective func. for hard-margin SVM
𝒘 𝒙 +𝑏 ≤ −1

𝒘 ⊤ 𝒙 +𝑏=0
Constrained optimization
problem with inequality
constraints. Objective and
constraints both are
convex

CS771: Intro to ML
4
Soft-Margin SVM (More Commonly Used)
 Allow some training examples to fall within
the no-man’s land (margin region)
 Even okay for some training examples to fall
totally on the wrong side of h.p.
 Extent of “violation” by a training input () is
known as slack
 means totally on the wrong side

𝒘 ⊤ 𝒙 𝑛 + 𝑏 ≥ 1 − 𝜉 𝑛 if 𝑦 𝑛 =+1
𝒘 ⊤ 𝒙 𝑛 + 𝑏 ≤ − 1+ 𝜉 𝑛 if 𝑦 𝑛 =− 1

Soft-margin constraint: 𝑦 𝑛 (𝒘 ¿ ¿ ⊤ 𝒙𝑛 +𝑏) ≥ 1− 𝜉 𝑛 ∀ 𝑛 ¿

CS771: Intro to ML
5
Soft-Margin SVM (Contd)
Sum of slacks is
 Goal: Still want to maximize the margin such that like the training
error
 Soft-margin constraints are satisfied for all training ex.
 Do not have too many margin violations (sum of slacks should be small)
 The objective func. for soft-margin SVM
Inversely Trade-off hyperparam
prop. to training Constrained optimization
margin error problem with inequality
constraints. Objective and
constraints both are
convex

 Hyperparameter controls the trade off between large margin

and small training error (need to tune)
 Large : small training error but also small margin (bad)
 Small : large margin but large training error (bad)
CS771: Intro to ML
6

Solving the SVM Problem

CS771: Intro to ML
7
Solving Hard-Margin SVM
 The hard-margin SVM optimization problem is

 A constrained optimization problem. One option is to solve using Lagrange’s method

 Introduce Lagrange multipliers , one for each constraint, and solve

 denotes the vector of Lagrange multipliers

 It is easier (and helpful; we will soon see why) to solve the dual: min and then max
CS771: Intro to ML
8
Solving Hard-Margin SVM Note: if we ignore the bias term then we don’t
need to handle the constraint (problem becomes
a bit more easy to solve)
 The dual problem (min then max) is
Otherwise, the ’s are coupled and
some opt. techniques such as co-
ordinate ascent can’t easily be
applied
 Take (partial) derivatives of w.r.t. and and setting them to zero gives (verify)

tells us how important

training example () is
 The solution is simply a weighted sum of all the training inputs
 Substituting in the Lagrangian, we get the dual problem as (verify)
Note that inputs appear only as pairwise dot
This is also a “quadratic products. This will be useful later on when
program” (QP) – a quadratic we make SVM nonlinear using kernel
function of the variables methods

Maximizing a concave function G is an p.s.d. matrix, also called the Gram Matrix, and
(or minimizing a convex 1 is a vector of all 1s
function) s.t. and . Many
methods to solve it. (Note: For various SVM solvers, can see “Support Vector Machine Solvers” by Bottou and Lin) CS771: Intro to ML
9
Solving Hard-Margin SVM
 One we have the ’s by solving the dual, we can get and as

 A nice property: Most ’s in the solution will be zero (sparse solution)

𝒘 ⊤ 𝒙 +𝑏=1
 Reason: KKT conditions
𝒘 ⊤ 𝒙+𝑏=−1  For the optimal ’s, we must have
 Thus nonzero only if , i.e., the training example lies
on the boundary
𝛼𝑛 {1− 𝑦 𝑛 ( 𝒘 ⊤ 𝒙 𝑛 +𝑏 ) }=0

𝒘 ⊤ 𝒙+𝑏=0  These examples are called support vectors

CS771: Intro to ML
10
Solving Soft-Margin SVM
 Recall the soft-margin SVM optimization problem

 Here is the vector of slack variables

 Introduce Lagrange multipliers for each constraint and solve Lagrangian

 The terms in red color above were not present in the hard-margin SVM
 Two set of dual variables and
 Will eliminate the primal var , b, to get dual problem containing the dual variables
CS771: Intro to ML
11
Solving Soft-Margin SVM Note: if we ignore the bias term then we don’t
need to handle the constraint (problem becomes
a bit more easy to solve)
 The Lagrangian problem to solve Otherwise, the ’s are coupled and some opt. techniques
such as co-ordinate aspect can’t easily applied

 Take (partial) derivatives of w.r.t. , and and setting to zero gives

Weighted sum of training inputs

 Using and , we have (for hard-margin,

 Substituting these in the Lagrangian gives the Dual problem The dual variables don’t
Given , and can be found appear in the dual problem!
just like the hard-margin
SVM case
Maximizing a concave function In the solution, will still be sparse just like the
(or minimizing a convex hard-margin SVM case. Nonzero correspond
function) s.t. and . Many to the support vectors
methods to solve it. CS771:
(Note: For various SVM solvers, can see “Support Vector Machine Solvers” Intro
by Bottou to ML
and Lin)
12
Support Vectors in Soft-Margin SVM
 The hard-margin SVM solution had only one type of support vectors
 All lied on the supporting hyperplanes and

 The soft-margin SVM solution has three types of support vectors (with nonzero )

1. Lying on the supporting hyperplanes

2. Lying within the margin region but still

on the correct side of the hyperplane

3. Lying on the wrong side of the

hyperplane (misclassified training
examples)

CS771: Intro to ML
13
SVMs via Dual Formulation: Some Comments
 Recall the final dual objectives for hard-margin and soft-margin SVM
Note: Both these ignore the bias term
otherwise will need another constraint

 The dual formulation is nice due to two primary reasons

 Allows conveniently handling the margin based constraint (via Lagrangians)
 Allows learning nonlinear separators by replacing inner products in by general kernel-based
similarities (more on this when we talk about kernels)
 However, dual formulation can be expensive if is large (esp. compared to )
 Need to solve for variables
 Need to pre-compute and store gram matrix G
 Lot of work on speeding up SVM in these settings (e.g., can use co-ord. descent for )
CS771: Intro to ML
14
Solving for SVM in the Primal
 Maximizing margin subject to constraints led to the soft-margin formulation of SVM

 Note that slack is the same as , i.e., hinge loss for ()

 Thus the above is equivalent to minimizing the regularized hinge loss

 Sum of slacks is like sum of hinge losses, and play similar roles
 Can learn directly by minimizing using (stochastic)(sub)grad. descent
 Hinge-loss version preferred for linear SVMs, or with other regularizers on (e.g.,
CS771: Intro to ML
15
SVM: Summary
 A hugely (perhaps the most!) popular classification algorithm
 Reasonably mature, highly optimized SVM softwares freely available (perhaps the
reason why it is more popular than various other competing algorithms)
 Some popular ones: libSVM, LIBLINEAR, sklearn also provides SVM
 Lots of work on scaling up SVMs* (both large and large )
 Extensions beyond binary classification (e.g., multiclass, structured outputs)
 Can even be used for regression problems (Support Vector Regression)
 Nonlinear extensions possible via kernels

*
See: “Support Vector Machine Solvers” by Bottou and Lin
CS771: Intro to ML
16
Coming up next
 A co-ordinate ascent algorithm for solving the SVM dual
 Multi-class SVM
 One-class SVM
 Kernel methods and nonlinear SVM via kernels

CS771: Intro to ML

Lecture 5. Support Vector Machines SVM
No ratings yet
Lecture 5. Support Vector Machines SVM
47 pages
Assignment 3
No ratings yet
Assignment 3
280 pages
M8 SupportVectorMachines
No ratings yet
M8 SupportVectorMachines
64 pages
Class 0420
No ratings yet
Class 0420
44 pages
L5 SVM
No ratings yet
L5 SVM
61 pages
16 SVM
No ratings yet
16 SVM
41 pages
Lecture 7 - SVM
No ratings yet
Lecture 7 - SVM
125 pages
CMPE 442 Introduction To Machine Learning: Support Vector Machines
No ratings yet
CMPE 442 Introduction To Machine Learning: Support Vector Machines
64 pages
12 - Bài Toán Phân L P - SVM - v2
No ratings yet
12 - Bài Toán Phân L P - SVM - v2
138 pages
Ds 5
No ratings yet
Ds 5
21 pages
Lecture 20-Dual Quadratic Programming Formulation of SVMs and Kernel Trick
No ratings yet
Lecture 20-Dual Quadratic Programming Formulation of SVMs and Kernel Trick
31 pages
Lecture 9 - SVM
No ratings yet
Lecture 9 - SVM
42 pages
Lecture 4
No ratings yet
Lecture 4
9 pages
Deep Learn
No ratings yet
Deep Learn
48 pages
Perceptrons and SVMS: Cs771: Introduction To Machine Learning Nisheeth
No ratings yet
Perceptrons and SVMS: Cs771: Introduction To Machine Learning Nisheeth
18 pages
2024 Scu ML 2 1 SVM
No ratings yet
2024 Scu ML 2 1 SVM
36 pages
An Introduction Of: Support Vector Machine
No ratings yet
An Introduction Of: Support Vector Machine
36 pages
Support Vector Machine (SVM)
No ratings yet
Support Vector Machine (SVM)
103 pages
ML TCS Lecture 15
No ratings yet
ML TCS Lecture 15
46 pages
Support Vector Machines For Classification and Regression
No ratings yet
Support Vector Machines For Classification and Regression
8 pages
6 Lec SVM Kernel
No ratings yet
6 Lec SVM Kernel
36 pages
SVM Slides
No ratings yet
SVM Slides
22 pages
SVM Consolidated
No ratings yet
SVM Consolidated
34 pages
Lecture: Classification With Support Vector Machines: CS 2XX: Mathematics For AI and ML
No ratings yet
Lecture: Classification With Support Vector Machines: CS 2XX: Mathematics For AI and ML
28 pages
Chapter 5 - Support Vector Machine: Prepared By: Shier Nee, SAW
No ratings yet
Chapter 5 - Support Vector Machine: Prepared By: Shier Nee, SAW
44 pages
Math Behind SVM Part 2 (Support Vector Machine) - by MLMath - Io - Medium
No ratings yet
Math Behind SVM Part 2 (Support Vector Machine) - by MLMath - Io - Medium
6 pages
Chapter 07 SVM
No ratings yet
Chapter 07 SVM
20 pages
An Overview On Support Vector Machines
No ratings yet
An Overview On Support Vector Machines
14 pages
Unit 2
No ratings yet
Unit 2
47 pages
Support Vector Machine
No ratings yet
Support Vector Machine
29 pages
Lecture 18 - SVM
No ratings yet
Lecture 18 - SVM
54 pages
Classification: Linear SVM
No ratings yet
Classification: Linear SVM
26 pages
27-Module 4 - Support Vector Machine and Naïve Bayes-20-09-2024
No ratings yet
27-Module 4 - Support Vector Machine and Naïve Bayes-20-09-2024
31 pages
W12 SVM
No ratings yet
W12 SVM
52 pages
A09 Support Vector Machines 2up
No ratings yet
A09 Support Vector Machines 2up
15 pages
SVM Explained PDF
No ratings yet
SVM Explained PDF
19 pages
Support Vector Machine
No ratings yet
Support Vector Machine
49 pages
L5 SVMs
No ratings yet
L5 SVMs
37 pages
SVM Student
No ratings yet
SVM Student
40 pages
Support Vector Machine
No ratings yet
Support Vector Machine
55 pages
Machine Learning - SVM
No ratings yet
Machine Learning - SVM
11 pages
A Survey of Support Vector Machines With Uncertainties: Ximing Wang Panos M. Pardalos
No ratings yet
A Survey of Support Vector Machines With Uncertainties: Ximing Wang Panos M. Pardalos
17 pages
Support Vector Machines: Jeff Wu
No ratings yet
Support Vector Machines: Jeff Wu
35 pages
04SVM
No ratings yet
04SVM
22 pages
(Optimization) SVMs
No ratings yet
(Optimization) SVMs
19 pages
SVM Notes Unit 4
No ratings yet
SVM Notes Unit 4
8 pages
Support Vector Machine (SVM)
No ratings yet
Support Vector Machine (SVM)
4 pages
SVM Tutorial
No ratings yet
SVM Tutorial
31 pages
Introduction To: Support Vector Machines
No ratings yet
Introduction To: Support Vector Machines
53 pages
Lec 06 SVM
No ratings yet
Lec 06 SVM
34 pages
SVM PRESENTATION
No ratings yet
SVM PRESENTATION
34 pages
10 SVM
No ratings yet
10 SVM
23 pages
SVM Tutorial
100% (1)
SVM Tutorial
34 pages
SVM Tutorial
No ratings yet
SVM Tutorial
34 pages
An Introduction Of: Support Vector Machine
No ratings yet
An Introduction Of: Support Vector Machine
36 pages
CS-13410 Introduction To Machine Learning
No ratings yet
CS-13410 Introduction To Machine Learning
33 pages
Fundamental Knowledge of Machine Learning: Abstract This Chapter Introduces The Basic Concepts and Methods of Machine
No ratings yet
Fundamental Knowledge of Machine Learning: Abstract This Chapter Introduces The Basic Concepts and Methods of Machine
14 pages
SVM Set-2
No ratings yet
SVM Set-2
5 pages
Time Series
100% (1)
Time Series
91 pages
Pyspark Interview Questions: Click Here
0% (1)
Pyspark Interview Questions: Click Here
35 pages
Maximization Problem
100% (2)
Maximization Problem
10 pages
Numpy Interview Questions: Click Here
No ratings yet
Numpy Interview Questions: Click Here
32 pages
Quantitative Methods For Economic Analysis 1 Solved MCQs (Set-7)
100% (1)
Quantitative Methods For Economic Analysis 1 Solved MCQs (Set-7)
5 pages
SQL Joins Interview Questions: Click Here
No ratings yet
SQL Joins Interview Questions: Click Here
34 pages
Semi-: Supervised Learning
No ratings yet
Semi-: Supervised Learning
40 pages
(Advances in Intelligent Systems and Computing 577) Wojciech Mitkowski, Janusz Kacprzyk, Krzysztof Oprzędkiewicz, Paweł Skruch (Eds.) - Trends in Advanced Intelligent Control, Optimization and Automat
No ratings yet
(Advances in Intelligent Systems and Computing 577) Wojciech Mitkowski, Janusz Kacprzyk, Krzysztof Oprzędkiewicz, Paweł Skruch (Eds.) - Trends in Advanced Intelligent Control, Optimization and Automat
886 pages
Reinforcement Learning
No ratings yet
Reinforcement Learning
23 pages
3.6 Illustrating Extreme Value Theorem
No ratings yet
3.6 Illustrating Extreme Value Theorem
16 pages
RNN LSTM
No ratings yet
RNN LSTM
49 pages
VHDL Implementation of 128 Bit Pipelined Blowfish Algorithm
No ratings yet
VHDL Implementation of 128 Bit Pipelined Blowfish Algorithm
5 pages
MODULE 2 Calculus2
No ratings yet
MODULE 2 Calculus2
9 pages
08.time Series
No ratings yet
08.time Series
1 page
F Inal CoursePack - CCS - R1UC505C
No ratings yet
F Inal CoursePack - CCS - R1UC505C
17 pages
IITK Gradesheet
No ratings yet
IITK Gradesheet
4 pages
Chapter1-Foundations For Efficiencies
No ratings yet
Chapter1-Foundations For Efficiencies
5 pages
Revision V5no
No ratings yet
Revision V5no
14 pages
20210501-ML Question Bank
No ratings yet
20210501-ML Question Bank
1 page
Chapter3 Gaining Efficiencies
No ratings yet
Chapter3 Gaining Efficiencies
6 pages
Paper 03
No ratings yet
Paper 03
13 pages
Artificial Intelligence Interview Questions: Click Here
No ratings yet
Artificial Intelligence Interview Questions: Click Here
44 pages
Uncertainity Quantification
No ratings yet
Uncertainity Quantification
88 pages
Emotion Recognition Using Eeg Dignals
No ratings yet
Emotion Recognition Using Eeg Dignals
8 pages
Sem5 MTMH CC11
No ratings yet
Sem5 MTMH CC11
2 pages
Sudoku
No ratings yet
Sudoku
35 pages
A Malware Detection Approach Using Autoencoder in Deep Learning
No ratings yet
A Malware Detection Approach Using Autoencoder in Deep Learning
11 pages
Sl. No. Experiments/Programs Cos
No ratings yet
Sl. No. Experiments/Programs Cos
17 pages
FiniteElements2D - TimeDependent
No ratings yet
FiniteElements2D - TimeDependent
31 pages
알파폴드1논문
No ratings yet
알파폴드1논문
27 pages
Final Value Theorem PPT Electronics 1
No ratings yet
Final Value Theorem PPT Electronics 1
6 pages
Experiment6 AA
No ratings yet
Experiment6 AA
10 pages
Instruments and Control System(s)
No ratings yet
Instruments and Control System(s)
2 pages
ISM Unit - 2
No ratings yet
ISM Unit - 2
9 pages
11 Ceaser-X-orCustomized 2231450 2221510 2221555 2110655 2111498
No ratings yet
11 Ceaser-X-orCustomized 2231450 2221510 2221555 2110655 2111498
1 page
7 1526465877 - 16-05-2018 PDF
No ratings yet
7 1526465877 - 16-05-2018 PDF
7 pages
A Penny Saved Is A Penny Earned
No ratings yet
A Penny Saved Is A Penny Earned
2 pages
Spline Interpolation Fortran Code
No ratings yet
Spline Interpolation Fortran Code
4 pages
School in Delhi
No ratings yet
School in Delhi
56 pages
Resume PDF
No ratings yet
Resume PDF
15 pages
S C EE: ECES Exam Blueprint v1
No ratings yet
S C EE: ECES Exam Blueprint v1
3 pages
QC Homework
No ratings yet
QC Homework
11 pages
In-Class Problems: Introduction To Scientific Computation
No ratings yet
In-Class Problems: Introduction To Scientific Computation
3 pages
A-level Maths Revision: Cheeky Revision Shortcuts
From Everand
A-level Maths Revision: Cheeky Revision Shortcuts
Scool Revision
3.5/5 (8)
Backpropagation: Fundamentals and Applications for Preparing Data for Training in Deep Learning
From Everand
Backpropagation: Fundamentals and Applications for Preparing Data for Training in Deep Learning
Fouad Sabry
No ratings yet

Lecture 17 - Hyperplane Classifiers - SVM - Plain

Uploaded by

Lecture 17 - Hyperplane Classifiers - SVM - Plain

Uploaded by

Hyperplane based Classifiers (2):

Large-Margin Classification - SVM

CS771: Introduction to Machine Learning

 Hyperplane based classifier. Ensures a large margin around the hyperplane

Soft-margin constraint: 𝑦 𝑛 (𝒘 ¿ ¿ ⊤ 𝒙𝑛 +𝑏) ≥ 1− 𝜉 𝑛 ∀ 𝑛 ¿

 Hyperparameter controls the trade off between large margin

Solving the SVM Problem

 A constrained optimization problem. One option is to solve using Lagrange’s method

 denotes the vector of Lagrange multipliers

tells us how important

 A nice property: Most ’s in the solution will be zero (sparse solution)

𝒘 ⊤ 𝒙+𝑏=0  These examples are called support vectors

 Here is the vector of slack variables

 Take (partial) derivatives of w.r.t. , and and setting to zero gives

 Using and , we have (for hard-margin,

1. Lying on the supporting hyperplanes

2. Lying within the margin region but still

3. Lying on the wrong side of the

 The dual formulation is nice due to two primary reasons

 Note that slack is the same as , i.e., hinge loss for ()

You might also like