0% found this document useful (0 votes)

7 views3 pages

HW 3

Uploaded by

explosion4601

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

7 views3 pages

HW 3

Uploaded by

explosion4601

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 3

Mehryar Mohri

Foundations of Machine Learning

Courant Institute of Mathematical Sciences
Homework assignment 3
April 5, 2013
Due: April 19, 2013

A. Kernels

1. Let X be a finite set. Show that the kernel K defined over 2X , the set of
subsets of X , by
1
∀A, B ∈ 2X , K(A, B) = exp − |A∆B| ,
2
where A∆B is the symmetric difference of A and B is PDS (hint: you could
use the fact that K is the result of the normalization of a kernel function
K 0 ). Note that this could define a similarity measure for documents based
on the set of their common words, or n-grams, or gappy n-grams, or a sim-
ilarity measure for images based on some patterns, or a similarity measure
for graphs based on their commong sub-graphs.
2. Let X be a finite set. Let K0 be a PDS kernel over X , show that K 0 defined
by X
∀A, B ∈ 2X , K 0 (A, B) = K0 (x, x0 )
x∈A,x0 ∈B

is a PDS kernel.
3. Show that K defined by K(x, x0 ) = √ 1
for all x, x0 ∈ X = {x ∈
1−(x·x0 )
RN : kxk2 < 1} is a PDS kernel. Bonus point: show that the dimension of
the feature space associated to K is infinite (hint: one method to show that
consists of finding an explicit expression of a feature mapping Φ).

B. Support Vector Machines

1. Download and install the libsvm software library from:

http://www.csie.ntu.edu.tw/˜cjlin/libsvm/ ,

and briefly consult the documentation to become more familiar with the
tools.

1
2. Consider the splice data set

http://www.cs.toronto.edu/˜delve/data/splice/desc.html .

Download the already formatted training and test files of that dataset from
http://www.cs.nyu.edu/˜mohri/ml13/splice.train.txt
http://www.cs.nyu.edu/˜mohri/ml13/splice.test.txt .

Use the libsvm scaling tool to scale the features of all the data. The scaling
parameters should be computed only on the training data and then applied to
the test data.

3. Consider the corresponding binary classification which consists of distin-

guishing two types of splice junctions in DNA sequences using about 60
features. Use SVMs combined with polynomial kernels to tackle this prob-
lem.
To do that, randomly split the training data into ten equal-sized disjoint sets.
For each value of the polynomial degree, d = 1, 2, 3, 4, plot the average
cross-validation error plus or minus one standard deviation as a function
of C (let other parameters of polynomial kernels in libsvm be equal to
their default values), varying C in powers of 10, starting from a small value
C = 10−k to C = 10k , for some value of k. k should be chosen so that you
see a significant variation in training error, starting from a very high training
error to a low training error. Expect longer training times with libsvm as
the value of C increases.

4. Let (C ∗ , d∗ ) be the best pair found previously. Fix C to be C ∗ . Plot the

ten-fold cross-validation error and the test errors for the hypotheses obtained
as a function of d. Plot the average number of support vectors obtained as
a function of d. How many of the support vectors lie on the margin hyper-
planes?

5. Suppose
Pmwe replace in the primal 2optimization problem of SVMs the penalty
term i=1 ξi = kξk1 with kξk2 , that is we use the quadratic hinge loss
instead. Give the associated dual optimization problem and compare it with
the dual optimization problems of SVMs.

6. In class, we presented margin-based generalization bounds in support of the

SVM algorithm based on the standard hinge loss. Can you derive similar
margin-based generalization bounds when the quadratic hinge loss is used?

2
To do that, you could use instead of the margin loss function Φρ defined in
class the function Ψρ defined by

1
 if u ≤ 0
u
2
Ψρ (u) = ρ −1 if u ∈ [0, ρ]

0 otherwise,


and show that it is a Lipschitz function. Compare the empirical and com-
plexity term of your generalization bound to those given in class using Φρ .

C. Boosting

1. Let Ψ : R → R denote the function defined by Ψ(u) = (1 − u)2 1u≤1 . Show

that Ψ is an upper bound on the binary loss function and that it is convex and
differentiable. Use Ψ to derive a boosting-style algorithm as in the case of the
exponential function used in AdaBoost using coordinate descent. Describe
the algorithm in detail.

2. Implement that algorithm with boosting stumps and apply the algorithm to
the same data set as question B with the same training and test sets. Plot
the average cross-validation error plus or minus one standard deviation as
a function of the number of rounds of boosting T by selecting the value
of this parameter out of {10, 102 , . . . , 10k } for a suitable value of k, as in
question B. Let T ∗ be the best value found for the parameter. Plot the error
on the training and test set as a function of the number of rounds of boosting
for t ∈ [1, T ∗ ]. Compare your results with those obtained using SVMs in
question B.

SVM
No ratings yet
SVM
21 pages
hw2 4
No ratings yet
hw2 4
3 pages
HW2 2
No ratings yet
HW2 2
3 pages
Assignment 4
No ratings yet
Assignment 4
3 pages
Assignment 4 (Sol.) : Introduction To Machine Learning Prof. B. Ravindran
No ratings yet
Assignment 4 (Sol.) : Introduction To Machine Learning Prof. B. Ravindran
4 pages
Lecture Slides-Week12
100% (1)
Lecture Slides-Week12
41 pages
1 Analytical Part (3 Percent Grade) : + + + 1 N I: y +1 I 1 N I: y 1 I
No ratings yet
1 Analytical Part (3 Percent Grade) : + + + 1 N I: y +1 I 1 N I: y 1 I
5 pages
Support Vector Machines: (Vapnik, 1979)
No ratings yet
Support Vector Machines: (Vapnik, 1979)
34 pages
Midterm Solutions For Machine Learning
No ratings yet
Midterm Solutions For Machine Learning
13 pages
Introduction To: Support Vector Machines
No ratings yet
Introduction To: Support Vector Machines
53 pages
hw3 Soln
No ratings yet
hw3 Soln
7 pages
Support Vector Machines
No ratings yet
Support Vector Machines
24 pages
Dis11 Sol
No ratings yet
Dis11 Sol
5 pages
SVM Intro
No ratings yet
SVM Intro
23 pages
Supervised Learning - Support Vector Machines and Feature Reduction
No ratings yet
Supervised Learning - Support Vector Machines and Feature Reduction
11 pages
Slide - SVM
No ratings yet
Slide - SVM
12 pages
Machine Learning
No ratings yet
Machine Learning
45 pages
Solution In5520 Exercise SVM 2020
No ratings yet
Solution In5520 Exercise SVM 2020
8 pages
Introduction To Support Vector Machines: Andrew Moore CMU
No ratings yet
Introduction To Support Vector Machines: Andrew Moore CMU
40 pages
hw2 3
No ratings yet
hw2 3
3 pages
Homework 2: SVM, Kernel Methods, Ensemble Learning, Learning Theory
No ratings yet
Homework 2: SVM, Kernel Methods, Ensemble Learning, Learning Theory
12 pages
L5-Support Vector Machine
No ratings yet
L5-Support Vector Machine
61 pages
22-Kernel Tricks Shit
No ratings yet
22-Kernel Tricks Shit
43 pages
SVM Tutorial
No ratings yet
SVM Tutorial
31 pages
EGIRAFFE Computational Intelligence UE - DZ - H - Hausuebung - 2020SS - Assignment 4
No ratings yet
EGIRAFFE Computational Intelligence UE - DZ - H - Hausuebung - 2020SS - Assignment 4
15 pages
Support Vector Machines: Jeff Wu
No ratings yet
Support Vector Machines: Jeff Wu
35 pages
SVM Tutorial
100% (1)
SVM Tutorial
34 pages
SVM Tutorial
No ratings yet
SVM Tutorial
34 pages
Machine Learning - Open Elective - Part III
No ratings yet
Machine Learning - Open Elective - Part III
90 pages
CS-13410 Introduction To Machine Learning
No ratings yet
CS-13410 Introduction To Machine Learning
33 pages
Fundamental Knowledge of Machine Learning: Abstract This Chapter Introduces The Basic Concepts and Methods of Machine
No ratings yet
Fundamental Knowledge of Machine Learning: Abstract This Chapter Introduces The Basic Concepts and Methods of Machine
14 pages
hw3 Solutions PDF
No ratings yet
hw3 Solutions PDF
11 pages
2019-20-I ES Key
No ratings yet
2019-20-I ES Key
4 pages
Lect3 2
No ratings yet
Lect3 2
43 pages
10 SVM
No ratings yet
10 SVM
23 pages
Lecture Notes SVM
No ratings yet
Lecture Notes SVM
4 pages
Time Series Forecasting by Using Wavelet Kernel SVM
No ratings yet
Time Series Forecasting by Using Wavelet Kernel SVM
52 pages
Lecture Notes SVM
No ratings yet
Lecture Notes SVM
4 pages
An Introduction Of: Support Vector Machine
No ratings yet
An Introduction Of: Support Vector Machine
36 pages
SVM PRESENTATION
No ratings yet
SVM PRESENTATION
34 pages
SVM Tutorial
No ratings yet
SVM Tutorial
34 pages
An Improved Training Algorithm For Support Vector Machines
No ratings yet
An Improved Training Algorithm For Support Vector Machines
10 pages
T R Ik-Cl Ervor Er Kis: (Example)
No ratings yet
T R Ik-Cl Ervor Er Kis: (Example)
122 pages
SVM
No ratings yet
SVM
36 pages
SVM-CDing2024 11 15
No ratings yet
SVM-CDing2024 11 15
54 pages
HW 3
No ratings yet
HW 3
7 pages
Support Vector Machine: Prof. Subodh Kumar Mohanty
No ratings yet
Support Vector Machine: Prof. Subodh Kumar Mohanty
52 pages
Lecture 6
No ratings yet
Lecture 6
17 pages
Lec5 Support Vector Machine
No ratings yet
Lec5 Support Vector Machine
28 pages
SVM Class
No ratings yet
SVM Class
33 pages
EIE520 Neural Computation: The Hong Kong Polytechnic University
No ratings yet
EIE520 Neural Computation: The Hong Kong Polytechnic University
14 pages
Another Introduction SVM
No ratings yet
Another Introduction SVM
4 pages
An Introduction Of: Support Vector Machine
No ratings yet
An Introduction Of: Support Vector Machine
36 pages
Support Vector Machine: Abinas Panda
No ratings yet
Support Vector Machine: Abinas Panda
52 pages
Introduction To Support Vector Machines: BTR Workshop Fall 2006
No ratings yet
Introduction To Support Vector Machines: BTR Workshop Fall 2006
88 pages
Introduction To Support Vector Machines: BTR Workshop Fall 2006
No ratings yet
Introduction To Support Vector Machines: BTR Workshop Fall 2006
88 pages
Makalah
No ratings yet
Makalah
4 pages
A Short Course in Discrete Mathematics
From Everand
A Short Course in Discrete Mathematics
Edward A. Bender
3/5 (1)
A First Course in Functional Analysis
From Everand
A First Course in Functional Analysis
Martin Davis
No ratings yet
A-level Maths Revision: Cheeky Revision Shortcuts
From Everand
A-level Maths Revision: Cheeky Revision Shortcuts
Scool Revision
3.5/5 (8)
Advanced Statistical Theory 2 1112
No ratings yet
Advanced Statistical Theory 2 1112
3 pages
Mean Median Mode PDF
No ratings yet
Mean Median Mode PDF
52 pages
Me 309-162-HW 2
No ratings yet
Me 309-162-HW 2
3 pages
Virial Theorem
No ratings yet
Virial Theorem
7 pages
Data Structure & Algorithms: Arrays
No ratings yet
Data Structure & Algorithms: Arrays
42 pages
Ruin Theory
No ratings yet
Ruin Theory
29 pages
Laplace Transformation - Soran Jalal Muhamadamin
No ratings yet
Laplace Transformation - Soran Jalal Muhamadamin
24 pages
Mathematics Entrance Exam
No ratings yet
Mathematics Entrance Exam
3 pages
Chapter 2 (DH Parameters)
No ratings yet
Chapter 2 (DH Parameters)
18 pages
Binomial Distribution
No ratings yet
Binomial Distribution
6 pages
Gen Math q1 Answer Key Updated
No ratings yet
Gen Math q1 Answer Key Updated
8 pages
Linear Relationships Math Notes
100% (1)
Linear Relationships Math Notes
32 pages
Introduction To Thermodynamics
No ratings yet
Introduction To Thermodynamics
12 pages
June 99 Paper 4
No ratings yet
June 99 Paper 4
8 pages
Stability: Oncept OF Tability
No ratings yet
Stability: Oncept OF Tability
9 pages
Integration Technique by Parts Method Notes
No ratings yet
Integration Technique by Parts Method Notes
61 pages
Functional Equations: 1. Basic Techniques in Solving Functional Equations in One Variable
No ratings yet
Functional Equations: 1. Basic Techniques in Solving Functional Equations in One Variable
9 pages
Basic Calculus Limits
No ratings yet
Basic Calculus Limits
1 page
ComplexAnalysis Lecture Notes
No ratings yet
ComplexAnalysis Lecture Notes
33 pages
Cuet Math
No ratings yet
Cuet Math
8 pages
3.2 Graphs of Exponential Functions
100% (1)
3.2 Graphs of Exponential Functions
3 pages
Nonlinear Programming
100% (1)
Nonlinear Programming
5 pages
DFT, IDFT and Linear Convolution Using Overlap Add and Save Method
50% (2)
DFT, IDFT and Linear Convolution Using Overlap Add and Save Method
11 pages
Machine Learning Coursera All Exercies PDF
No ratings yet
Machine Learning Coursera All Exercies PDF
117 pages
2nd Introduction To The Matrix Package
No ratings yet
2nd Introduction To The Matrix Package
9 pages
Can Quantum-Mechanical Description of Physical Reality Be Considered Complete?
No ratings yet
Can Quantum-Mechanical Description of Physical Reality Be Considered Complete?
4 pages
Data Exploration and Visualization - AD3301 - Important Questions With Answer - Unit 1 - Exploratory Data Analysis
No ratings yet
Data Exploration and Visualization - AD3301 - Important Questions With Answer - Unit 1 - Exploratory Data Analysis
8 pages
Gate Syllabus
No ratings yet
Gate Syllabus
3 pages
What Is A Obtuse Angle
No ratings yet
What Is A Obtuse Angle
5 pages
hw1 Sol
No ratings yet
hw1 Sol
4 pages

HW 3

Uploaded by

HW 3

Uploaded by

Mehryar Mohri

Foundations of Machine Learning

B. Support Vector Machines

1. Download and install the libsvm software library from:

3. Consider the corresponding binary classification which consists of distin-

4. Let (C ∗ , d∗ ) be the best pair found previously. Fix C to be C ∗ . Plot the

6. In class, we presented margin-based generalization bounds in support of the

1. Let Ψ : R → R denote the function defined by Ψ(u) = (1 − u)2 1u≤1 . Show

You might also like