hw2 4
hw2 4
B. Growth function
1
functions:
p
X
F= x 7→ sgn ak fk (x) : a1 , . . . , ap ∈ R .
k=1
http://www.csie.ntu.edu.tw/~cjlin/libsvm/ ,
http://www.cs.toronto.edu/~delve/data/splice/desc.html.
Download the already formatted training and test files of a noisy ver-
sion of that dataset from
http://www.cs.nyu.edu/~mohri/ml15/splice_noise_train.txt
http://www.cs.nyu.edu/~mohri/ml15/splice_noise_test.txt.
Use the libsvm scaling tool to scale the features of all the data. The
scaling parameters should be computed only on the training data and
then applied to the test data.
2
average cross-validation error plus or minus one standard deviation as
a function of C (let other parameters of polynomial kernels in libsvm
be equal to their default values), varying C in powers of 5, starting
from a small value C = 5−k to C = 5k , for some value of k. k should be
chosen so that you see a significant variation in training error, starting
from a very high training error to a low training error. Expect longer
training times with libsvm as the value of C increases.
5. Now, combine SVMs with Gaussian kernels to tackle the same task.
Use cross-validation as before to determine the best value of C and σ,
varying C in powers of 5, and σ in powers of 2 for a reasonable range
so that you see a significant variation in training error, as before. Fix
C and σ to the best values found via cross-validation. How does the
test error of the solution compare to the best result obtained using
polynomial kernels? What is the value of the soft margin?
6. Here, use as a kernel the sum of the best polynomial kernel (degree
d∗ ) and the Gaussian kernel with the best parameter σ you found in
the previous question. Use cross-validation as before to determine the
best value of C. How does the test error of the solution compare to
the best result obtained in the previous questions?
D. Kernels