0% found this document useful (0 votes)
18 views

Assignment #3_handout

Uploaded by

omar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
18 views

Assignment #3_handout

Uploaded by

omar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 3

Elements of Machine Learning, WS 2024/2025

Prof. Dr. Isabel Valera and Dr. Kavya Gupta


Assignment Sheet #3: Generalization, Regularization and Beyond Linearity

Deadline: Wednesday, December 11th, 2024 23:59 hrs

This problem set is worth a total of 50 points, consisting of 3 theory questions and 1 programming
question. Please carefully follow the instructions below to ensure a valid submission:

• You are encouraged to work in groups of two students. Register your team (of 1 or 2 members) on
the CMS at least ONE week before the submission deadline. You have to register your team for
each assignment.
• All solutions, including coding answers, must be uploaded individually to the CMS under the
corresponding assignment and problem number. On CMS you will find FOUR problems under each
assignment. Make sure you upload correctly each of your solution against
Assignment#X − P roblem Y (where X- Assignment number and Y is the problem number) on
CMS. In total you have to upload THREE PDFs (theoretical problems) and ONE ZIP file
(programming problem).
• For each theoretical question, we encourage using LaTeX or Word to write your solutions for
clarity and readability. Scanned handwritten solutions will be accepted as long as they are clean
and easily legible. Final submission format must always be in a single PDF file per theoretical
problem. Ensure your name, team member’s name (if applicable), and matriculation numbers are
clearly listed at the top of each PDF.
• For programming question, you need to upload a ZIP file to CMS under
Assignment#X − P roblem 4. Each ZIP file must contain a PDF or HTML exported from Jupyter
Notebook and the .ipynb file with solutions. Make sure all cells in your Jupyter notebook contain
your final answers. For creating PDF/HTML, use the export of the Jupyter notebook. Before
exporting, ensure that all cells have been computed. To do this:

– Go to the “Cell” menu at the top of the Jupyter interface.


– Select “Run All” to execute every cell in your notebook.
– Once all cells are executed, export the notebook: Click on “File” in the top menu.
– Choose “Export As” and select either PDF or HTML.

The submission should include your name, team member’s name, and matriculation numbers at the
top of both PDF/HTML and .ipynb file document.

• Finally, ensure academic integrity is maintained. Cite any external resources you use for your
assignment.
• If you have any questions follow the instructions here.

1 of 3
Elements of Machine Learning, WS 2024/2025
Prof. Dr. Isabel Valera and Dr. Kavya Gupta
Assignment Sheet #3: Generalization, Regularization and Beyond Linearity

Problem 1 (Generalization). (10 Points)


1. Assume you are only given training points for a binary classification problem and a small validation
set. Does it make sense to compute the validation error for all classification methods (Logistic
Regression, LDA, QDA) and report minimal validation error over all methods as an estimate of the
test error ? Justify your answer. (3 Points)
2. Is it possible that model selection using cross-validation overfits? If yes, describe with an example,
if no, explain the reason why overfitting is impossible. (4 Points)
3. Why does K-fold CV result in a higher bias than LOOCV? (3 Points)

Problem 2 (Regularization). (15 Points)


1. Lasso and Ridge regressions are used to predict a target Y from X as shown in Equation 2.1 and
2.2 respectively. In order to understand which of the two models is better suited for a task, the
mathematical equations for these are written as follows:
 2
n
X p
X p
X
yi − β0 − βj xij  + λ |βj | (2.1)
i=1 j=1 j=1

 2
n
X p
X p
X
yi − β0 − βj xij  + λ βj2 (2.2)
i=1 j=1 j=1

(a) Discuss how the model coefficients (βj ) change as λ → 0 and as λ → ∞ in both Equation 2.1
and 2.2. (4 Points)
(b) If we have significantly more independent features than observations and want to perform
feature selection, which type of regularization method should we use? (Hint: L1 or L2 ?) What
value of λ should be considered i.e. small or large? (3 Points)
Pp
2. Suppose that yi = β0 + j=1 xij βj + ϵi , where ϵ1 , . . . , ϵn are independent and identically
distributed from a N (0, σ 2 ) distribution.
(a) Write out the likelihood for the data. (2 Points)
(b) Assume the prior for β : β1 , . . . , βp are independent and identically distributed according to a
double-exponential distribution with mean 0 and common scale parameter b, written as:
 
1 |β|
p(β) = exp − .
2b b
Write out the posterior for β in this setting. (2 Points)
(c) Show that the lasso estimate is the mode for β under this posterior distribution. (4 Points)

Problem 3 (Beyond linearity: Polynomial and Splines). (15 Points)


1. Cubic regression spline with one knot at ξ can be obtained using a basis of the form
x, x2 , x3 , (x − ξ)3+ , where (x − ξ)3+ = (x − ξ)3 if x > ξ and equals 0 otherwise. We can show that a
function of the form

f (x) = β0 + β1 x + β2 x2 + β3 x3 + β4 (x − ξ)3+
is indeed a cubic regression spline, regardless of the values of β0 , β1 , β2 , β3 , β4 .
2 of 3
Elements of Machine Learning, WS 2024/2025
Prof. Dr. Isabel Valera and Dr. Kavya Gupta
Assignment Sheet #3: Generalization, Regularization and Beyond Linearity

(a) Find a cubic polynomial (2 Points)

f1 (x) = a1 + b1 x + c1 x2 + d1 x3

such that f (x) = f1 (x) for all x ≤ ξ. Express a1 , b1 , c1 , d1 in terms of β0 , β1 , β2 , β3 , β4 .


(b) Find a cubic polynomial (2 Points)

f2 (x) = a2 + b2 x + c2 x2 + d2 x3

such that f (x) = f2 (x) for all x > ξ. Express a2 , b2 , c2 , d2 in terms of β0 , β1 , β2 , β3 , β4 . We


have now established that f (x) is a piecewise polynomial.
(c) Show that f1 (ξ) = f2 (ξ). That is, f (x) is continuous at ξ. (2 Points)
(d) Show that f1′ (ξ) = f2′ (ξ). That is, f ′ (x) is continuous at ξ. (2 Points)
(e) Show that f1′′ (ξ) = f2′′ (ξ). That is, f ′′ (x) is continuous at ξ. (2 Points)
Therefore, f (x) is indeed a cubic spline.
Hint: Parts (d) and (e) of this problem require knowledge of single-variable calculus. As a
reminder, given a cubic polynomial

f1 (x) = a1 + b1 x + c1 x2 + d1 x3 ,

the first derivative takes the form

f1′ (x) = b1 + 2c1 x + 3d1 x2 .

2. Consider two curves, ĝ1 and ĝ2 , defined by (5 Points)

n Z !
X
2 (3) 2
ĝ1 = arg min (yi − g(xi )) + λ [g (x)] dx ,
g
i=1

n Z !
X
2 (4) 2
ĝ2 = arg min (yi − g(xi )) + λ [g (x)] dx ,
g
i=1

where g (m) represents the m-th derivative of g.

(a) As λ → ∞, will ĝ1 or ĝ2 have the smaller training RSS?


(b) As λ → ∞, will ĝ1 or ĝ2 have the smaller test RSS?
(c) For λ = 0, will ĝ1 or ĝ2 have the smaller training and test RSS?

Problem 4 (Coding Generalization, Regularization and Beyond Linearity). (10 Points)

In this assignment, you will work on selecting the best model using K-fold cross-validation. You will also
explore methods for selecting hyperparameters to enhance the generalizability of your trained models.

Please refer to the file assignment 3 handout.ipynb and only complete the sections marked in red and
missing codes denoted with #TODO. Once you have filled in the required parts, revisit submission
instructions to check how to submit it.

3 of 3

You might also like