0% found this document useful (0 votes)
7 views

Assignment 2

Uploaded by

lukmanamidu230
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views

Assignment 2

Uploaded by

lukmanamidu230
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 2

University of Regina

Electronic Systems Engineering

Winter term 2020 ENEL-865/ENSE 865


Assignment# 02
Grade Points: 03

Acknowledgment: These questions are taken from the textbook: An Introduction to Statistical
Learning with Applications in R by James, Witten, Hastie, and Tibshirani
1) [3 X 5 Points]
We perform the best subset, forward stepwise, and backward stepwise selection on a single
data set. For each approach, we obtain 𝑝 + 1 models, containing 0, 1, 2, . . . , 𝑝 predictors.
Explain your answers:
a) Which one of the three models with 𝑘 predictors has the smallest training RSS?
b) Which one of the three models with 𝑘 predictors has the smallest test RSS?
c) True or False:
i) The predictors in the 𝑘-variable model identified by forward stepwise are a subset of
the predictors in the (𝑘 + 1)-variable model identified by forward stepwise selection.
ii) The predictors in the 𝑘-variable model identified by backward stepwise are a subset
of the predictors in the (𝑘 + 1)-variable model identified by backward stepwise
selection.
iii) The predictors in the 𝑘-variable model identified by backward stepwise are a subset
of the predictors in the (𝑘 + 1)- variable model identified by forward stepwise
selection.
iv) The predictors in the k-variable model identified by forward stepwise are a subset of
the predictors in the (𝑘 + 1)-variable model identified by backward stepwise
selection.
v) The predictors in the k-variable model identified by the best subset are a subset of the
predictors in the (𝑘 + 1)-variable model identified by best subset selection.

2) [5 X 5 Points]

Suppose we estimate the regression coefficients in a linear regression model by minimizing

for a particular value of s. For parts (a) through (e), indicate which of i. through v. is correct.
Justify your answer.
a) As we increase 𝑠 from 0, the training RSS will:
i) Increase initially, and then eventually start decreasing in an inverted U shape.
ii) Decrease initially, and then eventually start increasing in a U shape.
iii) Steadily increase.
iv) Steadily decrease.
v) Remain constant.
b) Repeat (a) for test RSS.
c) Repeat (a) for variance.
d) Repeat (a) for (squared) bias.
e) Repeat (a) for the irreducible error.

3) [5 X 5 Points]
Suppose we estimate the regression coefficients in a linear regression model by minimizing

for a particular value of 𝜆. For parts (a) through (e), indicate which of i. through v. is correct.
Justify your answer.

a) As we increase 𝜆 from 0, the training RSS will:


i) Increase initially, and then eventually start decreasing in an inverted U shape.
ii) Decrease initially, and then eventually start increasing in a U shape.
iii) Steadily increase.
iv) Steadily decrease.
v) Remain constant.
b) Repeat (a) for test RSS.
c) Repeat (a) for variance.
d) Repeat (a) for (squared) bias.
e) Repeat (a) for the irreducible error.

4) [5+10 +15 Points]


We have discussed the coordinate descent algorithm for finding optimal weights for least
square cost function (using RSS only), Ridge, and Lasso using normalized data. Repeat the
process (and write the complete iterative algorithms) when the data is not normalized.

You might also like