Assignment 2
Assignment 2
Acknowledgment: These questions are taken from the textbook: An Introduction to Statistical
Learning with Applications in R by James, Witten, Hastie, and Tibshirani
1) [3 X 5 Points]
We perform the best subset, forward stepwise, and backward stepwise selection on a single
data set. For each approach, we obtain 𝑝 + 1 models, containing 0, 1, 2, . . . , 𝑝 predictors.
Explain your answers:
a) Which one of the three models with 𝑘 predictors has the smallest training RSS?
b) Which one of the three models with 𝑘 predictors has the smallest test RSS?
c) True or False:
i) The predictors in the 𝑘-variable model identified by forward stepwise are a subset of
the predictors in the (𝑘 + 1)-variable model identified by forward stepwise selection.
ii) The predictors in the 𝑘-variable model identified by backward stepwise are a subset
of the predictors in the (𝑘 + 1)-variable model identified by backward stepwise
selection.
iii) The predictors in the 𝑘-variable model identified by backward stepwise are a subset
of the predictors in the (𝑘 + 1)- variable model identified by forward stepwise
selection.
iv) The predictors in the k-variable model identified by forward stepwise are a subset of
the predictors in the (𝑘 + 1)-variable model identified by backward stepwise
selection.
v) The predictors in the k-variable model identified by the best subset are a subset of the
predictors in the (𝑘 + 1)-variable model identified by best subset selection.
2) [5 X 5 Points]
for a particular value of s. For parts (a) through (e), indicate which of i. through v. is correct.
Justify your answer.
a) As we increase 𝑠 from 0, the training RSS will:
i) Increase initially, and then eventually start decreasing in an inverted U shape.
ii) Decrease initially, and then eventually start increasing in a U shape.
iii) Steadily increase.
iv) Steadily decrease.
v) Remain constant.
b) Repeat (a) for test RSS.
c) Repeat (a) for variance.
d) Repeat (a) for (squared) bias.
e) Repeat (a) for the irreducible error.
3) [5 X 5 Points]
Suppose we estimate the regression coefficients in a linear regression model by minimizing
for a particular value of 𝜆. For parts (a) through (e), indicate which of i. through v. is correct.
Justify your answer.