Midterm 1 Practice Solutions
Midterm 1 Practice Solutions
1. Setting up and solving a linear regression problem with features that are nonlinear
functions of the model’s input.
2. Writing down the optimization problem for least squares linear regression using
matrix-vector notation
3. Familiarity with matrix multiplication, in particular when multiplying by diagonal
matrices
4. Evaluating the true positive rate, false positive rate, precision, and recall of a
predictor for binary classification
5. Setting up a logistic regression problem and writing down the appropriate function
that is being minimized
6. Computing the mean, expected value, and variance of uniform random variables
7. Explaining causes and remedies for overfitting and underfitting of ML models
1
Question 1.
Consider the following training data.
x1 x2 y
0 0 0
0 1 1.5
1 0 2
1 1 2.5
Suppose the data comes from a model y = ✓0 +✓1 x1 +✓2 x2 +noise for unknown constants
✓0 , ✓1 , ✓2 . Use least squares linear regression to find an estimate of ✓0 , ✓1 , ✓2 .
Response:
Let
P 9 1D of
We solve the least squares problem
main Hy 110112
The solution is given by
Atx Xt
son
xx
f Atx P
so
0
15 o o s
2
02 0.5
2
Question 2.
Consider the following training data:
x y
1 3
2 1
3 0.5
Suppose the data comes from a model y = cx + noise, for unknown constants c and .
Use least squares linear regression to find an estimate of c and .
Response:
1Lpg Egos
1 4
1 11,0
let
p
a
(a) Let ✓ ⇤ 2 Rd , and let f (✓) = 12 k✓ ✓ ⇤ k2 . Show that the Hessian of f is the identity
matrix.
Response:
(b) Let X 2 Rn⇥d and y 2 Rn . For ✓ 2 Rd , let g(✓) = 12 kX✓ yk2 . Show that the Hessian
of g is X t X.
Response:
a H on offoontrol
Write Frost 0 057
On E
Efron
gro o
So Hin I IIe H Id
4
b grot
11 Xo yip
By class
Tgror Xtra y
Xoxo Tty
kthrow of
Let M Ktx e IRded Let Mn be M
so
f Ma Ky
Mato Atyln
so
go n my
jthentry of Mk
Thus
H M htt
Question 4.
Consider a binary classification problem whose features are in R2 . Suppose the predictor
learned by logistic regression is (✓0 + ✓1 x1 + ✓2 x2 ), where ✓0 = 4, ✓1 = 1, ✓2 = 0. Find
and plot curve along which P(class 1) = 1/2 and the curve along which P(class 1) = 0.95.
Response:
0.5
1
0100 0,21 02227 12 X2
Got 0,21 0222 0
4
Htt
x o
0.95
Recall O Z
Éz Fez
So 0121 0.95
z 0.95
Z
E it 0.0526
2 2.94
5
Question 5.
Consider a 3-class classification problem. You have trained a predictor whose input is
x 2 R2 and whose output is softmax(x1 + x2 1, 2x1 + 3, x2 ). Find and sketch the three
regions in R2 that gets classified as class 1, 2, and 3.
Response:
Z X th 1
Za 24 3
Z As
Atx 4 X I
6
b where is classified as class 2
24 3 4th 1 24 3742
24
Italy 1 2
X2 24 3
X2 Htt 1
X Cl 24 1 3
class I
Xz
class 2
I
3 Class 3
X
Question 6.
Suppose x ⇠Uniform([ 1, 1]) and y = x + ", where " ⇠Uniform([ , ]) for some > 0.
Consider a predictor given by f ✓ (x) = ✓1 + ✓2 x, where ✓ 2 R2 . Evaluate the risk of f ✓
with respect to the square loss. Your answer should be a deterministic expression only
depending on ✓1 , ✓2 , and .
Response:
RIO E
II foul 91 I a tax x et
Note E X2
L o t lo 1 x et
Six tan til
V3
Exelon E ello tix E E
EXO 2
Ee to.co j 2E11o
x
Sina O O are deterministic O EE
Fae
EE EEE O and XandE inolep FIX E EE EEE
RCO 8
Q t
t 10 1 y
7
Question 7.
You are training a logistic regression model and you notice that it does not perform well
on test data.
• Could the poor performance be due to underfitting? Explain.
• Could the poor performance be due to overfitting? Explain.
classes
example the
Consider case 2
for of
separable by a curved line
A e
a cheek 1
chen
logisticRegression's
time
8
Overfitting
to be linearly separable as a