P05 LinearRegression SolutionNotes
P05 LinearRegression SolutionNotes
Practical exercises
There is not evidence in favor of biases, as residues appear to be randomly distributed against y1 and y2.
f) Compute the closed form solution considering Ridge regularization term with 𝜆 = 0.2.
0.24
𝐰 = (𝑋 𝑇 𝑋 + 𝜆 𝐼)−1 𝑋 𝑇 𝑍 = (0.05)
0.63
g) Compare the hyperplanes obtained using ordinary least squares and Ridge regression.
The norm of the Ridge vector describing the hyperplane has lower norm as expected.
1.5
𝐰 = (𝑋 𝑇 𝑋)−1 𝑋 𝑇 𝑍 = ( 0 )
−0.5
b) Assuming the output threshold θ=0.5, use the regression to classify 𝐱 new = [2 2.5]𝑇
0.2 0
b) 𝒘 using the Bayesian approach, assuming 𝑝(𝒘) = 𝑁 (𝒘 | 𝒖 = [0 0], 𝝈 = [ ])
0 0.2
Maximum posterior is given by (proof on the slides): 𝐰 = (𝑋 𝑇 𝑋 + 𝜆 𝐼)−1 𝑋 𝑇 𝑍
𝜎𝒑𝒐𝒔𝒕𝒆𝒓𝒊𝒐𝒓 𝟐 0.12
λ= 𝟐 = . Solve exercise similarly as 1. 𝑓).
𝜎𝒑𝒓𝒊𝒐𝒓 0.22
4. Identify a transformation to aid the linearly y1 y2 output
𝐱1 -0.95 0.62 0
modelling of the following data points. 𝐱2 0.63 0.31 0
𝐱3 -0.12 -0.21 1
Sketch the predicted surface. 𝐱4 -0.24 -0.5 0
𝐱5 0.07 -0.42 1
𝐱6 0.03 0.91 0
𝐱7 0.05 0.09 1
𝐱8 -0.83 0.22 0
Plotting the data points we see that the labels seem to change with the distance from the origin.
A way to capture is this is to perform a quadratic feature transform
𝜑(𝑥1 , 𝑥2 ) = (𝑥1 2 , 𝑥2 2 )
1 −0.952 0.622 0
1 0.632 0.312 0
1 −0.122 −0.212 1
Φ= 1 −0.242 −0.52 , 𝒛 = 0
1 0.072 −0.422 1
1 0.032 0.912 0
1 0.052 0.092 1
(1 −0.832 0.222 ) (0 )
0.817
𝐰 = (𝑋 𝑇 𝑋)−1 𝑋 𝑇 𝒛 = (−0.865)
−0.95
0.817 − 0.865𝑥1 − 0.95𝑥2
1 9 1.5
1 16 9.3
Φ2 = 1 36 , 𝑍 = 23.4
1 100 45.8
(1 144) (60.1)
2.7895
𝐰𝟐 = (𝑋 𝑇 𝑋)−1 𝑋 𝑇 𝑍 = ( )
0.4136
2.7895 + 0.4136 × 𝑥 𝟐
b) Which one minimizes the sum of squared errors on the original training data
Programming quests
7. Consider the housing dataset available at https://web.ist.utl.pt/~rmch/dscience/data/housing.arff and
the Regression notebook available at the course’s webpage:
a) Compare the determination coefficient of the non-regularized, Lasso and Ridge linear regression
b) Compare the MAE and RMSE of linear, kNN and decision tree regressors on housing