Homework2 - Tran Anh Vu
Homework2 - Tran Anh Vu
Homework 2
Tran Anh Vu - V202100569
October 21, 2024
1 Perceptron
1.1 Exercise 1a
- Based on the diagrams of data distribution, we can state that this dataset is linearly separable.
Therefore, we can train a Perceptron to classify it perfectly.
- Initially, w0 = [0, −1] and b0 = 12 We iteratively go through each sample: - For iteration 1:
• x1 = [0, 0] =⇒ a = w0T x1 + b0 = 12 . So a.y < 0 which implies the incorrect prediction. Then we
do the update w1 = w0 + y.x1 = [0, −1] and b1 = b0 + y = −0.5
• x2 = [0, 1] =⇒ a = w1T x2 + b1 = −3
2 . So a.y < 0 which implies the incorrect prediction. Then
we do the update w2 = w1 + y.x2 = [0, 0] and b2 = b1 + y2 = 0.5
• x3 = [1, 0] =⇒ a = w2T x3 + b2 = 12 . So a.y > 0 which implies the correct prediction.
• x3 = [1, 1] =⇒ a = w2T x4 + b2 = 12 . So a.y > 0 which implies the correct prediction.
- For iteration 2:
• x1 = [0, 0] =⇒ a = w2T x1 + b2 = 12 . So a.y < 0 which implies the incorrect prediction. Then we
do the update w3 = w2 + y.x1 = [0, 0] and b3 = b2 + y = −0.5
• x2 = [0, 1] =⇒ a = w3T x2 + b3 = −1
2 . So a.y < 0 which implies the incorrect prediction. Then
we do the update w4 = w3 + y.x2 = [0, 1] and b4 = b3 + y = 0.5
• x3 = [1, 0] =⇒ a = w4T x3 + b4 = 12 . So a.y > 0 which implies the correct prediction.
• x3 = [1, 1] =⇒ a = w4T x4 + b4 = 32 . So a.y > 0 which implies the correct prediction.
- For iteration 3:
• x1 = [0, 0] =⇒ a = w4T x1 + b4 = 12 . So a.y < 0 which implies the incorrect prediction. Then we
do the update w5 = w4 + y.x1 = [0, 1] and b5 = b4 + y = −0.5
• x2 = [0, 1] =⇒ a = w5T x2 + b5 = 21 . So a.y > 0 which implies the correct prediction.
1
• x3 = [1, 0] =⇒ a = w5T x3 + b5 = −1
2 . So a.y > 0 which implies the incorrect prediction. Then
we do the update w6 = w5 + y.x3 = [1, 1] and b6 = b5 + y = 0.5
• x3 = [1, 1] =⇒ a = w6T x4 + b6 = 52 . So a.y > 0 which implies the correct prediction.
- For iteration 4:
• x1 = [0, 0] =⇒ a = w2T x1 + b2 = 12 . So a.y < 0 which implies the incorrect prediction. Then we
do the update w7 = w6 + y.x1 = [1, 1] and b7 = b6 + y = −0.5
• x2 = [0, 1] =⇒ a = w7T x2 + b7 = 21 . So a.y > 0 which implies the correct prediction.
• x3 = [1, 0] =⇒ a = w7T x3 + b7 = 12 . So a.y > 0 which implies the correct prediction.
• x3 = [1, 1] =⇒ a = w7T x4 + b7 = 32 . So a.y > 0 which implies the correct prediction.
−1
Therefore, the perfect classifier for the dataset is the Perceptron with w∗ = [1, 1] and b∗ = 2
1.2 Exercise 1b
Assume that we can find a Perceptron that perfectly classifies the dataset and its parameters are
w∗ = [w1 , w2 ] and b∗ = b Therefore, those parameter has to satisfy the following system of equations:
b < 0(1)
w + b ≥ 0(2)
1
w2 + b ≥ 0(3)
w1 + w2 + b < 0(4)
From that system, we plus (1) and (4), we get w1 + w2 + 2b < 0 while plus (2) and (3), we get
w1 + w2 + 2b ≥ 0 ( contradiction). Therefore, we can not find a Perceptron that perfectly fits the
dataset.
2 Linear Regression
2.1 Exercise 2a
The loss function is given as:
L(w) = ∥Xw − y∥2 = (Xw − y)T (Xw − y) = wT X T Xw − 2y T Xw + y T y
∂L(w)
= 2X T (Xw − y)
∂w
This is the derivative of the loss function with respect to w.
2.2 Exercise 2b
From Exercise 2a, we have the derivative as:
∂L(w)
= 2X T (Xw − y)
∂w
Setting this equal to zero:
X T (Xw∗ − y) = 0
=⇒ X T Xw∗ = X T y
w∗ = (X T X)−1 X T y
2
2.3 Exercise 2c
We have:
Lridge = ∥Xw − y∥2 + λ∥w∥2 = (Xw − y)T (Xw − y) + λwT w
X T (Xw∗ − y) + λw∗ = 0
=⇒ X T Xw∗ + λw∗ = X T y
=⇒ (X T X + λI)w∗ = X T y
In case λ > 0, we have (X T X + λI) > 0 because X T X > 0 has been proved above. Therefore,
(X T X + λI) is invertable. Hence, the solution is:
w∗ = (X T X + λI)−1 X T y
2.4 Exercise 2d
Let X and y ∗ be features and label sets of the new dataset such that we can obtain L2 regularization
∗
by training with ordinary least square regression with this new dataset. In other words:
Hence, we can augment the original dataset X, y to the augmented dataset X ∗ , y ∗ to achieve the same
effect as L2 regularize while using the ordinary least square regression.
3 Coding Questions
I described my code in comment lines