Practical Work N. 3 (Travaux Pratiques N. 3) : Introduction To Machine Learning: Application To Geosciences
Practical Work N. 3 (Travaux Pratiques N. 3) : Introduction To Machine Learning: Application To Geosciences
Page 1
Giffard-Roisin Sophie M1 2022-23
2.1 Question 2
Let’s first have a look at the code without using it.
a) What are the lines corresponding to the calculation of the loss gradient?
b) How are the parameters (or weights) initialized?
c) What is the size of the parameter vector w?
d) How many for loops will be performed (if we use the default parameters of this gradient descent function)?
2.2 Question 3
Now call the function gradient descent with inputs (Xtransf , Ytransf ) using the default rate and iterations
values. The estimated parameter vector will have the name westimated . Give in your report the values of
westimated,1 and westimated,2 .
2.3 Question 4
Fill and launch the following code to plot the estimated linear model:
fig = plt.figure()
ax = fig.add_subplot(projection='3d')
# first, plot the data points
ax.scatter(X_transf[:,0],X_transf[:,1], Y_transf)
x_data=np.arange(-1.5,1.5,0.1)
y_data=np.arange(-1.5,1.5,0.1)
ax.set_xlabel('X1 (area)')
ax.set_ylabel('X2 (nb windows)')
ax.set_zlabel('Y (consumption)')
plt.show()
Page 2
Giffard-Roisin Sophie M1 2022-23
2.4 Question 5
We will make a modified version of the gradient descent function by adding some lines in order to store
the values of w at each iterations (tip: fill in the w iterations using witerations [it] = ..). Fill the following
function:
2.5 Question 6
Repeat Question 5 setting a different rate value: rate = 0.001. Add the plots to the report. Are the
parameter values stabilizing faster or slower (i.e., the convergence is reached in a smallest or largest number
of iterations)? (Bonus: do the same with rate = 0.1: what is happening?)
3 Testing
We will now estimate the consumption of 5 new houses using our trained model.
Load the test data set and transform it, in the same manner as in section 1:
dataframe_test = pd.read_csv('dataset_test_consumption.csv')
print(dataframe_test)
X_test = dataframe_test[name_featuresX].copy()
Y_test = dataframe_test.consumption.copy() # this is the ground truth to validate our predictions
X_test_transf = scaler.transform(X_test)
Y_test_transf = scaler_Y.transform(Y_test.array.reshape(-1,1))
3.1 Question 7
From the values of w1 = westimated [0] and w2 = westimated [1] estimated in question 5, calculate Y test transf predicted
from X test transf. Remember that this is a linear regression with formula: fw (x) = w1 x1 + w2 x2 . Write the
values of the 5 estimated Y test transf predicted.
3.2 Question 8
Let’s now transform back our predictions in kWh (’de-standardization’) and compare them with the ground
truth consumption values Y test. Calculate the error (in kWh) for every sample, and then calculate the mean
absolute error.
Page 3
Giffard-Roisin Sophie M1 2022-23
Y_test_predicted = scaler_Y.inverse_transform(Y_test_transf_predicted).flatten()
errors = [TO FILL]
absolute_errors = np.abs(errors)
mean_error = [TO FILL]
4 Bonus
4.1 Question 9 (bonus)
Modify your function gradient descent iterations in order to calculate and store the loss mean squared error
value at each iteration. Plot it for rate = 0.001 with a sufficient number of iterations so that it reaches a
convergence. Add the plot to your report.
Page 4