Lab 4 - Markdown Practical - Solution
Lab 4 - Markdown Practical - Solution
Class name: __
Student code: __
Objectives
After completing this lab you will be able to:
Using a direct closed-form equation that directly computes the model parame‐ ters that
best fit the model to the training set (i.e., the model parameters that minimize the cost
function over the training set).
Using an iterative optimization approach, called Gradient Descent (GD) , that
gradually tweaks the model parameters to minimize the cost function over the training set,
eventually converging to the same set of parameters as the first method. We will look at a
few variants of Gradient Descent that we will use again and again when we study neural
networks in Part II: Batch GD , Mini-batch GD , and Stochastic GD
Next we will look at Polynomial Regression, a more complex model that can fit non‐ linear
datasets. Since this model has more parameters than Linear Regression, it is more prone to
overfitting the training data, so we will look at how to detect whether or not this is the case, using
learning curves, and then we will look at several regularization techniques that can reduce the
risk of overfitting the training set. Finally, we will look at two more models that are commonly
used for classification tasks: Logistic Regression and Softmax Regression.
1. Linear Regression
In chapter 1, we looked at a simple regression model of life satisfaction:
This can be written much more concisely using a vectorized form , as shown in Equation
4-2
Equation 4-2. Linear Regression model prediction (vectorized form)
y = hθ(x) = θ. x (3)
2. How to training?
Okay, that’s the Linear Regression model, so now how do we train it? Well, recall that training a
model means setting its parameters so that the model best fits the training set. For this purpose,
we first need a measure of how well (or poorly) the model fits the training data. In chapter 2 we
saw that the most common performance measure of a regression model is the Root Mean
Square Error (RMSE) in Equation 2-1. Therefore, to train a Linear Regression model, you need
to find the value of θ that minimizes the RMSE. In practice, it is simpler to minimize the Mean
Square Error (MSE) than the RMSE, and it leads to the same result (because the value that
minimizes a function also minimizes its square root). The MSE of a Linear Regression
hypothesis hθ
on a training set X is calculated using Equation 4-3.
To find the value of θ that minimizes the cost function, there is a closed-form solution — in
other words, a mathematical equation that gives the result directly. This is called the
Normal Equation (Equation 4-4).
4. Practical
Let’s generate some linear-looking data to test this equation on (Figure 4-1):
import numpy as np
X = 2 * np.random.rand(100, 1)
y = 4 + 3 * X + np.random.randn(100, 1)
Figure 4-1. Randomly generated linear dataset
θ
Now let’s compute using the Normal Equation. We will use the inv() function from
NumPy’s Linear Algebra module ( np.linalg ) to compute the inverse of a matrix, and
the dot() method for matrix multiplication:
3x1 + Gaussian_noise
The actual function that we used to generate the data is . Let’s see what the equation
found:
>>> theta_best
array([[4.21509616],
[2.77011339]])
5. Prediction
Now you can make predictions using theta_best