0% found this document useful (0 votes)
14 views

Lab 4 - Markdown Practical - Solution

Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views

Lab 4 - Markdown Practical - Solution

Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 5

Lab 4: Jupyter Notebook practical

Class name: __

Student code: __

Student name: ___

Objectives
After completing this lab you will be able to:

Using markdown to present and format document


Using heading in markdown
Using image in markdown
Using link in markdown
Using math equation in markdown
Format text(bold, italic...)

Estimated time needed: 60 minutes

Any cheater who copies or shares the code will receive 1

point. Training Models


So far we have treated Machine Learning models and their training algorithms mostly like black
boxes. If you went through some of the exercises in the previous chapters, you may have been
surprised by how much you can get done without knowing any‐ thing about what’s under the
hood: you optimized a regression system, you improved a digit image classifier, and you even
built a spam classifier from scratch—all this without knowing how they actually work. Indeed, in
many situations you don’t really need to know the implementation details. However, having a
good understanding of how things work can help you quickly home in on the appropriate model,
the right training algorithm to use, and a good set of hyperparameters for your task.
Understanding what’s under the hood will also help you debug issues and perform error
analysis more efficiently. Lastly, most of the topics discussed in this chapter will be essential in
understanding, building, and training neural networks (discussed in another part of this book). In
this chapter, we will start by looking at the Linear Regression model, one of the simplest models
there is. We will discuss two very different ways to train it:

Using a direct closed-form equation that directly computes the model parame‐ ters that
best fit the model to the training set (i.e., the model parameters that minimize the cost
function over the training set).
Using an iterative optimization approach, called Gradient Descent (GD) , that
gradually tweaks the model parameters to minimize the cost function over the training set,
eventually converging to the same set of parameters as the first method. We will look at a
few variants of Gradient Descent that we will use again and again when we study neural
networks in Part II: Batch GD , Mini-batch GD , and Stochastic GD

Next we will look at Polynomial Regression, a more complex model that can fit non‐ linear
datasets. Since this model has more parameters than Linear Regression, it is more prone to
overfitting the training data, so we will look at how to detect whether or not this is the case, using
learning curves, and then we will look at several regularization techniques that can reduce the
risk of overfitting the training set. Finally, we will look at two more models that are commonly
used for classification tasks: Logistic Regression and Softmax Regression.

1. Linear Regression
In chapter 1, we looked at a simple regression model of life satisfaction:

life_satisfaction = θ0 + θ1 × GDP_per_capita (1) θ0 θ1


This model is just a linear function of the input feature GDP_per_capita. and are the
model’s parameters . More generally, a linear model makes a prediction by simply
computing a weighted sum of the input features, plus a constant called the bias term (also
called the intercept term), as shown in Equation 4-1.

Equation 4-1. Linear Regression model prediction

y = θ0 + θ1x1 + θ2x2 + ⋯ + θnxn (2)

This can be written much more concisely using a vectorized form , as shown in Equation
4-2
Equation 4-2. Linear Regression model prediction (vectorized form)

y = hθ(x) = θ. x (3)
2. How to training?
Okay, that’s the Linear Regression model, so now how do we train it? Well, recall that training a
model means setting its parameters so that the model best fits the training set. For this purpose,
we first need a measure of how well (or poorly) the model fits the training data. In chapter 2 we
saw that the most common performance measure of a regression model is the Root Mean
Square Error (RMSE) in Equation 2-1. Therefore, to train a Linear Regression model, you need
to find the value of θ that minimizes the RMSE. In practice, it is simpler to minimize the Mean
Square Error (MSE) than the RMSE, and it leads to the same result (because the value that
minimizes a function also minimizes its square root). The MSE of a Linear Regression
hypothesis hθ
on a training set X is calculated using Equation 4-3.

Equation 4-3. MSE cost function for a Linear Regression model

1 (θT x(i) − y(i))2(4)


m
MSE(X, hθ) = 3. The Normal m
∑ i=1
Equation

To find the value of θ that minimizes the cost function, there is a closed-form solution — in
other words, a mathematical equation that gives the result directly. This is called the
Normal Equation (Equation 4-4).

Equation 4-4. Normal Equation

θ = (XT. X)−1XT y (5)

4. Practical
Let’s generate some linear-looking data to test this equation on (Figure 4-1):

import numpy as np
X = 2 * np.random.rand(100, 1)
y = 4 + 3 * X + np.random.randn(100, 1)
Figure 4-1. Randomly generated linear dataset

θ
Now let’s compute using the Normal Equation. We will use the inv() function from
NumPy’s Linear Algebra module ( np.linalg ) to compute the inverse of a matrix, and
the dot() method for matrix multiplication:

X_b = np.c_[np.ones((100, 1)), X]


theta_best =np.linalg.inv(X_b.T.dot(X_b)).dot(X_b.T).dot(y) y = 4 +

3x1 + Gaussian_noise
The actual function that we used to generate the data is . Let’s see what the equation
found:

>>> theta_best
array([[4.21509616],
[2.77011339]])

5. Prediction
Now you can make predictions using theta_best

>>> X_new = np.array([[0], [2]])


>>> X_new_b = np.c_[np.ones((2, 1)), X_new] # add x0 = 1 to each
instance >>> y_predict = X_new_b.dot(theta_best)
>>> y_predict
array([[4.21509616],
[9.75532293]])
Let’s plot this model’s predictions (Figure 4-2):

plt.plot(X_new, y_predict, "r-")


plt.plot(X, y, "b.")
plt.axis([0, 2, 0, 15])
plt.show()
Figure 4-2. Linear Regression model predictions

You might also like