Lab09 Assignment
Lab09 Assignment
Outline
• 1 - Packages
• 2 - Neural Networks
– 2.1 Problem Statement
– 2.2 Dataset
– 2.3 Model representation
– 2.4 Tensorflow Model Implementation
• Exercise 1
– 2.5 NumPy Model Implementation (Forward Prop in NumPy)
• Exercise 2
– 2.6 Vectorized NumPy Model Implementation (Optional)
• Exercise 3
– 2.7 Congratulations!
– 2.8 NumPy Broadcasting Tutorial (Optional)
NOTE: To prevent errors from the autograder, you are not allowed to edit or delete non-graded
cells in this notebook . Please also refrain from adding any new cells. Once you have passed this
assignment and want to experiment with any of the non-graded code, you may follow the
instructions at the bottom of this notebook.
1 - Packages
First, let's run the cell below to import all the packages that you will need during this
assignment.
import logging
logging.getLogger("tensorflow").setLevel(logging.ERROR)
tf.autograph.set_verbosity(0)
2 - Neural Networks
In Course 1, you implemented logistic regression. This was extended to handle non-linear
boundaries using polynomial regression. For even more complex scenarios such as image
recognition, neural networks are preferred.
This exercise will show you how the methods you have learned can be used for this classification
task.
2.2 Dataset
You will start by loading the dataset for this task.
• The load_data() function shown below loads the data into variables X and y
• The data set contains 1000 training examples of handwritten digits ❑1, here limited
to zero and one.
– Each training example is a 20-pixel x 20-pixel grayscale image of the digit.
• Each pixel is represented by a floating-point number indicating the
grayscale intensity at that location.
• The 20 by 20 grid of pixels is “unrolled” into a 400-dimensional vector.
• Each training example becomes a single row in our data matrix X.
• This gives us a 1000 x 400 matrix X where every row is a training example
of a handwritten digit image.
( )
−−− ( x 1 ) −− −
( )
X = −−− ( x ) −− −
( 2)
⋮
− −− ( x ) −− −
(m )
• The second part of the training set is a 1000 x 1 dimensional vector y that contains labels
for the training set
– y = 0 if the image is of the digit 0, y = 1 if the image is of the digit 1.
# load dataset
X, y = load_data()
• A good place to start is to print out each variable and see what it contains.
• In the cell below, the code randomly selects 64 rows from X, maps each row back to a 20
pixel by 20 pixel grayscale image and displays the images together.
• The label for each image is displayed above the image
import warnings
warnings.simplefilter(action='ignore', category=FutureWarning)
# You do not need to modify anything in this cell
m, n = X.shape
• layer2: The shape of W2 is (25, 15) and the shape of b2 is: (15,)
Note: It is also possible to add an input layer that specifies the input dimension of the
first layer. For example:
tf.keras.Input(shape=(400,)), #specify input shape
We will include that here to illuminate some model sizing.
Exercise 1
Below, using Keras Sequential model and Dense Layer with a sigmoid activation to construct the
network described above.
# UNQ_C1
# GRADED CELL: Sequential model
model = Sequential(
[
tf.keras.Input(shape=(400,)), # Specify input size
model.summary()
Model: "my_model"
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳
━━━━━━━━━━━━━━━━━┓
┃ Layer (type) ┃ Output Shape ┃
Param # ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇
━━━━━━━━━━━━━━━━━┩
│ dense (Dense) │ (None, 25) │
10,025 │
├──────────────────────────────────────┼─────────────────────────────┼
─────────────────┤
│ dense_1 (Dense) │ (None, 15) │
390 │
├──────────────────────────────────────┼─────────────────────────────┼
─────────────────┤
│ dense_2 (Dense) │ (None, 1) │
16 │
└──────────────────────────────────────┴─────────────────────────────┴
─────────────────┘
Expected Output (Click to Expand) The model.summary() function displays a useful summary
of the model. Because we have specified an input layer size, the shape of the weight and bias
arrays are determined and the total number of parameters per layer can be shown. Note, the
names of the layers may vary as they are auto-generated.
Model: "my_model"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
dense (Dense) (None, 25) 10025
_________________________________________________________________
dense_1 (Dense) (None, 15) 390
_________________________________________________________________
dense_2 (Dense) (None, 1) 16
=================================================================
Total params: 10,431
Trainable params: 10,431
Non-trainable params: 0
_________________________________________________________________
model = Sequential(
[
tf.keras.Input(shape=(400,)), # specify input size
(optional)
Dense(25, activation='sigmoid'),
Dense(15, activation='sigmoid'),
Dense(1, activation='sigmoid')
], name = "my_model"
)
# UNIT TESTS
from public_tests import *
test_c1(model)
----------------------------------------------------------------------
-----
ValueError Traceback (most recent call
last)
<ipython-input-9-a0cf083b9265> in <cell line: 3>()
1 # UNIT TESTS
2 from public_tests import *
----> 3 test_c1(model)
/content/public_tests.py in test_c1(target)
8 assert len(target.layers) == 3, \
9 f"Wrong number of layers. Expected 3 but got
{len(target.layers)}"
---> 10 assert target.input.shape.as_list() == [None, 400], \
11 f"Wrong input shape. Expected [None, 400] but got
{target.input.shape.as_list()}"
12 i = 0
/usr/local/lib/python3.10/dist-packages/keras/src/ops/operation.py in
input(self)
252 Input tensor or list of input tensors.
253 """
--> 254 return self._get_node_attribute_at_index(0,
"input_tensors", "input")
255
256 @property
/usr/local/lib/python3.10/dist-packages/keras/src/ops/operation.py in
_get_node_attribute_at_index(self, node_index, attr, attr_name)
283 """
284 if not self._inbound_nodes:
--> 285 raise ValueError(
286 f"The layer {self.name} has never been called
"
287 f"and thus has no defined {attr_name}."
ValueError: The layer my_model has never been called and thus has no
defined input.
The parameter counts shown in the summary correspond to the number of elements in the
weight and bias arrays as shown below.
We can examine details of the model by first extracting the layers with model.layers and then
extracting the weights with layerx.get_weights() as shown below.
Expected Output
print(model.layers[2].weights)
The following code will define a loss function and run gradient descent to fit the weights of the
model to the training data. This will be explained in more detail in the following week.
model.compile(
loss=tf.keras.losses.BinaryCrossentropy(),
optimizer=tf.keras.optimizers.Adam(0.001),
)
model.fit(
X,y,
epochs=20
)
Epoch 1/20
32/32 ━━━━━━━━━━━━━━━━━━━━ 2s 2ms/step - loss: 0.4844
Epoch 2/20
32/32 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.1168
Epoch 3/20
32/32 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.0355
Epoch 4/20
32/32 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.0176
Epoch 5/20
32/32 ━━━━━━━━━━━━━━━━━━━━ 0s 1ms/step - loss: 0.0102
Epoch 6/20
32/32 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.0059
Epoch 7/20
32/32 ━━━━━━━━━━━━━━━━━━━━ 0s 1ms/step - loss: 0.0090
Epoch 8/20
32/32 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.0048
Epoch 9/20
32/32 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.0042
Epoch 10/20
32/32 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.0043
Epoch 11/20
32/32 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.0024
Epoch 12/20
32/32 ━━━━━━━━━━━━━━━━━━━━ 0s 1ms/step - loss: 0.0067
Epoch 13/20
32/32 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.0015
Epoch 14/20
32/32 ━━━━━━━━━━━━━━━━━━━━ 0s 1ms/step - loss: 0.0022
Epoch 15/20
32/32 ━━━━━━━━━━━━━━━━━━━━ 0s 1ms/step - loss: 0.0014
Epoch 16/20
32/32 ━━━━━━━━━━━━━━━━━━━━ 0s 1ms/step - loss: 0.0020
Epoch 17/20
32/32 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 9.1903e-04
Epoch 18/20
32/32 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 8.8541e-04
Epoch 19/20
32/32 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 9.5468e-04
Epoch 20/20
32/32 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 7.5980e-04
<keras.src.callbacks.history.History at 0x7b8f48a7e9b0>
To run the model on an example to make a prediction, use Keras predict. The input to
predict is an array so the single example is reshaped to be two dimensional.
The output of the model is interpreted as a probability. In the first example above, the input is a
zero. The model predicts the probability that the input is a one is nearly zero. In the second
example, the input is a one. The model predicts the probability that the input is a one is nearly
one. As in the case of logistic regression, the probability is compared to a threshold to make a
final prediction.
Let's compare the predictions vs the labels for a random sample of 64 digits. This takes a
moment to run.
import warnings
warnings.simplefilter(action='ignore', category=FutureWarning)
# You do not need to modify anything in this cell
m, n = X.shape
Exercise 2
Below, build a dense layer subroutine. The example in lecture utilized a for loop to visit each unit
(j) in the layer and perform the dot product of the weights for that unit (W[:,j]) and sum the
bias for the unit (b[j]) to form z. An activation function g(z) is then applied to that result. This
section will not utilize some of the matrix operations described in the optional lectures. These
will be explored in a later section.
# UNQ_C2
# GRADED FUNCTION: my_dense
# Quick Check
x_tst = 0.1*np.arange(1,3,1).reshape(2,) # (1 examples, 2 features)
W_tst = 0.1*np.arange(1,7,1).reshape(2,3) # (2 input features, 3
output features)
b_tst = 0.1*np.arange(1,4,1).reshape(3,) # (3 features)
A_tst = my_dense(x_tst, W_tst, b_tst, sigmoid)
print(A_tst)
Expected Output
# UNIT TESTS
test_c2(my_dense)
The following cell builds a three-layer neural network utilizing the my_dense subroutine above.
def my_sequential(x, W1, b1, W2, b2, W3, b3):
a1 = my_dense(x, W1, b1, sigmoid)
a2 = my_dense(a1, W2, b2, sigmoid)
a3 = my_dense(a2, W3, b3, sigmoid)
return(a3)
W1_tmp,b1_tmp = layer1.get_weights()
W2_tmp,b2_tmp = layer2.get_weights()
W3_tmp,b3_tmp = layer3.get_weights()
# make predictions
prediction = my_sequential(X[0], W1_tmp, b1_tmp, W2_tmp, b2_tmp,
W3_tmp, b3_tmp )
if prediction >= 0.5:
yhat = 1
else:
yhat = 0
print( "yhat = ", yhat, " label= ", y[0,0])
prediction = my_sequential(X[500], W1_tmp, b1_tmp, W2_tmp, b2_tmp,
W3_tmp, b3_tmp )
if prediction >= 0.5:
yhat = 1
else:
yhat = 0
print( "yhat = ", yhat, " label= ", y[500,0])
yhat = 0 label= 0
yhat = 0 label= 1
Run the following cell to see predictions from both the Numpy model and the Tensorflow
model. This takes a moment to run.
import warnings
warnings.simplefilter(action='ignore', category=FutureWarning)
# You do not need to modify anything in this cell
m, n = X.shape
We can demonstrate this using the examples X and the W1,b1 parameters above. We use
np.matmul to perform the matrix multiply. Note, the dimensions of x and W must be
compatible as shown in the diagram above.
x = X[0].reshape(-1,1) # column vector (400,1)
z1 = np.matmul(x.T,W1) + b1 # (1,400)(400,25) = (1,25)
a1 = sigmoid(z1)
print(a1.shape)
(1, 25)
You can take this a step further and compute all the units for all examples in one Matrix-Matrix
operation.
The full operation is Z=X W +b . This will utilize NumPy broadcasting to expand b to m rows. If
this is unfamiliar, a short tutorial is provided at the end of the notebook.
Exercise 3
Below, compose a new my_dense_v subroutine that performs the layer calculations for a
matrix of examples. This will utilize np.matmul().
Note: This function is not graded because it is discussed in the optional lectures on
vectorization. If you didn't go through them, feel free to click the hints below the expected code
to see the code. You can also submit the notebook even with a blank answer here.
# UNQ_C3
# UNGRADED FUNCTION: my_dense_v
Expected Output
Click for hints In matrix form, this can be written in one or two lines.
# UNIT TESTS
test_c3(my_dense_v)
The following cell builds a three-layer neural network utilizing the my_dense_v subroutine
above.
W1_tmp,b1_tmp = layer1.get_weights()
W2_tmp,b2_tmp = layer2.get_weights()
W3_tmp,b3_tmp = layer3.get_weights()
Let's make a prediction with the new model. This will make a prediction on all of the examples at
once. Note the shape of the output.
(1000, 1)
Run the following cell to see predictions. This will use the predictions we just calculated above.
This takes a moment to run.
import warnings
warnings.simplefilter(action='ignore', category=FutureWarning)
# You do not need to modify anything in this cell
m, n = X.shape
for i, ax in enumerate(axes.flat):
# Select random indices
random_index = np.random.randint(m)
2.7 Congratulations!
You have successfully built and utilized a neural network.
More specifically: When operating on two arrays, NumPy compares their shapes element-wise.
It starts with the trailing (i.e. rightmost) dimensions and works its way left. Two dimensions are
compatible when
If these conditions are not met, a ValueError: operands could not be broadcast together
exception is thrown, indicating that the arrays have incompatible shapes. The size of the
resulting array is the size that is not 1 along each axis of the inputs.
The graphic below describes expanding dimensions. Note the red text below:
The graphic above shows NumPy expanding the arguments to match before the final operation.
Note that this is a notional description. The actual mechanics of NumPy operation choose the
most efficient implementation.
For each of the following examples, try to guess the size of the result before running the
example.
a = np.array([1,2,3]).reshape(-1,1) #(3,1)
b = 5
print(f"(a + b).shape: {(a + b).shape}, \na + b = \n{a + b}")
a = np.array([1,2,3]).reshape(-1,1) #(3,1)
b = 5
print(f"(a * b).shape: {(a * b).shape}, \na * b = \n{a * b}")
a = np.array([1,2,3,4]).reshape(-1,1)
b = np.array([1,2,3]).reshape(1,-1)
print(a)
print(b)
print(f"(a + b).shape: {(a + b).shape}, \na + b = \n{a + b}")
[[1]
[2]
[3]
[4]]
[[1 2 3]]
(a + b).shape: (4, 3),
a + b =
[[2 3 4]
[3 4 5]
[4 5 6]
[5 6 7]]
This is the scenario in the dense layer you built above. Adding a 1-D vector b to a (m,j) matrix.
Matrix + 1-D Vector