C1 W2 Lab02 Multiple Variable Soln
C1 W2 Lab02 Multiple Variable Soln
Regression
In this lab, you will extend the data structures and previously developed routines to support
multiple features. Several routines are updated making the lab appear lengthy, but it makes
minor adjustments to previous routines making it quick to review.
Outline
• 1.1 Goals
• 1.2 Tools
• 1.3 Notation
• 2 Problem Statement
• 2.1 Matrix X containing our examples
• 2.2 Parameter vector w, b
• 3 Model Prediction With Multiple Variables
• 3.1 Single Prediction element by element
• 3.2 Single Prediction, vector
• 4 Compute Cost With Multiple Variables
• 5 Gradient Descent With Multiple Variables
• 5.1 Compute Gradient with Multiple Variables
• 5.2 Gradient Descent With Multiple Variables
• 6 Congratulations
1.1 Goals
• Extend our regression model routines to support multiple features
– Extend data structures to support multiple features
– Rewrite prediction, cost and gradient routines to support multiple features
– Utilize NumPy np.dot to vectorize their implementations for speed and
simplicity
1.2 Tools
In this lab, we will make use of:
1.3 Notation
Here is a summary of some of the notation you will encounter, updated for multiple features.
2 Problem Statement
You will use the motivating example of housing price prediction. The training dataset contains
three examples with four features (size, bedrooms, floors and, age) shown in the table below.
Note that, unlike the earlier labs, size is in sqft rather than 1000 sqft. This causes an issue, which
you will solve in the next lab!
You will build a linear regression model using these values so you can then predict the price for
other houses. For example, a house with 1200 sqft, 3 bedrooms, 1 floor, 40 years old.
Please run the following code cell to create your X_train and y_train variables.
( )
x (00) x (10 ) ⋯ x (n0−) 1
( m − 1)
X = x(01) x (11 ) ⋯ x (n1−) 1 x n −1 ¿
¿ x (0m− 1) x (1m −1) ¿
notation:
(i ) (i )
• x is vector containing example i. x $ = (x^{(i)}_0, x^{(i)}1, \cdots,x^{(i)}{n-1})$
(i )
• x j is element j in example i. The superscript in parenthesis indicates the example number
while the subscript represents an element.
()
w0
w1
w=
⋯
w n −1
• b is a scalar parameter.
For demonstration, w and b will be loaded with some initial selected values that are near the
optimal. w is a 1-D NumPy vector.
b_init = 785.1811367994083
w_init = np.array([ 0.39133535, 18.75376741, -53.36032453, -
26.42131618])
print(f"w_init shape: {w_init.shape}, b_init type: {type(b_init)}")
f w ,b ( x )=w 0 x 0 + w1 x 1 +. . .+ wn − 1 x n −1 +b
or in vector notation:
f w ,b ( x )=w ⋅ x +b
To demonstrate the dot product, we will implement prediction using (1) and (2).
Args:
x (ndarray): Shape (n,) example with multiple features
w (ndarray): Shape (n,) model parameters
b (scalar): model parameter
Returns:
p (scalar): prediction
"""
n = x.shape[0]
p = 0
for i in range(n):
p_i = x[i] * w[i]
p = p + p_i
p = p + b
return p
# make a prediction
f_wb = predict_single_loop(x_vec, w_init, b_init)
print(f"f_wb shape {f_wb.shape}, prediction: {f_wb}")
Note the shape of x_vec. It is a 1-D NumPy vector with 4 elements, (4,). The result, f_wb is a
scalar.
Recall from the Python/Numpy lab that NumPy np.dot()[link] can be used to perform a vector
dot product.
Returns:
p (scalar): prediction
"""
p = np.dot(x, w) + b
return p
The results and shapes are the same as the previous version which used looping. Going forward,
np.dot will be used for these operations. The prediction is now a single statement. Most
routines will implement it directly rather than calling a separate predict routine.
where:
f w ,b ( x (i )) =w ⋅ x ( i) +b
In contrast to previous labs, w and x (i ) are vectors rather than scalars supporting multiple
features.
Below is an implementation of equations (3) and (4). Note that this uses a standard pattern for
this course where a for loop over all m examples is used.
Returns:
cost (scalar): cost
"""
m = X.shape[0]
cost = 0.0
for i in range(m):
f_wb_i = np.dot(X[i], w) + b #(n,)(n,) = scalar (see
np.dot)
cost = cost + (f_wb_i - y[i])**2 #scalar
cost = cost / (2 * m) #scalar
return cost
where, n is the number of features, parameters w j , b , are updated simultaneously and where
m −1
∂ J ( w , b) 1
∂wj
¿ ∑ ( f ( x( i) ) − y ( i) ) x (ji )
m i=0 w , b
m −1
∂ J ( w , b) 1
¿ ∑ (f w , b( x ) − y )
( i) ( i)
∂b m i=0
Returns:
dj_dw (ndarray (n,)): The gradient of the cost w.r.t. the
parameters w.
dj_db (scalar): The gradient of the cost w.r.t. the
parameter b.
"""
m,n = X.shape #(number of examples, number of features)
dj_dw = np.zeros((n,))
dj_db = 0.
for i in range(m):
err = (np.dot(X[i], w) + b) - y[i]
for j in range(n):
dj_dw[j] = dj_dw[j] + err * X[i, j]
dj_db = dj_db + err
dj_dw = dj_dw / m
dj_db = dj_db / m
Expected Result:
dj_db at initial w,b: -1.6739251122999121e-06
dj_dw at initial w,b:
[-2.73e-03 -6.27e-06 -2.22e-06 -6.92e-05]
Args:
X (ndarray (m,n)) : Data, m examples with n features
y (ndarray (m,)) : target values
w_in (ndarray (n,)) : initial model parameters
b_in (scalar) : initial model parameter
cost_function : function to compute cost
gradient_function : function to compute the gradient
alpha (float) : Learning rate
num_iters (int) : number of iterations to run gradient
descent
Returns:
w (ndarray (n,)) : Updated values of parameters
b (scalar) : Updated value of parameter
"""
for i in range(num_iters):
# initialize parameters
initial_w = np.zeros_like(w_init)
initial_b = 0.
# some gradient descent settings
iterations = 1000
alpha = 5.0e-7
# run gradient descent
w_final, b_final, J_hist = gradient_descent(X_train, y_train,
initial_w, initial_b,
compute_cost,
compute_gradient,
alpha, iterations)
print(f"b,w found by gradient descent: {b_final:0.2f},{w_final} ")
m,_ = X_train.shape
for i in range(m):
print(f"prediction: {np.dot(X_train[i], w_final) + b_final:0.2f},
target value: {y_train[i]}")
Expected Result:
b,w found by gradient descent: -0.00,[ 0.2 0. -0.01 -0.07]
prediction: 426.19, target value: 460
prediction: 286.17, target value: 232
prediction: 171.47, target value: 178
plt.show()
These results are not inspiring! Cost is still declining and our predictions are not very accurate.
The next lab will explore how to improve on this.
6 Congratulations!
In this lab you:
• Redeveloped the routines for linear regression, now with multiple variables.
• Utilized NumPy np.dot to vectorize the implementations