0% found this document useful (0 votes)
31 views51 pages

Mayhoc

The document outlines a practice lab for implementing linear regression to predict restaurant profits based on city populations. It includes functions for loading data, computing cost and gradients, and running gradient descent to optimize parameters. Additionally, it provides visualization techniques and a brief introduction to logistic regression.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
31 views51 pages

Mayhoc

The document outlines a practice lab for implementing linear regression to predict restaurant profits based on city populations. It includes functions for loading data, computing cost and gradients, and running gradient descent to optimize parameters. Additionally, it provides visualization techniques and a brief introduction to logistic regression.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 51

utils.

py

import numpy as np
def load_data():
data = np.loadtxt("data/ex1data1.txt", delimiter=',')
X = data[:,0]
y = data[:,1]
return X, y
def load_data_multi():
data = np.loadtxt("data/ex1data2.txt", delimiter=',')
X = data[:,:2]
y = data[:,2]
return X, y

Practice Lab: Linear Regression


Welcome to your first practice lab! In this lab, you will implement linear regression with one
variable to predict profits for a restaurant franchise.

Outline
1 - Packages
2 - Linear regression with one variable
2.1 Problem Statement
2.2 Dataset
2.3 Refresher on linear regression
2.4 Compute Cost
Exercise 1
2.5 Gradient descent
Exercise 2
2.6 Learning parameters using batch gradient descent
NOTE: To prevent errors from the autograder, you are not allowed to edit or delete non-graded
cells in this notebook . Please also refrain from adding any new cells. Once you have passed this
assignment and want to experiment with any of the non-graded code, you may follow the
instructions at the bottom of this notebook.

1 - Packages
First, let's run the cell below to import all the packages that you will need during this assignment.
 numpy is the fundamental package for working with matrices in Python.
 matplotlib is a famous library to plot graphs in Python.
 utils.py contains helper functions for this assignment. You do not need to modify code in
this file.

import numpy as np
import matplotlib.pyplot as plt
from utils import *
import copy
import math
%matplotlib inline

2 - Problem Statement
Suppose you are the CEO of a restaurant franchise and are considering different cities for
opening a new outlet.
 You would like to expand your business to cities that may give your restaurant higher
profits.
 The chain already has restaurants in various cities and you have data for profits and
populations from the cities.
 You also have data on cities that are candidates for a new restaurant.
 For these cities, you have the city population.
Can you use the data to help you identify which cities may potentially give your business higher
profits?
3 - Dataset
You will start by loading the dataset for this task.
 The load_data() function shown below loads the data into variables x_train and y_train
 x_train is the population of a city
 y_train is the profit of a restaurant in that city. A negative value for profit indicates
a loss.
 Both X_train and y_train are numpy arrays.

# load the dataset


x_train, y_train = load_data()

View the variables


Before starting on any task, it is useful to get more familiar with your dataset.
 A good place to start is to just print out each variable and see what it contains.
The code below prints the variable x_train and the type of the variable.

# print x_train
print("Type of x_train:",type(x_train))
print("First five elements of x_train are:\n", x_train[:5])

x_train is a numpy array that contains decimal values that are all greater than zero.
 These values represent the city population times 10,000
 For example, 6.1101 means that the population for that city is 61,101
Now, let's print y_train

# print y_train
print("Type of y_train:",type(y_train))
print("First five elements of y_train are:\n", y_train[:5])

Similarly, y_train is a numpy array that has decimal values, some negative, some positive.
 These represent your restaurant's average monthly profits in each city, in units of $10,000.
 For example, 17.592 represents $175,920 in average monthly profits for that city.
 -2.6807 represents -$26,807 in average monthly loss for that city.

Check the dimensions of your variables


Another useful way to get familiar with your data is to view its dimensions.
Please print the shape of x_train and y_train and see how many training examples you have in
your dataset.

print ('The shape of x_train is:', x_train.shape)


print ('The shape of y_train is: ', y_train.shape)
print ('Number of training examples (m):', len(x_train))

The city population array has 97 data points, and the monthly average profits also has 97 data
points. These are NumPy 1D arrays.

Visualize your data


It is often useful to understand the data by visualizing it.
 For this dataset, you can use a scatter plot to visualize the data, since it has only two
properties to plot (profit and population).
 Many other problems that you will encounter in real life have more than two properties
(for example, population, average household income, monthly profits, monthly
sales).When you have more than two properties, you can still use a scatter plot to see the
relationship between each pair of properties.

# Create a scatter plot of the data. To change the markers to red "x",
# we used the 'marker' and 'c' parameters
plt.scatter(x_train, y_train, marker='x', c='r')

# Set the title


plt.title("Profits vs. Population per city")
# Set the y-axis label
plt.ylabel('Profit in $10,000')
# Set the x-axis label
plt.xlabel('Population of City in 10,000s')
plt.show()

Your goal is to build a linear regression model to fit this data.


 With this model, you can then input a new city's population, and have the model estimate
your restaurant's potential monthly profits for that city.
# UNQ_C1
# GRADED FUNCTION: compute_cost
def compute_cost(x, y, w, b):
"""
Computes the cost function for linear regression.

Args:
x (ndarray): Shape (m,) Input to the model (Population of cities)
y (ndarray): Shape (m,) Label (Actual profits for the cities)
w, b (scalar): Parameters of the model

Returns
total_cost (float): The cost of using w,b as the parameters for linear regression
to fit the data points in x and y
"""
# number of training examples
m = x.shape[0]
# You need to return this variable correctly
total_cost = 0

### START CODE HERE ###


sum_cost = 0
for z in range(m):
f = w*x[z]+b
cost = (f-y[z])**2
sum_cost +=cost
total_cost = 1/(2*m)*sum_cost
### END CODE HERE ###

return total_cost

You can check if your implementation was correct by running the following test code:
# Compute cost with some initial values for paramaters w, b
initial_w = 2
initial_b = 1

cost = compute_cost(x_train, y_train, initial_w, initial_b)


print(type(cost))
print(f'Cost at initial w: {cost:.3f}')

# Public tests
from public_tests import *
compute_cost_test(compute_cost)
# UNQ_C2
# GRADED FUNCTION: compute_gradient
def compute_gradient(x, y, w, b):
"""
Computes the gradient for linear regression
Args:
x (ndarray): Shape (m,) Input to the model (Population of cities)
y (ndarray): Shape (m,) Label (Actual profits for the cities)
w, b (scalar): Parameters of the model
Returns
dj_dw (scalar): The gradient of the cost w.r.t. the parameters w
dj_db (scalar): The gradient of the cost w.r.t. the parameter b
"""

# Number of training examples


m = x.shape[0]

# You need to return the following variables correctly


dj_dw = 0
dj_db = 0
# Nguyen Do Phuoc Rin
### START CODE HERE ###
for z in range(m):
f = w*x[z]+b
dj_dw_z = (f-y[z])*x[z]
dj_db_z = f-y[z]
dj_db += dj_db_z
dj_dw += dj_dw_z
dj_dw = dj_dw/m
dj_db = dj_db/m
### END CODE HERE ###
return dj_dw, dj_db

Run the cells below to check your implementation of the compute_gradient function with two
different initializations of the parameters w,b.

# Compute and display gradient with w initialized to zeroes


initial_w = 0
initial_b = 0

tmp_dj_dw, tmp_dj_db = compute_gradient(x_train, y_train, initial_w, initial_b)


print('Gradient at initial w, b (zeros):', tmp_dj_dw, tmp_dj_db)

compute_gradient_test(compute_gradient)

Now let's run the gradient descent algorithm implemented above on our dataset.

# Compute and display cost and gradient with non-zero w


test_w = 0.2
test_b = 0.2
tmp_dj_dw, tmp_dj_db = compute_gradient(x_train, y_train, test_w, test_b)

print('Gradient at test w, b:', tmp_dj_dw, tmp_dj_db)

2.6 Learning parameters using batch gradient descent


You will now find the optimal parameters of a linear regression model by using batch gradient
descent. Recall batch refers to running all the examples in one iteration.
 You don't need to implement anything for this part. Simply run the cells below.
 A good way to verify that gradient descent is working correctly is to look at the value
of J(w,b) and check that it is decreasing with each step.
 Assuming you have implemented the gradient and computed the cost correctly and you
have an appropriate value for the learning rate alpha, J(w,b) should never increase and
should converge to a steady value by the end of the algorithm.

def gradient_descent(x, y, w_in, b_in, cost_function, gradient_function, alpha, num_iters):


"""
Performs batch gradient descent to learn theta. Updates theta by taking
num_iters gradient steps with learning rate alpha

Args:
x: (ndarray): Shape (m,)
y: (ndarray): Shape (m,)
w_in, b_in : (scalar) Initial values of parameters of the model
cost_function: function to compute cost
gradient_function: function to compute the gradient
alpha : (float) Learning rate
num_iters : (int) number of iterations to run gradient descent
Returns
w : (ndarray): Shape (1,) Updated values of parameters of the model after
running gradient descent
b : (scalar) Updated value of parameter of the model after
running gradient descent
"""

# number of training examples


m = len(x)

# An array to store cost J and w's at each iteration — primarily for graphing later
J_history = []
w_history = []
w = copy.deepcopy(w_in) #avoid modifying global w within function
b = b_in

for i in range(num_iters):

# Calculate the gradient and update the parameters


dj_dw, dj_db = gradient_function(x, y, w, b )

# Update Parameters using w, b, alpha and gradient


w = w - alpha * dj_dw
b = b - alpha * dj_db

# Save cost J at each iteration


if i<100000: # prevent resource exhaustion
cost = cost_function(x, y, w, b)
J_history.append(cost)

# Print cost every at intervals 10 times or as many iterations if < 10


if i% math.ceil(num_iters/10) == 0:
w_history.append(w)
print(f"Iteration {i:4}: Cost {float(J_history[-1]):8.2f} ")

return w, b, J_history, w_history #return w and J,w history for graphing

Now let's run the gradient descent algorithm above to learn the parameters for our dataset

# initialize fitting parameters. Recall that the shape of w is (n,)


initial_w = 0.
initial_b = 0.
# some gradient descent settings
iterations = 1500
alpha = 0.01

w,b,_,_ = gradient_descent(x_train ,y_train, initial_w, initial_b,


compute_cost, compute_gradient, alpha, iterations)
print("w,b found by gradient descent:", w, b)

m = x_train.shape[0]
predicted = np.zeros(m)

for i in range(m):
predicted[i] = w * x_train[i] + b

We will now plot the predicted values to see the linear fit.

# Plot the linear fit


plt.plot(x_train, predicted, c = "b")

# Create a scatter plot of the data.


plt.scatter(x_train, y_train, marker='x', c='r')

# Set the title


plt.title("Profits vs. Population per city")
# Set the y-axis label
plt.ylabel('Profit in $10,000')
# Set the x-axis label
plt.xlabel('Population of City in 10,000s')

Your final values of w,b can also be used to make predictions on profits. Let's predict what the
profit would be in areas of 35,000 and 70,000 people.
 The model takes in population of a city in 10,000s as input.
 Therefore, 35,000 people can be translated into an input to the model as np.array([3.5])
 Similarly, 70,000 people can be translated into an input to the model as np.array([7.])

predict1 = 3.5 * w + b
print('For population = 35,000, we predict a profit of $%.2f' % (predict1*10000))

predict2 = 7.0 * w + b
print('For population = 70,000, we predict a profit of $%.2f' % (predict2*10000))

Logistic Regression
utils.py

import numpy as np
import matplotlib.pyplot as plt

def load_data(filename):
data = np.loadtxt(filename, delimiter=',')
X = data[:,:2]
y = data[:,2]
return X, y

def sig(z):
return 1/(1+np.exp(-z))

def map_feature(X1, X2):


"""
Feature mapping function to polynomial features
"""
X1 = np.atleast_1d(X1)
X2 = np.atleast_1d(X2)
degree = 6
out = []
for i in range(1, degree+1):
for j in range(i + 1):
out.append((X1**(i-j) * (X2**j)))
return np.stack(out, axis=1)

def plot_data(X, y, pos_label="y=1", neg_label="y=0"):


positive = y == 1
negative = y == 0

# Plot examples
plt.plot(X[positive, 0], X[positive, 1], 'k+', label=pos_label)
plt.plot(X[negative, 0], X[negative, 1], 'yo', label=neg_label)

def plot_decision_boundary(w, b, X, y):


# Credit to dibgerge on Github for this plotting code

plot_data(X[:, 0:2], y)
if X.shape[1] <= 2:
plot_x = np.array([min(X[:, 0]), max(X[:, 0])])
plot_y = (-1. / w[1]) * (w[0] * plot_x + b)

plt.plot(plot_x, plot_y, c="b")

else:
u = np.linspace(-1, 1.5, 50)
v = np.linspace(-1, 1.5, 50)

z = np.zeros((len(u), len(v)))

# Evaluate z = theta*x over the grid


for i in range(len(u)):
for j in range(len(v)):
z[i,j] = sig(np.dot(map_feature(u[i], v[j]), w) + b)

# important to transpose z before calling contour


z = z.T

# Plot z = 0.5
plt.contour(u,v,z, levels = [0.5], colors="g")
1 - Packages
First, let's run the cell below to import all the packages that you will need during this assignment.
 numpy is the fundamental package for scientific computing with Python.
 matplotlib is a famous library to plot graphs in Python.
 utils.py contains helper functions for this assignment. You do not need to modify code in
this file.
import numpy as np
import matplotlib.pyplot as plt
from utils import *
import copy
import math
%matplotlib inline

2 - Logistic Regression
In this part of the exercise, you will build a logistic regression model to predict whether a student
gets admitted into a university.
2.1 Problem Statement
Suppose that you are the administrator of a university department and you want to determine each
applicant’s chance of admission based on their results on two exams.
 You have historical data from previous applicants that you can use as a training set for
logistic regression.
 For each training example, you have the applicant’s scores on two exams and the
admissions decision.
 Your task is to build a classification model that estimates an applicant’s probability of
admission based on the scores from those two exams.
2.2 Loading and visualizing the data
You will start by loading the dataset for this task.
 The load_dataset() function shown below loads the data into variables X_train and y_train
 X_train contains exam scores on two exams for a student
 y_train is the admission decision
o y_train = 1 if the student was admitted
o y_train = 0 if the student was not admitted
 Both X_train and y_train are numpy arrays.

# load dataset
X_train, y_train = load_data("data/ex2data1.txt")

View the variables


Let's get more familiar with your dataset.
 A good place to start is to just print out each variable and see what it contains.
The code below prints the first five values of X_train, y_train and the type of the variable.
print("First five elements in X_train are:\n", X_train[:5])
print("Type of X_train:",type(X_train))
print("First five elements in y_train are:\n", y_train[:5])
print("Type of y_train:",type(y_train))

Check the dimensions of your variables


Another useful way to get familiar with your data is to view its dimensions. Let's print the shape
of X_train and y_train and see how many training examples we have in our dataset.

print ('The shape of X_train is: ' + str(X_train.shape))


print ('The shape of y_train is: ' + str(y_train.shape))
print ('We have m = %d training examples' % (len(y_train)))

Visualize your data


Before starting to implement any learning algorithm, it is always good to visualize the data if
possible.
 The code below displays the data on a 2D plot (as shown below), where the axes are the
two exam scores, and the positive and negative examples are shown with different
markers.
 We use a helper function in the utils.py file to generate this plot.

plot_data(X_train, y_train[:], pos_label="Admitted", neg_label="Not admitted")

# Set the y-axis label


plt.ylabel('Exam 2 score')
# Set the x-axis label
plt.xlabel('Exam 1 score')
plt.legend(loc="upper right")
plt.show()

Your goal is to build a logistic regression model to fit this data.


 With this model, you can then predict if a new student will be admitted based on their
scores on the two exams.

# UNQ_C1
# GRADED FUNCTION: sigmoid

def sigmoid(z):
"""
Compute the sigmoid of z

Args:
z (ndarray): A scalar, numpy array of any size.

Returns:
g (ndarray): sigmoid(z), with the same shape as z

"""
### START CODE HERE ###
g = 1/(1+np.exp(-z))
### END SOLUTION ###

return g

# Note: You can edit this value


value = 0

print (f"sigmoid({value}) = {sigmoid(value)}")

print ("sigmoid([ -1, 0, 1, 2]) = " + str(sigmoid(np.array([-1, 0, 1, 2]))))

# UNIT TESTS
from public_tests import *
sigmoid_test(sigmoid)

Note:
values but matrices of shape (m,n) and (𝑚,1) respectively, where 𝑛 is the number of
 As you are doing this, remember that the variables X_train and y_train are not scalar

features and 𝑚 is the number of training examples.


 You can use the sigmoid function that you implemented above for this part.

# UNQ_C2
# GRADED FUNCTION: compute_cost
def compute_cost(X, y, w, b, *argv):
"""
Computes the cost over all examples
Args:
X : (ndarray Shape (m,n)) data, m examples by n features
y : (ndarray Shape (m,)) target value
w : (ndarray Shape (n,)) values of parameters of the model
b : (scalar) value of bias parameter of the model
*argv : unused, for compatibility with regularized version below
Returns:
total_cost : (scalar) cost
"""

m, n = X.shape
# Nuyen Do Phuoc Rin 21098871
### START CODE HERE ###
cost=0
total_cost=0
loss=0
loss_sum=0
for i in range(m):
z_wb = 0
for j in range(n):
z_wb_ij = w[j] * X[i][j]
z_wb += z_wb_ij
z_wb += b
f_wb = sigmoid(z_wb)
loss = -y[i] * np.log(f_wb) - (1 - y[i]) * np.log(1 - f_wb)
loss_sum += loss
total_cost = loss_sum / m
### END CODE HERE ###

return total_cost

Run the cells below to check your implementation of the compute_cost function with two
different initializations of the parameters w and b

m, n = X_train.shape

# Compute and display cost with w and b initialized to zeros


initial_w = np.zeros(n)
initial_b = 0.
cost = compute_cost(X_train, y_train, initial_w, initial_b)
print('Cost at initial w and b (zeros): {:.3f}'.format(cost))

# Compute and display cost with non-zero w and b


test_w = np.array([0.2, 0.2])
test_b = -24.
cost = compute_cost(X_train, y_train, test_w, test_b)

print('Cost at test w and b (non-zeros): {:.3f}'.format(cost))


# UNIT TESTS
compute_cost_test(compute_cost)

# UNQ_C3
# GRADED FUNCTION: compute_gradient
def compute_gradient(X, y, w, b, *argv):
"""
Computes the gradient for logistic regression
Args:
X : (ndarray Shape (m,n)) data, m examples by n features
y : (ndarray Shape (m,)) target value
w : (ndarray Shape (n,)) values of parameters of the model
b : (scalar) value of bias parameter of the model
*argv : unused, for compatibility with regularized version below
Returns
dj_dw : (ndarray Shape (n,)) The gradient of the cost w.r.t. the parameters w.
dj_db : (scalar) The gradient of the cost w.r.t. the parameter b.
"""
m, n = X.shape
dj_dw = np.zeros(w.shape)
dj_db = 0.
# Nguyen Do Phuoc Rin 21098871
### START CODE HERE ###
for i in range(m):
z_wb = 0
for j in range(n):
z_wb_ij = X[i, j] * w[j]
z_wb += z_wb_ij
z_wb += b
f_wb = sigmoid(z_wb)

dj_db_i = f_wb - y[i]


dj_db += dj_db_i

for j in range(n):
dj_dw_ij =(f_wb - y[i])* X[i][j]
dj_dw[j] += dj_dw_ij
dj_dw = dj_dw / m
dj_db = dj_db / m
### END CODE HERE ###
return dj_db, dj_dw

# Compute and display gradient with w and b initialized to zeros


initial_w = np.zeros(n)
initial_b = 0.

dj_db, dj_dw = compute_gradient(X_train, y_train, initial_w, initial_b)


print(f'dj_db at initial w and b (zeros):{dj_db}' )
print(f'dj_dw at initial w and b (zeros):{dj_dw.tolist()}' )

# Compute and display cost and gradient with non-zero w and b


test_w = np.array([ 0.2, -0.5])
test_b = -24
dj_db, dj_dw = compute_gradient(X_train, y_train, test_w, test_b)

print('dj_db at test w and b:', dj_db)


print('dj_dw at test w and b:', dj_dw.tolist())

# UNIT TESTS
compute_gradient_test(compute_gradient)

def gradient_descent(X, y, w_in, b_in, cost_function, gradient_function, alpha, num_iters,


lambda_):
"""
Performs batch gradient descent to learn theta. Updates theta by taking
num_iters gradient steps with learning rate alpha
Args:
X: (ndarray Shape (m, n) data, m examples by n features
y: (ndarray Shape (m,)) target value
w_in : (ndarray Shape (n,)) Initial values of parameters of the model
b_in : (scalar) Initial value of parameter of the model
cost_function : function to compute cost
gradient_function : function to compute gradient
alpha : (float) Learning rate
num_iters : (int) number of iterations to run gradient descent
lambda_ : (scalar, float) regularization constant

Returns:
w : (ndarray Shape (n,)) Updated values of parameters of the model after
running gradient descent
b : (scalar) Updated value of parameter of the model after
running gradient descent
"""

# number of training examples


m = len(X)

# An array to store cost J and w's at each iteration primarily for graphing later
J_history = []
w_history = []

for i in range(num_iters):

# Calculate the gradient and update the parameters


dj_db, dj_dw = gradient_function(X, y, w_in, b_in, lambda_)
# Update Parameters using w, b, alpha and gradient
w_in = w_in - alpha * dj_dw
b_in = b_in - alpha * dj_db

# Save cost J at each iteration


if i<100000: # prevent resource exhaustion
cost = cost_function(X, y, w_in, b_in, lambda_)
J_history.append(cost)

# Print cost every at intervals 10 times or as many iterations if < 10


if i% math.ceil(num_iters/10) == 0 or i == (num_iters-1):
w_history.append(w_in)
print(f"Iteration {i:4}: Cost {float(J_history[-1]):8.2f} ")

return w_in, b_in, J_history, w_history #return w and J,w history for graphing

Now let's run the gradient descent algorithm above to learn the parameters for our dataset.
Note The code block below takes a couple of minutes to run, especially with a non-vectorized
version. You can reduce the iterations to test your implementation and iterate faster. If you have
time later, try running 100,000 iterations for better results.

np.random.seed(1)
initial_w = 0.01 * (np.random.rand(2) - 0.5)
initial_b = -8

# Some gradient descent settings


iterations = 10000
alpha = 0.001

w,b, J_history,_ = gradient_descent(X_train ,y_train, initial_w, initial_b, compute_cost,


compute_gradient, alpha, iterations, 0)
2.7 Plotting the decision boundary
We will now use the final parameters from gradient descent to plot the linear fit. If you
implemented the previous parts correctly, you should see a plot similar to the following plot:
We will use a helper function in the utils.py file to create this plot.

plot_decision_boundary(w, b, X_train, y_train)


# Set the y-axis label
plt.ylabel('Exam 2 score')
# Set the x-axis label
plt.xlabel('Exam 1 score')
plt.legend(loc="upper right")
plt.show()

2.8 Evaluating logistic regression


We can evaluate the quality of the parameters we have found by seeing how well the learned
model predicts on our training set.
You will implement the predict function below to do this.

# UNQ_C4
# GRADED FUNCTION: predict

def predict(X, w, b):


"""
Predict whether the label is 0 or 1 using learned logistic
regression parameters w

Args:
X : (ndarray Shape (m,n)) data, m examples by n features
w : (ndarray Shape (n,)) values of parameters of the model
b : (scalar) value of bias parameter of the model

Returns:
p : (ndarray (m,)) The predictions for X using a threshold at 0.5
"""
# number of training examples
m, n = X.shape
p = np.zeros(m)

### START CODE HERE ###


# Loop over each example
for i in range(m):
z_wb = 0
# Loop over each feature
for j in range(n):
# Add the corresponding term to z_wb
z_wb_ij = X[i, j] * w[j]
z_wb += z_wb_ij

# Add bias term


z_wb += b

# Calculate the prediction for this example


f_wb = sigmoid(z_wb)
# Apply the threshold
p[i] = f_wb >= 0.5

### END CODE HERE ###


return p
Once you have completed the function predict, let's run the code below to report the training
accuracy of your classifier by computing the percentage of examples it got correct.
# Test your predict code
np.random.seed(1)
tmp_w = np.random.randn(2)
tmp_b = 0.3
tmp_X = np.random.randn(4, 2) - 0.5

tmp_p = predict(tmp_X, tmp_w, tmp_b)


print(f'Output of predict: shape {tmp_p.shape}, value {tmp_p}')

# UNIT TESTS
predict_test(predict)

Now let's use this to compute the accuracy on the training set

#Compute accuracy on our training set


p = predict(X_train, w,b)
print('Train Accuracy: %f'%(np.mean(p == y_train) * 100))
# load dataset
X_train, y_train = load_data("data/ex2data2.txt")

# print X_train
print("X_train:", X_train[:5])
print("Type of X_train:",type(X_train))

# print y_train
print("y_train:", y_train[:5])
print("Type of y_train:",type(y_train))

print ('The shape of X_train is: ' + str(X_train.shape))


print ('The shape of y_train is: ' + str(y_train.shape))
print ('We have m = %d training examples' % (len(y_train)))

# Plot examples
plot_data(X_train, y_train[:], pos_label="Accepted", neg_label="Rejected")

# Set the y-axis label


plt.ylabel('Microchip Test 2')
# Set the x-axis label
plt.xlabel('Microchip Test 1')
plt.legend(loc="upper right")
plt.show()

print("Original shape of data:", X_train.shape)

mapped_X = map_feature(X_train[:, 0], X_train[:, 1])


print("Shape after feature mapping:", mapped_X.shape)

print("X_train[0]:", X_train[0])
print("mapped X_train[0]:", mapped_X[0])
# UNQ_C5
def compute_cost_reg(X, y, w, b, lambda_ = 1):
"""
Computes the cost over all examples
Args:
X : (ndarray Shape (m,n)) data, m examples by n features
y : (ndarray Shape (m,)) target value
w : (ndarray Shape (n,)) values of parameters of the model
b : (scalar) value of bias parameter of the model
lambda_ : (scalar, float) Controls amount of regularization
Returns:
total_cost : (scalar) cost
"""
m, n = X.shape

# Calls the compute_cost function that you implemented above


cost_without_reg = compute_cost(X, y, w, b)

# You need to calculate this value


reg_cost = 0.
# Nguyen Do Phuoc Rin 21098871
### START CODE HERE ###
for j in range(n):
reg_cost_j = w[j]**2
reg_cost = reg_cost + reg_cost_j
reg_cost = (lambda_/(2 * m)) * reg_cost

### END CODE HERE ###

# Add the regularization cost to get the total cost


total_cost = cost_without_reg + reg_cost

return total_cost
# UNQ_C6
def compute_gradient_reg(X, y, w, b, lambda_ = 1):
"""
Computes the gradient for logistic regression with regularization

Args:
X : (ndarray Shape (m,n)) data, m examples by n features
y : (ndarray Shape (m,)) target value
w : (ndarray Shape (n,)) values of parameters of the model
b : (scalar) value of bias parameter of the model
lambda_ : (scalar,float) regularization constant
Returns
dj_db : (scalar) The gradient of the cost w.r.t. the parameter b.
dj_dw : (ndarray Shape (n,)) The gradient of the cost w.r.t. the parameters w.
"""
m, n = X.shape

dj_db, dj_dw = compute_gradient(X, y, w, b)


# Nguyen Do Phuoc Rin 21098871
### START CODE HERE ###
for j in range(n):
dj_dw_j_reg = (lambda_ / m) * w[j]
dj_dw[j] = dj_dw[j] + dj_dw_j_reg

### END CODE HERE ###

return dj_db, dj_dw


# Initialize fitting parameters
np.random.seed(1)
initial_w = np.random.rand(X_mapped.shape[1])-0.5
initial_b = 1.

# Set regularization parameter lambda_ (you can try varying this)


lambda_ = 0.01

# Some gradient descent settings


iterations = 10000
alpha = 0.01

w,b, J_history,_ = gradient_descent(X_mapped, y_train, initial_w, initial_b,


compute_cost_reg, compute_gradient_reg,
alpha, iterations, lambda_)

plot_decision_boundary(w, b, X_mapped, y_train)


# Set the y-axis label
plt.ylabel('Microchip Test 2')
# Set the x-axis label
plt.xlabel('Microchip Test 1')
plt.legend(loc="upper right")
plt.show()

utils.py
import numpy as np
import matplotlib.pyplot as plt
from matplotlib.colors import ListedColormap
from mpl_toolkits.mplot3d import Axes3D

def load_data():
X = np.load("data/ex7_X.npy")
return X

def draw_line(p1, p2, style="-k", linewidth=1):


plt.plot([p1[0], p2[0]], [p1[1], p2[1]], style, linewidth=linewidth)

def plot_data_points(X, idx):


# Define colormap to match Figure 1 in the notebook
cmap = ListedColormap(["red", "green", "blue"])
c = cmap(idx)

# plots data points in X, coloring them so that those with the same
# index assignments in idx have the same color
plt.scatter(X[:, 0], X[:, 1], facecolors='none', edgecolors=c, linewidth=0.1, alpha=0.7)

def plot_progress_kMeans(X, centroids, previous_centroids, idx, K, i):


# Plot the examples
plot_data_points(X, idx)

# Plot the centroids as black 'x's


plt.scatter(centroids[:, 0], centroids[:, 1], marker='x', c='k', linewidths=3)

# Plot history of the centroids with lines


for j in range(centroids.shape[0]):
draw_line(centroids[j, :], previous_centroids[j, :])

plt.title("Iteration number %d" %i)

def plot_kMeans_RGB(X, centroids, idx, K):


# Plot the colors and centroids in a 3D space
fig = plt.figure(figsize=(16, 16))
ax = fig.add_subplot(221, projection='3d')
ax.scatter(*X.T*255, zdir='z', depthshade=False, s=.3, c=X)
ax.scatter(*centroids.T*255, zdir='z', depthshade=False, s=500, c='red', marker='x', lw=3)
ax.set_xlabel('R value - Redness')
ax.set_ylabel('G value - Greenness')
ax.set_zlabel('B value - Blueness')
ax.w_yaxis.set_pane_color((0., 0., 0., .2))
ax.set_title("Original colors and their color clusters' centroids")
plt.show()

def show_centroid_colors(centroids):
palette = np.expand_dims(centroids, axis=0)
num = np.arange(0,len(centroids))
plt.figure(figsize=(16, 16))
plt.xticks(num)
plt.yticks([])
plt.imshow(palette)

import numpy as np
import matplotlib.pyplot as plt
from utils import *
%matplotlib inline
# UNQ_C1
# GRADED FUNCTION: find_closest_centroids

def find_closest_centroids(X, centroids):


"""
Computes the centroid memberships for every example

Args:
X (ndarray): (m, n) Input values
centroids (ndarray): k centroids

Returns:
idx (array_like): (m,) closest centroids

"""

# Set K
K = centroids.shape[0]
# You need to return the following variables correctly
idx = np.zeros(X.shape[0], dtype=int)
# YOUR CODE HERE
for i in range(X.shape[0]):
# Array to hold distance between X[i] and each centroids[j]
distance = []
for j in range(centroids.shape[0]):
norm_ij = np.linalg.norm(X[i] - centroids[j])
distance.append(norm_ij)

idx[i] = np.argmin(distance)

return idx

Now let's check your implementation using an example dataset

# Load an example dataset that we will be using


X = load_data()

print("First five elements of X are:\n", X[:5])


print("The shape of X is:", X.shape)

# Select an initial set of centroids (3 Centroids)


initial_centroids = np.array([[3, 3], [6, 2], [8, 5]])

# Find closest centroids using initial_centroids


idx = find_closest_centroids(X, initial_centroids)

# Print closest centroids for the first three elements


print("First three elements in idx are:", idx[:3])

# UNIT TEST
from public_tests import *

find_closest_centroids_test(find_closest_centroids)

# UNQ_C2
# GRADED FUNCTION: compute_centpods

def compute_centroids(X, idx, K):


"""
Returns the new centroids by computing the means of the
data points assigned to each centroid.

Args:
X (ndarray): (m, n) Data points
idx (ndarray): (m,) Array containing index of closest centroid for each
example in X. Concretely, idx[i] contains the index of
the centroid closest to example i
K (int): number of centroids

Returns:
centroids (ndarray): (K, n) New centroids computed
"""
# Useful variables
m, n = X.shape

# You need to return the following variables correctly


centroids = np.zeros((K, n))
# YOUR CODE HERE
for k in range(K):
points = X[idx == k]
centroids[k] = np.mean(points, axis = 0)

return centroids

K=3
centroids = compute_centroids(X, idx, K)

print("The centroids are:", centroids)

# UNIT TEST
compute_centroids_test(compute_centroids)

# You do not need to implement anything for this part


def run_kMeans(X, initial_centroids, max_iters=10, plot_progress=False):
"""
Runs the K-Means algorithm on data matrix X, where each row of X
is a single example
"""

# Initialize values
m, n = X.shape
K = initial_centroids.shape[0]
centroids = initial_centroids
previous_centroids = centroids
idx = np.zeros(m)

# Run K-Means
for i in range(max_iters):

# Output progress
print("K-Means iteration %d/%d" % (i, max_iters - 1))

# For each example in X, assign it to the closest centroid


idx = find_closest_centroids(X, centroids)

# Optionally plot progress


if plot_progress:
plot_progress_kMeans(X, centroids, previous_centroids, idx, K, i)
previous_centroids = centroids

# Given the memberships, compute new centroids


centroids = compute_centroids(X, idx, K)
plt.show()
return centroids, idx

# Load an example dataset


X = load_data()

# Set initial centroids


initial_centroids = np.array([[3, 3], [6, 2], [8, 5]])
K=3

# Number of iterations
max_iters = 10

centroids, idx = run_kMeans(X, initial_centroids, max_iters, plot_progress=True)

# You do not need to modify this part

def kMeans_init_centroids(X, K):


"""
This function initializes K centroids that are to be
used in K-Means on the dataset X

Args:
X (ndarray): Data points
K (int): number of centroids/clusters

Returns:
centroids (ndarray): Initialized centroids
"""

# Randomly reorder the indices of examples


randidx = np.random.permutation(X.shape[0])

# Take the first K examples as centroids


centroids = X[randidx[:K]]

return centroids
# Divide by 255 so that all values are in the range 0 - 1
original_img = original_img / 255

# Reshape the image into an m x 3 matrix where m = number of pixels


# (in this case m = 128 x 128 = 16384)
# Each row will contain the Red, Green and Blue pixel values
# This gives us our dataset matrix X_img that we will use K-Means on.

X_img = np.reshape(original_img, (original_img.shape[0] * original_img.shape[1], 3))

4.2 K-Means on image pixels


Now, run the cell below to run K-Means on the pre-processed image.

# Run your K-Means algorithm on this data


# You should try different values of K and max_iters here
K = 16
max_iters = 10

# Using the function you have implemented above.


initial_centroids = kMeans_init_centroids(X_img, K)

# Run K-Means - this takes a couple of minutes


centroids, idx = run_kMeans(X_img, initial_centroids, max_iters)

print("Shape of idx:", idx.shape)


print("Closest centroid for the first five elements:", idx[:5])

# Represent image in terms of indices


X_recovered = centroids[idx, :]

# Reshape recovered image into proper dimensions


X_recovered = np.reshape(X_recovered, original_img.shape)

# Display original image


fig, ax = plt.subplots(1, 2, figsize=(8, 8))
plt.axis("off")

ax[0].imshow(original_img * 255)
ax[0].set_title("Original")
ax[0].set_axis_off()

# Display compressed image


ax[1].imshow(X_recovered * 255)
ax[1].set_title("Compressed with %d colours" % K)
ax[1].set_axis_off()

You might also like