Mayhoc
Mayhoc
py
import numpy as np
def load_data():
data = np.loadtxt("data/ex1data1.txt", delimiter=',')
X = data[:,0]
y = data[:,1]
return X, y
def load_data_multi():
data = np.loadtxt("data/ex1data2.txt", delimiter=',')
X = data[:,:2]
y = data[:,2]
return X, y
Outline
1 - Packages
2 - Linear regression with one variable
2.1 Problem Statement
2.2 Dataset
2.3 Refresher on linear regression
2.4 Compute Cost
Exercise 1
2.5 Gradient descent
Exercise 2
2.6 Learning parameters using batch gradient descent
NOTE: To prevent errors from the autograder, you are not allowed to edit or delete non-graded
cells in this notebook . Please also refrain from adding any new cells. Once you have passed this
assignment and want to experiment with any of the non-graded code, you may follow the
instructions at the bottom of this notebook.
1 - Packages
First, let's run the cell below to import all the packages that you will need during this assignment.
numpy is the fundamental package for working with matrices in Python.
matplotlib is a famous library to plot graphs in Python.
utils.py contains helper functions for this assignment. You do not need to modify code in
this file.
import numpy as np
import matplotlib.pyplot as plt
from utils import *
import copy
import math
%matplotlib inline
2 - Problem Statement
Suppose you are the CEO of a restaurant franchise and are considering different cities for
opening a new outlet.
You would like to expand your business to cities that may give your restaurant higher
profits.
The chain already has restaurants in various cities and you have data for profits and
populations from the cities.
You also have data on cities that are candidates for a new restaurant.
For these cities, you have the city population.
Can you use the data to help you identify which cities may potentially give your business higher
profits?
3 - Dataset
You will start by loading the dataset for this task.
The load_data() function shown below loads the data into variables x_train and y_train
x_train is the population of a city
y_train is the profit of a restaurant in that city. A negative value for profit indicates
a loss.
Both X_train and y_train are numpy arrays.
# print x_train
print("Type of x_train:",type(x_train))
print("First five elements of x_train are:\n", x_train[:5])
x_train is a numpy array that contains decimal values that are all greater than zero.
These values represent the city population times 10,000
For example, 6.1101 means that the population for that city is 61,101
Now, let's print y_train
# print y_train
print("Type of y_train:",type(y_train))
print("First five elements of y_train are:\n", y_train[:5])
Similarly, y_train is a numpy array that has decimal values, some negative, some positive.
These represent your restaurant's average monthly profits in each city, in units of $10,000.
For example, 17.592 represents $175,920 in average monthly profits for that city.
-2.6807 represents -$26,807 in average monthly loss for that city.
The city population array has 97 data points, and the monthly average profits also has 97 data
points. These are NumPy 1D arrays.
# Create a scatter plot of the data. To change the markers to red "x",
# we used the 'marker' and 'c' parameters
plt.scatter(x_train, y_train, marker='x', c='r')
Args:
x (ndarray): Shape (m,) Input to the model (Population of cities)
y (ndarray): Shape (m,) Label (Actual profits for the cities)
w, b (scalar): Parameters of the model
Returns
total_cost (float): The cost of using w,b as the parameters for linear regression
to fit the data points in x and y
"""
# number of training examples
m = x.shape[0]
# You need to return this variable correctly
total_cost = 0
return total_cost
You can check if your implementation was correct by running the following test code:
# Compute cost with some initial values for paramaters w, b
initial_w = 2
initial_b = 1
# Public tests
from public_tests import *
compute_cost_test(compute_cost)
# UNQ_C2
# GRADED FUNCTION: compute_gradient
def compute_gradient(x, y, w, b):
"""
Computes the gradient for linear regression
Args:
x (ndarray): Shape (m,) Input to the model (Population of cities)
y (ndarray): Shape (m,) Label (Actual profits for the cities)
w, b (scalar): Parameters of the model
Returns
dj_dw (scalar): The gradient of the cost w.r.t. the parameters w
dj_db (scalar): The gradient of the cost w.r.t. the parameter b
"""
Run the cells below to check your implementation of the compute_gradient function with two
different initializations of the parameters w,b.
compute_gradient_test(compute_gradient)
Now let's run the gradient descent algorithm implemented above on our dataset.
Args:
x: (ndarray): Shape (m,)
y: (ndarray): Shape (m,)
w_in, b_in : (scalar) Initial values of parameters of the model
cost_function: function to compute cost
gradient_function: function to compute the gradient
alpha : (float) Learning rate
num_iters : (int) number of iterations to run gradient descent
Returns
w : (ndarray): Shape (1,) Updated values of parameters of the model after
running gradient descent
b : (scalar) Updated value of parameter of the model after
running gradient descent
"""
# An array to store cost J and w's at each iteration — primarily for graphing later
J_history = []
w_history = []
w = copy.deepcopy(w_in) #avoid modifying global w within function
b = b_in
for i in range(num_iters):
Now let's run the gradient descent algorithm above to learn the parameters for our dataset
m = x_train.shape[0]
predicted = np.zeros(m)
for i in range(m):
predicted[i] = w * x_train[i] + b
We will now plot the predicted values to see the linear fit.
Your final values of w,b can also be used to make predictions on profits. Let's predict what the
profit would be in areas of 35,000 and 70,000 people.
The model takes in population of a city in 10,000s as input.
Therefore, 35,000 people can be translated into an input to the model as np.array([3.5])
Similarly, 70,000 people can be translated into an input to the model as np.array([7.])
predict1 = 3.5 * w + b
print('For population = 35,000, we predict a profit of $%.2f' % (predict1*10000))
predict2 = 7.0 * w + b
print('For population = 70,000, we predict a profit of $%.2f' % (predict2*10000))
Logistic Regression
utils.py
import numpy as np
import matplotlib.pyplot as plt
def load_data(filename):
data = np.loadtxt(filename, delimiter=',')
X = data[:,:2]
y = data[:,2]
return X, y
def sig(z):
return 1/(1+np.exp(-z))
# Plot examples
plt.plot(X[positive, 0], X[positive, 1], 'k+', label=pos_label)
plt.plot(X[negative, 0], X[negative, 1], 'yo', label=neg_label)
plot_data(X[:, 0:2], y)
if X.shape[1] <= 2:
plot_x = np.array([min(X[:, 0]), max(X[:, 0])])
plot_y = (-1. / w[1]) * (w[0] * plot_x + b)
else:
u = np.linspace(-1, 1.5, 50)
v = np.linspace(-1, 1.5, 50)
z = np.zeros((len(u), len(v)))
# Plot z = 0.5
plt.contour(u,v,z, levels = [0.5], colors="g")
1 - Packages
First, let's run the cell below to import all the packages that you will need during this assignment.
numpy is the fundamental package for scientific computing with Python.
matplotlib is a famous library to plot graphs in Python.
utils.py contains helper functions for this assignment. You do not need to modify code in
this file.
import numpy as np
import matplotlib.pyplot as plt
from utils import *
import copy
import math
%matplotlib inline
2 - Logistic Regression
In this part of the exercise, you will build a logistic regression model to predict whether a student
gets admitted into a university.
2.1 Problem Statement
Suppose that you are the administrator of a university department and you want to determine each
applicant’s chance of admission based on their results on two exams.
You have historical data from previous applicants that you can use as a training set for
logistic regression.
For each training example, you have the applicant’s scores on two exams and the
admissions decision.
Your task is to build a classification model that estimates an applicant’s probability of
admission based on the scores from those two exams.
2.2 Loading and visualizing the data
You will start by loading the dataset for this task.
The load_dataset() function shown below loads the data into variables X_train and y_train
X_train contains exam scores on two exams for a student
y_train is the admission decision
o y_train = 1 if the student was admitted
o y_train = 0 if the student was not admitted
Both X_train and y_train are numpy arrays.
# load dataset
X_train, y_train = load_data("data/ex2data1.txt")
# UNQ_C1
# GRADED FUNCTION: sigmoid
def sigmoid(z):
"""
Compute the sigmoid of z
Args:
z (ndarray): A scalar, numpy array of any size.
Returns:
g (ndarray): sigmoid(z), with the same shape as z
"""
### START CODE HERE ###
g = 1/(1+np.exp(-z))
### END SOLUTION ###
return g
# UNIT TESTS
from public_tests import *
sigmoid_test(sigmoid)
Note:
values but matrices of shape (m,n) and (𝑚,1) respectively, where 𝑛 is the number of
As you are doing this, remember that the variables X_train and y_train are not scalar
# UNQ_C2
# GRADED FUNCTION: compute_cost
def compute_cost(X, y, w, b, *argv):
"""
Computes the cost over all examples
Args:
X : (ndarray Shape (m,n)) data, m examples by n features
y : (ndarray Shape (m,)) target value
w : (ndarray Shape (n,)) values of parameters of the model
b : (scalar) value of bias parameter of the model
*argv : unused, for compatibility with regularized version below
Returns:
total_cost : (scalar) cost
"""
m, n = X.shape
# Nuyen Do Phuoc Rin 21098871
### START CODE HERE ###
cost=0
total_cost=0
loss=0
loss_sum=0
for i in range(m):
z_wb = 0
for j in range(n):
z_wb_ij = w[j] * X[i][j]
z_wb += z_wb_ij
z_wb += b
f_wb = sigmoid(z_wb)
loss = -y[i] * np.log(f_wb) - (1 - y[i]) * np.log(1 - f_wb)
loss_sum += loss
total_cost = loss_sum / m
### END CODE HERE ###
return total_cost
Run the cells below to check your implementation of the compute_cost function with two
different initializations of the parameters w and b
m, n = X_train.shape
# UNQ_C3
# GRADED FUNCTION: compute_gradient
def compute_gradient(X, y, w, b, *argv):
"""
Computes the gradient for logistic regression
Args:
X : (ndarray Shape (m,n)) data, m examples by n features
y : (ndarray Shape (m,)) target value
w : (ndarray Shape (n,)) values of parameters of the model
b : (scalar) value of bias parameter of the model
*argv : unused, for compatibility with regularized version below
Returns
dj_dw : (ndarray Shape (n,)) The gradient of the cost w.r.t. the parameters w.
dj_db : (scalar) The gradient of the cost w.r.t. the parameter b.
"""
m, n = X.shape
dj_dw = np.zeros(w.shape)
dj_db = 0.
# Nguyen Do Phuoc Rin 21098871
### START CODE HERE ###
for i in range(m):
z_wb = 0
for j in range(n):
z_wb_ij = X[i, j] * w[j]
z_wb += z_wb_ij
z_wb += b
f_wb = sigmoid(z_wb)
for j in range(n):
dj_dw_ij =(f_wb - y[i])* X[i][j]
dj_dw[j] += dj_dw_ij
dj_dw = dj_dw / m
dj_db = dj_db / m
### END CODE HERE ###
return dj_db, dj_dw
# UNIT TESTS
compute_gradient_test(compute_gradient)
Returns:
w : (ndarray Shape (n,)) Updated values of parameters of the model after
running gradient descent
b : (scalar) Updated value of parameter of the model after
running gradient descent
"""
# An array to store cost J and w's at each iteration primarily for graphing later
J_history = []
w_history = []
for i in range(num_iters):
return w_in, b_in, J_history, w_history #return w and J,w history for graphing
Now let's run the gradient descent algorithm above to learn the parameters for our dataset.
Note The code block below takes a couple of minutes to run, especially with a non-vectorized
version. You can reduce the iterations to test your implementation and iterate faster. If you have
time later, try running 100,000 iterations for better results.
np.random.seed(1)
initial_w = 0.01 * (np.random.rand(2) - 0.5)
initial_b = -8
# UNQ_C4
# GRADED FUNCTION: predict
Args:
X : (ndarray Shape (m,n)) data, m examples by n features
w : (ndarray Shape (n,)) values of parameters of the model
b : (scalar) value of bias parameter of the model
Returns:
p : (ndarray (m,)) The predictions for X using a threshold at 0.5
"""
# number of training examples
m, n = X.shape
p = np.zeros(m)
# UNIT TESTS
predict_test(predict)
Now let's use this to compute the accuracy on the training set
# print X_train
print("X_train:", X_train[:5])
print("Type of X_train:",type(X_train))
# print y_train
print("y_train:", y_train[:5])
print("Type of y_train:",type(y_train))
# Plot examples
plot_data(X_train, y_train[:], pos_label="Accepted", neg_label="Rejected")
print("X_train[0]:", X_train[0])
print("mapped X_train[0]:", mapped_X[0])
# UNQ_C5
def compute_cost_reg(X, y, w, b, lambda_ = 1):
"""
Computes the cost over all examples
Args:
X : (ndarray Shape (m,n)) data, m examples by n features
y : (ndarray Shape (m,)) target value
w : (ndarray Shape (n,)) values of parameters of the model
b : (scalar) value of bias parameter of the model
lambda_ : (scalar, float) Controls amount of regularization
Returns:
total_cost : (scalar) cost
"""
m, n = X.shape
return total_cost
# UNQ_C6
def compute_gradient_reg(X, y, w, b, lambda_ = 1):
"""
Computes the gradient for logistic regression with regularization
Args:
X : (ndarray Shape (m,n)) data, m examples by n features
y : (ndarray Shape (m,)) target value
w : (ndarray Shape (n,)) values of parameters of the model
b : (scalar) value of bias parameter of the model
lambda_ : (scalar,float) regularization constant
Returns
dj_db : (scalar) The gradient of the cost w.r.t. the parameter b.
dj_dw : (ndarray Shape (n,)) The gradient of the cost w.r.t. the parameters w.
"""
m, n = X.shape
utils.py
import numpy as np
import matplotlib.pyplot as plt
from matplotlib.colors import ListedColormap
from mpl_toolkits.mplot3d import Axes3D
def load_data():
X = np.load("data/ex7_X.npy")
return X
# plots data points in X, coloring them so that those with the same
# index assignments in idx have the same color
plt.scatter(X[:, 0], X[:, 1], facecolors='none', edgecolors=c, linewidth=0.1, alpha=0.7)
def show_centroid_colors(centroids):
palette = np.expand_dims(centroids, axis=0)
num = np.arange(0,len(centroids))
plt.figure(figsize=(16, 16))
plt.xticks(num)
plt.yticks([])
plt.imshow(palette)
import numpy as np
import matplotlib.pyplot as plt
from utils import *
%matplotlib inline
# UNQ_C1
# GRADED FUNCTION: find_closest_centroids
Args:
X (ndarray): (m, n) Input values
centroids (ndarray): k centroids
Returns:
idx (array_like): (m,) closest centroids
"""
# Set K
K = centroids.shape[0]
# You need to return the following variables correctly
idx = np.zeros(X.shape[0], dtype=int)
# YOUR CODE HERE
for i in range(X.shape[0]):
# Array to hold distance between X[i] and each centroids[j]
distance = []
for j in range(centroids.shape[0]):
norm_ij = np.linalg.norm(X[i] - centroids[j])
distance.append(norm_ij)
idx[i] = np.argmin(distance)
return idx
# UNIT TEST
from public_tests import *
find_closest_centroids_test(find_closest_centroids)
# UNQ_C2
# GRADED FUNCTION: compute_centpods
Args:
X (ndarray): (m, n) Data points
idx (ndarray): (m,) Array containing index of closest centroid for each
example in X. Concretely, idx[i] contains the index of
the centroid closest to example i
K (int): number of centroids
Returns:
centroids (ndarray): (K, n) New centroids computed
"""
# Useful variables
m, n = X.shape
return centroids
K=3
centroids = compute_centroids(X, idx, K)
# UNIT TEST
compute_centroids_test(compute_centroids)
# Initialize values
m, n = X.shape
K = initial_centroids.shape[0]
centroids = initial_centroids
previous_centroids = centroids
idx = np.zeros(m)
# Run K-Means
for i in range(max_iters):
# Output progress
print("K-Means iteration %d/%d" % (i, max_iters - 1))
# Number of iterations
max_iters = 10
Args:
X (ndarray): Data points
K (int): number of centroids/clusters
Returns:
centroids (ndarray): Initialized centroids
"""
return centroids
# Divide by 255 so that all values are in the range 0 - 1
original_img = original_img / 255
ax[0].imshow(original_img * 255)
ax[0].set_title("Original")
ax[0].set_axis_off()