0% found this document useful (0 votes)

43 views27 pages

C1 W3 Logistic Regression

Uploaded by

itzkun1609

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

43 views27 pages

C1 W3 Logistic Regression

Uploaded by

itzkun1609

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 27

Logistic Regression

In this exercise, you will implement logistic regression and apply it to two different datasets.

Outline
• 1 - Packages
• 2 - Logistic Regression
– 2.1 Problem Statement
– 2.2 Loading and visualizing the data
– 2.3 Sigmoid function
– 2.4 Cost function for logistic regression
– 2.5 Gradient for logistic regression
– 2.6 Learning parameters using gradient descent
– 2.7 Plotting the decision boundary
– 2.8 Evaluating logistic regression
• 3 - Regularized Logistic Regression
– 3.1 Problem Statement
– 3.2 Loading and visualizing the data
– 3.3 Feature mapping
– 3.4 Cost function for regularized logistic regression
– 3.5 Gradient for regularized logistic regression
– 3.6 Learning parameters using gradient descent
– 3.7 Plotting the decision boundary
– 3.8 Evaluating regularized logistic regression model

1 - Packages
First, let's run the cell below to import all the packages that you will need during this
assignment.

• numpy is the fundamental package for scientific computing with Python.

• matplotlib is a famous library to plot graphs in Python.
• utils.py contains helper functions for this assignment. You do not need to modify
code in this file.
import numpy as np
import matplotlib.pyplot as plt
from utils import *
import copy
import math
%matplotlib inline

2 - Logistic Regression
In this part of the exercise, you will build a logistic regression model to predict whether a
student gets admitted into a university.

2.1 Problem Statement

Suppose that you are the administrator of a university department and you want to determine
each applicant’s chance of admission based on their results on two exams.

• You have historical data from previous applicants that you can use as a training set for
logistic regression.
• For each training example, you have the applicant’s scores on two exams and the
admissions decision.
• Your task is to build a classification model that estimates an applicant’s probability of
admission based on the scores from those two exams.

2.2 Loading and visualizing the data

You will start by loading the dataset for this task.

• The load_dataset() function shown below loads the data into variables X_train and
y_train
– X_train contains exam scores on two exams for a student
– y_train is the admission decision
• y_train = 1 if the student was admitted
• y_train = 0 if the student was not admitted
– Both X_train and y_train are numpy arrays.
# load dataset
X_train, y_train = load_data("data/ex2data1.txt")

View the variables

Let's get more familiar with your dataset.

• A good place to start is to just print out each variable and see what it contains.

The code below prints the first five values of X_train and the type of the variable.
print("First five elements in X_train are:\n", X_train[:5])
print("Type of X_train:",type(X_train))

First five elements in X_train are:

[[34.62365962 78.02469282]
[30.28671077 43.89499752]
[35.84740877 72.90219803]
[60.18259939 86.3085521 ]
[79.03273605 75.34437644]]
Type of X_train: <class 'numpy.ndarray'>

Now print the first five values of y_train

print("First five elements in y_train are:\n", y_train[:5])

print("Type of y_train:",type(y_train))

First five elements in y_train are:

[0. 0. 0. 1. 1.]
Type of y_train: <class 'numpy.ndarray'>

Check the dimensions of your variables

Another useful way to get familiar with your data is to view its dimensions. Let's print the shape
of X_train and y_train and see how many training examples we have in our dataset.

print ('The shape of X_train is: ' + str(X_train.shape))

print ('The shape of y_train is: ' + str(y_train.shape))
print ('We have m = %d training examples' % (len(y_train)))

The shape of X_train is: (100, 2)

The shape of y_train is: (100,)
We have m = 100 training examples

Visualize your data

Before starting to implement any learning algorithm, it is always good to visualize the data if
possible.

• The code below displays the data on a 2D plot (as shown below), where the axes are the
two exam scores, and the positive and negative examples are shown with different
markers.
• We use a helper function in the utils.py file to generate this plot.

# Plot examples
plot_data(X_train, y_train[:], pos_label="Admitted", neg_label="Not
admitted")
# Set the y-axis label
plt.ylabel('Exam 2 score')
# Set the x-axis label
plt.xlabel('Exam 1 score')
plt.legend(loc="upper right")
plt.show()

Your goal is to build a logistic regression model to fit this data.

• With this model, you can then predict if a new student will be admitted based on their
scores on the two exams.

2.3 Sigmoid function

Recall that for logistic regression, the model is represented as

f w ,b ( x )=g ( w ⋅ x +b )

where function g is the sigmoid function. The sigmoid function is defined as:
1
g ( z )= −z
1+ e
Let's implement the sigmoid function first, so it can be used by the rest of this assignment.
Exercise 1
Please complete the sigmoid function to calculate

1
g ( z )= −z
1+ e
Note that

• z is not always a single number, but can also be an array of numbers.

• If the input is an array of numbers, we'd like to apply the sigmoid function to each value
in the input array.

If you get stuck, you can check out the hints presented after the cell below to help you with the
implementation.

# UNQ_C1
# GRADED FUNCTION: sigmoid

def sigmoid(z):
"""
Compute the sigmoid of z

Args:
z (ndarray): A scalar, numpy array of any size.

Returns:
g (ndarray): sigmoid(z), with the same shape as z

"""

# YOUR CODE HERE

#raise NotImplementedError()
g = 1 / (1 + np.exp(-z))
return g

Click for hints

numpy has a function called np.exp(), which offers a convinient way to calculate the
exponential ( e z ) of all elements in the input array (z).

When you are finished, try testing a few values by calling sigmoid(x) in the cell below.

• For large positive values of x, the sigmoid should be close to 1, while for large negative
values, the sigmoid should be close to 0.
• Evaluating sigmoid(0) should give you exactly 0.5.
print ("sigmoid(0) = " + str(sigmoid(0)))

sigmoid(0) = 0.5
Expected Output: sigmoid(0) 0.5

• As mentioned before, your code should also work with vectors and matrices. For a
matrix, your function should perform the sigmoid function on every element.
print ("sigmoid([ -1, 0, 1, 2]) = " + str(sigmoid(np.array([-1, 0, 1,
2]))))

# UNIT TESTS
from public_tests import *
sigmoid_test(sigmoid)

sigmoid([ -1, 0, 1, 2]) = [0.26894142 0.5 0.73105858

0.88079708]
All tests passed!

Expected Output: sigmoid([-1, 0, 1, 2]) [0.26894142 0.5 0.73105858 0.88079708]

2.4 Cost function for logistic regression

In this section, you will implement the cost function for logistic regression.

Exercise 2
Please complete the compute_cost function using the equations below.

Recall that for logistic regression, the cost function is of the form
m −1
1
[
J ( w , b )= ∑ l o s s ( f w ,b ( x (i ) ) , y ( i) ))
m i=0
where

• m is the number of training examples in the dataset

• l o s s ( f w ,b ( x (i )) , y (i )) is the cost for a single data point, which is -

l o s s ( f w ,b ( x ) , y )=¿
(i ) (i )

• f w ,b ( x (i )) is the model's prediction, while y (i ), which is the actual label

• f w ,b ( x (i )) =g ( w ⋅ x (i )+ b ) where function g is the sigmoid function.

– It might be helpful to first calculate an intermediate variable

z w ,b ( x (i ) )=w ⋅ x ( i) +b=w 0 x (0i )+ .. .+w n −1 x(n−
i)
1+ b where
n is the number of features,
before calculating f w ,b ( x ) =g ( z w , b ( x ))
(i ) (i )
Note:

• As you are doing this, remember that the variables X_train and y_train are not scalar
values but matrices of shape (m , n) and (𝑚,1) respectively, where 𝑛 is the number of
features and 𝑚 is the number of training examples.
• You can use the sigmoid function that you implemented above for this part.

If you get stuck, you can check out the hints presented after the cell below to help you with the
implementation.

# UNQ_C2
# GRADED FUNCTION: compute_cost
def compute_cost(X, y, w, b, lambda_= 1):
"""
Computes the cost over all examples
Args:
X : (ndarray Shape (m,n)) data, m examples by n features
y : (array_like Shape (m,)) target value
w : (array_like Shape (n,)) Values of parameters of the model

b : scalar Values of bias parameter of the model

lambda_: unused placeholder
Returns:
total_cost: (scalar) cost
"""

m, n = X.shape

# YOUR CODE HERE

#raise NotImplementedError()
z = np.dot(X, w) + b

# Compute the model's prediction using the sigmoid function

f_wb = sigmoid(z)

# Compute the cost function for logistic regression

cost = (-y * np.log(f_wb) - (1 - y) * np.log(1 - f_wb))

# Compute the total cost

total_cost = (1 / m) * np.sum(cost)
return total_cost

Run the cells below to check your implementation of the compute_cost function with two
different initializations of the parameters w

m, n = X_train.shape

# Compute and display cost with w initialized to zeroes

initial_w = np.zeros(n)
initial_b = 0.
cost = compute_cost(X_train, y_train, initial_w, initial_b)
print('Cost at initial w (zeros): {:.3f}'.format(cost))

Cost at initial w (zeros): 0.693

Expected Output: Cost at initial w (zeros) 0.693

# Compute and display cost with non-zero w

test_w = np.array([0.2, 0.2])
test_b = -24.
cost = compute_cost(X_train, y_train, test_w, test_b)

print('Cost at test w,b: {:.3f}'.format(cost))

# UNIT TESTS
compute_cost_test(compute_cost)

Cost at test w,b: 0.218

All tests passed!

Expected Output: Cost at test w,b 0.218

2.5 Gradient for logistic regression

In this section, you will implement the gradient for logistic regression.

Recall that the gradient descent algorithm is:

$$\begin{align*}& \text{repeat until convergence:} \; \lbrace \newline \; & b := b - \alpha \frac{\

partial J(\mathbf{w},b)}{\partial b} \newline \; & w_j := w_j - \alpha \frac{\partial J(\
mathbf{w},b)}{\partial w_j} \tag{1} \; & \text{for j := 0..n-1}\newline & \rbrace\end{align*}$$

where, parameters b , w j are all updated simultaniously

Exercise 3
∂ J ( w , b) ∂ J ( w , b)
Please complete the compute_gradient function to compute , from
∂w ∂b
equations (2) and (3) below.
m −1
∂ J ( w , b) 1
= ∑ ( f w, b ( x (i) ) − y ( i) )
∂b m i=0
m −1
∂ J ( w , b) 1
= ∑ ( f w, b ( x (i) ) − y ( i) ) x (ji )
∂wj m i=0
• m is the number of training examples in the dataset

• f w ,b ( x (i )) is the model's prediction, while y (i ) is the actual label

• Note: While this gradient looks identical to the linear regression gradient, the
formula is actually different because linear and logistic regression have different
definitions of f w ,b ( x ).

As before, you can use the sigmoid function that you implemented above and if you get stuck,
you can check out the hints presented after the cell below to help you with the implementation.

# UNQ_C3
# GRADED FUNCTION: compute_gradient
def compute_gradient(X, y, w, b, lambda_=None):
"""
Computes the gradient for logistic regression

Args:
X : (ndarray Shape (m,n)) variable such as house size
y : (array_like Shape (m,1)) actual value
w : (array_like Shape (n,1)) values of parameters of the model

b : (scalar) value of parameter of the model

lambda_: unused placeholder.
Returns
dj_dw: (array_like Shape (n,1)) The gradient of the cost w.r.t.
the parameters w.
dj_db: (scalar) The gradient of the cost w.r.t.
the parameter b.
"""
m, n = X.shape
dj_dw = np.zeros(w.shape)
dj_db = 0.

# YOUR CODE HERE

#raise NotImplementedError()
for i in range(m):
z_wb = np.dot(X[i], w) + b # Tính z = X[i] * w + b
f_wb = 1 / (1 + np.exp(-z_wb)) # Hàm sigmoid

# Sai số giữa dự đoán và giá trị thực tế

error = f_wb - y[i]

# Tính gradient đối với b

dj_db += error

# Tính gradient đối với w

for j in range(n):
dj_dw[j] += error * X[i, j]
# Chia trung bình gradient theo số lượng mẫu m
dj_dw /= m
dj_db /= m

return dj_db, dj_dw

Click for hints

• Here's how you can structure the overall implementation for this function ```python
def compute_gradient(X, y, w, b, lambda_=None): m, n = X.shape dj_dw =
np.zeros(w.shape) dj_db = 0.

##--- BEGINNING SOLUTIONS

for i in range(m):
# Calculate f_wb (exactly as you did in the
compute_cost function above)
f_wb =

# Calculate the gradient for b from this example

dj_db_i = # Your code here to calculate the error

# add that to dj_db

dj_db += dj_db_i

# get dj_dw for each attribute

for j in range(n):
# You code here to calculate the gradient from the
i-th example for j-th attribute
dj_dw_ij =
dj_dw[j] += dj_dw_ij

# divide dj_db and dj_dw by total number of examples

dj_dw = dj_dw / m
dj_db = dj_db / m
##--# ENDING SOLUTIONS

return dj_db, dj_dw

```

If you're still stuck, you can check the hints presented below to figure out how to
calculate f_wb, dj_db_i and dj_dw_ij

Hint to calculate f_wb Recall that you calculated f_wb in compute_cost above
— for detailed hints on how to calculate each intermediate term, check out the hints
section below that exercise More hints to calculate f_wb You can
calculate f_wb as for i in range(m):
# Calculate f_wb (exactly how you did it in the compute_cost function above) z_wb =
0 # Loop over each feature for j in range(n): # Add the corresponding term to z_wb
z_wb_ij = X[i, j] * w[j] z_wb += z_wb_ij

# Add bias term

z_wb += b

# Calculate the prediction from the model

f_wb = sigmoid(z_wb)

Hint to calculate dj_db_i You can calculate dj_db_i as dj_db_i = f_wb - y[i]

Hint to calculate dj_dw_ij You can calculate dj_dw_ij as dj_dw_ij = (f_wb - y[i])*
X[i][j]
Run the cells below to check your implementation of the compute_gradient function with
two different initializations of the parameters w

# Compute and display gradient with w initialized to zeroes

initial_w = np.zeros(n)
initial_b = 0.

dj_db, dj_dw = compute_gradient(X_train, y_train, initial_w,

initial_b)
print(f'dj_db at initial w (zeros):{dj_db}' )
print(f'dj_dw at initial w (zeros):{dj_dw.tolist()}' )

dj_db at initial w (zeros):-0.1

dj_dw at initial w (zeros):[-12.00921658929115, -11.262842205513591]

Expected Output: dj_db at initial w (zeros) -0.1 ddj_dw at initial w (zeros): [-

12.00921658929115, -11.262842205513591]

# Compute and display cost and gradient with non-zero w

test_w = np.array([ 0.2, -0.5])
test_b = -24
dj_db, dj_dw = compute_gradient(X_train, y_train, test_w, test_b)

print('dj_db at test_w:', dj_db)

print('dj_dw at test_w:', dj_dw.tolist())

# UNIT TESTS
compute_gradient_test(compute_gradient)

dj_db at test_w: -0.5999999999991071

dj_dw at test_w: [-44.831353617873795, -44.37384124953978]
All tests passed!
Expected Output: dj_db at initial w (zeros) -0.5999999999991071 ddj_dw at initial w (zeros):
[-44.8313536178737957, -44.37384124953978]

2.6 Learning parameters using gradient descent

Similar to the previous assignment, you will now find the optimal parameters of a logistic
regression model by using gradient descent.

• You don't need to implement anything for this part. Simply run the cells below.

• A good way to verify that gradient descent is working correctly is to look at the
value of J ( w , b ) and check that it is decreasing with each step.

• Assuming you have implemented the gradient and computed the cost correctly, your
value of J ( w , b ) should never increase, and should converge to a steady value by the
end of the algorithm.

def gradient_descent(X, y, w_in, b_in, cost_function,

gradient_function, alpha, num_iters, lambda_):
"""
Performs batch gradient descent to learn theta. Updates theta by
taking
num_iters gradient steps with learning rate alpha

Args:
X : (array_like Shape (m, n)
y : (array_like Shape (m,))
w_in : (array_like Shape (n,)) Initial values of parameters of
the model
b_in : (scalar) Initial value of parameter of
the model
cost_function: function to compute cost
alpha : (float) Learning rate
num_iters : (int) number of iterations to run
gradient descent
lambda_ (scalar, float) regularization constant

Returns:
w : (array_like Shape (n,)) Updated values of parameters of the
model after
running gradient descent
b : (scalar) Updated value of parameter of the
model after
running gradient descent
"""

# number of training examples

m = len(X)
# An array to store cost J and w's at each iteration primarily for
graphing later
J_history = []
w_history = []

for i in range(num_iters):

# Calculate the gradient and update the parameters

dj_db, dj_dw = gradient_function(X, y, w_in, b_in, lambda_)

# Update Parameters using w, b, alpha and gradient

w_in = w_in - alpha * dj_dw
b_in = b_in - alpha * dj_db

# Save cost J at each iteration

if i<100000: # prevent resource exhaustion
cost = cost_function(X, y, w_in, b_in, lambda_)
J_history.append(cost)

# Print cost every at intervals 10 times or as many iterations

if < 10
if i% math.ceil(num_iters/10) == 0 or i == (num_iters-1):
w_history.append(w_in)
print(f"Iteration {i:4}: Cost {float(J_history[-1]):8.2f}
")

return w_in, b_in, J_history, w_history #return w and J,w history

for graphing

Now let's run the gradient descent algorithm above to learn the parameters for our dataset.

Note

The code block below takes a couple of minutes to run, especially with a non-vectorized version.
You can reduce the iterations to test your implementation and iterate faster. If you have
time, try running 100,000 iterations for better results.

np.random.seed(1)
intial_w = 0.01 * (np.random.rand(2).reshape(-1,1) - 0.5)
initial_b = -8

# Some gradient descent settings

iterations = 10000
alpha = 0.001

w,b, J_history,_ = gradient_descent(X_train ,y_train, initial_w,

initial_b,
compute_cost, compute_gradient,
alpha, iterations, 0)

Iteration 0: Cost 1.01

Iteration 1000: Cost 0.31
Iteration 2000: Cost 0.30
Iteration 3000: Cost 0.30
Iteration 4000: Cost 0.30
Iteration 5000: Cost 0.30
Iteration 6000: Cost 0.30
Iteration 7000: Cost 0.30
Iteration 8000: Cost 0.30
Iteration 9000: Cost 0.30
Iteration 9999: Cost 0.30

# With the following settings

np.random.seed(1)
intial_w = 0.01 * (np.random.rand(2).reshape(-1,1) - 0.5)
initial_b = -8
iterations = 10000
alpha = 0.001
#

Iteration 0: Cost 1.01

2.7 Plotting the decision boundary

We will now use the final parameters from gradient descent to plot the linear fit. If you
implemented the previous parts correctly, you should see the following plot:

We will use a helper function in the utils.py file to create this plot.

plot_decision_boundary(w, b, X_train, y_train)

2.8 Evaluating logistic regression
We can evaluate the quality of the parameters we have found by seeing how well the learned
model predicts on our training set.

You will implement the predict function below to do this.

Exercise 4
Please complete the predict function to produce 1 or 0 predictions given a dataset and a
learned parameter vector w and b .

• First you need to compute the prediction from the model f ( x ( i) )=g ( w ⋅ x (i) ) for every
example
– You've implemented this before in the parts above
• We interpret the output of the model ( f ( x ( i) )) as the probability that y (i )=1 given x (i )
and parameterized by w .

• Therefore, to get a final prediction ( y (i )=0 or y (i )=1) from the logistic regression
model, you can use the following heuristic -

if f ( x ( i) ) >¿ 0.5 , predict y (i )=1

if f ( x ( i) ) <0.5 , predict y (i )=0

If you get stuck, you can check out the hints presented after the cell below to help you with the
implementation.

# UNQ_C4
# GRADED FUNCTION: predict

def predict(X, w, b):

"""
Predict whether the label is 0 or 1 using learned logistic
regression parameters w

Args:
X : (ndarray Shape (m, n))
w : (array_like Shape (n,)) Parameters of the model
b : (scalar, float) Parameter of the model

Returns:
p: (ndarray (m,1))
The predictions for X using a threshold at 0.5
"""
# number of training examples
m, n = X.shape
p = np.zeros(m)

# YOUR CODE HERE

#raise NotImplementedError()
for i in range(m):
z_wb = 0
for j in range(n):
z_wb += X[i, j] * w[j]
z_wb += b
f_wb = 1 / (1 + np.exp(-z_wb))
p[i] = 1 if f_wb >= 0.5 else 0
return p

Once you have completed the function predict, let's run the code below to report the training
accuracy of your classifier by computing the percentage of examples it got correct.

# Test your predict code

np.random.seed(1)
tmp_w = np.random.randn(2)
tmp_b = 0.3
tmp_X = np.random.randn(4, 2) - 0.5

tmp_p = predict(tmp_X, tmp_w, tmp_b)

print(f'Output of predict: shape {tmp_p.shape}, value {tmp_p}')

# UNIT TESTS
predict_test(predict)
Output of predict: shape (4,), value [0. 1. 1. 1.]
All tests passed!

Expected output

Now let's use this to compute the accuracy on the training set

#Compute accuracy on our training set

p = predict(X_train, w,b)
print('Train Accuracy: %f'%(np.mean(p == y_train) * 100))

Train Accuracy: 92.000000

3 - Regularized Logistic Regression

In this part of the exercise, you will implement regularized logistic regression to predict whether
microchips from a fabrication plant passes quality assurance (QA). During QA, each microchip
goes through various tests to ensure it is functioning correctly.

3.1 Problem Statement

Suppose you are the product manager of the factory and you have the test results for some
microchips on two different tests.

• From these two tests, you would like to determine whether the microchips should be
accepted or rejected.
• To help you make the decision, you have a dataset of test results on past microchips,
from which you can build a logistic regression model.

3.2 Loading and visualizing the data

Similar to previous parts of this exercise, let's start by loading the dataset for this task and
visualizing it.

• The load_dataset() function shown below loads the data into variables X_train and
y_train
– X_train contains the test results for the microchips from two tests
– y_train contains the results of the QA
• y_train = 1 if the microchip was accepted
• y_train = 0 if the microchip was rejected
– Both X_train and y_train are numpy arrays.
# load dataset
X_train, y_train = load_data("data/ex2data2.txt")

View the variables

The code below prints the first five values of X_train and y_train and the type of the
variables.

# print X_train
print("X_train:", X_train[:5])
print("Type of X_train:",type(X_train))

# print y_train
print("y_train:", y_train[:5])
print("Type of y_train:",type(y_train))

X_train: [[ 0.051267 0.69956 ]

[-0.092742 0.68494 ]
[-0.21371 0.69225 ]
[-0.375 0.50219 ]
[-0.51325 0.46564 ]]
Type of X_train: <class 'numpy.ndarray'>
y_train: [1. 1. 1. 1. 1.]
Type of y_train: <class 'numpy.ndarray'>

Check the dimensions of your variables

Another useful way to get familiar with your data is to view its dimensions. Let's print the shape
of X_train and y_train and see how many training examples we have in our dataset.

print ('The shape of X_train is: ' + str(X_train.shape))

print ('The shape of y_train is: ' + str(y_train.shape))
print ('We have m = %d training examples' % (len(y_train)))

The shape of X_train is: (118, 2)

The shape of y_train is: (118,)
We have m = 118 training examples

Visualize your data

The helper function plot_data (from utils.py) is used to generate a figure like Figure 3,
where the axes are the two test scores, and the positive (y = 1, accepted) and negative (y = 0,
rejected) examples are shown with different markers.

# Plot examples
plot_data(X_train, y_train[:], pos_label="Accepted",
neg_label="Rejected")
# Set the y-axis label
plt.ylabel('Microchip Test 2')
# Set the x-axis label
plt.xlabel('Microchip Test 1')
plt.legend(loc="upper right")
plt.show()

Figure 3 shows that our dataset cannot be separated into positive and negative examples by a
straight-line through the plot. Therefore, a straight forward application of logistic regression
will not perform well on this dataset since logistic regression will only be able to find a linear
decision boundary.

3.3 Feature mapping

One way to fit the data better is to create more features from each data point. In the provided
function map_feature, we will map the features into all polynomial terms of x 1 and x 2 up to
the sixth power.
[)
x1
x2
2
x1
x1 x2
m a p f e a t u r e ( x )= x 22
3
x1
⋮
x 1 x52
6
x2

As a result of this mapping, our vector of two features (the scores on two QA tests) has been
transformed into a 27-dimensional vector.

• A logistic regression classifier trained on this higher-dimension feature vector will have a
more complex decision boundary and will be nonlinear when drawn in our 2-dimensional
plot.
• We have provided the map_feature function for you in utils.py.
print("Original shape of data:", X_train.shape)

mapped_X = map_feature(X_train[:, 0], X_train[:, 1])

print("Shape after feature mapping:", mapped_X.shape)

Original shape of data: (118, 2)

Shape after feature mapping: (118, 27)

Let's also print the first elements of X_train and mapped_X to see the tranformation.

print("X_train[0]:", X_train[0])
print("mapped X_train[0]:", mapped_X[0])

X_train[0]: [0.051267 0.69956 ]

mapped X_train[0]: [5.12670000e-02 6.99560000e-01 2.62830529e-03
3.58643425e-02
4.89384194e-01 1.34745327e-04 1.83865725e-03 2.50892595e-02
3.42353606e-01 6.90798869e-06 9.42624411e-05 1.28625106e-03
1.75514423e-02 2.39496889e-01 3.54151856e-07 4.83255257e-06
6.59422333e-05 8.99809795e-04 1.22782870e-02 1.67542444e-01
1.81563032e-08 2.47750473e-07 3.38066048e-06 4.61305487e-05
6.29470940e-04 8.58939846e-03 1.17205992e-01]

While the feature mapping allows us to build a more expressive classifier, it is also more
susceptible to overfitting. In the next parts of the exercise, you will implement regularized
logistic regression to fit the data and also see for yourself how regularization can help combat
the overfitting problem.
3.4 Cost function for regularized logistic regression
In this part, you will implement the cost function for regularized logistic regression.

Recall that for regularized logistic regression, the cost function is of the form
m −1 n −1
J ( w , b )=
1
m i=0
[
∑ − y (i ) log ( f w , b ( x( i) ) ) − ( 1− y (i )) log ( 1− f w ,b ( x (i) ) ) )+ 2λm ∑ w2j
j=0

Compare this to the cost function without regularization (which you implemented above), which
is of the form
m− 1
1
J ( w . b )= ∑ ¿ ¿
m i=0
The difference is the regularization term, which is
n−1
λ
∑
2m j=0
w 2j

Note that the b parameter is not regularized.

Exercise 5
Please complete the compute_cost_reg function below to calculate the following term for
each element in w
n−1
λ
∑
2m j=0
w 2j

The starter code then adds this to the cost without regularization (which you computed above in
compute_cost) to calculate the cost with regulatization.

If you get stuck, you can check out the hints presented after the cell below to help you with the
implementation.

# UNQ_C5
def compute_cost_reg(X, y, w, b, lambda_ = 1):
"""
Computes the cost over all examples
Args:
X : (array_like Shape (m,n)) data, m examples by n features
y : (array_like Shape (m,)) target value
w : (array_like Shape (n,)) Values of parameters of the model

b : (array_like Shape (n,)) Values of bias parameter of the

model
lambda_ : (scalar, float) Controls amount of regularization
Returns:
total_cost: (scalar) cost
"""

m, n = X.shape

# Calls the compute_cost function that you implemented above

cost_without_reg = compute_cost(X, y, w, b)

# You need to calculate this value

reg_cost = 0.

# YOUR CODE HERE

#raise NotImplementedError()
for j in range(n):
reg_cost_j = w[j] ** 2 # Tính bình phương của trọng số w[j]
reg_cost += reg_cost_j # Cộng dồn vào reg_cost
# Add the regularization cost to get the total cost
total_cost = cost_without_reg + (lambda_/(2 * m)) * reg_cost

return total_cost

Run the cell below to check your implementation of the compute_cost_reg function.

X_mapped = map_feature(X_train[:, 0], X_train[:, 1])

np.random.seed(1)
initial_w = np.random.rand(X_mapped.shape[1]) - 0.5
initial_b = 0.5
lambda_ = 0.5
cost = compute_cost_reg(X_mapped, y_train, initial_w, initial_b,
lambda_)

print("Regularized cost :", cost)

# UNIT TEST
compute_cost_reg_test(compute_cost_reg)

Regularized cost : 0.6618252552483951

All tests passed!

Expected Output: Regularized cost : 0.6618252552483948

3.5 Gradient for regularized logistic regression

In this section, you will implement the gradient for regularized logistic regression.
∂ J ( w , b)
The gradient of the regularized cost function has two components. The first, is a scalar,
∂b
the other is a vector with the same shape as the parameters w , where the j t h element is defined
as follows:
m −1
∂ J ( w , b) 1
= ∑ ( f w, b ( x (i) ) − y ( i) )
∂b m i=0

( )
m− 1
∂ J ( w , b) 1 λ
= ∑ ( f w ,b ( x (i ) ) − y (i )) x (ji) + w j for j=0 . . . ( n −1 )
∂wj m i=0 m

Compare this to the gradient of the cost function without regularization (which you
implemented above), which is of the form
m −1
∂ J ( w , b) 1
= ∑ ( f w, b ( x (i) ) − y ( i) )
∂b m i=0
m −1
∂ J ( w , b) 1
= ∑ ( f w, b ( x (i) ) − y ( i) ) x (ji )
∂wj m i=0

∂ J ( w , b) ∂ J ( w , b)
As you can see, is the same, the difference is the following term in , which is
∂b ∂w
λ
w for j=0 .. . ( n −1 )
m j

Exercise 6
Please complete the compute_gradient_reg function below to modify the code below to
calculate the following term
λ
w for j=0 .. . ( n −1 )
m j
∂ J ( w , b)
The starter code will add this term to the returned from compute_gradient above
∂w
to get the gradient for the regularized cost function.

If you get stuck, you can check out the hints presented after the cell below to help you with the
implementation.

# UNQ_C6
def compute_gradient_reg(X, y, w, b, lambda_ = 1):
"""
Computes the gradient for linear regression

Args:
X : (ndarray Shape (m,n)) variable such as house size
y : (ndarray Shape (m,)) actual value
w : (ndarray Shape (n,)) values of parameters of the model

b : (scalar) value of parameter of the model

lambda_ : (scalar,float) regularization constant
Returns
dj_db: (scalar) The gradient of the cost w.r.t. the
parameter b.
dj_dw: (ndarray Shape (n,)) The gradient of the cost w.r.t. the
parameters w.

"""
m, n = X.shape

dj_db, dj_dw = compute_gradient(X, y, w, b)

# YOUR CODE HERE

#raise NotImplementedError()
dj_dw += (lambda_ / m) * w
return dj_db, dj_dw

Run the cell below to check your implementation of the compute_gradient_reg function.

X_mapped = map_feature(X_train[:, 0], X_train[:, 1])

np.random.seed(1)
initial_w = np.random.rand(X_mapped.shape[1]) - 0.5
initial_b = 0.5

lambda_ = 0.5
dj_db, dj_dw = compute_gradient_reg(X_mapped, y_train, initial_w,
initial_b, lambda_)

print(f"dj_db: {dj_db}", )
print(f"First few elements of regularized dj_dw:\n
{dj_dw[:4].tolist()}", )

# UNIT TESTS
compute_gradient_reg_test(compute_gradient_reg)

dj_db: 0.07138288792343662
First few elements of regularized dj_dw:
[-0.010386028450548703, 0.011409852883280122, 0.0536273463274574,
0.003140278267313462]
All tests passed!

Expected Output: dj_db:0.07138288792343656 First few elements of regularized dj_dw:

[[-0.010386028450548701], [0.01140985288328012], [0.0536273463274574],
[0.003140278267313462]]
3.6 Learning parameters using gradient descent
Similar to the previous parts, you will use your gradient descent function implemented above to
learn the optimal parameters w ,b .

• If you have completed the cost and gradient for regularized logistic regression correctly,
you should be able to step through the next cell to learn the parameters w .
• After training our parameters, we will use it to plot the decision boundary.

Note

The code block below takes quite a while to run, especially with a non-vectorized version. You
can reduce the iterations to test your implementation and iterate faster. If you hae time, run
for 100,000 iterations to see better results.

# Initialize fitting parameters

np.random.seed(1)
initial_w = np.random.rand(X_mapped.shape[1])-0.5
initial_b = 1.

# Set regularization parameter lambda_ to 1 (you can try varying this)

lambda_ = 0.01;
# Some gradient descent settings
iterations = 10000
alpha = 0.01

w,b, J_history,_ = gradient_descent(X_mapped, y_train, initial_w,

initial_b,
compute_cost_reg,
compute_gradient_reg,
alpha, iterations, lambda_)

Iteration 0: Cost 0.72

Iteration 1000: Cost 0.59
Iteration 2000: Cost 0.56
Iteration 3000: Cost 0.53
Iteration 4000: Cost 0.51
Iteration 5000: Cost 0.50
Iteration 6000: Cost 0.48
Iteration 7000: Cost 0.47
Iteration 8000: Cost 0.46
Iteration 9000: Cost 0.45
Iteration 9999: Cost 0.45

# Using the following settings

#np.random.seed(1)
#initial_w = np.random.rand(X_mapped.shape[1])-0.5
#initial_b = 1.
#lambda_ = 0.01;
#iterations = 10000
#alpha = 0.01
Iteration 0: Cost 0.72
Iteration 1000: Cost 0.59
Iteration 2000: Cost 0.56
Iteration 3000: Cost 0.53
Iteration 4000: Cost 0.51
Iteration 5000: Cost 0.50
Iteration 6000: Cost 0.48
Iteration 7000: Cost 0.47
Iteration 8000: Cost 0.46
Iteration 9000: Cost 0.45
Iteration 9999: Cost 0.45

3.7 Plotting the decision boundary

To help you visualize the model learned by this classifier, we will use our
plot_decision_boundary function which plots the (non-linear) decision boundary that
separates the positive and negative examples.

• In the function, we plotted the non-linear decision boundary by computing the

classifier’s predictions on an evenly spaced grid and then drew a contour plot of
where the predictions change from y = 0 to y = 1.

• After learning the parameters w ,b , the next step is to plot a decision boundary
similar to Figure 4.

plot_decision_boundary(w, b, X_mapped, y_train)

3.8 Evaluating regularized logistic regression model
You will use the predict function that you implemented above to calculate the accuracy of the
regulaized logistic regression model on the training set

#Compute accuracy on the training set

p = predict(X_mapped, w, b)

print('Train Accuracy: %f'%(np.mean(p == y_train) * 100))

Train Accuracy: 82.203390

Expected Output: Train Accuracy:~ 80%

Binary Logistic Regression From Scratch
No ratings yet
Binary Logistic Regression From Scratch
10 pages
Logistic Regression
No ratings yet
Logistic Regression
10 pages
Lab-5 Report
No ratings yet
Lab-5 Report
11 pages
Logistic Regression
No ratings yet
Logistic Regression
8 pages
Logistic Regression
No ratings yet
Logistic Regression
21 pages
Exp3 ML
No ratings yet
Exp3 ML
4 pages
Import Numpy As NP
No ratings yet
Import Numpy As NP
4 pages
Module-2 - Logistic Regression in Machine Learning
No ratings yet
Module-2 - Logistic Regression in Machine Learning
28 pages
Wa0004.
No ratings yet
Wa0004.
9 pages
Kritika Sejwal - 24MCI10023 - ML Lab - Worksheet 2
No ratings yet
Kritika Sejwal - 24MCI10023 - ML Lab - Worksheet 2
6 pages
Logistic Regression
100% (1)
Logistic Regression
10 pages
Experiment No 3
No ratings yet
Experiment No 3
7 pages
Deep Learning Lab Manual-36-41
No ratings yet
Deep Learning Lab Manual-36-41
6 pages
ML Lab 08 Manual - Logisitic Regression (Ver7)
No ratings yet
ML Lab 08 Manual - Logisitic Regression (Ver7)
9 pages
Ex 2
No ratings yet
Ex 2
13 pages
B24 ML Exp-1
No ratings yet
B24 ML Exp-1
10 pages
Exercise 3: Logistic Regression: Andrew NG (Very Slightly Edited by Luis R. Izquierdo For The University of Burgos)
No ratings yet
Exercise 3: Logistic Regression: Andrew NG (Very Slightly Edited by Luis R. Izquierdo For The University of Burgos)
5 pages
05 Logistic - Regression
No ratings yet
05 Logistic - Regression
7 pages
W2 Ann
No ratings yet
W2 Ann
12 pages
Ch2Regression and Regularization1
No ratings yet
Ch2Regression and Regularization1
45 pages
ML DSBA Lab2
No ratings yet
ML DSBA Lab2
4 pages
Rain in Australia Logistic Regression Classifier
No ratings yet
Rain in Australia Logistic Regression Classifier
10 pages
Lab Manual 04
No ratings yet
Lab Manual 04
12 pages
Ml0101En-Reg-Nonelinearregression-Py-V1: 1 Non Linear Regression Analysis
No ratings yet
Ml0101En-Reg-Nonelinearregression-Py-V1: 1 Non Linear Regression Analysis
12 pages
Optional Lab - Sigmoid Function and Logistic Regression - Coursera
No ratings yet
Optional Lab - Sigmoid Function and Logistic Regression - Coursera
2 pages
Logistic Regression (Probability Concepts) and Perceptron
No ratings yet
Logistic Regression (Probability Concepts) and Perceptron
20 pages
Mayhoc
No ratings yet
Mayhoc
51 pages
Neural Networks Skimmed - Ipynb - Colab
No ratings yet
Neural Networks Skimmed - Ipynb - Colab
8 pages
Index: Name - JINESH PRAJAPAT Class - B. Tech, III Year Branch - AI & DS Sem - V
No ratings yet
Index: Name - JINESH PRAJAPAT Class - B. Tech, III Year Branch - AI & DS Sem - V
35 pages
Lab02
No ratings yet
Lab02
14 pages
ML4 Linear Models
No ratings yet
ML4 Linear Models
34 pages
Log Reg Skimed - Ipynb - Colab
No ratings yet
Log Reg Skimed - Ipynb - Colab
10 pages
W8 - Logistic Regression
No ratings yet
W8 - Logistic Regression
18 pages
ML-Unit 4
No ratings yet
ML-Unit 4
29 pages
Pranil DLA LAB 5
No ratings yet
Pranil DLA LAB 5
3 pages
09 23ECE216 LogisticRegression
No ratings yet
09 23ECE216 LogisticRegression
40 pages
Lecture 07
No ratings yet
Lecture 07
26 pages
Logistic Regression
No ratings yet
Logistic Regression
13 pages
Logistic Regression
No ratings yet
Logistic Regression
25 pages
ML File
No ratings yet
ML File
17 pages
Logistic Regression
No ratings yet
Logistic Regression
21 pages
Week 3 Lecture Notes
No ratings yet
Week 3 Lecture Notes
7 pages
3-LG Eval
No ratings yet
3-LG Eval
52 pages
Day.12 Logistic Regression
No ratings yet
Day.12 Logistic Regression
8 pages
ML Lab Manual
100% (1)
ML Lab Manual
37 pages
ML Lab Manual
No ratings yet
ML Lab Manual
36 pages
Lab 04 - Logisitic Regression
No ratings yet
Lab 04 - Logisitic Regression
5 pages
AC-ED L04 - Logistic Regression, Regularization
No ratings yet
AC-ED L04 - Logistic Regression, Regularization
80 pages
29 - ML Exp - 03
No ratings yet
29 - ML Exp - 03
4 pages
Neural Network
No ratings yet
Neural Network
14 pages
Logistic Exp1
No ratings yet
Logistic Exp1
2 pages
21CSC305P ML - Lab Programs 1 - 9
No ratings yet
21CSC305P ML - Lab Programs 1 - 9
36 pages
Logistic Regression
No ratings yet
Logistic Regression
6 pages
Logistic Regression
No ratings yet
Logistic Regression
19 pages
DL Lab 5
No ratings yet
DL Lab 5
3 pages
2021 Logistic Regression
No ratings yet
2021 Logistic Regression
33 pages
DSBDA Practicals
No ratings yet
DSBDA Practicals
16 pages
ML Record Print
No ratings yet
ML Record Print
20 pages
Machine Learning Strategies
No ratings yet
Machine Learning Strategies
59 pages
Advanced C Concepts and Programming: First Edition
From Everand
Advanced C Concepts and Programming: First Edition
Gayatri
3/5 (1)
Visual Image Understanding
No ratings yet
Visual Image Understanding
7 pages
A Comparative Analysis of SVM and CNN For Scoliosis Detection in Backbone X-Ray Images
No ratings yet
A Comparative Analysis of SVM and CNN For Scoliosis Detection in Backbone X-Ray Images
14 pages
1 s2.0 S294989102300828X Main
No ratings yet
1 s2.0 S294989102300828X Main
20 pages
Batch 7
No ratings yet
Batch 7
42 pages
New Model To Predict Bearing Capacity of Shallow F
No ratings yet
New Model To Predict Bearing Capacity of Shallow F
18 pages
Animal Classificatio. Mini Project Report
No ratings yet
Animal Classificatio. Mini Project Report
11 pages
BBC Reading 1
No ratings yet
BBC Reading 1
23 pages
Luo Et Al., 2022
No ratings yet
Luo Et Al., 2022
11 pages
Report Ipl Prediction
No ratings yet
Report Ipl Prediction
31 pages
AI IMP QB (E-Next - In)
No ratings yet
AI IMP QB (E-Next - In)
3 pages
Naive Bates Classifier
No ratings yet
Naive Bates Classifier
18 pages
Assignment 3 - LP1
No ratings yet
Assignment 3 - LP1
13 pages
Used Car Price Prediction Using Machine Learning: Veluru Ranjith (Urk18Cs020)
No ratings yet
Used Car Price Prediction Using Machine Learning: Veluru Ranjith (Urk18Cs020)
26 pages
Locally Weighted Regression
No ratings yet
Locally Weighted Regression
17 pages
Artificial Intelligence Tutorial For Beginners
No ratings yet
Artificial Intelligence Tutorial For Beginners
17 pages
Entire-Id: An Extensive and Diverse Dataset For Person Re-Identification
No ratings yet
Entire-Id: An Extensive and Diverse Dataset For Person Re-Identification
5 pages
A Survey Paper On Credit Card Fraud Detection Techniques
No ratings yet
A Survey Paper On Credit Card Fraud Detection Techniques
9 pages
AgriFlow - Enhancing Plant Health With Web-Based Di
No ratings yet
AgriFlow - Enhancing Plant Health With Web-Based Di
12 pages
Design Analysis and Performance Prediction of Packed Bed Latent Heat Storage System Employing Machine Learning Models
No ratings yet
Design Analysis and Performance Prediction of Packed Bed Latent Heat Storage System Employing Machine Learning Models
17 pages
L - S H - C W T E E E M: Chinesewebtext
No ratings yet
L - S H - C W T E E E M: Chinesewebtext
15 pages
Aishwarya DL Mini Project Report
No ratings yet
Aishwarya DL Mini Project Report
4 pages
FINAL REVIEW PAPER Android Dynamic Malware Analysis
No ratings yet
FINAL REVIEW PAPER Android Dynamic Malware Analysis
12 pages
Urban Sound Classification PaperV2
No ratings yet
Urban Sound Classification PaperV2
6 pages
Hydrogen Rcci Ann Paper
No ratings yet
Hydrogen Rcci Ann Paper
11 pages
How To Pass Image Datasets To CNN Models Using Image Data Generations - by MD Shahbaz Alam - Medium
No ratings yet
How To Pass Image Datasets To CNN Models Using Image Data Generations - by MD Shahbaz Alam - Medium
14 pages
Application of Artificial Neural Network (ANN) For Estimating Reliable Service Life of Reinforced Concrete (RC) Structure Bookkeeping Factors Responsible For Deterioration Mechanism
No ratings yet
Application of Artificial Neural Network (ANN) For Estimating Reliable Service Life of Reinforced Concrete (RC) Structure Bookkeeping Factors Responsible For Deterioration Mechanism
15 pages
21CS64 Data Science and Visualization (PE)
No ratings yet
21CS64 Data Science and Visualization (PE)
37 pages
ECO413
No ratings yet
ECO413
18 pages
Q R N N: Uaternion Ecurrent Eural Etworks
No ratings yet
Q R N N: Uaternion Ecurrent Eural Etworks
19 pages
Ebook 17CCC
No ratings yet
Ebook 17CCC
440 pages