0% found this document useful (0 votes)

33 views40 pages

Python Tut Gradient Descent Algos MLR - Jupyter Notebook

Uploaded by

Anvitha

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

33 views40 pages

Python Tut Gradient Descent Algos MLR - Jupyter Notebook

Uploaded by

Anvitha

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 40

Python Tutorial on Implementation of Gradient Desecent Optimization

algorithms from Scratch

Example : Implementation of Gradient Descent Algorithms from scratch in Python

Table of Contents
1. Create Data
2. Solve Direclty
3. Batch Gradient Descent
4. Stochastic Gradient Descent
5. Mini-batch Gradient Descent

In [1]:  1 import numpy as np

2 import matplotlib.pyplot as plt
3
4 %matplotlib inline
1.Create Data
In [15]:  1 # generate random data with some noise
2
3 X = 2 * np.random.rand(100, 1)
4 print(X.shape)
5 # print(X)
6 y = 8 +(3*X) + (np.random.randn(100, 1))
7 print(y.shape)
8
9 ## For printing together input and target
10 X_y=np.c_[X,y]
11 print("Input and target")
12 print(X_y)
(100, 1)
(100, 1)
Input and target
[[ 0.04497682 9.10423587]
[ 0.44369684 9.91643706]
[ 1.14757428 11.57207686]
[ 1.98332104 12.02308413]
[ 0.03038902 9.2510614 ]
[ 1.06406009 10.54856201]
[ 0.76146702 11.78215315]
[ 0.92371199 9.46087616]
[ 1.14317242 11.56766946]
[ 1.55185131 12.12205791]
[ 0.0186962 7.2777673 ]
[ 1.1789661 11.50784111]
[ 0.12408403 7.41553359]
[ 1.19483023 12.57702559]
[ 1.59064174 14.23886903]
[ 1.40627822 12.4646239 ]
[ 0.41236386 9.66891799]
[ 0.01554289 7.50275661]
[ 1.24460193 12.63655415]
[ 0.59811763 11.0568907 ]
[ 0.41969231 9.65690536]
[ 1.36654913 12.13159462]
[ 1.64921642 13.65952935]
[ 0.35949318 6.70617922]
[ 0.02194567 8.71344024]
[ 0.19417357 7.8774343 ]
[ 0.43027749 8.26604118]
[ 0.01570807 7.77699381]
[ 1.8942345 14.27457547]
[ 1.84577681 13.78644959]
[ 1.0891461 11.9357483 ]
[ 0.94948277 10.53574713]
[ 1.8847801 13.73178354]
[ 1.26666735 11.90012728]
[ 0.40107006 7.84937596]
[ 1.44085866 11.71417226]
[ 1.61521082 13.30733268]
[ 1.50202438 11.5954719 ]
[ 0.88315884 11.46431646]
[ 0.47911189 9.47640518]
[ 0.30588616 10.13541128]
[ 1.52129161 12.84197568]
[ 0.25075942 7.88154753]
[ 0.90823295 10.79025615]
[ 1.63317685 12.44083037]
[ 1.68238394 14.15681462]
[ 1.36562465 12.50388206]
[ 0.77130311 12.89670151]
[ 1.10717202 10.36614913]
[ 0.23961462 7.03771564]
[ 0.55449143 8.97155405]
[ 1.96119551 13.11300052]
[ 1.02226458 10.56326872]
[ 0.13509331 9.68292579]
[ 0.27385538 7.45643523]
[ 0.68252181 10.8104117 ]
[ 0.47555176 9.06428252]
[ 1.09838153 11.87228133]
[ 1.08531632 11.92645865]
[ 0.75247426 8.81934867]
[ 1.99022675 15.21941186]
[ 1.99980551 12.91520817]
[ 1.5864137 12.94750455]
[ 1.66776477 13.05471928]
[ 0.39009362 9.5872106 ]
[ 0.06842432 6.52626061]
[ 0.49160231 8.96437506]
[ 1.89122417 13.97621451]
[ 0.29577358 9.39384895]
[ 1.21440789 11.27633599]
[ 1.79309467 11.8779532 ]
[ 1.63039219 12.65629986]
[ 0.81381444 11.28511091]
[ 1.0785014 11.64870513]
[ 0.65317283 10.92945199]
[ 1.39720473 10.49816875]
[ 0.72273399 10.0385813 ]
[ 1.56516876 11.22598596]
[ 1.79435949 13.69041692]
[ 0.92515284 10.69258089]
[ 0.92666572 11.16743518]
[ 1.79482058 14.3119088 ]
[ 0.83466322 10.25302971]
[ 0.89994989 9.90721344]
[ 0.05247682 7.3231602 ]
[ 1.34408815 12.47571037]
[ 1.32256478 11.96633382]
[ 0.9896771 11.92459708]
[ 0.38734407 9.25595385]
[ 0.62419774 9.34182876]
[ 1.03598244 9.13251898]
[ 1.52120598 12.7770783 ]
[ 0.16201475 7.62316748]
[ 1.5294854 12.93170731]
[ 0.77389001 11.13815733]
[ 1.43854668 14.43331059]
[ 0.77970708 9.08175759]
[ 0.42117791 9.99767758]
[ 0.38189651 8.82041598]
[ 0.45969414 11.29653945]]

In [16]:  1 # X=2*np.random.rand(100,1)
2 # print(X)
3 # print(X.shape)
4 # y=8+3*X+np.random.rand(100,1)
5 # print(y.shape)
In [17]:  1 # Scatter plot to see the relation between X and y
2 plt.figure(figsize=(15,10))
3 plt.scatter(X, y, marker='x')
4 plt.xlabel('X')
5 plt.ylabel('y', rotation=0)

Out[17]: Text(0, 0.5, 'y')

2. Solve using Numpy Library
In [18]:  1 # # Understanding numpy functions
2 # import numpy as np
3 # X = np.arange(6)
4 # print(X)
5 # X = X.reshape((6, 1))
6 # print(X)
7 # X1=np.ones_like(X) # Return an array of ones with the same shape and type as a given array.
8 # print(X1)
9 # z=np.c_[X1,X] # Translates slice objects to concatenation along the second axis.
10 # print(z)
In [19]:  1 # Adding bais unit to every vector in X
2 B=np.ones_like(X)
3 X_b=np.c_[B,X]
4 print(X_b)
5 print(X_b.shape)
6
7 #y=y.reshape((6,2))
8 theta=np.linalg.inv(X_b.T.dot(X_b)).dot(X_b.T).dot(y)
9 print(theta)
[[1. 0.04497682]
[1. 0.44369684]
[1. 1.14757428]
[1. 1.98332104]
[1. 0.03038902]
[1. 1.06406009]
[1. 0.76146702]
[1. 0.92371199]
[1. 1.14317242]
[1. 1.55185131]
[1. 0.0186962 ]
[1. 1.1789661 ]
[1. 0.12408403]
[1. 1.19483023]
[1. 1.59064174]
[1. 1.40627822]
[1. 0.41236386]
[1. 0.01554289]
[1. 1.24460193]
[1. 0.59811763]
[1. 0.41969231]
[1. 1.36654913]
[1. 1.64921642]
[1. 0.35949318]
[1. 0.02194567]
[1. 0.19417357]
[1. 0.43027749]
[1. 0.01570807]
[1. 1.8942345 ]
[1. 1.84577681]
[1. 1.0891461 ]
[1. 0.94948277]
[1. 1.8847801 ]
[1. 1.26666735]
[1. 0.40107006]
[1. 1.44085866]
[1. 1.61521082]
[1. 1.50202438]
[1. 0.88315884]
[1. 0.47911189]
[1. 0.30588616]
[1. 1.52129161]
[1. 0.25075942]
[1. 0.90823295]
[1. 1.63317685]
[1. 1.68238394]
[1. 1.36562465]
[1. 0.77130311]
[1. 1.10717202]
[1. 0.23961462]
[1. 0.55449143]
[1. 1.96119551]
[1. 1.02226458]
[1. 0.13509331]
[1. 0.27385538]
[1. 0.68252181]
[1. 0.47555176]
[1. 1.09838153]
[1. 1.08531632]
[1. 0.75247426]
[1. 1.99022675]
[1. 1.99980551]
[1. 1.5864137 ]
[1. 1.66776477]
[1. 0.39009362]
[1. 0.06842432]
[1. 0.49160231]
[1. 1.89122417]
[1. 0.29577358]
[1. 1.21440789]
[1. 1.79309467]
[1. 1.63039219]
[1. 0.81381444]
[1. 1.0785014 ]
[1. 0.65317283]
[1. 1.39720473]
[1. 0.72273399]
[1. 1.56516876]
[1. 1.79435949]
[1. 0.92515284]
[1. 0.92666572]
[1. 1.79482058]
[1. 0.83466322]
[1. 0.89994989]
[1. 0.05247682]
[1. 1.34408815]
[1. 1.32256478]
[1. 0.9896771 ]
[1. 0.38734407]
[1. 0.62419774]
[1. 1.03598244]
[1. 1.52120598]
[1. 0.16201475]
[1. 1.5294854 ]
[1. 0.77389001]
[1. 1.43854668]
[1. 0.77970708]
[1. 0.42117791]
[1. 0.38189651]
[1. 0.45969414]]
(100, 2)
[[7.87752609]
[3.11459601]]
In [20]:  1 # Adding bais unit to every vector in X
2 B= np.ones_like(X) # Return an array of ones with the same shape and type as a given array.
3 X_b = np.c_[B, X] # Translates slice objects to concatenation along the second axis.
4 # X_b = X
5 print(X_b)
6 theta = np.linalg.inv(X_b.T.dot(X_b)).dot(X_b.T).dot(y) # Compute the (multiplicative) inverse of a matrix.
7
8 # Values are very close to are thetas 8 and 3
9 print(theta)
10
[[1. 0.04497682]
[1. 0.44369684]
[1. 1.14757428]
[1. 1.98332104]
[1. 0.03038902]
[1. 1.06406009]
[1. 0.76146702]
[1. 0.92371199]
[1. 1.14317242]
[1. 1.55185131]
[1. 0.0186962 ]
[1. 1.1789661 ]
[1. 0.12408403]
[1. 1.19483023]
[1. 1.59064174]
[1. 1.40627822]
[1. 0.41236386]
[1. 0.01554289]
[1. 1.24460193]
[1. 0.59811763]
[1. 0.41969231]
[1. 1.36654913]
[1. 1.64921642]
[1. 0.35949318]
[1. 0.02194567]
[1. 0.19417357]
[1. 0.43027749]
[1. 0.01570807]
[1. 1.8942345 ]
[1. 1.84577681]
[1. 1.0891461 ]
[1. 0.94948277]
[1. 1.8847801 ]
[1. 1.26666735]
[1. 0.40107006]
[1. 1.44085866]
[1. 1.61521082]
[1. 1.50202438]
[1. 0.88315884]
[1. 0.47911189]
[1. 0.30588616]
[1. 1.52129161]
[1. 0.25075942]
[1. 0.90823295]
[1. 1.63317685]
[1. 1.68238394]
[1. 1.36562465]
[1. 0.77130311]
[1. 1.10717202]
[1. 0.23961462]
[1. 0.55449143]
[1. 1.96119551]
[1. 1.02226458]
[1. 0.13509331]
[1. 0.27385538]
[1. 0.68252181]
[1. 0.47555176]
[1. 1.09838153]
[1. 1.08531632]
[1. 0.75247426]
[1. 1.99022675]
[1. 1.99980551]
[1. 1.5864137 ]
[1. 1.66776477]
[1. 0.39009362]
[1. 0.06842432]
[1. 0.49160231]
[1. 1.89122417]
[1. 0.29577358]
[1. 1.21440789]
[1. 1.79309467]
[1. 1.63039219]
[1. 0.81381444]
[1. 1.0785014 ]
[1. 0.65317283]
[1. 1.39720473]
[1. 0.72273399]
[1. 1.56516876]
[1. 1.79435949]
[1. 0.92515284]
[1. 0.92666572]
[1. 1.79482058]
[1. 0.83466322]
[1. 0.89994989]
[1. 0.05247682]
[1. 1.34408815]
[1. 1.32256478]
[1. 0.9896771 ]
[1. 0.38734407]
[1. 0.62419774]
[1. 1.03598244]
[1. 1.52120598]
[1. 0.16201475]
[1. 1.5294854 ]
[1. 0.77389001]
[1. 1.43854668]
[1. 0.77970708]
[1. 0.42117791]
[1. 0.38189651]
[1. 0.45969414]]
[[7.87752609]
[3.11459601]]

3. Batch Gradient Decent Algorithm

Below mentioned Cost Function which stands for Mean Squared Error(MSE) is for Linear Regression and Gradient is derived from the it.

Cost
𝑚
𝐽(𝜃) = 1/𝑚 ∑(ℎ(𝜃)(𝑖) − 𝑦(𝑖) )2
𝑖=1
Gradient

∂𝐽(𝜃) = 2/𝑚 𝑚 (ℎ(𝜃(𝑖) ) − 𝑦(𝑖) ). 𝑋𝑗(𝑖)

∂𝜃𝑗 ∑𝑖=1
Below mentioned Cost Function which stands for Mean Squared Error(MSE) is for Linear Regression and Gradient is derived from the it.

Cost
𝑚
𝐽(𝜃) = 1/2𝑚 ∑(ℎ(𝜃)(𝑖) − 𝑦(𝑖) )2
𝑖=1
Gradient

∂𝐽(𝜃) = 1/𝑚 𝑚 (ℎ(𝜃(𝑖) ) − 𝑦(𝑖) ). 𝑋𝑗(𝑖)

∂𝜃𝑗 ∑𝑖=1

Function for computing Cost function

In [21]:  1 def cal_cost(theta, X, y):
2 '''
3 Calculates the cost for given X and y.
4
5
6 Parameters
7 ----------
8 theta: Vector of thetas
9 X: Row of X's
10 y: Actual y's
11
12 Returns
13 -------
14 Calculated cost
15 '''
16 m = len(y)
17 predictions = np.dot(X, theta)
18 cost = (1/ m) * np.sum(np.square(predictions - y))
19
20 return cost
Function for Batch gradient Decent Algorithm
In [22]:  1 def gradient_descent(X, y, theta, learning_rate, iterations):
2 '''
3 Parameters
4 ----------
5 X: Matrix of X with bais units added to every vector of X
6 y: Vector of y
7 theta: Vector initialized randomly
8 learning_rate: learning rate that will be used as step size
9 iterations = Number of iterations
10
11 Returns
12 -------
13 The final theta vector, array of cost and theta history over no of iterations
14 '''
15
16 m = len(y)
17 cost_history = np.zeros(iterations)
18 theta_history = np.zeros((iterations, 2))
19
20 for i in range(iterations):
21 #Forward propagation
22 predictions = np.dot(X, theta)
23 #Compute Cost (Value of loss function)
24 cost_history[i] = cal_cost(theta, X, y)
25 # Backpropogation
26 grad =(2/m) * (X.T.dot((predictions - y)))
27 # Update Parameters
28 theta = theta - (learning_rate * grad)
29 theta_history[i, :] = theta.T
30
31
32 return theta, cost_history, theta_history
Calling function

In [23]:  1 # Using gradient descent to find out the thetas between our X and y relation
2 import time
3 lr = 0.3
4 n_iter = 100
5
6 theta = np.random.randn(2,1)
7 print("Initial value of weights")
8 print(theta)
9 X_b = np.c_[np.ones((len(X),1)),X]
10
11 ## Calling BGD function
12 t10 = time.time()
13 theta,cost_history,theta_history = gradient_descent(X_b, y, theta, lr, n_iter)
14 t11 = time.time()
15 time_BGD = t11 - t10
16 print("Processing Time of MBG Algorithm")
17 print(time_BGD)
18
19
20 # Predicted value by NNs
21 y_predicted= np.dot(X_b, theta)
22 print("Optimal value of weights")
23 print('{:<10}{:.3f}'.format('Theta0:',theta[0][0]))
24 print('{:<10}{:.3f}'.format('Theta1:',theta[1][0]))
25 print("Minimum value of cost function")
26 print('{:<10}{:.3f}'.format('Cost/MSE:',cost_history[-1]))
27
28 # print(y_predicted)
29
30 predicted__actual = np.c_[y_predicted, y]
31 print("Comparing predicted vs actual")
32 print(predicted__actual)
[ 9.25939063 9.91643706]
[11.45175653 11.57207686]
[14.0548556 12.02308413]
[ 7.97206146 9.2510614 ]
[11.19163502 10.54856201]
[10.24914889 11.78215315]
[10.75449303 9.46087616]
[11.43804608 11.56766946]
[12.71095756 12.12205791]
[ 7.93564186 7.2777673 ]
[11.5495326 11.50784111]
[ 8.26389315 7.41553359]
[11.59894458 12.57702559]
[12.83177803 14.23886903]
[12.25754129 12.4646239 ]
[ 9.16179784 9.66891799]
[ 7.92582023 7.50275661]
[11.75396842 12.63655415]
[ 9.74036482 11.0568907 ]
[ 9.18462378 9.65690536]
[12 13379715 12 13159462]
In [24]:  1 # plotting the cost history vs number of iterations
2
3 plt.figure(figsize=(15,5))
4 plt.scatter(range(n_iter), cost_history)
5 plt.xlabel('Number of Iterations')
6 plt.ylabel('Cost', rotation=0)
7

Out[24]: Text(0, 0.5, 'Cost')

In [25]:  1 # plot prediction line
2 plt.figure(figsize=(15,5))
3 plt.scatter(X, y, marker='x')
4 plt.plot(X, y_predicted, 'red')
5 plt.xlabel('X')
6 plt.ylabel('y', rotation=0)

Out[25]: Text(0, 0.5, 'y')

4. Stochastic Gradient Descent (SGD) Algorithm
In [26]:  1 def stochastic_gradient_descent(X, y, theta, learning_rate,iterations):
2 '''
3 Parameters
4 ----------
5 X: Matrix of X with bais units added to every vector of X
6 y: Vector of y
7 theta: Vector initialized randomly
8 learning_rate: learning rate that will be used as step size
9 iterations = Number of iterations
10
11 Returns
12 -------
13 The final theta vector, array of cost and theta history over no of iterations
14 '''
15 m = len(y)
16 theta_history = np.zeros((iterations, 2))
17 cost_history = np.zeros(iterations)
18 # theta_history = []
19
20 for i in range(iterations):
21 cost_per_iteration = .0
22 for j in range(m):
23 # #Step : Shuffle (X,y)
24 X_rand_idx = np.random.randint(m)
25 X_inner = X[X_rand_idx].reshape(1, X.shape[1])
26 y_inner = y[X_rand_idx].reshape(1, 1)
27
28 # Forward Propgation
29 predictions = np.dot(X_inner, theta)
30
31 # Computing cost value
32 cost_per_iteration += cal_cost(theta, X_inner, y_inner)
33
34 # Backpropagation
35 grad= (2/m) * (X_inner.T.dot((predictions - y_inner)))
36
37 # Weight updation
38 theta = theta - (learning_rate * grad)
39
40 cost_history[i] = cost_per_iteration
41
42
43 return theta, cost_history, theta_history
In [31]:  1 # Using stochastic gradient descent to find out the thetas between our X and y relation
2 import time
3 lr = 0.3
4 n_iter = 100
5 theta = np.random.randn(2,1)
6 print("Initial value of weights")
7 print(theta)
8
9 X_b = np.c_[np.ones((len(X),1)),X]
10
11 ## calling SGD
12 t20 = time.time()
13 theta,cost_history,theta_history = stochastic_gradient_descent(X_b, y, theta, lr, n_iter)
14 t21 = time.time()
15 time_SGD = t21 - t20
16 print("Processing Time of SGD Algorithm")
17 print(time_SGD)
18 # theta,cost_history,theta_history = stochastic_gradient_descent(X_b, y, theta, lr, n_iter)
19 print(theta)
20
21
22 # Predicted value by NNs
23 y_predicted= np.dot(X_b, theta)
24
25 print("Optimal value of weights")
26 print('{:<10}{:.3f}'.format('Theta0:',theta[0][0]))
27 print('{:<10}{:.3f}'.format('Theta1:',theta[1][0]))
28 print("Minimum value of cost function")
29 print('{:<10}{:.3f}'.format('Cost/MSE:',cost_history[-1]))
30
31
32 # print(y_predicted)
33 predicted__actual = np.c_[y_predicted, y]
34 print("Comparing predicted vs actual")
35 print(predicted__actual)
36
37
[ 8.52699622 7.8774343 ]
[ 9.26269015 8.26604118]
[ 7.9709022 7.77699381]
[13.82434371 14.27457547]
[13.67335075 13.78644959]
[11.31570832 11.9357483 ]
[10.88052087 10.53574713]
[13.79488402 13.73178354]
[11.86886012 11.90012728]
[ 9.17168056 7.84937596]
[12.41163588 11.71417226]
[12.95491287 13.30733268]
[12.60222674 11.5954719 ]
[10.67385716 11.46431646]
[ 9.41485695 9.47640518]
[ 8.87508987 10.13541128]
[12.66226294 12.84197568]
[ 8.70331633 7.88154753]
[10.75198745 10.79025615]
[13.01089457 12.44083037]
[13 16422264 14 15681462]
In [32]:  1 # def stochastic_gradient_descent(X, y, theta, learning_rate,iterations):
2 # '''
3 # Parameters
4 # ----------
5 # X: Matrix of X with bais units added to every vector of X
6 # y: Vector of y
7 # theta: Vector initialized randomly
8 # learning_rate: learning rate that will be used as step size
9 # iterations = Number of iterations
10
11 # Returns
12 # -------
13 # The final theta vector, array of cost and theta history over no of iterations
14 # '''
15
16 # m = len(y)
17 # theta_history = np.zeros((iterations, 2))
18 # cost_history = np.zeros(iterations)
19 # theta_history = []
20
21 # for i in range(iterations):
22 # cost_per_iteration = .0
23 # for j in range(m):
24 # # #Step : Shuffle (X,y)
25 # X_rand_idx = np.random.randint(m)
26 # X_inner = X[X_rand_idx].reshape(1, X.shape[1])
27 # y_inner = y[X_rand_idx].reshape(1, 1)
28
29 # predictions = np.dot(X_inner, theta)
30
31 # theta = theta - (2/m) * learning_rate * (X_inner.T.dot((predictions - y_inner)))
32 # cost_per_iteration += cal_cost(theta, X_inner, y_inner)
33 # cost_history[i] = cost_per_iteration
34
35 # return theta, cost_history, theta_history
In [20]:  1 # plotting the cost history vs number of iterations
2
3 plt.figure(figsize=(15,5))
4 plt.scatter(range(n_iter), cost_history)
5 plt.xlabel('Number of Iterations')
6 plt.ylabel('Cost', rotation=0)
7

Out[20]: Text(0, 0.5, 'Cost')

In [20]:  1 # plot prediction line
2 plt.figure(figsize=(15,5))
3 plt.scatter(X, y, marker='x')
4 plt.plot(X, np.dot(X_b, theta), 'red')
5 plt.xlabel('X')
6 plt.ylabel('y', rotation=0)

Out[20]: Text(0, 0.5, 'y')

5. Mini-batch Gradient Descent (MBGD) Algorithm
In [27]:  1 def mini_batch_gradient_descent(X, y, theta, learning_rate, iterations):
2 '''
3 Parameters
4 ----------
5 X: Matrix of X with bais units added to every vector of X
6 y: Vector of y
7 theta: Vector initialized randomly
8 learning_rate: learning rate that will be used as step size
9 iterations = Number of iterations
10
11 Returns
12 -------
13 The final theta vector, array of cost and theta history over no of iterations
14 '''
15
16 m = len(y)
17 cost_history = np.zeros(iterations)
18 # print(cost_history)
19 theta_history = np.zeros((iterations, 2))
20 # print(theta_history)
21 # Batch size
22 batch_size=10
23 for i in range(iterations):
24 cost_per_iteration = .0
25 # #Step 1: Shuffle (X,y)
26 rand_indicies = np.random.permutation(m)
27 X = X[rand_indicies]
28 y = y[rand_indicies]
29
30 for j in range(0, m, batch_size):
31 X_inner = X[i: i+batch_size]
32 y_inner = y[i: i+batch_size]
33 # forward propagation
34 predictions = np.dot(X_inner, theta)
35 # # weight updataion equation
36 # theta = theta - (2/m) * learning_rate * (X_inner.T.dot((predictions - y_inner)))
37 # cost_per_iteration += cal_cost(theta, X_inner, y_inner)
38 # Computing cost value
39 cost_per_iteration += cal_cost(theta, X_inner, y_inner)
40 # Backpropagation
41 grad= (2/m) * (X_inner.T.dot((predictions - y_inner)))
42 # Weight updation
43 theta = theta - (learning_rate * grad)
44 cost_history[i] = cost_per_iteration
45
46 return theta, cost_history, theta_history
In [28]:  1 # Using mini batch gradient descent to find out the thetas between our X and y relation
2 lr = 0.3
3 n_iter = 100
4 theta = np.random.randn(2,1)
5 print("Initial value of weights")
6 print(theta)
7
8 # print(X_b)
9 X_b = np.c_[np.ones((len(X),1)),X]
10 # print(X_b)
11
12 ## calling MBGD
13 t30 = time.time()
14 theta,cost_history,theta_history = mini_batch_gradient_descent(X_b, y, theta, lr, n_iter)
15 t31 = time.time()
16 time_MBGD = t31 - t30
17 print("Processing Time of MBGD Algorithm")
18 print(time_MBGD)
19 # theta,cost_history,theta_history = mini_gradient_descent(X_b, y, theta, lr, n_iter)
20 # print(theta)
21
22 # Predicted value by NNs
23 y_predicted= np.dot(X_b, theta)
24
25 print("Optimal value of weights")
26 print('{:<10}{:.3f}'.format('Theta0:',theta[0][0]))
27 print('{:<10}{:.3f}'.format('Theta1:',theta[1][0]))
28 print("Minimum value of cost function")
29 print('{:<10}{:.3f}'.format('Cost/MSE:',cost_history[-1]))
30
31 # print(y_predicted)
32 predicted__actual = np.c_[y_predicted, y]
33 print("Comparing predicted vs actual")
34 print(predicted__actual)
[12.12343891 12.13159462]
[13.04414139 13.65952935]
[ 8.84326137 6.70617922]
[ 7.74380331 8.71344024]
[ 8.30478315 7.8774343 ]
[ 9.07381964 8.26604118]
[ 7.72348624 7.77699381]
[13.84221304 14.27457547]
[13.68437689 13.78644959]
[11.21988316 11.9357483 ]
[10.76497248 10.53574713]
[13.8114182 13.73178354]
[11.79810449 11.90012728]
[ 8.97868538 7.84937596]
[12.36547953 11.71417226]
[12.93337851 13.30733268]
[12.5647082 11.5954719 ]
[10.5489425 11.46431646]
[ 9.23288283 9.47640518]
[ 8.66865284 10.13541128]
In [29]:  1 # plotting the cost history vs number of iterations
2
3 plt.figure(figsize=(15,5))
4 plt.scatter(range(n_iter), cost_history)
5 plt.xlabel('Number of Iterations')
6 plt.ylabel('Cost', rotation=0)
7

Out[29]: Text(0, 0.5, 'Cost')

In [30]:  1 # plot prediction line
2
3 plt.figure(figsize=(15,5))
4 plt.scatter(X, y, marker='x')
5 plt.plot(X, np.dot(X_b, theta), 'red')
6 plt.xlabel('X')
7 plt.ylabel('y', rotation=0)

Out[30]: Text(0, 0.5, 'y')

Comparison of Processing Speed of above three Grandient Decent

Optimization Algorithms Implementation
In [33]:  1 import matplotlib.pyplot as plt
2 %matplotlib inline
3 # plt.style.use('ggplot')
4
5 x = ['Method-1', 'Method-2', 'Method-3']
6 Processing_Times = [time_BGD, time_SGD, time_MBGD]
7
8
9 x_pos = [i for i, _ in enumerate(x)]
10
11 plt.bar(x_pos, Processing_Times, color='green')
12 plt.xlabel("<-------------Method-----------> ")
13 plt.ylabel("<----------Time-------------> ")
14 plt.title(" Time Vs Method ")
15
16 plt.xticks(x_pos, x)
17
18 plt.show()
In [ ]:  1

Data AMMM
No ratings yet
Data AMMM
27 pages
Out 49
No ratings yet
Out 49
299 pages
Data Mining Portfolio
No ratings yet
Data Mining Portfolio
19 pages
Merged
No ratings yet
Merged
35 pages
Minor Project
No ratings yet
Minor Project
92 pages
Python Programming
No ratings yet
Python Programming
41 pages
01 5 CF BF
No ratings yet
01 5 CF BF
357 pages
DL Lab2
No ratings yet
DL Lab2
38 pages
In Time Tec
No ratings yet
In Time Tec
23 pages
Meta - LeetCode
100% (1)
Meta - LeetCode
84 pages
Stock-MArket-Forecasting - Untitled - Ipynb at Master Krishnaik06 - Stock-MArket-Forecasting
No ratings yet
Stock-MArket-Forecasting - Untitled - Ipynb at Master Krishnaik06 - Stock-MArket-Forecasting
37 pages
Import As
100% (1)
Import As
27 pages
Upyter Notebook1
No ratings yet
Upyter Notebook1
5 pages
Tabel Distribusi Normal Baku
No ratings yet
Tabel Distribusi Normal Baku
1 page
TP5 Ex1 Kmeans
No ratings yet
TP5 Ex1 Kmeans
7 pages
Implementation of Image Processing Algorithms For Fracture Detection On Different Human Body Parts. (Minor 02)
No ratings yet
Implementation of Image Processing Algorithms For Fracture Detection On Different Human Body Parts. (Minor 02)
22 pages
RDF
No ratings yet
RDF
21 pages
UG - B.Sc. - Computer Science - DATA STRUCTURES AND ALGORITHMS-13033
No ratings yet
UG - B.Sc. - Computer Science - DATA STRUCTURES AND ALGORITHMS-13033
136 pages
ML Journal
No ratings yet
ML Journal
58 pages
Simulacion Canica Con Rebotes
No ratings yet
Simulacion Canica Con Rebotes
5 pages
SK Learn 1
No ratings yet
SK Learn 1
11 pages
Numpy TE2
No ratings yet
Numpy TE2
12 pages
To Generate Permutation You Can Use These Codes-: Time Complexity
No ratings yet
To Generate Permutation You Can Use These Codes-: Time Complexity
128 pages
Python Basics With Illustrations From The Financial Markets Vivek Krishnamoorthy Download
No ratings yet
Python Basics With Illustrations From The Financial Markets Vivek Krishnamoorthy Download
86 pages
Tugas Monte Carlo
No ratings yet
Tugas Monte Carlo
15 pages
Double Spring Mass Damper
No ratings yet
Double Spring Mass Damper
90 pages
Método de Jacobi Final
No ratings yet
Método de Jacobi Final
11 pages
Format Oapa General
No ratings yet
Format Oapa General
110 pages
Tarea 8
No ratings yet
Tarea 8
7 pages
CSc9618 (As) - Mock 2 - Paper 2
No ratings yet
CSc9618 (As) - Mock 2 - Paper 2
11 pages
1 Simple Linear Regression
No ratings yet
1 Simple Linear Regression
9 pages
Metodo Explicito
No ratings yet
Metodo Explicito
5 pages
Assignment-07-DBSCAN Clustering (Crimes) - Jupyter Notebook
No ratings yet
Assignment-07-DBSCAN Clustering (Crimes) - Jupyter Notebook
11 pages
BHMC17 P5.ipynb - Colaboratory
No ratings yet
BHMC17 P5.ipynb - Colaboratory
4 pages
Double Spring Mass Damper
No ratings yet
Double Spring Mass Damper
90 pages
4,5
No ratings yet
4,5
3 pages
Allala
No ratings yet
Allala
3 pages
G6
No ratings yet
G6
14 pages
Cheatspeet Dsa
No ratings yet
Cheatspeet Dsa
114 pages
Practical 1
No ratings yet
Practical 1
9 pages
EQ Data2
No ratings yet
EQ Data2
18 pages
EQ Curve Data - 1
No ratings yet
EQ Curve Data - 1
18 pages
PCA
No ratings yet
PCA
23 pages
Interpolatingfunction : Bla Ndsolve ( (F ''' (T) + F (T) F '' (T) 0, F (0) 0, F ' (0) 0, F ' (100 000) 1), F, T)
No ratings yet
Interpolatingfunction : Bla Ndsolve ( (F ''' (T) + F (T) F '' (T) 0, F (0) 0, F ' (0) 0, F ' (100 000) 1), F, T)
7 pages
Bca 2024-25 I and Ii Sem Syllabus
No ratings yet
Bca 2024-25 I and Ii Sem Syllabus
19 pages
Grin 5
No ratings yet
Grin 5
4 pages
Module 4 Latches and Flip-Flops
No ratings yet
Module 4 Latches and Flip-Flops
84 pages
Coordenadas de La Cerchas Grupo Peres
No ratings yet
Coordenadas de La Cerchas Grupo Peres
31 pages
IIT M ES FOUNDATION FN EXAM FEF1 24 Dec 2023
No ratings yet
IIT M ES FOUNDATION FN EXAM FEF1 24 Dec 2023
89 pages
C++ 1st and 2nd Semester
No ratings yet
C++ 1st and 2nd Semester
95 pages
Data Covid-19 Jakarta: Numpy NP Matplotlib - Pyplot PLT Ipython - Display
No ratings yet
Data Covid-19 Jakarta: Numpy NP Matplotlib - Pyplot PLT Ipython - Display
3 pages
EQ Data 44
No ratings yet
EQ Data 44
18 pages
Go Optimizations 101 Tapir Liu 2024 Scribd Download
100% (1)
Go Optimizations 101 Tapir Liu 2024 Scribd Download
55 pages
(PRF193) - 4. Data Handling
No ratings yet
(PRF193) - 4. Data Handling
107 pages
April 23, 2025: Pandas PD
No ratings yet
April 23, 2025: Pandas PD
11 pages
123
No ratings yet
123
12 pages
Lec 12
No ratings yet
Lec 12
18 pages
DL Lab 3
No ratings yet
DL Lab 3
5 pages
Cambridge International AS & A Level: Computer Science 9618/23
No ratings yet
Cambridge International AS & A Level: Computer Science 9618/23
24 pages
ADS Notes Unit1
No ratings yet
ADS Notes Unit1
60 pages
Data Analytics
No ratings yet
Data Analytics
70 pages
Displacement vs. Time
No ratings yet
Displacement vs. Time
46 pages
Scholarship Schemes
No ratings yet
Scholarship Schemes
12 pages
Trainandoutput Patterns
No ratings yet
Trainandoutput Patterns
2 pages
HW8 La
No ratings yet
HW8 La
18 pages
Grin 7
No ratings yet
Grin 7
4 pages
Knn1 MinMaxScalar
No ratings yet
Knn1 MinMaxScalar
13 pages
Giản đồ Kremcer
No ratings yet
Giản đồ Kremcer
12 pages
EXAMEN
No ratings yet
EXAMEN
11 pages
Localweighted - Jupyter Notebook
No ratings yet
Localweighted - Jupyter Notebook
4 pages
SCİLAB T, Sa Değerleri
No ratings yet
SCİLAB T, Sa Değerleri
15 pages
Charotar University of Science & Technology & Devang Patel Institute of Advance Technology & Reserch CE144 Object Oriented Programming With C++
No ratings yet
Charotar University of Science & Technology & Devang Patel Institute of Advance Technology & Reserch CE144 Object Oriented Programming With C++
123 pages
Cumulative ZT
No ratings yet
Cumulative ZT
9 pages
Matlab Report
No ratings yet
Matlab Report
34 pages
Grin 4
No ratings yet
Grin 4
4 pages
Hashing
No ratings yet
Hashing
7 pages
EAW Resolution 2 User Guide
No ratings yet
EAW Resolution 2 User Guide
97 pages
Question Bank DPD All Modules
No ratings yet
Question Bank DPD All Modules
5 pages
Practice J277 02 Sep 2020 QPX
No ratings yet
Practice J277 02 Sep 2020 QPX
20 pages
ADADELTA
No ratings yet
ADADELTA
2 pages
Cambridge International AS & A Level: Computer Science 9618/21
No ratings yet
Cambridge International AS & A Level: Computer Science 9618/21
20 pages
Final Exam Question 232 CSE1111 C Summer 23
No ratings yet
Final Exam Question 232 CSE1111 C Summer 23
2 pages
Dimentionality Reduction Implementation
No ratings yet
Dimentionality Reduction Implementation
8 pages
Advanced JAVA-2 Marks
No ratings yet
Advanced JAVA-2 Marks
12 pages
Book 3
No ratings yet
Book 3
4 pages
12 CS EM Public Answer Key May 2022
No ratings yet
12 CS EM Public Answer Key May 2022
10 pages
Model QP DPD
No ratings yet
Model QP DPD
4 pages
Numpy
No ratings yet
Numpy
1 page
U03s01 Introduction
No ratings yet
U03s01 Introduction
17 pages
Sen 1
No ratings yet
Sen 1
6 pages
DATA SCIENCE IDC 302 End Sem Project
No ratings yet
DATA SCIENCE IDC 302 End Sem Project
1 page
Internal Assessment 2 Important Questions - 230327 - 124239
No ratings yet
Internal Assessment 2 Important Questions - 230327 - 124239
1 page
Memory Leak: Dangling Pointer
No ratings yet
Memory Leak: Dangling Pointer
12 pages
Lab 09 BSDSF23 PF
No ratings yet
Lab 09 BSDSF23 PF
2 pages
The Fibonacci Number Series
From Everand
The Fibonacci Number Series
Michael Husted
5/5 (1)

Python Tut Gradient Descent Algos MLR - Jupyter Notebook

Uploaded by

Python Tut Gradient Descent Algos MLR - Jupyter Notebook

Uploaded by

Python Tutorial on Implementation of Gradient Desecent Optimization

algorithms from Scratch

Example : Implementation of Gradient Descent Algorithms from scratch in Python

In [1]:  1 import numpy as np

Out[17]: Text(0, 0.5, 'y')

3. Batch Gradient Decent Algorithm

∂𝐽(𝜃) = 2/𝑚 𝑚 (ℎ(𝜃(𝑖) ) − 𝑦(𝑖) ). 𝑋𝑗(𝑖)

∂𝐽(𝜃) = 1/𝑚 𝑚 (ℎ(𝜃(𝑖) ) − 𝑦(𝑖) ). 𝑋𝑗(𝑖)

Function for computing Cost function

Out[24]: Text(0, 0.5, 'Cost')

Out[25]: Text(0, 0.5, 'y')

Out[20]: Text(0, 0.5, 'Cost')

Out[20]: Text(0, 0.5, 'y')

Out[29]: Text(0, 0.5, 'Cost')

Out[30]: Text(0, 0.5, 'y')

Comparison of Processing Speed of above three Grandient Decent

You might also like