Gradient DescentSummartyL5
Gradient DescentSummartyL5
The function dydx(x) calculates the derivative of the objective function y=(x−10)**2.This gives the
gradient or slope at any point x.
If the gradient is positive (dydx(x) > 0), this means the slope is going upward, so we
decrease x (move left) by step.
If the gradient is negative or zero, the slope is going downward or flat, so we increase x
(move right) by step.
After updating x, the algorithm recalculates the value of the objective function
y=(x−10)**2.
The new values of x and y are added to the xs and ys lists, respectively.
Now it checks if the change in y between the last two iterations is smaller than
0.0001.
If this condition is satisfied, the loop breaks, and the algorithm prints:
o The number of steps it took to find the minimum (i).
o The value of y at the minimum (ys[-1]).
o The optimal value of x (xs[-1]).
o If the loop reaches the maximum of 2000 iterations without finding a small enough
change in y, it prints a failure message.
o These variables track the best (minimum) values of y, the iteration at which this
minimum occurred (argmin_y), and the corresponding value of x (best_x).
Summary:
The code is an implementation of directional grid search combined with gradient
descent.
It minimizes the function y=(x−10)2**2 by updating x based on the gradient:
o If the gradient is positive, it moves x left (decreases x).
o If the gradient is negative, it moves x right (increases x).
The algorithm stops when the change in y between two iterations is small enough (less
than 0.0001) or after 2000 iterations if it fails to converge.
12)
13)
14) In simple mathematical functions, calculating the gradient (or slope) is straightforward using
basic calculus.
15) However, in more complex machine learning models, especially deep neural networks, manually
calculating gradients becomes very difficult.
16) TensorFlow is a powerful library that automates the calculation of gradients, making it easier to
implement gradient-based optimization techniques in complex models.
17) Code:
import tensorflow as tf
from IPython.display import Markdown as md
# Define the variable x, initially set to 2
tfx = tf.Variable(2, dtype='float32')
# Use GradientTape to record operations for automatic differentiation
with tf.GradientTape() as tape:
ty = (tfx - 10)**2 # Define the function y = (x - 10)^2
# Compute the gradient of y with respect to x
dydx = tape.gradient(ty, tfx).numpy()
# Display the result
md(f"the gradient of the function $y=(x-10)^2$ at $x=2$ is {dydx}")
This took 30 steps while the previous one (directional grid search) took 877.