REPORT - Gradient Based Optimization
REPORT - Gradient Based Optimization
Introduction
Mathematical Background
1.Gradient Descent
Gradient descent is a popular optimization algorithm used to minimize a loss function. Given a
differentiable function \(f(x)\), the goal is to find the value of \(x\) that minimizes \(f\). The
algorithm iteratively adjusts \(x\) in the opposite direction of the gradient of \(f\) with respect to \
(x\), aiming to reach a local minimum.
Where:
- \(x_n\) is the current value of \(x\) at iteration \(n\).
- \(\alpha\) is the learning rate, determining the step size.
- \(\nabla f(x_n)\) is the gradient of \(f\) at \(x_n\).
Stochastic gradient descent is a variation of gradient descent that updates the model's
parameters based on a randomly selected subset of the data (a mini-batch) rather than the
entire dataset. This reduces the computational cost and makes it suitable for large datasets.
The update rule for SGD is similar to gradient descent, but it uses a mini-batch of data:
Python Implementation
To implement gradient-based optimization in Python, one can use libraries like NumPy and
TensorFlow. Below is a simplified example of gradient descent for linear regression:
import numpy as np
for _ in range(num_iterations):
predictions = np.dot(X, theta)
errors = predictions - y
gradient = np.dot(X.T, errors) / m
theta = theta - alpha * gradient
return theta
Results
Conclusion
Gradient-based optimization is a powerful tool for minimizing loss functions and training
machine learning models. Understanding the underlying mathematical principles and being able
to implement it in Python opens up a wide array of possibilities in the field of machine learning
and numerical optimization. In future work, exploring advanced optimization techniques and
adaptative learning rate strategies could lead to further improvements in model performance.