We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 7
Gradient Descent
Gradient Descent is known as one of the
most commonly used optimization algorithms to train machine learning models by means of minimizing errors between actual and expected results. Further, gradient descent is also used to train Neural Networks. In machine learning, optimization is the task of minimizing the cost function parameterized by the model's parameters. The main objective of gradient descent is to minimize the convex function using iteration of parameter updates. Once these machine learning models are optimized, these models can be used as powerful tools for Artificial Intelligence and various computer science applications. What is Gradient Descent or Steepest Descent? Gradient Descent is defined as one of the most commonly used iterative optimization algorithms of machine learning to train the machine learning and deep learning models. It helps in finding the local minimum of a function.
Main objective - To minimize the cost function using iteration. What is Cost-function?
The cost function is defined as the
measurement of difference or error between actual values and expected values at the current position and present in the form of a single real number. It helps to increase and improve machine learning efficiency by providing feedback to this model so that it can minimize error and find the local or global minimum. The loss function refers to the error of one training example, while a cost function calculates the average error across an entire training set. How does Gradient Descent work? Y=mX+c Where 'm' represents the slope of the line, and 'c' represents the intercepts on the y-axis.
The slope becomes steeper at the
starting point, but whenever new parameters are generated, then steepness gradually reduces, and at the lowest point, it approaches the lowest point, which is called a point of convergence. Stochastic Gradient Descent Stochastic Gradient Descent (SGD) is a type of gradient descent where the model is updated after processing one training example at a time, rather than the whole dataset at once. This makes it easier to handle large datasets since it requires less memory. However, because it updates so frequently, it can be a bit slower and noisy compared to batch gradient descent. Despite this, SGD can sometimes help find the global minimum and avoid getting stuck in a local minimum.
Advantages: 1. Uses less memory, making it easier to handle. 2. More efficient for large datasets.