Mod 2.4,2.5,2.6 Architecture Design
Mod 2.4,2.5,2.6 Architecture Design
The backpropagation algorithm contains two main phases, referred to as the forward and
backward phases, respectively.
1. Forward phase: In this phase, the inputs for a training instance are fed into the neural
network. This results in a forward cascade of computations across the layers, using the current
set of weights. The final predicted output can be compared to that of the training instance and
the derivative of the loss function with respect to the output is computed. The derivative of this
loss now needs to be computed with respect to the weights in all layers in the backwards phase.
2. Backward phase: The main goal of the backward phase is to learn the gradient of the loss
function with respect to the different weights by using the chain rule of differential calculus.
These gradients are used to update the weights. Since these gradients are learned in the
backward direction, starting from the output node, this learning process is referred to as the
backward phase.
Training a Neural Network
with Backpropagation
In the single-layer neural network, the training process is relatively
straightforward because the error (or loss function) can be computed as a direct
function of the weights, which allows easy gradient computation.
In the case of multi-layer networks, the problem is that the loss is a complicated
composition function of the weights in earlier layers. The gradient of a
composition function is computed using the backpropagation algorithm.
Let’s visualize how we might minimize the squared error over all of the training
examples
Imagine a three-dimensional space where the horizontal dimensions correspond to
the weights w1 and w2, and the vertical dimension corresponds to the value of the
error function E.
In this space, points in the horizontal plane correspond to different settings of the
weights, and the height at those points corresponds to the incurred error.
If we consider the errors we make over all possible weights, we get a surface in
this three-dimensional space, in particular, a quadratic bowl.
Gradient Descent