Forward and Backward Propagation Deep Learning 1703697260
Forward and Backward Propagation Deep Learning 1703697260
FORWARD PROPAGATION
Forward propagation is the way by which a neural network is first trained. It is carried out by
inputting data into the network, which produces an output prediction.
The weights and bias to be used for both layers should be declared initially and weights
will be reported randomly to avoid the same output of all units within them, bias will start to zero
the calculation itself will be from the beginning and according to the given rules down where W1,
W2 and b1, b2 There are weights and biases for the first and second layers Here A is a specific
part for production work.
Input layer:
Artificial input neurons form the input layer of the neural network, which receives the input
data and transmits it to higher layers of artificial neurons for processing The input layer marks the
beginning of the operation of the artificial neural network.
Artificial tissue generally consists of three layers: input, storage, and output. Convolutional
layers and encoding/decoding layers are examples of additional components.
Since the input layer is the first layer in the network, it consists of "passive" neurons that
do not receive information from higher layers. This is one of the unique characteristics of the input
layer. Experts explain this by saying that artificial muscles play a different role at the input level.
Although theoretically, the input layer can consist of artificial neurons without weighted input or
with separately calculated weights. Artificial tissues are generally expected to have weighted
inputs and can be operated on the basis of these weighted inputs.
Dense Layer:
The dense layer is intricately linked to its preceding layer and is responsible for modifying
the output's dimension through matrix-vector multiplication.
We can think of a deep learning model system as its sequence. It is possible to use different
layers in the models. By their nature, each of these positions has a specific meaning. Such are the
variable layers in image processing, LSTM layers in time series analysis, NLP challenges, etc. In
the final stage of neural networks, a dense layer or fully connected layer is used.
Activation function:
By calculating the weighted amount and then applying the bias, the activation function
indicates whether a neuron is needed or not. The activation function aims to add nonlinearity to
the output of the neuron.
Linear Function
• Equation: Linear function has the equation similar to a straight line i.e. y = x
• No matter how many layers we have, if they are all linear, then the final activation function
of the last layer is equal to the linear function of the input of the first layer
• Range: -inf to +inf
• Uses: Linear activation function is used at just one place i.e. output layer.
• Issues: Varying the linear function to introduce nonlinearity will result in a consistent
function and will no longer rely on the input "x", so our method will not exhibit any new
behavior.
Sigmoid Function:
Tanh Function:
• The function that performs better than the sigmoid function is almost always the Tanh
function, also known as the Tangent Hyperbolic function. It is a mathematical
transformation of the sigmoid function. Both can be equalled and derived from each other.
• Equation: -
• Value Range: -1 to +1
• Nature: - non-linear
• Uses: - Usually found in hidden layers; Because its values range from -1 to 1, the mean of
the hidden component is 0 or very close to it, which helps cantering the data by bringing
the average closer to 0. This makes learning much easier for the next level
RELU Function:
• It Stands for Rectified linear unit. It is the most widely used activation function. Chiefly
implemented in hidden layers of Neural network.
• Equation:- A(x) = max(0,x). It gives an output x if x is positive and 0 otherwise.
• Value Range:- [0, inf)
• Nature:- non-linear, which means we can easily backpropagate the errors and have
multiple layers of neurons being activated by the ReLU function.
• Uses:- ReLu is computationally less expensive than Tanh and Sigmoid because it has
simpler mathematical operations. Only a few neurons are active at a time which makes the
network simple which makes it more efficient and easier to calculate.
Softmax Function:
Although it is also a form of sigmoid function, the softmax function is useful when dealing with
multiclass classification issues.
Nature: - non-linear
Uses:- It is often used when managing multiple groups. In image classification challenges, the
output layer was usually where the softmax function was located. The output of each class would
be divided by all the outputs and pushed from 0 to 1 by the softmax function.
Output: - The output layer of the classifier, where we are actually trying to find a probability to
describe the class of each input, is where the softmax function is most useful
The general rule of thumb is that if you don’t know exactly which activation function to use, just
use RELU as it is the most common active function in hidden layers and is widely used these days
If your output is a binary classification, the sigmoid function is a very natural choice for the output
layer.
Softmax is a very helpful tool for predicting the probability of each class occurring if your process
is a distribution of multiple classes.
BIBLIOGRAPHY: