2.neural Network
2.neural Network
ROAD MAP
Why Artificial Neural networks
History of ANN
Why now?
Artificial Neural Networks
Preceptron
Loss Function
Activation functions
Binary Cross Entropy
Gradient decent
Back Propagation
WHY NEURAL NETWORKS
HISTORY
1800's 1960's and 70's 1986 1986: MLP, RNN 2012: Dropout 2014: GANs
late-1800's - Neural Simple neural A multilayer Dropout to A generative
Networks appear as Backpropagation reducing overfitting in adversarial network is
networks appear perceptron is a
an analogy to algorithm appears neural networks by a class of machine
• Fall out of favor class of
biological systems • Neural Networks preventing complex learning frameworks
because the feedforward co-adaptations designed by Ian
perceptron is not have a resurgence artificial neural
in popularity on training data. Goodfellow and his
effective by itself, network. colleagues in 2014.
and there were • More
computationally • RNN Recurrent
no good neural network
algorithms for expensive
multilayer nets
WHY NOW?
DAVID HUBEL AND TORSTEIN
WIESEL EXPERIMENT OF VISUAL CORTEX
NEURON
Computational models inspired by the human
brain:
Algorithms that try to mimic the brain.
Massively parallel, distributed system, made up
of simple processing units (neurons)
Synaptic connection strengths among neurons
are used to store the acquired knowledge.
Knowledge is acquired by the network from its
environment through a learning process
Dendrite: Receives signals from other neurons
Soma: Processes the information
Axon: Transmits the output of this neuron
Synapse: Point of connection to other neurons
ARTIFICIAL NEURON :
PERCEPTRON
A Perceptron is an algorithm
used for supervised learning of
binary classifiers. Binary classifiers
decide whether an input, usually
represented by a series of vectors,
belongs to a specific class.
In short, a perceptron is a single-
layer neural network. They consist
of four main parts including input
values, weights and bias, net sum,
and an activation function.
ARTIFICIAL NEURON :
PERCEPTRON
WHY
simple to solve but limited in terms of ability to
solve complex problems or higher degree
polynomials.
Linear
ReLU
LeakyReLU
Sigmoid
Tanh
Soft max
RELU
•Equation : f(x) = max(0,x)
•Range : (0 to infinity)
•Pros:
•The function and its derivative both
are monotonic.
•Due to its functionality it does not activate
all the neuron at the same time
•It is efficient and easy for computation.
•Cons:
•The outputs are not zero centered similar
to the sigmoid activation function
•When the gradient hits zero for the
negative values, it does not converge
towards the minima which will result in a
dead neuron while back propagation.
SIGMOID OR LOGISTIC ACTIVATION
FUNCTION
The Sigmoid Function curve looks like a S-shape.
Sigmoid Function Equation : f(x) = 1 / 1 + exp(-x)
Range : (0 to 1)
Pros:
The function is differentiable. That means, we can find
the slope of the sigmoid curve at any two points
The function is monotonic but function’s derivative is
not
Cons:
It gives rise to a problem of “vanishing gradients”, since
the Y values tend to respond very less to changes in X
Secondly , its output isn’t zero centered. It makes the
gradient updates go too far in different directions. 0 <
output < 1, and it makes optimization harder.
Sigmoid saturate and kill gradients.
Sigmoid have slow convergence.