Ain3001 - Introduction - To.ann
Ain3001 - Introduction - To.ann
Machine Learning
Introduction to ANN
Biological neuron
From Biological to Artificial Neurons
> Biological Neurons
The signal path from the retina to human lateral occipital cortex (LOC)
which finally recognizes the object.
Figure credit to Jonas Kubilius
From Biological to Artificial Neurons
> A given input is perceived at multiple levels of abstraction
such as edges, corners and contours, shapes, object parts
to object.
Perceptron diagram
From Biological to Artificial Neurons
> The Perceptron
How is computing the outputs of a fully connected layer?
Multi-Layer Perceptron
From Biological to Artificial Neurons
> Multi-Layer Perceptron and Backpropagation
For many years researchers struggled to find a way to train MLPs,
without success.
In 1986; David Rumelhart, Geoffrey Hinton and Ronald Williams
published a groundbreaking paper introducing the backpropagation
training algorithm.
forward pass: for each training instance the backpropagation algorithm
first makes a prediction
measures the error: calculate output error by using a loss function
backward pass: goes through each layer in reverse to measure the
error contribution from each connection
Gradient Descent step: slightly tweaks the connection weights to
reduce the error
From Biological to Artificial Neurons
> Multi-Layer Perceptron and Backpropagation
Algorithm Details
• It handles one mini-batch at a time and it goes through the full training set
multiple times. Each pass is called an epoch.
• Forward pass : Each mini-batch instances are passed to the network’s input
layer and algorithm computes the output of all the neurons until we get the
output of the last layer, the output layer. This is same as making predictions
except all intermediate results are preserved since they are needed for the
backward pass.
• Next, the algorithm measures the network’s output error by using a loss
function.
• Backward pass : Then it computes how much each output connection
contributed to the error by simply applying the chain rule. The error
contribution calculation continues until the algorithm reaches the input layer.
This reverse pass measures the error gradient across all the connection
weights in the network by propagating the error gradient backward through
the network.
• Finally, the algorithm performs a Gradient Descent step to tweak all the
connection weights in the network, using the error gradients it just computed.
From Biological to Artificial Neurons
> Multi-Layer Perceptron and Backpropagation
Activation Functions
In order for this algorithm to work properly, the authors made a key
change to the MLP’s architecture:
• they replaced the step function with the logistic function
σ(z) =1 / (1 + exp(–z))
• This was essential because the step function contains only flat
segments, so there is no gradient to work with while the logistic
function has a well-defined nonzero derivative everywhere.
• Other popular activation functions
Hyperbolic tangent function Rectified Linear Unit function
tanh(z) = 2σ(2z) – 1 ReLU(z) = max(0, z)
From Biological to Artificial Neurons
> Multi-Layer Perceptron and Backpropagation
Activation Functions
http://playground.tensorflow.org/