The Introduction To Neural Networks 10 4 24
The Introduction To Neural Networks 10 4 24
Introduction to Neural
Networks
MOTIVATION
• Our brain uses the extremely large interconnected network of neurons for information
processing and to model the world around us. Simply put, a neuron collects inputs from other
neurons using dendrites. The neuron sums all the inputs and if the resulting value is greater than
a threshold, it fires. The fired signal is then sent to other connected neurons through the axon.
Biological Networks
1. The majority of neurons encode their
outputs or activations as a series of brief
electrical pulses (i.e. spikes or action
potentials).
2. Dendrites are the receptive zones that
receive activation from other neurons.
3. The cell body (soma) of the neuron’s
processes the incoming activations and
converts them into output activations.
4. Axons are transmission lines that send
activation to other neurons.
5. Synapses allow weighted transmission of
signals (using neurotransmitters) between
axons and dendrites to build up large neural
networks.
Networks of McCulloch-Pitts Neurons
• Artificial neurons have the same basic components as biological
neurons. The simplest ANNs consist of a set of McCulloch-Pitts
neurons labelled by indices k, i, j and activation flows between them
via synapses with strengths wki, wij:
MOTIVATION
• Humans are incredible pattern-recognition machines. Our brains
process ‘inputs’ from the world, categorize them (that’s a spider;
that’s ice-cream), and then generate an ‘output’ (run away from the
spider; taste the ice-cream). And we do this automatically and
quickly, with little or no effort.
MOTIVATION
• Neural networks loosely mimic the way our brains solve the problem:
by taking in inputs, processing them and generating an output. Like
us, they learn to recognize patterns, but they do this by training on
labelled datasets. Before we get to the learning part, let’s take a look
at the most basic of artificial neurons: the perceptron, and how it
processes inputs and produces an output.
THE PERCEPTRON
• Perceptrons were developed way back in the 1950s-60s by the scientist Frank Rosenblatt, inspired by
earlier work from Warren McCulloch and Walter Pitts. While today we use other models of artificial
neurons, they follow the general principles set by the perceptron.
• As you can see, the network of nodes sends signals in one direction. This is called a feed-forward
network.
• The figure depicts a neuron connected with n other neurons and thus receives n inputs (x1, x2, ….. xn).
This configuration is called a Perceptron.
THE PERCEPTRON
• Let’s understand this better with an example. Say you bike to work. You
have two factors to make your decision to go to work: the weather must
not be bad, and it must be a weekday. The weather’s not that big a deal,
but working on weekends is a big no-no. The inputs have to be binary, so
let’s propose the conditions as yes or no questions. Weather is fine? 1 for
yes, 0 for no. Is it a weekday? 1 yes, 0 no.
• Remember, I cannot tell the neural network these conditions; it has to
learn them for itself. How will it know which information will be most
important in making its decision? It does with something called weights.
Remember when I said that weather’s not a big deal, but the weekend is?
Weights are just a numerical representation of these preferences. A higher
weight means the neural network considers that input more important
compared to other inputs.
THE PERCEPTRON
• For our example, let’s purposely set suitable weights of 2 for weather and 6 for
weekday. Now how do we calculate the output? We simply multiply the input
with its respective weight, and sum up all the values we get for all the inputs. For
example, if it’s a nice, sunny (1) weekday (1), we would do the following
calculation:
•DRAWBACK: Suppose you are creating a binary classifier. Something which should say a
“yes” or “no” ( activate or not activate ). A Step function could do that for you! That’s
exactly what it does, say a 1 or 0. Now, think about the use case where you would want
multiple such neurons to be connected to bring in more classes. Class1, class2, class3 etc.
What will happen if more than 1 neuron is “activated”. All neurons will output a 1 ( from
step function). Now what would you decide? Which class is it? Hard, complicated.
* https://towardsdatascience.com/activation-functions-and-its-types-which-is-better-a9a5310cc8f
TYPES OF ACTIVATION FUNCTIONS
Linear function
•A = cx
•A straight line function where activation is proportional to input ( which is
the weighted sum from neuron ).
•This way, it gives a range of activations, so it is not binary activation. We
can definitely connect a few neurons together and if more than 1 fires, we
could take the max and decide based on that. So that is ok too. Then what is
the problem with this?
•A = cx, derivative with respect to x is c. That means, the gradient has no
relationship with X. It is a constant gradient and the descent is going to be
on constant gradient. If there is an error in prediction, the changes made by
back propagation is constant and not depending on the change in input.
TYPES OF ACTIVATION FUNCTIONS
Sigmoid function
• Note that if the activation on the hidden layer were linear, the network would
be equivalent to a single layer network, and wouldn’t be able to cope with non-
linearly separable problems.
BACK PROPOGATION BY EXAMPLE
https://mattmazur.com/2015/03/17/a-step-by-step-backpropagation-example/
• Consider a neural network with two inputs, two hidden neurons, two
output neurons. Additionally, the hidden and output neurons will
include a bias.
In order to have some numbers to work with, here are the initial
weights, the biases, and training inputs/outputs: given inputs 0.05 and
0.10, we want the neural network to output 0.01 and 0.99.
delta