Multilayer Perceptron
Multilayer Perceptron
Multilayer Perceptron
What are (everyday) computer systems good at... and not so good at? Good at Not so good at
Rule-based systems: Dealing with noisy data doing what the programmer Dealing with unknown wants them to do environment data
Massive parallelism
Fault tolerance
Adapting to circumstances
when we can't formulate an algorithmic solution. when we can get lots of examples of the behavior we require. learning from experience when we need to pick out the structure from existing data.
Structure: large number of highly interconnected processing elements (neurons) working together to process the data
Neural networks are configured for a specific application, such as pattern recognition or data classification, through a learning process In a biological system, learning involves adjustments to the synaptic connections between neurons
A neuron: many-inputs / one-output unit output can be excited or not excited incoming signals from other neurons determine if the neuron shall excite ("fire") Output subject to attenuation in the synapses, which are junction parts of the neuron
Dendrites:
nerve fibres carrying electrical signals to the cell
Cell body:
computes a non-linear function of its input
Axon:
single long fibre that carries the electrical signal from the cell body to other neurons
Synapse:
the point of contact between the axon of one cell and the dendrite of another, regulating a chemical connection whose strength affects the input to the cell.
Mathematical representation
The neuron calculates a weighted sum of inputs and compares it to a threshold. If the sum is higher than the threshold, the output is set to 1, otherwise to -1.
Non-linearity
A simple perceptron
Its a single-unit network Change the weight by an amount proportional to the difference between the desired output and the actual output. Input Wi = * (D-Y).Ii
Learning rate Actual output Desired output
The network has 2 inputs, and one output. All are binary. The output is
xn
what do the extra layers gain you? Start with looking at what a single layer cant do.
XOR Problem
Minsky & Papert (1969) offered solution to XOR problem by combining perceptron unit responses using a second layer of units
+1
1 3 2
+1
(1,-1)
(1,1)
(-1,-1)
(-1,1)
Properties of architecture
No connections within a layer
yi f ( wij x j bi )
j 1
Properties of architecture
No connections within a layer No direct connections between input and output layers
yi f ( wij x j bi )
j 1
Properties of architecture
No
connections within a layer No direct connections between input and output layers Fully connected between layers
yi f ( wij x j bi )
j 1
Properties of architecture
No
connections within a layer No direct connections between input and output layers Fully connected between layers Often more than 3 layers Number of output units need not equal number of input units Number of hidden units per layer can be more or less than input or output units
yi f ( wij x j bi )
j 1
Why MLP?
Popularity - the most used type of NN Universal Approximators - general-purpose models, with a huge number of applications Nonlinearity - capable of modelling complex functions; Robustness - good at ignoring irrelevant inputs and noise Adaptability - can adapt its weights and/or topology in response to environment changes
Statisticians: perform flexible data analysis Engineers: exploit MLP capabilities in several areas Cognitive Scientists: describe models of thinking Biologists: interpret DNA sequences
Applications
Classification (discrete outputs) Regression (numeric outputs) Reinforcement Learning (Output is not perfectly known)
THANK YOU