Lecture 3 MLP
Lecture 3 MLP
DS5007
Sahely Bhadra
[email protected]
Acknowledgement
Perceptron
• Developed by Frank Rosenblatt – 1950-60
• Initial version was a piece of hardware
1 With ReLU
1
b
b
2
MLP as a Classifier
25
Perceptron as a Classifier
● Linear Classifier
1
w1x1+w2x2=T
0
26
Modeling Complex Decision Boundaries (1)
27
Modelling Complex Decision Boundaries (2)
28
Modeling Complex Decision Boundaries (3)
29
Modeling Complex Decision Boundaries (4)
x2 AN
4
4 D
3 3
5
4 x1
4
3 3
4
30
More Complex Decision Boundaries
O
R
AN AN
D D
31
More Complex Decision Boundaries
32
Note: Capacity of an MLP (1)
● Universal Approximation Theorem (Hornik 1991)
○ “a single hidden layer neural network with a linear output unit can
approximate any continuous function arbitrarily well, given enough
hidden units”
● The result is true for MLPs that use other type of
activation functions.
● The theorem however doesn’t mean there is a learning
algorithm that can find the necessary parameter
values!
34
Note: Capacity of an MLP (2)
● A single layer MLP is a universal function approximator
○ Can approximate any function to arbitrary precision
○ But may require infinite nodes in the hidden layer
● Deeper networks require far fewer nodes for the same
approximation error
○ How many layers?
● Too few layers with a small number of nodes – insufficient
capacity.
● How to determine if the MLP has sufficient capacity to
model the data?
35