0% found this document useful (0 votes)
18 views22 pages

ANN - Back Propagation

Uploaded by

Bushra Qadr
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
18 views22 pages

ANN - Back Propagation

Uploaded by

Bushra Qadr
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 22

Lecture 3

Neural networks structure and


Back Propoagation

1
Decision Boundary
• 0 hidden layers: linear classifier
– Hyperplanes

x1 x2

2
Decision Boundary
• 1 hidden layer
– Boundary of convex region (open or closed)

x1 x2

3
Decision Boundary

y
• 2 hidden layers
– Combinations of convex regions

x1 x2

4
Decision Different Levels of
Functions Abstraction
• We don’t know
the “right”
levels of
abstraction
• So let the model
figure it out!

5
Example from Honglak Lee (NIPS 2010)
Decision Different Levels of
Functions Abstraction
Face Recognition:
– Deep Network
can build up
increasingly
higher levels of
abstraction
– Lines, parts,
regions

6
Example from Honglak Lee (NIPS 2010)
Decision Different Levels of
Functions Abstraction
Output y

c1 c2 … cF
Hidden Layer 3

b1 b2 … bE
Hidden Layer 2

a1 a2 … aD
Hidden Layer 1

x1 x2 x3 … xM
Input

7
Example from Honglak Lee (NIPS 2010)
Neural Network Architectures
Even for a basic Neural Network, there are
many design decisions to make:
1. # of hidden layers (depth)
2. # of units per hidden layer (width)
3. Type of activation function (nonlinearity)
4. Form of objective function

8
Activation Functions
Sigmoid / Logistic Function So far, we’ve
assumed that the
activation function
(nonlinearity) is
always the sigmoid
function…

9
Activation Functions
• A new change: modifying the nonlinearity
– The logistic is not widely used in modern ANNs

Alternate 1:
tanh

Like logistic function but


shifted to range [-1, +1]

Slide from William Cohen


Objective Functions for NNs
• Regression:
– Use the same objective as Linear Regression
– Quadratic loss (i.e. mean squared error)
• Classification:
– Use the same objective as Logistic Regression
– Cross-entropy (i.e. negative log likelihood)
– This requires probabilities, so we add an additional
“softmax” layer at the end of our network

11
Multi-Class Output

Output y1 … yK

Hidden Layer a1 a2 … aD

Input x1 x2 x3 … xM
12
Training Backpropagation
• Question 1:
When can we compute the gradients of the
parameters of an arbitrary neural network?

• Question 2:
When can we make the gradient
computation efficient?

13
Training Chain Rule

Given:
Chain Rule:

y1

u1 u2 … uJ

x2

14
Training Chain Rule

Given:
Chain Rule:

y1

Backpropagation

is just repeated u1 u2 uJ

application of the
chain rule from
Calculus 101. x2

15
Training Backpropagation

16
Training Backpropagation

17
Training Backpropagation
Output y

Case 1:
Logistic θ1 θ2 θ3 θM
Regression
x1 x2 x3 … xM
Input

18
Training Backpropagation

Output y

z1 z2 … zD
Hidden Layer

x1 x2 x3 … xM
Input

19
Training Backpropagation

Output y

z1 z2 … zD
Hidden Layer

x1 x2 x3 … xM
Input

20
Training Backpropagation
Case 2:
Neural
Network
y
z z z

1 2 D
x x x x

1 2 3 M

21
Summary
1. Neural Networks…
– provide a way of learning features
– are highly nonlinear prediction functions
– (can be) a highly parallel network of logistic
regression classifiers
– discover useful hidden representations of the
input
2. Backpropagation…
– provides an efficient way to compute gradients
– is a special case of reverse-mode automatic
differentiation
22

You might also like