0% found this document useful (0 votes)

21 views19 pages

3b Dynamics

Uploaded by

jiejialing08

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

21 views19 pages

3b Dynamics

Uploaded by

jiejialing08

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 19

COMP9444: Neural Networks and

Deep Learning
Week 3b. Hidden Unit Dynamics

Alan Blair
School of Computer Science and Engineering
June 11, 2024
Outline

➛ geometry of hidden unit activations (8.2)

➛ limitations of 2-layer networks
➛ vanishing / exploding gradients
➛ alternative activation functions (6.3)
➛ ways to avoid overfitting in neural networks (5.2-5.3)

2
Encoder Networks

Inputs Outputs
10000 10000
01000 01000
00100 00100
00010 00010
00001 00001

➛ identity mapping through a bottleneck

➛ also called N–M–N task
➛ used to investigate hidden unit representations

3
N–2–N Encoder
Hidden Unit Space:

4
8–3–8 Encoder

Exercise:

➛ Draw the hidden unit space for 2-2-2, 3-2-3, 4-2-4 and 5-2-5 encoders.
➛ Represent the input-to-hidden weights for each input unit by a point, and the
hidden-to-output weights for each output unit by a line.
➛ Now consider the 8-3-8 encoder with its 3-dimensional hidden unit space.
→ what shape would be formed by the 8 points representing the
input-to-hidden weights for the 8 input units?
→ what shape would be formed by the planes representing the
hidden-to-output weights for each output unit?
Hint: think of two platonic solids, which are “dual” to each other.

5
Hinton Diagrams
Sharp Straight Sharp
Left Ahead Right

30 Output
Units

4 Hidden
Units

30x32 Sensor
Input Retina

➛ used to visualize higher dimensions

➛ white = positive, black = negative

6
Learning Face Direction

7
Learning Face Direction

8
Weight Space Symmetry (8.2)

➛ swap any pair of hidden nodes, overall function will be the same
➛ on any hidden node, reverse the sign of all incoming and outgoing weights
(assuming symmetric transfer function)
➛ hidden nodes with identical input-to-hidden weights in theory would never
separate; so, they all have to begin with different random weights
➛ in practice, all hidden nodes may try to do similar job at first, then gradually
specialize.

9
Controlled Nonlinearity

➛ for small weights, each layer implements an approximately linear function,

so multiple layers also implement an approximately linear function.
➛ for large weights, transfer function approximates a step function,
so computation becomes digital and learning becomes very slow.
➛ with typical weight values, two-layer neural network implements a function
which is close to linear, but takes advantage of a limited degree of nonlinearity.

10
Limitations of Two-Layer Neural Networks
Some functions are difficult for a 2-layer network to learn.
6

−2

−4

−6

−6 −4 −2 0 2 4 6

For example, this Twin Spirals problem is difficult to learn with a 2-layer network,
but it can be learned using a 3-layer network.

11
First Hidden Layer

12
Second Hidden Layer

13
Network Output

14
Adding Hidden Layers

➛ twin spirals can be learned by 3-layer network

➛ first hidden layer learns linearly separable features
➛ second hidden layer combines these to produce more complex features
➛ learning rate and initial weight values must be small
➛ learning can be improved using the Adam optimizer

15
Vanishing / Exploding Gradients

➛ training by backpropagation in networks with many layers is difficult

➛ when the weights are small, the differentials become smaller and smaller as
we backpropagate through the layers, and end up having no effect
➛ when the weights are large, the activations in the higher layers may saturate
to extreme values
➛ when the weights are large, the differentials may sometimes get multiplied
twice in succession in places where the transfer function is steep, causing
them to blow up to large values

16
Vanishing / Exploding Gradients

Ways to avoid vanishing / exploding gradients:

➛ new activations functions

➛ weight initialization (Week 4)
➛ batch normalization (Week 4)
➛ skip connections (Week 4)
➛ long short term memory (LSTM) (Week 5)

17
Activation Functions (6.3)
4 4

3 3

2 2

1 1

0 0

-1 -1

-2 -2
-4 -2 0 2 4 -4 -2 0 2 4

Sigmoid Rectified Linear Unit (ReLU)

4 4

3 3

2 2

1 1

0 0

-1 -1

-2 -2
-4 -2 0 2 4 -4 -2 0 2 4

Hyperbolic Tangent Scaled Exponential Linear Unit (SELU)

18
Activation Functions

➛ sigmoid and hyperbolic tangent traditionally used for 2-layer networks,

but suffer from vanishing gradient problem in deeper networks.

➛ rectified linear units (ReLUs) are popular for deep networks

(including convolutional networks); gradients will not vanish
because derivative is either 0 or 1.

Python For Data Science
No ratings yet
Python For Data Science
9 pages
3b Dynamics4
No ratings yet
3b Dynamics4
5 pages
UNIT-2 Foundations of Deep Learning
No ratings yet
UNIT-2 Foundations of Deep Learning
64 pages
6.3 HiddenUnits
No ratings yet
6.3 HiddenUnits
26 pages
Neural Networks
No ratings yet
Neural Networks
37 pages
Neural Networks / Deep Learning
No ratings yet
Neural Networks / Deep Learning
9 pages
Unit 3
No ratings yet
Unit 3
12 pages
Neural Network Introduction
No ratings yet
Neural Network Introduction
5 pages
2K21 - Ee - 192 MLP
No ratings yet
2K21 - Ee - 192 MLP
59 pages
VNUK - Deep Neural Network Representation
No ratings yet
VNUK - Deep Neural Network Representation
44 pages
Unit Ii DNN
No ratings yet
Unit Ii DNN
24 pages
Deep Learning Algorithms Report PDF
No ratings yet
Deep Learning Algorithms Report PDF
11 pages
Unit 4
No ratings yet
Unit 4
19 pages
Unit 5 (Second Half)
No ratings yet
Unit 5 (Second Half)
10 pages
Unit 2.1
No ratings yet
Unit 2.1
37 pages
Machine Learning-Lecture 16 (Student)
No ratings yet
Machine Learning-Lecture 16 (Student)
10 pages
Unit 2
No ratings yet
Unit 2
18 pages
1725876123-Unit 1 Fundamental of Deep Learning
No ratings yet
1725876123-Unit 1 Fundamental of Deep Learning
51 pages
Data Mining: Practical Machine Learning Tools and Techniques
No ratings yet
Data Mining: Practical Machine Learning Tools and Techniques
123 pages
Lecture W15ab
No ratings yet
Lecture W15ab
44 pages
DL Co3 - PPT 1
No ratings yet
DL Co3 - PPT 1
22 pages
Ad3451 ML Unit 4 Notes
No ratings yet
Ad3451 ML Unit 4 Notes
34 pages
Deep Learing
No ratings yet
Deep Learing
37 pages
ML Lec-22
No ratings yet
ML Lec-22
25 pages
Unit I
No ratings yet
Unit I
90 pages
Activation Function
No ratings yet
Activation Function
44 pages
Artificial Neural Networks
No ratings yet
Artificial Neural Networks
26 pages
M3 L4 RNN Regularization
No ratings yet
M3 L4 RNN Regularization
24 pages
Unit II
No ratings yet
Unit II
56 pages
Unit 3 Self Made
No ratings yet
Unit 3 Self Made
23 pages
Neural Networks Unit-3
No ratings yet
Neural Networks Unit-3
14 pages
Deep Learning (All in One)
No ratings yet
Deep Learning (All in One)
23 pages
Neural Network Presentation
No ratings yet
Neural Network Presentation
33 pages
Activation FN
No ratings yet
Activation FN
15 pages
Linear Separability Linearly Separable Data Non-Linearly Separable Data
No ratings yet
Linear Separability Linearly Separable Data Non-Linearly Separable Data
1 page
Chapter 9
No ratings yet
Chapter 9
73 pages
Kagan Lecture2
No ratings yet
Kagan Lecture2
118 pages
L14 Exploding and Vanishing Gradients
No ratings yet
L14 Exploding and Vanishing Gradients
13 pages
Introtodeeplearning MIT 6.S191
No ratings yet
Introtodeeplearning MIT 6.S191
36 pages
3 Neural
No ratings yet
3 Neural
27 pages
Ad3451 ML Unit 4 Notes
No ratings yet
Ad3451 ML Unit 4 Notes
36 pages
Artificial Neural Networks - Lect - 4
No ratings yet
Artificial Neural Networks - Lect - 4
17 pages
Unit Iv
No ratings yet
Unit Iv
34 pages
Ad3451 ML Unit 4 Notes Eduengg
No ratings yet
Ad3451 ML Unit 4 Notes Eduengg
36 pages
Chap 7.2 Sequence Analysis Using RNN LSTM
No ratings yet
Chap 7.2 Sequence Analysis Using RNN LSTM
60 pages
Supervised Learning Network
No ratings yet
Supervised Learning Network
33 pages
CT1 DL Ans
No ratings yet
CT1 DL Ans
13 pages
Unit 2 - Machine Learning
No ratings yet
Unit 2 - Machine Learning
19 pages
3 ArtificialNeuralNetworks PDF
No ratings yet
3 ArtificialNeuralNetworks PDF
77 pages
Neural Networks From Scratch: 3.1 Formal Neuron
No ratings yet
Neural Networks From Scratch: 3.1 Formal Neuron
8 pages
Analysing 3 Networks
No ratings yet
Analysing 3 Networks
30 pages
Week 03-04 - Deep Feedforward Networks - Intro
No ratings yet
Week 03-04 - Deep Feedforward Networks - Intro
141 pages
Neural Network Basics 2.1 Neurons or Nodes and Layers
No ratings yet
Neural Network Basics 2.1 Neurons or Nodes and Layers
9 pages
Artificial Neural Network Notes
No ratings yet
Artificial Neural Network Notes
9 pages
Chapter 3
No ratings yet
Chapter 3
30 pages
Lecture8 DeepLearning
No ratings yet
Lecture8 DeepLearning
94 pages
Unit 2 DL
No ratings yet
Unit 2 DL
70 pages
Neural Networks - Annotated
No ratings yet
Neural Networks - Annotated
21 pages
Lecture 5 - CS50's Introduction To Artificial Intelligence With Python
No ratings yet
Lecture 5 - CS50's Introduction To Artificial Intelligence With Python
16 pages
Artificial Neural Nets
No ratings yet
Artificial Neural Nets
34 pages
Hidden Line Removal: Unveiling the Invisible: Secrets of Computer Vision
From Everand
Hidden Line Removal: Unveiling the Invisible: Secrets of Computer Vision
Fouad Sabry
No ratings yet
Syllabus For ENG-1B-45999 OL (06 - 17-07 - 25) Critical Thinking - Writing
No ratings yet
Syllabus For ENG-1B-45999 OL (06 - 17-07 - 25) Critical Thinking - Writing
5 pages
(9789004233041 - Crossing Boundaries) Crossing Boundaries
100% (1)
(9789004233041 - Crossing Boundaries) Crossing Boundaries
273 pages
Apm Coursework
No ratings yet
Apm Coursework
4 pages
ACCT5919 Individual Assignment and Reflection Term 2 2024
No ratings yet
ACCT5919 Individual Assignment and Reflection Term 2 2024
3 pages
2022 Book HumanAnimalRelationshipsInTran
No ratings yet
2022 Book HumanAnimalRelationshipsInTran
414 pages
ES1086 Scientific Report - Search For Life On Mars - Summer 2024
No ratings yet
ES1086 Scientific Report - Search For Life On Mars - Summer 2024
16 pages
BM - CA1 - July24 Term - Canvas
No ratings yet
BM - CA1 - July24 Term - Canvas
3 pages
MCD2080 Business Statistics Group Assignment-Final
No ratings yet
MCD2080 Business Statistics Group Assignment-Final
5 pages
SM Assignment 1
No ratings yet
SM Assignment 1
2 pages
I Need U
No ratings yet
I Need U
21 pages
125.364 Week 04-05 Part 01
No ratings yet
125.364 Week 04-05 Part 01
25 pages
125.364 Week 05
No ratings yet
125.364 Week 05
41 pages
HIST 11 Notes
No ratings yet
HIST 11 Notes
52 pages
COMP9444 Project Marking Criteria
No ratings yet
COMP9444 Project Marking Criteria
2 pages
4a Convolutional Neural Networks
No ratings yet
4a Convolutional Neural Networks
56 pages
MARK3088 - Lecture WK 5 - New Product Idea Generation
No ratings yet
MARK3088 - Lecture WK 5 - New Product Idea Generation
46 pages
Assignment 2 Option 1 - Advertisment Design Canvas
No ratings yet
Assignment 2 Option 1 - Advertisment Design Canvas
2 pages
大纲 MARK3088 Assessment Guide 2024 T2
No ratings yet
大纲 MARK3088 Assessment Guide 2024 T2
14 pages
Massey Uni Sustainability Reporting-Final2
No ratings yet
Massey Uni Sustainability Reporting-Final2
23 pages
125.364 Week 06 Interest Rate Swap
No ratings yet
125.364 Week 06 Interest Rate Swap
21 pages
Income From Property
No ratings yet
Income From Property
12 pages
Assignment 2 Option 3 - Store Design Canvas
No ratings yet
Assignment 2 Option 3 - Store Design Canvas
3 pages
125.364 Week 01
No ratings yet
125.364 Week 01
61 pages
125.364 Topic03
No ratings yet
125.364 Topic03
45 pages
Week 3 LECTURE SLIDES
No ratings yet
Week 3 LECTURE SLIDES
53 pages
General Concepts of Income
No ratings yet
General Concepts of Income
16 pages
IRAC Method
No ratings yet
IRAC Method
13 pages
Week 2 Lecture Slides
No ratings yet
Week 2 Lecture Slides
29 pages
Week (1) Lecture
No ratings yet
Week (1) Lecture
25 pages
Week 6 Lecture Slides
No ratings yet
Week 6 Lecture Slides
46 pages
DeepSeek Presentation
No ratings yet
DeepSeek Presentation
76 pages
AI - Instructional System Design (Final)
No ratings yet
AI - Instructional System Design (Final)
25 pages
Lecture 1 Self Study
No ratings yet
Lecture 1 Self Study
80 pages
ABC-AI-Big-Data-and-Cloud-Computing
No ratings yet
ABC-AI-Big-Data-and-Cloud-Computing
67 pages
Project Report - Hangman
No ratings yet
Project Report - Hangman
18 pages
CISCON2024PAPAER522
No ratings yet
CISCON2024PAPAER522
8 pages
Human Activity Recognition Using ML Techniques
100% (1)
Human Activity Recognition Using ML Techniques
5 pages
Preparing Data For Machine Learning - Pluralsight PDF
No ratings yet
Preparing Data For Machine Learning - Pluralsight PDF
74 pages
AI and Machine Learning
No ratings yet
AI and Machine Learning
3 pages
PHD Thesis On Business Intelligence
100% (2)
PHD Thesis On Business Intelligence
8 pages
KianNet A Violence Detection Model Using An Attent
No ratings yet
KianNet A Violence Detection Model Using An Attent
12 pages
ETI Microproject Report
No ratings yet
ETI Microproject Report
19 pages
Comprehensive AI & ML Course - From Beginner To Gen...
No ratings yet
Comprehensive AI & ML Course - From Beginner To Gen...
5 pages
Adult Census Income Prediction
100% (1)
Adult Census Income Prediction
31 pages
Project Proposal: Real Time Driver Drowsiness Detection
0% (1)
Project Proposal: Real Time Driver Drowsiness Detection
4 pages
Stock Price Prediction Using Machine Learning Algorithms: ARIMA, LSTM & Linear Regression
No ratings yet
Stock Price Prediction Using Machine Learning Algorithms: ARIMA, LSTM & Linear Regression
7 pages
Digital Version - The 10 Most Iconic CEO's To Watch in 2024
No ratings yet
Digital Version - The 10 Most Iconic CEO's To Watch in 2024
52 pages
Machine Learning Based Prediction of Seismic Response of Eleva 2024 Structur
No ratings yet
Machine Learning Based Prediction of Seismic Response of Eleva 2024 Structur
43 pages
Introduction To RNNS!: Arun Mallya!
No ratings yet
Introduction To RNNS!: Arun Mallya!
52 pages
Ai Algorithm Failure
No ratings yet
Ai Algorithm Failure
36 pages
AI Skills Report
No ratings yet
AI Skills Report
20 pages
A Complete Guide To Data Augmentation - DataCamp
No ratings yet
A Complete Guide To Data Augmentation - DataCamp
18 pages
Lec15 16 Handout
No ratings yet
Lec15 16 Handout
33 pages
User and Entity Behavior Analytics Guide PDF
No ratings yet
User and Entity Behavior Analytics Guide PDF
16 pages
Instance-Based Learning: Slides Provided by Introduction To Data Mining, 2 Edition
No ratings yet
Instance-Based Learning: Slides Provided by Introduction To Data Mining, 2 Edition
13 pages
A Deep Learning Architecture For Metabolic Pathway Prediction
No ratings yet
A Deep Learning Architecture For Metabolic Pathway Prediction
7 pages
Machine Learning in Mechanical Design
No ratings yet
Machine Learning in Mechanical Design
18 pages
India Biotech and Medtech Report
No ratings yet
India Biotech and Medtech Report
21 pages
Decision Tree Classifier For Parametric Fault Detection in Electrical Submersible Pumps
No ratings yet
Decision Tree Classifier For Parametric Fault Detection in Electrical Submersible Pumps
4 pages