Chapter 6 Deep Learning Knowledge

The document discusses techniques to improve deep learning networks, including remedies for vanishing gradients, overfitting, and computational load. It introduces ReLU activation functions and dropout methods to address these issues and enable effective training of deep networks.

Uploaded by

durant

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

48 views24 pages

Chapter 6 Deep Learning Knowledge

Uploaded by

durant

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 24

CHAPTER 6

Deep Learning Knowledge

Deep Learning (BSD4543)

DR. KU MUHAMMAD NA’IM KU KHALIF
Content
Chapter 6.1: Improve of the Deep Learning
Chapter 6.2: Example: ReLU and Dropout
Chapter 6.1: Improve of
the Deep Learning
By the end of this topic, you should be able
to:
▪ improve the deep learning by play around
the parameter settings.
Deep Learning
▪ Briefly, deep learning is a
machine learning technique that
employs the deep neural network.
As you know, the deep neural
network is the multi-layer neural
network that contains two or more
hidden layers.
▪ Although this may be
disappointingly simple, this is the
Figure 6.1: The concept of deep learning and its
true essence of deep learning. relationship to machine learning.
Figure 6.1 illustrates the concept
of deep learning and its
relationship to machine learning.
▪ The deep neural network lies
in the place of the final product
of machine learning.
▪ And the learning rule becomes
the algorithm that generates
the model (the deep neural
network) from the training
data.
Improvement of the Deep Neural Network

▪ Despite its outstanding achievements, deep learning actually does

not have any critical technologies to present. The innovation of
deep learning is a result of many small technical improvements.
This section briefly introduces why the deep neural network yielded
poor performance and how deep learning overcame this problem.
▪ The reason that the neural network with deeper layers yielded
poorer performance was that the network was not properly trained.
The backpropagation algorithm experiences the following three
primary difficulties in the training process of the deep neural
network:
1. Vanishing Gradient
2. Overfitting
3. Computational Load
Vanishing Gradient
▪ The gradient in this context can be thought as a similar concept to
the delta of the back-propagation algorithm. The vanishing gradient
in the training process with the back-propagation algorithm occurs
when the output error is more likely to fail to reach the farther
nodes.
▪ The back-propagation algorithm trains the neural network as it
propagates the output error backward to the hidden layers.
However, as the error hardly reaches the first hidden layer, the
weight cannot be adjusted. Therefore, the hidden layers that are
close to the input layer are not properly trained. There is no point of
adding hidden layers if they cannot be trained (see Figure 6.2).
Figure 6.2: The vanishing gradient.

▪ The representative solution to the vanishing gradient is the use of

the Rectified Linear Unit (ReLU) function as the activation function.
It is known to better transmit the error than the sigmoid function. The
ReLU function is defined as follows:
▪ Figure 6.3 depicts the ReLU function. It produces zero for negative
inputs and conveys the input for positive inputs.2 Its implementation
is extremely easy as well.

Figure 6.3: The ReLU function.

▪ The sigmoid function limits the node’s outputs to the unity regardless
of the input’s magnitude. In contrast, the ReLU function does not
exert such limits.
▪ Isn’t it interesting that such a simple change resulted in a drastic
improvement of the learning performance of the deep neural
network? Another element that we need for the back-propagation
algorithm is the derivative of the ReLU function.
▪ By the definition of the ReLU function, its derivative is given as:

▪ In addition, the cross entropy-driven learning rules may improve the

performance. Furthermore, the advanced gradient descent, which is
a numerical method that better achieves the optimum value, is also
beneficial for the training of the deep neural network.
Overfitting
▪ The reason that the deep neural network is especially vulnerable to
overfitting is that the model becomes more complicated as it
includes more hidden layers, and hence more weight. As addressed
in Chapter 1, a complicated model is more vulnerable to overfitting.
Here is the dilemma—deepening the layers for higher performance
drives the neural network to face the challenge of machine learning.
▪ The most representative solution is the dropout, which trains only
some of the randomly selected nodes rather than the entire
network. It is very effective, while its implementation is not very
complex. Figure 6.4 explains the concept of the dropout. Some
nodes are randomly selected at a certain percentage and their
outputs are set to be zero to deactivate the nodes.
Figure 6.4: Dropout is where some nodes are randomly
selected and their outputs are set to zero to deactivate the
nodes
▪ The dropout effectively prevents overfitting as it continuously alters
the nodes and weights in the training process. The adequate
percentages of the dropout are approximately 50% and 25% for
hidden and input layers, respectively.
▪ Another prevailing method used to prevent overfitting is adding
regularization terms, which provide the magnitude of the weights, to
the cost function. This method works as it simplifies the neural
network’ architecture as much as possible, and hence reduces the
possible onset of overfitting. Furthermore, the use of massive
training data is also very helpful as the potential bias due to
particular data is reduced.
Computational Load
▪ The last challenge is the time required to complete the training. The
number of weights increases geometrically with the number of
hidden layers, thus requiring more training data. This ultimately
requires more calculations to be made. The more computations the
neural network performs, the longer the training takes.
▪ This problem is a serious concern in the practical development of
the neural network. If a deep neural network requires a month to
train, it can only be modified 20 times a year. A useful research
study is hardly possible in this situation. This trouble has been
relieved to a considerable extent by the introduction of high-
performance hardware, such as GPU, and algorithms, such as
batch normalization.
▪ The minor improvements that this section introduced are the drivers
that has made deep learning the star of machine learning.
▪ The three primary research areas of Machine Learning are usually
said to be the image recognition, speech recognition, and natural
language processing.
▪ Each of these areas had been separately studied with specifically
suitable techniques. However, deep learning currently outperforms
all the techniques of all three areas.
Chapter 6.2:
Example: ReLU and
Dropout
By the end of this topic, you should be able
to:
▪ develop of the ReLU function
▪ develop the Dropout
ReLU Function
▪ This section introduces the ReLU function via the example. The
function drelu trains the given deep neural network using the back-
propagation algorithm. It takes the weights of the network and
training data and returns the trained weights.
▪ [W1, W2, W3

▪ where W1, W2, W3, and W4 are weight matrices of input-hidden1,

hidden1-
▪ hidden2, hidden2-hidden3, and hidden3-output layers, respectively.
X and D
▪ are input and correct output matrices of the training data.
Dropout
▪ This section presents the code that implements the dropout. We use
the sigmoid activation function for the hidden nodes.
▪ This code is mainly used to see how the dropout is coded, as the
training data may be too simple for us to perceive the actual
improvement of overfitting.

Unpacked Learning Competencies Eapp
100% (2)
Unpacked Learning Competencies Eapp
16 pages
Unit II
No ratings yet
Unit II
56 pages
IM - Creative Arts, Music, and Movements in ECE
No ratings yet
IM - Creative Arts, Music, and Movements in ECE
22 pages
Fundamentals of Deep Learning
No ratings yet
Fundamentals of Deep Learning
26 pages
Deep Learning Unit 2
No ratings yet
Deep Learning Unit 2
30 pages
Deep Learning Hand Book 2024
No ratings yet
Deep Learning Hand Book 2024
185 pages
MGT 3013 Questions Ch03
100% (1)
MGT 3013 Questions Ch03
48 pages
1725876123-Unit 1 Fundamental of Deep Learning
No ratings yet
1725876123-Unit 1 Fundamental of Deep Learning
51 pages
Deep Learning Model
No ratings yet
Deep Learning Model
144 pages
The Little Book of Deep Learning
No ratings yet
The Little Book of Deep Learning
163 pages
Artificial Intelligence Interview Questions
From Everand
Artificial Intelligence Interview Questions
Tech Interviews
5/5 (2)
Deep Learning
100% (2)
Deep Learning
49 pages
A Little Book of Deep Learning - Francois Fleuret
No ratings yet
A Little Book of Deep Learning - Francois Fleuret
149 pages
1 1excite
No ratings yet
1 1excite
23 pages
Artificial Intelligence - Chapter 7
No ratings yet
Artificial Intelligence - Chapter 7
18 pages
School Observations Report
75% (4)
School Observations Report
3 pages
Lesson Plan What's Your Address
100% (1)
Lesson Plan What's Your Address
3 pages
ProfEd 111 (BECED)
No ratings yet
ProfEd 111 (BECED)
11 pages
LBDL
No ratings yet
LBDL
185 pages
LBDL
No ratings yet
LBDL
143 pages
2019 Book SecondHandbookOfEnglishLanguag
83% (6)
2019 Book SecondHandbookOfEnglishLanguag
1,221 pages
The Little Book of Deep Learning
No ratings yet
The Little Book of Deep Learning
167 pages
Math 6 Week 7 Q4 DLP
No ratings yet
Math 6 Week 7 Q4 DLP
5 pages
Data Mining: Practical Machine Learning Tools and Techniques
No ratings yet
Data Mining: Practical Machine Learning Tools and Techniques
123 pages
The Little Book of Deep Learning
No ratings yet
The Little Book of Deep Learning
143 pages
1Unit1L1 Art Education in The Elementary Grades
No ratings yet
1Unit1L1 Art Education in The Elementary Grades
32 pages
Lbdlu
No ratings yet
Lbdlu
168 pages
Module 3 - Modified
No ratings yet
Module 3 - Modified
106 pages
Models of Curriculum
No ratings yet
Models of Curriculum
23 pages
Lect 12 - Deep Feed Forward NN - Review
No ratings yet
Lect 12 - Deep Feed Forward NN - Review
93 pages
Chapter 9
No ratings yet
Chapter 9
73 pages
Performance and Portfolio Assessment Methods
No ratings yet
Performance and Portfolio Assessment Methods
34 pages
Euthenics 1
No ratings yet
Euthenics 1
3 pages
Lecture8 DeepLearning
No ratings yet
Lecture8 DeepLearning
94 pages
ML Unit - 2
No ratings yet
ML Unit - 2
70 pages
Home Assignment Submission Solutions
No ratings yet
Home Assignment Submission Solutions
82 pages
Deep Neural Network Module 4 Regularization
No ratings yet
Deep Neural Network Module 4 Regularization
53 pages
Artificial Neural Networks - DL
No ratings yet
Artificial Neural Networks - DL
55 pages
cst414 - Deep Learning
No ratings yet
cst414 - Deep Learning
34 pages
Aarya Edara - CAS Portfolio - Batch of 2025
No ratings yet
Aarya Edara - CAS Portfolio - Batch of 2025
46 pages
Cours 4
No ratings yet
Cours 4
30 pages
Unit III
No ratings yet
Unit III
43 pages
Kagan Lecture2
No ratings yet
Kagan Lecture2
118 pages
DLL - RW - LC 1 - Francis EN1112RWS-IIIa-1
No ratings yet
DLL - RW - LC 1 - Francis EN1112RWS-IIIa-1
3 pages
Deep Learing
No ratings yet
Deep Learing
37 pages
Supervised Deep Learning
No ratings yet
Supervised Deep Learning
28 pages
Lecture Slides
No ratings yet
Lecture Slides
30 pages
Deep Feedforward Networks and Regularization: Licheng Zhang
No ratings yet
Deep Feedforward Networks and Regularization: Licheng Zhang
56 pages
Machine Learning
No ratings yet
Machine Learning
83 pages
Chapter 3
No ratings yet
Chapter 3
30 pages
ML Prep For Samsung
No ratings yet
ML Prep For Samsung
73 pages
Deep Neural Networks
No ratings yet
Deep Neural Networks
26 pages
Deep Learning
No ratings yet
Deep Learning
19 pages
Chap 2 Training Feed Forward Neural Networks
No ratings yet
Chap 2 Training Feed Forward Neural Networks
22 pages
6.3 HiddenUnits
No ratings yet
6.3 HiddenUnits
26 pages
Deep Learning
No ratings yet
Deep Learning
49 pages
Introtodeeplearning MIT 6.S191
No ratings yet
Introtodeeplearning MIT 6.S191
36 pages
DL Class3
No ratings yet
DL Class3
28 pages
3b Dynamics
No ratings yet
3b Dynamics
19 pages
Chapter 3 Ann
No ratings yet
Chapter 3 Ann
26 pages
Student Roles and Behaviors in Higher Education Co-Creation-A Systematic Literature Review - O
No ratings yet
Student Roles and Behaviors in Higher Education Co-Creation-A Systematic Literature Review - O
24 pages
Module 2 DL Snotes P1
No ratings yet
Module 2 DL Snotes P1
16 pages
Lecture 1 Part II
No ratings yet
Lecture 1 Part II
24 pages
Cours 2 - Training Deep Neural Networks
No ratings yet
Cours 2 - Training Deep Neural Networks
42 pages
Deep Learning Unit2
No ratings yet
Deep Learning Unit2
16 pages
SANHS ESIP 2023 2025 Final
No ratings yet
SANHS ESIP 2023 2025 Final
21 pages
ANN Presentation Exam Hafsa
No ratings yet
ANN Presentation Exam Hafsa
29 pages
CT1 DL Ans
No ratings yet
CT1 DL Ans
13 pages
DL 4 Notes
No ratings yet
DL 4 Notes
34 pages
Unit II.
No ratings yet
Unit II.
14 pages
Transfer Learning Convolutional Neural Network-AlexNet Achieving Face Recognition
No ratings yet
Transfer Learning Convolutional Neural Network-AlexNet Achieving Face Recognition
4 pages
Notes Chapter8
No ratings yet
Notes Chapter8
4 pages
Module 2
No ratings yet
Module 2
13 pages
2025 Chum-1 8
No ratings yet
2025 Chum-1 8
8 pages
1st Grade Mtsu Comprehensive Art Education Lesson Plan TN Benchmarks 1
No ratings yet
1st Grade Mtsu Comprehensive Art Education Lesson Plan TN Benchmarks 1
13 pages
A Probabilistic Theory of Deep Learning: Unit 2
100% (1)
A Probabilistic Theory of Deep Learning: Unit 2
17 pages
Using The ADDIE Model To Develop Online Continuing Education Courses On Caring For Nurses in Taiwan
No ratings yet
Using The ADDIE Model To Develop Online Continuing Education Courses On Caring For Nurses in Taiwan
10 pages
Lecture in GEd 202 - Module 05
No ratings yet
Lecture in GEd 202 - Module 05
7 pages
Unit 5 (Second Half)
No ratings yet
Unit 5 (Second Half)
10 pages
Deep Learning (All in One)
No ratings yet
Deep Learning (All in One)
23 pages
Deep Learning Algorithms Report PDF
No ratings yet
Deep Learning Algorithms Report PDF
11 pages
French Grammar & Expression - II: Format For Course Curriculum
No ratings yet
French Grammar & Expression - II: Format For Course Curriculum
4 pages
Staar Ush Blueprint
No ratings yet
Staar Ush Blueprint
1 page
What Is Learning?: More or Less Permanent Change
No ratings yet
What Is Learning?: More or Less Permanent Change
4 pages
Lesson Plan
No ratings yet
Lesson Plan
3 pages
Tukuran Technical - Vocational High School
No ratings yet
Tukuran Technical - Vocational High School
2 pages
PAI QP
No ratings yet
PAI QP
2 pages
Anatomy of Neural Networks
No ratings yet
Anatomy of Neural Networks
2 pages
Must and Mustnt - 7901633
No ratings yet
Must and Mustnt - 7901633
1 page
Essay
No ratings yet
Essay
2 pages

Chapter 6 Deep Learning Knowledge

Uploaded by

Chapter 6 Deep Learning Knowledge

Uploaded by

CHAPTER 6

Deep Learning Knowledge

Deep Learning (BSD4543)

▪ Despite its outstanding achievements, deep learning actually does

▪ The representative solution to the vanishing gradient is the use of

Figure 6.3: The ReLU function.

▪ In addition, the cross entropy-driven learning rules may improve the

▪ where W1, W2, W3, and W4 are weight matrices of input-hidden1,

You might also like