0% found this document useful (0 votes)

2 views4 pages

Week 8

The document discusses various activation functions in deep learning, including Tanh, Sigmoid, and Exponential ReLU, highlighting their challenges such as saturation and computational expense. It also covers issues related to saturating neurons, the benefits of batch normalization, and the advantages of unsupervised pre-training. Additionally, it addresses the Dead ReLU problem and how to identify it in a neural network.

Uploaded by

janhvidash2003

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

2 views4 pages

Week 8

Uploaded by

janhvidash2003

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 4

Deep Learning - Week 8

1. What are the challenges associated with using the Tanh(x) activation function?

(a) It is not zero centered

(b) Computationally expensive
(c) Non-differentiable at 0
(d) Saturation

Correct Answer: (b),(d)

Solution: Tanh(x) is zero-centered but the problem of saturation still persists. It is
computationally expensive to do this operation.

2. Which of the following problems makes training a neural network harder while using
sigmoid as the activation function?

(a) Not-continuous at 0
(b) Not-differentiable at 0
(c) Saturation
(d) Computationally expensive

Correct Answer: (c),(d)

Solution: Sigmoid is computationally expensive due to the exponentiation process.
They saturate easily and since their range is [0,1], weight update directions are limited.

3. Consider the Exponential ReLU (ELU) activation function, defined as:

(
x, x>0
f (x) = x
a(e − 1), x ≤ 0

where a ̸= 0. Which of the following statements is true?

(a) The function is discontinuous at x = 0.

(b) The function is non-differentiable at x = 0.
(c) Exponential ReLU can produce negative values.
(d) Exponential ReLU is computationally less expensive than ReLU.

Correct Answer: (c)

Solution:
1. Discontinuity at x = 0:

(a) Right-hand limit: lim f (x) = 0.

x→0+
(b) Left-hand limit: lim a(ex − 1) = a(1 − 1) = 0.
x→0−
(c) Since both limits and f (0) are equal, the function is continuous at x = 0.
2. Non-differentiability at x = 0:

(a) Right derivative: lim f ′ (x) = 1.

x→0+
(b) Left derivative: lim aex = a.
x→0−
(c) The function is differentiable at x = 0 only if a = 1.
(d) Since a ̸= 0 but not necessarily 1, differentiability depends on a, making this
statement inconclusive.

3. Computational expense compared to ReLU:

(a) ReLU uses max(0, x), which is a simple comparison.

(b) ELU involves an exponential operation, which is more computationally expen-
sive.

4. Possibility of negative values:

(a) For x < 0,

f (x) = a(ex − 1).

(b) Since ex − 1 < 0 for x < 0, f (x) is negative if a > 0.

4. We have observed that the sigmoid neuron has become saturated. What might be
the possible output values at this neuron?

(a) 0.0666
(b) 0.589
(c) 0.9734
(d) 0.498
(e) 1

Correct Answer: (a),(c),(e)

Solution: Since the neuron has saturated its output values are close to 0 or 1.

5. What is the gradient of the sigmoid function at saturation?

Correct Answer: 0
Solution: At saturation, the sigmoid function outputs 0 or 1, and its gradient becomes
zero, causing vanishing gradients.

6. Which of the following are common issues caused by saturating neurons in deep
networks?

(a) Vanishing gradients

(b) Slow convergence during training
(c) Overfitting
(d) Increased model complexity
Correct Answer: (a),(b)
Solution: Saturating neurons, especially in sigmoid activation functions, cause vanish-
ing gradients, making it hard to propagate error signals back and slow down learning.

7. Given a neuron initialized with weights w1 = 0.9, w2 = 1.7, and inputs x1 = 0.4,
x2 = −0.7, calculate the output of a ReLU neuron.

Correct Answer: 0
Solution: The weighted sum is 0.9 × 0.4 + 1.7 × (−0.7) = 0.36 − 1.19 = −0.83. ReLU
outputs the max of 0 and the input, so the result is max(0, −0.83) = 0.

8. Which of the following is incorrect with respect to the batch normalization process
in neural networks?

(a) We normalize the output produced at each layer before feeding it into the next
layer
(b) Batch normalization leads to a better initialization of weights.
(c) Backpropagation can be used after batch normalization
(d) Variance and mean are not learnable parameters.

Correct Answer: (d)

Solution:
1. ”We normalize the output produced at each layer before feeding it into the next
layer.”
Batch Normalization (BN) normalizes activations by adjusting them to have zero
mean and unit variance before passing them to the next layer.
The formula for batch normalization is:
x−µ
x̂ = √
σ2 + ϵ

This helps stabilize learning and speeds up convergence.

2. ”Batch normalization leads to a better initialization of weights.”
BN helps mitigate issues like internal covariate shift, making the training less depen-
dent on careful weight initialization.
It allows training with higher learning rates and stabilizes deep networks.
3. ”Backpropagation can be used after batch normalization.”
BN is differentiable, and gradients can flow through it during backpropagation.
During training, gradients are computed normally, taking into account the transfor-
mation applied by BN.
4. ”Variance and mean are not learnable parameters.” (Incorrect)
BN initially normalizes using batch statistics (mean µ and variance σ 2 ).
However, batch normalization introduces learnable parameters: - γ (scaling parame-
ter) - β (shifting parameter)
These parameters allow the model to learn an optimal representation instead of always
enforcing zero mean and unit variance.
9. Which of the following is an advantage of unsupervised pre-training in deep learning?

(a) It helps in reducing overfitting

(b) Pre-trained models converge faster
(c) It requires fewer computational resources
(d) It improves the accuracy of the model

Correct Answer: (a),(b),(d)

Solution: Unsupervised pre-training helps in reducing overfitting in deep neural net-
works by initializing the weights in a better way. This technique requires more com-
putational resources than supervised learning, but it can improve the accuracy of
the model. Additionally, the pre-trained model is shown to converge faster than
non-pre-trained models

10. How can you tell if your network is suffering from the Dead ReLU problem?

(a) The loss function is not decreasing during training

(b) A large number of neurons have zero output
(c) The accuracy of the network is not improving
(d) The network is overfitting to the training data

Correct Answer: (b)

Solution: The Dead ReLU problem can be detected by checking the output of each
neuron in the network. If a large number of neurons have zero output, then the
network may be suffering from the Dead ReLU problem. This can indicate that the
bias term is too high, causing a large number of dead neurons.

Unit 2 Introduction To Deep Learning
No ratings yet
Unit 2 Introduction To Deep Learning
79 pages
Training Deep Neural Networks
No ratings yet
Training Deep Neural Networks
55 pages
A Probabilistic Theory of Deep Learning: Unit 2
100% (1)
A Probabilistic Theory of Deep Learning: Unit 2
17 pages
Lesson 3 Artificial Neural Network
No ratings yet
Lesson 3 Artificial Neural Network
77 pages
Examen Deep Learning
100% (1)
Examen Deep Learning
8 pages
ReLu Heuristics For Avoiding Local Bad Minima
100% (2)
ReLu Heuristics For Avoiding Local Bad Minima
10 pages
Activation Function
No ratings yet
Activation Function
36 pages
2 DL Training
No ratings yet
2 DL Training
60 pages
06 AIS302 ANN Backpropagation
No ratings yet
06 AIS302 ANN Backpropagation
83 pages
Unit 2
No ratings yet
Unit 2
35 pages
AD3511-DEEP LEARNING LAB MANUAL Revised
No ratings yet
AD3511-DEEP LEARNING LAB MANUAL Revised
72 pages
Machine Learning NN
100% (2)
Machine Learning NN
16 pages
QB1 DL
No ratings yet
QB1 DL
20 pages
Chapter 2 Adaline
No ratings yet
Chapter 2 Adaline
71 pages
Fundamental - Deep Learning
No ratings yet
Fundamental - Deep Learning
69 pages
Neural Network Basics
No ratings yet
Neural Network Basics
37 pages
ISE-1 Imp DLPDF
No ratings yet
ISE-1 Imp DLPDF
28 pages
Genai See
No ratings yet
Genai See
51 pages
L4 Training Neural Networks en
No ratings yet
L4 Training Neural Networks en
48 pages
Unit 3
No ratings yet
Unit 3
29 pages
SS 2020 Solutions
No ratings yet
SS 2020 Solutions
22 pages
Motivational Letter Sample
No ratings yet
Motivational Letter Sample
2 pages
2 Mayank - Vatsa
No ratings yet
2 Mayank - Vatsa
39 pages
SS 2020
No ratings yet
SS 2020
21 pages
WS 2021 Solutions
No ratings yet
WS 2021 Solutions
16 pages
SS 2021 Solutions
No ratings yet
SS 2021 Solutions
16 pages
Week8 Discussion - Deep Learning
No ratings yet
Week8 Discussion - Deep Learning
22 pages
Is The Data Linearly Separable?: A) Yes B) No
No ratings yet
Is The Data Linearly Separable?: A) Yes B) No
19 pages
WS 2021
No ratings yet
WS 2021
16 pages
PowerPoint Presentation-3
No ratings yet
PowerPoint Presentation-3
28 pages
Must Know Questions Deep Learning
No ratings yet
Must Know Questions Deep Learning
22 pages
DL M2 Tech
No ratings yet
DL M2 Tech
32 pages
Transformers in Machine Learning
No ratings yet
Transformers in Machine Learning
16 pages
Lec08-1Activation Functions
No ratings yet
Lec08-1Activation Functions
19 pages
New Microsoft Word Document
No ratings yet
New Microsoft Word Document
14 pages
DL Assi02
No ratings yet
DL Assi02
9 pages
Question Bank For NN
No ratings yet
Question Bank For NN
6 pages
10 Improving Deep Neural Networks Hyperparameter Tuning, Regularization
No ratings yet
10 Improving Deep Neural Networks Hyperparameter Tuning, Regularization
6 pages
Document 8 Study
No ratings yet
Document 8 Study
8 pages
Krishna Rungta - TensorFlow in 1 Day Make Your Own Neural Network (2018) - Trang-1
No ratings yet
Krishna Rungta - TensorFlow in 1 Day Make Your Own Neural Network (2018) - Trang-1
24 pages
Pptdfnse
No ratings yet
Pptdfnse
31 pages
VGG16 Architecture
No ratings yet
VGG16 Architecture
30 pages
Feed Forward NN
No ratings yet
Feed Forward NN
35 pages
1 MC Culloh Pitts Neuron Model 22 Jul 2019material I 22 Jul 2019 Intro New
No ratings yet
1 MC Culloh Pitts Neuron Model 22 Jul 2019material I 22 Jul 2019 Intro New
58 pages
Deep Learning For Financial Applications - A Survey
No ratings yet
Deep Learning For Financial Applications - A Survey
52 pages
Disc11-Examprep-Sols (9 Files Merged)
No ratings yet
Disc11-Examprep-Sols (9 Files Merged)
12 pages
Bidirectional RNN and RVNN
No ratings yet
Bidirectional RNN and RVNN
15 pages
ML Endsem 2022
No ratings yet
ML Endsem 2022
7 pages
Activation Functions in Neural Networks
No ratings yet
Activation Functions in Neural Networks
3 pages
DL Quiz1
No ratings yet
DL Quiz1
5 pages
Module 2
No ratings yet
Module 2
13 pages
Deep Learning Meets Sparse Regularization: A Signal Processing Perspective
No ratings yet
Deep Learning Meets Sparse Regularization: A Signal Processing Perspective
23 pages
7 Types of Neural Network Activation Functions
No ratings yet
7 Types of Neural Network Activation Functions
16 pages
Introduction To ANN
No ratings yet
Introduction To ANN
6 pages
Deep Learning
No ratings yet
Deep Learning
40 pages
Week 7
No ratings yet
Week 7
7 pages
DSE 3151 25 Sep 2023
No ratings yet
DSE 3151 25 Sep 2023
9 pages
Activation
No ratings yet
Activation
7 pages
Deep Learning 15
No ratings yet
Deep Learning 15
13 pages
Ker As Tutorial
No ratings yet
Ker As Tutorial
33 pages
Neural Networks Activation Functions 1694135997
No ratings yet
Neural Networks Activation Functions 1694135997
7 pages
Deep Learning For Face Recognition
No ratings yet
Deep Learning For Face Recognition
47 pages
Machine Learning: Course
No ratings yet
Machine Learning: Course
44 pages
Activation Functions in Deep Learning: A Comprehensive Survey and Benchmark
No ratings yet
Activation Functions in Deep Learning: A Comprehensive Survey and Benchmark
18 pages
CS490 Advanced Topics in Computing (Deep Learning)
No ratings yet
CS490 Advanced Topics in Computing (Deep Learning)
37 pages
(AK) AIMLCZG511 Midsem Regular
No ratings yet
(AK) AIMLCZG511 Midsem Regular
7 pages
CS230 Midterm Solutions Fall 2021
No ratings yet
CS230 Midterm Solutions Fall 2021
14 pages
ANN Notes
No ratings yet
ANN Notes
7 pages
Need and Use of Activation Functions in Anndeep Learning
No ratings yet
Need and Use of Activation Functions in Anndeep Learning
7 pages
Week 8
No ratings yet
Week 8
3 pages
DNN Cluster S2 22 MidSem Makeup
No ratings yet
DNN Cluster S2 22 MidSem Makeup
7 pages
Deep Learning
No ratings yet
Deep Learning
5 pages
Solution: Introduction To Deep Learning
No ratings yet
Solution: Introduction To Deep Learning
20 pages
Different Activation Functions With The Equations
No ratings yet
Different Activation Functions With The Equations
6 pages
DL Exam 2023-2
No ratings yet
DL Exam 2023-2
5 pages
Hydrological Modeling Using Generalized Artificial Neuron Model
No ratings yet
Hydrological Modeling Using Generalized Artificial Neuron Model
24 pages
MT1 SP19 Solutions
No ratings yet
MT1 SP19 Solutions
14 pages
MT1SP19
No ratings yet
MT1SP19
13 pages
Solution PDF
No ratings yet
Solution PDF
20 pages
Week 8
No ratings yet
Week 8
3 pages
UEC823
No ratings yet
UEC823
2 pages
CST414 A
No ratings yet
CST414 A
2 pages
Activation Functions
No ratings yet
Activation Functions
4 pages
Back Propagation Neural Network
No ratings yet
Back Propagation Neural Network
5 pages
Activation F
No ratings yet
Activation F
4 pages
Tugas JST #Individual Task 1. MLP (Multi Layer Perceptron)
No ratings yet
Tugas JST #Individual Task 1. MLP (Multi Layer Perceptron)
3 pages
Hopfield
No ratings yet
Hopfield
3 pages
Model Examinations-Psd3C Model Examinations - Psd3C
No ratings yet
Model Examinations-Psd3C Model Examinations - Psd3C
1 page
Digital Signal and Image Processing using MATLAB, Volume 3: Advances and Applications, The Stochastic Case
From Everand
Digital Signal and Image Processing using MATLAB, Volume 3: Advances and Applications, The Stochastic Case
Gérard Blanchet
3/5 (1)
A Brief Introduction to MATLAB: Taken From the Book "MATLAB for Beginners: A Gentle Approach"
From Everand
A Brief Introduction to MATLAB: Taken From the Book "MATLAB for Beginners: A Gentle Approach"
Peter Kattan
2.5/5 (2)