0% found this document useful (0 votes)

117 views8 pages

Ann 2

Uploaded by

sahiny883

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

117 views8 pages

Ann 2

Uploaded by

sahiny883

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 8

Introduction to Deep Learning (I2DL)

Mock Exam - Solutions

IN2346 - SoSe 2020
Technical University of Munich

Full Your
Problem
Points Score
1 Multiple Choice 10
2 Short Questions 12
3 Backpropagation 9

Total 31

Total Time: 31 Minutes

Allowed Ressources: None

The purpose of this mock exam is to give you an idea of the type of problems and the
structure of the final exam. The mock exam is not graded. The final exam will most
probably be composed of 90 graded points with a total time of 90 minutes.

Multiple Choice Questions:

• For all multiple choice questions any number of answers, i.e. either zero (!) or one or
multiple answers can be correct.

• For each question, you’ll receive 2 points if all boxes are answered correctly (i.e. correct
answers are checked, wrong answers are not checked) and 0 otherwise.

How to Check a Box:

• Please cross the respective box: (interpreted as checked)

• If you change your mind, please fill the box: (interpreted as not checked)

• If you change your mind again, please circle the box: (interpreted as checked)
I2DL Mock Exam, Page 2 of 8 SoSe 2020

Part I: Multiple Choice (10 points)

1. (2 points) To avoid overfitting, you can...
increase the size of the network.
√
use data augmentation.
use Xavier initialization.
√
stop training earlier.

2. (2 points) What is true about Dropout?

The training process is faster and more stable to initialization when using
Dropout.
You should not use weaky ReLu as non-linearity when using Dropout.
√
Dropout acts as regularization.
√
Dropout is applied differently during training and testing.
3. (2 points) What is true about Batch Normalization?
√
Batch Normalization uses two trainable parameters that allow the
network to undo the normalization effect of this layer if needed.
√
Batch Normalization makes the gradients more stable so that we
can train deeper networks.
√
At test time, Batch Normalization uses a mean and variance com-
puted on training samples to normalize the data.
√
Batch Normalization has learnable parameters.

4. (2 points) Which of the following optimization methods use first order momentum?
Stochastic Gradient Descent
√
Adam
RMSProp
Gauss-Newton

5. (2 points) Making your network deeper by adding more parametrized layers will al-
ways...
√
slow down training and inference speed.
reduce the training loss.
improve the performance on unseen data.
√
(Optional: make your model sound cooler when bragging about it
at parties.)
I2DL Mock Exam, Page 3 of 8 SoSe 2020

Part II: Short Questions (12 points)

1. (2 points) You’re training a neural network and notice that the validation error is sig-
nificantly lower than the training error. Name two possible reasons for this to happen.

Solution:
The model performs better on unseen data than on training data - this should not
happen under normal circumstances. Possible explanations:

• Training and Validation data sets are not from the same distribution
• Error in the implementation
• ...

2. (2 points) You’re working for a cool tech startup that receives thousands of job appli-
cations every day, so you train a neural network to automate the entire hiring process.
Your model automatically classifies resumes of candidates, and rejects or sends job offers
to all candidates accordingly. Which of the following measures is more important for
your model? Explain.
T rue P ositives
Recall = T otal P ositive Samples
T rue P ositives
Precision = T otal P redicted P ositive Samples

Solution:
Precision: High precision means low rate of false positives.
False Negatives are okay, since we get ”thousands of applications” it’s not too bad if
we miss a few candidates even when they’d be a good fit. However, we don’t want
False Positives, i.e. offer a job to people who are not well suited.
I2DL Mock Exam, Page 4 of 8 SoSe 2020

3. (2 points) You’re training a neural network for image classification with a very large
dataset. Your friend who studies mathematics suggests: ”If you would use Newton-
Method for optimization, your neural network would converge much faster than with
gradient descent!”. Explain whether this statement is true (1p) and discuss potential
downsides of following his suggestion (1p).

Solution:
Faster convergence in terms of number of iterations (”mathematical view”). (1 pt.)
However: Approximating the inverse Hessian is highly computationally costly, not
feasible for high-dimensional datasets. (1 pt.)

4. (2 points) Your colleague trained a neural network using standard stochastic gradient
descent and L2 weight regularization with four different learning rates (shown below)
and plotted the corresponding loss curves (also shown shown below). Unfortunately he
forgot which curve belongs to which learning rate. Please assign each of the learning rate
values below to the curve (A/B/C/D) it probably belongs to and explain your thoughts.
l e a r n i n g r a t e s = [ 3 e −4, 4e −1, 2e −5, 8e −3]

Training Loss history

2.4
Curve A (red)
2.3
Curve B (blue)
2.2
training loss

2.1
Curve C (green)
2.0
Curve D (orange)

1.9
0 20 40 60 80 100 120 140
iteration

Solution:
Curve A: 4e-1 = 0.4 (Learning Rate is way too high)
Curve B: 2e-5 = 0.00002 (Learning Rate is too low)
Curve C: 8e-3 = 0.008 (Learning Rate is too high)
Curve D: 3e-4 = 0.0003 (Good Learning Rate)
I2DL Mock Exam, Page 5 of 8 SoSe 2020

5. (1 point) Explain why we need activation functions.

Solution:
Without non-linearities, our network can only learn linear functions, because the
composition of linear functions is again linear.

6. (3 points) When implementing a neural network layer from scratch, we usually imple-
ment a ‘forward‘ and a ‘backward‘ function for each layer. Explain what these functions
do, potential variables that they need to save, which arguments they take, and what
they return.

Solution:
Forward Function:

• takes output from previous layer, performs operation, returns result (1 pt.)
• caches values needed for gradient computation during backprop (1 pt.)

Backward Function:

• takes upstream gradient, returns all partial derivatives (1 pt.)

7. (0 points) Optional: Given a Convolution Layer with 8 filters, a filter size of 6, a stride
of 2, and a padding of 1. For an input feature map of 32 × 32 × 32, what is the output
dimensionality after applying the Convolution Layer to the input?

Solution:
32−6+2·1
2
+ 1 = 14 + 1 = 15 (1 pt.)

15 × 15 × 8 (1 pt.)
I2DL Mock Exam, Page 6 of 8 SoSe 2020

Part III: Backpropagation (9 points)

1. (9 points) Given the following neural network with fully connection layer and ReLU
activations, including two input units (i1 , i2 ), four hidden units (h1 , h2 ) and (h3 , h4 ).
The output units are indicated as (o1 , o2 ) and their targets are indicated as (t1 , t2 ). The
weights and bias of fully connected layer are called w and b with specific sub-descriptors.
b1 b3

i1 w11 h1 ReLU h3 w31 o1

w12 w32
w21 w41
i2 w22 h2 ReLU h4 w42 o2

b2 b4

The values of variables are given in the following table:

Variable i1 i2 w11 w12 w21 w22 w31 w32 w41 w42 b1 b2 b3 b4 t1 t2
Value 2.0 -1.0 1.0 -0.5 0.5 -1.0 0.5 -1.0 -0.5 1.0 0.5 -0.5 -1.0 0.5 1.0 0.5

(a) (3 points) Compute the output (o1 , o2 ) with the input (i1 , i2 ) and network paramters
as specified above. Write down all calculations, including intermediate layer results.

Solution:

Forward pass:

h1 = i1 × w11 + i2 × w21 + b1 = 2.0 × 1.0 − 1.0 × 0.5 + 0.5 = 2.0

h2 = i1 × w12 + i2 × w22 + b2 = 2.0 × −0.5 + −1.0 × −1.0 − 0.5 = −0.5
h3 = max(0, h1 ) = h1 = 2
h4 = max(0, h2 ) = 0
o1 = h3 × w31 + h4 × w41 + b3 = 2 × 0.5 + 0 × −0.5 − 1.0 = 0
o2 = h3 × w32 + h4 × w42 + b4 = 2 × −1.0 + 0 × 1.0 + 0.5 = −1.5
I2DL Mock Exam, Page 7 of 8 SoSe 2020

(b) (1 point) Compute the mean squared error of the output (o1 , o2 ) calculated above
and the target (t1 , t2 ).

Solution:
1 1
M SE = × (t1 − o1 )2 + × (t2 − o2 )2 = 0.5 × 1.0 + 0.5 × 4.0 = 2.5
2 2

(c) (5 points) Update the weight w21 using gradient descent with learning rate 0.1 as
well as the loss computed previously. (Please write down all your computations.)

Solution:

Backward pass (Applying chain rule):

∂M SE ∂ 1 (t1 − o1 )2 ∂o1 ∂h3 ∂h1 ∂ 21 (t2 − o2 )2 ∂o2 ∂h3 ∂h1

= 2 × × × + × × ×
∂w21 ∂o1 ∂h3 ∂h1 ∂w21 ∂o2 ∂h3 ∂h1 ∂w21
= (o1 − t1 ) × w31 × 1.0 × i2 + (o2 − t2 ) × w32 × 1.0 × i2
= (0 − 1.0) × 0.5 × −1.0 + (−1.5 − 0.5) × −1.0 × −1.0
= 0.5 + −2.0 = −1.5
Update using gradient descent:

+ ∂M SE
w21 = w21 − lr ∗ = 0.5 − 0.1 ∗ −1.5 = 0.65
∂w21
I2DL Mock Exam, Page 8 of 8 SoSe 2020

Additional Space for solutions. Clearly mark the problem your answers are
related to and strike out invalid solutions.

DEEP LEARNING IIT Kharagpur Assignment - 4 - 2024
100% (2)
DEEP LEARNING IIT Kharagpur Assignment - 4 - 2024
7 pages
1.deep Learning Assignment1 Solutions 1
100% (3)
1.deep Learning Assignment1 Solutions 1
12 pages
C 0.8 Reference Manual: Ardpeek
No ratings yet
C 0.8 Reference Manual: Ardpeek
60 pages
DNN Cluster S2 22 MidSem Regular
No ratings yet
DNN Cluster S2 22 MidSem Regular
6 pages
SS 2020 Solutions
No ratings yet
SS 2020 Solutions
22 pages
Null Pages 21 Merged
No ratings yet
Null Pages 21 Merged
3 pages
WS 2021 Solutions
No ratings yet
WS 2021 Solutions
16 pages
2425 CS420 22TT HW04
No ratings yet
2425 CS420 22TT HW04
6 pages
ML Endsem 2022
No ratings yet
ML Endsem 2022
7 pages
Deep Learning
No ratings yet
Deep Learning
9 pages
Deep Learning For Beginners Mock Exam PDF
No ratings yet
Deep Learning For Beginners Mock Exam PDF
15 pages
DNN Cluster S2 22 MidSem Makeup
No ratings yet
DNN Cluster S2 22 MidSem Makeup
7 pages
DL - Midterm - Fall23
No ratings yet
DL - Midterm - Fall23
2 pages
Deep Learning
No ratings yet
Deep Learning
5 pages
SS 2020
No ratings yet
SS 2020
21 pages
ML 2018
No ratings yet
ML 2018
2 pages
Solution: Introduction To Deep Learning
No ratings yet
Solution: Introduction To Deep Learning
20 pages
CSCI 5521 Spring 2025 Final Exam
No ratings yet
CSCI 5521 Spring 2025 Final Exam
8 pages
DL Midterm Rubrics
No ratings yet
DL Midterm Rubrics
5 pages
Sample Final Exam Solutions
No ratings yet
Sample Final Exam Solutions
30 pages
Machine Learning-Csen 3233-2023
No ratings yet
Machine Learning-Csen 3233-2023
4 pages
Week 4
No ratings yet
Week 4
61 pages
Machine Learning - Info 4122 - 2023
No ratings yet
Machine Learning - Info 4122 - 2023
4 pages
INF8953CE Final Exam Questions 2020
No ratings yet
INF8953CE Final Exam Questions 2020
5 pages
Solution PDF
No ratings yet
Solution PDF
20 pages
CS230 Midterm Solutions Fall 2021
No ratings yet
CS230 Midterm Solutions Fall 2021
14 pages
Module 3 Quiz - Review
No ratings yet
Module 3 Quiz - Review
4 pages
Midterm Solutions
No ratings yet
Midterm Solutions
14 pages
Solutions To Deep Learning
No ratings yet
Solutions To Deep Learning
25 pages
Midpaper
No ratings yet
Midpaper
16 pages
WS 2021
No ratings yet
WS 2021
16 pages
COE292 - T221 - Final - Version C
No ratings yet
COE292 - T221 - Final - Version C
19 pages
Home Assignment Submission Solutions
No ratings yet
Home Assignment Submission Solutions
82 pages
MACHINE LEARNING CSEN 3233 - 2022.pdf - Crdownload
No ratings yet
MACHINE LEARNING CSEN 3233 - 2022.pdf - Crdownload
6 pages
Assignment 4 2022
No ratings yet
Assignment 4 2022
7 pages
1160 CS F425 20241218114944 Comprehensive Exam Question Paper
No ratings yet
1160 CS F425 20241218114944 Comprehensive Exam Question Paper
5 pages
SS 2021 Solutions
No ratings yet
SS 2021 Solutions
16 pages
AI42001 Machine Learing Foundations ES 2024
No ratings yet
AI42001 Machine Learing Foundations ES 2024
18 pages
Cs230exam Win19 Soln
No ratings yet
Cs230exam Win19 Soln
29 pages
2024 Exam2 Solution
No ratings yet
2024 Exam2 Solution
11 pages
1st Exam Question Paper 2
No ratings yet
1st Exam Question Paper 2
16 pages
Deep Learning Lab Manual
No ratings yet
Deep Learning Lab Manual
65 pages
HW04
No ratings yet
HW04
9 pages
Exam DL 2023
No ratings yet
Exam DL 2023
4 pages
Mandatory - Exercise 2
No ratings yet
Mandatory - Exercise 2
11 pages
Machine Learning Info 4122 2022
No ratings yet
Machine Learning Info 4122 2022
4 pages
Kinetic Theory of Gases Notes
No ratings yet
Kinetic Theory of Gases Notes
5 pages
Assignment Week 4-Deep-Learning PDF
100% (1)
Assignment Week 4-Deep-Learning PDF
7 pages
Solution Dseclzg524!01!102020 Ec2r
100% (1)
Solution Dseclzg524!01!102020 Ec2r
6 pages
Module2 Question and Answer
No ratings yet
Module2 Question and Answer
25 pages
Homework 04: Your Content Should Use Any Color That Is Different From Those
No ratings yet
Homework 04: Your Content Should Use Any Color That Is Different From Those
5 pages
Lec-4-Opt and BP
No ratings yet
Lec-4-Opt and BP
75 pages
5.scaling Optimization
No ratings yet
5.scaling Optimization
68 pages
Ass4 Soln
No ratings yet
Ass4 Soln
7 pages
Second Exam 2021-22
No ratings yet
Second Exam 2021-22
14 pages
Week 7
No ratings yet
Week 7
7 pages
Semester Two Examinations 2023 DATA7703
No ratings yet
Semester Two Examinations 2023 DATA7703
15 pages
Sinif Ingilizce 4. Unite Calisma Kagidi 4
No ratings yet
Sinif Ingilizce 4. Unite Calisma Kagidi 4
2 pages
Sinif Ingilizce 4. Unite Calisma Kagidi 5
No ratings yet
Sinif Ingilizce 4. Unite Calisma Kagidi 5
1 page
OpenAI O3 and The New Era of Smart AI Models
No ratings yet
OpenAI O3 and The New Era of Smart AI Models
5 pages
Sinif Ingilizce 4. Unite Calisma Kagidi 3
No ratings yet
Sinif Ingilizce 4. Unite Calisma Kagidi 3
1 page
Thesis Proposal Rewrited
No ratings yet
Thesis Proposal Rewrited
4 pages
Scars of Wars
No ratings yet
Scars of Wars
4 pages
01 Test Bank
100% (1)
01 Test Bank
15 pages
2 Erp
100% (1)
2 Erp
16 pages
Announcement
No ratings yet
Announcement
1 page
Comparison Sponge Jet Vs Wet Blasting REI
No ratings yet
Comparison Sponge Jet Vs Wet Blasting REI
2 pages
Column Detail and Foundation Plan
No ratings yet
Column Detail and Foundation Plan
7 pages
Upendra Internship Final
No ratings yet
Upendra Internship Final
39 pages
Iterative Design and Prototyping
No ratings yet
Iterative Design and Prototyping
26 pages
Software Up Etr
0% (1)
Software Up Etr
35 pages
G7 May Test
No ratings yet
G7 May Test
3 pages
Drag Force: The Basics of Transport Phenomena
No ratings yet
Drag Force: The Basics of Transport Phenomena
12 pages
Miginox 310 / Tiginox 310: Classification: en Iso 14343-A
No ratings yet
Miginox 310 / Tiginox 310: Classification: en Iso 14343-A
1 page
OS 30 Q Assignment
No ratings yet
OS 30 Q Assignment
8 pages
Psi Corp
No ratings yet
Psi Corp
2 pages
RE Monthly March-2025-Report JMKResearch
No ratings yet
RE Monthly March-2025-Report JMKResearch
49 pages
Sartorius CAH1 PDF
No ratings yet
Sartorius CAH1 PDF
132 pages
MA 302: MATLAB Laboratory, Spring 2004 Graphics in MATLAB: An Overview
No ratings yet
MA 302: MATLAB Laboratory, Spring 2004 Graphics in MATLAB: An Overview
15 pages
Information Security Fundamental Weaknesses Place EPA Data and Operations at Risk 1st Edition by Government Accountability Office ISBN 1508400784 9781508400783 Instant Download
100% (6)
Information Security Fundamental Weaknesses Place EPA Data and Operations at Risk 1st Edition by Government Accountability Office ISBN 1508400784 9781508400783 Instant Download
75 pages
DPR - Surat Metro (Ceo Ieccl - June-23
No ratings yet
DPR - Surat Metro (Ceo Ieccl - June-23
332 pages
Pricelist Eleven Wedding 2019-1 Terbaru
No ratings yet
Pricelist Eleven Wedding 2019-1 Terbaru
14 pages
5 Mva GTP For Export Job
No ratings yet
5 Mva GTP For Export Job
3 pages
Notebook LM Masterclass - Ebook
No ratings yet
Notebook LM Masterclass - Ebook
63 pages
Estimating Today March April 2023
No ratings yet
Estimating Today March April 2023
36 pages
Aqua Series
No ratings yet
Aqua Series
2 pages
Practice Q Ans
No ratings yet
Practice Q Ans
11 pages
SLAM100 Fieldwork Guidebook
No ratings yet
SLAM100 Fieldwork Guidebook
12 pages
Intro To Econometrics With R PDF
No ratings yet
Intro To Econometrics With R PDF
392 pages
Project Report of RICHA
No ratings yet
Project Report of RICHA
31 pages
The Use of Multimedia Instruction in Enhancing Learners' Performance in Filipino 6
No ratings yet
The Use of Multimedia Instruction in Enhancing Learners' Performance in Filipino 6
33 pages
MRI Acronyms Brochure 2020
No ratings yet
MRI Acronyms Brochure 2020
12 pages
1548microsoft 365 For Dummies 1st Edition Jennifer Reed - Read The Ebook Online or Download It To Own The Full Content
100% (4)
1548microsoft 365 For Dummies 1st Edition Jennifer Reed - Read The Ebook Online or Download It To Own The Full Content
50 pages
Enhancing Millennial Performance Through Individual Characteristics and Employee Engagement
No ratings yet
Enhancing Millennial Performance Through Individual Characteristics and Employee Engagement
10 pages
Sap GST
No ratings yet
Sap GST
5 pages

Ann 2

Uploaded by

Ann 2

Uploaded by

Introduction to Deep Learning (I2DL)

Mock Exam - Solutions

Total Time: 31 Minutes

Multiple Choice Questions:

How to Check a Box:

Part I: Multiple Choice (10 points)

2. (2 points) What is true about Dropout?

Part II: Short Questions (12 points)

Training Loss history

5. (1 point) Explain why we need activation functions.

• takes upstream gradient, returns all partial derivatives (1 pt.)

Part III: Backpropagation (9 points)

i1 w11 h1 ReLU h3 w31 o1

The values of variables are given in the following table:

h1 = i1 × w11 + i2 × w21 + b1 = 2.0 × 1.0 − 1.0 × 0.5 + 0.5 = 2.0

Backward pass (Applying chain rule):

∂M SE ∂ 1 (t1 − o1 )2 ∂o1 ∂h3 ∂h1 ∂ 21 (t2 − o2 )2 ∂o2 ∂h3 ∂h1

You might also like