0% found this document useful (0 votes)

24 views16 pages

SS 2021

This document outlines the structure and instructions for an endterm exam in Deep Learning at the Technical University of Munich. It includes details on the exam format, types of questions (multiple choice, short questions, and problems related to convolutions, optimization, and multi-class classification), and specific instructions for answering. The total achievable credits for the exam is 91, with various topics covered including data augmentation, neural network architectures, and optimization techniques.

Uploaded by

aleksanderpiciga

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

24 views16 pages

SS 2021

Uploaded by

aleksanderpiciga

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 16

Chair of Visual Computing & Artificial Intelligence

Department of Informatics
Technical University of Munich

Note:
• During the attendance check a sticker containing a unique code will be
Ecorrection put on this exam.
Place student sticker here • This code contains a unique number that associates this exam with your
registration number.
• This number is printed both next to the code and to the signature field in
the attendance check list.

Introduction to Deep Learning

Exam: IN2346 / Endterm Date: Tuesday 13th July, 2021
Examiner: Prof. Dr. Matthias Nießner Time: 17:30 – 19:00

Working instructions
• This exam consists of 16 pages with a total of 5 problems.
Please make sure now that you received a complete copy of the exam.

• The total amount of achievable credits in this exam is 91 credits.

• Detaching pages from the exam is prohibited.

• Allowed resources: None

• Do not write with red or green colors

– Page 1 / 16 –
Problem 1 Multiple Choice (18 credits)
Below you can see how you can answer multiple choice questions.

Mark correct answers with a cross ×

To undo a cross, completely fill out the answer option ⌅
To re-mark an option, use a human-readable marking ×⌅
• For all multiple choice questions any number of answers, i.e. either zero (!), one or multiple
answers can be correct.

• For each question, you’ll receive 2 points if all boxes are answered correctly (i.e. correct
answers are checked, wrong answers are not checked) and 0 otherwise.

1.1 Which of the following models are unsupervised learning methods?

Auto-Encoder
Maximum Likelihood Estimate Debatable: badly phrased.
K-means Clustering
Linear regression

1.2 In which cases would you usually reduce the learning rate when training a neural network?
When the training loss stops decreasing
To reduce memory consumption
After increasing the mini-batch size
After reducing the mini-batch size

1.3 Which techniques will typically decrease your training loss?

Add additional training data
Remove data augmentation
Add batch normalization
Add dropout

1.4 Which techniques will typically decrease your validation loss?

Add dropout
Add additional training data
Remove data augmentation
Use ReLU activations instead of LeakyReLU

– Page 2 / 16 –
1.5 Which of the following are affected by multiplying the loss function by a constant positive
value when using SGD?
Memory consumption during training
Magnitude of the gradient step
Location of minima
Number of mini-batches per epoch

1.6 Which of the following functions are not suitable as activation functions to add non-linearity
to a network?
sin(x)

ReLU(x) ≠ ReLU(≠x)

log (ReLU(x) + 1)

log (ReLU(x + 1))

1.7 Which of the following introduce non-linearity in the neural network?

LeakyReLU with – = 0
Convolution
MaxPool
Skip connection

1.8 Compared to the L1 loss, the L2 loss...

is robust to outliers
is costly to compute
has a different optimum
will lead to sparser solutions

1.9 Which of the following datasets are NOT i.i.d. (independent and identically distributed)?
A sequence (toss number, result) of 10,000 coin flips using biased coins with p(toss result =
1) = 0.7

A set of (image, label) pairs where each image is a frame in a video and each label
indicates whether that frame contains humans.
A monthly sample of Munich’s population over the past 100 years
A set of (image, number) pairs where each image is a chest X-ray of a different human
and each number represents the volume of their lungs.

– Page 3 / 16 –
Problem 2 Short Questions (29 credits)

0 2.1 Explain the idea of data augmentation (1p). Specify 4 different data augmentation techniques
you can apply on a dataset of RGB images (2p).
1

0 2.2 You are training a deep neural network for the task of binary classification using the Binary
Cross Entropy loss. What is the expected loss value for the first mini-batch with batch size N = 64
1 q
for an untrained, randomly initialized network? Hint: BCE = ≠
2

0 2.3 Explain the differences between ReLU, LeakyReLU and Parametric ReLU.
1

0 2.4 How will weights be initialized by Xavier initialization? Which mean and variance will the
weights have? Which mean and variance will the output data have?
1

– Page 4 / 16 –
Only true in the context of SGD:
2.5 Why do we often refer to L2-regularization as “weight decay”? Derive a mathematical expres- 0
sion that includes the weights W , the learning rate ÷, and the L2 regularization hyperparameter
1
⁄ to explain your point.
2

2.6 Given a Convolution Layer in a network with 6 filters, kernel size 5, a stride of 3, and a 0
padding of 2. For an input feature map of shape 28 ◊ 28 ◊ 28 , what are the dimensions/shape
1
of the output tensor after applying the Convolution Layer to the input?
2

2.7 You are given a Convolutional Layer with: number of input channels 3, number of filters 5, 0
kernel size 4, stride 2, padding 1. What is the total number of trainable parameters for this layer?
1
Don’t forget to consider the bias.
2

2.8 You are given a fully-connected network with 2 hidden layers, the first of has 10 neurons, 0
and the second hidden layer contains 5 neurons. Both layers use dropout with probability 0.5.
1
The network classifies gray-scale images of size 8 ◊ 8 pixels as one of 3 different classes. All
neurons include a bias. Calculate the total number of trainable parameters in this network. 2

– Page 5 / 16 –
0 2.9 ”Breaking the symmetry” : Why is initializing all weights of a fully-connected layer to the
same value problematic?
1

0 2.10 Explain the difference between Auto-Encoders and Variational Auto-Encoders.

0 2.11 Generative Adversarial Networks (GANs): What is the input to the generator network (1
pt)? What are the two inputs to the discriminator (1 pt)?
1

0 2.12 Explain how LSTM networks often outperform traditional RNNs. What in their architecture
enables this?
1

0 2.13 Explain how batch normalization is applied differently between a fully connected layer and
a convolutional layer (1 pt). How many learnable parameters does batch normalization contain
1
following (a) a single fully-connected layer (1 pt), and (b) a single convolutional layer with 16
2 filters (1 pt)?
3

– Page 6 / 16 –
Problem 3 Convolutions (13 credits)
You are asked to perform per-pixel semantic segmentation on the Cityscapes dataset, which
consists of RGB images of European city streets, and you want to segment the images into 5
classes (vehicle, road, sky, nature, other). You have designed the following network, as seen in
the illustration below:
For clarification of notation: The shape after having applied the operation ‘conv1’ (the first
convolutional layer in the network) is 50x32x32.
You are using 2D convolutions with: stride = 2 , padding = 1 , and kernel_size = 4 .
For the MaxPool operation, you are using: stride = 2 , padding = 0 , and kernel_size = 2 .

3.1 What is the shape of the weight matrix of the fully-connected layer ‘fc1’? (Ignore the bias) 0

3.2 Explain the term ‘receptive field’ (1p). What is the receptive field of one pixel of the activation 0
map. after performing the operation ‘maxpool1’(1p)? What is the receptive field of a single
1
neuron in the output of layer ‘fc1’ (1p)?
2

– Page 7 / 16 –
0 3.3 You now want to be able to classify finer-grained labels, which comprise of 30 classes. What
is the minimal change in network architecture needed in order to support this without adding
1
any additional layers?

0 3.4 Luckily, you found a pre-trained version of this network, which is trained on the original 5
labels. (It outputs a tensor of shape 5 ◊ 64 ◊ 64 ). How can you make use of/build upon this
1
pre-trained network (as a black-box) to perform segmentation into 30 classes.
2

0 3.5 Luckily, you have gained access to a large dataset of city street images. Unfortunately, these
images are not labelled, and you do not have the resources to annotate them. However, how
1
can you still make use of these images to improve your network? Explain the architecture of any
2 networks that you will use and explain how training will be performed. (Note: This question is
3 independent of (3.3) and (3.4))

0 3.6 Instead of taking 64 ◊ 64 images as input, you now want to be able to train the network to
segment images of arbitrary size > 64. List, explicitly, two different approaches that would allow
1
this. Your new network should support varying image sizes in run-time, without having to be
2 re-trained.

– Page 8 / 16 –
Problem 4 Optimization (13 credits)

4.1 Explain the idea behind the RMSProp optimizer. How does it enable faster convergence 0
than standard SGD? How does it make use of the gradient?
1

4.2 What is the bias correction in the ADAM optimizer? Explain the problem that it fixes. 0

4.3 You read that when training deeper networks, you may suffer from the vanishing gradients 0
problem. Explain what are vanishing gradients in the context of deep convolutional networks
1
and the underlying cause of the problem.
2

– Page 9 / 16 –
0 4.4 In the following image you can see a segment of a very deep architecture that uses residual
connections. How are residual connections helpful against vanishing gradients? Demonstrate
1
this mathematically by performing a weight update for w0 . Make sure to explain how this reduces
2 the effect of vanishing gradients. Hint: Write the mathematical expression for ˆˆWz0 w.r.t all other
3 weights.

– Page 10 / 16 –
Problem 5 Multi-Class Classification (18 credits)
Note: If you cannot solve a sub-question and need its answer for a calculation in following sub-
questions, mark it as such and use a symbolic placeholder (i.e., the mathematical expression
you could not explicitly calculate + a note that it is missing from the previous question.)

Assume you are given a labeled dataset {X , y } , where each sample xi belongs to one of
C = 10 classes. We denote its corresponding label yi œ {1, ..., 10} . In addition, you can assume
each data sample is a row vector.
You are asked to train a classifier for this classification task, namely, a 2-layer fully-connected
network. For a visualization of the setting, refer to the following illustration:

5.1 Why does one use a Softmax activation at the end of such a classification network? What 0
property does it have that makes it a common choice for a classification task?
1

5.2 For a vector of logits ˛z , the Softmax function ‡ : RC æ RC , is defined: 0

e zi 1
ŷi = ‡ (˛z )i = qC z
j=1 e
j

where C is the number of classes and zi is the i -th logit.

A special property of this function is that its derivative can be expressed in terms of the Softmax
function itself. How could this be advantageous for training neural networks?

– Page 11 / 16 –
ˆ ŷi
0 5.3 Show explicitly how this can be done, by writing ˆ zi
in terms of ŷi .
1

ˆ ŷi
0 5.4 Similarly, show explicitly how this can be done, by writing ˆ zj
in terms of ŷi and ŷj , for
1
” j.
i=

– Page 12 / 16 –
5.5 Using the Softmax activation, what loss function L (y, ŷ) would you want to minimize, to 0
train a network on such a multi-class classification task? Name this loss function (1 pt), and
1
write down its formula (2 pt), for a single sample x , in terms of the network’s prediction ŷ and
C
its true
Y label y . Here, you can assume the label y œ {0, 1} is a one-hot encoded vector: 2
]1, if i == true class index
3
yi =
[0, otherwise

5.6 Having done a forward pass with our sample x , we will back-propagate through the network. 0
2
We want to perform a gradient update for the weight wj,k (the weight which is in row j , column
2 1
k of the second weights’ matrix W ). First, use the chain rule to write down the derivative
ˆL
ˆ wj,k
as a product of 3 partial derivatives (no need to compute them). For convenience, you can 2
ignore the bias and omit the 2 superscript.

– Page 13 / 16 –
2
0 5.7 Now, compute the gradient for the weight: w3,1 . For this, you will need to compute each
of the partial derivatives you have written above, and perform the multiplication to get the final
1
answer. You can assume the ground-truth label for the sample was true_class = 3 . Hint: The
2 derivative of the logarithm is (log t)Õ = 1t .
3

– Page 14 / 16 –
Additional space for solutions–clearly mark the (sub)problem your answers are related
to and strike out invalid solutions.

– Page 15 / 16 –
– Page 16 / 16 –

Rev2 Service Manual CRANEX3D Eng
100% (5)
Rev2 Service Manual CRANEX3D Eng
176 pages
Question Bank
No ratings yet
Question Bank
14 pages
Solution Dseclzg524 05-07-2020 Ec3r
No ratings yet
Solution Dseclzg524 05-07-2020 Ec3r
7 pages
MT1SP19
No ratings yet
MT1SP19
13 pages
Chart Installation
No ratings yet
Chart Installation
138 pages
Solution: Introduction To Deep Learning
No ratings yet
Solution: Introduction To Deep Learning
20 pages
CSE489: Machine Vision (Sheet 7) : Yehia Zakaria
No ratings yet
CSE489: Machine Vision (Sheet 7) : Yehia Zakaria
34 pages
Exercises INF 5860: Exercise 1 Linear Regression
No ratings yet
Exercises INF 5860: Exercise 1 Linear Regression
5 pages
Solution PDF
No ratings yet
Solution PDF
20 pages
General Notes: Heruntergeladen Durch Petre Weinberger (Extern - Weinberger@tum - De)
No ratings yet
General Notes: Heruntergeladen Durch Petre Weinberger (Extern - Weinberger@tum - De)
6 pages
Quiz 3
No ratings yet
Quiz 3
5 pages
Exercises INF 5860 Solution Hints
No ratings yet
Exercises INF 5860 Solution Hints
11 pages
Introduction To Neural Networks 67103 - 2019 Exam B
No ratings yet
Introduction To Neural Networks 67103 - 2019 Exam B
2 pages
MT1 SP19 Solutions
No ratings yet
MT1 SP19 Solutions
14 pages
12 - Vrealize-Orchestrator-81-Install-Config-Guide
No ratings yet
12 - Vrealize-Orchestrator-81-Install-Config-Guide
76 pages
Deep Learning
No ratings yet
Deep Learning
5 pages
SP18 Practice Midterm
No ratings yet
SP18 Practice Midterm
5 pages
ManualOperacion CJ1 CS1-ETN PDF
No ratings yet
ManualOperacion CJ1 CS1-ETN PDF
314 pages
What Are The Components of A Neural Network? Explain: Unit - I Part - A
No ratings yet
What Are The Components of A Neural Network? Explain: Unit - I Part - A
8 pages
Mock Endterm ADL 2021
No ratings yet
Mock Endterm ADL 2021
8 pages
CS230 Midterm Solutions Fall 2021
No ratings yet
CS230 Midterm Solutions Fall 2021
14 pages
CS230 Midterm Fall 2022
No ratings yet
CS230 Midterm Fall 2022
14 pages
DSE 3151 25 Sep 2023
No ratings yet
DSE 3151 25 Sep 2023
9 pages
DNN Cluster S2 22 MidSem Makeup
No ratings yet
DNN Cluster S2 22 MidSem Makeup
7 pages
F16midterm Sols v2
No ratings yet
F16midterm Sols v2
14 pages
Unit 5 - Introduction To Hadoop
No ratings yet
Unit 5 - Introduction To Hadoop
50 pages
Lunch Box Switch - Seven Segment Display (CC and CA) : Lab Activity - 7
No ratings yet
Lunch Box Switch - Seven Segment Display (CC and CA) : Lab Activity - 7
7 pages
Quick Setup Guide: MFC-L2717DW / MFC-L2710DW / MFC-L2690DWXL / MFC-L2690DW / DCP-L2550DW / HL-L2390DW
No ratings yet
Quick Setup Guide: MFC-L2717DW / MFC-L2710DW / MFC-L2690DWXL / MFC-L2690DW / DCP-L2550DW / HL-L2390DW
2 pages
120 Deep Learning Important Questions + Answers ?
No ratings yet
120 Deep Learning Important Questions + Answers ?
68 pages
Architectural Lighting and LED Drivers Ebook FINAL
No ratings yet
Architectural Lighting and LED Drivers Ebook FINAL
14 pages
101521-Report On The Physical Count of Property, Plant - Equipment-RPCPPE
No ratings yet
101521-Report On The Physical Count of Property, Plant - Equipment-RPCPPE
4 pages
Deep Learning - Question Bank
No ratings yet
Deep Learning - Question Bank
6 pages
Assignment Class Notes
No ratings yet
Assignment Class Notes
8 pages
Deep Learning 15
No ratings yet
Deep Learning 15
13 pages
APS360H1 20231 631682452284APS360 Midterm Winter 2023
No ratings yet
APS360H1 20231 631682452284APS360 Midterm Winter 2023
16 pages
C1 SC10 Setup Cli Ja
No ratings yet
C1 SC10 Setup Cli Ja
27 pages
Texecom Premier 412 816 832 User Guide
No ratings yet
Texecom Premier 412 816 832 User Guide
24 pages
Genai See
No ratings yet
Genai See
51 pages
Xtremax Company Profile 2009
100% (2)
Xtremax Company Profile 2009
13 pages
Midpaper
No ratings yet
Midpaper
16 pages
Reviewer Chapter 1: A (N) - Is The Thin Dotted Line That Encloses An Object in The Designer. Bounding Box
No ratings yet
Reviewer Chapter 1: A (N) - Is The Thin Dotted Line That Encloses An Object in The Designer. Bounding Box
3 pages
11th Computer Science 1st Mid Term Test 2024 Original Question Paper Thoothukudi District English Medium PDF Download
No ratings yet
11th Computer Science 1st Mid Term Test 2024 Original Question Paper Thoothukudi District English Medium PDF Download
2 pages
Determining The Initial States in Forward-Backward Ltering
No ratings yet
Determining The Initial States in Forward-Backward Ltering
8 pages
Deped Mission and Vision
No ratings yet
Deped Mission and Vision
5 pages
Review of Literature For Mobile Banking
No ratings yet
Review of Literature For Mobile Banking
5 pages
QuestionBank C# and
No ratings yet
QuestionBank C# and
3 pages
Classical IPC Problems Reader's and Writer Problem
No ratings yet
Classical IPC Problems Reader's and Writer Problem
79 pages
7 HomologyModelling 12oct2020
No ratings yet
7 HomologyModelling 12oct2020
8 pages
Neural Network - Test Questions
No ratings yet
Neural Network - Test Questions
9 pages
IF4071 Ass1 - Practive Questions of Deep Learning IF4071 Ass1 - Practive Questions of Deep Learning
No ratings yet
IF4071 Ass1 - Practive Questions of Deep Learning IF4071 Ass1 - Practive Questions of Deep Learning
8 pages
Advanced OSPF Lab1
No ratings yet
Advanced OSPF Lab1
22 pages
Section - C: Unit 1
No ratings yet
Section - C: Unit 1
12 pages
WS 2021
No ratings yet
WS 2021
16 pages
SS 2020 Solutions
No ratings yet
SS 2020 Solutions
22 pages
Unit1 PracticeQuestions
No ratings yet
Unit1 PracticeQuestions
1 page
SS 2021 Solutions
No ratings yet
SS 2021 Solutions
16 pages
DL Questions
No ratings yet
DL Questions
5 pages
WS 2021 Solutions
No ratings yet
WS 2021 Solutions
16 pages
SS 2020
No ratings yet
SS 2020
21 pages
2024 Exam2 Solution
No ratings yet
2024 Exam2 Solution
11 pages
Case-Study-Dos - 19070123
No ratings yet
Case-Study-Dos - 19070123
13 pages
Production System in Arti Cial Intelligence and Its Characteristics
No ratings yet
Production System in Arti Cial Intelligence and Its Characteristics
18 pages
Thesis Title Approval Form
100% (2)
Thesis Title Approval Form
4 pages
Quiz Sol
No ratings yet
Quiz Sol
4 pages
Memory Access Method
No ratings yet
Memory Access Method
14 pages
DL Quiz1
No ratings yet
DL Quiz1
5 pages
Quadratic Py Qs
No ratings yet
Quadratic Py Qs
8 pages
BRKCRS 3810
No ratings yet
BRKCRS 3810
143 pages
XM25QH32C Ver2.1
No ratings yet
XM25QH32C Ver2.1
101 pages
DM-MICA ZIVAME Group Anuj Sarabhai
No ratings yet
DM-MICA ZIVAME Group Anuj Sarabhai
9 pages
Question Bank Advanced CO1, CO2
No ratings yet
Question Bank Advanced CO1, CO2
4 pages
(AK) AIMLCZG511 Midsem Regular
No ratings yet
(AK) AIMLCZG511 Midsem Regular
7 pages
Deepques
No ratings yet
Deepques
12 pages
Deep Learning Viva Questions (1-3)
No ratings yet
Deep Learning Viva Questions (1-3)
4 pages
Deep Learning Viva Questions
No ratings yet
Deep Learning Viva Questions
4 pages
DL Internal
No ratings yet
DL Internal
9 pages
UScan Operation Manual CT
No ratings yet
UScan Operation Manual CT
68 pages
Iva Unit-5 Edited
No ratings yet
Iva Unit-5 Edited
42 pages
DPL302 M
No ratings yet
DPL302 M
6 pages
ISE-1 Imp DLPDF
No ratings yet
ISE-1 Imp DLPDF
28 pages
Computer Vision Exam Questions English
No ratings yet
Computer Vision Exam Questions English
9 pages
19CSE456 - VI Sem May 2022
No ratings yet
19CSE456 - VI Sem May 2022
6 pages
Domande ANN
No ratings yet
Domande ANN
28 pages
Minor 1 - DNN
No ratings yet
Minor 1 - DNN
2 pages
7COM1033test 0000
No ratings yet
7COM1033test 0000
4 pages
FBC Lab Manual - 4361603-1
No ratings yet
FBC Lab Manual - 4361603-1
48 pages
Alfresco 5.2 Step-By-Step Installation Guide
No ratings yet
Alfresco 5.2 Step-By-Step Installation Guide
46 pages
DL CO1 and CO2 Answers
No ratings yet
DL CO1 and CO2 Answers
36 pages