0% found this document useful (0 votes)

29 views3 pages

hw8 (5555)

This document contains 7 problems related to mathematical foundations of deep neural networks. The problems cover topics like transposes of downsampling operators, nearest neighbor upsampling, f-divergences, generalized inverse transform sampling, change of variables formulas for Gaussians, computing inverses of permutations, and properties of permutation matrices.

Uploaded by

Ezekiel Elliott

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

29 views3 pages

hw8 (5555)

Uploaded by

Ezekiel Elliott

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 3

Mathematical Foundations of Deep Neural Networks, M1407.

001200
E. Ryu
Spring 2024
Due 5pm, Monday, May 06, 2024

Problem 1: Transpose of downsampling. Consider the downsampling operator T : Rm×n →

R(m/2)×(n/2) , defined as the average pool with a 2 × 2 kernel and stride 2. For the sake of
simplicity, assume m and n are even. Describe the action of T ⊤ . More specifically, describe
how to compute T ⊤ (Y ) for any Y ∈ R(m/2)×(n/2) .

Clarification. The downsampling operator T is a linear operator (why?). Therefore, T has a

matrix representation A ∈ R(mn/4)×(mn) such that

T (X) = (A(X.reshape(mn))).reshape(m/2, n/2)

for all X ∈ Rm×n . The adjoint T ⊤ has two equivalent definitions. One definition is

T ⊤ (Y ) = (A⊤ (Y.reshape(mn/4))).reshape(m, n)

for all Y ∈ R(m/2)×(n/2) . Another is

m/2 n/2 m X
n
XX X
Yij (T (X))ij = (T ⊤ (Y ))ij (X)ij
i=1 j=1 i=1 j=1

for all X ∈ Rm×n and Y ∈ R(m/2)×(n/2) .

Hint. To spoil the suspence, T ⊤ is a constant times the nearest neighbor upsampling. Explain
why in your answer.

Problem 2: Nearest neighbor upsampling. How is the nearest neighbor upsampling operator
an instance of transpose convolution? Specifically, describe how
layer = nn . Upsample ( scale_factor =r , mode = ’ nearest ’)

where r is a positive integer, can be equivalently represented by

layer = nn . ConvTranspose2d (...)
layer . weight . data = ...

with ... appropriately filled in.

1
Problem 3: f-divergence. Let X and Y be two continuous random variables with densities pX
and pY . The f -divergence of X from Y is defined as
Z
pX (x)
Df (X∥Y ) = f pY (x) dx,
pY (x)

where f is a convex function such that f (1) = 0.

(a) Show that Df (X∥Y ) ≥ 0.

(b) Show that f = − log t and f = t log t correspond to the KL divergence.

Problem 4: Generalized inverse transform sampling. Let F : R → [0, 1] be the CDF of a

random variable and let U ∼ Uniform([0, 1]). If F is continuous and strictly increasing and
therefore invertible, then F −1 (U ) is a random variable with CDF F , because

P(F −1 (U ) ≤ t) = P(U ≤ F (t)) = F (t).

When F is not necessarily invertible, the generalized inverse of F is G : (0, 1) → R with

G(u) = inf{x ∈ R | u ≤ F (x)}.

Show that G(U ) is a random variable with CDF F .

Hint. Use the fact that F is right-continuous, i.e., limh→0+ F (x + h) = F (x) for all x ∈ R, and
that limx→−∞ F (x) = 0.

Problem 5: Change of variables formula for Gaussians. If φ : Rn → Rn is a one-to-one differ-

entiable function, Y = φ(X), and Y is a continuous random variable with density function pY ,
then X is a continuous random variable with density function

∂φ
pX (x) = pY (φ(x)) det (x) .
∂x

Let Y ∈ Rn be a continuous random vector with density

1 − 21 ∥y∥2
pY (y) = e ,
(2π)n/2

i.e., Y ∼ N (0, I). Let X = AY + b with an invertible matrix A ∈ Rn×n and a vector b ∈ Rn .
Define Σ = AA⊺ . Show that X is a continuous random vector with density
1 1 ⊺ −1
pX (x) = p e− 2 (x−b) Σ (x−b) .
n
(2π) det Σ

2
Problem 6: Inverse permutation. Let Sn denote the group of length-n permutations. Note
that the map i 7→ σ(i) is a bijection. Define σ −1 ∈ Sn as the permutation representing the
inverse of this map, i.e, σ −1 (σ(i)) = i for i = 1, . . . , n. Describe an algorithm for computing
σ −1 given σ.

Clarification. In this class, we defined σ as a list of length n containing the elements of {1, . . . , n}
exactly once. The output of the algorithm, σ −1 , should also be provided as a list.
Clarification. For this problem, it is sufficient to describe the algorithm in equations or pseu-
docode. There is no need to submit a Python script for this problem.

Problem 7: Permutation matrix. Given a permutation σ ∈ Sn , the permutation matrix of σ is

defined as  ⊺ 
eσ(1)
 e⊺ 
 σ(2)  n×n
Pσ =  ..  ∈ R ,

 . 
e⊺σ(n)
where e1 , . . . , en ∈ Rn are the standard unit vectors. Show

(a) (Pσ x)i = xσ(i) for all x ∈ Rn and i = 1, . . . , n,

(b) Pσ⊺ = Pσ−1 = Pσ−1 and

Hint. If the rows of U ∈ Rn×n are orthonormal, we say U is an orthogonal matrix. Orthogonal
matrices satisfy U U ⊺ = U ⊺ U = I.

Semester I: Discipline: Electronics and Communication Stream: EC3
No ratings yet
Semester I: Discipline: Electronics and Communication Stream: EC3
99 pages
Private Key Encryption and Recovery in Blockchain: July 2019
100% (1)
Private Key Encryption and Recovery in Blockchain: July 2019
23 pages
Programming Exercises For R: by Nastasiya F. Grinberg & Robin J. Reed
50% (2)
Programming Exercises For R: by Nastasiya F. Grinberg & Robin J. Reed
35 pages
S Ccs Answers
No ratings yet
S Ccs Answers
192 pages
Aps Question Bank
No ratings yet
Aps Question Bank
29 pages
Rexercises 1 R Basic
No ratings yet
Rexercises 1 R Basic
35 pages
Linear Programming Notes
No ratings yet
Linear Programming Notes
122 pages
Solutions Manual Scientific Computing
0% (1)
Solutions Manual Scientific Computing
192 pages
Optimization by UC Berkley
No ratings yet
Optimization by UC Berkley
77 pages
Annotated L18
No ratings yet
Annotated L18
73 pages
Solution Manual For Discrete Time Signal Processing 3 E 3rd Edition Alan V Oppenheim Ronald W Schafer
0% (1)
Solution Manual For Discrete Time Signal Processing 3 E 3rd Edition Alan V Oppenheim Ronald W Schafer
4 pages
ML Ctanujit
No ratings yet
ML Ctanujit
56 pages
MA4151 Applied Probability)
No ratings yet
MA4151 Applied Probability)
28 pages
Machine 2020 Jul-Dec Practice 7,8
No ratings yet
Machine 2020 Jul-Dec Practice 7,8
37 pages
Lecture Notes On Problem Discretization Using Approximation Theory
No ratings yet
Lecture Notes On Problem Discretization Using Approximation Theory
73 pages
ML Solutions
No ratings yet
ML Solutions
40 pages
Lectures On Randomized Numerical Linear Algebra: Petros Drineas Michael W. Mahoney
No ratings yet
Lectures On Randomized Numerical Linear Algebra: Petros Drineas Michael W. Mahoney
45 pages
Signal Processing
No ratings yet
Signal Processing
85 pages
MATH1005 Final Exam 2022
No ratings yet
MATH1005 Final Exam 2022
17 pages
Mlgs 2021 Endterm Solution
No ratings yet
Mlgs 2021 Endterm Solution
26 pages
Problem 0: Akshya Gupta September 13, 2021
No ratings yet
Problem 0: Akshya Gupta September 13, 2021
48 pages
R Exercises
No ratings yet
R Exercises
35 pages
DAMA 50 Exam Resit 22-23
No ratings yet
DAMA 50 Exam Resit 22-23
11 pages
Compensator Design Using Bode Plot
No ratings yet
Compensator Design Using Bode Plot
18 pages
DAMA 50 Exam Final 22-23
No ratings yet
DAMA 50 Exam Final 22-23
11 pages
ПМиИИ Демо ENG
No ratings yet
ПМиИИ Демо ENG
11 pages
Homework 2 MATH2050
No ratings yet
Homework 2 MATH2050
10 pages
Problems-Chapter1 Calculus I UC3M
No ratings yet
Problems-Chapter1 Calculus I UC3M
10 pages
DGM 2023 Endterm Solution
No ratings yet
DGM 2023 Endterm Solution
12 pages
Exercise 01 Solution
No ratings yet
Exercise 01 Solution
8 pages
DAMA 50 Exam Final 23-24
No ratings yet
DAMA 50 Exam Final 23-24
8 pages
Spring 2005 Solutions
No ratings yet
Spring 2005 Solutions
11 pages
NB With Notations
No ratings yet
NB With Notations
7 pages
Python Lab Sols 6
No ratings yet
Python Lab Sols 6
8 pages
EE263s Homework 4
No ratings yet
EE263s Homework 4
11 pages
HW2 Solution
No ratings yet
HW2 Solution
9 pages
Col726 A2
No ratings yet
Col726 A2
5 pages
Department of Electrical Engineering School of Science and Engineering
No ratings yet
Department of Electrical Engineering School of Science and Engineering
10 pages
Exercise Sheet 8
No ratings yet
Exercise Sheet 8
5 pages
CST294 B
No ratings yet
CST294 B
4 pages
Exercise Sheet 1
No ratings yet
Exercise Sheet 1
5 pages
Exercise 01 Math Refresher
No ratings yet
Exercise 01 Math Refresher
4 pages
PSDA Final
No ratings yet
PSDA Final
4 pages
hw0 22au
No ratings yet
hw0 22au
5 pages
HW 2
No ratings yet
HW 2
7 pages
Col726 A1 1
No ratings yet
Col726 A1 1
3 pages
Exercise 01
No ratings yet
Exercise 01
3 pages
Paper 4 S2 2324
No ratings yet
Paper 4 S2 2324
5 pages
Assignments Problem
No ratings yet
Assignments Problem
3 pages
HW01 - Math Recap
No ratings yet
HW01 - Math Recap
4 pages
Practice 2
No ratings yet
Practice 2
3 pages
LAforAIML 2
No ratings yet
LAforAIML 2
3 pages
Take Home Exam
No ratings yet
Take Home Exam
9 pages
hw01 Cvxopt sp19
No ratings yet
hw01 Cvxopt sp19
3 pages
E1 Exam Sol
No ratings yet
E1 Exam Sol
6 pages
HW 23 P 4 Rie
No ratings yet
HW 23 P 4 Rie
5 pages
CS236 Hw1 Answers
No ratings yet
CS236 Hw1 Answers
9 pages
Series 1, Oct 1st, 2013 Probability and Related) : Machine Learning
No ratings yet
Series 1, Oct 1st, 2013 Probability and Related) : Machine Learning
4 pages
5/6/2011: Final Exam: Your Name
No ratings yet
5/6/2011: Final Exam: Your Name
4 pages
E1 Exam
No ratings yet
E1 Exam
3 pages
Introduction To Artificial Intelligence QA 2
No ratings yet
Introduction To Artificial Intelligence QA 2
29 pages
Polynomial-Basic Concepts
No ratings yet
Polynomial-Basic Concepts
15 pages
Chapter 1
No ratings yet
Chapter 1
13 pages
EE 340 Control Systems: Schedule Semester
No ratings yet
EE 340 Control Systems: Schedule Semester
3 pages
Johnson's Rule
100% (1)
Johnson's Rule
12 pages
Non-Stationary Time Series Models.
No ratings yet
Non-Stationary Time Series Models.
13 pages
Assignment 2 Stability, Steady State Error, PID
No ratings yet
Assignment 2 Stability, Steady State Error, PID
9 pages
2.004 Dynamics and Control Ii: Mit Opencourseware
No ratings yet
2.004 Dynamics and Control Ii: Mit Opencourseware
9 pages
Christophgruber Mechatronics2012
No ratings yet
Christophgruber Mechatronics2012
8 pages
Mini-Max Algorithm in Artificial Intelligence
No ratings yet
Mini-Max Algorithm in Artificial Intelligence
2 pages
Chapter 6 Project Schedule Management
No ratings yet
Chapter 6 Project Schedule Management
35 pages
Machine Learning and Deep Learning For State of Art
No ratings yet
Machine Learning and Deep Learning For State of Art
21 pages
Lecture 2 Deep Learning Overview
No ratings yet
Lecture 2 Deep Learning Overview
98 pages
23 Domain Adaptation Challenges Methods Datasets and Applications
No ratings yet
23 Domain Adaptation Challenges Methods Datasets and Applications
48 pages
Queueing System Analysis of Multi Server Model at
No ratings yet
Queueing System Analysis of Multi Server Model at
9 pages
Manipal Institute of Technology: Course Plan
No ratings yet
Manipal Institute of Technology: Course Plan
3 pages
Cryptography Answer
No ratings yet
Cryptography Answer
4 pages
Control System Final Roadmap
No ratings yet
Control System Final Roadmap
3 pages
Mcclean 2016
No ratings yet
Mcclean 2016
23 pages
MS BKRP
No ratings yet
MS BKRP
51 pages
Assgn 2
No ratings yet
Assgn 2
2 pages
Cs Gyaat
No ratings yet
Cs Gyaat
1 page
Expt3.ipynb - JupyterLab
No ratings yet
Expt3.ipynb - JupyterLab
5 pages
First Three Pages of Scilab File - 2024-25
No ratings yet
First Three Pages of Scilab File - 2024-25
4 pages
Power Point Presentation On-: Array Based Applications in C Language
No ratings yet
Power Point Presentation On-: Array Based Applications in C Language
20 pages
Assignment 1
No ratings yet
Assignment 1
2 pages
Eqt271 Course Outline 07 Apr
No ratings yet
Eqt271 Course Outline 07 Apr
1 page
Hackerrank
No ratings yet
Hackerrank
3 pages
Theory of Approximation
From Everand
Theory of Approximation
N. I. Achieser
No ratings yet
Differential Forms
From Everand
Differential Forms
Henri Cartan
5/5 (2)

hw8 (5555)

Uploaded by

hw8 (5555)

Uploaded by

Mathematical Foundations of Deep Neural Networks, M1407.

Problem 1: Transpose of downsampling. Consider the downsampling operator T : Rm×n →

Clarification. The downsampling operator T is a linear operator (why?). Therefore, T has a

T (X) = (A(X.reshape(mn))).reshape(m/2, n/2)

for all Y ∈ R(m/2)×(n/2) . Another is

for all X ∈ Rm×n and Y ∈ R(m/2)×(n/2) .

where r is a positive integer, can be equivalently represented by

with ... appropriately filled in.

where f is a convex function such that f (1) = 0.

(a) Show that Df (X∥Y ) ≥ 0.

(b) Show that f = − log t and f = t log t correspond to the KL divergence.

Problem 4: Generalized inverse transform sampling. Let F : R → [0, 1] be the CDF of a

P(F −1 (U ) ≤ t) = P(U ≤ F (t)) = F (t).

When F is not necessarily invertible, the generalized inverse of F is G : (0, 1) → R with

G(u) = inf{x ∈ R | u ≤ F (x)}.

Show that G(U ) is a random variable with CDF F .

Problem 5: Change of variables formula for Gaussians. If φ : Rn → Rn is a one-to-one differ-

Let Y ∈ Rn be a continuous random vector with density

Problem 7: Permutation matrix. Given a permutation σ ∈ Sn , the permutation matrix of σ is

(a) (Pσ x)i = xσ(i) for all x ∈ Rn and i = 1, . . . , n,

(b) Pσ⊺ = Pσ−1 = Pσ−1 and

You might also like