Exercise Sheet 8
Exercise Sheet 8
Exercise 1
Show that the single-layer perceptron model
with σ(x) = max{0, x} and a, b, c ∈ Rn for some n ∈ N can represent any piecewise linear
function. You may proceed as follows:
(a) Let 0 = x0 ≤ . . . ≤ xN = 1 be a set of grid points, and let C(I) and P1 (I) be the
continuous and linear functions on the interval I, respectively. Show that any function
0, x ≤ xi−1 ,
x−xi−1
, x ∈ [xi−1 , xi ],
xi −xi−1
pi (x) = x−xi+1
xi −xi+1 , x ∈ [xi , xi+1 ],
0, x ≥ xi+1 .
1 pi
0
xi−1 xi xi+1
(b) Construct the hat function pi using functions of the form eq. (1).
Exercise 2
Show that a single-layer perceptron is an over-parametrized model. That is, consider models
of the form
p (x) = a⊤ σ (bx + c) (2)
1
with x ∈ R, σ(y) = max{0, y}, and a, b, c ∈ Rn ; we concatenate the parameter vectors as
follows:
a
b ∈ R3n .
c
In the lecture, we have already noted that a permutation of the parameter vectors a, b, c
leads to the same model. In this exercise, construct a subspace (dimension > 0) of the space
of concatenated parameter vectors, R3n , that yields the model
p(x) = σ(x)
using the representation eq. (2); to simplify the discussion, you may choose any specific n.
Exercise 3
Consider the following Theorem from the book Linear Algebra and Learning from Data from
G. Strang1 :
Theorem 1 For v ∈ Rm , suppose the graph of F (v) has folds along N hyperplanes H1 , . . . , HN .
Those come from N linear equations a⊤ i v + bi = 0, in other words from ReLU at N neurons.
Then the number of linear pieces of F and regions bounded by the N hyperplanes is r(N, m):
m
X N N N
r(N, m) = = + ... + .
i=0
i 0 m
(b) Using the theorem, compute the numbers for one, two, and three dimensional input
vectors.
Exercise 4
for N → ∞, where xi is sampled form a uniform distribution U (0, 1). Choose a fourth
order polynomial and compare the numerical approximation with the exact value of the
integral.
1
Strang, Gilbert. Linear Algebra and Learning From Data. Wellesley-Cambridge Press, 2019.
2
(b) Use the same approach to compute an approximation of π by approximating
Z
χC (x) dx,
[0,1]2
where
1, for ∥x∥2 ≤ 1,
χC (x) =
0, otherwise.
Exercise 5
Prove a sub-optimal approximation result, which uses a stronger assumption of f that just
continuity:
(a) Show that, for the linear interpolant p on Ii = [xi , xi+1 ] with
h2i ′′
max |f (x) − p(x)| ≤ |f (ξ)| ,
x 8
with hi = xi+1 − xi .
(b) Let f ∈ C 2 ([0, 1]) arbitrary and σ(x) = max {0, x} and. Prove that, for every ε > 0,
there exists some n ∈ N and a function
P (x) = a⊤ σ (bx + c) ,
Hint:
with K ∈ R. Impose F (x̄) = 0 for some x̄ ∈ [xi , xi+1 ]. Then, use Rolle’s theorem.
Exercise 6
Implement a multi-layer perceptron model with two hidden layers, sigmoid activation func-
tion, and the following architecture using Python:
• input dimension: 2
• output dimension: 1
3
You may use any deep learning library to implement the model or implement it yourself.
In order to initialize the weights, compare the following two strategies discussed in the
lecture:
• Uniform distribution: U − ni , ni ,
√1 √1
q q
• Glorot and Bengio: U − ni +ni+1 , ni +ni+1
6 6
Initialize the bias vectors with zero. Without training the network, perform the following
tasks:
(a) Visualize the output of the neural networks after initialization for five different random
seeds.
(b) Visualize the value of the loss function depending on the diagonal entries of the weight
matrix in the first hidden layer. Therefore, use the loss functions
2
LMSE N N σW,b (xi ) , yi 1
N N σW,b (xi ) − yi 2 ,
Mean squared error (MSE) = N
σ 1
N N σW,b (x) − y 1 .
Mean absolute error (MAE) LMAE N N W,b (x) , yi = N
In order to compute the input and output data, take 100 points randomly sampled from
{x ∈ R2 | x21 + x22 ≤ 1} as input and the squared norm as corresponding output:
y = x21 + x22 .
Exercise 7
For the two functions
(a)
f (x, y, θ, r) = x + iy + r(cos(θ) + i sin(θ))
• input dimension: 4
• dimension of each hidden layer: 2
• output dimension: 1
visualize the computational graph. Then, perform forward and backward propagation for
the input
1
1
π .
2
2
4
Exercise 8
Exercise 9
(a) Find a solution of the boundary value problem: find u such that
∂ 1 ∂
· u = 1 in [0, 1],
∂x x + 1 ∂x
u(0) = 0,
7
u(1) = ,
3
and the loss function that allow for solving the BVP using the method of Lagaris et al.
(c) Replace the ansatz by a data loss and formalize the corresponding PINN loss function.
Exercise 10
Test the example in examples/pinn forward/Poisson Dirichlet 1d.py from the DeepXDE
library2 for solving a one dimensional Poisson problem using PINNs.
2
https://github.com/lululxvi/deepxde