Math 4 Week 4 UvA Solutions
Math 4 Week 4 UvA Solutions
3 Differential calculus
Exercise 3.1. For each of the following functions determine the derivative Df :
(a) f : R → R2 , f (t) = (et cos t, et sin t)
(b) f : R2 → R, f (x, y) = ex (cos y + sin y)
(c) f : R2 → R2 , f (x, y) = (ex cos y, ex sin y)
(d) f : R2 → R3 , f (x, y) = (y 2 , x cos y, y sin x)
Solution:
d
′ dt
(et cos t) et (cos t − sin t)
(a) f (t) = d = t
dt
(et sin t) e (sin t + cos t)
(b) Df (x, y) = Dx f (x, y) Dy f (x, y) = ex (cos y + sin y) ex (− sin y + cos y)
x
Df1 (x, y) e cos y −ex sin y
(c) Df (x, y) = =
Df2 (x, y) ex sin y ex cos y
Df1 (x, y) 0 2y
(d) Df (x, y) = Df2 (x, y) = cos y −x sin y
Df3 (x, y) y cos x sin x
Exercise 3.2. Let f : Rm → Rm be differentiable at 0 and satisfy f (0) = 0. Define for all
n ∈ N the function f n : Rm → Rm recursively as f 0 (x) = x and f n+1 (x) = f (f n (x)).
(a) If g : R → R is given as g(x) = x2 , compute g n (x) for all n ∈ N.
(b) Compute Df 0 (0) and Df 1 (0). Use the chain rule to compute Df 2 (0) in terms of
Df (0).
(c) Guess an expression for Df n (0) and prove it using mathematical induction.
(d) If g : R2 → R2 satisfies g(0) = 0 and
2 0
Dg(0) = ,
0 3
compute Dg n (0).
Solution:
(a) We have
(c) We conjecture that f n (0) = 0 and Df n (0) = (Df (0))n . This true for n = 0 and
n = 1.
Assume that it has been proved for n. First we have f n+1 (0) = f (f n (0)) = f (0) = 0.
and in particular
Df n+1 (0) = Df (f n (0))Df n (0) = Df (0)Df n (0) = Df (0)(Df (0))n = (Df (0))n+1 .
Exercise 3.3.
1 2 3
Let f : R → R be the affine function defined by f (x) = Ax+b, where A =
3 2
4 5 6
and b = (2, 4). Find Df (x).
Solution:
We have
f1 (x) x1 + 2x2 + 3x3 + 2
f (x) = = .
f2 (x) 4x1 + 5x2 + 6x3 + 5
Then !
∂ ∂ ∂
f (x)
∂x1 1
f (x)
∂x2 1
f (x)
∂x3 1 1 2 3
Df (x) = ∂ ∂ ∂ = = A.
f (x)
∂x1 2
f (x)
∂x2 2
f (x)
∂x3 2
4 5 6
This should be no surprise. You are asked to prove it more formally in the next exercise.
Exercise 3.4.
Let f : Rm → Rk be the affine function defined by f (x) = Ax + b, where A is k × m
matrix and b ∈ Rk . Find Df (x).
Solution:
We have
" m #
∂ X Xm
∂ ∂
fi (x) = Aij xj + bi = [Aij xj + bi ] = Ait
∂xt ∂xt j=1 j=1
∂x t
∂
···
f (x)
∂x1 1
∂
f (x)
∂xm 1
Hence Df (x) = =A
.
.. .. ..
. .
∂
f (x) · · ·
∂x1 k
∂
f (x)
∂xm k
Define
h(x) = f (x) · g(x).
Calculate Dh(1, 1, 1). Verify that Dh(1, 1, 1) = f (1, 1, 1)⊤ Dg(1, 1, 1)+g(1, 1, 1)⊤ Df (1, 1, 1).
Solution:
•
x1
1 2 3 1 x1 + 2x2 + 3x3 + 1
f (x) = · x2 + =
3 2 1 0 3x1 + 2x2 + x3
x3
x2 (1 + 2x2 + 3x3 + 1) + x2 x3 (3)
• ∇h(x) = x1 (1 + 2x2 + 3x3 + 1) + 2x2 + x3 (3x1 + 2x2 + x3 ) + 2x2 x3
3x1 x2 + x2 (3x1 + 2x2 + x3 )
7+1+3 11
• ∇h(1, 1, 1) = 1 + 2 + 6 + 2 = 17
3+6+1 10
• f (1, 1, 1) = (7, 6), g(1, 1, 1) = (1, 1)
1 2 3
• Df (1, 1, 1) =
3 2 1
x2 x3 0 1 1 0
• Dg(x1 , x2 , x3 ) = so that Dg(1, 1, 1) =
0 x3 x2 0 1 1
• So check that Dh(1, 1, 1) = f (1, 1, 1)T Dg(1, 1, 1) + g(1, 1, 1)T Df (1, 1, 1) =
1 1 0 1 2 3
(7 6) + (1 1) = (7 13 6) + (4 7 4) = (11 27 10)
0 1 1 3 2 1
Hint: first figure out how this works if k = 1 and see the previous exercise.
Solution:
Pk
We have h(x) = t=1 ft (x)gt (x) so that
" k # k
∂ ∂ X X ∂
h(x) = ft (x)gt (x) = ft (x)gt (x)
∂xj ∂xj t=1 t=1
∂x j
X ∂ft
k
∂gt
= (x)gt (x) + (x)ft (x)
t=1
∂xj ∂xj
Xk X k
∂ft ∂gt
= (x)gt (x) + (x)ft (x)
t=1
∂xj t=1
∂xj
∂f ∂g
= (x) · g(x) + (x) · f (x)
∂xj ∂xj
Note that in the third equality we used the product rule for single variable functions.
Moreover, realize that g(x) and f (x) are column vectors, Df (x) and Dg(x) row vectors.
⊤ 1 2
Exercise 3.7. Let f : R → R be given by f (x) = x Ax where A =
2
. Determine
0 4
Df (x), directly and using the former exercise.
Solution:
Write out in a scalar fashion
x1 + 2x2
f (x) = x1 x2 = x1 (x1 + 2x2 ) + 4x22 = x21 + 2x1 x2 + 4x22 .
4x2
Hence, Df (x) = 2x1 + 2x2 2x1 + 8x2 . Now use the former exercise with product
functions h(x) = x, g(x) = Ax. Then we have
Solution:
Consider a quadratic form f (x) = x · Ax where A is a real (not necessarily symmetric)
m × m matrix A. We will show that Df (x) = 2x⊤ (A⊤ + A), in two different ways:
(a) To see this, first rewrite f (x): take k ∈ {1, 2, . . . , m} and split the product in parts
that do depend on xk and parts that don’t:
X
m X
m X
m X
m X
m
f (x) = x⊤ Ax = xi (Ax)i = xi aij xj = aij xi xj
i=1 i=1 j=1 i=1 j=1
X
m XX
m
= akj xk xj + aij xi xj
j=1 i̸=k j=1
X X X
=∗ akk x2k + akj xk xj + aik xi xk + aij xi xj
j̸=k i̸=k i,j̸=k
∂ X X
f (x) = 2akk xk + akj xj + aik xi + 0
∂xk j̸=k i̸=k
X
m X
m
= akj xj + aik xi
j=1 i=1
In standard OLS, the matrix A is just the identity matrix. As we will see later, minimizing
this expression requires the partial derivatives of the function h.
Find an expression for Dh(β) in terms of A, X, y.
Solution:
Use the former exercise here. We have Df (x) = 2x⊤ A and Dg(β) = −X. So using the
Chain Rule we get
for x ∈ Rm and b ∈ R,
1
σ(t) = .
1 + e−t
Calculate ∇f (w), and show that it can be expressed in terms of σ, x, and b.
Solution:
f is a differentiable function composed of two functions, an affine transformation of w and
a differentiable function σ.
Standard differentiation techniques give
1 e−t 1
σ ′ (t) = e−t · = · = (1 − σ(t))σ(t)
(1 + e−t )2 1 + e−t 1 + e−t
So we have
∇f (w) = Df (w)⊤ = f (w)(1 − f (w))x.
for x ∈ Rm and b ∈ R,
et − e−t
tanh(t) = .
et + e−t
Calculate ∇f (w, b) and express it in terms of f and x.
Solution:
f is a differentiable function composed of two functions, an affine transformation of w and
a differentiable function tanh .
Standard differentiation techniques give
So we have
⊤ x
∇f (w) = Df (w) = (1 − (f (w)) ) · 2
1
g(x, y, z) = (x − y + z, x2 + y 2 )
f (s, t) = (es + e−t , e−s + et )
Solution:
es −e−t
Df (s, t) = ,
−e−s et
1 −1 1
Dg(x, y, z) =
2x 2y 0
−e−(x +y 1 −1 1
2 2)
ex−y+z
Dh(x, y, z) = Df (g(x, y, z))Dg(x, y, z) =
−e−x+y−z
2 2
ex +y 2x 2y 0
x−y+z
− 2xe−(x +y ) −ex−y+z − 2ye−(x +y )
2 2 2 2
e ex−y+z
−e−x+y−z + 2xex +y e−x+y−z + 2yex +y −e−x+y−z
2 2 2 2