0% found this document useful (0 votes)
12 views4 pages

Mathrecap Sol

Uploaded by

张立波
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views4 pages

Mathrecap Sol

Uploaded by

张立波
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 4

Chair of Electronic Design Automation

Department of Electrical and Computer Engineering


Technical University of Munich

Machine Learning: Methods and Tools

Exercise 0: Math Recap

Linear Algebra
Notation. We use the following notation in this exercise:
• Scalars are denoted as lowercase letters, e.g. 𝑎, 𝑏, 𝑥
• Vectors are denoted as bold lowercase letters, e.g. 𝒂, 𝒃, 𝒙.
• Matrices are denoted as bold uppercase letters, e.g. 𝑨, 𝑩, 𝑿
Problem 1
A matrix 𝑨 ∈ ℝ!×! is positive semi-definite (PSD), denoted 𝑨 ≽ 0, if 𝑨 = 𝑨# and ∀𝒙 ∈
𝑅! , 𝒙# 𝑨𝒙 ≥ 0. A matrix 𝑨 ∈ ℝ!×! is positive definite (PD), denoted 𝑨 ≻ 0, if 𝑨 = 𝑨# and
𝒙# 𝑨𝒙 > 0, for all 𝒙 ≠ 0.
a) Let 𝑨 ∈ ℝ!×! , assume that there exists a matrix 𝑩 ∈ ℝ!×! such that 𝑨𝑩 = 𝑩𝑨 = 𝑰.
What can you say about the eigenvalues of 𝑨 ?
𝑨𝑩 = 𝑩𝑨 = 𝑰 ⟹ 𝑩 = 𝑨$𝟏 ⟹ 𝑨 is invertible, which means the determinant of 𝑨 is
not equal to 0, i.e., none of the eigenvalues of 𝑨 are 0.

b) Let 𝑨 ∈ ℝ&×! , prove that the matrix 𝑩 = 𝑨# 𝑨 is positive semi-definite (PSD) for any
choice of 𝑨.
𝒙# 𝑩𝒙 = 𝒙# 𝑨# 𝑨𝒙 = (𝑨𝒙)# 𝑨𝒙 = ‖𝑨𝒙‖'' ≥ 0
⟹The matrix B is positive semi-definite.

c) Let 𝑨 ∈ ℝ!×! be positive semi-definite (PSD) and 𝑩 ∈ ℝ&×! be arbitrary, where


𝑚, 𝑛 ∈ ℕ. Is 𝑩𝑨𝑩# PSD? If so, prove it. If not, give a counter example.
∀𝒙 ∈ ℝ& , 𝒙# 𝑩𝑨𝑩# 𝒙 = (𝑩# 𝒙)# 𝑨(𝑩# 𝒙) = 𝒛# 𝑨𝒛 ≥ 0, Where 𝒛 = 𝑩# 𝒙 ∈ ℝ!
⟹ 𝑩𝑨𝑩# is PSD.

Calculus
Problem 2
(
a) For 𝒙 ∈ ℝ! , let 𝑓: ℝ! → ℝ with 𝑓(𝒙) = 𝒙# 𝑨𝒙 + 𝒃# 𝒙, where 𝑨 is a symmetric matrix
'
and 𝒃 ∈ ℝ! is a vector. What is ∇) 𝑓(𝒙) and ∇') 𝑓(𝒙) ?
!

𝒃# 𝒙 = 𝑏( 𝑥( + ⋯ + 𝑏! 𝑥! = F 𝑏* 𝑥*
*+(
𝜕𝒃# 𝒙
⎡ ⎤
#
𝜕𝒃 𝒙 ⎢ 𝜕𝑥( ⎥ 𝑏(
=⎢ ⋮ ⎥=O⋮ P=𝒃
𝜕𝒙 ⎢𝜕𝒃# 𝒙⎥ 𝑏!
⎢ ⎥
⎣ 𝜕𝑥! ⎦
Similarly,
! !
#
𝒙 𝑨𝒙 = F F 𝐴*, 𝑥* 𝑥,
*+( ,+(
With 1≤ 𝑘 ≤ 𝑛, 𝑡he partial derivative of 𝑥- is
! !
𝜕𝒙# 𝑨𝒙 𝜕
= UF F 𝐴*, 𝑥* 𝑥, V
𝜕𝑥- 𝜕𝑥-
*+( ,+(
! !
𝜕
= FF (𝐴 𝑥 𝑥 )
𝜕𝑥- *, * ,
*+( ,+(
! !
𝜕 𝜕 𝜕
= F (𝐴 𝑥 𝑥 ) + F W𝐴-, 𝑥- 𝑥, X + (𝐴 𝑥 ' )
𝜕𝑥- *- * - 𝜕𝑥- 𝜕𝑥- -- -
*+(,*/-, ,+(,,/-,
! !

= F 𝐴*- 𝑥* + F 𝐴-, 𝑥, + 2𝐴-- 𝑥-


*+(,*/- ,+(,,/-
! ! !
01&&234*5
= F 𝐴*- 𝑥* + F 𝐴-, 𝑥, = 2 F 𝐴-, 𝑥,
*+( ,+( ,+(
!
𝜕𝒙# 𝑨𝒙 ⎡ ⎤
⎡ ⎤ ⎢ 2 F 𝐴(, 𝑥, ⎥
𝜕𝒙# 𝑨𝒙 ⎢ 𝜕𝑥( ⎥ ⎢ ,+( ⎥
=⎢ ⋮ ⎥=⎢ ⋮ ⎥ = 2𝑨𝒙
𝜕𝒙 ⎢𝜕𝒙# 𝑨𝒙⎥ !
⎢ ⎥ ⎢ ⎥
⎣ 𝜕𝑥! ⎦ ⎢2 F 𝐴!, 𝑥, ⎥
⎣ ,+( ⎦
1 #
∇) 𝑓(𝒙) = ∇) Z 𝒙 𝑨𝒙 + 𝒃# 𝒙\ = 𝑨𝒙 + 𝒃
2
!
𝜕 ' (𝒙# 𝑨𝒙) 𝜕
= U2 F 𝐴-, 𝑥, V = 2𝐴-6 , ( 1 ≤ 𝑙 ≤ 𝑛)
𝜕𝑥- 𝜕𝑥6 𝜕𝑥6
,+(

1 # 𝐴(( ⋯ 𝐴(!
∇') 𝑓(𝒙) = ∇') ( #
𝒙 𝑨𝒙 + 𝒃 𝒙) = O ⋮ ⋱ ⋮ P=𝑨
2 𝐴 ⋯ 𝐴!!
!(
Therefore, we obtain ∇) 𝑓(𝒙) = 𝑨𝒙 + 𝒃, ∇') 𝑓(𝒙) = 𝑨.

b) For 𝑏 ∈ ℝ! , let 𝑓: ℝ! → ℝ with 𝑓(𝒙) = ‖𝑨𝒙 − 𝒃‖'' = (𝑨𝒙 − 𝒃)# (𝑨𝒙 − 𝒃), where
𝑨 ∈ ℝ&×! , 𝒃 ∈ ℝ& , what is ∇) 𝑓(𝒙) ?
𝑓(𝒙) = ‖𝑨𝒙 − 𝒃‖'' = (𝑨𝒙 − 𝒃)# (𝑨𝒙 − 𝒃)
= (𝒙# 𝑨# − 𝒃# )(𝑨𝒙 − 𝒃)
= 𝒙# 𝑨# 𝑨𝒙 − 𝒙# 𝑨# 𝒃 − 𝒃# 𝑨𝒙 − 𝒃# 𝒃
= 𝒙# 𝑨# 𝑨𝒙 − 2𝒃# 𝑨𝒙 − 𝒃# 𝒃
From question a), we get
∇) 𝑓(𝒙) = 2𝑨# 𝑨𝒙 − 2𝑨# 𝒃

c) Let 𝑓(𝒙) = 𝑔Wℎ(𝒙)X, where 𝑔: ℝ → ℝ is differentiable and ℎ: ℝ! → ℝ is differentiable.


What is ∇) 𝑓(𝒙)?
𝑑𝑔(ℎ)
∇) 𝑓(𝒙) = ∇) ℎ(𝑥)
𝑑ℎ

d) Let 𝑓(𝒙) = 𝑔(𝒂# 𝒙), where 𝑔: ℝ → ℝ is continuously differentiable and 𝒂 ∈ ℝ! , What


is ∇) 𝑓(𝒙) ?
∇) 𝑓(𝒙) = 𝑔7 (𝒂# 𝒙)∇) (𝒂# 𝒙) = 𝑔7 (𝒂# 𝒙) ∙ 𝒂

Problem 3
Compute the derivatives for the following functions.
a) 𝑓( : ℝ → ℝ, 𝑓( (𝑥) = log(𝑥 8 ) sin (𝑥 9 )
4 sin(𝑥 9 )
𝑓(7 (𝑥) = + 3𝑥 ' cos(𝑥 9 ) log(𝑥 8 )
𝑥
4 sin(𝑥 9 )
= + 12𝑥 ' cos(𝑥 9 ) log(𝑥)
𝑥
(
b) 𝑓' : ℝ → ℝ, 𝑓' (𝑥) = (:;<= ($))
exp(−𝑥)
𝑓'7 (𝑥) = = 𝑓' (𝑥)(1 − 𝑓' (𝑥))
(1 + exp(−𝑥))'

(
c) 𝑓9 : ℝ → ℝ, 𝑓9 (𝑥) = exp p− 'A! (𝑥 − 𝜇)' r
(𝑥 − 𝜇) 1 (𝑥 − 𝜇)
𝑓97 (𝑥) = − exp Z− (𝑥 − 𝜇)'
\ = − 𝑓9 (𝑥)
𝜎' 2𝜎 ' 𝜎'

d) 𝑓8 : ℝ9 → ℝ' , 𝑓8 (𝑥, 𝑦, 𝑧) = (𝑥𝑦 + 2𝑦𝑧, 2𝑥𝑦 ' 𝑧)#

𝑦 𝑥 + 2𝑧 2𝑦
𝐽B (𝑥, 𝑦, 𝑧) = w x
2𝑦 ' 𝑧 4𝑥𝑦𝑧 2𝑥𝑦 '

Probability Theory
Problem 4
a) Two random variables 𝑋, 𝑌 are independent if 𝐹CD (𝑥, 𝑦) = 𝐹C (𝑥)𝐹D (𝑦) for all values of
𝑥 and 𝑦. In the case of independence, the following property holds:
𝐸 [𝑓(𝑥)𝑔(𝑦)] = 𝐸 [𝑓(𝑥)]𝐸[𝑔(𝑦)]
Assume that 𝑋, 𝑌 are two random variables that are independent and identically
distributed with 𝑋, 𝑌~𝒩(0, 𝜎 ' ). Prove that
𝑉𝑎𝑟[𝑋𝑌] = 𝑉𝑎𝑟[𝑋]𝑉𝑎𝑟[𝑌].
𝑋, 𝑌~𝒩(0, 𝜎 ' ) ⇒ 𝐸 [𝑋] = 𝐸 [𝑌] = 0
By the definition of variance, we have
𝑉𝑎𝑟[𝑋] = 𝐸 [𝑋 ' ] − 𝐸 [𝑋]'
⟹ 𝐸 [𝑋 ' ] = 𝑉𝑎𝑟[𝑋] + 𝐸 [𝑋]'

𝑉𝑎𝑟[𝑋𝑌 ] = 𝐸[(𝑋𝑌)' ] − 𝐸 [𝑋𝑌]'


= 𝐸[𝑋 ' ]𝐸 [𝑌 ' ] − 𝐸 [𝑋]' 𝐸 [𝑌]'
= (𝑉𝑎𝑟[𝑋] + 𝐸 [𝑋]' )(𝑉𝑎𝑟[𝑌] + 𝐸 [𝑌]' ) − 𝐸 [𝑋]' 𝐸 [𝑌]'
= 𝑉𝑎𝑟[𝑋]𝑉𝑎𝑟[𝑌 ] + 𝑉𝑎𝑟[𝑋] „…† 𝐸 [𝑌 ]' + 𝐸 [𝑋]' 𝑉𝑎𝑟[𝑌] − 𝐸
„…† [𝑋]' 𝐸 [𝑌]'
„‡‡…‡‡†
+E +E +E
= 𝑉𝑎𝑟[𝑋]𝑉𝑎𝑟[𝑌 ]

This property comes handy when we talk about initialization of weights of the neural
network.

b) Assume that the random variable 𝑋 is normal distributed with mean 𝜇 and variance
𝜎 ' , i.e., 𝑋~𝒩(𝜇, 𝜎 ' ). Let 𝑍 = 𝑎𝑥 ' + 𝑏𝑥 + 𝑐, determine the mean of random variable
Z.
Using 𝑉𝑎𝑟[𝑥 ] = 𝐸 [𝑥 ' ] − 𝐸 [𝑥 ]' = 𝐸 [𝑥 ' ] − 𝜇' = 𝜎 ' , we have

𝐸 [𝑍] = 𝐸 [𝑎𝑥 ' + 𝑏𝑥 + 𝑐]


= 𝑎𝐸 [𝑥 ' ] + 𝑏𝐸 [𝑥] + 𝑐
= 𝑎(𝜎 ' + 𝜇' ) + 𝑏𝜇 + 𝑐

You might also like