Lecture 01
Lecture 01
Lecture 1: Introduction
Sandjai Bhulai
Vrije Universiteit Amsterdam
[email protected]
6 September 2023
Motivation
▪ Self-customizing programs
> Amazon, Net ix product recommendations
▪ Self-customizing programs
> Amazon, Net ix product recommendations
▪ Self-customizing programs
> Amazon, Net ix product recommendations
▪ Self-customizing programs
> Amazon, Net ix product recommendations
Properties of matrices:
▪ AA −1 = A −1A = I
▪ (AB)⊤ = B ⊤ A ⊤
▪ (AB)−1 = B −1A −1
▪ (A ⊤)−1 = (A −1)⊤
▪ (A ⊤)⊤ = A
▪ (A + B)⊤ = A ⊤ + B ⊤
29 Sandjai Bhulai / Advanced Machine Learning / 6 September 2023
Linear algebra
▪ Symmetric matrix: A⊤ = A
▪ Positive de nite: x ⊤ Ax > 0 for all non-zero x and A
symmetric
∑
i-th element of Ax is given by (Ax)i = aij xj
j=1
( ∂x )
∂a ∂ai
=
i
∂x
( ∂a )
∂x ∂x
▪ scalar-vector =
i
∂ai
( ∂b )
▪ vector-vector ∂a ∂ai
=
ij
∂bj
fi
Vector calculus
n ∂ ⊤ n
▪ Example: let a ∈ ℝ and x ∈ ℝ . What is (x a)?
∂x
n m
∂ ∑k=1 ∑j=1 bjajk xk m m
( ∂x )
⊤ ⊤
∂b Ax ∂b Ax
aij⊤bj = (A ⊤b)i
∑ ∑
= = = bjaji =
∂xi ∂xi j=1 j=1
i
∂b⊤ Ax
▪ Hence, = b⊤ A
∂x
33 Sandjai Bhulai / Advanced Machine Learning / 6 September 2023
Vector calculus
∂x
▪ =I
∂x
∂Ax
▪ =A
∂x
∂x⊤ A
▪ = A⊤
∂x
∂x⊤ Ax
▪ = x⊤(A + A ⊤) and 2x⊤ A if A is symmetric
∂x
∂2x⊤ Ax ⊤
▪ = A + A and 2A if A is symmetric
∂x∂x ⊤
▪ Proof:
( ∂x ) {1,
∂x ∂xi 0, if i ≠ j,
= =
∂xj if i = j
ij
▪ Proof:
( ∂x )
∂Ax ∂(Ax)i ∂ ∑k=1 aik xk
= = = aij
ij
∂xj ∂xj
▪ Proof:
▪ Proof:
n n
⊤
∂a xx b ⊤ ∂ ∑k=1 ak xk ∑l=1 xlbl n n
∑ ∑
= = bi ak xk + ai xlbl =
∂xi ∂xi k=1 l=1
{ 2σ 2 }
2 1 1 2
(x | μ, σ ) = exp − (x − μ)
(2πσ 2)1/2
β = 1/σ 2
(x | μ, σ 2) > 0
∞
∫−∞
(x | μ, σ 2) dx = 1
𝒩
𝒩
39 Sandjai Bhulai / Advanced Machine Learning / 6 September 2023
𝒩
Normal distribution
▪ Univariate Normal distribution
{ 2σ 2 }
2 1 1 2
(x | μ, σ ) = exp − (x − μ)
(2πσ 2)1/2
∫−∞
[x] = x (x | μ, σ 2) dx = μ
∞
∫−∞
[x 2] = x 2 (x | μ, σ 2) dx = μ 2 + σ 2
var[x] = [x 2] − [x]2 = σ 2
𝒩
𝔼
𝔼
𝒩
𝒩
40 Sandjai Bhulai / Advanced Machine Learning / 6 September 2023
𝔼
𝔼
Normal distribution
{ 2 }
1 1 1 ⊤ −1
(x | μ, Σ) = exp − (x − μ) Σ (x − μ)
(2π) D/2
|Σ| 1/2
fi
Normal distribution
([0] [0 1]) ([1] [0.8 1 ])
0 1 0 0 1 0.8
, ,
43
𝒩
𝒩
Sandjai Bhulai / Advanced Machine Learning / 6 September 2023