2 Probability and Linear Algebra
2 Probability and Linear Algebra
Jeffrey W. Miller
Department of Biostatistics
Harvard T.H. Chan School of Public Health
1 / 21
Outline
Probability basics
Random vectors
2 / 21
Outline
Probability basics
Random vectors
3 / 21
Linear algebra in this course
4 / 21
Matrices and transposes
A is an m × n real matrix, written A ∈ Rm×n , if
a11 a12 · · · a1n
a21 a22 · · · a2n
A= .
. ..
. .
am1 am2 · · · amn
6 / 21
Basic matrix properties
(AB)C = A(BC)
I Consequently, we can write ABC without specifying the order
in which the multiplications are performed.
A(B + C) = AB + AC
(B + C)A = BA + CA
Except in special circumstances, AB is not equal to BA.
(AB)T = B T AT
(A + B)T = AT + B T
7 / 21
Identity, inverse, and trace
The n × n identity matrix, denoted In×n or I for short, is
1 0 ··· 0
0 1 · · · 0
n×n
I = In×n = . .. ∈ R .
.. .
0 0 ··· 1
IA = A = AI
If it exists, the inverse of A, denoted A−1 , is a matrix such
that A−1 A = I and AA−1 = I.
If A−1 exists, we say that A is invertible.
(A−1 )T = (AT )−1
(AB)−1 = B −1 A−1
n×n , denoted tr(A), is
The trace of a square Pnmatrix A ∈ R
defined as tr(A) = i=1 Aii .
tr(AB) = tr(BA) if AB is a square matrix.
8 / 21
Symmetric and definite matrices
A is symmetric if A = AT .
9 / 21
Outline
Probability basics
Random vectors
10 / 21
Discrete random variables
Informally, a random variable (r.v.) is a quantity that
probabilistically takes any one of a range of values.
Notation: Uppercase for r.v.s, lowercase for values taken.
11 / 21
Continuous random variables
A random variable X ∈ R is continuous
R if there is a function
p(x) ≥ 0 such that P(X ∈ A) = A p(x)dx for all A ⊆ R.
I (We will ignore measure-theoretic technicalities in this course.)
Examples: Normal, Uniform, Beta, Gamma, Exponential.
12 / 21
Joint distributions of multiple random variables/vectors
p(x, y) denotes the joint density of X ∈ X and Y ∈ Y.
I P(X = x, Y = y) = p(x, y) if X and Y are discrete.
R
I P(X ∈ A, Y ∈ B) = A×B
p(x, y)dx dy if X and Y are
continuous.
R
I P(X = x, Y ∈ B) = B
p(x, y)dy if X is discrete and Y is
continuous.
X1 , . . . , Xn are independent if
for all x1 , . . . , xn .
for all x1 , . . . , xn , y.
14 / 21
Expectations (a.k.a. expected values)
15 / 21
Outline
Probability basics
Random vectors
16 / 21
Random vectors
is a random vector in Rn .
E(Zn )
17 / 21
Random vectors
The covariance matrix of a random vector Z ∈ Rn is the
matrix Cov(Z) ∈ Rn×n with (i, j)th entry
Cov(Z)ij = Cov(Zi , Zj )
where
Cov(Zi , Zj ) = E (Zi − E(Zi ))(Zj − E(Zj ))
= E(Zi Zj ) − E(Zi )E(Zj ).
Equivalently,
Cov(Z) = E (Z − E(Z))(Z − E(Z))T
= E(ZZ T ) − E(Z)E(Z)T .
E(AZ + b) = A E(Z) + b
and
Cov(AZ + b) = A Cov(Z)AT
for any fixed (i.e., nonrandom) A ∈ Rm×n and b ∈ Rm .
19 / 21
Multivariate normal distribution
If µ ∈ Rn and C ∈ Rn×n is SPSD, then Z ∼ N (µ, C) denotes
that Z is multivariate normal with E(Z) = µ and
Cov(Z) = C.
20 / 21
Multivariate normal distribution
for all z ∈ Rn .
21 / 21