Lect Slides#4
Lect Slides#4
Chapter#6.
X = (X 1 , X 2 , ···, X n )
x = (x 1 , x 2 , . . . , x n )
Examples
An MP3 codec (coder and decoder), or cellular phone
6.1 Arrivals at a Packet Switch
codec processes the audio in blocks of n samples,
Packets arrive at each of three input ports according to
independent Bernoulli trials with p = 1 / 2 , and each X = ( X 1 , X 2 , . . . , X n ) . Then X is a vector random
packet is equally likely to be relayed to any of three variable.
output ports. Sampling sin(x) with T = /5
Let X = ( X 1 , X 2 , X 3 ) where X i is the total number
of packets arriving for output port i . Then X is a vector 1 sin X(t)
random variable, whose values are determined by the sin X
k
Pattern of arrivals at the input ports. 0.5
0
T = /5
6.3 Samples of an Audio Signal -0.5
Let the outcome ζ of a random experiment be an audio
-1
signal X ( t ) .
Let the random variable X k = X ( k T ) be the sampling 0 1 2 3 4 5 6
of the signal at time k T with unit interval T . k
The function from X ( t ) to X k is called analog-to-digital X(t) or X = X(kT)
converter (A2D or ADC).
Events and Joint Probabilities
For the n-dimensional random variable X , events have the following
product form:
A = { X 1 ∈ A 1 } ∩{ X 2 ∈ A 2 } ∩···∩{ X n ∈ A n }
The joint cdf is defined for discrete, continuous, and of mixed type.
The joint marginal cdf can be obtained by sending uninterested
variables to + ∞ :
Joint pdf: Continuous RV
X 1 , . . . , X n are said to be jointly continuous random variables if
the probability of any n-dimensional event A is given by
n-dimensional integral of a joint pdf, i.e., if a joint pdf exists:
The marginal pdf for the pair X1 and X3 is found by integrating the joint pdf over x2
6.1.3 Independence
RVs X 1 , . . . , X n are independent if and only if their joint cdf can be
factorized:
The given joint pdf is the product of ‘n’ one dimensional Gaussian (0,1) pdf. Hence independent
6.2 Functions of Several RVs
One Function of Several RVs Let a RV Z = g(X 1 , X 2 , . . . , X n ). The cdf of Z
is found by
F Z (z) = P [Z ≤ z] = P [{ x : g(x ) ≤ z}
and the pdf of Z is found by the derivative of F Z (z).
P [X i ≤ x] = F X ( x) , P [X i > x] = 1 − F X ( x) , ∀i ∈ { 1, 2, . . . , n}
Then,
n
F W (w) = P [ ma x( X 1 , . . . , X n ) ≤ w] = P [X 1 ≤ w] · · · P [ X n ≤ w] = { F X (w)}
F Z ( z) = P [min(X 1 , . . . , X n ) ≤ z] = P [X 1 ≤ z ∪ · · · ∪ X n ≤ z]
= P [(X 1 > z ∩ · · · ∩ X n > z) c ] = 1 − P [X 1 > z ∩ · · · ∩ X n > z]
= 1 − P [X 1 > z] · · · P [X n > z] = 1 − { 1 − F X ( z) } n
Max of Two Random variables
X , X Y,
Z max( X , Y )
Y , X Y,
FW ( w) 1 PW w 1 P X w, Y w
FW ( w) 1 PW w 1 P X w, X w
FW ( w) 1 PW w 1 [ P X w) P ( X w]
FW ( w) 1 PW w 1 [ P ( X w)]2
FW ( w) 1 PW w 1 [1 FX ( x)]2
Example:6.12
6.3 Expected Values of Vector RVs
An expected value of a function g(X) = g(X 1 , . . . , X n ) of a vector
random variable X = (X 1 , . . . , X n ) is given by:
. . .
Covariance Matrix
Diagonal Matrix
Diagonal Matrix
The inverse of a diagonal matrix is obtained by replacing each element in the diagonal with its reciprocal
The determinant of diagonal matrix is the product of its diagonal values.
𝑇
𝑥1 𝑚1 1/𝜎12 0 𝑥1 𝑚1
− [ ] [ − [ ]]
𝑥2 𝑚2 0 1/𝜎22 𝑥2 𝑚2
1/𝜎12 0 𝑥1 − 𝑚1
[ 𝑥1 − 𝑚1 (𝑥2 − 𝑚2)] 2 𝑥2 − 𝑚2]
[
0 1/𝜎2
f X (x)
xMAP = arg m ax f X (x|y) = arg max f Y (y|x) ,
x x f Y (y)
xML = arg m ax f Y (y|x)
x
Ex 6.25 M L vs. MAP Estimators
Let X and Y be the random pair in Example 5.16. Find the MAP and ML
estimators for X in terms of Y .
e−y e−y
f Y (y|x) = = ⇒ Xˆ M L = y
1 −e−x 1 −e−y e−( x −y)
Plot for y=2, x=[2 20]
Max occur at x=2 which is the MAP estimate
Xmap=y
2𝑥 0≤𝑥≤1
𝑓𝑋 𝑥 =
0 𝑜𝑤
Also the conditional likelihood pdf of y given x is: 𝑓𝑦 𝑦 𝑥 = 𝑥(1 − 𝑥)𝑦−1 y=1,2,3….
The MAP estimate of ‘X’ is between [0 1] and it is the one which will maximize ( with y=3)
We can find maximum value by differentiating w.r.t ‘x’ and putting it equal to zero
Example: ML Estimate
Let ‘X’ be continuous random variable with following pdf (prior density)
2𝑥 0≤𝑥≤1
𝑓𝑋 𝑥 =
0 𝑜𝑤
𝑓𝑦 𝑦 𝑥 = 𝑥(1 − 𝑥)𝑦−1 y=1,2,3….
Also the conditional likelihood pdf of y given x is:
We can find maximum value by differentiating w.r.t ‘x’ and putting it equal to zero
𝑑 𝑑 3
𝑥 1−𝑥 2 = 𝑥 − 2𝑥 2 + 𝑥 = 0 3𝑥 2 − 4𝑥 + 1 = 0 x=[1 1/6]
𝑑𝑥 𝑑𝑥
2 𝑥=1 1
2nd derivative = [6𝑥 − 4] = 𝑥𝑀𝐿 =
−3 𝑥 = 1/6 6
2𝑥 2 (1 − 𝑥)2
MAP Estimate
x(1 − 𝑥)2
ML Estimate
6.5.2 Minimum MSE Linear Estimator
The estimate for X is given by a function of the observation
𝑋 = 𝑔 𝑌 , and the estimation error is (most of time non-zero)
X −X̂ = X −g(Y )
The cost [C()] is a function of error, and our goal is minimizing the
expected value of the cost (expected because of random nature)
When X and Y are continuous RVs, we often use the mean square error
(MSE) as the cost function:
g(Y ) = a
a∗ = arg min E[(X −a)2] = arg min{E[X 2 ] −2aE[X] + a 2 }
a a
= arg m in{ a2 −2aE [X ]} Keep only terms which has ‘a’
a
𝑑 2
𝑎 − 2𝑎𝐸 𝑋 =0 2a − 2E X = 0 𝑎∗ = 𝐸[𝑋] MMSE Estimate
𝑑𝑎
𝑒 = 𝐸[(𝑋 − 𝑎∗ ))2 ]
Minimum Mean-Square Error
𝑒 = 𝐸 (𝑋 − 𝐸[𝑋]))2 = 𝑉𝐴𝑅[𝑋]
Min. MSE Estimator: Linear Case
g(Y ) = aY + b
(a∗ , b∗ ) = arg m in E [(X −aY −b) 2 ]
a,b
Can be viewed as approximation of [X-aY] by ‘b’
From constant case,
Constant case: 𝑏 ∗ = 𝐸[𝑋 − 𝑎𝑌] (best value of ‘b’)
b∗ = E [X ] −aE [Y ] ⇒
a∗ = arg m in E [(X −aY −E [X ] + aE [Y ]) 2 ]
a
= arg m in E [{ (X −E [X ]) −a(Y −E [Y ])} 2 ] Solve for best ‘a’
a
Take the derivative w.r.t. a, equate it to 0, then
COV(X, Y ) σX
a∗ = = ρX , Y
VAR(Y ) σY
Y −E [Y ]
Xˆ = a∗Y + b∗ = ρX ,Y σX + E [X ] Linear Estimator
σY
e∗L = VAR(X)(1 −ρ2X ,Y )
Minimum MSE
𝑌 − 𝐸[𝑌]
𝑋 = 𝜌𝑋𝑌 𝜎𝑋 + 𝐸[𝑋] To Make sure that estimator has correct mean
𝜎𝑌
If X and Y are uncorrelated (𝜌𝑋𝑌 = 0) then the best estimate for X is E[X]
If X and Y are linearly related (𝜌𝑋𝑌 = ±1) then the best estimate for X
𝑌 − 𝐸[𝑌]
±𝜎𝑋 + 𝐸[𝑋]
𝜎𝑌
Orthogonality Condition
a∗ = arg m in E [{ (X −E [X ]) −a(Y −E [Y ])} 2 ]
𝐸 𝑋−𝐸 𝑋 − 𝑎∗ 𝑌 − 𝐸 𝑌 𝑌−𝐸 𝑌 =0
The error of best linear estimator (the quantity inside the braces) is orthogonal to
the observation (Y-E[Y])