Lectur
Lectur
Lectur
1 Conditional Distribution
Consider the probability space (Ω, F, P) and an event B ∈ F such that P( B) > 0. Then, the conditional
probability of any event A ∈ F given event B was defined as
P( A ∩ B)
P( A B) = .
P( B)
Consider two random variables X, Y defined on this probability space, then for y ∈ R such that FY (y) > 0,
we can define events A = X −1 (−∞, x ] and B = Y −1 (−∞, y], such that
FX,Y ( x, y)
P { X 6 x } {Y 6 y } = .
FY (y)
The key observation is that {Y 6 y} is a non-trivial event. How do we define conditional expectation based
on events such as {Y = y}? When random variable Y is continuous, this event has zero probability measure.
P { X 6 x, Y = y}
, for all x ∈ R.
F ( x ) , P { X 6 x } {Y = y } =
X Y =y P (y) Y
Exercise 1.1. For simple random variable Y ∈ Y, show that the function F conditioned on the
X Y =y
event {Y = y} is a distribution.
Definition 1.2. For a simple random variable Y ∈ Y, the distribution F is called the conditional dis-
X Y =y
tribution of X given Y = y. The conditional distribution of X given Y denoted by FX |Y : Ω → [0, 1]R is a
measurable function of the random variable Y, and hence it is a random variable such that
F : ω 7→ F .
X Y X Y =Y ( ω )
Example 1.3 (Conditional distribution). Consider the zero-mean Gaussian random variable N with
variance σ2 , and another independent random variable Y ∈ {−1, 1} with PMF (1 − p, p) for some p ∈
[0, 1]. Let X = Y + N, then the conditional distribution of X given simple random variable Y is
FX |Y = FX |Y =−1 1{Y =−1} + FX |Y =1 1{Y =1} ,
( t − µ )2
Rx −
where FX |Y =µ ( x ) is −∞ e
σ2 dt.
1
1.2 Conditional densities
When X, Y are both continuous random variables, there exists a joint density f X,Y ( x, y) for all ( x, y) ∈ R2 .
For each y ∈ Y such that f Y (y) > 0, we can define a function f : R → R+ such that
X Y =y
f ( x, y)
f ( x ) , X,Y , for all x ∈ R.
X Y =y f Y (y)
Exercise 1.4. For continuous random variables X, Y, show that the function f is a density of
X Y =y
continuous random variable X for each y ∈ R.
Definition 1.5. The conditional density of X given Y for continuous random variables X, Y is defined as a
measurable function of the random variable Y, and hence it is random variable such that
f : ω 7→ f .
X Y X Y =Y ( ω )
2 Conditional Expectation
Since we have defined the conditional distribution and densities, we can define the conditional expectation
given an event as an integration with respect to the conditional distribution given that event. In the fol-
lowing, we will assume the random variables X, Y are defined on the same probability space (Ω, F, P) and
E | X | < ∞ such that EX exists and is finite.
Definition 2.1 (Conditional expectation for conditioning on simple random variables). When Y is a sim-
ple random variable, we can define the conditional expectation of X given the random variable Y is a
measurable function of random variable Y, denoted by E[ X Y ] : Ω → R such that
Z
E[ X Y ] : ω 7 →
xdFX |Y (ω ) ( x ).
x ∈R
Remark 1. The random variable E[ X Y ] takes value E[ X Y = y] with probability PY (y) for all y ∈ Y.
Lemma 2.2. For simple random variable Y : Ω → Y, the mean of random variable E[ X Y ] is E[ X ].
2
Example 2.3 (Conditional expectation). Consider a fair die being thrown and the random variable
X takes the value of the outcome of the experiment. That is, X ∈ {1, · · · , 6} with P[ X = i ] = 1/6 for
i ∈ {1, · · · , 6}. Define another random variable Y = 1{ X ≤3} . Then the conditional expectation of X
given Y is a random variable given by
(
E[ X |Y = 1] = 2 w.p 0.5
E [ X |Y ] =
E[ X |Y = 0] = 5 w.p 0.5.
Lemma 2.4. For continuous random variables X, Y, the mean of random variable E[ X Y ] is E[ X ].
Proof. Since E[ X |Y ] is a function of the random variable Y and its density is f Y (y), we get
Z Z Z
E[E[ X Y ]] = dy f Y (y)E[ X Y = y] =
dy f Y (y) x f dx.
y ∈R y ∈R x ∈R X Y =y
From the definition of conditional density of X given Y = y, we get f ( x ) f Y (y) = f X,Y ( X, y). Inter-
X Y =y
changing integrations from Fubini’s theorem, and from the law of total probability, we get
Z Z Z
E[E[ X Y ]] = x f X ( x )dx = E[ X ].
xdx f X,Y ( x, y)dy =
x ∈R y ∈R x ∈R
Example 2.6 (Indicator function). Let A ∈ F be an event, then X = 1 A is a random variable and
Ω, x > 1,
X −1 (−∞, x ] = Ac , x ∈ [0, 1),
∅, x < 0.
This implies that the smallest event space generated by this random variable is σ ( X ) = {∅, A, Ac , Ω}.
3
Example 2.7 (Simple random variables). Let X be a simple random variable, then X = ∑ x∈X x1 Ax
where ( A x = X −1 { x } ∈ F : x ∈ X) is a finite partition of the sample space Ω. Without loss of generality,
we can denote X = { x1 , . . . , xn } where x1 6 . . . 6 xn . Then,
Ω,
x > xn ,
−1
X (−∞, x ] = ∪ j=1 A xi , x ∈ [ xi , xi+1 ), i ∈ [n − 1],
i
∅,
x < x1 .
Then the smallest event space generated by the simple random variable X is {∪ x∈S A x : S ⊆ X}.