0% found this document useful (0 votes)
13 views23 pages

Lecture 3 20240318

This document discusses conditional probability and conditional expectation. It defines conditional probability and conditional expectation for both discrete and continuous random variables. It provides examples calculating conditional probabilities and expectations. It also discusses using the law of total expectation to compute expectations by conditioning on another random variable.

Uploaded by

Reedus
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views23 pages

Lecture 3 20240318

This document discusses conditional probability and conditional expectation. It defines conditional probability and conditional expectation for both discrete and continuous random variables. It provides examples calculating conditional probabilities and expectations. It also discusses using the law of total expectation to compute expectations by conditioning on another random variable.

Uploaded by

Reedus
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 23

Conditional Probability

and
Conditional Expectation
Introduction
One of the most useful concepts in probability theory is that
of conditional probability and conditional expectation.
In practice, we are often interested in calculating probabilities and
expectations when some partial information is available; hence, the
desired probabilities and expectations are conditional ones.
Secondly, in calculating a desired probability or expectation it is
often extremely useful to first “condition” on some appropriate
random variable.

Recall that for any two events E and F, the conditional


probability of E given F is defined, as long as P(F) > 0, by
𝑃𝑃(𝐸𝐸𝐸𝐸)
𝑃𝑃 𝐸𝐸 𝐹𝐹 =
𝑃𝑃(𝐹𝐹)
The Discrete Case
If 𝑋𝑋 and 𝑌𝑌 are discrete random variables, then it is natural to define
the conditional probability mass function of 𝑋𝑋 given that 𝑌𝑌 = 𝑦𝑦, for
all values of 𝑦𝑦 such that 𝑃𝑃 𝑌𝑌 = 𝑦𝑦 > 0, by
𝑃𝑃(𝑋𝑋=𝑥𝑥, 𝑌𝑌=𝑦𝑦) 𝑝𝑝𝑋𝑋𝑌𝑌 (𝑥𝑥, 𝑦𝑦)
𝑝𝑝𝑋𝑋|𝑌𝑌 𝑥𝑥 𝑦𝑦 = 𝑃𝑃 𝑋𝑋 = 𝑥𝑥 𝑌𝑌 = 𝑦𝑦 = =
𝑃𝑃(𝑌𝑌=𝑦𝑦) 𝑝𝑝𝑌𝑌 (𝑦𝑦)

Similarly, the conditional probability distribution function of 𝑋𝑋 given


that 𝑌𝑌 = 𝑦𝑦, for all values of 𝑦𝑦 such that 𝑃𝑃 𝑌𝑌 = 𝑦𝑦 > 0, by
𝐹𝐹𝑋𝑋|𝑌𝑌 𝑥𝑥 𝑦𝑦 = 𝑃𝑃 𝑋𝑋 ≤ 𝑥𝑥 𝑌𝑌 = 𝑦𝑦 = ∑𝑎𝑎≤𝑥𝑥 𝑝𝑝𝑋𝑋|𝑌𝑌 𝑎𝑎 𝑦𝑦

The conditional expectation of 𝑋𝑋 given that 𝑌𝑌 = 𝑦𝑦 is defined by


𝐸𝐸 𝑋𝑋 𝑌𝑌 = 𝑦𝑦 = ∑𝑥𝑥 𝑥𝑥𝑃𝑃 𝑋𝑋 = 𝑥𝑥 𝑌𝑌 = 𝑦𝑦 = ∑𝑥𝑥 𝑝𝑝𝑋𝑋|𝑌𝑌 𝑥𝑥 𝑦𝑦

If 𝑋𝑋 is independent of 𝑌𝑌, then


𝑃𝑃(𝑋𝑋=𝑥𝑥,𝑌𝑌=𝑦𝑦)
𝑝𝑝𝑋𝑋|𝑌𝑌 𝑥𝑥 𝑦𝑦 = 𝑃𝑃 𝑋𝑋 = 𝑥𝑥 𝑌𝑌 = 𝑦𝑦 = = 𝑃𝑃(𝑋𝑋 = 𝑥𝑥)
𝑃𝑃(𝑌𝑌=𝑦𝑦)
Example 1
Example 3.1 Suppose that 𝑝𝑝(𝑥𝑥, 𝑦𝑦), the joint probability mass
function of 𝑋𝑋 and 𝑌𝑌, is given by p(1, 1) = 0.5, p(1, 2) = 0.1, p(2,
1) = 0.1, p(2, 2) = 0.3. Calculate the conditional probability mass
function of 𝑋𝑋 given that 𝑌𝑌 = 1.

𝑝𝑝𝑌𝑌 1 =

𝑝𝑝𝑋𝑋|𝑌𝑌 1 1 =

𝑝𝑝𝑋𝑋|𝑌𝑌 2 1 =
Example 2
Example 3.2 If 𝑋𝑋1 and 𝑋𝑋2 are independent binomial random
variables with respective parameters (𝑛𝑛1 , 𝑝𝑝) and (𝑛𝑛2 , 𝑝𝑝), calculate
the conditional probability mass function of X1 given that X1 +
X2 = m.

Hypergeometric distribution
 The number of blue balls that are chosen when a sample of m balls is randomly
chosen from an urn that contains 𝑛𝑛1 blue and 𝑛𝑛2 red balls
Example 3
Example 3.3 If 𝑋𝑋 and 𝑌𝑌 are independent Poisson random
variables with respective means 𝜆𝜆1 and 𝜆𝜆2 , calculate the
conditional expected value of 𝑋𝑋 given that 𝑋𝑋 + 𝑌𝑌 = 𝑛𝑛.
Example 2 of Probability on Lect. 1
Let A1 (A0) and B1 (B0) be the event that 1 (0) is sent and the
event that 1(0) is received, respectively.
Assumption
P(A0) = 0.8, P(A1) = 1- P(A0) = 0.2,
The probability of error, i.e., p=P(B1|A0)= P(B0|A1), is 0.1

0 0.9 0
0.1
0.1
1 0.9 1

Find
The error probability at the receiver
The probability that 1 is sent when the receiver decides 1.
Answer: Example 2 on Lect. 1
Parameters
P(A0) = 0.8, P(A1) = 1- P(A0) = 0.2, P(B1|A0)= P(B0|A1) = 0.1

The error probability at the receiver


P(error)=P(𝐴𝐴0 𝐵𝐵1 ) + P(𝐴𝐴1 𝐵𝐵0 )=P(𝐵𝐵1 |𝐴𝐴0 )P(𝐴𝐴0 ) + P(𝐵𝐵0 |𝐴𝐴1 )P(𝐴𝐴1 ) = 0.1

The probability that 1 is sent when the receiver decides 1.


𝑃𝑃 𝐵𝐵 𝐴𝐴𝑚𝑚 𝑃𝑃(𝐴𝐴𝑚𝑚 ) 𝐵𝐵 𝐴𝐴𝑚𝑚 𝑃𝑃(𝐴𝐴𝑚𝑚 )
𝑃𝑃
P(𝐴𝐴𝑚𝑚 𝐵𝐵 = = ∑𝑁𝑁
𝑃𝑃 𝐵𝐵 𝑛𝑛=1 𝑃𝑃 𝐵𝐵 𝐴𝐴𝑛𝑛 𝑃𝑃(𝐴𝐴𝑛𝑛 )
𝑃𝑃 𝐵𝐵1 𝐴𝐴1 𝑃𝑃(𝐴𝐴1 ) 𝑃𝑃 𝐵𝐵1 𝐴𝐴1 𝑃𝑃(𝐴𝐴1 ) 0.9×0.2
P(𝐴𝐴1 𝐵𝐵1 = 𝑃𝑃 𝐵𝐵1
= 𝑃𝑃 𝐵𝐵1 𝐴𝐴1 𝑃𝑃 𝐴𝐴1 +𝑃𝑃 𝐵𝐵1 𝐴𝐴0 𝑃𝑃(𝐴𝐴0
= =
) 0.9×0.2+0.1×0.8
0.69
The Continuous Case
If X and Y have a joint probability density function f (x, y),
then the conditional probability density function of X, given
that Y = y, is defined for all values of y such that fY(y) > 0, by
𝑓𝑓(𝑥𝑥,𝑦𝑦)
𝑓𝑓𝑋𝑋|𝑌𝑌 𝑥𝑥 𝑦𝑦 =
𝑓𝑓(𝑦𝑦)

To motivate this definition, multiply the left side by dx and


the right side by (dx dy)/dy to get
𝑓𝑓 𝑥𝑥,𝑦𝑦 𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑 𝑃𝑃 𝑥𝑥≤𝑋𝑋≤𝑥𝑥+𝑑𝑑𝑑𝑑,𝑦𝑦≤𝑌𝑌≤𝑦𝑦+𝑑𝑑𝑑𝑑
𝑓𝑓𝑋𝑋|𝑌𝑌 𝑥𝑥 𝑦𝑦 𝑑𝑑𝑑𝑑 = ≈
𝑓𝑓 𝑦𝑦 𝑑𝑑𝑑𝑑 𝑃𝑃 𝑦𝑦≤𝑌𝑌≤𝑦𝑦+𝑑𝑑𝑑𝑑
= 𝑃𝑃{𝑥𝑥 ≤ 𝑋𝑋 ≤ 𝑥𝑥 + 𝑑𝑑𝑑𝑑|𝑦𝑦 ≤ 𝑌𝑌 ≤ 𝑦𝑦 + 𝑑𝑑𝑑𝑑}
In other words, for small values dx and dy, f (x|y) dx is X |Y

approximately the conditional probability that X is between x and x


+ dx given that Y is between y and y + dy.
The conditional expectation of X, given that Y = y, is defined
for all values of y such that fY(y) > 0, by

𝐸𝐸 𝑋𝑋 𝑌𝑌 = 𝑦𝑦 = ∫−∞𝑥𝑥 𝑓𝑓𝑋𝑋|𝑌𝑌 𝑥𝑥 𝑦𝑦 𝑑𝑑𝑑𝑑
Example 1
Example 3.6 Suppose the joint density of X and Y is given by

Compute the conditional expectation of X given that Y = y,


where 0 < y < 1.
Example 2
Example 3.7 Suppose the joint density of X and Y is given by

Compute 𝐸𝐸[𝑋𝑋|𝑌𝑌 = 𝑦𝑦].


Example 3
Example 3.8 The joint density of X and Y is given by

𝑋𝑋
What is 𝐸𝐸[𝑒𝑒 |𝑌𝑌 = 1]?
2
Computing Expectations by
Conditioning
Let us denote by 𝐸𝐸[𝑋𝑋|𝑌𝑌] that function of the random variable
𝑌𝑌 whose value at 𝑌𝑌 = 𝑦𝑦 is 𝐸𝐸[𝑋𝑋|𝑌𝑌 = 𝑦𝑦].
An extremely important property of conditional expectation
is that for all random variables X and Y
𝐸𝐸 𝑋𝑋 = 𝐸𝐸[𝐸𝐸 𝑋𝑋 𝑌𝑌 ]
Discrete RV: 𝐸𝐸 𝑋𝑋 = 𝐸𝐸 𝐸𝐸 𝑋𝑋 𝑌𝑌 = ∑𝑦𝑦 𝐸𝐸 𝑋𝑋 𝑌𝑌 = 𝑦𝑦 𝑃𝑃{𝑌𝑌 = 𝑦𝑦}

Continuous RV: 𝐸𝐸 𝑋𝑋 = 𝐸𝐸 𝐸𝐸 𝑋𝑋 𝑌𝑌 = ∫−∞ 𝐸𝐸 𝑋𝑋 𝑌𝑌 = 𝑦𝑦 𝑓𝑓𝑌𝑌 𝑦𝑦 𝑑𝑑𝑑𝑑
Proof?

Compound random variable


The random variable ∑𝑁𝑁 𝑖𝑖=1 𝑋𝑋𝑖𝑖 equal to the sum of a random number
𝑁𝑁 of independent and identically distributed random variables that
are also independent of 𝑁𝑁. the expected value of a compound
random variable is 𝐸𝐸 𝑋𝑋 𝐸𝐸 𝑁𝑁 . See examples
Example 1
Example 3.10 Sam will read either one chapter of his
probability book or one chapter of his history book. If the
number of misprints in a chapter of his probability book is
Poisson distributed with mean 2 and if the number of misprints
in his history chapter is Poisson distributed with mean 5, then
assuming Sam is equally likely to choose either book, what is
the expected number of misprints that Sam will come across?
Example 2
Example 3.11 (The Expectation of the Sum of a Random
Number of Random Variables) Suppose that the expected
number of accidents per week at an industrial plant is four.
Suppose also that the numbers of workers injured in each
accident are independent random variables with a common
mean of 2. Assume also that the number of workers injured in
each accident is independent of the number of accidents that
occur. What is the expected number of injuries during a week?
Example 3
Example 3.12 (The Mean of a Geometric Distribution) A coin,
having probability p of coming up heads, is to be successively
flipped until the first head appears. What is the expected
number of flips required?
Example 4
Example 3.15 Independent trials, each of which is a success
with probability p, are performed until there are k consecutive
successes. What is the mean number of necessary trials?
Computing Variances by
Conditioning
Conditional expectations can also be used to compute the
variance of a random variable. Specifically, we can use
𝑉𝑉𝑉𝑉𝑉𝑉 𝑋𝑋 = 𝐸𝐸 𝑋𝑋 2 − 𝐸𝐸[𝑋𝑋] 2
and then use conditioning to obtain both E[X] and 𝐸𝐸 𝑋𝑋 2 .

Example 3.18 (Variance of the Geometric Random Variable)


Independent trials, each resulting in a success with
probability p, are performed in sequence. Let N be the trial
number of the first success. Find Var(N).
Computing Variances by
Conditioning
Another way to use conditioning to obtain the variance of a
random variable is to apply the conditional variance formula. The
conditional variance of X given that Y = y is defined by
𝑉𝑉𝑉𝑉𝑉𝑉 𝑋𝑋 𝑌𝑌 = 𝑦𝑦 = 𝐸𝐸 𝑋𝑋 − 𝐸𝐸[𝑋𝑋|𝑌𝑌 = 𝑦𝑦] 2 |𝑌𝑌 = 𝑦𝑦

That is, the conditional variance is defined in exactly the same


manner as the ordinary variance with the exception that all
probabilities are determined conditional on the event that Y = y.
Expanding the right side of the preceding and taking expectation
term by term yields
𝑉𝑉𝑉𝑉𝑉𝑉 𝑋𝑋 𝑌𝑌 = 𝑦𝑦 = 𝐸𝐸 𝑋𝑋 2 |𝑌𝑌 = 𝑦𝑦 − 𝐸𝐸[𝑋𝑋|𝑌𝑌 = 𝑦𝑦] 2

Letting Var(X|Y) denote that function of Y whose value when Y =y is


Var(X|Y = y), we have the following result.

Proposition 3.1 (The Conditional Variance Formula)


𝑉𝑉𝑉𝑉𝑉𝑉 𝑋𝑋 = 𝐸𝐸 𝑉𝑉𝑉𝑉𝑉𝑉(𝑋𝑋|𝑌𝑌) + 𝑉𝑉𝑉𝑉𝑉𝑉(𝐸𝐸 𝑋𝑋 𝑌𝑌 )
Computing Variances by
Conditioning
Example 3.19 (The Variance of a Compound Random
Variable) Let X1, X2, . . . be independent and identically
distributed random variables with distribution F having mean
μ and variance σ2, and assume that they are independent of
the nonnegative integer valued random variable N. As noted
in Example 3.11, where its expected value was determined,
𝑁𝑁
the random variable S = ∑𝑖𝑖=1 𝑋𝑋𝑖𝑖 is called a compound
random variable. Find its variance.
Computing Probabilities by
Conditioning
Not only can we obtain expectations by first conditioning on
an appropriate random variable, but we may also use this
approach to compute probabilities.

To see this, let E denote an arbitrary event and define the
indicator random variable X by
1, if E occurs
𝑋𝑋 = �
0, if E does not occur

It follows from the definition of X that


𝐸𝐸 𝑋𝑋 = 𝑃𝑃 𝐸𝐸
𝐸𝐸 𝑋𝑋 𝑌𝑌 = 𝑦𝑦 = 𝑃𝑃(𝐸𝐸|𝑌𝑌 = 𝑦𝑦), for any random variable Y

Therefore, we obtain
𝑃𝑃 𝐸𝐸 = ∑𝑦𝑦 𝑃𝑃 𝐸𝐸 𝑌𝑌 = 𝑦𝑦 𝑃𝑃{𝑌𝑌 = 𝑦𝑦}, if Y is discrete

= ∫−∞ 𝑃𝑃 𝐸𝐸 𝑌𝑌 = 𝑦𝑦 𝑓𝑓𝑌𝑌 𝑦𝑦 𝑑𝑑𝑑𝑑, if Y is continuous
Examples
Example 3.21 Suppose that X and Y are independent
continuous random variables having densities fX and fY,
respectively. Compute P{X < Y}.

Example 3.22 An insurance company supposes that the


number of accidents that each of its policyholders will have
in a year is Poisson distributed, with the mean of the Poisson
depending on the policyholder. If the Poisson mean of a
randomly chosen policyholder has a gamma distribution with
density function
𝑔𝑔 𝜆𝜆 = 𝜆𝜆𝑒𝑒 −𝜆𝜆 , 𝜆𝜆 ≥ 0
what is the probability that a randomly chosen policyholder
has exactly n accidents next year?
Examples
Example 3.28 Let U1,U2, . . . be a sequence of independent uniform
(0, 1) random variables, and let
𝑁𝑁 = min 𝑛𝑛 ≥ 2: 𝑈𝑈𝑛𝑛 > 𝑈𝑈𝑛𝑛−1 and
𝑀𝑀 = min 𝑛𝑛 ≥ 2: 𝑈𝑈1 + ⋯ + 𝑈𝑈𝑛𝑛 > 1
That is, N is the index of the first uniform random variable that is
larger than its immediate predecessor, and M is the number of
uniform random variables we need sum to exceed 1.

Example 3.29 Let X1,X2, . . . be independent continuous random


variables with a common distribution function F and density f = F’,
and suppose that they are to be observed one at a time in
sequence. Let
𝑁𝑁 = min 𝑛𝑛 ≥ 2: 𝑋𝑋𝑛𝑛 = second largest of 𝑋𝑋1 , … , 𝑋𝑋𝑛𝑛 and
𝑀𝑀 = min 𝑛𝑛 ≥ 2: 𝑋𝑋𝑛𝑛 = second largest of 𝑋𝑋1 , … , 𝑋𝑋𝑛𝑛
Which random variable—XN, the first random variable which when
observed is the second largest of those that have been seen, or XM,
the first one that on observation is the second smallest to have
been seen—tends to be larger?

You might also like