Binomial Random Variable Approximations and Conditional Probability Density Functions
Binomial Random Variable Approximations and Conditional Probability Density Functions
Table 4.1
4
Since x1 < 0 , from Fig. 4.1(b), the above probability is given
by P (2 , 475 ≤ X ≤ 2 ,525 ) = erf ( x 2 ) − erf ( x 1 ) = erf ( x 2 ) + erf (| x 1 |)
⎛5⎞
= 2 erf ⎜ ⎟ = 0 . 516 ,
⎝7⎠
where we have used Table 4.1 (erf (0.7) = 0.258) .
1 1
e −x /2
2
e −x /2
2
2π 2π
x x
x1 x2 x1 x2
(a) x1 > 0, x2 > 0 (b) x1 < 0, x2 > 0
Fig. 4.1
4.2 The Poisson Approximation
As we have mentioned earlier, for large n, the Gaussian
approximation of a binomial r.v is valid only if p is fixed,
i.e., only if np >> 1 and npq >> 1 . what if np is small, or if it
does not increase with n?
5
Obviously that is the case if, for example, p → 0 as n → ∞,
such that np = λ is a fixed number.
Many random phenomena in nature in fact follow this
pattern. Total number of calls on a telephone line, claims in
an insurance company etc. tend to follow this type of
behavior. Consider random arrivals such as telephone calls
over a line. Let n represent the total number of calls in the
interval (0, T ). From our experience, as T → ∞ we have n → ∞
so that we may assume n = µT . Consider a small interval of
duration ∆ as in Fig. 4.2. If there is only a single call
coming in, the probability p of that single call occurring in
that interval must depend on its relative size with respect to
T. 1 2
"
n
∆
0 T
Fig. 4.2 6
∆
Hence we may assume Note that p → 0 as T → ∞ .
p =
T
.
Thus λk
lim Pn ( k ) = e −λ , (4-8)
n → ∞ , p → 0 , np = λ k!
k −1⎞
since the finite products ⎛⎜⎝ 1 − 1n ⎞⎟⎠ ⎛⎜⎝ 1 − n2 ⎞⎟⎠ " ⎛⎜⎝ 1 − ⎟
n ⎠ as well
as ⎛⎜⎝ 1 − λn ⎞⎟⎠ tend to unity as n → ∞ , and
k
λ ⎞
n
⎛
lim ⎜ 1 − ⎟ = e − λ .
n→ ∞
⎝ n ⎠
The right side of (4-8) represents the Poisson p.m.f and the
Poisson approximation to the binomial r.v is valid in
situations where the binomial r.v parameters n and p
diverge to two extremes ( n → ∞ , p → 0 ) such that their
product np is a constant.
8
Example 4.2: Winning a Lottery: Suppose two million
lottery tickets are issued with 100 winning tickets among
them. (a) If a person purchases 100 tickets, what is the
probability of winning? (b) How many tickets should one
buy to be 95% confident of having a winning ticket?
Solution: The probability of buying a winning ticket
No. of winning tickets 100 −5
p= = = 5 × 10 .
Total no. of tickets 2 × 10 6
9
(b) In this case we need P ( X ≥ 1) ≥ 0 .95 .
P ( X ≥ 1) = 1 − e − λ ≥ 0 .95 implies λ ≥ ln 20 = 3 .
⎛ 4 2⎞
= 1 − e −2 ⎜1 + 2 + 2 + + ⎟ = 0.052.
⎝ 3 3⎠
Conditional Probability Density Function
For any two events A and B, we have defined the conditional
probability of A given B as
P( A ∩ B )
P( A | B ) = , P ( B ) ≠ 0. (4-9)
P( B)
Noting that the probability distribution function FX ( x ) is
given by
F X ( x ) = P { X (ξ ) ≤ x }, (4-10)
we may define the conditional distribution of the r.v X given
the event B as
11
P { ( X (ξ ) ≤ x ) ∩ B }.
F X ( x | B ) = P { X (ξ ) ≤ x | B }= (4-11)
P(B)
Thus the definition of the conditional distribution depends
on conditional probability, and since it obeys all probability
axioms, it follows that the conditional distribution has the
same properties as any distribution function. In particular
P { ( X ( ξ ) ≤ +∞ )∩ B }=
P(B )
F X ( +∞ | B ) = = 1,
P(B ) P(B )
(4-12)
P { ( X ( ξ ) ≤ −∞ ) ∩ B } = P (φ ) = 0 .
F X ( −∞ | B ) =
P(B ) P(B )
Further
P { (x 1 < X (ξ ) ≤ x 2 ) ∩ B }
P ( x 1 < X (ξ ) ≤ x 2 | B ) =
P(B)
= F X ( x 2 | B ) − F X ( x 1 | B ), (4-13)
12
Since for x2 ≥ x1,
( X (ξ ) ≤ x 2 ) = ( X (ξ ) ≤ x1 ) ∪ (x1 < X (ξ ) ≤ x 2 ) . (4-14)
P ( x 1 < X (ξ ) ≤ x 2 | B ) =
x2
∫ x1
f X ( x | B ) dx . (4-17)
13
Example 4.4: Refer to example 3.2. Toss a coin and
X(T)=0, X(H)=1. Suppose B = {H }. Determine FX ( x | B ).
Solution: From Example 3.2, FX ( x ) has the following form.
We need FX ( x | B ) for all x.
For x < 0, {X (ξ ) ≤ x } = φ , so that { ( X (ξ ) ≤ x ) ∩ B } = φ ,
and FX ( x | B ) = 0.
FX (x) FX (x | B)
1 1
q
x x
1 1
(a) (b)
Fig. 4.3
14
For 0 ≤ x < 1, {X (ξ ) ≤ x } = { T }, so that
{ ( X (ξ ) ≤ x ) ∩ B } = { T }∩ { H } = φ and FX ( x | B ) = 0.
For x ≥ 1, {X (ξ ) ≤ x } = Ω , and
P( B )
{ ( X (ξ ) ≤ x ) ∩ B } = Ω ∩ { B } = { B } and FX ( x | B) = =1
P( B )
(see Fig. 4.3(b)).
Example 4.5: Given FX ( x ), suppose B = {X (ξ ) ≤ a}. Find f X ( x | B ).
Solution: We will first determine FX ( x | B ). From (4-11) and
B as given above, we have
P { (X ≤ x ) ∩ (X ≤ a ) }
FX ( x | B ) = . (4-18)
P (X ≤ a )
15
For x < a , ( X ≤ x ) ∩ ( X ≤ a ) = ( X ≤ x ) so that
P (X ≤ x ) FX (x)
FX (x | B ) = = . (4-19)
P (X ≤ a ) FX (a )
For x ≥ a , ( X ≤ x ) ∩ ( X ≤ a ) = ( X ≤ a ) so that FX ( x | B ) = 1.
Thus
⎧ FX ( x )
⎪ , x < a,
FX ( x | B ) = ⎨ FX (a ) (4-20)
⎪⎩ 1, x ≥ a,
and hence
⎧ fX (x)
d ⎪ , x < a,
fX (x | B) = FX ( x | B ) = ⎨ FX (a ) (4-21)
dx ⎪⎩ 0 , otherwise.
16
FX ( x | B )
f X ( x | B)
1
f X (x )
FX (x )
a x a x
(a) (b)
Fig. 4.4
f X ( x | B)
f X (x )
x
a b
Fig. 4.5
18
We can use the conditional p.d.f together with the Bayes’
theorem to update our a-priori knowledge about the
probability of events in presence of new observations.
Ideally, any new information should be used to update our
knowledge. As we see in the next example, conditional p.d.f
together with Bayes’ theorem allow systematic updating. For
any two events A and B, Bayes’ theorem gives
P (B | A)P ( A)
P(A | B) = .
P(B) (4-27)
Let B = {x 1 < X ( ξ ) ≤ x 2 } so that (4-27) becomes (see (4-13) and
(4-17))
P (( x 1 < X ( ξ ) ≤ x 2 ) | A )P ( A )
P {A | ( x 1 < X ( ξ ) ≤ x 2 ) } =
P (x 1 < X (ξ ) ≤ x 2 )
x2
F ( x | A ) − F X ( x1 | A )
= X 2 P ( A) =
∫ x1
f X ( x | A ) dx
P ( A ). (4-28)
F X ( x 2 ) − F X ( x1 ) x2
∫ x1
f X ( x ) dx
19
Further, let x1 = x, x2 = x + ε , ε > 0, so that in the limit as ε → 0,
f X ( x | A)
lim P {A | ( x < X (ξ ) ≤ x + ε ) } = P ( A | X (ξ ) = x ) = P ( A ). (4-29)
ε→0 fX ( x)
or
P( A | X = x) fX ( x)
f X |A ( x | A ) = . (4-30)
P ( A)
P ( A | P = p ) = p k q n−k , (4-34) p
0 1
Fig.4.6
21
and using (4-32) we get
1 1 ( n − k )! k !
P ( A) = ∫ 0
P ( A | P = p ) f P ( p ) dp = ∫ 0
p k (1 − p ) n − k dp =
( n + 1 )!
. (4-35)
22
Let B= “head occurring in the (n+1)th toss, given that k
heads have occurred in n previous tosses”.
Clearly P ( B | P = p ) = p , and from (4-32)
1
P(B) = ∫ 0
P ( B | P = p ) f P ( p | A ) dp . (4-37)
24