0% found this document useful (0 votes)
5 views14 pages

Week 3

This document provides a summary of key concepts in probability theory, including: - Definitions of σ-algebra, probability space, and properties of probability such as the law of total probability and Bayes' theorem. - Random variables, their distributions, expectation, and the change of variable formula. - The moment generating function and how it can be used to obtain moments. - An example is provided to illustrate measurability of random variables. - Homework is assigned involving the dice game of craps.

Uploaded by

文 徐
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views14 pages

Week 3

This document provides a summary of key concepts in probability theory, including: - Definitions of σ-algebra, probability space, and properties of probability such as the law of total probability and Bayes' theorem. - Random variables, their distributions, expectation, and the change of variable formula. - The moment generating function and how it can be used to obtain moments. - An example is provided to illustrate measurability of random variables. - Homework is assigned involving the dice game of craps.

Uploaded by

文 徐
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 14

SUStech Probability and Some Calculations Applied Stochastic Process

Week 3 : Probability and Some Calculations


TA: Wen Xu Email: [email protected]

A Short Review to Probability Theory


1. ( σ-algebra, or σ-field). A σ-algebra, or σ-field, on a set Ω is a subset A of the power set
2Ω with the following properties:
(a). Ω ∈ F
(b). A ∈ F implies Ac ∈ F ;
∪∞
(c). if (An )n⩾1 is a sequence in F, then n=1 An also belongs to F.
Example:
(i) The smallest σ-field associated with Ω is the collection F = {∅, Ω}.
(ii) If A is any subset of Ω, then F = {∅, A, Ac , Ω} is a σ-field.

2. Consider an experiment whose sample space is Ω. For each event A of the sample space (Ω, F),
we assume that a number P (A) is defined and satisfies the following three conditions:
(i) 0 ⩽ P (A) ⩽ 1.
(ii) P (Ω) = 1.
(iii) For any events A1 , A2 , . . . that are mutually exclusive, then
(∞ )
∪ ∑∞
P An = P (An ) .
n=1 n=1

We refer to P (A) as the probability of the event A. The triple (Ω, F, P ) is called a probability
space.

3. Properties of Probability.
(i) If A ⊂ B, then P (A) ⩽ P (B).
(ii) P (Ac ) = 1 − P (A).
(iii) P (A ∪ B) = P (A) + P (B) − P (A ∩ B).
∑n
(iv) P (∪ni=1 Ai ) = i=1 P (Ai ) if Ai are mutually exclusive.
∑∞
(v) Boole’s inequality: P (∪∞ i=1 Ai ) ⩽ i=1 P (Ai ).

4. Let A1 , A2 , . . . be an increasing sequence of events, i.e., A1 ⊂ A2 ⊂ . . ., and write A for their


limit:


A= Ai = lim Ai
i→∞
i=1

Then
P (A) = lim P (Ai ) .
i→∞

Page 1
SUStech Probability and Some Calculations Applied Stochastic Process

5.(Law of Total Probability). For any events A and B,

P (A) = P (A | B)P (B) + P (A | B c ) P (B c ) .

More generally, let B1 , . . . , Bn be a partition of Ω, then



n
P (A) = P (A | Bi ) P (Bi )
i=1

6.(Bayes’ theorem). For events A and B satisfying that P (B) > 0,

P (B | A)P (A)
P (A | B) = .
P (B)

7.(Independence). Events A and B are called independent if P (A ∩ B) = P (A)P (B). More


generally, a family {Ai : i ∈ I} is called independent if

P (∩i∈J Ai ) = P (Ai )
i∈J

for any finite subsets J of I.

8.(Random Variable). A random variable on the sample space (Ω, F) is a function X : Ω → R


with property that {ω ∈ Ω : X(ω) ⩽ x} ∈ F for each x ∈ R.

9. If {X ⩽ x} ∈ F for all x ∈ R, we say X is measurable with respect to F, and people often use
the shorthand notation X ∈ F .

10.For a random variable X we define σ(X) to be the smallest sigma field which make X measurable
with respect to σ(X). We read it as ”the sigma field generated by X ′′ . For random variables X, Y , we
say that X is Y -measurable if X ∈ σ(Y ).
Example. Let Ω = {a, b, c} and A = {{a, b, c}, {a, b}, {c}, ∅}, and we define X, Y, Z as follows:

ω X Y Z
a 1 1 1
b 1 2 7
c 2 2 4

(i) Which of the random variables are A-measurable?


(ii) If X ∈ σ(Y ) ? Is Y ∈ σ(Z) ?
Solution.

Page 2
SUStech Probability and Some Calculations Applied Stochastic Process

(i) Note that 



 ∅ if x < 1


{X ⩽ x} = {a, b} if 1 ⩽ x < 2



{a, b, c} if x ⩾ 2,
and thus X ∈ A. Since {Y ⩽ 1} = {a} ∈
/ A, then Y is not A-measurable. For the same reason, Z is
not A-measurable, either.
(ii) Note that σ(Y ) = {{a, b, c}, {a}, {b, c}, ∅}. Then X ∈
/ σ(Y ) since {X ⩽ 1} = {a, b} ∈
/ σ(Y ).
For the second question, Y ∈ σ(Z) since σ(Z) = 2{a,b,c} .

11. (Expectation, a higher level). If X ⩾ 0 is a random variable on (Ω, F, P ), then the


expectation of X is defined as ∫
EX = XdP

which always makes sense, but may be ∞. If X is a general random variable, then

EX = EX + − EX −

where x+ = max{x, 0} and x− = max{−x, 0} whenever EX + < ∞ or EX − < ∞.

12. (Change of variable formula). Let X be a random variable with distribution F . If g is a


measurable function so that E|g(X)| < ∞ or g ⩾ 0, then

Eg(X) = g(x)dF (x).
R

The integration above is known as the Riemann-Stieltjes integral, which is a generalization of


∫x
Riemann integral. If F (x) = −∞ f (t)dt for some function f , then
∫ ∫
g(x)dF (x) = g(x)f (x)dx
R R

which is the expectation formula of the continuous random variable. If F is a step function with jumps
at {xi , i ⩾ 1}, then
∫ ∑ ∑
g(x)dF (x) = g (xi ) (F (xi ) − F (xi −)) = g (xi ) f (xi ) .
R i⩾1 i⩾1

13. The moment generating function (m.g.f.) of X is


∑
( tX )  x etx p(x) if X discrete
ϕ(t) = E e = ∫∞
 etx f (x)dx if X continuous.
−∞

ϕ is the mgf since we can obtain all moments from it.In fact, by Taylor’s formula, we have


ϕ(n) (0)
ϕ(t) = tn ,
n=0
n!

Page 3
SUStech Probability and Some Calculations Applied Stochastic Process

and
( ) ∑

(tX)n ∑

E (X n )
ϕ(t) = E e tX
=E = tn .
n=0
n! n=0
n!
Thus,
E (X n ) = ϕ(n) (0).

14.
V (X) = E(V (X | Y )) + V (E(X | Y )).

Homework
13. The dice game craps is played as follows. The player throws two dice, and if the sum is seven or
eleven, then she wins. If the sum is two, three, or twelve, then she loses. If the sum is anything else,
then she continues throwing until she either throws that number again (in which case she wins) or she
throws a seven (in which case she loses). Calculate the probability that the player wins.

Solution:
Note that

i 2 3 4 5 6 7 8 9 10 11 12
1 2 3 4 5 6 5 4 3 2 1
P 36 36 36 36 36 36 36 36 36 36 36

Let Ai be the event that throw i at the first time and win the game. Then
{
2
9
, i = 7, 11
P (Ai ) =
0, i = 2, 3, 12

Now we calculate it for other i. Denote by p the probability of getting i at each trial, r the
probability of not i and 7 . Then,


P (Ai ) = P (Ai , win in n + 1 trials )
n=1
∑∞
p2 p2
= prn−1 p = = .
n=1
1−r 1
6
+p

Hence 

 1
, i = 4, 10
 36
P (Ai ) = 2
, i = 5, 9


45
 25
, i = 6, 8
396


12
P ( win ) = P (Ai ) ≈ 0.49
i=2

30. Let X be a Poisson random variable with parameter λ. Show that P (X = i) increases mono-
tonically and then decreases monotonically as i increases, reaching its maximum when i is the largest

Page 4
SUStech Probability and Some Calculations Applied Stochastic Process

integer not exceeding λ. Hint: Consider P (X = i)/P (X = i − 1).

Solution: The maximum is attained at i = [λ].


33. Let X be a random variable with probability density

c (1 − x2 ) , −1 < x < 1
f (x) =
0, otherwise

(a) What is the value of c ?


(b) What is the cumulative distribution function of X ?

Solution:



 0, y ≤ −1
 [ ]
y3
F (y) = 3
y− + 2
, −1 < y < 1


4 3 3
 1, y≥1

43. An urn contains n + m balls, of which n are red and m are black. They are withdrawn from the
urn, one at a time and without replacement. Let X be the number of red balls removed before the first
black ball is chosen. We are interested in determining E[X]. To obtain this quantity, number the red
balls from 1 to n. Now define the random variables Xi , i = 1, . . . , n, by

1, if red ball i is taken before any black ball is chosen
Xi =
0, otherwise

(a) Express X in terms of the Xi .


(b) Find E[X].

Solution:
(b)
( )

n ∑
n
E(X) = E Xi = E (Xi )
i=1 i=1

n
1
= ( consider i th red and all black)
i=1
m+1
n
=
m+1

Page 5
SUStech Probability and Some Calculations Applied Stochastic Process

46. If X is a nonnegative integer valued random variable, show that

(a)

∞ ∑

E[X] = P (X ≥ n) = P (X > n)
n=1 n=0

Hint: Define the sequence of random variables {In , n ≥ 1}, by



1, if n ≤ X
In =
0, if n > X

Now express X in terms of the In .


(b) If X and Y are both nonnegative integer valued random variables, show that

∞ ∑

E[XY ] = P (X ≥ n, Y ≥ m)
n=1 m=1

Solution: (a) Let 


1, n ≤ X
In =
0, n > X.

Note that,



X= In .
n=1
( )

∞ ∑

E(X) = E In = E (In )
n=1 n=1


= P (X ≥ n)
n=1
∑∞
= P (X > n).
n=0

(b) Let 
1, m ≤ Y
Jm =
0, m > Y

Similarly,


Y = Jm .
m=1

Page 6
SUStech Probability and Some Calculations Applied Stochastic Process

Then [( ) ] ( )

∞ ∑
∞ ∑

E(XY ) = E In Y =E In Y = E (In Y )
n=1 n=1 n=1
[ ( ∞ )]

∞ ∑
= E In Jm
n=1 m=1
∑∞ ∑∞
= E (In Jm )
n=1 m=1
∑∞ ∑ ∞
= P (X ≥ n, Y ≥ m)
n=1 m=1

67. Calculate the moment generating function of the uniform distribution on (0, 1). Obtain E[X] and
Var[X] by differentiating.
Solution: Note that ∫
( ) 1
1( t )
ϕ(t) = E etX = etx dx = e −1 ,
0 t
Then
1∑ 1 m ∑ 1 1 n
∞ ∞
ϕ(t) = t = t .
t m=1 m! n=0
n! n + 1
Hence,
1
ϕ(n) (0) = .
n+1
Thus,
1 1
ϕ′ (0) = , ϕ′′ (0) = .
2 3
Hence
1
E(X) = ϕ′ (0) =
2
( ) 1
Var(X) = E X 2 − (E(X))2 = ϕ′′ (0) − (ϕ′ (0)) =
2
12

Supplementary exercises
1. (Some Probability Identities). Let A1 , A2 , . . . , An denote events and define the indicator
variables Ij , j = 1, . . . , n by 
1 if Aj occurs
Ij =
0 otherwise.
Letting

n
N= Ij ,
j=1

then N denotes the number of the Aj , 1 ≤ j ≤ n, that occur. A useful identity can be obtained by
noting that 
1 if N = 0
(1 − 1)N =
0 if N > 0.

Page 7
SUStech Probability and Some Calculations Applied Stochastic Process

But by the binomial theorem,


( )

N
N
(1 − 1) =N
(−1)i
i=0 i
( ) ( )

n
N m
i
= (−1) since = 0 when i > m
i=0 i i

Hence, if we let 
1 if N > 0
I=
0 if N = 0,
then yield ( )

n
N
1−I = (−1)i
i=0 i
or ( )

n
N
I= (−1)i+1
i=1 i
Taking expectations
[( )] [( )]
N N
E[I] = E[N ] − E + · · · + (−1) n+1
E (1)
2 n
However,
E[I] = P {N > 0}
= P { at least one of the Aj occurs }
(n )

=P Aj
j

and [ ]

n ∑
n
E[N ] = E Ij = P (Aj ) ,
j=1 j=1
[( )]
N
E = E [number of pairs of the Aj that occur]
2
[ ]
∑∑
=E Ii Ij
i<j
∑∑
= E [Ii Ij ]
i<j
∑∑
= P (Ai Aj )
i<j

Page 8
SUStech Probability and Some Calculations Applied Stochastic Process

in general, by the same reasoning,


[( )]
N
E = E[ number of sets of size i that occur ]
i
[ ]
∑ ∑
=E I l1 I l 2 · · · I li
l1 <l2 <...<li

= P (Al1 Al2 · · · Ali )
l1 <l2 ...<li

Hence, (1) is a statement of the well-known identity


(n )
∪ ∑n ∑ ∑
P Ai = P (Ai ) − P (Ai Al ) + P (Ai Aj Ak )
i=1 i=1 i<l i<j<k

− · · · + (−1) n+1
P (A1 A2 · · · An ) .

2. The Sum of a Random Number of Random Variables. Let X1 , X2 , . . . denote a sequence


of independent and identically distributed random variables; and let N denote a nonnegative integer
valued random variable that is independent of the sequence X1 , X2 , . . . We shall compute the moment
∑N
generating function of Y = 1 Xi by first conditioning on N . Now
[ { } ]

N
E exp t Xi |N =n
i
[ { } ]

n
= E exp t Xi |N =n
1
[ { }]

n
= E exp t Xi (by independence)
1
n
= (ΨX (t)) ,
[ ]
(by independence) where ΨX (t) = E etX is the moment generating function of X. Hence,
[ { N } ]
∑ N
E exp t Xt | N = (ΨX (t))
1

and so [ { }]

N [ ]
N
ΨY (t) = E exp t Xt = E (ΨX (t))
i

To compute the mean and variance of Y = ΣN 1 Xi , we differentiate ΨY (t) as follows:


[ ]
N −1 ′
Ψ′Y (t) = E N (ΨX (t)) ΨX (t)
[ ]
N −2 N −1 ′′
Ψ′′Y (t) = E N (N − 1) (ΨX (t)) (Ψ′X (t)) + N (ΨX (t))
2
ΨX (t)

Evaluating at t = 0 gives
E[Y ] = E[N E[X]] = E[N ]E[X]

Page 9
SUStech Probability and Some Calculations Applied Stochastic Process

and [ ] [ [ ]]
E Y 2 = E N (N − 1)E 2 [X] + N E X 2
[ ]
= E[N ] Var(X) + E N 2 E 2 [X]
Hence,
[ ]
Var(Y ) = E Y 2 − E 2 [Y ]
= E[N ] Var(X) + E 2 [X] Var(N )
3. A miner is trapped in a mine containing three doors. The first door leads to a tunnel that
takes him to safety after two hours of travel. The second door leads to a tunnel that returns him to
the mine after three hours of travel. The third door leads to a tunnel that returns him to his mine
after five hours. Assuming that the miner is at all times equally likely to choose any one of the doors,
let us compute the moment generating function of X, the time when the miner reaches safety. Let Y
denote the door initially chosen. Then
[ ] 1 ( [ tX ] [ ] [ ])
E etX = E e | Y = 1 + E etX | Y = 2 + E etX | Y = 3 . (2)
3
Now given that Y = 1, it follows that X = 2, and so
[ ]
E etX | Y = 1 = e2t .

Also, given that Y = 2, it follows that X = 3 + X ′ , where X ′ is the number of additional hours
to safety after returning to the mine. But once the miner returns to his cell the problem is exactly as
before, and thus X ′ has the same distribution as X. Therefore,
[ ] [ ]
E etX | Y = 2 = E et(3+X)
[ ]
= e3t E etX .

Similarly,
[ ] [ ]
E etX | Y = 3 = e5t E etX .

Substitution back into (1) yields


[ ] 1 ( 2t [ ] [ ])
E etX = e + e3t E etX + e5t E etX
3
or
[ ] e2t
E etX =
3 − e3t − e5t
Not only can we obtain expectations by first conditioning upon an appropriate random variable,
but we may also use this approach to compute probabilities. To see this, let E denote an arbitrary
event and define the indicator random variable X by

1 if E occurs
X=
0 if E does not occur.

Page 10
SUStech Probability and Some Calculations Applied Stochastic Process

It follows from the definition of X that


E[X] = P (E)
E[X | Y = y] = P (E | Y = y) for any random variable Y.

Therefore, from Equation (1.5.1) we obtain that



P (E) = P (E | Y = y)dFY (y).

4. The Matching Problem At a party n people put their hats in the center of a room where the
hats are mixed together. Each person then randomly selects one. We are interested in the mean and
variance of X-the number that select their own hat. To solve, we use the representation

X = X1 + X2 + · · · + Xn ,

where 
1 if the i th person selects his or her own hat
Xi =
0 otherwise
Now, as the i th person is equally likely to select any of the n hats, it follows that P {Xi = 1} = 1/n,
and so
E [Xi ] = 1/n,
( )
1 1 n−1
Var (Xi ) = 1− =
n n n2
Also
Cov (Xi , Xj ) = E [Xi Xj ] − E [Xi ] E [Xj ] .

Now, 
1 if the i th and j th person both select their own hats
Xi Xj =
0 otherwise,
and thus
E [Xi Xj ] = P {Xi = 1, Xj = 1}
= P {Xi = 1} P {Xj = 1 | Xi = 1}
1 1
=
nn−1
Hence,
( )2
1 1 1
Cov (Xi , Xj ) = − = 2
n(n − 1) n n (n − 1)
Therefore,
E[X] = 1

and ( )
n−1 n 1
Var(X) = +2
n 2 n2 (n − 1)

= 1.

Page 11
SUStech Probability and Some Calculations Applied Stochastic Process

Note that the property for variances is that


[ n ]
∑ ∑n ∑
n ∑
Var Xi = Var (Xi ) + Cov (Xi , Xl )
i=1 i=1 i=1 i<l

Thus both the mean and variance of the number of matches are equal to 1. (See (6) for an
explanation as to why these results are not surprising )
5. Matching problem restated
Suppose in the matching problem (4), that those choosing their own hats depart, while the others
(those without a match) put their selected hats in the center of the room, mix them up, and then
reselect. If this process continues until each individual has his or her own hat, find E [Rn ] where Rn is
the number of rounds that are necessary.
We will now show that E [Rn ] = n. The proof will be by induction on n, the number of individuals.
As it is obvious for n = 1, assume that E [Rk ] = k for k = 1, ., n − 1. To compute E [Rn ], start by
conditioning on M , the number of matches that occur in the first round. This gives

n
E [Rn ] = E [Rn | M = i] P {M = i}.
i=0

Now, given a total of i matches in the initial round, the number of rounds needed will equal 1
plus the number of rounds that are required when n − i people remain to be matched with their hats.
Therefore,

n
E [Rn ] = (1 + E [Rn−i ]) P {M = i}
i=0

n
= 1 + E [Rn ] P {M = 0} + E [Rn−i ] P {M = i}
i=1
∑n
= 1 + E [Rn ] P {M = 0} + (n − i)P {M = i}
i=1

(by the induction hypothesis)

= 1 + E [Rn ] P {M = 0} + n(1 − P {M = 0}) − E[M ]


= E [Rn ] P {M = 0} + n(1 − P {M = 0})

(since E[M ] = 1) which proves the result.

6. The Matching Problem Revisited.


Let us reconsider (3) in which n individuals mix their hats up and then randomly make a selection
We shall compute the probability of exactly k matches
First let E denote the event that no matches occur, and to make explicit the dependence on n
write Pn = P (E). Upon conditioning on whether or not the first individual selects his or her own
hat-call these events M and M c -we obtain

Pn = P (E) = P (E | M )P (M ) + P (E | M c ) P (M c )

Page 12
SUStech Probability and Some Calculations Applied Stochastic Process

Clearly, P (E | M ) = 0, and so
n−1
Pn = P (E | M c ) .
n
Now, P (E | M c ) is the probability of no matches when n − 1 people select from a set of n − 1
hats that does not contain the hat of one of them. This can happen in either of two mutually exclusive
ways Either there are no matches and the extra person does not select the extra hat (this being the
hat of the person that chose first), or there are no matches and the extra person does select the extra
hat. The probability of the first of these events is Pn−1 , which is seen by regarding the extra hat as
”belonging” to the extra person Since the second event has probability [1/(n − 1)]Pn−2 , we have
1
P (E | M c ) = Pn−1 + Pn−2
n−1
and thus, from Equation (1.5.4),
n−1 1
Pn = Pn−1 + Pn−2 ,
n n
or, equivalently,
1
Pn − Pn−1 = − (Pn−1 − Pn−2 )
n
However, clearly
1
P1 = 0, P2 = .
2
Thus, from Equation (1 5.5),

P3 − P2 = − (P2 −P
3
1)
= − 3!1 or P3 = 1
2!
− 1
3!
,
P4 − P3 = − (P3 −P
4
2)
= 1
4!
or P4 = 1
2!
− 1
3!
+ 4!1 ,

and, in general, we see that


1 1 1 (−1)n
Pn = − + −·+
2! 3! 4! n!
To obtain the probability of exactly k matches, we consider any fixed group of k individuals The
probability that they, and only they, select their own hats is
1 1 1 (n − k)!
··· Pn−k = Pn−k ,
nn−1 n − (k − 1) n!
where Pn−k is the conditional probability
( that the other n − k individuals, selecting among their own
)
n
hats, have no matches. As there are choices of a set of k individuals, the desired probability of
k
exactly k matches is
( ) n−k
n (n − k)!
1
2!
− 3!1 + · · · + (−1)
(n−k)!
Pn−k = ,
k n! k!

which, for n large, is approximately equal to e−1 /k! .


Thus for n large the number of matches has approximately the Poisson distribution with mean 1 .
To understand this result better recall that the Poisson distribution with mean λ is the limiting distri-
bution of the number of successes in n independent trials, each resulting in a success with probability

Page 13
SUStech Probability and Some Calculations Applied Stochastic Process

pn , when npn → λ as n → ∞. Now if we let



1 if the i th person selects his or her own hat
Xi =
0 otherwise,

∑n
then the number of matches, i=1 Xi , can be regarded as the number of successes in n trals when each
is a success with probability 1/n. Now, whereas the above result is not immediately applicable because
these trials are not independent, it is true that it is a rather weak dependence since, for example,

P {Xi = 1} = 1/n

and
P {Xi = 1 | Xj = 1} = 1/(n − 1), j ̸= i.

Hence we would certainly hope that the Poisson limit would still remain valid under this type of
weak dependence. The results of this example show that it does.

Page 14

You might also like