0% found this document useful (0 votes)
12 views10 pages

Thinning Poisson Process

Uploaded by

baydou zoubir
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views10 pages

Thinning Poisson Process

Uploaded by

baydou zoubir
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 10

AI2613 随机过程 2020–2021 春季学期

Lecture 8 – Poisson Process (II)


2021 年 4 月 19 日
Lecturer: 张驰豪 Scribe: 杨宽

Today we are going to talk about some interesting properties and applications of the Poisson
process. The Poisson process has three well-known transformations: thinning, superposition
and conditioning. We will discuss thinning and conditioning today, and leave superposition in
the exercise. After the two transformations we will introduce the Poisson approximation.

1 Thinning
We first review the definition of Poisson process. A Poisson process with rate λ is a sequence
of random variables N (t )t ≥0 s.t. (i) N (0) = 0, (ii) N (t + s) − N (s) ∼ Pois(λt ), and (iii) N (t ) has
independent increments. We also introduced another constructive definition in the last lecture.
Pn
Let τi ∼ Exponential(λ) and Tn = i =1 τi . Then N (t ) = max {n : Tn ≤ t } is a Poisson process.
Now consider the example of customers coming into the restaurant. Sometimes we have a more
detailed characterization of customers, such as the gender. We associate an independent and
identically distributed (i.i.d.) random variable Yi with each arrival, and then use the value of Yi
to separate the Poisson process into several.
τ1 τ2 τ3 τ4 τ5

Y1 Y2 Y3 Y4 Y5

Figure 1: Poisson process with random variables associated with arrivals

£ ¤
Formally, suppose that Yi ∈ N and are i.i.d. random variables. Let p j = Pr Yi = j . For all j ∈ N
(or Range(Yi )), let N j (t ) denote the number of arrivals that have arrived by time t with exactly
value j . Then {N j (t )} is called the thinning of a Poisson process.
The following properties for thinning might be a bit surprising.

Theorem 1. N j (t ) are independent rate p j λ Poisson processes. Namely,


1. For all j , N j (t ) is a Poisson process with rate p j λ;

1
1 0 1 1 0

0 0 N0 (t )

1 1 1 N1 (t )

Figure 2: Thinning

2. All N j (t ) are mutually independent.

Remark. Intuitively, we probably can understand that the resulting processes are Poissons. But
the independence is a real “surprise”. The following example explains why this lemma is surpris-
ing. Assume that the customers coming into a restaurant is a Poisson process, and each customer
is male or female independently with probability 1/2 and 1/2 respectively. In fact we can assume
that we flip coins to determine whether arriving customers are male or female. So intuitively
one might think that a large number of men (such as 40) arriving in one hour would indicates a
large volume of business and hence a larger than normal number of women arriving. However
this theorem tells us that the number of men arriving and the number of women arriving are
independent.

Proof. For convenience we assume that Yi ∈ {0, 1}. Then the following calculation concludes the
independence and the distribution of N j (t ) at the same time.
£ ¤ £ ¤
Pr N0 (t ) = j ∧ N1 (t ) = k = Pr N0 (t ) = j ∧ N0 (t ) + N1 (t ) = j + k
£ ¤ £ ¤
= Pr N0 (t ) + N1 (t ) = j + k · Pr N0 (t ) = j | N0 (t ) + N1 (t ) = j + k
à !
j +k
(λt ) j + k
= e−λt ·
j
· · p 0 · p 1k
( j + k)! j
(λt ) j +k
= e−λt ·
j
· p 0 · p 1k
j ! · k!
−p 0 λt (p 0 λt ) j −p 1 λt (p 1 λt )k
=e · ·e · .
j! k!

Example 1. Customers come into a restaurant at times of a Poisson process with rate 10 per
hour. Suppose that 40% of customers are male and 60% of customers are female. What is the
distribution of the number of male customers in 2 hours?

2
By independence the number of customers has a Poisson distribution with rate 8.

Example 2 (Maximum Likelihood Estimation). Now we consider the maximum likelihood estima-
tion of the Poisson process. Suppose there are two editors reading a book of 300 pages. Editor A
finds 100 typos in the book, and editor B finds 120 typos, 80 of which are in common. Suppose
that the author’s typos follow a Poisson process with some unknown rate λ per page, while the
two editors catch typos with unknown probabilities of success pA and p B respectively. Our goal
is to determine λ, pA and p B .
Clearly, there are four types of typos.
1. Type 1: neither of editors found, with probability q1 ≜ (1 − pA )(1 − p B );
2. Type 2: only editor A found, with probability q2 ≜ pA (1 − p B );
3. Type 3: only editor B found, with probability q3 ≜ p B (1 − pA );
4. Type 4: both of editors found, with probability q4 ≜ pA p B .
So the occurrence of type-i typos follows an independent Poisson process with rate qi λ.
Let A and B be the sets of typos that editors A and B found. We claim that by the maximum like-
lihood estimation we have |A \ (A ∩ B)| = 300q2 λ, |B \ (A ∩ B)| = 300q3 λ and |A ∩ B| = 300q4 λ,
which yield that

p A (1 − p B )λ = 20/300 ,
p B (1 − p A )λ = 40/300 ,
p A p B λ = 80/300 .

Thus we have pA = 2/3, p B = 4/5 and λ = 1/2.


Finally we prove our claim. Suppose X ∼ Pois(θ) with some unknown θ . Then given x , our goal
(maximum likelihood estimation) is to find arg maxθ p(x | θ) ≜ Pr[X = x | X ∼ Pois(θ)]. Note that

θx
p(x | θ) = e−θ · .
x!

So the likelihood function is L(θ) ≜ p(x | θ) and the log-likelihood function is

ℓ(θ) ≜ ln L(θ) = −θ + x ln θ − ln(x!) .

Since ℓ0 (θ) = −1 + x/θ , it is easy to verify that arg maxθ p(x | θ) = arg maxθ ℓ(θ) = x .

Example 3 (Coupon Collector Problem for Non-Uniform Coupons). Suppose that the rates of all
types of coupons are not identical in the coupon collector problem. Specifically, suppose that

3
there are n distinct types of coupons and each purchase gives a coupon of type-i independently
with probability p i . Let N be the random variable that denotes the number of purchases until
collecting all n types of coupons. Our goal is to compute E[N ].
Pn
It is clear that j =1 p j = 1. Let N j be the random variable that denotes the number of pur-
chases until collecting type j . Then N j has a geometric distribution with parameter p j and
N = max1≤ j ≤n N j .
However it is difficult to compute the distribution of the maximum of N j since they are not
independent. We now consider another case: the maximum of independent exponential random
variables.
Suppose that the coupons are collected at times chosen according to a Poisson process with rate
λ = 1, where each arrival of this Poisson process brings a type- j coupon independently with
probability p j . Let {P j (t )} be the thinning of this Poisson process, X j be the first time to meet a
type- j coupon, and X = max1≤ j ≤n X j . Note that X j has an exponential distribution of rate p j .
By the independence of thinning, we have

Pr[X ≤ t ] = Pr[X 1 ≤ t ∧ X 2 ≤ t ∧ · · · ∧ X n ≤ t ]
Y
n £ ¤
= Pr X j ≤ t
j =1
Y n ¡ ¢
= 1 − e−p j t ,
j =1
Qn ¡ ¢
which implies that Pr[X > t ] = 1 − j =1 1 − e−p j t .
We now claim that for any nonnegative random variable X , it holds that
Z ∞
E[X ] = Pr[X > t ] dt .
0

Proof of our claim. It is a double-counting. We first prove the discrete version where X ∈ N:
X
∞ X
∞ tX
−1 X
∞ X
∞ X

E[X ] = t · Pr[X = t ] = Pr[X = t ] = Pr[X = t ] = Pr[X > s] .
t =0 t =0 s=0 s=0 t =s+1 s=0

The proof of continuous case is almost the same.


·Z X ¸ ·Z ∞ ¸
E[X ] = E 1 dt = E 1[X >t ] dt .
0 0

Fubini’s theorem, or Tonelli’s theorem justifies exchanging the order of expectation and integra-
tion. Hence, Z Z
∞ £ ¤ ∞
E[X ] = E 1[X >t ] dt = Pr[X > t ] dt .
0 0

4
Remark. Note that in our proof we need Fubini’s theorem or Tonelli’s theorem. The conclusions
of these two theorems are identical, but the assumptions are different. In fact, there are various
alternative statements. Here we introduce a simple one.

Theorem 2 (Fubini’s Theorem). Let A × B ⊆ R2 and suppose that f : A × B → R is a measurable


function such that either f ≥ 0 throughout A × B or
Z
¯ ¯
¯ f ¯ d(x, y) < ∞ ,
A×B
R
Then it follows that A×B f (x, y) d(x, y) can be evaluated by way of an iterated integral in either
order, that is, Z ³Z Z ³Z
´ ´
f (x, y) dy dx = f (x, y) dx dy .
A B B A

Now let’s continue our analysis of the coupon collector problem. Using our claim, we conclude
that Z Z
∞ ∞ Y
n ¡ ¢
E[X ] = Pr[X > t ] dt = 1− 1 − e−p j t dt .
0 0 j =1

Finally, we relate E[X ], the expected time until collecting all types of coupons in the Poisson
process, to our goal E[N ] in the coupon collector problem.
Let τi denote the time between the i − 1-th arrival of coupons and the i -th arrival of coupons. It
is clear that
X
N
X= τi ,
i =1
where τi has an exponential distribution with rate λ = 1. Taking the expectation of the both sides
we have " #
X
N
E[X ] = E τi .
i =1
PN
If N is a constant, then we have E[X ] = i =1 E[τi ] = N ·E[τi ] immediately. However, N is a random
variable now.

Question. Is it true that " #


X
N
E τi = E[N ] · E[τi ]
i =1

if N is a random variable?

The answer is true if τ1 , τ2 , . . . are i.i.d. random variables with finite mean, N is independent of
(τ1 , τ2 , . . .) and N has finite expectation as well. Actually, this is a simple case of the Wald’s
equation, which we will introduce later. Here we only give a simple proof of our case.

5
Note that τi are i.i.d. exponential random variables with rate 1, and N is indpendent of (τ1 , τ2 , . . .),
so
E[X | N = n] = E[τ1 + · · · + τN | N = n] = E[τ1 + · · · + τn ] = N · E[τi ] = N .

Applying the law of total expectation, we have

E[X ] = E[E[X | N ]] = E[N ] ,

and hence conclude that


Z Z Y
n ¡
∞ ∞ ¢
E[N ] = Pr[X > t ] dt = 1− 1 − e−p j t dt .
0 0 j =1

Now we verify that the expectation of purchases is n times the n -th harmonic number in the
uniform coupon collector problem.
Z ∞ ¡ ¢n
E[N ] = 1 − 1 − e−t /n dt
Z0 ∞ ³
¡ ¢n ´ ¡ ¢
= 1 − 1 − e−t /n · −n · et /n de−t /n
0
Z
1 (1 − x)n
1
=n − dx
0 x x
Xn Z 1 (1 − x)k−1 (1 − x)k
=n − dx
k=1 0 x x
Xn Z 1
=n (1 − x)k−1 dx
k=1 0
Xn 1
=n .
k=1 k

2 Conditioning
Let T1 , T2 , T3 , . . . be the arrival times of a Poisson process with rate λ. Suppose that N (t ) = n , that
is, 0 < T1 < T2 < · · · < Tn ≤ t < Tn+1 . Let U1 ,U2 , . . . ,Un be n numbers independently and uniformly
at random chosen from [0, t ], and then V1 ,V2 , . . . ,Vn be n nubmers by rearranging U1 ,U2 , . . . ,Un
in increasing order. Conditioned on N (t ) = n , we have the following theorem.

Theorem 3. The vector (T1 , T2 , . . . , Tn ) has the same distribution as (V1 ,V2 , . . . ,Vn ).

6
Proof. For any 0 < t 1 < t 2 < · · · < t n ≤ t , by independence we have
" #
^
n ¯ Pr[τ1 = t 1 , τ2 = t 2 − t 1 , . . . , τn = t n − t n−1 , τn+1 > t − t n ]
¯
Pr Ti = t i ¯ N (t ) = n =
i =1 Pr[N (t ) = n]
λe−λt1 · λe−λ(t2 −t1 ) · · · · · λe−λ(tn −tn−1 ) · e−λ(t −tn )
=
e−λt (λt )n /n!
λn e−λt n!
= −λt n
= n.
e (λt ) /n! t

Note that the volume of the simplex S = {(t 1 , t 2 , . . . , t n ) : 0 < t 1 < t 2 < . . . < t n ≤ t } is t n /n!. So
(T1 , T2 , . . . , Tn ) has a uniform distribution over S .
It is easy to see that (V1 ,V2 , . . . ,Vn ) also has a uniform distribution over S , since for any permu-
tation π = (π1 , . . . , πn ) of [n],
£ ¤ 1
Pr Uπ1 < Uπ2 < · · · < Uπn = ,
n!

and thus
" # £Vn ¤
^
n ¯ Pr U πi = t i
1/t n n!
¯ i =1
Pr V i = t i ¯ U π 1 < U π 2 < · · · < U πn = £ ¤= = .
i =1 Pr Uπ1 < Uπ2 < · · · < Uπn 1/n! t n

Remark. Here we should point out that the argument to show the volume of S only works for
finite set. Since every infinite subset of Rn has a one-to-many mapping into itself, please DO
NOT compute the volume by mapping points to another set. Actually it is easy to compute the
volume of S by an induction on n :
Z
Vol(S) = 1 d(t 1 , t 2 , . . . , t n )
0<t 1 <t 2 <···<t n ≤t
Z t ³Z ´
= 1 d(t 1 , t 2 , . . . , t n ) dt n
0 0<t 1 <t 2 <···<t n−1 <t n
Z t t nn tn
= dt n = .
0 (n − 1)! n!

Corollary 4. Suppose that 0 ≤ s < t and m ≤ n . Then


à !
n ³ s ´m ³ s ´n−m
Pr[N (s) = m | N (t ) = n] = · · 1− .
m t t

Proof. The probability is the same as the probability that exactly m of n independently and uni-
formly chosen numbers are in (0, s].

7
3 Poisson Approximation
Example 4 (Balls-into-bins, Maxload). There are m balls and n bins. Each ball is independently
and uniformly put into a bin. So for each bin, the expected number of balls in this bin is m/n ,
and the number has a binomial distribution with parameter 1/n . However, the number of balls
in these bins are not independent. So the problem is how to analyze the joint distribution.
Formally, let X i denote the number of balls in the i -th bin. Then we have X i ∼ Binom(m, 1/n)
Pn
and i =1 X i = m.

Theorem 5. The distribution of (X 1 , . . . , X n ) is the same as the distribution of (Y1 , . . . , Yn ) conditioned


Pn
on i =1 Yi = m , where Yi ∼ Pois(λ) are independent Poisson random variables with an arbitrary rate
λ.

The proof is straightforward. But this theorem has many important applications. For example,
let’s consider the following maxload problem.
Assume that m = n , and let X = max1≤i ≤n X i . Applying Theorem 5 we will show the following
result.

Theorem 6 (Maxload). There exists two constant c 1 , c 2 > 0 such that


· ¸
log n log n
Pr c 1 · < X < c2 · = 1 − o(1/n) .
log log n log log n

Proof of the upper bound. We first prove the upper bound of X as a warm-up, that is, we are going
to show that · ¸
log n
Pr X ≥ c · = o(1/n)
log log n
for some c > 0. For convenience, let k = c log n/ log log n . Using the union bound we have

Pr[X ≥ k] = Pr[∃ i s.t. X i ≥ k]


à !
n ³ 1 ´k
≤ n · Pr[X i ≥ k] ≤ n
k n
³ en ´k n · ek
≤n· · n −k ≤ k .
k k

It is sufficient to show that n ·ek /k k < 1/n 1+ε for some ε > 0. Taking the logarithms to both sides,
it is equivalent to
log n + k − k log k < −(1 + ε) log n .

8
Hence k log k > 3 log n suffices. Note that
log n
k log k = c · · (log log n − log log log n + log c)
log log n
³ log log log n ´ c
> c log n 1 − > · log n .
log log n 2
So we complete our proof by letting c > 6.

The proof of the lower is a bit more complicated. Now introduce a powerful tool.
Theorem 7. Let f : Nn → N be an arbitrary function, Y1 , Y2 , . . . , Yn be n independent Poisson random
variables with rate λ = m/n , i.e., Yi ∼ Pois(m/n). Then we have
£ ¤ p £ ¤
E f (X 1 , X 2 , . . . , X n ) ≤ e m · E f (Y1 , Y2 , . . . , Yn ) .

Remark. Note that the difference between the statement of this theorem and Theorem 5 is that
P
we no longer need conditioning on Yi = m .
The power of this theorem is to bound the expectation of any function of X i by the expectation
of the function of independent Poissons. Usually, we let the function indicate some event, so the
expectation of the indicator is the probability that the event happens.
As an application of this theorem, we first prove the lower bound of X in Theorem 6.

Proof of the lower bound. We are now ready to show that there exists c > 0 s.t.
· ¸
log n
Pr X ≤ c · = o(1/n) .
log log n
Let k = c log n/ log log n and f be the indicator function such that f (x 1 , x 2 , . . . , x n ) = 1 if x i ≤ k for
all i and f = 0 otherwise. Namely,

f (x 1 , x 2 , . . . , x n ) = 1max1≤i ≤n xi ≤k .
£ ¤
Hence, E f (Z1 , Z2 , . . . , Zn ) = Pr[max1≤i ≤n Zi ≤ k] for any random variables Z1 , Z2 , . . . , Zn . Since
Y1 , Y2 , . . . , Yn are independent Poissons, we have
· ¸
£ ¤
E f (Y1 , Y2 , . . . , Yn ) = Pr max Yi ≤ k
1≤i ≤n

= Pr[Y1 ≤ k ∧ Y2 ≤ k ∧ · · · ∧ Yn ≤ k]
= Pr[Y1 ≤ k] · Pr[Y2 ≤ k] · · · · · Pr[Yn ≤ k]
¡ ¢n
≤ 1 − Pr[Yi = k + 1]
³ 1 ´n
= 1−
(k + 1)! · e
≤ e−n/(e(k+1)!) .

9
Note that ln(k + 2) − ln k = ln(1 + 2/k) < 2/k , so for k ≥ 2,
X
k+1 Z k+2
ln(k + 1)! = ln i < ln x dx = (k + 2) ln(k + 2) − k − 1 < (k + 2) log −k + 3 .
i =1 1

Plugging in k = c log n/ log log n and letting c = 1, we have


c log n + 2 log log n ¡ ¢ c log n
ln(k + 1)! < log log n + log c − log log log n − +3
log log n log log n
< log n + 2 log log n − log n/ log log n < log n − log log n − 2

for sufficiently large n . It follows that


n
e · (k + 1)! <
e · log n
and thus
£ ¤
E f (Y1 , Y2 , . . . , Yn ) ≤ e−n/(e(k+1)!) < e−e·log n = n −e .

Combining with Theorem 7 it concludes that


· ¸ p
log n £ ¤ p £ ¤ e n
Pr X ≤ = E f (X 1 , . . . , X n ) ≤ e n · E f (Y1 , . . . , Yn ) < e
log log n n
for sufficiently large n .

As the last part of today’s lecture, let’s prove Theorem 7.

Proof of Theorem 7. Applying Theorem 5, it is easy to see that


£ ¤ X∞ £ X ¤ £X ¤
E f (Y1 , Y2 , . . . , Yn ) = E f (Y1 , Y2 , . . . , Yn ) | Yi = k · Pr Yi = k
k=0
£ X ¤ £X ¤
≥ E f (Y1 , Y2 , . . . , Yn ) | Yi = m · Pr Yi = m
£ ¤ £X ¤
= E f (X 1 , X 2 , . . . , X n ) · Pr Yi = m .
P p
Now it is sufficient to verify that Pr[ Yi = m] ≥ 1/(e m) if Yi ∼ Pois(m/n). The proof is straight-
P
forward by Stirling’s approximation, since Yi ∼ Pois(m) and thus
£X ¤ mm mm 1
Pr Yi = m = e−m · > e−m p > p .
m! e1/(12n) 2πm(m/e)m e m
Remark. Theorem 6 estimates the maxload of the balls-into-bins model, which is very useful in
computer science. For example, in the Erdős-Rényi random graph model, using the same tech-
nique we can show that the maximum degree of a random graph G(n, d /n) is Θ(log n/ log log n)
with high probability, even though the average degree d is a constant.

10

You might also like