2016 S Poisson Distribution
2016 S Poisson Distribution
TwoWays
#1
#2follows:
Derivation of the Poisson distribution (the Law of Rare Events).
Begin with the exact result for the probability distribution governing the outcome of N tosses of a very
unfair coin. We assume that p(heads), the probability of obtaining heads on one toss, is much less than
one (p<<1). Thus, in a long sequence of coin tosses, the appearance of heads is a rare event. As shown
in class, (independent of the value of p), the exact result of obtaining n heads in N tosses is:
N!
P ( N , n) = p n (1 p ) N n
( N n)!n !
We make two approximations:
First show that:
(1 p ) N n e Np
Derivation:
ln[(1 p) N n ] = ( N n) ln(1 p)
ln(1 p) p ( for p << 1)
( N n)( p ) Np
therefore
(1 p) N n e Np
Second, show that:
N!
Nn
( N n)!
Derivation:
ln( N !/( N n)!) N ln N N ( N n) ln( N n) + ( N n)
n << N
ln( N n) = ln N + ln(1 n / N ) ln N n / N
( N n)(ln N n / N ) = N ln N n ln N n + n 2 / N
n2 / N 0
so
ln( N !/( N n)!) n ln N
thus
N !/( N n)! N n
Putting these results together
P( N , n) ( Np) n e Np / n !
Often, the symbol =Np is assigned, where is the expected outcome. In this case, is the number of
heads expected in N tosses, given the probability p that heads will come up in one toss. To put it the
other way around, if you wished to determine p, you would toss the coins N times, count heads, and
calculate p ~ /N. Then,
P ( N , n) = n e / n !
P(N,n) is the Poisson distribution, an approximation giving the probability of obtaining exactly n
heads in N tosses of the coin, for p<<1.
To think about how this might apply to a sequence in space or time, imagine tossing a coin that has
p=0.01, 1000 times. This will produce a long sequence of tails but occasionally a head will turn up.
TTTTTTTTTTTTTHTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTHTT
(about 10 heads are expected total). Now consider this sequence to be seconds in time, where H means
that a customer arrives at a cash register stand (T means no customer). In this case, the customer arrival
rate is 10/1000 seconds, or 1 per 100 seconds. The Poisson distribution then gives us, for 100 second
intervals, the probability that n = 0, 1, 2, customers arrive in any given 100 second interval.
Alternatively, one could consider the probability of encountering road kill per mile. H would correspond
to road kill, T=none.
For example, suppose that from 10-11 am, 60 customers arrive at the bank, or 1 customer per minute.
On a per minute basis we can calculate the probability that exactly 0,1,2, customers will arrive:
P(n) = 1n e 1 / n !
P(0) = 1/ e = 0.368
P(1) = 1/ e = 0.368
P(2) = e 1 / 2! = 0.184
...
To calculate the probability that 1 or more customers will arrive, you will need to sum (integrate) the
distribution.
For one nice discussion of the properties of the Poisson distribution, see
http://en.wikipedia.org/wiki/Poisson_distribution
For an overview of the use of the Poisson distribution in Queuing Theory (i.e. supermarket cashier lines,
freeway traffic density considering on/off ramps, etc.), see
http://www.eventhelix.com/RealtimeMantra/CongestionControl/queueing_theory.htm
Interestingly, Queuing Theory begins with the simple-appearing but surprisingly non-obvious theorem
known as Littles Theorem. The following is taken from the above web site.
The average number of customers (N) can be determined from the following equation:
Here lambda is the average customer arrival rate and T is the average service time for a customer. Proof of this theorem can
be obtained from any standard textbook on queueing theory. Here we will focus on an intuitive understanding of the result.
Consider the example of a restaurant where the customer arrival rate (lambda) doubles but the customers still spend the same
amount of time in the restaurant (T). This will double the number of customers in the restaurant (N). By the same logic if the
customer arrival rate remains the same but the customers service time doubles, this will also double the total number of
customers in the restaurant.
Note: the above argument is only a rough guide, because if customers arrive more rapidly than they can
be serviced, N will continue to increase without bound. After this point, it gets more interesting!