PRobabilty Distribution Zoo
PRobabilty Distribution Zoo
Let X denote the total number of successes in these n trials, then X follows a binomial
distribution with parameters n and π, where n ≥ 1 is a known integer and 0 ≤ π ≤ 1. This
is often written as:
X ∼ Bin(n, π).
If X ∼ Bin(n, π), then:
E(X) = n π.
Example
A multiple choice test has 4 questions, each with 4 possible answers. James is taking the test,
but has no idea at all about the correct answers. So he guesses every answer and, therefore,
has the probability of 1/4 of getting any individual question correct.
Let X denote the number of correct answers in James’ test. X follows the binomial distribution
with n = 4 and π = 0.25, i.e. we have:
X ∼ Bin(4, 0.25).
For example, what is the probability that James gets 3 of the 4 questions correct?
Here it is assumed that the guesses are independent, and each has the probability π = 0.25 of
being correct.
However, we do not care about the order of the 0s and 1s, only about the number of 1s. So 1101
and 1011, for example, also count as 3 correct answers. Each of these also has the probability
π 3 (1 − π)1 .
The total number of sequences with three 1s (and, therefore, one 0) is the number of locations
for the three 1s which can be selected in the sequence of 4 answers. This is 43 = 4 (see below).
where nx is the binomial coefficient – in short, the number of ways of choosing x objects
out of n when sampling without replacement when the order of the objects does not matter.
n
x can be calculated as:
n n!
=
x x! (n − x)!
where k! = k × (k − 1) × · · · × 3 × 2 × 1, for an integer k > 0. Also note that 0! = 1. For
example:
4 4! 4! 4×3×2×1 24
= = = = = 4.
3 3! (4 − 3)! 3! 1! (3 × 2 × 1) × 1 6×1
Example
More generally, consider a student who has the same probability π of the correct answer for
every question, so that X ∼ Bin(20, π).
The figure below shows plots of the probabilities for π = 0.25, 0.5, 0.7 and 0.9 (reflecting
students of differing abilities, i.e. the better the student the more likely s/he is to get the
answer correct and hence a higher π).
Note that as π increases, the probability of obtaining a large number of correct answers increases
(and hence the probability of obtaining a small number of correct answers decreases) as we
would expect because better-prepared students tend to score higher marks. Of course, there is
an opportunity cost: it takes more time and effort to prepare, but this on average is rewarded
with high marks!
0.30
0.30
0.20
0.20
Probability
Probability
0.10
0.10
0.00
0.00
0 5 10 15 20 0 5 10 15 20
0.30
0.20
0.20
Probability
Probability
0.10
0.10
0.00
0.00
0 5 10 15 20 0 5 10 15 20
Poisson distribution
The possible values of the Poisson distribution are the non-negative integers 0, 1, 2, . . ..
X ∼ Poisson(λ) or X ∼ Pois(λ).
If X ∼ Poisson(λ), then:
E(X) = λ
Poisson distributions are used for counts of occurrences of various kinds. To give a formal
motivation, suppose that we consider the number of occurrences of some phenomenon in time,
and that the process which generates the occurrences satisfies the following conditions:
Credits to Dr.James Abdey, University of London
3
1. The numbers of occurrences in any two disjoint intervals of time are independent of each
other.
2. The probability of two or more occurrences at the same time is negligibly small.
3. The probability of one occurrence in any short time interval of length t is λt for some
constant λ > 0.
In essence, these state that individual occurrences should be independent, sufficiently rare, and
happen at a constant rate λ per unit of time. A process like this is a Poisson process.
If occurrences are generated by a Poisson process, then the number of occurrences in a randomly
selected time interval of length t = 1, X, follows a Poisson distribution with mean λ, i.e.
X ∼ Poisson(λ).
The single parameter λ of the Poisson distribution is, therefore, the rate of occurrences per
unit of time.
Example
Examples of variables for which we might use a Poisson distribution include the following.
Because λ is the rate per unit of time, its value also depends on the unit of time (that is, the
length of interval) we consider.
Example
If X is the number of arrivals per hour and X ∼ Poisson(1.5), then if Y is the number of
arrivals per two hours, Y ∼ Poisson(2 × 1.5) = Poisson(3).
Both motivations suggest that distributions with higher values of λ have higher probabilities
of large values of X.
4
Example
λ=2
λ=4
0.20
0.15
p(x)
0.10
0.05
0.00
0 2 4 6 8 10
Example
Customers arrive at a bank on weekday afternoons randomly at an average rate of 1.6 customers
per minute. Let X denote the number of arrivals per minute and Y denote the number of
arrivals per 5 minutes.
2. What is the probability that more than two customers arrive in a one-minute interval?
P (X > 2) = 1 − P (X ≤ 2) = 1 − [P (X = 0) + P (X = 1) + P (X = 2)] which is:
5
e−1.6 (1.6)0 e−1.6 (1.6)1 e−1.6 (1.6)2
1− − − = 1 − e−1.6 − 1.6e−1.6 − 1.28e−1.6
0! 1! 2!
= 1 − 3.88e−1.6
= 0.2167.
3. What is the probability that no more than 1 customer arrives in a five-minute interval?
For Y ∼ Poisson(8), the probability P (Y ≤ 1) is:
There are close connections between some probability distributions, even across different
families of them. Some connections are exact, i.e. one distribution is exactly equal to another,
for particular values of the parameters. For example, Bernoulli(π) is the same distribution as
Bin(1, π).
Some connections are approximate (or asymptotic), i.e. one distribution is closely
approximated by another under some limiting conditions. We next discuss one of these, the
Poisson approximation of the binomial distribution.
Suppose that:
• X ∼ Bin(n, π).
The connection is exact at the limit, i.e. Bin(n, π) → Poisson(λ) if n → ∞ and π → 0 in such
a way that n π = λ remains constant.
This ‘law of small numbers’ provides another motivation for the Poisson distribution.