Chapter 3
DISCRETE PROBABILITY DISTRIBUTIONS
Introduction
Many physical systems can be modelled by a similar or the same random variables
and random experiments. The distribution of the random variables involved in each of
these common systems can be analyzed, and the result of that analysis can be used in
different applications and examples. In this chapter, the analysis of several random
experiments and discrete random variables that often appear in applications is discussed.
A discussion of the basic sample space of the random experiment is frequently omitted
and the distribution of a particular random variable is directly described.
Discrete Probability Distribution
A discrete distribution describes the probability of occurrence of each value of a
discrete random variable. A discrete random variable is a random variable that has
countable values, such as a list of non-negative integers.
With a discrete probability distribution, each possible value of the discrete random
variable can be associated with a non-zero probability. Thus, a discrete probability
distribution is often presented in tabular form.
Video: https://www.youtube.com/watch?v=mrCxwEZ_22o
3.1 Random Variables and Their Probability Distributions
Random Variables
In probability and statistics, a random variable is a variable whose value is subject
to variations due to chance (i.e. randomness, in a mathematical sense). As opposed to
other mathematical variables, a random variable conceptually does not have a single,
fixed value (even if unknown); rather, it can take on a set of possible different values,
each with an associated probability.
A random variable’s possible values might represent the possible outcomes of a
yet-to-be-performed experiment, or the possible outcomes of a past experiment whose
already-existing value is uncertain (for example, as a result of incomplete information or
imprecise measurements). They may also conceptually represent either the results of an
“objectively” random process (such as rolling a die), or the “subjective” randomness that
results from incomplete knowledge of a quantity. Random variables can be classified as
either discrete (that is, taking any of a specified list of exact values) or as continuous
(taking any numerical value in an interval or collection of intervals). The mathematical
function describing the possible values of a random variable and their associated
probabilities is known as a probability distribution.
Discrete Random Variables
Discrete random variables can take on either a finite or at most a countably infinite
set of discrete values (for example, the integers). Their probability distribution is given by
a probability mass function which directly maps each value of the random variable to a
probability. For example, the value of x1 takes on the probability p1, the value of x2 takes
on the probability p2, and so on. The probabilities pi must satisfy two requirements: every
probability pi is a number between 0 and 1, and the sum of all the probabilities is 1.
(p1+p2+⋯+pk=1)
Discrete Probability Distribution
This shows the probability mass function of a discrete probability distribution. The
probabilities of the singletons {1}, {3}, and {7} are respectively 0.2, 0.5, 0.3. A set not
containing any of these points has probability zero.
Examples of discrete random variables include the values obtained from rolling a
die and the grades received on a test out of 100.
Probability Distributions for Discrete Random Variables
Probability distributions for discrete random variables can be displayed as a
formula, in a table, or in a graph. A discrete random variable x has a countable number
of possible values. The probability distribution of a discrete random variable x lists the
values and their probabilities, where value x1 has probability p1, value x2 has
probability x2, and so on. Every probability pi is a number between 0 and 1, and the sum
of all the probabilities is equal to 1.
Examples of discrete random variables include:
• The number of eggs that a hen lays in a given day (it can’t be 2.3)
• The number of people going to a given soccer match
• The number of students that come to class on a given day
• The number of people in line at McDonald’s on a given day and time
A discrete probability distribution can be described by a table, by a formula, or by a
graph. For example, suppose that xx is a random variable that represents the number of
people waiting at the line at a fast-food restaurant and it happens to only take the values
2, 3, or 5 with probabilities 2/10, 3/10, and 5/10 respectively. This can be expressed
through the function f(x) = x/10, x=2, 3, 5 or through the table below. Of the conditional
probabilities of the event BB given that A1 is the case or that A2 is the case, respectively.
Notice that these two representations are equivalent, and that this can be represented
graphically as in the probability histogram below.
Video: https://www.youtube.com/watch?v=cqK3uRoPtk0
Probability Histogram: This histogram displays the probabilities of each of the three
discrete random variables. The formula, table, and probability histogram satisfy the
following necessary conditions of discrete probability distributions:
1. 0≤f(x) ≤1, i.e., the values of f(x) are probabilities, hence between 0 and 1.
2. ∑f(x) =1, i.e., adding the probabilities of all disjoint cases, we obtain the probability
of the sample space, 1.
Sometimes, the discrete probability distribution is referred to as the probability mass
function (pmf). The probability mass function has the same purpose as the probability
histogram, and displays specific probabilities for each discrete random variable.
The only difference is how it looks graphically.
Probability Mass Function
This shows the graph of a probability mass function. All the values of this function
must be non-negative and sum up to 1.
x f(x)
2 0.2
3 0.3
5 0.5
Discrete Probability Distribution: This table shows the values of the discrete random
variable can take on and their corresponding probabilities.
Example 1. A shipment of 20 similar laptop computers to a retail outlet contains 3 that
are defective. If a school makes a random purchase of 2 of these computers, find the
probability distribution for the number of defectives.
Solution:
Let X be a random variable whose values x are the possible numbers of defective
computers purchased by the school. Then x can only take the numbers 0, 1, and 2.
Now,
Thus, the probability distribution of X is
x 0 1 2
f(x) 68/95 51/190 3/190
3.2 Cumulative Distribution Functions
You might recall that the cumulative distribution function is defined for discrete
random variables as:
𝐹(𝑥) = 𝑃(𝑋 ≤ 𝑥) = ∑ 𝑓(𝑡)
𝑡≤𝑥
Again, F(x) accumulates all of the probability less than or equal to x. The cumulative
distribution function for continuous random variables is just a straightforward extension of
that of the discrete case. All we need to do is replace the summation with an integral.
The cumulative distribution function ("c.d.f.") of a continuous random variable X is
defined as:
𝑥
𝐹(𝑥) = ∫ 𝑓(𝑡)𝑑𝑡
−∞
For -∞<x<∞
Example 1. Suppose that a day’s production of 850 manufactured parts contains 50 parts
that do not con- form to customer requirements. Two parts are selected at random,
without replacement, from the batch. Let the random variable X equal the number of
nonconforming parts in the sample. What is the cumulative distribution function of X?
Solution:
The question can be answered by first finding the probability mass function of X.
Therefore,
The cumulative distribution function for this example is graphed in the figure below. Note
that F(x) is defined for all x from - < x < and not only for 0, 1, and 2.
Graph of the cumulative distribution function for the above example
3.3 Expected Values of Random Variables
The expected value of a random variable is the weighted average of all possible
values that this random variable can take on.
Discrete Random Variable
A discrete random variable X has a countable number of possible values. The
probability distribution of a discrete random variable X lists the values and their
probabilities, such that xi has a probability of pi. The probabilities pi must satisfy two
requirements:
1. Every probability pi is a number between 0 and 1.
2. The sum of the probabilities is 1: p1+p2+⋯+pi = 1.
Expected Value Definition
In probability theory, the expected value (or expectation, mathematical
expectation, EV, mean, or first moment) of a random variable is the weighted average of
all possible values that this random variable can take on. The weights used in computing
this average are probabilities in the case of a discrete random variable.
The expected value may be intuitively understood by the law of large numbers: the
expected value, when it exists, is almost surely the limit of the sample mean as sample
size grows to infinity. More informally, it can be interpreted as the long-run average of the
results of many independent repetitions of an experiment (e.g. a dice roll). The value may
not be expected in the ordinary sense—the “expected value” itself may be unlikely or even
impossible (such as having 2.5 children), as is also the case with the sample mean.
How To Calculate Expected Value
Suppose random variable X can take value x1 with probability p1, value x2 with
probability p2, and so on, up to value xi with probability pi. Then the expectation value of
a random variable XX is defined as E[X] = x1 p1+ x2 p2+⋯+xi pi, which can also be written
as: E[X] =∑xi p1.
If all outcomes xi are equally likely (that is, p 1= p2 =⋯=pi), then the weighted average
turns into the simple average. This is intuitive: the expected value of a random variable is
the average of all values it can take; thus, the expected value is what one expects to
happen on average. If the outcomes xi are not equally probable, then the simple average
must be replaced with the weighted average, which takes into account the fact that some
outcomes are more likely than the others. The intuition, however, remains the same: the
expected value of X is what one expects to happen on average.
For example, let X represent the outcome of a roll of a six-sided die. The possible values
for X are 1, 2, 3, 4, 5, and 6, all equally likely (each having the probability of 1/6). The
expectation of X is:
E[X] = (1x1/6) + (2x2/6) + (3x3/6) + (4x4/6) + (5x5/6) + (6x6/6) = 3.5.
In this case, since all outcomes are equally likely, we could have simply averaged the
numbers together:
(1 + 2 + 3 + 4 + 5 + 6) /6 = 3.5.
Average Dice Value Against Number of Rolls
An illustration of the convergence of sequence averages of rolls of a die to the
expected value of 3.5 as the number of rolls (trials) grows.
3.4 The Binomial Distribution
Binomial Experiment
A binomial experiment is a statistical experiment that has the following properties:
• The experiment consists of n repeated trials.
• Each trial can result in just two possible outcomes. We call one of these outcomes
a success and the other, a failure.
• The probability of success, denoted by P, is the same on every trial.
• The trials are independent; that is, the outcome on one trial does not affect the
outcome on other trials.
Consider the following statistical experiment. You flip a coin 2 times and count the
number of times the coin lands on heads.
This is a binomial experiment because:
1. The experiment consists of repeated trials. We flip a coin 2 times.
2. Each trial can result in just two possible outcomes - heads or tails.
3. The probability of success is constant - 0.5 on every trial.
4. The trials are independent; that is, getting heads on one trial does not affect
whether we get heads on other trials.
The following notation is helpful, when we talk about binomial probability.
• x: The number of successes that result from the binomial experiment.
• n: The number of trials in the binomial experiment.
• P: The probability of success on an individual trial.
• Q: The probability of failure on an individual trial. (This is equal to 1 - P.)
• n!: The factorial of n (also known as n factorial).
• b (x; n, P): Binomial probability - the probability that an n-trial binomial experiment
results in exactly x successes, when the probability of success on an individual
trial is P.
• nCr: The number of combinations of n things, taken r at a time.
Binomial Distribution
A binomial random variable is the number of successes x in n repeated trials of
a binomial experiment. The probability distribution of a binomial random variable is called
a binomial distribution.
Suppose we flip a coin two times and count the number of heads (successes). The
binomial random variable is the number of heads, which can take on values of 0, 1, or 2.
The binomial distribution is presented below.
Number of Heads Probability
0 0.25
1 0.50
2 0.25
The binomial distribution has the following properties:
▪ The mean of the distribution (μx) is equal to n * P.
▪ The variance (σ2x) is n * P * (1 - P).
▪ The standard deviation (σx) is sqrt [n * P * (1 - P)].
Binomial Formula and Binomial Probability
The binomial probability refers to the probability that a binomial experiment
results in exactly x successes. For example, in the above table, we see that the binomial
probability of getting exactly one head in two-coin flips is 0.50.
Given x, n, and P, we can compute the binomial probability based on the binomial
formula.
Binomial Formula. Suppose a binomial experiment consists of n trials and results
in x successes. If the probability of success on an individual trial is P, then the binomial
probability is:
b (x; n, P) = nCx * Px * (1 - P) n - x
or
b (x; n, P) = {n! / [ x! (n - x)!]} * Px * (1 - P) n - x
Example 1.Suppose a die is tossed 5 times. What is the probability of getting exactly 2
fours?
Solution:
This is a binomial experiment in which the number of trials is equal to 5, the number
of successes is equal to 2, and the probability of success on a single trial is 1/6 or about
0.167. Therefore, the binomial probability is:
b (2; 5, 0.167) = 5C2 * (0.167)2 * (0.833)3
b (2; 5, 0.167) = 0.161
Cumulative Binomial Probability
A cumulative binomial probability refers to the probability that the binomial
random variable falls within a specified range (e.g., is greater than or equal to a stated
lower limit and less than or equal to a stated upper limit).
For example, we might be interested in the cumulative binomial probability of obtaining
45 or fewer heads in 100 tosses of a coin. This would be the sum of all these individual
binomial probabilities.
b (x < 45; 100, 0.5) =
b (x = 0; 100, 0.5) + b (x = 1; 100, 0.5) + ... + b (x = 44; 100, 0.5) + b (x = 45; 100, 0.5)
Example 1. What is the probability of obtaining 45 or fewer heads in 100 tosses of a
coin?
Solution:
To solve this problem, we compute 46 individual probabilities, using the binomial
formula. The sum of all these probabilities is the answer we seek. Thus,
b (x < 45; 100, 0.5) = b (x = 0; 100, 0.5) + b (x = 1; 100, 0.5) + . . . + b (x = 45; 100, 0.5)
b (x < 45; 100, 0.5) = 0.184
Example 3. The probability that a student is accepted to a prestigious college is 0.3. If 5
students from the same school apply, what is the probability that at most 2 are
accepted?
Solution:
To solve this problem, we compute 3 individual probabilities, using the binomial
formula. The sum of all these probabilities is the answer we seek. Thus,
b (x < 2; 5, 0.3) = b(x = 0; 5, 0.3) + b(x = 1; 5, 0.3) + b(x = 2; 5, 0.3)
b(x < 2; 5, 0.3) = 0.1681 + 0.3601 + 0.3087
b(x < 2; 5, 0.3) = 0.8369
3.5 The Poisson Distribution
A Poisson distribution is the probability distribution that results from a Poisson
experiment.
Attributes of a Poisson Experiment
A Poisson experiment is a statistical experiment that has the following properties:
▪ The experiment results in outcomes that can be classified as successes or
failures.
▪ The average number of successes (μ) that occurs in a specified region is known.
▪ The probability that a success will occur is proportional to the size of the region.
▪ The probability that a success will occur in an extremely small region is virtually
zero.
Note that the specified region could take many forms. For instance, it could be a length,
an area, a volume, a period of time, etc.
Notation
The following notation is helpful, when we talk about the Poisson distribution.
• e: A constant equal to approximately 2.71828. (Actually, e is the base of the natural
logarithm system.)
• μ: The mean number of successes that occur in a specified region.
• x: The actual number of successes that occur in a specified region.
• P (x; μ): The Poisson probability that exactly x successes occur in a Poisson
experiment, when the mean number of successes is μ.
Poisson Distribution
A Poisson random variable is the number of successes that result from a
Poisson experiment. The probability distribution of a Poisson random variable is called
a Poisson distribution.
Given the mean number of successes (μ) that occur in a specified region, we can
compute the Poisson probability based on the following Poisson formula.
Poisson Formula. Suppose we conduct a Poisson experiment, in which the average
number of successes within a given region is μ. Then, the Poisson probability is:
P (x; μ) = (e-μ) (μx) / x!
where x is the actual number of successes that result from the experiment, and e is
approximately equal to 2.71828.
The Poisson distribution has the following properties:
▪ The mean of the distribution is equal to μ.
▪ The variance is also equal to μ.
Example 1. The average number of homes sold by the Acme Realty company is 2
homes per day. What is the probability that exactly 3 homes will be sold tomorrow?
Solution:
This is a Poisson experiment in which we know the following:
▪ μ = 2; since 2 homes are sold per day, on average.
▪ x = 3; since we want to find the likelihood that 3 homes will be sold tomorrow.
▪ e = 2.71828; since e is a constant equal to approximately 2.71828.
We plug these values into the Poisson formula as follows:
P (x; μ) = (e-μ) (μx) / x!
P (3; 2) = (2.71828-2) (23) / 3!
P (3; 2) = (0.13534) (8) / 6
P (3; 2) = 0.180
Thus, the probability of selling 3 homes tomorrow is 0.180.
Cumulative Poisson Probability
A cumulative Poisson probability refers to the probability that the Poisson
random variable is greater than some specified lower limit and less than some specified
upper limit.
Example. Suppose the average number of lions seen on a 1-day safari is 5. What is the
probability that tourists will see fewer than four lions on the next 1-day safari?
Solution: This is a Poisson experiment in which we know the following:
▪ μ = 5; since 5 lions are seen per safari, on average.
▪ x = 0, 1, 2, or 3; since we want to find the likelihood that tourists will see fewer
than 4 lions; that is, we want the probability that they will see 0, 1, 2, or 3 lions.
▪ e = 2.71828; since e is a constant equal to approximately 2.71828.
To solve this problem, we need to find the probability that tourists will see 0, 1, 2, or 3
lions. Thus, we need to calculate the sum of four probabilities: P (0; 5) + P (1; 5) + P (2;
5) + P (3; 5).
To compute this sum, we use the Poisson formula:
P (x < 3, 5) = P (0; 5) + P (1; 5) + P (2; 5) + P (3; 5)
P (x < 3, 5) = [ (e-5) (50) / 0!] + [ (e-5) (51) / 1!] + [(e-5) (52) / 2!] + [(e-5) (53) / 3!]
P (x < 3, 5) = [(0.006738) (1) / 1] + [(0.006738) (5) / 1] + [(0.006738) (25) / 2] +
[(0.006738) (125) / 6]
P (x < 3, 5) = [0.0067] + [0.03369] + [0.084224] + [0.140375]
P (x < 3, 5) = 0.2650
Thus, the probability of seeing at no more than 3 lions is 0.2650.
REFERENCES:
Montgomery, D. C. et al. (2003). Applied Statistics and Probability for Engineers 3rd Edition. USA.
John Wiley & Sons, Inc.
Walpole, R. E. et al. (2016). Probability & Statistics for Engineers & Scientists 9th Edition. England.
Pearson Education Limited
https://courses.lumenlearning.com/boundless-statistics/chapter/discrete-random-variables/
https://newonlinecourses.science.psu.edu/stat414/node/98/
https://stattrek.com/probability-distributions/binomial.aspx
https://stattrek.com/probability-distributions/poisson.aspx