0% found this document useful (0 votes)

107 views56 pages

Discrete Random Variables and Probability Distributions

Discrete random variables and their probability distributions were described in 3 sentences: A discrete random variable assigns a numerical value to each outcome of a random experiment. The probability mass function specifies the probabilities of each possible value and must be non-negative and sum to 1. The cumulative distribution function gives the probability that the random variable is less than or equal to each value.

Uploaded by

Bui Tien Dat

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

107 views56 pages

Discrete Random Variables and Probability Distributions

Uploaded by

Bui Tien Dat

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 56

Discrete Random Variables

and Probability

3
Distributions
LO
• Describe a discrete random variable
• Check if a function is a probability mass function and use it to
calculate probability
• Find the cumulative distribution function of a discrete
random variable
• Compute the mean and variance of a discrete random variable
• Determine the probability, mean and variance of uniform,
binomial, geometric and negative binomial, hypergeometric
and Poisson distributions
X = number of trials until
the 1st success
X = number of successes
in a series of n Bernoulli trials

X = number of successes Geometric Binomial

in sample of size n from N Distribution Distribution
objects

Hyper-
-geometric
Discrete Uniform
Distribution RV Distribution

P(X = xi) = 1/n

Negative for i =1, …, n
Binomial Poisson
Distribution Distribution

X = number of trials until

having r successes
X = number of rare events
Random Variables
• A random variable is a function that assigns a real number to
each outcome in the sample space of a random experiment.
X:S→
X()  
If X(S) = {x1, x2, …, xn} or X(S) = {x1, x2, …, xn, …},
X is called discrete.
Random variables – Ex
Flipping a coin twice.
➔ The sample space is S = {HH, HT, TH, TT}.
X:S→
X() = number of heads in each outcome 

Discrete random variable

Probability Distribution – Ex1
Toss a fair coin three times and let X be the number of Heads
observed, X()  {0, 1, 2, 3}. Then we have the following
probabilities
HTT HHT
THT HTH
TTT TTH THH HHH

X 0 1 2 3
P(X = x) 1/8 3/8 3/8 1/8
Probability
distribution for
xP(x) = 1 number of heads.
Probability Distribution - Ex2
• Ex. (Digital Channel) There is a chance that a bit transmitted through a digital
transmission channel is received in error. Let X equal the number of bits in
error in the next four bits transmitted. The possible values for X are {0, 1, 2, 3,
4}. Suppose that the probabilities are

X 0 1 2 3 4
P(X = x) 0.6561 0.2916 0.0486 0.0036 0.0001

The probability distribution of a random variable X is a

description of the probabilities associated with the
Probability distribution
possible values of X. for bits in error.
Probability Mass Functions (pmf)
For a discrete random variable X with possible values x1, x2, …,
xn, a probability mass function is a function f such that

Ex. Verify that the following function is a

(1) f(xi)  0 pmf.
(2) f(xi) = P(X = xi) 2x + 1
f(x) = , x = 0, 1, 2, 3, 4
25
(3) i f(xi) = 1
(1) f(x)  0
(2) f(x) = P(X = x), and P(X = 4) = f(4) = 9/25
(3) i f(xi) = f(0) + f(1) + f(2) + f(3) + f(4) = 1
Probability Mass Functions (pmf) - Ex
Given the pmf f(x), determine the probabilities.
2x + 1
f(x) = , x = 0, 1, 2, 3, 4
25

a/ P(X = 4) b/ P(X  3)
c/ P(2  X  4) d/ P(X > -3)
---
c/ P(2  X  4) = P(X = 2) + P(X = 3) + P(X = 4)
= (22 +1)/25 + (23 + 1)/25 + (24 + 1)/25 = 21/25
Cumulative Distribution Function (cdf)
The cumulative distribution function (cdf) of a discrete random
variable X, denoted as F(x), is
F(x) = P(X  x)
For discrete random variable X, F(x) satisfies
(1) F(x) = σx  x f(xi)
i x -1 0 1 2 otherwise
(2) 0  F(x)  1
f(x) 0.2 0.5 0 0.3 0
(3) If x  y, then F(x)  F(y) Find F(-1), F(1), F(1.9)
---
F(-1) = f(-1) = 0.2, F(1) = f(-1) + f(0) + f(1) = 0.7
F(1.9) = f(-1) + f(0) + f(1) = 0.7
Pmf vs cdf
x 0 1 2
f(x) 0.886 0.111 0.003

Probability mass function pmf Cumulative distribution function cdf

Cdf - Ex
Determine the pmf of X from the following cdf

x -3 -2 -1 0 1 2 3
Find f(x) from F(x):
f(x) = F(x) – F(x-) F(x) 0 0.2 0.2 0.7 0.7 1 1
f(x) 0 0.2 0 0.5 0 0.3 0
Mean and Variance
• The mean or expected value of X, denoted as  or E(X) is
 = E(X) = xxf(x)
• The variance of X, denoted as 2 or V(X), is
V(X) = 2 = x(x - )2f(x) = x x2f(x) – 2
• The standard deviation of X is  = 2

Large variance Small variance

Parts (a) and (b) illustrate equal means, but Part (a) illustrates a larger variance
Mean and Variance - Ex
Ex. (Digital Channel) There is a chance that a bit transmitted through a
digital transmission channel is received in error. Let X equal the number of
bits in error in the next four bits transmitted. Suppose that the probabilities
are
x 0 1 2 3 4
f(x) = P(X = x) 0.6561 0.2916 0.0486 0.0036 0.0001

 = E(X)
= xxf(x) = 0.4
2 = V(X)
= x(x - )2f(x) = 0.36
Mean and Variance - Ex
Ex. Given the pmf of a discrete r. v. X.

x 0 1 2 3
f(x) 0.5 0.3 0.1 0.1

Find E(X), V(X).

---
E(X) = 0*0.5 + 1*0.3 + 2*0.1 + 3*0.1 = 0.8
V(X) = 0.5*(0 – 0.8)2 + 0.3*(1 – 0.8)2 + 0.1*(2 – 0.8)2 + 0.1*(3 – 0.8)2 = 0.96
Expected Value of a Function of a
Discrete Random Variable
If X is a discrete random variable with probability mass function
f(x),
E[h(X)] = x h(x)f(x)

Ex. Given the pmf of a discrete r. v. X.

x 0 1 2 3
f(x) 0.5 0.3 0.1 0.1

Find E(X), E(X + 2), E(3X), E(3X + 2), E(X2)

E(X2) = 02(0.5) + 12(0.3) + 22(0.1) + 32(0.1) = 1.6
Some Useful properties

2 = V(X)
= x(x - )2f(x)
= E[(X- )2]
= E[X2 - 2X + 2]
= E(X2) – 2
= E(X2) – E(X)2
Discrete Uniform Distribution
A random variable X has a discrete uniform distribution if each of
the n values in its range, say x1, x2, …, xn, has equal probability.
Then,
f(xi) = 1/n

Ex. Roll a fair die.

Let X be the number shown.
Then X is discrete uniform range of 1 to 6.
f(1) = f(2) = … = f(6) = 1/6
Discrete Uniform Distribution
• Suppose that X is a discrete uniform random variable on the
consecutive integers a, a + 1, a + 2, …, b for a  b.
a+b
• The mean of X is  = E(X) =
2
(b – a + 1)2 – 1
• The variance of X is 2 = V(X) =
12
Ex. Suppose the discrete uniform random variable Y has range 5, 10, …, 30.
Let Y = 5X, where X has range 1, 2, …, 6. Then,
E(Y) = 5E(X) = 5(1 + 6)/2 = 17.5,
V(Y) = 52V(X) = 25[(6 – 1 + 1)2 – 1]/12 = 72.92
Bernoulli trials
• Bernoulli trial: A trial with only two possible outcomes (success
or failure).
Ex. The following random experiments are series of Bernoulli
trials:
• Flip a coin 10 times.
• Guess each question of a multiple-choice exam with 50 questions,
each with four choices.
• Independence: The outcome from one trial has no effect on the
outcome to be obtained from any other trial.
Bernoulli trials – Ex
For each question of a quiz, without preparation, you select at
random an answer from 4 options. Suppose the quiz has 5
questions.
What is the probability that you get 2 correct answers?
Binomial Distribution
(bi = two: two outcomes {success, failure})
• A random experiment consists of n Bernoulli trials such that
(1) The trials are independent.
(2) Each trial results in only two possible outcomes, labeled as
“success’’ and “failure’’.
(3) The probability of a success in each trial, denoted as p, remains
constant.
• The random variable X that equals the number of trials that result in
a success has a binomial random variable with parameters 0 < p < 1
and n =1, 2, … The probability mass function of X is
𝑛
f(x) = 𝑥
px(1-p)n-x
Binomial Distribution – Ex
A quality control engineer tests the quality of produced
computers. Suppose that 5% of computers have defects, and
defects occur independently of each other.
Find the probability of exactly 3 defective computers in a
shipment of twenty.
20
P(X = 3) = 3
(0.05)3(0.95)17 = 0.0596
Binomial distribution - specific cases
n = 10 n = 10 n = 10
p = 0.1 p = 0.5 p = 0.9
Binomial distribution – R-Ex
The random variable X has a binomial distribution with n = 10
and p = 0.2.
Then,
10
(a) P(X = 4) = f(4) = 4
0.24(1-0.2)10-4  0.088
10
(b) P(X = 6) = f(6) = 0.26(1-0.2)10-6  0.0055
6
(c) P(X  3) = P(X = 0) + P(X = 1) + P(X = 2) + P(X = 3)  0.879
(d) P(X  4) = 1 – P(X  3)  0.121
Binomial Distribution - Mean and
Variance
Consider random variables
1 if ith trial is a success Xi 0 1 E(Xi) = p
Xi = ቊ V(Xi) = p(1-p)
0 otherwise P(Xi) 1-p p
X = X1 + X2 + … + Xn R-Ex. A lab network consisting of 20 computers was
attacked by a computer virus. This virus enters each
computer with probability 0.4, independently of other
E(X) = np computers. Find the expected number of computers
V(X) = np(1-p) attacked by this virus.
Let X = “number of computers attacked by the virus”
➔X ~ binom(n=20, p=0.4)
➔ E(X) = np = 20(0.4) = 8 computers
Binomial distribution

n = number of trials
X = number of successes
Binomial p = probability of success
Distribution P(x) = n px(1 – p)n-x
x
E(X) = np
V(X) = np(1 – p)
Geometric Distribution
In a series of Bernoulli trials (independent trials with constant probability p of
a success), let the random variable X denote the number of trials
until the first success. Then X is a geometric random variable with
parameter 0 < p < 1 and
P(the 1st success occurs on the x-th trial) is
f(x) = (1 - p)x-1p, x = 1, 2, …
x-1 trials with failures the last trial (x-th trial) with a success

Ex. A search engine goes through a list of sites looking for a given key phrase.
Suppose the search terminates as soon as the key phrase is found. The number
of sites visited has geometric distribution.
Geometric Distribution - Ex
The probability that a bit transmitted through a digital
transmission channel is received in error is 0.1. Assume the
transmissions are independent events, and let the random
variable X denote the number of bits transmitted until the first
error. Find P(X = 5)

P(X = 5) = P(OOOOE) = 0.940.1 = 0.0656

O: Okey bit E: error bit

Geometric Distribution

p = 0.2 p = 0.6
Geometric Distribution – Mean
x 1 2 … k …
f(x) p (1-p)p … (1-p)k-1p

The mean of X is
 
 = E X = ෍ 𝑘𝑝(1 − 𝑝)𝑘−1 = 𝑝 ෍ 𝑘𝑞𝑘−1
𝑘=1 𝑘=1

𝜕 𝑘
𝜕 𝑞 1 1
=𝑝 ෍𝑞 =𝑝 =𝑝 2
=
𝜕𝑞 𝜕𝑞 1 − 𝑞 1−𝑞 𝑝
𝑘=1
Ex. Geometric Distribution - Mean
The probability that a bit transmitted through a digital
transmission channel is received in error is 0.1. Assume the
transmissions are independent events, and let the random
variable X denote the number of bits transmitted until the first
error. What is the expected number of bits transmitted until the
first error?

E(X) = 1/p = 10 bits

Geometric Distribution - Variance
2 = V(X) = (1 – p)/p2
Ex. The probability that a bit transmitted through a digital transmission
channel is received in error is 0.1. Assume the transmissions are
independent events, and let the random variable X denote the number of
bits transmitted until the first error.

2 = V(X) = (1 – p)/p2 = (1 – 0.1)/(0.1)2 ➔  = 9.49

Practical Interpretation when p is small:
•    = 1/p, which is large
• The number of trials until the first success
may be much different from the mean.
Lack of Memory Property
The probability that a bit is transmitted in error is equal to 0.1.
For example, if 100 bits are transmitted, the probability that the
first error, after bit 100, occurs on bit 105 is the probability that
the next six outcomes are OOOOE. This probability is (0.9)5(0.1)
= 0.0656, which is identical to the probability that the initial error
occurs on bit 5.
P(X = 5 after 100th bit) = P(X = 5 at the beginning)
And the mean number of bits until the next error is 1/0.1 = 10.
Geometric Distribution

X = number of trials
Geometric p = probability of success
P(x) = p(1 – p)x-1, x = 1, 2, …
Distribution E(X) = 1/p
V(X) = (1 – p)/p2
Negative binomial distribution
In a series of Bernoulli trials (independent trials, Prob(success) =
p = constant), let the random variable X denote the number of
trials until r successes occur. Then X is a negative binomial
random variable with parameters 0 < p < 1 and r = 1, 2, 3, …,
and
𝑥−1
f(x) = 𝑟−1
pr(1-p)x-r, x = r, r + 1, …

• r = 1: a negative binomial distribution becomes a geometric

distribution
p = 0.1 p = 0.4 p = 0.4
r=5 r=5 r = 10

Smaller value of p, larger number of trials Larger value of r, larger number of trials
Negative binomial distribution – Ex
Ex. Applicants for a new student internship are accepted with
probability p = 0.2 independently from person to person. Several
hundred people are expected to apply. Find the probability that it will
take no more than 100 applicants to find 10 students for the program.
Let X be the number of people who apply for the internship until the
10th student is accepted. Then X has a negative binomial distribution
with parameters r = 10 and p = 0.2.
The desired probability is
Negative binomial distribution
Negative binomial random variable represented
as a sum of geometric random variables.

Mean and Variance

E(X) = r/p
V(X) = r(1-p)/p2

Ex. (Web Servers) A Web site contains three identical computer servers. Only one is used to operate
the site, and the other two are spares that can be activated in case the primary system fails. The
probability of a failure in the primary computer (or any activated spare system) from a request for
service is 0.0005. Assuming that each request represents an independent trial, what is the mean
number of requests until failure of all three servers?
E(X) = r/p = 3/(0.0005) = 6000 requests.
Negative binomial distribution – Ex
A Web site randomly selects among 10 products to discount each
day. The color printer of interest to you is discounted today.
(a) What is the expected number of days until this product is
again discounted? (b) What is the probability that this product is
first discounted again exactly 10 days from now? (c) If the
product is not discounted for the next five days, what is the
probability that it is first discounted again 15 days from now?
(d) What is the probability that this product is first discounted
again within three or fewer days?
Negative Binomial Distribution

X = number of trials until r successes

Negative p = probability of success
𝑥−1 r x-r, x = r, r + 1, …
Binomial P(x) = 𝑟−1
p (1-p)
Distribution E(X) = r/p
V(X) = r(1 – p)/p2
Hypergeometric Distribution
A set of N objects contains
K objects classified as successes
N – K objects classified as failures
A sample of size n objects is selected randomly (without
replacement) from the N objects.
Let X denote the number of successes in the sample. Then X is a
hypergeometric random variable and

for max{0, n – (N – K)}  x  min{n, K}

Hypergeometric Distribution –
Selected cases
N = 10, K = 5, n = 5 N = 50, K = 5, n = 5 N = 50, K = 3, n = 5
Hypergeometric Distribution - Ex
Ex. A shipment of 50 computers contains 4 defective ones. Ten
are bought at random. What is the probability that two of them
will be defective?
X = the number of defective computers ➔ X is a hypergeometric
random variable with parameters N = 50, K = 4, n = 10.
The desired probability is
Hypergeometric Distribution

finite population
p: the proportion of successes in the set of N objects correction factor

Ex. In the previous example, n = 10, p = 4/50.

➔ E(X) = 10(4/50) = 0.8
and V(X) = 10(4/50)(46/50)(40/49) = 0.601
Hypergeometric vs Binomial
n  N, p = K/N is not too close to 0 or 1

Hypergeometric distribution Binomial Distribution

N = 100, K = 20, n = 10 p = K/N = 0.2, n = 10
Binomial distribution approximates
Hypergeometric Distribution - Example
Ex. Suppose a shipment of 100 computers contains 20 defective ones. Ten
are selected at random. Find the probability that 3 of them will be defective.
Use hypergeometric distribution (N = 100, K = 20, n = 10):

Use binomial distribution (10 = n N = 100, p = 20/100 = 0.2):

Hypergeometric Distribution
N = number of objects
K = number of success-objects
n = sample size
Hypergeometric K N–K
x n–x
Distribution P(x) =
N
n
E(X) = np, where p = K/N
N–n
V(X) = np(1 – p)
N–1
Poisson Distribution
The number of rare events occurring within a fixed period of time
has Poisson distribution with parameter  > 0.
: frequency, average number of events
e −  x
Poisson f(x) = , x = 0, 1, 2, …
Distribution x!
 =
2 = 
Examples of rare events: telephone calls, e-mail messages, traffic
accidents, network blackouts, virus attacks, errors in software, floods,
earthquakes, soldiers killed by horse kick, etc.
Poisson Distribution
λ = 0.1 λ=2 λ=5

Poisson distributions for selected values of the parameters.

Poisson Distribution - Ex
(New accounts) The number of new accounts of an internet service
provider has a Poisson distribution with a mean of 5 accounts per day.
a/ What is the probability that there more than 4 new accounts in one
day?
b/ What is the probability that there are 15 new accounts in 2 days?
a/ P(X > 4) = 1 – P(X  4) = 1 – P(X = 0) – P(X = 1) – P(X = 2) – P(X
= 3) – P(X = 4) = 0.56
b/ Let X denote the number of new accounts in 2 days. Then X has a
Poisson distribution with  = E(X) = 2(5) = 10.
𝑒 − 𝑥 𝑒 −10 1015
➔P(X = 15) = = = 0.0347
𝑥! 15!
Poisson Distribution – Exercises
The number of telephone calls that arrive at a phone exchange is
often modeled as a Poisson random variable. Assume that on the
average there are 10 calls per hour.
(a) What is the probability that there are exactly five calls in one hour?
(b) What is the probability that there are three or fewer calls in one
hour?
(c) What is the probability that there are exactly 15 calls in two hours?
(d) What is the probability that there are exactly five calls in 30
minutes?
Poisson approximation to Binomial
• Poisson distribution (λ = np) can be effectively used to
approximate Binomial probabilities when
• n is large (e.g., n  30)
• p is small (e.g., p  0.05)

Binom(n = 30, p = 0.03) Pois(λ = np = 0.9)

Poisson approximation to Binomial
Ex. 3% of messages are transmitted with errors. What is the
probability that out of 200 messages, exactly 5 will be transmitted
incorrectly?
• Let X be the number of messages with errors.
➔ X ~ Binom(n = 200, p = 0.03) and P(X = 5) = 0.162
• Use Poisson distribution with λ = np = 6:
e−6 65
P(X = 5) = = 0.161
5!
X = number of trials until
the 1st success
X = number of successes
in a series of n Bernoulli trials