Chapter 2 - Discrete Random Variable
Chapter 2 - Discrete Random Variable
Chapter 2:
1. Random Variables
Suppose the following table gives the frequency and relative frequency distributions of the number
of vehicles owned by all 2000 families living in a small town.
Suppose one family is randomly selected from this population. The process of randomly selecting a
family is called a random or chance experiment. Let x denote the number of vehicles owned by the
selected family. Then x can assume any of the five possible values (0, 1, 2, 3, and 4) listed in the first
column of the above table. The value of x depends on the outcome of a random experiment.
Consequently, x is called a random variable or a chance variable. In general, a random variable is
denoted by x and y.
Definition 1.1
A random variable is a variable whose value is determined by the outcome of a random experiment.
DEFINITION 1.2
A random variable that assumes countable values is called a discrete random variable.
In the table, the number of vehicles owned by a family is an example of a discrete random variable
because the values of the random variable x are countable: 0, 1, 2, 3, and 4.
1
SCHOOL OF MATHEMATICAL SCIENCES
Definition 1.3
A random variable that can assume any value contained in one or more intervals is called a continuous
random variable.
Because the number of values contained in any interval is infinite, the possible number of values that
a continuous random variable can assume is also infinite. Moreover, we cannot count these values.
Consider the life of a battery. We can measure it as precisely as we want. For instance, the life of this
battery may be 40 hours, or 40.25 hours, or 40.247 hours. Assume that the maximum life of a battery
is 200 hours. Let x denote the life of a randomly selected battery of this kind. Then, x can assume any
value in the interval 0 to 200. Consequently, x is a continuous random variable. As shown in the
diagram, every point on the line representing the interval 0 to 200 gives a possible value of x.
2
SCHOOL OF MATHEMATICAL SCIENCES
This chapter is limited to a discussion of discrete random variables and their probability distributions.
Continuous random variables will be discussed in Chapter 3.
Let x be a discrete random variable. The probability distribution of x describes how the probabilities
are distributed over all the possible values of x.
DEFINITION 2.1
The probability distribution of a discrete random variable lists all the possible values that the random
variable can assume and their corresponding probabilities. We will sometimes denote P( X = x) by
P( x) or f ( x) .
THEOREM 2.1
For any discrete probability distribution, the following must be true:
1. 0 ≤ P(x) ≤ 1 for all x.
2. ∑𝑥 𝑃(𝑥) = 1, where the summation is over all values of x with nonzero probability.
Note: For a random variable X of the discrete type, the probability P( X = x) is frequently denoted
by f ( x) , and this function f ( x) is called probability mass function (p.m.f.).
Example 2.1
Each of the following table lists certain values of x and their probabilities. Determine whether or not
each table represents a valid probability distribution.
(a)
x P (x)
0 0.08
1 0.11
2 0.39
3 0.27
(b)
x P (x)
2 0.25
3 0.34
4 0.28
5 0.13
(c)
x P (x)
7 0.70
8 0.50
9 -0.20
3
SCHOOL OF MATHEMATICAL SCIENCES
Example 2.2
The following table lists the probability distribution of the number of breakdowns per week for a
machine based on past data.
Find the probability that the number of breakdowns for this machine during a given week is
(i) exactly 2
(ii) 0 to 2
(iii) more than 1
(iv) at most 1
DEFINITION 2.2
The cumulative distribution function (CDF), F(x) of a discrete random variable X with probability
distribution P(X) is
Note: P( X = x) = f ( x) = F ( x) − F ( x − 1)
Example 2.3
Consider the following distribution function of X:
X 1 2 3 4
P(x) 0.4 0.3 0.2 0.1
Find the cumulative distribution of X.
4
SCHOOL OF MATHEMATICAL SCIENCES
3. The Expected Value (Mean) and Standard Deviation of a Discrete Random Variable
The mean of a discrete random variable, denoted by , is actually the mean of its probability
distribution. The mean of a discrete random variable x is also called its expected value and is denoted
by E x . The mean (or expected value) of a discrete random variable is the value that we expect to
observe per repetition, on average, if we perform an experiment a large number of times. For example,
we may expect a car salesperson to sell, on average, 2.4 cars per week. This does not mean that every
week this salesperson will sell exactly 2.4 cars. This simply means that if we observe for many weeks,
this salesperson will sell a different number of cars during different weeks; however, the average for
all these weeks will be 2.4 cars per week.
To calculate the mean of a discrete random variable x, we multiply each value of x by the
corresponding probability and sum the resulting products. This sum gives the mean (or expected value)
of a discrete random variable x.
DEFINITION 3.1
Let X be a discrete random variable with the probability function p(x). Then the expected value of X,
E[X], is defined to be
E[ X ] = x . p( x) .
x
5
SCHOOL OF MATHEMATICAL SCIENCES
Example 3.1
The probability distribution for a random variable Y is given in Table below. Find the mean of Y.
Example 3.2
Find E[X] where X is the outcome when we roll a fair die.
Example 3.3
The distribution for the age of mathematics lecturers is summarized below. Calculate the mean age.
x = age 32 39 45 57 62
P(X =x) 0.16 0.42 0.10 0.28
6
SCHOOL OF MATHEMATICAL SCIENCES
Example 3.4
An insurance policy pays an individual 100 per day for up to 3 days of hospitalization and 25 per day
for each day of hospitalization thereafter. The number of days of hospitalization, X, is a discrete
random variable with probability function
6−𝑘
𝑃(𝑋 = 𝑘) = { 15 , 𝑓𝑜𝑟 𝑘 = 1,2,3,4,5
0, 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒
7
SCHOOL OF MATHEMATICAL SCIENCES
THEOREM 3.1
Let X be a discrete random variable with probability function p(x) and g(X) be a real-valued function
of X. Then the expected value of g(X) is given by
E[ g ( X )] = g ( x) p( x) .
all x
Example 3.5
If X is the outcome when we roll a fair die, find the expected value of g ( X ) = 2 X 2 + 1 .
THEOREM 3.2
Let X be a discrete random variable with probability function p(x) and mean E[X] = μ; then the
variance of a random variable X is defined to be the expected value of (X − μ)2. That is,
Var(𝑋) = 𝜎 2 = 𝐸[(𝑋 − 𝜇)2 ] = 𝐸(𝑋 2 ) − 𝜇 2 .
Example 3.6
Calculate the standard deviation of Y as shown in Example 3.1
8
SCHOOL OF MATHEMATICAL SCIENCES
Example 3.7
Calculate the standard deviation of X, where X represents the outcome when a fair die is rolled.
The binomial probability distribution is one of the most widely used discrete probability distributions.
It is applied to find the probability that an outcome will occur x times in n performances of an
experiment. For example, given that the probability is 0.05 that a DVD player manufactured at a firm
is defective, we may be interested in finding the probability that in a random sample of three DVD
players manufactured at this firm, exactly one will be defective. As a second example, we may be
interested in finding the probability that a baseball player with a batting average of 0.250 will have
no hits in 10 trips to the plate.
To apply the binomial probability distribution, the random variable x must be a discrete dichotomous
random variable. In other words, the variable must be a discrete random variable, and each repetition
of the experiment must result in one of two possible outcomes. The binomial distribution is applied
to experiments that satisfy the four conditions of a binomial experiment. (These conditions are
described in Definition 4.1). Each repetition of a binomial experiment is called a trial or a Bernoulli
trial. For example, if an experiment is defined as one toss of a coin and this experiment is repeated
10 times, then each repetition (toss) is called a trial. Consequently, there are 10 total trials for this
experiment.
DEFINITION 4.1
A binomial experiment possesses the following properties:
1. The experiment consists of a fixed number, n, of identical trials. In other words, the given
experiment is repeated n times, where n is a positive integer. All these repetitions are performed under
identical conditions.
2. Each trial results in one of two outcomes: success, S, or failure, F.
9
SCHOOL OF MATHEMATICAL SCIENCES
3. The probability of success on a single trial is equal to some value p and remains the same from trial
to trial. The probability of a failure is equal to q = (1 − p).
4. The trials are independent. In other words, the outcome of one trial does not affect the outcome of
another trial.
Note that one of the two outcomes of a trial is called a success and the other a failure. Notice that a
success does not mean that the corresponding outcome is considered favorable or desirable. Similarly,
a failure does not necessarily refer to an unfavorable or undesirable outcome. Success and failure are
simply the names used to denote the two possible outcomes of a trial. The outcome to which the
question refers is usually called a success; the outcome to which it does not refer is called a failure.
Example 4.1
Consider the experiment consisting of 10 tosses of a coin. Determine whether or not it is a binomial
experiment.
Solution
The experiment consisting of 10 tosses of a coin satisfies all four conditions of a binomial experiment.
1. There are a total of 10 trials (tosses), and they are all identical. All 10 tosses are performed
under identical conditions. Here, n = 10.
2. Each trial (toss) has only two possible outcomes: a head and a tail. Let a head be called a
success and a tail be called a failure.
1 1
3. The probability of obtaining a head (a success) is and that of a tail (a failure) is for any
2 2
toss. That is
1 1
p = P(H ) = and q = P (T ) =
2 2
The sum of these two probabilities is 1.0. Also, these probabilities remain the same for each
toss.
4. The trials (tosses) are independent. The result of any preceding toss has no bearing on the
result of any succeeding toss.
Example 4.2
Five percent of all DVD players manufactured by a large electronics company are defective. Three
DVD players are randomly selected from the production line of this company. The selected DVD
players are inspected to determine whether each of them is defective or good. Is this experiment a
binomial experiment?
Solution
1. This example consists of three identical trials. A trial represents the selection of a DVD player.
2. Each trial has only two outcomes: a DVD player is defective or a DVD player is good. Let a
defective DVD player be called a success and a good DVD player be called a failure.
10
SCHOOL OF MATHEMATICAL SCIENCES
3. Five percent of all DVD players are defective. So, the probability p that a DVD player is
defective is 0.05. As a result, the probability q that a DVD player is good is 0.95. These two
probabilities add up to 1.
4. Each trial (DVD player) is independent. In other words, if one DVD player is defective, it
does not affect the outcome of another DVD player being defective or good. This is because
the size of the population is very large compared to the sample size.
Theorem 4.1
For a binomial experiment, the probability of exactly x successes in n trials is given by the binomial
formula
𝑛
𝑃(𝑋 = 𝑥) = 𝑝(𝑥) = ( ) 𝑝 𝑥 𝑞 𝑛−𝑥 , 𝑥 = 0, 1, 2, … , 𝑛 𝑎𝑛𝑑 0 ≤ 𝑝 ≤ 1.
𝑥
Number of Probability of x
outcomes with successes among n
exactly x successes trials for any one
among n trials particular order
To solve a binomial problem, we determine the values of n, x, n-x, p, and q and then substitute these
values in the binomial formula.
To find the probability of x successes in n trials for a binomial experiment, the only values needed
are those of n and p. These are called the parameters of the binomial probability distribution or simply
the binomial parameters. The value of q is obtained by subtracting the value of p from 1.0. Thus, q =
1-p.
11
SCHOOL OF MATHEMATICAL SCIENCES
Example 4.3
Four fair coins are flipped. If the outcomes are assumed independent, what is the probability that two
heads are obtained?
Example 4.4
Five percent of all DVD players manufactured by a large electronics company are defective. A quality
control inspector randomly selects three DVD players from the production line. What is the
probability that exactly one of these three DVD players is defective?
Example 4.5
It is known from past data that 2% of packages mailed through Express House Delivery Service do
not arrive at their destinations within the specified time. Suppose a corporation mails 10 packages
through Express House Delivery Service on a certain day. Find the probability that at most one of
these 10 packages will not arrive at its destination within the specified time.
12
SCHOOL OF MATHEMATICAL SCIENCES
Example 4.6
A small commuter plane has 30 seats. The probability that any particular passenger will not show up
for a flight is 0.10, independent of other passengers. The airline sells 32 tickets for the flight. Calculate
the probability that more passengers show up for the flight than the seats available.
13
SCHOOL OF MATHEMATICAL SCIENCES
THEOREM 4.2
Let X be a binomial random variable based on n trials and success probability p. Then
Example 4.7
According to a report, 56% of teens volunteer time for charitable causes. Assume that this result is
true for the current population of teens. A sample of 60 teens is selected. Let x be the number of teens
in this sample who volunteer time for charitable causes. Find the mean and standard deviation of the
probability distribution of x.
Solution
This is a binomial experiment with a total of 60 trials. Each trial has two outcomes: (1) the selected
teen volunteers time for charitable causes or (2) the selected teen does not volunteer time for
charitable causes. The probabilities p and q for these two outcomes are 0.56 and 0.44, respectively.
Thus,
n = 60, p = 0.56, and q = 0.44
Using the formulas for the mean and standard deviation of the binomial distribution, we obtain
μ = np = 60(0.56) = 33.60
= npq = ( 60)( 0.56)( 0.44) = 3.845
We could list all possible orders by rearranging the F’s and S’s except for the last outcome, which
must be the fifth success. The total number of possible orders is equal to the number of partitions of
the first six trials into two groups with 2 failures assigned to the one group and 4 successes assigned
6
to the other group. This can be done in = 15 mutually exclusive ways. Hence, if X represents the
4
outcome on which the fifth success occurs, then
6
P ( X = 7 ) = ( 0.6 ) ( 0.4 ) = 0.1866
5 2
4
14
SCHOOL OF MATHEMATICAL SCIENCES
Theorem 5.1
If repeated independent trials can result in a success with probability p and a failure with probability
q = 1-p, then the probability distribution of the random variable X, the number of the trial on which
the kth success occurs, is
x − 1 k x −k
b * ( x; k , p ) = p q , x = k , k + 1, k + 2,...
k − 1
Example 5.1
In an NBA championship series, the team that wins four games out of seven is the winner. Suppose
that teams A and B face each other in the championship games and that team A has probability 0.55
of winning a game over team B.
(a) What is the probability that team A will win the series in 6 games?
15
SCHOOL OF MATHEMATICAL SCIENCES
(b) What is the probability that team A will win the series?
(c) If teams A and B were facing each other in a regional playoff series, which is decided by
winning three out of five games, what is the probability that team A would win the series?
THEOREM 5.2
If we define the mean of the negative binomial distribution as the average number of trials required
to produce k success, then the mean is equal to:
𝑘
𝜇 = 𝐸[𝑋] =
𝑝
where E[X] = 𝜇 is the mean number of trials, k is the number of successes, and p is the probability of
a success on any given trial.
The variance of a negative binomial random variable X is
𝑘(1 − 𝑝)
and 𝜎 2 = Var(𝑋) = .
𝑝2
16
SCHOOL OF MATHEMATICAL SCIENCES
Example 5.2
Determine the average and the variance for the number of times a person flips a fair coin until he/she
gets the third head.
Geometric Distribution
If we consider the special case of the negative binomial distribution where k = 1, we have a probability
distribution for the number of trials required for a single success. An example would be the tossing
of a coin until a head occurs. We might be interested in the probability that the first head occurs on
the fourth toss. The negative binomial distribution reduces to the form
b * ( x;1, p ) = pq x −1 , x = 1, 2,3,...
Since the successive terms constitute a geometric progression, it is customary to refer to this special
case as the geometric distribution and denote its values by g ( x; p ) .
Theorem 5.3
If repeated independent trials can result in a success with probability p and a failure with probability
q = 1- p, then the probability distribution of the random variable X, the number of trials until the first
success occurs, is
g ( x; p ) = pq x −1 , x = 1, 2,3,...
Example 5.3
In a certain manufacturing process, it is known that, on the average, 1 in every 100 items is defective.
What is the probability that the fifth item inspected is the first defective item found?
17
SCHOOL OF MATHEMATICAL SCIENCES
Example 5.4
If the probability is 0.75 that an applicant for a driver’s license will pass the road test on any given
attempt, what is the probability that an applicant will finally pass the test on the fourth attempt?
THEOREM 5.4
If X is a random variable with a geometric distribution,
1 1−𝑝
𝜇 = 𝐸[𝑋] = 𝑎𝑛𝑑 𝜎 2 = Var(𝑋) =
𝑝 𝑝2
Example 5.5
By referring to Example 5.3, determine the mean and standard deviation for the number of items
inspected until the first defective item is found.
Example 5.6
By referring to Example 5.4, determine the mean and standard deviation for the number of attempts
until the applicant pass the exam.
18
SCHOOL OF MATHEMATICAL SCIENCES
Example 5.7
You are given a geometric random variable X with the property that P(X = 3) = 5. P(X = 6). Find E(X).
19
SCHOOL OF MATHEMATICAL SCIENCES
Theorem 6.1
Let
N = total number of elements in the population
r = number of successes in the population
N – r = number of failures in the population
n = number of trials (sample size)
x = number of successes in n trials
n – x = number of failures in n trials
The probability of x successes in n trials is given by
r N − r
x n−x
P ( x ) =
N
n
where x is an integer 0, 1, 2, . . . , n, subject to the restrictions x ≤ r and n − x ≤ N − r .
Note:
i. Total number of ways of selecting samples of size n chosen from N items, irrespective of
𝑁
whether they are successes or failures, is ( 𝑛 )
→these samples are assumed to be equally likely.
𝑟
ii. There are (𝑥) ways of selecting x ‘successes’ from the r that are available and for each of
𝑁−𝑟
these ways we can choose the n - x ‘failures’ in (𝑛−𝑥 ) ways.
𝑟 𝑁−𝑟
→total number of samples that contains x ‘successes’ is (𝑥)(𝑛−𝑥 ).
(𝑥𝑟)(𝑁−𝑟
𝑛−𝑥)
iii. Hence, 𝑃(𝑋 = 𝑥) =
(𝑁
𝑛)
1. The types of applications of the hypergeometric are very similar to those of the binomial
distribution. Both compute the probabilities for the number of observations that fall into a
particular category.
2. The difference between the binomial distribution and the hypergeometric distribution are:
→in binomial distribution, independence among trials is required. The sampling must be done
with replacement.
→In hypergeometric distribution, the trials are not independent. Sampling without
replacement is done.
3. Applications for the hypergeometric distribution are found in many areas, with heavy uses in
acceptable sampling, electronic testing, and quality assurance. Obviously, for many of these
fields testing is done at the expense of the item being tested. That is, the item is destroyed and
hence cannot be replaced in the sample.
20
SCHOOL OF MATHEMATICAL SCIENCES
Example 6.1
Brown Manufacturing makes auto parts that are sold to auto dealers. Last week the company shipped
25 auto parts to a dealer. Later, it found out that 5 of those parts were defective. By the time the
company manager contacted the dealer, 4 auto parts from that shipment had already been sold. What
is the probability that 3 of those 4 parts were good parts and 1 was defective?
Solution
Thus, the probability that 3 out of 4 parts sold are good and 1 is defective is 0.4506.
Example 6.2
Dawn Corporation has 12 employees who hold managerial positions. Of them, 7 are female and 5 are
male. The company is planning to send 3 of these 12 managers to a conference. If 3 managers are
randomly selected out of 12,
(a) find the probability that all 3 of them are female
(b) find the probability that at most 1 of them is a female
Solution
Let the selection of a female be called a success and the selection of a male be called a failure
(a) N = total number of managers in the population = 12
r = number of successes (females) in the population = 7
N – r = number of failures (males) in the population = 5
n = number of selections (sample size) = 3
x = number of successes (females) in three selections = 3
n – x = number of failures (males) in three selections = 0
21
SCHOOL OF MATHEMATICAL SCIENCES
Thus, the probability that all 3 of the managers selected are female is 0.1591.
Example 6.3
A particular part that is used as an injection device is sold in lots of 10. The producer deems a lot
acceptable if no more than one defective is in the lot. A sampling plan involves random sampling and
testing 3 of the parts out of 10. If none of the 3 is defective, the lot is accepted. Comment on the utility
of this plan.
Solution
Let us assume that the lot is truly unacceptable (i.e. that 2 out of 10 parts are defective). The
probability that the sampling plan finds the lot acceptable is
2 8
P ( X = 0 ) = = 0.467
0 3
10
3
Thus, if the lot is truly unacceptable, with 2 defective parts, this sampling plan will allow acceptance
roughly 47% of the time. As a result, this plan should be considered faulty.
THEOREM 6.2
If X is a random variable with a hypergeometric distribution,
𝑛𝑟 N −n r r
𝜇 = 𝐸[𝑋] = 𝑎𝑛𝑑 𝜎 2 = Var(𝑋) = n 1 −
𝑁 N −1 N N
22
SCHOOL OF MATHEMATICAL SCIENCES
Example 6.4
By referring to Example 6.1, determine the mean and variance for the number of good auto parts in
the sample of 4 auto parts which was sold.
The Poisson probability distribution is another important probability distribution of a discrete random
variable that has a large number of applications. Suppose a washing machine in a laundromat breaks
down an average of three times a month. We may want to find the probability of exactly two
breakdowns during the next month. This is an example of a Poisson probability distribution problem.
Each breakdown is called an occurrence in Poisson probability distribution terminology. The Poisson
probability distribution is applied to experiments with random and independent occurrences. The
occurrences are random in the sense that they do not follow any pattern, and, hence, they are
unpredictable. Independence of occurrences means that one occurrence (or non-occurrence) of an
event does not influence the successive occurrences or non-occurrences of that event. The
occurrences are always considered with respect to an interval. In the example of the washing machine,
the interval is one month. The interval may be a time interval, a space interval, or a volume interval.
The actual number of occurrences within an interval is random and independent. If the average
number of occurrences for a given interval is known, then by using the Poisson probability
distribution, we can compute the probability of a certain number of occurrences, x, in that interval.
Note that the number of occurrences in an interval is denoted by x.
Theorem 7.1
Conditions to Apply the Poisson Probability Distribution
The following three conditions must be satisfied to apply the Poisson probability distribution.
1. x is a discrete random variable
2. The occurrences are random
3. The occurrences are independent
23
SCHOOL OF MATHEMATICAL SCIENCES
The following are three examples of discrete random variables for which the occurrences are random
and independent. Hence, these are examples to which the Poisson probability distribution can be
applied.
1. Consider the number of telemarketing phone calls received by a household during a given day. In
this example, the receiving of a telemarketing phone call by a household is called an occurrence, the
interval is one day (an interval of time), and the occurrences are random (that is, there is no specified
time for such a phone call to come in) and discrete. The total number of telemarketing phone calls
received by a household during a given day may be 0, 1, 2, 3, 4, and so forth. The independence of
occurrences in this example means that telemarketing phone calls are received individually and none
of two (or more) of these phone calls are related.
2. Consider the number of defective items in the next 100 items manufactured on a machine. In this
case, the interval is a volume interval (100 items). The occurrences (number of defective items) are
random and discrete because there may be 0, 1, 3, …, 100 defective items in 100 items. We can
assume the occurrence of defective items to be independent of one another.
3. Consider the number of defects in a 5-foot-long iron rod. The interval, in this example, is a space
interval (5 feet). The occurrences (defects) are random because there may be any number of defects
in a 5-foot iron rod. We can assume that these defects are independent of one another.
The following examples also qualify for the application of the Poisson probability distribution.
1. The number of accidents that occur on a given highway during a 1-week period.
2. The number of customers entering a grocery store during a 1-hour interval
3. The number of television sets sold at a department store during a given week
In contrast, consider the arrival of patients at a physician’s office. These arrivals are non-random if
the patients have to make appointments to see the doctor. The arrival of commercial airplanes at an
airport is non-random because all planes are scheduled to arrive at certain times, and airport
authorities know the exact number of arrivals for any period (although this number may change
slightly because of late or early arrivals and cancellations). The Poisson probability distribution
cannot be applied to these examples.
Theorem 7.2
According to the Poisson probability distribution, the probability of x occurrences in an interval is
𝜆𝑥 −𝜆
𝑃(𝑥) = 𝑒 , 𝑥 = 0, 1, 2, … ; 𝜆 > 0.
𝑥!
where λ is the mean number of occurrences in that interval and the value of e is approximately
2.71828.
24
SCHOOL OF MATHEMATICAL SCIENCES
Example 7.1
A washing machine in a laundry breaks down an average of three times per month. Find the
probability that during the next month this machine will have:
a) Exactly two breakdowns
b) At most one breakdown
c) At least one breakdown
Example 7.2
An insurance company has noted that, on average, ten insurance claims are received during a week
(consisting of 5 working days). Using an appropriate probability distribution, determine the
probability that
(i) less than 3 insurance claims are received in a week.
(ii) at least 2 insurance claims are received in a three-day period.
(iii) between 3 to 5 (inclusive) insurance claims are received in 2 weeks.
25
SCHOOL OF MATHEMATICAL SCIENCES
THEOREM 4.7.1
If X is a random variable possessing a Poisson distribution with parameter λ,
then
μ = E[X] = λ and σ2 = Var(X) = λ.
Example 7.3
By referring to Example 7.2, determine the mean and standard deviation for the number of insurance
claims in
(a) 1 week
(b) 3 days
(c) 2 weeks
∫ 𝑥 𝑘 𝑓(𝑥)𝑑𝑥 , 𝑖𝑓 𝑋 𝑖𝑠 𝑐𝑜𝑛𝑡𝑖𝑛𝑢𝑜𝑢𝑠
{−∞
26
SCHOOL OF MATHEMATICAL SCIENCES
Note:
i. the first moment about the origin, 𝜇1′ = 𝐸(𝑋).
ii. The second moment about the origin, 𝜇2′ = 𝐸(𝑋 2 ).
→ mean, 𝜇 = 𝜇1′ and variance, 𝜎 2 = 𝜇2′ − 𝜇 2
Although the moments of a random variable can be determined directly from Definition 8.1, an
alternative procedure exists. This procedure requires us to utilize a moment-generating function.
∑ 𝑒 𝑡𝑥 𝑝(𝑥) , 𝑖𝑓 𝑋 𝑖𝑠 𝑑𝑖𝑠𝑐𝑟𝑒𝑡𝑒
𝑥
𝑀(𝑡) = 𝐸(𝑒 𝑡𝑋 ) = ∞
∫ 𝑒 𝑡𝑥 𝑝(𝑥)𝑑𝑥 , 𝑖𝑓 𝑋 𝑖𝑠 𝑐𝑜𝑛𝑡𝑖𝑛𝑢𝑜𝑢𝑠
{−∞
The moment generating function is so-called because it generates the moments of X, as can be seen
by expanding the exponential function inside the expectation operator:
( tX ) ( tX )
2 3
M X ( t ) = E ( e ) = E 1 + tX +
tx
+ + ...
2! 3!
t2 t3
= 1 + tE ( X ) + E ( X 2 ) + E ( X 3 ) + ...
2! 3!
It is clear that information about the moments of X about the origin, i.e. E X k , k = 0,1, 2,... has
been compressed into the function M X ( t ) .
We can also use calculus to extract the k-th moment. To see this, observe that
t2
M X ( t ) = E ( X ) + tE ( X ) + E ( X 3 ) + ...
' 2
2!
t2
M X'' ( t ) = E ( X 2 ) + tE ( X 3 ) + E ( X 4 ) + ...
2!
In general,
t2
M X (t ) = E ( X
(k ) k
) + tE ( X k +1
) + 2! E ( X k +2 ) + ..., k = 1, 2,...
Therefore
M X' ( 0 ) = E ( X ) , M X'' ( 0 ) = E ( X 2 ) , and M X( k ) ( 0 ) = E ( X k )
27
SCHOOL OF MATHEMATICAL SCIENCES
THEOREM 8.1
Let X be a random variable with moment-generating function M X ( t ) , then
d k M X (t )
k
= E ( X k ) = k'
dt t =0
Example 8.1
1 3
Given that X has the probability distribution f ( x) = for x = 0,1, 2, and 3, find the moment-
8 x
generating function of this random variable and use it to determine 1 and 2 .
Example 8.2
1 3 2 1
Let 𝑀(𝑡) = + 𝑒𝑡 + 𝑒 3𝑡 + 𝑒 4𝑡 . Find the mean and variance of X.
7 7 7 7
28
SCHOOL OF MATHEMATICAL SCIENCES
Example 8.3
Show that the moment-generating function of the Binomial random variable with parameters n and p
is given by 𝑀𝑋 (𝑡) = (𝑝𝑒 𝑡 + 𝑞)𝑛 and hence, show that E ( X ) = np and Var ( X ) = npq , where
q = 1− p .
Solution
𝐸(𝑒 𝑡𝑋 ) = ∑𝑛𝑘=0 𝑒 𝑡𝑘 (𝑛𝑘)𝑝𝑘 𝑞 𝑛−𝑘
= ∑𝑛𝑘=0(𝑛𝑘)(𝑝𝑒 𝑡 )𝑘 𝑞 𝑛−𝑘
n
n
From the binomial theorem, ( a + b ) = a n −i bi
n
i =0 i
n ( n − 1) n−2 2
= a n + na n−1b + a b + ... + nabn−1 + bn
2!
Thus, applying the binomial theorem,
𝑛
𝑡𝑋 )
𝑛
𝐸(𝑒 = ∑ ( ) (𝑝𝑒 𝑡 )𝑘 𝑞 𝑛−𝑘
𝑘
𝑘=0
= ( pet + q )
n
Setting t = 0 we have
Example 8.4
Show that the moment-generating function of the Poisson random variable with parameter is given
( )
and hence, show that E ( X ) = and Var ( X ) = .
et −1
by M X ( t ) = e
Solution
e− x
M X (t ) = e tx
x =0 x!
( et ) x
= e
−
x =0 x!
29
SCHOOL OF MATHEMATICAL SCIENCES
( et ) x
x 2 x3 xn
Note that e = 1 + x + + + ... = , thus
x
2! 3! n =0 n !
x =0 x!
= eet . Hence
( et ) x
M X (t ) = e
−
x =0 x !
= e − ee = e
t (
et −1)
𝑉(𝑋) = ( 𝜆 + 𝜆2 ) − 𝜆2 = 𝜆
9. Chebyshev’s Theorem
Theorem 9.1
The probability that any random variable X will assume a value within k standard deviations of the
1
mean is at least 1 − 2 . That is
k
1
P ( − k X + k ) 1 − 2
k
1 3
For k = 2, the theorem states that the random variable X has a probability of at least 1 − = of
22 4
falling within two standard deviations of the mean. That is, three-fourths or more of the observations
of any distribution lie in the interval 2 . Similarly, the theorem says that at least eight-ninths of
the observations of any distribution fall in the interval 3 .
Example 9.1
A random variable X has a mean = 8 , a variance 2 = 9 , and an unknown probability distribution.
Find
(a) P ( −4 X 20 )
(b) P ( X − 8 6)
30
SCHOOL OF MATHEMATICAL SCIENCES
Solution
P ( −4 X 20 ) = P (8 − ( 4 )( 3) X 8 − ( 4 )( 3) ) 1 −
1 15
(a) =
42 16
(b) P ( X − 8 6) = 1 − P ( X − 8 6)
= 1 − P ( −6 X − 8 6 )
= 1 − P ( 8 − ( 2 )( 3) X 8 + ( 2 )( 3) ) 1 −
1 1
=
22 4
Chebyshev’s theorem holds for any distribution of observations, and for this reason the results are
usually weak. The value given by the theorem is a lower bound only. That is, we know that the
probability of a random variable falling within two standard deviations of the mean can be no less
3
than , but we never know how much more it might actually be. Only when the probability
4
distribution is known can we determine exact probabilities. For this reason we call the theorem a
distribution-free result. When specific distributions are assumed, as in the previous sections, the
results will be less conservative. The use of Chebyshev’s theorem is relegated to situations where the
form of the distribution is unknown.
31