0% found this document useful (0 votes)

232 views114 pages

Stat

1) The document is a review guide for UCLA STAT 100A final exam covering chapter 4 on random variables. 2) It provides examples of random variables such as the number of heads when flipping a coin twice, the sum of dice rolled from a pair of dice, and the number of flips needed for a weighted coin to land on heads. 3) It also discusses random variables in the context of roulette wheel outcomes and colored marbles drawn from an urn, defining the possible values and probabilities of the random variables.

Uploaded by

Shihab Hasan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

232 views114 pages

Stat

Uploaded by

Shihab Hasan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 114

UCLA STAT 100A, Final Exam Review Guide

Chapter 4: Random Variables

4.1 Random Variables

A random variable is a “rule” (or, more technically, a function) which assigns a
number to each outcome in the sample space of an experiment. Probabilities are then
assigned to the values of the random variable.

Exercise 4.1 (Random Variables)

1. Flipping A Coin Twice. Let random variable X be the number of heads that
come up.

(a) P {HH} = (circle one) P {X = 0} / P {X = 1} / P {X = 2}

(b) P {HT, HT } = (circle one) P {X = 0} / P {X = 1} / P {X = 2}
(c) P {T T } = (circle one) P {X = 0} / P {X = 1} / P {X = 2}
(d) True / False If the coin is fair (heads come up as often as tails), the
distribution of X (the number of heads in two ﬂips of the coin) is then
x 0 1 2
1 2 1
P (X = x) 4 4 2

(e) True / False If heads come up twice as often as tails, the distribution of
X is then
x 0 1 2
1 1 1 1 2 4 2 2 4
P (X = x) 3
× 3
= 9
2× × 3 3
= 9 3
× 3
= 9

2. Rolling a Pair of Dice.

Fall, 2002.
58 Chapter 4. Random Variables

S
6 x x x x x x
5 x x x x x x
4 x x x x x x

Die 2
3 x x x x x x
2 x x x x x x
1 x x x x x x

1 2 3 4 5 6
Die 1

Figure 4.1 (Sample Space For Rolling A Pair of Dice)

(a) Let X be the sum of the dice.

P {(1, 1)} = (circle one) P {X = 1} / P {X = 2} / P {X = 3}
P {(1, 2), (2, 1)} = (circle one) P {X = 1} / P {X = 2} / P {X = 3}
P {(1, 5), (2, 4), (3, 3), (2, 4), (1, 5)} =
(circle one) P {X = 4} / P {X = 5} / P {X = 6}
P {X = 11} = (circle best one) P {(5, 6)} / P {(6, 5)} / P {(5, 6), (6, 5)}
True / False If the dice are fair (each number comes up one sixth of the
time), the distribution of X (the sum of two rolls of a pair of dice) is then
x 2 3 4 5 6 7
1 2 3 4 5 6
P (X = x) 36 36 36 36 36 36

x 8 9 10 11 12
5 4 3 2 1
P (X = x) 36 36 36 36 36

(b) Let X be the number of 4’s rolled.

P {(4, 4)} = (circle one) P {X = 0} / P {X = 1} / P {X = 2}
P {X = 1} = (circle best one)
P {(1, 4), (2, 4), (3, 4), (5, 4), (6, 4)}
P {(4, 1), (4, 2), (4, 3), (4, 5), (4, 6)}
P {(1, 4), (2, 4), (3, 4), (5, 4), (6, 4), (4, 1), (4, 2), (4, 3), (4, 5), (4, 6)}
P {X = 0} = (circle one) 11 36
/ 20
36
/ 25
36
True / False If the dice are fair, the distribution of X (the number of 4’s
in two rolls of a pair of dice) is then
x 0 1 2
25 10 1
P (X = x) 36 36 36

3. Flipping Until a Head Comes Up. A (weighted) coin has a probability of p = 0.7
of coming up heads (and so a probability of 1 − p = 0.3 of coming up tails).
This coin is flipped until a head comes up or until a total of 4 flips are made.
Let X be the number of flips.
Section 1. Random Variables 59

(a) P {X = 1} = P {H} (circle one) 0.7 / 0.3 / 0.3(0.7)

(b) P {X = 2} = P {T H} (circle one) 0.7 / 0.3 / 0.3(0.7)
(c) P {X = 3} = P {T T H} (circle one) 0.7 / 0.3(0.7) / 0.32 (0.7)
(d) P {X = 4} = P {T T T T, T T T H} (circle none, one or more)
0.7 / 0.33 (0.3 + 0.7) / 0.33
(Remember that, at most, only four (4) ﬂips can be made.)
4. Roulette. The roulette table has 38 numbers: the numbers are 1 to 36, 0 and
00. A ball is spun on a corresponding roulette wheel which, after a time, settles
down and the ball drops into one of 38 slots which correspond to the 38 numbers
on the roulette table. Consider the following roulette table.

0 3 6 9 12 15 18 21 24 27 30 33 36
2 5 8 11 14 17 20 23 26 29 32 35
00
1 4 7 10 13 16 19 22 25 28 31 34
first section second section third section

Figure 4.2 (Roulette)

(a) The sample space consists of 38 outcomes: {00, 0, 1, . . . , 35, 36}. The
event “an even comes up” (numbers 2, 4, 6, . . . , 36, but not 0 or 00)
consists of (circle one) 18 / 20 / 22 numbers.
The chance an even comes up is then (circle one) 18/38 / 20/38 / 22/38
The event “a number in the second section comes up” (12 numbers: 13,
16, 19, 14, 17, 20, 15, 18 and 21) consists of
(circle one) 12 / 20 / 22 numbers.
The chance a second section comes up is then
(circle one) 12
38
/ 20
38
/ 22
38
(b) Let random variable X be the winnings from a $1 bet placed on an even
coming up. If an even number does come up, the gambler keeps his dollar
and receives another dollar (+$1). If an odd number comes up, the gambler
loses the dollar he bets (−$1). In other words, an even pays “1 to 1”. And
so
P {X = $1} = (circle one) 1838
/ 20
38
/ 22
38
P {X = −$1} = (circle one) 18 38
/ 20
38
/ 22
38
True / False The distribution of X is then
x -$1 $1
22 20
P (X = x) 38 38
60 Chapter 4. Random Variables

(c) Let random variable Y be the winnings from a $1 bet placed on the second
section coming up. If a second section number does come up, the gambler
keeps his dollar and receives another two dollars (+$2). If a ﬁrst or third
section number comes up, the gambler loses the dollar he bets (−$1). In
other words, an second section bet pays “2 to 1”.
P {Y = $2} = (circle one) 12
38
/ 20
38
/ 26
38
P {Y = −$1} = (circle one) 1238
/ 20
38
/ 26
38
True / False The distribution of Y is then
y -$1 $2
P (Y = y) 26
38
12
38
20 26
By the way, P {Y = $1} = (circle one) 0 / 38
/ 38

5. Random Variables And Urns. Two marbles are taken, one at a time, without
replacement, from an urn which has 6 red and 10 blue marbles. We win $2 for
each red marble chosen and lose $1 for each blue marble chosen. Let X be the
winnings.

0 1marbles
(a) The chance both 0 1 are0red 1is0 1 0 10 1
@ A@
6 10 A @ 9 A@ 8 A @ A@
8 11 A
2
(circle one) 0 10 / 10 12 /
1
0 2
1
@ 16 A @ 16 A @ 16 A
2 3 3
(b) Since the winnings are X = $4, if both marbles are red, then
P {X = $4} = (circle one) 0.025 / 0.125 / 0.225
Use your calculator to work out the combinations.
(c) Choose the correct distribution below.
i. Distribution A.
x -$2 $1 $4
P (X = x) 0.500 0.375 0.125
ii. Distribution B.
x -$2 $1 $4
P (X = x) 0.375 0.500 0.125

4.2 Distribution Functions

The (cumulative) distribution function (c.d.f.) is
F (b) = P {X ≤ b}
where −∞ < b < ∞.

Exercise 4.2 (Cumulative Distribution Function)

Section 2. Distribution Functions 61

1. Flipping A Coin Twice. Recall,

x 0 1 2
1 2 1
P {X = x} 4 4 2

P {X = 0} = 0.25, P {X = 1} = 0.50, P {X = 2} = 0.25.

Consequently,

(a) P {X ≤ 0} = (circle one) 0.25 / 0.75 / 1

(b) P {X ≤ 1} = (circle one) 0.25 / 0.75 / 1
(c) P {X ≤ 2} = (circle one) 0.25 / 0.75 / 1
(d) True / False Since F (b) = P {X ≤ b},

F (0) = 0.25, F (1) = 0.75, F (2) = 1.

or

 0.25, x<0
F (x) = 0.75, 0≤x<1

1, 1≤x

(e) True / False A graph of the distribution function is given below.

0.75
0.50

0.25

0 1 2 3

Figure 4.3 (Graph of Distribution Functions)

2. Rolling a Pair of Dice.

(a) Let X be the sum of the dice. Recall,

x 2 3 4 5 6 7
1 2 3 4 5 6
P (X = x) 36 36 36 36 36 36

x 8 9 10 11 12
5 4 3 2 1
P (X = x) 36 36 36 36 36
62 Chapter 4. Random Variables

Consequently,
1 2 3
F (2)P {X ≤ 2} = (circle one) 36 / 36 / 36
1 2 3
F (3) = P {X ≤ 3} = (circle one) 36 / 36 / 36
F (11) = (circle one) 34
36
/ 35
36
/1
34 35
F (12) = (circle one) 36 / 36 / 1
True / False


 0, x < 2

 1

 , 2≤x<3
 36
 3, 3≤x<4
36
F (x) = .. ..

 . .




35
, 11 ≤ x < 12

 1,36
12 ≤ x

0 1
P {X < 2} = (circle one) 36 / 36 /1
0
P {X > 2} = (circle one) 36 / 35
36
/1
P {2 ≤ X < 4} = (circle none, one or more) F (4) − F (2) / F (3) − F (1)
/ F (3) − F (1)
(b) Let X be the number of 4’s rolled. Recall,
25 10 1
P {X = 0} = , P {X = 1} = , P {X = 2} = .
36 36 36
Consequently,
25 35
F (0) = (circle one) 36
/ 36
/1
25 35
F (1) = (circle one) 36
/ 36
/1
25 35
F (2) = (circle one) 36
/ 36
/1
True / False


 0, x<0
 25
, 0≤x<1
F (x) = 36
35

 , 1≤x<2
 36
1, 2≤x

3. Another Distribution. Let



 0, x<0
 1
, 0≤x<1
F (x) = 3
1

 , 1≤x<2
 2
1, 2≤x
1 1 1
(a) P {X = 0} = F (0) = (circle one) 6
/ 3
/ 2
1
(b) P {X = 1} = F (1) − F (0) = (circle one) 6
/ 13 / 1
2
Section 3. Discrete Random Variables 63

1 1 1
(c) P {X = 2} = F (2) − F (1) = (circle one) 6
/ 3
/ 2

4. Properties of Distribution Functions. Circle true or false.

(a) True / False If a < b, then F (a) ≤ F (b); that is, F is nondecreasing.
(b) True / False limb→∞ F (b) = 1
(c) True / False limb→−∞ F (b) = 0
(d) True / False limn→∞ F (bn ) = F (b); that is, F is right continuous (which
determines where the solid and empty endpoints are on the graph of a
distribution function)

4.3 Discrete Random Variables

The random variable is discrete if it assigns the outcomes in a sample space to a set
of ﬁnite or countably inﬁnite possible real values. We introduce the notation

p(a) = P {X = a}

Exercise 4.3 (Discrete Random Variables)

1. Chance of Seizures. The number of seizures, X, of a typical epileptic person in

any given year is given by the following probability distribution.

X 0 2 4 6 8 10
p(x) 0.17 0.21 0.18 0.11 0.16 0.17

(a) The chance a person has 8 epileptic seizures is

p(8) =(circle one) 0.17 / 0.21 / 0.16 / 0.11.
(b) The chance a person has less than 6 seizures is
(circle one) 0.17 / 0.21 / 0.56 / 0.67.
(c) P {X ≤ 4} = (circle one) 0.17 / 0.21 / 0.56 / 0.67.
(d) p(2) = (circle one) 0.17 / 0.21 / 0.56 / 0.67.
(e) p(2.1) = (circle one) 0 / 0.21 / 0.56 / 0.67.
(f) P {X > 2.1} = (circle one) 0.21 / 0.38 / 0.56 / 0.62.
(g) P {X = 0} + P {X = 2} + P {X = 4} + P {X = 6} + P {X = 8} + P {X =
10} =
(circle one) 0.97 / 0.98 / 0.99 / 1.
64 Chapter 4. Random Variables

(h) Histogram (Graph) of Distribution. Consider the probability histograms

given in the ﬁgure below.

P(X = x) P(X = x) P(X = x)

0.20 0.20 0.20

0.15 0.15 0.15

0.10 0.10 0.10

0 2 4 6 8 10 0 2 4 6 8 10 0 2 4 6 8 10

(a) (b) (c)

Figure 4.4 (Probability Histogram)

Which, if any, of the three probability histograms in the ﬁgure above,
describe the probability distribution of the number of seizures?
(circle none, one or more) (a) / (b) / (c)
(i) Function of Distribution. Which one of the following functions describes
the probability distribution of the number of seizures?
i. Function (a).

0.17, if x = 0
P (X = x) =
0.21, if x = 2
ii. Function (b).

0.18, if x = 4
P (X = x) =
0.11, if x = 6
iii. Function (c).


 0.17, if x=0



 0.21, if x=2

0.18, if x=4
P (X = x) =

 0.11, if x=6



 0.16, if x=8

0.17, if x = 10

2. Chance of Being A Smoker. Consider the following distribution of the number

of smokers in a group of three people,

x 0 1 2 3
1 3 3 1
P (X = x) 8 8 8 8
Section 3. Discrete Random Variables 65

1 3 4
(a) At exactly x = 0, P (X = 0) = (circle one) 0 / 8
/ 8
/ 8
1 3 4
(b) whereas at x = −2, P (X = −2) = (circle one) 0 / 8
/ 8
/ 8
1 3 4
(c) and, indeed, at any x < 0, P (X = x) = (circle one) 0 / 8
/ 8
/ 8
1
(d) Since, also, at x = 14 , P (X = 14 ) = (circle one) 0 / 8
/ 38 / 4
8
1 3
(e) and at x = 12 , P (X = 12 ) = (circle one) 0 / 8
/ 8
/ 48
1 3 4
(f) and, indeed, at any 0 < x < 1, P (X = x) = (circle one) 0 / 8
/ 8
/ 8
1 3 4
(g) But, at exactly x = 1, P (X = 1) = (circle one) 0 / 8 / 8 / 8
(h) whereas, at x = 1 14 , P (X = 1 14 ) = (circle one) 0 / 18 / 38 / 48

3. Number of Bikes. The number of bicycles, X, on a bike rack at lunch time

during the summer is given by the following probability distribution.
1
p(x) = , x = 5, 6, 7, 8, 9,
5
(a) The chance the bike rack has 8 bicycles is
p(8) = (circle one) 15 / 25 / 35 / 45 .
(b) The chance the bike rack has less than 8 bicycles is
(circle one) 15 / 25 / 35 / 45 .
1 2 3
(c) P {X ≤ 6} = (circle one) 5
/ 5
/ 5
/ 45 .
1 2 3
(d) p(7) = (circle one) 5
/ 5
/ 5
/ 45 .
0 2 1
(e) p(8.1) = (circle one) 5
/5
/ 35 .
5
/
(f) P {5 < X < 8} = (circle one) 05 / 15 / 2
5
/ 35 .
x=9 0 1 2 5
(g) x=5 p(x) = (circle one) 5 / 5 / 5 / 5
.

4. Flipping a Coin. The number of heads, X, in one ﬂip of a coin, is given by the
following probability distribution.

p(x) = (0.25)x (0.75)1−x , x = 0, 1

(a) The chance of ﬂipping 1 head (X = 1) is

p(1) = (0.25)1 (0.75)1−1 = (circle one) 0 / 0.25 / 0.50 / 0.75.
(b) This coin is (circle one) fair / unfair.
(c) The chance of ﬂipping no heads (X = 0) is
p(0) = (0.25)0 (0.75)1−0 = (circle one) 0 / 0.25 / 0.50 / 0.75.
(d) A “tabular” version of this probability distribution of ﬂipping a coin is
(circle one)
66 Chapter 4. Random Variables

i. Distribution A.
X 0 1
p(x) 0.25 0.75
ii. Distribution B.
X 0 1
p(x) 0.75 0.25
iii. Distribution C.
X 0 1
p(x) 0.50 0.50
(e) The number of diﬀerent ways of describing a distribution include (check
none, one or more)
i. function
ii. tree diagram
iii. table
iv. graph

(f) True / False F (a) = allx≤a p(x)

5. Rock, Scissors and Paper. Rock, scissors and paper (RSP) involves two players
in which both can either show either a “rock” (clenched ﬁst), “scissors” (V–
sign) or “paper” (open hand) simultaneously, where either does not know what
the other is going to show in advance. Rock beats scissors (crushes it), scissors
beats paper (cuts it) and paper bets rock (covers it). Whoever wins, receives
a dollar ($1). The payoﬀ matrix RSP is given below. Each element represents
the
amount player C (column) pays player R (row).

Player C → rock (1) scissors (2) paper (3)

Player R ↓
rock (1) 0 $1 -$1
scissors (2) -$1 0 $1
paper (3) $1 -$1 0

(a) According to the payoﬀ matrix, if both players C and R show “rock”, then
player C pays player R
(circle one) -$1 / $0 / $$1
(In other words, no one wins–one player does not pay the other player.)
(b) If player C shows “rock” and player R shows “paper”, then player C pays
player R
(circle one) -$1 / $0 / $$1
(c) To say player C pays player R negative one dollar, -$1, means (circle one)
Section 3. Discrete Random Variables 67

i. player C pays player R one dollar.

ii. player R pays player C one dollar.
iii. player C loses one dollar (to player R).
iv. player R wins one dollar (from player C).
(d) If each of the nine possible outcomes are equally likely (each occur with a
probability of 19 ), which is the correct probability distribution of payoﬀ X,
the amount that player C pays player R
i. Distribution A.
X -1 0 1
2 3 4
p(x) 9 9 9
ii. Distribution B.
X -1 0 1
3 3 3
p(x) 9 9 9

6. Binomial Distribution. The distribution of the binomial random variable is

given by

n
p(i) = P {X = i} = pi (1 − p)n−i , i = 0, 1, . . . , n
i
(a) If n = 10,
p =0.65, i = 4, then
10
p(5) = 0.654 0.356 = (circle one) 0.025 / 0.050 / 0.069
4
(2nd DISTR 0:binompdf(10,0.65,4) ENTER)
(b) If n = 11, p = 0.25, i = 3, then
p(3) = (circle one) 0.26 / 0.50 / 0.69
(c) True / False If n = 4, p = 0.25, then the entire distribution is given by
X 0 1 2 3 4
p(x) 0.32 0.42 0.21 0.05 0.004
(2nd DISTR 0:binompdf(4,0.25) ENTER)
7. Poisson Distribution. The distribution of the Poisson random variable is given
by
λi
p(i) = P {X = i} = e−λ , i = 0, 1, . . . , λ > 0
i!
(a) If λ = 10, i = 4, then
4
p(4) = P {X = 4} = e−10 104! = (circle one) 0.019 / 0.050 / 0.069
(2nd DISTR B:poissonpdf(10,4) ENTER)
(b) If n = 11, i = 3, then
p(3) = (circle one) 0.0026 / 0.0037 / 0.0069
68 Chapter 4. Random Variables

4.4 Expected Value

The expected value, E[X] (or mean, µ), of a random variable, X is given by

E[X] = xp(x)
x:p(x)>0

It is, roughly, a weighted average of the probability distribution.

Exercise 4.4 (Expected Value of a Discrete Random Variable)

1. Seizures. The probability mass function for the number of seizures, X, of a
typical epileptic person in any given year is given in the following table.

X 0 2 4 6 8 10
p(x) 0.17 0.21 0.18 0.11 0.16 0.17

(a) A First Look: Expected Value Is Like The Fulcrum Point of Balance.

P(X = x) P(X = x) P(X = x)

0.20 0.20 0.20

0.15 0.15 0.15

0.10 0.10 0.10

0 2 4 6 8 10 0 2 4 6 8 10 0 2 4 6 8 10

(a) (b) (c)

Figure 4.5 (Expected Value Is Like The Fulcrum Point of Balance)

If the expected value is like a fulcrum point which balances the “weight”
of the probability distribution, then the expected value is most likely close
to the point of the fulcrum given in which of the three graphs above?
Circle one. (a) / (b) / (c)
In other words, the expected value seems close to (circle one) 1 / 5 / 9
(b) Calculating The Expected Value. The expected value (mean) number of
seizures is given by
E[X] = 0(0.17) + 2(0.21) + 4(0.18) + 6(0.11) + 8(0.16) + 10(0.17)
which is equal to (circle one) 4.32 / 4.78 / 5.50 / 5.75.
(Use your calculator: STAT ENTER; type X, 0, 2, 4, 6 and 8, into L1 and
p(x), 0.17, . . . , 0.17, into L2 ; then deﬁne
L3 = L1 × L2 ; then STAT CALC
ENTER 2nd L3 ENTER; then read x = 4.78.)
Section 4. Expected Value 69

(c) General Formula For The Expected Value. True / False The general
formula for the expected value (mean) is given by

n
E[X] = xi p(xi )
i=1

2. Smokers. The number of smokers, X, in any group of three people is given by

the following probability distribution.

x 0 1 2 3
1 3 3 1
p(x) 8 8 8 8

The mean (expected) number of smokers is

1 3 3 1
E[X] = µX = ×0 + ×1 + ×2 + ×3
8 8 8 8
which is equal to (circle one) 0.5 / 1.5 / 2.5 / 3.5.
3. Another Distribution In Tabular Form. If the distribution is

x 0 1 2 3
4 2 1 1
p(x) 8 8 8 8

the mean is
4 2 1 1
E[X] = ×0 + ×1 + ×2 + ×3 =
8 8 8 8
which is equal to (circle one) 1.500 / 0.875 / 1.375 / 0.625
4. Rolling a Pair of Dice. If the dice are fair, the distribution of X (the sum of
two rolls of a pair of dice) is then

x 2 3 4 5 6 7
1 2 3 4 5 6
P (X = x) 36 36 36 36 36 36

x 8 9 10 11 12
5 4 3 2 1
P (X = x) 36 36 36 36 36

The mean (expected) sum of the roll of a pair of fair dice is then
1 2 2 1
E[X] = µX = ×2 + ×3 + · · · + ×11 + ×12
36 36 36 36
which is equal to (circle one) 5 / 6 / 7 / 8.
(Think about it: this is a symmetric distribution balanced on what number?)
70 Chapter 4. Random Variables

5. Expectation and Distribution Function. If the distribution is

3−x
p(x) = , x = 1, 2,
3
the mean is
3−1 3−2
E[X] = 1 × +2×
3 3
3 4 5 6
which is equal to (circle one) 3
/ 3
/ 3
/ 3

6. Roulette. The roulette table has 38 numbers: the numbers are 1 to 36, 0 and
00. A ball is spun on a corresponding roulette wheel which, after a time, settles
down and the ball drops into one of 38 slots which correspond to the 38 numbers
on the roulette table.

(a) Let random variable X be the winnings from a $1 bet placed on an even
coming up, where this bet pays 1 to 1. Recall,
x -$1 $1
20 18
p(x) 38 38

and so the mean is

20 18
E[X] = −1 × +1×
38 38
which is equal to (circle one) − 20
38
2
/ − 38 / 2
38
/ 20
38
(b) Let random variable X be the winnings from a $1 bet placed on a section
(with 12 numbers) coming up, where this bet pays 2 to 1. Recall,
x -$1 $2
26 12
p(x) 38 38

and so the mean is

26 12
E[X] = −1 × +2×
38 38
which is equal to (circle one) − 20
36
2
/ − 38 / 2
38
/ 20
38

7. Binomial Distribution. The distribution of the binomial random variable is

given by

n
p(i) = P {X = i} = pi (1 − p)n−i , i = 0, 1, . . . , n
i

Consider the case when n = 4 and p = 0.25, where

Section 5. Expectation of a Function of a Random Variable 71

X 0 1 2 3 4
p(x) 0.32 0.42 0.21 0.05 0.004

(a) The mean (expected value) is then

E[X] = µX = 0.32×0 + 0.42×1 + 0.21×2 + 0.05×3 + 0.004×4

which is equal to (circle closest one) 1 / 2 / 3 / 4.

(b) np = (circle closest one) 1 / 2 / 3 / 4.
(which, notice, is the same answer as above!)
(c) True / False If X is a binomial random variable, then E[X] = np.

8. Expectation and the Indicator Function. The random variable I is an indicator

function of an event A if

1, if A occurs
I=
0, if Ac occurs

and so the mean is

E[X] = 1 × P (A) + 0 × (1 − P (A))

which is equal to (circle one) 0 / 1 − P (A) / P (A)

4.5 Expectation of a Function of a Random Vari-

able
The expected value of a function g of the random variable X, E[g(X)] is given by

E[g(X)] = g(xi )p(i )
i

Exercise 4.5(Expected Value of a Function of a Discrete Random Variable)

1. Seizures. The probability mass function for the number of seizures, X, of a

typical epileptic person in any given year is given in the following table.

X 0 2 4 6 8 10
p(x) 0.17 0.21 0.18 0.11 0.16 0.17

(a) If the medical costs for each seizure, X, is $200, g(x) = 200x, the new
distribution for g(x) becomes,
72 Chapter 4. Random Variables

X 0 2 4 6 8 10
g(X) = 200x 200(0) = 0 200(2) = 400 800 1200 1600 2000
p(g(x)) 0.17 0.21 0.18 0.11 0.16 0.17
The expected value (mean) cost of seizures is then given by

E[g(X)] = E[200X] = [0](0.17) + [400](0.21) + · · · + [2000](0.17)

which is equal to (circle one) 432 / 578 / 750 / 956.

(Use your calculator: STAT ENTER; type X, 0, 2, 4, 6 and 8, into L1 and
deﬁne g(X) in L2 = 200 × L1 , and type p(x), 0.17, . . . , 0.17, into L3 ; then
deﬁneL4 = L2 × L3 ; then STAT CALC ENTER 2nd L4 ENTER; then
read x = 956.)
(b) If the medical costs for each seizure is g(x) = 200x + 1500, the new distri-
bution for g(x) becomes,

X 0 2 4 6 8 10
g(X) = 200x + 1500 200(0) + 1500 = 1500 1900 2300 2700 3100 3500
p(g(x)) 0.17 0.21 0.18 0.11 0.16 0.17
The expected value (mean) cost of seizures is then given by

E[g(X)] = E[200X + 1500] = (1500)(0.17) + (1900)(0.21) + · · · + (3500)(0.17)

which is equal to (circle one) 432 / 578 / 750 / 2456.

(c) If the medical costs for each seizure is g(x) = x2 , the new distribution for
g(x) becomes,
X 0 2 4 6 8 10
g(X) = x2 2
0 =0 4 16 36 64 100
p(g(x)) 0.17 0.21 0.18 0.11 0.16 0.17
The expected value (mean) cost of seizures is then given by

E[g(X)] = E[X 2 ] = (0)(0.17) + (4)(0.21) + · · · + (100)(0.17)

which is equal to (circle one) 34.92 / 57.83 / 75.01 / 94.56.

(E[X 2 ] is called the second moment (about the origin); E[X 3 ] is called the
third moment; E[X n ] is called the nth moment.) )
(d) If the medical costs for each seizure is g(x) = 200x2 + x − 5,
E[g(X)] = E[200X 2 + X − 5] = (circle closest one)
4320 / 5780 / 6983 / 8480.
Section 5. Expectation of a Function of a Random Variable 73

2. Flipping Until a Head Comes Up. A (weighted) coin has a probability of p = 0.7
of coming up heads (and so a probability of 1 − p = 0.3 of coming up tails).
This coin is flipped until a head comes up or until a total of 4 flips are made.
Let X be the number of flips. Then, recall,

X 1 2 3 4
2 3
p(x) 0.7 0.3(0.7) = 0.21 0.3 (0.7) = 0.063 0.3 = 0.027

(a) E[X] = (circle one) 1.417 / 2.233 / 2.539 / 4.567

(b) If g(x) = 3x + 5, E[g(X)] = (circle one) 7.417 / 8.233 / 9.251 / 10.567
(c) 3E[X] + 5 = 3(1.417) + 5 = (circle one) 7.417 / 8.233 / 9.251 / 10.567
(d) True / False aE[X] + b = E[aX + b]
(e) If g(x) = x2 the second moment is
E[g(X)] = E[X 2 ] = (circle one) 1.539 / 2.233 / 2.539 / 4.567
(f) If g(x) = x3 the third moment is
E[g(X)] = E[X 3 ] = (circle one) 1.539 / 2.233 / 2.539 / 5.809
(g) If g(x) = x4 the fourth moment is
E[g(X)] = E[X 4 ] = (circle one) 11.539 / 12.233 / 12.539 / 16.075
In general, E[X n ], is the nth moment.

3. Consider the distribution

3−x
p(x) = , x = 1, 2
3
(a) The mean is
3−1 3−2
E[X] = [1] × + [2] ×
3 3
2 3 4 5
which is equal to (circle one) 3
/ 3
/ 3
/ 3
(b) If g(x) = 3x + 5,
4
E[g(X)] = E[3X + 5] = 3E[X] + 5 = 3 × +5=
3
21 22 23 27
which is equal to (circle one) 3
/ 3
/ 3
/ 3
(c) If g(x) = 6x,
4
E[g(X)] = E[6X] = 6E[X] = 6 × =
3
21 22 23 24
which is equal to (circle one) 3
/ 3
/ 3
/ 3
74 Chapter 4. Random Variables

4.6 Variance
We will now look at the variance, V (X),

Var(X) = E[(X − µ)2 ]

and standard deviation, SD(X), of a random variable, X.

Exercise 4.6 (Standard Deviation of a Discrete Random Variable)

1. Seizures. Since the number of seizures, X, of a typical epileptic person in any

given year is given by the following probability distribution,

X 0 2 4 6 8 10
P(X = x) 0.17 0.21 0.18 0.11 0.16 0.17

and the expected value (mean) number of seizures is given by µ = E(X) = 4.78,

(a) A First Look: Variance Measures How “Dispersed” The Distribution Is.

P(X = x) P(X = x) P(X = x)

0.20 0.20 0.20

0.15 0.15 0.15

0.10 0.10 0.10

0 2 4 6 8 10 0 2 4 6 8 10 0 2 4 6 8 10

(a) seizure distribution (b) another distribution (c) and another distribution

Figure 4.6 (Variance Measures How “Dispersed” The Distribution Is)

If the variance (standard deviation) measures how “spread out” (or “dis-
persed”) the distribution is, then distribution for the number of seizures
distribution (a) above, is (circle one) more / as equally / less dispersed
than the other two distributions (b) and (c) above.
In other words, if ten (10) is “very” dispersed and zero (0) is not dispersed
(concentrated at one point), then the variance for the seizure distribution
seems close to (circle one) 0 / 7 / 10
(b) Calculating The Variance. The variance is given by,

Var(X) = (0 − 4.78)2 (0.17) + (2 − 4.78)2(0.21) + · · · + (10 − 4.78)2(0.17)

Section 6. Variance 75

which is equal to (circle one) 10.02 / 11.11 / 12.07 / 13.25.

The standard deviation is given by
√
SD(X) = 12.07

which is equal to (circle one) 3.47 / 4.11 / 5.07 / 6.25.

(Use your calculator: as above, STAT ENTER; type X, 0, 2, 4, 6 and
8, into L1 and P (X = x), 0.17, . . . , 0.17, into L2 ; then deﬁne L3 =
2
1 − 4.78) × L2 ; then STAT√CALC ENTER 2nd L3 ENTER; then read
(L
x = 12.07 for the variance; 12.07 = 3.47 gives the standard deviation.)
(c) If the medical costs for each seizure, X, is $200, g(x) = 200x, the new
distribution for g(x) becomes,
X 0 2 4 6 8 10
g(X) = 200x 200(0) = 0 200(2) = 400 800 1200 1600 2000
p(g(x)) 0.17 0.21 0.18 0.11 0.16 0.17
Since the expected value (mean) cost of seizures is E[g(X)] = 200E[X] =
200(4.78) = 956, then the variance for g(X) is given by,

Var(X) = (0 − 0)2 (0.17) + (400 − 956)2 (0.21) + · · · + (2000 − 956)2(0.17)

which is equal to (circle one) 100200 / 311100 / 4120700 / 482800.

The standard deviation is given by
√
SD(X) = 482800

which is equal to (circle closest one) 347 / 411 / 507 / 695.

(d) Using the Formula Var(aX + b) = a2 Var(X). If the medical costs for each
seizure, X, is $200, g(x) = 200x, the new distribution for g(x) becomes,
X 0 2 4 6 8 10
g(X) = 200x 200(0) = 0 200(2) = 400 800 1200 1600 2000
p(g(x)) 0.17 0.21 0.18 0.11 0.16 0.17
Since the variance of X is given by Var(X) = 12.07, the variance of g(x) =
200x is given by

Var[g(X)] = Var[200X] = 2002 Var(X) = 2002 (12.07) =

100200 / 311100 / 4120700 / 482800.

2. Smokers. Since the number of smokers, X, in any group of three people is given
by the following probability distribution.

x 0 1 2 3
1 3 3 1
P (X = x) 8 8 8 8
76 Chapter 4. Random Variables

(a) One Way To Calculate Variance. Since the mean (expected) number of
smokers is µ = 1.5, then the variance is given by,
1 3 3 1
Var(X) = (0 − 1.5)2 + (1 − 1.5)2 + (2 − 1.5)2 + (3 − 1.5)2
8 8 8 8
which is equal to (circle one) 0.02 / 0.41 / 0.59 / 0.75.
The standard
√ deviation is given by
SD(X) = 0.75 (circle one) 0.47 / 0.86 / 1.07 / 2.25.
(b) Another Way To Calculate Variance. Since the mean (expected) number
of smokers is E[X] = µ = 1.5 and the second moment is given by
1 3 3 1
E[2 ] = (0)2 + (1)2 + (2)2 + (3)2 = 3
8 8 8 8
then the variance is given by,

Var(X) = E[X 2 ] − [E(X)]2 = 3 − (1.5)2

which is equal to (circle one) 0.02 / 0.41 / 0.59 / 0.75.

The standard
√ deviation is given by
SD(X) = 0.75 (circle one) 0.47 / 0.86 / 1.07 / 2.25.

3. Rolling a Pair of Dice. If the dice are fair, the distribution of X (the sum of
two rolls of a pair of dice) is

x 2 3 4 5 6 7
1 2 3 4 5 6
P (X = x) 36 36 36 36 36 36

x 8 9 10 11 12
5 4 3 2 1
P (X = x) 36 36 36 36 36

where, remember, the expected value is 7 and so

1 2 1 1
Var(X) = (2 − 7)2 + (3 − 7)2 + · · · + (11 − 7)2 + (12 − 7)2
36 36 36 36
which is equal to (circle one) 346
/ 35
6
/ 36
6
.
The standard

deviation is given by
35
SD(X) = 6
(circle one) 0.47 / 0.86 / 1.07 / 2.42.

4. And Yet Another Distribution. Since the distribution is

3−x
P (X = x) = , x = 1, 2,
3
Section 6. Variance 77

and µ = 43 , then
2 2
4 3−1 4 3−2
Var(X) = 1 − + 2− ,
3 3 3 3

which is equal to (circle one) 29 / 39 / 49 / 59 .

Also, SD(X) = 29 = (circle one) 0.05 / 0.47 / 1.07 / 2.25.

5. Roulette.

(a) Let random variable X be the winnings from a $1 bet placed on an even
coming up, where this bet pays 1 to 1. Recall,
x -$1 $1
20 18
p(x) 38 38
2
where the mean is µ = − 38 and so
2 2
2 20 2 18
Var(X) = −1 − − + 1− − ,
38 38 38 38

which is equal to (circle one) 360 / 860 / 891 / 932 .

361 361 361 361
Also, SD(X) = 29 = (circle one) 0.051 / 0.999 / 1.573 / 2.251.
(b) Let random variable X be the winnings from a $1 bet placed on a section
(with 12 numbers) coming up, where this bet pays 2 to 1. Recall,
x -$1 $2
26 12
p(x) 38 38
2
where the mean is µ = − 38 and so
2 2
2 26 2 12
Var(X) = −1 − − + 2− − ,
38 38 38 38

which is equal to (circle one) 702 / 860 / 891 / 932 .

361 361 361 361
Also, SD(X) = 29 = (circle one) 0.05 / 0.47 / 1.39 / 2.25.

6. Mathematical Manipulations Of Variance and Expectation.

(a) If E[X] = 4 and Var(X) = 3, then

E[5X − 2] = 5E[X] − 2 = 5(4) − 2 = (circle one) 16 / 18 / 20.
Var[5X − 2] = 52 Var(X) = 25(3) = (circle one) 15 / 18 / 75.
E[7X + 5] = 7E[X] + 5 = 7(4) + 5 = (circle one) 16 / 28 / 33.
Var[7X + 5] = 72 Var(X) = 49(3) = (circle one) 134 / 118 / 147.
78 Chapter 4. Random Variables

(b) If E[X] = −2 and Var(X) = 6, then

E[5X − 2] = 5E[X] − 2 = 5(−2) − 2 = (circle one) −12 / −18 / −20.
Var[5X − 2] = 52 Var(X) = 25(6) = (circle one) 150 / 180 / 750.
E[7X + 5] = 7E[X] + 5 = 7(−2) + 5 = (circle one) −7 / −9 / −11.
Var[−7X + 5] = (−7)2 Var(X) = 49(6) = (circle one) 234 / 268 / 294.
(c) If E[X] = −2 and Var(X) = 6, then
E[X 2 ] = E[X]− Var(X) = −2 − (6) = (circle one) −6 / −8 / −20.
Review Chapter 5

Continuous Random Variables

5.1 Introduction
For all real x ∈ (−∞, ∞),

P {X ∈ B} = f (x) dx
B
where f (x) is called the probability density function and so
b
P {x ≤ X ≤ B} = f (x) dx
a
a

P {X = a} = f (x) dx = 0
a
a

P {X < a} P {X ≤ a} = f (x) dx
−∞

Exercise 5.1 (Introduction to Continuous Random Variables)

1. A First Look: Uniform Probability Distribution and Potatoes An automated
process fills one bag after another with Idaho potatoes. Although each filled
bag should weigh 50 pounds, in fact, because of the differing shapes and weights
of each potato, each bag weighs anywhere from 49 pounds to 51 pounds, as
indicated in the three graphs below.

area is 1

0.5 0.5 0.5

x x 49 x
0 49 51 0 49 50 51 0 51
49.3 50.7

(a) f(x) = 0.5 on [49, 51] (b) f(x) = 0.5 on [49, 51] (c) f(x) = 0.5 on [49, 51]

107
108 Chapter 5. Continuous Random Variables

Figure 5.1 (Uniform Distributions and Potatoes)

(a) If all of the filled bags must fall between 49 and 51 pounds, then there
is (circle one) a little / no chance that one filled bag, chosen at random
from all filled bags, will weigh 48.5 pounds.
(b) There is (circle one) a little / no chance that one filled bag, chosen at
random, will weigh 51.5 pounds.
(c) There is a (circle one) 100% / 50% / 0% chance that one randomly chosen
filled bag chosen will weigh 53.5 pounds.
(d) One randomly chosen filled bag will weigh 36 pounds with probability
(circle one) 1 / 0.5 / 0.
(e) One randomly chosen filled bag will weigh (strictly) less than 49 pounds
with probability P (x < 49) = (circle one) 1 / 0.5 / 0.
(f) One randomly chosen filled bag will weigh (strictly) more than 51 pounds
with probability P (x > 51) = (circle one) 1 / 0.5 / 0.
(g) Figure (a). One randomly chosen filled bag will weigh between 49 and 51
pounds (inclusive) with probability P (49 ≤ x ≤ 51) =
(circle one) 1 / 0.5 / 0.
(h) More Figure (a). The probability P (49 ≤ x ≤ 51) is represented by or
equal to the (circle none, one or more)
i. rectangular area equal to 1.
ii. rectangular area equal to the width (51 − 49 = 2) times the height
(0.5).
iii. definite integral of f (x) = 0.5 over the interval [49, 51].
51
iv. 49 0.5 dx = [0.5x]5149 = 0.5(51) − 0.5(49) = 1.
(i) True / False The probability density function is given by the piecewise
function,

 0 if x < 49
f (x) = 0.5 if 49 ≤ x ≤ 51

0 if x > 51
This is an example of a uniform probability density function (pdf).
(j) Figure (b). One randomly chosen filled bag will weigh between 49 and 50
(not 51!) pounds (inclusive) with probability
P (49 ≤ x ≤ 50) = (50 − 49)(0.5) = (circle one) 0 / 0.5 / 1.
(k) More Figure (b). One randomly chosen filled bag will weigh between 49
and 50 pounds (inclusive)
50 with probability
P (49 ≤ x ≤ 50) = 49 0.5 dx = [0.5x]50
49 = 0.5(50) − 0.5(49) =
(circle one) 0 / 0.5 / 1.
Section 1. Introduction 109

(l) Figure (c). One randomly chosen ﬁlled bag will weigh between 49.3 and
50.7 pounds (inclusive) with probability
P (49.3 ≤ x ≤ 50.7) = (50.7 − 49.3)(0.5) = (circle one) 0 / 0.5 / 0.7.
(m) More Figure (c). One randomly chosen ﬁlled bag will weigh between 49.3
and 50.7 pounds (inclusive)
50.7 with probability
P (49.3 ≤ x ≤ 50.7) = 49.3 0.5 dx = [0.5x]50.7
49.3 = 0.5(50.7) − 0.5(49.3) =
(circle one) 0 / 0.5 / 0.7.
50.9
(n) P (49.1 ≤ x ≤ 50.9) = 49.1 0.5 dx = [0.5x]50.9
49.1 = 0.5(50.9) − 0.5(49.1) =
(circle one) 0 / 0.5 / 0.9.
(o) Another example.
50.9
P (x ≤ 50.9) = f (x) dx
−∞
49 50.9
= f (x) dx + f (x) dx
−∞ 49
49 50.9
= 0 dx + 0.5 dx
−∞ 49
= 0+ [0.5x]50.9
49
= 0.5(50.9) − 0.5(49) =

(circle one) 0 / 0.5 / 0.95.

(p) Another example.
50.2
P (x ≤ 50.2) = f (x) dx
−∞
49 50.2
= f (x) dx + f (x) dx
−∞ 49
49 50.2
= 0 dx + 0.5 dx
−∞ 49
= 0+ [0.5x]50.2
49
= 0.5(50.2) − 0.5(49) =

(circle one) 0 / 0.6 / 0.95.

110 Chapter 5. Continuous Random Variables

(q) Another example.

∞
P (x ≥ 50.2) = f (x) dx
50.2
51 ∞
= f (x) dx + f (x) dx
50.2 51
51 ∞
= 0.5 dx + 0 dx
50.2 51
= [0.5x]51
50.2 + 0
= 0.5(51) − 0.5(50.2) =

(circle one) 0.4 / 0.6 / 0.95.

2. More Probability Density Distributions. In addition to the uniform probability

density function, there are other probability density functions, as shown in the
three graphs below.

2 area = 1 area = 1

area = 1 0
x
x x
0.751 1.69 2.09 2.515 2.930

(a) f(x) = -(9/20)x + 1.5 (b) f(x) = x2 - 2/x (c) f(x) = x2- 5

Figure 5.2 (Diﬀerent Probability Density Functions)

9
(a) Figure (a), f (x) = − 20 x + 1.5 on [0, 0.751]. The probability P ([0, 0.751])
is represented by or equal to the (circle none, one or more)
i. shaded area equal to 1.
9
ii. deﬁnite integral of f (x) = − 20 x + 1.5 deﬁned over the interval
[0, 0.751].
0.751 9 9 2 0.751
iii. 0 − 20 x + 1.5 dx = − 40 x + 1.5x 0 =1
9

(Y1 = − 20 x+1.5, WINDOW 0 4 1 0 2 1, GRAPH, 2nd CALC 7: f (x) dx)
(b) More Figure (a). True / False The probability density function is given
by the piecewise function,
9
− 20 x + 1.5 if 0 ≤ x ≤ 0.751
f (x) =
0 elsewhere
Section 1. Introduction 111

(c) More Figure (a)

0.5
9
P (0.1 ≤ x ≤ 0.5) = − x + 1.5 dx
0.1 20

0.5
9 2
= − x + 1.5x =
40 0.1

(circle one) 0.446 / 0.546 / 0.646.

(d) More Figure (a). P (x ≥ 0.4) = (circle one) 0.436 / 0.546 / 0.646.
(e) Figure (b), f (x) = x2 − x2 on [1.69, 2.09]. The probability P ([1.69, 2.09])
is represented by or equal to the (circle none, one or more)
i. shaded area equal to 1.
ii. deﬁnite integral of f (x) = x2 − x2 deﬁned over the interval [1.69, 2.09].
2.09 2.09
iii. 1.69 x2 − x2 dx = 13 x3 − 2 ln x 1.69 = 1
(Y1 = x2 − x2 ,
(f) More Figure (b). True / False The probability density function is given
by the piecewise function,
2 2
x − x if 1.69 ≤ x ≤ 2.09
f (x) =
0 elsewhere

(g) More Figure (b). The following piecewise function,

2 2
x − x if 3 ≤ x ≤ 5
f (x) =
0 elsewhere

(circle one) is / is not a probability density function because

5 2 2
3
x − x dx = 1
(h) More Figure (b). The following piecewise function,
2 2
x − x if 0 ≤ x ≤ 1
f (x) =
0 elsewhere

is not a probability density function because (circle two)

i. it is not continuous (there are “gaps” in the interval).
ii. negative.
iii. its integral the interval [0, 1] does not exactly equal 1.
(i) More Figure (b).P (0.1 ≤ x ≤ 0.5) = (circle one) 0 / 0.546 / 0.646.
(j) More Figure (b). P (x ≥ 0.4) = (circle one) 0.436 / 0.546 / 1.
112 Chapter 5. Continuous Random Variables

(k) Figure (c), f (x) = x2 − 5. The function f (x) = x2 − 5 is a probability

density function if deﬁned on the interval (circle one) [0.001, 0.251] /
[2.515, 2.930] / [1.545, 1.978]

(l) More Figure (c)

2.7
P (2.6 ≤ x ≤ 2.7) = x2 − 5 dx
2.6

2.7
1 3
= x − 5x =
3 2.6

(circle one) 0.202 / 0.546 / 0.646.

(m) More Figure (c). P (x ≥ 0.4) = (circle one) 0.436 / 0.546 / 1.

3. Normalizing Continuous Functions Into Probability Density Functions

(a) Find C such that f (x) = Cx is a probability density function over the
interval [2, 4]. In other words, ﬁnd C such that
4
P (2 ≤ x ≤ 4) = Cx dx
2

4
C 2
= x
2 2
C 2 C 2
= (4) − (2)
2 2
C 2
= (4) − (2)2
2
C
= (12)
2
= 6C
= 1
1 1
and so C = (circle one) / / 16 .
4 5 4 1
2 6
x dx = 1.
(b) Find k such that f (x) = kx is a probability density function over the
Section 1. Introduction 113

interval [1, 5]. In other words, ﬁnd k such that

5
P (1 ≤ x ≤ 5) = kx dx
1

5
k 2
= x
2 1
k 2 k 2
= (5) − (1)
2 2
k 2
= (5) − (1)2
2
k
= (24)
2
= 12k
= 1
1 1 1
and so k = (circle one) 4
/ 11
/ 12
.
2
(c) Find k such that f (x) = kx is a probability density function over the
interval [1, 5]. In other words, ﬁnd k such that
5
P (1 ≤ x ≤ 5) = kx2 dx
1

5
k 3
= x
3 1
k 3 k 3
= (5) − (1)
3 3
k 3
= (5) − (1)3
3
k
= (124)
3
124
= k
3
= 1
3 1 3
and so k = (circle one) 26
/ 11
/ 124
.
(d) Find k such that f (x) = k(x − 3) is a probability density function over the
114 Chapter 5. Continuous Random Variables

interval [1, 5]. In other words, ﬁnd k such that

5
P (1 ≤ x ≤ 5) = k(x + 3) dx
1

5
k 2
= x + 3kx
2 1

k 2 k 2
= (5) + 3k(5) − (1) + 3k(1)
2 2
k
= (25 − 1) + k(15 − 3)
2
k
= (24) + 12k
2
= 24k
= 1
3 1 1
and so k = (circle one) 26
/ 24
/ 18
.
∞
(e) Exponential Distribution and Improper Integration. 0 λe−λx dx =
(circle none, one or more)
b
limb!1 ; e;x
0
;(0)
limb!1 −e;b − −e
limb!1 −e;b + 1
∞
and so 0 λe−λx dx = (circle one) −1 / 0 / 1
and so f (x) = λe−λx , λ > 0 is a probability density function

5.2 Expectation and Variance of Continuous Ran-

dom Variables
Let f (x) be the probability density function. Then, the expected value, denoted
E[X], (or µ) is deﬁned as
∞
E[X] = xf (x) dx
−∞

and the variance, denoted Var(X), is deﬁned as

Var(X) = E[(X − µ)2 ] = E[X 2 ] − (E[X])2

Section 2. Expectation and Variance of Continuous Random Variables 115

and the standard deviation is deﬁned as the square root of the variance. Some prop-
erties include,
∞
E[g(X)] = g(x)f (x) dx
−∞
E[aX + b] = aE[X] + b
Var(X) = a2 Var(X)

Exercise 5.2 (Expected Value, Variance and Standard Deviation)

1. Expected Values For Uniform Probability Density Functions: Potatoes Again
Consider, again, the different automated processes which fill bags of Idaho pota-
toes which have different uniform probability density functions, as shown in the
three graphs below.

1/3
1/3.9

x
x 49.5 50.5 x
0 49 52 0 0 48.4 52.3
50.5 50 50.35
expected value expected value expected value

(a) f(x) = 1/3 on [49, 52] (b) f(x) = 1 on [49.5, 50.5] (c) f(x) = 1/3.9 on [48.4, 52.3]

Figure 5.3 (Expected Values of Uniform Probability Density Functions)

(a) Figure (a). Since the weight of potatoes are uniformly spread over the
interval [49, 52], we would expect the weight of a potato chosen at random
from all these potatoes to be
49+52
2
= (circle one) 50 / 50.5 / 51.
(b) Figure (a) Again. The expected weight of a potato chosen at random can
also be calculated in the following way:
∞
E[X] = xf (x) dx
−∞
49 52 ∞
= xf (x) dx + xf (x) dx + xf (x) dx
−∞ 49 52
49 52 ∞
1
= x(0) dx + x dx + x(0) dx
−∞ 49 3 52

52
1 2
= 0+ x +0
6 49
1 1
= (52)2 − (49)2 =
6 6
50 / 50.5 / 51.
116 Chapter 5. Continuous Random Variables

(c) Figure (b). Since the weight of potatoes are uniformly spread over the
interval [49.5, 50.5], we would expect the weight of a potato chosen at ran-
dom from all these potatoes to be
49.5+50.5
2
= (circle one) 50 / 50.5 / 51.
(d) Figure (b) Again. The expected weight of a potato chosen at random can
also be calculated in the following way:
∞
E[X] = xf (x) dx
−∞
49.5 50.5 ∞
= xf (x) dx + xf (x) dx + xf (x) dx
−∞ 49.5 50.5
49.5 50.5 ∞
= x(0) dx + x(1) dx + x(0) dx
−∞ 49.5 50.5

50.5
1 2
= 0+ x +0
2 49.5
1 1
= (50.5)2 − (49.5)2 =
2 2
50 / 50.5 / 51.
(e) Figure (c). The expected weight of a potato chosen at random can also be
calculated in the following way:
∞
E(x) = xf (x) dx
−∞
52.3
1
= x dx
48.4 3.9

52.3
1 2
= x
7.8 48.4
1 1
= (52.3)2 − (48.4)2 =
2(7.8) 2(7.8)

50.15 / 50.35 / 51.15.

2. Expected Values For Other Probability Density Functions Consider the following
probability density functions, as shown in the three graphs below.
Section 2. Expectation and Variance of Continuous Random Variables 117

0
x
0.751 x 1.69 2.09 x
2.515 2.930
0.359 1.93 2.77
expected value expected value expected value
(a) f(x) = -(9/20)x + 1.5 (b) f(x) = x2 - 2/x (c) f(x) = x2- 5

Figure 5.4 (Expected Values of Other Probability Density Functions)

(a) Figure (a). Since there is “more” probability on the “left” of the interval
[0, 0.751], we would expect the expected (or mean) weight of value chosen
from this distribution to be (circle one) smaller than / equal to / larger
than the middle value, 0+0.751
2
= 0.3755.
(b) Figure (a) Again. The expected value is
∞
E[X] = xf (x) dx
−∞
0 0.751 ∞
= xf (x) dx + xf (x) dx + xf (x) dx
−∞ 0 0.751
0 0.751 ∞
9
= x(0) dx + x − x + 1.5 dx + x(0) dx
−∞ 0 20 0.751
0.751
9 2
= − x + 1.5x dx
0 20

0.751
−9 3 1.5 2
= x + x =
60 2 0

0.359 / 0.376 / 0.410.

0.751 9 2
(Use MATH 9:fnInt for 0 − 20 x + 1.5x dx.)
(c) Figure (b). Since there is “more” probability on the “right” of the interval
[1.69, 2.09], we would expect the expected (or mean) weight of value chosen
from this distribution to be (circle one) smaller than / equal to / larger
than the middle value, 1.69+2.09
2
= 1.89.
(d) Figure (b) Again. The expected value is
∞
E[X] = xf (x) dx
−∞
2.09
2 2
= x x − dx
1.69 x
2.09
3
= x − 2 dx =
1.69

1.89 / 1.93 / 2.04.

118 Chapter 5. Continuous Random Variables

(e) Figure (c). The expected value is

∞
E[X] = xf (x) dx
−∞
2.930

= x x2 − 5 dx
2.515
2.930
3
= x − 5x dx =
2.515

(circle one) 2.77 / 2.93 / 3.04.

3. Expected Values For Diﬀerent Functions, E[g(x)].

(a) The expected value of g(X) = X + 2 where the probability density is
f (x) = x2 − 5 on the interval [2.515, 2.930], is
∞
E(X + 2) = (x + 2)f (x) dx
−∞
2.930

= (x + 2) x2 − 5 dx
2.515
2.930
3
= x + 2x2 − 5x − 10 dx =
2.515

(circle one) 2.77 / 2.93 / 4.79.

(b) The expected value of g(X) = X 2 where the probability density is f (x) =
x2 − 5 on the interval [2.515, 2.930], is
∞
2
E[X ] = x2 f (x) dx
−∞
2.930

= x2 x2 − 5 dx
2.515
2.930
4
= x − 5x2 dx =
2.515

(circle one) 6.77 / 7.65 / 8.79.

(c) The expected value of g(X) = 2X 3 where the probability density is f (x) =
x2 − 5 on the interval [2.515, 2.930], is
∞
3
E[2X ] = 2x3 f (x) dx
−∞
2.930

= 2x3 x2 − 5 dx
2.515
2.930
= 2x5 − 10x3 dx =
2.515
Section 2. Expectation and Variance of Continuous Random Variables 119

(circle one) 36.77 / 37.65 / 42.32.

(d) Consider the probability density f (x) = x2 −5 on the interval [2.515, 2.930].
Then,
∞
3
E[2X ] − 2 = 2x3 f (x) dx − 2
−∞
2.930

= 2x3 x2 − 5 dx − 2
2.515
2.930
5
= 2x − 10x3 dx − 2 =
2.515

(circle one) 36.77 / 40.32 / 42.32.

(e) Consider the probability density f (x) = x2 −5 on the interval [2.515, 2.930].
Then,
∞ ∞
2 2
E[X ] − 3E[X] = x f (x) dx − 3 xf (x) dx
−∞ −∞
2.930 2.930
2
2
= x x − 5 dx − 3 x x2 − 5 dx
2.515
2.930 2.515
2.930
= x4 − 5x2 dx − 3 x3 − 5x dx =
2.515 2.515

(circle one) −0.667 / 0.667 / 2.322.

4. Variance (and Standard Deviation) For Diﬀerent Probability Density Functions.

9
(a) The variance of the probability density f (x) = − 20 x + 1.5 on the interval
[0, 0.751], is

Var(X) = E[X 2 ] − [E(X)]2

∞ ∞
2
2
= x f (x) dx − xf (x) dx
−∞ −∞
0.751
2
0.751
2 9 9
= x dx −
− x + 1.5 x − x + 1.5 dx
0 20 0 20
0.751 0.751
2
9 3 2 9 2
= − x + 1.5x dx − − x + 1.5x dx
0 20 0 20

(circle one) 0.173 / 0.047 / 0.123. √

The standard deviation, then is σ = 0.047 =
(circle one) 0.173 / 0.047 / 0.216.
120 Chapter 5. Continuous Random Variables

2
(b) The variance of the probability density f (x) = x2 − x
on the interval
[1.69, 2.09], is
Var(X) = E(x2 ) − [E(x)]2
∞ ∞
2
2
= x f (x) dx − xf (x) dx
−∞ −∞

2.09 2.09
2
2 2 2 2 2
= x x − dx − x x − dx
1.69 x 1.69 x
2.09 2.09
2
4 3
= x − 2x dx − x − 2 dx =
1.69 1.69

(circle one) −0.02 / 0.02 / 0.04

(which is incorrect, due to round oﬀ error in the calculator) √
The standard deviation (assuming a variance of 0.02), is σ = 0.02 =
(circle one) 0.141 / 0.047 / 0.216.
(c) The variance of the probability density f (x) = x2 − 5 on the interval
[2.515, 2.930], is
Var(X) = E(x2 ) − [E(x)]2
∞ ∞
2
2
= x f (x) dx − xf (x) dx
−∞ −∞

2
2.930
2
2
2.930 2

= x x − 5 dx − x x − 5 dx
2.515 2.515
2.930
2
4 2
2.930 3
= x − 5x dx − x − 5x dx =
2.515 2.515

(circle one) −0.04 / 0.04 / 0.04

(which is also incorrect, due to round oﬀ error in the calculator) √
The standard deviation (assuming a variance of 0.04), is σ = 0.04 =
(circle one) 0.146 / 0.199 / 0.216.
(d) True / False. The variance (and standard deviation) provide a measure
of how “spread out” of “dispersed” the probability density function is from
the expected value.

5.3 The Uniform Random Variable

Uniform random variable X has a probability density function on the interval (α, β)
where
1
β−α
if α ≤ x ≤ β
f (x) =
0 elsewhere
Section 3. The Uniform Random Variable 121

and a distribution function,


 0 if a ≤ α
a−α
F (a) = if α < a < β
 β−α
1 if a ≥ β

where the expected value and variance are

β+α
E[X] =
2
(β − α)2
Var(X) =
12

Exercise 5.3 (Uniform Random Variable)

1. Uniform Probability Density Functions: Potatoes Again Different automated
processes which fill bags of Idaho potatoes have different uniform probability
density functions, as shown in the three graphs below.

1
area is 1 area is 1
1/3 area is 1 1/3.9

x
x x
0 49 52 0 49.5 50.5 0 48.4 52.3

(a) f(x) = 1/3 on [49, 52] (b) f(x) = 1 on [49.5, 50.5] (c) f(x) = 1/3.9 on [48.4, 52.3]

Figure 5.5 (Uniform Probability Density Functions and Potatoes)

(a) Figure (a).

i. The probability P (49 ≤ x ≤ 52) is represented by or equal to the
(circle none, one or more)
A. rectangular area equal to 1.
B. rectangular area equal to the width (52 − 49 = 1) times the height
( 13 ).
C. deﬁnite integral of f (x) = 13 over the interval [49, 52].
52 52
D. 49 31 dx = 13 x 49 = 13 (52) − 13 (49) = 1.
ii. True / False The probability density function is given by the piecewise
function,

 0 if x < 49
1
f (x) = if 49 ≤ x ≤ 52
 3
0 if x > 52
122 Chapter 5. Continuous Random Variables

iii. Probability By Integrating.

50.2
P (x ≤ 50.2) = f (x) dx
−∞
49 50.2
= f (x) dx + f (x) dx
−∞ 49
49 50.2
1
= 0 dx + dx
−∞ 49 3

50.2
1
= 0+ x
3 49
1 1
= (50.2) − (49) =
3 3
(circle one) 13 / 13:2 / 13:4 .
iv. Probability By Distribution Function.

P (x ≤ 50.2) = F (50.2)
50.2 − α
=
β−α
50.2 − 49
= =
52 − 49
(circle one) 1
3
/ 13:2 / 13:4 .
v. P (x ≥ 50.2) = (circle one) 03:8 / 13:2 / 13:8 .
vi. Expectation and Variance.
E[X] = β+α2
= 52+49
2
= (circle one) 50 / 50.5 / 51.
(β−α)2 (52−49)2
Var(X) = 12 = 12 (circle one) 0.75 / 1 / 1.25.
(b) Figure (b).
i. The probability P (49.5 ≤ x ≤ 50.5) is represented by or equal to the
(circle none, one or more)
A. rectangular area equal to 1.
B. rectangular area equal to the width (50.5 − 49.5 = 3) times the
height (1).
C. deﬁnite integral of f (x) = 1 over the interval [49.5, 50.5].
50.5
D. 49.5 1 dx = [x]50.5
49.5 = 51.5 − 49.5 = 1.
ii. True / False The probability density function is given by the piece-
wise function,

1 if 49.5 ≤ x ≤ 50.5
f (x) =
0 elsewhere
Section 3. The Uniform Random Variable 123

iii. Probability By Integration.

50.2
P (x ≤ 50.2) = f (x) dx
−∞
49.5 50.2
= f (x) dx + f (x) dx
−∞ 49.5
49.5 50.2
= 0 dx + 1 dx
−∞ 49.5
= 0+ [x]50.2
49.5
= 50.2 − 49.5 =

(circle one) 0.5 / 0.7 / 0.9.

iv. Probability By Distribution Function.

P (x ≤ 50.2) = F (50.2)
50.2 − α
=
β−α
50.2 − 49.5
= =
52 − 49.5
(circle one) 0.5 / 0.7 / 0.9.
v. P (x ≥ 50.1) = 1 − F (50.2) = (circle one) 0.2 / 0.3 / 0.4.
vi. Expectation and Variance.
E[X] = β+α2
= 50.5+49.5
2
= (circle one) 50 / 50.5 / 51.
(β−α)2 (50.5−49.5)2
Var(X) = 12 = 12
(circle one) 0.075 / 0.083 / 0.093.
(c) Figure (c).
i. The probability P ([48.4, 52.3]) = P ([48.4 ≤ x ≤ 52.3]) is represented
by or equal to the (circle none, one or more)
A. rectangular area equal to 1.
B. rectangular area equal to the width (52.3 − 48.4 = 3.9) times the
1
height ( 3.9 ).
1
C. deﬁnite integral of f (x) = 3.9 over the interval [48.4, 52.3].
52.3 1 1 52.3 1 1
D. 48.4 3.9 dx = 3.9 x 48.4 = 3.9 (52.3) − 3.9 (48.4) = 1.
ii. True / False The probability density function is given by the piece-
wise function,
1
3.9
if 48.4 ≤ x ≤ 52.3
f (x) =
0 elsewhere
124 Chapter 5. Continuous Random Variables

iii. Probability By Distribution Function.

P (x ≤ 50.2) = F (50.2)
50.2 − α
=
β−α
50.2 − 48.4
= =
52.3 − 48.4

(circle one) 31:9 / 13::29 / 23::19 .

iv. More Probability By Distribution Function.

P (49.3 ≤ x ≤ 50.2) = F (50.2) − F (47.3)

50.2 − 48.4 49.3 − 48.4
= − =
52.3 − 48.4 52.3 − 48.4
(circle one) 0.8 / 0.9 / 1.0.
(d) More Questions.
i. If a uniform density is defined on the interval [40, 50], then f (x) =
1 1 1
50 1
(circle one) 10 / 15 / 20 and zero elsewhere since 40 10 dx = 1
ii. If a uniform density is defined on the interval [0, 50], then f (x) =
1 1 1 50 1
(circle one) 30 / 40 / 50 and zero elsewhere since 0 50 dx = 1
iii. If a uniform density is defined on the interval [−10, 5050],1 then f (x) =
1 1 1
(circle one) 30 / 40 / 60 and zero elsewhere since −10 60 dx = 1
iv. If a uniform 50density is defined on the interval [−10, 50], then
P ([0, 50]) = 0 60 dx = (circle one) 30
1
60
/ 4060
/ 50
60
v. If a uniform density is defined on the interval [−2.3, 5.5], then
P ([−2.1, 5.1]) = (circle one) 77::18 / 77::28 / 77::78
vi. True / False. In general, a uniform probability density function over
the interval [a, b] is given by
1
β−α
if α ≤ x ≤ β
f (x) =
0 elsewhere

vii. A uniform probability density function has the following properties:

(circle none, one or more)
A. continuity (there are no “gaps” in the interval).
B. nonnegative (it is never negative).
C. integral over entire interval equals exactly 1.
Section 4. Normal Random Variables 125

5.4 Normal Random Variables

We now look at the (standard) normal probability density distribution,
1 2
f (x) = √ e−x /2
2π
with distribution function,
x
1 2 /2
F (x) = Φ(x) = √ e−y dy
2π −∞

One interesting property1 of the standard normal is

1 2
1 − Φ(x) ∼ √ e−x /2
x 2π
We also look at a more general version of this function called the (nonstandard)
normal probability density distribution,
1 2
f (x) = √ e−(1/2)[(x−µ)/σ]
σ 2π
with distribution function,

a−µ
FX (a) = Φ
σ

and where the expected value and variance are

E[X] = µ
Var(X) = σ 2

Exercise 5.4 (Normal Distribution)

1. Probabilities For Standard Normal: Westville Temperatures. In Westville, in

February, the temperature, x, is assumed to be standard normally distributed
with mean µ = 0o and variance σ 2 = 1o .
1 a(x)
a(x) ∼ b(x) if limx→∞ b(x) =1
126 Chapter 5. Continuous Random Variables

f(x) f(x)
P(X < 1.42) = ?

P(x < -2.11) = ?

0 x x
1.42 -2.11 0

(a) (b)

f(x)
f(x)
P(X > 0.54) = ? P(-1.73<X<1.62) = ?

0 0.54 x -1.73 0 1.62 x

Figure 5.6 (Probabilities For Standard Normal: Westville Temperatures)

(a) The standard normal distribution, in (a) of the figure above, say, is (circle
one) skewed right / symmetric / skewed left.
(b) Since the standard normal is a probability density function, the total area
under this curve is (circle one) 50% / 75% / 100% / 150%.
(c) The shape of this distribution is (circle one)
triangular / bell–shaped / rectangular.
(d) This distribution has an expected value at (circle one) µ = 0o / µ = 1o .
(e) Since this distribution is symmetric, (circle one) 25% / 50% / 75% of the
temperatures are above (to the right) of 0o .
(f) The probability of the temperature being less than 1.42o is (circle one)
greater than / about the same as / smaller than 0.50. Use (a) in
the figure above.
(g) The probability the temperature is less than 1.42o,
P {X ≤ 1.42} = F (1.42)
= Φ(1.42)
1.42
1 2
= √ e−y /2 dy =
2π −∞
0.9222 / 0.0174 / 0.2946 / 0.9056.
(It is not possible to determine this integral in an analytical way (“by
hand”) and so you must use your calculator to perform a numerical ap-
proximation for this integration: 2nd DISTR 2:normalcdf(− 2nd EE 99,
1.42); look at graph (a) of the figure above to better visualize the proba-
bility that is being determined.)
Section 4. Normal Random Variables 127

(h) P {X < −2.11} = Φ(−2.11) =

(circle one) 0.9222 / 0.0174 / 0.2946 / 0.9056.
(Use 2nd DISTR 2:normalcdf( − 2nd EE 99, −2.11).)
(i) P {x > 0.54} = 1 − Φ(0.54) =
(circle one) 0.9222 / 0.0174 / 0.2946 / 0.9056.
(Use 2nd DISTR 2:normalcdf(0.54, 2nd EE 99).)
(j) P {−1.73 < X < 1.62} = Φ(1.62) − Φ(−1.73) =
(circle one) 0.9222 / 0.0174 / 0.2946 / 0.9056.
(Use 2nd DISTR 2:normalcdf( −1.73, 1.62).)
(k) True / False The probability the temperature is exactly 1.42o , say, is zero.
This is because the probability is equal to the area under the bell–shaped
curve and there is no area “under” the “line” at 1.42o .
(l) True / False P {X < 1.42o } = P {X ≤ 1.42o}.

2. Nonstandard Normal, A First Look: IQ Scores. It has been found that IQ scores
can be distributed by a nonstandard normal distribution. The following ﬁgure
compares the two normal distributions for the 16 year olds and 20 year olds.
f(x)

16 year old IQs

σ = 16
f(x)
20 year old IQs

σ = 20

µ = 100 µ = 120 x

Figure 5.7 (Nonstandard Normal Distributions of IQ Scores)

(a) The mean IQ score for the 20 year olds is

µ = (circle one) 100 / 120 / 124 / 136.
(b) The average (or mean) IQ score for the 16 year olds is
(circle one) 100 / 120 / 124 / 136.
(c) The standard deviation in the IQ score for the 20 year olds
σ = (circle one) 16 / 20 / 24 / 36.
(d) The standard deviation in the IQ score for the 16 year olds is
(circle one) 16 / 20 / 24 / 36.
(e) The normal distribution for the 20 year old IQ scores is (circle one)
broader than / as wide as / narrower than the normal distribution
for the 16 year old IQ scores.
128 Chapter 5. Continuous Random Variables

(f) The normal distribution for the 20 year old IQ scores is (circle one) shorter
than / as tall as / taller than than the normal distribution for the 16
year old IQ scores.
(g) The total area (probability) under the normal distribution for the 20 year
old IQ scores is (circle one) smaller than / the same as / larger than
the area under the normal distribution for the 16 year old IQ scores.
(h) True / False Neither the normal distribution for the IQ scores for the
20 year old IQ scores nor the 16 year old IQ scores is a standard normal
because neither have mean zero, µ = 0, and standard deviation 1, σ = 1.
Both, however, have the same general “bell–shaped” distribution.
(i) There is (circle one) one / two / many / an infinity of nonstandard
normal distributions. The standard normal is one special case of the family
of (nonstandard) normal distributions where µ = 0 and σ = 1.

3. Probabilities For Nonstandard Normal: IQ Scores Again.

P(X > 84) = ? f(x) P(96 < X < 120) = ? f(x)

SD 16 SD 16

84 x 100 120 x
100 96
(a) (b)

P(X > 84) = ? f(x) P(96 < X < 120) = ? f(x)

SD 20 SD 20

x x
84 120 96 120

Figure 5.8 (Probabilities For Nonstandard Normal Distributions of IQ Scores)

(a) The upper two (of the four) normal curves above represent the IQ scores
for sixteen year olds. Both are nonstandard normal curves because the
(circle none, one or more)
i. the average is 100 and the SD is 16.
ii. neither the average is 0, nor is the SD equal to 1.
iii. the average is 16 and the SD is 100.
iv. the average is 0 and the SD is 1.
Section 4. Normal Random Variables 129

The lower two normal curves above represent the IQ scores for twenty year
olds (µ = 120, σ = 20).
(b) Since the sixteen year old distribution is symmetric, (circle one) 25% /
50% / 75% of the IQ scores are above (to the right) of 100.
(c) The probability of the IQ scores being less than 84, P {X < 84}, for the
sixteen year old distribution is (circle one) greater than / about the
same as / smaller than 0.50.
(d) P {X < 84} =

84 − µ
P {X > 84} = 1 − Φ
σ

84 − 100
= 1−Φ
16
84
1 2
= 1− √ e−(1/2)[(84−100)/16] dy =
σ 2π −∞

(circle one) 0.8413 / 0.1587 / −0.1587

(Use 2nd DISTR 2:normalcdf(− 2nd EE 99, 84, 100, 16).)
(e) Consider the following table of probabilities and possible values of proba-
bilities.
Column I Column II
(a) P {X > 84}, “sixteen year old” normal (a) 0.4931
(b) P {96 < X < 120}, “sixteen year old” normal (b) 0.9641
(c) P {X > 84}, “twenty year old” normal (c) 0.8413
(d) P {96 < X < 120}, “twenty year old” normal (d) 0.3849
Using your calculator and the ﬁgure above, match the four items in column
I with the items in column II.
Column I (a) (b) (c) (d)
Column II
(f) True / False P {X < 84} for standard normal equals P {X < 84} for the
nonstandard normal

4. Standardizing Nonstandard Normal Random Variables. Nonstandard random

variable X, with mean µ and standard deviation σ, can be “standardized” into
a standard random variable Z using the following formula:
X −µ
Z=
σ
130 Chapter 5. Continuous Random Variables

(a) The IQ scores for the 16 year olds are normal with µ = 100 and σ = 16.
The standardized value of the nonstandard IQ score of 110 for the 16 year
olds, then, is
Z = X−µσ
= 110−100
16
= (circle one) 0.625 / 1.255 / 3.455
and so P {X > 110} = P {Z > 0.625}.
(Compare 2nd DISTR 2:normalcdf(110, 2nd EE 99, 100, 16) with 2nd
DISTR 2:normalcdf(0.625, 2nd EE 99, 0, 1).)
(b) The IQ scores for the 20 year olds are normal with µ = 120 and σ = 20.
The standardized value of the nonstandard IQ score of 110 for the 20 year
olds, then, is
Z = X−µσ
= 110−120
20
= (circle one) 0.5 / -0.5 / 0.25.
and so P {X > 110} = P {Z > −0.5}.
(Compare 2nd DISTR 2:normalcdf(110, 2nd EE 99, 120, 20) with 2nd
DISTR 2:normalcdf(−0.5, 2nd EE 99, 0, 1).)
(c) If both a 16 year old and 20 year old score 110 on an IQ test, (check none,
one or more)
i. the 16 year old is brighter relative to his age group than the 20 year
old is relative to his age group
ii. the z–score is higher for the 16 year old than it is for the 20 year old
iii. the z–score allows us to compare the IQ score for a 16 year old with
the IQ score for a 20 year old
(d) If µ = 100 and σ =
16, then
P {X > 130} = P Z > 130−100
16
= (circle one) 0.03 / 0.31
(e) If µ = 120 and σ =
20, then
P {X > 130} = P Z > 130−120
20
= (circle one) 0.03 / 0.31
(f) If µ = 25 and σ = 5, then

P {27 < X < 32} = P 27−25 5
<Z< 32−25
5
=
(circle one) 0.03 / 0.26 / 0.31

5. Normal Approximation To Binomial. A lawyer estimates she wins 40% of her

cases (p = 0.4), and this problem is assumed to obey the conditions of a bi-
nomial experiment. If the lawyer presently represents n = 10 defendants and
X represents the number of wins (of the 10 cases), the functional form of the
probability is given by,

10
(0.4)i (0.6)10−i , i = 0, 1, 2, . . . 10
i

(a) Binomial, Exactly. Various quantities connected to this binomial, includ-

ing,
i. mean or expected value is given by µ = np (circle one) 4 / 2.4 / 3.0.
Section 4. Normal Random Variables 131

ii. standard deviation is given by σ = np(1 − p) (circle one) 4 / 2.4 /
3.0 / 1.55.
iii. P {X ≥ 5} = (circle one) 0.367 / 0.289 / 0.577.
(Use your calculator; subtract 2nd DISTR A:binomcdf(10, 0.4, 4) from
one (1).)
(b) Normal Approximation. Consider a graph of the binomial and a normal
approximation to this distribution below.

f(x)
0.30
0.30
σ = 1.55 2
0.25
P(X >
_ 5) 0.25 Ν(4,1.55 )

0.20 0.20
P(X = x)

P(X >
_ 4.5)
0.15 0.15

0.10 0.10

0.05 0.05

0.00 0.00
0 1 2 3 4 5 6 7 8 9 10 µ=4
number of cases won, x number of cases won, x

(a) binomial (b) normal approximation

Figure 5.9 (Binomial and Normal Approximation)

i. True / False The binomial is a discrete distribution, whereas the nor-
mal approximation is a continuous distribution. We will be approxi-
mating a discrete distribution by a continuous one. In particular, we
plan to approximate the shaded P (X ≥ 5) from the binomial with the
shaded P (X ≥ 4.5) from the normal.
ii. It would appear, simply by looking at the two graphs above, as though
the binomial is (circle one) skewed / symmetric. This is “good”
because we plan to approximate a (not necessarily, but, in this case,
apparently) symmetric binomial with an always symmetric normal dis-
tribution.
iii. To check to see if the binomial is symmetric “enough”, we must show
that both
np ≥ 5 and n(1 − p) ≥ 5.
In fact, these conditions are violated in the following way (circle one)
A. np ≥ 5 and n(1 − p) ≥ 5
B. np < 5 and n(1 − p) ≥ 5
C. np ≥ 5 and n(1 − p) < 5
D. np < 5 and n(1 − p) < 5
and so the binomial, in this case, is actually not symmetric enough to
be approximated by the normal.
132 Chapter 5. Continuous Random Variables

iv. However, in spite of violating the conditions required for symmetry, we

will proceed to approximate the binomial with a normal. The normal
we will use to approximate the binomial with will be a nonstandard
normal with mean equal to the mean of the binomial, µ = np =
10(4), and standard
deviation equal to the standard deviation of the
binomial, σ = np(1 − p) = (circle one) 2.4 / 1.55.
v. True / False The nonstandard normal distribution we will use to
approximate the binomial distribution has a mean of 4 and a standard
deviation of 1.55.
vi. If X in normal where µ = 4 and σ = 1.55, then P {X ≥ 5} = (circle
one) 0.374 / 0.259.
(Use 2nd DISTR 2:normalcdf(5, 2nd EE 99, 4, 1.55).)
vii. The normal approximation, P {X ≥ 5} = 0.259, is (circle one) smaller
than / about the same as / larger than the exact binomial value,
P {X ≥ 5} = 0.367 and so this is a bad normal approximation to the
binomial.
viii. To improve the continuous normal approximation to the discrete bi-
nomial, a continuity correction factor is introduced. In this case, 0.5
is subtracted from 5 and the revised normal approximation becomes
P {X ≥ 4.5} = (circle one) 0.374 / 0.259.
(Use 2nd DISTR 2:normalcdf(4.5, 2nd EE 99, 4, 1.55).)
Section 5. Exponential Random Variable 133

5.5 Exponential Random Variable

We look at the exponential random variable and Laplace (or double exponential)
random variable. We also look at the hazard rate (or failure rate) function.

Exercise 5.5 (Exponential Random Variable)

1. Exponential Random Variable. The exponential random variable X (often re-
lated to the amount of time until a speciﬁc event–a telephone call, say–occurs)
has a probability density function where
−λx
λe if x ≥ 0
f (x) =
0 if x < 0

and a distribution function,

F (a) = P {X ≤ a} = 1 − e−λa , a ≥ 0

where the expected value and variance are

1
E[X] =
λ
1
Var(X) =
λ2
(a) Sketching λe−λx : Waiting Time For Emails. The graphs of diﬀerent (dif-
ferent λ) exponential density functions are given the ﬁgure below.
5 5

(1)
(2) (2) (1)
(3) (3)
0 3 x 0 3 x
1.1

(a) waiting time densities

(b) chance of waiting less
than 1.1 minutes
Figure 5.10 (Exponential Probability Density Functions)
Match each of the exponential distribution functions ((1), (2) and (3))
given below to each of the graphs in (a) given above.
1 −x/2
P = 2
e 3e−3x 5e−4x
graph
(Hint: Use your calculators; use WINDOW 0 3 1 0 5 1.)
134 Chapter 5. Continuous Random Variables

(b) Figure (a).

1
i. For distribution f (x) = 12 e−x/2 , λ = (circle one) 2
/3/5
1 −x/2
ii. The function f (x) = 2
ecrosses the f (x)–axis at
1
f (x) = (circle one) 2 / 3 / 5
iii. For distribution f (x) = 3e−3x , λ = (circle one) 12 / 3 / 5
iv. The function f (x) = 3e−3x crosses the f (x)–axis at
f (x) = (circle one) 12 / 3 / 5
v. For distribution f (x) = 5e−5x , λ = (circle one) 12 / 3 / 5
vi. The function f (x) = 5e−5x crosses the f (x)–axis at
f (x) = (circle one) 12 / 3 / 5
vii. The function f (x) = 12 e−x/2 is (circle one)
more steeply bent towards the f (x)–axis
less steeply bent towards the f (x)–axis
as steeply bent towards the f (x)–axis
as f (x) = 5e−5x .
viii. As positive λ becomes larger, the function f (x) = λe−λx ,
bends more steeply towards the f (x)–axis
bends less steeply towards the f (x)–axis
The constant λ is called the rate of the distribution.
(c) Figure (b).
i. Probability By Integration. The probability of waiting less than 1.1
minutes for an email when λ = 12 is
1.1
P {X ≤ 1.1} = λe−λx dx
0
1.1
1 −1x
= e 2 dx =
0 2

(circle one) 0.32 / 0.42 / 0.52.

1
(Deﬁne Y1 = 12 e− 2 X , then MATH fnInt(Y1 ,X,0,1.1) ENTER)
ii. Probability By Distribution Function2 .

P {X ≤ 1.1} = F (1.1)
= 1 − e−λ(1.1)
1
= 1 − e− 2 (1.1) =

(circle one) 0.32 / 0.42 / 0.52.

2
The distribution function, F (x), is often easier to calculate than integrating the distribution,
f (x) dx.
Section 5. Exponential Random Variable 135

iii. For λ = 3, P {X < 1.1} = F (1.1) = 1 − e−3(1.1) =

(circle one) 0.32 / 0.42 / 0.96.
iv. For λ = 5, P {X < 1.1} = F (1.1) = 1 − e−5(1.1) = (circle one) 0.32 /
0.42 / 0.996.
v. The probability of waiting less than 1.1 minutes for an email when
λ = 12 is (circle one) greater than / about the same as / smaller
than probability of waiting less than 1.1 minute for an email when
λ = 5.
(d) For λ = 3,
P {X > 0.54} = 1 − F (0.54) = 1 − 1 − e−(3)(0.54) = e−(3)(0.54) =
(circle one) 0.20 / 0.22 / 0.29.
(e) For λ = 3,
P {1.13 < X < 1.62} = F (1.62) − F (1.13) = e−(3)(1.13) − e−(3)(1.62) = (circle
one) 0.014 / 0.026 / 0.29.
(f) True / False The probability the waiting time is exactly 1.42 minutes,
say, is zero.
(g) Expectation and Variance.
For λ = 12 , E[X] = λ1 = 1/2
1
= (circle one) 2 / 3 / 4.
For λ = 3, E[X] = λ = 3 = (circle one) 12 / 13 / 14 .
1 1

For λ = 12 , Var(X) = λ12 = (1/2)

1
2 (circle one) 2 / 3 / 4.

For λ = 3, Var(X) = λ2 = 32 (circle one) 12 / 15 / 19 .

1 1

(h) Memoryless Property: Batteries. A key property of the exponential ran-

dom variable is that it is memoryless; that is,

P {X > s + t|X > t} = P {X > s}; s, t ≥ 0

The exponential random variable is the only random variable to possess

this property. Suppose the distribution of the lifetime of batteries, X, is
exponential, where λ = 3.
i. t = 0, s = 10. Suppose the batteries are new when t = 0 and they are
10 hours old when s = 10. Then

P {X > 10} = 1 − F (10) = 1 − (1 − e−3(10) ) =

(circle one) e;10 / e;20 / e;30

136 Chapter 5. Continuous Random Variables

and also
P {X > 10, X > 0}
P {X > 10|X > 0} =
P {X > 0}
P {X > 10}
=
P {X > 0}
1 − F (10)
=
1 − F (0)
1 − (1 − e−3(10) )
=
1 − (1 − e−3(0) )
1 − (1 − e−3(10) )
= =
1
(circle one) e;10 / e;20 / e;30
or,

P {X > 10|X > 0} = P {X > 10}

or, in other words the chance a battery lasts at least 10 hours or more,
is the same as the chance a battery lasts at least 10 hours more, given
that it has already lasted 0 hours or more (which is not too surprising).
ii. t = 5, s = 10. Suppose the batteries are 5 hours old when t = 5 and
they are 10 hours old when s = 10. Then, once again,

P {X > 10} = 1 − F (10) = 1 − (1 − e−3(10) ) =

(circle one) e;10 / e;20 / e;30

and also
P {X > 15, X > 5}
P {X > 15|X > 5} =
P {X > 5}
P {X > 15}
=
P {X > 5}
1 − F (15)
=
1 − F (5)
1 − (1 − e−3(15) )
=
1 − (1 − e−3(5) )
e−3(15)
= =
e−3(5)
(circle one) e;10 / e;20 / e;30
or,

P {X > 15|X > 5} = P {X > 10}

Section 5. Exponential Random Variable 137

or, in other words, the chance a battery lasts at least 10 hours or more,
is the same as the chance a battery lasts at least 15 hours more, given
that it has already lasted 5 hours or more. This is kind of surprising,
because it seems to imply the battery’s life starts “fresh” after 5 hours,
as though the battery “forgot” about the ﬁrst ﬁve hours of its life.
iii. What Is Not Being Said. True / False Although

P {X > 15|X > 5} = P {X > 10}

since P {X > 15|X > 5} = P {X > 15},

P {X > 15} = P {X > 10}

or, in other words, the (unconditional) chance a battery lasts at least

10 hours or more, is not the same as the (unconditional) chance a
battery lasts at least 15 hours more.
iv. What, Then, Is The Memoryless Property of Exponential Distribu-
tions? True / False The “memoryless” property of the exponential
distribution is not so much to do with the notion that random variable
X is “forgetting” its previous lifetime in some way, as it is to do with
the shape of the exponential distribution of X that “falls oﬀ” in such
a way that the area of a particular unconditional probability, given by
the ratio of (1)/(2) in the ﬁgure below, happens to equal the area of
a particular conditional probability3 , given by the ratio of (3)/(4).

(2)
(4)
(1)
(3)

0 x
5 10 15
Figure 5.11 (Memoryless Property of Exponential)
v. An Implication Of The Memoryless Property of Exponential Distribu-
tions? True / False If

P {X > s + t|X > t} = P {X > s}; s, t ≥ 0

3
In fact, the “rate” of the fall–oﬀ of the exponential distribution is traditionally said to have a
constant failure (or hazard) rate, as will be discussed shortly.
Review Chapter 6

Jointly Distributed Random Variables

6.1 Joint Distribution Functions

We look at discrete joint densities in two variables,

p(x, y) = P {X = x, Y = y}

with corresponding marginal densities

pX (x) = P {X = x} = p(x, y)
x:p(x,y)>0

pY (y) = P {Y = y} = p(x, y)
y:p(x,y)>0

and which has the joint density function in n variables generalization,

p(x, y, . . . , xn ) = P {X = x, Y = y, . . . , Xn = xn }

We also look at continuous joint distribution functions in two variables,

F (a, b) = P {X ≤ a, Y ≤ b}, −∞ < a, b < ∞

with corresponding marginal distributions

FX (a) = P {X ≤ a} = F (a, ∞)
FY (b) = P {Y ≤ b} = F (∞, b)

165
166 Chapter 6. Jointly Distributed Random Variables

and density1

∂2
f (a, b) = F (a, b)
∂a∂b
and which has a joint density function in n variables generalization,

F (a1 , a2 , . . . , an ) = P {X1 ≤ a1 , X2 ≤ a2 , . . . , Xn ≤ an }

Exercise 6.1 (Joint Distribution Functions)

1. Discrete Joint Density: Waiting Times To Catch Fish. The joint density,
P {X, Y }, of the number of minutes waiting to catch the first ﬁsh, X, and
the number of minutes waiting to catch the second ﬁsh, Y , is given below.

P {X = i, Y = j} j row sum
1 2 3 P {X = i}
1 0.01 0.02 0.08 0.11
i 2 0.01 0.02 0.08 0.11
3 0.07 0.08 0.63 0.78
column sum P {Y = j} 0.09 0.12 0.79

(a) The (joint) chance of waiting three minutes to catch the first fish and three
minutes to catch the second fish is
P {X = 3, Y = 3} = (circle one) 0.09 / 0.11 / 0.63 / 0.78.
(b) The (joint) chance of waiting three minutes to catch the first fish and one
minute to catch the second fish is
P {X = 3, Y = 1} = (circle one) 0.07 / 0.11 / 0.63 / 0.78.
(c) The (joint) chance of waiting one minute to catch the first and three min-
utes to catch the second fish is
P {X = 1, Y = 3} = (circle one) 0.08 / 0.11 / 0.63 / 0.78.
(d) The (marginal) chance of waiting three minutes to catch the first fish is
P {X = 3} = (circle one) 0.09 / 0.11 / 0.12 / 0.78.
(e) The (marginal) chance of waiting three minutes to catch the second fish is
P {Y = 3} = (circle one) 0.09 / 0.11 / 0.12 / 0.79.
(f) The (marginal) chance of waiting three minutes to catch the second fish is
(circle none, one or more)
i. P {Y = 3} = 0.79
1
Notice that the differentiation is with respect to a and b, rather than X and Y !
Section 1. Joint Distribution Functions 167

ii. P {X = 1, Y = 3} + P {X = 2, Y = 3} + P {X = 3, Y = 3} =
0.08 + 0.08 + 0.63 = 0.79
iii. pY (3) = p(1, 3) + p(2, 3) + p(3, 3) = 0.08 + 0.08 + 0.63 = 0.79

iv. pY (3) = y:p(x,y)>0 p(x, y) = p(1, 3) + p(2, 3) + p(3, 3) = 0.08 + 0.08 +
0.63 = 0.79
(g) The (marginal) chance of waiting two minutes to catch the first fish is
(circle none, one or more)
i. P {X = 2} = 0.11
ii. P {X = 2, Y = 1} + P {X = 2, Y = 2} + P {X = 2, Y = 3} =
0.01 + 0.02 + 0.08 = 0.11
iii. pX (2) = p(2, 1) + p(2, 2) + p(2, 3) = 0.01 + 0.02 + 0.08 = 0.11

iv. pX (2) = y:p(2,y)>0 p(2, y) = p(2, 1) + p(2, 2) + p(2, 3) = 0.01 + 0.02 +
0.08 = 0.11
(h) The (marginal) chance of waiting two minutes to catch the second fish is
(circle none, one or more)
i. P {Y = 2} = 0.12
ii. P {X = 2, Y = 1} + P {X = 2, Y = 2} + P {X = 2, Y = 3} =
0.01 + 0.02 + 0.08 = 0.11
iii. pY (2) = p(1, 2) + p(2, 2) + p(3, 2) = 0.02 + 0.02 + 0.08 = 0.12

iv. pY (2) = y:p(x,2)>0 p(x, 2) = p(1, 2) + p(2, 2) + p(3, 2) = 0.02 + 0.02 +
0.08 = 0.12
(i) The chance of waiting at least two minutes to catch the first fish is (circle
none, one or more)
i. P {X ≥ 2} = 0.11 + 0.78 = 0.89
ii. P {X = 2, Y = 1} + P {X = 2, Y = 2} + P {X = 2, Y = 3} + P {X =
3, Y = 1} + P {X = 3, Y = 2} + P {X = 3, Y = 3} = 0.01 + 0.02 +
0.08 + 0.07 + 0.08 + 0.63 = 0.89
iii. pX (2) = p(2, 1) + p(2, 2) + p(2, 3) = 0.01 + 0.02 + 0.08 = 0.11

iv. pX (2) = y:p(2,y)>0 p(2, y) = p(2, 1) + p(2, 2) + p(2, 3) = 0.01 + 0.02 +
0.08 = 0.11
(j) The chance of waiting at most two minutes to catch the first fish is (circle
none, one or more)
i. P {X ≤ 2} = 0.11 + 0.11 = 0.22
ii. P {X = 1, Y = 1} + P {X = 1, Y = 2} + P {X = 1, Y = 3} + P {X =
2, Y = 1} + P {X = 2, Y = 2} + P {X = 2, Y = 3} = 0.01 + 0.02 +
0.08 + 0.01 + 0.02 + 0.08 = 0.22
iii. F (2, 3) = P {X ≤ 2, Y ≤ 3} = 0.22
168 Chapter 6. Jointly Distributed Random Variables

iv. FX (2) = F (2, ∞) = F (2, 3) = 0.22

(k) The chance of waiting at most two minutes to catch the first fish and one
minute to catch the second fish is (circle none, one or more)
i. P {X ≤ 2, Y = 1} = 0.11
ii. P {X = 1, Y = 1} + P {X = 2, Y = 1} = 0.01 + 0.02 = 0.03
iii. P {X ≤ 2, Y ≤ 1} = F (2, 1) = 0.11
iv. FX (2) = F (2, ∞) = F (2, 3) = 0.22
(l) The chance of waiting at most two minutes to catch the first fish and at
most two minutes to catch the second fish is (circle none, one or more)
i. P {X ≤ 2, Y ≤ 2} = 0.06
ii. P {X = 1, Y = 1} + P {X = 2, Y = 2} + P {X = 2, Y = 1} + P {X =
2, Y = 2} = 0.01 + 0.02 + 0.01 + 0.02 = 0.06
iii. F (2, 2) = P {X ≤ 2, Y ≤ 2} = 0.06
iv. FX (2) = F (2, ∞) = F (2, 3) = 0.22
(m) The chance of waiting at least two minutes to catch the first fish and at
least two minutes to catch the second fish is (circle none, one or more)
i. P {X ≥ 2, Y ≥ 2} = 0.81
ii. P {X > 1, Y > 1} = 0.81
iii. P {X = 2, Y = 2} + P {X = 2, Y = 3} + P {X = 3, Y = 2} + P {X =
3, Y = 3} = 0.02 + 0.08 + 0.08 + 0.63 = 0.81
iv. 1 − FX (1) − FY (1) + F (1, 1) = 1 − P {X ≤ 1} − P {Y ≤ 1} + P {X ≤
1, Y ≤ 1} = 1 − 0.11 − 0.09 + 0.01 = 0.81
Notice that P {X ≥ 2, Y ≥ 2} = 1 − P {X < 2, Y < 2} because P {X ≥
2, Y ≥ 2} is the “right–back” portion of the distribution, whereas P {X <
2, Y < 2} is the “left–front” portion of the distribution.

2. Discrete Joint Density: Coin and Dice. A fair coin, marked “1” on one side
and “2” on the other, is flipped once and, independent of this, one fair die is
rolled once. Let X be the value of the coin (either 1 or 2) flipped and let Y be
the sum of the coin flip and die roll (for example, a flip of 2 and a roll of 1 gives
Y = 3).
(a) The chance of flipping a “1” and the sum of coin and die is equal to 2 is
P {X = 1, Y = 2} = P {coin is 1, sum is 2}
= P {sum is 2|coin is 1}P {coin is 1}
= P {die is 1}P {coin is 1}
1 1
= ·
6 2
1 1 1 1
which equals (circle one) 10
/ 11
/ 12
/ 13
.
Section 1. Joint Distribution Functions 169

(b) The chance of ﬂipping a “1” and the sum of coin and die is equal to 3 is

P {X = 1, Y = 3} = P {coin is 1, sum is 3}
= P {sum is 3|coin is 1}P {coin is 1}
= P {die is 2}P {coin is 1}
1 1
= ·
6 2
1 1 1 1
which equals (circle one) 10
/ 11
/ 12
/ 13
.
(c) The chance of ﬂipping a “2” and the sum of coin and die is equal to 2 is

P {X = 2, Y = 2} = P {coin is 2, sum is 2}
= P {sum is 2|coin is 2}P {coin is 2}
= P {die is 0}P {coin is 2}
1
= 0·
2
1 1 1
which equals (circle one) 0 / 11
/ 12
/ 13
.
(d) True / False. The following table is the joint density of P {X, Y }.

P {X = i, Y = j} j row sum
2 3 4 5 6 7 8 P {X = i}
1 1/12 1/12 1/12 1/12 1/12 1/12 0 6/12
i 2 0 1/12 1/12 1/12 1/12 1/12 1/12 6/12
column sum P {Y = j} 1/12 2/12 2/12 2/12 2/12 2/12 1/12

where, notice, the sum of all probabilities is 1.

(e) Sketching P {X, Y }. The possible graphs of P {X, Y } are given the ﬁgure
below.

P{X = i, Y = j} P{X = i, Y = j}
y y
8 8
7 7
6 6
1/12 5 1/12 5
4 4
3 3
2 2

1 2 x 1 2 x

(a) joint density candidate (b) joint density candidate

Figure 6.1 (Possible P {X, Y })
The graph which corresponds to P {X, Y }, in this case, is graph
(circle one) (a) / (b).
170 Chapter 6. Jointly Distributed Random Variables

(f) The chance of ﬂipping a “1” is

1 2 3 6
P {X = 1} = (circle one) 12 / 12 / 12
/ 12
.
(g) The chance the sum is “8” is
1 2 3 6
P {Y = 8} = (circle one) 12 / 12
/ 12
/ 12
.
(h) The chance the sum is “7” is (circle none, one or more)
2
i. P {Y = 7} = 12
1 1 2
ii. P {X = 1, Y = 7} + P {X = 2, Y = 7} = 12 + 12
= 12
1 1 2
iii. pY (7) = p(1, 7) + p(2, 7) = 12 + 12 = 12
1 1 2
iv. pY (7) = y:p(x,7)>0 p(x, 7) = 12 + 12 = 12
(i) The chance the sum is at most “3” is (circle none, one or more)
3
i. P {Y ≤ 3} = 12
ii. P {X = 1, Y = 2} + P {X = 1, Y = 3} + P {X = 2, Y = 2} + P {X =
1 1 1 3
2, Y = 3} = 12 + 12 + 0 + 12 = 12
3
iii. FY (3) = F (∞, 3) = F (2, 3) = 12
iv. F (2, 5) = P {X ≤ 2, Y ≤ 5}
(j) The chance the coin is at least 1 and the sum is at least “3” is (circle none,
one or more)
i. P {X ≥ 1, Y ≥ 3} = 1112
ii. P {X > 0, Y > 2} = 1112
iii. 1 − FX (0) − FY (2) + F (0, 2) = 1 − P {X ≤ 0} − P {Y ≤ 2} + P {X ≤
1
0, Y ≤ 2} = 1 − 0 − 12 + 0 = 11
12
iv. P {X = 2, Y = 2} + P {X = 2, Y = 3} + P {X = 3, Y = 2} + P {X =
3, Y = 3} = 0.02 + 0.08 + 0.08 + 0.63 = 0.81

3. Discrete Joint Density: Marbles In An Urn. Three marbles are chosen at ran-
dom without replacement from an urn consisting of 6 black and 8 blue marbles.
Let Xi equal 1 if the ith marble selected is black and let it equal 0 otherwise.

(a) Joint Density of P {X1 , X2 }.

i. The chance of, ﬁrst, choosing a black marble, X1 = 1, and, second,
also choosing a black marble, X2 = 1, is
p(1, 1) = (circle one) 146313 / 146413 / 146513 / 146613 .
ii. The chance of, ﬁrst, choosing a black marble, X1 = 1, and, second,
choosing a blue marble, X2 = 0, is
p(1, 0) = (circle one) 146613 / 146713 / 146813 / 146913 .
iii. True / False The joint density is
Section 1. Joint Distribution Functions 171

P {X1 = i, X2 = j} j row sum

0 1 P {X1 = i}
8·7 8·6 (6)(8)+(7)(8)
i 0 14·13 14·13 14·13
6·8 6·5 (5)(6)+(6)(8)
1 14·13 14·13 14·13
(6)(8)+(7)(8) (5)(6)+(6)(8)
column sum P {X2 = j} 14·13 14·13

(b) Joint Density of P {X1 , X2 , X3 }.

i. The chance of, ﬁrst, choosing a black marble, X1 = 1, and, second,
choosing a black marble, X2 = 1, and, third, also choosing a black
marble, X3 = 1, is
p(1, 1, 1) = (circle one) 14613
33
12
/ 14613
4 3
12
/ 14613
54
12
/ 14613
63
12
.
ii. The chance of, ﬁrst, choosing a black marble, X1 = 1, and, second,
choosing a black marble, X2 = 1, and, third, choosing a blue marble,
X3 = 0, is
p(1, 1, 0) = (circle one) 14613
38
12
/ 14613
4 8
12
/ 14613
58
12
/ 14613
68
12
.
iii. True / False Since either a black or blue marble can be chosen on
each of the three picks out of the urn, there are 2 × 2 × 2 = 8 possible
probabilities in the joint density P {X1, X2 , X3 }.

4. Probability Calculations and Z = XY : Speeding Tickets. The joint density,

P {X, Y }, of the number of speeding tickets a driver receives in a year, X, and
the amount of money required to payoﬀ these tickets, Y , is given below.

P {X = i, Y = j} j row sum
20 40 P {X = i}
i 1 0.2 0.3 0.5
2 0.4 0.1 0.5
P {Y = j} 0.6 0.4
column sum

Determine the density of the total amount of money spent on speeding tickets
in a year, Z = XY , and use this density to calculate P {XY > 20}.

(a) The joint distribution probabilities as well as the product, Z = XY , of the

number of speeding tickets a driver receives in a year, X, times the amount
of money required to payoﬀ these tickets, Y , is given below are combined
in the one table below.
172 Chapter 6. Jointly Distributed Random Variables

P {X = i, Y = j} j P {X = i}
Z = XY 20 40
i 1 0.2 0.3 0.5
1(20) = 20 1(40) = 40 4
2 0.4 0.1 0.5
2(20) = 40 2(40) = 80 5
P {Y = j} 0.6 0.4
The chance that the total amount paid for speeding tickets in a year is $20
is given by
(circle one) 0.1 / 0.2 / 0.4 / 0.5.
(b) The chance that the total amount paid for speeding tickets in a year is
$40, z = xy = 40, occurs in two possible ways, (2,1) and (1,2), with
probabilities (circle one)
i. 0.1 and 0.3, respectively.
ii. 0.2 and 0.3, respectively.
iii. 0.3 and 0.3, respectively.
iv. 0.4 and 0.3, respectively.
(c) Thus, the chance that the total amount paid for speeding tickets in a year
is $40, z = xy = 40, is
P{XY = 40} = 0.4 + 0.3 = (circle one) 0.4 / 0.6 / 0.7.
(d) The product, z = xy = 40, also occurs in two possible ways (circle one)
i. (2,2) and (1,2).
ii. (1,2) and (2,2).
iii. (1,1) and (2,2).
iv. (2,1) and (1,2).
(e) Complete the probability distribution of the total amount paid for speeding
tickets in a year, Z = XY ,
z = XY 20 40 80
P {XY } 0.2 0.1

5. Continuous Joint Density: Weight and Amount of Salt in Potato Chips. Three
machines ﬁlls potato chip bags. Although each bag should weigh 50 grams
each and contain 5 milligrams of salt, in fact, because of diﬀering machines,
the weight and amount of salt placed in each bag varies according to the three
graphs below.
Section 1. Joint Distribution Functions 173

volume = 1 3x + y = 155
volume = 1
f(x,y) f(x,y)
f(x,y) 0.25 y y
y 8 8
8 7 1/6 7
6 6
7 5 5
6 4 4
1/12 5 3 3
4 2 2
3
2 volume = 1
49 51 x 49 51 x
49 51 x

(b) machine B: f(x,y) = 0.25, (c) machine C: f(x,y) = 1/6,

(a) machine A: f(x,y) = 1/12 49 < x < 51, 49 < x < 51, 2 < y < 8,
49 < x < 51, 4<y<6 3x + y < 155
2<y<8

Figure 6.2 (Possible f (x, y))

(a) Machine A; Figure (a). One randomly chosen ﬁlled bag will weigh between
49 and 51 grams and contain between 2 and 8 milligrams of salt with
probability P {49 ≤ X ≤ 51, 2 ≤ Y ≤ 8} = (circle one) 1 / 0.5 / 0.
(b) More Machine A. The probability P {49 ≤ X ≤ 51, 2 ≤ Y ≤ 8} is repre-
sented by or equal to the (circle none, one or more)
i. rectangular box volume equal to 1.
ii. rectangular box volume equal to the width (51 − 49 = 2) times the
1
depth (8 − 2 = 6) times the height ( 12 ).
1
iii. deﬁnite integral of f (x) = 12 over the region (49, 51) × (2, 8).
iv. the integral,
8 51 8 51
1 x
dx dy = dy
2 49 12 2 12 49
8
51 − 49
= dy
2 12
8
2y
=
12 2
= 1

(c) More Machine A. True / False The joint probability density function is
given by,
1
12
49 ≤ x ≤ 51, 2 ≤ y ≤ 8
f (x, y) =
0 elsewhere

(d) More Machine A. The chance a potato chip bag, chosen at random, weighs
at most 50.5 grams and contains at most 4 grams of salt is (circle none,
one or more)
1 3
i. P {X ≤ 50.5, Y ≤ 4} = (1.5)(2) 12 = 12
= 0.25
174 Chapter 6. Jointly Distributed Random Variables

ii. F (50.5, 4) =
4 50.5 4 50.5
1 x
dx dy = dy
2 49 12 2 12 49
4
50.5 − 49
= dy
2 12
4
1.5y
=
12 2
3
=
12
iii. P {X = 1, Y = 1} + P {X = 2, Y = 1} = 0.01 + 0.02 = 0.03
iv. P {X ≤ 2, Y ≤ 1} = F (2, 1) = 0.11
(e) More Machine A. The chance a potato chip bag, chosen at random, weighs
at most 50.5 grams is (circle none, one or more)
1 9
i. P {X ≤ 50.5} = (1.5)(8 − 2) 12 = 12 = 0.75
ii. FX (50.5) = F (50.5, ∞)
∞ 50.5 8 50.5
1 x
dx dy = dy
2 49 12 2 12 49
8
50.5 − 49
= dy
2 12
8
1.5x
=
12 2
9
=
12
3
iii. FY (3) = F (∞, 3) = F (2, 3) = 12
iv. F (2, 5) = P {X ≤ 2, Y ≤ 5}
(f) More Machine A. The chance a potato chip bag, chosen at random, con-
tains at most 4 grams is (circle none, one or more)
1 4
i. P {Y ≤ 4} = (51 − 49)(2) 12 = 12 = 0.33
ii. FY (4) = F (∞, 4)
4 ∞ 4 51
1 x
dx dy = dy
2 49 12 2 12 49
4
51 − 49
= dy
2 12
4
2x
=
12 2
4
=
12
Section 1. Joint Distribution Functions 175

3
iii. FY (3) = F (∞, 3) = F (2, 3) = 12
iv. F (2, 5) = P {X ≤ 2, Y ≤ 5}
(g) More Machine A. The chance a potato chip bag, chosen at random, weighs
at least 50.5 grams and contains at least 4 grams of salt is (circle none, one
or more)
i. P {X ≥ 50.5, Y ≥ 4} =
8 51 x 51
8
1
dx dy = dy
4 50.5 12 4 12 50.5
8
51 − 50.5
= dy
4 12
8
0.5x
=
12 4
2
=
12
ii. P {X ≥ 50.5, Y ≥ 4} =
9 4 3
1 − FX (50.5) − FY (4) + F (50.5, 4) = 1 − − +
12 12 12
2
=
12
3
iii. FY (3) = F (∞, 3) = F (2, 3) = 12
iv. F (2, 5) = P {X ≤ 2, Y ≤ 5}
Notice that P {X ≥ 50.5, Y ≥ 4} = 1 − P {X < 50.5, Y < 4} because
P {X ≥ 50.5, Y ≥ 4} is the “right–back” portion of the distribution,
whereas P {X < 50.5, Y < 4} is the “left–front” portion of the distri-
bution.
(h) Machine B; Figure (b). One randomly chosen ﬁlled bag will weigh between
49 and 51 grams and contain between 4 and 6 milligrams of salt with
probability P {49 ≤ X ≤ 51, 4 ≤ Y ≤ 6} = (circle one) 1 / 0.5 / 0.
(i) More Machine B. The probability P {49 ≤ X ≤ 51, 4 ≤ Y ≤ 6} is repre-
sented by or equal to the (circle none, one or more)
i. rectangular box volume equal to 1.
ii. rectangular box volume equal to the width (51 − 49 = 2) times the
depth (6 − 4 = 2) times the height ( 14 ).
iii. deﬁnite integral of f (x) = 14 over the region (49, 51) × (4, 6).
176 Chapter 6. Jointly Distributed Random Variables

iv. the integral,

6 51 6 51
1 x
dx dy = dy
4 49 4 4 4 49
6
51 − 49
= dy
4 4
6
2x
=
4 4
= 1

(j) More Machine B. True / False The joint probability density function is
given by,
1
4
49 ≤ x ≤ 51, 4 ≤ y ≤ 6
f (x, y) =
0 elsewhere

(k) More Machine B. The chance a potato chip bag, chosen at random, weighs
at most 50.5 grams and contains at most 5 grams of salt is (circle none,
one or more)
i. P {X ≤ 50.5, Y ≤ 5} = (1.5)(1) 14 = 1.5
4
ii. F (50.5, 5) =
5 50.5 5 50.5
1 x
dx dy = dy
4 49 4 4 4 49
5
50.5 − 49
= dy
4 4
5
1.5x
=
4 4
1.5
=
4
iii. P {X = 1, Y = 1} + P {X = 2, Y = 1} = 0.01 + 0.02 = 0.03
iv. P {X ≤ 2, Y ≤ 1} = F (2, 1) = 0.11
(l) More Machine B. The chance of potato chip bag, chosen at random, weighs
at most 50.5 grams is (circle none, one or more)
i. P {X ≤ 50.5} = (1.5)(6 − 4) 14 = 3
4
= 0.75
Section 1. Joint Distribution Functions 177

ii. FX (50.5) = F (50.5, ∞)

∞ 50.5 6 50.5
1 x
dx dy = dy
4 49 4 4 4 49
6
50.5 − 49
= dy
4 4
6
1.5y
=
4 4
3
=
4
3
iii. FY (3) = F (∞, 3) = F (2, 3) = 12
iv. F (2, 5) = P {X ≤ 2, Y ≤ 5}
(m) More Machine B. The chance of potato chip bag, chosen at random, con-
tains at most 5 grams is (circle none, one or more)
i. P {Y ≤ 5} = (51 − 49)(1) 14 = 24 = 0.50
ii. FY (5) = F (∞, 5)
5 ∞ 5 51
1 x
dx dy = dy
4 49 4 4 4 49
5
51 − 49
= dy
4 12
5
2y
=
4 4
2
=
4
3
iii. FY (3) = F (∞, 3) = F (2, 3) = 12
iv. F (2, 5) = P {X ≤ 2, Y ≤ 5}
(n) More Machine B. The chance of potato chip bag, chosen at random, weighs
at least 50.5 grams and contains at least 5 grams of salt is (circle none, one
or more)
i. P {X ≥ 50.5, Y ≥ 5} =
6 51 6 51
1 x
dx dy = dy
5 50.5 4 5 4 50.5
6
51 − 50.5
= dy
5 4
6
0.5y
=
4 5
0.5
=
4
178 Chapter 6. Jointly Distributed Random Variables

ii. P {X ≥ 50.5, Y ≥ 5} =
3 2 1.5
1 − FX (50.5) − FY (5) + F (50.5, 5) = 1 − − +
4 4 4
0.5
=
4
3
iii. FY (3) = F (∞, 3) = F (2, 3) = 12
iv. F (2, 5) = P {X ≤ 2, Y ≤ 5}
(o) Machine C; Figure (c). One randomly chosen ﬁlled bag will weigh, X
between 49 and 51 grams and contain between 2 and 8 milligrams of salt, Y ,
and also the weight and amount of salt obeys the constraint 3X +Y < 155,
with probability P {49 ≤ X ≤ 51, 2 ≤ Y ≤ 8} = (circle one) 1 / 0.5 / 0.
(p) More Machine C. The probability P {49 ≤ X ≤ 51.2 ≤ Y ≤ 8} is repre-
sented by or equal to the (circle none, one or more)
i. pie slice volume equal to 1.
ii. pie slice volume equal to the width (51 − 49 = 2) times one–half the
depth ( 12 (8 − 2) = 3) times the height ( 16 ).
iii. deﬁnite integral of f (x) = 16 over the region 49 < X < 51, 2 < Y < 8,
3X + Y < 155.
iv. the integral,
8 8
1 1
dx dy = dx dy
2 3x+y<155 6 2 x<155/3−(1/3)y 6
8 155/3−(1/3)y
x
= dy
2 6 49
8
(155/3 − (1/3)y) − 49
= dy
2 6
8
(8/3)y (1/6)y 2
= −
6 6 2
= 1

(q) More Machine C. True / False The joint probability density function is
given by,
1
6
49 ≤ x ≤ 51, 2 ≤ y ≤ 8, 3x + y < 155
f (x, y) =
0 elsewhere

(r) More Machine C. The chance a potato chip bag, chosen at random, weighs
at most 50.5 grams and contains at most 4 grams of salt is (circle none,
one or more)
Section 1. Joint Distribution Functions 179

i. P {X ≤ 50.5, Y ≤ 4} = (50.5−49)(4−2) 16 − 12 (50.5−151/3)(4−3.5) 16 ≈

0.49305 because the “back–right pie–shaped corner” of the box volume
in (2, 4) × (49, 50.5) is “chopped oﬀ”.
ii. F (50.5, 4) =
4 4 50.5 4
1 1 1
dx dy = dx dy − dx dy
2 3x+y<155 6 2 49 6 3.5 151/3<x<155/3−(1/3)y 6
4 50.5 4 155/3−(1/3)y
x x
= dy − dy
2 6 49 3.5 6 151/3
4 4
50.5 − 49 (155/3 − (1/3)y) − 151/3
= dy − dy
2 6 3.5 6
4 4
1.5y (4/3)y (1/6)y 2
= − −
6 2 6 6 3.5
= 0.5 − 0.006944 = 0.49305

iii. P {X = 1, Y = 1} + P {X = 2, Y = 1} = 0.01 + 0.02 = 0.03

iv. P {X ≤ 2, Y ≤ 1} = F (2, 1) = 0.11
(s) More Machine C. The chance a potato chip bag, chosen at random, weighs
at most 50.5 grams is (circle none, one or more)
i. P {X ≤ 50.5} = (50.5 − 49)(8 − 2) 16 − 12 (50.5 − 49)(8 − 3.5) 16 ≈ 0.9375
because the “back–right pie–shaped corner” of the box volume in
(2, 8) × (49, 50.5) is “chopped oﬀ”.
ii. FX (50.5) =
8 8 50.5 8
1 1 1
dx dy = dx dy − dx dy
2 3x+y<155 6 2 49 6 3.5 49<x<155/3−(1/3)y 6
8 50.5 8 155/3−(1/3)y
x x
= dy − dy
2 6 49 3.5 6 49
8 8
50.5 − 49 (155/3 − (1/3)y) − 49
= dy − dy
2 6 3.5 6
8 8
1.5y (8/3)y (1/6)y 2
= − −
6 2 6 6 3.5
= 1.5 − 0.5625 = 0.9375
3
iii. FY (3) = F (∞, 3) = F (2, 3) = 12
iv. F (2, 5) = P {X ≤ 2, Y ≤ 5}
(t) More Machine C. The chance a potato chip bag, chosen at random, con-
tains at most 4 grams is (circle none, one or more)
180 Chapter 6. Jointly Distributed Random Variables

i. P {Y ≤ 4} = (51 − 49)(4 − 2) 16 − 12 (51 − 151/3)(4 − 2) 16 = 59 ≈

0.55 because the “back–right pie–shaped corner” of the box volume in
(2, 4) × (49, 51) is “chopped oﬀ”.
ii. FY (4) =
4 4 51 4
1 1 1
dx dy = dx dy − dx dy
2 3x+y<155 6 2 49 6 2 50.5<x<155/3−(1/3)y 6
4 51 4 155/3−(1/3)y
x x
= dy − dy
2 6 49 2 6 50.5
4 4
51 − 49 (155/3 − (1/3)y) − 50.5
= dy − dy
2 6 2 6
4 4
2y (2/3)y (1/6)y 2
= − −
6 2 6 6 2
= 0.6666 − 0.1111 = 0.5555
3
iii. FY (3) = F (∞, 3) = F (2, 3) = 12
iv. F (2, 5) = P {X ≤ 2, Y ≤ 5}
(u) More Machine C. The chance a potato chip bag, chosen at random, weighs
at least 50.5 grams and contains at least 4 grams of salt is (circle none, one
or more)
i. P {X ≥ 50.5, Y ≥ 4} = 0 since the joint density is not deﬁned in this
region
ii. P {X ≥ 50.5, Y ≥ 4} =

1 − FX (50.5) − FY (4) + F (50.5, 4) = 1 − 0.9375 − 0.5555 + 0.49305

= 0
3
iii. FY (3) = F (∞, 3) = F (2, 3) = 12
iv. F (2, 5) = P {X ≤ 2, Y ≤ 5}

6. More Continuous Distribution Functions.

(a) If the joint distribution function is given by

2 2
(1 − e−x )(1 − e−y ) x > 0, y > 0
F (x, y) =
0 elsewhere

Then P {x ≤ 1, y ≤ 2} = (circle one)

2 2
i. (1 − e−(1) )(1 − e−(2) ) = 0.621
2 1 2 2
ii. 0 0 (1 − e−(1) )(1 − e−(2) ) dx dy = 0.621
and the joint density f (x, y) is given by (circle none, one or more)
Section 1. Joint Distribution Functions 181

∂2
i. ∂x∂y
F (x, y)
∂ 2 2 2
ii. ∂x∂y
(1 − e−(1) )(1 − e−(2) )
−x2 2 2 +y 2 )
iii. [−(−2x)e ] × [−(2y)e−y ] = 2xye−(x
and P {1 < x < 1.25, 1.5 < y < 2} = (circle none, one or more)
i. F (1, 1.5) + F (1.25, 2) − F (1, 2) − F (1.25, 1.5)
2 2 2 2 2
ii. (1 − e−(1) )(1 − e−(1.5) ) + (1 − e−(1.25) )(1 − e−(2) ) − (1 − e−(1) )(1 −
2 2 2
e−(2) ) − (1 − e−(1.25) )(1 − e−(1.5) )
iii. 0.56549 + 0.775912 − 0.620542 − 0.707082 = 0.013778
(Hint: Draw a picture of the rectangular region of integration to convince
yourself that adding and subtracting the joint distributions as given above
is appropriate.)
(b) Determine c so that

cx(3x − y) 0 ≤ x ≤ 2, 0 ≤ y ≤ 1, x + y < 1
f (x, y) =
0 elsewhere

is a joint probability density function. Since

1 2 1 2
cx(3x − y) dx dy = c (3x2 − xy) dx dy
0 0 0 0
1
2
3 1 2
= c x − x y dy
0 2
1
0
1
= c 23 − 22 y dy
0 2

1
1 2
= c 8y − 4y
4 0

1
= c 8−
4
31
= c
4
= 1
3 4 5 6
and so c = (circle one) 31
/ 31
/ 31
/ 31
.

7. n–Variable Joint Distributions.

(a) Discrete 3–Variable Joint Distribution. Consider the joint distribution,

1
xyz x = 1, 2; y = 1, 2, 3; z = 1, 2
P {X = x, Y = y, Z = z} = 54
0 elsewhere
182 Chapter 6. Jointly Distributed Random Variables

1 4
P {X = 1, Y = 2, Z = 2} = (circle none, one or more) 54 (1)(2)(2) / 54
/
5 6
31
/ 31 .
1 5 8
P {X = 2, Y = 2, Z = 2} = (circle one) 54 / 54 / 54 / 12
54
.
1 5 8 12
P {X ≤ 2, Y = 2, Z = 2} = (circle one) 54 / 54 / 54 / 54 .
(b) Discrete Multinomial Joint Distribution. The multinomial joint distribu-
tion is given by
n!
P {X1 = x1 , X2 = x2 , . . . , Xr = xr } = pn1 pn2 · · · pnr r
n1 !n2 ! · · · nr ! 1 2

where ri=1 ni = n.
Suppose a fair die is rolled 8 times. The chance that 1 appears 3 times,
2 appears once, 3 appears once, 4 appears 3 times, and 5 or 6 does not
appear is P {X = 1, Y = 2, Z = 2} = (circle none, one or more)
i. P {X1 = 3, X2 = 1, X3 = 1, X4 = 3, X5 = 0, X6 = 0}
8!
ii. 3!1!1!3!0!0! (1/6)3 (1/6)1(1/6)1 (1/6)3 (1/6)1(1/6)0
8!
iii. 3!1!1!3!0!0! (1/6)8
iv. 0.0006668

6.2 Independent Random Variables

Random variables X and Y are independent if and only if any of the following equa-
tions are satisﬁed
P {X ∈ A, Y ∈ B} = P {X ∈ A}P {Y ∈ B}
P {X ≤ a, Y ≤ b} = P {X ≤ a}P {X ≤ b}
F (a, b) = FX (a)FY (b)
p(x, y) = pX (a)pY (b), for allx, y; (discrete case)
f (x, y) = fX (x)fY (y), (continuous case)
In general, n random variables X1 , X2 , . . . , Xn are independent if, for all sets of
real numbers,
P {X1 ∈ A1 , X2 ∈ A2 , . . . , Xn ∈ An } = Πni=1 P {Xi ∈ Ai }
P {X1 ≤ a1 , X2 ≤ a2 , . . . , Xn ≤ an } = Πni=1 P {Xi ≤ ai }

Exercise 6.2 (Joint Distribution Functions)

1. Discrete Distribution and Independence: Waiting Time To Fish. The joint
density of the number of minutes waiting to catch a ﬁsh on the ﬁrst and second
day, P {X, Y }, is given below.
Section 2. Independent Random Variables 183

P {X, Y } Y row sum

1 2 3 P {X = x}
1 0.01 0.02 0.08 0.11
X 2 0.01 0.02 0.08 0.11
3 0.07 0.08 0.63 0.78
column sum P {Y = y} 0.09 0.12 0.79

Are the waiting times on the two days independent of one another? To
demonstrate independence, we must show that P {X, Y } = P {X}P {Y } for
X, Y = 1, 2, 3.

(a) X = 3, Y = 3.
The chance of waiting three minutes to catch one fish on the first day is
P {X = 3} = (circle one) 0.09 / 0.11 / 0.12 / 0.78.
The chance of waiting three minutes to catch one fish on the second day is
P {Y = 3} = (circle one) 0.09 / 0.11 / 0.12 / 0.79.
The chance of waiting three minutes to catch one fish on the first day and
waiting three minutes to catch one fish on the second day is
P {X = 3, Y = 3} = (circle one) 0.09 / 0.11 / 0.63 / 0.78.
Since the chance of waiting three minutes to catch one fish on the first day
and waiting three minutes to catch one fish on the second day,
P {X = 3, Y = 3} = 0.63,
(circle one) does / does not equal
P {X = 3}P {Y = 3} = (0.78)(0.79) = 0.6162,
the waiting three minutes on the second day depends on the waiting three
minutes on the first day.
(b) X = 2, Y = 3.
The chance of waiting two minutes to catch a fish on the first day is
P {X = 2} = (circle one) 0.09 / 0.11 / 0.12 / 0.78.
The chance of waiting three minutes to catch a fish on the second day is
P {Y = 3} = (circle one) 0.09 / 0.11 / 0.12 / 0.79.
The chance of waiting two minutes to catch a fish on the first day and
waiting three minutes to catch a fish on the second day is
P {X = 3, Y = 3} = (circle one) 0.08 / 0.11 / 0.63 / 0.78.
Since the chance of waiting two minutes to catch a fish on the first day
and waiting three minutes to catch a fish on the second day,
P {X = 2, Y = 3} = 0.08,
(circle one) does / does not equal
P {X = 2}P {Y = 3} = (0.11)(0.79) = 0.0869,
the waiting three minutes on the second day depends on the waiting two
minutes on the first day.
184 Chapter 6. Jointly Distributed Random Variables

(c) True / False In order for the waiting time on the second day to be in-
dependent of the waiting time on the first day, it must be shown that the
waiting times of one, two or three minutes on the second day must all be
shown to be independent of the waiting times of one, two or three minutes
on the first day. If any of the waiting times on the second day are shown
to be independent of any of the waiting times on the first day, this would
demonstrate the waiting time on the second day depends on the waiting
time of the first day.
2. More Discrete Distribution and Independence: Waiting Time To Fish Again.
The joint density of the number of minutes waiting to catch a fish on the first
and second day, P {X, Y }, is given below.

P {X, Y } Y row sum

1 2 3 P {X = x}
1 0.01 0.01 0.08 0.10
X 2 0.01 0.01 0.08 0.10
3 0.08 0.08 0.64 0.80
column sum P {Y = y} 0.10 0.10 0.80

Are the waiting times on the two days independent of one another? To
demonstrate independence, we must show that P {X, Y } = P {X}P {Y } for
X, Y = 1, 2, 3.
(a) X = 2, Y = 3.
The chance of waiting two minutes to catch a fish on the first day is
P {X = 2} = (circle one) 0.09 / 0.10 / 0.12 / 0.78.
The chance of waiting three minutes to catch a fish on the second day is
P {Y = 3} = (circle one) 0.09 / 0.11 / 0.12 / 0.80.
The chance of waiting two minutes to catch a fish on the first day and
waiting three minutes to catch a fish on the second day is
P {X = 3, Y = 3} = (circle one) 0.08 / 0.11 / 0.64 / 0.80.
Since the chance of waiting two minutes to catch a fish on the first day
and waiting three minutes to catch a fish on the second day,
P {X = 2, Y = 3} = 0.08,
(circle one) does / does not equal
P {X = 2}P {Y = 3} = (0.10)(0.80) = 0.08,
the waiting three minutes on the second day is
(circle one) independent / dependent
on the waiting two minutes on the first day.
(b) True / False In fact, since P {X, Y } = P {X}P {Y } for X, Y = 1, 2, 3, the
waiting time on the second day is independent of the waiting time on the
first day.
Section 2. Independent Random Variables 185

3. More Discrete Joint Distributions and Independence: Waiting Times To Fish

Yet Again. The distribution of the number of minutes waiting to catch a ﬁsh,
X, on any day is given below.

X 1 2 3
P {X = x} 0.1 0.1 0.8

(a) The chance of waiting one minute to catch a ﬁsh is

P {X = 1} = (circle one) 0.1 / 0.4 / 0.5 / 0.8.
(b) The chance of waiting three minutes to catch another ﬁsh is
P {X = 3} = (circle one) 0.1 / 0.4 / 0.5 / 0.8.
(c) Since waiting one minute and waiting three minutes are independent of
one another,
P {X = 1, X = 3} = P {X = 1}P {X = 3} = (0.1)(0.8) =
(circle one) 0.01 / 0.04 / 0.05 / 0.08.
(d) Complete the following joint distribution table, if the waiting times on one
day are independent of the waiting times on any other day.
P {X, Y } Y
1 2 3
1 0.01 0.01 0.08
X 2 0.01
3 0.08

4. More Discrete Joint Distributions and Independence: Shooting Hoops. Suppose

two basketball players are each taking a free throw. Basketball player A has a
45% chance of making a free throw, and so the chance s/he makes a basket on
the fourth throw is (using the geometric distribution)

P {X = 4} = p(1 − p)3 = 0.45(1 − 0.45)3

Basketball player B also has a 45% chance of making a free throw, and so the
chance s/he makes a second basket on the fourth throw is (using the negative
binomial distribution)

i−1 r i−r 4−1
P {Y = 4} = p (1 − p) = 0.452 (1 − 0.45)4−2
r−1 2−1

Assume basketball player A’s free throws are independent of basketball player
B’s free throws,

(a) The chance that A makes a basket on the fourth throw and B makes a
second basket on the fourth throw is
186 Chapter 6. Jointly Distributed Random Variables

P {X = 4, Y = 4} =

P {X = 4}P
{Y = 4} = (circle one)
4−1
0.45(1 − 0.45)1 × 0.452 (1 − 0.45)4;2
2−1

2 4−1
0.45(1 − 0.45) × 0.452 (1 − 0.45)4;2

2 − 1
3 4−1
0.45(1 − 0.45) × 0.452 (1 − 0.45)4;2
2−1
(b) The chance that A makes a basket on the third throw and B makes a
second basket on the fifth throw is
P {X = 3, Y = 5} =
P {X = 3}P
{Y = 5} = (circle one)
5−1
0.45(1 − 0.45)1 × 0.452 (1 − 0.45)5;2

2 − 1
2 5−1
0.45(1 − 0.45) × 0.452 (1 − 0.45)5;2

2 − 1
3 5−1
0.45(1 − 0.45) × 0.452 (1 − 0.45)5;2
2−1
(c) Suppose A makes a basket on the third throw, starts again and, indepen-
dent of the first round of throws, makes a basket on the fifth throw on the
second round of throws and then, independent of this, on a third round of
throws, makes a basket on the first attempt. The chance of this happening
is
P {X1 = 3, X2 = 5, X3 = 1} = P {X1 = 3}P {X2 = 5}P {X3 = 1} =
(circle one)
0.45(1 − 0.45)2 × 0.45(1 − 0.45)2 × 0.45
0.45(1 − 0.45)2 × 0.45(1 − 0.45)3 × 0.45
0.45(1 − 0.45)2 × 0.45(1 − 0.45)4 × 0.45

5. Continuous Distribution and Independence. Consider the joint density,

4xy 0 < x < 1, 0 < y < 1
f (x, y) =
0 elsewhere

To demonstrate independence of X and Y , we must show that f (x, y) =

fX (x)fY (y).

(a) fX (x) =
1 1
4xy dy = 2xy 2 y=0
0

= 2x(1) − 2x(0)2
2
=

(circle one) 2 / 2x / 2y / 4xy.

Section 2. Independent Random Variables 187

(b) fY (y) =
1 1
4xy dx = 2x2 y x=0
0

= 2(1)2y − 2(0)2 y =

(circle one) 2 / 2x / 2y / 4xy.

(c) Since f (x, y) = 4xy
(circle one) does / does not equal
fX (x)fY (y) = (2x)(2y) = 4xy,
random variable X is independent of Y .
(d) Is independence symmetric? If X is independent of Y , then Y (circle one)
is / is not independent of X.

6. Another Continuous Joint Distribution and Independence. Consider the joint

density,

24xy 0 < x < 1, 0 < y < 1, 0 < x + y < 1
f (x, y) =
0 elsewhere

Are X and Y independent?

(a) fX (x) =

24xy dy = 24xy dy
x+y<1 y<1−x
1−x
= 24xy dy
0

2 1−x
= 12xy y=0

= 12x(1 − x)2 − 12x(0)2 =

(circle one) 2 / 2x / 2y / 12x(1 − x)2 .

(b) fY (y) =

24xy dx = 24xy dx
x+y<1 x<1−y
1−y
= 24xy dx
0
1−y
= 12x2 y x=0

= 12(1 − y)2 y − 12(0)2 y =

(circle one) 2 / 2x / 2y / 12y(1 − y)2 .

188 Chapter 6. Jointly Distributed Random Variables

(c) Since f (x, y) = 24xy

(circle one) does / does not equal
fX (x)fY (y) = (12x(1 − x)2 )(12y(1 − y)2),
random variable X is dependent on Y .
7. Another Continuous Joint Distribution and Independence. The joint distribu-
tion function is given by
2 2
(1 − e−x )(1 − e−y ) x > 0, y > 0
F (x, y) =
0 elsewhere
To demonstrate independence of X and Y , we must show that F (x, y) =
FX (x)FY (y).
(a) FX (x) =
F (x, ∞) = lim F (x, y)
y→∞
2 2
= lim (1 − e−x )(1 − e−y )
y→∞
2
= (1 − e−x )(1 − 0) =
(circle one) 2 / 2x / 2y / 1 − e;x .
2

(b) FY (y) =
F (∞, y) = lim F (x, y)
x→∞
2 2
= lim (1 − e−x )(1 − e−y )
x→∞
2
= (1 − 0)(1 − e−y ) =
(circle one) 2 / 2x / 2y / 1 − e;y .
2

2 2
(c) Since f (x, y) = (1 − e−x )(1 − e−y )
(circle one) does / does not equal
2 2
fX (x)fY (y) = (1 − e−x )(1 − e−y ),
random variable X is independent of Y .
8. Another Continuous Joint Distribution and Independence. If X and Y are
independent, what is the density of X/Y , if X and Y are both exponential
random variables with parameters λ and µ, respectively?
(a) General Density.
FZ (a) = P {X/Y < a}
= P {X < aY }
∞ ay
= fX (x)fY (y) dx dy
0 0
∞
= FX (ay)fY (y) dy
0
Section 3. Sums of Independent Random Variables 189

and so
∞
d
fZ (a) = FX (ay)fY (y) dy
da 0

(circle
1 one)
01 fX (ay)fY (y) dy
0 1X
f (ay)yfY (y) dy
− 0 fX (ay)yfY (y) dy.
(b) Density When X and Y Are Exponential Random Variables. If X is ex-
ponential with parameter λ and Y is exponential with parameter µ, fZ (a)
(circle
1 one);ay ;y
01 λe;ay µe ;ydy
0 1
λe yµe dy
;ay ;y
− 0 λe yµe dy.

6.3 Sums of Independent Random Variables

In the discrete case, if X and Y are independent, the distribution of X + Y is2
P {X + Y = n} = P {X = k, Y = n − k}
= P {X = k}P {Y = n − k}
In the continuous case, if X and Y are independent the distribution of X + Y is3
FX+Y (a) = P {X + Y ≤ a}
= P {X ≤ a − Y }
∞ a+y
= fX,Y (x, y) dx dy
−∞ 0
∞ a−y
= fX (x)fY (y) dx dy
−∞ 0
∞
= FX (a − y)fY (y) dy
−∞

and so
∞ ∞
d
fX+Y (a) = FX (a − y)fY (y) dy = fX (a − y)fY (y) dy
da −∞ −∞

Exercise 6.3 (Statistic and its Sampling Distribution)

2
Notice the neat trick in equating P {X + Y = n} with P {X = k, Y = n − k}: if X = k, k ≤ n,
then X + Y = k + Y = n or Y = n − k.
3
Notice that the trick in equating P {X + Y = n} with P {X = k, Y = n − k} in the discrete case
is eﬀectively repeated here in the continuous case, where fX,Y (x, y) is equated with fX (a − y)fY (y).
190 Chapter 6. Jointly Distributed Random Variables

1. Discrete Probability Distribution of Sum, Montana Fishing Trip. A ﬁsherman

takes two trips to a lake there, where the number of ﬁsh caught at a lake on either
one of these two trips, X, is a random variable with the following distribution,

x 1 2 3
P {X = x} 0.4 0.4 0.2

(a) One fish is caught on the first trip and three fish are caught on the second
trip. The sum of the number of fish caught over these two trips is
x1 + x2 = 1 + 3 = (circle one) 0.3 / 1.5 / 4.
(b) Two fish are caught on the first trip and three fish are caught on the second
trip. The sum of the number of fish caught over these two trips is
x1 + x2 = 2 + 3 = (circle one) 0.3 / 1.5 / 5.
(c) The joint distribution probabilities as well as the sum of the number of
fish caught on two trips to the lake are combined in the one table below.
P{x1 , x2 } x2
x1 + x2 1 2 3
1 0.16 0.16 0.08
2 3 4
x1 2 0.16 0.16 0.08
3 4 5
3 0.08 0.08 0.04
4 5 6
The sum for when three fish are caught on the first trip and two fish are
caught on the second trip, (3, 2), is 5 with chance given by
(circle one) 0.04 / 0.08 / 0.16.
(d) The sum, x1 + x2 = 4, occurs in three possible ways: (3,1), (2,2) and (1,3),
with probabilities (circle one)
i. 0.08, 0.08 and 0.08, respectively.
ii. 0.08, 0.16 and 0.16, respectively.
iii. 0.08, 0.16 and 0.08, respectively.
iv. 0.16, 0.16 and 0.08, respectively.
(e) Thus, the chance that the sum of the number of fish caught on two trips
to the lake is four is
P{X1 + X2 = 4} = 0.08 + 0.16 + 0.08 = (circle one) 0.04 / 0.16 / 0.32.
(f) The sum, x1 + x2 = 5, occurs in two possible ways (circle one)
i. (2,2) and (1,3).
ii. (2,3) and (2,3).
Section 3. Sums of Independent Random Variables 191

iii. (3,2) and (2,3).

iv. (2,1) and (1,3).
(g) Combining the probabilities associated with the two ways that the sum 5
can occur,
P(X1 X2 = 5) = (circle one) 0.04 / 0.16 / 0.32.
(h) Complete the probability distribution of the sum of the number of fish on
two trips to the lake, X1 + X2 ,
x1 + x2 2 3 4 5 6
P {X1 + X2 } 0.16 0.32 0.16 0.04
(i) True / False Since the number of fish on each trip to the lake, X, is a
random variable, the sum of the number of fish caught, X1 + X2 , is also a
random variable.
(j) True / False. In addition to the probability distribution for the sum,
X1 + X2 , there is also a probability distribution for other functions of X1
and X2 , such as X1 X2 , say.

2. Discrete Probability Distribution of Sum, Waiting Time To Catch a Fish. The

distribution of the number of minutes waiting to catch a ﬁsh, Y , is given below.

x 1 2 3
P {X = x} 0.1 0.1 0.8

(a) If two minutes are spent waiting for one fish and two minutes are spent
waiting for another fish, (2,2), the sum of time spent waiting is
x1 + x2 = 2 + 2 = (circle one) 0.3 / 1.5 / 4.
(b) Complete the following table of joint distribution probabilities as well as
the average times spent waiting to catch two fish.
P{x1 , x2 } x1
X1 + x2 1 2 3
1 0.01 0.01 0.08
2 3 4
x2 2 0.01 0.01 0.08
4 5
3 0.08 0.08 0.64
5 6
(c) Complete the probability distribution of the sum of the waiting time, X1 +
X2 , is
x1 + x2 2 3 4 5 6
P {X1 + X2 } 0.02 0.17 0.64
192 Chapter 6. Jointly Distributed Random Variables

(d) The method used here of determining the probability distribution of X1 +

X2 (circle one) does / does not require that the random variables X1 and
X2 are independent of one another; this method could also be applied to
random variables that are dependent.

3. Discrete: Poisson sum. The Poisson random variable has distribution given by
λi
p(i) = P {X = i} = e−λ , i = 0, 1, . . . , λ > 0
i!
with parameter λ and so the distribution of the sum X + Y is

n
P {X + Y = n} = P {X = k, Y = n − k}
k=0

n
= P {X = k}P {Y = n − k}
k=0

n
λ1 k −λ2 λ2 n−k
= e−λ1 e
k=0
k! (n − k)!
n
λ1 k λ2 n−k
−(λ1 +λ2 )
= e
k=0
k!(n − k)!
e−(λ1 +λ2 )
n
n!
= λ1 k λ2 n−k
n! k=0
k!(n − k)!
e−(λ1 +λ2 )
= (λ1 + λ2 )n binomial coeﬃcient identity
n!

(a) Suppose the number of people who live to 100 years of age in Westville
per year has a Poisson distribution with parameter λ1 = 2.5; whereas,
independent of this, in Michigan City, it has a Poisson distribution with
parameter λ2 = 3. The chance that the sum of the number of people who
live to 100 years of age in Westville and Michigan City is 4 is
P {X1 + X2 = 4} = (circle none, one or more)
−(λ +λ ) e−(2.5+3)
i. e n!1 2
(λ1 + λ2 )n = 4!
(2.5 + 3)4
ii. 0.1558
(Hint: Poissonpdf(5.5,4))
(b) For λ1 = 2.5, λ2 = 3, P {X1 + X2 ≤ 4} =
(circle one) 0.311 / 0.358 / 0.543.
(Hint: Poissoncdf(5.5,4))
(c) For λ1 = 2.5, λ2 = 6, P {X1 + X2 ≤ 4} =
(circle one) 0.074 / 0.358 / 0.543.
Section 3. Sums of Independent Random Variables 193

4. Continuous Example: gamma. The gamma random variable, with parameters

(t, λ), t ≥ 0, λ ≥ 0 is given by
−λx t−1
λe (λx)
Γ(t)
if x ≥ 0
f (x) =
0 if x < 0

and so the distribution of the sum X + Y , where Y has parameters (s, λ), is
∞
fX+Y (a) = fX (a − y)fY (y) dy
−∞
a
1
= λe−λ(a−y) (λ(a − y))s−1λe−λy (λy)t−1 dy
Γ(s)Γ(t) 0
a
−λa
= Ke (a − y)s−1y t−1 dy
0
1
−λa s+t−1 y
= Ke a (1 − x)s−1 xt−1 dx letting x =
0 a
−λa s+t−1
= Ce a
λe−λa (λa)s+t−1 1
= letting C =
Γ(s + t) Γ(s + t)

In general, ifXi is gamma with parameter (ti , λ), then ni=1 Xi is gamma with
parameter ( ni=1 t1 , λ).

(a) Suppose the number of people who live to 100 years of age in Westville per
year has a gamma distribution with parameter (t, λ) = (2, 2.5); whereas,
independent of this, in Michigan City, it has a gamma distribution with
parameter (s, λ) = (3, 2.5).
fX1 +X2 (4) = (circle none, one or more)
λe−λa (λa)s+t−1 2.5e−2.5 (2.5(4))2+3−1
i. Γ(s+t)
= Γ(2+3)
−(2.5)(4) (2.5(4))2+3−1
ii. 2.5e (5−1)!
iii. 0.0473
(b) For t = 2, s = 3, P {X1 + X2 ≤ 4} = (circle one) 0.189 / 0.358 / 0.543.
−(2.5)(4) (2.5(4))2+3−1
(Hint: fnInt( 2.5e (5−1)!
, X, 0, 4))

5. Continuous: normal. After some eﬀort, it can be shown that the distribution
n
of the sum of independent random variables, i=1 Xi , where each Xi has a
normal distribution,
n and where
n each has the parameter (µi , σi2 ), is also normal
with parameter ( i=1 µi , i=1 σi2 ). Consequently, the distribution of X1 + X2 ,
where X1 is normal with parameter (2, 3) and X2 is normal with parameter
(1, 1), is normal with parameter (circle one) (2, 3) / (3, 4) / (4, 5).
Review Chapter

Properties of Expectation

7.1 Introduction
If

P {a ≤ X ≤ b} = 1

then

a ≤ E(X) ≤ b

Exercise 7.1 (Introduction)

1. Let X be the waiting time for a bus. The chance the waiting time is between
3 and 7 minutes is 100%, P {3 ≤ X ≤ 7} = 1. This means the waiting time is
expected to between (circle one) (3, 7) / (1, 6) / (4, 8) minutes.

2. Let X be the weight of a new–born child. The chance the weight is between
4 and 10 pounds is 100%, P {3 ≤ X ≤ 10} = 1. This means the weight is
expected to between (circle one) (3, 7) / (4, 10) / (4, 8) minutes.

7.2 Expectation of Sums of Random Variables

We look at the expectation of the sums of random variables.

• In general,

y g(x, y)p(x, y) if discrete
E[g(x, y)] = ∞x ∞
−∞ −∞
g(x, y)f (x, y) dxdy if continuous

207
208 Chapter 7. Properties of Expectation

• In particular, if g(x, y) is the sum of random variables,

E(X + Y ) = E(X) + E(Y ) (7.1)

E[X1 + · · · + Xn ] = E[X1 ] + · · · + E[Xn ] (7.2)

Exercise 7.2 (Expectation of Sums of Random Variables)

1. Expectation, Discrete: Montana Fishing Trip. The joint density, P {X, Y }, of

the number of minutes waiting to catch the first fish, X, and the number of
minutes waiting to catch the second fish, Y , is given below. In addition to the
joint distribution probabilities, the average number of fish caught on two trips
to the lake, g(x, y) = x+y
2
, are also given in the one table below.

P {X = i, Y = j} j
g(x, y) = x+y
2
1 2 3
1 0.16 0.16 0.08
3
1 2
2
i 2 0.16 0.16 0.08
3 5
2
2 2
3 0.08 0.08 0.04
5
2 2
3

(a) True / False This is a probability density because the probabilities sum
to one.
(b) One fish is caught on the first trip and three fish are caught on the second
trip with probability
(circle one) 0.04 / 0.08 / 16.
The average waiting time over these two trips is
g(1, 3) = 1+3
2
= (circle one) 0.3 / 1.5 / 2.
(c) Two fish is caught on the first trip and three fish are caught on the second
trip with probability
(circle one) 0.04 / 0.08 / 16.
The average waiting time over these two trips is
g(2, 3) = 2+3
2
= (circle one) 0.3 / 1.5 / 2.5.
(d) Three fish is caught on the first trip and two fish are caught on the second
trip with probability
(circle one) 0.04 / 0.08 / 16.
The average waiting time over these two trips is
g(3, 2) = 2+3
2
= (circle one) 0.3 / 1.5 / 2.5.
Section 2. Expectation of Sums of Random Variables 209

(e) The expected average waiting time over these two trips is

E[g(x, y)] = g(x, y)p(x, y)
x y

3
= (1)(0.16) + (0.16) + (2)(0.08)
2

3 3
+ (0.16) + (2)(0.16) + (0.08)
2 2

5
+ (2)(0.08) + (0.08) + (3)(0.04) =
2
(circle one) 1 / 1.5 / 1.8.

2. More Expectation, Discrete.

(a) g(x, y) = 3xy. Consider the following joint density. Notice that g(x, y) =
3xy.
P {X = x, Y = y} y
g(x, y) = 3xy 1 2 3
1 0.16 0.16 0.08
3 6 9
x 2 0.16 0.16 0.08
6 12 18
3 0.08 0.08 0.04
9 18 27
The expected value of g(x, y) = 3xy is

E[3XY ] = 3xy p(x, y)
x y
= (3)(0.16) + (6)(0.16) + (9)(0.08)
+ (6)(0.16) + (12)(0.16) + (18)(0.08)
+ (9)(0.08) + (18)(0.08) + (27)(0.04) =
(circle one) 8.4 / 9.72 / 11.
(Hint: Type g(x, y) into L1 , P {X, Y } into L2 , define L3 = L1 × L2 , the
sum L3 by STAT CALC 1–Var Stats.)
(b) g(x, y) = 3xy. If g(x, y) = x2 y, x = 1, 2, 3 and y = 1, 2, 3, then

E[X 2 Y ] = x2 y p(x, y)
x y

= (1 )(1)(0.16) + (12 )(2)(0.16) + (12 )(3)(0.08)

+ (22 )(1)(0.16) + (22 )(2)(0.16) + (22 )(3)(0.08)

+ (32 )(1)(0.08) + (32 )(2)(0.08) + (32 )(3)(0.04) =
210 Chapter 7. Properties of Expectation

(circle one) 6.84 / 8.44 / 11.02.

(c) If g(x, y) = x/y, x = 1, 2, 3 and y = 1, 2, 3, then
x
E[X/Y ] = p(x, y)
x y
y
= (1/1)(0.16) + (1/2)(0.16) + (1/3)(0.08)
+ (2/1)(0.16) + (2/2)(0.16) + (2/3)(0.08)
+ (3/1)(0.08) + (3/2)(0.08) + (3/3)(0.04) =

(circle one) 0.8 / 1.2 / 2.5.

(d) If g(x, y) = x + y, x = 1, 2, 3 and y = 1, 2, 3, then

E[X + Y ] = (x + y) p(x, y)
x y
= (1 + 1)(0.16) + (1 + 2)(0.16) + (1 + 3)(0.08)
+ (2 + 1)(0.16) + (2 + 2)(0.16) + (2 + 3)(0.08)
+ (3 + 1)(0.08) + (3 + 2)(0.08) + (3 + 3)(0.04) =

(circle one) 0.8 / 1.2 / 3.6.

3. More Expectation, Discrete. Consider the following joint density.

P {X = i, Y = j} j row sum
1 2 3 P {X = i}
1 0.01 0.02 0.08 0.11
i 2 0.01 0.02 0.08 0.11
3 0.07 0.08 0.63 0.78
P {Y = j} 0.09 0.12 0.79
column sum

(a) The expected value of g(x, y) = 3xy is

E[3XY ] = 3xy p(x, y)
x y
= 3(1)(1)(0.01) + 3(1)(2)(0.02) + 3(1)(3)(0.08)
+ 3(2)(1)(0.01) + 3(2)(2)(0.02) + 3(2)(3)(0.08)
+ 3(3)(1)(0.07) + 3(3)(2)(0.08) + 3(3)(3)(0.63) =

(circle one) 18.43 / 21.69 / 11.22.

Section 2. Expectation of Sums of Random Variables 211

(b) The expected value of g(x, y) = x + y is

E[X + Y ] = (x + y) p(x, y)
x y
= (1 + 1)(0.01) + (1 + 2)(0.02) + (1 + 3)(0.08)
+ (2 + 1)(0.01) + (2 + 2)(0.02) + (2 + 3)(0.08)
+ (3 + 1)(0.07) + (3 + 2)(0.08) + (3 + 3)(0.63) =

(circle one) 5.37 / 8.74 / 11.2.

(c) The expected value of g(x, y) = x is

E[X] = x p(x, y)
x y
= (1)(0.01) + (1)(0.02) + (1)(0.08)
+ (2)(0.01) + (2)(0.02) + (2)(0.08)
+ (3)(0.07) + (3)(0.08) + (3)(0.63)
= (1)(0.11) + (2)(0.11) + (3)(0.78) =

= xpX (x) =
x

(circle one) 2.67 / 8.74 / 11.2.

(d) The expected value of g(x, y) = y is

E[X] = ypY (y)
y
= (1)(0.09) + (2)(0.12) + (3)(0.79) =

(circle one) 2.67 / 2.7 / 11.2.

(e) True / False E(X + Y ) = 5.37 = 2.67 + 2.7 = E(X) + E(Y )

4. Expectation, Discrete. Let

1
p(x, y, z) = , x = 1, 2; y = 1, 2, 3; z = 1, 2, 3, 4
24
and

g(x, y, z) = x + y + z
3 4 12 13 14
(a) pX (1) = y=1 z=1 p(1, y, z) = (circle one) 24
/ 24
/ 24
.
3 4 12 13 14
(b) pX (2) = y=1 z=1 p(2, y, z) = (circle one) 24
/ 24
/ 24
.
212 Chapter 7. Properties of Expectation

(c) The expected value of g(x, y, z) = x is

E[X] = x pX (x)
x

12 12
= (1) + (2) =
24 24

(circle one) 1 / 1.5 / 2.5.

(d) The expected value of g(x, y, z) = y is

E[Y ] = y pY (y)
y

8 8 8
= (1) + (2) + (3) =
24 24 24

(circle one) 1 / 1.5 / 2.

(e) The expected value of g(x, y, z) = z is

E[Z] = z pZ (z)
z

6 6 6 6
= (1) + (2) + (3) + (4) =
24 24 24 24
60
(circle one) 1 / 1.5 / 24
.
(f) E(X + Y + Z) = E(X) + E(Y ) + E(Z) = (circle one) 6 / 8.74 / 11.2.

5. Expectation, Continuous. Consider the following distribution,

2 xy
x + 3 0 < x < 1, 0 < y < 2
f (x, y) =
0 elsewhere

(a) fX (x) = (circle none, one or more)

∞ 2
i. −∞ x2 + xy 3
dy = 0 x2 + xy
3
dy
2

2
ii. yx2 + xy6
0
2x
iii. 2x2 + 3
(b) E(X) = (circle none, one or more)
∞ 1 3 2x2

i. −∞ x 2x2 + 2x 3
dx = 0
2x + 3
dx
3

1
ii. 24 x4 + 2x9
0
2 2 13
iii. 4
+ 9
= 18
Section 2. Expectation of Sums of Random Variables 213

(c) fY (y) = (circle none, one or more)

∞ 1
i. −∞ x2 + xy 3
dx = 0 x2 + xy
3
dx
2

1
ii. 13 x3 + x6y
0
1 y
iii. 3
+ 6
(d) E(Y ) = (circle none, one or more)
∞ 2 2

i. −∞ y 13 + y6 dy = 0 y3 + y6 dy

3 2
ii. 16 y 2 + y18
0
1 23 10
iii. 6
(2)2 + 18
= 9
13 10 11
(e) E(X + Y ) = E(X) + E(Y ) = 18
+ 9
= (circle one) 6
/ 3 / 4.
(f) E(X + Y ) = (circle none, one or more)
∞ ∞ 21 xy

i. −∞ −∞ (x + y)f (x, y) dx dy = 0 0 (x + y) x2 + 3
dx dy
21 2 2

ii. 0 0 x3 + 4x3 y + xy3 dx dy

2 4 3

2 2 1
iii. 0 x4 + 4x9 y + x 6y dy
2 2

0

iv. 0 14 + 4y 9
+ y6 dy
2

2
y3
v. y4 + 4y18
+ 18
= 116
0

6. Expectation of Binomial. Let X be a binomial random variable with parameters

n and p and let
X = X1 + X2 + · · · + Xn
where

1 if ith trial is a success
Xi =
0 if ith trial is a failure
(a) Each Xi is a Bernoulli where
E(Xi ) = 1(p) + 0(1 − p) = (circle one) p / 1 − p / 1.
(b) And so E(X) = E(X1 ) + · · · + E(Xn ) = (circle one) np / n(1 − p) / n.
(c) If n = 8 and p = 0.25, then E(X) = (circle one) 3 / 4 / 5.
7. Expected Number of Matches. Ten people throw ten tickets with their names on
each ticket into a jar, then draw one ticket out of the jar at random (and put
it back in the jar). Let X be the number of people who select their own ticket
out of the jar. Let
X = X1 + X2 + · · · + X10
214 Chapter 7. Properties of Expectation

where

1 if ith person selects own ticket
Xi =
0 if ith person does not select their own ticket

(a) Since each person will choose any ticket with equal chance, E(Xi ) =
P {Xi = 1} = (circle one) 0.1 / 0.2 / 0.3.
(b) And so E(X) = E(X1 ) + · · · + E(X10 ) = 10(0.1) = (circle one) 1 / 5 / 10.
In other words, we’d expect one of the ten individuals to choose their own
ticket.
(c) If, instead of ten individuals, n individuals played this game, then we would
expect E(X) = E(X1 ) + · · · + E(Xn ) = n n1 (circle one) 1 / 5 / 10.

7.3 Covariance, Variance of Sums and Correla-

tions
In this section, we look at covariance, a measure of how two random variables are
related.

• If X, Y independent, then E[g(x)h(Y )] = E(g(X))E(h(Y ))

• Covariance is defined by
Cov(X, Y ) = E[(X − E(X))(Y − E(Y ))] = E(XY ) − E(X)E(Y )
and has the following properties.

– Cov(X, Y ) = Cov(Y, X)
– Cov(X, X) = Var(X)
– Cov(aX, Y ) = aCov(X, Y )

– Cov( ni=1 Xi , ni=1 Yi ) = ni=1 nj=1 Cov(Xi , Yj )

• In general,

Var( ni=1 Xi ) = ni=1 Var(Xi ) +2 i<j Cov(Xi , Yj ),

• The correlation is given by

ρ(x, y) = √ Cov(X,Y )
Var(X)Var(Y )

Exercise 7.3 (Covariance, Variance of Sums and Correlations)

1. Covariance, Discrete. Consider the following joint density.

Section 3. Covariance, Variance of Sums and Correlations 215

P {X = i, Y = j} j row sum
1 2 3 P {X = i}
1 0.01 0.02 0.08 0.11
i 2 0.01 0.02 0.08 0.11
3 0.07 0.08 0.63 0.78
P {Y = j} 0.09 0.12 0.79
column sum

(a) The expected value of g(x, y) = xy is

E[XY ] = xy p(x, y)
x y
= (1)(1)(0.01) + (1)(2)(0.02) + (1)(3)(0.08)
+ (2)(1)(0.01) + (2)(2)(0.02) + (2)(3)(0.08)
+ (3)(1)(0.07) + (3)(2)(0.08) + (3)(3)(0.63) =
(circle one) 7.23 / 13.74 / 11.22.
(b) The expected value of g(x, y) = x is

E[X] = x pX (x) =
x
= (1)(0.11) + (2)(0.11) + (3)(0.78) =
(circle one) 2.67 / 8.74 / 11.2.
(c) The expected value of g(x, y) = y is

E[Y ] = y pY (y)
y
= (1)(0.09) + (2)(0.12) + (3)(0.79) =
(circle one) 2.67 / 2.7 / 11.2.
(d) Cov(X, Y ) = E(XY ) − E(X)E(Y ) = 7.23 − (2.67)(2.7) = (circle one)
0.021 / 0.335 / 0.545.
(e) Cov(Y, X) = E(Y X) − E(Y )E(X) = 7.23 − (2.7)(2.67) = (circle one)
0.021 / 0.335 / 0.545.
(f) True / False. Cov(Y, X) = Cov(X, Y )
(g) The expected value of g(x, y) = x2 is

E[X 2 ] = x2 pX (x) =
x
= (1 )(0.11) + (22 )(0.11) + (32 )(0.78) =
2

(circle one) 2.67 / 7.57 / 11.2.

216 Chapter 7. Properties of Expectation

(h) Cov(X, X) = Var(X) = E(X 2 ) − E(X)E(X) = 7.57 − (2.67)(2.67) =

(circle one) 0.4411 / 7.57 / 11.2.
(i) The expected value of g(x, y) = y 2 is

E[Y 2 ] = y 2 pY (y) =
y

= (1 )(0.09) + (22 )(0.12) + (32 )(0.79) =

(circle one) 2.67 / 7.57 / 7.68.

(j) Var(Y ) = E(Y 2 ) − [E(Y )]2 = (circle one) 0.39 / 0.5511 / 11.2.
(k) Cov(3X, Y ) = 3 Cov(XY ) = 3(0.021) =
(circle one) 0.021 / 0.063 / 0.545.
(l) Cov(X, 4Y ) = 4 Cov(XY ) = 4(0.021) =
(circle one) 0.021 / 0.063 / 0.084.
(m) Cov(3X, 4Y ) = (3)(4) Cov(XY ) = 12(0.021) =
(circle one) 0.021 / 0.063 / 0.252.
(n) True / False. If X, Y independent, then E[g(x)h(Y )] =
E(g(X))E(h(Y )). If E(XY ) = E(X)E(Y ), then Cov(X, Y ) = E(XY ) −
E(X)E(Y ) = 0. In other words, if X and Y are independent, then the
covariance of X and Y is zero. (The converse is not necessarily true.)
Cov(X,Y )
(o) Correlation. ρ(x, y) = √ =√ 0.021
=
Var(X)Var(Y ) (0.4411)(0.39)
(circle one) 0.021 / 0.053 / 0.084.
Correlation measure how linear the data is; ρ = 1 (positive) if Y = aX + b
where a > 0 and ρ = −1 (negative) if Y = aX + b where a < 0 and ρ = 0
(uncorrelated) if Y = aX + b where a = 0.

2. More Covariance, Discrete. Let

1
p(x, y, z) = , x = 1, 2; y = 1, 2, 3; z = 1, 2, 3, 4
24
and
g(x, y, z) = x + y + z

(a) The expected value of g(x, y, z) = x is

E[X] = x pX (x)
x

12 12
= (1) + (2) =
24 24
(circle one) 1 / 1.5 / 2.5.
Section 3. Covariance, Variance of Sums and Correlations 217

(b) The expected value of g(x, y, z) = y is

E[Y ] = y pY (y)
y

8 8 8
= (1) + (2) + (3) =
24 24 24

(circle one) 1 / 1.5 / 2.

(c) The expected value of g(x, y, z) = z is

E[Z] = z pZ (z)
z

6 6 6 6
= (1) + (2) + (3) + (4) =
24 24 24 24
60
(circle one) 1 / 1.5 / 24
.
(d) The expected value of g(x, y) = xy is

E[XY ] = xy pX,Y (x, y)
x y

4 4 4
= (1)(1) + (1)(2) + (1)(3)
24 24 24

4 4 4
+ (2)(1) + (2)(2) + (2)(3) =
24 24 24
71 72 73
(circle one) 24
/ 24
/ 24
.
(e) The expected value of g(x, z) = xz is

E[XZ] = xz pX,Z (x, z)
x z

3 3 3 3
= (1)(1) + (1)(2) + (1)(3) + (1)(4)
24 24 24 24

3 3 3 3
+ (2)(1) + (2)(2) + (2)(3) + (2)(4) =
24 24 24 24
71 72 90
(circle one) 24
/ 24
/ 24
.
72
(f) Cov(X, Y ) = E(XY ) − E(X)E(Y ) = 24
− (1.5)(2) =
(circle one) 0 / 0.335 / 0.545.
90
60
(g) Cov(X, Z) = E(XZ) − E(X)E(Z) = 24
− (1.5) 24
=
(circle one) 0 / 0.335 / 0.545.
218 Chapter 7. Properties of Expectation

(h) Since Cov( ni=1 Xi , ni=1 Yi ) = ni=1 nj=1 Cov(Xi , Yj ),
Cov(X + Y, Z) = Cov(X, Z)+ Cov(Y, Z) = 0 + 0 =
(circle one) 0 / 0.335 / 0.545.

(i) Since Cov( ni=1 Xi , ni=1 Yi ) = ni=1 nj=1 Cov(Xi , Yj ),
Cov(X, Y + Z) = Cov(X, Y )+ Cov(X, Z) = 0 + 0 =
(circle one) 0 / 0.335 / 0.545.
(j) The expected value of g(x, y, z) = x2 is

E[X 2 ] = x2 pX (x)
x

2 12 2 12
= (1) + (2) =
24 24
60
(circle one) 24
/ 1.5 / 2.5.
60
(k) Var(X) = E(X 2 ) − [E(X)]2 = 24
− (1.5)2 = (circle one) 0 / 0.25 / 0.545.
(l) The expected value of g(x, y, z) = y 2 is

E[Y 2 ] = y 2 pY (y)
y

2 8 2 8 2 8
= (1) + (2) + (3) =
24 24 24
112
(circle one) 1 / 1.5 / 24
.
(m) Var(Y ) = E(Y 2 ) − [E(Y )]2 = 112
24
− (2)2 = (circle one) 0 / 0.5 / 23 .

(n) Since Var( ni=1 Xi ) = ni=1 Var(Xi ) +2 i<j
Cov(Xi , Yj ),
2
Var(X + Y ) = Var(X) + Var(Y ) +2 Cov(X, Y ) = 0.25 + 3
+ 2(0) =
9
(circle one) 12 / 10
12
/ 11
12
.
3. Variance of Rolling Dice. Calculate the expectation and variance of the sum of
15 rolls of a fair die.

6roll, Xi = 1, 2, 3, 4, 5, 6,
(a) For the ith
E[Xi ] = j=1 xp(x) = 1(1/6) + 2(1/6) + · · · + 6(1/6) =
(circle one) 32 / 52 / 72 .
(b) and so

15
10
E Xi = E[Xi ]
i=1 i=1
= 15E[Xi ]
7
= 15 =
2
75 90 105
(circle one) 2
/ 2
/ 2
.
Section 3. Covariance, Variance of Sums and Correlations 219

(c) Since E[Xi2 ] = 6j=1 x2 p(x) = 12 (1/6) + 22 (1/6) + · · · + 62 (1/6) = 91
6
(circle one) 88
6
/ 91
6
/ 95
6
.
and so Var(Xi ) = E[Xi ] − (E[Xi ])2 = 91/6 − (7/2)2 =,
2

(circle one) 30
12
/ 35
12
/ 40
12
.
(d) Since Xi are independent,

10
10
V Xi = V [Xi ]
i=1 i=1
= 15V [Xi ]
35
= 15 =
12
165 170 175
(circle one) 4
/ 4
/ 4
.

4. Covariance, Continuous. Consider the following distribution,

2 xy
x + 3 0 < x < 1, 0 < y < 2
f (x, y) =
0 elsewhere

(a) fX (x) = (circle none, one or more)

∞ 2
i. −∞ x2 + xy 3
dy = 0 x2 + xy
3
dy

2 2
ii. yx2 + xy6
0
2 2x
iii. 2x + 3
(b) E(X) = (circle none, one or more)
∞ 1 3 2x2

i. −∞ x 2x2 + 2x 3
dx = 0
2x + 3
dx
3

1
ii. 24 x4 + 2x9
0
2 2 13
iii. 4
+ 9
= 18
(c) fY (y) = (circle none, one or more)
∞ 1 2 xy
i. −∞ x2 + xy 3
dx = 0
x + 3 dx
2

1
ii. 13 x3 + x6y
0
1 y
iii. 3
+ 6
(d) E(Y ) = (circle none, one or more)
∞ 1 y 2 y y2

i. −∞ y 3 + 6 dy = 0 3 + 6 dy

3 2
ii. 16 y 2 + y18
0
1 23 10
iii. 6
(2)2 + 18
= 9
220 Chapter 7. Properties of Expectation

(e) E(XY ) =
∞ ∞ 21 xy

i. −∞ −∞ (xy)f (x, y) dx dy = 0 0 (xy) x2 + 3
dx dy
21 2 2

ii. 0 0 x3 y + x 3y dx dy
2 4

3 2 1
iii. 0 x4y + x 9y dy
2 2

0

iv. 0 y4 + y9 dy
2

3 2
v. y8 + y27 =
0
43 44 45
(circle one) 54
/ 54
/ 54
.
43
13 10
(f) Cov(X, Y ) = E(XY ) − E(X)E(Y ) = 54
− 18 9
=
3
(circle one) −0.0062 / 0.0062 / 18 .
(g) E(X 2 ) =
∞ 2x
1 4 2x3

i. −∞ x2 2x2 + 3
dx = 0
2x + 3
dx
4

1
ii. 25 x5 + 2x
12
0
2 2
iii. 5
+ 12
=
16 17 18
(circle one) 30
/ 30
/ 30
.
17
13 2 73 17 18
(h) Var(X) = E(X 2 ) − [E(X)]2 = 30
− 18
= (circle one) 1620
/ 30
/ 30
.
(i) E(Y 2 ) =
∞ 2 2 y3

i. −∞ y 2 13 + y6 dy = 0 y3 + 6
dy

2
1 3 y4
ii. 9 y + 24
0
1 24
iii. 9
(2)3 + 24
=
13 14 15
(circle one) 9
/ 9
/ 9
.
14
10 2 26 17 18
(j) Var(Y ) = E(Y 2 ) − [E(Y )]2 = 9
− 9
= (circle one) 81
/ 30
/ 30
.
(k) Cov(5X, Y ) = 5 Cov(XY ) = 5(−0.0062) =
(circle one) −0.021 / −0.031 / −0.123.
(l) Cov(X, 4Y ) = 4 Cov(XY ) = 4(−0.0062) =
(circle one) −0.021 / −0.063 / −0.123.
Cov(X,Y )
(m) Correlation. ρ(x, y) = √ =√ −0.0062
=
Var(X)Var(Y ) (73/1620)(26/81)
(circle one) −0.024 / −0.052 / −0.084.
5. Covariance of Binomial. Let X be a binomial random variable with parameters
n and p and let
X = X1 + X2 + · · · + Xn
Section 4. Conditional Expectation 221

where

1 if ith trial is a success
Xi =
0 if ith trial is a failure

(a) Each Xi is a Bernoulli where

E(Xi ) = 1(p) + 0(1 − p) = (circle one) p / 1 − p / 1.
(b) E(Xi2 ) = 12 (p) + 02 (1 − p) = (circle one) p / 1 − p / 1.
In other words, E(Xi2 ) = E(Xi ); in fact, E(Xin ) = E(Xi ).
(c) Var(Xi ) = E(Xi2 ) − [E(Xi )]2 = p − (p)2 = (circle one) p / p(1 − p) / 1.
(d) For i = j, E(Xi Xj ) = (1)(1)(p)(p) + (1)(0)(p)(1 − p) + (0)(1)(1 − p)(p) +
02 (1 − p)2 = (circle one) p / 1 − p / p2 .
(e) Cov(Xi , Xj ) = E(Xi Xj ) − E(Xi )E(Xj ) = p2 − (p)(p) =
3
(circle one) 0 / 0.0062 / 18 .
n n
(f) Since Var( i=1 Xi ) = i=1 Var(X i ) +2 i<j Cov(Xi , Yj ), and
Cov(Xi , Xj ) = 0,
Var(X1 + · · · + Xn ) = Var(X1 ) + Var(X2 ) + · · · + Var(Xn )
= p(1 − p) + · · · + p(1 − p) = (circle one) p(1 − p) / np(1 − p) / 1.

7.4 Conditional Expectation

We learn about conditional expectation and variance; in particular, we look at the
following items.

• E[X|Y = y] = x xP {X = x|Y = y} = xpX|Y (x|y), discrete
∞
E[X|Y = y] = −∞ xfX|Y (x|y) dx, fX (x) > 0, continuous

• E[g(X)|Y = y] = g(x)pX|Y (x|y), discrete
∞
E[g(X)|Y = y] = −∞ g(x)fX|Y (x|y) dx, fX (x) > 0, continuous

• general sum: E[ ni=1 Xi |Y = y] = ni=1 E(Xi |Y = y)

• computing expectation
by conditioning,
E[X] = E[E(X|Y )] = y E(X = x|Y = y)P (Y = y), discrete
∞
E[X] = −∞ E(X = x|Y = y)fY (y) dy, continuous

• computing probability by conditioning,

if E(X) = P (E),
where X = 1 if E occurs, 0 otherwise
then E[X] = y P (E|Y = y)P (Y = y), discrete (generalization of total prob-
ability)
∞
E[X] = −∞ P (E|Y = y)fY (y), continuous
222 Chapter 7. Properties of Expectation

Exercise 7.4 (Conditional Expectation)

1. Conditional Expectation, Discrete. Consider the following joint density.

P {X = i, Y = j} j row sum
1 2 3 P {X = i}
1 0.01 0.02 0.08 0.11
i 2 0.01 0.02 0.08 0.11
3 0.07 0.08 0.63 0.78
P {Y = j} 0.09 0.12 0.79
column sum

(a) Compute E[Y |X = 2].

Since P {Y = 1|X = 2} = P {Y = 1, X = 2}/P {X = 2} = 0.01
0.11
=
1 2 7 8
(circle one) 11 / 11 / 11 / 11
and P {Y = 2|X = 2} = P {Y = 2, X = 2}/P {X = 2} = 0.02
0.11
=
1 2 7 8
(circle one) 11 / 11 / 11 / 11
and P {Y = 3|X = 2} = P {Y = 3, X = 2}/P {X = 2} = 0.08
0.11
=
1 2 7 8
(circle one) 11 / 11 / 11 / 11
1 2
then E[Y |X = 2] = yP {Y = y|X = 2} = (1) 11 + (2) 11 +
8 y
(3) 11 =
1 2
(circle one) 11 / 11 / 29
11
/ 30
11
(b) Compute E[X|Y = 1].
Since P {X = 1|Y = 1} = P {X = 1, Y = 1}/P {Y = 1} = 0.01
0.09
=
1 2 7 8
(circle one) 9 / 9 / 9 / 9
and P {X = 2|Y = 1} = P {X = 2, Y = 1}/P {Y = 1} = 0.01
0.09
=
1 2 7 8
(circle one) 9 / 9 / 9 / 9
and P {X = 3|Y = 1} = P {X = 3, Y = 1}/P {Y = 1} = 0.07
0.09
=
(circle one) 19 / 29 / 79 / 89

then E[X|Y = 1] = x xP {X = x|Y = 1} = (1) 19 + (2) 19 + (3) 79 =
(circle one) 19 / 29 / 79 / 24
9
(c) Compute E[X|Y = 2].
Since P {X = 1|Y = 2} = P {X = 1, Y = 2}/P {Y = 2} = 0.02
0.12
=
1 2 7 8
(circle one) 12 / 12 / 12 / 12
and P {X = 2|Y = 2} = P {X = 2, Y = 2}/P {Y = 2} = 0.02
0.12
=
1 2 7 8
(circle one) 12 / 12 / 12 / 12
Section 4. Conditional Expectation 223

and P {X = 3|Y = 2} = P {X = 3, Y = 2}/P {Y = 2} = 0.08

0.12
=
1 2 7 8
(circle one) 12 / 12 / 12 / 12
2 2
then E[X|Y
= 2] = x xP {X = x|Y = 2} = (1) 12
+ (2) 12
+
8
(3) 12 =
1 2
(circle one) 12 / 12 / 27
12
/ 29
12
(d) Compute E[X|Y = 3].
Since P {X = 1|Y = 3} = P {X = 1, Y = 3}/P {Y = 3} = 0.08
0.79
=
1 2 7 8
(circle one) 79 / 79 / 79 / 79
and P {X = 2|Y = 3} = P {X = 2, Y = 3}/P {Y = 3} = 0.08
0.79
=
1 2 7 8
(circle one) 79 / 79 / 79 / 79
and P {X = 3|Y = 3} = P {X = 3, Y = 3}/P {Y = 3} = 0.63
0.79
=
1 2 7
(circle one) 79 / 79 / 79 / 63
79 8 8
then E[X|Y
= 3] = x xP {X = x|Y = 3} = (1) 79 + (2) 79 +
(3) 63
79
=
1 2
(circle one) 79 / 79 / 213
79
/ 214
79
(e) Compare E[E[X|Y ]] to E[X].
Since E[E(X|Y )] = (circle none, one or more)

i. y E(X|Y = y)P (Y = y) = E(X|Y = 1)P (Y = 1) + E(X|Y =
2)P (Y = 2) + E(X|Y = 3)P (Y = 3)
29 213
ii. 249
(0.09) + 12
(0.12) + 79
(0.79)
iii. 2.67
and E[X] = (circle none, one or more)

i. x xP (X = x) = (1)P (X = 1) + (2)P (X = 2) + (3)P (X = 3)
ii. (1)(0.11) + (2)(0.11) + (3)(0.78)
iii. 2.67
In other words, E[X] (circle one) does / does not equal E[E(X|Y )].
(f) Compute E(eX |Y = 1).
E[eX |Y = 1] = x ex P {X = x|Y = 1} = e1 19 + e2 19 + e3 79 =
(circle one) 15.4 / 15.7 / 16.3 / 16.7
(g) Compute E(X 2 |Y = 1).
E[X 2 |Y = 1] = x x2 P {X = x|Y = 1} = (1)2 19 + (2)2 19 + (3)2 79 =
(circle one) 19 / 29 / 67
9
/ 68
9
(h) Compute Var(X|Y = 1).
68
24 2
Var(X|Y = 1) = E(X 2 |Y = 1) − (E(X|Y = 1))2 = 9
− 9
=
(circle one) 19 / 39 / 49 / 11
9
(i) Compute Var(X|Y = 2). 2 2
2
Since E[X
|Y = 2] = x x2 P {X = x|Y = 2} = (1)2 12 + (2)2 12 +
2 8
(3) 12 =
224 Chapter 7. Properties of Expectation

(circle one) 16 / 26 / 41 / 68
6 6
41
29 2
Var(X|Y = 2) = E(X 2 |Y = 2) − (E(X|Y = 2))2 = 6
− 12
=
1
(circle one) 144 / 143
144
/ 247
144
/ 368
144
(j) Compute Var(X|Y = 3). 2
2 2 8 2 8
Since E[X
|Y = 3] = x x P {X = x|Y = 3} = (1) 79
+ (2) 79
+
(3)2 63
79
=
1 2
(circle one) 79 / 79 / 67 / 607
79 79 2
Var(X|Y = 3) = E(X 2 |Y = 3) − (E(X|Y = 3))2 = 607 79
− 213
79
=
1 232 247 2584
(circle one) 6241 / 6241 / 6241 / 6241
(k) Show Var(X) = E[Var(X|Y )] + Var(E(X|Y )).
Since E[Var(X|Y )] = (circle none, one or more)

i. y Var(X|Y = y)P (Y = y) = Var(X|Y = 1)P (Y = 1) + Var(X|Y =
2)P (Y = 2) + Var(X|Y = 3)P (Y = 3)
4 2584
ii. 9 (0.09) + 143
144
(0.12) + 6241
(0.79)
iii. 0.486255
and Var(E(X|Y )) = E[(E[X|Y ])2 ] − (E[E(X|Y )])2 = E[(E[X|Y ])2 ] −
(E[X])2
where E[(E[X|Y ])2 ] = (circle none, one or more)
2 2
i. y (E[X|Y = y]) P (Y = y) = (E[X|Y = 1]) P (Y = 1) + (E[X|Y =
2])2 P (Y = 2) + (E[X|Y = 3])2 P (Y = 3)
2 29 2 213 2
ii. 249
(0.09) + 12
(0.12) + 79
(0.79)
iii. 7.083744
and (E[X])2 = (circle none, one or more)

i. ( x xP (X = x))2 = ((1)P (X = 1) + (2)P (X = 2) + (3)P (X = 3))2
ii. ((1)(0.11) + (2)(0.11) + (3)(0.78))2
iii. 2.672 = 7.1289
And so Var(E(X|Y )) = E[(E[X|Y ])2 ] − (E[X])2 = 7.083744 − 7.1289 =
−0.045156
And so Var(X) = E[Var(X|Y )] + Var(E(X|Y )) = 0.486255 − 0.045156 =
0.441099.

2. Conditional Expectation, Continuous. Consider the following distribution,

2 xy
x + 3 0 < x < 1, 0 < y < 2
f (x, y) =
0 elsewhere

(a) fX (x) = (circle none, one or more)

∞ 2
i. −∞ x2 + xy 3
dy = 0 x2 + xy
3
dy
Section 4. Conditional Expectation 225

2
xy 2
ii. yx2 + 6
0
x2
iii. 2x2 + 3
(b) fY (y) = (circle none, one or more)
∞ 1
i. −∞ x2 + xy 3
dx = 0 x2 + xy
3
dx

2 2 1
ii. 13 x3 + x 6y
0
1 y2
iii. 3
+ 6
(c) Compute E[X|Y = 1].
Since fX|Y (x|y) = (circle none, one or more)
f (x,y)
i. fY (y)
x2 + xy
3
ii. 1 y2
3
+ 6
6x2 +2xy
iii. 2+y 2

then E[X|Y ] = (circle none, one or more)

∞ 2 +2xy
1 3 +2x2 y
i. −∞ x 6x2+y 2 dx = 0 6x2+y 2 dx
6 4 2 3
1
x + x y
ii. 4 2+y32
x=0
6 2
+ y
4 3 9+4y
iii. 2+y 2
= 12+6y 2
6
+ 2 (1)
and so E[X|Y = 1] = 42+(1)3
2 =

(circle one) 18 / 18 / 18 / 15
12 13 14
18
(d) Compute E[X|Y = 1.5].
6
4
+ 23 (1.5)
E[X|Y = 1.5] = 2+(1.5)2
=
(circle one) 10
17
/ 13
17
/ 14
17
/ 17
17
(e) Compare E[E[X|Y ]] to E[X].
Since E[E(X|Y )] = (circle none, one or more)
∞ 2 9+4y
1 y2

i. −∞ E(X|Y )fY (y) dy = 0 12+6y 2 3

+ 6 dy
2 1 1
ii. 0 4 + 9 y dy

1 2 2
iii. 14 y + 18 y y=0
1 1 13
iv. 4
(2) + 18
(2)2 = 18
and E[X] = (circle none, one or more)
∞ 1
i. −∞ xfX (x) dx = 0 x 2x2 + 2x3
dx
1 3 2x2

ii. 0 2x + 3 dx
226 Chapter 7. Properties of Expectation

1
1 4 2x3
iii. 2
x + 9
x=0
1 2 13
iv. 2
+ 9
= 18
In other words, E[X] (circle one) does / does not equal E[E(X|Y )].
(f) Compute E(X 2 |Y = 1).
E[X 2 |Y ] = (circle none, one or more)
∞ 2
1 4 +2x3 y
i. −∞ x2 6x2+y +2xy
2 dx = 0 6x2+y 2 dx
6 5 2 4
1
x + x y
ii. 5 2+y42
x=0
6 2
+ y
5 4 12+5y
iii. 2+y 2
= 20+10y 2
12+5(1)
and so E[X|Y = 1] = 20+10(1) 2 =
12 13 14 17
(circle one) 30 / 30 / 30 / 30
(g) Compute Var(X|Y = 1).
17
13 2
Var(X|Y = 1) = E(X 2 |Y = 1) − (E(X|Y = 1))2 = 30
− 18
=
1 3 4 73
(circle one) 1620 / 1620 / 1620 / 1620
(h) Determine Var(X|Y ).
Var(X|Y ) = (circle none, one or more)
i. E(X 2 |Y ) − (E(X|Y ))2

2
12+5y 9+4y
ii. 20+10y 2 − 12+6y 2

3. Prisoner’s Escape and Three Doors. A prisoner is faced with three doors. The
first door leads to a tunnel that leads to freedom in 4 hours. The second door
leads to a tunnel that returns the prisoner back to the prison in 5 hours. The
third door leads to a tunnel that returns the prisoner back to the prison in 10
hours. Assume the prisoner is equally likely to choose any door. Let X represent
the amount of time until the prisoner reaches freedom and let Y represent the
door (1, 2 or 3) he chooses. What is the expected length of time until the
prisoner reaches safety, E[X]?

(a) Since E[X] = E[E(X|Y )] = (circle none, one or more)

(d) and E(X|Y = 3) = (circle one)

(circle one) 4 / 5 + E[X] / 10 + E[X]
(e) and so E[X] = 13 [4 + (5 + E[X]) + (10 + E[X])] and so
E[X] = (circle one) 16 / 17 / 19
Review Chapter 8

Limit Theorems

8.1 Introduction
We look at various limit theorems used in probability theory, including some laws of
large numbers and some central limit theorems.

8.2 Chebyshev’s Inequality and the Weak Law of

Large Numbers
We look at Markov’s inequality, Chebyshev’s inequality and the weak law of large
numbers, which are given below.
• Markov’s inequality: For nonnegative random variable X and for a > 0,
P {X > a} ≤ E[X]
a
.
• Chebyshev’s inequality: For random variable X with finite µ and σ 2 and for
k > 0,
2
P {|X − µ| ≥ k} ≤ σk2 .
• Weak law of large numbers: For a sequence of independent identical random
variables
Xi , each
with
finite E[Xi ] = µ and for ε > 0,
P X1 +···X
n
n
− µ ≥ ε → 0 as n → ∞.
The first two inequalities allow us to specify (very loose) bounds on probabilities
knowing only µ (Markov) or µ and σ (Chebyshev), when the distribution is not
known. The first two inequalities also are used to prove further limit results, such as
the third result, the weak law of large numbers.

Exercise 8.1 (Markov’s Inequality and Chebyshev’s Inequality)

1. Markov’s Inequality: Ph Levels In Soil. Consider the following n = 28 Ph levels
in soil samples taken at Sand Dunes Park, Indiana.

243
244 Chapter 8. Limit Theorems

4.3 5 5.9 6.5 7.6 7.7 7.7 8.2 8.3 9.5

10.4 10.4 10.5 10.8 11.5 12 12 12.3 12.6 12.6
13 13.1 13.2 13.5 13.6 14.1 14.1 15.1
Assume one of these twenty–eight samples is taken at random.
(a) True / False Markov’s inequality can be rewritten in the following ways,
E[X] µ
P {X > a} ≤ =
a a
E[X]
P {X < a} ≥ 1 −
a
(b) True / False The expected (or mean) Ph level is µ = 10.55.
(Hint: Type the Ph levels into L1 , then STAT CALC 1:1–Var Stats.)
(c) If a = 15, Markov’s inequality,
10.55
P {X > 15} ≤ ≈ 0.703
15
allows us to say, at most, a proportion of 70.3% of the 28 Ph levels should
be more than a Ph level of 15.
1
In fact, only 1 (one) of the 28 Ph levels, or 28 = 0.036 or 3.6%, are above
15 (look at the data above and check that only the Ph level 15.1 is above
15).
Markov’s inequality (circle one) has / has not been violated in this case
Although not violated, Markov’s inequality provides a (circle one) good /
bad approximation to the proportion of Ph levels above 15.
(d) If a = 11, Markov’s inequality,
10.55
P {X > 11} ≤ ≈
11
(circle one) 0.76 / 0.86 / 0.96, allows us to say, at most, a proportion of
96% of the 28 Ph levels should be more than a Ph level of 11.
In fact, 14 of the 28 Ph levels, or 14
28
= 0.50 or 50%, are above a Ph level
of 11.
(e) If a = 7, Markov’s inequality,
10.55
P {X > 7} ≤ ≈
7
(circle one) 1.51 / 1.86 / 1.96, allows us to say, at most, all1 28 Ph levels
should be more than a Ph level of 7.
In fact, 24 of the 28 Ph levels, or 24
28
= 0.86 or 86%, are above a Ph level
of 7.
1
Even though 10.55
7 ≈ 1.51, it is not possible to have more than 100% of all Ph levels have a Ph
level more than 7.
Section 2. Chebyshev’s Inequality and the Weak Law of Large Numbers 245

(f) If a = 11, Markov’s inequality,

10.55
P {X < 11} ≥ 1 − ≈
11
(circle one) 0.04 / 0.05 / 0.06, allows us to say, at least, 4% of the 28 Ph
levels should be less than a Ph level of 11.
In fact, 14 of the 28 Ph levels, or 14
28
= 0.50 or 50%, are less than a Ph
level of 11.

2. Chebyshev’s Inequality: Ph Levels In Soil. Consider the following n = 28 Ph

levels in soil samples taken at Sand Dunes Park, Indiana.

4.3 5 5.9 6.5 7.6 7.7 7.7 8.2 8.3 9.5

10.4 10.4 10.5 10.8 11.5 12 12 12.3 12.6 12.6
13 13.1 13.2 13.5 13.6 14.1 14.1 15.1

Assume one of these twenty–eight samples is taken at random.

(a) True / False Chebyshev’s inequality can be rewritten in the following

ways,

σ2
P {|X − µ| ≥ k} ≤
k2
σ2 1
P {|X − µ| ≥ σk} ≤ 2 2
= 2
σ k k
1
P {|X − µ| ≤ σk} ≥ 1 − 2
k
1
P {µ − σk ≤ X ≤ µ + σk} ≥ 1 − 2
k

(b) True / False The expected Ph level and standard deviation in Ph level
are µ = 10.55 and σ = 3.01, respectively.
(Hint: Type the Ph levels into L1 , then STAT CALC 1:1–Var Stats.)
(c) The Ph level one standard deviation above the average is equal to µ + σ =
10.55 + 3.01 = 13.56. The Ph level two standard deviations below the
average is equal to µ − 2(3.01) = 4.53. Determine (a), (b), (c), (d) and (e)
in the ﬁgure and then ﬁll in the table below.
246 Chapter 8. Limit Theorems

10.55 - 3(3.01) 10.55 + 3(3.01)

at least 89%
10.55 - 2(3.01) 10.55 + 2(3.01)

at least 75%

10.55 - 3.01 10.55 + 3.01

at least 0%

(a) 4.53 (b) 10.55 (c) (d) (e)

ave

Figure 8.1 (Ph Levels 1, 2 and 3 Standard Deviations From Mean)

(a) (b) (c) (d) (e)

(d) The smallest Ph level, 4.3, is (circle one) inside / outside the interval
between 7.54 and 13.56. Also, the Ph level, 10.5, is (circle one) inside /
outside the interval (7.54, 13.56).
(e) Ph levels that are within one standard deviation of the average, refers to
Ph levels that are (circle one) inside / outside the interval (7.54, 13.56).
Ph levels that are within two standard deviations of the average, refers to
Ph levels that are (circle one) inside / outside the interval (4.53, 16.57).
(f) Instead of saying “Ph levels that are within one standard deviation of
the mean”, it is also possible to say “Ph levels are within k standard
deviations of the mean”, where k = 1. If the Ph levels are within two
standard deviations of the average,
then k = 1 / 2 / 3
If the Ph levels are within two and a half standard deviations of the average,
then k = 1 / 1.5 / 2.5
(g) If k = 1.5, then 1 − k12 = 1 − 1.51 2 ≈ 0.56 or 56%.
If k = 2, then 1 − k12 = 14 / 24 / 34
which is equal to 25% / 50% / 75%.
(h) Chebyshev’s inequality,
1
P {µ − σk ≤ X ≤ µ + σk} ≥ 1 −
k2
1 3
P {10.55 − 2σ ≤ X ≤ 10.55 + 2σ} ≥ 1 − 2 = ,
2 4
allows us to say, at least a 1 − k12 = 0.75 proportion or 75% of the 28 Ph
levels should be within two (k = 2) standard deviation of the average.
In fact, 27 of the 28 Ph levels (look at the data above and see for yourself),
Section 2. Chebyshev’s Inequality and the Weak Law of Large Numbers 247

or 27
28
= 0.964 or 96.4%, are in the interval (4.53, 16.57). Chebyshev’s
inequality (circle one) has / has not be violated in this case.
(i) Using Chebyshev’s inequality, what proportion should fall within k = 3
standard deviations of the average?
1 − 312 = (circle one) 34 / 67 / 89
In fact, what proportion of the Ph levels are actually in this interval (count
the number in the interval (1.52,19.58))?
(circle one) 26
28
/ 27
28
/ 28
28
(j) Using Chebyshev’s inequality, what proportion should fall within k = 2.5
standard deviations of the average?
1 − 2.51 2 = (circle one) 20
25
/ 21
25
/ 22
25
In fact, what proportion of the Ph levels are actually in this interval (count
the number in the interval (3.025,18.075))?
(circle one) 2628
/ 27
28
/ 28
28
(k) Since at least 75% or 21 Ph levels are inside the interval (4.534, 16.574),
then at most
(circle one) 25% / 35% / 45%
of the levels are outside the interval (4.534, 16.574).

3. Chebyshev and the Normal Distribution. Let X be a normal random variable,

with mean µ = 5 and standard deviation σ = 2.

(a) Using your calculators, P {1 < X < 9} = (circle one) 0.68 / 0.75 / 0.95
(Hint: normalcdf(1,9,5,2))
(b) Using Chebyshev’s inequality, P {1 < X < 9} =
1
P {µ − kσ ≤ X ≤ µ + kσ} ≥ 1 −
k2
1
P {5 − 2(2) ≤ X ≤ 5 + 2(2)} ≥ 1 − 2 =
2
(circle one) 0.68 / 0.75 / 0.95
(c) So, although Chebyshev’s inequality is correct, it (circle one) is / is not
a good approximation to the correct probability in this case.

4. Weak Law of Large Numbers and Chebyshev.

(a) True / False For a sequence of independent identical random variables

Xi , each with ﬁnite E[Xi ] = µ and so

X1 + · · · Xn X 1 + · · · Xn σ2
E = µ and Var =
n n n
248 Chapter 8. Limit Theorems

σ2
then, from Chebyshev’s inequality, P {|X − µ| ≥ σk} ≤ σ2 k 2
,

X1 + · · · Xn σ2
P − µ ≥ ε ≤ 2
n nε

which tends to zero, as n → 0.

(b) Based on past experience, the mean test score is µ = 70 and the variance
in the test score is σ 2 = 20. How many students would have to take a test
to be sure, that with probability of at least 0.85, the class average would
be within ε = 5 of 70? Since

X1 + · · · + Xn σ2
P
− µ ≥ ε ≤
n nε2

X1 + · · · + Xn 20
P − 70 ≥ 5 ≤
n n52
20 20
and so 1 − 0.85 = n52
when n = (0.15)25
≈ (circle one) 3 / 4 / 5
(c) Based on past experience, the mean test score is µ = 50 and the variance
in the test score is σ 2 = 15. How many students would have to take a test
to be sure, that with probability of at least 0.95, the class average would
be within ε = 7 of 50?
15
n= ≈
(0.05)72

(circle one) 3 / 4 / 6
(d) Based on past experience, the mean test score is µ = 50 and the variance in
the test scores is σ 2 = 15. Determine the probability that the average score
of 40 students will between 39 and 61. The weak law of large numbers,

X1 + · · · Xn σ2
P − µ ≤ ε ≥ 1 − 2 ,
n nε

can be used to approximate the probability In particular,

X1 + · · · Xn 152
P
− 70 ≤ 11 ≥ 1 − =
n (40)112

(circle one) 0.95 / 0.99 / 1

Section 3. The Central Limit Theorem 249

8.3 The Central Limit Theorem

The Central Limit Theorem (CLT) says that as random sample size n increases, the
probabilitydistribution of the sum of independent identically distributed random
variables, ni=1 Xi , tends to a normal distribution,
a
X1 + · · · Xn − nµ 1 2
P √ ≤a → √ e−x /2 dx, n → ∞
σ n 2π −∞
A related result is that

MZn (t) → MX (t), implies FZn (t) → FZ (t)

Exercise 8.2 (Central Limit Theorem)

1. Using The Central Limit Theorem.

(a) Sum. Suppose X has a (any!)

distribution where
µX = 2.7 and σX = 0.64.
35
If n = 35, then determine P i=1 Xi > 99 .

i. µP Xi = nµ = 35(2.7) = (circle one) 93.5 / 94.5 / 95.5.

√
ii. σP Xi = σ n = 0.64 35 = (circle one) 3.5 / 3.8 / 4.1.
√

35
iii. P i=1 Xi > 99 ≈ (circle one) 0.09 / 0.11 / 0.15.
(2nd DISTR normalcdf(99,E99,94.5,3.8))

where µX =
(b) Another Sum. Suppose X has a (any!)
distribution −1.7 and
σX = 1.6. If n = 43, then determine P −76 < 43 i=1 X i < −71 .
i. µP Xi = nµ = 43(−1.7) = (circle one) −73.5 / −73.1 / −72.9.
√
ii. σP Xi = σ n = 1.6 43 = (circle one) 9.5 / 9.8 / 10.5.
√

iii. P −76 < 43 i=1 Xi < −71 ≈ (circle one) 0.09 / 0.11 / 0.19.
(2nd DISTR normalcdf(-76,-71,-73.1,10.5))
(c) And Yet Another Sum. Suppose X has a distribution where µX = 0.7 and
σX
= 1.1.
If n = 51, then
P 34.5 < 51 i=1 Xi < 35.1 ≈ (circle one) 0.01 / 0.02 / 0.03.

(d) Average. Suppose X has a distribution where µ = 2.7 and σ = 0.64. If

n = 35, determine the chance the average (not sum!) is larger than 2.75,
P (X̄ > 2.75).
nµ
i. µX̄ = n
=µ= (circle one) 2.7 / 2.8 / 2.9.
√
σ n
ii. σX̄ = n
= √σn
=√ 0.64
35
= (circle one) 0.11 / 0.12 / 0.13.

iii. P X̄ > 2.75 ≈ (circle one) 0.30 / 0.32 / 0.35.
(2nd DISTR normalcdf(2.75,E99,2.7,0.11))
250 Chapter 8. Limit Theorems

(e) Another Average. Suppose X has a distribution where µX = −1.7 and

σX = 1.5. If n = 49, then
P (−2 < X̄ < 2.75) ≈ (circle one) 0.58 / 0.58 / 0.92.
(f) Exponential Sum. Suppose X has an exponential distribution where
λ = 4 and where n = 35. Determine the chance of the sum of 35 in-
dependent

35 identically
distributed random variables, X, is greater than 9,
P i=1 Xi > 9 using the normal approximation.
i. E[X] = µ = λ1 = (circle one) 0.25 / 0.33 / 0.50.
ii. µP Xi = nµ = 35(0.25) = (circle one) 7.75 / 8.75 / 9.75.
iii. Var[X] = σ 2 = λ12 = (circle one) 0.0255 / 0.0335 / 0.0625.
√ √
iv. σP Xi = σ n = 0.0625 35 = (circle one) 0.0106 / 0.0335 / 1.48.
√

v. Normal

35 approximation,

P i=1 Xi > 9 ≈ (circle one) 0 / 0.11 / 0.43.
(2nd DISTR normalcdf(9,E99,8.75,1.48))
(g) Negative Binomial Sum. Suppose X has a negative binomial distribution
where the number of trials is i = 500, the required number of successes
are r = 4 and chance of success on each trial is p = 0.3. Determine
the chance of the sum of 35 independent identically

35 distributed random
variables, X, is greater than 450 successes, P i=1 X i > 450 , using the
normal approximation.
i. E[X] = µ = pr = (circle one) 12.3 / 13.3 / 14.3.
ii. µP Xi = nµ = 35(13.3) = (circle one) 465.5 / 470.5 / 495.5.
iii. Var[X] = σ 2 = r(1−p)
p2
= (circle one) 29.3 / 31.1 / 34.3.
√ √
iv. σP Xi = σ n = 31.1 35 = (circle one) 33 / 35 / 37.
√

v. Normal

35 approximation,

P i=1 Xi > 450 ≈ (circle one) 0 / 0.68 / 0.75.
(2nd DISTR normalcdf(450,E99,465.5,33))
(h) Dice Sum. What is the
chance,in 30 rolls of a fair die, that the sum is
between 100 and 105, P 100 < 30 i=1 Xi < 105 , using the normal approx-
imation.

i. E[X] = µ = 1 16 + · · · + 6 16 = (circle one) 2.3 / 3.5 / 4.3.
ii. µP Xi = nµ = 30(3.5) = (circle one) 100 / 105 / 110.

iii. E[X 2 ] = µ = 12 16 + · · · + 62 16 = (circle one) 2.3 / 3.5 / 15.2.
iv. Var[X] = E[X 2 ] − (E[X])2 = 15.2 − 3.52 (circle one) 2.9 / 3.1 / 3.3.
√ √
v. σP Xi = σ n = 2.9 30 = (circle one) 9.1 / 9.4 / 9.7.
√

vi. Normal

approximation,

P 100 < 30 i=1 Xi < 105 ≈ (circle one) 0 / 0.20 / 0.35.
(2nd DISTR normalcdf(100,105,105,9.4))
Section 3. The Central Limit Theorem 251

2. Understanding The Central Limit Theorem: Fishing in Montana.

(a) The distributions of the average number of ﬁsh caught at a lake, X̄, where
n = 1, 2, 3 are given by
x, n = 1 1 2 3
P (X = x) 0.4 0.4 0.2
where µX = 1.8 and σX = 0.75,
3 5
x̄, n = 2 1 2
2 2
3
P (X̄ = x̄) 0.16 0.32 0.32 0.16 0.04
0.75
√
where µX̄ = 1.8 and σX̄ = 2
= 0.53,
4 5 7 8
x̄, n = 3 1 3 3
2 3 3
3
P (X̄ = x̄) 0.064 0.192 0.288 0.256 0.144 0.048 0.008
where µX̄ = 1.8 and σX̄ = 0.75
√ = 0.43. The probability histograms of these
3
three sampling distributions are given below.

0.35 0.30
0.40
P(X = x) = P(X = x)

0.30 0.25
_

0.35
P(X = x)
_ _

0.30 0.25

P(X = x)
0.20
_

_ _
0.25 0.20
0.15
0.20 0.15
0.15 0.10
0.10
0.10 0.05
0.05
0.05
0.00 0.00
0.00 X 1 4/3 5/3 2 7/3 8/3 3 X
1 2 3 X 1 3/2 2 5/2 3

(a) n = 1 (b) n = 2 (c) n = 3

Figure 8.2 (Comparing Sampling Distributions Of Sample Mean)

As the random sample size, n, increases, the sampling distribution of the
average, X̄, changes shape and becomes more (circle one)
i. rectangular–shaped.
ii. bell–shaped.
iii. triangular–shaped.
In fact, the central limit theorem (CLT) says no matter what the original
distribution, the sampling distribution of the average is typically normal
when n > 30.
(b) Even though the sampling distribution becomes more normal–shaped as
the random sample size increases, the mean of the average, µX̄ = 1.8 (circle
one)
σ2
i. decreases and is equal to nX ,
ii. remains the same and is equal to µX = 1.8,
252 Chapter 8. Limit Theorems

iii. increases and is equal to nµX ,

and the standard deviation of the average, σX̄ (circle one)
σX
i. decreases and is equal to √ n
.
ii. remains the same and is equal to σX .
iii. increases and is equal to nσX .
(c) After n = 30 trips to the lake, the distribution in the average number of
fish caught is essentially normal (why?), where (circle one)
0.75
√
i. µX̄ = 1.8 and σX̄ = 3
= 0.43.
0.75
√
ii. µX̄ = 1.8 and σX̄ = 10
= 0.24.
0.75
√
iii. µX̄ = 1.8 and σX̄ = 30
= 0.14
(d) True / False. After n = 30 trips to the lake, the (approximate) chance
the average number of fish caught is greater than 2.1 fish is given by (using
your calculators)
0.75
P (X̄ > 2.1) ≈ 0.015, where µX̄ = 1.8 and σX̄ = √ = 0.14.
√ 30
(2nd DISTR 2:normalcdf(2.1,E99,1.8,0.75/ 30) ENTER)
(e) After 30 trips to the lake, the chance the average number of fish is less
than 1.95 is P (X̄ < 1.95) ≈ (circle one) 0.73 / 0.86 / 0.94.
(f) After n = 35 trips to the lake, the distribution in the average number of
fish caught is essentially normal (why?), where (circle one)
0.75
√
i. µX̄ = 1.8 and σX̄ = 30
= 0.14.
0.75
√
ii. µX̄ = 1.8 and σX̄ = 35
= 0.13.
0.75
√
iii. µX̄ = 1.8 and σX̄ = 40
= 0.12
(g) After 35 trips to the lake, the chance the average number of fish is less
than 1.95 is P (X̄ < 1.95) ≈ (circle one) 0.73 / 0.88 / 0.94.
(h) After n = 15 trips to the lake, the distribution in the average number of
fish caught (circle one) is / is not normal.
(i) The CLT is useful because (circle none, one or more):
i. No matter what the original distribution is, as long as a large enough
random sample is taken, the average of this sample follows a normal
(not a binomial or any other distribution) distribution.
ii. In practical situations where it is not known what probability distri-
bution to use, as long as a large enough random sample is taken, the
average of this sample follows a normal distribution.
iii. Rather than having to deal with many different probability distribu-
tions, as long as a large enough random sample is taken, the average
of this sample follows one distribution, the normal distribution.
Section 3. The Central Limit Theorem 253

iv. Many of the distributions in statistics rely in one way or another on

the normal distribution because of the CLT.
(j) True / False The central limit theorem requires not only that n ≥ 30,
but also that a random sample of size n ≥ 30 is used.

Chapter 4
80% (5)
Chapter 4
21 pages
Ma1252-Probability and Queueing Theory Unit - I Probability and Random Variable
100% (1)
Ma1252-Probability and Queueing Theory Unit - I Probability and Random Variable
19 pages
Random Variable and Mathematical Expectation
No ratings yet
Random Variable and Mathematical Expectation
31 pages
Sports Arbitrage Guide 04 - The Calculations
100% (2)
Sports Arbitrage Guide 04 - The Calculations
5 pages
STAT273 - CHAPTER 04 (Summer)
No ratings yet
STAT273 - CHAPTER 04 (Summer)
30 pages
Safety Integrity Level (SIL)
No ratings yet
Safety Integrity Level (SIL)
1 page
Student Notes 2.1
No ratings yet
Student Notes 2.1
5 pages
Chapter3-Probability Distribution
100% (1)
Chapter3-Probability Distribution
35 pages
ISO 4287-1997 - Surface Texture
No ratings yet
ISO 4287-1997 - Surface Texture
36 pages
Chapter 4 Random Variables
No ratings yet
Chapter 4 Random Variables
180 pages
Digital Design Interview Questions
No ratings yet
Digital Design Interview Questions
24 pages
Homological Algebra
0% (1)
Homological Algebra
279 pages
L-6 Probability Distribution
No ratings yet
L-6 Probability Distribution
58 pages
Chapter 7 Discrete Probablity
No ratings yet
Chapter 7 Discrete Probablity
24 pages
Chapter 4
No ratings yet
Chapter 4
22 pages
4.1 - Discrete Statistics
No ratings yet
4.1 - Discrete Statistics
24 pages
Book - Applied - Mathematics - ClassXII-208-382
No ratings yet
Book - Applied - Mathematics - ClassXII-208-382
175 pages
2024 Chapter4
No ratings yet
2024 Chapter4
65 pages
STA416 - Topic 4 - 1
No ratings yet
STA416 - Topic 4 - 1
24 pages
Book - Applied - Mathematics - ClassXII-208-382-1-135
No ratings yet
Book - Applied - Mathematics - ClassXII-208-382-1-135
135 pages
Probability FoundationalMathofAI S24
No ratings yet
Probability FoundationalMathofAI S24
7 pages
Stat 350 Study Guide
No ratings yet
Stat 350 Study Guide
37 pages
Elementary Probability Theory
No ratings yet
Elementary Probability Theory
18 pages
Distribution of Probability
No ratings yet
Distribution of Probability
34 pages
Chapter 3
No ratings yet
Chapter 3
35 pages
Unit 2 P&S
No ratings yet
Unit 2 P&S
82 pages
CHAPTER TWO (2) S
No ratings yet
CHAPTER TWO (2) S
69 pages
Module 05
No ratings yet
Module 05
94 pages
Week One Notes
No ratings yet
Week One Notes
10 pages
EFA 2213 Engineering Mathematics IV (Statistics) : Random Variables
No ratings yet
EFA 2213 Engineering Mathematics IV (Statistics) : Random Variables
74 pages
Chapter 3
No ratings yet
Chapter 3
8 pages
Sub BAB 4.1
No ratings yet
Sub BAB 4.1
4 pages
Week05-06 EC With Annotations
No ratings yet
Week05-06 EC With Annotations
84 pages
Ch04 Cost Volume Profit Analysis
0% (1)
Ch04 Cost Volume Profit Analysis
21 pages
@ CHPTR 03
No ratings yet
@ CHPTR 03
43 pages
Statistics and Probability - Lect5 - S2025 - Pucit
No ratings yet
Statistics and Probability - Lect5 - S2025 - Pucit
11 pages
Ch-04 - Random Variables and Their Properties
No ratings yet
Ch-04 - Random Variables and Their Properties
32 pages
Lecture Note L3
No ratings yet
Lecture Note L3
54 pages
Aem Probability PDF
No ratings yet
Aem Probability PDF
10 pages
S-11 - Random Variables and Discrete Probability Distributions
No ratings yet
S-11 - Random Variables and Discrete Probability Distributions
24 pages
Random Variables FinalNotes
No ratings yet
Random Variables FinalNotes
57 pages
4.1 - Discrete Models PDF
No ratings yet
4.1 - Discrete Models PDF
24 pages
Chapter 4 - Random Variables
No ratings yet
Chapter 4 - Random Variables
59 pages
Discrete Random Variables Class 4, 18.05 Jeremy Orloff and Jonathan Bloom 1 Learning Goals
No ratings yet
Discrete Random Variables Class 4, 18.05 Jeremy Orloff and Jonathan Bloom 1 Learning Goals
13 pages
Unit 1 Basic
No ratings yet
Unit 1 Basic
28 pages
Random Variables
No ratings yet
Random Variables
26 pages
Random Variables and Pdfs
No ratings yet
Random Variables and Pdfs
18 pages
Advanced Engineering Mathematics Vivek Vijay
No ratings yet
Advanced Engineering Mathematics Vivek Vijay
246 pages
P&S Unit 1
No ratings yet
P&S Unit 1
50 pages
Lecture 1
No ratings yet
Lecture 1
81 pages
Statistical Method
No ratings yet
Statistical Method
227 pages
Orientation - Basic Mathematics and Statistics - Probability
No ratings yet
Orientation - Basic Mathematics and Statistics - Probability
48 pages
Note Chapter 1
No ratings yet
Note Chapter 1
49 pages
PRP - Unit 2
No ratings yet
PRP - Unit 2
41 pages
FicheTD Proba 2024
No ratings yet
FicheTD Proba 2024
6 pages
Random Variable Slides
No ratings yet
Random Variable Slides
41 pages
Random Variables: Jeff Chak Fu WONG
No ratings yet
Random Variables: Jeff Chak Fu WONG
104 pages
Random Variables Apr 27
No ratings yet
Random Variables Apr 27
32 pages
Discrete Cumulative - Probability Distribution
No ratings yet
Discrete Cumulative - Probability Distribution
50 pages
Module 2
No ratings yet
Module 2
37 pages
Basic Probability Review
No ratings yet
Basic Probability Review
77 pages
List of Open Elective 2021
No ratings yet
List of Open Elective 2021
3 pages
Best Ones
No ratings yet
Best Ones
32 pages
ACF 602/622 Coursework: Group 53
No ratings yet
ACF 602/622 Coursework: Group 53
13 pages
Random Processes Ma1254
No ratings yet
Random Processes Ma1254
17 pages
Manual Moisture
No ratings yet
Manual Moisture
38 pages
Asset Pricing
No ratings yet
Asset Pricing
23 pages
Urban Transport: 4401 - Gravity Model
100% (1)
Urban Transport: 4401 - Gravity Model
21 pages
Investments Fama French Three Factor Analysis Slide Deck
No ratings yet
Investments Fama French Three Factor Analysis Slide Deck
39 pages
G7 Q1 Week 01
No ratings yet
G7 Q1 Week 01
8 pages
Lent 2020 Week 7 Test Revision Answers
No ratings yet
Lent 2020 Week 7 Test Revision Answers
4 pages
Customer Churn Analysis - Jupyter Notebook
No ratings yet
Customer Churn Analysis - Jupyter Notebook
10 pages
Audit & Accounts Order - 1952
No ratings yet
Audit & Accounts Order - 1952
3 pages
ME 2019 New Question Format Chem 2
No ratings yet
ME 2019 New Question Format Chem 2
4 pages
Samrat Class 8 Science
No ratings yet
Samrat Class 8 Science
10 pages
Python Notes 11 Dictionary Tuples and Sets 1664121924
No ratings yet
Python Notes 11 Dictionary Tuples and Sets 1664121924
21 pages
DEA With Stata
No ratings yet
DEA With Stata
14 pages
Legal Framework-Constitutional Provision
No ratings yet
Legal Framework-Constitutional Provision
40 pages
Deadlock
No ratings yet
Deadlock
67 pages
70 FTC
No ratings yet
70 FTC
6 pages
물리 교재 28단원
No ratings yet
물리 교재 28단원
26 pages
ACtion Plan
No ratings yet
ACtion Plan
14 pages
Public Expenditure Management and Budget
No ratings yet
Public Expenditure Management and Budget
15 pages
Econ 406 Final Exam Main 2017
No ratings yet
Econ 406 Final Exam Main 2017
5 pages
2015 Examinations
No ratings yet
2015 Examinations
3 pages
Respondent Independent Variables Indicators of The Variable Measuring Indicator Questions 1. Pensioner (Before iBAS) 2. Pensioner (After iBAS++)
No ratings yet
Respondent Independent Variables Indicators of The Variable Measuring Indicator Questions 1. Pensioner (Before iBAS) 2. Pensioner (After iBAS++)
4 pages
MTBF Paper
No ratings yet
MTBF Paper
76 pages
AcF 602 Coursework
No ratings yet
AcF 602 Coursework
2 pages
Lent 2020 Week 9 Questions
No ratings yet
Lent 2020 Week 9 Questions
1 page
Daily Monitoring Report For CGDF: Senior Finance Controller, Air Forces
No ratings yet
Daily Monitoring Report For CGDF: Senior Finance Controller, Air Forces
4 pages
CGDF Monitoring Report
No ratings yet
CGDF Monitoring Report
4 pages
Elementary Statistics A Step by Step Approach 9th Edition Bluman Test Bank PDF Download
100% (2)
Elementary Statistics A Step by Step Approach 9th Edition Bluman Test Bank PDF Download
65 pages
Submitted in Partial Fulfilment For The Award of Degree of
No ratings yet
Submitted in Partial Fulfilment For The Award of Degree of
13 pages
Accounting Treatment of Exchange Account Codes & Relevant Questions
No ratings yet
Accounting Treatment of Exchange Account Codes & Relevant Questions
13 pages
C27 Btest-1 Physics Paper
No ratings yet
C27 Btest-1 Physics Paper
8 pages
IBAS++ Implementation in Military Farms
No ratings yet
IBAS++ Implementation in Military Farms
6 pages
Free Damped Vibration
No ratings yet
Free Damped Vibration
50 pages
Kinematics (Motion in Straight Line) WS 1
No ratings yet
Kinematics (Motion in Straight Line) WS 1
3 pages
Caie As Level Psychology 9990 Methodology 63d5229efa0a7313631e05cb 853
No ratings yet
Caie As Level Psychology 9990 Methodology 63d5229efa0a7313631e05cb 853
9 pages
1.6 Other Types of Equations: Objectives
No ratings yet
1.6 Other Types of Equations: Objectives
19 pages
Vertopal Com EDA Project
No ratings yet
Vertopal Com EDA Project
21 pages
Anova 2
No ratings yet
Anova 2
4 pages
Jain College, Jayanagar II PUC Mock Paper I 2018 Mathematics Duration: 3hr 15 Min Max - Marks: 100 Part A I. Answer All The Questions: 1 × 10 10
No ratings yet
Jain College, Jayanagar II PUC Mock Paper I 2018 Mathematics Duration: 3hr 15 Min Max - Marks: 100 Part A I. Answer All The Questions: 1 × 10 10
3 pages
Department of Education: General Mathematics Weekly Home Learning Plan
No ratings yet
Department of Education: General Mathematics Weekly Home Learning Plan
3 pages
PROBLEMSet 092022
No ratings yet
PROBLEMSet 092022
1 page
Differential Forms
From Everand
Differential Forms
Henri Cartan
5/5 (2)
Worked Examples in Mathematics for Scientists and Engineers
From Everand
Worked Examples in Mathematics for Scientists and Engineers
G. Stephenson
No ratings yet
Pythagorean Triangles
From Everand
Pythagorean Triangles
Waclaw Sierpinski
No ratings yet