Chapter+Three+Random+Variables
Chapter+Three+Random+Variables
Definition
A random variable is a quantitative variable whose value depends on the outcome of an experiment. (More
generally, it is a number whose value is not known with certainty.) For now we focus on discrete random variables.
PROBLEM 3.1
● ● ● ● ● ● ● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
● ●
●
●
● ●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ● ● ● ● ● ● ● ● ● ●
● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●
● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●
These are the important questions to consider regarding any random variable X:
1) What are the possible values of X?
2) What are the corresponding probabilities for each possible value?
3) What is the expected value of X?
4) What is the standard deviation of X?
Definition
The probability distribution of a discrete random variable X is a two-column table listing the possible values of X
and the probabilities for each.
Example
A pair of balanced dice will be rolled.
Let X = the sum of the two outcomes
Let Y = the product of the two outcomes
The probability distributions for each random variable are given below:
X P(X) Y P(Y)
2 1/36=.028 1 1/36=.028
3 2/36=.056 2 2/36=.056
4 3/36=.083 3 2/36=.056
5 4/36=.111 4 3/36=.083
6 5/36=.139 5 2/36=.056
7 6/36=.167 6 4/36=.111
8 5/36=.139 8 2/36=.056
9 4/36=.111 9 1/36=.028
10 3/36=.083 10 2/36=.056
11 2/36=.056 12 4/36=.111
12 1/36=.028 15 2/36=.028
36/36=1 16 1/36=.028
18 2/36=.056
20 2/36=.056
24 2/36=.056
25 1/36=.028
30 2/36=.056
36 1/36=.028
36/36=1
24
PROBLEM 3.2
A fair coin is to be flipped four times.
Let X = the total number of heads
Give the probability distribution of X.
PROBLEM 3.3
You will flip a fair coin four times. If all four flips are heads, you will win $10. Otherwise, you win nothing.
Let W = the amount of money you win. Find the probability distribution of W. (Would you pay $1 to play this
game? Why or why not?)
PROBLEM 3.4
Two cards are randomly selected from a full deck of fifty-two, without replacement.
Let X = the total number of aces selected
Let Y = the total number of clubs selected
Use the Multiplication Rule from Chapter 2 to find the probability distributions of both X and Y.
Definition
The probability histogram of a random variable X is a graphical display of the possible values and probabilities.
The following probability histograms are for random variables from previous example and problems:
• Left-skewed
• Right-skewed
• Symmetric
Sometimes we can derive a formula that gives probabilities for each possible value of a (discrete) random variable.
PROBLEM 3.5
You have a pair of balanced dice. You will roll the dice repeatedly until “doubles” comes up for the first time. Then
you stop. Let Y = the total number of times you roll the dice.
(a) What are the possible values of Y?
(b) What are the corresponding probabilities? Find a formula that gives P(Y).
PROBLEM 3.6
Suppose that 12% of the general population is left-handed. You will randomly select individuals, one at a time,
until you obtain a lefty. Let V = the total number of people you select. Find a formula that gives P(V).
25
Definition
The expected value of a random variable X is denoted by x , where
X = X P( X )
It is a weighted average of the possible values, each one weighted by its own probability.
PROBLEM 3.7
Refer to the coin flipping example, where a coin is flipped four times. Let X = total number of heads. What is x?
PROBLEM 3.8
Refer to the example where a pair of balanced dice are to be rolled, where X = sum and Y = product.
(a) What is x? (b) What is y?
- The Interpretation of x -
Suppose that X is a random variable with respect to some experiment. If the experiment is performed repeatedly –
many, many times – then the average (i.e. mean) of the obtained values of X will be “very close” to the expected value
x with “very high” probability. This is known as the Law of Large Numbers.
PROBLEM 3.9
You will flip a fair coin four times. If all four flips are heads, you will win $10. Otherwise, you win nothing.
Let W = the amount of money you win. Find and interpret the expected value of W. Would you pay $1 to play
this game? Why or why not?
PROBLEM 3.10
You are playing roulette. Recall that there are thirty-eight colored possible outcomes: numbers 1 to 36, ‘0’, and ‘00’.
Of these:
18 are red
18 are black
2 are green (with numbers ‘0’ and ‘00’)
(a) You will wager $1.00 on the color of the outcome; if you bet one of the two “major” colors and win,
you get $2.00 back. Suppose that you will bet on red.
Let X = the amount of money that is handed back to you after the game. What is x?
(b) Suppose instead that you will bet on a number, say, ‘17’. If you win, you’ll get back $36.00.
Let Y = the amount of money that is handed to you by the dealer after the game. What is y?
What about other betting strategies??? (betting on green, betting rows, columns, four-at-a-time, etc.)
26
Definition
The standard deviation of a random variable X is denoted by x , where
X= ( X − X ) 2 P( X )
This measures the extent to which the probability is concentrated around the expected value. The following chart shows
how the standard deviation is computed, using the random variable X = sum of two rolled balanced dice
𝟐
X P(X) XP(X) 𝐗 − 𝝁𝑿 (𝑿 − 𝝁𝑿 )𝟐 (𝑿 − 𝝁𝑿 ) 𝑷(𝑿)
2 0.028 0.056 -5 25 0.694
3 0.056 0.167 -4 16 0.889
4 0.083 0.333 -3 9 0.750
5 0.111 0.556 -2 4 0.444
6 0.139 0.833 -1 1 0.139 Standard deviation of X is
7 0.167 1.167 0 0 0.000
8 0.139 1.111 1 1 0.139
9 0.111 1.000 2 4 0.444
𝜎𝑋 = √5.833 = 2.42
10 0.083 0.833 3 9 0.750
11 0.056 0.611 4 16 0.889
12 0.028 0.333 5 25 0.694
1 𝜇𝑋 =7 5.833
To understand how to interpret x, here are probability distributions for three different random variables: X, Y, Z.
The possible values are the same (3-9) and expected value is the same (6). The only difference is the distribution of
the probabilities. The lesser the concentration of probabilities around the expected value, the larger x is.
Suppose X is a randomly selected value from a population with mean and standard deviation . Then:
x = and x =
Example
The following population consists of the ten players on a youth basketball team; each player’s age is recorded:
8 5 8 7 6 6 6 6 5 5
mean = = (8+5+8+7+6+6+6+6+5+5)/10 = 62/10 = 6.2 years
Now suppose that a player is selected at random, and let X = the age of the randomly selected player. What is x?
Solution: X P(X) XP(X)
5 .3 1.5
6 .4 2.4
7 .1 .7
8 .2 1.6
6.2 = x
The expected value of X is simply the average age!!! (The same holds true for the standard deviation of X.)
27
Before we continue, we must take care of some mathematical concepts: factorials and binomial coefficients.
Definition
For any positive whole number n, we define n! to be the product of the first n whole numbers:
Examples:
0! = 1 (defined specially)
1! = 1
2! = 2 x 1 = 2
3! = 3 x 2 x 1 = 6
4! = 4 x 3 x 2 x 1 = 24
5! = 5 x 4 x 3 x 2 x 1 = 120 … and so on …
Definition
For any positive whole number n and any non-negative whole number x, the binomial coefficient nCx is
n!
n Cx =
x !(n − x)!
We say “n choose x”.
PROBLEM 3.11
Solve: (a) 6C2 (b) 8C5 (c) 40C2 (d) 40C38 (e) 13C4 (f) 8C0
1. There are NCn ways to choose a sample of size n from a population of size N.
Example
Consider the eight students in the statistics class from Problem 7. If two students are to be selected (say, to give a
presentation), there are 8C2 = 28 different groups of size-2 (below right):
2. Suppose that there are a total of n objects of two different types; x objects of one type and (n-x) objects
of another type. There are nCx different ways to arrange these objects in a row.
Example
Suppose that you have 5 marbles; 3 are blue and 2 are green. There are 5C2 = 10 ways to arrange them in a row:
28
Back to random variables: the Binomial Probability Formula
Consider an experiment that consists of a fixed, pre-determined number of repeated, independent trials.
Examples:
1) thirty rolls of a pair of dice
2) eighteen flips of a coin
3) a sample of fifty individuals off the street, one at a time
In each trial, something of interest to you either occurs (success) or does not occur (failure).
Examples:
1) “doubles” is rolled ….. “doubles” is not rolled
2) “heads” comes up ….. “heads” does not come up
3) a person with Type A blood is chosen ….. a person with Type A blood is not chosen
X is a special type of random variable called a Binomial random variable – we will derive a formula that will give
the probabilities for its possible values. Once this is done, all we have to do is recognize a particular random
variable as being a Binomial and we can just apply the formula to find all the probabilities we want!
The following experiment can be used as a model for all Binomial random variables.
Example
The pictured wheel will be spun four times. For each spin, P(S) = 1/3 = .333 and P(F) = 2/3 = .667.
Assume that the spins are independent.
OUTCOME PROBABILITY
SSSS (1/3)(1/3)(1/3)(1/3)=1/81
SSSF (1/3)(1/3)(1/3)(2/3)=2/81
SSFS (1/3)(1/3)(2/3)(1/3)=2/81
SSFF (1/3)(1/3)(2/3)(2/3)=4/81
SFSS (1/3)(2/3)(1/3)(1/3)=2/81
SFSF (1/3)(2/3)(1/3)(2/3)=4/81 P(X = 1) = P[(SFFF) or (FSFF) or (FFSF) or (FFFS)]
SFFS (1/3)(2/3)(2/3)(1/3)=4/81
SFFF (1/3)(2/3)(2/3)(2/3)=8/81
FSSS (2/3)(1/3)(1/3)(1/3)=2/81 = P(SFFF) + P(FSFF) + P(FFSF) + P(FFFS)
FSSF (2/3)(1/3)(1/3)(2/3)=4/81
FSFS (2/3)(1/3)(2/3)(1/3)=4/81
8 8 8 8
FSFF (2/3)(1/3)(2/3)(2/3)=8/81 = 81
+ 81
+ 81
+ 81
FFSS (2/3)(2/3)(1/3)(1/3)=4/81
FFSF (2/3)(2/3)(1/3)(2/3)=8/81
FFFS (2/3)(2/3)(2/3)(1/3)=8/81 32
FFFF (2/3)(2/3)(2/3)(2/3)=16/81 = = .395
81
81/81=1
29
But what if the number of trials (i.e. spins) is much larger than four?
Example
The same wheel from the previous page will be spun twelve times.
Again, let X = the total number of ‘S’s that come up.
Find P(X = 4).
Solution
NOTE: There are 212=4096 (yikes!) possible sequences of ‘S’s and ‘F’s. But it is not necessary to list them all.
OUTCOME PROBABILITY
SSSSFFFFFFFF (1/3)4(2/3)8
SSSFSFFFFFFF (1/3)4(2/3)8
SSSFFSFFFFFF (1/3)4(2/3)8
SSFSFSFFFFFF (1/3)4(2/3)8
. .
. .
. .
FFFFFFFFSSSS (1/3)4(2/3)8
The remaining question is ….. how many of these sequences with exactly 4 ‘S’s are there?
12!
= 12 C4 = = 495
4! 8!
Therefore, the probability of getting exactly 4 ‘S’s in twelve spins is equal to:
P( X = 4) = 12 C4 ( 13 ) 4 ( 32 )8
= 495 ( 13 ) 4 ( 32 )8
= .2384
Generally speaking: the wheel is spun n times, where for each spin P(S) = p (for any 0 < p < 1). Letting X = the total
number of ‘S’s that come up:
P( X ) = n C X p X (1 − p) n − X
for X = 0,1, 2,......, n
This is known as the Binomial Probability Formula.
Example
The pictured wheel will be spun seven times.
For each spin, P(S) = .1
P(F) = .9
Let X = the total number of ‘S’s that come up
(a) Find P(1 ≤ X ≤ 3)
(b) Give the probability distribution of X.
30
Solution
Here, n = 7 and p = .1, so plugging into the Binomial Probability formula we get
P(X) = 7CX (.1)X (.9)7-X for X = 0, 1, 2, 3, 4, 5, 6, 7
(a)
P(1 ≤ X ≤ 3) = P(X = 1) + P(X = 2) + P(X = 3)
= 7 C1 (.1)1(.9)6 + 7C2 (.1) (.9)
2 5 + C3 (.1)3(.9)4
7
= .372 + .124 + .023
= .519 or 51.9% chance
(b) X P(X) .
0 7C0 (.1)0(.9)7 = 1 (.1)0(.9)7 = .478
1 7C1 (.1) (.9) = 7 (.1)1(.9)6 = .372
1 6
1.00
…. we may wish to find probabilities for the random number of successes. Call it X. Then:
P( X ) = n C X p X (1 − p) n − X
for X = 0,1, 2,......, n
Example
A fair coin is flipped fifteen times. What is the probability of getting exactly six “heads”?
Solution
Let X = the total number of heads. We want P(X = 6).
31
Next we consider the more “important” problem of random sampling from a population.
Example
It has been estimated that 35% of all American families have a pet cat; assume this is true.
A sample of nine families is selected at random.
(a) What is the probability of selecting exactly five families that own cats?
(b) What is the probability of selecting at least two families that own cats?
Solution
Obtaining this sample amounts to performing nine (n = 9) consecutive independent trials. (why?)
For each selection, we either get a family with a cat (success) or we do not get a family with a cat (failure).
(b) P(X 2) = P(X = 2) + P(X = 3) + P(X = 4) + ….. + P(X = 9) (that’s a lot of work!)
= 1 – P(X 1) (why?)
= 1 – [ P(X = 0) + P(X = 1) ]
= 1 – [9C0 (.35)0(.65)9 + 9C1 (.35)1(.65)8]
=1–[ .021 + .100 ]
= 1 – .121
= .879 or 87.9% chance
p% of a (large!) population has a certain characteristic. Suppose that a random sample of n individuals is selected. What is the
probability that exactly X of them have the characteristic?
P( X ) = n C X p X (1 − p) n − X
for X = 0,1, 2,......, n
X is a Binomial (n,p) random variable!
PROBLEM 3.12
Suppose that 42% of the students at a large university commute to class. If thirteen students are randomly selected,
find the probability that
(a) exactly three are commuters
(b) exactly nine of them are commuters
(c) at least two of them are commuters
32
The expected value and standard deviation of a Binomial random variable
X = X P( X )
X= ( X − X ) 2 P( X )
X = np
X = np(1 − p)
PROBLEM 3.13
Refer to the example on the preceding page, where we assumed that 35% of all American families own cats. If nine
families are selected at random, what is the expected value of the number of families that have pet cats?
PROBLEM 3.14
You will roll a pair of dice 1000 times. What is the expected value of the total number of “snake-eyes” rolled?
NOTE: Pascal’s Triangle – the arrangement where each number is the sum of the two above it - is composed of
binomial coefficients! (It is pictured below up to 11 rows; it goes on indefinitely.)
Each row of the Triangle gives the binomial coefficients nCx for a fixed n and x = 0, 1, 2, 3, …. , n.
(Recall the previous example when we found the probability distribution of X = number of ‘S’s that come up in n=7
spins with probability of success p =.1.) All of the binomial coefficients can be found in a row of the Triangle:
(n=0) 1
n=1 1 1
n=2 1 2 1
n=3 1 3 3 1
n=4 1 4 6 4 1
n=5 1 5 10 10 5 1
n=6 1 6 15 20 15 6 1
n=7 1 7 21 35 35 21 7 1
n=8 1 8 28 56 70 56 28 8 1
n=9 1 9 36 84 126 126 84 36 9 1
n=10 1 10 45 120 210 252 210 120 45 10 1
33
Binomial random variables and bell curves
Here are three examples of binomial random variables where the number of trials is “large” (n=50 for each); the
only difference is the probability of success (p):
Example 1
Suppose that 35% of all families have pet cats.
For 50 randomly selected families, let X = number of families in sample that have pet cats
Then:
P ( X ) = C (.35) (1 − .35)
50 X
X 50− X
= 50(.35) =17.5
X
Example 2
Suppose that there is a 50% chance that a newborn baby is female.
For 50 randomly selected births, let X = number of females born
Then:
P ( X ) = C (.50) (1 − .50)
50 X
X 50 − X
= 50(.50) = 25
X
Example 3
Suppose that a football kicker has a 86% chance of making an extra-point attempt.
For 50 attempts, let X = number of extra points made
Then:
P ( X ) = C (.86) (1 − .86)
50 X
X 50 − X
= 50(.86) = 43
X
Ex. 3
Ex. 2
Ex. 1
34