Probability Distributions
Probability Distributions
Random Variable
• A random variable x takes on a defined set of
values with different probabilities.
• For example, if you roll a die, the outcome is random (not
fixed) and there are 6 possible outcomes, each of which
occur with probability one-sixth.
• For example, if you poll people about their voting
preferences, the percentage of the sample that responds
“Yes on Proposition 100” is a also a random variable (the
percentage will be slightly differently every time you poll).
p(x)
1/6
x
1 2 3 4 5 6
P(x) 1
all x
Probability mass function (pmf)
x p(x)
1 p(x=1)=1/6
2 p(x=2)=1/6
3 p(x=3)=1/6
4 p(x=4)=1/6
5 p(x=5)=1/6
6 p(x=6)=1/6
1.0
Cumulative distribution function
(CDF)
1.0 P(x)
5/6
2/3
1/2
1/3
1/6
1 2 3 4 5 6 x
Cumulative distribution
function
x P(x≤A)
1 P(x≤1)=1/6
2 P(x≤2)=2/6
3 P(x≤3)=3/6
4 P(x≤4)=4/6
5 P(x≤5)=5/6
6 P(x≤6)=6/6
Examples
1. What’s the probability that you roll a 3 or less?
P(x≤3)=1/2
12 .25
1.0
Answer (b)
b. f(x)= (3-x)/2 for x=1,2,3,4
x f(x)
Though this sums to 1,
1 (3-1)/2=1.0 you can’t have a negative
probability; therefore, it’s
2 (3-2)/2=.5 not a probability
function.
3 (3-3)/2=0
4 (3-4)/2=-.5
Answer (c)
c. f(x)= (x2+x+1)/25 for x=0,1,2,3
x f(x)
0 1/25
1 3/25
Doesn’t sum to 1. Thus,
2 7/25 it’s not a probability
function.
3 13/25
24/25
Practice Problem:
The number of ships to arrive at a harbor on any given day is a
random variable represented by x. The probability distribution
for x is:
x 10 11 12 13 14
P(x) .4 .2 .2 .1 .1
e
x x
e 0 1 1
0
0
Continuous case: “probability
density function” (pdf)
p(x)=e-x
p(x)=e-x
x
1 2
2 2
x x
P(1 x 2) e e e 2 e 1 .135 .368 .23
1
1
Cumulative distribution
function
As in the discrete case, we can specify the “cumulative
distribution function” (CDF):
A A
x x
e e e A e 0 e A 1 1 e A
0
0
Example
p(x)
2 x
2
P(x 2) 1 - e 1 - .135 .865
Example 2: Uniform
distribution
The uniform distribution: all values are equally likely
x
1
1 x
1 0 1
0
0
Example: Uniform distribution
What’s the probability that x is between ¼ and ½?
p(x)
¼ ½ x
1
P(½ x ¼ )= ¼
Practice Problem
4. Suppose that survival drops off rapidly in the year following diagnosis of a
certain type of advanced cancer. Suppose that the length of survival (or
time-to-death) is a random variable that approximately follows an
exponential distribution with parameter 2 (makes it a steeper drop off):
2 x 2 x
[note : 2e e 0 1 1]
0
0
1 (1 e 2(1) ) .135
Expected Value and Variance
One standard
deviation from the
Mean ()
mean ()
Expected value, or mean
x 10 11 12 13 14
P(x) .4 .2 .2 .1 .1
E( X ) x p(x )
all x
i i
Continuous case:
E( X )
all x
xi p(xi )dx
Empirical Mean is a special case of
Expected Value…
x i n
1
X i 1
n
i 1
xi ( )
n
E( X ) x p(x )
all x
i i
Continuous case:
E( X )
all x
xi p(xi )dx
Extension to continuous case:
uniform distribution
p(x)
x
1
1
x2 1
1 1
E ( X ) x(1)dx
0
2 0
2
0
2
Symbol Interlude
E(X) = µ
these symbols are used interchangeably
Expected Value
Expected value is an extremely useful
concept for good decision-making!
Example: the lottery
The Lottery (also known as a tax on people
who are bad at math…)
A certain lottery works by picking 6 numbers
from 1 to 49. It costs $1.00 to play the
lottery, and if you win, you win $2 million
after taxes.
1 1 1 “49 choose 6”
7.2 x 10-8
49 49! 13,983,816
Out of 49
6 43!6!
numbers, this is
the number of
distinct
The probability function (note, sums to 1.0): combinations of 6.
x$ p(x)
-1 .999999928
Expected Value
E(X) = P(win)*$2,000,000 + P(lose)*-$1.00
= 2.0 x 106 * 7.2 x 10-8+ .999999928 (-1) = .144 - .999999928 = -$.86
A roulette wheel has the numbers 1 through 36, as well as 0 and 00. If you
bet $1 that an odd number comes up, you win or lose $1 according to
whether or not that event occurs. If random variable X denotes your net
gain, X=1 with probability 18/38 and X= -1 with probability 20/38.
E(X) = 1(18/38) – 1 (20/38) = -$.053
On average, the casino wins (and the player loses) 5 cents per game.
The casino rakes in even more if the stakes are higher:
E(X) = 10(18/38) – 10 (20/38) = -$.53
If the cost is $10 per game, the casino wins an average of 53 cents per game.
If 10,000 games are played in a night, that’s a cool $5300.
**A few notes about Expected Value as a
mathematical operator:
E(cX)=cE(X)
Example: If the casino charges $10 per game instead of $1,
then the casino expects to make 10 times as much on average
from the game (See roulette example above!)
E(c + X)=c + E(X)
10 10
1 1 10(10 1)
E ( x) i ( )
i 1 10 10
i
i (.1)
2
55(.1) 5.5
Expected value isn’t
everything though…
Take the show “Deal or No Deal”
Everyone know the rules?
Let’s say you are down to two cases left. $1
and $400,000. The banker offers you
$200,000.
So, Deal or No Deal?
Deal or No Deal…
This could really be represented as a
probability distribution and a non-
random variable:
x$ p(x)
+1 .50
+$400,000 .50
x$ p(x)
+$200,000 1.0
Expected value doesn’t help…
x$ p(x)
+1 .50
+$400,000 .50
x$ p(x)
+$200,000 1.0
E ( X ) 200,000
How to decide?
Variance!
• If you take the deal, the variance/standard
deviation is 0.
•If you don’t take the deal, what is average
deviation from the mean?
•What’s your gut guess?
Variance/standard deviation
“The average (expected) squared
distance (or deviation) from the mean”
Var ( x) E[( x ) ]
2 2
(x )
all x
i
2
p(xi )
Var ( X ) 2
(x )
all x
i
2
p(xi )
Continuous case:
Var ( X ) ( xi ) p ( xi )dx
2 2
Similarity to empirical variance
( xi x ) 2 N
1
i 1
n 1
i 1
( xi x ) (2
n 1
)
2
(x
all x
i ) p(xi )
2
2
all x
( xi ) 2 p(xi )
.997 .99
Standard deviation is $.99. Interpretation: On average, you’re
either 1 dollar above or 1 dollar below the mean, which is just
under zero. Makes sense!
Handy calculation formula!
(x ) x
2
Var ( X ) i
2
p(xi ) i p(xi ) ( ) 2
all x all x
Proofs:
E(x-)2 = E(x2–2x + 2) remember “FOIL”?!
=E(x2) – E(2x) +E(2) Use rules of expected value:E(X+Y)= E(X) + E(Y)
= E(x2) – 2E(x) +2 E(c) = c
= E(x2) – 2 +2 E(x) =
= E(x2) – 2
= E(x2) – [E(x)]2
OR, equivalently:
E(x-)2 =
[( x )
allx
2
] p( x) [( x
allx
2
2 x 2 ] p ( x ) x
allx
2
p ( x) 2 xp( x) p( x) E ( x
2 2
) 2 E ( x) 2 (1)
E ( x 2 ) 2 2 2 (1) E ( x 2 ) 2
For example, what’s the variance and
standard deviation of the roll of a die?
x p(x)
1 p(x=1)=1/6 p(x) average distance from the mean
2 p(x=2)=1/6
3 p(x=3)=1/6
4 p(x=4)=1/6
1/6
5 p(x=5)=1/6 x
1 2 3 4 5 6
6 p(x=6)=1/6
1.0
mean
1 1 1 1 1 1 21
E ( x)
all x
xi p(xi ) (1)( ) 2( ) 3( ) 4( ) 5( ) 6( )
6 6 6 6 6 6 6
3.5
1 1 1 1 1 1
2
E(x )
all x
2
xi p(xi ) (1)( ) 4( ) 9( ) 16( ) 25( ) 36( ) 15.17
6 6 6 6 6 6
+c
Var (c+X)= Var(X)
Var (c+X)= Var(X)
Adding a constant to every instance of a random variable
doesn’t change the variability. It just shifts the whole
distribution by c. If everybody grew 5 inches suddenly, the
variability in the population would still be the same.
+c
Var(cX)= c2Var(X)
Var(cX)= c2Var(X)
Multiplying each instance of the random variable by c makes it
c-times as wide of a distribution, which corresponds to c2 as
much variance (deviation squared). For example, if everyone
suddenly became twice as tall, there’d be twice the deviation
and 4 times the variance in heights in the population.
Var(X+Y)= Var(X) + Var(Y)
Var(X+Y)= Var(X) + Var(Y) ONLY IF X and Y are
independent!!!!!!!!
x 10 11 12 13 14
P(x) .4 .2 .2 .1 .1
Answer: variance and std dev
5
E(x 2 )
i 1
2
xi p( x i ) (100)(.4) (121)(.2) 144(.2) 169(.1) 196(.1) 129.5
N
σ xy ( xi x )( yi y ) P( xi , yi )
i 1
The Sample Covariance
The sample covariance:
( x X )( y
i i Y )
cov ( x , y ) i 1
n 1
Interpreting Covariance
Covariance between two random
variables: