Week05-06 EC With Annotations
Week05-06 EC With Annotations
1
Topics
Lecture Summary
2
Definitions
A random variable is a function that associates a
real number with each element in the sample space
of a statistical experiment.
Common Notation for representing random
variables:
X denotes a random variable
x denotes one of its values.
3
Random Variable Types
Random Variable
Represents a possible numerical value from a
random event
Takes on different values based on chance
Random
Variables
Discrete Continuous
Random Variable Random Variable
4
Discrete Random Variable
A discrete random variable is a variable that
can assume only a countable (e.g., finite)
number of values
Many possible outcomes:
number of complaints per day
number of TV’s in a household
number of rings before the phone is answered
Only two possible outcomes:
gender: male or female
defective: yes or no
spreads peanut butter first vs. spreads jelly
first
5
Continuous Random Variable
A continuous random variable is a variable that
can assume an uncountable (e.g., infinite)
number of values
thickness of an item
time required to complete a task
temperature of a solution
height, in inches
T T x Value Probability
0 1/4 = .25
T H 1 2/4 = .50
2 1/4 = .25
H T
Probability
.50
.25
H H
8
0 1 2 x
Discrete Probability Distribution
A list of all possible [ xi , P(xi) ] pairs
xi = Value of Random Variable (Outcome)
P(xi) = Probability Associated with Value
xi’s are mutually exclusive
(no overlap)
xi’s are collectively exhaustive
(nothing left out)
0 £ P(xi) £ 1 for each xi
S P(xi) = 1
9
Definitions
Discrete Sample Space: If a sample space contains a finite
number of possibilities or an unending sequence with as
many elements as there are whole numbers, it is called a
discrete sample space.
Example: Coins in the Jar Experiment from Chapter 2.
10
Relationship between Sample Space
and Random Variables
Statistical Experiment Statistical Experiment
Generates outcomes
Outcomes Defined in
Sample Space
Sample Space
Sample Space can be
Discrete or Continuous
Random Variable assigns
Discrete Continuous
real number to outcomes
in the sample space and is
discrete (continuous) if Continuous R.V.
Discrete R.V.
sample space is discrete
(continuous).
11
Associating a Random Variable with the
outcomes of a statistical experiment
Example: Consider an experiment where we flip a coin
twice. The possible outcomes from the experiment are as
follows
S = {HH, HT, TH, HH}
If we define a random variable X to be the total number of
heads observed. X can take on possible values 0,1,2; The
function X (s) is shown in the table below
X is a discrete
s X(s)=x Random
HH 2 Variable
HT 1 because…..
TH 1
TT 0
12
Example
A stockroom clerk returns three safety helmets at
random to three steel mill employees who had
previously checked them.
If Smith, Jones, and Brown, in that order, receive one
of the three hats, list the sample points for the possible
orders of returning the helmets, and find the value m of
random variable M that represents the number of
correct matches.
Note: Try to link material from chapter 2 and
determine how many points should be in the sample
space. Then list all of the elements. 13
Sample Space
Number of points in the sample space is 3!=6
If S, J, and B stand for Smith’s, Jones’ and Brown’s
helmets, respectively, then the sample space is:
(SJB) (SBJ) (JSB) (JBS) (BSJ) (BJS)
Hence: Sample Space m
SJB 3
SBJ 1
JSB 1
JBS 0
BSJ 0
BJS 1
14
Topics
Lecture Summary
15
Distributions
Probability distributions are functions used to
evaluate the probability that a certain value or set
of values associated with a random variable
occurs.
Distributions associated with discrete random
variables:
Probability Mass Function (PMF)
Distributions associated with continuous random
variables
Probability Density function (PDF)
16
Properties of Distributions
1. The probability of random variable X taking on a
value x (or set of values) is positive.
We can never have a negative probability
2. Calculating the probability over the entire set of
values is equal to 1.
Probability can never be greater than 1.
3. Calculating probabilities
Discrete distributions – Summation
Continuous distributions – Integration
17
Discrete Distributions
A discrete r.v. assumes each of its values with a
certain probability.
Typically write probabilities associated with r.v. X
as f(x).
This function f(x) is called a probability mass
function or probability distribution of the discrete
random variable X.
The function evaluated at x yields P(X=x)
18
Discrete Probability Distributions
The set of ordered pairs (x, f(x)) is a probability function,
probability mass function, or probability distribution of
the discrete random variable X if, for each possible
outcome x,
20
Using P(A)=n/N to define the
probability distribution
3 5
0 2 10
f (0) P ( X 0)
8 28
2
3 5
1 1 15
f (1) P ( X 1)
8 28
2
3 5
2 0 3
f (2) P ( X 2)
8 28
2 21
Thus, the probability distribution of X is:
x 0 1 2
22
Cumulative Distribution Function
(CDF)
For every PMF we can define a CDF which gives us the
PX x
The cumulative distribution F(x) of a discrete random
variable X with probability distribution f(x) is
23
Example
Suppose we have the following pmf:
x f(x)
1 .4
2 .3
3 .2
4 .1
Then
F (1) PX 1 f (t ) f (1) 0.4
t 1
F (3) PX 3 f (t ) f (1) f (2) f (3) 0.4 0.3 0.2 0.9
t 3
25
Discrete Cumulative Distribution
F(x)
1
0.9
0.7
0.4
x
0 1 2 3 4 26
Observations concerning the CDF
For any number x, F(X) will equal the value of F at
the closest possible value of X to the left of x.
Let’s consider a case where we want to evaluate
F(2.7)
F ( 2.7) P X 2.7 f (t ) f (1) f ( 2) 0.4 0.3 0.7 F ( 2)
t 2.7
Also observe,
f (2) F (2) F (1) f (1) f (2) f (1) f (2).
So, if you were given a CDF you could
determine the PMF. 27
Observations concerning CDF
What if you wanted to determine in general, Pa x b .
Assume a and b are integers.
Pa x b f (a ) f (a 1) f (a 2).... f (b)
We know
F (a ) P ( X a ) f (t ) f (t 0 ) f (t1 ) ... f (a 1) f (a )
t a
29
Calculating Probabilities for Discrete
Distributions
Example 1: If the cumulative distribution function of X is given by
0 x0
1 0 x 1
2
3
1 x 2
5
F ( x ) 4
2 x 3
5
9 3 x 4
a. Find P( X 3) 10
1 4 x
b. Find P(X>2)
c. Find
d.
P ( 2 X 3)
Find P ( 2 X 3)
e. Find PMF of X
30
Calculating Probabilities for Discrete
Distributions
0 x0
1 0 x 1 P ( X 3) F (3) 9 / 10
2
3
1 x 2
5 4 1
F ( x ) 4 P ( X 2) 1 P ( X 2) 1 F ( 2) 1
2 x 3 5 5
5
9 3 x 4
10 9 3 9 6 3
1 4 x P ( 2 X 3) F (3) F (1)
10 5 10 10 10
9 4 9 8 1
P ( 2 X 3) F (3) F ( 2)
10 5 10 10 10
31
Find PMF of X
• Step 1. We know the upper and lower bound of the distribution just from looking at the
CDF. The possible values of X are 0,1,2,3 and 4.
4
x
f ( x ) P ( X x ) , for x 0,1,2,3,4
16
33
Cumulative Distribution
34
Example
Recall Car Agency problem. Find the cumulative
distribution for the random variable X in the
problem. Using F(x), verify that f(2)=3/8.
4
x
f ( x ) P ( X x ) , for x 0,1,2,3,4
16
1
F (0) f (0)
16
5
F (1) f (0) f (1)
16
11
F (2) f (0) f (1) f (2)
16
15
F (3) f (0) f (1) f (2) f (3)
16
35
F (4) f (0) f (1) f (2) f (3) f (4) 1
Example (con’t)
0 for x0
1
for 0 x 1
16
5
for 1 x 2
16
F ( x)
11 for 2 x 3
16
15
for 3 x 4
16
1 for x 4
11 5 3
f (2) F (2) F (1)
16 16 8 36
Bar Chart
f(x)
6/16
5/16
4/16
3/16
2/16
1/16
x
0 1 2 3 4 37
Probability Histogram
f(x)
6/16
5/16
4/16
3/16
2/16
1/16
x
0 1 2 3 4 38
Discrete Cumulative Distribution
F(x)
1
3/4
1/2
1/4
x
0 1 2 3 4 39
Topics
Lecture Summary
40
Continuous Probability
Distributions
A continuous random variable has a probability of zero
of assuming exactly any of its values.
In dealing with continuous variables, f(x) is usually
called the probability density function (PDF), or
simply the density function of X.
A probability density function is constructed so that the
area under its curve bounded by the x axis is equal to 1
when computed over the range of X for which f(x) is
defined.
This is how you know the PDF is a valid PDF.
41
Probability Density Function
The function f(x) is a probability density function
for the continuous random variable X, defined
over the set of real numbers R, if
42
Example
Suppose that the error in the reaction temperature, in °C,
for a controlled laboratory experiment is a continuous
random variable X having the probability density
function x2
, 1 x 2
f ( x) 3
0, elsewhere
(a) Verify
f ( x)dx 1
43
2 x2 x3 2
(a ).
f ( x)dx dx
1 3 9 1
1
x2
1 x3 1 1
(b). P (0 X 1) dx
0 3 9 0 9
44
Cumulative Distribution Function for
Continuous Random Variable
The cumulative distribution F(x) of a continuous random
variable X with density function f(x) is
x
F ( x) P ( X x) f (t )dt for x
Hence
P (a X b) F (b) F (a )
dF ( x)
f ( x)
dx
Take the derivative of CDF with respect to x to get the PDF.
Integrate the PDF from it’s lower limit to t to get the CDF.
45
Example
For the density function
x2
, 1 x 2
f ( x) 3
0, elsewhere
Find F(x), and use it to evaluate P(0<X≤1)
46
Example (con’t)
For 1 x 2,
x t2 t3 x x3 1
F ( x ) f (t )dt dt
1 3 9 1 9
Therefore ,
0 x 1 If CDF is defined, we can use it
3 to determine the probability,
x 1
F ( x ) , 1 x 2 without any further integration.
9
1 x 2
2 1 1
P (0 X 1) F (1) F (0)
9 9 9
Note: For a continuous distribution P(X=a)=0, therefore
P ( a X b) P ( a X b) F (b) F ( a ) 47
Topics
Lecture Summary
48
Objectives
Define joint probability distributions
Determine probability of jointly distributed
random variables
Define concept of statistical independence
49
Joint Distribution
Given 2 random variables, X and Y, defined on the
sample space of an experiment S, you are
interested in the probability that both X and Y take
on values simultaneously.
Note: This can be extended to more than 2 random
variables.
The function defining the probability that X and Y
occur simultaneously is f(x,y).
50
Definitions
3.P ( X x, Y y ) f ( x, y )
P( X , Y ) A A f ( x, y )
For any region A in the xy plane 51
Definitions
The function (x, y) is a joint density function of
the continuous random variables X and Y if
1. f ( x, y ) 0 for all ( x, y )
2. f ( x, y )dxdy 1
( x, y ) | x y 1
54
Joint PMF Example: Solution
Using Concepts from Chapter 2.
N=The # of ways to select 2 refills at random.
n= The # of ways to select a specific number of
blue (x) and red (y) refills.
Write a function in terms of x and y
3 2 3
x y 2 x y
f ( x, y )
8
2
55
Joint PMF Example: Solution (con’t)
The list of possible values that X and Y can take
is defined in the table below. Marginal
distribution of y
X h(y)
Y 0 1 2
Marginal 56
distribution of x
Joint PMF Example (con’t)
If I was only interested in P(X=x), then I can
define the marginal PMF of X (g(x)) obtained by
summing over the values of the other variable (Y).
Example:
10
P ( X 0) f (0, y ) f (0,0) f (0,1) f (0,2)
y 28
57
Joint Density Function: Another Example
A candy company distributes boxes of chocolates with a
mixture of creams, toffees, and nuts coated in both
light and dark chocolate. For a randomly selected box,
let X and Y, respectively, be the proportions of the light
and dark chocolates that are creams and suppose
that the joint density function is:
2
(2 x 3 y ) 0 x 1,0 y 1
f ( x, y ) 5
0 elsewhere
(a )Verify
f ( x, y )dxdy 1
(b) Find P[( X , Y ) A], where A is the region
1 1 1
( x, y ) | 0 x , y
2 4 2
58
Solution (part a)
1 1 1 2 x 1
2 2x 6 xy
2 x 3 y dxdy dy
0 0
5 0
5 5 x 0
1
2 6y
0
5 5 dy
1
2 6y
0
5 5 dy
2 y 1
2y 6y 2 6
5 10 5 10 1
y 0 59
Solution (part b)
.5 .5 .5 x .5
2 2
2x 6 xy
2 x 3 y dx dy dy
.25 .0
5 .25
5 5 x 0
1 2 1
.5 2 6 y .5
2 2 1 3 y
.25
5
5
dy dy
.25
10 5
2 y .5
y 3y 13
10 10
y .25 160
60
Joint Density Function:
Another Example (con’t)
Now, suppose I was only interested in the marginal distribution g(x) or h(y).
To find g(x) we have to integrate over all possible y values.
1 y 1
2 2 3y 2 2 3 4x 3
g ( x) f ( x, y )dy 2 x 3 y dy 2 xy 2 x
50 5 2 y 0 5 2 5
1 x 1
2 2 2 2 6y
h( y ) f ( x, y )dx 2 x 3 y dx x 3xy
5 5 x 0 5
0
61
Example: Joint Density Functions
on non-rectangle regions
62
Solution
y
x+y>6 OR
6
x
1 3 6
63
Conditional Distribution
Let X and Y be two random variables, discrete or
continuous. The conditional distribution of the random
variable Y, given that X=x, is
f ( x, y )
f ( y | x) , g ( x) 0
g ( x)
Similarly, the conditional distribution of the random
variable X, given that Y=y, is
f ( x, y )
f ( x | y) , h( y ) 0
h( y )
64
Conditional Distribution
Hence,
P ( a X b | Y y ) f ( x | y )
x
For the Discrete Case
b
P ( a X b | Y y ) f ( x | y )dx
a
65
Conditional Distribution
Example
Refer to Ballpoint pen refills example (slide #55-
#56). Find the conditional distribution of X, given
that Y=1, and use it to determine P(X=0|Y=1).
66
Solution
We need to find the conditional distribution of X
given y = 1.
So we want to populate the values in the
following table:
x f(x|y=1)
0 ?
1 ?
2 ?
67
Solution
f ( x,1) 14
P( X x | Y 1) f ( x,1)
h(1) 6
14 3 1
P( X 0 | Y 1)
6 14 2
14 3 1
P( X 1 | Y 1)
6 14 2
14
P( X 2 | Y 1) 0 0
6
So the conditional distribution f(x|y=1) is defined as
x f(x|y=1)
0 ½
1 ½
2 0
x (1 3 y 2 )
0 x 2,0 y 1
f ( x, y ) 4
0 elsewhere
Find g(x), h(y), f(x|y), and evaluate P(1/4<X<1/2|Y=1/3)
69
Solution
a) Find g(x)
Integrate f(x,y) over all possible values of y
1 2 3 y 1
x (1 3 y ) xy xy 2x x
g ( x) dy , 0x2
0
4 4 y 0
4 2
70
Solution (con’t)
b) find h(y)
Integrate f(x,y) over all possible values of x
2 x 2
2 2 2
x (1 3 y ) (1 3 y ) x (1 3 y 2 )
h( y ) dx 2 , 0 y 1
0
4 4 x 0
2
71
Solution (con’t)
C) Find f(x|y) Note: f(x|y)=g(x)
X and Y are statistically
independent
f ( x, y ) x (1 3 y 2 ) 2 x
f ( x | y) *
(1 3 y 2 ) 2 0x2
h( y ) 4
1/ 2 0.5
x x2 0.25 0.0625
P(1 / 4 X 1 / 2 | Y 1 / 3) dx 0.046875
1/ 4
2 4 0.25
4
72
Statistical Independence
Let X and Y be two random variables, discrete or
continuous, with joint probability distribution
f(x,y) and marginal distribution g(x) and h(y),
respectively. The random variables X and Y are
said to be statistically independent if and only if
f(x,y)=g(x)h(y)
for all (x,y) within their range.
73
Statistical Independence
Example
Refer to Ballpoint pen refills example. Show
that the random variables are not statistically
independent.
74
Solution
X h(y)
0 1 2
Y 0 3/28 9/28 3/28 15/28
1 3/14 3/14 0 6/14
2 1/28 0 0 1/28
g(x) 10/28 15/28 3/28
3 10 6 15
f (0,1) g (0) * h(1) *
14 28 14 112
So, X and Y are not statistically independent.
75
Statistical Independence
Let X1, X2, …,Xn be n random variables, discrete
or continuous, with joint probability distribution
f(x1,x2…xn) and marginal distributions f1(x1),
f2(x2),…,fn(xn), respectively. The random variables
X1, X2, …, Xn are said to be mutually statistically
independent if and only if
f(x1,x2,…,xn)=f1(x1)f2(x2)…fn(xn)
For all (x1,x2,…,xn) within their range.
76
Example 5.8 p.236
Show that the random variables x1 and x2 are
independent.
77
Example
Show that the random variables x1 and x2 are
independent.
2(1 x1 ) for 0 x1 1 and 0 x2 1
f ( x1, x2 )
0 elsewhere
1
f1 ( x1 ) f ( x1, x2 )dx2 2(1 x1)dx2 2(1 x1), 0 x1 1
0
1
f 2 ( x2 ) f ( x1, x2 )dx1 2(1 x1)dx1 1, 0 x2 1
0
Since f(x1,x2)=f1(x1)f2(x2), for all real numbers x1 and x2; and X1 and X2 are
independent random variables.
78
Example
Are independent?
79
Derived random variables
Let X have the PDF
80
Derived random variables
81
Example (cont’d)
82
Example (cont’d)
83
Summary
Define random variables
Examined discrete/continuous random variables
Presented cumulative distribution functions for
discrete and continuous variables
Reviewed properties of jointly distributed
continuous random variables
Presented conditional probability
Find out whether two (or more) distribution
functions are statistically independent or not
Examined derived random variables 84