0% found this document useful (0 votes)

43 views27 pages

Chapter 4 - Multivariate Probability Distribution

The document provides information about multivariate probability distributions. It defines joint probability distributions for discrete and continuous random variables. For discrete random variables X and Y, the joint probability distribution is a function f(x,y) where f(x,y) gives the probability of X=x and Y=y occurring simultaneously. For continuous random variables, the joint density function f(x,y) must satisfy three conditions: 1) f(x,y) ≥ 0, 2) the double integral of f(x,y) over all x and y is 1, and 3) the probability of (X,Y) being in a region R is the double integral of f(x,y) over R. Several examples

Uploaded by

Shehan De Silva

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

43 views27 pages

Chapter 4 - Multivariate Probability Distribution

Uploaded by

Shehan De Silva

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 27

SCHOOL OF MATHEMATICAL SCIENCES

MAT 1034 INTRODUCTION TO PROBABILITY

Chapter 4: Multivariate Probability Distribution

1. Joint Probability Distributions

Our study of random variables and their probability distributions in the preceding chapters is
restricted to one-dimensional sample spaces, where we recorded outcomes of an experiment as
values assumed by a single random variable. There will be situations, however, where we may
find it desirable to record the simultaneous outcomes of several variables. For example:

1. We might measure the amount of precipitate P and volume V of gas released from a
controlled chemical experiment, giving rise to a two-dimensional sample space
consisting of the outcomes (p, v).

2. We might be interested in the hardness H and tensile strength T of cold-drawn copper,

resulting in the outcomes (h, t).

3. In a study to determine the likelihood of success in college based on high school data,
we might use a three-dimensional sample space and record for each individual his or
her aptitude test score, high school class rank, and grade-point average at the end of
freshman year in college.

If X and Y are two discrete random variables, the probability distribution for their simultaneous
occurrence can be represented by a function with values f ( x, y ) for any pair of values ( x, y )
within the range of the random variables X and Y. This function is referred to as the joint
probability distribution of X and Y.

Hence, in the discrete case,

f ( x, y ) = P ( X = x , Y = y )
that is, the values f ( x, y ) give the probability that outcomes x and y occur at the same time.

Definition 1.1
The function f ( x, y ) is a joint probability distribution or probability mass function of the
discrete random variables X and Y if
1. f ( x, y )  0 for all ( x, y )
2.  f ( x, y ) = 1
x y

3. P ( X = x, Y = y ) = f ( x , y )
For any region A in the xy plane, P ( X , Y )  A =  f ( x, y )
A

1
SCHOOL OF MATHEMATICAL SCIENCES

The information for a discrete joint distribution can be neatly summarized in tabular form as
follows:

Y
y1 … yn p(x)
x1 p(x1,y1) … p(x1,yn) p(x1)
. . . .
X . . . .
. . . .
xm p(xm,y1) p(xm,yn) p(xm)
p(y) p(y1) … p(yn) 1

Example 1
Given the following joint probability distribution f ( x, y) :
Y
0 1 4
1 0.10 0.05 0.15
X 3 0.05 0.20 0.25
5 0.15 0.00 0.05

(a) Verify that this is a joint distribution table.

(b) Find P (X = 3, Y = 4).
(c) P[( X , Y )  R], where R is the region {( x, y) | x + y  4}.

Solution
(a)
Y g ( x)
0 1 4
1 0.10 0.05 0.15 0.3
X 3 0.05 0.20 0.25 0.5
5 0.15 0.00 0.05 0.2
h( y ) 0.30 0.25 0.45 1.0

Since all f ( x, y)  0 and  f ( x, y) = 1, this is a joint distribution table.

x y

(b) P(3, 4) = 0.25

(c) P ( x + y  4 ) = f (1, 0 ) + f (1,1) + f (3, 0) + f (3,1)

= 0.10 + 0.05 + 0.05 + 0.20 = 0.40

2
SCHOOL OF MATHEMATICAL SCIENCES

Example 2
Two refills for a ballpoint pen are selected at random from a box that contains 3 blue refills, 2
red refills, and 3 green refills. If X is the number of blue refills and Y is the number of red refills
selected, find

(a) The joint probability function f ( x, y) , and

(b) P[( X , Y )  R], where R is the region {( x, y) | x + y  1}.

Note: 0  x + y  2

Example 3
Roll a red die and a green die. Let X1 = number of dots on the red die, X2 = number of dots on
the green die. There are 36 points in the sample space.
Table: Possible Outcomes of Rolling a Red Die and a Green Die. (First number in the pair is
the number on the red die.)

3
SCHOOL OF MATHEMATICAL SCIENCES

Green 1 2 3 4 5 6
Red
1 1 1 1 2 1 3 1 4 1 5 1 6
2 2 1 2 2 2 3 2 4 2 5 2 6
3 3 1 3 2 3 3 3 4 3 5 3 6
4 4 1 4 2 4 3 4 4 4 5 4 6
5 5 1 5 2 5 3 5 4 5 5 5 6
6 6 1 6 2 6 3 6 4 6 5 6 6

1 1
The probability of (1, 1) is . The probability of (6, 3) is also .
36 36
 1  1
Now consider P ( 2  X 1  4,1  X 2  3) = f ( 2, 2 ) + f ( 2,3) + f ( 3, 2 ) + f ( 3,3) = 4   =
 36  9

When X and Y are continuous random variables, the joint density function f ( x, y ) is a surface
lying above the xy plane, and P ( X , Y )  A , where A is any region in the xy plane, is equal
to the volume of the right cylinder bounded by the base A and the surface f ( x, y ) .

Definition 1.2
The function f ( x, y) is a joint density function of the continuous random variables X and Y if
1. f ( x, y)  0 for all ( x, y)
 
2.   f ( x, y )dxdy = 1
− −

3. P[( X , Y )  R] =   f ( x, y ) dxdy for any region R in the xy plane.

Example 4
A candy company distributes boxes of chocolates with a mixture of creams, toffees, and nuts
coated in both light and dark chocolate. For a randomly selected box, let X and Y, respectively,
be the proportion of the light and dark chocolates that are creams and suppose that the joint
density function is

2
 (2 x + 3 y ) 0  x  1,0  y  1
f ( x, y ) =  5
 0 elsewhere

(a) Verify condition (2) of Definition 1.2.

(b) Find the P[( X , Y )  A] , where A is the region ( x, y ) | 0  x  1 , 1  y  1  .
 2 4 2

Solution
  1 1
2
(a)  f ( x, y )dxdy =   (2 x + 3 y)dxdy
− − 0 0
5

4
SCHOOL OF MATHEMATICAL SCIENCES
x =1
 2 x 2 6 xy 
1
=  +  dy
0
5 5  x =0
 2 6y 
1
=  + dy
0
5 5 
1
 2 y 3y2 
= + 
 5 5 0
2 3
= + =1
5 5
 1 1 1
(b) P[( X , Y )  A] = P  0  X  ,  Y  
 2 4 2
1 1
2 2
2
=   (2 x + 3 y ) dxdy
1 0 5
4
1
1 x=
2
 2 x 2 6 xy 
2
=  +  dy
1 5 5 
4 x =0
1

 1 3y 
2
=   + dy
1  10 5 
4
1
 y 3y2  2
= + 
 10 10  1
4

1  1 3   1 3   13
=  + − +  =
10  2 4   4 16   160

5
SCHOOL OF MATHEMATICAL SCIENCES

Example 5
Let f ( x, y ) = kx be a joint density function on the region R in the plane described by
0  x  y  1 . Find the value of k.

Example 6
An insurance company insures a large number of drivers. Let X be the random variable
representing the company’s losses under collision insurance, and let Y represent the company’s
losses under liability insurance. X and Y have joint density function

 2x + 2 − y
 , for 0  x  1 and 0  y  2
f ( x, y ) =  4

 0, otherwise.

What is the probability that the total loss is at least 1?

6
SCHOOL OF MATHEMATICAL SCIENCES

Example 7
Let X, Y, and Z have the joint probability density function

kxy 2 z , 0  x, y  1, 0  z  2
f ( x, y , z ) = 
 0, elsewhere

(a) Find k.
 1 1 
(b) Find P  X  , Y  ,1  Z  2  .
 4 2 
Solution
  
(a)    f ( x, y, z )dxdydz = 1
− − −
2 1 1

   kxy zdxdydz = 1
2

0 0 0
2 1 x =1
x2 y 2 z
0 0 k 2 dydz = 1
x =0
2 1
y2 z
k
0 0
2
dydz = 1

2 y =1
y3 z
0 6
k dz = 1
y =0
2
z
 k 6dz = 1
0
z =2
z2
k =1
12 z =0
 4
k   =1
 12 
12
k = =3
4

7
SCHOOL OF MATHEMATICAL SCIENCES
1

 
2 1 4
1 1
(b) P  X  , Y  ,1  Z  2  =    3xy 2 zdxdydz
 4 2  110
2
1
2 1 x=
3x 2 y 2 z 4
=  dydz
1 1
2 x =0
2
2 1
3y2 z
=  dydz
1 1
32
2
2 y =1
y3 z
= dz
1
32 y=
1
2

 z z 
2
=  − dz
1  32 256 
2
7z
= dz
1
256
z =2
7z2
=
512 z =1
28 7 21
= − =
512 512 512

2. Marginal Distributions

Given the joint probability distribution f ( x, y ) of the discrete random variables X and Y, the
probability distribution g(x) of X alone is obtained by summing f ( x, y ) over the values of Y.
Similarly, the probability distribution h(y) of Y alone is obtained by summing f (x, y) over the
values of X. We define g(x) and h(y) to be the marginal distributions of X and Y, respectively.
When X and Y are continuous random variables, summations are replaced by integrals. We can
now make the following general definitions.

Definition 2.1
The marginal distributions of X alone and of Y alone are
g ( x ) =  f ( x, y ) and h ( y ) =  f ( x, y )
y x

for the discrete case, and

 
g ( x) =  f ( x, y )dy and h ( y ) =  f ( x, y )dx
− −
for the continuous case.

The term marginal is used here because, in the discrete case, the values of g(x) and h(y) are just
the marginal totals of the respective columns and rows when the values of f(x, y) are displayed
in a rectangular table.
8
SCHOOL OF MATHEMATICAL SCIENCES

Example 8
By referring to Example 1, find the marginal distributions for X and Y.

Solution
Marginal distribution for X
x 1 3 5
g(x) 0.30 0.50 0.20

Marginal distribution for Y

Y 0 1 4
h(y) 0.30 0.25 0.45

Example 9
By referring to Example 2, find the marginal distribution of X and Y.

9
SCHOOL OF MATHEMATICAL SCIENCES

Example 10
Find the marginal distribution of X and Y.

2
 (2 x + 3 y ), 0  x  1, 0  y  1,
f ( x, y ) =  5
 0, elsewhere.

Solution

g ( x) =  f ( x, y )dy
−

1
2
=  (2 x + 3 y )dy
0
5
y =1
 4 xy 6 y 2 
= + 
 5 10  y =0
4x + 3
=
5

 4x + 3
 ,0  x 1
g ( x) =  5
 0, elsewhere


h( y) =  f ( x, y )dx
−
1
2
=  (2 x + 3 y )dx
0
5
x =1
 2 x 2 6 xy 
= + 
 5 5  x =0
2 + 6y
=
5

2 + 6y
 ,0  y 1
h( y) =  5
 0, elsewhere

Example 11
Let f ( x, y) = 6 x be a joint density function on the region R in the plane described by
0  x  y  1.

a) Find marginal density functions for X and Y.

b) Find E(X) and E(Y).

10
SCHOOL OF MATHEMATICAL SCIENCES

3. Conditional Probability Distributions

In Chapter 1, recall that the conditional probability is given as

P ( A  B)
P ( B A) = , provided P ( A)  0 ,
P ( A)
where A and B are now the events defined by X = x and Y = y , respectively, then
P ( X = x, Y = y ) f ( x , y )
P (Y = y X = x ) = = , provided g ( x )  0
P ( X = x) g ( x)
where X and Y are discrete random variables.

f ( x, y )
The function is strictly a function of y with x fixed and satisfies all the conditions of
g ( x)
a probability distribution. This is also true when f ( x, y ) and g ( x ) are the joint density and
marginal distribution, respectively, of continuous random variables. As a result, it is extremely
f ( x, y )
important that we make use of the special type of distribution of the form in order to
g ( x)
effectively compute conditional probabilities. This type of distribution is called a conditional
probability distribution.

Definition 3.1
Let X and Y be two random variables, discrete or continuous. The conditional distribution of
the random variable Y given that X = x is
f ( x, y )
f ( y x) = , provided g ( x )  0
g ( x)
Similarly, the conditional distribution of X given that Y = y is
11
SCHOOL OF MATHEMATICAL SCIENCES

f ( x, y )
f ( x y) = , provided h ( y )  0
h( y)
If we wish to find the probability that the discrete random variable X falls between a and b
when it is known that the discrete variable Y = y, we evaluate
P ( a  X  b Y = y ) =  f ( x y ),
a  x b
where the summation extends over all values of X between a and b. When X and Y are
continuous, we evaluate
b
P ( a  X  b Y = y ) =  f ( x y ) dx.
a

Example 12
By referring to Example 2, find the conditional distribution of X, given that Y = 1, and use it
to determine P( X = 0 | Y = 1).

Therefore, if it is known that 1 of the 2 pen refills selected is red, we have a probability equal
1
to that the other refill is not blue.
2

Example 13
The joint density for the random variables (X,Y), where X is the unit temperature change and Y
is the proportion of spectrum shift that a certain atomic particle produces is

10 xy 2 0  x  y 1
f ( x, y ) = 
 0 elsewhere

a) Find the marginal densities g(x), h(y), and the conditional density f ( y | x) .

12
SCHOOL OF MATHEMATICAL SCIENCES

b) Find the probability that the spectrum shifts more than half of the total observations,
given the temperature is increased to 0.25 units.

 1
(a) g ( x) =  f ( x, y )dy =  10 xy 2 dy
− x
y =1

x (1 − x3 ) , 0  x  1
10 3 10
= xy =
3 y=x 3
 y

h( y) =  f ( x, y )dx =  10 xy dx
2

− 0

2 x= y
= 5x2 y = 5 y4 , 0  y  1
x =0

f ( x, y ) 10 xy 2 3y2
f ( y x) = = = ,0  x  y 1
g ( x) 10
x (1 − x )
3 1 − x 3

 
1
P  Y  X = 0.25  =  f ( y x = 0.25 )dy
1
(b)
 2  1
2
1
3y2 8
= dy =
1 1 − 0.25
3
9
2

Example 14
Given the joint density function
 x (1 + 3 y 2 )
 , 0  x  2, 0  y  1
f ( x, y ) =  4

 0, elsewhere
1 1
Find g ( x ) , h ( y ) , f ( x y ) and evaluate P   X  Y =  .
1
4 2 3

13
SCHOOL OF MATHEMATICAL SCIENCES

Solution
 1
x (1 + 3 y 2 )
g ( x) =  f ( x, y )dy =  dy
− 0
4
y =1
 xy xy 3  x
= +  = ,0  x  2
 4 4  y =0 2
 2
x (1 + 3 y 2 )
h( y) =  f ( x, y )dx =  dx
− 0
4
x=2
 x 2 3x 2 y 2  1+ 3y2
= +  =
 8 8  x =0 2
f ( x, y ) x (1 + 3 y ) 1 + 3 y 2 x
2

f ( x y) = =  = ,
h( y) 4 2 2
and
1

1 1 1 x 3 2
P   X  Y =  =  dx =
4 2 3 1 2 64
4

4. Statistical Independence
If f ( x | y ) does not depend on y, as in the case for Example 14, then f ( x | y ) = g ( x) and
f ( x, y ) = g ( x)h( y ) . The proof follows by substituting
f ( x, y ) = f ( x y ) h ( y )
into the marginal distribution of X. That is,
 
g ( x) = 
−
f ( x, y )dy =  f ( x y ) h ( y )dy
−

If f ( x y ) does not depend on y, we may write


g ( x ) = f ( x y )  h ( y )dy
−

Note that  h ( y )dy = 1 since h ( y ) is the probability density function of Y. Therefore
−

g ( x ) = f ( x y ) and then f ( x, y ) = g ( x ) h ( y )

If f ( x | y ) does not depend on y, then the outcome of the random variable Y has no impact on
the outcome of the random variable X. In other words, we say that X and Y are independent
random variables.

14
SCHOOL OF MATHEMATICAL SCIENCES

Definition 4.1
Let X and Y be two random variables, discrete or continuous, with joint probability
distribution f ( x | y ) and marginal distributions g (x) and h( y ) , respectively. The random
variables X and Y are said to be statistically independent if and only if

f ( x, y) = g ( x).h( y)
for all (x,y) within their range.

The continuous random variables of Example 14 are statistically independent, since the product
of the two marginal distributions gives the joint density function. This is obviously not the case,
however, for the continuous random variables of Example 13. Checking for statistical
independence of discrete random variables requires a more thorough investigation, since it is
possible to have the product of the marginal distributions equal to the joint probability
distribution for some but not all combinations of ( x, y ) . If you can find any point ( x, y ) for
which f ( x, y ) is defined such that f ( x, y )  g ( x ) h ( y ) , the discrete variables are not
statistically independent.

Example 15
By referring to Example 2, is the number of blue refills in the sample independent of the number
of red refills? (Is X independent of Y?)

Example 16

6 xy 2 , 0  x  1;0  y  1
Let f ( x, y ) = 
0, elsewhere

Show that X and Y are independent.

15
SCHOOL OF MATHEMATICAL SCIENCES

5. The Covariance of Two Random Variables

Definition 5.1
Let X and Y be random variables with joint probability distribution f ( x, y ) and means  X
and Y , respectively. The covariance of X and Y is
 XY =Cov(X , Y ) = E[( X −  X )(Y − Y )]

If X and Y are discrete,

 XY = E[( X −  X )(Y − Y )] =  ( X −  X )(Y − Y ) f ( x, y)
x y

If X and Y are continuous,

 
 XY = E[( X −  X )(Y − Y )] =   (X − 
− −
X )(Y − Y ) f ( x, y )dxdy

Note:
1. Cov( X , Y ) is a measurement of the nature of the association between the random variables
X and Y. If large values of X often results in large values of Y or small values of X result in
small values of Y, positive X −  X will result in positive Y − Y and negative X −  X will
result in negative Y − Y . Thus, the product ( X −  X )(Y − Y ) will tend to be positive. On
the other hand, if large X values often result in small Y values, the product
( X −  X )(Y − Y ) will tend to be negative.
2. The sign of the covariance indicates whether the relationship between two dependent
random variables is positive or negative.
3. When X and Y are statistically independent, the Cov( X , Y ) =0. The converse is not
generally true, i.e. two variables may have zero covariance and still not be statistically
independent.
4. The covariance only describes the linear relationship between two random variables.
Therefore, if a covariance between X and Y is zero, X and Y may have a nonlinear
relationship, which means that they are not necessarily independent.

The alternative and preferred formula for  XY is stated in Theorem 5.1.

16
SCHOOL OF MATHEMATICAL SCIENCES

Theorem 5.1
If X and Y are random variables with means  X and Y , respectively, the covariance of X
and Y is
Cov( X , Y ) = E[( X −  X )(Y − Y )] = E[ XY ] − E[ X ]E[Y ] = E[ XY ] −  X Y

Proof
For the discrete case, we can write
 XY =  ( x −  X )( y − Y ) f ( x, y)
x y

=  xyf ( x, y) −  X  yf ( x, y) −Y  xf ( x, y) +  X Y  f ( x, y)
x y x y x y x y

Since
 X =  xf ( x, y), Y =  yf ( x, y), and  f ( x, y) = 1
x y x y x y

for any joint discrete distribution, it follows that

 XY = E  XY  −  X Y − Y  X +  X Y = E[ XY ] −  X Y
For the continuous case, the proof is identical with summations replaced by integrals.

Example 17
Find the covariance between X and Y with joint probability function:
Y
0 1 4
1 0.10 0.05 0.15
X 3 0.05 0.20 0.25
5 0.15 0.00 0.05

Solution
5 4
E  XY  =  xyf ( x, y )
x =1 y =0

= (1)( 0 ) f (1, 0 ) + (1)(1) f (1,1) + (1)( 4 ) f (1, 4 ) + ( 3)( 0 ) f ( 3, 0 ) + ( 3)(1) f ( 3,1) + ( 3 )( 4 ) f (3, 4 )
+ ( 5 )( 0 ) f ( 5, 0 ) + ( 5 )(1) f ( 5,1) + ( 5 )( 4 ) f ( 5, 4 )
= f (1,1) + 4 f (1, 4 ) + 3 f ( 3,1) + 12 f ( 3, 4 ) + 5 f ( 5,1) + 20 f ( 5, 4 )
= 0.05 + 4 ( 0.15) + 3 ( 0.20 ) + 12 ( 0.25) + 5 ( 0 ) + 20 ( 0.05)
= 5.25
Marginal Distribution for X
x
1 3 5
g (x) 0.30 0.5 0.20
5
 X =  xg ( x ) = (1)( 0.30 ) + ( 3)( 0.50 ) + ( 5 )( 0.20 ) = 2.8
x =1

17
SCHOOL OF MATHEMATICAL SCIENCES

Marginal Distribution for Y

y
0 1 4
h (y) 0.30 0.25 0.45
4
Y =  yh ( y ) = ( 0 )( 0.30 ) + (1)( 0.25) + ( 4 )( 0.45) = 2.05
y =0

 XY = E  XY  −  X Y
= 5.25 − ( 2.8 )( 2.05 )
= −0.49

Example 18
The fraction X of male runners and the fraction Y of female runners who compete in marathon
races are described by the joint density function
8 xy, 0  y  x  1
f ( x, y ) = 
 0, elsewhere
Find the covariance of X and Y.

Solution
4 x3 , 0  x  1
g ( x) = 
 0, elsewhere
4 y (1 − y ) , 0  y  1
 2

h( y) = 

 0, elsewhere
1
4
 X = E  X  =  4 x 4 dx =
0
5
1
Y = E Y  =  4 y 2 (1 − y 2 ) dy =
8
0
15
1 1
4
E  XY  =   8 x 2 y 2 dxdy =
0 y
9
4  4  8  4
 XY = E  XY  −  X Y = −    =
9  5  15  225

Example 19
Let X and Y be discrete random variables with a joint probability distribution shown as follows.
Show that X and Y are dependent but have zero covariance.

18
SCHOOL OF MATHEMATICAL SCIENCES

Solution
1 1
E  XY  =   xyf ( x, y )
x =−1 y =−1

= ( −1)( −1) f ( −1, −1) + ( −1)(1) f ( −1,1) + (1)( −1) f (1, −1) + (1)(1) f (1,1)
1 1 1 1
= − − + =0
16 16 16 16
Marginal Distribution for X
x
-1 0 1
g (x) 5 6 5
16 16 16
1
 5  5
 X =  xg ( x ) = ( −1)   + (1)   = 0
x =−1  16   16 
Marginal Distribution for Y
y
-1 0 1
h (y) 5 6 5
16 16 16
1
5 5
Y =  yh ( y ) = ( −1)   + (1)   = 0
y =−1  16   16 
 XY = E  XY  −  X Y
= 0−0
=0
 5  5  25 1
However, g ( −1) h ( −1) =    = but f ( −1, −1) = . Hence, X and Y are not
 16  16  256 16
independent since f ( −1, −1)  g ( −1) h ( −1) .

19
SCHOOL OF MATHEMATICAL SCIENCES

Although the covariance between two random variables does provide information regarding
the nature of the relationship, the magnitude of  XY does not indicate anything regarding the
strength of the relationship, since  XY is not scale-free. Its magnitude will depend on the units
used to measure both X and Y. There is a scale-free version of the covariance called the
correlation coefficient that is used widely in statistics.

Definition 5.2
Let X and Y be random variables with covariance  XY and standard deviation  X and  Y ,
respectively. The correlation coefficient of X and Y is
 XY
 XY = , where -1 ≤  ≤ 1
 XY

 XY is free of the units of X and Y. It assumes a value of zero when  XY = 0. When there is an
exact linear dependency, say Y = a + bX ,  XY = 1 if b > 0 and  XY = −1 if b < 0.

Example 20
Find the correlation coefficient between X and Y in Example 17.
Solution
E  X 2  = (12 ) ( 0.30 ) + ( 32 ) ( 0.5 ) + ( 52 ) ( 0.20 ) = 9.8
E Y 2  = ( 02 ) ( 0.30 ) + (12 ) ( 0.25 ) + ( 42 ) ( 0.45 ) = 7.45

 2 = 9.8 − ( 2.8 ) = 1.96

2
X

 2 = 7.45 − ( 2.05) = 3.2475

2
Y

 XY −0.49 −0.49
 XY = = = = −0.1942
 XY 1.96 3.2475 (1.4 )(1.8021)

Example 21
Find the correlation coefficient of X and Y in Example 18.
Solution
1
2
E  X  =  4 x5 dx =
2

0
3
1
E Y 2  =  4 y 3 (1 − y 2 )dy = 1 −
2 1
=
0
3 3
2
2 4 2
 2 = −  =
X
3  5  75
2
1 8 11
Y = −  =
2

3  15  225
 4 / 225 4
 XY = XY = = = 0.4924
 XY 2 / 75 11/ 225 66

20
SCHOOL OF MATHEMATICAL SCIENCES

Note that although the covariance in Example 20 is larger in magnitude (disregarding the sign)
that that in Example 21, the relationship of the magnitude of the correlation coefficients in
these two examples is just the reverse. This is evidence that we cannot look at the magnitude
of the covariance to decide on how strong the relationship is.

Theorem 5.2
E  X  Y  = E  X   E Y 
If X represent the daily production of some item from machine A and Y the daily production of
the same kind of item from machine B, then X + Y represents the total number of items produced
daily by both machines. Theorem 5.2 states that the average daily production for both machines
is equal to the sum of the average daily production of each machine.

Theorem 5.3
Let X and Y be two independent variables. Then
E  XY  = E  X  E Y 
Proof
 
E  XY  =   xyf ( x, y )dxdy
− −

Since X and Y are independent

f ( x, y ) = g ( x ) h ( y )
where g ( x ) and h ( y ) are the marginal distributions of X and Y, respectively. Hence
   
E  XY  =  xyg ( x ) h ( y )dxdy =  xg ( x ) dx  yh ( y ) dy
− − − −

= E  X  E Y 

Theorem 5.2 can be illustrated for discrete variables by considering the experiment of tossing
a green die and a red die. Let the random variable X represent the outcome on the green die and
the random variable Y represent the outcome on the red die. Then XY represents the product of
the numbers that occur on the pair of dice. In the long run, the average of the products of the
numbers is equal to the product of the average number that occurs on the green die and the
average of the number that occurs on the red die.

Theorem 5.4
If X and Y are independent random variables, then
 XY = 0
Proof:
 XY = E[ XY ] −  X Y

= E[ X ]E[Y ] −  X Y (X and Y independent)

21
SCHOOL OF MATHEMATICAL SCIENCES

Theorem 5.5
If X and Y are random variables with joint probability distribution f ( x, y) , then
Var(aX + bY + c) = a 2 Var(X ) + b 2 Var(Y ) + 2abCov( X , Y )
Proof:

Var(aX + bY + c) = E  (ax + by + c) − aX +bY +c  

2
 

= E (ax + by + c) − a X − bY − c  
2
 

= E  a( x −  X ) + b ( y − Y ) 
2

 

  
= a 2 E ( x −  X ) 2 + 2abE( x −  X )( y − Y ) + b 2 E ( y − Y ) 2 
= a 2 X2 + 2ab XY + b 2 Y2

= a 2 Var( X ) + 2abCov( X , Y ) + b 2 Var(Y )

Note: Var(aX + bY + c) = Var(aX + bY )

Var (aX − bY ) = a 2 Var( X ) + b 2 Var(Y ) − 2abCov( X , Y )

Corollary:

1. Setting b = 0, Var ( aX ) = a 2Var ( X )

2. Setting a = 1 and b = 0, Var ( X + c ) = Var ( X )
3. Setting b = 0 and c = 0, Var ( aX ) = a 2Var ( X )
4. If X and Y are independent random variables, then
Var(a X + bY ) = a 2 Var( X ) + b 2 Var(Y )
5. If X and Y are independent random variables, then
Var(a X − bY ) = a 2 Var( X ) + b 2 Var(Y )
6. If X1 , X2 ,…, Xn are independent random variables, then
Var(a1 X1 + a2 X 2 + ... + an X n ) = a12 Var(X1 ) + ... + an 2Var(X n )

Corollary 1 to 3 state that the variance is unchanged if a constant is added to or subtracted from
a random variable. The addition or subtraction of a constant simply shifts the values of X to the
right or to the left but does not change their variability. However, if a random variable is
multiplied or divided by a constant, then Corollary 1 and 3 state that the variance is multiplied
or divided by the square of the constant.
The result stated in Corollary 4 is obtained from Theorem 5.5 by invoking Theorem 5.4.
Corollary 5 follows when b in Corollary 4 is replaced by -b.
Generalizing to a linear combination of n independent random variables, we have Corollary 6.

22
SCHOOL OF MATHEMATICAL SCIENCES

Example 22
Suppose E[X] = -3, E[X2] = 13, Var [Y] = 20, E[Y] = 4, and E [XY] = 7. Find Var [5X – 9Y].
Solution

Var  X  = E  X 2  − ( E  X )
2

= 13 − ( −3) = 4
2

Cov  XY  = E  XY  − E  X  E Y 
= 7 − ( −3)( 4 )
= 19
Var 5 X − 9Y  = 25Var  X  + 81Var Y  − 2 ( 5 )( 9 ) Cov  XY 
= 25 ( 4 ) + 81( 20 ) − 90 (19 ) = 100 + 1620 − 1710 = 10

Example 23
If X and Y are random variables with variances  X2 = 2 and  Y2 = 4 and covariance  XY
2
= −2 ,
find the variance of the random variable Z = 3 X − 4Y + 8 .
Solution
Var(3 X − 4Y + 8) = 9Var(X ) + 16Var(Y ) − 2 ( 3)( 4 ) Cov( X , Y )
= ( 9 )( 2 ) + (16 )( 4 ) − ( 24 )( −2 ) = 130

Example 24
Let X and Y denote the amounts of two different types of impurities in a batch of a certain
chemical product. Suppose that X and Y are independent random variables with variances
 X2 = 2 and  Y2 = 3 . Find the variance of the random variable Z = 3 X − 2Y + 5
Solution
Var(3 X − 2Y + 5) = 9Var(X ) + 4Var(Y )
= ( 9 )( 2 ) + ( 4 )( 3) = 30

6. The Multinomial Probability Distribution

The binomial experiment in chapter 2 becomes a multinomial experiment if we let each trial
have more than two possible outcomes. The classification of a manufactured product as being
light, heavy, or acceptable and the recording of accidents at a certain intersection according to
the day of the week (for example, the number of accidents on Monday, Tuesday, …, Friday)
constitute multinomial experiments. The drawing of a card from a deck with replacement is
also a multinomial experiment if the 4 suits are the outcomes of interest.

In general, if a given trial can result in any one of k possible outcomes E1, E2, …, Ek with
probabilities p1 , p2 ,..., pk , then the multinomial distribution will give the probability that E1
occurs x1 times, E2 occurs x2 times, …, and Ek occurs xk times in n independent trials, where
23
SCHOOL OF MATHEMATICAL SCIENCES

x1 + x2 + ... + xk = n
We shall denote this joint probability distribution by
f ( x1 , x2 ,..., xk ; p1 , p2 ,..., pk , n ) ,
where p1 + p2 + ... + pk = 1 , since the result of each trial must be one of the k possible outcomes.
The following shows the multinomial distribution.

Multinomial Distribution
If a given trial can result in the k outcomes E1, E2, …, Ek with probabilities p1 , p2 ,..., pk , then
the probability distribution of the random variables X 1 , X 2 ,..., X k representing the number of
occurrences for E1, E2, …, Ek in n independent trials, is

 n  x1 x2
f ( x1 , x2 ,..., xk ; p1 , p2 ,..., pk , n ) =  xk
 p1 p 2 ... p k ,
 1 2
x , x ,..., xk 

with
k k

 xi = n and
i =1
p
i =1
i =1

Example 25
A certain city has 3 newspapers, A, B, and C. Newspaper A has 50 percent of the readers in
the city. Newspaper B, has 30 percent of the readers, and newspaper C has the remaining 20
percent. Find the probability that, among 8 randomly chosen readers in that city, 5 will read
newspaper A, 2 will read newspaper B, and 1 will read newspaper C. (assume no one reads
more than one newspaper)
Solution

 8 
f ( 5, 2,1;0.5, 0.3, 0.2,8 ) =   ( 0.5 ) ( 0.3) ( 0.2 )
5 2 1

 5, 2,1
8!
= ( 0.5) ( 0.3) ( 0.2 )
5 2 1

5!2!1!
= 0.0945

Example 26
The complexity of arrivals and departures of planes at an airport is such that computer
simulation is often used to model the “ideal” conditions. For a certain airport with three
runways, it is known that in the ideal setting the following are the probabilities that the
individual runways are accessed by a randomly arriving commercial jet:
2
Runway 1: p1 =
9
1
Runway 2: p2 =
6
24
SCHOOL OF MATHEMATICAL SCIENCES

11
Runway 3: p3 =
18
What is the probability that 6 randomly arriving airplanes are distributed in the following
fashion?
Runway 1: 2 airplanes
Runway 2: 1 airplane
Runway 3: 3 airplanes
Solution
 6   2   1   11 
2 1 3
 2 1 11 
f  2,1,3; , , ,8  6 =       
 9 6 18   2,1,3   9   6   18 
2 1 3
6!  2   1   11 
=      
2!1!3!  9   6   18 
= 0.1127

7. Conditional Expectations

Definition 7.1
If X and Y are any two random variables, the conditional expectation of g(X), given that Y =
y, is defined to be

E[ g ( X ) | Y = y] =  g ( x) f ( x | y)dx if X and Y are jointly continuous and
−

E[ g ( X ) | Y = y ] =  g ( x) p( x | y) if X and Y are jointly discrete.

all x

Let us denote by E  X Y  that function of the random variable Y whose value at Y = y is

E  X Y = y  . Note that E  X Y  is itself a random variable. An extremely important property
of conditional expectation is that for all random variables X and Y

E[ X ] = E[ E[ X | Y ]] (7.1)

If Y is a discrete random variable, then Equation (7.1) states that

E  X  =  E  X Y = y  P (Y = y ) (7.2)
y

while if Y is continuous with density fY ( y ) , then Equation (7.1) says that


EX  =  E  X Y = y  f ( y ) dy
Y (7.3)
−
The following shows the proof for Equation (7.2). The proof for Equation (7.3) is similar by
changing the summation notation to integrals.

25
SCHOOL OF MATHEMATICAL SCIENCES

Proof

 E  X Y = y  P (Y = y ) =  xP ( X = x Y = y )P (Y = y )
y y x

P ( X = x, Y = y )
=  x P (Y = y )
y x P (Y = y )
=  xP ( X = x, Y = y )
y x

=  x  P ( X = x, Y = y )
x y

=  xP ( X = x )
x

= EX 

Example 27
A quality control plan for an assembly line involves sampling n =10 finished items per day and
counting Y, the number of defectives. If p denotes the probability of observing a defective, the
Y has a binomial distribution, assuming a large number of items are produced by the line. But
p varies from day to day and is assumed to have a uniform distribution on the interval from 0
to ¼. Find the expected value of Y.
Solution
 1
Y ~ Bin (10, p ) , where p ~ Uniform  0, 
 4

E Y  =  E Y p  f ( p ) dp
−

E Y p  = np = 10 p
 1
 4, 0  p 
f ( p) =  4
0, elsewhere

E Y  =  E Y p  f ( p ) dp
−
1
4
=  (10 p )( 4 ) dp
0
1
=  20 p 2  4
0

20 5
= =
16 4

26
SCHOOL OF MATHEMATICAL SCIENCES

Example 28
A professor works in Moon Township and lives in Pittsburgh. It is about a 25 mile commute.
The professor randomly chooses from 3 different routes home in a futile attempt to evade rush
hour traffic. The routes are identified by the name of a major bridge along the way. The
professor has accumulated data over a lengthy period of time on the mean drive times of the
three routes. Using the data summary given below,
Route Probability of Route Expected Time of
Route (in minutes)
Wickle bridge 0.2 55
Fort bridge 0.4 50
Liberty bridge 0.4 45

Calculate the overall expected drive time.

Solution
Let X be the drive time
Y = 1 be route Wickle bridge is chosen
Y = 2 be route Fort bridge is chosen
Y = 3 be route Liberty bridge is chosen
E  X  =  E  X Y = y  P (Y = y )
y

= E  X Y = 1 P (Y = 1) + E  X Y = 2 P (Y = 2 ) + E  X Y = 3 P (Y = 3)

= ( 55 )( 0.2 ) + ( 50 )( 0.4 ) + ( 45 )( 0.4 )
= 49 minutes

ALL SST 205 NOTES
100% (1)
ALL SST 205 NOTES
102 pages
SMA 2231 Probability and Statistics III
100% (1)
SMA 2231 Probability and Statistics III
89 pages
Lecture 6 Joint
No ratings yet
Lecture 6 Joint
34 pages
Chapter 5
No ratings yet
Chapter 5
56 pages
Lecture No.6
No ratings yet
Lecture No.6
8 pages
Week - 4 - Joint Probability Distributions, Marginal Distributions, Conditional Probability Distributions
100% (1)
Week - 4 - Joint Probability Distributions, Marginal Distributions, Conditional Probability Distributions
21 pages
Conditional Joint Conitmnuous Pdfs
No ratings yet
Conditional Joint Conitmnuous Pdfs
16 pages
Econ-2042- Unit 4-HO
No ratings yet
Econ-2042- Unit 4-HO
13 pages
Joint and Conditional Probability Distributions
No ratings yet
Joint and Conditional Probability Distributions
52 pages
5 Joint Distributions
No ratings yet
5 Joint Distributions
31 pages
Lecture 5
No ratings yet
Lecture 5
13 pages
Chapter 7
No ratings yet
Chapter 7
55 pages
6.mean and Variance of A Distribution
No ratings yet
6.mean and Variance of A Distribution
38 pages
Joint Distributions: Basic Theory
No ratings yet
Joint Distributions: Basic Theory
10 pages
AMA2104 Probability and Engineering Statistics 3 Joint Distribution
No ratings yet
AMA2104 Probability and Engineering Statistics 3 Joint Distribution
25 pages
Lecture 4
No ratings yet
Lecture 4
2 pages
Lecture8
No ratings yet
Lecture8
21 pages
Math 241 Notes
No ratings yet
Math 241 Notes
76 pages
PS_Module 2 Aartyh
No ratings yet
PS_Module 2 Aartyh
206 pages
SST205 CHAPTER 1
No ratings yet
SST205 CHAPTER 1
7 pages
Multivariate Distributions
No ratings yet
Multivariate Distributions
31 pages
Chapter 3
No ratings yet
Chapter 3
31 pages
11
No ratings yet
11
28 pages
Module 5.0 Joint Probability Distribution
No ratings yet
Module 5.0 Joint Probability Distribution
26 pages
ch5 pt1 PDF
No ratings yet
ch5 pt1 PDF
40 pages
Probability and Statistical Methods Math 322: y Y Y y
No ratings yet
Probability and Statistical Methods Math 322: y Y Y y
9 pages
Joint
No ratings yet
Joint
5 pages
Sma 2201
No ratings yet
Sma 2201
35 pages
Sta227 Notes
No ratings yet
Sta227 Notes
82 pages
sta-2201-probability-and-statistics-iii
No ratings yet
sta-2201-probability-and-statistics-iii
90 pages
Chapter5: Joint Probability Distributions
No ratings yet
Chapter5: Joint Probability Distributions
39 pages
Math13-Topic-5
No ratings yet
Math13-Topic-5
8 pages
Stochastic Hydrology: Indian Institute of Science
No ratings yet
Stochastic Hydrology: Indian Institute of Science
52 pages
PS Ch3 SS 2023-2024
No ratings yet
PS Ch3 SS 2023-2024
24 pages
Course No: MTH F113: Probability and Statistics
No ratings yet
Course No: MTH F113: Probability and Statistics
80 pages
SDM_UNIT_3_Joint Probability and Markov Chain
No ratings yet
SDM_UNIT_3_Joint Probability and Markov Chain
7 pages
Joint Probability Distribution
100% (1)
Joint Probability Distribution
10 pages
ISO-Module2-BCS301
No ratings yet
ISO-Module2-BCS301
28 pages
Joint Distribution and Later
No ratings yet
Joint Distribution and Later
61 pages
Chapter 5 Joint Probability Distributions 2
No ratings yet
Chapter 5 Joint Probability Distributions 2
49 pages
CH 1
No ratings yet
CH 1
7 pages
3-Joint Probability Distribution-03-02-2024
No ratings yet
3-Joint Probability Distribution-03-02-2024
22 pages
Vector 1
No ratings yet
Vector 1
6 pages
CHAPTER 03-Random Variable
No ratings yet
CHAPTER 03-Random Variable
68 pages
Lecture11 Slides
No ratings yet
Lecture11 Slides
7 pages
Chapter Five 5. Two Dimensional Random Variables
No ratings yet
Chapter Five 5. Two Dimensional Random Variables
12 pages
3.4
No ratings yet
3.4
37 pages
2.2. Multivariate Probability Distributions: Joint Probability Distribution (Density) Function
No ratings yet
2.2. Multivariate Probability Distributions: Joint Probability Distribution (Density) Function
8 pages
JPD and QT Notes
No ratings yet
JPD and QT Notes
23 pages
CHAPTER 5
No ratings yet
CHAPTER 5
10 pages
MA2013E MAT IV.2
No ratings yet
MA2013E MAT IV.2
39 pages
Topic_5_Multivariate_distributions
No ratings yet
Topic_5_Multivariate_distributions
50 pages
Theories Joint Distribution PDF
No ratings yet
Theories Joint Distribution PDF
25 pages
Theories Joint Distribution
No ratings yet
Theories Joint Distribution
25 pages
Chapter Four 4. Joint and Marginal Distributions
100% (1)
Chapter Four 4. Joint and Marginal Distributions
12 pages
Random Variables
No ratings yet
Random Variables
23 pages
SST205-CHAPTER -2
No ratings yet
SST205-CHAPTER -2
14 pages
Differential Forms
From Everand
Differential Forms
Henri Cartan
5/5 (2)
Pythagorean Triangles
From Everand
Pythagorean Triangles
Waclaw Sierpinski
No ratings yet
Elements of Partial Differential Equations
From Everand
Elements of Partial Differential Equations
Ian N. Sneddon
4.5/5 (14)
BDT
No ratings yet
BDT
3 pages
MAT2094 Tutorial 4(1)
No ratings yet
MAT2094 Tutorial 4(1)
5 pages
Chapter 3 - Continuous Random Variable
No ratings yet
Chapter 3 - Continuous Random Variable
37 pages
Chapter 5 - Functions of Random Variables
No ratings yet
Chapter 5 - Functions of Random Variables
14 pages
Chapter 2 - Discrete Random Variable
No ratings yet
Chapter 2 - Discrete Random Variable
31 pages

Chapter 4 - Multivariate Probability Distribution

Uploaded by

Chapter 4 - Multivariate Probability Distribution

Uploaded by

SCHOOL OF MATHEMATICAL SCIENCES

MAT 1034 INTRODUCTION TO PROBABILITY

1. Joint Probability Distributions

2. We might be interested in the hardness H and tensile strength T of cold-drawn copper,

Hence, in the discrete case,

(a) Verify that this is a joint distribution table.

Since all f ( x, y)  0 and  f ( x, y) = 1, this is a joint distribution table.

(b) P(3, 4) = 0.25

(c) P ( x + y  4 ) = f (1, 0 ) + f (1,1) + f (3, 0) + f (3,1)

(a) The joint probability function f ( x, y) , and

3. P[( X , Y )  R] =   f ( x, y ) dxdy for any region R in the xy plane.

(a) Verify condition (2) of Definition 1.2.

What is the probability that the total loss is at least 1?

for the discrete case, and

Marginal distribution for Y

a) Find marginal density functions for X and Y.

3. Conditional Probability Distributions

In Chapter 1, recall that the conditional probability is given as

If f ( x y ) does not depend on y, we may write

Show that X and Y are independent.

5. The Covariance of Two Random Variables

If X and Y are discrete,

If X and Y are continuous,

The alternative and preferred formula for  XY is stated in Theorem 5.1.

for any joint discrete distribution, it follows that

Marginal Distribution for Y

 2 = 9.8 − ( 2.8 ) = 1.96

 2 = 7.45 − ( 2.05) = 3.2475

Since X and Y are independent

= E[ X ]E[Y ] −  X Y (X and Y independent)

Var(aX + bY + c) = E  (ax + by + c) − aX +bY +c  

= a 2 Var( X ) + 2abCov( X , Y ) + b 2 Var(Y )

Note: Var(aX + bY + c) = Var(aX + bY )

Var (aX − bY ) = a 2 Var( X ) + b 2 Var(Y ) − 2abCov( X , Y )

1. Setting b = 0, Var ( aX ) = a 2Var ( X )

6. The Multinomial Probability Distribution

E[ g ( X ) | Y = y ] =  g ( x) p( x | y) if X and Y are jointly discrete.

Let us denote by E  X Y  that function of the random variable Y whose value at Y = y is

If Y is a discrete random variable, then Equation (7.1) states that

while if Y is continuous with density fY ( y ) , then Equation (7.1) says that

Calculate the overall expected drive time.

= E  X Y = 1 P (Y = 1) + E  X Y = 2 P (Y = 2 ) + E  X Y = 3 P (Y = 3)

You might also like