Module 5 EE Data Analysis
Module 5 EE Data Analysis
Module no. 5
Joint Probability Distributions
Introduction:
The theoretical development since Module 3 has primarily concerned the probability
distribution of a single random variable. However, many of the problems discussed in
Module 2 had two or more distinct aspects of interest; it would be awkward to attempt to
describe their sample spaces using a single random variable. In this module, we
develop the theory for distributions of multiple random variables, called joint probability
distributions. For simplicity, we begin by considering random experiments in which only
two random variables are studied. In later sections, we generalize the presentation to
the joint probability distribution of more than two random variables.
Objectives:
At the end of this module, the students should be able to
1. Apply joint probability mass functions and joint probability density functions to
calculate probabilities and calculate marginal probability distributions from joint
probability distributions.
2. Calculate conditional probability distributions from joint probability distributions
and assess independence of random variables.
3. Interpret and calculate covariances and correlations between random variables.
4. Calculate means and variances for linear functions of random variables and
calculate probabilities for linear functions of normally distributed random
variables.
5. Determine the distribution of a general function of a random variable.
Pre – Test
Module 5 – Joint Probability Distributions
Name: Subject:
Course/Section: Date:
Direction: Solve the following problems. Write your solutions in a clean sheet of paper.
1. Let denote the sum of the points in two tosses of a fair die.
a. Find the probability distribution and events corresponding to the values of .
b. Obtain the cdf of .
c. Find .
2. Suppose the random variables and have joint pdf
Learning Activities:
5.1 Joint Probability Distributions for Two Random Variables
If and are two random variables, the probability distribution that defines their
simultaneous behavior is called a joint probability distribution.
Joint Probability Mass Function If and are discrete random variables, the joint
probability distribution of and is a description of the set of points in the range
of along with the probability of each point. Also, and is usually
written as The joint probability distribution of two random variables is
sometimes referred to as the bivariate probability distribution or bivariate distribution of
the random variables. One way to describe the joint probability distribution of two
discrete random variables is through a joint probability mass function
Joint Probability Density Function The joint probability distribution of two continuous
random variables and can be specified by providing a method for calculating the
probability that and assume a value in any region of two-dimensional space.
Analogous to the probability density function of a single continuous random variable, a
joint probability density function can be defined over two-dimensional space. The double
integral of over a region R provides the probability that assumes a value
in .
A joint probability density function for and is shown in Figure 5.2. The probability
that
assumes a value in the region R equals the volume of the shaded region in
Figure 5.2. In this manner, a joint probability density function is used to determine
probabilities for and .
common starting time and . Assume that the joint probability density function for
and is
Reasonable assumptions can be used to develop such a distribution, but for now, our
focus is on only the joint probability density function.
The region with nonzero probability is shaded in Figure 5.3. The property that this
joint probability density function integrates to 1 can be verified by the integral of
over this region as follows:
Figure 5.3
The joint probability density function of and is nonzero over the shaded region.
A probability for only one random variable, say, for example, , can be
found from the marginal probability distribution of or from the integral of the joint
probability distribution of and as
Therefore,
Figure 5.6 Region of integration for the probability that is darkly shaded, and
it is partitioned into two regions with and .
In Figure 5.5, the marginal probability distributions of and are used to obtain the
means as
and
Note that
. This set of probabilities defines the conditional probability distribution
of given that .
Example 5.5 illustrates that the conditional probabilities for given that can
be thought of as a new probability distribution called the conditional probability mass
function for given . For Example 5.5, the conditional probability mass function
for given that consists of the four probabilities
.
The following definition applies these concepts to continuous random variables.
The conditional probability density function provides the conditional probabilities for the
values of given that .
Practical Interpretation: If the connect time is 1500 ms, then the expected time to be
authorized is 2000 ms.
For the discrete random variables in Example 5.1, the conditional mean of given
is obtained from the conditional distribution in Example 5.5:
The conditional mean is interpreted as the expected response time given that one
bar of signal is present. The conditional variance of given is
The conditional probability mass function is shown in Fig. 5.8(b). Notice that
for any That is, knowledge of whether or not the bill has errors does
not change the probability of the number of X-rays listed on the bill.
Figure 5.8 (a) Joint and marginal probability distributions of and . (b) Conditional
probability distribution of given .
Rectangular Range for (X, Y) Let denote the set of points in two-dimensional space
that receive positive probability under . If is not rectangular, then and are
not independent because knowledge of can restrict the range of values of that
receive positive probability. If is rectangular, independence is possible but not
demonstrated. One of the conditions in Equation 5.8 must still be verified.
5.3 Joint Probability Distributions for More Than Two Random Variables
More than two random variables can be defined in a random experiment. Results for
multiple random variables are straightforward extensions of those for two random
variables.
What is the probability that the device operates for more than 1000 hours without
any failures? The requested probability is
, which equals the multiple integral of over the region
. The joint probability density function can be
written as a product of exponential functions, and each integral is the simple integral of
an exponential function. Therefore,
Suppose that the joint probability density function of several continuous random
variables is a constant over a region (and zero elsewhere). In this special case,
When the joint probability density function is constant, the probability that the random
variables assume a value in the region is just the ratio of the volume of the region
to the volume of the region for which the probability is positive.
Figure 5.9 Joint probability distribution of and . Points are equally likely.
is
Similar to the result for only two random variables, independence implies that Equation
5.13 holds for all If we find one point for which the equality fails,
are not independent. It is left as an exercise to show that if are
independent,
That is, can be thought of as the weighted average of for each point in
the range of . The value of represents the average value of that
is expected in a long sequence of repeated trials of the random experiment.
and
Therefore,
The covariance is defined for both continuous and discrete random variables by the
same formula.
Figure 5.10 Joint probability distributions and the sign of covariance between and .
Figure 5.10 assumes all points in the joint probability distribution of and are equally
likely and shows examples of pairs of random variables with positive, negative, and zero
covariance.
Covariance is a measure of linear relationship between the random variables. If the
relationship between the random variables is nonlinear, the covariance might not be
sensitive to the relationship. This is illustrated in Figure 5.10(d). The only points with
nonzero probability are the points on the circle. There is an identifiable relationship
between the variables. Still, the covariance is zero.
The equality of the two expressions for covariance in Equation 5.15 is shown for
continuous random variables as follows. By writing the expectations as integrals,
Now
Therefore,
Example 5.17
In Example 5.1, the random variables and are the number of signal bars and the
response time (to the nearest second), respectively. Interpret the covariance between
and as positive or negative.
As the signal bars increase, the response time tends to decrease. Therefore, and
have a negative covariance. The covariance was calculated to be −0.5815 in Example
5.16.
There is another measure of the relationship between two random variables that is
often easier to interpret than the covariance.
Two random variables with nonzero correlation are said to be correlated. Similar to
covariance, the correlation is a measure of the linear relationship between random
variables.
For independent random variables, we do not expect any relationship in their joint
probability distribution.
However, if the correlation between two random variables is zero, we cannot conclude
that the random variables are independent. Figure 5.10(d) provides an example.
By using Equation 5.11 for each of the terms in this expression, we obtain the following.
Note that the result for the variance in Equation 5.28 requires the random variables to
be independent.
Consequently, the standard deviation of thickness of the final product is 951/2 = 9.75
nm, and this shows how the variation in each layer is propagated to the final product.
The conclusion for ̅ is obtained as follows. Using Equation 5.28 with and
yields
The mean and variance of follow from Equations 5.26 and 5.28.
Now,
We now consider the situation in which the random variables are continuous. Let
Because the integral gives the probability that for all values of a contained in the
feasible set of values for must be the probability density of .
Therefore, the probability distribution of is
Self-Evaluation:
1. Let the random variables X and Y have joint distribution
Review of Concepts:
1. Joint Probability Distributions for Two Random Variables
a. If and are two random variables, the probability distribution that defines
their simultaneous behavior is called a joint probability distribution.
b. Joint Probability Mass Function:
The joint probability mass function of the discrete random variables and ,
denoted as , satisfies
(1)
(2) ∑ ∑
(3)
c. Joint Probability Density Function:
A joint probability density function for the continuous random variables and
, denoted as satisfies the following properties:
(1) for all
(2) ∫ ∫
(3) For any region of two-dimensional space,
( ) ∫∫
c. Independence:
For random variables and , if any one of the following properties is true,
the others are also true, and and are independent.
(1) for all and
(2) for all and with
(3) for all and with
(4) for any sets and in the
range of and , respectively.
3. Joint Probability Distributions for More Than Two Random Variables
a. Joint Probability Density Function”
∫ ( )
d. Distribution of a Subset of Random Variables:
If the joint probability density function of continuous random variables
is , the probability density function of
is
∫∫ ∫ ( )
where the integral is over all points in the range of for which
.
e. Independence:
Random variables are independent if and only if
( ) ( )
4. Covariance and Correlation
a. Expected Value of a Function of Two Random Variables
∑∑
{
∫∫
b. Covariance:
c. Correlation:
The correlation between random variables and , denoted as , is
√
For any two random variables and ,
is a linear function of .
b. Mean of a linear Function
If ,
If are independent,
( )
6. General Functions of Random Variables
a. General Function of a Discrete Random Variable:
Suppose that is a discrete random variable with probability distribution
Let define a one-to-one transformation between the values
of and so that the equation can be solved uniquely for in
terms of . Let this solution be . Then the probability mass function
of the random variable is
References:
Douglas C. Montgomery & George C. Runger. Applied Statistics And Probability
For Engineers. John Wiley & Sons; 7th ed. 2018.
Hongshik Ahn. Probability And Statistics For Sciences & Engineering with
Examples in R. Cognella, Inc.; 2nd ed. 2018.
Jay L. Devore. Probability and Statistics for Engineering and the Science.
Cengage Learning; 9th ed. 2016.