Gazi o Introduction To Probability and Random Variables
Gazi o Introduction To Probability and Random Variables
Gazi
Introduction
to Probability
and Random
Variables
Introduction to Probability and Random Variables
Orhan Gazi
Introduction to Probability
and Random Variables
Orhan Gazi
Electrical and Electronics Engineering Department
Ankara Medipol University
Altındağ/Ankara, Türkiye
© The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland
AG 2023
This work is subject to copyright. All rights are solely and exclusively licensed by the Publisher, whether
the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of
illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and
transmission or information storage and retrieval, electronic adaptation, computer software, or by
similar or dissimilar methodology now known or hereafter developed.
The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication
does not imply, even in the absence of a specific statement, that such names are exempt from the relevant
protective laws and regulations and therefore free for general use.
The publisher, the authors, and the editors are safe to assume that the advice and information in this
book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or
the editors give a warranty, expressed or implied, with respect to the material contained herein or for any
errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional
claims in published maps and institutional affiliations.
This Springer imprint is published by the registered company Springer Nature Switzerland AG
The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland
Preface
The first book about probability and random variables was written in 1937. Although
probability has been known for long time in history, it has been seen that the
compilation of probability and random variables as a written material does not go
back more than a hundred years in history. In fact, most of the scientific develop-
ments in humanity history have been done in the last 100 years. It is not wrong to say
that people have sufficient intelligence only in the last century. Developments
especially in basic sciences took a long time in humanity history.
The founding in random variables and probability affected the other sciences as
well. The scientists dealing with physics subject focused on deterministic modeling
for long time. As the improvements in random variables and probability showed
itself, the modeling of physical events have been performed using probabilistic
modeling. Beforehand the physicians were modeling the flows of electrons around
an atom as deterministic circular paths, but, the developments in probability and
random variables lead physicians to think about the probabilistic models about the
movement of electrons. It is seen that the improvements in basic sciences directly
affect the other sciences as well. The modern electronic devices owe their existence
to the probability and random variable science. Without probability concept, it
would not be possible to develop information communication subjects. Modern
electronic communication devices are developed using the fundamental concepts
of probability and random variables. Shannon in 1948 published his famous paper
about information theory using probabilistic modeling and it lead to the development
of modern communication devices. The developments in probability science caused
the science of statistics to emerge. Many disciplines from medical sciences to
engineering benefit from the statistics science. Medical doctors measure the effects
of tablets extracting statistical data from patients. Engineers model some physical
phenomenon using statistical measures.
In this book, we explain fundamental concepts of probability and random vari-
ables in a clear manner. We cover basic topics of probability and random variables.
The first chapter is devoted to the explanations of experiments, sample spaces,
events, and probability laws. The first chapter can be considered as the basement
v
vi Preface
of the random variable topic. However, it is not possible to comprehend the concept
of random variables without mastering the concept of events, definition of probabil-
ity and probability axioms.
The probability topic has always been considered as a difficult subject compared
to the other mathematic subjects by the students. We believe that the reason for this
perception is the unclear and overloaded explanations of the subject. Considering
this we tried to be brief and clear while explaining the topics. The concept of joint
experiments, writing the sample spaces of joint experiments, and determining the
events from the given problem statement are important to solve the probability
problems.
In Chap. 2, using the basic concepts introduced in Chap. 1, we introduce some
classical probability subjects such as total probability theorem, independence, per-
mutation and combination, multiplication, partition rule, etc.
Chapter 3 introduces the discrete random variables. We introduce the probability
mass function of the discrete random variables using the event concept. Expected
value and variance calculation are the other topics covered in Chap. 3. Some well-
known probability mass functions are also introduced in this chapter. It is easier to
deal with discrete random variables than the continuous random variables. We
advise the reader to study the discrete random variables before continuous random
variables. Functions of random variables are explained in Chap. 4 where joint
probability mass function, cumulative distribution function, conditional probability
mass function, and conditional mean value concepts are covered as well.
Continuous random variables are covered in Chap. 5. Continuous random vari-
ables can be considered as the integral form of the discrete random variables. If the
reader comprehends the discrete random variables covered in Chap. 4, it will not be
hard to understand the subjects covered in Chap. 5. In Chap. 6, we mainly explain
the calculation of probability density, cumulative density, conditional probability
density, conditional mean value calculation, and related topics considering more
than one random variable case. Correlation and covariance topics of two random
variables are also covered in Chap. 6.
This book can be used as a text book for one semester probability and random
variables course. The book can be read by anyone interested in probability and
random variables. While writing this book, we have used the teaching experience of
many years. We tried to provide original examples while explaining the basic
concepts. We considered examples which are as simple as possible, and they provide
succinct information. We decreased the textual part of the book as much as possible.
Inclusions of long text parts decrease the concentration of the reader. Considering
this we tried to be brief as much as possible and aimed to provide the fundamental
concept to the reader in a quick and short way without being lost in details.
I dedicate this book to my lovely daughter Vera Gazi.
vii
viii Contents
Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 233
Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 235
Chapter 1
Experiments, Sample Spaces, Events,
and Probability Laws
In this section, we provide some definitions very widely used in probability theory.
We first consider the discrete probability experiments and give definitions of discrete
sample spaces to understand the concept of probability in an easy manner. Later, we
consider continuous experiments.
Set
A set in its most general form is a collection of objects, and these objects can be
physical objects like, pencils or chairs, or they can be nonphysical objects, like
integers, real numbers, etc.
Experiment
An experiment is a process used to measure a physical phenomenon.
Trial
A trial is a single performance of an experiment. If we perform an experiment once,
then we have a trial of the experiment.
Outcome, Simple Event, Sample Point
After the trial of an experiment, we have an outcome that can be called as a simple
event, sample point, or simple outcome.
Sample Space
A sample space is defined for an experiment, and it is a set consisting of all the
possible outcomes of an experiment.
Event
A sample space is a set, and it has subsets. A subset of a sample space is called an
event. A discrete sample space, i.e., a countable sample space, consisting of
N outcomes, or simple events, has 2N events, i.e., subsets.
Example 1.1: Consider the coin toss experiment. This experiment is a discrete
experiment, i.e., we have a countable number of different outcomes for this exper-
iment. Then, we have the following items for this experiment.
A 6 B B 6 S
then we can say that A and B are not events for the rolling-a-die experiment.
Example 1.5: For the rolling-a-die experiment, the sample space is
S = {1, 2, 3, 4, 5, 6}. Write three events for the rolling-a-die experiment.
Solution 1.5: We can write any three subsets of the sample space since events are
nothing but the subsets of the sample space. Then, we can write three arbitrary
events as
1.3 Probability and Probabilistic Law 3
Since events are nothing but subsets of the sample space, the operations defined on
sets are also valid on events. If A and B are two events, then we can define the
following operations on the events:
Ac = S - A:
Note: A - B = A \ Bc
Mutually Exclusive Events or Disjoint Events
Let A and B be two events. If A \ B = ϕ, then A and B are called mutually exclusive
events, or disjoint events.
Probability is a real valued function, and it is usually denoted by P(). The inputs of
the probability function are the events of experiments, and the outputs are the real
numbers between 0 and 1. Thus, we can say that the probability function is nothing
but a mapping between events and real numbers in the range of 0–1. The use of
probability function P() is illustrated in Fig. 1.1.
Probabilistic Law
The probability function P() is not an ordinary real valued function. For a real
valued function to be used as a probability function, it should obey some axioms, and
these axioms are named probabilistic law axioms, which are outlined as follows:
Probability Axioms
Let S be the sample space of an experiment, and A and B be two events for which
the probability function P() is used such that
4 1 Experiments, Sample Spaces, Events, and Probability Laws
0
Event-1
Event-2 P
Experiment
Event-N
1
PðAÞ ≥ 0: ð1:1Þ
PðSÞ = 1: ð1:3Þ
S = fs1 , s2 , ⋯, sN g:
A = fa1 , a2 , ⋯, ak g:
PðAÞ = Pfa1 , a2 , ⋯, ak g
where employing (1.2), since the simple events are also disjoint events, we get
If the simple events are equally probable events, i.e., P(si) = p, then according to
(1.3), we have
1
PðSÞ = 1 → Pðs1 Þ þ Pðs2 Þ þ ⋯ þ PðsN Þ = 1 → Np = 1 → p = :
N
1
Pðsi Þ = :
N
Then, in this case, the probability of the event given in (1.4) can be calculated as
1 1 1 k
PðAÞ = Pða1 Þ þ Pða2 Þ þ ⋯Pðak Þ → PðAÞ = þ þ ⋯ þ → PðAÞ =
N N N N
Note: Equation (1.5) is valid, only if the simple events are all equally likely, i.e.,
simple events have equal probability of occurrences.
Assume that we perform two different experiments. Let experiment-1 have the
sample space S1 and experiment-2 have the sample space S2. If both experiments
are performed at the same time, we can consider both experiments as a single
6 1 Experiments, Sample Spaces, Events, and Probability Laws
experiment, which can be considered as a joint experiment. In this case, the sample
space of the joint experiments becomes equal to
S = S1 × S 2
i.e., Cartesian product of S1 and S2. Similarly, if more than two experiments with
sample spaces S1, S2, ⋯ are performed at the same time, then the sample space of the
joint experiment can be calculated as
S = S1 × S2 × ⋯
If
S 1 = f a 1 , a 2 , a 3 , ⋯ g S2 = f b 1 , b 2 , b 3 , ⋯ g S 3 = f c 1 , c 2 , c 3 , ⋯ g ⋯
then a single element of S will be in the form si = ajblcm⋯, and the probability of si
can be calculated as
That is, the probability of the simple event of the combined experiment equals the
product of the probabilities of the simple events appearing in the simple event of the
combined experiment.
Example 1.6: For the fair coin toss experiment, sample space is S = {H, T}. Simple
events are {H}, {T}. The probabilities of the simple events are
1 1
PðH Þ = PðT Þ = :
2 2
Example 1.7: We toss a coin twice. Find the sample space of this experiment.
Solution 1.7: For a single toss of the coin, the sample space is S1 = {H, T}. If we
toss the coin twice, we can consider it as a combined experiment, and the sample
space of the combined experiment can be calculated as
Example 1.8: We toss a coin three times. Find the sample space of this experiment.
Solution 1.8: The three tosses of the coin can be considered a combined experi-
ment. For a single toss of the coin, the sample space is S1 = {H, T}. For three tosses,
the sample space can be calculated as
Example 1.9: For the fair die toss experiment, sample space is S = {f1, f2, f3, f4, f5,
f6}. Simple events are {f1}, {f2}, {f3}, {f4}, {f5}, {f6}. The probabilities of the simple
events are
1
Pðf 1 Þ = Pðf 2 Þ = Pðf 3 Þ = Pðf 4 Þ = Pðf 5 Þ = Pðf 6 Þ = :
6
Example 1.10: We flip a fair coin and toss a fair die at the same time. Find the
sample space of the combined experiment, and find the probabilities of the simple
events of the combined experiment.
Solution 1.10: For the coin flip experiment, we have the sample space
S1 = fH, T g
S2 = ff 1 , f 2 , f 3 , f 4 , f 5 , f 6 g
where the integers indicate the faces of the die. For the combined experiment, the
sample space S can be calculated as
S = S1 × S2 → S = fHf 1 , Hf 2 , Hf 3 , Hf 4 , Hf 5 , Hf 6 , Tf 1 , Tf 2 , Tf 3 , Tf 4 , Tf 5 , Tf 6 g:
fHf 1 g fHf 2 g fHf 3 g fHf 4 g fHf 5 g fHf 6 g fTf 1 g fTf 2 g fTf 3 g fTf 4 g fTf 5 g fTf 6 g:
1 1 1
PðHf 1 Þ = PðH ÞPðf 1 Þ → PðHf 1 Þ = × → PðHf 1 Þ =
2 6 12
1 1 1
PðHf 2 Þ = PðH ÞPðf 2 Þ → PðHf 2 Þ = × → PðHf 2 Þ =
2 6 12
1 1 1
PðHf 3 Þ = PðH ÞPðf 3 Þ → PðHf 3 Þ = × → PðHf 3 Þ =
2 6 12
1 1 1
PðHf 4 Þ = PðH ÞPðf 4 Þ → PðHf 4 Þ = × → PðHf 4 Þ =
2 6 12
1 1 1
PðHf 5 Þ = PðH ÞPðf 5 Þ → PðHf 5 Þ = × → PðHf 5 Þ =
2 6 12
8 1 Experiments, Sample Spaces, Events, and Probability Laws
1 1 1
PðHf 6 Þ = PðH ÞPðf 6 Þ → PðHf 6 Þ = × → PðHf 6 Þ =
2 6 12
1 1 1
PðTf 1 Þ = PðT ÞPðf 1 Þ → PðTf 1 Þ = × → PðTf 1 Þ =
2 6 12
1 1 1
PðTf 2 Þ = PðT ÞPðf 2 Þ → PðTf 2 Þ = × → PðTf 2 Þ =
2 6 12
1 1 1
PðTf 3 Þ = PðHT ÞPðf 3 Þ → PðTf 3 Þ = × → PðTf 3 Þ =
2 6 12
1 1 1
PðTf 4 Þ = PðT ÞPðf 4 Þ → PðTf 4 Þ = × → PðTf 4 Þ =
2 6 12
1 1 1
PðTf 5 Þ = PðT ÞPðf 5 Þ → PðTf 5 Þ = × → PðTf 5 Þ =
2 6 12
1 1 1
PðTf 6 Þ = PðT ÞPðf 6 Þ → PðTf 6 Þ = × → PðTf 6 Þ =
2 6 12
Example 1.11: A biased coin is flipped. The sample space is S1 = {Hb, Tb}. The
probabilities of the simple events are
2 1
P ðH b Þ = PðT b Þ = :
3 3
Assume that the biased coin is flipped twice. Consider the two flips as a single
experiment. Find the sample space of the combined experiment, and determine the
probabilities of the simple events for the combined experiment.
Solution 1.11: The sample space of the combined experiment can be found using
S = S1 × S1 as
S = fH b H b , H b T b , T b H b , T b T b g:
fH b H b g fH b T b g fT b H b g fT b T b g:
The probabilities of the simple events of the combined experiment are calculated
as
2 2 4
PðH b H b Þ = PðH b ÞPðH b Þ → PðH b H b Þ = × → P ðH b H b Þ =
3 3 9
2 1 2
PðH b T b Þ = PðH b ÞPðT b Þ → PðH b T b Þ = × → PðH b T b Þ =
3 3 9
1.5 Joint Experiment 9
1 2 2
PðT b H b Þ = PðT b ÞPðH b Þ → PðT b H b Þ = × → PðT b H b Þ =
3 3 9
1 1 1
PðT b T b Þ = PðT b ÞPðT b Þ → PðT b T b Þ = × → PðT b T b Þ = :
3 3 9
Example 1.12: We have a three-faced biased die and a biased coin. For the three-
faced biased die, the sample space is S1 = {f1, f2, f3}, and the probabilities of the
simple events are
1 1 2
Pðf 1 Þ = Pðf 2 Þ = Pðf 3 Þ = :
6 6 3
For the biased coin, the sample space is S2 = {Hb, Tb}, and the probabilities of the
simple events are
1 2
P ðH b Þ = PðT b Þ = :
3 3
We flip the coin and toss the die at the same time. Find the sample space of the
combined experiment, and calculate the probabilities of the simple events.
Solution 1.12: For the combined experiment, the sample space can be calculated
using
S = S1 × S2
as
S = ff 1 , f 2 , f 3 g × f H b , T b g → S = f f 1 H b , f 1 T b , f 2 H b , f 2 T b , f 3 H b , f 3 T b g :
1 1 1
Pðf 1 H b Þ = Pðf 1 ÞPðH b Þ → Pðf 1 H b Þ = × → Pðf 1 H b Þ =
6 3 18
1 2 2
Pðf 1 T b Þ = Pðf 1 ÞPðT b Þ → Pðf 1 T b Þ = × → Pðf 1 T b Þ =
6 3 18
1 1 1
Pðf 2 H b Þ = Pðf 2 ÞPðH b Þ → Pðf 2 H b Þ = × → Pðf 2 H b Þ =
6 3 18
1 2 2
Pðf 2 T b Þ = Pðf 2 ÞPðT b Þ → Pðf 2 T b Þ = × → Pðf 2 T b Þ =
6 3 18
2 1 2
Pðf 3 H b Þ = Pðf 3 ÞPðH b Þ → Pðf 3 H b Þ = × → Pðf 3 H b Þ =
3 3 9
10 1 Experiments, Sample Spaces, Events, and Probability Laws
2 2 4
Pðf 3 T b Þ = Pðf 3 ÞPðT b Þ → Pðf 3 T b Þ = × → Pðf 3 T b Þ = :
3 3 9
Example 1.13: We toss a coin three times. Find the probabilities of the following
events.
(a) A = {ρi 2 S | ρi includes at least two heads}.
(b) B = {ρi 2 S | ρi includes at least one tail}.
Solution 1.13: For three tosses, the sample space can be calculated as
1 1 1 1
PðAÞ = PðHHH Þ þ PðHHT Þ þ PðHTH Þ þ PðTHH Þ → PðAÞ = þ þ þ
8 8 8 8
4
→ PðAÞ = :
8
7
PðBÞ = :
8
Example 1.14: For a biased coi, the sample space is S1 = {Hb, Tb}. The probabil-
ities of the simple events for the biased coin flip experiment are
2 1
P ðH b Þ = PðT b Þ = :
3 3
Assume that a biased coin and a fair coin are flipped together. Consider the two
flips as a single experiment. Find the sample space of the combined experiment.
Determine the probabilities of the simple events for the combined experiment, and
determine the probabilities of the following events.
(a) A = {Biased head appears in the simple event.}
(b) B = {At least two heads appear.}
1.5 Joint Experiment 11
Solution 1.14: For the fair coin flip experiment, the sample space is
S2 = fH, T g
1 1
PðH Þ = PðT Þ = :
2 2
For the flip of biased and fair coin experiment, the sample space can be calculated
as
S = S1 × S2 → S = fH b H, H b T, T b H, T b T g:
The probabilities of the simple events for the combined experiment are calculated
as
2 1 1
PðH b H Þ = PðH b ÞPðH Þ → PðH b H Þ = × → PðH b H Þ =
3 2 3
2 1 1
PðH b T Þ = PðH b ÞPðT Þ → PðH b T Þ = × → PðH b T Þ =
3 2 3
1 1 1
PðT b H Þ = PðT b ÞPðH Þ → PðT b H Þ = × → PðT b H Þ =
3 2 6
1 1 1
PðT b T Þ = PðT b ÞPðT Þ → PðT b T Þ = × → PðT b T Þ = :
3 2 6
A = fH b H, H b T g B = fH b H, H b T, T b H g
1 1 2
PðAÞ = PðH b H Þ þ PðH b T Þ → PðAÞ = þ → PðAÞ =
3 3 3
1 1 1 5
PðBÞ = PðH b H Þ þ PðH b T Þ þ PðT b H Þ → PðBÞ = þ þ → PðBÞ = :
3 3 6 6
Exercises:
1. For a biased coin, the sample space is S1 = {Hb, Tb}. The probabilities of the
simple events for the biased coin toss experiment are
12 1 Experiments, Sample Spaces, Events, and Probability Laws
2 1
P ðH b Þ = PðT b Þ = :
3 3
Assume that a biased coin is flipped and a fair die is tossed together. Consider
the combined experiment, and find the sample space of the combined experiment.
Determine the probabilities of the simple events for the combined experiment,
and determine the probabilities of the events:
(a) A = {Biased head and odd numbers appear in the simple event.}
(b) B = {Biased tail and a number divisible by 3 appear in the simple event.}
2. For a biased coin, the sample space is S1 = {Hb, Tb}. The probabilities of the
simple events for the biased coin toss experiment are
2 1
P ðH b Þ = PðT b Þ = :
3 3
Assume that a biased coin is tossed three times. Find the sample space, and
find the probabilities of the simple events. Calculate the probability of the events
A = {At least two heads appear in the simple event.}
B = {At most two tails appear in the simple event.}
Let A, B, and C be the events for an experiment, and P() be the probability function
defined on the events of the experiment. We have the following properties for the
probability function P().
(a) If A ⊂ B, then P(A) ≤ P(B)
(b) P(A [ B) = P(A) + P(B) - P(A \ B)
(c) P(A [ B) ≤ P(A) + P(B)
(d) P(A [ B [ C) = P(A) + P(Ac \ B) + P(Ac \ Bc \ C)
We will prove some of these properties in examples.
Example 1.15: Prove the property P(A [ B) = P(A) + P(B) - P(A \ B).
Proof 1.15: We should keep in our mind that events are nothing but subsets. Then,
any operation that can be performed on sets is also valid on events.
Let S be the sample space. The event A [ B can be written as
A [ B = S \ ðA [ BÞ
A [ B = ðA [ Ac Þ \ ðA [ BÞ
A [ B = A [ ðAc \ BÞ ð1:7Þ
where the events A and Ac \ B are disjoint events, i.e., A \ (Ac \ B) = ϕ. According
to probability axiom-2, the probability of the event A [ B in (1.7) can be written as
B=S \ B
B = ðA [ Ac Þ \ B
B = ðA \ BÞ [ ðAc \ BÞ ð1:9Þ
Note: If A and B are disjoint, i.e., mutually exclusive events, then we have
AI BIC
Since events of an experiment are nothing but subsets of the sample space of the
experiment, it may sometimes be easier to manipulate the events using Venn
diagrams.
Venn Diagram Illustration of Events
In Fig. 1.2, Venn diagram illustrations of the events are depicted. As can be seen
from Fig. 1.2, we can take the intersection and union of the events.
Example 1.16: Show that
PðA \ BÞ ≥ 0: ð1:14Þ
PðAc Þ = 1 - PðAÞ:
A [ Ac = S ð1:15Þ
PðSÞ = 1
and
PðAÞ þ PðAc Þ = 1
which leads to
PðAc Þ = 1 - PðAÞ:
Theorem 1.1: If the events A1, A2, ⋯, Am are mutually exclusive events, then we
have
Example 1.18: For a biased die, the probabilities of the simple events are given as
1 1 1 1 2 1
Pðf 1 Þ = P ðf 2 Þ = P ðf 3 Þ = Pðf 4 Þ = Pðf 5 Þ = Pðf 6 Þ =
12 12 6 6 6 6
A = ff 2 , f 4 , f 6 g B = ff 1 , f 2 , f 4 g A [ B = ff 1 , f 2 , f 4 , f 6 g A \ B = ff 2 , f 4 g:
1 1 1 5
PðAÞ = Pðf 2 Þ þ Pðf 4 Þ þ Pðf 6 Þ → PðAÞ = þ þ → PðAÞ =
12 6 6 12
1 1 1 4
PðBÞ = Pðf 1 Þ þ Pðf 2 Þ þ Pðf 4 Þ → PðBÞ = þ þ → PðBÞ =
12 12 6 12
16 1 Experiments, Sample Spaces, Events, and Probability Laws
B=A [ B
B = ðA [ B Þ \ S
B = ðA [ BÞ \ ðA [ Ac Þ
B = A [ ðAc \ BÞ ð1:16Þ
where the events A and Ac \ B are disjoint events, i.e., A \ (Ac \ B ) = ϕ. Using
probability law axiom-2 in (1.2) and equation (1.16), we have
PðAÞ ≤ PðBÞ:
1.7 Conditional Probability 17
Assume that we perform and experiment, and we get an outcome of the experiment.
Let the outcome of the experiment belong to an event B. And consider the question:
What is the probability that the outcome of the experiment also belongs to another
event A? To calculate this probability, we should first determine the sample space,
then identify the event and calculate the probability of the event. Assume that the
experiment is a fair one, as
where the condition “given that it also belongs to B” implies that the sample space
equals B, i.e.,
S0 = B:
N ðE Þ N ðA \ B Þ
PðE Þ = 0 → PðE Þ = ð1:18Þ
N ðS Þ N ðB Þ
N ðA \ BÞ=N ðSÞ
P ðE Þ =
N ðBÞ=N ðSÞ
leading to
PðA \ BÞ
PðEÞ = :
PðBÞ
If we show this special event E by a special notation AjB, then the conditional
event probability can be written as
18 1 Experiments, Sample Spaces, Events, and Probability Laws
PðA \ BÞ
PðAjBÞ = ð1:19Þ
PðBÞ
Properties
1. If A1 and A2 are disjoint events, then we have
PððA1 [ A2 Þ \ BÞ
PðA1 [ A2 jBÞ =
PðBÞ
A1 \ A2 = ϕ
ðA1 \ BÞ \ ðA2 \ BÞ = ϕ
leading to
1.7 Conditional Probability 19
PðA1 \ BÞ þ PðA2 \ BÞ
PðA1 [ A2 jBÞ =
PðBÞ
leading to
PðA1 \ BÞ PðA2 \ BÞ
PðA1 [ A2 jBÞ = þ
PðBÞ PðBÞ
PðA1 \ BÞ þ PðA2 \ BÞ
PðA1 [ A2 jBÞ ≤
PðBÞ
leading to
PðA1 \ BÞ PðA2 \ BÞ
PðA1 [ A2 jBÞ ≤ þ
PðBÞ PðBÞ
Example 1.20: There are two students A and B having an exam. The following
information is available about the students.
20 1 Experiments, Sample Spaces, Events, and Probability Laws
(a) The probability that student A can be successful in the exam is 5/8.
(b) The probability that student B can be successful in the exam is 1/2.
(c) The probability that at least one student can be successful is 3/4.
After the exam, it was announced that only one student was successful in the
exam. What is the probability that student A was successful in the exam?
Solution 1.20: For each student, having an exam can be considered as an experi-
ment. The sample spaces of individual experiments are
SA = As , Af SB = B s , B f
where As, Af are the success and fail outputs for student A, and Bs, Bf are the success
and fail outputs for student B. If we consider both students having an exam together,
i.e., joint experiment, the sample space in this case can be formed as
S = SA × SB → S = As Bs , As Bf , Af Bs , Af Bf
E A = fStudent A is successfulg → EA = As Bs , As Bf
E B = fStudent B is successfulg → E B = As Bs , Af Bs
E1 = fAt least one student is successfulg → E1 = As Bs , As Bf , Af Bs
From the given information in the question, we can write the following equations:
5 5
PðE A Þ = → PðAs Bs Þ þ P As Bf =
8 8
1 4
PðE B Þ = → PðAs Bs Þ þ P Af Bs =
2 8
3 6
PðE1 Þ = → PðAs Bs Þ þ P As Bf þ P Af Bs =
4 8
3 2 1
PðAs Bs Þ = P As Bf = P Af Bs = :
8 8 8
PðE A \ Eo Þ
PðEA jE o Þ =
PðE o Þ
2 1 3
PðEo Þ = P As Bf þ P Af Bs → PðE o Þ = þ → PðE o Þ =
8 8 8
2
PðEA \ E o Þ = P As Bf → PðEA \ E o Þ = :
8
2
2
PðEA jE o Þ = 8
3
→ PðEA jE o Þ = :
8
3
Example 1.21: A fair coin is tossed three times. The events A and B are defined as
Since the coin is a fair one and simple events have the same probability, the
probabilities of the events A and B can be calculated using
N ðA Þ N ðBÞ
PðAÞ = PðBÞ =
N ðSÞ N ð SÞ
where N(A), N(B), and N(S) indicate the number of elements in the events, A, B, and
S, respectively. Then, P(A) and P(B) are found as
22 1 Experiments, Sample Spaces, Events, and Probability Laws
N ðAÞ 4 N ðBÞ 4
PðAÞ = → PðAÞ = PðBÞ = → PðBÞ = :
N ð SÞ 8 N ð SÞ 8
PðA \ BÞ
PðAjBÞ = ð1:24Þ
PðBÞ
N ðA \ BÞ 2
PðA \ BÞ = → PðA \ BÞ = ð1:25Þ
N ð SÞ 8
PðA \ BÞ 2
2
PðAjBÞ = → PðAjBÞ = 8
→ PðAjBÞ =
PðBÞ 4
8
4
Example 1.22: Consider a metal detector security system in an airport. The prob-
ability of the security system giving an alarm in the absence of a metal is 0.02, the
probability of the security system giving an alarm in the presence of a metal is 0.95,
and the probability of the security system not giving an alarm in the presence of a
metal is 0.03. The probability of availability of metal is 0.02.
Express the miss detection event mathematically, and calculate the probability of
miss detection.
Express the missed detection event mathematically, and calculate the probability
of missed detection.
Solution 1.22: Considering the given information in the question, we can define the
events and their probabilities as
C = Ac \ B
D = A \ Bc
Example 1.23: A box contains three white and two black balls. We pick a ball from
this box. Find the sample space of this experiment and write the events for this
sample space.
Solution 1.23: The sample space can be written as
S = fw1 , w2 , w3 , b1 , b2 g:
The events are subsets of S, and there are in total 25 = 32 events. These events are
Example 1.24: A box contains two white and two black balls. We pick two balls
from this box without replacement. Find the sample space of this experiment.
Solution 1.24: We perform two experiments consecutively. The sample space of
the first experiment can be written as
24 1 Experiments, Sample Spaces, Events, and Probability Laws
S1 = fw1 , w2 , b1 , b2 g:
The sample space of the second experiment depends on the outcome of the first
experiment.
If the outcome of the first experiment is w1, the sample space of the second
experiment is
S21 = fw2 , b1 , b2 g:
If the outcome of the first experiment is w2, the sample space of the second
experiment is
S22 = fw1 , b1 , b2 g:
If the outcome of the first experiment is b1, the sample space of the second
experiment is
S23 = fw1 , w2 , b2 g:
If the outcome of the first experiment is b2, the sample space of the second
experiment is
S24 = fw1 , w2 , b1 g:
If the outcome of the first experiment is w1, the sample space of combined
experiment is
S = S1 × S21 :
If the outcome of the first experiment is w2, the sample space of the second
experiment is
S = S1 × S22 :
If the outcome of the first experiment is b1, the sample space of the second
experiment is
S = S1 × S23 :
If the outcome of the first experiment is b2, the sample space of the second
experiment is
S = S1 × S24 :
1.7 Conditional Probability 25
Continuous Experiment
For continuous experiments, sample space includes an uncountable number of
simple events. For this reason, for continuous experiments, the sample space is
usually expressed either as an interval if one-dimensional representation is sufficient,
or it is expressed as an area in two-dimensional plane.
Let’s illustrate the concept with an example.
Example: A telephone call may occur at a time t which is a random point in the
interval [8 18].
Solution: The sample space of the experiment is the interval [8 18], i.e.,
S = ½8 18:
The events A and B are subsets of S, and they are nothing but the intervals
LengthðAÞ 16 - 10 6
PðAÞ = → PðAÞ = → PðAÞ =
LengthðSÞ 18 - 8 10
LengthðBÞ 16 - 8 8
PðBÞ = → PðBÞ = → PðAÞ =
LengthðSÞ 18 - 8 10
PðB \ AÞ
PðBjAÞ = →
PðAÞ
Pð½8 16 \ ½10 16Þ Pð½8 10Þ 2
PðBjAÞ = → PðBjAÞ = → PðBjAÞ = :
Pð½10 16Þ Pð½10 16Þ 6
26 1 Experiments, Sample Spaces, Events, and Probability Laws
Problems
S = fs1 , s2 , s3 , s4 , s5 , s6 g:
Find three mutually exclusive events E1, E2, E3 such that S = E1 [ E2 [ E3.
Find the probability of each mutually exclusive event.
5. The sample space of an experiment is given as
S = fs1 , s2 , s3 , s4 , s5 , s6 , s7 , s8 g:
E = fs1 , s3 , s5 , s6 , s8 g:
Write the event E as the union of two mutually exclusive events E1 and E2, i.e.,
E = E1 [ E2
S = fs1 , s2 , s3 g
1 2 1
Pðs1 Þ = Pðs2 Þ = Pðs3 Þ = :
4 4 4
Write all the events for this sample space, and calculate the probability of each
event.
7. The sample space of an experiment is given as
S = fs1 , s2 , s3 g
1 1 1
Pðs1 Þ = Pðs2 Þ = Pðs3 Þ = :
3 6 2
We perform the experiment twice. Consider the two performances of the same
experiment as a single experiment, i.e., combined experiment. Find the simple
Problems 27
events of the combined experiment, and calculate the probability of each simple
event of the combined experiment.
8. The sample spaces of two experiments are given as
S1 = fa, b, cg S2 = fd, eg
1 1 1
PðaÞ = PðbÞ = P ð cÞ =
3 6 2
3 1
Pðd Þ = PðeÞ = :
4 4
We perform the first experiment once and the second experiment twice. Consider
the three trials of the experiment as a single experiment, i.e., combined experiment.
Find the simple events of the combined experiment, and calculate the probability of
each simple event of the combined experiment.
Chapter 2
Total Probability Theorem, Independence,
Combinatorial
Definition
Partition: Let A1, A2, ⋯, AN be the events of a sample space such that Ai \ Aj = ϕ i,
j 2 {1, 2, ⋯, N} and S = A1 [ A2⋯AN. We say that the events A1, A2, ⋯, AN form a
partition of S.
The partition of a sample space is graphically illustrated in Fig. 2.1.
Let A1, A2, ⋯, AN be the disjoint events that form a partition of a sample space S, and
B is any event. Then, the probability of the event B can be written as
S = A1 [ A2 ⋯ [ AN :
B=B \ S
A1 B A2
A3
A3 I B
A1 I B A3
B = B \ ðA1 [ A2 ⋯ [ AN Þ
B = ðB \ A1 Þ [ ðB \ A2 Þ⋯ \ ðB \ AN Þ: ð2:2Þ
In (2.2), the events (B \ Ai) and (B \ Aj) i, j 2 {1, 2, ⋯, N}, i ≠ j are disjoint
events. Then, according to probability law axiom-2 in (1.2), P(B) can be written as
Solution 2.1: The experiment here can be considered as playing a chess game
against an opponent. The sample space is
S = f100 playersg
N ðAÞ 20
PðAÞ = → PðAÞ =
N ð SÞ 100
N ðBÞ 30
PðBÞ = → PðBÞ =
N ð SÞ 100
N ðC Þ 50
PðC Þ = → PðCÞ = :
N ð SÞ 100
The sample space and its partition are depicted in Fig. 2.3.
Example 2.2: In a chess tournament, there are 100 players. Of these 100 players,
20 of them are at an advanced level, 30 of them are at an intermediate level, and 50 of
them are at a beginner level.
Your probability of winning against an advanced player is 0.2, and it is 0.5
against an intermediate player, and it is 0.7 against a beginner player.
You randomly choose an opponent and play a game. What is the probability of
winning?
Solution 2.2: The sample space is S = {100 players}, and the events are
It is clear that
S=A [ B [ C
20 30 50
PðAÞ = PðBÞ = PðC Þ = :
100 100 100
Exercise: There is a box, and inside the box there are 100 question cards. Of these
100 mathematic questions, 10 of them are difficult, 50 of them are normal, and 40 of
them are easy. Your probability of solving a difficult question is 0.2, it is 0.4 for
normal questions, and it is 0.6 for easy questions. You randomly choose a card; what
is the probability that you can solve the question on the card?
Let A1, A2, ⋯, AN be disjoint events that form a partition of a sample space S.
The conditional probability P(Ai| B) can be calculated using
PðAi \ BÞ
PðAi jBÞ =
PðBÞ
PðAi ÞPðBjAi Þ
PðAi jBÞ =
PðBÞ
PðAi ÞPðBjAi Þ
PðAi jBÞ = ð2:4Þ
PðA1 ÞPðBjA1 Þ þ PðA2 ÞPðBjA2 Þ þ ⋯ þ PðAN ÞPðBjAN Þ
In the question, we are required to find P(A| W ), which can be calculated using
PðAÞPðWjAÞ
PðAjW Þ =
PðW Þ
in which using
with
20 30 50
PðAÞ = PðBÞ = PðC Þ = :
100 100 100
we obtain
0:2 × 0:2
PðAjW Þ = → PðAjW Þ = 0:074
0:54
Exercise
1. An electronic device is produced by three factories: F1, F2, and F3. The factories
F1, F2, and F3 have market sizes of 30%, 30%, and 40%, respectively, and the
probabilities of F1, F2, and F3 for producing a defective device are 0.02, 0.04,
and 0.01. Assume that you purchased the electronic device produced by these
factories, and you found that the device is defective. What is the probability that
the defective device is produced by the second factory, i.e., by F2?
Example 2.4: A box contains two regular coins and one two-headed coin, i.e.,
biased coin. You pick a coin and flip it, and a head shows up. What is the probability
that the chosen coin is the two-headed coin?
Solution 2.4: The experiment for this example can be considered as choosing a coin
and flipping it. Since the box contains two fair and one two-headed coins, we can
write the sample space as
S = fH 1 , T 1 , H 2 , T 2 , H b1 , H b2 g
where H1, T1, H2, T2 corresponds to the fair coins, and Hb, Hb corresponds to the
two-headed coin. Let’s define the events
PðBjAÞ
PðBjAÞ
as
PðB \ AÞ PðfH b1 , H b2 g \ fH 1 , H 2 , H b1 , H b2 gÞ
PðBjAÞ = → PðBjAÞ = → PðBjAÞ
PðAÞ PðfH 1 , H 2 , H b1 , H b2 gÞ
2
PðfH b1 , H b2 gÞ 2
= → PðBjAÞ = 6 → PðBjAÞ = :
PðfH 1 , H 2 , H b1 , H b2 gÞ 4 4
6
In fact, if we inspect the event A = {H1, H2, Hb1, Hb2}, we see that half of the
heads are biased.
2.2 Multiplication Rule 35
PðA1 \ A2 ⋯ \ AN Þ
= PðA1 ÞPðA2 jA1 ÞPðA3 jA1 \ A2 Þ⋯PðAN jA1 \ A2 ⋯AN - 1 Þ ð2:5Þ
N
P \Ni = 1 Ai = P Ai \ij -
= 1 Aj :
1
ð2:6Þ
i=1
Proof: We can show the correctness of (2.5) using the definition of the conditional
probability as in
PðA1 \ A2 ⋯ \ AN Þ = PðA1 Þ
PðA1 \ A2 Þ PðA3 \ A1 \ A2 Þ PðAN \ A1 \ A2 ⋯AN - 1 Þ
⋯
PðA1 Þ PðA1 \ A2 Þ PðA1 \ A2 ⋯AN - 1 Þ
Since the experiment is a fair one, the probability of the event A1 can be calculated
as
N ðA1 Þ
PðA1 Þ = ð2:7Þ
N ð S1 Þ
where N(A1) and N(S1) are the number of simple events in the event A1 and S1,
respectively.
The probability (2.7) can be calculated as
N ðA 1 Þ 6
PðA1 Þ = → PðA1 Þ = :
N ð S1 Þ 12
After the first experiment, the sample space has one missing element, and the
sample space can be written as
N ðA2 jA1 Þ 5
PðA2 jA1 Þ = = :
N ð S2 Þ 11
4
PðA3 jA1 \ A2 Þ = :
10
2.3 Independence
The events A and B are said to be independent events if the occurrence of the event
B does not change the probability of the occurrence of event A. That is, if
2.3 Independence 37
then the events A and B are said to be independent events. The independence
condition in (2.8) can alternatively be expressed as
PðA \ BÞ
PðAjBÞ = PðAÞ → = PðAÞ → PðA \ BÞ = PðAÞPðBÞ:
PðBÞ
PðA \ BÞ = PðAÞPðBÞ
is satisfied.
Note: For disjoint events A and B, we have P(A \ B) = 0, and for independent
events A and B, we have P(A \ B) = P(A)P(B).
Example 2.6: Show that two disjoint events A and B can never be independent
events.
Proof 2.6: Let A and B be two disjoint events such that
and
PðA \ BÞ = 0:
It is clear that
PðAÞPðBÞ > 0:
PðA \ BÞ ≠ PðAÞPðBÞ:
Solution 2.7: The sample space of the first toss is S1 = {f1, f2, f3}. The sample space
of the two tosses can be calculated as
S = S 1 × S 1 → S = f f 1 f 1 , f 1 f 2 , f 1 f 3 , f 2 f 1 , f 2 f 2 , f 2 f 3 , f 3 f 1 , f 3 f 2 , f 3 f 3 g:
A = ff 1 f 1 , f 1 f 2 , f 1 f 3 g B = ff 1 f 3 , f 2 f 3 , f 3 f 3 g
3 3
PðAÞ = PðBÞ = : ð2:9Þ
9 9
A \ B = ff 1 f 3 g
whose probability is
1
PðA \ BÞ = : ð2:10Þ
9
Since
is satisfied, we can conclude that the events A and B are independent of each other.
Let A1, A2, ⋯, AN be the events of an experiment. The events A1, A2, ⋯, AN are
independent of each other, if
Example 2.8: If the events A1, A2, and A3 are independent of each other, then all of
the following equalities must be satisfied.
1. P(A1 \ A2) = P(A1)P(A2)
2. P(A1 \ A3) = P(A1)P(A3)
2.4 Conditional Independence 39
Decide whether the events A, B, and C are independent of each other or not.
The events A and B are said to be conditionally independent, if for a given event C
is satisfied.
The left side of the conditional independence in (2.12) can be written as
PðA \ B \ C Þ
PðA \ BjCÞ =
PðC Þ
we obtain
leading to
The conditional independence implies that, if the event C did occur, the additional
occurrence of the event B does not have any effect on the probability of occurrence
of event A.
Example 2.9: For the two tosses of a fair coin experiment, the following events are
defined
Decide whether the events A and B are conditionally independent given the
event C.
Solution 2.9: The events A, B, and C can be written as
PðAjB \ C Þ = PðAjC Þ
PðA \ B \ CÞ PðA \ C Þ
= : ð2:15Þ
P ðB \ C Þ PðC Þ
1
PðA \ B \ C Þ = PfHT g → PðA \ B \ C Þ =
4
1
PðB \ C Þ = PfHT g → PðB \ C Þ =
4
2.4 Conditional Independence 41
2
PðA \ CÞ = PfHH, HT g → PðA \ C Þ = :
4
1 2
2
4
1
= 4
3
→1=
4 4
3
PðAjB \ C Þ ≠ PðAjCÞ
which means that the events A and B given C are not conditionally independent of
each other.
Example 2.10: Show that if A and B are independent events, so are the A and Bc.
Proof 2.10: If A and B are independent events, then we have
PðA \ BÞ = PðAÞPðBÞ:
A=A \ S
A = A \ ðB [ Bc Þ → A = ðA \ BÞ [ ðA \ Bc Þ
leading to
Exercise: Show that if A and B are independent events, so are the Ac and Bc.
Hint: Ac = Ac \ S and S = B [ Bc.
Exercise: Show that if A and B are independent events, so are the Ac and B.
Hint: Ac = Ac \ S and S = B [ Bc and use the result of the previous example.
Assume that we perform an experiment, and at the end of the experiment, we wonder
whether an event has occurred or not, for example, flip of a fair coin and occurrence
of head, success or failure from an exam, winning or losing a game, it rains or does
not rain, toss of a die and occurrence of an even number, etc. Let’s assume that such
experiments are repeated N times in a sequential manner, for instance, flipping a fair
coin ten times, playing 10 chess games, etc. We wonder about the probability of the
same event occurring k times out of N trials. Let’s explain the topic with an example.
Example 2.11: Consider the flip of a biased coin experiment. The sample space is
S1 = {H, T} and the simple events have the probabilities
PðH Þ = p PðT Þ = 1 - p:
Let’s say that we flip the coin 5 times. In this case, sample space is calculated by
taking the 5 Cartesian product of S1 by itself, i.e.,
S = S1 × S 1 × S 1 × S 1 × S 1
which includes 32 elements, and each element of S contains 5 simple events, for
instance, HHHHH, HHHHT, ⋯ etc. Now think about the question, what is the
probability of seeing 3 heads and 2 tails after 5 flips of the coin?
Consider the event A having 3 heads and 2 tails; the event A can be written as
The probability of any simple event containing 3 heads and 2 tails equals p3(1 - p)2,
for instance, P(HHHTT) can be calculated as
5 3
PðAÞ = p ð 1 - pÞ 2 :
3
Thus, the probability of seeing 3 heads and 2 tails after 5 tosses of the coin is
5 3
p ð1 - pÞ2 :
3
A0 = f0 Head 5 tailsg
A1 = f1 Head 4 tailsg
A2 = f2 Heads 3 tailsg
A3 = f3 Heads 2 tailsg
A4 = f4 Heads 1 Tailg
A5 = f5 Heads 0 Tailg
It is obvious that the events A0, A1, A2, A3, A4, A5 are disjoint events, i.e.,
Ai \ Aj = ϕ, i, j = 0, 1, ⋯, 5, i ≠ j, and we have
S = A0 [ A1 [ A2 [ A3 [ A4 [ A5 :
According to the probability law axioms-2 and 3 in (1.2) and (1.3), we have
leading to
5 0 5 1 5 2 5 3
p ð 1 - pÞ 5 þ p ð1 - pÞ4 þ p ð 1 - pÞ 3 þ p ð 1 - pÞ 2
0 1 2 3
5 4 5 5
þ p ð1 - pÞ1 þ p ð 1 - pÞ 0 = 1
4 5
5
5
pk ð1 - pÞ5 - k = 1:
k=0 k
Example 2.12: Consider the flip of a fair coin experiment. The sample space is
S1 = {H, T}. Let’s say that we flip the coin N times. In this case, sample space is
calculated by taking the N Cartesian product of S1 by itself, i.e.,
S = S1 × S1 × ⋯S1
What is the probability of seeing k heads at the end of N trials. Following a similar
approach as in the previous example, we can write that
N k
P k heads appear in N flips → PðAk Þ = p ð 1 - pÞ N - k ð2:16Þ
k
Ak
N
N k
p ð1 - pÞN - k = 1:
k=0 k
Now consider the event Ak, the number of heads appearing in N tosses is a number
between k1 and k2. The event Ak can be written as
Note: Let x and y be two simple events of a sample space, then we have
x [ y = fx, yg
thus
S1 = f1, 2, 3, 4, 5, 6g:
A = f3, 6g:
S2 = A [ B
2 4
PðAÞ = PðBÞ = :
6 6
When the fair die is flipped 5 times, the sample space happens to be
S = S2 × S2 × S2 × S2 × S2
leading to
2
1 2
P ðC Þ = 5 ×
3 3
2
5 1 2
PðC Þ = × :
4 3 3
N k
p ð 1 - pÞ N - k
k
directly for the given example, we can get the same result.
Theorem 2.1: Let S be the sample space of an experiment, and A is an event.
Assume that the experiment is performed N times. The probability of an event
occurring k times in N trials can be calculated as
N k
PN ðAk Þ = p ð1 - pÞN - k ð2:18Þ
k
where p = Prob(A).
Example 2.14: A biased coin has the simple event probabilities
2 1
PðH Þ = PðT Þ = :
3 3
Assume that the biased coin is flipped and a fair die is tossed together 8 times.
What is the probability that a tail and an even number appear together 5 times?
Solution 2.14: The sample space of the biased coin toss experiment is
S1 = fH, T g:
S2 = f1, 2, 3, 4, 5, 6g:
S = S1 × S2 → S = fH1, H2, H3, H4, H5, H6, T1, T2, T3, T4, T5, T6g:
The event
can be written as
and the sample space S3, considering the experimental outcomes given in the
question, is
S3 = A [ B
where
leading to
1 1 1 1 1 1 1
PðAÞ = × þ × þ × → PðAÞ = :
3 6 3 6 3 6 6
Now consider the combined experiment, i.e., the biased is flipped and a fair die is
tossed together 8 times. The probability that event A occurs 5 times in 8 trials of the
experiment can be calculated using
5 3
N k 8 1 5
PN ðAk Þ = p ð 1 - pÞ N - k → P 8 ð A 5 Þ = :
k 5 6 6
Example 2.15: We flip a biased coin and draw a ball with replacement from a box
that contains 2 red, 3 yellow, and 2 blue balls. For the biased coin, the probabilities
of the head and tail are
48 2 Total Probability Theorem, Independence, Combinatorial
1 3
P ðH b Þ = PðT b Þ = :
4 4
If we repeat the experiment 8 times, what is the probability of seeing a tail and
drawing a blue ball together 5 times?
Solution 2.15: For the biased coin flip experiment, the sample space can be written
as
S1 = fH b , T b g
and for the ball drawing experiment, we can write the sample space as
S2 = fR1 , R2 , Y 1 , Y 2 , Y 3 , B1 , B2 g:
If we consider two experiments at the same time, i.e., the combined experiment, the
sample space can be formed as
S = S1 × S2 → S = fH b , T b g × fR1 , R2 , Y 1 , Y 2 , Y 3 , B1 , B2 g →
H b R1 , H b R2 , H b Y 1 , H b Y 2 , H b Y 3 , H b B1 , H b B2 ,
S= :
T b R1 , T b R2 , T b Y 1 , T b Y 2 , T b Y 3 , T b B1 , T b B2
A = f T b B1 , T b B2 g:
In our question, the experiment is repeated 8 times, and the probability of seeing a
tail and drawing a blue ball together 5 times is asked. We can calculate the asked
probability as
5
8 3 11 3
PðA5 Þ = :
5 14 14
2.5 Independent Trials and Binomial Probabilities 49
Exercise: A biased coin is flipped and a 4-sided biased die is tossed together
8 times. The probabilities of the simple events for the separate experiments are
2 1
PðH Þ = P ðT Þ =
3 3
2 1 2 1
Pðf 1 Þ = Pðf 2 Þ = Pðf 3 Þ = Pðf 4 Þ = :
3 3 3 3
What is the probability of seeing a head and an odd number 3 times in 8 tosses?
Example 2.16: An electronic device is produced by a factory. The probability that
the produced device is defective equals 0.1. We purchase 1000 of these devices.
What is the probability that the total number of defective devices is a number
between 50 and 150.
Solution 2.16: Consider the coin toss experiment. The sample space is S1 = {H, T}.
If you flip the coin N times, you calculate the sample space using N times Cartesian
product
S = S1 × S1 × ⋯ × S1 :
S2 = fD, N g
S = S2 × S2 × ⋯ × S2 :
The probability of the simple event with N letters in which D appears can be
calculated as
p k ð 1 - pÞ N - k
and the probability of the event including the simple events of S in which D appears
k times is calculated as
50 2 Total Probability Theorem, Independence, Combinatorial
N k
p ð 1 - pÞ N - k :
k
And if k is a number between k1 and k2, the sum of probabilities of all these events
equals
k2
N k
PðAk Þ = p ð 1 - pÞ N - k :
k k = k1 k
For our question, the probability that the total number of defective devices is a
number between 50 and 150 can be calculated as
150
1000
PðAk Þ = 0:1k × 0:9100 - k :
k k = 50 k
Assume that there are M experiments with samples spaces S1, S2, ⋯, SM. The
number of elements in the sample spaces S1, S2, ⋯, SM are N1, N2, ⋯, NM,
respectively. If the experiments are all considered together as a single experiment,
then the sample space of the combined experiment is calculated as
S = S1 × S2 × ⋯ × SM ð2:19Þ
N = N1 × N2 × ⋯ × NM : ð2:20Þ
Example 2.17: Consider the integer set Fq = {0, 1, 2, ⋯, q - 1}. Assume that we
construct integer vectors v = ½v1 v2 ⋯vL using the integers in Fq. How many
different integer vectors we can have?
Solution 2.17: Selecting a number from the integer set Fq = {0, 1, 2, ⋯, q - 1} can
be considered as an experiment, and the sample space of this experiment is
S1 = f0, 1, 2, ⋯, q - 1g:
S = S1 × S 1 × ⋯ × S 1 :
q × q × ⋯ × q → qL :
L times
Example 2.18: Consider the integer set F3 = {0, 1, 2, 3}. Assume that we construct
integer vectors v = ½v1 v2 ⋯v10 including 10 integers using the elements of F3. How
many different integer vectors can we have?
Solution 2.18: The answer is
3 × 3 × ⋯ × 3 = 310 :
10 times
2.7 Permutation
Consider the integer set S1 = {1, 2, ⋯, N}. Assume that we draw an integer from the
set S1 without replacement, and we repeat this experiment k times in total. The
sample space of the kth draw, i.e., kth experiment, is indicated by Sk. The sample
space of the combined experiment
S = S1 × S 2 × ⋯ × S N
contains
N × ðN- 1Þ × ⋯ × ðN- k þ 1Þ
N!
: ð2:21Þ
ðN - k Þ!
The sample space S of the combined experiment contains simple events consisting
of k distinct integers chosen from S1. Thus, at the end of the kth trial, we obtain a
sequence of k distinct integers. And the number N × (N - 1) × ⋯ × (N - k + 1)
indicates the total number of integer sequences containing k distinct integers, i.e., the
number of elements in the sample space S.
52 2 Total Probability Theorem, Independence, Combinatorial
The discussion given above can be extended to any set containing objects rather
than integers. In that case, while forming the distinct combination of objects, we pay
attention to the index of the objects.
Example 2.19: The set S1 = {1, 2, 3} is given. We draw 2 integers from the set
without replacement. Write the possible generated sequences.
Solution 2.19: Assume that at the first trial 1 is selected, then at the end of second
trial, we can get the sequences
If at the first trial 2 is selected, then at the end of second trial, we can get the
sequences
If at the first trial 3 is selected, then at the end of second trial, we can get the
sequences
3!
= 6:
ð3 - 2Þ!
Example 2.20: In English language, there are 26 letters. How many words can be
formed consisting of 5 distinct letters?
Solution 2.20: You can consider this question as the draw of letters from the
alphabet box without replacement, and we repeat the experiment 5 times. Then,
the number of words that contains 5 distinct letters can be calculated using
26 × 25 × 24 × 23 × 22
2.8 Combinations
Assume that there are N people, and we want to form a group consisting of k persons
selected from N people. How many different groups can we form?
The answer to this question passes through permutation calculation. We can find
the answer by calculating the k permutation of N. However, since humans are
considered while forming the sequences, some of the sequences include the same
persons although their order is different in the sequence. For instance, the sequences
abcd and bcda contain the same persons and they are considered the same.
The elements of a sequence containing k distinct elements can be reordered in k!
different ways.
Example 2.21: The sequence aec can be reordered as
N!
ð2:22Þ
ðN - k Þ! × k!
N
: ð2:23Þ
k
Example 2.22: Consider the sample space S = {a, b, c, d}. The number of different
sequences containing 2 distinct letters from S can be calculated using 2 permutations
of 4 as
4!
= 12
ð4 - 2Þ!
ab ac ad ba bc bd ca cb cd da db dc:
On the other hand, if reordering is not wanted, then the number of sequences
containing 2 distinct letters can be calculated using
54 2 Total Probability Theorem, Independence, Combinatorial
4!
=6
ð4 - 2Þ! × 2!
ab ac ad bc bd cd:
Example 2.23: A box contains 60 items, and of these 60 items 15 of them are
defective. Suppose that we select 23 items randomly. What is the probability that
from these 23 items 8 of them are defective?
Solution 2.23: Let’s formulate the solution as follows. Sample space is
60
N ðSÞ =
23
A = fFrom 23 selected items, 8 of them are defective and 15 of them are robustg:
A = A1 × A2
15 45
N ðAÞ = ×
8 15
N ðA Þ 15
× 45
PðAÞ = → PðAÞ = 8 15
N ð SÞ 60
23
2.8 Combinations 55
Example 2.24: An urn contains 3 red and 3 green balls, each of which is labeled by
a different number. A sample of 4 balls are drawn without replacement. Find the
number of elements in the sample space.
Solution 2.24: Let’s show the content of the urn by the set {R1, R2, R3, G1, G2, G3}.
After drawing of the 4 balls, we can get the sequences,
R1 R2 R3 G1 R1 R2 R3 G2 R1 R2 R3 G3 R1 R2 G1 G2 R1 R2 G1 G3
3R 1G 3R 1G 3R 1G 2R 2G 2R 2G
R1 R2 G2 G3 R2 R3 G1 G2 R2 R3 G1 G3 R2 R3 G2 G3 R1 R3 G1 G2
2R 2G 2R 2G 2R 2G 2R 2G 2R 2G
R1 R3 G1 G3 R1 R3 G2 G3 R1 G1 G2 G3 R2 G1 G2 G3 R3 G1 G2 G3
2R 2G 2R 2G 1R 3G 1R 3G 1R 3G
6
4
6
N ð SÞ = :
4
Example 2.25: For the previous example, consider the event A defined as
A = A1 × A2
We have
3 3
N ðA1 Þ = N ðA2 Þ =
2 2
and
56 2 Total Probability Theorem, Independence, Combinatorial
3 3
N ðAÞ = N ðA1 Þ × N ðA2 Þ → N ðAÞ = × :
2 2
N ðA Þ 3
× 3
PðAÞ = → PðAÞ = 2 2
:
N ðSÞ 6
4
2.9 Partitions
S = S1 × S 2 × ⋯ × S N r
and the size of the sample space S, denoted by jSj, which indicates the number of
ways these groups can be formed, can be calculated using
j S j = j S1 j × j S2 j × ⋯ × j SN r j
leading to
N N - N1 N - N1 - N2 N - N1 - ⋯ - Nr - 1
× × ×⋯×
N1 N2 N3 Nr
N! ðN - N 1 Þ ðN - N 1 - ⋯ - N r - 1 Þ!
× ×⋯×
ðN - N 1 Þ! × N 1 ! ðN - N 1 - N 2 Þ! × N 2 ! ðN - N 1 - ⋯ - N r - 1 - N r Þ × N r !
N!
: ð2:24Þ
N1! × N2! × ⋯ × Nr !
2.9 Partitions 57
The idea of the partitions can also be interpreted in a different way considering the
permutation law. If there are N distinct objects available, the number of N-object
sequences that can be formed from these N objects can be calculated as
That is, the total number of permutations for N objects equals N!.
In fact, the result in (2.25) is nothing but the number of elements in the sample
space of the combined experiment, and there are N experiments in total and the
sample space of the kth, k = 1⋯N experiment contains
N -k
1
elements.
Note: |S| indicates the number of elements in the set S.
If N1 objects are the same, then the total number of permutations is
N!
:
N 1!
If N1 objects are the same, and N2 objects are the same, then the total number of
permutations is
N!
:
N 1 !N 2 !
In a similar manner, if N1 objects are the same, N2 objects are the same, and so on
until the Nr are the same objects, the total number of permutations is
N!
< N! ð2:26Þ
N 1 !N 2 !⋯N r !
Example 2.26: In the English language, there are 26 letters. How many words can
be formed consisting of 5 distinct letters?
Solution 2.26: You can consider this question as the draw of letters from the
alphabet box without replacement, and we repeat the experiment 5 times. Then,
the number of words that contains 5 distinct letters can be calculated using
26 × 25 × 24 × 23 × 22
Example 2.27: The total number of permutations of the sequence abc is 3! = 6, and
these sequences are
On the other hand, the total number of permutations of the sequence aab is
3!/2! = 3, and these sequences are
Which contains 3 sequences, i.e., 6/2!; the reason for this reduction can be seen
from
Example 2.28: The total number of permutations for the sequence abcd is 4! = 24.
That is, by reordering the items in abcd, we can write 24 distinct sequences in total.
On the other hand, the total number of permutations for the sequence abac is
4!/2! = 12, and these sequences are
abac acab abca acba aabc aacb cbaa bcaa baca caba caab baac
Exercise: For the sequences abcde and aaabbc write all the possible permutations,
and show the relation between the permutation number of both sequences.
Example 2.29: How many different letter sequences by reordering the letter in the
word TELLME?
Solution 2.29: The number of different letter sequences equals
6!
:
2! × 2!
Partitions Continued
Let S be the sample space of an experiment, and A1, A2, A3, ⋯, Ar be the disjoint sets
forming a partition of S such that Ai \ Aj = ϕ, i ≠ j, and A1 [ A2 [ ⋯ [ Ar = S. The
probabilities of the disjoint events A1, A2, A3, ⋯, Ar are
such that
2.9 Partitions 59
p1 þ p2 þ ⋯ þ pr = 1:
S = fA1 , A2 , ⋯, Ar g:
Assume that we repeat the experiment N times. Consider the event B defined as
N!
PðBÞ = pN 1 × pN2 2 × ⋯ × pNr r ð2:27Þ
N 1! × N 2! × ⋯ × N r ! 1
where
N!
ð2:29Þ
N 1! × N 2! × ⋯ × Nr !
is the total number of elements in B. Every element of B has the same probability of
occurrence.
Example 2.30: A fair die is tossed 15 times. What is the probability that the
numbers 2 or 4 appear 5 times and 3 appears 4 times?
Solution 2.30: For the given experiment, the sample space is S1 = {1, 2, 3, 4, 5, 6}.
We can define the disjoint events A1, A2, and A3 as
Considering the experimental output expected in the question, we can write the
sample space of the experiment as
S1 = A1 [ A2 [ A3 :
2 1 3
PðA1 Þ = PðA2 Þ = PðA3 Þ = :
6 6 6
60 2 Total Probability Theorem, Independence, Combinatorial
We perform the experiment 15 times. The sample space of the repeated combined
experiment can be found by taking 15 times the Cartesian product of S1 by itself as
S = S1 × S 1 × ⋯ × S 1
S¼ A1 A1 A1 A1 A1 A1 A1 A1 A1 A1 A1 A1 A1 A1 A1 ,
A1 A1 A1 A1 A1 A1 A1 A1 A1 A1 A1 A1 A1 A2 , ⋯ :
5 4 6
15! 2 1 3
PðBÞ = × ×
5! × 4! × 6! 6 6 6
15!
5! × 4! × 6!
5 4 6
2 1 3
× ×
6 6 6
S1 = fT 1 , T 2 , T 3 , T 4 g:
The probabilities of the simple events T1, T2, T3, T4 are given in the question as
2.10 Case Study: Modeling of Binary Communication Channel 61
Throwing the dart 12 times can be considered as repeating the same experiment
12 times, and the sample space of the combined experiment in this case can be
calculated by taking 12 times the Cartesian product of S1 by itself, i.e.,
S = S1 × S 1 × ⋯ × S 1 :
S = fT 1 T 1 T 1 T 1 T 1 T 1 T 1 T 1 T 1 T 1 T 1 T 1 , T 1 T 1 T 1 T 1 T 1 T 1 T 1 T 1 T 1 T 1 T 1 T 2 , ⋯ g:
i.e.,
B = f T 1 T 1 T 1 T 2 T 2 T 2 T 3 T 3 T 3 T 4 T 4 T 4 , T 1 T 2 T 1 T 1 T 2 T 2 T 3 T 3 T 3 T 4 T 4 T 4 , ⋯g:
N!
PðBÞ = pN 1 × pN2 2 × ⋯ × pNr r
N1! × N 2! × ⋯ × N r ! 1
as
12!
PðBÞ = 0:13 × 0:43 × 0:13 × 0:43 :
3! × 3! × 3! × 3
Exercise: A fair die is flipped 8 times. Determine the probability of an odd number
appearing 2 times and 4 appearing 3 times.
T 0 = fTransmitting a 0g T 1 = fTransmitting a 1g
R0 = fReceiving a 0g R1 = fReceiving a 1g
E = fError at receiverg
62 2 Total Probability Theorem, Independence, Combinatorial
Channel
The events T0 and T1 are disjoint events, i.e., T0 \ T1 = ϕ. The error event E can
be written as
E = ðT 0 \ R 1 Þ [ ð T 1 \ R 0 Þ
Since
S = T0 [ T1
we can write R1 as
R1 = R1 \ S → R1 = R1 \ ðT 0 [ T 1 Þ
leading to
R1 = ðR1 \ T 0 Þ [ ðR1 \ T 1 Þ
P( R0 | T1 ) = 0.1 P( R1 | T0 ) = 0.05
T1 R1
P( R1 | T1 ) = 0.90
Example 2.32: For the binary symmetric channel shown in Fig. 2.5, if the bits “0”
and “1” have an equal probability of transmission, calculate the probability of error
at the receiver side.
Solution 2.32: Since the bits “0” and “1” have equal transmission probabilities,
then we have
1
PðT 0 Þ = PðT 1 Þ = :
2
Using the channel transition probabilities, the transmission error can be calculated as
leading to
1 1
PðE Þ = 0:05 × þ 0:1 × → PðE Þ = 0:075:
2 2
Example 2.33: For the binary symmetric channel shown in Fig. 2.6, the bits “0”
and “1” have equal probability of transmission.
If a “1” is received at the receiver side:
(a) What is the probability that a “1” was sent?
(b) What is the probability that a “0” was sent?
Solution 2.33:
(a) We are asked to find P(T1| R1), which can be calculated as
P ðT 1 \ R 1 Þ
PðT 1 jR1 Þ =
PðR1 Þ
PðR1 jT 1 ÞPðT 1 Þ
=
PðR1 jT 1 ÞPðT 1 Þ þ PðR1 jT 0 ÞPðT 0 Þ
0:95 × 0:5
=
0:95 × 0:5 þ 0:05 × 0:5
= 0:95:
64 2 Total Probability Theorem, Independence, Combinatorial
P( R0 | T1 ) = 0.1 P( R1 | T0 ) = 0.05
T1 R1
P( R1 | T1 ) = 0.90
P ðT 0 \ R 1 Þ
PðT 0 jR1 Þ =
PðR1 Þ
PðR1 jT 0 ÞPðT 0 Þ
=
PðR1 jT 1 ÞPðT 1 Þ þ PðR1 jT 0 ÞPðT 0 Þ
0:05 × 0:5
=
0:95 × 0:5 þ 0:05 × 0:5
= 0:0526:
Problems
S = fs1 , s2 , s3 , s4 , s5 , s6 , s7 , s8 g:
S = fs1 , s2 , s3 , s4 , s5 , s6 , s7 , s8 g
such that
S = A [ B [ C:
D = fs1 , s4 , s5 , s7 g:
Problems 65
Verify that
Write these events explicitly, and decide whether the events A and B are
independent of each other. Decide whether the events A and B are conditionally
independent of each other given the event C.
6. Assume that you get up early and go for the bus service for your job every
morning. The probability that you miss the bus service is 0.1. Calculate the
probability that you miss the bus service 5 times in 30 days, i.e., in a month.
7. A three-sided biased die is tossed. The sample space of this experiment is given as
S = ff 1 , f 2 , f 3 g
1 2 1
Pðf 1 Þ = Pðf 2 Þ = Pðf 3 Þ = :
4 4 4
66 2 Total Probability Theorem, Independence, Combinatorial
Assume that we toss the die 8 times. What is the probability that f1 appears
5 times out of these 8 tosses.
8. Using the integers in integer set F4 = {0, 1, 2, 3}, how many different integer
vectors consisting of 12 integers can be formed?
9. From a group of 10 men and 8 women, 6 people will be selected to form a jury for
a court. It is required that the jury would contain at least 2 men. In how many
different ways we can form the jury?
Chapter 3
Discrete Random Variables
Let S = {s1, s2, ⋯, sN} be the sample space of a discrete experiment, and X ðÞ be a
real valued function that maps the simple events of the sample space to real numbers.
This is illustrated in Fig. 3.1.
Example 3.1: Consider the coin flip experiment. The sample space is S = {H, T}.
A random variable X ðÞ can be defined on the simple events, i.e., outcomes, as
X ðH Þ = 3:2 X ðT Þ = - 2:4:
S = fHH b , HT b , TH b , TT b g:
Let’s define a real valued function X ðÞ on simple outcomes of the combined
experiment as
1 if si contains H b
X ðsi Þ = ð3:1Þ
3 if si contains T b :
~
X()
s
s1 s 2 s3 s4
If we denote the simple events HHb, HTb, THb, TTb by s1, s2, s3, s4, we can draw
the graph of
X ðÞ
as in Fig. 3.2.
Example 3.3: Consider the toss of a fair die experiment. The sample space of this
experiment can be written as S = {s1, s2, s3, s4, s5, s6}. Let’s define the random
variable X ðÞ on simple events of S as
2i - 1 if i is odd
X ðs i Þ =
2i þ 1 if i is even:
X ðs1 Þ = 2 × 1 - 1 → X ðs1 Þ = 1
X ðs2 Þ = 2 × 2 þ 1 → X ðs2 Þ = 5
X ðs3 Þ = 2 × 3 - 1 → X ðs3 Þ = 5
X ðs4 Þ = 2 × 4 þ 1 → X ðs4 Þ = 9
X ðs5 Þ = 2 × 5 - 1 → X ðs5 Þ = 9
X ðs6 Þ = 2 × 6 þ 1 → X ðs6 Þ = 13:
3.2 Defining Events Using Random Variables 69
Then, we can state that the random variable function X ðÞ takes the values from
the set {1, 5, 9, 13}, which can be called a range set of the random variable X and can
be denoted as
R = f1, 5, 9, 13g:
X
si jX ðsi Þ = x ð3:2Þ
which indicates the subset, i.e., event, of S consisting of si which satisfy X ðsi Þ = x.
Example 3.4: For the toss-of-a-die experiment in the previous question, the random
variable is defined as
2i - 1 if i is odd
X ðs i Þ =
2i þ 1 if i is even:
A = si jX ðsi Þ = 5
A = fs 2 , s 3 g
S = fs1 , s2 , s3 , s4 , s5 , s6 g:
1 if i is odd
X ðs i Þ =
-1 if i is even:
A = si jX ðsi Þ = - 1 :
Since
the event
A = si jX ðsi Þ = - 1
A = fs2 , s4 , s6 g:
Example 3.6: Consider the two independent flips of a fair coin. The sample space
of this experiment can be written as S = {HH, HT, TH, TT}. Let’s define the random
variable X ðÞ on simple events of S as
where si is one of the simple events of S. The random variable function can be
explicitly written as
A = si jX ðsi Þ = 1
B = si jX ðsi Þ = - 1
C = si jX ðsi Þ = 2
D = si jX ðsi Þ = 1 or X ðsi Þ = 0 :
si jX ðsi Þ = x
means finding those si that satisfy X ðsi Þ = x and using all these si forming an event
of S.
Since
X ðHT Þ = 1 X ðTH Þ = 1
the event
A = si jX ðsi Þ = 1
A = fHT, TH g:
B = si jX ðsi Þ = - 1
B = f g:
C = si jX ðsi Þ = 2
since X ðsi Þ = 2 is satisfied for only si = TT, i.e., X ðTT Þ = 2, the event C can be
written as
C = fTT g:
The event
D = si jX ðsi Þ = 1 or X ðsi Þ = 0
D = fHT, TH, TT g
since
72 3 Discrete Random Variables
The expression
si jX ðsi Þ = x
X=x : ð3:3Þ
That is, the mathematical expressions si jX ðsi Þ = x and X = x mean the same
thing, i.e.,
The expression
si jX ðsi Þ ≤ x
means making a subset of S, i.e., an event, from those si satisfying X ðsi Þ ≤ x, and the
event
si jX ðsi Þ ≤ x
X ≤x : ð3:5Þ
Example 3.7: Consider the roll-of-a-die experiment. The sample space of this
experiment can be written as S = {s1, s2, s3, s4, s5, s6}. The random variable X ðÞ
on simple events of S is defined as
X ðsi Þ = 4 × i
X ðs1 Þ = 4 × 1 → X ðs1 Þ = 4
X ðs2 Þ = 4 × 2 → X ðs2 Þ = 8
X ðs3 Þ = 4 × 3 → X ðs3 Þ = 12
X ðs4 Þ = 4 × 4 → X ðs4 Þ = 16
X ðs5 Þ = 4 × 5 → X ðs5 Þ = 20
X ðs6 Þ = 4 × 6 → X ðs6 Þ = 24:
3.2 Defining Events Using Random Variables 73
A = si jX ðsi Þ ≤ 10
B = si jX ðsi Þ ≤ 14
C = si jX ðsi Þ ≤ 20
D = si jX ðsi Þ ≤ 25 :
A = si jX ðsi Þ ≤ 10
A = fs1 , s2 g:
B = fs 1 , s 2 , s 3 g C = fs1 , s2 , s3 , s4 , s5 g D = fs1 , s2 , s3 , s4 , s5 , s6 g:
X ≤x
si jX ðsi Þ ≤ x
i.e.,
Example 3.8: The range set of the random variable X is given as R = f- 1, 1, 3g:
X
Verify that
S= X = -1 [ X=1 [ X=3 :
74 3 Discrete Random Variables
s2
~
{ X = 1} s4 1
.
.
.
~ sN
{ X = 1} . 1
.
.
Solution 3.8: The random variable function X ðÞ is defined on the simple outcomes
of the sample space S, and it is a one-to-one function between simple events and real
numbers. The events
X = -1 X =1 X =3
are disjoint events and their union gives S. This is illustrated in Fig. 3.3.
Example 3.9: Consider the two independent tosses of a fair coin. The sample space
of this experiment can be written as S = {HH, HT, TH, TT}. Let’s define the random
variable X ðÞ on simple events of S as
where si is one of the simple events of S. The random variable function can be
explicitly written as
S= X =0 [ X =1 [ X =2
X=0 \ X=1 =ϕ
X=0 \ X=2 =ϕ
X=1 \ X=2 =ϕ
X =0 \ X =1 \ X =2 =ϕ
and we have
S = fs1 , s2 , s3 , s4 , s5 , s6 g:
X= -2 X =3 X =4 :
X = - 2 [ X = 3 [ X = 4 = S:
Solution 3.10:
(a) The events X = - 2 , X = 3 , and X = 4 can be explicitly written as
X = - 2 = fs 1 , s 2 , s 3 g
76 3 Discrete Random Variables
X = 3 = fs4 g
X = 4 = fs5 , s 6 g
(b) Considering the explicit form of the events in part a, it is obvious that the events
X = - 2 , X = 3 , and X = 4 are disjoint of each other, i.e.,
X = -2 \ X=3 =ϕ
X = -2 \ X=4 =ϕ
X=3 \ X=4 =ϕ
X= -2 \ X =3 \ X =4 =ϕ
which is the sample space. The partition of the sample space is depicted in
Fig. 3.4.
The probability mass function p(x) for discrete random variable X is defined as
pðxÞ = Prob X = x
where x is a value of the random variable function X ðÞ. The probability mass
function can also be indicated as
where the subscript of p(x), i.e., X, points to a random variable to which the
probability mass function belongs to. For the easiness of the notation, we will not
use the subscript in the probability mass function expression unless otherwise
indicated.
Let’s illustrate the concept of probability mass function with an example.
3.3 Probability Mass Function for Discrete Random Variables 77
s5 , s6
~
{ X = 4}
Example 3.11: Consider the experiment, the two independent flips of a fair coin.
The sample space of this experiment can be written as S = {HH, HT, TH, TT}. Let’s
define the random variable X ðÞ on simple events of S as
where si is one of the simple events of S. The random variable function can be
explicitly written as
R = f0, 1, 2g:
X
pðxÞ = Prob X = x
where x takes one of the values from the set R = f0, 1, 2g, i.e., x can be either
X
0, or it can be 1, or it can be 2. We will consider each distinct x value for the
calculation of p(x). For x = 0, the probability mass function p(x) is calculated as
78 3 Discrete Random Variables
pðx= 0Þ = Prob X = 0
X = 0 = fTT g:
Then, we have
1
pðx= 0Þ = PfTT g → pðx= 0Þ = :
4
pðx= 1Þ = Prob X = 1
X = 1 = fHT, TH g:
Then, we have
1
pðx= 1Þ = PfHT, TH g → pðx= 1Þ = PfHT g þ PfTH g → pðx= 1Þ = :
2
pðx= 2Þ = Prob X = 2
X = 2 = fHH g:
Then, we have
1
pðx= 2Þ = PfHH g → pðx= 2Þ = PfHH g → pðx= 2Þ = :
4
Hence, the values of the probability mass function p(x) are found as
1 1 1
pðx= 0Þ = pðx= 1Þ = pðx= 2Þ = :
4 2 4
We can draw the graph of probability mass function p(x) with respect to x as
in Fig. 3.5.
3.3 Probability Mass Function for Discrete Random Variables 79
1/4
x
0 1 2
1 1 1
pðx= 0Þ þ pðx= 1Þ þ pðx= 2Þ = þ þ → pðx= 0Þ þ pðx= 1Þ þ pðx= 2Þ = 1:
4 2 4
That is,
pðxÞ = 1:
x
Theorem 3.1:
(a) The probability mass function of a discrete random variable X satisfies
pðxÞ = 1: ð3:7Þ
x
R = fx1 , x2 , x3 g:
X
S = X = x1 [ X = x2 [ X = x3 ð3:8Þ
P ð SÞ = P X = x1 þ P X = x2 þ P X = x3
=1 pðx = x1 Þ pðx = x2 Þ pðx = x3 Þ
p ð x = x 1 Þ þ pð x = x 2 Þ þ pð x = x 3 Þ = 1 → pðxÞ = 1:
x
This process can be performed for any range set with any number of elements.
F ðxÞ = Prob X ≤ x
F ðxÞ = Prob X ≤ x :
X
- 1 < x < a1
a1 ≤ x < a2
a2 ≤ x < a3
⋮
aN - 1 ≤ x < a N
aN ≤ x < 1
Example 3.12: Consider again the experiment, the two independent tosses of a fair
coin. The sample space of this experiment can be written as S = {HH, HT, TH, TT}.
Let’s define the random variable X ðÞ on simple events of S as
where si is one of the simple events of S. The random variable function can be
explicitly written as
Calculate and draw the cumulative distribution function, i.e., F(x), of the random
variable X.
Solution 3.12: The range set of the random variable X can be written as
R = f0, 1, 2g:
X
-1<x<0
0≤x<1
1≤x<2
2 ≤ x < 1:
In the second step, we determine the cumulative distribution function, i.e., CDF,
F(x) for the given intervals. To determine the CDF for the given intervals, we can
pick a value for x for the interval under concern and calculate the value of F(x). For
our example, we can proceed as follows:
3/ 4
1/ 4
x
0 1 2
-1<x<3
3≤x<7
7 ≤ x < 12
12 ≤ x < 1:
F ðxÞ = Prob X ≤ x
pðxÞ = Prob X = x
as
F ð xi Þ = pðxÞ: ð3:10Þ
x ≤ xi
R = f1, 2, 3, 4, 5g:
X
Calculate the value of the cumulative distribution function F(x) at x = 3.4 in terms
of its probability mass function p(x).
Solution 3.14: When the cumulative distribution function
F ðxÞ = Prob X ≤ x
where Prob X ≤ 3:4 means the probability of the discrete random variable X
producing values less than 3.4. Since the values produced by discrete random
variable X less than 3.4 are 1, 2,and 3,
can be written as
pðxÞ = Prob X = x
we get
The event X ≤ 3:4 can also be considered as the union of the mutually
exclusive events X = 1 , X = 2 , X = 3 , i.e.,
X ≤ 3:4 = X = 1 [ X = 2 [ X = 3
Show that the value of the cumulative distribution function F(x) on the interval
x3 < x < x4 can be written in terms of the probability mass function values as
Solution 3.15: Since the range set of the discrete random variable is given as
The sample space of the random variable can be partitioned as in Fig. 3.7.
Considering Fig. 3.7, the event
can be written as
X ≤ x = X = x1 [ X = x2 [ X = x3 ð3:11Þ
~
X = x2 ~
X = x4
F(x) and p(x) are the cumulative distribution and probability mass function of the
discrete random variable X, respectively. How do we calculate F(0.5) and F(2) using
probability mass function p(x)?
Solution 3.16: Considering the discussion in the previous example, we can write
F(0.5) and F(2) as
Example 3.17: The probability mass function values of a discrete random variable
X is given as
Write the range set of the random variable and find the value of a.
Solution 3.17: The range set of the random variable X is
R = f- 1, 2, 3:5g:
X
Since
pð x Þ = 1
x
we have
1 2 3
p ð 2Þ = p ð 5Þ = p ð 8Þ = :
6 6 6
Find the range set, and draw the cumulative distribution function of the discrete
random variable X.
Solution 3.18: The range set of the discrete random variable X can be written as
R = f2, 5, 8g:
X
To draw the cumulative distribution function, let’s first write the x-intervals as
-1<x<2
2≤x<5
5≤x<8
8 ≤ x < 1:
In the second step, we calculate the value of the probability distribution function
F ðxÞ = Prob X ≤ x
on the determined intervals. For this purpose, we can select a value on the deter-
mined interval and calculate the value of the probability distribution function on the
concerned interval as follows:
- 1 < x < 2!F ðxÞ ¼ Prob X ≤ x !F ð1Þ ¼ Prob X ≤ 1 !F ðxÞ ¼ 02 ≤ x < 5!F ðxÞ
¼ Prob X ≤ x !F ð3Þ ¼ Prob X ≤ 3 !F ðxÞ ¼ pð2Þ5 ≤ x < 8!F ðxÞ ¼ Prob X ≤ x !F ð6Þ
¼ Prob X ≤ 6 !F ðxÞ ¼ pð2Þ þ pð5Þ8 ≤ x < 1!F ðxÞ ¼ Prob X ≤ x !F ð9Þ
¼ Prob X ≤ 9 !F ðxÞ ¼ pð2Þ þ pð5Þ þ pð8Þ:
Hence, we have
3/ 6
1/ 6
x
2 5 8
The graph of the cumulative distribution function F(x) happens to be as in Fig. 3.8
Exercise: A fair and a biased coin are tossed together. For the fair coin, we have
1 1
ProbðH Þ = ProbðT Þ = :
2 2
2 1
ProbðH b Þ = ProbðT b Þ = :
3 3
(c) Find the probability density function p(x) and cumulative distribution function
F(x) of the discrete random variable X ðÞ, and draw the graphs of p(x) and F(x).
Expected value and variance are two important parameters of a random variable.
Expected or mean value is also called the probabilistic average.
Before introducing the fundamental formulas for mean and variance calculations,
let’s consider the average value calculations via some examples. Assume that we
have a digit generator machine and the generated digits are 1, 2, and 6. Besides, each
88 3 Discrete Random Variables
digit has the same probability of occurrence. Let’s say that 60 digits are generated,
i.e., each digit is generated 20 times. The arithmetic average of the generated digit
sequence can be calculated as
20 × 1 þ 20 × 2 þ 20 × 6 1þ2þ6
→ → 3: ð3:12Þ
60 3
Now assume that the probability of occurrence of digits 1 and 2 are equal to each
other and it equals half of the probability of occurrence of digit 6. In this case, out of
60 generated digits, 15 of them are 1, the other 15 of them are 2, and 30 of them are
6. The arithmetic average of the digit sequence can be calculated as
15 × 1 þ 15 × 2 þ 30 × 6 1 1 1
→ × 1 þ × 2 þ × 6 → 3:75: ð3:13Þ
60 4 4 2
When (3.12) and (3.13) are compared to each other, we see in (3.13) we have a
larger number. This is due to the higher probability of occurrence of digit 6. Now it is
time to state the expected value concept.
The probabilistic average value, i.e., expected or mean value, of a discrete
random variable X with probability mass function p(x) is calculated using
E X = xpðxÞ
x
m= xpðxÞ: ð3:14Þ
x
If the range set of the discrete random variable X contains N values, and if the
probability of occurrence of values in the range set is equal to 1/N, then the mean
value expression in (3.14) happens to be the arithmetic average expression, i.e., if
p(x) = 1/N, then we get
x 1
m= →m= x:
x N N x
3.5.2 Variance
If the variance of a random variable is a small number, it means that the generated
values are close to the mean value of the random variable; on the other hand, if the
variance of a random variable is a large number, then it means that the spread of the
3.5 Expected Value (Mean Value), Variance, and Standard Deviation 89
generated values is very wide and the generated values are neither close to each other
nor close to the mean value.
Example 3.19: Assume that the sequences
v1 = ½1 2 3 2 1 3 1 4 v2 = ½- 10 12 3 0 87 34 5 - 2
Now let’s state the variance formula. The variance of a discrete random variable X
with mean value m and probability mass function p(x) is calculated as
2
Var X = E X - m2
where
2
E X = x2 pðxÞ:
x
2
Var X = E X - m
The standard deviation of a random variable is nothing but the square root of its
variance, i.e.,
σ= Var X : ð3:15Þ
R = f- 1, 0, 2, 3g:
X
The probability mass function p(x) of the discrete random variable X is given as
2 1 2 2 1
pðx = - 1Þ = pðx = 0Þ = pðx = 1Þ = pðx = 2Þ = pðx = 3Þ = :
8 8 8 8 8
Find the mean value, i.e., probabilistic average or expected value, variance, and
standard deviation of the discrete random variable X.
Solution 3.20: The mean value of the discrete random variable X is calculated as
E X = xpðxÞ →
x
leading to
7
E X =m= :
8
where
x2 pðxÞ
x
3.6 Expected Value and Variance of Functions of a Random Variable 91
is computed as
2 1 2 2 1
x 2 p ð x Þ = ð - 1Þ 2 × þ ð0Þ2 × þ ð 1Þ 2 × þ ð 2Þ 2 × þ ð3Þ2 ×
x 8 8 8 8 8
resulting in
21
x2 pð x Þ = : ð3:17Þ
x 8
2
21 7 119
Var X = - → Var X = → Var X ≈ 1:86:
8 8 64
Since standard deviation is nothing but the square root of the variance, we can get
it as
p
119
σ= Var X → σ = → σ ≈ 1:36:
8
as
7 2 2 7 2 1 7 2
2
Var X = - 1 - × þ 0- × þ 1- ×
8 8 8 8 8 8
7 2 2 7 2 1
þ 2- × þ 3- ×
8 8 8 8
Var X ≈ 1:86:
E g X = gðxÞpðxÞ
x
m= gðxÞpðxÞ:
x
2
Var g X =E g X - m2
where
2
E g X = ½gðxÞ2 pðxÞ:
x
which is computed as
R = f- 1, 1, 2g:
X
The probability mass function p(x) of the discrete random variable X is specified
as
1 1 1
pðx= - 1Þ = pðx= 1Þ = pðx= 2Þ = :
4 4 2
3 2 2
ð aÞ E X ð bÞ E X - 1 ð cÞ E X þ 1 :
3.6 Expected Value and Variance of Functions of a Random Variable 93
E g X = gðxÞpðxÞ:
x
(a) For
3
g X =X
we calculate E g X as
3
E g X = gðxÞpðxÞ → E X = x3 pð x Þ
x x
leading to
3 3 1 1 1
E X = x3 pð x Þ → E X = ð - 1Þ 3 × þ ð 1Þ 3 × þ ð 2Þ 3 ×
x
4 4 2
resulting in
3
E X = 4:
(b) For
2
g X =X -1
we calculate E g X as
2
E g X = gðxÞpðxÞ → E X - 1 = x2- 1 pðxÞ
x x
leading to
2
E X -1 = x2- 1 pðxÞ →
x
2 1 1 1
E X - 1 = ð - 1Þ2 - 1 × þ 12- 1 × þ 22- 1 × →
4 4 2
resulting in
94 3 Discrete Random Variables
2 3
E X -1 = :
2
(c) For
2
g X =X þ 1
we calculate E g X as
2
E g X = gðxÞpðxÞ → E X þ 1 = x2 þ 1 pð x Þ
x x
leading to
2
E X þ1 = x2 þ 1 pð x Þ →
x
2 1 1 1
E X þ 1 = ð - 1Þ 2 þ 1 × þ 12 þ 1 × þ 22 þ 1 × →
4 4 2
resulting in
2 7
E X -1 = :
2
R = f- 1, 1, 2g:
X
The probability mass function p(x) of the discrete random variable X is specified
as
1 1 1
pð x = - 1Þ = pð x = 1Þ = pð x = 2Þ = :
4 4 2
g X = 2X þ 1
we calculate E g X as
3.6 Expected Value and Variance of Functions of a Random Variable 95
leading to
E 2X þ 1 = ð2x þ 1ÞpðxÞ →
x
1 1 1
E 2X þ 1 = ð2 × ð- 1Þ þ 1Þ × þ ð 2 × 1 þ 1Þ × þ ð2 × 2 þ 1Þ × →
4 4 2
resulting in
12
E 2X þ 1 = → E 2X þ 1 = 3:
4
We know that
2
Var g X =E g X - m2
where
2
E g X = ½gðxÞ2 pðxÞ:
x
For
g X = 2X þ 1
2
we calculate E g X as
2 2
E g X = ½gðxÞ2 pðxÞ → E 2X þ 1 = ð2x þ 1Þ2 pðxÞ →
x x
2 1 1 1
E 2X þ 1 = ð2 × ð - 1Þ þ 1Þ2 × þ ð2 × 1 þ 1Þ2 × þ ð2 × 2 þ 1Þ2 ×
4 4 2
resulting in
2 60
E 2X þ 1 = :
4
96 3 Discrete Random Variables
2 60
Var 2X þ 1 = E 2X þ 1 - m2 → Var 2X þ 1 = - 32 → Var 2X þ 1 = 6:
4
Example 3.23: The probability mass function of the discrete random variable X is
given as
2
Var X = E X -m
2
equals E X - m2 :
3.6 Expected Value and Variance of Functions of a Random Variable 97
2
E X-m
we get
2
E X - 2mX þ m2
2
E X - 2mX þ m2 = x
x2- 2mx þ m2 pðxÞ ð3:18Þ
2 2
E X-m =E X - m2 :
1
x= -2
4
1
pðxÞ = x=0
2
1
x=3 :
4
1 1 1 1
E X = xpðxÞ → E X = - 2 × þ 0× þ 3× →E X = :
x 4 2 4 4
2 2
Var X = E X-m →E X-m = ð x - m Þ 2 pð x Þ
x
leading to
X~ - m
2
E = ð x - m Þ 2 pð x Þ →
x
2 2
2 1 1 1 1 1
= - 2 - 14 × þ 0- × þ 3- ×
4 4 2 4 4
204
= :
16
Standard deviation σ is nothing but the square root of the variance, then we have
204
σ= → σ ≈ 3:57:
16
E Y = aE X þ b: ð3:19Þ
E g X = gðxÞpðxÞ
x
for Y = aX þ b, we have
E Y~ = E aX~ þ b
= ðax þ bÞpðxÞ
x
=a xpðxÞ þ b pð x Þ
x x
= aE X~ þ b
= amx þ b:
Thus, we obtained
3.7 Some Well-Known Discrete Random Variables in Mathematic Literature 99
my = amx þ b:
Var Y = a2 Var X :
my = amx þ b:
2
Var Y = E Y - my
in which substituting
my = amx þ b
we get
Var Y~ = E Y~ - amx - b
2
Let X be a discrete random variable and x be an integer such that x 2 {0, 1, ⋯, N},
i.e., x is an integer taking values from the integer set {0, 1, ⋯, N}. If the random
variable X has the probability mass function
N x
pð x Þ = p ð1 - pÞN - x , 0 ≤ p ≤ 1 x = 0, 1, ⋯, N ð3:21Þ
x
then X is called a binomial random variable with parameters N and p. The mass
function p(x) is called binomial distribution or binomial probability mass function.
Since
pð x Þ = 1 ð3:22Þ
x
N
px ð1 - pÞN - x = 1: ð3:23Þ
x x
The graphs of the binomial distribution, i.e., binomial probability mass function
p(x), for N = 80, p = 0.1, p = 0.5, and p = 0.9 are drawn in Fig. 3.9.
Let X be a discrete random variable and x be an integer such that x 2 {0, 1, ⋯, 1},
i.e., x is non-negative integer. If the random variable X has the probability mass
function
then X is called a geometric random variable with parameter p. The mass function
p(x) is called geometric distribution or geometric probability mass function.
The graphs of the geometric distribution, i.e., geometric probability mass function
p(x), for p = 0.2, p = 0.4, and N = 20 are drawn in Fig. 3.10.
3.7 Some Well-Known Discrete Random Variables in Mathematic Literature 101
Fig. 3.9 Binomial distribution for N = 80, p = 0.1, p = 0.5, and p = 0.9.
Let X be a discrete random variable and x be an integer such that x 2 {0, 1, ⋯, 1},
i.e., x is non-negative integer. If the random variable X has the probability mass
function
λx
pð x Þ = e - λ , x2ℕ ð3:25Þ
x!
then X is called a Poisson random variable with parameter λ. The mass function p(x)
is called the Poisson distribution or Poisson probability mass function.
The graphs of the Poisson distributions, i.e., Poisson probability mass functions,
for λ = 4 and λ = 10 are drawn in Fig. 3.11.
102 3 Discrete Random Variables
Let X be a discrete random variable and x be an integer such that x 2 {0, 1}. If the
random variable X has the probability mass function
p if x = 1
pð xÞ = ð3:26Þ
1 - p if x = 0
then X is called a Bernoulli random variable with parameter p. The mass function
p(x) is called the Bernoulli distribution or Bernoulli probability mass function.
3.7 Some Well-Known Discrete Random Variables in Mathematic Literature 103
1
if x 2 fk, k þ 1, ⋯, k þ N - 1g
pð x Þ = N ð3:27Þ
0 otherwise
then X is called a discrete uniform random variable with parameters k and N. The
mass function p(x) is called discrete uniform distribution or discrete uniform prob-
ability mass function.
Example 3.26: Calculate the mean and variance of the Bernoulli random variable.
Solution 3.26: We can calculate the mean value of the Bernoulli random variable X
using its probability mass function
p if x = 1
pð x Þ =
1-p if x = 0
as
E X = xpðxÞ → E X = 1 × p þ 0 × ð1- pÞ → E X = p:
x
2 2
Var X = E X - E X
2
where E X is computed as
2 2 2
E X = x2 pð x Þ → E X = 12 × p þ 02 × ð1- pÞ → E X = p:
x
Example 3.27: The probability mass function of a discrete uniform random vari-
able X is given as
1
if - 2 ≤ x ≤ 3
pðxÞ = 6
0 otherwise:
Find the mean and variance of the discrete uniform random variable X.
Solution 3.27: The mean value can be calculated as
1 1 1 1 1 1
E X = xpðxÞ → E X = - 2 × þ ð- 1Þ × þ 0 × þ 1 × þ 2 × þ 3 ×
x 6 6 6 6 6 6
leading to
1 3
E X = ð- 2 - 1 þ 0 þ 1 þ 2 þ 3Þ × →E X = :
6 6
2
For the variance calculation, we first compute E X as
2 2 1 2 19
E X = x2 pðxÞ → E X = ð - 2Þ2 þ ð - 1Þ2 þ 02 þ 12 þ 22 þ 32 × →E X = :
x 6 6
2 2 19 1 35
Var X = E X - E X → Var X = - → Var X = :
6 4 12
Example 3.28: Calculate the mean value of the Poisson random variable.
Solution 3.28: We can calculate the mean value of the Poisson random variable X
using its probability mass function
λx
pð x Þ = e - λ , x2ℕ
x!
as
Problems 105
E X = xpðxÞ
x
1 λx
= xe - λ
x=0 x!
1 λ x
= xe - λ
x=1 x!
1 λx
= e-λ
x=1 ðx - 1Þ!
1
λx - 1
= λe - λ
x = 1 ðx - 1Þ!
1
-λ λm
= λe m=x-1
m = 0 m!
1
λm
= λe - λ
m=0
m!
eλ
= λe - λ eλ
=λ
E X = λ: ð3:28Þ
Problems
1. A fair three-sided die is tossed twice. Write the sample space for the combined
experiment. Let si be a simple outcome of the experiment, i.e., si denotes the
integer pairs 11, 12, 13, 21, ⋯, 33. The random variable function X ~ for the
simple events is defined as
~ ð si Þ = 0
A = si j X ~ ð si Þ = 1
B = si j X ~ ðs i Þ = 2
C = si j X ~ ðs i Þ ≤ 1 :
D = si j X
(c) Verify that the events defined in part b are disjoint events and they make a
partition of the sample space, i.e., they are disjoint and S = A [ B [ C.
106 3 Discrete Random Variables
S = fs1 , s2 , s3 , s4 , s5 , s6 , s7 , s8 g:
~ on S is defined as
The random variable X
~ ðs 1 Þ = - 1
X ~ ðs2 Þ = 0
X ~ ðs 3 Þ = 1
X ~ ðs 4 Þ = 0
X
~ ðs5 Þ = 1
X ~ ðs6 Þ = 0
X ~ ðs7 Þ = - 1
X ~ ðs 8 Þ = 1
X
~ = -1
X ~ =0
X ~ =1 :
X
~ = -1 ,
(b) Are the events X ~ = 0 , and X
X ~ = 1 disjoint?
(c) Show that
~ = -1 [ X
X ~ =0 [ X
~ = 1 = S:
S = fs1 , s2 , s3 , s4 , s5 , s6 , s7 , s8 g:
~ on S is defined as
The random variable X
~ ðs 1 Þ = - 2
X ~ ðs2 Þ = 1
X ~ ðs 3 Þ = 1
X ~ ðs 4 Þ = 2
X
~ ðs 5 Þ = 2
X ~ ðs6 Þ = 1
X ~ ðs7 Þ = - 2
X ~ ðs8 Þ = - 2:
X
~ is given as
4. The range set of a discrete random variable X
RX~ = f- 1, 0, 2g:
~ = -1
A= X ~ =0
B= X ~ =2
C= X D = A [ B [ C:
Problems 107
5. A fair coin is flipped and a three-sided fair die is tossed at the same time. Let si
be a simple outcome of the combined experiment. The random variable X ~ for the
simple events of the sample space is defined as
~
(a) Write the range set of the random variable X.
~
(b) Calculate and draw the probability mass function of X.
~ is given as
6. The range set of a discrete random variable X
RX~ = f- 1, 0, 1, 3, 7g:
Write the x-intervals for which the cumulative distribution function F(x) is
calculated.
~ is given as
7. The probability mass function p(x) of a discrete random variable X
~ is
8. The graph of the probability mass function of a discrete random variable X
depicted in Fig. 3P.1.
(a) Write the range set of the random variable.
(b) Calculate and draw the cumulative distribution function F(x).
~
(c) Calculate the mean value, variance, and standard deviation of X:
x
2 1 0 1 2
3/8
1/ 4
x
1 0 1 2
~ is given as
10. The probability mass function p(x) of a discrete random variable X
~ is defined as Y~ = 2X
A function of X ~ þ 2.
~ equals 2.5.
11. The variance of the discrete random variable X
(a) Find the variance of Y~ = 2X:
~
~ ~
(b) Find the variance of Y = 2X þ 1:
Problems 109
2a
x
2 1 0 1 2
12. Write the distribution functions of geometric, binomial, and Poisson discrete
random variables.
13. A uniform discrete random variable is defined in the integer interval [-3, -2,
⋯, 4, 5]. Find the mean value and variance of this uniform random variable.
~ is depicted in
14. The probability mass function p(x) of a discrete random variable X
Fig. 3P.3. Without mathematically calculating the mean value of this random
variable, decide whether the mean value is a positive or negative number.
Chapter 4
Functions of Random Variables
p y ð yÞ = p ðxÞ:
fxjy = gðxÞg x
ð4:1Þ
4
x= -1
8
1
px ð xÞ = x=0
8
3
x = 1:
8
If Y~ = X
~ , determine the probability mass function of Y,
~ i.e., py( y) = ?
2
Solution 4.1: If Y~ = g X
~ , the relation between probability mass functions of X
~ and
~
Y is given as
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 111
O. Gazi, Introduction to Probability and Random Variables,
https://doi.org/10.1007/978-3-031-31816-0_4
112 4 Functions of Random Variables
7/8
1/8
y
0 1
py ð y Þ = p ð xÞ
fxjy = gðxÞg x
py ðyÞ = p ð xÞ
fxjy = x2 g x
x = - 1 → y = x2 → y = 1
x = 0 → y = x2 → y = 0
x = 1 → y = x2 → y = 1
4 3 7
py ðy = 1Þ = px ðx = - 1Þ þ px ðx = 1Þ → py ðy = 1Þ = þ → py ðy = 1Þ =
8 8 8
1
py ð y = 0 Þ = p x ð x = 0 Þ → p y ð y = 0 Þ = :
8
The graph of the probability mass function py( y) is shown in Fig. 4.1.
Example 4.2: For the discrete random variable X, ~ the probability mass function is
~ ~ 2
~ i.e., py( y), in terms of
px(x). If Y = X , determine the probability mass function of Y,
px(x).
~ and Y~ is given as
Solution 4.2: Since the relation between random variables X
Y~ = X
~2
we first consider the values x and y generated by these random variables and solve
the equation
4.2 Joint Probability Mass Function 113
y = x2
1
px ð x Þ = -1≤x≤2
4
0 otherwise:
If Y~ = X ~ i.e., py( y) = ?
~ , determine the probability mass function of Y,
Exercise: For the discrete random variable X,~ the probability mass function is px(x).
If Y~ = X ~ i.e., py( y), in terms of px(x).
~ , determine the probability mass function of Y,
3
For a given experiment, let S be the sample space of the experiment, and on this
sample space, let’s define two random variables X ~ and Y.
~ Consider the events
X~ = x and Y~ = y . The intersection of these events is
~ = x \ Y~ = y which means si j X
X ~ ðsi Þ = x \ si j Y~ ðsi Þ = y :
~ and Y~ is
The joint probability mass function for the discrete random variables X
defined as
~ = x \ Y~ = y
pðx, yÞ = Prob X ð4:2Þ
~ = x, Y~ = y
pðx, yÞ = Prob X ð4:3Þ
or as
114 4 Functions of Random Variables
~ = x and Y~ = y :
pðx, yÞ = Prob X ð4:4Þ
Example 4.3: For the two tosses of a coin experiment, the sample space is
~ and Y~ as
Let’s define the discrete random variables X
then we have,
~ ðHH Þ = 3
X ~ ðHT Þ = 1
X ~ ðTH Þ = 1
X ~ ðTT Þ = - 1
X
Y~ ðHH Þ = - 1 Y~ ðHT Þ = 1 Y~ ðTH Þ = 1 Y~ ðTT Þ = 3:
~ and Y~ can be
The joint probability mass function p(x, y) of the random variables X
calculated considering all possible values of (x, y) pairs as
~ = x, Y~ = y →
pðx, yÞ = Prob X
x = - 1, y = - 1 → pðx = - 1, y = - 1Þ = Prob X~ = - 1, Y~ = - 1 →
pðx = - 1, y = - 1Þ = ProbðfTT g \ fHH gÞ → pðx = - 1, y = - 1Þ = ProbðϕÞ →
pðx = - 1, y = - 1Þ = 0
x = - 1, y = 1 → pðx = - 1, y = 1Þ = Prob X~ = - 1, Y~ = 1 →
pðx = - 1, y = 1Þ = ProbðfTT g \ fHT, TH gÞ → pðx = - 1, y = 1Þ = ProbðϕÞ →
pðx = - 1, y = 1Þ = 0
~ = - 1, Y~ = 3 →
x = - 1, y = 3 → pðx = - 1, y = 3Þ = Prob X
pðx = - 1, y = 3Þ = ProbðfTT g \ fTT gÞ → pðx = - 1, y = 3Þ = ProbðfTT gÞ →
1
pðx = - 1, y = 3Þ =
4
~ = 1, Y~ = - 1 →
x = 1, y = - 1 → pðx = 1, y = - 1Þ = Prob X
pðx = 1, y = - 1Þ = ProbðfHT, TH g \ fHH gÞ → pðx = 1, y = - 1Þ = ProbðϕÞ →
pðx = 1, y = - 1Þ = 0
4.2 Joint Probability Mass Function 115
x=1,y=1→pðx=1,y=1Þ=Prob X ~ =1, Y~ =1 →
pðx=1,y=1Þ=ProbðfHT,TH g\ fHT,TH gÞ→pðx=1,y=1Þ=ProbðfHT,TH gÞ →
2
pðx=1,y=1Þ=
4
~ = 1, Y~ = 3 →
x = 1, y = 3 → pðx = 1, y = 3Þ = Prob X
pðx = 1, y = 3Þ = ProbðfHT, TH g \ fTT gÞ → pðx = 1, y = 3Þ = ProbðϕÞ →
pðx = 1, y = 3Þ = 0
x = 3, y = - 1 → pðx = 3, y = - 1Þ = Prob X ~ = 3, Y~ = - 1 →
pðx = 3, y = - 1Þ = ProbðfHH g \ fHH gÞ → pðx = 3, y = - 1Þ = ProbðfHH gÞ →
1
pðx = 3, y = - 1Þ =
4
~ = 3, Y~ = 1 →
x = 3, y = 1 → pðx = 3, y = 1Þ = Prob X
pðx = 3, y = 1Þ = ProbðfHH g \ fHT, TH gÞ → pðx = 3, y = 1Þ = ProbðϕÞ →
pðx = 3, y = 1Þ = 0
~ = 3, Y~ = 3 →
x = 3, y = 3 → pðx = 3, y = 3Þ = Prob X
pðx = 3, y = 3Þ = ProbðfHH g \ fTT gÞ → pðx = 3, y = 3Þ = ProbðϕÞ →
pðx = 3, y = 3Þ = 0
1
pðx= - 1, y = - 1Þ = 0 pðx= - 1, y = 1Þ = 0 pðx= - 1, y = 3Þ =
4
2
pðx= 1, y = - 1Þ = 0 pðx= 1, y = 1Þ = pðx= 1, y = 3Þ = 0
4
1
pðx= 3, y = - 1Þ = pðx= 3, y = 1Þ = 0 pðx= 3, y = 3Þ = 0:
4
S = fs1 , s2 , s3 , s4 g
~ ðs 1 Þ = 1
X ~ ðs2 Þ = 1
X ~ ðs 3 Þ = 2
X ~ ðs 4 Þ = 2
X
Y~ ðs1 Þ = - 1 Y~ ðs2 Þ = 0 Y~ ðs3 Þ = - 1 Y~ ðs4 Þ = 0:
~ =1
(c) Prob X
(d) Prob X = 1, Y~ = - 1
~
Solution 4.4:
~ =x
(a) Remembering that X ~ ðsi Þ = x , we can find X
means si j X ~ =1 and
Y~ = - 1 as
~ = 1 = fs1 , s2 g
X Y~ = - 1 = fs1 , s3 g
X = 1, Y = - 1 = X=1 \ Y = -1
= ff s 1 , s 2 g \ fs 1 , s 3 g g
= fs1 g:
Thus, we have
~ = 1, Y~ = - 1 = fs1 g
X
~ = 1 as
(c) Using the result of part-a, we can calculate Prob X
Prob X = 1 = Probfs1 , s2 g
2
= :
4
~ = 1, Y~ = - 1 as
(d) Using the result of part-c, we can calculate Prob X
Prob X = 1, Y = - 1 = Probfs1 g
1
= :
4
Theorem 4.1: The joint and marginal probability mass functions for the discrete
random variables X ~ and Y~ are denoted by p(x, y), px(x), and py( y) respectively. Show
that the marginal probability mass functions p(x) and p( y) can be obtained from the
joint probability mass function p(x, y) via
p x ð xÞ = y
pðx, yÞ py ð y Þ = x
pðx, yÞ: ð4:5Þ
4.2 Joint Probability Mass Function 117
px ð xÞ = y
pðx, yÞ:
For the simplicity of the proof, assume that the range set of the random variable
Y~ is given as
RY~ = fy1 , y2 , y3 g:
S = Y~ = y1 [ Y~ = y2 [ Y~ = y3 : ð4:6Þ
~ = x can be written as
The event X
~ =x = X
X ~ =x \ S
~ =x = X
X ~ =x \ Y~ = y1 [ Y~ = y2 [ Y~ = y3
leading to
~ =x = X
X ~ = x \ Y~ = y1 [ X
~ = x \ Y~ = y2 [ X
~ = x \ Y~ = y3 ð4:7Þ
~ = x = Prob
Prob X X ~ = x \ Y~ = y2 [ X
~ = x \ Y~ = y1 [ X ~ = x \ Y~ = y3
That is,
px ð x Þ = y
pðx, yÞ:
The proof of
py ð y Þ = x
pðx, yÞ
px ð x Þ = y
pðx, yÞ: ð4:9Þ
~ = xjY~ = y :
pðxjyÞ = Prob X ð4:10Þ
Example 4.5: Show that the joint probability mass function p(x, y) can be expressed
as
or as
~ = x, Y~ = y
pðx, yÞ = Prob X
~ = x \ Y~ = y
pðx, yÞ = Prob X
ProbðA \ BÞ = ProbðAjBÞProbðBÞ
we obtain
4.3 Conditional Probability Mass Function 119
~ = x \ Y~ = y →
pðx, yÞ = Prob X
~ = xjY~ = y Prob Y~ = y
pðx, yÞ = Prob X
which is equal to
x
pðxjyÞ = 1:
pðx, yÞ
pðxjyÞ =
py ðyÞ
into
x
pðxjyÞ
we get
pðx, yÞ
x py ð y Þ
1
pðx, yÞ
p y ð yÞ x
leading to
1 py ð yÞ
pðx, yÞ → → 1:
py ð y Þ x py ð y Þ
pðyÞ
Theorem 4.2: For the joint probability mass function p(x, y), we have
x,y
pðx, yÞ = 1: ð4:11Þ
120 4 Functions of Random Variables
a 2a
2
3a a
1
y
1 2
x,y
pðx, yÞ
can be written as
x y
pðx, yÞ
px ðxÞ
leading to
p ðxÞ → 1:
x x
Example 4.7: The joint probability mass function p(x, y) of two discrete random
variables X~ and Y~ is depicted in Fig. 4.2.
Determine the value of a, and find the marginal probability mass functions p(x)
and p( y).
Solution 4.7:
(a) Expanding
x,y
pðx, yÞ = 1
x,y
pðx, yÞ = 1 →
x y
pðx, yÞ = 1 →
x
pðx, 1Þ þ pðx, 2Þ = 1 →
(b) Using
px ð x Þ = y
pðx, yÞ py ðyÞ = x
pðx, yÞ
we get
4
px ðx= 1Þ = pðx= 1, y = 1Þ þ pðx= 1, y = 2Þ → px ðx= 1Þ = 4a → px ðx= 1Þ =
7
3
px ðx= 2Þ = pðx= 2, y = 1Þ þ pðx= 2, y = 2Þ → px ðx= 2Þ = 3a → px ðx= 2Þ =
7
and
4
py ðy = 1Þ = pðx = 1; y = 1Þ þ pðx = 2; y = 1Þ → py ðy = 1Þ = 4a → py ðy = 1Þ =
7
3
py ðy = 2Þ = pðx = 1; y = 2Þ þ pðx = 2; y = 2Þ → py ðy = 2Þ = 3a → py ðy = 2Þ =
7
For the sample space of an experiment, we can define any number of random
variables. In this case, we can define the joint probability mass functions for a
group of random variables. Assume that for the sample space S, we have four
defined random variables W, ~ X,
~ Y,
~ Z,
~ and for any group of random variables, we
~ Y,
can define a probability mass function; for example, for X, ~ Z~ we can define p(x, y,
z) as
~ = x, Y~ = y, Z~ = z
pðx, y, zÞ = Prob X ð4:12Þ
~ = x \ Y~ = y \ Z~ = z :
pðx, y, zÞ = Prob X ð4:13Þ
~ X,
For four random variables W, ~ Y,
~ Z,
~ we define p(w, x, y, z) as
~ = w, X
pðw, x, y, zÞ = Prob W ~ = x, Y~ = y, Z~ = z : ð4:14Þ
pðx, y, zÞ = w
pðw, x, y, zÞ
py ð yÞ = x,w,z
pðw, x, y, zÞ
pxy ðx, yÞ = z
pðx, y, zÞ
p x ð xÞ = y,z
pðx, y, zÞ
x,y,z
ð ⋯Þ = x y z
ð⋯Þ:
~ and Y~ as
Let’s define the function of two discrete random variables X
Z~ = g X,
~ Y~ :
~ Y~
E g X, = gðx, yÞpðx, yÞ ð4:15Þ
x,y
~ Y~
E g X, = gðx, yÞpðx, yÞ: ð4:16Þ
x y
~ and Y~ is defined as
The joint probability mass function p(x, y) for X
4.5 Functions of Two Random Variables 123
1 2 2 1
pð- 1, - 2Þ = pð- 1, 3Þ = pð1,- 2Þ = pð1, 3Þ = :
6 6 6 6
~ and Y~ is defined as
A function of two random variables X
g X, ~ Y:
~ Y~ = X ~
~ Y~ .
Calculate E g X,
Solution 4.8: Employing the formula
~ Y~
E g X, = gðx, yÞpðx, yÞ
x,y
~ Y~ = X
for g X, ~ Y,
~ we get
~ Y~ =
E X xypðx, yÞ
x y
~ Y~ =
E X x × ð- 2Þ × pðx,- 2Þ þ x × ð3Þ × pðx, 3Þ
x
leading to
E X~ Y~ = ð - 1Þ × ð - 2Þ × pð - 1; - 2Þ þ ð - 1Þ × ð3Þ × pð - 1; 3Þ
þ1 × ð - 2Þ × pð1; - 2Þ þ 1 × ð3Þ × pð1; 3Þ
in which substituting the values of the joint probability mass function, we get
1 2
E X~ Y~ = ð - 1Þ × ð - 2Þ × þ ð - 1Þ × ð3Þ ×
6 6
2 1
þ1 × ð - 2Þ × þ 1 × ð3Þ ×
6 6
resulting in
~ Y~ = - 7 :
E X
6
E aX ~ þ bE Y~ :
~ þ bY~ = aE X ð4:17Þ
124 4 Functions of Random Variables
~ Y~
E g X, = gðx, yÞpðx, yÞ
x y
~ Y~ = aX
for g X, ~ þ bY,
~ we obtain
~ Y~
E g X, = ðax þ byÞpðx, yÞ
x y
~ þ bY~ =
E aX axpðx, yÞ þ bypðx, yÞ
x y y x
leading to
~ þ bY~ = a
E aX x pðx, yÞ þ b y pðx, yÞ
x y y x
pðxÞ pðyÞ
resulting in
~ þ bY~ = a
E aX xpðxÞ þ b ypðyÞ
x y
~ þ bY~ = aE X
E aX ~ þ bE Y~ :
For a discrete experiment, let S be the sample space and A be any event. The
conditional probability mass function conditioned on the particular event A is
defined as
~ = xjA
pðxjAÞ = Prob X ð4:18Þ
which is equal to
4.6 Conditional Probability Mass Function 125
~ =x \ A
Prob X
pðxjAÞ = : ð4:19Þ
ProbðAÞ
ProbðAÞ = Prob ~ =x \ A :
X ð4:20Þ
x
~
Solution 4.10: Assume that the range set of the random variable X is
RX~ = fx1 , x2 , x3 g. Then, we have
~ = x1 [ X
S= X ~ = x2 [ X
~ = x3
~ = x1 , X
where X ~ = x2 , X
~ = x3 are disjoint events, and for the event A, we can
write
A=S \ A→A= X ~ = x2 [ X
~ = x1 [ X ~ = x3 \A
leading to
A= ~ = x1 \ A [
X ~ = x2 \ A [
X ~ = x3 \ A :
X ð4:21Þ
ProbðAÞ = Prob ~ =x \ A :
X ð4:23Þ
x
x
pðxjAÞ = 1: ð4:24Þ
~ =x \ A
Prob X
pðxjAÞ = ð4:25Þ
ProbðAÞ
1 ~ =x \ A
pðxjAÞ = Prob X
x ProbðAÞ x
1
pðxjAÞ = ProbðAÞ → pðxjAÞ = 1:
x ProbðAÞ x
Thus, we have
x
pðxjAÞ = 1:
S = f s1 , s 2 , s 3 , s 4 g
~ ðs 1 Þ = 3 X
X ~ ðs2 Þ = 1 X
~ ðs 3 Þ = 1 X
~ ðs4 Þ = - 1:
An event A is defined as
A = fs1 , s2 , s3 g:
RX~ = f- 1, 1, 3g
~ = - 1, X
and the events X ~ = 1, and X
~ = 3 can be written as
~ = - 1 = fs4 g
X ~ = 1 = fs 2 , s 3 g
X ~ = 3 = fs1 g:
X
~ =x \ A
Prob X
pðxjAÞ =
ProbðAÞ
~ = -1 \A
Prob X
pðx = - 1jAÞ = →
ProbðAÞ
Probðfs4 g \ fs1 , s2 , s3 gÞ ProbðϕÞ
pðx = - 1jAÞ = → pðx = - 1jAÞ = →
Probðfs1 , s2 , s3 gÞ Probðfs1 , s2 , s3 gÞ
pðx = - 1jAÞ = 0
and for x = 1 as
~ =1 \ A
Prob X Probðfs2 , s3 g \ fs1 , s2 , s3 gÞ
pðx= 1jAÞ = → pðx= - 1jAÞ =
ProbðAÞ Probðfs1 , s2 , s3 gÞ
Probðfs2 , s3 gÞ 2
→ pðx= - 1jAÞ = → pðx= 1jAÞ =
Probðfs1 , s2 , s3 gÞ 3
and for x = 3 as
~ =3 \ A
Prob X Probðfs1 g \ fs1 , s2 , s3 gÞ
pðx= 3jAÞ = → pðx= 3jAÞ =
ProbðAÞ Probðfs1 , s2 , s3 gÞ
Probðs1 Þ 1
→ pðx= 3jAÞ = → pðx= 3jAÞ = :
Probðfs1 , s2 , s3 gÞ 3
2 1
pðx= - 1jAÞ = 0 pðx= 1jAÞ = pðx= 3jAÞ = :
3 3
x
pðxjAÞ = 1:
Consider that we have two random variables X ~ and Y~ defined on the simple events
of the same sample space. If the event A is chosen as A = Y~ = y , then the
conditional probability mass function p(x| A) happens to be
~ = xjY~ = y
pðxjyÞ = Prob X ð4:26Þ
pðx, yÞ
pðxjyÞ = : ð4:27Þ
py ð y Þ
128 4 Functions of Random Variables
~ and Y~ be two random variables defined for the simple events of the same
Let X
sample space, and A be an event.
~ is defined as
The conditional expected value of X
~
E XjA = xpðxjAÞ ð4:28Þ
x
~ i.e., g X
and for a function of X, ~ ,E g X
~ jA is calculated using
~ jA =
E g X gðxÞpðxjAÞ: ð4:29Þ
x
~ Y~ = y =
E Xj xpðxjyÞ: ð4:30Þ
x
~ =
E X ~ Y~ = y :
pðyÞE Xj ð4:31Þ
y
pð x Þ = y
pðx, yÞ
pð x Þ = y
pðyÞpðxjyÞ: ð4:32Þ
x
xpðxÞ = x
x y
pðyÞpðxjyÞ
E ðX
~Þ
~ =
E X pð y Þ xpðxjyÞ
y x
E ðXj
~ Y~ = yÞ
~ =
E X ~ Y~ = y :
pðyÞE Xj
y
~ is a
Theorem 4.5: If A1, A2, ⋯, AN form a partition of a sample space S, and X
~
random variable, then the expected value of X can be calculated as
~ =
E X
N
~ i :
PðAi ÞE XjA ð4:33Þ
i=1
Proof 4.5: For the simplicity of the proof, assume that N = 3, i.e., there are three
disjoint events A1, A2, and A3 such that
S = A1 [ A2 [ A3 :
~ = x can be written as
Then, the event X
~ =x = X
X ~ =x \ S→ X
~ =x = X
~ = x \ fA1 [ A2 [ A3 g →
~ =x = X
X ~ = x \ A2 [ X
~ = x \ A1 [ X ~ = x \ A2
~ = x = Prob X
Prob X ~ = x \ A1 þ Prob X
~ = x \ A2 þ Prob X
~ = x \ A2
pð x Þ = Ai
pðx, Ai Þ
pð x Þ = Ai
pðxjAi ÞpðAi Þ: ð4:35Þ
x
xpðxÞ = Ai x
xpðxjAi ÞpðAi Þ ð4:36Þ
E ðX
~Þ E ðXjA
~ iÞ
resulting in
~ =
E X ~ i pðAi Þ:
E XjA ð4:37Þ
Ai
~ is given as
Exercise: Probability mass function of a discrete random variable X
1
x= -2
4
1
pð x Þ = x=1
2
1
x = 3:
4
(a) E X~ =?
(b) Var X~ =?
(c) Find and draw the cumulative distribution function F(x).
~ =X
(d) g X ~ 2 - 1, E g X
~ = ? Var g X ~ =?
(e) Prob - 2 ≤ X~ ≤2 =?
Exercise: Sample space of an experiment is given as S = {s1, s2, s3, s4, s5}. The
~ is defined as
random variable X
X ~ ðs 2 Þ = 1 X
~ ðs1 Þ = - 1 X ~ ðs3 Þ = - 1 X
~ ðs 4 Þ = 1 X
~ ðs5 Þ = 2:
A = fs1 , s2 , s5 g
X ~ ðs 2 Þ = 1 X
~ ð s1 Þ = - 1 X ~ ðs 3 Þ = - 1 X
~ ðs 4 Þ = 1 X
~ ðs5 Þ = - 1
Two discrete random variables X ~ and Y~ are independent of each other if their joint
probability mass function p(x, y) satisfies
The random variable X ~ is independent of the event A if the joint probability mass
function p(x, A) satisfies
where
~ =x \ A
pðx, AÞ = Prob X ~ =x
pðxÞ = Prob X ð4:43Þ
~ = x and A :
pðx, AÞ = Prob X ð4:44Þ
implies that
132 4 Functions of Random Variables
Example 4.12: If p(x, y| A) = p(x| A)p(y| A), show that p(x| y, A) = p(x| A).
Solution: The conditional expression p(x| y, A) can be written as
pðx, y, AÞ
pðxjy, AÞ =
pðy, AÞ
where using
we get
,
pð y A Þ
ProbðAÞ
leading to
pðxjy, AÞ = pðxjAÞ:
~ Y~ = E X
E X ~ E Y~ : ð4:47Þ
~ Y~
E g X, = gðx, yÞpðx, yÞ
x,y
~ Y~ = X
for g X, ~ Y,
~ we get
~ Y~ =
E X xy pðx, yÞ
x,y
~ Y~ =
E X xy pðxÞpðyÞ
x,y
~ Y~ =
E X xpðxÞ ypðyÞ
x y
leading to
~ Y~ = E X
E X ~ E Y~ :
x2 pðx, yÞ = ~2
x2 pð x Þ = E X ð4:48Þ
x,y x
x,y
x2 pðx, yÞ
can be written as
x y
x2 pðx, yÞ
x
x2 y
pðx, yÞ
= pðxÞ
leading to
x
x2 pð x Þ
~2 :
E X
134 4 Functions of Random Variables
~ Y~
E g X, = gðx, yÞpðx, yÞ
x,y
show that
~ 2 þ Y~ 2 þ 2X
E X ~ Y~ = E X
~ 2 þ E Y~ 2 þ E 2X
~ Y~ : ð4:49Þ
~ Y~
E g X, = gðx, yÞpðx, yÞ
x,y
for
~ Y~ = X
g X, ~ 2 þ Y~ 2 þ 2X
~ Y~
we obtain
2 2
E X þ Y þ 2XY = x2 þ y2 þ 2xy pðx, yÞ
x, y
= x pðx, yÞ þ
2
y2 pðx, yÞ þ 2 xypðx, yÞ
x, y x, y x, y
2 2
E X E Y E XY
2 2
=E X þE Y þ 2E XY :
Var Z~ = Var X
~ þ Var Y~ : ð4:50Þ
Proof 4.7: If Z~ = X
~ þ Y,
~ then using
~ Y~
E g X, = gðx, yÞpðx, yÞ
x,y
E Z~ = E X
~ þ E Y~ :
4.8 Independence of Random Variables 135
mz = mx þ my :
Var Z~ = E Z~ - m2z
2
in which substituting Z~ = X
~ þ Y~ and mz = mx + my, we get
Var Z~ = E ~ þ Y~ 2 2
X - mx þ my
Var Z~ = E X
~ 2 þ Y~ 2 þ 2X
~ Y~ - m2x - m2y - 2mx my
leading to
Var Z~ = E X
~ 2 þ E Y~ 2 þ E 2X
~ Y~ - m2x - m2y - 2mx my
Var Z~ = E X
~ 2 - m2x þ E Y~ 2 - m2y þ 2E X
~ E Y~ - 2mx my
mx my
Var ðX
~Þ Var ðY~ Þ
leading to
Var Z~ = Var X
~ þ Var Y~ :
Theorem 4.8: If X~ and Y~ are independent random variables, then the functions of
~ , h Y~ are independent of each other, i.e.,
these random variables g X
~ h Y~
E g X ~ E h Y~ :
=E g X ð4:51Þ
136 4 Functions of Random Variables
The random variables X ~ , Y~ , and Z~ are independent of each other, if joint probability
mass function p(x, y, z) satisfies
~ Y,
If the random variables X, ~ and Z~ are independent of each other, then their
functions are also independent of each other; for instance,
~ Z~
g X,
is independent of
h Y~ :
Problems
~ is given as
1. The probability mass function p(x) of a discrete random variable X
1
x= -1
4
2
pð x Þ = x=1
4
1
x = 2:
4
a 4a
2
4a a
1
y
1 2
S = fs1 , s2 , s3 g
~ ðs 1 Þ = - 1
X ~ ðs 2 Þ = 1
X ~ ðs3 Þ = - 1
X
Y~ ðs1 Þ = - 1 Y~ ðs2 Þ = 1 Y~ ðs3 Þ = 1:
4. The joint probability mass function p(x, y) of two discrete random variables X~
~
and Y is depicted in Fig. 4P.1.
Determine the value of a, find the marginal probability mass functions px(x)
and py( y), and find also the conditional probability mass functions p(x| y) and
p(y| x).
5. The range sets of the discrete random variables X~ and Y~ are given as
~ and Y~ is defined as
The joint probability mass function p(x, y) for X
138 4 Functions of Random Variables
2 3 3 2
pð- 1, 1Þ = pð- 1, 2Þ = pð1, 1Þ = pð1, 2Þ = :
8 8 8 8
~ and Y~ is defined as
A function of two random variables X
Z~ = g X, ~ Y~ 2 þ Y~ 3 :
~ Y~ = X
~
(a) Determine the range set of Z.
~ i.e., pz(z), and draw its graph.
(b) Determine the probability mass function of Z,
~ Y~ , i.e., calculate E Z~ .
(c) Calculate E g X,
~ and Y,
6. For two discrete random variables X ~ we have
~ = 2:5 E Y~ = 4:
E X
If Z~ = 2X
~ þ 3Y,
~ calculate E Z~ .
7. The sample space of an experiment is given as
S = fs1 , s2 , s3 , s4 , s5 , s6 , s7 , s8 g
X ~ ðs2 Þ = - 1
~ ðs 1 Þ = 1 X ~ ðs 3 Þ = 1 X
X ~ ðs4 Þ = 2
~ ðs5 Þ = 1
X ~ ðs6 Þ = 2
X ~ ðs 7 Þ = - 1 X
X ~ ðs8 Þ = 2:
(a) Find the conditional probability mass functions p(x| A), p(x| B), and p(x| C).
Determine the result of
~ , E XjB
(b) Calculate E XjA ~ , E XjC
~ , and E X
~ 2 þ 1jA .
(c) If the events A, B, and C are defined as
Find the conditional probability mass functions p(x| A, B) and p(x| B, C).
Problems 139
S = fs1 , s2 , s3 g
~ ðs 1 Þ = - 1 X
X ~ ðs 2 Þ = - 1 X
~ ðs3 Þ = 1 X
~ ðs4 Þ = 1
The event A is defined as A = {s1, s2}. Show that the random variables X ~ and
~
Y are conditionally independent of each other given the event A.
9. Write the criteria for the independence of four random variables from each other.
10. The variance of discrete random variable X ~ is 4. Find the variance of Y~ = 2X:
~
Chapter 5
Continuous Random Variables
The random variable functions, which are used for experiments having sample
spaces including an uncountable number of simple outcomes, are called continuous
random variables. The probability density function f(x) of a continuous random
~ satisfies
variable X
b
~ ≤b =
Prob a ≤ X f ðxÞdx: ð5:1Þ
a
~ we have
Note that for discrete random variable X,
b
Prob a ≤ X ≤ b = pðxÞ: ð5:2Þ
x=a
RX~ = ½x1 x2 :
For continuous random variables, we do not consider a single value of the random
variable; instead, we consider intervals on which the random variable can have a
value.
The probability that the continuous random variable X ~ takes a value on the
interval I ⊂ RX~ is calculated as
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 141
O. Gazi, Introduction to Probability and Random Variables,
https://doi.org/10.1007/978-3-031-31816-0_5
142 5 Continuous Random Variables
x1 x2 x
I
x1 x
a b x2
~ =x 2 I =
Prob X f ðxÞdx ð5:3Þ
I
~2I =
Prob X f ðxÞdx: ð5:4Þ
I
~ 2 I is calculated as
If the interval I equals [a b], i.e., I = [a b], then Prob X
b
~2I =
Prob X f ðxÞdx: ð5:5Þ
a
~ ≤ x2 = Prob x1 < X
Prob x1 ≤ X ~ ≤ x2 = Prob x1 ≤ X
~ < x2 = Prob x1 < X
~ < x2
5.2 Continuous Uniform Random Variable 143
f ( x1 )
Area | G f ( x1 )
x1 x1 G x
3. The total area under the probability density function equals 1, i.e.,
1
~ ≤1 =
Prob - 1 ≤ X f ðxÞdx = 1 ð5:7Þ
-1
I = ½x1 x1 þ δ
~ =x 2 I
Prob X
can be approximated as
x1 þδ
Prob X ~ ≤ x1 þ δ =
~ = x 2 I = Prob x1 ≤ X f ðxÞdx ≈ δf ðx1 Þ ð5:8Þ
x1
1 ~ ≤x þδ :
f ðxÞ = lim Prob x ≤ X ð5:9Þ
δ→0 δ
~ is
The probability density function of a continuous uniform random variable X
defined on the interval RX~ = ½a b, 0 < a < b as
144 5 Continuous Random Variables
x
a b
1
f ðxÞ = if a ≤ x ≤ b ð5:10Þ
b-a
0 otherwise
~
Example 5.1: The probability density function of a continuous random variable X
is given as
K
0≤x≤1
f ðxÞ = x1=3
0 otherwise:
1
K
dx = 1
0 x1=3
1
K 3 2 1 3 2
dx = 1 → K x3 = 1 → K = 1 → K = :
0 x1=3 2 0 2 3
5.3 Expectation and Variance for Continuous Random Variables 145
1
~ =
E X xf ðxÞdx: ð5:12Þ
-1
~ =E X
Var X ~ -m 2
ð5:13Þ
~ =E X
Var X ~ 2 - m2 ð5:15Þ
~ 2 is computed as
where E X
1
~2 =
E X x2 f ðxÞdx: ð5:16Þ
-1
~
Example 5.2: The probability density function of a continuous random variable X
is shown in Fig. 5.5. Calculate the mean value and variance of the random
~
variable X.
x
2 6
146 5 Continuous Random Variables
for the probability density function depicted in Fig. 5.5, we calculate the mean value
of the random variable as
2 6
x x 3
m = E X~ = dx þ x - þ dx
0 4 2 16 8
≈ 2:3:
2
For the variance calculation, we first evaluate E X using
1
2
E X = x2 f ðxÞdx
-1
as
2 2 6
x x 3
E X~ 2 = dx þ x2 - þ dx
0 4 2 16 8
≈ 6:7
2
Var X ≈ E X - m2
as
1
my = E g X = gðxÞf ðxÞdx: ð5:17Þ
-1
1
Var g X = ½gðxÞ2 f ðxÞdx - m2y : ð5:18Þ
-1
Var X ≥ 0: ð5:19Þ
1
f ðxÞ = if a ≤ x ≤ b ð5:20Þ
b-a
0 otherwise:
1 b
1 1 b2 - a2
E X = xf ðxÞdx → E X = xdx → E X =
-1 b-a a b-a 2
resulting in
aþb
m=E X = : ð5:21Þ
2
2
Var X = E X - m2
148 5 Continuous Random Variables
2
where E X is computed as
1 b
2 2 1 2 1 b 3 - a3
E X = x2 f ðxÞdx → E X = x2 dx → E X =
-1 b-a a b-a 3
leading to
2 b2 þ ab þ a2
E X = :
3
2
2 a2 þ ab þ b2 aþb
Var X = E X - m2 → Var X = -
3 2
resulting in
ðb - aÞ2
Var X = : ð5:22Þ
12
1 ðx - mÞ2
f ðxÞ = p e - 2σ2 ð5:23Þ
σ 2π
X N m, σ 2 ð5:24Þ
to indicate the Normal random variable with mean m, and variance σ 2. For the
normal random variable X, we have
5.5 Gaussian or Normal Random Variable 149
E X =m
leading to
1 ðx - mÞ2
1
m= p xe - 2σ 2 dx
σ 2π -1
and
Var X = σ 2
leading to
1 ðx - mÞ2
1
σ2 = p ðx - mÞ2 e - 2σ 2 dx: ð5:25Þ
σ 2π -1
The Gaussian random variable X with zero mean and unity variance is called
standard normal random variable, and it is indicated as
X N ð0, 1Þ:
Y N m, σ 2
Y = m þ σX
Y -m
X= : ð5:26Þ
σ
x
0.7
For X N ð0, 1Þ, the probability density function f(x) has the form
1 x2
f ðxÞ = p e - 2 ð5:28Þ
2π
x
1 t2
F ðxÞ = p e - 2 dt: ð5:29Þ
2π
-1
f (x)
N (0,1)
N (0,2)
N (0,4)
Fig. 5.7 Normal distributions with mean value m = 0, and variances σ 2 = 1, σ 2 = 2, and σ 2 = 4
f (x)
N (0,1) N (4,1)
x
5 0 9
4
Solution 5.8: If we add a constant to a random variable, the new random variable
owns the same variance as the added one. Just the mean value of for new random
variable is shifted by the added amount. Thus, the random variable Y has the
distribution N(1 + 2, 4) → N(3, 4).
152 5 Continuous Random Variables
f (x)
N (4,1)
N (4,2)
N (4,4)
x
0 4
Fig. 5.9 Normal distributions with mean value m = 4, and variances σ 2 = 1, σ 2 = 2, and σ 2 = 4
0
1 x2
p e - 2 dx:
2π -1
Solution 5.9: The total area under the standard normal distribution equals 1, i.e.,
1 1
1 x2
f ðxÞdx = 1 → p e - 2 dx = 1:
-1 2π -1
The integral expression given in the question corresponds to half of the area under
the Gaussian curve; for this reason, we have
0
1 x2 1
p e - 2 dx = :
2π -1 2
λe - λx if x ≥ 0
f ðxÞ = ð5:30Þ
0 otherwise
5.6 Exponential Random Variable 153
udv = uv - vdu
we get
1
1
m = E X~ = - xe - λx 0
þ e - λx dx
0
1
e - λx
=0- λ 0
1
= :
λ
2
For the variance calculation, let’s first calculate E X as follows:
1 1
2 2
E X = x2 f ðxÞdx → E X = x2 λe - λx dx
-1 -1
udv = uv - vdu
we get
1
2 1
E X = - x2 e - λx 0
þ 2xe - λx dx
0
1
2 - λx
=0 þ xλe dx
λ 0
E X
2
= 2:
λ
154 5 Continuous Random Variables
2
Var X = E X - m2
leading to
2 1 1
Var X = - → Var X = 2 :
λ2 λ2 λ
F ð xÞ = pð x i Þ ð5:32Þ
xi ≤ x
and
x
F ð xÞ = f ðt Þdt ð5:33Þ
-1
respectively.
F ð- 1Þ = 0 F ð1Þ = 1: ð5:36Þ
3. If the random variable X is a discrete one, then F(x) has a piecewise constant and
staircase shape.
4. If the random variable X is a continuous one, then F(x) has continuous form.
5. For continuous random variable X, the relation between probability density
function f(x) and cumulative distribution function F(x) can be stated as
x
dF ðxÞ
F ð xÞ = f ðxÞdx f ð xÞ = : ð5:37Þ
-1 dx
6. For discrete random variable X with range set R = fx1 , x2 , ⋯, xN g, the relation
X
between probability mass function p(x) and cumulative distribution function F(x)
can be stated as
x
1 3
156 5 Continuous Random Variables
c 2
þ 2c = 1 → c = :
2 5
To draw the cumulative distribution function F(x), let’s first consider the x-
intevals on which F(x) is determined. While determining the x-intevals, we pay
attention to the graph of the f(x), and consider the points at which function changes.
Following this idea, we can determine the x-intevals as
0≤x<1
1 ≤ x ≤ 3:
In the next step, on each interval we calculate the cumulative distribution function
F(x) employing
x
F ð xÞ = f ðt Þdt:
-1
2
f ðxÞ = x 0 ≤ x < 1
5
2
f ðxÞ = 1≤x≤3
5
x 1 x
2 2 1 2
F ð xÞ = f ðt Þdt → F ðxÞ = tdt þ dt → F ðxÞ = þ ðx- 1Þ:
-1 0 5 1 5 5 5
1/ 5
x
0 1 3
0 -1<x<0
x2
0≤x<1
5
F ð xÞ = 1 2
þ ð x - 1Þ 1≤x≤3
5 5
1 3 ≤ x < 1:
1 if x = 0
δðxÞ = ð5:40Þ
0 otherwise
which satisfies
1
δðxÞdx = 1: ð5:41Þ
-1
The shifting operation does not alter the integration property, i.e.,
1
δðx- x0 Þdx = 1: ð5:42Þ
-1
x
0 x0
1
1/ 2
x
0
1 if x > 0
1
uð x Þ = if x = 0 ð5:43Þ
2
0 otherwise:
or as
1 if x ≥ 0
uð x Þ = ð5:44Þ
0 otherwise:
duðxÞ x
δðxÞ = → uðxÞ = δðt Þdt: ð5:45Þ
dx -1
Some functions can be expressed as the sum of the shifted impulses or unit steps.
For instance, the function shown in Fig. 5.14 can be expressed in terms of shifted
unit functions as
1
x
0 1 2
0 .4
x
1 4
dgðxÞ
= δðx- 1Þ þ 2δðx- 2Þ
dx
dF ðxÞ
f ðxÞ = :
dx
For the given F(x), the calculation of f(x) is depicted in Fig. 5.17.
160 5 Continuous Random Variables
x
1 4
0 .6
0 .4
x
1 4
x
1 0 1 2 3
-1<x< -1
-1≤x<1
1≤x<2
2≤x<3
3 ≤ x < 1:
x
F ð xÞ = f ðt Þdt
-1
Hence, using the calculated values, we can write the cumulative distribution
function as
0 -1<x< -1
1
-1≤x<1
4
F ð xÞ = 3
1≤x<2
4
x 1
þ 2≤x<3
4 4
whose graph is depicted in Fig. 5.19 with the graph of probability density function.
Exercise: Draw the cumulative distribution function of a random variable whose
probability density function is depicted in Fig. 5.20.
162 5 Continuous Random Variables
1 0 1 2 3
F (x )
1
3/ 4
1/ 4
1 0 1 2 3
0.3
0.4
x
1 2 4
For continuous experiments, sample spaces and events are defined as intervals. We
can indicate the sample spaces and events using random variables. For instance, for a
discrete random variable, let
R = f- 1, 2, 5g
X
be the range set of the random variable. Then, the sample space of the random
variable can be indicated as
S= -1≤X≤5
A= X = -1
or by an interval as
B= -1≤X<3 :
For continuous random variable, the range set of the random variable is a real
number interval. For instance,
R = ½- 20 60:
X
And similar to the discrete random variables, we can use the continuous random
variable to characterize the sample space of the continuous experiment, and an event
is defined for the given sample space. For instance, using R , we can indicate the
X
sample space of the continuous experiment as
S = - 20 ≤ X ≤ 60
A = - 10 ≤ X < 20
x≤X ≤x þ δ
A \ B = a ≤ X~ ≤ b \ x ≤ X~ ≤ x þ δ
x ≤ X~ ≤ x þ δ if x 2 ½a b ð5:46Þ
→A \ B=
0 otherwise
That is,
B if x 2 A
A \ B= ð5:47Þ
0 otherwise:
164 5 Continuous Random Variables
1
f ðxÞ = lim Prob x ≤ X ≤ x þ δ : ð5:48Þ
δ→0 δ
1
f ðxjAÞ = lim Prob x ≤ X ≤ x þ δjA ð5:49Þ
δ→0 δ
lim Prob x ≤ X~ ≤ x þ δ =δ
δ→0
f ðxjAÞ = if x 2 ½a b ð5:51Þ
ProbðAÞ
0 otherwise:
f ðxÞ
if x 2 A
f ðxjAÞ = ProbðAÞ ð5:52Þ
0 otherwise
where A = [a b].
Example 5.14: Probability density function, i.e., f(x), of a continuous random
variable is depicted in Fig. 5.21.
The events A, B, and C are defined as
0 1 2 3
5.10 Conditional Probability Density Function 165
Solution 5.14: The events given in the question can be written as intervals, i.e.,
A = ½ 0 1 B = ½1 2 C = ½2 3:
1 1
1 1
ProbðAÞ = f ðxÞdx → ProbðAÞ = dx → ProbðAÞ =
0 0 3 3
3 3
1 1
ProbðBÞ = f ðxÞdx → ProbðBÞ = dx → ProbðBÞ =
2 2 3 3
3 4
1 1
ProbðCÞ = f ðxÞdx → ProbðCÞ = dx → ProbðC Þ =
3 3 3 3
f ðxÞ
if x 2 A
f ðxjAÞ = ProbðAÞ
0 otherwise
f ðxÞ
if x 2 ½0 1 3f ðxÞ if x 2 ½0 1
f ðxjAÞ = ProbðAÞ → f ðxjAÞ =
0 otherwise
0 otherwise
f ðxÞ
if x 2 ½1 2 3f ðxÞ if x 2 ½1 2
f ðxjBÞ = ProbðBÞ → f ðxjBÞ =
0 otherwise
0 otherwise
f ðxÞ
if x 2 ½0 1 3f ðxÞ if x 2 ½2 3
f ðxjAÞ = ProbðC Þ → f ðxjV Þ =
0 otherwise
0 otherwise
The graphs of f(x| A), f(x| B), and f(x| C) are depicted in Fig. 5.15.
166 5 Continuous Random Variables
f ( x | A) f (x | B) f (x | C )
1 1 1
x x x
0 1 0 1 2 0 1 2 3
Fig. 5.22 The graphs of the conditional probability density functions f(x| A), f(x| B), and f(x| C)
x
0 1 2
It is clear from Figs. 5.21 and 5.22 that the probability density function f(x) can be
written in terms of the conditional probability functions and event probabilities as
The conditional expected value for the continuous random variable X conditioned on
event A is defined as
1
E XjA = xf ðxjAÞdx ð5:53Þ
-1
and for a function of random variable X, i.e., g X , the conditional expected value is
calculated as
1
E g X jA = gðxÞf ðxjAÞdx ð5:54Þ
-1
x
0 1
1
1
ProbðAÞ = f ðxÞdx → ProbðAÞ = :
0 2
For the event A = [0 1], the conditional probability can be evaluated employing
f ðxÞ
if x 2 A
f ðxjAÞ = ProbðAÞ
0 otherwise
as
f ðxÞ
if x 2 ½0 1 2f ðxÞ if x 2 ½0 1
f ðxjAÞ = 1=2 → f ðxjAÞ =
0 otherwise
0 otherwise
E X~ jA = xf ðxjAÞdx
-1
1
=2 x2 dx
0
2
= :
3
168 5 Continuous Random Variables
2
Similarly, we can evaluate E X jA as
1
E X~ 2 jA = x2 f ðxjAÞdx
-1
1
=2 x3 dx
0
2
= :
4
Theorem 5.1: Let A1, A2, ⋯, AN be the disjoint events, i.e., disjoint intervals, with
P(Ai) ≥ 0, such that
S = A1 [ A2 [ ⋯ [ AN ð5:55Þ
then we have
N
f ðxÞ = ProbðAi Þf ðxjAi Þ: ð5:56Þ
i=1
Proof 5.1: Let’s define the sample space S and the events A, B, and C as
S = A [ B [ C:
D = D \ S → D = D \ ðA [ B [ C Þ → D = ðD \ AÞ [ ðD \ BÞ [ ðD \ C Þ
leading to
where multiplying both sides by 1/δ, and taking the limit as δ → 0, we obtain
5.11 Conditional Expectation 169
N
f ðxÞ = ProbðAi Þf ðxjAi Þ: ð5:58Þ
i=1
N
xf ðxÞ = x ProbðAi Þf ðxjAi Þ
i=1
N
xf ðxÞ = ProbðAi Þxf ðxjAi Þ ð5:61Þ
i=1
1 N 1
xf ðxÞdx = ProbðAi Þ xf ðxjAi Þdx
-1 i=1 -1
N
E X = ProbðAi ÞE XjAi :
i=1
170 5 Continuous Random Variables
x
0 1 2
N
f ðxÞ = ProbðAi Þf ðxjAi Þ
i=1
and
E X = ProbðAi ÞE XjAi
i
S = A [ B and A \ B = ϕ:
1
1
ProbðAÞ = f ðxÞdx → ProbðAÞ =
0 2
2
1
ProbðBÞ = f ðxÞdx → ProbðBÞ = :
1 2
f ðxÞ f ðxÞ
if x 2 A if x 2 B
f ðxjAÞ = ProbðAÞ f ðxjBÞ = ProbðBÞ
0 otherwise 0 otherwise
as
5.11 Conditional Expectation 171
x x
0 1 0 1 2
f ðxÞ
if x 2 ½0 1Þ 2f ðxÞ if x 2 ½0 1Þ
f ðxjAÞ = 1=2 → f ðxjAÞ =
0 otherwise
0 otherwise
f ðxÞ
if x 2 ½1 2 2f ðxÞ if x 2 ½1 2
f ðxjBÞ = 1=2 → f ðxjBÞ =
0 otherwise:
0 otherwise
leading to
2x if x 2 ½0 1Þ - 2x þ 4 if x 2 ½1 2
f ðxjAÞ = f ðxjBÞ =
0 otherwise 0 otherwise:
as
1 1
2
E XjA = xf ðxjAÞdx → E XjA = 2 x2 dx → E XjA =
-1 0 3
1 2
4
E XjB = xf ðxjBÞdx → E XjB = - 2x2 þ 4x → E XjB = :
-1 1 3
On the other hand, the mean value of the random variable X can be calculated
using the formulas
172 5 Continuous Random Variables
1
E X = xf ðxÞdx
-1
as
1 1 2
E X = xf ðxÞdx → E X = xf ðxÞdx þ xf ðxÞdx →
-1 0 1
1 2
1 7
E X = 2
x dx þ - x2 þ 2x dx → E X = - þ 3 → E X = 1:
0 1 3 3
Using the probability density function graphs in Figs. 5.25 and 5.26, we can show
that
1 1
f ðxÞ = f ðxjAÞ þ f ðxjBÞ → f ðxÞ = f ðxjAÞProbðAÞ þ f ðxjBÞProbðBÞ:
2 2
Using
1 1
ProbðAÞ = ProbðBÞ =
2 2
and
2 4
E X =1 E XjA = E XjB =
3 3
2 1 4 1
E X = E XjA ProbðAÞ þ E XjB ProbðBÞ → 1 = × þ × → 1 = 1√
3 2 3 2
2
Var XjA = E X jA - m2xjA ð5:62Þ
where
5.12 Conditional Variance 173
x
2 3
1 1
2
E X jA = x2 f ðxjAÞdx mxjA = E XjA = xf ðxjAÞdx: ð5:63Þ
-1 -1
Example 5.17: For a continuous random variable X, the probability density func-
tion is depicted in Fig. 5.27.
The events A and B are given as A = 0 ≤ X < 2 , B = 2 ≤ X ≤ 3 .
N
E X = ProbðAi ÞE XjAi :
i=1
Solution 5.17:
1 2 3
x 1 8 5
E X = xf ðxÞdx → E X = x dx þ x dx → E X = þ →E X
-1 0 4 2 2 12 4
23
= = m:
12
2
Var X = E X - m2
where
1 2 3
2 2 x 1
E X = x2 f ðxÞdx → E X = x2 dx þ x2 dx
-1 0 4 2 2
174 5 Continuous Random Variables
x
2
A= 0≤X <2
can be calculated as
2 2
ProbðAÞ = f ðxÞdx → ProbðAÞ = f ðxÞdx = 1=2:
0 0
Using
f ðxÞ
if x 2 ½a b
f ðxjAÞ = ProbðAÞ
0 otherwise
2f ðxÞ if x 2 ½0 2
f ðxjAÞ =
0 otherwise:
B= 2≤X≤3
can be calculated as
3 3
ProbðBÞ = f ðxÞdx → ProbðBÞ = f ðxÞdx = 1=2:
2 2
Using
5.12 Conditional Variance 175
x
0 2 3
f ðxÞ
if x 2 ½a b
f ðxjBÞ = ProbðBÞ
0 otherwise
2f ðxÞ if x 2 ½2 3
f ðxjBÞ =
0 otherwise:
(d) The conditional expectation conditioned on the event A, i.e., E XjA , can be
calculated using the conditional probability density function f(x| A)
f ( x | A)
x
2
as
1 2
x
E XjA = xf ðxjAÞdx → E XjA = x dx → mxjA = E XjA
-1 0 2
2
1 8
= x2 dx → mxjA =
2 0 6
176 5 Continuous Random Variables
(e) The conditional variance conditioned on the event A, i.e., Var XjA , can be
calculated as
2
Var XjA = E X jA - m2xjA
2
where E X jA is evaluated as
1 2 2 3
2 2 x 2 x
E X jA = x2 f ðxjAÞdx → E X jA = x2 dx → E X jA = dx
-1 0 2 0 2
(f) The conditional expectation conditioned on the event B, i.e., E XjB , can be
calculated using the conditional probability density function f(x| B)
f ( x | B)
x
0 2 3
as
1 3
5
E XjB = xf ðxjBÞdx → E XjB = xdx → mxjB = E XjB =
-1 2 2
(g) The conditional variance conditioned on the event B, i.e., Var XjB , can be
calculated as
2
Var XjB = E X jB - m2xjB
2
where E X jB is evaluated as
1 3
2 2 2 19
E X jB = x2 f ðxjBÞdx → E X jB = x2 dx → E X jB = :
-1 2 3
5.12 Conditional Variance 177
19 25 1
Var XjB = - → Var XjB = :
3 4 12
N
E X = ProbðAi ÞE XjAi :
i=1
We found that
8 5
E XjA = E XjB =
6 2
1
ProbðAÞ = ProbðBÞ =
2
23
E X = :
12
Expanding
N
E X = ProbðAi ÞE XjAi
i=1
for N = 2, we get
we obtain
8 5 23
E X = þ →E X = √
12 4 12
x
0 1 3
0 x
1 1
Problems
(c) Find and draw the cumulative distribution function F(x) of this random
variable.
2. A continuous uniform random variable is defined on the interval [-2 6]. Draw
the graph of the probability density function of this random variable. Calculate
and draw the cumulative distribution function of this random variable.
3. The probability density function of a continuous random variable X is given as
K
f ð xÞ = 0≤x≤1
x1=40 otherwise:
x
0 1 4
2 0 2 x
1/ 4
x
1 0 1 2 3
1/ 4
x
1 0 1 2
(a) Calculate and draw the probability density function for this random variable.
(b) Calculate the probabilities
.
8. The cumulative distribution function of a continuous random variable X is
depicted in Fig. 5P.6.
(a) Calculate and draw the probability density function for this random variable.
(b) Find the mean value and variance of this random variable.
9. The probability density function of a continuous random variable X is depicted
in Fig. 5P.7.
(a) Find the value of the constant a.
(b) Calculate and draw the cumulative distribution function of this random
variable.
10. The probability density function of a continuous random variable X is depicted
in Fig. 5P.8. The events A, B, and C are defined as
Problems 181
x
1 0 1 2 5
x
0 1 2 3 4 5
We can define more than one random variable for the sample space of the continuous
~ and Y~ be continuous random variables defined on the same sample
experiment. Let X
~
space. The joint probability density function of the continuous random variables X
~
and Y is defined as
1 ~ ≤ x þ δx , y ≤ Y~ ≤ y þ δy
f ðx, yÞ = lim Prob x ≤ X ð6:1Þ
δx → 0 δx δy
δy → 0
where
~ ≤ x þ δx and y ≤ Y~ ≤ y þ δy
x≤X ð6:2Þ
are events, i.e., subsets of the continuous sample space. Note that for continuous
experiments, events are defined using real number intervals.
~ and Y~ be as
Let the range sets, i.e., intervals, of the random variables X
~ ≤ xe = yb ≤ Y~ ≤ ye
S = xb ≤ X
and
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 183
O. Gazi, Introduction to Probability and Random Variables,
https://doi.org/10.1007/978-3-031-31816-0_6
184 6 More Than One Random Variables
~
~
X() Y( )
S = [ m n] R ~ = [ xb xe ] S = [m n] R ~ = [ yb ye ]
X Y
Fig. 6.1 Two continuous random variables defined on the same continuous sample space
~ ≤ xe , yb ≤ Y~ ≤ ye = ProbðSÞ = 1:
Prob xb ≤ X
~ ≤ b B = c ≤ Y~ ≤ d :
A= a≤X
The probability
ProbðA \ BÞ
is calculated as
b d
ProbðA \ BÞ = f ðx, yÞdxdy
a c
b d
~ ≤ b, c ≤ Y~ ≤ d =
Prob a ≤ X f ðx, yÞdxdy ð6:3Þ
a c
~ Y~ 2 D =
Prob X, f ðx, yÞdxdy ð6:4Þ
ðx, yÞ2D
Properties
1. The total volume of the geometric shape under the function f(x, y) equals 1, i.e.,
1 1
f ðx, yÞdxdy = 1: ð6:5Þ
-1 -1
2. Marginal probability density functions f(x) and f( y) can be obtained from the
joint probability density function as
1 1
f ðxÞ = f ðx, yÞdy f ðyÞ = f ðx, yÞdx: ð6:6Þ
-1 -1
Example 6.1: The joint probability density function f(x, y) of two continuous
random variables X ~ and Y~ is a constant and it is defined on the region shown in
Fig. 6.3. Find f(x, y), f(x), and f( y).
Solution 6.1: The region in Fig. 6.3 is detailed in Fig. 6.4.
Using Fig. 6.4, we can mathematically write f(x, y) as
c for 0 ≤ x ≤ 1 0 ≤ y ≤ 1 - x
f ðx; yÞ =
0 otherwise
x
x 1
1 1
f ðx, yÞdxdy = 1
-1 -1
we get
1 1-x
cdydx = 1
x=0 y=0
1
cð1- xÞdx = 1
x=0
leading to
1
c 1- =1
2
c = 2:
2 for 0 ≤ x ≤ 1 0 ≤ y ≤ 1 - x
f ðx; yÞ =
0 otherwise:
Note that
1 1-x
dxdy
x=0 y=0
1-x
f ðxÞ = 2dy → f ðxÞ = 2ð1- xÞ 0 ≤ x ≤ 1:
0
1-y
f ðyÞ = 2dx → f ðyÞ = 2ð1- yÞ 0 ≤ y ≤ 1:
0
Thus, we got
f ðxÞ = 2ð1 - xÞ 0 ≤ x ≤ 1,
f ðyÞ = 2ð1 - yÞ 0 ≤ y ≤ 1:
Note that it can be shown for the calculated f(x) and f( y); we have
1 1
f ðxÞdx = 1 f ðyÞdy = 1:
-1 -1
~
The conditional probability density function of two continuous random variables X
and Y~ is defined as
f ðx, yÞ
f ðxjyÞ = : ð6:7Þ
f ðyÞ
188 6 More Than One Random Variables
f ðx, yÞ
f ðxjyÞ =
f ðyÞ
into
1
f ðxjyÞdx
-1
we get
1
f ðx, yÞ
dx
- 1 f ðyÞ
where employing
1
f ðyÞ = f ðx, yÞdx
-1
we obtain
1
f ðyÞ → 1:
f ð yÞ
6.3 Conditional Expectation 189
~ on condition Y~ = y is defined as
The conditional expectation of X
1
~ Y~ = y =
E Xj xf ðxjyÞdx ð6:8Þ
-1
~ Y~ = y :
which can be considered as a function of y, i.e., gðyÞ = E Xj
And in a similar manner, the conditional expectation of g X ~ on condition
~
Y = y is defined as
1
~ jY~ = y =
E g X gðxÞf ðxjyÞdx: ð6:9Þ
-1
Example 6.3: The joint probability density function f(x, y) of two continuous
~ and Y~ is defined on the region shown in Fig. 6.5 as f(x, y) = 2.
random variables X
The marginal probability density functions f(x) and f( y) are equal to
f ðxÞ = 2ð1 - xÞ 0 ≤ x ≤ 1,
f ðyÞ = 2ð1 - yÞ 0 ≤ y ≤ 1:
~ Y~ = y , and E Yj
Find f(x| y), E Xj ~X ~ =x :
Solution 6.3: We first calculate the conditional probability functions and then find
the conditional expectations as
190 6 More Than One Random Variables
f ðx; yÞ 2 1
f ðxjyÞ = → f ðxjyÞ = → f ðxjyÞ = 0 ≤ x ≤ 1 - y:
f ðyÞ 2ð 1 - yÞ 1-y
1
E X~ jY~ = y = xf ðxjyÞdx →
-1
1-y
1 1-y
E X~ jY~ = y = dx → E X~ jY~ = y =
x 0 ≤ y ≤ 1:
0 1-y 2
f ðx; yÞ 2 1
f ðyjxÞ = → f ðyjxÞ = → f ðyjxÞ = 0 ≤ y ≤ 1 - x:
f ðxÞ 2ð 1 - xÞ 1-x
1
E Y~ jX~ = x = yf ðyjxÞdy →
-1
1-x
1 1-x
E Y~ jX~ = x = y dy → E X~ jY~ = E Y~ jX~ = x y = 0 ≤ x ≤ 1:
0 1-x 2
1 1 1
~
E g X = ~ Y~
gðxÞf ðxÞdx E g X, = gðx, yÞf ðx, yÞdxdy: ð6:10Þ
-1 -1 -1
~ jY~ = y and E g X,
2. The conditional expected functions E g X ~ Y~ jY~ = y can be
evaluated as
1
E g X~ jY~ = y = gðxÞf ðxjyÞdx
-1 ð6:11Þ
1 1
E g X~ ; Y~ jY~ = y = gðx; yÞf ðxjyÞdxdy:
-1 -1
~ þ bY~ = aE X
E aX ~ þ bE Y~ : ð6:12Þ
~ 2 Ajy =
Prob X f ðx, yÞdx: ð6:14Þ
A
~ Y~ 2 D =
Prob X, f ðx, yÞdxdy: ð6:15Þ
ðx, yÞ2D
The Bayes rule for the conditional probability density function of continuous
random variables is given as
f ðxÞf ðyjxÞ
f ðxjyÞ = : ð6:17Þ
f ðxÞf ðyjxÞdx
~ Y~ = y
E Xj
is calculated as
1
~ Y~ = y =
E Xj xf ðxjyÞdx: ð6:18Þ
-1
192 6 More Than One Random Variables
The result of
~ Y~ = y
E Xj
~ Y~ = y :
gðyÞ = E Xj ð6:19Þ
~ Y~ :
g Y~ = E Xj
~ 2 jY~ = y is calculated as
Example 6.4: E X
1
~ 2 jY~ = y =
E X x2 f ðxjyÞdx:
-1
~ Y~
Expected Value of E Xj
~ Y~ equals E X
Property: The expected value of E Xj ~ , i.e.,
~ = E E Xj
E X ~ Y~ : ð6:21Þ
1
E g Y~ = gðyÞf ðyÞdy
-1
1
~ Y~
E E Xj = ~ Y~ = y f ðyÞdy:
E Xj
-1
Substituting
1
~ Y~ = y =
E Xj xf ðxjyÞdx
-1
into
1
~ Y~ = y f ðyÞdy
E Xj
-1
we obtain
1 1
xf ðxjyÞdxf ðyÞdy
-1 -1
leading to
1 1
x f ðx, yÞdy dx
-1 -1
= f ð xÞ
resulting in
1
xf ðxÞdx
-1
~ :
E X
Hence, we got
194 6 More Than One Random Variables
1
~ =
E X ~ Y~ = y f ðyÞdy
E Xj ð6:22Þ
-1
~ = E E Xj
E X ~ Y~ :
~ and functions of X,
The result in (6.22) can be generalized for functions of X ~ Y~ as
1
E g X~ = E g X~ jY~ = y f ðyÞdy
-1 ð6:23Þ
1
E g X~ ; Y~ = E g X~ ; Y~ jY~ = y f ðx; yÞdy:
-1
~ =
E X ~ Y~ = y pðyÞ:
E Xj ð6:24Þ
y
~ Y~ :
Var E Xj
Var g Y~ g Y~ - E g Y~
2 2
=E
~ Y~ for g Y~ , we obtain
in which substituting E Xj
~ Y~
Var E Xj =E ~ Y~
E Xj
2
- ~ Y~
E E Xj
E ðX
~Þ
leading to
~ Y~
Var E Xj =E ~ Y~
E Xj
2
~
- E X
2
: ð6:25Þ
Example 6.6: The joint probability density function f(x, y) of two continuous
~ and Y~ is defined on the region shown in Fig. 6.6 as f(x, y) = 2.
random variables X
The marginal probability density functions f(x), f( y) and conditional f(x| y), f(y| x) are
equal to
6.4 Conditional Expectation 195
1
D
x
1
f ðxÞ = 2ð1 - xÞ 0 ≤ x ≤ 1
f ðyÞ = 2ð1 - yÞ 0 ≤ y ≤ 1
1
f ðxjyÞ = 0≤x≤1-y
1-y
1
f ðyjxÞ = 0 ≤ y ≤ 1 - x:
1-x
~ Y~ and E E Xj
Find E Xj ~ Y~ : Verify that
~ = E E Xj
E X ~ Y~ :
~ Y~ = y as
Solution 6.6: First, let’s calculate the conditional expected term E Xj
1
E X~ jY~ = y = xf ðxjyÞdx →
-1
1-y
1 1-y
E X~ jY~ = y = x dx → E X~ jY~ = y = 0 ≤ x ≤ 1 - y:
0 1-y 2
~
~ Y~ = 1 - Y :
E Xj
2
~ Y~
E E Xj can be calculated as
1
E E X~ jY~ = E X~ jy f ðyÞdy →
-1
1
1-y 1
E E X~ jY~ = 2ð1 - yÞdy → E E X~ jY~ = :
0 2 3
~ using
We can evaluate E X
196 6 More Than One Random Variables
1
~ =
E X xf ðxÞdx
-1
as
1
~ =
E X ~ = 1:
x2ð1- xÞdx → E X
0 3
~ = E E Xj
E X ~ Y~ :
~
Var Xjy
is defined as
~ =E X
Var Xjy ~
~ 2 jy - E Xjy 2
ð6:26Þ
~
which can be considered as a function of y, i.e., as y changes the value of Var Xjy
~
changes. Then we can denote Var Xjy as
~
gðyÞ = Var Xjy
g Y~ = Var Xj
~ Y~ :
That is,
~ Y~ = E X
Var Xj ~ 2 jY~ - E Xj
~ Y~ 2
: ð6:27Þ
6.5 Conditional Variance 197
Example 6.7: Calculate the expected value of conditional variance, i.e., find
~ Y~ :
E Var Xj
1
E g Y~ = gðyÞf ðyÞdy
-1
~ 2 jy - E Xjy
in which substituting E X ~ 2
~ , we obtain
for Var Xjy
1
~ Y~
E Var Xj = ~ 2 jy - E Xjy
E X ~ 2
f ðyÞdy
-1
~ Y~
E Var Xj ~ 2 jY~
=E E X -E ~ Y~
E Xj
2
~2
E X
which is simplified as
~ Y~
E Var Xj ~2 - E
=E X ~ Y~
E Xj
2
ð6:28Þ
~ Y~
Var E Xj =E ~ Y~
E Xj
2
~
- E X
2
ð6:29Þ
1
D
x
1
~ Y~
E Var Xj ~ Y~
þ Var E Xj ~2 - E X
=E X ~ 2
VarðX
~Þ
~ = E Var Xj
Var X ~ Y~ ~ Y~ :
þ Var E Xj ð6:30Þ
~ = E Var Xj
Var X ~ Y~ ~ Y~ :
þ Var E Xj ð6:31Þ
f ðxÞ = 2ð1 - xÞ 0 ≤ x ≤ 1,
f ðyÞ = 2ð1 - yÞ 0 ≤ y ≤ 1:
Find
~ ~
E Var XjY ~2 - E
=E X ~ ~
E XjY
2
~ Y~
Var E Xj =E ~ Y~
E Xj
2
~
- E X
2
~ = E Var Xj
Var X ~ Y~ ~ Y~ :
þ Var E Xj
6.6 Independence of Continuous Random Variables 199
When f(x, y) = f(x| y)f( y) and f(x, y) = f(y| x)f(x) are substituted into (6.32), we get
and
respectively.
~ and Y~ are independent random variables and A = a ≤ X
If X ~ ≤ b , B = c ≤ Y~ ≤ d
are two events, then we have
b d
Prob a ≤ X~ ≤ b; c ≤ Y~ ≤ d = f ðx; yÞdydx →
a c
b d
Prob a ≤ X~ ≤ b; c ≤ Y~ ≤ d = f x ðxÞf y ðyÞdydx →
a c
d
b
Prob a ≤ X~ ≤ b; c ≤ Y~ ≤ d = f x ðxÞdx f y ðyÞdy →
a
c
Prob a ≤ X~ ≤ b; c ≤ Y~ ≤ d = Prob a ≤ X~ ≤ b Prob c ≤ Y~ ≤ d :
The above equality can be derived in an alternative way as follows. Let A = [a b],
B = [c d] be the event, then we have
~ and Y,
For independent X ~ we also have
~ k Y~
E g X ~ E k Y~ :
=E g X ð6:35Þ
~ þ Y~ = Var X
Var X ~ þ Var Y~ : ð6:36Þ
~ Y~
Var g X, =E ~ Y~
g X,
2
- m2
where
1 1
E ~ Y~
g X,
2
= ½gðx, yÞ2 f ðx, yÞdxdy
-1 -1
and
1 1
~ Y~
m = E g X, = gðx, yÞf ðx, yÞdxdy:
-1 -1
~ Y~ = X
Let g X, ~ þ Y,
~ then the mean value of g X,
~ Y~ can be calculated as
1 1
m = E X~ þ Y~ = ðx þ yÞf ðx; yÞdxdy →
-1 -1
1 1 1 1
m= xf ðx; yÞdxdy þ xf ðx; yÞdxdy →
-1 -1 -1 -1
1 1
m= xf x ðxÞdx þ yf y ðyÞdy
-1 -1
m= mx þ my :
We can compute E ~ þ Y~
X
2
as
6.7 Joint Cumulative Distribution Function 201
1 1
2
E XþY = ðx þ yÞ2 f ðx, yÞdxdy →
-1 -1
1 1 1 1
= x2 f ðx, yÞdxdy þ y2 f ðx, yÞdxdy
-1 -1 -1 -1
1 1
þ 2xyf x ðxÞf y ðyÞdxdy
-1 -1
1 1 1 1
= x2 f x ðxÞdx þ y2 f y ðyÞdy þ 2 xf x ðxÞdx yf y ðyÞdy
-1 -1 -1 -1
2 2
=E X þE Y þ 2E X E Y :
~ þ Y~ using
Finally, we can calculate Var X
~ þ Y~ = E
Var X ~ þ Y~
X
2
- m2
as
2 2 2
Var X þ Y = E X þE Y þ 2E X E Y - mx þ my
2 2
=E X - m2x þ E Y - m2y þ 2E X E Y - 2mx my
0
Var X Var Y
= Var X þ Var Y :
~ and Y~ is
The joint cumulative distribution function of continuous random variables X
defined as
~ ≤ x, Y~ ≤ y
F ðx, yÞ = Prob X ð6:37Þ
and joint probability density function f(x, y) can be obtained from its joint cumulative
distribution function via
2
∂ F ðx, yÞ
f ðx, yÞ = : ð6:39Þ
∂x∂y
202 6 More Than One Random Variables
~ 2 A, Y~ 2 B, Z~ 2 C
Prob X
~ Y,
Prob X, ~ Z~ 2 D
a ≤ x ≤ b, c ≤ y ≤ d, e ≤ z ≤ f
can be calculated as
~ 2 A, Y~ 2 B, Z~ 2 C =
Prob X f ðx, y, zÞdxdydz
x2A, y2B, z2C
or as
~ Y,
Prob X, ~ Z~ 2 D = f ðx, y, zÞdxdydz ð6:40Þ
ðx, y, zÞ2D
or more in detail as
b d f
~ ≤ b, c ≤ Y~ ≤ d, e ≤ Z~ ≤ f
Prob a ≤ X = f ðx, y, zÞdxdydz: ð6:41Þ
a c e
Properties
1. Marginal probability density functions can be calculated from joint distributions
as
Joint distributions involving fewer variables can be evaluated from those joint
distributions involving more variables as
f ðx, y, zÞ f ðx, y, zÞ
f ðxjy, zÞ = f ðx, yjzÞ = ð6:46Þ
f ðy, zÞ f ðzÞ
~ Y,
3. If the random variables X, ~ and Z~ are independent of each other, we have
Assume that the function z = f(x, y) is defined on the region D shown in Fig. 6.8, and
we can consider that there is a plane on the region D whose height at point (x, y)
equals f(x, y). The volume of the region whose base is indicated by D is calculated as
b gðxÞ
V= f ðx, yÞdydx: ð6:49Þ
a hðxÞ
204 6 More Than One Random Variables
h(x)
a x
b
a
x
If the region D is surrounded by the functions g( y) and h( y) as in Fig. 6.9, then the
volume of the plane with base D region D whose height at point (x, y) equals to f(x, y)
is calculated as
b gðyÞ
V= f ðx, yÞdxdy: ð6:50Þ
a hðyÞ
f ðxÞdx
A
f ( x, y ) cxy
1
1 x
g ( x) x 1
f ( x, y ) cxy
1
h( x) x 1
x
1
1 xþ1
cxydydx = 1
0 - xþ1
leading to
1 xþ1
c x ydy dx = 1
0 - xþ1
~ Y~ = E X
Cov X, ~ Y~ - E X
~ E Y~ ð6:51Þ
or using
Cov X, ~ - mx Y~ - my :
~ Y~ = E X ð6:52Þ
~ - mx Y~ - my
E X ~ Y~ - E X
=E X ~ E Y~ : ð6:53Þ
Property
~ Y~ are independent of each other, then we have
If the random variables X,
~ Y~ = E X
Cov X, ~ Y~ - E X
~ E Y~ → Cov X,
~ Y~ = E X
~ E Y~ - E X
~ E Y~ → Cov X,
~ Y~ = 0:
~ Y~ is calculated as
The correlation coefficient for the random X,
~ Y~
Cov X,
ρ= ð6:54Þ
~ Var Y~
Var X
x
1
H ðyÞ = Prob Y ≤ y
= Prob g X ≤ y
= Prob X ≤ g - 1 ðyÞ
g - 1 ðyÞ
= f ðxÞdx
-1
-1
= F ðg ðyÞÞ:
1 0≤x≤1
f ðxÞ = 0 otherwise
x
1
p
Solution 6.10: If Y~ = X~ , then we have
p
y= x
~
and for 0 ≤ x ≤ 1, we have 0 ≤ y ≤ 1. The cumulative distribution function of X
equals to
F ðxÞ = x 0 ≤ x ≤ 1:
H ðyÞ = Prob Y ≤ y
= Prob X ≤y
= Prob X ≤ y2
= F ð y2 Þ 0 ≤ y 2 ≤ 1
= y2 0 ≤ y ≤ 1
Hence, we got
y2 0 ≤ y ≤ 1
H ð yÞ =
0 otherwise:
dH ðyÞ 2y 0 ≤ y ≤ 1
f y ðyÞ = → f y ð yÞ =
dy 0 otherwise:
Note: The derivative of the combined function F(g( y)) can be calculated as
dF ðgðyÞÞ
= F 0 ðgðyÞÞg0 ðyÞ: ð6:57Þ
dy
6.8 Distribution for Functions of Random Variables 209
Example 6.11: If
dF ðxÞ
f x ðxÞ =
dx
find
p
dF ð yÞ
dy :
p
Solution 6.11: Let gðyÞ = y, employing
dF ðgðyÞÞ
= F 0 ðgðyÞÞg0 ðyÞ
dy
we obtain
p p
dF y p p 0 dF y 1 p
=fx y y → = p fx y :
dy dy 2 y
Example 6.12: If
dF ðxÞ
f x ðxÞ =
dx
find
p
dF ð - yÞ
dy :
p
Solution 6.12: Let gðyÞ = - y, employing
dF ðgðyÞÞ
= F 0 ðgðyÞÞg0 ðyÞ
dy
we obtain
p p
dF ð - yÞ p p 0 dF ð - yÞ p
dy =fx - y - y → dy =- 1
p
2 yfx - y :
H ð yÞ = Prob Y~ ≤ y
= Prob X~ 2 ≤ y
p p
= Prob - y ≤ X~ ≤ y
p p
= F y -F - y
dF ðgðyÞÞ
= F 0 ðgðyÞÞg0 ðyÞ
dy
H ðyÞ = Prob Y ≤ y
= Prob aX þ b ≤ y
y-b
= Prob X ≤ if a > 0
a
y-b
=F
a
H ðyÞ = Prob Y ≤ y
= Prob aX þ b ≤ y
y-b
= Prob X ≥ if a < 0
a
y-b
= 1 - Prob X ≤
a
y-b
=1-F
a
6.9 Probability Density Function for Function of Two Random Variables 211
Hence, we got
y-b
F if a > 0
a
H ðyÞ =
y-b
1-F if a < 0
a
dF ðgðyÞÞ
= F 0 ðgðyÞÞg0 ðyÞ
dy
1 y-b
fx if a > 0
a a
f y ð yÞ =
1 y-b
- fx if a < 0
a a
1 y-b
f y ð yÞ = f : ð6:58Þ
jaj x a
~ which is
In this subsection, we will inspect the probability density function of Z,
obtained from two different continuous random variables by a function, i.e.,
Z~ = g X,
~ Y~ : ð6:59Þ
~ Y~ ≤ z
F ðzÞ = Prob g X, ð6:61Þ
bðxÞ
d dbðxÞ daðxÞ
f ðx, yÞdy = f ðx, bðxÞÞ - f ðx, aðxÞÞ
dx aðxÞ dx dx
bðxÞ
∂f ðx, yÞ
þ dy: ð6:66Þ
aðxÞ ∂x
6.9 Probability Density Function for Function of Two Random Variables 213
d b b
∂f ðx, yÞ
f ðx, yÞdy = dy: ð6:67Þ
dx a a ∂x
Example 6.16: For the previous example, if the random variables X ~ and Y~ are
independent of each, find the probability density function of Z~ = X
~ þ Y.
~
which is nothing but the convolution of fx(x) and fy( y), i.e.,
1 z-x
F ðzÞ = f x ð xÞ f y ðyÞdy dx
x= -1 y= -1
H ð z - xÞ
leading to
1
F ðzÞ = f x ðxÞH ðz- xÞdx:
x= -1
214 6 More Than One Random Variables
Using
dF ðzÞ
f ðzÞ =
dz
we get
1
dH ðz - xÞ
f ðzÞ = f x ðxÞ dx
x= -1 dz
where employing
f y ð yÞ = H 0 ð yÞ
we obtain
1
f ðzÞ = f x ðxÞf y ðz- xÞdx
x= -1
which is nothing but the convolution of fx(x) and fy( y), i.e.,
Y~ = g X
~ : ð6:68Þ
y = gðxÞ: ð6:69Þ
6.10 Alternative Formula for the Probability Density Function of a Random Variable 215
y = g ð x 1 Þ = g ð x 2 Þ = ⋯ = g ð x N - 1 Þ = gð x N Þ ð6:70Þ
then, the probability density function of Y, ~ i.e., fy( y), can be calculated from the
~
probability density function of X, i.e., fx(x), as
f x ðx1 Þ f ðx Þ
f y ðyÞ = þ ⋯ þ x0 N : ð6:71Þ
j g0 ð x 1 Þ j jg ðxN Þj
Example 6.17: If Y~ = aX
~ þ b, find fy( y) in terms of the probability density function
fx(x).
Solution 6.17: If we solve
y = ax þ b
y-b
x1 = :
a
g0 ðxÞ = a:
f x ð x1 Þ
f y ð yÞ =
jg0 ðx1 Þj
leading to
1 y-b
f y ðyÞ = f :
j aj x a
1
y=
x
1
x1 = :
y
1
g0 ðxÞ = - :
x2
f x ð x1 Þ f ðx Þ
f y ð yÞ = → f y ðyÞ = x 1 1 → f y ðyÞ = x21 f x ðx1 Þ
jg0 ðx1 Þj x2 1
leading to
1 1
f y ð yÞ = f :
y2 x y
In this section, we explain the probability density function calculation for the
functions of two random variables using the cumulative distribution function via
some examples.
Example 6.19: If Z~ = X=
~ Y,
~ find fz(z) in terms of the joint probability density
function f(x, y).
Solution 6.19: The cumulative distribution function of Z~ can be written as
F ðzÞ = Prob Z~ ≤ z
leading to
6.11 Probability Density Function Calculation for the Functions of Two Random. . . 217
X~
F ðzÞ = Prob ≤z ~ Y~ ≤ z
$ F ðzÞ = Prob g X,
Y~
~ Y~ ≤ z can be calculated as
Note: Prob g X,
~ Y~ ≤ z =
Prob g X, f ðx, yÞdxdy ð6:73Þ
D = fðx, yÞjgðx, yÞ ≤ zg
x
D = ðx, yÞj < z → D1 = fðx, yÞj x < yz, y > 0g D2 = fðx, yÞjx > yz, y < 0g
y
1 yz 0 1
F ðzÞ = f ðx, yÞdxdy þ f ðx, yÞdxdy:
y=0 x= -1 y = - 1 x = yz
1 yz 0 1
d d
f z ðzÞ = f ðx, yÞdx dy þ f ðx, yÞdx dy
y=0 dz x= -1 y= -1 dz x = yz
leading to
1 0
f z ðzÞ = yf ðyz, yÞdy þ - yf ðyz, yÞdy
y=0 y= -1
1
f z ðzÞ = jyjf ðyz, yÞdy:
y= -1
Example 6.20: If Z~ = X
~ 2 þ Y~ 2 , find fz(z) in terms of the joint probability density
function f(x, y).
Solution 6.20: The cumulative distribution function of Z~ can be written as
F ðzÞ = Prob Z~ ≤ z
leading to
~ 2 þ Y~ 2 ≤ z $ F ðzÞ = Prob g X,
F ðzÞ = Prob X ~ Y~ ≤ z
where the region D on which the integration is performed is the area of a circle with
p
radius z and can be elaborated as
p p
D = ðx, yÞjx2 þ y2 < z → D = ðx, yÞj - z ≤ y ≤ z, - z - y2 ≤ x ≤ z - y2
gðz, yÞ
where
6.11 Probability Density Function Calculation for the Functions of Two Random. . . 219
p
z - y2
gðz, yÞ = p f ðx, yÞdx:
x= - z - y2
dF ðzÞ
f ðzÞ =
dz
leading to
p
z
dgðz, yÞ
f ðzÞ = p
dy
y= - z dz
where
dgðz, yÞ
dz
can be calculated as
dgðz, yÞ 0 0
= z - y2 f z - y2 , y - - z - y2 f - z - y2 , y
dz
leading to
dgðz, yÞ 1 1
= f z - y2 , y þ f - z - y2 , y :
dz 2 z - y2 2 z - y2
Note:
bðxÞ bðxÞ
d dbðxÞ daðxÞ ∂f ðx, yÞ
f ðx, yÞdy = f ðx, bðxÞÞ - f ðx, aðxÞÞ þ dy
dx aðxÞ dx dx aðxÞ ∂x
220 6 More Than One Random Variables
F ðzÞ = Prob ~ 2 þ Y~ 2 ≤ z
X
leading to
~ 2 þ Y~ 2 ≤ z2 $ F ðzÞ = Prob g X,
F ðzÞ = Prob X ~ Y~ ≤ z2 :
z
z
f z ðzÞ = f z 2 - y2 , y þ f - z 2 - y2 , y dy:
y= -z z2 - y2
1 x2 1 y2
f x ð xÞ = p e - 2σ2 f y ðyÞ = p e - 2σ2
2πσ 2 2πσ 2
yielding
z
z
f z ðzÞ =
y= -z z2- y2
1 z2 - y 2 1 y2 1 z2 - y2 1 y2
p e - 2σ2 p e - 2σ2 þ p e - 2σ2 p e - 2σ2 dy
2πσ 2 2πσ 2 2πσ 2 2πσ 2
z
z 1 - 2σz22
f z ðzÞ = e dy
z2 - y2 πσ
2
y= -z
0
z 1 - 2σz22
f z ðzÞ = - e z sin θdθ
y=π πσ 2
z2 - z2 ðcos θÞ2
0
z 1 - 2σz22
f z ðzÞ = - e z sin θdθ
y = π z sin θ πσ
2
resulting in
1 - 2σz22
f z ðzÞ = e z>0 ð6:75Þ
σ2
z - z22σþm2 2 zm
f z ðzÞ = e I0 2 ð6:76Þ
σ2 σ
~ and Y,
For two continuous random variables X ~ we define
222 6 More Than One Random Variables
Z~ = g X,
~ Y~ ~ = h X,
W ~ Y~ : ð6:78Þ
z = gðx, yÞ w = hðx, yÞ
for the unknowns x and y, and denote the roots by xi, yi.
Step 2: The joint probability density function of Z~ and W
~ can be calculated using
1
f zw ðz, wÞ = f ðx , y Þ ð6:79Þ
i
jJ ðxi , yi Þj xy i i
where
∂z ∂z
J ðx; yÞ = ∂x ∂y and J ðx ; y Þ = J ðx; yÞ
j i i j j jxi , yi ð6:80Þ
∂w ∂w
∂x ∂y
Second Method
The second method can be used if the equation set z = g(x, y), w = h(x, y) has only
one pair of root for x and y.
Step 1: Using equations
z = gðx, yÞ w = hðx, yÞ
we express x and y as
where
6.12 Two Functions of Two Random Variables 223
∂x ∂x
J ðz; wÞ = ∂z ∂w : ð6:82Þ
∂y ∂y
∂z ∂w
Example 6.22: X ~ and Y~ are two continuous random variables. Using these two
random variables, we obtain the random variables Z~ and W
~ as
Z~ = X
~ þ Y~ W
~ =X
~ - Y:
~
z=x þ y w=x-y
zþw z-w
x1 = y1 = :
2 2
∂z ∂z
∂x ∂y
J ðx; yÞ =
∂w ∂w
∂x ∂y
as
J ðx, yÞ = 1
1
1
-1
= - 2 → J ðx1 , y1 Þ = - 2:
1
f zw ðz, wÞ = f ðx , y Þ
i
jJ ðxi , yi Þj xy i i
leading to
1
f zw ðz, wÞ = f ðx , y Þ
jJ ðx1 , y1 Þj xy 1 1
resulting in
1 z þ w z-w
f zw ðz, wÞ = f , :
2 xy 2 2
Second Method
Step 1: Using equations
z=x þ y w=x-y
we express x and y as
zþw z-w
x= y= :
2 2
∂x ∂x
J ðz; wÞ = ∂z ∂w
∂y ∂y
∂z ∂w
as
1 1
J ðz; wÞ = 21 2 → J ðz; wÞ = - 1
1 2
-
2 2
leading to
6.12 Two Functions of Two Random Variables 225
1 z þ w z-w
f zw ðz, wÞ = f , :
2 xy 2 2
Example 6.23: X ~ and Y~ are two continuous random variables. Using these two
random variables, we obtain the random variables Z~
Z~ = X
~ Y:
~
Find fz(z) in terms of the joint probability density function f(x, y).
Solution 6.23: To be able to use the Jacobian approach, we need two equations.
The first one is given as
Z~ = X
~ Y:
~
~ = X:
W ~
z = xy w = x
z
x 1 = w y1 = :
w
∂z ∂z
∂x ∂y
J ðx; yÞ =
∂w ∂w
∂x ∂y
as
J ðx, yÞ = y
1
x
0
= - x → J ðx1 , y1 Þ = - w:
1
f zw ðz, wÞ = f ðx , y Þ
i
jJ ðxi , yi Þj xy i i
leading to
1
f zw ðz, wÞ = f ðx , y Þ
jJ ðx1 , y1 Þj xy 1 1
resulting in
1 z
f zw ðz, wÞ = f w, :
j w j xy w
Second Method
Step 1: Using equations
z = xy w = x
we express x and y as
z
x=w y= :
w
∂x ∂x
J ðz; wÞ = ∂z ∂w
∂y ∂y
∂z ∂w
as
0 1 1
J ðz; wÞ = 1 z → J ðz; wÞ = - :
- 2 w
w w
leading to
6.12 Two Functions of Two Random Variables 227
1 z
f zw ðz, wÞ = f w, :
j w j xy w
Example 6.24: If
~
Z~ = X ~=X
~ þ Y~ W
Y~
x
z=x þ y w=
y
w z
x1 = z y1 = :
wþ1 wþ1
z w z
f zw ðz, wÞ = f xy z , :
ð w þ 1Þ 2 wþ1 wþ1
p
Example 6.25: If Y~ = X~ , find fy( y) in terms of fx(x).
p
Solution 6.25: When the equation y = x is solved for x, we get
x = y2
f x ð x1 Þ f ðx Þ
f y ð yÞ = þ ⋯ þ x0 N
jg0 ðx1 Þj j g ð xN Þ j
f x ð x1 Þ
f y ð yÞ =
jg0 ðx1 Þj
p 1 1
gðxÞ = x → g0 ðxÞ = p → g0 ðx1 Þ = g0 y2 = :
2 x 2y
Then, we have
f x ðy2 Þ
f y ðyÞ = → f y ðyÞ = 2yf x y2 y > 0:
j g0 ð x 1 Þ j
Example 6.26: If Y~ = X
~ , find fy( y) in terms of fx(x).
2
f x ð x1 Þ f ðx Þ
f y ð yÞ = þ ⋯ þ x0 N
jg0 ðx1 Þj j g ð xN Þ j
f x ð x1 Þ f ðx Þ
f y ð yÞ = þ x 2
jg0 ðx1 Þj jg0 ðx2 Þj
p p
where f x ðx1 Þ = f x y , f x ðx2 Þ = f x - y and
p
gðxÞ = x2 → g0 ðxÞ = 2x → g0 ðx1 Þ = g0 y2 = 2 y:
Then, we have
p p p p
fx y fx - y fx y fx - y
f y ð yÞ = 0 þ → f y ðyÞ = p þ p y > 0:
jg ðx1 Þj jg0 ðx2 Þj 2 y 2 y
Thus,
Problems 229
D
1 x
1
p p
fx y fx - y
f y ð yÞ = p þ p y>0
2 y 2 y
0 otherwise:
Exercises:
~
1. If Y~ = eX , find fy( y) in terms of fx(x).
2. If Y~ = - ln X,~ find fy( y) in terms of fx(x).
Problems
1. The joint probability density function f(x, y) of two continuous random variables
~ and Y~ is a constant and it is defined on the region shown in Fig. 6P.1.
X
(a) Find f(x, y), f(x), f( y), and f(x| y).
(b) Calculate E X ~ , E Y~ , and Var X ~ .
(c) Find E Xj~ Y~ = y and E Yj ~X~ =x .
(d) ~ ~
Find E XjY and E E XjY . ~ ~
(e) Find Var Xj ~ Y~ = y and Var Yj ~X ~ =x .
(f) ~ ~
Find Var XjY and verify that Var X ~ = E Var Xj
~ Y~ ~ Y~ .
þ Var E Xj
~ 1 and X
2. Assume that we have two independent normal random variables X ~ 2 , i.e.,
~ 1 N ð0, 1Þ X
X ~ 2 N ð0, 1Þ:
If Y~ = X
~1 þ X
~ 2 , what is the variance of Y~ ?
3. Assume that we have independent random variables X, ~ Y,
~ each of which is
uniformly distributed in the interval [0 1]. Find the probabilities
~
~< 3 P X
P X ~ Y~ ≤ 1 P X ≤ 1 P max X,
~ þ Y~ ≤ 3 P X ~ Y~ ≤
1 ~ < Y~ :
P X
7 8 4 Y~ 4 5
x
f ðx; yÞ = c if x > 0,y > 0 and þ y < 10 otherwise:
2
~ ≤ Y~ B = Y~ ≤ 0:5 :
A= X
(a) Determine the value of the constant c and find the following:
~ Y~ E X
PðBjAÞ PðAjBÞ E X ~ þ Y~ E Xj
~ Y~ f x ðxjBÞ Cov X,
~ Y~ :
Y~
Z~ = :
X~
Z~ = aX
~ þ bY~ W
~ = cX
~ þ dY:
~
Express the joint probability density function of Z~ and W,~ i.e., fzw(z, w), in
~ ~
terms of the joint probability density function of X and Y, i.e., fxy(x, y).
13. Assume that we have three continuous random variables X, ~ Y, ~ Z, ~ and the
relation among these random variables is defined as
Problems 231
Z~ = X
~ Y:
~
Find the probability density function of Z, ~ i.e., fz(z), in terms of the joint
~ and Y,
probability density function of X ~ i.e., fxy(x, y).
14. Assume the random variables X ~ and Y~ are normal random variables, i.e.,
If
Z~ = X
~ þ Y~ W
~ =X
~ - Y~
1. Athanasios Papoulis, Probability, Random Variables and Stochastic Processes 4th Edition, 2001,
ISBN-10:0073660116
2. Hwei Hsu, Schaum’s Outline of Probability, Random Variables, and Random Processes, 3rd
Edition, 2014, ISBN-10:0071822984
3. Charles Therrien (Author), Murali Tummala, Probability and Random Processes for Electrical
and Computer Engineers 2nd Edition, ISBN-10:1439826986
4. Joseph K. Blitzstein, Jessica Hwang, Introduction to Probability, Second Edition (Chapman &
Hall/CRC Texts in Statistical Science)
5. Bilal M. Ayyub, Richard H. McCuen, Probability, Statistics, and Reliability for Engineers and
Scientists 3rd Edition, ISBN-10:1439809518
© The Editor(s) (if applicable) and The Author(s), under exclusive license to 233
Springer Nature Switzerland AG 2023
O. Gazi, Introduction to Probability and Random Variables,
https://doi.org/10.1007/978-3-031-31816-0
Index
© The Editor(s) (if applicable) and The Author(s), under exclusive license to 235
Springer Nature Switzerland AG 2023
O. Gazi, Introduction to Probability and Random Variables,
https://doi.org/10.1007/978-3-031-31816-0
236 Index
J function, 12–14
Joint cumulative distribution function, 201–206 mass function, 76–80
Joint experiment, 5–12 Properties of cumulative distribution function,
Joint probability mass function, 113–118, 154–157
121–122
S
M Sample space, 1–2, 9
Modeling of binary communication channel, Several random variables, 136–139
61–64 Standard deviation, 87–91
Multiplication rule, 35–36 Standard random variable, 149–152
N T
Normal random variable, 148–152 Total probability theorem, 29–34
Trial, 1
Two continuous random variables, 221–228
P
Partitions, 56–61
Permutation, 51–52 U
Poisson random variable, 101 Unit step function, 158–161
Probabilistic law, 3–4
Probability, 3
axioms, 4 V
density function for function of two random Variance, 87–91, 146–148
variables, 211–214 Venn diagram, 14–16