0% found this document useful (0 votes)
14 views18 pages

IMEN319 1.probability Review

The document provides lecture notes on probability theory, discussing concepts such as deterministic versus random systems, sample spaces, events, axioms of probability, conditional probability, and independence of events. It includes various examples to illustrate these concepts, emphasizing the mathematical foundation necessary for making predictions about random phenomena. Key principles such as the total probability theorem and Bayes' formula are also presented to aid in understanding probability in practical scenarios.

Uploaded by

leesh4660
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views18 pages

IMEN319 1.probability Review

The document provides lecture notes on probability theory, discussing concepts such as deterministic versus random systems, sample spaces, events, axioms of probability, conditional probability, and independence of events. It includes various examples to illustrate these concepts, emphasizing the mathematical foundation necessary for making predictions about random phenomena. Key principles such as the total probability theorem and Bayes' formula are also presented to aid in understanding probability in practical scenarios.

Uploaded by

leesh4660
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 18

IMEN 319 OR-II ✏ ‰µ

Lecture notes

2023YDƒ 1Y0

1 Introduction
Example 1.1 Deterministic System vs. Random System
"

mmls

Sservery

Fig. 1. A simple Single Server Queue

Assume service time takes 4 minutes on average and customers arrive every 5
minutes on average. If deterministic model (no uncertainty), there will be no
waiting. Why? But in a random system (i.e., real world), queue will be formu-
lated.

How can we make precise predictions about such random phenomena whose
behavior is inherently unpredictable? In order to reason reliably about random
phenomena, it is essential to develop a rigorous mathematical foundation which
is the goal of probability theory:
Probability theory is the mathematical study of random phenomena.

2 Review of Probability
2.1 Sample Space

Definition 1. The sample space ⌦ is the set of all possible outcomes of a


random experiment.

Example 2.1.1 Consider the random experiment of flipping two coins. The sample
space consists of the following four points
⌦ = {(h, h), (h, t), (t, h), (t, t)}.
2 IMEN 319 Lecture Notes

Example 2.1.2 Consider the random experiment of tossing two dice. We denote
by (i, j) the outcome that the first die comes up i and the second die comes up
j. Hence, we define the sample space
⌦ = {(i, j) : i, j = 1, 2, 3, 4, 5, 6}.

Example 2.1.3 Consider the random experiment of waiting for a bus that will
arrive at a random time in the future. In this case, the outcome of the experiment
can be any real number t 0 (t = 0 means the bus comes immediately, t = 1.5
means the bus comes after 1.5 hours, etc.). We can therefore define the sample
space
⌦ = [0, +1].

2.2 Events

Informally, an event is a statement for which we can determine whether it is true


of false after the experiment has been performed.

Example 2.2.1 In Example 2.1.1, consider the following event:


“A head appears on the first coin.”
Note that this event occurs in a given experiment if and only if the outcome of
the experiment happens to lie in the following subset of all possible outcomes:
{(h, h), (h, t)} ⇢ ⌦

Example 2.2.2 In Example 2.1.2, consider the following event:


“The sum of the numbers on the dice is 7.”
Note that this event occurs in a given experiment if and only if the outcome of
the experiment happens to lie in the following subset of all possible outcomes:
{(1, 6), (2, 5), (3, 4), (4, 3), (5, 2), (6, 1)} ⇢ ⌦

Example 2.2.3 In Example 2.1.3, consider the following event:


“The bus comes within the first hour.”
Note that this event occurs in a given experiment if and only if the outcome of
the experiment happens to lie in the following subset of all possible outcomes:
[0, 1] ⇢ ⌦.

Definition 2. An event is a subset A of the sample space ⌦.

Definition 2 allows us to translate any combination of events into mathematical


language (i.e., set operations).
IMEN 319 Lecture Notes 3

2.3 Axioms of Probability

Definition 3. Consider an experiment whose sample space is ⌦. For each


event A of the sample space ⌦, we assume that a number P (A) is defined
and satisfies the following three axioms:

Axiom 1. (Nonnegativity) 0  P (A)  1 (probability is a “degree of confi-


dence”).

Axiom 2. (Normalization) P (⌦) = 1 (we are certain that something will


happen).

Axiom 3. (Additivity) For any sequence of mutually exclusive events A1 , A2 , ...


(that is, events for which Ai \ Aj = ? for all i 6= j),
S
1 P
1
P Ai = P (Ai )
i=1 i=1
(the probabilities of mutually exclusive events add up).
We refer to P (A) as the probability of the event A.

Example 2.3.1 Consider the random experiment of throwing a die. As the die
has six sides, the natural sample space for this model is
⌦ = {1, 2, 3, 4, 5, 6}
How can we assign probabilities to this experiment?

2.4 Conditional Probability

Example 2.4.1 (The Prisoner’s Dilemma) Three prisoners are informed by their
jailer that one of them has been chosen at random to be executed, and the other
two are to be freed. Prisoner A asks the jailer to tell him privately who is the
prisoner other than himself that will be released. The jailer refuses to answer this
questions, pointing out that if A knew which of his fellows were to be set free,
then his own probability of being executed would rise from 1/3 to 1/2, since he
would then be one of two prisoners. What do you think of the jailer’s reasoning?
4 IMEN 319 Lecture Notes

What would happen if we gain partial information concerning the result of an


experiment? We thus seek to construct a new probability law, which takes into
account this knowledge and which, for any event A, gives us the conditional
probability of A given B, denoted by P (A|B).

Definition 4. The conditional probability P (A|B) of an event A given that


the event B occurs is defined as
P (A \ B)
P (A|B) =
P (B)
provided P (B) > 0.

In words, out of the total probability of the elements of B, P (A|B) is the fraction
that is assigned to possible outcomes that also belong to A. When we condition
on B, only outcomes where B occurs remain possible. This restricts the set of
all possible outcomes to B, and the set of favorable outcomes to A \ B. Thus
the conditional probability of A given B is P (A \ B)/P (B): we must normalize
by P (B) to ensure that the probability of all possible outcomes is one (that is,
P (B|B) = 1). Conditional probability provides us with a way to reason about
the outcome of an experiment, based on partial information.

Example 2.4.2 We toss a fair coin three successive times. We wish to find the
conditional probability P (A|B) when A and B are the events
A = {more heads than tails come up}, B = {1st toss is a head}.

Example 2.4.3 Suppose that an urn contains 8 red balls and 4 white balls. We
draw 2 balls from the urn without replacement. If we assume that at each draw,
each ball in the urn is equally likely to be chosen, what is the probability that
both balls drawn are red?
IMEN 319 Lecture Notes 5

A generalization of P (A \ B) = P (A|B)P (B), which provides an expression for


the probability of the intersection of an arbitrary number of events, is sometimes
referred to as the multiplication rule.

The multiplication rule

P (A1 \ A2 \ A3 \ · · · \ An )
= P (A1 )P (A2 |A1 )P (A3 |A1 \ A2 ) · · · P (An |A1 \ · · · \ An 1 ).

Let us now derive a more interesting property of conditional probabilities. By


the definition of conditional probability, we have

P (A \ B) = P (A|B)P (B),
P (A \ B c ) = P (A|B c )P (B c ).

As A = (A \ B) [ (A \ B c ), and events A \ B and A \ B c are clearly mutually


exclusive, we have, by Axiom 3,
P (A) = P (A \ B) + P (A \ B c ) = P (A|B)P (B) + P (A|B c )P (B c ).
This equation states that the probability of the event A is a weighted average of
the conditional probability of A given that B has occurred and the conditional
probability of A given that B has not occurred - each conditional probability be-
ing given as much weight as the event on which it is conditioned has of occurring.
In fact, this equation can be generalized and is referred to as total probability
theorem.

Total Probability Theorem

Let A1 , ..., An be disjoint events that form a partition of the sample space
(each possible outcome is included in one and only one of the events
A1 , ..., An ) and assume that P (Ai ) > 0, for all i = 1, ..., n. Then, for any
event B, we have

P (B) = P (A1 \ B) + · · · + P (An \ B)


= P (A1 )P (B|A1 ) + · · · + P (An )P (B|An ).

The total probability theorem is often used in conjunction with the following
theorem, which relates conditional probabilities of the form P (A|B) with condi-
tional probabilities of the form P (B|A), in which the order of the conditioning
is reversed.
6 IMEN 319 Lecture Notes

Bayes Formula GT

Let A1 , A2 , ..., An be disjoint events that form a partition of the sample


space, and assume that P (Ai ) > 0, for all i. Then, for any event B such
that P (B) > 0, we have

P
P (Ai )P (B|Ai )
P (Ai |B) =
P (B)
P (Ai )P (B|Ai )
= .
P (A1 )P (B|A1 ) + · · · + P (An )P (B|An )

The beauty of this formula is that it allows us to turn around the role of the
conditioning and conditioned event: that is, it expresses P (B|A) in terms of
P (A|B). This is an extremely useful formula, because its use often enables us to
determine the probability of an event by first “conditioning” upon whether or
not some second event has occurred.

Example 2.4.4 You are not feeling well, and go to the doctor. The doctor sends
you to undergo a medical test which in the present case, medical statistics show
that patients who have this disease test positive 95% of the time, while patients
who do not have the disease test positive 2% of the time. In the general pop-
ulation, one in a thousand people in your age group have this disease. A week
after, the test results come back positive. Given this information, what is the
probability that you actually have the disease?
A : test result Ts Positive

B : Yom have the disease

P (B) =
00 o1, P ( AIB) =
0 . 95 .
P ( AI BC ) =
0 02.
P ( BIA) =
?

P (A 3 P( A )P
AIB
∴ P ( BIA) = - (
(
) = ~P P ( AIB P( B)
)
+
P ( A 1 B ) ( BV)
P

0 .95

008
×
=
= 0 045
L 0067
.

.+ 5000 ↑
,

0 . 02× ( 1-
: ( 0 001 )
.

Example 2.4.5 An insurance company believes that people can be divided into
two classes: those who are accident prone and those who are not. The company’s
statistics show that an accident-prone person will have an accident at some time
within an fixed 1-year period with probability 0.4, whereas this probability de-
creases to 0.2 for a person who is not accident prone. (a) If we assume that
30 percent of the population is accident prone, what is the probability that a
new policyholder will have an accident within a year of purchasing a policy? (b)
accident
A event that thepolc holderWTI
.
: )= 2
hacan

A : " " " " Ts accident Prone


P (A ) = o 3
, P ( A 11 A
)=o × ,P A ,AC
(
)
= 2 O

(a) P ( A ) .
=
P( A 1 A) XP (A) + P (A , IAc ) XP (AC )
= 0 .
4 × 03 + 02 × ( 1 - 03 )

=
0 - - 014
g
0=
IMEN 319 Lecture Notes 7

Suppose that a new policy holder has an accident within a year of purchasing a
policy. What is the probability that he or she is accident prone?
~
)

P( AnA
( n P ( AIA ) =
)

* =
. . "-
6
0 .
26 B
~ 1"

P (A)
6
=
0. , P ( AIA . )

=
3

)
(
:
!

2.5 Independence

In the special case where P (A|B) does in fact equal P (A), we say that A is inde-
pendent of B. That is, A is independent of B if knowledge that B has occurred
does not change the probability that A occurs.

Definition 5. Two events A and B are said to be independent if


P (A \ B) = P (A)P (B). : P (1A 11B) = P(A)
= P (A1 !3)
-

P (B)
A common first thought is that two events are independent if they are disjoint, C3 ~
= AMB =② )
but in fact the opposite is true: two disjoint events A and B with P (A) > 0 and
P (B) > 0 are never independent, since their interaction A \ B is empty and has
probability 0.

Suppose now that A is independent of B and is also independent of C. Is A


then necessarily independent of B \ C? TFAXB AIK - A + ( BMC ] ?

O
=

,

- > !
Example 2.5.1 Two fair dice are thrown. Let A denote the event that the sum of
the dice is 7. Let B denote the event that the first die equals 4 and C denote the
event that the second die equals 3. Is event A independent of B and C? What
about for event B \ C? 9 D

)
OP (AMB )= P
( 94. 33 )= 6
.P ( AAC ) = P ( 94 , 37 ) mb:A
B.
P (A) =
66. P(B )= =116,PLC) = 116
~~
+


2 P( A 1 1 nc)
B =
# P (A)
8 IMEN 319 Lecture Notes

Definition 6. Three events A, B, and C are said to be independent if

)
P (A \ B \ C) = P (A)P (B)P (C)
P (A \ B) = P (A)P (B)
P (A \ C) = P (A)P (C)
Tndependent
P (B \ C) = P (B)P (C)

We may also extend the definition of independence to more than three events.
The events A1 , A2 , ..., An are said to be independent if for every subset A10 , A20 , ..., Ar0 , mmm

r  n of these events,
P (A10 \ A20 \ · · · \ Ar0 ) = P (A10 )P (A20 ) · · · P (Ar0 )

Example 2.5.2 The gambler’s ruin problem: Two gamblers, A and B, bet
on the outcomes of successive flips of a coin. On each flip, if the coin comes up
heads, A collects 1 unit from B, whereas if it comes up tails, A pays 1 unit to B.
They continue to do this until one of them runs out of money. If it is assumed
that the successive flips of the coin are independent and each flip results in a
head with probability p, what is the probability that m A ends up with all the ~ ~ ~

money if he starts with i units and B starts with N i units?


m m m m m
E =

N p(E) ?
condifron
=

on FTrst Outzome

P. =
P (E) P( H ) PCH ) + P (EIT P (T) )
=
P P( H) + ( - p) P ( EIT)

GP . 1
P .+
0 o P= A +( -p
DB
= 1 )

= ,

Q N N -

n
~
d d
itl
. P =
P ← ( H 81
Nrli - t
)

P 1
[
pt8) + 81→
=

(Pi + P. ) ( 10 -11
)
P
-

8
=
.

P+ -P F
.
.(P -1
'
) 2 '
=
1 8
.

. …
,

(
1 J
P . P Pp A
=

p ) ?
F
-
,
1

8 -
.


i=z
,
11 (Fe) : p
.
P P
1 (p p, )
1
/
- .
- =
%
- ,

(=☆)
F
,

P. 1 8
= ( i?~ P
-
1 )
-

P. - P ,= P .[ +I8) 8 …
+
(p )
8

P =P . 8) ( ( )' +
.

[ 8

+
(F8)
IMEN 319 Lecture Notes 9

2.6 Random Variables

Given an experiment and the corresponding set of possible outcomes (the sample
space), a random variable associates a particular number with each outcome.
We refer to this number as the numerical value or the experimental value of the
random variable. Mathematically, a random variable is a real-valued function of
the experimental outcome.
10 IMEN 319 Lecture Notes

Definition 7. A random variable taking values in D is a function X that


assigns a value X(!) 2 D to every possible outcome ! 2 ⌦.

Example 2.6.1 We flip three coins. A good sample space for this problem is
⌦ = {HHH, T HH, HT H, HHT, T T H, T HT, HT T, T T T }.
Let X be the total number of heads that are flipped in this experiment.

Definition 8. A random variable X : ⌦ ! D is called discrete if it takes


values in a finite or countable set D.

A random variable is called discrete if its range (the set of values that it can
take) is finite or at most countably infinite. A random variable that can take an
uncountably infinite number of values is not discrete. Note that a set is countable
if its elements can be put in a one-to-one correspondence with the sequence of
positive integers.

Remark: It is customary to denote random variables by uppercase letters X, Y, ...


and nonrandom values by lowercase letters x, y, i, j, ... For example, {X = x} is
the event that the random variable X takes the given value x.

Since the value of a random variable is determined by the outcome of the exper-
iment, we may assign probabilities to the possible values of the random variable.
For a discrete random variable X, the probabilities of the values that it can take
is captured by the probability mass function (pmf for short) of X, denoted
pX . In particular, if x is any possible value of X, the probability mass of x, de-
noted pX (x), is the probability of the event {X = x} consisting of all outcomes
that give rise to a value of X equal to x :
pX (x) = P {X = x}

For example, let the experiment consist of two independent tosses of a fair coin,
and let X be the number of heads obtained. Then the pmf of X is
8
>
<1/4 if x = 0 or x = 2,
pX (x) = 1/2 if x = 1,
>
:
0 otherwise.

Note that
P
pX (x) = 1,
x
IMEN 319 Lecture Notes 11

where in the summation above, x ranges over all the possible numerical values
of X. This follows from the additivity and normalization axioms, because the
events {X = x} are disjoint and form a partition of the sample space, as x ranges
over all possible values of X.

Example 2.6.2 The probability mass function of a random variable X is given by


p(i) = c i /i!, i = 0, 1, 2, ..., where is some positive value. Find (a) P {X = 0}
and (b) P {X > 2}.

Example 2.6.3 Four balls are to be randomly selected, without replacement,


from an urn that contains 20 balls numbered 1 through 20. If X is the largest
numbered ball selected, then X is a random variable that takes on one of the
values 4, 5, ..., 20. Specify the pdf of X and determine P (X > 10).

Random variables whose set of possible values is uncountable are called contin-
uous. Examples include the time that a train arrives at a certain stop and the
lifetime of a transistor. Let X be such a random variable. We say that X is a
continuous random variable if there exists a nonnegative function fX defined
for all real x 2 ( 1, 1), having the property that for any set B of real numbers,

Z
P (X 2 B) = fX (x)dx,
B
The function fX is called the probability density function of X, or pdf for
short. In words, the probability that X will be in B is obtained by integrating
the pdf over the set B. In particular, the probability that the value of X falls
within an interval is
12 IMEN 319 Lecture Notes

Z b
P (a  X  b) = fX (x)dx,
a

and can be interpreted as the area under the graph of the pdf.

Ra
For any single value a, we have P (X = a) = a fX (x) = 0. For this reason,
including or excluding the endpoints of an interval has no e↵ect on its probability:
P (a  X  b) = P (a < X < b) = P (a  X < b) = P (a < X  b).
To interpret the pdf, note that for an interval [x, x + ] with very small length
, we have

Z x+
P ([x, x + ]) = fX (t)dt ⇡ fX (x) · ,
x

so we can view fX (x) as the “probability mass per unit length” near x.

8 →
-
0

Example 2.6.4 Consider a random variable X with pdf


8
1

*
< p if 0 < x  1,
fX (x) = 2 x
:
0 otherwise,

.
"

×


. nxdc = A
f
!

S
.
dc =

[k 1 . A
IMEN 319 Lecture Notes 13

2.7 Expected Value and Variance

In many cases, we would like to know how large X is “on average”, that is, what
is the average value of X over many repeated experiments?

Definition 9. Let X : ⌦ ! D be a discrete random variable having a


probability mass function p(x). The expectation of X is defined as
P
E[X] = xp(x).
x2D

In words, the expected value of X is a weighted average of the possible values


that X can take on, each value being weighted by the probability that X as-
sumes it.

Example 2.7.1 For any event A, define the indicator function 1A : ⌦ ! {0, 1}
as (
1 if A occurs,
I = 1A =
0 if A does not occur.
That is, 1A is the discrete random that takes the value 1 if A occurs and takes
the value 0 if A does not occur. The expectation of 1A is
E[I] = E[1A ] = 1 · P {1A = 1} + 0 · P {1A = 0} = P (A).
We therefore see that the probability of an event A is just the expectation of the
random variable 1A . This observation is useful in many computations.

The expected value or mean of a continuous random variable X is defined by

Z 1
E[X] = xfX (x)dx.
1

Similar to discrete case, E[X] can be interpreted as the anticipated average value
of X in a large number of independent repetitions of the experiment.

Example 2.7.2 Find E[X] when the density function of X is


(
2x if 0  x  1,
fX (x) =
0 otherwise,

There are many other quantities that can be associated with a random variable
and its pmf. For example, we define the 2nd moment of the random variable X
as the expected value of the random variable X 2 . More generally, we define the
14 IMEN 319 Lecture Notes

nth moment as E[X n ], the expected value of the random variable X n . With
this terminology, the 1st moment of X is just the mean.

Suppose that we are given a discrete random variable along with it probabil-
ity mass function and that we want to compute the expected value of some
function of X, say, g(X). How can we accomplish this? One way is as follows:
Since g(X) is itself a discrete random variable, it has a probability mass function,
which can be determined from the probability mass function of X. Once we have
determined the probability mass function of g(X), we can compute E[g(X)] by
using the definition of expected value.

Example 2.7.3 Let X denote a random variable that takes on any of the values
-1, 0, and 1 with respective probabilities
P {X = 1} = 0.2 P {X = 0} = 0.5 P {X = 1} = 0.3
Compute E[X 2 ].

There is another way of thinking about E[g(X)]: Since g(X) will equal g(x)
whenever X is equal to x, it seems reasonable that E[g(X)] should just be a
weighted average of the values g(x), with g(x) being weighted by the probability
that X is equal to x.

Proposition 1. Let X : ⌦ ! D be a discrete random variable having a


probability mass function p(x), and let g : D ! R be a real-valued function.
The expectation of g(X) is defined as
P
E[g(X)] = g(x)p(x)
x2D

If X is a continuous random variable with given pdf, any real-valued function


Y = g(X) of X is also a random variable.

Z 1
E[g(X)] = g(x)fX (x)dx,
1

A simple logical consequence of Proposition 1 is the following:


If a and b are constants, then
E[aX + b] = aE[X] + b
IMEN 319 Lecture Notes 15

Example 2.7.4 A product that is sold seasonally yields a net profit of b dollars
for each unit sold and a net loss of l dollars for each unit left unsold when the
season ends. The number of units of the product that are ordered at a specific
department store during any season is a random variable having probability mass
function p(i), i 0. If the store must stock this product in advance, determine
the number of units the store should stock as to maximize its expected profit.

Let X denote the number of units ordered. If s units are stocked, then the
profit P (s) can be expressed as

P (s) = bX (s X)l if X  s
= sb if X > s

Hence, the expected profit equals


s
X 1
X
E[P (s)] = [bi (s i)l]p(i) + sbp(i)
i=0 i=s+1
s
X s
X h s
X i
= (b + l) ip(i) sl p(i) + sb 1 p(i)
i=0 i=0 i=0
s
X s
X
= (b + l) ip(i) (b + l)s p(i) + sb
i=0 i=0
s
X
= sb + (b + l) (i s)p(i)
i=0

To determine the optimum value of s, let us investigate what happens to the


profit when we increase s by 1 unit. By substitution, we see that the expected
profit in this case is given by
s+1
X
E[P (s + 1)] = b(s + 1) + (b + l) (i s 1)p(i)
i=0
Xs
= b(s + 1) + (b + l) (i s 1)p(i)
i=0

Therefore,
P
s
E[P (s + 1)] E[P (s)] = b (b + l) p(i)
i=0

Thus, stocking s + 1 units will be better than stocking s units whenever


Ps b
p(i) <
i=0 b + l
Because the left-hand side is increasing in s while the right-hand side is constant,
the inequality will be satisfied for all values of s  s⇤ , where s⇤ is the largest
value of s satisfying the above equation. Since
16 IMEN 319 Lecture Notes

E[P (0)] < · · · < E[P (s⇤ )] < E[P (s⇤ + 1)] > E[P (s⇤ + 2)] > · · ·
it follows that stocking s⇤ + 1 items will lead to a maximum expected profit.

Although E[X] yields the weighted average of the possible values of X, it does
not tell us anything about the variation, or spread, of these values. Because we
expect X to take on values around its mean E[X], it would appear that a rea-
sonable way of measuring the possible variation of X would be to look at how
far apart X would be from its mean, on the average.

Definition 10. Let X : ⌦ ! D be a discrete random variable having a


probability mass function p(x). The variance of X is defined as
V ar(X) = E[(X E[X])2 ]
and can be calculated as
P
V ar(X) = (x E[X])2 p(x).
x2D

Its square root is denoted by X and is called the standard deviation.

The variance or standard deviation provides a measure of dispersion of X


around its mean. Since (X E[X])2 can only take nonnegative values, the
variance is always
P nonnegative. But could it be zero? Since every term in
the formula (x E[X])2 p(x) is nonnegative, the sum is zero if and only if
(x E[X])2 p(x) = 0 for every x. This condition implies that for any x with
p(x) > 0, we must have x = E[X] and the random variable X is not really
“random”: its experimental value is equal to the mean E[X], with probability 1.

An alternative formula for V ar(X) is derived as follows:


X
V ar(X) = (x E[X])2 p(x)
x
X
= (x2 2xE[X] + (E[X])2 )p(x)
x
X X X
= x2 p(x) 2E[X] xp(x) + (E[X])2 p(x)
x x x
= E[X 2 ] 2(E[X])2 + (E[X])2
= E[X 2 ] (E[X])2

var(X) = E[X 2 ] (E[X])2

Example 2.7.5 Suppose there are m days in a year, and that eachPm person is in-
dependently born on day r with probability pr , r = 1, ..., m,, r=1 pr = 1. Let
Ai,j be the event that persons i and j are born on the same day.
IMEN 319 Lecture Notes 17

(a) Find P (A1,3 )


(b) Find P (A1,3 |A1,2 )
(c) Show P (A1,3 |A1,2 ) P (A1,3 )

Example 2.7.6 Consider the case of a uniform pdf over an interval [a, b]. Com-
pute the variance.

A useful identity of variance is that for any constants a and b,


V ar(aX + b) = a2 V ar(X)
Example 2.7.7 Consider a random variable Y with probability density function:
18 IMEN 319 Lecture Notes
(
1 y/5
fY (y) = 25 ye if y > 0,
0 otherwise.

(a) Specify the cumulative distribution function of the random variable Y.


(b) Compute E[Y ] and V ar(Y ).

You might also like