0% found this document useful (0 votes)
10 views116 pages

1-ProbabilityReview v3

The probability that the first roll is 2 given that the sum is 7 is 1/6. P(A|B) = P(A ∩ B) / P(B) = 1/6 / 1 = 1/6 The only outcome in B that has the first roll being 2 is (2,5). Since there is only 1 such outcome and the total outcomes in B is 6, the probability is 1/6.

Uploaded by

anntr154
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views116 pages

1-ProbabilityReview v3

The probability that the first roll is 2 given that the sum is 7 is 1/6. P(A|B) = P(A ∩ B) / P(B) = 1/6 / 1 = 1/6 The only outcome in B that has the first roll being 2 is (2,5). Since there is only 1 such outcome and the total outcomes in B is 6, the probability is 1/6.

Uploaded by

anntr154
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 116

Probability Review

January 26, 2023

1 / 96
Outline

• Textbook: Chapter 1 - Shreve I


• Review some basis context in probability
• Sample space and probability
• Filtration as record of information
• Random variable and expectation
• Conditional distribution and conditional expectation

2 / 96
Random process

A random process (or stochastic process ) is a mathematical model of a probabilistic


experiment that evolves in time and generates a sequence of numerical values.
• the sequence of daily prices of a stock
• the sequence of scores in a football game
• the sequence of failure times of a machine

3 / 96
Random process

A random process (or stochastic process ) is a mathematical model of a probabilistic


experiment that evolves in time and generates a sequence of numerical values.
• the sequence of daily prices of a stock
• the sequence of scores in a football game
• the sequence of failure times of a machine
Each numerical value in the sequence is modeled by a random variable,

3 / 96
Random process

A random process (or stochastic process ) is a mathematical model of a probabilistic


experiment that evolves in time and generates a sequence of numerical values.
• the sequence of daily prices of a stock
• the sequence of scores in a football game
• the sequence of failure times of a machine
Each numerical value in the sequence is modeled by a random variable, so a stochastic
process is simply a (finite or infinite) sequence of random variables

3 / 96
Why study random processes?

• Many phenomenon can be modelled by random processes: understand random


processes help to understand the phenomenon.
• Wide application: finance, actuary...
• Provide foundation for later: financial mathematics 1 and 2.

4 / 96
Course outline

• Review probability
• Introduction to random process
• Introduction to some special random process
• Poisson process
• Markov chain
• Random walk
• Brownian motion
• Stochastic calculus
• Itô integral
• Itô’s formula for derivative of a function of Brownian motion and of an Itô process
• Solve stochastic differential equations

5 / 96
Plan

1 Probability space
Probability space

2 Random variables
Random variables
Simulation
Expectation and Variance

3 Random vectors

4 Conditional distribution and conditional expectation


Conditional distribution
Conditional Expectation

6 / 96
Probability models

• Random experiment: produce uncertain outcomes under the same condition.


• An outcome ω: a result of experiment.
• A sample space Ω: all results of experiment.
• An event: a collection of some outcomes
• Probability measure P : estimate the likelihood of each event.
• P (A): how likely that event A occurs
7 / 96
Axiom of Probability

A law P to assign each event in a sample space Ω to a number is a probability in Ω if


it satisfies
1 0 ≤ P (A) ≤ 1
2 P (Ω) = 1
3 If A1 , A2 , . . . are mutually exclusive (also called disjoint, i.e Ai ∩ Aj = ∅∀i ̸= j)
then
∞ ∞
!
[ X
P An = P (An )
n=1 n=1

8 / 96
Probability measure on finite sample space

• Sample space has finite elements

Ω = {x1 , . . . , xn }

• X
P (A) = p(xi )
xi ∈A

• Probability of each outcome p(xi ) must satisfies the conditions


1 0 ≤ P (xi ) ≤ 1
2 p(x1 ) + · · · + p(xn ) = 1

9 / 96
Example 1

• Experiment: toss a fair coin three times


• Sample space: all possible outcome of three coin tosses

Ω = {HHH, HHT, HT H, HT T, T HH, T HT, T T H, T T T }

1 Probability of individual outcomes?


2 Probability of H on the first toss?

10 / 96
Additive rule

P( A ∪ B}
| {z ) = P (A) + P (B) − P ( A ∩ B}
| {z )
at least one event occurs both events occur

Additive rule for disjoint sets


If A and B are disjoint, i.e A ∩ B = ∅ then they can not occur simultaneously and

P (A ∪ B) = P (A) + P (B)

Complement rule
Complement of A is Ac = Ω − A - the event contains all outcomes that is not in A

P (Ac ) = 1 − P (A)

11 / 96
Conditional probability

For events A and B, the conditional probability of A given B is

P (A ∩ B)
P (A|B) =
P (B)

defined for P (B) > 0.

12 / 96
Conditional probability

For events A and B, the conditional probability of A given B is

P (A ∩ B)
P (A|B) =
P (B)

defined for P (B) > 0.

Measure the likelihood of A in the new sample space B

12 / 96
Example
Roll a fair dice twice. What is the probability that the first roll is a 2 given that the
sum of the roll is 7?

13 / 96
Example
Roll a fair dice twice. What is the probability that the first roll is a 2 given that the
sum of the roll is 7?
Solution
• Sample space Ω = {(i, j) : 1 ≤ i, j ≤ 6}
• A: outcome of the 1st dice

13 / 96
Example
Roll a fair dice twice. What is the probability that the first roll is a 2 given that the
sum of the roll is 7?
Solution
• Sample space Ω = {(i, j) : 1 ≤ i, j ≤ 6}
• A: outcome of the 1st dice
• B: sum of the roll is 7.

13 / 96
Example
Roll a fair dice twice. What is the probability that the first roll is a 2 given that the
sum of the roll is 7?
Solution
• Sample space Ω = {(i, j) : 1 ≤ i, j ≤ 6}
• A: outcome of the 1st dice
• B: sum of the roll is 7.
• Need to find P (A|B)

13 / 96
Example
Roll a fair dice twice. What is the probability that the first roll is a 2 given that the
sum of the roll is 7?
Solution
• Sample space Ω = {(i, j) : 1 ≤ i, j ≤ 6}
• A: outcome of the 1st dice
• B: sum of the roll is 7.
• Need to find P (A|B)
• B = {(1, 6), (2, 5), (3, 4), (4, 3), (5, 2), (6, 1)}

13 / 96
Example
Roll a fair dice twice. What is the probability that the first roll is a 2 given that the
sum of the roll is 7?
Solution
• Sample space Ω = {(i, j) : 1 ≤ i, j ≤ 6}
• A: outcome of the 1st dice
• B: sum of the roll is 7.
• Need to find P (A|B)
• B = {(1, 6), (2, 5), (3, 4), (4, 3), (5, 2), (6, 1)}
• A = {(2, 1), (2, 2), (2, 3), (2, 4), (2, 5), (2, 6)}

13 / 96
Example
Roll a fair dice twice. What is the probability that the first roll is a 2 given that the
sum of the roll is 7?
Solution
• Sample space Ω = {(i, j) : 1 ≤ i, j ≤ 6}
• A: outcome of the 1st dice
• B: sum of the roll is 7.
• Need to find P (A|B)
• B = {(1, 6), (2, 5), (3, 4), (4, 3), (5, 2), (6, 1)}
• A = {(2, 1), (2, 2), (2, 3), (2, 4), (2, 5), (2, 6)}
• AB = {(2, 5)}
P (AB) 1/36 1
P (A|B) = = =
P (B) 6/36 6

13 / 96
Practice

Roll a fair dice twice. What is the probability that the first roll is a 2 given that the
second roll is even?
Does the result of the first roll effect on the second?

14 / 96
Independence

Two events A and B are independent if

P (A|B) = P (A) for P (B) > 0

15 / 96
Independence

Two events A and B are independent if

P (A|B) = P (A) for P (B) > 0

There is no relationship between A and B. B has no effect on A or knowing B does


not change the probability that A happens then.

15 / 96
Independence

Two events A and B are independent if

P (A|B) = P (A) for P (B) > 0

There is no relationship between A and B. B has no effect on A or knowing B does


not change the probability that A happens then.
Equivalent condition
P (AB) = P (A)P (B)

15 / 96
Independence

Two events A and B are independent if

P (A|B) = P (A) for P (B) > 0

There is no relationship between A and B. B has no effect on A or knowing B does


not change the probability that A happens then.
Equivalent condition
P (AB) = P (A)P (B)
the second condition is used to check the dependency between A and B even for
P (B) = 0

15 / 96
Multiplication rule
1 P (AB) = P (B)P (A|B)
2 General case

P (A1 A2 ...Ak ) = P (A1 )P (A2 |A1 )P (A3 |A1 A2 )...P (Ak |A1 ...Ak−1 )

16 / 96
Law of total probability
Let A1 , ..., Ak be a partition the sample space
• Ai ∩ Aj = ∅ for all i ̸= j (mutually exclusive (disjoint))
• ∪i Ai = Ω
Then, for any event B, we have
k
X k
X
P (B) = P (B ∩ Ai ) = P (B|Ai )P (Ai ).
i=1 i=1

17 / 96
Plan

1 Probability space
Probability space

2 Random variables
Random variables
Simulation
Expectation and Variance

3 Random vectors

4 Conditional distribution and conditional expectation


Conditional distribution
Conditional Expectation

18 / 96
Random variables

Definition
Random variable A random variable a function on sample space

X: Ω→R
ω → X(ω)

19 / 96
Random variables

Definition
Random variable A random variable a function on sample space

X: Ω→R
ω → X(ω)

Classify random variables


Range(X) = {X(ω) for ω ∈ Ω} is the set of all possible values of random variable X
• If Range(X) is a countable set then X is called a discrete random variable.
• If Range(X)is an uncountable set then X is called a continuous random variable.

19 / 96
Example - Binomial Asset Pricing Model
u: up factor, d: down factor

• Initial stock price S0


• Next period
• Upward: uS0
• Downward: dS0
1
where 0 < d < 1 < u, (d = u )
• Toss a coin
• Head: move up
• Tail: move down
• Range(S1 ) = {2, 8} so S1 is a
discrete RV

20 / 96
Definition (Cumulative distribution function (cdf))
Probability that the values of X does not exceed a given value

F (x) = P (X ≤ x)

21 / 96
Definition (Cumulative distribution function (cdf))
Probability that the values of X does not exceed a given value

F (x) = P (X ≤ x)

Evaluate probability of a RV with cdf


• P (X > x) = 1 − F (x)
• P (X < x) = lim F (t) = F (x− )
t→x−
• P (a < X ≤ b) = F (b) − F (a)
• P (a ≤ X ≤ b) = F (b) − F (a− )

21 / 96
Properties of cdf

• Increasing

lim F (x) = 0
x→−∞

and
lim F (x) = 1
x→∞
• has left-limit
• Right- continuous

22 / 96
Distribution of a discrete random variable
Range(X): countable set x1 , x2 , ...
Definition (Probability mass function (pmf))
pmf of X is a function pX given by

pX (xi ) = P (X = xi )

which satisfies
1 0 ≤ pX (xi ) ≤ 1 for all i
X
2 pX (xi ) = 1
i

Properties
X X
(1) P (a ≤ X ≤ b) = pX (xi ) (2) cdf F (x) = pX (xi )
a≤xi ≤b xi ≤x
23 / 96
Bernoulli distribution

A random variable X is Bernoulli distribution if it is an indicator random variable of a


trial which has success probability p and failure probability 1 − p.

P (X = 0) = 1 − p
P (X = 1) = p

Denote X ,→ Ber(p)

24 / 96
Example - Bernoulli distribution

Toss a fair coin. Let X be the number of H.


Probability mass function (pmf) of X

x 0 1
P (X = x) P (T ) = 1/2 P (H) = 1/2

25 / 96
Binomial distribution

• n independent trials, each is Ber(p).


• X is the number of success
• X is called Binomial RV with parameter (n, p)
• Denote X ∼ Bino(n, p).
!
n k
P (X = k) = p (1 − p)n−k
k

26 / 96
Example - Binomial distribution

Toss a fair coin twice. Pmf of the number of H


x P (X = x)
0 P (T T ) = 1/4
1 P (T H) + P (T T ) = 1/2
2 P (HH) = 1/4

27 / 96
Example - Binomial distribution

Toss a fair coin 3 times. Pmf of the number of H


x P (X = x)
0 P (T T T ) = 1/8
1 P (T T H) + P (T HT ) + P (HT T ) = 3/8
2 P (HHT ) + P (HT H) + P (T HH) = 3/8
3 P (HHH) = 1/8

28 / 96
Example - Payoff of European Call Option

• European call option: confers the right to buy the stock at maturity or
expiration time T = 2 for strike price K = 14 dollars. It is worth
• S2 − K if S2 − K > 0
• 0 otherwise
• Value (payoff) of the option at maturity

C2 = max(S2 − K, 0) = (S2 − K)+

where x+ = max(x, 0)
• Stock price: binomial model with S0 = 4, d = 1/2, u = 2,
p = p(H)1/2,q = p(T ) = 1 − p = 1/2
• Find probability distribution for payoff C2 of the European call option.

29 / 96
Solution

pmf of S2

x 1 4 16
P (S2 = x) 1/4 1/2 1/4

pmf of C2

x 0 2
P (C2 = x) 3/4 1/4

30 / 96
Example - European Put Option

• European put option: confers the right to sell the stock at maturity or
expiration time T = 2 for strike price K = 3 dollars. It is worth
• K − S2 if K − S2 > 0
• 0 otherwise
• Value (payoff) of the option at maturity

P2 = max(K − S2 , 0) = (K − S2 , 0)+

• Stock price: binomial model with S0 = 4, d = 1/2, u = 2, p(H) = p(T ) = 1/2


• Find probability distribution for payoff P2 of the European put option.

31 / 96
Solution

pmf of S2

x 1 4 16
P (S2 = x) 1/4 1/2 1/4

pmf of C2

x 0 2
P (P2 = x) 3/4 1/4

32 / 96
Practice

Find c.d.f of stock price S2 , European call option C2 and European put option P2 in
the previous example

33 / 96
Distribution of a continuous random variable
Range(X): uncountable
Definition (Probability density function (pdf))
The pdf of X is a function f that satisfies
1 f (x) ≥ 0 for all x
R∞
2 P (−∞ < X < ∞) = −∞ f (x)dx =1

Properties
Ra
• P (X = a) = P (a ≤ X ≤ a) = a f (x)dx = 0
Rb
• P (a ≤ X ≤ b) = a f (x)dx
• P (a ≤ X ≤ b) = F (b) − F (a)
• cdf Z a
F (a) = P (X ≤ a = f (x)dx)
−∞

34 / 96
Probability as an Area

Note that probability of any individual value is 0


35 / 96
Interpretation of p.d.f

Z a+ ϵ
ϵ ϵ
 
2
P a− ≤X ≤a+ = f (x)dx
2 2 a− 2ϵ

≈ ϵf (a)

f (a) is a measure of how likely it is that the random variable will be near a.

36 / 96
Continuous uniform distribution

• X is uniformly distributed on [a, b] if its pdf is


(
1
b−a if x ∈ [a, b]
fX (x) =
0 otherwise

37 / 96
Continuous uniform distribution

• X is uniformly distributed on [a, b] if its pdf is


(
1
b−a if x ∈ [a, b]
fX (x) =
0 otherwise

Any value in [a, b] is equally likely to be value of X.


Denote X ∼ U ni[a, b].
• cdf 
0,

 x<a
x−a
F (x) = , a≤x<b
 b−a
x≥b

1,

37 / 96
Exponential distribution with parameter λ

• pdf
(
0, x<0
f (x) = −λx
λe , x≥0
• cdf (
0, x<0
F (x) =
e−λx , x≥0

38 / 96
Normal distribution

Definition
Continuous RV X is said to be normally distributed with parameter µ and σ 2 if its pdf
is
1 2 2
f (x) = √ e−(x−µ) /2σ , −∞ < x < ∞
σ 2π
Denote X ∼ N (µ, σ 2 ).

Properties
1 E(X) = µ
2 V ar(X) = σ 2

39 / 96
Simulate random number

Simulate uniform random number


• Simulate 1 random number from uniform distribution: rand
• Simulate a m × n matrix of random number from unifrom distribution rand(m,n)

Simulate random number from typical distribution


search Google for binomial, poisson, exponential, normal ....

40 / 96
Practice

• Generate 10000 uniform random number


• Plot histogram of the generated sample to make a comparison with pdf of uniform
distribution
• Similar for binomial, poisson, exponential, normal random number

41 / 96
Definition (Expectation of random variable X)
∞
X
xk P (X = xk ), if X is discrete with countable values





k=1
E(X) = Z ∞


 x fX (x) dx, if X is continuous with uncountable values
 −∞ | {z }

pdf of X

Expectation of function of random variable


∞
 X g(x )P (X = x ),


 k k
E(g(X)) = k=1
Z ∞

g(x)fX (x)dx



−∞

42 / 96
Definition (Variance of random variable X)

V ar(X) = E[(X − E[X])2 ]

Property

V ar(X) = E[X 2 ] − (E[X])2


where
(P
2
2 x x p(X = xk ) if X is a discrete random variable
E(X ) = R ∞k 2k
−∞ x fX (x)dx if X is a continuous random variable

43 / 96
Some properties of expectation and variance

• E(g(X) + h(X)) = E(g(X)) + E(h(X))


• E[aX + b] = aE[X] + b
• V ar[aX + b] = a2 V ar[X]

44 / 96
Example

Find expectation and variance for stock price S2 , European call option C2 and
European put option P2 in the previous example

45 / 96
Plan

1 Probability space
Probability space

2 Random variables
Random variables
Simulation
Expectation and Variance

3 Random vectors

4 Conditional distribution and conditional expectation


Conditional distribution
Conditional Expectation

46 / 96
Joint distribution of two discrete random variables

Definition (Joint probability mass function )

pX,Y (x, y) = P (X = x, Y = y)

Properties
For constants a < b and c < d,
b X
X d
P (a ≤ X ≤ b, c ≤ Y ≤ d) = P(X = x, Y = y)
x=a y=c

47 / 96
Marginal pmf

pmf of X and Y are given by



X
PX (X = x) = P(X = x, Y = y)
y=−∞

and ∞
X
PY (Y = y) = P(X = x, Y = y)
x=−∞

48 / 96
Example
Consider binomial asset pricing model
• S0 = 4, u = 2, d = 1/2
• p(H) = 1/3, p(T ) = 2/3

Find the joint pmf of the stock price (S1 , S2 )


49 / 96
Joint distribution of two continuous random variables

Definition (Joint pdf)


The joint probability density function of two continuous random variables X and Y ,
denoted by fX,Y (x, y) is a function which satisfies the following properties:
• fX,Y (x, y) > 0, ∀x, y
Z +∞ Z +∞
• fX,Y (x, y)dxdy = 1
−∞ −∞
• ZZ
P ((X, Y ) ∈ R) = fX,Y (x, y)dxdy
R
Z bZ d
In particular P (a ≤ x ≤ b, c ≤ y ≤ d) = fX,Y (x, y)dydx
a c

50 / 96
Probability as a volume
ZZ
P ((X, Y ) ∈ R = fX,Y (x, y)dxdy
R

51 / 96
Definition (Marginal pdf)
The marginal pdf of X and Y are given by
Z ∞
fX (x) = fX,Y (x, y)dy
−∞

and Z ∞
fY (y) = fX,Y (x, y)dx
−∞

Definition (Joint cdf)


Z a Z b
FX,Y (a, b) = P (X ≤ a, Y ≤ b) = fX,Y (x, y)dydx
−∞ −∞
Relationship between joint pdf and joint cdf

∂2
fX,Y (x, y) = FX,Y (x, y)
∂x∂y
52 / 96
Independence of random variables
• X and Y are independence if and only if

FX,Y (x, y) = FX (x)FY (y)

for all x, y
• If X and Y are discrete RV then X and Y are independent if and only if

P (X = x, Y = y) = P (X = x)P (Y = y)

for all x, y
• If X and Y are continuous RV then X and Y are independent if and only if

fX,Y (x, y) = fX (x)fY (y)

for all x, y
53 / 96
Definition (Expectation of function of random vector)
P P
 x y g(x, y)P (X = x, Y = y), if X, Y are discrete

Z ∞ Z ∞
E[g(X, Y )] =

 g(x, y)fX,Y (x, y)dxdy, if X, Y are continuous
−∞ −∞

Example
P P
 x y xyP (X = x, Y = y), if X, Y are discrete

Z ∞ Z ∞
E[XY ] =

 xyfX,Y (x, y)dxdy, if X, Y are continuous
−∞ −∞

54 / 96
Covariance and correlation coefficient

• Covariance of X and Y is cov(X, Y ) = E[XY ] − E[X]E[Y ].


• Correlation coefficient of X and Y is given by:

cov(X, Y )
cor(X, Y ) =
σX σY
where σX , σY are standard deviations of X and Y

−1 ≤ corr(X, Y ) ≤ 1
Correlation coefficient is used to measure how strong linear relationship between
X and Y is

55 / 96
Bivariate normal distribution

Let X1 ,→ N (µ1 , σ12 ), X2 ,→ N (µ2 , σ22!) be 2 normal random variables with covariance
X1
σ12 then the random vector X = is a bivariate normal distribution if the joint
X2
pdf is given by
1 − 21 (x−µ)T Σ−1 (x−µ)
f (x1 , x2 ) = 1 e
2π det(Σ) 2
where !
µ1
• µ=
µ2
!
σ12 σ12
• Σ= is the variance - covariance matrix of X
σ12 σ22

56 / 96
57 / 96
Properties

• Cov(X, Y ) = Cov(Y, X)
• Cov(X, X) = V ar(X)
• Cov(aX, Y ) = aCov(X, Y )
• Cov(X + Y, Z) = Cov(X, Z) + Cov(Y, Z)
• If X and Y are independent then Cov(X, Y ) = 0

58 / 96
Variance of Sum

X1 , . . . , Xn : RVs
n
X n
X X
V ar( Xi ) = V ar(Xi ) + Cov(Xi , Xj )
i=1 i=1 i̸=j

59 / 96
Variance of Sum

X1 , . . . , Xn : RVs
n
X n
X X
V ar( Xi ) = V ar(Xi ) + Cov(Xi , Xj )
i=1 i=1 i̸=j

V ar(X + Y ) = V ar(X) + V ar(Y ) + 2Cov(X, Y )

59 / 96
Sum of normal distributions

2 ),
• If (X, Y ) has multivariate normal distribution, X ,→ N (µX , σX
Y ,→ N (µY , σY2 ) and Cov(X, Y ) = σXY then
2 2
X + Y ,→ N (µX + µY , σX + σX + 2σXY )
2 ), Y ,→ N (µ , σ 2 ) and X and Y are independent then
• If X ,→ N (µX , σX Y Y

2 2
X + Y ,→ N (µX + µY , σX + σX )

60 / 96
Example

Let X ,→ N (0, 4) and Y ,→ N (0, 1) be daily return of stock A and B respectively.


Suppose that Cov(X, Y ) = 2 and the joint distribution of X and Y is a multivariate
normal distribution.
1 Consider a portfolio consisting of 70% stock A and 30% stock B. Then the return
of the portfolio is given by
W = 0.7X + 0.3Y
Determine the distribution of W .
2 Suppose that we aim to allocate stock A and B with weight a and 1 − a. The
return of portfolio is
U = aX + (1 − a)Y
Determine a to minimize risk of the portfolio.

61 / 96
Solution

1 W is normally distributed with


• E(W ) = 0.7E(X) + 0/3E(Y ) = 0

V ar(W ) = V ar(0.7X) + V ar(0.3Y ) + 2Cov(0.7X0.3Y )


= (0.7)2 V ar(X) + (0.3)2 V ar(Y ) + 2(0.21)Cov(X, Y )
= (0.7)2 × 4 + (0.3)2 × 1 + (0.42) × 2 =

2 Evaluate risk via variance

V ar(U ) = 4a2 + (1 − a)2 + 4a(1 − a)

Need to find a ∈ [0, 1] to minimize V ar(U )

62 / 96
Plan

1 Probability space
Probability space

2 Random variables
Random variables
Simulation
Expectation and Variance

3 Random vectors

4 Conditional distribution and conditional expectation


Conditional distribution
Conditional Expectation

63 / 96
Conditioning a RV on an event

The conditional pmf of a RV X given an event A is

P ((X = x) ∩ A)
pX|A (x) = P (X = x|A) =
P (A)

if P (A) > 0

64 / 96
Example

Let X be the roll of a fair die and let A be the event that the roll is an even number.
Then
P (X = 1 and roll is even)
pX|A (1) = =0
P (roll is even)

65 / 96
Example

Let X be the roll of a fair die and let B be the event that the roll is an even number.
Then
P (X = 2 and roll is even) 1
pX|B (2) = =
P (roll is even) 3

66 / 96
Conditional of a discrete RV on another

• 2 RVs X and Y
• given Y = y with P (Y = y) > 0
• conditional pmf of X

pX|Y (x|y) = P (X = x|Y = y)


P (X = x, Y = y)
=
P (Y = y)

67 / 96
For each y, we view the joint pmf along the slice Y = y and renormalize such that
X
pX|Y (x|y) = 1
x 68 / 96
Example
1
Consider a binomial asset pricing model with S0 = 4, d = 2, u = 2, p = 2/3 and
q = 1/3. Find conditional pmf of S2 given S1 = 2.

69 / 96
Solution

x 1 4 8
1 2
PS2 |S1 =2 (x|2) 3 3 0

70 / 96
Practice

Consider a binomial asset pricing model with S0 = 4, d = 2, u = 12 , p = 2/3 and


q = 1/3. Find conditional pmf of S1 given S2 = 4.

71 / 96
Practice

Consider binomial asset pricing model


• S0 = 4, u = 2, d = 1/2
• p = 1/3, q = 2/3
Find
1 The conditional probability of S2 = 16 given that S1 = 8
2 The conditional probability of S3 = 8 given that S1 = 8

72 / 96
Conditional distributions

• If X and Y are jointly distributed discrete random variables, then the conditional
probability mass function of Y given X = x is
P (X = x, Y = y)
P (Y = y|X = x) = defined when P (X = x) > 0.
P (X = x)

73 / 96
Conditional distributions

• If X and Y are jointly distributed discrete random variables, then the conditional
probability mass function of Y given X = x is
P (X = x, Y = y)
P (Y = y|X = x) = defined when P (X = x) > 0.
P (X = x)
• For continuous random variable X and Y , the conditional density function of Y
given X = x is
f (x, y)
fY |X (y|x) = ,
fX (x)
provide the likelihood that X takes values near x given that Y takes values near
by y

73 / 96
Example
Let X1 and X2 be jointly normal random variables with parameters µ1 , σ1 , µ2 , σ2 , and
ρ. The joint pdf is given by
(x1 −µ1 )2 (x −µ )2
 
1 (x −σ )(x −σ )
1 − + 2 22 −2ρ 1 σ1 σ 2 2
1−ρ2 2σ 2 2σ 1 2
f (x1 , x2 ) = p e 1 2
2πσ1 σ2 1 − ρ2

X1 is a normal distribution N (µ1 , σ12 ) with pdf


(x1 −µ1 )2
1 − 2σ 2
fX1 (x1 ) = e 1
2πσ1
Condition pdf of X2 given X1 = x1 is

f (x1 , x2 )
fX2 |X1 (x2 |x1 ) = = ...
fX1 (x1 )

74 / 96
Another way to find conditional distribution of X2 given
X1 = x 1
Using the construction
(
X1 = µ1 + σ1 Z1
p
X2 = µ2 + σ2 (ρZ1 + 1 − ρ2 Z2 )
Given X1 = x1 , we have
x1 − µ 1
Z1 =
σ1
Then
x1 − µ 1 q x 1 − µ1
  q
X2 = µ2 + σ2 ρ + 1 − ρ2 Z2 = µ2 + σ2 + σ2 1 − ρ2 Z2
σ1 σ1

75 / 96
Since Z2 and Z1 are independent, knowing Z1 does not provide any information on Z2 .
It implies that given X1 = x1 , X2 is a linear combination of Z2 , thus it is normal with
mean
x1 − µ 1
µ2 + σ2
σ1
and variance
σ22 (1 − ρ2 )
So
x 1 − µ1 2
 
X2 |(X1 = x1 ) ∼ N µ2 + σ2 , σ2 (1 − ρ2 )
σ1

76 / 96
Conditional expectation of Y |X = x

X


 yP (Y = y|X = x), if X is discrete
y
• E(Y |X = x) = Z ∞
yfY |X (y|x)dy, if X is continuous



−∞

77 / 96
Conditional expectation of Y |X = x

X


 yP (Y = y|X = x), if X is discrete
y
• E(Y |X = x) = Z ∞
yfY |X (y|x)dy, if X is continuous



−∞
• E(Y |X = x) is a function of x, i.e., the result depends on the value of x

77 / 96
Example

Consider binomial asset pricing model


• S0 = 4, u = 2, d = 1/2
• p(H) = 1/3, p(T ) = 2/3
Find
1 The conditional expectation of S2 given that S1 = 8.
2 The conditional expectation of S3 given that S1 = 8

78 / 96
Example

Suppose that W1 ,→ N (0, 1) and W2 (0, 1) are log-return of the first and ther second
year of stock A. Suppose W1 and W2 are independent. The cumulative log-return of
this stock is
B1 = W1
and
B2 = W1 + W2 .
Given that B1 = 1, find conditional expectation of B2 .

79 / 96
Important note

1 If X and Y are independent then E(Y |X = x) = E(Y ).


2 If g is a function then E(g(X)|X = x) = g(x).
3 If g is a function then
X


 g(y)P (Y = y|X = x), if X is discrete
y
E(g(Y )|X = x) = Z ∞
g(y)fY |X (y|x)dy, if X is continuous



−∞

80 / 96
σ − algebra: record of information

Let Ω be a sample space of a random experiment. A collection of subsets of Ω is called


an σ − algebra over Ω if it satisfies the following conditions:
1 Ω∈F
2 A ∈ F ⇒ Ac ∈ F
3 Ai ∈ F, ∀i = 1, 2 · · · ⇒ ∪∞
i=1 Ai ∈ F
Meaning: In measure-theoretic probability, information is modeled using σ-algebras.
The information associated with a σ-algebra F can be thought of as follows. A
random experiment is performed and an outcome w is determined, but the value of w
is not revealed. Instead, for each set in the σ-algebra F, we are told whether w is in
the set. The more sets there are on F, the more information this provides.

81 / 96
Example
Some important σ − algebra of Ω - sample space when toss a coint three times
1 F0 = {∅, Ω}: trial σ − algebra - contains no information. Knowing whether the
outcome w of the three tosses is in ∅ and whether it is in Ω tells you nothing
about w.

82 / 96
Example
Some important σ − algebra of Ω - sample space when toss a coint three times
1 F0 = {∅, Ω}: trial σ − algebra - contains no information. Knowing whether the
outcome w of the three tosses is in ∅ and whether it is in Ω tells you nothing
about w.
2

F1 = {0, Ω, {HHH, HHT, HT H, HT T }, {T HH, T HT, T T H, T T T }}


= {0, Ω, AH , AT }

where
AH = {HHH, HHT, HT H, HT T } = { H on first toss }
AT = {T HH, T HT, T T H, T T T } = { H on first toss }
F1 : information of the first coin or ”information up to time 1”. For
example, you are told that the first coin is H and no more.
82 / 96
Example 2
3

F2 = {∅, Ω, {HHH, HHT }, {HT H, HT T }, {T HH, T HT }, {T T H, T T T }


and all sets which can be built by taking unions of these }
= {∅, Ω, AHH , AHT , AT H , AT T }
and all sets which can be built by taking unions of these }

where
AHH = {HHH, HHT } = {HH on the first two tosses}
AHT = {HT H, HT T } = {HT on the first two tosses}
AT H = {T HH, T HT } = {TH on the first two tosses}
AT T = {T T H, T T T } = {TT on the first two tosses}
F2 : information of the first two tosses or ”information up to time 2”
83 / 96
4 F3 = G set of all subsets of Ω: “full information” about the outcome of all three
tosses

84 / 96
F - measurable

Definition
A random variable X is called F - measurable if σ(X) ⊂ F
Meaning: the information in F is enough to determine the value of the random
variable X(w), even though it may not be enough to determine the value w of the
outcome of the random experiment.
Example
Consider a binomial asset pricing model and F2 is σ - algebra generated by information
of the first two tosses.
The asset price S1 and S2 are F2 - measurable but S3 is not F2 - measurable

85 / 96
Conditional distribution X given a σ - algebra F
Conditional expectation E(X|F) is a random variable which satisfies
• F - measurable
Meaning: Estimate E(X|F) of X is based on the information in F
• Partial - average property
Z Z
E(X|F)dP = XdP for all A ∈ F
A A

E(X|F) is indeed an estimate of X. It gives the same averages as X over all the
sets in F.
If F has many sets, which provide a fine resolution of the uncertainty inherent in
w, then this partial-averaging property over the ”small” sets in F says that
E(X|F) is a good estimator of X.

86 / 96
Conditional expectation given a random variable

Definition

E(Y |X) = E(Y |σ(X))


can be regared as an estimate of the value of Y based on the knownledge of X

87 / 96
Conditional expectation given a random variable

Definition

E(Y |X) = E(Y |σ(X))


can be regared as an estimate of the value of Y based on the knownledge of X

Find formula for E(Y |X)


• Find g(x) = E(Y |X = x)
• E(Y |X) = g(X)

87 / 96
Example
Consider a binomial asset pricing model with S0 = 4, u = 1/d = 2, p(H) = 2/3 and
p(T ) = 1/3, find E(S2 |S1 )

88 / 96
Properties of conditional expectation
1 Linearity
E(aY + bZ|X) = aE(Y |X) + bE(Z|X)
2 Taking out of what we know (typical approach to verify Markov and martingal
property)
E(f (X)Y |X) = f (X)E(Y |X)
3 Iterated conditioning σ - algebra G ⊂ H

E(E(Z|H)|G) = E(Z|G)

In particular E(Y ) = E(E(Y |G))


4 Independence
E(Y |G) = E(Y )
if X is independent of G.
89 / 96
Example - Linearity

Considering a binomial asset pricing model with S0 = 4, p = 1/3 and q = 2/3.


Compare
E(S2 + S3 |S1 )
and
E(S2 |S1 ) + E(S3 |S1 )

90 / 96
Solution

• S1 takes two values 8 and 2


• Given S1 = 8
2 1
E(S2 |S1 = 8) = (16) + (4) = 12
3 3
 2
2 1 2
  
E(S3 |S1 = 8) = (32) + (8)
3 3 3
 2
2 1 1
  
+ (8) + (2) = 18
3 3 3
So
E(S2 |S1 = 8) + E(S3 |S1 = 8) = 12 + 18 = 30

91 / 96
• Given S1 = 8, (S2 , S3 ) takes pair values (16, 32), (16, 8), (4, 8), (4, 2). So
 2
2 1 2
  
E(S2 + S3 |S1 = 8) = (16 + 32) + (16 + 8)
3 3 3
 2
1 2 1
  
+ (4 + 8) + (4 + 2) = 30
3 3 3
• Hence E(S2 + S3 |S1 = 8) = E(S2 |S1 = 8) + E(S3 |S1 = 8) = 30
• Similary E(S2 + S3 |S1 = 2) = E(S2 |S1 = 2) + E(S3 |S1 = 2) = 7.5
• Regardless the outcome of S1 , we have

E(S2 + S3 |S1 ) = E(S2 |S1 ) + E(S3 |S1 )

92 / 96
Example - Take out what is known

Compare
E(S1 S2 |S1 )
and
S1 E(S2 |S1 )

93 / 96
Example - Iterated conditioning

Compare
E(S3 |S1 )
and
E(E(S3 |(S1 , S2 ))|S1 )

94 / 96
Example-Independence

Compare
S2
 
E |S1
S1
and
S2
 
E
S1

95 / 96
Practice

Suppose that W1 ,→ N (0, 1) and W2 (0, 1) are log-return of the first and the second
year of stock A. Suppose W1 and W2 are independent. The cumulative log-return of
this stock is
B1 = W1
and
B2 = W1 + W2 .
Find E(B2 |B1 ).

96 / 96

You might also like