0% found this document useful (0 votes)
5 views29 pages

Basic Probability Theory

The document discusses fundamental concepts in financial theory and probability, including state spaces, events, and the origin of probability. It explains the definitions of independence and probability measures, as well as the properties of stochastic variables and their independence. Additionally, it covers the notions of measurable functions and conditional probability within the context of probability spaces.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views29 pages

Basic Probability Theory

The document discusses fundamental concepts in financial theory and probability, including state spaces, events, and the origin of probability. It explains the definitions of independence and probability measures, as well as the properties of stochastic variables and their independence. Additionally, it covers the notions of measurable functions and conditional probability within the context of probability spaces.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 29

Financial theory and models

Basic probability theory

State space and events


The origin of probability and independence
Measure theory and stochastic variables
Stochastic variables and independence

Frank Hansen
Department of Economics
Copenhagen University
2022
The state space
The state space S represents all the different and mutually exclusive
states of the “world” under consideration. A finite state space

S = {ω1 , . . . , ωn }

allows exactly n states or possibilities ω1 , . . . , ωn to occur.


If we consider a toss of a coin then the state space

S = {0, 1}

has two points with 0 representing head and 1 representing tail.


Two tosses with a coin is described by a state space

S = {(0, 0), (0, 1), (1, 0), (1, 1)}.

with four elements, where the first coordinate represents the outcome
of the first toss and the second the outcome of the second toss.
1 / 28
More examples

An experiment in which a card is drawn has a state space

S = {ω1 , ω2 , ω3 , . . . , ω51 , ω52 }

with 52 elements. It may be written as a union

S = S♠ ∪ S♥ ∪ S♦ ∪ S♣

of four subsets each representing one of the four suits: spades, hearts,
diamonds and clubs.
If the experiment is the measurement of the temperature of a body of
water under atmospheric pressure and in thermodynamic equilibrium
then the state space S = [0, 100].
The state space is thus the collection of all possible outcomes under
consideration.

2 / 28
Events

An event is a subset A ⊆ S of the state (sample) space.

S B

In the picture we consider a state space S with two events A and B.


The events are not disjoint, meaning that there exists states belonging
to both events. The events are therefore not mutually exclusive.

3 / 28
Examples of events

In the experiment in which a coin is tossed twice one may consider the
event A of obtaining exactly one head. Then

A = {(0, 1), (1, 0)} ⊂ S = {(0, 0), (0, 1), (1, 0), (1, 1)}.

In the experiment in which a card is drawn one may consider the event
that the chosen card is a spade. This event has 13 elements - one for
each rank of the cards.

In the experiment in which the temperature of a body of water is


measured one may consider the event that the measurement gives a
result between 10 and 20 degrees Celsius.

In the sampling of voter preferences one may consider the event that
the first candidate is preferred by between 20 and 25 out of the 100
voters participating in the survey.

4 / 28
The origin of probability

An event in a state or sample space may be assigned a probability.


The origin of probability is the notion of equi-probable events - that is
events with the same likelihood of obtaining.
When a card is drawn each of the 52 possibilities (specified by rank
and suit) are equally likely to obtain. We say that the game is fair.
The four mutually exclusive events club, diamond, spade and heart
each have 13 elements and should therefore be equally likely.
These four events exhaust all possibilities, and the sure event is
assigned probability one by convention.
We arrive at the conclusion that the event ”the drawn card is a spade”
must be assigned probability 1/4 .

5 / 28
Probability
Suppose S is a finite state space with N elements and that all states
are equally likely to obtain.
The probability P[A] of an event A ⊆ S with n elements is thus
n
P[A] =
N
and the so-defined probability function P satisfies
(i) A∩B =∅ ⇒ P[A ∪ B] = P[A] + P[B]
(ii) P[S\A] = 1 − P[A]
(iii) P[∅] = 0 and P[S] = 1.

Definition
A set function P defined on the set of subsets of S is called a
probability measure if it satisfies these three conditions.

6 / 28
Continued repetitions of an experiment

Suppose that we have a funny coin, where the probability of head is


p ∈ [0, 1], and the probability of tail is 1 − p.
This means that the fraction of heads to the number of tossings
approaches p as the number of tossings tend to infinity.
If we toss the coin twice then the state space is

S = {(0, 0), (0, 1), (1, 0), (1, 1)},

where 0 represents head and 1 represents tail.


If the experiment is repeated a large number of times the fraction
where head is up in the first of two tosses approaches p.
The second toss is intuitively independent of the first toss, so the
fraction of tails in the second toss remains 1 − p. The ratio of tossings
with first head up and then tail is therefore p(1 − p).

7 / 28
The origin of independence
The ratios (probabilities) of the four outcomes are listed below:

H0, 0L H0, 1L
p2 pH1-pL

H1, 0L H1, 1L
pH1-pL H1-pL2

The event A that the first toss gives head is


A = {(0, 0), (0, 1)} with probability P[A] = p.
The event B that the second toss gives tail is
B = {(0, 1), (1, 1)} with probability P[B] = 1 − p.
The event A ∩ B = {(0, 1)}, first head then tail, has probability
P[A ∩ B] = p(1 − p) = P[A]P[B].
This property is taken as the defining notion of independence.
8 / 28
Independent events

Let (S, P) be a finite state space S equipped with a probability


measure P.

Definition
Two events A, B ⊆ S are said to be independent if

P[A ∩ B] = P[A]P[B].

More generally, we say that n events A1 , . . . , An ⊆ S are independent if

P[Ai1 ∩ · · · ∩ Aik ] = P[Ai1 ] · · · P[Aik ]

for arbitrary indices 1 ≤ i1 < · · · < ik ≤ n.

9 / 28
Issues regarding the definition of probability
Any distribution of probability over S defines a probability measure.
The above definition of probability works well for a finite state space,
and we notice that
" #
[ X
Ai ∩ Aj = ∅ for i 6= j ⇒ P Ai = P[Ai ]
i∈I i∈I

for any family (Ai )i∈I of subsets of S.


However, if the state space is infinite then we cannot reasonably
assume absolute additivity as above.
If for example S = [0, 1] and we assume all states to be equally likely
then the probability of each state must be zero. But with absolute
additivity this implies P(S) = 0 which is absurd.
Another issue is that not all subsets of S = [0, 1] may be assigned a
probability in a meaningful way.
10 / 28
The notion of a σ-algebra
A collection F of subsets of a state space S is called a σ-algebra if
(i) The empty set ∅ belongs to F.
(ii) For any set A ∈ F the complement S\A = {ω ∈ S | ω ∈
/ A}
belongs to F.
(iii) For any sequence of sets A1 , A2 , . . . , An , . . . in F the union

[
A= An = {ω ∈ S | ω ∈ An for some n}
n=1

also belongs to F.

Definition
A pair (S, F) is called a measure space if S is a non-empty set and F
is a σ-algebra on S.

A subset A ⊆ S is said to be measurable if A ∈ F.


11 / 28
Countable intersection of measurable sets

Let F be a σ-algebra of subsets of a state space S.

Proposition
For any sequence of sets A1 , A2 , . . . , An , . . . in F the intersection

\
A= An = {ω ∈ S | ω ∈ An for all n}
n=1

also belongs to F.

Proof: The complement of A satisfies



[
S\A = (S\An ) ∈ F.
n=1

Indeed, x ∈
/ A if and only if x ∈
/ An for some n. Therefore A ∈ F.

12 / 28
The Borel sets
Consider a family (Fi )i∈I of σ-algebras on S. It is easy to establish that
also the intersection
\
F= Fi = {A ⊆ S | A ∈ Fi ∀i ∈ I}
i∈I

is a σ-algebra on S.

Corollary
To any system O of subsets of S there is a smallest σ-algebra σ(O)
such that O ⊆ σ(O).

σ(O) is called the σ-algebra generated by O.

Definition
The Borel sets B(Rn ) in Rn is the σ-algebra generated by the open
sets in Rn .

13 / 28
Measure and probability
Let (S, F) be a measure space. A mapping µ : F → R ∪ {∞} is called
a measure if
(i) µ[A] ≥ 0 for every set A ∈ F
(ii) µ[∅] = 0
(iii) For any sequence A1 , . . . , An , . . . of mutually disjoint sets in F the
measure of the union
"∞ # ∞
[ X
µ An = µ[An ].
n=1 n=1

If in addition µ[S] = 1 we call µ a probability measure.


It follows from the definition that if A ⊆ B for sets A, B ∈ F and
µ[B] < ∞ then
µ[A] = µ[B] − µ[B\A].
Note that B\A = S\(A ∪ (S\B)) ∈ F.
14 / 28
The Lebesgue measure
The following theorem is rather deep and is only given as a reference.

Theorem
The exists a σ-algebra F on R such that the Borel sets

B = B(R) ⊂ F

and a measure λ on F such that

λ[(a, b)] = b − a for any a < b.

The sets in F are the Lebesgue sets and λ is the Lebesgue measure.
There exists non-countable sets in R with zero Lebesque measure.
The Lebesgue sets F have the nice property that

A⊆B∈F and λ[B] = 0 ⇒ A ∈ F.

15 / 28
Measurable functions

Let (S, F) be a probability space with a measure µ defined on F.


A function X : S → Rn is said to be measurable if

X −1 (B) = {ω ∈ S | X (ω) ∈ B} ∈ F

for any Borel set B ∈ B(Rn ).


It is only an exercise to establish that sums, products and quotients
(when applicable) of measurable functions again are measurable.

Suppose now that S is a subset of Rm for some m.


Since the Borel sets contain the open sets it follows that any
continuous function is measurable.
A point-wise limit of measurable functions is again measurable.

16 / 28
Stochastic (random) variables
Let (S, F, P) be a probability space with a probability measure P.

Definition
A real measurable function X : S → R is called a random variable.

Let X be a random variable and consider y ∈ R and a small ε > 0.


The interval (y − ε, y + ε) is a Borel set.
Since X is measurable the set

X −1 ((y − ε, y + ε)) = {ω ∈ S | y − ε < X (ω) < y + ε}

is in F and can therefore be assigned a probability

P[{ω ∈ S | y − ε < X (ω) < y + ε}].

It is thus possible, for each real y , to calculate the probability of the


event that the random variable X takes values close to y .
17 / 28
Conditional probability
Let (S, F, P) be a measure space with a probability measure and take
an event B with P[B] > 0.

Definition
The conditional probability P[A | B] of A given B is defined by setting

P[A ∩ B]
P[A | B] =
P[B]

for any other event A ∈ F.

We immediately see that

P[A]P[B]
P[A | B] = = P[A]
P[B]

if A and B are independent. We therefore gain no knowledge about A


by knowing that B has occurred.
18 / 28
Expectations on a discrete variable

Let Y be a discrete stochastic variable with values y1 , y2 , . . . and let


B ⊆ Ω with P[B] > 0. The conditional expectation E[Y | B] of Y given
B is defined by setting

X
E[Y | B] = yk P[Y = yk | B].
k =1

The sets Ai = {ω ∈ Ω | Y (ω) = yi } are for i = 1, 2, . . . mutually


disjoints with union Ω.
Let X be another discrete stochastic variable with values x1 , x2 , . . . .
The conditional expectation E[X | Y ] of X given Y is defined by setting

E[X | Y ](ω) = E[X | Ai ] ω ∈ Ai

Note that E[X | Y ] is constant on each of the sets A1 , A2 , . . . .

19 / 28
Stochastic variables and independence
Let in the previous experiment X1 denote the stochastic variable that
measures the outcome of the first toss. That is,

X1 (0, ω) = 0 and X1 (1, ω) = 1,

regardless of ω ∈ {0, 1}. Similarly,

X2 (ω, 0) = 0 and X2 (ω, 1) = 1.

We calculate the probability

P[X1 = 0, X2 = 1] = P[X1−1 {0} ∩ X2−1 {1}] = P[A ∩ B]


= P[A]P[B]
= P[X1 = 0]P[X2 = 1]

for the two independent tosses (represented by X1 and X2 ) of the coin.

20 / 28
Independent stochastic variables

Let (S, F, P) be a measure space with a probability measure.

Definition
Let X1 and X2 be two stochastic variables on S. We say that X1 and X2
are independent if

P[X1 ∈ B1 , X2 ∈ B2 ] = P[X1 ∈ B1 ]P[X2 ∈ B2 ]

for all Borel sets B1 and B2 .

We say, more generally, that n stochastic variables X1 , . . . , Xn on S are


independent if

P[Xi1 ∈ Bi1 , . . . , Xik ∈ Bik ] = P[Xi1 ∈ Bi1 ] · · · P[Xik ∈ Bik ]

for Borel sets B1 , . . . , Bn and arbitrary indices 1 ≤ i1 < · · · < ik ≤ n.

21 / 28
Functions of independent variables
Let (S, F, P) be a measure space with a probability measure and
consider independent stochastic variables X and Y on S.
Take arbitrary measurable functions f and g such that f is defined in
the range of X , and g is defined in the range of Y .
If B1 and B2 are Borel sets then so are
f −1 (B1 ) = {t ∈ R | f (t) ∈ B1 } and g −1 (B2 ) = {t ∈ R | g(t) ∈ B2 }.
We calculate the joint probability
P[f (X ) ∈ B1 , g(Y ) ∈ B2 ] = P[X ∈ f −1 (B1 ), Y ∈ g −1 (B2 )]
= P[X ∈ f −1 (B1 )]P[Y ∈ g −1 (B2 )] = P[f (X ) ∈ B1 ]P[g(Y ) ∈ B2 ]
and realise that f (X ) and g(Y ) are also independent.
Theorem
Measurable functions of n independent variables are independent.
22 / 28
Stochastic vectors with a density
A vector of stochastic variables X = (X1 , . . . , Xn ) with distribution
FX (x) = P[X1 ≤ x1 , . . . , Xn ≤ xn ] x = (x1 , . . . , xn ) ∈ Rn
has a density if there is a non-negative function fX : Rn → R such that
Z x1 Z xn
FX (x1 , . . . , xn ) = ··· fX (t1 , . . . , tn ) dt1 · · · dtn .
−∞ −∞

Theorem
Let X = (X1 , . . . , Xn ) be a stochastic vector with density fX and let
Z ∞ Z ∞
fi (xi ) = ··· f (t1 , . . . , ti−1 , xi , ti+1 , . . . , tn ) dt1 · · · dti−1 dti+1 . . . dtn
−∞ −∞

be the reduced densities. If X1 , . . . , Xn are independent then

fX (x1 , . . . , xn ) = f1 (x1 ) · · · fn (xn ) for x1 , . . . , xn ∈ R.

23 / 28
Proof

We first notice that


Z xi
P[Xi ≤ xi ] = fi (t) dt i = 1, . . . , n.
−∞

Therefore, fi is the density of Xi . If X1 , . . . , Xn are independent then


Z x1 Z xn
··· fX (t1 , . . . , tn ) dt1 · · · dtn = FX (x1 , . . . , xn )
−∞ −∞
Z x1 Z xn
= FX1 (x1 ) · · · FXn (xn ) = f1 (t1 ) dt1 · · · fn (tn ) dtn
−∞ −∞
Z x1 Z xn
= ··· f1 (t1 ) · · · fn (tn ) dt1 · · · dtn
−∞ −∞

from which the assertion follows.

24 / 28
E[XY ] = E[X ]E[Y ] for independent variables
Consider independent variables X and Y with finite many values
x1 , . . . , xn and y1 , . . . , ym . Then the mean
n X
X m
E[XY ] = xi yj P[X = xi , Y = yj ]
i=1 j=1
n X
X m
= xi yj P[Xi = xi ]P[Y = yj ] = E[X ]E[Y ].
i=1 j=1

By suitable approximation the argument may be extended to discrete


or continuous stochastic variables with finite first moments.

Similarly, we obtain the formula

E[X1 · · · Xn ] = E[X1 ] · · · E[Xn ]

valid for n independent variables with finite first moments.


25 / 28
Covariance
The covariance Cov[X , Y ] of two stochastic variables X , Y : S → R is
defined by setting
 
Cov[X , Y ] = E (X − E(X ))(Y − E(Y )) .

The definition tacitly assumes that the means of X and Y exist. Still,
the covariance may not be well-defined if the function

ω → (X (ω) − E[X ])(Y (ω) − E[Y ])

is not integrable. Since expectation is linear we may write


Cov[X , Y ] = E[XY ] − E[XE[Y ]] − E[E[X ]Y ] + E[E[X ]E[Y ]]

= E[XY ] − E[X ]E[Y ].

Proposition
Cov[X , Y ] = 0 for independent stochastic variables X and Y .

26 / 28
The covariance matrix
Let X = (X1 , . . . , Xn ) be a stochastic vector such that the variables
have first and second moments. We introduce the covariance matrix
 n
SX = Cov[Xi , Xj ] .
i,j=1

Take any vector a ∈ Rn . We then obtain


n
X n X
X n
(SX a) · a = (SX a)i ai = Cov[Xi , Xj ]aj ai
i=1 i=1 j=1
n
hX n
X i
= Cov ai Xi , aj Xj = Var[Y ] ≥ 0,
i=1 j=1
Pn
where Y = i=1 ai Xi .

It follows that SX is a positive semi-definite matrix.

27 / 28
The covariance matrix and independence

Suppose now that the stochastic variables X1 , . . . , Xn are independent.


For i 6= j we calculate
 
Cov[Xi , Xj ] = E (Xi − E[Xi ])(Xj − E[Xj ])
     
= E[Xi Xj ] − E Xi E[Xj ] − E E[Xi ]Xj + E E[Xi ]E[Xj ]

= E[Xi Xj ] − E[Xi ]E[Xj ] = 0.

The covariance matrix SX for a vector of independent variables is thus


a diagonal matrix with Var[X1 ], . . . , Var[Xn ] in the diagonal.

It is important to know that the opposite is not true in general.

Later we learn that the components in a vector X of normally


distributed variables are independent if and only if the covariance
matrix SX is diagonal.

28 / 28

You might also like