Basic Probability Theory
Basic Probability Theory
Frank Hansen
Department of Economics
Copenhagen University
2022
The state space
The state space S represents all the different and mutually exclusive
states of the “world” under consideration. A finite state space
S = {ω1 , . . . , ωn }
S = {0, 1}
with four elements, where the first coordinate represents the outcome
of the first toss and the second the outcome of the second toss.
1 / 28
More examples
S = S♠ ∪ S♥ ∪ S♦ ∪ S♣
of four subsets each representing one of the four suits: spades, hearts,
diamonds and clubs.
If the experiment is the measurement of the temperature of a body of
water under atmospheric pressure and in thermodynamic equilibrium
then the state space S = [0, 100].
The state space is thus the collection of all possible outcomes under
consideration.
2 / 28
Events
S B
3 / 28
Examples of events
In the experiment in which a coin is tossed twice one may consider the
event A of obtaining exactly one head. Then
A = {(0, 1), (1, 0)} ⊂ S = {(0, 0), (0, 1), (1, 0), (1, 1)}.
In the experiment in which a card is drawn one may consider the event
that the chosen card is a spade. This event has 13 elements - one for
each rank of the cards.
In the sampling of voter preferences one may consider the event that
the first candidate is preferred by between 20 and 25 out of the 100
voters participating in the survey.
4 / 28
The origin of probability
5 / 28
Probability
Suppose S is a finite state space with N elements and that all states
are equally likely to obtain.
The probability P[A] of an event A ⊆ S with n elements is thus
n
P[A] =
N
and the so-defined probability function P satisfies
(i) A∩B =∅ ⇒ P[A ∪ B] = P[A] + P[B]
(ii) P[S\A] = 1 − P[A]
(iii) P[∅] = 0 and P[S] = 1.
Definition
A set function P defined on the set of subsets of S is called a
probability measure if it satisfies these three conditions.
6 / 28
Continued repetitions of an experiment
7 / 28
The origin of independence
The ratios (probabilities) of the four outcomes are listed below:
H0, 0L H0, 1L
p2 pH1-pL
H1, 0L H1, 1L
pH1-pL H1-pL2
Definition
Two events A, B ⊆ S are said to be independent if
P[A ∩ B] = P[A]P[B].
9 / 28
Issues regarding the definition of probability
Any distribution of probability over S defines a probability measure.
The above definition of probability works well for a finite state space,
and we notice that
" #
[ X
Ai ∩ Aj = ∅ for i 6= j ⇒ P Ai = P[Ai ]
i∈I i∈I
also belongs to F.
Definition
A pair (S, F) is called a measure space if S is a non-empty set and F
is a σ-algebra on S.
Proposition
For any sequence of sets A1 , A2 , . . . , An , . . . in F the intersection
∞
\
A= An = {ω ∈ S | ω ∈ An for all n}
n=1
also belongs to F.
Indeed, x ∈
/ A if and only if x ∈
/ An for some n. Therefore A ∈ F.
12 / 28
The Borel sets
Consider a family (Fi )i∈I of σ-algebras on S. It is easy to establish that
also the intersection
\
F= Fi = {A ⊆ S | A ∈ Fi ∀i ∈ I}
i∈I
is a σ-algebra on S.
Corollary
To any system O of subsets of S there is a smallest σ-algebra σ(O)
such that O ⊆ σ(O).
Definition
The Borel sets B(Rn ) in Rn is the σ-algebra generated by the open
sets in Rn .
13 / 28
Measure and probability
Let (S, F) be a measure space. A mapping µ : F → R ∪ {∞} is called
a measure if
(i) µ[A] ≥ 0 for every set A ∈ F
(ii) µ[∅] = 0
(iii) For any sequence A1 , . . . , An , . . . of mutually disjoint sets in F the
measure of the union
"∞ # ∞
[ X
µ An = µ[An ].
n=1 n=1
Theorem
The exists a σ-algebra F on R such that the Borel sets
B = B(R) ⊂ F
The sets in F are the Lebesgue sets and λ is the Lebesgue measure.
There exists non-countable sets in R with zero Lebesque measure.
The Lebesgue sets F have the nice property that
15 / 28
Measurable functions
X −1 (B) = {ω ∈ S | X (ω) ∈ B} ∈ F
16 / 28
Stochastic (random) variables
Let (S, F, P) be a probability space with a probability measure P.
Definition
A real measurable function X : S → R is called a random variable.
Definition
The conditional probability P[A | B] of A given B is defined by setting
P[A ∩ B]
P[A | B] =
P[B]
P[A]P[B]
P[A | B] = = P[A]
P[B]
19 / 28
Stochastic variables and independence
Let in the previous experiment X1 denote the stochastic variable that
measures the outcome of the first toss. That is,
20 / 28
Independent stochastic variables
Definition
Let X1 and X2 be two stochastic variables on S. We say that X1 and X2
are independent if
21 / 28
Functions of independent variables
Let (S, F, P) be a measure space with a probability measure and
consider independent stochastic variables X and Y on S.
Take arbitrary measurable functions f and g such that f is defined in
the range of X , and g is defined in the range of Y .
If B1 and B2 are Borel sets then so are
f −1 (B1 ) = {t ∈ R | f (t) ∈ B1 } and g −1 (B2 ) = {t ∈ R | g(t) ∈ B2 }.
We calculate the joint probability
P[f (X ) ∈ B1 , g(Y ) ∈ B2 ] = P[X ∈ f −1 (B1 ), Y ∈ g −1 (B2 )]
= P[X ∈ f −1 (B1 )]P[Y ∈ g −1 (B2 )] = P[f (X ) ∈ B1 ]P[g(Y ) ∈ B2 ]
and realise that f (X ) and g(Y ) are also independent.
Theorem
Measurable functions of n independent variables are independent.
22 / 28
Stochastic vectors with a density
A vector of stochastic variables X = (X1 , . . . , Xn ) with distribution
FX (x) = P[X1 ≤ x1 , . . . , Xn ≤ xn ] x = (x1 , . . . , xn ) ∈ Rn
has a density if there is a non-negative function fX : Rn → R such that
Z x1 Z xn
FX (x1 , . . . , xn ) = ··· fX (t1 , . . . , tn ) dt1 · · · dtn .
−∞ −∞
Theorem
Let X = (X1 , . . . , Xn ) be a stochastic vector with density fX and let
Z ∞ Z ∞
fi (xi ) = ··· f (t1 , . . . , ti−1 , xi , ti+1 , . . . , tn ) dt1 · · · dti−1 dti+1 . . . dtn
−∞ −∞
23 / 28
Proof
24 / 28
E[XY ] = E[X ]E[Y ] for independent variables
Consider independent variables X and Y with finite many values
x1 , . . . , xn and y1 , . . . , ym . Then the mean
n X
X m
E[XY ] = xi yj P[X = xi , Y = yj ]
i=1 j=1
n X
X m
= xi yj P[Xi = xi ]P[Y = yj ] = E[X ]E[Y ].
i=1 j=1
The definition tacitly assumes that the means of X and Y exist. Still,
the covariance may not be well-defined if the function
Proposition
Cov[X , Y ] = 0 for independent stochastic variables X and Y .
26 / 28
The covariance matrix
Let X = (X1 , . . . , Xn ) be a stochastic vector such that the variables
have first and second moments. We introduce the covariance matrix
n
SX = Cov[Xi , Xj ] .
i,j=1
27 / 28
The covariance matrix and independence
28 / 28