Notes
Notes
1 Measure Theory
1.1 Defining σ-algebras
σ-algebras are a special kind of set algebra. A set algebra over some set S is a field-like structure
defined to be a family of subsets F such that:
By De’Morgans law:
n
!c n
[ \
Acn = An
i=1 i=1
Hence,
1
n
\
An ∈ F
i=1
A σ-algebra Σ over some set S is a set algebra such that condition 3. now states that Σ is closed under
countable unions. That is, for some infinite enumerable sequence Ai ∈ Σ,
∞
[
Ai ∈ Σ
i=1
The proof that this implies closure under countable intersection is the same as above.
σ-algebras lend themselves well to intuitive notions of measurability. For some set S and subset A, if
you know how ’big’ A is and how ’big’ S is, you should be able to work out how ’big’ Ac is. If you
know how ’big’ some sequence of sets is in S, you should be able to work out how ’big’ their union or
intersect is.
More specifically this is the intersection of every σ-algebra that contains F . To prove that taking the
intersection of every σ-algebra that contains F is a σ-algebra and minimal, consider that:
is also a σ-algebra:
If e ∈ Ci ∀ i, then it is guaranteed that ec ∈ Ci ∀ i as all the sets in question are σ-algebras. Hence,
^ ^
e∈ Ci → e c ∈ Ci
i i
That is, the intersect is closed under complement. As all the sets in question are σ-algebras,
^ ^
∅∈ Ci , X ∈ Ci
i i
That is, the intersect is closed under countable union. Hence the intersect of σ-algebras is a σ-algebra.
Proof: If Ci is the set of all σ-algebras on X containing a family of sets in X called F , the intersect is
the minimal σ-algebra containing F .
Therefore,
^
F ⊆ Ci
i
Borel subsets are sets that can be formed from ’open sets’ or ’closed sets’ through countable operations
of union, intersect and complement. The general notion of ’open sets’ requires the set to be topological
space.
A family of sets P belonging to parent set X form a π-system if they are closed under finite intersections.
λ-system:
1. X ∈ D
2. (A, B ∈ D) ∧ (B ⊆ A) → A \ B ∈ D S
3. For monotone increasing sequence An , ∀n An ∈ D → n An ∈ D
⇒
σ-algebras are closed under countable intersections and therefore are π-systems.
Moreover, σ-algebras are closed under countable unions, contain the entire set and are closed under
complement. Hence, they are λ-systems.
⇐
IS WRONG
To show that Σ is closed under countable unions and therefore intersects, consider some sequence
in An ∈ Σ. To convert it to a monotone increasing sequence, set B1 = A1 and define recursively
Bn = B(n−1) ∩ An . It is clear that Bn is monotone increasing by definition. As Σ is a λ-system
∪i Bi ∈ Σ. Hence ∪i Ai ∈ Σ. Therefore Σ is a σ-algebra bc demorgans?
P ⊂ D → σP ⊂ D
SKIPPING FOR NOW
(E × F, E ⊗ F)
Where E ⊗ F := {A × B | A ∈ E, B ∈ F}
f −1 (S) := {x ∈ E | f (x) ∈ S}
1.2.2 Properties of measurable functions
Proof: f : (E, E) → (F, F) is E/F measurable ⇔ ∀S ∈ F0 where F = σF0 , f −1 (S) ∈ E
⇒
As F0 ⊂ F, the measureability definition guarantees this.
⇐
Consider set G := {X ∈ F | f −1 (X) ∈ E}
1. F ∈ G as E = f −1 (F )
2. S ∈ G. S c ∈ G as f −1 (S c ) = f −1 (S)c
Sn ∈ G as f −1 ( n Sn ) = n f −1 (Sn )
S S S
3. Sn ∈ G. n
Suppose F0 ⊆ G.
Proof: Composition of measurable functions. For some measurable spaces (E, E), (F, F), and (G, G)
and functions f : E → F and g : F → G
The extended reals and the corresponding Borel algebra (R, B) form a measureable space.
A E/B measuarable numerical function can more succinctly be denoted as an E measurable function,
or more succinctly f ∈ E.
1.2.4 Building measurable numerical functions
Consider measurable space (E, E)
Indicator functions: Simplest measurable numerical functions. For some A ⊆ E, of the form:
(
1 if x ∈ A
1A (x) =
0 if x ∈
/A
Simple functions: A simple function f : E → R is one such that for some sequence of sets Ai ⊆ E and
n ∈ N it can be written as a linear combination of indicator functions:
n
ai 1Ai
X
f=
i=1
f g is simple
f
For g such that g(x) ̸= 0∀x ∈ E, g
must be simple
Through a limiting process of sequences of simple functions a large class of E-measurable functions
can be built. Necessary tools to derive limits of sequences:
lim sup (an ) is (very loosely) the infimum of the set of eventual supremums. Specifically,
If every sequence an is increasing, that is to say that fn increases pointwise, then the limit always
exists and is denoted by fn ↑ f .
If every sequence an is decreasing, that is to say that fn decreases pointwise, then the limit always
exists and is denoted by fn ↓ f .
Proof: For some sequence of E-measurable functions fn , the pointwise defined functions inf (fn ),
sup (fn ), lim inf (fn ), and lim sup (fn ) are all E-measurable. Hence the limit, if it exists i.e. lim inf (fn ) =
lim sup (fn ), also is E-measurable.
1. sup (fn ):
Let s := sup (fn ). The Borel σ-algebra on R can be generated by a set of intervals of the form [−∞, r].
As f : (E, E) → (F, F) is E/F measurable ⇔ ∀S ∈ F0 where F = σF0 , f −1 (S) ∈ E, it is sufficient to
prove that the preimage under f of every set of the form [−∞, r] lie in E. To do this consider that ∀r:
\
s−1 [−∞, r] = fn−1 [−∞, r]
n
2. inf (fn )
Let i := inf (fn ). It must be noted that i = − sup (−fn ). Hence by the above proof i is E-measurable.
step 1 + step 2
Hence, it has been established that limits of E-measurable functions are E-measurable.
The functions at our disposal to build all E-measurable functions are the simple functions previously
defined. To define a broad class of E-measurable functions, we will decompose them into positive and
negative functions. We will first show that all positive (and by sign change negative) E-measurable
functions can be built from simple functions. We will then show that any E-measurable function is
measurable if and only if its positive and negative decompositions are measurable. Hence the positive
and negative functions are sufficient to build all E-measurable functions.
It is intuitive that any numerical f can be written as
f := f + + f −
Proof: f + ∈ E+ ⇔ pointwise limit of simple functions fn
⇐ sufficiency,
If f + is the pointwise limit of simple functions then it is E-measurable as it was previously proven that
the limits of measruable functions were measurable.
⇒ necessity,
+
Goal is to chop up R into smaller and smaller pieces. Based on these pieces, create simple functions
that are the lower bound of f + on each interval. As the pieces keep getting smaller ”resolution” in-
creases and sequence of indicator functions pointwise converge to f + .
+
First cut R into [0, 2n ] and (2n , ∞] for some integer n. Now subdivide [0, 2n ] into 22n pieces, each of
1
width n . These intervals look like:
2
k k+1 2n
Ik,n = , | k ∈ {0, 1, 2, . . . 2 − 1}
2n 2n
Define:
Rn := (f + )−1 (2n , ∞]
As f + is measurable, ∀k, n Ek,n , Rn ∈ E.
Now we want to define fn such that it is the lower bound of f . Hence, set:
2n −1
2X
!
k
fn = 1
n Ek,n
+ 2n 1Rn
k=1
2
hence, if the inverse image of an interval Ik,n exists, we create an indicator function of that inverse
image and scale it by the lower bound of that interval, hence creating a lower bound of f on that
interval. Hence, lim fn = f +
n→∞
Hence every positive and negative E-measurable function can be written as a limit of simple functions.
Now we will extend this to the entire class of E-measurable functions.
f ∈ E ⇔ f + ∈ E+ ∧ f − ∈ E−
Hence all E-measurable functions are the limits of simple functions. Hence, all properties of simple
functions apply to all E-measurable functions.
Monotone Class of functions E on (E, E):
1. 1E ∈ M
2. ∀ bounded f, g ∈ M, ∀a, b ∈ R, af + bg ∈ M
1.3 Measures
For some measurable space (E, E), a measure µ is a set function:
µ : E → [0, ∞]
Such that it is countably additive. That is, for some sequence of disjoint sets An :
!
[ X
µ An = µ(An )
n n
With µ(∅) = 0.
A measure is called σ-finite if ∃An ∈ E such that:
!
[
µ An ≤∞
n
Properties of measures:
A ⊆ B ⇒ µ(A) ≤ µ(B)
Proof:
∞
!
[
∀n, An ⊆ An+1 ⇒ lim µ(An ) = µ An
n→∞
n
Proof:
∞
! ∞
[ X
µ An ≤ µ(An )
n n
Proof:
A space (E, E, µ) is called a measure space.
A pre-measure is similar to a measure but is defined on some algebra E0 . The below allows its extension
to the entire σ-algebra generated by it:
Carathadory extension thoerem: For some algebra E0 on E, the pre-measure µ0 can be extended to a
measure on the entire space (E, σE0 )
1.3.1 Dirac measure
Assigns measure 1 to sets ∈ E containing some x ∈ E.
δx (A) := 1A (x)
µ0 (a, b] = b − a
Morover, the finite additivity property is satisfied:
n
! n
[ X
µ0 (ai , bi ] = b i − ai
1=1 i=1
For two measure spaces (E, E, µE ) and (F, F, µF ). The measure on the product measure on the product
space (E × F, E ⊗ F, µE⊗F ) is defined as:
µE⊗F := µE µF
1.4 Integrals
Z Z
Denoted µf or f dµ or µ(dx)f
E E
Define for indicator functions. Define for simple functions. Define then for all E-measurable functions
by taking limit of simple functions.
µ1A = µ(A)
For simple functions:
n
! n
ai 1 A i
X X
µ = ai µ(Ai )
i=1 i=1
+
For all positive E-measurable functions f , define fn as previously. Hence define:
µf + = lim µfn
n→∞
To extend to all E-measurable functions, as earlier define that:
µf = µf + + µf −
In this case we also impose that either µf + or µf − are finite. This is to avoid ∞ − ∞ situations? If
µf < ∞, then f is called µ-integrable.
f = g µ-almost everywhere ⇒ µf = µg
Proof: Positivity
(µf ≥ 0) ∧ (f = 0) → µf = 0
Proof: Monotonicity
f ≤ g → µf ≤ µg
Proof: Linearity
µ(af + bg) = a · µf + b · µg
Proof: Monotone Convergence
fn ↑ f → µfn ↑ µf
L(f ) = µf
⇐⇒
1. L(f ) = 0 =⇒ µf = 0
2. L(af + bg) = aL(f ) + bL(g)
3. fn ↑ f =⇒ L(fn ) ↑ L(f )
1.4.7 Densities
Consider measure space (E, E, µ). Establish a function p ∈ E+ . Consider then that for any A ∈ E the
mapping:
Z
A 7→ p dµ
A
Forms a measure v, which is said to have density p w.r.t µ, denoted v = pµ which is called indefinite
integral.
Proof: pµ is a measure
vf = µ(f · p)
That is:
Z Z
f dv = f p dµ
Proof:
v ≪ µ : µ(A) = 0 =⇒ v(A) = 0
Called absolute continuity because for finite measures there is some relation to the definition of abso-
lute continuity of functions.
Radon-Nikodym theorem:
For measures v and µ on (E, E) such that µ is σ-finite and v ≪ µ, ∃p ∈ E+ such that v = µp and said
p is unique up to µ-almost equality.
Effectively v is zero whenever µ is zero, and v is non-zero whenever µ is non-zero. What the above
theorem states is that there is a unique positive E-measurable function p such that for any set A ∈ E:
Z
v(A) = p dµ
A
Consider the familiar formula:
Z
F ([a, b]) = f (x) dx
[a,b]
dF
Where =f
dx
By the above’s similarity to this, the function p is called the Radon-Nikodym derivative of v w.r.t µ
and is denoted heuristically as:
dv
=p
dµ
This is not a true derivative. It is rather just a measure of how much the v measure changes w.r.t the
µ measure. This notation is also in support of the
Z Z
f dv = f p dµ
As
Z Z
dv
f dv = f dµ
dµ
feels intuitive.
1.4.8 Image/pushforward measures and integrating w.r.t image measures
Consider a function f : E → F between measurable spaces (E, E) and (F, F) such that f is E/F-
measurable. Endow (E, E) with measure µ. Hence a measure v on (F, F), called the image measure,
can be established by defining:
v := µ ◦ f −1
That is for any set A ∈ F,
Modelled as a measure space (Ω, H, P). Ω is called the sample space. The sigma algebra H is called
the event hoard. P is a measure such that:
P : H → [0, 1]
With P(Ω) = 1 and P(∅) = 0.
Can specify a measure either as a direct formula for some A ∈ H or on a π -system Π such that H = σΠ
2.1 Properties of P
Along with the basic properties of measures and finite measures, the below hold for any A, B and An
∈ H:
P(Ac ) = 1 − A
Follows directly from P(Ω) = 1 and finite additivity.
This is continuity
S cfrom above. Is proved by taking the complement, therefore creating a sequence that
decreases to n An . The property that:
!
[
lim P(Acn ) = P Acn
n
P(Acn ) = 1 − P(An )
X = {Xt | t ∈ T}
If T is finite then X is called a random vector.
One is that is a distribution of sample paths over sample space Ω = ×t∈T E with hoard H = ⊗t∈T Et
The second is that a stochastic process is a sequence of random variables Xt taking values in (Et , Et ).
µ := P ◦ X −1
It is often denoted PX . For some set T ∈ E. The probability that X ∈ T is defined as:
P(X ∈ T ) := P ◦ X −1 (T )
F (x) := P(X ≤ x)
Often the X would take values in (R, B, λ) where λ is the Lebesgue measure. Hence:
Z
P(X ∈ A) = f dx
A
That is for some D ∈ E, if x ∈ D and x ∈ A, assign mass f (x) to that point. Clearly P has density
f (x) w.r.t the counting measure. The function f (x) is called the probability mass function rather than
density function.
Marginal distribution: individual distribution of each Xt . Denote these as µt and their density w.r.t λ
as ft
Consider the product space ×T XT . As the distribution is a measure the product distribution is the
product measure µ := ⊗T µt and its density w.r.t ⊗T λ is:
Y
f := ft
T
Now want Xi : [0, 1), B[0,1) , λ → {0, 1} such that λ(X −1 (A)) = Ber (A) for all A ∈ 2{0,1}
2.5 Expectation
For some numerical random variable X : (Ω, H, P) → (R, B), the expectation is defined as:
EX := PX
That is,
Z
EX := X dP
Ω
Constructing this is the exact same as constructing the integral of numerical measurable functions.
2.5.1 Properties of E
Follows all the properties of the integral:
Proof: Monotonicity
X1 ≤ X2 → EX1 ≤ EX2
Proof: Linearity
Xn ↑ X → EXn ↑ EX
Fatous lemma: For some sequence Xn of random numerical value
Proof:
µ = PX −1
and E-measurable numerical function h, the below details how to calculate Eh(X)
First, use the fact that:
Z Z
hdP = h ◦ Xdµ
E
Z
Eh(X) = h(X)
That is
Eh(X) = µh
This follows from something I havent proven and dont understand. DO:
But effectively the expected value of a function of a random variable is its integral w.r.t the probability
distribution. Or densities if such exits.
Consider some n-dimensional real (Rn , B n ) valued random vector X. Suppose it has density fX w.r.t
to the Lebesgue measure. Consider invertible g : Rn → Rn . Define random variable Z := g(X).
Hence, the probability density of Z w.r.t the Lebesgue measure is:
fX ◦ g − 1
fZ = ∂Z
∥ ∂X ∥
By applying invertibility,
∂X
fZ = fX ◦ g − 1
∂Z
2.7 Integral transforms
The MGF of a numerical random variable X : (Ω, H, P) → (R, B) with distribution µ := PX −1 is an
integral transform defined as:
M (s) := EesX
Z
M (s) = esX dP
Supposing pushforward
M (k) (0) = EX k
Simplifies working with independent sums of Xi
Y
MPi Xi (S) = Xi (S)
i
Proof: σX is a σ-algebra
The σ-algebra generated by stochastic process {Xt }T is the smallest σ-algebra generated by the union
of the individual σXt denoted
_
σXt
T
Proof?:
f ∈ σX ⇐⇒ f = h ◦ X, h ∈ E