0% found this document useful (0 votes)
4 views4 pages

Orf526 f24 Lec1

Download as pdf or txt
Download as pdf or txt
Download as pdf or txt
You are on page 1/ 4

LECTURE 1:

PROBABILITY SPACES AND MEASURES

The goal of the lecture notes is to give an exposition of the material in the same order and
approximately in the same completeness as it is done in class (ORF526, Fall 2024). Always
refer to the textbooks if in doubt and for more complete treatment.
Please do not distribute these notes as they may be incomplete. You are encouraged to send
any questions on the notes and observed typos to [email protected], thanks in advance!
E. Rebrova

Definition 1.1. A probability space is a triplet (Ω, F, P), where Ω is an arbitrary set
called the sample space, F is a σ-algebra, and P is a probability measure.
The precise definitions of a σ-algebra and a probability measure below formalize the intu-
ition that:
• Ω is the set of all possible outcomes.
• F denotes all events that can be observed (i.e., a set of possible outcomes).
• P assigns numerical values (probabilities, or likelihoods) to all observable events.
Sigma-algebras (F)
Let Ω be a set, and denote the collection of all subsets of Ω by 2Ω .
Definition 1.2. A collection of subsets F ⊂ 2Ω is a σ-algebra on Ω if:
(1) (Empty set) ∅ ∈ F.
(2) (Closed under complement) For any A ∈ F, its complement Ac := Ω \ A ∈ F.
(3) (Closed under countable union) For any countable collection of subsets Ai ∈ F, its
union ∪i Ai ∈ F.
If we require that (3) only holds for finite subsets, then we call F an algebra of subsets.
We call the sets in the algebra observable, or measurable subsets of Ω. This is because
probability likelihood/measure will be defined for all these sets.
An easy question: why is any σ-algebra an algebra? Think through why it is natural to
require (1)–(3) if we want to describe “all observable/possible events”. How many elements
are in the smallest possible σ-algebra on Ω?
A convenient trick to define a σ-algebra is to start from a given collection of events “that
can definitely happen”, and extend to all possible countable unions and complements.
Definition 1.3. A σ-algebra generated by A ⊆ 2Ω is the smallest σ-algebra containing A.
Formally, \
σ(A) = G ⊆ 2Ω : G is a σ-algebra and A ⊆ G .
As an exercise to confirm that the definition above is well-posed, check that any (arbitrary)
intersection of σ-algebras is itself a σ-algebra.
Date: Fall 2024.
1
2 LECTURE 1: PROBABILITY SPACES AND MEASURES

Remark 1.4. Different sets of generators can result in the same generated σ-algebra. For
example, consider Ω = {1, 2, 3}, and the generators {1} and {2, 3}. However (and this is
convenient!), to show that two generated σ-algebras coincide, it is enough to check that each
of them contains the set of generators of the other (think this through!)

Definition 1.5. If the set Ω is endowed with a topology,1 then the Borel σ-algebra is the
σ-algebra generated by the collection of all open sets, and is denoted by B(Ω).
Definition 1.6 (Open sets in R). We say that a set U ⊆ R is open if for every x ∈ U , there
exists an ϵ-ball centered at x that belongs to U . (That is, for every x ∈ U , there exists ϵ > 0
such that for all y ∈ R such that |x − y| < ϵ, we have y ∈ U .) We say that a set C ⊆ R is
closed if its complement C c = R \ C is open.
Some key properties of open sets are that (arbitrary) unions of open sets are open, and
finite intersections of open sets are open. In R, examples of open sets include open intervals
of the form (a, b)—i.e., excluding the endpoints—and their unions; and examples of closed
sets include closed intervals of the form [a, b].
The standard topology in R defined above easily generalizes to Rn : we say that a set
U ⊆ Rn is open if for every x ∈ U , there exists ϵ > 0 such that for all y ∈ Rn such that
∥x − y∥ < ϵ, we have y ∈ U (here, ∥·∥ denotes the Euclidean norm in Rn ).
Exercise 1.7. Verify that all of the following alternative definitions of the Borel σ-algebra
B(R) are equivalent:
σ({(a, b) : a < b ∈ R}) = σ({[a, b] : a < b ∈ R}) = σ({(−∞, b] : b ∈ R})
= σ({(−∞, b] : b ∈ Q}) = σ({O ⊂ R open}).
(Probability) measures (P and µ)
Definition 1.8. Let F be a σ-algebra on Ω. A function P : F → [0, 1] is called a probability
measure if
(1) P(∅) = 0 and P(Ω) = 1.
(2) P(A) ≥ 0 for any A ∈ F.
(3) P is a countably additive function; namely, for any countableP disjoint collection of
subsets Ai ∈ F (i.e., Ai ∩ Aj = ∅ if i ̸= j), we have P(⊔i Ai ) = i P(Ai ).2
How can we interpret these defining properties in terms of the likelihood of events?
Remark 1.9. Some additional related definitions and notations:
• If we do not require P(Ω) = 1, then P is called a measure, which is typically denoted
by µ : F → [0, ∞], and (Ω, F, µ) is called a measurable space.
• If µ(Ω) < ∞, the measure is called finite.
• If there exists a countable family of subsets Ai such that µ(Ai ) < ∞ and ∪i Ai =
Ω, then the measure is called σ-finite (think of the real line, which is a union of
countably many length one segments).
1Recall that a topology for Ω is a collection of subsets X ⊂ 2Ω such that ∅, Ω ∈ X ; any arbitrary (finite
or infinite) union of members of X belongs to X ; and any intersection of any finite number of members of X
belongs to X . Members of X are called open sets. For us, we focus on the standard topology on R defined
by the open sets such that each point has some ϵ-neighborhood that belongs to the set.
2We use the notation A ⊔ B to denote the union of two disjoint sets A and B.
LECTURE 1: PROBABILITY SPACES AND MEASURES 3

• If we reformulate (3) to involve only finite collections of disjoint sets, we say that µ
is a finitely additive non-negative set-function.
• If we do not require µ to be non-negative, it is called a signed measure.

Some properties of (probability) measures. Note that if A ⊂ B, then P(A) ⊂ P(B).


Indeed, B = A ⊔ (B \ A) and P(B) = P(A) + P(B \ A) ≥ P(A) from non-negativity and
finite additivity of measures. This is called monotonicity of (probability) measures.
Exercise 1.10. Prove the following properties of probability measures:
P
• (Sub-additivity/union bound) If A ⊆ ∪i Ai then P(A) ≤ i P(Ai )
• (Continuity from below) If A1 ⊆ A2 ⊆ . . . and ∪i Ai = A, then P(An ) → P(A) as
n → ∞.
• (Continuity from above) If A1 ⊃ A2 ⊃ . . . and ∩i Ai = A, then P(An ) → P(A) as
n → ∞.
Show that for arbitrary (not necessarily probability) measures µ : F → [0, ∞], the last
property can fail but the first two still hold.

Examples of probability spaces


Why do we define the probabilities on the events in F rather than on individual outcomes?
Why don’t we always consider F = 2Ω ? It is possible that (a) we do not want to, or (b) we
cannot.
Let’s first discuss (a). The σ-algebra represents the amount of information available, so we
might want to emphasize the lack of information. For example, in a sequence of experiments
of unspecified length, at time t = t0 , we only have information about the results of the
experiments that have occurred up to t0 (e.g., think of a sequence of throws of a die).

Example 1.11. Consider n coin tosses. Then the sample space is Ωn = {H, T }n , which
contains all possible sequences of n faces from the experiment. Any sequence can be observed
(i.e., F = 2Ωn ) and is equiprobable (i.e., P({w}) = 2−n for any w ∈ {H, T }n ). For any event
−n
P
A ∈ F, we have P(A) = w∈A 2 . Note that P, Ω and F all depend on n.
What if we want to model a sequence of n tosses in one space, and allow the set of
observable events to grow with time as we see more? We could fix Ω = Ωn , and define a
monotone sequence of σ-algebras F1 ⊆ F2 ⊆ · · · ⊆ Fn with Fn = F as follows. For l < k,
let Fl be defined by the following property: for any A ∈ Fl , if ω = (ωl | τn−l ) is in A, where
ωl denotes the outcomes of the first l coin tosses in the sequence ω and τn−l denotes the
remaining outcomes in ω, then any other sequence ω ′ = (ωl | τn−l′
) must also be in A for any
′ ′
other sequence τn−l of n − l coin tosses (i.e., ω has the same first l outcomes, but the tail is
allowed to be different).
The intuition is that Fl models the case where we do not know the outcomes beyond the
lth experiment, and hence our observable events in Fl are not able to “distinguish” between
the outcomes in the tail. Check that Fl satisfies the definition of a σ-algebra.
What if we do not know how many tosses will be made, like in the example from the first
class? If we do not know n, we can define Ω = Ω∞ to be the set of infinite (countable)
sequences, and define the monotone sequence F1 ⊆ F2 ⊆ . . . of σ-algebras Fl , l = 1, 2, . . . ,
4 LECTURE 1: PROBABILITY SPACES AND MEASURES

as before, and consider the measurable space (Ω, F) with F := σ l≥1 Fl .3 Note that in
S
this case we do not have to know how many tosses were made exactly.
How can we define a probability measure on this space? Given any observable event
A ∈ Fn after n coin tosses, what should the probability of A be? (For a more concrete
example that we can formulate precisely using the definitions we have seen, consider the
probability space (Ω, Fn , Pn ) equipped with the uniform probability measure Pn . Observe
that even though the sample space is infinite, the σ-algebra Fn in this case is finite (think
this through!). What is Pn (A)?)

Example 1.12. We could also consider a countable sample space Ω and infinite σ-algebra.
For example, this is the case for a random variable with the Poisson distribution, where we
k
have, for example, Ω = N = {1, 2, 3, . . .}, F = 2Ω , and P({k}) = λk! e−λ . Note that in this
case, individual outcomes cannot be equiprobable, otherwise the probability of the whole
state space Ω must be infinite by countable additivity of the probability measure P (recall
Definition 1.8).

We will discuss (b) in the next lecture.

3Thisis the cylindrical σ-algebra for the set of infinite binary sequences generated by the cylinder sets,
which are the sets in which the outcome of a finite number of indices (i.e., coin tosses) are specified. Note
that any individual outcome, say ω = {T, T, T, . . . }, is not in any Fl for any l ≥ 1, but ω ∈ F.

You might also like