0% found this document useful (0 votes)

25 views8 pages

Prob 1 Lecture 1

This document provides an introduction to probability theory and probability spaces. It discusses the axiomatic formulation of probability using measure theory and Kolmogorov's formulation of a probability space. It also covers distributions, distribution functions, and how these relate to defining probability measures on the real line.

Uploaded by

Han Jiang

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

25 views8 pages

Prob 1 Lecture 1

Uploaded by

Han Jiang

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

Lecture 1

1 Course Introduction
Probability theory has its roots in games of chance, such as coin tosses or throwing dice.
By playing these games, one develops some probabilistic intuition. Such intuition guided the
early development of probability theory, which is mostly concerned with experiments (such
as tossing a coin or throwing a die) with finitely many possible outcomes. The extension
to experiments with infinitely (even uncountably) many possible outcomes, such as sampling
a real number uniformly from the interval [0, 1], requires more sophisticated mathematical
tools. This is accomplished by Kolmogorov’s axiomatic formulation of probability theory
using measure theory, which lays the foundation for modern probability theory. Therefore we
will first recall some basic measure theory and Kolmogorov’s formulation of probability space
and random variables.
In this course, we will focus on the study of a sequence of independent real-valued random
variables. In particular, we will study the empirical average of a sequence of independent and
identically distributed (i.i.d.) real-valued random variables and prove the Law of Large Num-
bers (LLN), as well as the Central Limit Theorem (CLT) which governs the fluctuation of the
empirical average. Along the way we will study Fourier transforms of probability measures
and different notions of convergence of probability measures, in particular the weak conver-
gence. Other topics we aim to cover, which arises from the study of sums of independent
random variables, include: infinitely divisible distributions, stable distributions, large devi-
ations, extreme order statistics. See the bibliography for references, with [?, ?] being our
primary references. If time permits, we will also show how to use measure theory to construct
conditional probabilities/expectations when we condition on events of probability 0, which
is needed when we study experiments (random variables) with uncountably many possible
outcomes.
Topics on dependent random variables, such as Markov chains, martingales, and stationary
processes, will be covered in a second course. Topics on continuous time processes, in particular
stochastic calculus and stochastic differential equations, is usually covered in a third course.
Other topics, such as Lévy processes, large deviations, Malliavin calculus, interacting particle
systems, models from statistical physics such as percolation and the Ising model, population
genetic models, random graphs, random matrices, stochastic partial differential equations,
etc, are the subjects of special topics courses.

2 Probability Space
Let us first motivate the measure-theoretic formulation of probability theory. Let Ω be the
space of possible outcomes for a random experiment. If the experiment is the throw of a die,
then we take Ω := {1, 2, · · · , 6}. We also specify a probability mass function f : Ω → [0, 1]
P
with i∈Ω f (i) = 1 such that f (i) is the probability of seeing the outcome i. If the experiment
has an uncountable number of outcomes, such as drawing a random number uniformly from
[0, 1], then we take Ω := [0, 1]. However, there is no sensible way of defining a probability

1
P
mass function f : Ω → [0, 1] with f (x) = 0 for all x ∈ [0, 1] and x∈Ω f (x) = 1 (the sum is
undefined).
An alternative is to define probabilities for sets of outcomes in Ω, also called events. Thus
we also introduce F, a collection of events (subsets of Ω), and a set function P : F → [0, 1]
such that P (A) is the probability of the event A ∈ F. Since F is the collection of events for
which we can determine the probabilities using P (·), the larger is F, the more information we
have.
We expect F and P to satisfy some natural conditions:

• Ω ∈ F and ∅ ∈ F (with P (Ω) = 1 and P (∅) = 0),

• If A ∈ F, then Ac ∈ F as well (with P (Ac ) = 1 − P (A) ≥ 0),

• If A, B ∈ F, then A ∩ B, A ∪ B ∈ F (with P (A ∪ B) = P (A) + P (B) if A ∩ B = ∅).

A collection of sets F satisfying the above properties is called an algebra (or field). A set
function P (·) satisfying the above properties is called a finitely-additive probability measure.
An important technical condition we need to further impose on F is that, F is a σ-algebra
(or σ-field), i.e., ∪n∈N An ∈ F (or equivalently ∩n∈N An ∈ F) if An ∈ F for each n ∈ N.
Similarly, we need to further assume that P is a countably-additive probability measure on
the σ-algebra F, i.e., if (An )n∈N is a sequence of pairwise-disjoint sets in F, then P(∪n∈N An ) =
P
n∈N P (An ). It is easy to see that

Exercise 2.1 A finitely-additive probability measure P on a σ-algebra F satisfies countable

P
additivity, i.e., P(∪n∈N An ) = n∈N P (An ) for any sequence of pairwise-disjoint sets An ∈
F, if and only if P (Bn ) ↓ 0 for any sequence of Bn ∈ F decreasing to the empty set ∅.

We can now give Kolmogorov’s formulation of a probability space:

Definition 2.2 [Probability space] A probability space is a triple (Ω, F, P ), where Ω is a

set, F is a σ-algebra on Ω, and P is a probability measure (always assumed to be countably-
additive) on the measurable space (Ω, F).

Remark. Typically we will assume that the probability measure P is complete, i.e., if A ∈ F
has P (A) = 0 (called a null set), then we assume that B ∈ F for all B ⊂ A. If P is not
complete, then we can always complete it by enlarging F to include all subsets of null sets.

If Ω is a finite or countable set, then a natural choice of F is the collection of all subsets of
Ω, and specifying P becomes equivalent to specifying a probability mass function f : Ω → [0, 1]
P
with x∈Ω f (x) = 1. When Ω is uncountable, a natural question is: How to construct σ-fields
F on Ω and countably-additive probability measures P on F? The first part of this question
is addressed in

Exercise 2.3 If B is an algebra on Ω, then there is a unique σ-algebra F such that it is the
smallest σ-algebra containing B. We call F the σ-algebra generated by B.

The second part of the above question is addressed by

Theorem 2.4 [Caratheodory Extension Theorem] If P is a countably-additive proba-

bility measure on an algebra B, then P extends uniquely to a countably-additive probability
measure on the σ-algebra F generated by B.

2
The proof of Theorem ?? can be found in any of the references in the bibliography; see [?,
Sec. 1.2] for a proof sketch. Theorem ?? reduces the construction of countably additive prob-
ability measures on σ-algebras to the construction of countably additive probability measures
on algebras.
We now focus on the case Ω = R. A natural σ-algebra on R (in fact for any topological
space) is the Borel σ-algebra B, which is the smallest σ-algebra containing all the open and
closed sets. It turns out that to specify a probability measure on (R, B), it suffices to specify
its distribution function on R.

Definition 2.5 [Distribution Function] Let P be a countably additive probability measure

on (R, B). Then F : R → [0, 1] defined by F (x) := P ((−∞, x]) is called the distribution
function of P .

Theorem 2.6 [Correspondence between Distribution Functions and Probability

Measures on R] If F is the distribution function of a countably-additive probability measure P
on (R, B), then F is non-decreasing and right-continuous, with F (−∞) := limx→−∞ F (x) = 0
and F (∞) = 1. Conversely, any non-decreasing right-continuous function F : R → [0, 1]
with F (−∞) = 0 and F (∞) = 1 defines a unique countably-additive probability measure P on
(R, B) with P ((−∞, x]) = F (x) for all x ∈ R.

Proof. If F is the distribution function of P , then F (y) − F (x) = P ((x, y]) ≥ 0 for all x ≤ y,
while the countable-additivity of P implies F (−∞) = limx→−∞ P ((−∞, x]) = 0, F (∞) = 1,
and F (x + ) − F (x) = P ((x, x + ]) ↓ 0 as ↓ 0.
Conversely, if F is non-decreasing and right-continuous with F (−∞) = 0 and F (∞) = 1,
then we can define a set function P on intervals of the form (x, y], with x ≤ y, by P ((x, y]) :=
F (y) − F (x). Note that finite disjoint unions of such intervals (including ∅) form an algebra
I, and P extends to a finitely-additive probability measure on I. Futhermore, we note that
I generates (via countable union and countable intersection) open and closed intervals on
R, and hence B is the σ-algebra generated by I. Therefore it only remains to show that P
is countably-additive on I, so that we can then apply Caratheodory Extension Theorem to
conclude that P extends uniquely to a probability measure on (R, B).
Let An ∈ I with An ↓ ∅. By Exercise ??, we need to show that P (An ) ↓ 0. First we claim
that it suffices to verify P (An ∩ (−l, l]) ↓ 0 for any l > 0, which allows us to replace An by its
truncation Aln := An ∩ (−l, l]. Indeed, note that

P (An ) ≤ P (Aln ) + P ((−∞, −l]) + P ((l, ∞)) = P (Aln ) + F (−l) + (1 − F (l)),

where we can first send n → ∞ and then make F (−l) + (1 − F (l)) arbitrarily small by picking
l sufficiently large (possible because F (−∞) = liml→∞ F (−l) = 0 and F (∞) = 1).
We may now assume An ↓ ∅ and there exists l > 0 such that An ⊂ (−l, l] for each n ∈ N.
Suppose that P (An ) ↓ α > 0. We will derive a contradiction by constructing a decreasing
sequence of non-empty closed subsets Dn ⊂ An , which necessarily has ∩n Dn 6= ∅. Since
An ∈ I, we can write An as the disjoint union of intervals ∪ki=1 n
(an,i , bn,i ]. Since the right-
continuity of F implies that for any x ∈ R, P ((x, x + ]) = F (x + ) − F (x) ↓ 0 as ↓ 0, we
can choose en,i ∈ (an,i , bn,i ) such that Bn := ∪ki=1
n
(en,i , bn,i ] ⊂ An has P (An \Bn ) ≤ α/10n .
n
Let En := ∩i=1 Bi . Then

En = ∩ni=1 Ai \(Ai \Bi ) ⊃ ∩ni=1 Ai \ ∪ni=1 (Ai \Bi ),

3
and hence
n
X ∞
X
P (En ) ≥ P (∩ni=1 Ai ) − P (∪ni=1 (Ai \Bi )) ≥ P (An ) − P (Ai \Bi ) ≥ α − α/10i > α/2.
i=1 i=1

Therefore En 6= ∅. Note that B̄i ⊂ Ai , and hence Dn := Ēn = ∩ni=1 B̄i ⊂ ∩li=1 Ai = An ,
which is a decreasing sequence of closed subsets of the compact interval [−l, l]. Therefore
∩∞ ∞
n=1 Dn ⊂ ∩n=1 An must contain at least one point, which contradicts our assumption.

Remark 2.7 Similar to R, on Rd , finite disjoint unions of rectangles of the form (a1 , b1 ] ×
(a2 , b2 ] × · · · × (ad , bd ] forms an algebra which generates the Borel σ-algebra B on Rd .

Definition 2.8 [Atomic Measures] A distribution function F , with F (x) = 0 for x < a
and F (x) = 1 for x ≥ a, defines the so-called delta measure at a and is denoted by δa .
P
A probability measure µ is called atomic if µ = n∈N cn δan for some sequence an ∈ R and
P
cn ≥ 0 with n cn = 1.

Definition 2.9 [Absolutely Continuous, Singular, and Singular Continuous Mea-

sures] A probability measure µ on (R, B) is called absolutely continuous with respect to
the Lebesgue measure λ, denoted by µ << λ, if for all A ∈ B(R) with λ(A) = 0, we also have
µ(A) = 0. We say µ is singular with respect to λ, denoted by µ ⊥ λ, if there exists a set
A0 ∈ B(R) with λ(A0 ) = 0, such that µ(A0 ) = 1. We say µ is singular continuous with
respect to λ, if µ ⊥ λ and µ contains no atoms, i.e., µ({a}) = 0 for all a ∈ R.

Remark 2.10 Lebesgue’s Decomposition Theorem implies that every probability measure µ
on (R, B) can be written uniquely as µ = αµ1 + (1 − α)µ2 for some α ∈ [0, 1], where µ1 << λ
and µ2 ⊥ λ. Furthermore, µ2 = βν1 + (1 − β)ν2 for some β ∈ [0, 1], where ν1 is atomic and ν2
is singular continuous.

Exercise 2.11 Show that a probability measure µ on (R, B) contains no atoms if and only if
its distribution function F is continuous. Construct a probability measure µ which is singular
continuous with respect to the Lebesgue measure.

3 Random Variable, Distribution and Expectation

In the previous section, we interpreted Ω as the space of possible outcomes of an experiment.
When we have multiple experiments with different spaces for their outcomes, it’s more useful
to use an abstract probability space (Ω, F, P) from which all randomness originate. The
experiments we perform are then realized as deterministic functions of the outcome of the
experiment we perform on the abstract probability space (Ω, F, P) (which can be interpreted
as a master experiment). This leads to the formulation of an “experiment” as

Definition 3.1 [Random Variable] A real-valued random variable X is a measurable map

from X : (Ω, F) → (R, B), i.e., for each Borel set A ∈ B, we have X −1 (A) ∈ F.

Remark 3.2 A random variable X taking values in a general measurable space (E, G)
(e.g., (Rd , B) or any complete separable metric space equipped with the Borel σ-algebra) is
just a measurable map from (Ω, F) to (E, G). Multiple measurable functions can be defined on
(Ω, F), leading to (generally dependent) random variables taking values in possibly different
spaces.

4
Exercise 3.3 Let X be a measurable map from (Ω, F, P) to a measurable space (E, G), equipped
with a σ-algebra G. Then the set function Q : G → [0, 1] defined by Q(A) := (P ◦ X −1 )(A) =
P(X −1 (A)), for all A ∈ G, is a probability measure on (E, G).

Definition 3.4 [Distribution of a Random Variable] Let X : (Ω, F, P) → (E, G) be an

E-valued random variable. The induced probability measure P ◦ X −1 on (E, G) is called the
distribution of X under P.

Remark 3.5 If X1 , · · · , Xd : (Ω, F, P) → (R, B) are d real-valued random variables, then it

can be shown that X := (X1 , · · · , Xd ) : Ω → Rd is an (Rd , B(Rd ))-valued random variable.
The induced probability measure P ◦ X −1 on (Rd , B(Rd )) is called the joint distribution of
X1 , X2 , · · · and Xd .

Remark 3.6 The study of random variables on a nice enough measurable space (E, G) (in
particular, complete separable metric space with Borel σ-algebra) can be reduced to the study
of real-valued random variables. All we need to do is to apply to X a sufficiently large class
of measurable test functions {fi }i∈I , with fi : (E, G) → (R, B), so that we can determine the
distribution of X from the joint distribution of {fi (X)}i∈I . Note that for measurable fi , fi (X)
is a real-valued random variable.

For a real-valued random variable X : (Ω, F, P) → (R, B), we need to define the classic
notion of expectation (or R average) in our current measure-theoretic setting. This amounts
to defining the integral ω∈Ω X(ω)P(dω) of X on the probability space (Ω, F, P), which calls
for the theory of Lebesgue integration on a general measure space. Let us recall briefly how
Lebesgue integration is constructed.
Firstly, for X of the form (called simple functions) X(ω) := ki=1 ci 1Ai (ω), with Ai ∈ F
P

and ci ∈ R, we can define the integral

Z k
X
X(ω)P(dω) := ci P(Ai ).
i=1

Note that linear combinations of simple functions are still simple, and the integral defined
above is a linear operator on simple functions. Furthermore, the integral is a bounded operator
on the space of simple function equipped with the supremum norm k · k. More precisely, if X
is simple, then Z
X(ω)P(dω) ≤ kXk,

where kXk := supω∈Ω |X(ω)|. Consequently, if Xn are simple functions with kXn −Xk → 0 for
some Rlimiting function X on Ω (note that the limit of measurable functions
R is also measurable),
then Xn P(dω) must converge to a limit, which we define to be XP(dω).
We then observe that every bounded measurable function X can be approximated in
supremum norm by simple functions. Indeed, if we assume w.l.o.g. that kXk = 1, then we
can approximate X by Xn := n+1 i
P
i=−n−1 n 1An,i (ω), with An,i := {ω : X(ω) ∈ [i/n, (i + 1)/n)}.
Having constructed the integral for bounded measurable functions, we can then construct
the integral for arbitrary non-negative measurable functions X by
Z nZ o
XP(dω) := sup f (ω)P(dω) : 0 ≤ f ≤ X, kf k < ∞ ,

5
R
and X is said to be integrable if XP(dω) < ∞. A general measurable function X is said
to be integrable if its positive part X + := X ∨ 0 and negative part X − = (−X) ∨ 0 are both
integrable, which is equivalent to |X| being integrable. In this case we then define
Z Z Z
XP(dω) := X P(dω) − X − P(dω).
+

R
The L1 (Ω, F, P)-norm of a random variable X is defined by |X|1 := |X|P(dω), which is
finite if and only if X is integrable.
For integrable random variables defined on the probability space (Ω, F, P), we will intro-
duce the notation E[X] to denote the integral of X over Ω w.r.t. P, which is also called the
expectation or mean of X. If we let α denote the probability distribution of X on R under
P, i.e., α = P ◦ X −1 , then not surprisingly, one can show that
R R
Exercise 3.7 E[X] = R x α(dx), and E[g(X)] = R g(x)α(dx) for any g : (R, B) → (R, B)
such that g is integrable w.r.t. α on R.

We collect here some basic properties of the expectation E, which is a functional on

integrable real-valued random variables.

Theorem 3.8 Suppose that X, Y : (Ω, F, P) → (R, B) are integrable. Then

(i) If X ≥ 0 a.s., then E[X] ≥ 0.

(ii) For all a, b ∈ R, E[aX + bY ] = aE[X] + b E[Y ].

(iii) | E[X] | ≤ E[ |X| ].

Note that (i) and (ii) imply that E[X] = 0 if X = 0 a.s., and E[X] ≤ E[Y ] if X ≤ Y a.s.
We now collect some important inequalities.

Theorem 3.9 [Jensen’s Inequality] If X is an integrable random variable, and φ : R → R

is a convex function such that φ(X) is also integrable, then E[φ(X)] ≥ φ(E[X]).

1 1
Theorem 3.10 [Hólder’s Inequality] If p, q ∈ (1, ∞) with p + q = 1, and X, Y are two
random variables with E[ |X|p ] < ∞ and E[ |Y |q ] < ∞, then
1 1
E[ |X| · |Y | ] ≤ E[ |X|p ] p E[ |Y |q ] q .
1
In particular, E[ |X| ] ≤ E[ |X|p ] p for any p ≥ 1. The case p = q = 2 is the Cauchy-Schwarz
inequality.

Theorem 3.11 [Markov’s Inequality] Let X be a non-negative real-valued random vari-

able. Then
E[X1{X≥a} ] E[X]
P(X ≥ a) ≤ ≤ for all a > 0.
a a
In particular, if Y is any real-valued random variable and φ is any non-negative increasing
function defined on [0, ∞), then

E[φ(|Y |)1{|Y |≥a} ] E[φ(|Y |)]

P(|Y | ≥ a) ≤ ≤ for all a > 0.
φ(a) φ(a)

6
When φ(x) = x2 , this is also called Chebyshev’s inequality.

Exercise 3.12 Prove the following one-sided Markov inequality: If E[Y ] = 0 and E[Y 2 ] =
2
σ 2 and a > 0, then P(Y ≥ a) ≤ σ2σ+a2 , and equality holds for some random variable Y (Hint:
First try to construct a Y that achieves equality).

Exercise 3.13 Prove the Paley-Zygmund inequality: If X ≥ 0 and E[X 2 ] < ∞, then for
2
any 0 ≤ a < E[X], we have P(X > a) ≥ (E[X]−a)
E[X 2 ]
. A special case is when a = 0, which is
called the second moment method for lower bounding P(X > 0).

4 Convergence of Random Variables and Their Expectations

For a sequence of random variables (Xn )n∈N defined on a probability space (Ω, F, P), taking
values in a metric space (E, ρ) equipped with Borel σ-algebra G, we have several notions of
convergence. The first is the notion of everywhere convergence, i.e.,

∀ ω ∈ Ω, X(ω) := lim Xn (ω) exists.

n→∞

We leave it as an exercise to check that X is also a random variable. However, this notion of
convergence is too strong because it is insensitive toward the probability measure P. A more
sensible notion is

Definition 4.1 [Almost Sure Convergence] A sequence of random variables (Xn )n∈N de-
fined on (Ω, F, P) is said to converge almost surely (abbreviated by a.s.) to a random variable
X, if there exists a set Ωo ∈ F with P(Ωo ) = 1, such that Xn (ω) → X(ω) for every ω ∈ Ωo .

Almost sure convergence allows us to ignore what happens on a set of probability 0 w.r.t. P.
A weaker notion is

Definition 4.2 [Convergence in Probability] A sequence of random variables (Xn )n∈N

defined on (Ω, F, P) is said to converge in probability to a random variable X, if

lim P{ω : |Xn (ω) − X(ω)| ≥ } = 0 ∀ > 0.

n→∞

Example 4.3 If we take (Ω, F, P) := ([0, 1], B, λ), the unit interval with Borel σ-algebra and
Lebesgue measure, then Xn : [0, 1] → R, defined by Xn (ω) = n on [0, 1/n] and Xn (ω) = 0
on (1/n, 1], is a sequence of random variables converging a.s. to X ≡ 0. If we define instead
Xn (ω) := n on the interval ( n−1
P Pn
i=1 1/i, i=1 1/i] projected onto [0, 1] by identifying (k, k + 1]
with (0, 1] for each k ∈ Z, and Xn (ω) = 0 for other choices of ω, then Xn converges in
probability (but not almost surely!) to X ≡ 0.

Remark 4.4 [Convergence in Distribution] Both a.s. convergence and convergence in

probability require the sequence of random variables Xn to be defined on the same probability
space (Ω, F, P). To phrase it in another way, we say that (Xn )n∈N are coupled. However,
what we are often interested in is rather the distribution of Xn , and whether µn := P ◦ Xn−1
converges in a suitable sense on the metric space (E, ρ) where (Xn )n∈N takes their values. This
leads to the notion of convergence of Xn in distribution to X, or weak convergence
of µn to µ := P ◦ X −1 . The convergence of Xn to X in distribution is a statement about
the distributions µn and µ on (E, ρ), and has nothing to do with how (Xn )n∈N and X are

7
coupled on (Ω, F, P). However, if (Xn )n∈N and X are coupled in such a way that Xn → X in
probability (or even a.s.), then we can conclude that Xn converges to X in distribution. We
will study in detail the notion of convergence in distribution for real-valued random variables
when we come to study the Central Limit Theorem.

We now collect some important results on the relation between the convergence of a se-
quence of real-valued random variables (Xn )n∈N , and the convergence of their expectations.
We shall assume below that all random variables are real-valued and defined on a probability
space (Ω, F, P), with expectation denoted by E[·].

Theorem 4.5 [Bounded Convergence Theorem] If (Xn )n∈N is a sequence of uniformly

bounded random variables, and Xn converges in probability to X, then limn→∞ E[Xn ] = E[X].

Theorem 4.6 [Fatou’s Lemma] If (Xn )n∈N is a sequence of non-negative random variables
and Xn → X in probability, then E[X] ≤ lim inf n→∞ E[Xn ].

An easy way to remember the direction of the inequality above is to consider Example ??.

Theorem 4.7 [Monotone Convergence Theorem] If (Xn )n∈N is a sequence of non-

negative random variables such that Xn ↑ X a.s., then limn→∞ E[Xn ] = E[X].

Theorem 4.8 [Dominated Convergence Theorem] If (Xn )n∈N is a sequence of random

variables converging in probability to X, and there exists a random variable Y with E[|Y |] < ∞
and |Xn | ≤ Y a.s. for all n ∈ N, then limn→∞ E[Xn ] = E[X].

In practice, when we apply Theorems ??–??, it’s usually the case that Xn → X in the
stronger sense of a.s. convergence. The proof of the above theorems can be found in any
graduate textbook on analysis, or in any of the references below.

Exercise 4.9 Find counter-examples to Theorems ?? and ?? when we remove the non-
negativity assumption on Xn .

References
[1] R. Durrett. Probability: Theory and Examples, Duxbury Press.

[2] S.R.S. Varadhan. Probability Theory, Courant Lecture Notes 7.

[3] A. Klenke. Probability Theory–A Comprehensive Course, Springer.

[4] O. Kallenberg. Foundations of Modern Probability, Springer.

[5] L. Breiman. Probability, Society for Industrial and Applied Mathematics.

[6] W. Feller. An Introduction to Probability Theory and Its Applications, Vol II, John Riley
& Sons, Inc.

[7] K.L. Chung. A Course in Probability Theory, Academic Press.

Unit B: Quadrilaterals and Triangles
No ratings yet
Unit B: Quadrilaterals and Triangles
1 page
Overview of Platonic Solids
No ratings yet
Overview of Platonic Solids
4 pages
1D Array Practice Problems in C
No ratings yet
1D Array Practice Problems in C
1 page
Understanding Polygons: Types & Names
No ratings yet
Understanding Polygons: Types & Names
5 pages
Intersecting Circles: Input
No ratings yet
Intersecting Circles: Input
1 page
Angles of A Polygon
100% (1)
Angles of A Polygon
3 pages
Hexagon Radius Calculation in Circle
No ratings yet
Hexagon Radius Calculation in Circle
1 page
Triangle Centers Jigsaw Activity
No ratings yet
Triangle Centers Jigsaw Activity
3 pages
Pink Republic Poetry
No ratings yet
Pink Republic Poetry
23 pages
Grade 6 Physics Quiz Questions
No ratings yet
Grade 6 Physics Quiz Questions
4 pages
Max and Min Best Fit Lines
No ratings yet
Max and Min Best Fit Lines
2 pages
3rd Grade Math: Unit Fractions
No ratings yet
3rd Grade Math: Unit Fractions
2 pages
Understanding Quadrilaterals and Angles
No ratings yet
Understanding Quadrilaterals and Angles
1 page
Maths Syllabus Class 6
No ratings yet
Maths Syllabus Class 6
3 pages
Geometry Line and Coordinate Problems
No ratings yet
Geometry Line and Coordinate Problems
3 pages
Chemistry Basics for Students
No ratings yet
Chemistry Basics for Students
1 page
Name: - Mtap Grade 1 Read Each Item Carefully. Write The Correct Answer On The Blank Before The Number
No ratings yet
Name: - Mtap Grade 1 Read Each Item Carefully. Write The Correct Answer On The Blank Before The Number
1 page
Mathematics Q2 MELC: Content Standards
100% (1)
Mathematics Q2 MELC: Content Standards
2 pages
Integration Quiz for Additional Math Form 5
No ratings yet
Integration Quiz for Additional Math Form 5
2 pages
Properties of Rhombus Discovery Activity
No ratings yet
Properties of Rhombus Discovery Activity
2 pages
Shrinking Polygons: Input
No ratings yet
Shrinking Polygons: Input
2 pages
Instructor'S Document: Practice
No ratings yet
Instructor'S Document: Practice
3 pages
Newton's Interpolation Algorithm Explained
No ratings yet
Newton's Interpolation Algorithm Explained
5 pages
Stemology Week1
No ratings yet
Stemology Week1
1 page
Math 5 Decimal Comparison Worksheet
No ratings yet
Math 5 Decimal Comparison Worksheet
3 pages
Triangle Types and Area Calculation
No ratings yet
Triangle Types and Area Calculation
4 pages
Graph and Function
No ratings yet
Graph and Function
3 pages
ECE2111 Assignment
No ratings yet
ECE2111 Assignment
4 pages
Data Structures Quiz for Students
No ratings yet
Data Structures Quiz for Students
3 pages
Numbers
No ratings yet
Numbers
1 page

Prob 1 Lecture 1

Uploaded by

Prob 1 Lecture 1

Uploaded by

Lecture 1

• Ω ∈ F and ∅ ∈ F (with P (Ω) = 1 and P (∅) = 0),

• If A ∈ F, then Ac ∈ F as well (with P (Ac ) = 1 − P (A) ≥ 0),

• If A, B ∈ F, then A ∩ B, A ∪ B ∈ F (with P (A ∪ B) = P (A) + P (B) if A ∩ B = ∅).

Exercise 2.1 A finitely-additive probability measure P on a σ-algebra F satisfies countable

We can now give Kolmogorov’s formulation of a probability space:

Definition 2.2 [Probability space] A probability space is a triple (Ω, F, P ), where Ω is a

The second part of the above question is addressed by

Theorem 2.4 [Caratheodory Extension Theorem] If P is a countably-additive proba-

Definition 2.5 [Distribution Function] Let P be a countably additive probability measure

Theorem 2.6 [Correspondence between Distribution Functions and Probability

P (An ) ≤ P (Aln ) + P ((−∞, −l]) + P ((l, ∞)) = P (Aln ) + F (−l) + (1 − F (l)),

En = ∩ni=1 Ai \(Ai \Bi ) ⊃ ∩ni=1 Ai \ ∪ni=1 (Ai \Bi ),

Definition 2.9 [Absolutely Continuous, Singular, and Singular Continuous Mea-

3 Random Variable, Distribution and Expectation

Definition 3.1 [Random Variable] A real-valued random variable X is a measurable map

Definition 3.4 [Distribution of a Random Variable] Let X : (Ω, F, P) → (E, G) be an

Remark 3.5 If X1 , · · · , Xd : (Ω, F, P) → (R, B) are d real-valued random variables, then it

and ci ∈ R, we can define the integral

We collect here some basic properties of the expectation E, which is a functional on

Theorem 3.8 Suppose that X, Y : (Ω, F, P) → (R, B) are integrable. Then

(i) If X ≥ 0 a.s., then E[X] ≥ 0.

(ii) For all a, b ∈ R, E[aX + bY ] = aE[X] + b E[Y ].

(iii) | E[X] | ≤ E[ |X| ].

Theorem 3.9 [Jensen’s Inequality] If X is an integrable random variable, and φ : R → R

Theorem 3.11 [Markov’s Inequality] Let X be a non-negative real-valued random vari-

E[φ(|Y |)1{|Y |≥a} ] E[φ(|Y |)]

4 Convergence of Random Variables and Their Expectations

∀ ω ∈ Ω, X(ω) := lim Xn (ω) exists.

Definition 4.2 [Convergence in Probability] A sequence of random variables (Xn )n∈N

lim P{ω : |Xn (ω) − X(ω)| ≥ } = 0 ∀  > 0.

Remark 4.4 [Convergence in Distribution] Both a.s. convergence and convergence in

Theorem 4.5 [Bounded Convergence Theorem] If (Xn )n∈N is a sequence of uniformly

Theorem 4.7 [Monotone Convergence Theorem] If (Xn )n∈N is a sequence of non-

Theorem 4.8 [Dominated Convergence Theorem] If (Xn )n∈N is a sequence of random

[2] S.R.S. Varadhan. Probability Theory, Courant Lecture Notes 7.

[3] A. Klenke. Probability Theory–A Comprehensive Course, Springer.

[4] O. Kallenberg. Foundations of Modern Probability, Springer.

[5] L. Breiman. Probability, Society for Industrial and Applied Mathematics.

[7] K.L. Chung. A Course in Probability Theory, Academic Press.

You might also like

lim P{ω : |Xn (ω) − X(ω)| ≥ } = 0 ∀ > 0.