0% found this document useful (0 votes)
18 views

Week - 2

This document summarizes a lecture on statistics. It begins by introducing random variables and their distributions. Random variables are defined as functions that map outcomes from a sample space to real numbers. Cumulative distribution functions and probability density functions are also defined for both discrete and continuous random variables. The document then discusses the mean or expected value of a random variable. It proves that the expected value is a linear operator. Finally, it defines variance as a measure of how much a random variable deviates from its mean and lists some properties of variance.

Uploaded by

riotseeker12
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
18 views

Week - 2

This document summarizes a lecture on statistics. It begins by introducing random variables and their distributions. Random variables are defined as functions that map outcomes from a sample space to real numbers. Cumulative distribution functions and probability density functions are also defined for both discrete and continuous random variables. The document then discusses the mean or expected value of a random variable. It proves that the expected value is a linear operator. Finally, it defines variance as a measure of how much a random variable deviates from its mean and lists some properties of variance.

Uploaded by

riotseeker12
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 58

Week 2 - Lecture 1

MATH2901 - Higher Theory of Statistics

Libo, Li

June 5, 2020

Libo, Li
Random Variables and Their Distribution

We assume that we work on a probability space (Ω, A, P) and I


will denote the outcomes in Ω by ω.
Definition
A random variable (r.v) X is a function from Ω to R such that
∀x ∈ R, the set Ax = {ω ∈ Ω, X (ω) ≤ x} belongs to the
σ-algebra A.

Libo, Li
In this course, random variables are usually denoted by
upper-case characters and non-random objects are denoted by
lower-case characters.
Remark
What is important is that a random variable is just a function
from Ω to R. It is not some magical object that is hard to get
your hands on.

Libo, Li
Example
Let the sample space be Ω = [0, 1] and the probability P been
simply the function which measures the length. That is for any
subset of [0, 1] of the form (a, b) where a, b ∈ [0, 1],

P((a, b)) = b − a.

Then the function Y : Ω → R define by Y := f (x) = x 2 is a


random variable on the sample space ([0, 1], P)

Libo, Li
Cumulative Distribution Function

Definition
The cumulative distribution function of a r.v X is defined by

FX (x) := P({ω : X (ω) ≤ x}) = P(X ≤ x)

Libo, Li
Example
Consider flipping three coins, then the sample space is

Ω = {HHH, HHT , HTH, THH, TTH, THT , HTT , TTT }


= {ω1 , ω2 , ω3 , ω4 , ω5 , ω6 , ω7 , ω8 }

We define a function X on Ω by X (ω)=number of H. We see


that X (ω1 ) = X (HHH) = 3 and X (ω2 ) = X (HHT ) = 2.

Libo, Li
Example
Assume that it is equally likely to obtain head or tails and
consecutive tosses are independent. Therefore

1
FX (0) = P(X ≤ 0) = P({TTT }) =
8
FX (1) = P(X ≤ 1) = P(X = 0) + P(X = 1)
1
= P({HTT }) + P({THT }) + P({TTH}) + P({TTT }) =
2
FX (2) = P(X ≤ 2) = 1 − P(X = 3)
7
= 1 − P({HHH}) =
8
FX (3) = 1

Libo, Li
Libo, Li
Theorem
Suppose FX is a cumulative distribution function of X , then
it is bounded between zero and one, and

lim FX (x) = 0 and lim FX (x) = 1


x↓−∞ x↑∞

it is non-decreasing, that is if x ≤ y then FX (x) ≤ FX (y ).


for any x < y,

P(x < X ≤ y ) = P(X ≤ y ) − P(X ≤ x) = FX (y ) − FX (x)

Libo, Li
Theorem
it is right continuous, that is

1
lim FX (x + ) = FX (x)
n↑∞ n

it has finite left limit and


1
P(X < x) = lim FX (x − )
n→∞ n
which we denote by FX (x−).

Libo, Li
Libo, Li
Remark
A useful observation is that

P(X = x) = FX (x) − FX (x−) =: ∆FX (x)

That is the probability of X = x is the size of the jump/change


of the cumulative distribution function at the point x.

Libo, Li
Discrete Random Variables

Definition
A r.v X is said to be discrete if the image of X consists of
countable many values x, for which P(X = x) > 0.

For example, the number of heads in three consecutive toss of


coins is a discrete random variable.
Definition
The probability function of a discrete r.v X is the function
∆FX (x) = P(X = x) and satisfies
X
P(X = x) = 1
all possible x

Libo, Li
Continuous Random Variables

Definition
A r.v X is said to be continuous if the image of X takes a
continuum of values.

Definition
The probability density function of a continuous r.v is a
real-valued function fX on R with the property that
Z
P(X ∈ A) = fX (y )dy
A

for any ’Borel’ subset of R.

Libo, Li
For a function f : R → R to be a valid density function, the
function f must satisfy the following properties.
1 for all x ∈ R, f (x) ≥ 0
R∞
2 −∞ f (x)dx = 1

Example
For x ∈ [0, 1] consider for f (x) = 3x 2 , then f is a valid
probability density function.

Libo, Li
Useful Properties (for continuous random variable): For
any continuous random variable X with the density fX ,
1 by taking A = (−∞, x], P(X ∈ (−∞, x]) = P(X ≤ x) and
Z x
FX (x) = fX (y )dy
−∞

2 For any a < b ∈ R, one can compute P(a < X ≤ b) by


Z b
FX (b) − FX (a) = fX (x)dx
a

3 From fundamental theorem of calculus and 1. we have


Z x
d
FX0 (x) = fX (y )dy = fX (x).
dx −∞

Libo, Li
Example
The life time (in hours) X of a light bulb is assumed to exhibit
the density function fX (x) = e−x , for x > 0. To compute the
probability that the lifetime of the light bulb is between 2 and 3
hours is given by
Z 3
P(2 < X ≤ 3) = e−y dy
2
3
= −e−y 2
= e−2 − e−3

Libo, Li
Example
Consider a r.v X such that
(
1
5 with probability 5
X = 4
10 with probability 5

If we have 100 observations then I would expect to obtain 5,


twenty times and 10, eighty times. Then if we take the average,

5 × 20 + 10 × 80 20 80
=5× + 10 ×
100 100 100
1 4
= 5 × + 10 ×
5 5
which is the sum of outcomes × the probability of that outcome.

Libo, Li
Definition
The expectation of a r.v X is denoted by E(X ) and it is
computed by
1 Let X be a discrete r.v then
X X
E(X ) := x P(X = x) = x ∆FX (x)
all possible x all possible x

2 Let X be a continuous r.v. with density function fX (x) then


Z ∞
E(X ) := xfX (x)dx
−∞

Libo, Li
We often call the expectation of X , the mean of X . Note
that the mean and the average is slightly different.
Interpretation: The expectation E(X ) has the
interpretation of being the long run average of the
outcomes of X . That is the average observation from the
r.v X converges to E(X ).
In physical models, E(X ) has the interpretation of the
center of mass for the function fX .

Libo, Li
Lemma
Suppose g : R → R, then the expectation of the transformed r.v
g(X ) is
R
 R g(x)fX (x)dx
 continuous
E(g(X )) =

P
x g(x)P(X = x) discrete

usually one is interested in computing E(X r ) for r ∈ N, which is


called the r -th moment of X .

Libo, Li
Lemma
The expectation E is linear, i,e, for any constants a, b ∈ R,

E(aX + b) = aE(X ) + b

Libo, Li
Proof.
We shall proof only the continuous cases. Using 16 with
g(x) = ax + b, we have
Z ∞
E(aX + b) = (ax + b)fX (x)dx
−∞
Z ∞ Z ∞
(by linearity of the integral) = a xfX (x)dx + b fX (x)dx
−∞ −∞
= aE(X ) + b

where
R ∞the last line follows from definition of the expectation and
that −∞ fX (x)dx = 1.

Libo, Li
MATH2901 - Higher Theory of Statistics

Libo, Li

June 5, 2020

Libo, Li
Week 2 - Lecture 2

MATH2901 - Higher Theory of Statistics

Libo, Li

June 5, 2020

Libo, Li
Variance

Definition
Let X be a r.v and we set µ = E(X ). The variance is X is
denoted by Var(X ) and

Var(X ) := E((X − µ)2 )

and the standard deviation of X is the square root of the


variance.
Intuitively, the variance measures on average how much does
the random variable deviate from its expectation/mean.

Libo, Li
Variance

Lemma
Given a random variable X then for any constant a, b ∈ R,
1 Var(X ) = E(X 2 ) − (E(X ))2 .
2 Var(aX ) = a2 Var(X )
3 Var(X + b) = Var(X )
4 Var(b) = 0

Libo, Li
Libo, Li
Libo, Li
Libo, Li
Example
Suppose X is a random variable with density fX (x) = e−x for
x ≥ 0, then from

Var(X ) = E(X 2 ) − E(X )2


Z ∞ Z ∞
2 −x
= x e dx − ( xex dx)2
0 0

We need to use integration by parts


Z ∞
2
E(X ) = x 2 e−x dx
0
Z ∞

= −x 2 e−x 0 + 2xe−x dx = 2E(X )
0

Libo, Li
Example
We compute now E(X ), by using integration by parts
Z ∞
E(X ) = xe−x dx
0
Z ∞
−x ∞
= −xe 0 + e−x dx
0
=1

Therefore Var(X ) = 2 − 1 = 1

Libo, Li
Moment Generating Functions

The moments of a random variable, that is

E[X r ] r = 1, 2, . . . ,

Measure the shape of the distribution.


Mean, Variance, Skewness, Kurtosis and etc.

Libo, Li
Moment Generating Functions

Definition
The moment generating function (MGF) of a r.v X is denoted by

MX (u) := E(euX )

and we say that the MGF of X exists if MX (u) is finite in some


interval containing zero.

Remark
The moment generating function of X exists if there exists
h > 0 such that the MX (x) is finite for x ∈ [−h, h].

Libo, Li
Example

Example
Let X be a r.v with density function fX (x) = e−x for x > 0. Then
the moment generating function of X is
Z ∞
uX
E(e ) = eux e−x dx
0

e(u−1)x ∞
Z
= e(u−1)x dx =
0 u−1 0

from which we see that


(
1
uX 1−u u<1
E(e )=
∞ u≥1

Libo, Li
Example

Libo, Li
Lemma
Suppose the moment generating function of a r.v X exists then

(r ) dr
E(X r ) = lim MX (u) =: lim MX (u)
u→0 u→0 du

Proof.

Libo, Li
Libo, Li
Libo, Li
Week 2 - Lecture - 3

MATH2901 - Higher Theory of Statistics

Libo, Li

June 5, 2020

Libo, Li
Moment Generating Function - Continue

Given the MGF of X exists then E[X r ] < ∞ for r = 1, . . . ,.


The r -th moment of X can be compute by differentiating
the moment generating function and substitute in zero.

Libo, Li
Theorem
Let X and Y be two r.vs such that the moment generating
function of X and Y exists and MY (u) = MX (u) for all u in some
interval containing zero then FX (x) = FY (x) for all x ∈ R.

Remark
The above theorem tells you that if the moment generating
function exists then it uniquely characterises the cumulative
distribution function of the random variable.

Libo, Li
Moment Generating Function

If the MGF exists then all moment can be computed. The


converse is not always true.
All the moments of a random variable exists (i.e finite) then
the MGF might not exist.
There exists r.vs such that for all r ∈ N, its r -th moment is finite,
but the MGF of X does not exists.1

1
In the sense that does not exists a interval containing zero such that the
MGF is finite on the interval.
Libo, Li
Example
A random variable Y is said to be a log-normal random variable
with parameter (µ, σ) if the probability density function is

1 (ln y −µ)2

fY (y ) = √ e 2σ 2 y >0
σy 2π

From direct computation of E(Y r ) that for any r ∈ N the r -th


moment is given by
σ2 r 2
E(Y r ) = er µ+ 2 <∞

However the MGF of E(euY ) is infinite for any u > 0, i.e. it is not
finite on any interval containing zero.

Libo, Li
Libo, Li
Libo, Li
Example
Consider two random variables X1 and X2 with probability
density functions

1 [ln(x)]2
fX1 (x) = √ e , x >0
2σ 2
σx 2π
fX2 (x) = fX1 (x)(1 + sin(2π ln(x))), x > 0.

For all r ∈ N, we have E(X1r ) = E(X2r ), but FX1 6= FX2 .

Libo, Li
From the previous example, we have for all r ∈ N,
σ2 r 2
E(X1r ) = e 2 .

We show that E(X2r ) gives the same answer.


Z ∞ Z ∞
r
E(X2 ) = xfX2 (x)dx = x r fX1 (x)(1 + sin(2π ln(x)))dx
0 0
Z ∞
r
= E(X1 ) + x r fX1 (x) sin(2π ln(x)))dx.
0

We need only to show that


Z ∞
x r fX1 (x) sin(2π ln(x)))dx = 0.
0

Libo, Li
This can be done by making the substitution y = ln(x) − r and
use the fact that the function sin(x) is 2π periodic and odd.
Z ∞
1 −(y +r )2
e(r +y )r √ e 2 sin(2π(y + r ))dy
−∞ 2π
Z ∞
2 2 1
= e−y /2+r /2 √ sin(2πy )dy = 0
−∞ 2π

Libo, Li
Useful Inequalities
Lemma
(The Markov inequality or Chebychev’s first inequality) For any
non-negative r.v X and a > 0,

E(X )
P(X ≥ a) ≤
a

Libo, Li
Libo, Li
Useful Inequalities
Lemma
(Chebychev’s Second inequality) Suppose X is any r.v with
E(X ) = µ, Var(X ) = σ 2 and k > 0 then

1
P(|X − µ| > k σ) ≤
k2

Libo, Li
Libo, Li
Definition
A function h is convex (concave) if for any λ ∈ [0, 1] and x1 and
x2 in the domain of h, we have

h(λx1 + (1 − λ)x2 ) ≤ (≥) λh(x1 ) + (1 − λ)h(x2 )

Lemma
(Jensen’s inequality) Suppose h is a convex (concave) function
and X is a r.v then

h(E(X )) ≤ (≥) E(h(X ))

Libo, Li
Example
Let X be a r.v such that for p ∈ [0, 1],
(
x1 with probability p
X =
x2 with probability 1 − p

This gives E[X ] = x1 p + x2 (1 − p), and if h is a convex function,


we have from the definitions of the convex function

h(EX ()) = h(x1 p + x2 (1 − p)) ≤ λh(x1 ) + (1 − λ)h(x2 )


≤ E(h(X ))

Libo, Li
Application of Jensen’s inequality

By using the Jensen’s inequality, one can show

Arithmetic Mean ≥ Geometric Mean ≥ Harmonic Mean.

That is given a sequence of number (ai )i=1,...n , we have

n n
!1 n
!−1
n
1X Y X
ai ≥ ai ≥n ai−1
n
i=1 i=1 i=1

Libo, Li
Proof.
To obtain the first inequality
Let X be a r.v such that P(X = ln ai ) = n1 .
By taking h(x) = ex .
The Geometric Mean can be rewritten into
n
! n1
Y 1
Pn
ln ai
ai = en i=1 = eE(X ) = h(E(X ))
i=1

From Jensen’s inequality we have


n n
1 X ln ai 1X
h(E(X )) ≤ Eh(X ) (=) e = ai
n n
i=1 i=1

Libo, Li
To proof the last inequality, we apply the first inequality to the
sequence ( a1i )i=1,...,n . That is

n n
!1 n
!1 n
!−1
n n
1X 1 Y 1 Y X
≥ =⇒ ai ≥n ai−1 .
n ai ai
i=1 i=1 i=1 i=1

Libo, Li

You might also like