0% found this document useful (0 votes)
33 views189 pages

MATH2901 Part 1 Term 2 2020

Uploaded by

qwert1
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
33 views189 pages

MATH2901 Part 1 Term 2 2020

Uploaded by

qwert1
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 189

MATH2901 Revision Seminar

Part I: Probability Theory

Jeffrey Yang

UNSW Society of Statistics/UNSW Mathematics Society

August 2020

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics


MATH2901Society)
Revision Seminar August 2020 1 / 116
Note

The theoretical content was mostly sourced from Libo’s notes.

Most of the examples were taken from the slides Rui Tong (2018 StatSoc
Team) made for that year’s revision seminar.

All examples are presented at the end so that it isn’t as obvious which
techniques/methods should be used.

It is recommended that you refer to the official lecture notes when quoting
definitions/results.

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics


MATH2901Society)
Revision Seminar August 2020 2 / 116
Table of Contents

1 Basic Probability Theory

2 Review of Random Variables

3 Computations for Common Distributions

4 Inequalities

5 Limit of Random Variables

6 Tips and Assorted Examples

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics


MATH2901Society)
Revision Seminar August 2020 3 / 116
Basic Probability Theory

1 Axioms and Basic Results


2 Conditional Probability and Independence
3 Basic Laws
4 Bayes Formula

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics


MATH2901Society)
Revision Seminar August 2020 4 / 116
Introduction to Probability Theory

Probability theory is about modelling and analysing random experiments.


We can do this mathematically by specifying an appropriate probability
space consisting of three components:
1 a sample space, Ω, which is the set of all possible outcomes.
2 an event space, F, which is the set of all events ”of interest”. Here,
we can understand F as being a set of subsets of Ω which satisfies
certain conditions.
3 a probability function, P : F → [0, 1], which assigns each event in
the event space a probability.
Of course, for the model to be meaningful, P should satisfy certain axioms.

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics


MATH2901Society)
Revision Seminar August 2020 5 / 116
Axioms

Definition (Probability Space)


A probability space is the triple (Ω, F, P). Here, P satisfies the axioms
1 P(A) ≥ 0 for all A ∈ F
2 P(Ω) = 1
P( ∞
S P∞
i=1 Ai ) = i=1 P(Ai ) for mutually exclusive events A1 , A2 , ... ∈ F
3

Exercise for the reader: Prove the axioms.

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics


MATH2901Society)
Revision Seminar August 2020 6 / 116
Elementary Results

From the axioms, we are able to derive the following fundamental results:
1 If A1 , A2 , ..., Ak are mutually exclusive,
k k
!
[ X
P Ai = P(Ai ).
i=1 i=1

2 P(∅) = 0.
3 For any A ⊆ Ω, 0 ≤ P(A) ≤ 1 and P(Ā) = 1 − P(A).
4 If B ⊂ A, then P(B) ≤ P(A). Hence, if B occurs → A occurs, then
P(B) ≤ P(A).
These results can be used without proof.

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics


MATH2901Society)
Revision Seminar August 2020 7 / 116
Conditional Probability

Definition (Conditional Probability)


The conditional probability that an event A occurs, given that an event B
has occurred is
P(A ∩ B)
P(A|B) = .
P(B)
Here, we require P(B) 6= 0.

Given that B has occurred, the total probability for the possible results of
an experiment equals P(B). Visually, we see that the only outcomes in A
that are now possible are those in A ∩ B.
Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics
MATH2901Society)
Revision Seminar August 2020 8 / 116
Independence

Definition (Independent Events)


Events A and B are independent if P(A ∩ B) = P(A)P(B).

Recall that for any two events A and B, we have P(A ∩ B) = P(A|B)P(B).
Thus, A and B are independent if and and only P(A|B) = P(A) (or
equivalently, P(B|A) = P(B)). Intuitively, this means that knowing event A
has occurred does not give us any information on the probability of event B
occurring (and vice versa).

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics


MATH2901Society)
Revision Seminar August 2020 9 / 116
Independence

Definition (Pairwise Independent Events)


For a countable sequence of events {Ai }, the events are pairwise
independent if
P(Ai ∩ Aj ) = P(Ai )P(Aj )
for all i 6= j

Definition ((Mutually) Independent Events)


For a countable sequence of events {Ai }, the events are (mutually)
independent if for any collection Ai1 , Ai2 , ..., Ain , we have

P(Ai1 ∩ ... ∩ Ain ) = P(Ai1 )...P(Ain ).

Independence implies pairwise independence but not vice versa. Can you
think of an example of where pairwise independence does not imply
independence?
Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics
MATH2901Society)
Revision Seminar August 2020 10 / 116
The Multiplicative Law

Definition (The Multiplicative Law)


For events A1 , A2 , we have

P(A1 ∩ A2 ) = P(A2 |A1 )P(A1 ).

For events A1 , A2 , A3 , we have

P(A1 ∩ A2 ∩ A3 ) = P(A3 ∩ A2 ∩ A1 )
= P(A3 |A2 ∩ A1 )P(A2 ∩ A1 )
= P(A3 |A1 ∩ A2 )P(A2 |A1 )P(A1 ).

We trust that the reader is able to generalise this to cases involving a


greater number of events.

The Multiplicative Law is highly useful when dealing with a


sequence of dependent trials.
Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics
MATH2901Society)
Revision Seminar August 2020 11 / 116
The Additive Law

Definition (The Additive Law)


For events A and B, we have

P(A ∪ B) = P(A) + P(B) − P(A ∩ B).

You might have noticed that the Additive Law resembles the
Inclusion-exclusion principle. The following quote from Wikipedia sheds
some light on this:

”As finite probabilities are computed as counts relative to the cardinality of


the probability space, the formulas for the principle of inclusion–exclusion
remain valid when the cardinalities of the sets are replaced by finite
probabilities.”
https://en.wikipedia.org/wiki/Inclusion-exclusion principle

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics


MATH2901Society)
Revision Seminar August 2020 12 / 116
The Law of Total Probability

Definition (The Law of Total Probability)


Suppose A1 , A2 , ..., Ak are mutually exclusive (Ai ∩ Aj = ∅ for all i 6= j)
and exhaustive ( ki=1 Ai = Ω = sample space); that is, A1 , ..., Ak forms a
S
partition of Ω. Then, for any event B, we have
k
X
P(B) = P(B|Ai )P(Ai ).
i=1

The Law of Total Probability relates marginal probabilities to conditional


probabilities and is often used in calculations involving Bayes’ Formula.

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics


MATH2901Society)
Revision Seminar August 2020 13 / 116
Bayes’ Formula/Bayes’ Theorem/Bayes’ Law

Definition (Bayes’ Theorem)


For a partition A1 , A2 , ..., Ak and an event B,

P(B|Aj )P(Aj ) P(B|Aj )P(Aj )


P(Aj |B) = Pk =
i=1 P(B|Ai )P(Ai )
P(B)

In essence, Bayes’ Theorem allows us to reverse the order of conditioning


provided that we know the marginal probabilities of event A and event B.
When dealing with problems involving Bayes’ Theorem, it is recommended
that one draws a tree diagram.

Here, we shall adopt the frequentist interpretation of probability where


probability measures a proportion of outcomes – as opposed to the Bayesian
interpretation where probability measures a degree of belief.

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics


MATH2901Society)
Revision Seminar August 2020 14 / 116
Table of Contents

1 Basic Probability Theory

2 Review of Random Variables

3 Computations for Common Distributions

4 Inequalities

5 Limit of Random Variables

6 Tips and Assorted Examples

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics


MATH2901Society)
Revision Seminar August 2020 15 / 116
Review of Random Variables

1 Discrete Random Variables


2 Continuous Random Variables
3 Cumulative Distribution Function
4 Expectation and Moments

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics


MATH2901Society)
Revision Seminar August 2020 16 / 116
Discrete Random Variables

Definition (Discrete Random Variable)


The random variable X is discrete if there are countably many values x for
which P(X = x) > 0.

The probability structure of X is typically described by its probability


(mass) function.

Definition (Probability (Mass) Function)


The probability function of the discrete random variables X is given by

fX (x) = P(X = x).

The following two properties are important and apply to all discrete random
variables:
1 fX (x) ≥ 0 for all x ∈ R
P
all x fX (x) = 1
2

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics


MATH2901Society)
Revision Seminar August 2020 17 / 116
Continuous Random Variables
A continuous random variable has a continuum of possible values.
Continuous random variables do not have a probability (mass) function, but
have the analogous (probability) density function.

Definition ((Probability) Density Function)


The density function of a continuous random variable is a real-valued
function fX on R with the property
Z
fX (x)dx = P(X ∈ A)
A

for any (measurable) set A ⊆ R.

The following two properties are important and apply to all continuous
random variables:
1 f (x) ≥ 0 for all x ∈ R.
X
R∞
−∞ fX (x)dx = 1.
2

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics


MATH2901Society)
Revision Seminar August 2020 18 / 116
Cumulative Distribution Function (cdf)

Definition (Cumulative Distribution Function)


The cumulative distribution function (cdf) of a random variable X is
defined by
FX (x) = P(X ≤ x).
Here, X can be either continuous or discrete.

In the case where X is a continuous random variable, we have the following


important results:
R∞
1 F
X = −∞ fX (t)dt.
2 fX (x) = FX0 (x).
Rb
3 P(a ≤ X ≤ b) = a fX (x)dx = area under fX between a and b.
Thus, if we know one of FX or fX , we are able to derive the other. Moreover,
once we know these, we are able to derive any probability/property of X .

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics


MATH2901Society)
Revision Seminar August 2020 19 / 116
Important Remarks on Continuous Random Variables

Suppose X is a continuous random variable. Then we have

P(X = a) = 0 for any a ∈ R.

Hence, it is only meaningful to talk about the probability of X lying in some


subinterval(s) of R. Consequently, we have

P(a < X < b) = P(a ≤ X < b) = P(a < X ≤ b) = P(a ≤ X ≤ b).

Thus, we typically do not have to worry about whether an interval contains


its boundary points or not.

This is NOT the case for discrete random variables.

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics


MATH2901Society)
Revision Seminar August 2020 20 / 116
Expectation

Definition (Expected Value of a Discrete Random Variable)


The expected value or mean of a discrete random variable X is

def X X
EX = E[X ] = x · P(X = x) = x · fX (x),
all x all x

where fX is the probability function of X .

Definition (Expected Value of a Continuous Random Variable)


The expected value of mean of a continuous random variable X is
Z ∞
E[X ] = x · fX (x)dx,
−∞

where fX is the density function of X .

In either case, E[X ] can be interpreted as the long run average of X .


Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics
MATH2901Society)
Revision Seminar August 2020 21 / 116
Expectation of Transformed Random Variables
Often, we are interested in a transformation of a random variable. In
particular, we often examine the r th moment of X about some constant a,
E[(X − a)r ]. The following results provide a method of calculating the
expectation of a transformation of a random variable.

Result (Transformation of a Discrete Random Variable)


Let X be a discrete random variable and let g be a function of X . Then
X
Eg (X ) = E[g (X )] = g (x) · fX (x).
all x

Result (Transformation of a Continuous Random Variable)


Let X be a continuous random variable and let g be a function of X . Then
Z ∞
Eg (X ) = E[g (X )] = g (X )fX (x)dx.
−∞

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics


MATH2901Society)
Revision Seminar August 2020 22 / 116
Properties of Expectation

Result (Linearity of Expectation)


Let X , Y be random variables and a, b be constants. Then

E[aX + bY ] = aE[X ] + bE[Y ].

Result (Expected Value of a Constant)


For a constant c,
E[c] = c.

In general, for dependent random variables X , Y ,

E[XY ] 6= E[X ]E[Y ].

Also, if g is a transformation of X , then typically,

E[g (X )] 6= g (E[X ]).

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics


MATH2901Society)
Revision Seminar August 2020 23 / 116
Variance and Standard Deviation

Definition (Variance)
Let µ = E[X ]. Then the variance of X , denoted Var[X ], is defined as

Var[X ] = E[(X − µ)2 ].

Observe that this is the second moment of X about µ.

Definition (Standard Deviation)


The standard deviation of X is the square root of its variance:
p
σ = Var[X ].

Both variance and standard deviation are measures of the spread of a


random variable. Standard deviations are in the same unit as X and thus,
can be more readily interpreted. However, variances are easier to work with
theoretically.
Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics
MATH2901Society)
Revision Seminar August 2020 24 / 116
Properties of Variance

Result (Alternative Formula for Variance)


Let X be a random variable and let µ = E[X ]. Then

Var[X ] = E[X 2 ] − µ2 .

The variance will often be calculated using this formula.

Result (Nonlinearity of Variance)


Let X be a random variable and let a be a constant. Then

Var[X + a] = Var[X ]
Var[aX ] = a2 Var[X ].

Pay attention to the nonlinearity of variance. A common mistake is treating


variance as being linear.
Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics
MATH2901Society)
Revision Seminar August 2020 25 / 116
Moment Generating Functions
Definition (Moment Generating Function)
The moment generating function (mgf) of a random variable X is

µX (u) = E[e uX ].

We say that the moment generating function of X exists if mX (u) is finite


in some interval containing zero.

The following result regarding the r th moment of X , E[X r ], shows why the
moment generating function is called as such.
Result (r th Moment of a random variable)
Let X be a random variable. Then in general, we have
(r )
E[X r ] = mX (0)

for r = 0, 1, 2, ....
Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics
MATH2901Society)
Revision Seminar August 2020 26 / 116
Properties of Moment Generating Functions

Result (Uniqueness)
Let X and Y be two random variables all of whose moments exist. If

mX (u) = mY (u)

for all u in a neighbourhood of 0 (i.e., for all |µ| <  for some  < 0), then

FX (x) = FY (y ) for all x ∈ R.

That is, the moment generating function of a random variable is unique.

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics


MATH2901Society)
Revision Seminar August 2020 27 / 116
Properties of Moment Generating Functions

Result (Convergence)
Let {Xn : n = 1, 2, ...} be a sequence of random variables, each with
moment generating function mXn (u). Furthermore, suppose that

lim mXn (u) for all u in a neighbourhood of 0


n→∞

and mX (u) is a moment generating function of a random variable X . Then

lim FXn (x) = FX (x) for all x ∈ R.


n→∞

That is, convergence of moment generating functions implies convergence


of cumulative distribution functions.

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics


MATH2901Society)
Revision Seminar August 2020 28 / 116
Location and Scale Families of Densities
Result (Location Family of Densities)
A location family of densities based on the random variable U is the
family of densities fX (x) where X = U + c for all possible c. Here, fX (x) is
given by:
fX (x) = fU (x − c).

Result (Scale Family of Densities)


A scale family of densities based on the random variable U is the family of
densities fX (x) where X = cU for all possible c. fX (x) is given by:
x 
fX (x) = c −1 fU .
c
In essence, the density function of a continuous random variable may belong
to a family of density functions that all have a similar form. This relates to
the concept of parameters in statistics.
Proving the above results is not difficult and is a good exercise.
Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics
MATH2901Society)
Revision Seminar August 2020 29 / 116
Table of Contents

1 Basic Probability Theory

2 Review of Random Variables

3 Computations for Common Distributions

4 Inequalities

5 Limit of Random Variables

6 Tips and Assorted Examples

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics


MATH2901Society)
Revision Seminar August 2020 30 / 116
Computations for Common Distributions

1 Common distributions summary


2 Computing the MGF of random variables
3 Computing the moments directly
4 Computing the distribution of simple transformed random variables
5 Bivariate density computations
6 Computing the sum of random variables using the convolution
formula, bivariate transformation and MGFs
7 Identifying distributions from the MGF

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics


MATH2901Society)
Revision Seminar August 2020 31 / 116
Bernoulli Distribution
Definition (Bernoulli Distribution)
For a Bernoulli trial, define the random variable
(
1 if the trial results in success
X =
0 otherwise

Then X is said to have a Bernoulli distribution.

Result (Probability Function of X )


If X is a Bernoulli random variable defined according to a Bernoulli trial
with success probability 0 < p < 1 then the probability function of X is
(
p if x = 1
fX (x) =
1 − p if x = 0.

An equivalent way of writing this is fX (x) = p x (1 − p)x , x = 0, 1.


Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics
MATH2901Society)
Revision Seminar August 2020 32 / 116
Binomial Distribution

Definition (Binomial Random Variable)


Consider a sequence of n independent Bernoulli trials, each with success
probability p. If
X = total number of successes
then X is a Binomial random variable with parameters n and p. A
common shorthand is
X ∼ Bin(n, p).

Here, ”∼” has the meaning ”is distributed as” or ”has distribution”.
Results
If X ∼ Bin(n, p) then
n x n−x , x = 0, ..., n,

x p (1 − p)
1 f (x) =
X
2 E[X ] = np,
3 Var(X ) = np(1 − p).
Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics
MATH2901Society)
Revision Seminar August 2020 33 / 116
Geometric Distribution

Definition (Geometric Distribution)


If
X = number of trials until first success,
then X is said to have a geometric distribution with parameter p (the
probably of success on each trial).

Results
If X ∼ Geom(p), then
1 fX (x; p) = p(1 − p)x−1 , x = 1, 2, ...
2 E[X ] = p1 ,
1−p
3 Var(X ) = p2
.

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics


MATH2901Society)
Revision Seminar August 2020 34 / 116
Poisson Distribution
Definition (Poisson Distribution)
The random variable X has a Poisson distribution with parameter λ > 0
if its probability function is

e −λ λx
fX (x; λ) = P(X = x) = , x = 0, 1, 2, ...
x!
A common abbreviation is X ∼ Poisson(λ).

The Poisson distribution is a model of the occurrence of point events in a


continuum. The number of points occurring in a time interval t is a random
variable with a Poisson(λt) distribution
Results
If X ∼ Poisson(λ)
1 E(X ) = λ,
2 Var(X ) = λ.
Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics
MATH2901Society)
Revision Seminar August 2020 35 / 116
Exponential Distribution
The exponential distribution is useful for describing the probability structure
of positive random variables. The exponential distribution is closely related
to the Poisson distribution; the time until the next event has an exponential
distribution with parameter β = λ1 .

Definition
A random variable X is said to have an exponential distribution with
parameter β > 0 if X has density function:
1 −x/β
fX (x; β) = e , x > 0.
β

Results
1 E(X ) = β,
2 Var(X ) = β 2 ,
3 Memoryless property: P(X > s + t|X > s) = P(X > t).
Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics
MATH2901Society)
Revision Seminar August 2020 36 / 116
Uniform Distribution

Definition (Uniform Distribution)


A continuous random variable X that can take values in the interval (a, b)
with equal likelihood is said to have a uniform distribution on (a,b). A
common shorthand is
X ∼ Unif(a, b).

Results
If X ∼ Unif(a, b), then
1
1 fX (x; a, b) = b−a , a < x < b,
a+b
2 E(X ) = 2 ,
2
3 Var(X ) = (b−a)
12 .

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics


MATH2901Society)
Revision Seminar August 2020 37 / 116
Gamma Function

The Gamma Function extends the factorial function to the real numbers.
Definition (Gamma Function)
The Gamma function at x ∈ R is given by
Z ∞
Γ(x) = t x−1 e −t dt.
0

Results
1 Γ(x) = (x − 1)Γ(x − 1),
2 Γ(n) = (n − 1)!,, n = 1, 2, 3...

3 Γ( 21 ) = π,
R ∞ m −x
0 x e dx = m! for m = 0, 1, 2, ....
4

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics


MATH2901Society)
Revision Seminar August 2020 38 / 116
Beta Function

Definition (Beta Function)


The Beta function at x, y ∈ R is given by
Z 1
B(x, y ) = t x−1 (1 − t)y −1 dt.
0

Result
For all x, y ∈ R,
Γ(x)Γ(y )
B(x, y ) = .
Γ(x + y )

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics


MATH2901Society)
Revision Seminar August 2020 39 / 116
Phi Function

Definition (Φ)
For all x ∈ R, Z x
1 2 /2
Φ(x) = √ e −t dt.
2π −∞

We cannot simplify the above expression for Φ(x) as there is no closed-form


anti-derivative.
Φ gives the cumulative distribution function of the standard normal
distribution.
Results
1 limx→−∞ Φ(x) = 0,
2 limx→∞ Φ(x) = 1,
3 Φ(0) = 12 ,
4 Φ is monotonically increasing over R.

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics


MATH2901Society)
Revision Seminar August 2020 40 / 116
Normal Distribution

Definition (Normal Distribution)


The random variable X is said to have a normal distribution with
parameters µ and σ 2 (where −∞ < µ < ∞ and σ 2 > 0) if X has density
function
1 −(x−µ)2
fX (x; µ, σ) = √ e 2σ2 , −∞ < x < ∞.
σ 2π
A common shorthand is
X ∼ N(µ, σ 2 ).

Results
If X ∼ N(µ, σ 2 ),
1 E(X ) = µ,
2 Var(X ) = σ 2 ,

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics


MATH2901Society)
Revision Seminar August 2020 41 / 116
Computing Normal Distribution Probabilities

Result
If Z ∼ N(0, 1) then

P(Z ≤ x) = FZ (x) = Φ(x).

In other words, the Φ function is the cumulative distribution function of the


N(0, 1) random variable.

Result
If X ∼ N(µ, σ 2 ) then

X −µ
Z= ∼ N(0, 1).
σ

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics


MATH2901Society)
Revision Seminar August 2020 42 / 116
Gamma Distribution
Definition (Gamma Distribution)
A random variable X is said to have a Gamma distribution with
parameters α and β (where α, β > 0) if X has density function:

e −α/β x α−1
fX (x; α, β) = ,x > 0
Γ(α)β α

A common shorthand is:

X ∼ Gamma(α, β).

Results
If X ∼ Gamma(α, β) then
1 E(X ) = αβ,
2 Var(X ) = αβ 2 .
Moreover, Y has an Exponential distribution iff Y ∼ Gamma(1, β).
Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics
MATH2901Society)
Revision Seminar August 2020 43 / 116
Beta Distribution

Definition (Beta Distribution)


A random variable X is said to have a Beta distribution with parameters
aα, β > 0 if its density function is

x α−1 (1 − x)β−1
fX (x; α, β) = , 0 < x < 1.
B(α, β)

The Beta distribution generalises the Unif(0, 1) distribution which can be


thought of as a Beta distribution with a = b = 1.
Results
If X has a distribution with parameters α and β, then
α
1 E(X ) = α+β ,
αβ
2 Var(X ) = (α+β+1)(α+β)2
.

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics


MATH2901Society)
Revision Seminar August 2020 44 / 116
Joint Probability Function and Density Function

Definition (Joint Probability Function)


If X and Y are discrete random variables, then the joint probability
function of X and Y is

fX ,Y (x, y ) = P(X = x, Y = y ),

the probability that X = x and Y = y .

Definition (Joint Density Function)


The joint density function of continuous random variables X and Y is a
bivariate function fX ,Y with the property
Z Z
fX ,Y (x, y )dxdy = P((X , Y ) ∈ A)
A

for any (measurable) subset A of R2 .


Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics
MATH2901Society)
Revision Seminar August 2020 45 / 116
Joint Cumulative Distribution Function

Definition (Joint Cumulative Distribution Function)


The joint cdf of X and Y is

FX ,Y (x, y ) = P(X ≤ x, Y ≤ y )
(P P
P(X = u, Y = v ) (X discrete)
= R y R x v ≤y
u≤x

−∞ −∞ fX ,Y (u, v )dudv (X continuous).

Result
1 If
PX and P Y are discrete radom variables, then
all x all y fX ,Y (x, y ) = 1.
2 If
R∞ X and
R ∞ Y are continuous random variables then
−∞ −∞ fX ,Y (x, y )dxdy = 1.

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics


MATH2901Society)
Revision Seminar August 2020 46 / 116
Expectation of Joint Functions

Result (Expectation of Joint Functions)


If g is any function of g (X , Y ),
( P P
= all x all y g (x, y )P(X = x, Y = y ) discrete
E[g (X , Y )] = R∞ R∞
= −∞ −∞ g (x, y )fX ,Y (x, y )dxdy continuous

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics


MATH2901Society)
Revision Seminar August 2020 47 / 116
Marginal Probability Function

Result (Marginal Probability Function)


If X and Y are discrete, then fX (x) and fY (y ) can be calculated from
fX ,Y (x, y ) as follows:
X
fX (x) = fX ,Y (x, y )
all y
X
fY (y ) = fX ,Y (x, y ).
all x

fX (x) is sometimes referred to as the marginal probability function of X .

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics


MATH2901Society)
Revision Seminar August 2020 48 / 116
Marginal Density Function

Result (Marginal Density Function)


If X and Y are continuous, then fX (x) and fY (y ) can be calculated from
fX ,Y (x, y ) as follows:
Z ∞
fX (x) = fX ,Y (x, y )dy
−∞
Z ∞
fY (y ) = fX ,Y (x, y )dx.
−∞

fX (x) is sometimes referred to as the marginal density function of X .

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics


MATH2901Society)
Revision Seminar August 2020 49 / 116
Conditional Probability Function

Definition (Conditional Probability Function)


If X and Y are discrete, the conditional probability function of X given
Y = y is

P(X = x, Y = y ) fX ,Y (x, y )
fX |Y (x|y ) = P(X = x|Y = y ) = = .
P(Y = y ) fY (y )

Similarly,
fX ,Y (x, y )
fY |X (y |x) = P(Y = y |X = x) = .
fX (x)

Result
X
P(Y ∈ A|X = x) = fY |X (y |X = x).
y ∈A

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics


MATH2901Society)
Revision Seminar August 2020 50 / 116
Conditional Density Function

Definition (Conditional Density Function)


If X and Y are continuous, the conditional density function of X given
Y = y is
fX ,Y (x, y )
fX |Y (x|Y = y ) = .
fY (y )
Similarly,
fX ,Y (x, y )
fY |X (y |X = x) = .
fX (x)

Shorthand: fY |X (y |x).

Result
Z b
P(a ≤ Y ≤ b|X = x) = fY |X (y |x)dy .
a

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics


MATH2901Society)
Revision Seminar August 2020 51 / 116
Conditional Expected Value and Variance

Result (Conditional Expected Value)


The conditional expected value of X given Y = y is
(P
xP(X = x|Y = y ) if X is discrete
E(X |Y = y ) = R ∞all x
−∞ xfX |Y (x|y )dx if X is continuous

Result (Conditional Variance)


The conditional variance of X given Y = y is

Var(X |Y = y ) = E(X 2 |Y = y ) − [E(X |Y = y )]2

where (P
2 x 2 P(X = x|Y = y )
E(X |Y = y ) = R ∞all x 2
−∞ x fX |Y (x|y )dx.

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics


MATH2901Society)
Revision Seminar August 2020 52 / 116
Independent Random Variables

Definition (Independent)
Random variables X and Y are independent if and only if for all x, y

fX ,Y (x, y ) = fX (x)fY (y ).

Result
Random variables X and Y are independent if and only if for all x, y

fY |X (x, y ) = fY (y )

Result
If X and Y are independent

FX ,Y (x, y ) = FX (x) · FY (y ).
Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics
MATH2901Society)
Revision Seminar August 2020 53 / 116
Covariance

Definition (Covariance
The covariance of X and Y is

Cov(X , Y ) = E[(X − µX )(Y − µY )].

Cov(X , Y ) measures how much X and Y vary about their means and also
how much they vary together linearly.
Results
1 Cov(X , X ) = Var(X )
2 Cov(X , Y ) = E(XY ) − µX µY
3 If X and Y are independent, then Cov(X , Y ) = 0.

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics


MATH2901Society)
Revision Seminar August 2020 54 / 116
Correlation

Definition (Correlation)
The correlation between X and Y is
Cov(X , Y )
Corr(X , Y ) = p .
Var(X ) · Var(Y )

Corr(X , Y ) measures the strength of the linear association between X and


Y . Independent random variables are uncorrelated, but uncorrelated
variables are not necessarily independent (Consider the case where
E(X ) = 0 and Y = X 2 ).

Results
1 | Corr(X , Y )| ≤ 1
2 | Corr(X , Y )| = 1 if and only if P(Y = a + bX ) = 1 for some constants
a, b.

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics


MATH2901Society)
Revision Seminar August 2020 55 / 116
Transformations

Result
For discrete X ,
X
fY (y ) = P(Y = y ) = P[h(X ) = y ] = P(X = x).
x:h(x)=y

Result
For continuous X , if h is monotonic over the set {x : fX (x) > 0} then

dx
fY (y ) = fX (x)
dy
dx
= fX (h−1 (y ))
dy

for y such that fX (h−1 (y )) > 0.

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics


MATH2901Society)
Revision Seminar August 2020 56 / 116
Linear Transformation

Result
For a continuous random variable X , if Y = aX + b is a linear
transformation of X with a 6= 0, then
 
1 y −b
fY (y ) = fX
|a| a

for all y such that fX ( y −b


a ) > 0.

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics


MATH2901Society)
Revision Seminar August 2020 57 / 116
Leading into Bivariate Transformations...

If X and Y have joint density fX ,Y (x, y ) and U is a function of X and Y ,


we can find the density of U by calculating FU (u) = P(U ≤ u) and
differentiating.

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics


MATH2901Society)
Revision Seminar August 2020 58 / 116
Bivariate Transformations

Result
If U and V are functions of continuous random variables X and Y , then

fU,V (u, v ) = fX ,Y (x, y ) · |J|

where
∂x ∂x
∂u ∂v
J= ∂y ∂y
∂u ∂v

is a determinant called the Jacobian of the transformation.


The full specification of fU,V (u, v ) requires that the range of (u, v ) values
corresponding to those (x, y ) for which fX ,Y (x, y ) > 0 is determined.

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics


MATH2901Society)
Revision Seminar August 2020 59 / 116
Bivariate Transformations

To find fU (u) by bivariate transformation:


1 Define some bivariation transformation (U, V ).
2 Find fU,V (u, v ).
R∞
3 We want the marginal distribution of U. So now find −∞ fU,V (u, v ).
Using a bivariate transformation to find the distribution of U is often more
convenient that deriving it via the cumulative distribution function. Using
the cdf requires double integration, which we can avoid when we use a
bivariate transformation.

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics


MATH2901Society)
Revision Seminar August 2020 60 / 116
Sum of Independent Random Variables - Probability
Function/Density Function Approach

Result (Discrete Convolution Formula)


Suppose that X and Y are independent random variables raking only
non-negative integer values, and let Z = X + Y . Then
z
X
fX (z) = fX (z − y )fY (y ), z = 0, 1, ...
y =0

Result (Continuous Convolution Formula)


Suppose X and Y are independent continuous random variables with
X ∼ fX (x) and Y ∼ fY (y ). Then Z = X + Y has density
Z
fZ (z) = fX (z − y )fY (y )dy .
all possible y

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics


MATH2901Society)
Revision Seminar August 2020 61 / 116
Sum of Gammmas

Result
If X1 , X2 , ..., Xn are independent with Xi ∼ Gamma(αi , β), then
n
X Xn
Xi ∼ Gamma( αi , β).
i=1 i=1

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics


MATH2901Society)
Revision Seminar August 2020 62 / 116
Sum of Independent Random Variables - Moment
Generating Function Approach

Result
Suppose that X and Y are independent random variables with moment
generating functions mX and mY . Then

mX +Y (u) = mX (u)mY (u).

This generalises to n independent random variables.

Result
If X ∼ N(µX , σX2 ) and Y ∼ N(σY , σY2 ) are independent then

X + Y ∼ N(µX + µY , σX2 + σY2 ).

This also generalises.

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics


MATH2901Society)
Revision Seminar August 2020 63 / 116
Table of Contents

1 Basic Probability Theory

2 Review of Random Variables

3 Computations for Common Distributions

4 Inequalities

5 Limit of Random Variables

6 Tips and Assorted Examples

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics


MATH2901Society)
Revision Seminar August 2020 64 / 116
Inequalities

1 Jensen’s Inequality
2 Markov’s inequality
3 Chebyshev’s inequality

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics


MATH2901Society)
Revision Seminar August 2020 65 / 116
Jensen’s Inequality

Result (Jensen’s Inequality)


If h is a convex function (i.e., concave up) and X is a random variable, then

E[h(X )] ≥ h(E[X ]).

There are several formulations of Jensen’s inequality and the above


formulation is the one most relevant for probability theory. Students
studying MATH2701 next term will be delighted to re-encounter Jensen’s
inequality albeit in a different formulation.

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics


MATH2901Society)
Revision Seminar August 2020 66 / 116
Markov’s Inequality

Result (Markov’s Inequality)


Let X be a nonnegative random variable and a > 0. Then

E[X ]
P(X ≥ a) ≤ .
a
That is, the probability that X is at least a is at most the expectation of X
divided by a.

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics


MATH2901Society)
Revision Seminar August 2020 67 / 116
Chebychev’s Inequality

Result (Chebychev’s Inequality)


If X is any random variable with E[X ] = µ, Var(X ) = σ 2 then

1
P(|X − µ| > kσ) ≤ .
k2
That is, the probability that X is more than k standard deviations from its
mean is less than k12 .

The significance of Chebychev’s Inequality is that we can make specific


probabilistic statements about a random variable given only its mean and
standard deviation – observe that we have not made any assumptions on
the distribution of X .
Interestingly, Chebychev’s Inequality can be easily derived as a corollary of
Markov’s Inequality (Hint: consider the random variable (X − E[X ])2 )

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics


MATH2901Society)
Revision Seminar August 2020 68 / 116
Table of Contents

1 Basic Probability Theory

2 Review of Random Variables

3 Computations for Common Distributions

4 Inequalities

5 Limit of Random Variables

6 Tips and Assorted Examples

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics


MATH2901Society)
Revision Seminar August 2020 69 / 116
Limit of Random Variables

1 Almost Sure Convergence


2 Convergence in Probability
3 Convergence in Distribution
4 Central Limit Theorem
5 Law of Large Numbers
6 The Statement and Application of the Delta Method

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics


MATH2901Society)
Revision Seminar August 2020 70 / 116
Almost Sure Convergence

Definition (Almost Sure Convergence)


The sequence of numerical random variables X1 , X2 , ... is said to converge
a.s.
almost surely to a numerical random variable X , denoted Xn → X if
 
P ω : lim Xn (ω) = X (ω) = 1.
n→∞

Within the context of probability theory, an event is said to happen almost


surely if it happens with probability 1. Note that it is possible for the set of
exceptions to be non-empty as long as it has probability 0.

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics


MATH2901Society)
Revision Seminar August 2020 71 / 116
Almost Sure Convergence

The previous definition of almost sure convergence can be difficult to work


with so we will make use of an alternative definition.
Result (Alternative Definition of Almost Sure Convergence)
a.s.
Xn → X
if and only if for every  > 0,
!
lim P sup |Xk − X | >  = 0.
n→∞ k≥n

Almost Sure Convergence is the mode of convergence used in the Strong


Law of Large Numbers.

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics


MATH2901Society)
Revision Seminar August 2020 72 / 116
Convergence in Probability

The main idea behind Convergence in Probability is that the probability of


an “unusual” event becomes smaller as the sequence progresses.

Definition (Convergence in Probability)


The sequence of random variables X1 , X2 , ... converges in probability to a
random variable X if, for all  > 0,

lim P(|Xn − X | > ) = 0.


n→∞

This is usually written as


P
Xn → X .

For context, an estimator is called consistent if it converges in probability


to the quantity being estimated. Furthermore, convergence in probability is
the type of convergence established by the Weak Law of Large Numbers.

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics


MATH2901Society)
Revision Seminar August 2020 73 / 116
Relationship between Almost Sure Convergence and
Convergence in Probability

Result (Almost Sure Convergence implies Convergence in Probability)


a.s. P
Xn → X =⇒ Xn → X
and
a.s. P
Xn → 0 ⇐⇒ sup |Xk | → 0.
k≥n

To understand why almost sure convergence is stronger than convergence in


probability, almost sure convergence depends on a joint distribution whereas
convergence in probability depends only on a marginal distribution.
Wise words of wisdom: Almost sure convergence means no noodle leaves
the strip (for large enough n), convergence in probability means the
proportion of noodles leaving the strip goes to 0 (as n → ∞).

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics


MATH2901Society)
Revision Seminar August 2020 74 / 116
Convergence in Distribution

Convergence in Distribution is concerned with whether the distributions of


Xi converges to the distribution of some random variable X . In other words,
we increasingly expect to see the next outcome in a sequence of random
experiments become better modelled by a given probability distribution.

Definition (Convergence in Distribution)


Let X1 , X2 , ... be a sequence of random variables. We say that Xn
converges in distribution to X if

lim FXn (x) = FX (x)


n→∞

d
for all x where FX is continuous. A common shorthand is Xn → X .
We say that FX is the limiting distribution of Xn .

Convergence in distribution often arises in applications of the central limit


theorem.
Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics
MATH2901Society)
Revision Seminar August 2020 75 / 116
More on Convergence in Distribution

Convergence in distribution allows us to make approximate probability


statements about Xn , for large n, if we can derive the limiting distribution
FX (n).

Result (Establishing Convergence in Distribution using Moments)


Let {Xn }n∈N be a sequence of random variables, each with mgf MXi (t).
Furthermore, suppose that

lim MXn (t) = MX (t).


n→∞

If MX (t) is a moment generating function then there is a unique FX (which


gives a random variable X ) whose moments are determined by MX (t) and
for all points of continuity FX (x) we have

lim FXn (x) = FX (x).


n→∞

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics


MATH2901Society)
Revision Seminar August 2020 76 / 116
Relationship between Convergence in Probability and
Convergence in Distribution

Result (Convergence in Probability implies Convergence in Distribution)


P d
Xn → X =⇒ Xn → X .

Convergence in probability is concerned with the convergence of the actual


values (the xi ’s) whereas convergence in distribution is concerned with the
convergence of the distributions (the FXi (x)’s).

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics


MATH2901Society)
Revision Seminar August 2020 77 / 116
Weak Law of Large Numbers

Result (Weak Law of Large Numbers)


Suppose X1 , X2 , ... are independent, each with mean µ and variance
0 < σ 2 < ∞. If
n
1X P
X̄n = Xi , then X̄n → µ.
n
i=1

The Weak Law of Large Numbers describes how the sample average
converges to the distributional average as the sample size increases.

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics


MATH2901Society)
Revision Seminar August 2020 78 / 116
Slutsky’s Theorem

Result (Slutsky’s Theorem)


d
Let X1 , X2 , ... be a sequence of random variables such that Xn → X for
some distribution X and let Y1 , Y2 , ... be another sequence of random
P
variables such that Yn → c for some constant c. Then
d
1 Xn + Yn → X + c
d
2 Xn Yn → cX .

Slutsky’s Theorem extends some properties of algebraic operations on


convergent sequences of real numbers to sequences of random variables and
is useful for establishing convergence in distribution results.

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics


MATH2901Society)
Revision Seminar August 2020 79 / 116
Strong Law of Large Numbers

The Weak Law corresponds to convergence in probability while the Strong


Law corresponds to almost sure convergence.

Result (Strong Law of Large Numbers)


Let X1 , X2 , ... be independent with common mean E[X ] = µ and variance
Var(X ) = σ 2 < ∞, then
a.s.
X̄n → µ.

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics


MATH2901Society)
Revision Seminar August 2020 80 / 116
Central Limit Theorem

Result (Central Limit Theorem)


Suppose X1 , X2 , ... are independent and identically distributed random
variables with common mean µ = E(Xi ) and commonPn variance
Xi
σ 2 = Var(Xi ) < ∞. For each n ≥ 1 let X̄n = i=1 n . Then

X̄n − µ d
√ →Z
σ/ n

where Z ∼ N(0, 1). It is common to write

X̄n − µ d
√ → N(0, 1).
σ/ n
2
Note that E(X̄n = µ and Var(X̄n ) = σn so that Central Limit Theorem
states that the limiting distribution of any standardised average of
independent random variables is the standard Normal distribution.
Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics
MATH2901Society)
Revision Seminar August 2020 81 / 116
Alternative Forms of the Central Limit Theorem

Sometimes,
Pn probabilities involving
Pn related quantities such as the sum
i=1 Xi are required. Since i=1 Xi = nX̄ , the Central Limit Theorem
also applies to the sum of a sequence of random variables.
Results
√ d
1 n(X̄ − µ) → N(0, σ 2 )
P
Xi −nµ d
2 i√
σ n
→ N(0, 1)
P
X −nµ d
3 i √i
n
→ N(0, σ 2 )

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics


MATH2901Society)
Revision Seminar August 2020 82 / 116
Applications of the Central Limit Theorem

Central Limit Theorem for Binomial Distribution


Suppose X ∼ Bin(n, p). Then

X − np d
p → N(0, 1)
np(1 − p)

Normal Approximation to the Poisson Distribution


Suppose X ∼ Poisson(λ). Then
 
X −λ
lim P √ ≤ x = P(Z ≤ x)
λ→∞ λ
where Z ∼ N(0, 1).
1
Continuity correction: add 2 to the numerator.

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics


MATH2901Society)
Revision Seminar August 2020 83 / 116
The Delta Method

The Delta Method


Let Y1 , Y2 , ... be a sequence of random variables such that

n(Yn − θ) d
→ N(0, 1).
σ
Suppose the function g is differentiable in the neighbourhood of θ and
g 0 (θ) 6= 0. Then
√ d
n(g (Yn ) − g (θ)) → N(0, σ 2 [g 0 (θ)]2 ).

Alternatively,
g (Yn ) − g (θ) d
√ → N(0, 1).
g 0 (θ)/ n

There are several ways of stating the Delta method.

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics


MATH2901Society)
Revision Seminar August 2020 84 / 116
Table of Contents

1 Basic Probability Theory

2 Review of Random Variables

3 Computations for Common Distributions

4 Inequalities

5 Limit of Random Variables

6 Tips and Assorted Examples

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics


MATH2901Society)
Revision Seminar August 2020 85 / 116
General Tips

1 When finding a pdf/cdf, always specify the domain over which it is


defined.
2 Make sure that the relevant conditions are satisfied before applying
some result/theorem (e.g., are the random variables i.i.d? Does a
sequence of random variables have the right mode of convergence?)
3 Does the pdf in question belongs to a known family? If so, you can
typically make use of some known results.
a.s. P d
4 → =⇒ → =⇒ →
5 Know how to do calculations involving bivariate distributions
6 Know how to apply (bivariate) transformations
7 Know how to sum independent random variables using the convolution
formula approach and the moment generating function approach.
8 Know the Large Numbers Laws, the CLT and the Delta method.

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics


MATH2901Society)
Revision Seminar August 2020 86 / 116
Example 1

Example (MATH1251) (contd.)


99% of the people with the disease receive a positive test. 98% of those
without receive a negative test. If 2% of the population have the disease,
determine the probability of someone having the disease given they received
a positive test.

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics


MATH2901Society)
Revision Seminar August 2020 87 / 116
Example 1

P(T | D)P(D)
We require P(D | T ) = .
P(T )

P(T ) = P(T | D)P(D) + P(T | D c )P(D c )


= P(T | D)P(D) + 1 − P(T c | D c ) P(D c )


= 0.99 × 0.02 + (1 − 0.98) × 0.98 = 0.0394

0.99 × 0.02
∴ P(D | T ) = ≈ 0.5025
0.0394

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics


MATH2901Society)
Revision Seminar August 2020 87 / 116
Example 1

A lot of people get stuck with Bayes’ law, especially when used with other
results. Use a tree diagram!

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics


MATH2901Society)
Revision Seminar August 2020 87 / 116
Example 2

Example
Given the distribution of X below, compute its expectation and standard
deviation.
x 0 3 9 27
P(X = x) 0.3 0.1 0.5 0.1

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics


MATH2901Society)
Revision Seminar August 2020 88 / 116
Example 2

Example
Given the distribution of X below, compute its expectation and standard
deviation.
x 0 3 9 27
P(X = x) 0.3 0.1 0.5 0.1

X
E[X ] = x P(X = x)
all x
= 0 × 0.3 + 3 × 0.1 + 9 × 0.5 + 27 × 0.1
= 7.5

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics


MATH2901Society)
Revision Seminar August 2020 88 / 116
Example 2

Example
Given the distribution of X below, compute its expectation and standard
deviation.
x 0 3 9 27
P(X = x) 0.3 0.1 0.5 0.1

E[X ] = 7.5

E[X 2 ] = 02 × 0.3 + 32 × 0.1 + 92 × 0.5 + 272 × 0.1


= 114.3

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics


MATH2901Society)
Revision Seminar August 2020 88 / 116
Example 2

Example
Given the distribution of X below, compute its expectation and standard
deviation.
x 0 3 9 27
P(X = x) 0.3 0.1 0.5 0.1

E[X ] = 7.5
E[X 2 ] = 114.3

q p √
σX = E[X 2 ] − (E[X ])2 = 114.3 − 7.52 = 58.05 ≈ 7.619

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics


MATH2901Society)
Revision Seminar August 2020 88 / 116
Example 2

Example (2901 oriented)


Let X ∼ Geom(p). Prove that E[X ] = p1 .

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics


MATH2901Society)
Revision Seminar August 2020 88 / 116
Example 2

Example (2901 oriented)


Let X ∼ Geom(p). Prove that E[X ] = p1 .

Recall: P(X = x) = p(1 − p)x−1 for x = 1, 2, . . .

X ∞
X
E[X ] = x P(X = x) = xp(1 − p)x−1
all x x=1

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics


MATH2901Society)
Revision Seminar August 2020 88 / 116
Example 2
Example (2901 oriented)
Let X ∼ Geom(p). Prove that E[X ] = p1 .


X
E[X ] = xp(1 − p)x−1
x=1

X
= (y + 1)p(1 − p)y (y = x − 1)
y =0
 
X∞
= (1 − p)  (y + 1)p(1 − p)y −1 
y =0

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics


MATH2901Society)
Revision Seminar August 2020 88 / 116
Example 2


X
E[X ] = xp(1 − p)x−1
x=1

X
= (y + 1)p(1 − p)y (y = x − 1)
y =0
 
X∞
= (1 − p)  (y + 1)p(1 − p)y −1 
y =0

X ∞
X
y −1
= (1 − p) yp(1 − p) + (1 − p) p(1 − p)y −1
y =0 y =0

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics


MATH2901Society)
Revision Seminar August 2020 88 / 116
Example 2


X
E[X ] = xp(1 − p)x−1
x=1

X ∞
X
= (1 − p) yp(1 − p)y −1 + (1 − p) p(1 − p)y −1
y =0 y =0

X ∞
X
= (1 − p) yp(1 − p)y −1 + (1 − p) p(1 − p)y −1
y =1 y =1
−1
+ p(1 − p) (evaluating at y = 0)
 
−1
= (1 − p)E[X ] + (1 − p) 1 + p(1 − p)

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics


MATH2901Society)
Revision Seminar August 2020 88 / 116
Example 3

Example (2901 oriented)


Let X ∼ Geom(p). Prove that E[X ] = p1 .

 
∴ pE[X ] = (1 − p) + p
1
E[X ] =
p

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics


MATH2901Society)
Revision Seminar August 2020 89 / 116
Example 3

In general, can be done with the aid of Taylor series or binomial theorem.
But preferably just do this:

Method (Deriving Expected Value from definition) (2901)


Keep rearranging the expression until you make the entire density, or E[X ],
appear again.
Discrete case - Use a change of summation index at some point
Continuous case - Use integration by parts (or occasionally integration
by substitution)

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics


MATH2901Society)
Revision Seminar August 2020 89 / 116
Example 4

Example
2
Let fX (x) = θ2
x for 0 < x < θ. Compute the MGF and (2901) assert its
existence.

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics


MATH2901Society)
Revision Seminar August 2020 90 / 116
Example 4

Example
2
Let fX (x) = θ2
x for 0 < x < θ. Compute the MGF and (2901) assert its
existence.
Integrate by parts
Z θ
uX 2
mX (u) = E[e ]= 2 xe ux dx
θ 0
!
θ θ
xe ux e ux
Z
2
= 2 − dx
θ u 0 0 u

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics


MATH2901Society)
Revision Seminar August 2020 90 / 116
Example 4

Example
2
Let fX (x) = θ2
x for 0 < x < θ. Compute the MGF and (2901) assert its
existence.
Slowly tidy everything up
Z θ
uX 2
mX (u) = E[e ]= 2 xe ux dx
θ 0
!
θ θ
xe ux e ux
Z
2
= 2 − dx
θ u 0 0 u
!
θ
2θe uθ 2 e ux
= 2
− 2
uθ θ u2 0

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics


MATH2901Society)
Revision Seminar August 2020 90 / 116
Example 4

Example
2
Let fX (x) = θ2
x for 0 < x < θ. Compute the MGF and (2901) assert its
existence.
Slowly tidy everything up
Z θ
uX 2
mX (u) = E[e ]= 2 xe ux dx
θ 0
!
θ θ
xe ux e ux
Z
2
= 2 − dx
θ u 0 0 u
!
θ
2θe uθ 2 e ux
= 2
− 2
uθ θ u2 0

2(uθe uθ − e uθ + 1)
=
u 2 θ2

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics


MATH2901Society)
Revision Seminar August 2020 90 / 116
Example 4

Example
2
Let fX (x) = θ2
x for 0 < x < θ. Compute the MGF and (2901) assert its
existence.

2(uθe uθ − e uθ + 1)
mX (u) =
u 2 θ2
GeoGebra simulation

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics


MATH2901Society)
Revision Seminar August 2020 90 / 116
Example 4

Example
2
Let fX (x) = θ2
x for 0 < x < θ. Compute the MGF and (2901) assert its
existence.
Idea: Can check that the limit as u → 0 is finite. The finiteness of the limit
implies the required result.

2 θe uθ + uθ2 e uθ − θe uθ

2(uθe uθ − e uθ + 1) LH
lim = lim
u→0 u 2 θ2 u→0 2uθ2

= lim e
u→0
=1

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics


MATH2901Society)
Revision Seminar August 2020 90 / 116
Example 5

Example
Use the MGF of X ∼ Bin(n, p) to prove that E[X ] = np.

d
E[X ] = lim (1 − p + pe u )n
u→0 du

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics


MATH2901Society)
Revision Seminar August 2020 91 / 116
Example 5

Example
Use the MGF of X ∼ Bin(n, p) to prove that E[X ] = np.

d
E[X ] = lim (1 − p + pe u )n
du
u→0
= lim n(1 − p + pe u )n−1 · pe u
u→0
= n(1 − p + p)n−1 · p
= np

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics


MATH2901Society)
Revision Seminar August 2020 91 / 116
Example 6

Example
A busy switchboard receives 150 calls an hour on average. Assume that
calls are independent from each other and can be modelled with a Poisson
distribution. Find the probability of
1 Exactly 3 calls in a given minute
2 At least 10 calls in a given 5 minute period.

Naive:
X ∼ Poisson(150).

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics


MATH2901Society)
Revision Seminar August 2020 92 / 116
Example 6

Example
A busy switchboard receives 150 calls an hour on average. Assume that
calls are independent from each other and can be modelled with a Poisson
distribution. Find the probability of
1 Exactly 3 calls in a given minute
2 At least 10 calls in a given 5 minute period.

In Q1, take X ∼ Poisson(150/60) = Poisson(2.5). Then,

2.53
P(X = 3) = e −2.5 ≈ 0.2138
3!

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics


MATH2901Society)
Revision Seminar August 2020 92 / 116
Example 6

Example
A busy switchboard receives 150 calls an hour on average. Assume that
calls are independent from each other and can be modelled with a Poisson
distribution. Find the probability of
1 Exactly 3 calls in a given minute
2 At least 10 calls in a given 5 minute period.

In Q2, take Y ∼ Poisson(2.5 × 5) = Poisson(12.5). Then,

P(Y ≥ 10) = 1 − P(Y ≤ 9)


0 12.59
 
−12.5 12.5
=1−e + ··· +
0! 9!

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics


MATH2901Society)
Revision Seminar August 2020 92 / 116
Example 6

Example
A busy switchboard receives 150 calls an hour on average. Assume that
calls are independent from each other and can be modelled with a Poisson
distribution. Find the probability of
1 Exactly 3 calls in a given minute
2 At least 10 calls in a given 5 minute period.

In Q2, take Y ∼ Poisson(2.5 × 5) = Poisson(12.5). Then,

P(Y ≥ 10) = 1 − P(Y ≤ 9)


= 1 - ppois(9,lambda=12.5,lower=TRUE)
≈ 0.7985689

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics


MATH2901Society)
Revision Seminar August 2020 92 / 116
Example 7

Example (2901 course pack)


If, on average, 5 servers go offline during the day, what is the chance that
no servers will go offline in the next hour?

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics


MATH2901Society)
Revision Seminar August 2020 93 / 116
Example 7

Example (2901 course pack)


If, on average, 5 servers go offline during the day, what is the chance that
no servers will go offline in the next hour?

The number of servers going offline in a day is X ∼ Poisson(5).

So the time taken for the next server to go offline is T ∼ Exp(0.2),


measured in days.

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics


MATH2901Society)
Revision Seminar August 2020 93 / 116
Example 7

Example (2901 course pack)


If, on average, 5 servers go offline during the day, what is the chance that
no servers will go offline in the next hour?

The number of servers going offline in a day is X ∼ Poisson(5).

So the time taken for the next server to go offline is T ∼ Exp(0.2),


measured in days.  
1
∴ We require P T >
24

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics


MATH2901Society)
Revision Seminar August 2020 93 / 116
Example 7

Example (2901 course pack)


If, on average, 5 servers go offline during the day, what is the chance that
no servers will go offline in the next hour?

The number of servers going offline in a day is X ∼ Poisson(5).

So the time taken for the next server to go offline is T ∼ Exp(0.2),


measured in days.
  Z ∞
1
P T > = 5e −5t dt
24 1/24

= e −5/24

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics


MATH2901Society)
Revision Seminar August 2020 93 / 116
Example 8

Formula (Transforming a Discrete r.v.)


 X
P h(X ) = y = P(X = x)
x:h(x)=y

Um, ye wat?

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics


MATH2901Society)
Revision Seminar August 2020 94 / 116
Example 8

Example
A random variable has the following distribution:
x -1 0 1 2
P(X = x) 0.38 0.21 0.14 0.27

Determine the distribution of Y = X 3 and Z = X 2 .

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics


MATH2901Society)
Revision Seminar August 2020 94 / 116
Example 8

Example
A random variable has the following distribution:
x -1 0 1 2
P(X = x) 0.38 0.21 0.14 0.27

Determine the distribution of Y = X 3 and Z = X 2 .


If X can take the values −1, 0, 1, 2,
then Y = X 3 takes the values −1, 0, 1, 8.

P(Y = −1) = P(X 3 = −1) = P(X = −1) = 0.38

Similarly, P(Y = 0) = 0.21, P(Y = 1) = 0.14, P(Y = 8) = 0.27.

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics


MATH2901Society)
Revision Seminar August 2020 94 / 116
Example 8

Example
A random variable has the following distribution:
x -1 0 1 2
P(X = x) 0.38 0.21 0.14 0.27

Determine the distribution of Y = X 3 and Z = X 2 .


On the other hand, X 2 can only take the values of 0, 1, 4.

P(Z = 0) = P(X 2 = 0) = P(X = 0) = 0.21

...and P(Z = 4) is still equal to 0.27.

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics


MATH2901Society)
Revision Seminar August 2020 94 / 116
Example 8

Example
A random variable has the following distribution:
x -1 0 1 2
P(X = x) 0.38 0.21 0.14 0.27

Determine the distribution of Y = X 3 and Z = X 2 .


On the other hand, X 2 can only take the values of 0, 1, 4.

P(Z = 0) = P(X 2 = 0) = P(X = 0) = 0.21

P(Z = 1) = P(X 2 = 1) = P(X = ±1) = 0.38 + 0.14 = 0.62


...and P(Z = 4) is still equal to 0.27.

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics


MATH2901Society)
Revision Seminar August 2020 94 / 116
Example 8

Just to think about... (2901 oriented)


If X ∼ Poisson(λ), what must be the distribution of Y = X 2


y
(
e −λ (λ√y )! if y = 0, 1, 4, 9, . . .
P(Y = y ) =
0 otherwise

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics


MATH2901Society)
Revision Seminar August 2020 94 / 116
Example 9

Method 1 (Continuous random variable transform theorem)


Consider the transform y = h(x). If h is monotonic wherever fX (x) is
non-zero, then the density of Y = h(X ) is
 dx
fY (y ) = fX h−1 (y )
dy

Example
Let X ∼ Exp(λ). What is the density of Y = X 2 ?

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics


MATH2901Society)
Revision Seminar August 2020 95 / 116
Example 9

Example
Let X ∼ Exp(λ). What is the density of Y = X 2 ?

fX (x) = λ1 e −x/λ for all x > 0.



h(x) = x 2 is invertible for all x > 0, with h−1 (y ) = y.
√ dx
x = y , so dy = 2√1 y

√ 1
∴ fY (y ) = fX ( y ) √
2 y

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics


MATH2901Society)
Revision Seminar August 2020 95 / 116
Example 9

Example
Let X ∼ Exp(λ). What is the density of Y = X 2 ?

fX (x) = λ1 e −x/λ for all x > 0.



h(x) = x 2 is invertible for all x > 0, with h−1 (y ) = y.
√ dx
x = y , so dy = 2√1 y

√ 1
∴ fY (y ) = fX ( y ) √
2 y
1 √ 1
= e − y /λ √
λ 2 y
1 √
= √ e − y /λ
2λ y

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics


MATH2901Society)
Revision Seminar August 2020 95 / 116
Example 10

Example
Let X ∼ Exp(λ). What is the density of Y = X 2 ?

1 √
fY (y ) = √ e − y /λ
2λ y
Since x > 0 and y = x 2 , y > 0 as well.

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics


MATH2901Society)
Revision Seminar August 2020 96 / 116
Example 10

Example
Let X ∼ Unif(−10, 10). What is the density of Y = X 2 ?

1
fY (y ) = √
20 y
Since −10 < x < 10 and y = x 2 , we must have 0 < y < 100.

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics


MATH2901Society)
Revision Seminar August 2020 96 / 116
Example 11

Example
The joint probability distribution of X and Y is
y
0 1 2
0 1/16 1/8 1/8
x 1 1/8 1/16 0
2 3/16 1/4 1/16

Determine P(X = 0, Y = 1), P(X ≥ 1, Y < 1) and P(X − Y = 1)

1
P(X = 0, Y = 1) =
8

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics


MATH2901Society)
Revision Seminar August 2020 97 / 116
Example 11

Example
The joint probability distribution of X and Y is
y
0 1 2
0 1/16 1/8 1/8
x 1 1/8 1/16 0
2 3/16 1/4 1/16

Determine P(X = 0, Y = 1), P(X ≥ 1, Y < 1) and P(X − Y = 1)

P(X ≥ 1, Y < 1) = P(X = 1, Y = 0) + P(X = 2, Y = 0)


1 3 5
= + =
8 16 16

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics


MATH2901Society)
Revision Seminar August 2020 97 / 116
Example 11

Example
The joint probability distribution of X and Y is
y
0 1 2
0 1/16 1/8 1/8
x 1 1/8 1/16 0
2 3/16 1/4 1/16

Determine P(X = 0, Y = 1), P(X ≥ 1, Y < 1) and P(X − Y = 1)

P(X − Y = 1) = P(X = 2, Y = 1) + P(X = 1, Y = 0)


1 1 3
= + =
4 8 8

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics


MATH2901Society)
Revision Seminar August 2020 97 / 116
Example 12

Joint continuous distributions


Unless you know how to use indicator functions really well (2901), sketch
the region!

Example
1
fX ,Y (x, y ) = x ≥ 1, y ≥ 1
x 2y 2
is the joint density of the continuous r.v.s X and Y . Find P(X < 2, Y ≥ 4)
and P(X ≤ Y 2 ).

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics


MATH2901Society)
Revision Seminar August 2020 98 / 116
Example 12

Example
1
fX ,Y (x, y ) = x ≥ 1, y ≥ 1
x 2y 2
is the joint density of the continuous r.v.s X and Y . Find P(X < 2, Y ≥ 4)
and P(X ≤ Y 2 ).

Z 2Z ∞
1
P(X < 2, Y ≥ 4) = dy dx
1 4 x 2y 2
Z 2
1
= dx
1 4x 2
1
=
8

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics


MATH2901Society)
Revision Seminar August 2020 98 / 116
Example 12

Example
1
fX ,Y (x, y ) = x ≥ 1, y ≥ 1
x 2y 2
is the joint density of the continuous r.v.s X and Y . Find P(X < 2, Y ≥ 4)
and P(X ≤ Y 2 ).

Z ∞ Z x2
2 1
P(X ≤ Y ) = dy dx
x 2y 2
Z1 ∞ 1 
1 1
= − dx
1 x2 x4
2
=
3

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics


MATH2901Society)
Revision Seminar August 2020 98 / 116
Example 13

Example
Find E[Y 2 ln X ] for the following distribution

y
1 2
x 1 1/10 1/5
2 3/10 2/5

E[Y 2 ln X ] = 12 ln 1P(X = 1, Y = 1) + 22 ln 1P(X = 1, Y = 2)

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics


MATH2901Society)
Revision Seminar August 2020 99 / 116
Example 13

Example
Find E[Y 2 ln X ] for the following distribution

y
1 2
x 1 1/10 1/5
2 3/10 2/5

E[Y 2 ln X ] = 12 ln 1P(X = 1, Y = 1) + 22 ln 1P(X = 1, Y = 2)


+ 12 ln 2P(X = 2, Y = 1) + 22 ln 2P(X = 2, Y = 2)

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics


MATH2901Society)
Revision Seminar August 2020 99 / 116
Example 13

Example
Find E[Y 2 ln X ] for the following distribution

y
1 2
x 1 1/10 1/5
2 3/10 2/5

E[Y 2 ln X ] = 12 ln 1P(X = 1, Y = 1) + 22 ln 1P(X = 1, Y = 2)


+ 12 ln 2P(X = 2, Y = 1) + 22 ln 2P(X = 2, Y = 2)
 
3 2 11 ln 2
= +2× ln 2 =
10 5 10

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics


MATH2901Society)
Revision Seminar August 2020 99 / 116
Example 14

Problem
Examine the existence of E[XY ] for the earlier example:

1
fX ,Y (x, y ) = for x, y ≥ 1.
x 2y 2

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics


MATH2901Society)
Revision Seminar August 2020 100 / 116
Example 15

Definition (Cumulative Distribution Function)


The CDF FX ,Y (x, y ) is the function given by

FX ,Y (x, y ) = P(X ≤ x, Y ≤ y )

Finding a CDF (Continuous case)


Z x Z y
FX ,Y (x, y ) = fX ,Y (u, v ) du dv
−∞ −∞

Example
For the earlier example, FX ,Y (x, y ) = 0 if x < 1 or y < 1. Else:
Z xZ y   
1 1 1
FX ,Y (x, y ) = 2 2
du dv = 1 − 1−
1 1 u v x y

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics


MATH2901Society)
Revision Seminar August 2020 101 / 116
Example 16

Recall that P(A ∩ B) = P(A)P(B).

Definition (Independence of random variables)


Two random variables are independent when:

P(X = x, Y = y ) = P(X = x)P(Y = y ) (discrete case)


fX ,Y (x, y ) = fX (x)fY (y ) (continuous case)

Example
Test if X and Y are independent, for
1
fX ,Y (x, y ) = x, y ≥ 1.
x 2y 2

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics


MATH2901Society)
Revision Seminar August 2020 102 / 116
Example 16

Example
Test if X and Y are independent, for
1
fX ,Y (x, y ) = x, y ≥ 1.
x 2y 2

Z ∞
1
fX (x) = dy
1 x 2y 2
1
= x ≥1
x2
1
Similarly fY (y ) = y ≥ 1.
y2
Therefore since fX ,Y (x, y ) = fX (x)fY (y ), X and Y are independent.

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics


MATH2901Society)
Revision Seminar August 2020 102 / 116
Example 17

Example
Determine P(X = x | Y = 2), i.e. fX |Y (x | 2), for

y
1 2
x 1 1/10 1/5
2 3/10 2/5

P(Y = 2) = P(X = 1, Y = 2) + P(X = 2, Y = 2)


1 2
= +
5 5
3
= .
5

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics


MATH2901Society)
Revision Seminar August 2020 103 / 116
Example 17

Example
Determine P(X = x | Y = 2), i.e. fX |Y (x | 2), for

y
1 2
x 1 1/10 1/5
2 3/10 2/5

3
P(Y = 2) =
5

P(X = 1, Y = 2) 1
P(X = 1 | Y = 2) = =
P(Y = 2) 3
P(X = 2, Y = 2) 2
P(X = 2 | Y = 2) = =
P(Y = 2) 3

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics


MATH2901Society)
Revision Seminar August 2020 103 / 116
Example 18

Lemma (Independence of random variables)


Two random variables are independent if and only if

fY |X (y | x) = fY (y )

or
fX |Y (x | y ) = fX (x)

Investigation
For the earlier example with fX ,Y (x, y ) = x −2 y −2 for x ≥ 1, y ≥ 1, prove
the independence of X and Y using this lemma instead.

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics


MATH2901Society)
Revision Seminar August 2020 104 / 116
Example 19

Definition (Conditional Expectation)


X


 xP(X = x | Y = y ) discrete case
all
E[X | Y = y ] = Z ∞ x


 x fX |Y (x | y ) dx continuous case
−∞

Definition (Conditional Variance)


2
Var(X | Y = y ) = E[X 2 | Y = y ] − E[X | Y = y ]

(And similarly for Y . Basically, just add the condition to the original
formula.)

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics


MATH2901Society)
Revision Seminar August 2020 105 / 116
Example 19

Example
Find E[X | Y = 2] and Var(X | Y = 2) for

y
1 2
x 1 1/10 1/5
2 3/10 2/5

E[X | Y = 2] = 1 · P(X = 1 | Y = 2) + 2 · P(X = 2 | Y = 2)


1 2
=1× +2×
3 3
5
= .
3

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics


MATH2901Society)
Revision Seminar August 2020 105 / 116
Example 19

Example
Find E[X | Y = 2] and Var(X | Y = 2) for

y
1 2
x 1 1/10 1/5
2 3/10 2/5

E[X 2 | Y = 2] = 12 · P(X = 1 | Y = 2) + 22 · P(X = 2 | Y = 2)


1 2
= 12 × + 22 ×
3 3
= 3.

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics


MATH2901Society)
Revision Seminar August 2020 105 / 116
Example 19

Example
Find E[X | Y = 2] and Var(X | Y = 2) for

y
1 2
x 1 1/10 1/5
2 3/10 2/5

 2
2 5 2
Var(X | Y = 2) = 3 − =
3 9

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics


MATH2901Society)
Revision Seminar August 2020 105 / 116
Example 20

Example
Let fX ,Y (x, y ) = xy for x ∈ [0, 1], y ∈ [0, 2]. Determine their covariance in
the old fashioned way.

Step 1: Determine the marginal densities


Z 2
fX (x) = xy dy = 2x (0 ≤ x ≤ 1)
0
Z 1
y
fY (y ) = xy dx = (0 ≤ y ≤ 2)
0 2

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics


MATH2901Society)
Revision Seminar August 2020 106 / 116
Example 20

Example
Let fX ,Y (x, y ) = xy for x ∈ [0, 1], y ∈ [0, 2]. Determine their covariance in
the old fashioned way.

Step 2: Find the marginal expectations E[X ] and E[Y ]


Z 1
2
E[X ] = 2x 2 dx =
0 3
Z 2 2
y 4
E[Y ] = dy =
0 2 3

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics


MATH2901Society)
Revision Seminar August 2020 106 / 116
Example 20

Example
Let fX ,Y (x, y ) = xy for x ∈ [0, 1], y ∈ [0, 2]. Determine their covariance in
the old fashioned way.

Step 3: Find E[XY ]


Z 1Z 2
8
E[XY ] = xy dy dx = · · · =
0 0 9

Step 4: Plug in:


8 2 4
Cov(X , Y ) = E[XY ] − E[X ]E[Y ] = − × = 0.
9 3 3

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics


MATH2901Society)
Revision Seminar August 2020 106 / 116
Example 20

Example
Let fX ,Y (x, y ) = xy for x ∈ [0, 1], y ∈ [0, 2].Determine their covariance in
the old fashioned way.

That was a horrible idea.


Can prove that X and Y are independent
Can use the Fubini-Tonelli theorem to just check that E[XY ] equals
E[X ]E[Y ]

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics


MATH2901Society)
Revision Seminar August 2020 106 / 116
Example 21

Example (2901)
Let Z ∼ N (0, 1) and W satisfy P(X = 1) = P(X = −1) = 12 . Suppose
that W and Z are independent and define X := WZ .

Show that Cov(X , Z ) = 0.

Noting that E[Z ] = 0,

Cov(X , Z ) = E[XZ ] − E[X ]E[Z ] = E[XZ ]

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics


MATH2901Society)
Revision Seminar August 2020 107 / 116
Example 21

Example (2901)
Let Z ∼ N (0, 1) and W satisfy P(X = 1) = P(X = −1) = 12 . Suppose
that W and Z are independent and define X := WZ .

Show that Cov(X , Z ) = 0.

Noting that E[Z ] = 0,

Cov(X , Z ) = E[XZ ] − E[X ]E[Z ] = E[XZ ]

Subbing in X = WZ and using independence gives

Cov(X , Z ) = E[WZ 2 ] = E[W ]E[Z 2 ]

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics


MATH2901Society)
Revision Seminar August 2020 107 / 116
Example 21

Example (2901)
Let Z ∼ N (0, 1) and W satisfy P(X = 1) = P(X = −1) = 12 . Suppose
that W and Z are independent and define X := WZ .

Show that Cov(X , Z ) = 0.

Observe that

E[W ] = 1P(X = 1) − 1P(X = −1) = 0.

Hence Cov(X , Z ) = E[W ]E[Z 2 ] = 0.

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics


MATH2901Society)
Revision Seminar August 2020 107 / 116
Example 22
Theorem (Bivariate Transform Formula)
Suppose X and Y have joint density function fX ,Y and let U and V be
transforms on these random variables. Then the joint density of U, V is

fU,V (u, v ) = fX ,Y (x, y ) | det(J)|

where J is the Jacobian matrix


 ∂x ∂x 
J= ∂u ∂v
∂y ∂y
∂u ∂v
Remember: x above y and u left of v
Example (Course pack)
Let X and Y be i.i.d. Exp(4) r.v.s. Find the joint density of U and V if

U = 21 (X − Y ) and V = Y .

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics


MATH2901Society)
Revision Seminar August 2020 108 / 116
Example 22

Example (Course pack)


Let X and Y be i.i.d. Exp(4) r.v.s. Find the joint density of U and V if

1
U = (X − Y ) and V = Y .
2
We have y = v and
1
u = (x − v ) =⇒ x = 2u + v .
2
 
2 1
∴J= and det(J) = 2.
0 1

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics


MATH2901Society)
Revision Seminar August 2020 108 / 116
Example 22

Example (Course pack)


Let X and Y be i.i.d. Exp(4) r.v.s. Find the joint density of U and V if

1
U = (X − Y ) and V = Y .
2

1 −(x+y )/4
fX ,Y (x, y ) =
e
16
Since y = v and x = 2u + v , we get x + y = 2u + 2v . Therefore
1
fU,V (u, v ) = e −(u+v )/2 .
8

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics


MATH2901Society)
Revision Seminar August 2020 108 / 116
Example 22

Example (Course pack)


Let X and Y be i.i.d. Exp(4) r.v.s. Find the joint density of U and V if

1
U = (X − Y ) and V = Y .
2
We know that y > 0. Since v = y , it immediately follows that v > 0.

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics


MATH2901Society)
Revision Seminar August 2020 108 / 116
Example 22

Example (Course pack)


Let X and Y be i.i.d. Exp(4) r.v.s. Find the joint density of U and V if

1
U = (X − Y ) and V = Y .
2
We know that y > 0. Since v = y , it immediately follows that v > 0.
However, x > 0 and x = 2u + v . Therefore:

2u + v > 0
v
u>−
2

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics


MATH2901Society)
Revision Seminar August 2020 108 / 116
Example 23

Example
Let X and Y be i.i.d. Geom(p). Use convolutions to find the probability
function of Z := X + Y .

The probability functions are P(X = x) = p(1 − p)x for x = 1, 2, 3, . . .,


and P(Y = y ) = p(1 − p)y for y = 1, 2, 3, . . .. Therefore:

P(X = z − y ) = p(1 − p)z−y

for z − y = 1, 2, 3, . . .,

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics


MATH2901Society)
Revision Seminar August 2020 109 / 116
Example 23

Example
Let X and Y be i.i.d. Geom(p). Use convolutions to find the probability
function of Z := X + Y .

The probability functions are P(X = x) = p(1 − p)x for x = 1, 2, 3, . . .,


and P(Y = y ) = p(1 − p)y for y = 1, 2, 3, . . .. Therefore:

P(X = z − y ) = p(1 − p)z−y

for z − y = 1, 2, 3, . . ., i.e.

y − z = . . . , −3, −2, −1 ⇐⇒ y = . . . , z − 3, z − 2, z − 1

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics


MATH2901Society)
Revision Seminar August 2020 109 / 116
Example 23

Example
Let X and Y be i.i.d. Geom(p). Use convolutions to find the probability
function of Z := X + Y .

Hence P(X = z − y )P(Y = y ) = p(1 − p)z−y p(1 − p)y = p 2 (1 − p)z , when

y = 0, 1, 2, . . .
and y = . . . , z − 3, z − 2, z − 1.

Therefore, y = 0, 1, 2, . . . , z − 3, z − 2, z − 1.

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics


MATH2901Society)
Revision Seminar August 2020 109 / 116
Example 23

Example
Let X and Y be i.i.d. Geom(p). Use convolutions to find the probability
function of Z := X + Y .

z−1
X
∴ P(Z = z) = p 2 (1 − p)z
y =0

= zp (1 − p)z
2
(sum only depends on y !)

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics


MATH2901Society)
Revision Seminar August 2020 109 / 116
Example 23

Example
Let X and Y be i.i.d. Geom(p). Use convolutions to find the probability
function of Z := X + Y .

z−1
X
∴ P(Z = z) = p 2 (1 − p)z
y =0

= zp (1 − p)z
2
(sum only depends on y !)

Since x = 1, 2, . . . and y = 1, 2, . . ., i.e. x and y are natural numbers


greater than or equal to 1, z = x + y = 2, 3, 4, . . .

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics


MATH2901Society)
Revision Seminar August 2020 109 / 116
Example 24

Example
Let X and Y be i.i.d. Exp(1). Prove that Z := X + Y follows a
Gamma(2, 1) distribution using a convolution.

The densities are fX (x) = e −x for x > 0, and fY (y ) = e −y for y > 0.


Therefore:

fX (z − y ) = e −z+y , for z − y > 0 , i.e. y < z

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics


MATH2901Society)
Revision Seminar August 2020 110 / 116
Example 24

Example
Let X and Y be i.i.d. Exp(1). Prove that Z := X + Y follows a
Gamma(2, 1) distribution using a convolution.

The densities are fX (x) = e −x for x > 0, and fY (y ) = e −y for y > 0.


Therefore:

fX (z − y ) = e −z+y , for z − y > 0 , i.e. y < z

Hence fX (z − y )fY (y ) = e −z when y < z and y > 0. i.e.

fX (z − y )fY (y ) = e −z for 0 < y < z

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics


MATH2901Society)
Revision Seminar August 2020 110 / 116
Example 24

Example
Let X and Y be i.i.d. Exp(1). Prove that Z := X + Y follows a
Gamma(2, 1) distribution using a convolution.

Z z
∴ fZ (z) = e −z dy
0
−z
=e z

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics


MATH2901Society)
Revision Seminar August 2020 110 / 116
Example 24

Example
Let X and Y be i.i.d. Exp(1). Prove that Z := X + Y follows a
Gamma(2, 1) distribution using a convolution.

Z z
∴ fZ (z) = e −z dy
0
−z
=e z
e −z/1 z 2−1
=
Γ(2)12

Since x > 0 and y > 0, z = x + y > 0. Thus Z has the density of a


Gamma(2,1) random variable.

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics


MATH2901Society)
Revision Seminar August 2020 110 / 116
Example 25

Theorem (MGF of a sum)


If X and Y are independent random variables, then

mX +Y (u) = mX (u)mY (u)

Example
Let X and Y be i.i.d. Exp(1). Prove that Z := X + Y follows a
Gamma(2, 1) distribution from quoting MGFs.

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics


MATH2901Society)
Revision Seminar August 2020 111 / 116
Example 25

Example
Let X and Y be i.i.d. Exp(1). Prove that Z := X + Y follows a
Gamma(2, 1) distribution from quoting MGFs.
1 1
mX (u) = 1−u and mY (u) = 1−u . So clearly
 2
1
mZ (u) = mX (u)mY (u) = ,
1−u

which is the MGF of a Gamma(2,1) distribution. Hence Z follows this


distribution as well.

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics


MATH2901Society)
Revision Seminar August 2020 111 / 116
Example 26

Example
Let X1 , . . . , Xn be a sequence of i.i.d. Unif(0, 1) random variables. Define
d
Yn = n min{U1 , . . . , Un }. Prove that Yn → Y , where Y ∼ Exp(1).

FYn (y ) = P(Yn ≤ y ) = P(n min{U1 , . . . , Un } ≤ y )


 y
= P min{U1 , . . . , Un } ≤
n

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics


MATH2901Society)
Revision Seminar August 2020 112 / 116
Example 26

Example
Let X1 , . . . , Xn be a sequence of i.i.d. Unif(0, 1) random variables. Define
d
Yn = n min{U1 , . . . , Un }. Prove that Yn → Y , where Y ∼ Exp(1).

FYn (y ) = P(Yn ≤ y ) = P(n min{U1 , . . . , Un } ≤ y )


 y
= P min{U1 , . . . , Un } ≤
n
 y
= 1 − P min{U1 , . . . , Un } ≥
n

In general, if min{x1 , . . . , xn } ≤ x, then not every xi ≤ x.

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics


MATH2901Society)
Revision Seminar August 2020 112 / 116
Example 26

Example
Let X1 , . . . , Xn be a sequence of i.i.d. Unif(0, 1) random variables. Define
d
Yn = n min{U1 , . . . , Un }. Prove that Yn → Y , where Y ∼ Exp(1).

FYn (y ) = P(Yn ≤ y ) = P(n min{U1 , . . . , Un } ≤ y )


 y
= P min{U1 , . . . , Un } ≤
n
 y
= 1 − P min{U1 , . . . , Un } ≥
n
 y y
= 1 − P U1 > , . . . , Un >
n n
But it is true that if min{U1 , . . . , Un } ≥ x, then every xi ≥ x.

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics


MATH2901Society)
Revision Seminar August 2020 112 / 116
Example 26

Example
Let X1 , . . . , Xn be a sequence of i.i.d. Unif(0, 1) random variables. Define
d
Yn = n min{U1 , . . . , Un }. Prove that Yn → Y , where Y ∼ Exp(1).

 y y
FYn (y ) = 1 − P U1 > , . . . , Un >
n  n y
 y
= 1 − P U1 > . . . P Un > (independence)
n i n
h  y n
= 1 − P U1 > (id. distributed)
n

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics


MATH2901Society)
Revision Seminar August 2020 112 / 116
Example 26

Example
Let X1 , . . . , Xn be a sequence of i.i.d. Unif(0, 1) random variables. Define
d
Yn = n min{U1 , . . . , Un }. Prove that Yn → Y , where Y ∼ Exp(1).

 y y
FYn (y ) = 1 − P U1 > , . . . , Un >
n  n y
 y
= 1 − P U1 > . . . P Un > (independence)
n i n
h  y n
= 1 − P U1 > (id. distributed)
"Z #nn
1  y n
=1− 1 dt = 1 − 1 −
y /n n

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics


MATH2901Society)
Revision Seminar August 2020 112 / 116
Example 26

Example
Let X1 , . . . , Xn be a sequence of i.i.d. Unif(0, 1) random variables. Define
d
Yn = n min{U1 , . . . , Un }. Prove that Yn → Y , where Y ∼ Exp(1).

∴ lim FYn (y ) = 1 − e −y = FY (y )
n→∞
d
Hence Yn → Y .

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics


MATH2901Society)
Revision Seminar August 2020 112 / 116
Example 27

Example (Libo’s notes)


Australians have average weight about 68 kg and variance about 16 kg2 .
Suppose 40 random Australians are chosen. What is the (approximate)
probability that the average weight of these Australians is over 80?

Let X1 , . . . , X40 be the weights of the Australians.

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics


MATH2901Society)
Revision Seminar August 2020 113 / 116
Example 27

Example (Libo’s notes)


Australians have average weight about 68 kg and variance about 16 kg2 .
Suppose 40 random Australians are chosen. What is the (approximate)
probability that the average weight of these Australians is over 80?

Let X1 , . . . , X40 be the weights of the Australians. Then n = 40, µ = 68


and σ = 4, so by the CLT:

X − 68 d
√ →Z
4/ 40

where Z ∼ N (0, 1).

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics


MATH2901Society)
Revision Seminar August 2020 113 / 116
Example 27

Example (Libo’s notes)


Australians have average weight about 68 kg and variance about 16 kg2 .
Suppose 40 random Australians are chosen. What is the (approximate)
probability that the average weight of these Australians is over 80?

 
X40 − 68 80 − 68
∴ P(X40 > 80) = P √ > √
4/ 40 4/ 40
 
80 − 68
≈P Z > √
4/ 40

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics


MATH2901Society)
Revision Seminar August 2020 113 / 116
Example 27

Example (Libo’s notes)


Australians have average weight about 68 kg and variance about 16 kg2 .
Suppose 40 random Australians are chosen. What is the (approximate)
probability that the average weight of these Australians is over 80?

 
X40 − 68 80 − 68
∴ P(X40 > 80) = P √ > √
4/ 40 4/ 40
 
80 − 68
≈P Z > √
4/ 40

= P(Z > 3 40)
= 1-pnorm(3*sqrt(40))
or pnorm(3*sqrt(40), lower.tail=FALSE)

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics


MATH2901Society)
Revision Seminar August 2020 113 / 116
Example 28

Lemma (Normal Approximation to Binomial)


Let X ∼ Bin(n, p), which is a sum of n independent Ber(p) r.v.s. Then

X − np d
p → N (0, 1)
np(1 − p)

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics


MATH2901Society)
Revision Seminar August 2020 114 / 116
Example 28
Example
An unfortunate soul decided to sit his exam despite having a migraine and
the flu. Fortunately, it was not a university exam, and the paper involved
only 200 multiple choice questions with 5 options. Therefore, he randomly
guesses every answer. What is the (approximate) probability he fails?

Let X be how many he gets correct. Then X ∼ Bin 200, 15 .




Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics


MATH2901Society)
Revision Seminar August 2020 114 / 116
Example 28
Example
An unfortunate soul decided to sit his exam despite having a migraine and
the flu. Fortunately, it was not a university exam, and the paper involved
only 200 multiple choice questions with 5 options. Therefore, he randomly
guesses every answer. What is the (approximate) probability he fails?

Let X be how many he gets correct. Then X ∼ Bin 200, 15 .




We may approximate X with Y ∼ N (40, 32). Then,


P(X < 100) ≈ P(Y < 100)

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics


MATH2901Society)
Revision Seminar August 2020 114 / 116
Example 28
Example
An unfortunate soul decided to sit his exam despite having a migraine and
the flu. Fortunately, it was not a university exam, and the paper involved
only 200 multiple choice questions with 5 options. Therefore, he randomly
guesses every answer. What is the (approximate) probability he fails?

Let X be how many he gets correct. Then X ∼ Bin 200, 15 .




We may approximate X with Y ∼ N (40, 32). Then,


P(X < 100) ≈ P(Y < 100)
 
Y − 40 100 − 40
=P √ < √
32 32
 
60
=P Z < √
32
= P(Z < 10.6066)

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics


MATH2901Society)
Revision Seminar August 2020 114 / 116
Example 28
Example
An unfortunate soul decided to sit his exam despite having a migraine and
the flu. Fortunately, it was not a university exam, and the paper involved
only 200 multiple choice questions with 5 options. Therefore, he randomly
guesses every answer. What is the (approximate) probability he fails?

Let X be how many he gets correct. Then X ∼ Bin 200, 15 .




We may approximate X with Y ∼ N (40, 32). Then,


P(X < 100) ≈ P(Y < 100)
 
Y − 40 100 − 40
=P √ < √
32 32
 
60
=P Z < √
32
= P(Z < 10.6066) Oh my...

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics


MATH2901Society)
Revision Seminar August 2020 114 / 116
Example 29

Example (Libo’s notes)


Let X1 , X2 , ... be a sequence of i.i.d random variables with mean 2 and
variance 7. Obtain a large sample approximation for the distribution of
(Xn )3 .

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics


MATH2901Society)
Revision Seminar August 2020 115 / 116
Example 29

The CLT gives


√ d
n(X̄n − 2) → N(0, 7)
Applying the Delta Method with g (x) = x 3 leads to g 0 (x) = 3x 2 and then
√ d
n[(X̄n3 ) − 23 ] → N(0, 7 · (3 · 22 )2 ).

Simplifying, we have
√ d
n[(¯(X )3n − 8] → N(0, 1008).

Thus, for large n, the approximate distribution of (X̄n )3 is N(8, 1008


n ).

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics


MATH2901Society)
Revision Seminar August 2020 115 / 116
Example 30

Example (More 2801 focused)


The Riemann zeta function is defined for complex s with real part greater
than 1 by the absolutely convergent infinite series

X 1
ζ(s) = .
ns
n=1

Prove that the real part of every non-trivial zero of the Riemann zeta
function is 12 .

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics


MATH2901Society)
Revision Seminar August 2020 116 / 116
Example 30

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics


MATH2901Society)
Revision Seminar August 2020 116 / 116

You might also like