0% found this document useful (0 votes)
18 views

Stats3 Topic Notes

This document provides notes on various statistical topics for a STEP III exam: 1) It covers the algebra of expectations, including properties of expected values and variances for independent and dependent random variables. 2) It also discusses probability distribution functions, probability generating functions, moment generating functions, sampling, and the geometric distribution. Formulas for these topics can be found in the provided formula book. 3) The notes provide examples of how to calculate expectations, variances, correlation coefficients, and probabilities for different distributions using the defining formulas for each statistical concept.

Uploaded by

Sneha Khandelwal
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
18 views

Stats3 Topic Notes

This document provides notes on various statistical topics for a STEP III exam: 1) It covers the algebra of expectations, including properties of expected values and variances for independent and dependent random variables. 2) It also discusses probability distribution functions, probability generating functions, moment generating functions, sampling, and the geometric distribution. Formulas for these topics can be found in the provided formula book. 3) The notes provide examples of how to calculate expectations, variances, correlation coefficients, and probabilities for different distributions using the defining formulas for each statistical concept.

Uploaded by

Sneha Khandelwal
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 4

maths.

org/step

STEP Support Programme

STEP III Statistics Topic Notes

Algebra of Expectations

Not in the formula book (X, Y do not have to be independent for the first one):
• E(aX + bY + c) = aE(X) + bE(Y ) + c

• Var(aX + b) = a2 Var(X)

In the formula book!

If X and Y are independent random variables:


• E(XY ) = E(X)E(Y )

• Var(aX ± bY ) = a2 Var(X) + b2 Var(Y )

Covariance (all of these are in the formula book!)


 
• Cov(X, Y ) = E (X − µX )(Y − µY ) = E(XY ) − µX µY

• Var(aX ± bY ) = a2 Var(X) + b2 Var(Y ) ± 2abCov(X, Y )

• If X = aX 0 + b and Y = cY 0 + d then Cov(X, Y ) = ac Cov(X 0 , Y 0 ))


Cov(X, Y )
• Product moment correlation coefficient ρ =
σX σY

2 = Var(X) etc. Note that if X and Y are independent then Cov(X, Y ) = 0.


µX = E(X), σX

Distribution functions

The probability distribution function is also called the probability density function.

There are lots of formulae on pages 7 & 8 of the formulae book!

If you are given the probability distribution function of X, then you can find the distribution func-
tion of related random variables, such as Y = X 2 . You do this via the cumulative distribution
function, e.g.:

FY (y) = P(Y 6 y)
= P(X 2 6 y)

= P(X 6 y)
Z √y
= f(t) dt
−∞

The lower limit might be 0 or something else depending on how X is distributed.

Once you have the C.D.F. for Y you can differentiate (with respect to y) and hence find the P.D.F.

STEP III Statistics: Topic Notes 1


maths.org/step

Probability Generating Functions (Discrete Random Variables)

The probability generating function of X is GX (t) = E(tX ), so if the random variable takes integer
values greater than or equal to 0 we have:

GX (t) = P(X = 0) × t0 + P(X = 1) × t1 + P(X = 2) × t2 + · · · .

We have GX (1) = 1 as this is the sum of all the possible probabilities. The probability P(X = r)
is the coefficient of tr in the P.G.F. expansion.

Differentiating with respect to t gives:

G0X (t) = 0 × P(X = 0) + 1 × P(X = 1) + 2 × P(X = 2) × t + 3 × P(X = 3) × t2 + . . .

and substituting t = 1 gives G0X (1) = E(X).

Differentiating again and substituting t = 1 gives


G00X (1) = 2 × P(X = 2) + 3 × 2 × P(X = 3) + 4 × 3 × P(X = 4) + . . ..

Adding together G00X (1) and G0X (1) gives:

G0X (1) + G00X (1) = P(X = 1)+2 × P(X = 2) + 3 × P(X = 3) + 4 × P(X = 4) + . . .


+2 × P(X = 2) + 3 × 2 × P(X = 3) + 4 × 3 × P(X = 4) + . . .
= P(X = 1)+22 × P(X = 2) + 32 × P(X = 3) + 42 × P(X = 4) + . . .
2
Hence G00X (1) + G0X (1) = E(X 2 ) and so Var(X) = G00X (1) + G0X (1) − G0X (1) .


These formulae, and the generating functions for Binomial, Poisson and Geometric distributions
are given in the formula book.

Moment Generating Functions (Continuous Random Variables)

The M.G.F. of a continuous random variable with P.D.F. f(x) is:


Z
MX (t) = E(e ) = f(x)etx dx .
tX

Z
Substituting t = 0 gives MX (0) = f(x)dx = 1.

Differentiating with respect to t gives:


Z Z
0 d tx
MX (t) = f(x) × e dx = f(x)xetx dx .
dt
Z
Substituting t = 0 gives M0X (0) = f(x) × xdx = E(X).

In a similar way you can differentiate with respect to t again and show that M00X (0) = E(X 2 ) and
2
hence Var(X) = M00X (0) − M0X (0) .


These formulae and the M.G.F.’s for the Uniform, Exponential and Normal distributions are given
in the formula book.

STEP III Statistics: Topic Notes 2


maths.org/step

Sampling

If n independent observations, X1 , X2 , . . ., Xn are made from a distribution with mean µ and


2
variance σ 2 then the sample mean, X̄ has E(X̄) = µ and Var(X̄) = σn . When n (the sample size)
is “Large” the Central
 Limit Theorem says that the sample mean, X̄, is approximately normal,
σ2
i.e. X̄ ≈ ∼N µ, n no matter what distribution X has.

Geometric Distribution

The Geometric distribution models the number of trials needed up to and including the first “suc-
cess”. It has one parameter, p, the probability of success. It is not on all A-level specifications, but
does sometimes appear in STEP questions.

For example, X could be the number of rolls of a dice until a six is rolled. Then the probability of
success is p = 16 and X ∼Geo 16 .


The probability of a six occurring on the rth roll (and not before) is P(X = r) = (1 − p)r−1 p =
5 r−1 1 th roll).
 
6 6 (as the first r − 1 rolls are all “not sixes” followed by a six on the r

Writing q = 1 − p we have:

P(X = 1) + P(X = 2) + P(X = 3) + P(X = 4) + . . . = p + qp + q 2 p + q 3 p + . . .


= p 1 + q + q2 + q3 + . . .


1
=p× Sum of a infinite GP
1−q
1
=p× =1 as expected!
p

To find the expectation we want to find:

E(X) = P(X = 1) + 2P(X = 2) + 3P(X = 3) + 4P(X = 4) + . . .


= p + 2qp + 3q 2 p + 4q 3 p + . . .
= p 1 + 2q + 3q 2 + 4q 3 + . . . .


This looks a lot like a derivative! Starting with 1 + q + q 2 + q 3 + q 4 + . . . = (1 − q)−1 we can


differentiate with respect to q to get:

0 + 1 + 2q + 3q 2 + 4q 3 + . . . = (1 − q)−2 (*)

1 p
We then have E(X) = p × (1−q)2
= p2
= p1 .

To find Var(X) start by differentiating (∗) to get:

2 + 3 × 2q + 4 × 3q 2 + 5 × 4q 3 + . . . = 2(1 − q)−3 (†)

Then (†) − (∗) gives:


2 1 2−p
1 + 2 × 2q + 3 × 3q 2 + 4 × 4q 3 + . . . = 3
− 2 = .
p p p3

STEP III Statistics: Topic Notes 3


maths.org/step

2−p
Hence E(X 2 ) = p + 22 qp + 32 q 2 p + 42 q 3 p + · · · = p2
and
2
Var(X) = E(X 2 ) − E(X)

 2
2−p 1
= 2

p p
1−p
=
p2

STEP III Statistics: Topic Notes 4

You might also like