Formula Sheet (1) Descriptive Statistics: Quartiles (n+1) /4 (n+1) /2 (The Median) 3 (n+1) /4
Formula Sheet (1) Descriptive Statistics: Quartiles (n+1) /4 (n+1) /2 (The Median) 3 (n+1) /4
FORMULA SHEET
[1] DESCRIPTIVE STATISTICS
Sample Mean (ungrouped):
n
x
x
=
Population Mean (ungrouped):
Population Mean (grouped):
Sample variance s
2
=
1 n
) x - x (
2
=
1 n
2
) X n( -
2
X
(
(
=
1 n
2
n
x) (
-
2
X
(
(
Population variance (ungrouped)
2
=
N
) - X (
2
=
N
2
N -
2
X
(
(
=
N
2
N
x) (
-
2
X
(
(
Population Variance (grouped):
2
=
N
) - M f(
2
=
N
N
-
2
M f
(
(
2
) ( fM
The standard deviation is the positive square root of the variance.
Quartiles
First quartile position: Q
1
= (n+1)/4
Second quartile position: Q
2
= (n+1)/2 (the median)
Third quartile position: Q
3
= 3(n+1)/4
Where n is the number of observed values (sample size).
Rule 1: If the result is an integer then the quartile is equal to the ranked value.
Rule 2: If the result is a factional half, then the quantile is equal to the mean of the
corresponding values.
Rule 3: If the result is neither an integer of fractional half, round the result to the nearest
integer and select that ranked value.
N
x
=
N
fM
grouped
=
FORMULA SHEET (Statistics) 2
Coefficient of Variation CV =
. 100
Standardized value or z score =
X -
|
\
| or
X - X
s
|
\
|
and for a given z value X = + z
The Empirical Rule : The interval of values one standard deviation either side of the
mean X 1 S contains approximately 68% of the items or people
in the sample
[ 2 ] PROBABILITY THEORY
If N is the total number of opportunities for the event to occur and x is the number of
times the event has occurred
P(E) =
N
x
Addition Law P(A B) = P(A) + P(B) - P(A B)
= P(A) + P(B) if A and B are mutually exclusive
Conditional Probability P(A | B) =
) (
) | ( ). (
B P
A B P A P
=
P(B)
B) P(A
= P(A) if A and B are independent
Multiplication Rule P(A B) = P(B) P(A | B) = P(A) P(B | A)
= P(A) P(B) if A and B are independent
FORMULA SHEET (Statistics) 3
PROBABILITY DISTRIBUTIONS
Mean or expected value of discrete distribution:
= = )] ( [ ) ( x xP x E
Variance of a discrete distribution:
= )] ( . ) [(
2 2
x P x
The Covariance
Definition formula: ) ( )] ( )][( ( [
1
=
=
N
i
i i i i XY
Y X P Y E Y X E X
Calculation formula:
=
=
N
i
i i i i XY
Y E X E Y X P Y X
1
) ( ) ( ) (
Where X
i
Y
i
= the i
th
outcome of the discrete random variables X and Y respectively
P(X
i
Y
i
) = probability of the i
th
occurrence of X and Y
Portfolio Expected Return and Portfolio Risk
Portfolio expected return (weighted average return): E(P) = wE(X) +(1 w)E()
Portfolio risk (weighted variability): o
P
= w
2
o
X
2
+ (1 w)
2
o
2
+2w(1 w)o
X
Where E(P) = portfolio expected return
w = portion of portfolio value in asset X
(1 w) = portion of portfolio value in asset Y
Binomial Distribution P(X) =
n!
X! (n X)!
p
x
q
n-x
=
n
C
X
p
x
(1 - p)
n-x
(if using a calculator)
= n p and
2
= n p (1 - p)
q = 1-p
The standard deviation is the positive square root of the variance.
FORMULA SHEET (Statistics) 4
[ 3 & 4 ] STATISTICAL INFERENCE
SAMPLING DISTRIBUTIONS
SAMPLE MEAN
X ~ N (
X
= ,
X
=
n
)
If the population standard deviation is not known we use the unbiased estimate s when
we find the estimated value of
X
. The Testing Statistics we use are either
z =
X
/ n
(when is known)
or
t =
X
s / n
(when is not known)
THE SAMPLE PROPORTION
p ~ N (
p
=,
p
=
n
(1 )
)
The Testing Statistic we use is
z =
n
) - (1
p
UNIFORM DISTRIBUTION
WORKING WITH FINITE POPULATIONS
Whenever we have n / N > 0.05, the standard error of the sample estimate is multiplied
by the following term
Finite Population Correction Factor =
N n
N 1
FORMULA SHEET (Statistics) 5
INTERVAL ESTIMATION
[ ] Sample Estimate ( X ) z
/2
X
= X z
/2
( / n )
or
Sample Estimate ( X ) = x t
/2
,
n-1
n
s
[ p ] Sample Estimate ( p ) z
p
=
n
q p
z p
2 /
Confidence Interval for the Population Total
Population Total = NX
Confidence Interval Estimate:
NX
= N(t
n-1
)
s
n
_
N n
N 1
Confidence Interval for Total Difference
Total Difference = N
Mean Difference:
N
=1
n
Where: D
i
= audited value original value
Confidence Interval Estimate:
N
= N(t
n-1
)
s
n
_
N n
N 1
Where:
S
=
(
)
2 n
=1
n 1
ESTIMATING SAMPLE SIZE
MEAN n =
2
2
E
2
/2
z
= b
0
+ b
1
x
KEY VALUES USED IN REGRESSION CALCULATIONS
SS
XY
=
=
n
y x ) )( (
xy ) y - y )( x - x (
SS
XX
=
n
x
x x x
=
2
2
2
) (
) (
SS
yy
=
n
y
y y y
=
2
2
2
) (
) (
ESTIMATION FORMULAE
Slope b
1
=
SS
SS
x - (x
y - (y x - (x
xx
xy
=
2
)
) )
b
1
Intercept b
0
=
n
x
b
n
y
x b y
) (
1 1
=
SSE =
xy b y b y
1 0
2
Standard error of estimate
2
=
n
SSE
s
e
Coefficient of determination
= =
n
y
y
SSE
SS
SSE
r
yy
2
2
2
) (
1 1
Computational formula for r
2
yy
xx
SS
SS b
r
2
1 2
=
Pearson Product-Moment Correlation Coefficient
YY XX
XY
S S
S
r =
Adjusted R
2
:
(
\
|
=
1
1
) 1 ( 1
2 2
k n
n
r r
adj
FORMULA SHEET (Statistics) 7
Sampling Distribution for the estimated slope
t test for slope :
xx
e
b
b
SS
s
s where
s
b
t
=
=
:
1 1
t test for correlation:
2 n
r 1
- r
2
= t
where:
0 b if
0 b if
1
2
1
2
< =
> + =
r r
r r
COMPLETE MODEL EVALUATION FORMULAE
Total variation = Explained Variation + Unexplained Variation
SST = SSR + SSE
Where SSR =
2
) ( y y
These results are used in the following definition
Coefficient of determination r
2
=
= =
n
y
y
SSE
SST
SSE
r
2
2
2
) (
1 1
INTERVAL ESTIMATION
CONFIDENCE INTERVAL FOR THE CONDITIONAL MEAN
i YX n i
h s t y
2
+ =
n
i
i
i
X X
X X
n
h
1
2
2
) (
) ( 1
PREDICTION INTERVAL FOR AN INDIVIDUAL RESPONSE
i YX n i
h s t y +
1
2
,
+ =
n
i
i
i
X X
X X
n
h
1
2
2
) (
) ( 1
FORMULA SHEET (Statistics) 8
[6] TIME SERIES FORECASTING AND INDEX NUMBERS
Mean Absolute Deviation =
n
e
n
i
i
=1
| |
Mean Square Error =
n
e
n
i
i
=1
2
Exponential Smoothing:
F
2
= X
1
F
t+1
= aX
t
+ (1-a) F
t
Simple Weighted Index: 100 .
0
X
X
I
i
i
=
Weighted Aggregate Price Index:
) 100 (
o o
i i
i
Q P
Q P
I
=
Laspeyres Price Index:
) 100 (
0 0
Q P
Q P
I
O i
L
=
Paasche Price Index:
) 100 (
0 i
i i
P
Q P
Q P
I
=
Non-Linear Trend Forecasting:
Quadratic form:
i
+ + + =
2
i 2 i 1 0 i
X X Y
Exponential trend:
i
X
1 0 i
Y
i
=
Exponential trend (logged transformation): ) log( ) log( X ) log( ) log(Y
i 1 i 0 i
+ + =
Model selection:
First differences (linear trend): ) Y Y ( ) Y Y ( ) Y (Y
1 - n n 2 3 1 2
= = = L
Second differences (quadratic):
)] Y Y ( ) Y [(Y
)] Y Y ( ) Y [(Y )] Y Y ( ) Y [(Y
2 - n 1 - n 1 - n n
2 3 3 4 1 2 2 3
= =
=
L
percentage diff (exponential): % 100
Y
) Y (Y
% 100
Y
) Y (Y
% 100
Y
) Y (Y
1 - n
1 - n n
2
2 3
1
1 2
= =
L
FORMULA SHEET (Statistics) 9
FORMULA SHEET (Statistics) 10
FORMULA SHEET (Statistics) 11
FORMULA SHEET (Statistics) 12
FORMULA SHEET (Statistics) 13