0% found this document useful (0 votes)
216 views13 pages

Formula Sheet (1) Descriptive Statistics: Quartiles (n+1) /4 (n+1) /2 (The Median) 3 (n+1) /4

This document contains a formula sheet with statistical formulas organized into several sections: 1) Descriptive statistics formulas including formulas for sample mean, population mean, variance, standard deviation, and quartiles. 2) Probability theory formulas including addition law, conditional probability, and multiplication rule. 3) Probability distributions including formulas for mean, variance, and binomial distribution parameters. 4) Statistical inference formulas including sampling distributions, confidence intervals, and sample size determination.

Uploaded by

Tom Afa
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
216 views13 pages

Formula Sheet (1) Descriptive Statistics: Quartiles (n+1) /4 (n+1) /2 (The Median) 3 (n+1) /4

This document contains a formula sheet with statistical formulas organized into several sections: 1) Descriptive statistics formulas including formulas for sample mean, population mean, variance, standard deviation, and quartiles. 2) Probability theory formulas including addition law, conditional probability, and multiplication rule. 3) Probability distributions including formulas for mean, variance, and binomial distribution parameters. 4) Statistical inference formulas including sampling distributions, confidence intervals, and sample size determination.

Uploaded by

Tom Afa
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 13

FORMULA SHEET (Statistics) 1

FORMULA SHEET
[1] DESCRIPTIVE STATISTICS
Sample Mean (ungrouped):
n
x
x

=
Population Mean (ungrouped):

Population Mean (grouped):

Sample variance s
2
=
1 n
) x - x (
2

=
1 n
2
) X n( -
2
X

(
(

=
1 n
2
n
x) (
-
2
X

(
(



Population variance (ungrouped)
2
=
N
) - X (
2


=
N
2
N -
2
X
(
(


=
N
2
N
x) (
-
2
X
(
(



Population Variance (grouped):
2
=
N
) - M f(
2


=
N
N
-
2
M f
(
(


2
) ( fM


The standard deviation is the positive square root of the variance.

Quartiles
First quartile position: Q
1
= (n+1)/4
Second quartile position: Q
2
= (n+1)/2 (the median)
Third quartile position: Q
3
= 3(n+1)/4
Where n is the number of observed values (sample size).
Rule 1: If the result is an integer then the quartile is equal to the ranked value.
Rule 2: If the result is a factional half, then the quantile is equal to the mean of the
corresponding values.
Rule 3: If the result is neither an integer of fractional half, round the result to the nearest
integer and select that ranked value.

N
x

=
N
fM
grouped

=
FORMULA SHEET (Statistics) 2
Coefficient of Variation CV =

. 100
Standardized value or z score =
X -

|
\

| or
X - X
s
|
\

|
and for a given z value X = + z

The Empirical Rule : The interval of values one standard deviation either side of the
mean X 1 S contains approximately 68% of the items or people
in the sample


[ 2 ] PROBABILITY THEORY
If N is the total number of opportunities for the event to occur and x is the number of
times the event has occurred
P(E) =
N
x


Addition Law P(A B) = P(A) + P(B) - P(A B)
= P(A) + P(B) if A and B are mutually exclusive

Conditional Probability P(A | B) =
) (
) | ( ). (
B P
A B P A P
=

P(B)
B) P(A

= P(A) if A and B are independent

Multiplication Rule P(A B) = P(B) P(A | B) = P(A) P(B | A)
= P(A) P(B) if A and B are independent

FORMULA SHEET (Statistics) 3
PROBABILITY DISTRIBUTIONS

Mean or expected value of discrete distribution:

= = )] ( [ ) ( x xP x E


Variance of a discrete distribution:

= )] ( . ) [(
2 2
x P x

The Covariance

Definition formula: ) ( )] ( )][( ( [
1

=
=
N
i
i i i i XY
Y X P Y E Y X E X
Calculation formula:

=
=
N
i
i i i i XY
Y E X E Y X P Y X
1
) ( ) ( ) (

Where X
i
Y
i
= the i
th
outcome of the discrete random variables X and Y respectively
P(X
i
Y
i
) = probability of the i
th
occurrence of X and Y


Portfolio Expected Return and Portfolio Risk

Portfolio expected return (weighted average return): E(P) = wE(X) +(1 w)E()

Portfolio risk (weighted variability): o
P
= w
2
o
X
2
+ (1 w)
2
o

2
+2w(1 w)o
X


Where E(P) = portfolio expected return
w = portion of portfolio value in asset X
(1 w) = portion of portfolio value in asset Y


Binomial Distribution P(X) =
n!
X! (n X)!
p
x
q
n-x

=
n
C
X
p
x
(1 - p)
n-x
(if using a calculator)
= n p and
2
= n p (1 - p)
q = 1-p
The standard deviation is the positive square root of the variance.


FORMULA SHEET (Statistics) 4
[ 3 & 4 ] STATISTICAL INFERENCE
SAMPLING DISTRIBUTIONS
SAMPLE MEAN
X ~ N (
X
= ,
X
=

n
)

If the population standard deviation is not known we use the unbiased estimate s when
we find the estimated value of
X
. The Testing Statistics we use are either
z =
X
/ n

(when is known)
or
t =
X
s / n

(when is not known)

THE SAMPLE PROPORTION
p ~ N (
p
=,
p
=
n
(1 )
)
The Testing Statistic we use is
z =
n
) - (1
p


UNIFORM DISTRIBUTION




WORKING WITH FINITE POPULATIONS
Whenever we have n / N > 0.05, the standard error of the sample estimate is multiplied
by the following term
Finite Population Correction Factor =
N n
N 1





FORMULA SHEET (Statistics) 5
INTERVAL ESTIMATION
[ ] Sample Estimate ( X ) z
/2

X
= X z
/2
( / n )
or
Sample Estimate ( X ) = x t
/2
,
n-1

n
s

[ p ] Sample Estimate ( p ) z
p
=
n
q p
z p

2 /



Confidence Interval for the Population Total

Population Total = NX



Confidence Interval Estimate:
NX

= N(t
n-1
)
s
n
_
N n
N 1



Confidence Interval for Total Difference

Total Difference = N



Mean Difference:

N
=1
n


Where: D
i
= audited value original value


Confidence Interval Estimate:
N

= N(t
n-1
)
s

n
_
N n
N 1


Where:
S

=
(

)
2 n
=1
n 1


ESTIMATING SAMPLE SIZE
MEAN n =
2
2
E
2
/2
z

where E = ( X - ) is the error of estimation


PROPORTION n =
2
2
E
) (1 z
where E is the largest value of ( p - p) we will tolerate
In both cases we take the solution for n given by the formula and ROUND UP.

FORMULA SHEET (Statistics) 6
[ 5 ] REGRESSION ANALYSIS
POPULATION REGRESSION FUNCTION (PRF)
y
i
=
0
+
1
x
i
+
i

SAMPLE REGRESSION FUNCTION (SRF)
y

= b
0
+ b
1
x

KEY VALUES USED IN REGRESSION CALCULATIONS
SS
XY
=


=
n
y x ) )( (
xy ) y - y )( x - x (
SS
XX
=
n
x
x x x


=
2
2
2
) (
) (
SS
yy
=
n
y
y y y


=
2
2
2
) (
) (

ESTIMATION FORMULAE
Slope b
1
=
SS
SS
x - (x
y - (y x - (x
xx
xy
=

2
)
) )
b
1

Intercept b
0
=
n
x
b
n
y
x b y
) (
1 1

=

SSE =

xy b y b y
1 0
2

Standard error of estimate
2
=
n
SSE
s
e

Coefficient of determination

= =
n
y
y
SSE
SS
SSE
r
yy
2
2
2
) (
1 1

Computational formula for r
2
yy
xx
SS
SS b
r
2
1 2
=

Pearson Product-Moment Correlation Coefficient
YY XX
XY
S S
S
r =

Adjusted R
2
:
(

\
|

=
1
1
) 1 ( 1
2 2
k n
n
r r
adj

FORMULA SHEET (Statistics) 7
Sampling Distribution for the estimated slope
t test for slope :
xx
e
b
b
SS
s
s where
s
b
t
=

=
:
1 1



t test for correlation:
2 n
r 1
- r
2

= t

where:
0 b if
0 b if
1
2
1
2
< =
> + =
r r
r r


COMPLETE MODEL EVALUATION FORMULAE
Total variation = Explained Variation + Unexplained Variation
SST = SSR + SSE
Where SSR =


2
) ( y y

These results are used in the following definition
Coefficient of determination r
2
=

= =
n
y
y
SSE
SST
SSE
r
2
2
2
) (
1 1


INTERVAL ESTIMATION
CONFIDENCE INTERVAL FOR THE CONDITIONAL MEAN
i YX n i
h s t y
2

+ =
n
i
i
i
X X
X X
n
h
1
2
2
) (
) ( 1

PREDICTION INTERVAL FOR AN INDIVIDUAL RESPONSE
i YX n i
h s t y +

1
2
,

+ =
n
i
i
i
X X
X X
n
h
1
2
2
) (
) ( 1






FORMULA SHEET (Statistics) 8
[6] TIME SERIES FORECASTING AND INDEX NUMBERS

Mean Absolute Deviation =
n
e
n
i
i
=1
| |


Mean Square Error =
n
e
n
i
i
=1
2


Exponential Smoothing:

F
2
= X
1

F
t+1
= aX
t
+ (1-a) F
t


Simple Weighted Index: 100 .
0
X
X
I
i
i
=
Weighted Aggregate Price Index:
) 100 (
o o
i i
i
Q P
Q P
I

=


Laspeyres Price Index:
) 100 (
0 0
Q P
Q P
I
O i
L

=


Paasche Price Index:
) 100 (
0 i
i i
P
Q P
Q P
I

=



Non-Linear Trend Forecasting:

Quadratic form:
i
+ + + =
2
i 2 i 1 0 i
X X Y
Exponential trend:
i
X
1 0 i
Y
i
=
Exponential trend (logged transformation): ) log( ) log( X ) log( ) log(Y
i 1 i 0 i
+ + =

Model selection:

First differences (linear trend): ) Y Y ( ) Y Y ( ) Y (Y
1 - n n 2 3 1 2
= = = L
Second differences (quadratic):
)] Y Y ( ) Y [(Y
)] Y Y ( ) Y [(Y )] Y Y ( ) Y [(Y
2 - n 1 - n 1 - n n
2 3 3 4 1 2 2 3
= =
=
L

percentage diff (exponential): % 100
Y
) Y (Y
% 100
Y
) Y (Y
% 100
Y
) Y (Y
1 - n
1 - n n
2
2 3
1
1 2

= =

L
FORMULA SHEET (Statistics) 9




FORMULA SHEET (Statistics) 10



FORMULA SHEET (Statistics) 11


FORMULA SHEET (Statistics) 12






FORMULA SHEET (Statistics) 13

You might also like