Number of classes k 1 3.
3log10 N
Range R xm x0
Class Interval h
( xm Maximum Value; x0 Minimum Value)
R
k
Population Mean
Sample Mean x
x
fx
( N Population Size)
(for ungrouped data)
(for ungrouped data)
i i
(for grouped data)
Weighted Mean x w
xw
w
i
Geometric Mean G n x1 x2 ...xn
1
1
log x1 log x2 ... log xn log xi
n
n
1
G anti log log x i
(for ungrouped data)
n
log G
f i log x i
n
Harmonic Mean H
G anti log
1
x
i
fi
1
xi
(for grouped data)
(for ungrouped data)
(for grouped data)
n
n
Median x% th or
1 th
(for ungrouped data)
2
2
h n
Median x% l C
(for grouped data)
f 2
(l lower class boundary of the class containing median;
h class interval;
f frequency of the class containing median;
n samplesize;
C frequency of the calss preceding the class containing median)
n
1 th
4
Lower quartile Q1
2n
1 th
4
Middle quartile Median Q2
3n
1 th
4
Mode=Mostfrequent or repeated value
f m f1
Mode l
h
f m f1 f m f 2
Upper quartile Q3
Mode l
(for ungrouped data)
(for grouped data)
f2
h
f1 f 2
Mean - Mode 3(Mean - Median)
Mode 3Median - 2Mean
x x
Coefficient of Dispersion m 0
xm x0
Q3 Q1
2
Q Q1
Coefficient of Q.D 3
Q3 Q1
Quartile Deviation Q.D
Mean Deviation M .D
Mean Deviation M .D
Mean Deviation M .D
x x
i
(for ungrouped sample data)
n
xi
(for ungrouped population data)
xi x
(for grouped sample data)
n
M.D
M.D
Coefficient of M.D
or
Mean Median
( xi ) 2
2
Population Variance
N
( xi x) 2
2
Sample Variance S
n
f i ( xi x )2
2
Sample Variance S
n
Population Standard Deviation
SampleStandard Deviation S
(for ungrouped population data)
(for ungrouped population data)
(for grouped population data)
(x )
i
( x x)
i
SampleStandard Deviation S
Coefficient of S.D
f ( x x)
i
S.D
Mean
S
100
x
Coefficient of Variance C.V 100
x
Population Standard Varinable Z i i
x x
Population Standard Varinable zi i
S
n!
Permutation n Pk
(n k )!
Coefficient of Variance C.V
n
n!
Combination n Ck
k !(n k )!
k
m
Number of favorable outcomes
Probability of event A P( A)
n Total number of possible outcomes
Axiom (i):For any event Ei , 0 P( Ei ) 1
Axiom (ii): P( S ) 1for the sure event S
Axiom (iii):If A and B are mutually exclusive events (subsets) then P( A B) P( A) P( B)
Number of sample points in A n( A)
P ( A)
Number of sample points in S n( S )
A A S
P ( A A) P( S )
P ( A) P( A) 1
P ( A) 1 P( A)
P ( A B ) P( A) P( B )
(for mutually exculsive events)
P ( A B ) P( A) P( B ) P ( A B )
n( A B )
P( A B)
n( S )
P( A B)
Probability of A given that event B has occured P( A / B)
P( B)
P( A B)
Probability of B given that event A has occured P( B / A)
P ( A)
P( A B ) P ( A) P ( B / A) P ( B ) P ( A / B )
P( A B ) P ( A) P ( B )
(dependent events)
(independent events)
Binomial Distribution :(1) outcomes 2 i.e.Success and failure
(2) Probability of success remains constant (called p)
(3)Successive trials are independent
(4) Experiment is repeated fixed times (called n)
Expectance E ( x) xP( X x)
n
x dx
(discrete r.v. X )
x n 1
n 1
Expectance E ( x)
xfx dx
(continuous r.v. X )
Variance Var ( x) E ( x 2 ) ( E ( x))2
P( X x ) nC x p x q n x ; x 0,1, 2,..., n for binomial distribution
( x number of successes;
p probability of success;
q 1 p probability of failure; )
Expectance E ( x) np
Variance Var ( x) npq
Poisson Distribution:if n and p 0;if np 7
x
; x 0,1, 2,..., ; e 2.7183; np
x!
Normal Distribution: Mean=Median=Mode
P( X x) e
1 x
1
p.d.f f ( x)
e 2 ; x ; ; 0
2
2
X ~ ( , ) i.e.X is normally distributed with mean and variance
x
Standard Normal Variable Z
Q E ( Z ) 0 and Var ( Z ) 1
1 12 Z 2
Standard Normal Distribution f ( x)
e
; Z ;
2
Acceptence Region 1
Rejection Region Critical Region
Level of Significance=Probability of rejecting Null hypothesis when it is true.
5% when not given
If Alternative Hypothesis H1: 0 it is one-tailed (left tailed) test
If Alternative Hypothesis H1: 0 it is one-tailed (right tailed) test
If Alternative Hypothesis H1: 0 it is two-tailed test
Null Hypothesis,H 0 always contain " "sign while H1 does not.
Type I Error:When true H 0 is rejected
Type II Error:When false H 0 is accepted
General procedure for Testing Hypothesis:
1)State Hypotheses i.e.H 0 and H1
2) Choose/define
3) Define test statistic
4) Perform Computation
5) Define Critical Region
6) State Conclusion
Hypothesis testing about mean for population paramenter when n is large i.e. n 30
Z
x - 0
~ N ( x, 2 ) N (0,1) under H 0
/ n
(when is known)
x - 0
~ N ( x, 2 ) N (0,1) under H 0
S/ n
(when is unknown)
/ n 2 / n and S / n S 2 / n
Rejecting H 0 when:(1) Z c Z or Z c Z (For two-tailed test)
2
(2) Z c Z
(For left-tailed test)
(3) Z c Z
(For right-tailed test)
If 0.10 then Z 1.645 and Z 1.28
2
If 0.05 then Z 1.96 and Z 1.645
2
If 0.10 then Z 2.58 and Z 2.33
2
Hypothesis testing about difference between two means when n is largei.e. n 30
H 0 : 1 2 or 1 2 0 or 1 2 0
Z
x1 x 2 1 2
12 2 2
n1
n2
x1 x 2 1 2
S12 S 2 2
n1
n2
~ N (0,1) under H 0
(when is known)
~ N (0,1) under H 0
(when is unknown)
Hypothesis testing about population proportion, p, when n is largei.e. n 30
^
^
p p0
x
Z
~ N (0,1) under H 0 ( p = sample proportion; p population proportion)
n
p0 q0
n
H 0 : p p0
Hypothesis testing about difference between two proportions when n is largei.e. n 30
H 0 : p1 p2 or p1 p2 0 or p1 p2 0
^
p p p p2
Z 1 2 1
~ N (0,1) under H 0 (when H 0 : p1 p2 0 )
p1q1 p2 q2
n1
n2
^
p1 p 2 0
1 1
pc qc
n1 n2
~ N (0,1) under H 0 (when H 0 : p1 p2 0 )
n p n p
x x
pc Estimated common proportion 1 1 2 2 1 2
n1 n2
n1 n2
probability value p value maximum probability of rejecting H 0 when true.
Reject H 0 if p value
Accecpt H 0 if p value
Asumptions for t-test:(1)Sample is selected randomly
(2) population is normally distributed
(3) In case of two samples, both populations have same variance
Hypothesis testing about mean when n is smalli.e. n 30 and is unknown
t
x 0
under H 0 follows t-distribution with v n 1dgree of freedom
s/ n
( xi x ) n
i
x
n
s estimated standard deviation i 1
; ( xi x) 2 xi 2 i 1
n 1
n
i 1
i 1
Hypothesis testing about difference of means of normal distribution
when n is small i.e. n 30 and is unknown
n
Assume 1 2 s p pooled estimated of
t
(n1 1) s12 (n2 1) s2 2
(n1 n2 2)
( x1i x1 )2 ( x2i x 2 )2
i 1
i 1
(n1 n2 2)
( x1 x 2 ) ( 1 2 )
under H 0 follows t-distribution with v n1 n2 2 dgrees of freedom
1 1
sp
n1 n2
H 0 :1 2 0 or 1 2 0 or 1 2
Hypothesis testing about two means of normal distribution with paired or dependent observations
when n is small i.e. n 30 and is unknown
t
x 0
under H 0 follows t-distribution with v n 1dgree of freedom and x x1 x2 or x2 x1
s/ n
Asumptions for F-test:(1)Sampleis selected randomly
(2) population is normally distributed
Hypothesis testing that population variances are equali.e. 12 2 2
H 0 : 12 2 2 or
12
1
22
s12
s2 2
2
2
F 2 where s1 s2 or F 2 where s2 2 s12 which under H 0 follows
s2
s1
F distribution with v1 n1 1and v2 n2 1degrees of freedom
E ( s12 ) 12 and E ( s2 2 ) 2 2
F1 ( v1 ,v2 )
1
F ( v1 ,v2 )
Critical Regions : Two tailed : H1 : 2 ; F F
2
1
( v1 ,v2 )
and F
1
F
2
( v1 ,v2 )
F ( v1 ,v2 )
1 ( v1 , v2 )
2
One tailed (right) : H1 : 2 ; F F ( v1 ,v2 )
2
1
One tailed (left) : H1 : 12 2 2 ; F F ( v2 ,v1 )
Hypothesis testing of equality k ( k 2) population means (ANOVA=Analysis of Variance)
H 0 : 1 2 ... k
( j 1, 2,..., k )
H1 :All means are not equal or not all j 's are equal
F
k
sb 2
which
under
H
follows
F
distribution
with
v
1and
v
( n j k ) degrees of freedom
0
1
2
sw 2
j 1
k
sb between samples sum of square
n (x
j
j 1
(n
j 1
k
(n
j 1
x Grand Mean
x) 2
k 1
k
sw within samples sum of square
1) s j 2
j
k)
n1 x1 n2 x 2 ... nk x k
n1 n2 ... nk
Critical region : F F ( v1 ,v2 )
Simple LinearRegression Model : y 0 1 x
^
Estimaed LinearRegression Model : y i 0 1 xi