Chapter 2 Statistics
Chapter 2 Statistics
1. X =
__ fX
(For grouped data)
f
2. X A
fD where D = X – A (Short-cut Method)
f
__
X = A + X A
fU h where U =
X A
(Short-cut Method) *For equal interval data only.
f h
Examlpe: Find the Arithmetic Mean or Mean of the marks obtained by 9 students are given below:
45, 32, 37, 46 ,39, 36, 41, 48, 36
First Method X
X
360
40 marks.
n 9
2
X D
32 -7 Second Method
36 -3
36 -3 Let A=39, D = X – A = X-39,
X A 39 40
37 -2 D 9
39 0 n 9
41 2
45 6
46 7
48 9
X 360 D 9
X A 30 33
30 D 30
36
n 10
42
48 Third Method
54 X A X 30
60 U=
h 6
X 330 D 30 U 5 __
U h = 30 5 6 33
X =A+
n 10
X =
n1 X1 n2 X 2 n3 X 3 ............. nk X k
= nX
n1 n2 n3 ................ nk n
4) Arithmetic mean is affected by change of origin and scale.
__ __
(i) If Y = X b then Y X b (ii) If Y = aX, then Y a X
3
__
OR If Y = aX b then Y aX b
5) The Mean of constant values is constant itself i.e If X = a then X = a
=
W 1 X 1 W 2 X 2 W 3 X 3 ..................WkXk WX
Weighted Mean: Xw =
W 1 W 2 W 3 ................ Wk W
Example: Calculate the weighted mean
Xw
Food 290 7.5 2175 WX 2542.5
203.4
Rent
Clothing
54
98
2.0
1.5
108
147
W 12.5
Fuel and light 75 1.0 75
Other items 75 0.5 37.5
Total 12.5 2542.5
X X X Y
11
X X
133
19
13 n 7
Y
14 Y 287
18 41
n 7
24 __
26 Y 2 X 3 2 19 3 41
27
X 133 __
( X X ) 0 Y 287
n1 x1 n2 x2 n3 x3 7249
Combined Mean is X 60.4"
n1 n2 n3 120
4
Geometric Mean: The Geometric mean, G of a set of n positive values X1, X2, X3, X4,………… Xn is the
nth root of the product of the values. Its formula is given by:
G = Anti log
log X (For Ungrouped data)
n
G = Anti log
f log X
(For Grouped data)
f
Examlpe: The marks obtained by 9 students are given beow:
45, 32, 37, 46 ,39, 36, 41, 48, 36
G 9 45 32 ... 36 39.68
log X
Alternative Method G anti log anti log 1.59856 39.68
n
Geometric mean is use when we measure the average growth or depreciation in the data
Note:- It is not possible to find Geometric mean of the data if
i- any observation in the data is zero
ii- any observation in the data is negative
Harmonic mean: The Harmonic mean H, of a set of n values X1, X2, … Xn is the reciprocal of the
arithmetic mean of the reciprocal of the values.
H=
n
(For Ungrouped data) H=
f (For Grouped data)
1
f X
1
X
Harmonic mean is use when we measure average speed or rate in the data.
Example: Suppose a car is running at 15km/hr for first 30km, 20km/hr for second 30km, 25km/hr for
third 30km. The distance is constant but times are variable. Therefore, the Harmonic mean is the correct
average.
1 1 1
H Re ciprocal of 15 20 25 19.15km / hr
3
Note:- It is not possible to find Harmonic mean of the data if any observation in the data is zero
Example: Given the following frequency distribution
Solution:
X
fX 122.5 H
f
60
113.11
f 60 f
X 0.53044
G Anti log
f log X Anti log
124.2483
Anti log 2.0708 117.7
f 60
Mode: The mode is defined as that value in the data which occurs the greatest number of times provided
such value exits. It is denoted by x . A data may have more than one mode or no mode at all.
f m f1
Mode for group data x = l h Where
( f m f1 ) ( f m f 2 )
Median: The median of a set of values arranged in ascending order of magnitude is defined as the middle
value if the number of values is odd and the mean of two middle values if the number of values is even.
h n
For group data x = l ( c) Where
f 2
n
(i) If n is odd (or is not integer) the median is the middle value of the array in ascending order. i.e
2
n 1 n
Median = th terms. (or round off to the next integer)
2 2
n
(ii) If n is even (or is not integer) the median is the average of middle two terms i.e the average of
2
n n
th and 1 th terms.
2 2
6
QUANTILES:
For ungrouped data:
n n
Q1 = value of item, Q2 = median Q3 = value of 3 th item
4 4
h n h 3n
Q1 = l ( c) Q2 = median Q3 = l ( c)
f 4 f 4
Deciles:
jn jn
If is not integer then D j is the observation with ordinal number 1
10 10
jn
OR round off to the next integer.
10
jn jn jn
If is an integer then D j is the average of two observations with ordinal number and 1
10 10 10
h n h 9n
D1 l ( C) D9 l ( C)
For Group Data: f 10 f 10
Percentiles:
jn jn
If is not integer then Pj is the observation with ordinal number 1
100 100
jn
OR round off to the next integer.
100
jn jn jn
If is an integer then Pj is the average of two observations with ordinal number and 1
100 100 100
7
h n h 99n
P1 = l ( c) P99 = l ( c)
f 100 f 100
Quantiles: Collectively the quartiles, deciles, percentiles and other values obtained by equal sub division
of the data are called quantiles.
Examlpe: The marks obtained by 9 students are given below:
45, 32, 37, 46 ,39, 36, 41, 48, 36
n
Here n=9, 4.5 is not integer (round off to next integer)
2
Hence Median = value of 5th term = 39.
n
Now 2.25 is not integer (round off to next integer) so Q1 = 3rd item = 36
4
3n
Now 6.75 is not integer (round off to next integer) so Q3 = 7th item = 45
4
Example: Given the following frequency distribution
Mean = X
fX ?
f 905
h n 10
Median = l C 59.5 452.5 285 65
f 2 304
f m f1 304 190
Mode = l h 59.5 10 65
f m f1 f m f 2 304 190 304 211
8
hn 10
Q1 l c 49.5 226.25 95 56
f 4 190
h 3n 10
Q3 l c 69.5 678.75 589 74
f 4 211
h 8n 10
D8 l c 69.5 724 589 76
f 10 211
h 88n
P68 l c ?
f 100
Question: Find Mean, Median, Mode, Q1 , Q3 , D4 , P74 for the following frequency distribution
Q3 Q1
The Semi-Interquartile Range or The Quartile Deviation: Q.D. =
2
| X X |
The Mean Deviation from Mean M.D. =
n
The standard Deviation
( X X ) 2
s standard deviation (unbiased)
n 1
10
( X X ) 2
S standard deviation (biased)
n
The Variance: The variance is defined as the square of the standard deviation, i.e. the mean of the squared
deviations from the mean. It is given by
n
1
s2
n 1
i 1
( X i X )2 (for ungrouped data)
n
1
S2
n
i 1
( X i X )2 (for ungrouped data)
1 k
f ( X X )2
S2
n
i 1
fi ( X i X )2
n
(for grouped data)
X X Y Y .
1
Where Cov X , Y
N
v) The variance has the minimal property. This means that the variance or the standard deviation
is a minimum if and only if the deviation are taken form the mean. In other words,
1 n
( X i a)2 is minimum if and only if a = X .
n i 1
4
i) Mean Deviation = (Standard Deviation)
5
2
ii) Semi-Interquartile Range = (Standard Deviation)
3
5
iii) Semi-Interquartile Range = (Mean Deviation)
6
Examlpe: The marks obtained by a sample of size n = 9 students are given below:
45, 32, 37, 46 ,39, 36, 41, 48, 36
X
X 360
Sample Mean 40
n 9
7,8,10,13,14,19,20,25,26, 28
Population Mean
X 170 17
N 10
X
2
534
Population Standard deviation 7.31
N 10
X
2
534
Population Variance 2
53.4
N 10
Examlpe: Array the data in ascending order 32,36,36,37,39, 41, 45, 46, 48
36 36 36 37 39 41 45 45 45
Winsorized Mean 40
9
X X
2
Q No 4: By adding 5 to each of numbers in the set 3, 6, 2, 1, 5, 7, we obtain the set 8, 11, 7, 6, 10, 12.
show that the two sets have the same standard deviation but different means.
Solve: Denoting the first and second sets by X and Y respectively, X = X / n = 24/6 = 4 and
Y Y / n = 54/6 = 9. The relation between X and Y is given by Y = 5 + 4 or y = 5 + X .
Since the standard deviation is independent of the origin (5 in this case), it will remain the same
for the two sets.
Q No 5: The scores obtained by five students on a set of examination papers were 70, 50, 60, 70, 50.
These scores are changed by (i) adding 10 points to all scores (ii) increasing all scores by
10%. What effect will these changes have on the standard deviation?
Solve: (i) Denoting the given scores by X and changed scores by adding 10 by Y, then Y = X + 10.
By property of the variance
Var(Y) = Var (X + 10) = Var (X) or S.D. (Y) = S.D. (X).
1 4 1 4
(ii) Y = (4 + 3 X ) = + X =34/3, Var(Y) = Var (4 3x) = Var X = Var(X)
3 3 3 3
13
Q No 7: A manufacturer of T.V. tubes has two types of tubes A and B. The tubes have respective mean
lifetimes X A = 1495 hours and X B = 1895 hours S A 280, S B 310 . Which tube has the
grater (i) absolute dispersion (ii) relative dispersion?
Solve: (i) Since SB is greater than SA, tube B has greater absolute dispersion.
SA 280
(ii) C.V. (A) = 100 100 18.73% (Co-efficient of variation)
XA 1495
SB 310
C.V. (B) = 100 100 16.36%
XB 1895
Since C.V. (A) is greater than C.V. (B), tube A has greater relative dispersion.
r
X i
r for Papulation data
N
r
( xi )r
for Papulation data
N
mr
( xi x )r for Sample data
n
The relationship between Raw Moments & Central Moment
1 0
2 2 ( 1) 2
3 3 3 2 1 2( 1)3
4 4 4 3 1 6 2 ( 1) 2 3( 1) 4
Variance S 2 m2 and 2 2 (Variance is equal to Second moment about mean)
3 2 4 m32 m4
Moments Ratios: 1 , 2 (for sample) and b , 2 for population.
23 2 2 1
m23 m2 2
Examlpe: From a sample data 32,36,36,37,39, 41, 45, 46, 48 . Find first four moments about mean.
X X 0, m X X
2
X X X X
3 4
m3 20.67, m4 1189.78
n n
14
Examlpe: A population data in ascending order 3,4,6,8,11,12,14,15. Find first four moments about mean.
9.125,
X 0, X X
2
X 144.83
1 2 2
18.10
N N n 8
X
X
3 4
47.431 4031.08
3 5.91, 4 503.8
N 8 N 8
KURTOSIS
Lepto- kurtic Distribution:
A distribution having a relatively higher peak is called Lepto kurtic Distribution 2 3
Platy- kurtic Distribution:
A distribution which is flat-topped is called Platy kurtic Distribution 2 3
Meso- kurtic Distribution: or Normal
A distribution which is neither peaked nor high flat-topped is called Meso kurtic Distribution 2 3
Kurtosis: To show the degree of peaked ness of the distribution
m4
(I) Moment of Co-efficient of Kurtosis 2
m22
(II) Percentile of Co-efficient of Kurtosis Q.D
k
P90 P10
15
FORMULAS SHEET
Arithmetic Mean
Group Data Ungroup Data
X
fX X
f N
a
X a
fD short cut or deviation D
where “D” is X – a
f N
X a
fu h where u
xa
X a
u h where u
xa
f h N h
step deviation or coding
Weighed Mean
Xw
WX Xw
WX
W W
Geometric Mean
f log X n X1 X 2 ....X n
anti log
f
X1 X 2 X 3 ... X n 1 / n
Median
~ h n ~ n 1
X l C X th value
f 2 2
Harmonic Mean
H .M
f
H .M
n
f 1
X
X
Quartiles
h kn n 1
Qk C Qk K th value
f 4 4
Deciles
h kn n 1
Dk C Dk K th value
f 10 10
Percentile
h kn n 1
Pk C Pk K th value
f 100 100
Mode
16
Xˆ
f m f1 h
Most Common Value
f m f1 f m f 2
Range
R Xm X0 Xm X0
Co-eff of dispersion =
Xm X0
Mean Deviation
M .D
f X X M .D
X X from mean
f n
Q3 Q1
Q.D = or semi inter quartile range
2
Q3 Q1
Co-efficient of Quartile Deviation
Q3 Q1
Variance
2
S2
N n
Karl Person’s coefficient of Skewness
Mean Mode
Karl Person’s 1st coefficient of Skewness S.K
S.D
3Mean Median
Karl Person’s 2nd coefficient of Skewness S .K
S .D
Bow ley’s or Quartiles Co-efficient S .K Q 3 Q1 2median
Q 3 Q1