UNIT 4 Stats Text

Download as pdf or txt
Download as pdf or txt
You are on page 1of 31

MEASURES OF DISPERSION

281

the 90th and 10th percentiles respectively. The


are
where Pyo and Pio
R
ie..
Poo-P1o
90 Pio can also be used, but is not Com-
range.
semi-percentile 2

monly enployed.

The Mean Deviation

and quartile
discussed above, namely, range
ds of dispersion the term
in the strict sense of
1he two methods

m e a s u r e s of dispersion
are not However, to
deviation. s c a t t e r n e s s around an average.
do not show the'
aecause they distribution we should take
the deviations from
formation of a
study the The two other measures namnely, the average deviation and
an average. deviation, help u s in achieving this goal.
the standarddeviation is also known as the average deviation. It is the average
of that
The mean a distribution and
the median or mean
between the items in from
the deviations
there is an advantage in taking
difference
Theoretically minimum when
series. median is
because the s u m of deviations of items from
median arithmetic mean is more frequently
ignored.However, in practice, the reason why it
signs are and this is the
the value of average deviation used must
used in calculating called mean deviation. In any case, the average
confusion in meaning
is more commonly that any possible
stated in a given problem so
be clearly
is avoided.

Observations.* If X2, X1.


Deviation-Individual
Mean is
Computation of about an average A
observations then the deviation
Xg. Xy are N given
given by
M.D. |X-A|
EDI
EDI or N
modulus value or
Read as (X-A) is the
mod
where |.
D| = | X-Adeviation minus signs.
value of the ignoring plus and
absolute

Steps.
Compute the
median of the series. and denotee
from median ignoring t signs
deviations of items
The
these deviations by | D.
of these
deviations, ie., 2 | D|.
O b t a i n the total observations.
obtained in step (iî) by the number of
Divide the total
deviation is the range that
will
is normal, the mean t mean skewed,
If a distribution series. If it is moderately
items in the
57.7 cent of the items to fall within this
include per 57.5 per cent of the
approximately
then we may expect the distribution is highly compact
deviation is small,
range. Hence, if average concentrated within a small
m o r e than
half of the c a s e s are
since
or uniform,
range around the mean.

mean then in that case|D shall


is computed Irom
fthe m e a n deviation signs.
items Irom mean, ignoring
denote deviations of the
STATISTICALL METHODs

m c a n
iation,
dcvia called the
282 the
Donding
to mean deviviation by the
dividing
c o r r e s poob t a i n e d by Thus, if nean
m e deviation
is
m e a s u r e

The
relative

mean
deviation, ncan
deviation.
ol
mean devia shall be
of computng cocfflcient

Cocllicient u s e d in the
average median,
Particlar fronm medlan.
becn
compiulcd deviatlon by M.D.
as dividing
mean

oblained by Median deviat


Cocflicicnt of M.D. the value
ol mean ). in
dividina
calculating obtained
by
used
while s h a l l be
has been deviation
If mean
of mean
coeflicient
income aroin of
or the two
case
a
Such the mean. coefticient

deviation by deviation
and its
mean
Calculate the
llustration 6. beloW
members given 16,000 18,800
seven
v e and 15,200
(Rs.)14,000
14,800
17,000 17,500 18,000 19,000
Group 16,200
15,500 16,000
Group / (Rs.) DEVIATION

CALCULATION OF MEAN
Solution. Group
Group Deviation from
Deviation from median 17,000
Income (Rs.)
median 15,200 DI
Income (Rs.)
DI 1,500
15,500
14,000 1,200
16,000 1,000
400
14,800
16,200 800
15,200
17,000 0
16,000 800

18,800 3,600 17,500 500


18,000 1,000

19,000 2,000
N=5 D= 6,000 N=7 D= 6,800
Mean Deviation: Group l: M.D. =

N
D= Deviation from median ignoring signs,
Median Size of th item =
3rd item
Size of 3rd item is 15,200 M.D. = 5 =1,200
This means that the
Rs. 1,200. average deviation of the individual incomes from the median income is
Mean Deviation
Group II

Median Size of 1
thitem=
Size of 4th item is 2 =
4th item
17,000
2|D= 6,800, N=7.
Note. If we
median. Thus forwerefirst
to compute
M.D
6,800971.43.
coefficient of mean
the group: deviation we shall
divide mean by
deviano
PERSION
ASURES OF DISPE

283
Coefficient of M.D 1,200
15,200 0.079
the
second group
for
arnd

Coefficient of M.D. 971.43


17,000
0.057.

of Mean Deviation
Calculation
Series n discrete series the formula for
Discre
deviation is: calculating mean

-*
M.D = J D ,

D
N
by the same logic as given before)
D| denotes deviation Irom median ignoring signs.

Steps.
. Calculate the median of the series.
, Take the deviations of the items from median ignoring signs and
denote them by | D|.
.Multiply these deviations by the respective frequencies and obtain the
total Ef| D|.
. Divide the total obtained in Step (i) by the number of observationsS.
This gives us the value of mean deviation.
llustration7. (a) Calculate mean deviation from the following series
X 10 11 12 13 14
3 12 18 12 3
B.A. (Hons.) Econ., Madras Univ., 2009)
Solution. CALCULATION OF MEAN DEVIATION
X
DI fD C.f
10 3 2 6 3

11 12 12 15
12 18 33
13 12 12 45
14 3 2 6 48
N= 48 f| D=36
M.D. Ef D
N
Median Size of N+ th item= =24.5th item
2 2
S1Ze of 24.5th item is 12, hence Median = 12
36
M.D. =0.75.
48
0) Calculate the mean deviation from the mean for the following data:

Size 4 6 8 10 12 14 16
Frequency 2 5 3 2 1
B.Com. (H), Madras Uniw.. 20091
STATISTICAL METHOD

MEAN
FROM

284
CALCULATION
OF MEAN
DEVIATION

X-8 f D
IX DI
Solution 12
6

2 8
2
24
40 6
2
5
30 A 8
3
10 24 6
2
12 14 8
1 8
14 16
1
XfX= 160
Sf| D]=56
16
N 20

X=N 20
2fDISb -2.8.
M.D. = N 20
Series
D e v i a t i o n - C o n t i n u o u s

Calculation of Mean
the procedure remains
c o n t i n u o u s series
deviation in we have to
For calculating
mean difiference is that here
discussed above. The only deviations of these
the same as various classes and take
obtain the mid-point
of the
is same, i.e.,
median. The formula
points from
M.D. = I D ] |
N

median and mean deviation


of the following data
llustration 8. (a) Find the
Frequency Size Frequency
Size
40-50 16
0-10
10-20 12 50-60 14
20-30 18 60-70 8
30-40 25
(B.Com., Mysore Univ., 2004)
Solution. CALCULATION OF MEDIAN AND MEAN DEVIATION
Size C.f. m.p. m-35.2 f| DI
m
0-10
1D
5 30.2 211.4
10-20 2 19 15 20.2 242.4
20-30 18 37 25 10.2 183.6
30-40 25 62 35 0.2
40-50 16 78 5.0
45 9.8
50-60 14 92 156.8
60-70 55 19.8
8 100 277.2
65
N= 100 29.8 238.4

Med. f | D= 1314.8
Size ofth item =
= 50th item
2
MEASURES OF DISPERSION

285

lies in the
class 30-40
Median
Med. LN2c
50, c.f
=

37, f= 25, i 10
30. N 2
=
=

Med. 30 50 37
25
10 30 -5.2 352
M.D.=1D 13148 13.148
N 100
Calculate the mean deviation and its coefficient from the following data
Class Frequency Class Frequency
0-10 5 40-50 20
10-20 8 50-60 14
20-30 12 60-70 12
30-40 15 70-80 6
(B.Com, Andhra Univ. 2005)

Since nothing is special we will calculate mean deviation from median


Solution.
CALCULATION OF MEAN DEVIATION

Class Frequency C.f. m.p. m-43| f| DI


m JDI
38 190
0-10 5
13 15 28 224
10-20 8
18 16
12 25 25
20-30
40 35 8 120
30-40 15
45 2 40
40-50 20 60
55 12 168
50-60 14 74
22 264
12 86 55
60-70 192
75 32
5 92
70-80 f D| = 1414
N= 92
92
Med. Size of th item
=
46th item
Median lies in the class 40-50.

Med. L+ N/2-C.xi
f i= 10
N/2= 46, c.f. =
40, f=20,
L =40,
Med
46-40 10 =40 + 3 =43
40+ -

20
.

M.D-2|D_1414 _ 15.37
N 92

Coeff. of M.D.
M.D. 15.3 0.357.
Median 43

Merits and Limitations


deviation is its relative
the average
advantage of
to compute. Any
one
ETts. The outstanding
to understand and easy the
plicity. It is simple can readily appreciate
of the average
concept
a r with the devialion and sannple standard
between popillation whereas
distinction is often made
standard deviation
is denoted by a

deviation. Population
landard S.
Sample standard deviation by
requires a mea
286 s i l u a t i o n

r C or
a or any up not
d e v i a t i o n . I general
PuDne
1s
useful.

the aver to the d e v i a t i o n

mean of presented

will be the
average
data.
C Consequently chaIge
dispersion
lhat s l a t i s t i c s ,

of the devi
with in ol lI1Can

Very fanmiliar
item vallue

every the
cach and change iten
InS
It is based on item
would
value
o
Cxtrene
than
V a u e of any the
in the allected by
deviation is less value, comparison
c onparison about
value,
Can
the standard deviation. central

from a made.
taken be
are can
easily
Since deviations d i s t r i b u t i o n s

different algebraic es
Tornation of method
is t h a t signs
drawback
of this F o r example if

Limitations. The greatest of the items. mathemnatical.


the
deviations
s 1sof the y
1gnored while taking 30 and
not 30. he deviations
are deducted we
write If the signs
twenty, fifty is non-algebraic.
zero, 11 the referencee
method will be
makes he
WTong and s u m of deviations
reterence point is median
a r e not ignored
the net z e r o if
the
or
approximately
results. The reason i
point is the m e a n , accurate

not give us very


when deviations are taken
method may best results
his deviation gives us measure when the
mean satisiactory
Lhat But median is
not a
from median. And II we compute mean
in series is very high.
degree of variability desirable b e c a u s e the sum of
deviation from m e a n that
is also not
than the sum of
the deviations from m e a n (ignoring signs) is greaterm e a n deviation
is
lf
the deviations from median (ignoring signs).
also not scientific
because the value of
omputed from mode that is
mode cannot always be determined.
treatment.
I t is not capable of further algebraic
It is rarely used in sociological studies.
Because of these limitations its use is limited and it is overshadowed as a
measure of variation by the superior standard deviation.

Usefulness The serious drawbacks of the average deviation should not


bind to its practical
us
utility,
because of its simplicity in
meaning and
computation. It is especially effective in reports presented to the general
public or to groups not familiar with statistical methods. This
useful for small samples with no elaborate measure is
it may be mentioned that the analysis
National Bureau of Economic required. Incidentally,
found. in its work on
forecasting
business cycle, that the
Research has
is the most
practical measure of average deviationn
to dispersion use for this
The Standard Deviation purpose.
The standard deviation
It is by far the most concept was
introduced
by Karl Pearson in
dispersion. Its important
and
widely 1823.
from which thesignificance lies in the used measure of
fact that it is
of a
earlier
good measure of methods suffer and satisfies free from thosestudyig
defects
deviationdispersion.
mean
square
nean of the lor the Standard deviaticn most
is
of the
properue
squared deviation reason that it also known as
s
denoted by the small from the is the square rool
Greek letter o arithmetic mean. root ol u
(read as Standard deviauo
sigma).
variability o
MEASURESOF

dispersion (or
measures the absolute the greater
deviation or variability).
dard
standard
of disperston
amount the
deviations

The greater the be the magnitude of


distribution; the greater will deviation
means a high
deviation, standard a
distrodard A sn1all of a series;
from their ealn. well
homogeneity
he alues of (he
as a s
oDservalion

Thus, if we have
two o r

unifornity it is the
oaf
the opposite.
eaa1S Just
of
degre deviat0n identical
neans,

sfaldard or nearly
series with identical the most repre
uuarable deviation that has the
1ar standard
useful in judging
with the smallest
stribution with
tiol standard
d e
Hence
v i a t i o n is extremely
mean.
diti of the
mean.
tativeness
presenta
Deviation
and Standard
Deviation
Mean
between
Difference b et of
rence item
each and every
are based on
of dispersion
following respects
measures
0ath these
But they differ in the whereas
deviation
distribution.
t h e d i s t r i b u t i o n .

the are ignored while


calculating
mean

taken into
account.
Algebraic signs
are
standard deviation signs The
calculation of
,
mean.
or
in the
median
either from from the
be computed
always computed
can
deviation
Mean o n the
other hand, is deviation of
.
deviation,
the
of the
standard
because the sum of s q u a r e s
mean
arithmetic is the least.
arithmetic mean

items from

Standard Deviation
Calculation
of
standard
observations
individual
n case of methods:
Individual
Observations

any of the
following two
by applying
be computed actual m e a n .
deviation may items irom the
d e v i a t i o n of the assumed mean.
1. By taking of the items rom an
deviations
2. By taking actual
deviations are taken from
When deviations
Actual Mean.
Deviations taken from
formula is applied
mean the following
g V N

where. X =(X - X).


Steps
of the series, te., X.
Calculate the actual mean
mean, ie., find (X -

X).
items from the
the
Take the deviations of
X.
Denote these deviations by
total E x .
deviations and obtain the
'Square these and extract
total number of o b s e r v a t i o n s , ie., N
Divide 2 by the deviation.
us the
value of standard
ne square-root. This gives
actual i11ran is in
When the
Mean.
UIAlions taken from Assuned to take deviations
fract 123.674 it would
be too c u m b e r s o m e
it is either
OS. Say, these devialions. In Such a c a s e
rom
a n d then obtain squares of De taken from an
the or clsethe deviations
nean may be approximated the vialue ot the
assu n e d mean and the necessary adjustnent 111ade in
STATISTICAL
METHODS
288 a p p r o x i m a t i o n
is to>Ss rate
ol
Inelhod
are taken from
ar
former
assumed
d e v i a t i o n s

The case
deviation. a
standard
invariably
in sucl
followina c
the
and.

mean
thentore.

taken
from
assumed
mean
rmula is
deviations are
When

applied = 1 d _ / 2 d 2

o VN N

mean, i.e
Steps. items from
an
assumed
btain
deviatiin
deviations of the Take the otal of these deviatio
Take the deviations
d. by
Denote these
(X- A).
i.e.. obtain 2d.
the total 2d4.
deviations and obtain
Square these
in the above formula.
Substitute the values of Ed. Ed and N
are as under:
cholesterol levels of 10 persons
277, 251.
lustration 9. Blood serum
260, 290, 245, 255, 288, 272, 263,
240, assumed mean. (MBA, HPTU, 2014)
with the help of
Calculate standard deviation
DEVIATION BY THE
CALCULATION OF STANDARD
Solution. ASSUMED MEAN METHOD

X (X- 264)* d
d
240 -24 576
260 4 16
290 +26 676
245 -19 361
255 9 81
288 +24 576
272 + 8 64
263
1
277 +13 169
251 -13
2X 2641 169
2d= + 1
d 2689
a -N2 2
d= 2689,
N
Ed=+ 1, N= 10
G =1/26891 2
10 10
=
268.9 - 0.01
llustration 10. Calculate the standard deviation
16.398.
240.12 from the
240.13 following observations
240.15 240.15
240.17 240.12 240.17
240.16
The asuined 240.22 240.21
ean
minimise should be
taken 264 calculations. ln this as
nearer to
as
assumed case the the actual mean
mean. actual to
mean is
264.1 andpoSSID
We
ALGUL TON OF SIANDARD DEVIATION
Solution.

X (X- 240)
d d2

240.12 +0.12 0144


240.13 +0.13 0169
240.15 +0.15 0225
240.12 +0.12 .0144
240.17 +0.17 0289
240.15 +0.15 0225
240.17 +0.17 .0289
240.16 +0.16 0256
240.22 +0.22 .0484

240.21 +0.21 0441


N 10 Ed=+ 1.60 2d= 0.2666

(1:6
o N -No
N
0-2666
10 10
= Vo-02666 0-0256 =0-033

Calculation of Standard Deviation-Discrete Series. For calculating


methods may be
standard deviation in discrete series, any of the following
applied :
1. Actual m e a n method.
2. Assumed m e a n method.
3. Step deviation method.
this method is applied, deviations are
(a) Actual Mean Method. When
find (X- X) and denote these deviations
taken from the actual mean, i.e., we
the respective
are then squared and multiplied by
by x. These deviations

frequencies. The following formula


is applied:

a= ,
where x= (X X)
-

N
is rarely used because if the
actual mean
However, in practice this method
is in fractions the calculations
take a lot of time.
for-
Mean Method. When
this method is used, the following
(b) Assuned
mula is applied:
E d22fd2
where d =
(X- A) .

Steps. assumed mean and denote


of the items from an
Take the deviations
these deviations by d. and obtain the
by the respective frequencies
Multiply these deviations
total. 2d.
deviations. i.e..
calculate d'.
Obtain the squares of the and
deviations by the respective frequencies.
Multiply the squared
obtain the total. 2fd.
MTHOA
290
formula,
above
values in the data given below
Substitute the from the
deviation

ustration 11. Calculate the


standard Slze of tem
Frequency
Froquency 7.5 85
Size of item
3 8.5 32
3.5
7 9.5 8
4.5
22
5.5
60 (BBA, GGSIP Un., 206
6.5
DEVIATIONV
STANDARD
C A L C U L A T I O N OF

Solution.
(X- 6.5) fd fd
Size of item d
X 27
-9
3
3.5 3
-14 28
4.5 1
-22 22
5.5 22 0
6.5 60 0
+1 +85 85
7.5 85
32 +2 +64 128
8.5
+3 +24 72
9.5 8
N=217 fd= +128 2fd = 362

VN N
Efd 362, Efd= 128, N=217
d V362/128
V217 217
=
V1.668 -0.348 1.149.
(c) Step Deviation Method. When this method
is used take deviations of we
midpoints from an assumed mean and divide these
class interval, i.e., 't. In case class deviations
intervals are unequal, we by the width of
tions of by the lowest common factor and use 'c divide the devia-
formula midpoints
for calculating standard instead of 't in the
ard deviation is: deviation. The formula for
calculating stand-

where, d=-A) and


a=V-2axi N xi
i= class
The use of the above interval.
formula
lustration
table:
12. The
annual salaries simplifies
of a calculations.
group of
Salaries (in Rs.
000) 45
employees are given in the
Number of persons 50 55 followirng
Calculate the 3 5 60 65
standard deviation of 8 7 70 75 80
the 9
salaries. 7 4
MEASURES OF DISPERSION
291

Solution. CALCULATION OF STANDARD DEVIATION


Salaries No. of persons
(Rs. 000)
(X-60)/5
X f d fd td2
45 3 -3 27
50 -2 -10 20
55 -1 -8 8
60 0 0
65 +1 +9 9
70 +2 +114 28
75 4 +3 +12 36
80 7 +4 +28 112
N 50 fd 36 2fd= 240

xi
N N
Here, 2fd = 240, N=50, 2 fd=36, i=5
2
240 36
G-V50-50x5 v4.8-0.5184 x5 =10.35
Calculation of Standard Deviation-Continuous Series. In continuous
series any of the methods discussed above for discrete frequency distribution
can be used. However. in practice it is the step deviation method that is
most used. The formula is

-(2sd xi
N N

where d i=class interval

Steps.
Find the mid-points of various classes. and
from an assumed m e a n
Take the deviations of these mid-points
denote these deviations by d.
factor and denote this column by
Wherever possible take a common

d.
and ob-
the frequencies of each class with these deviations
Multiply
tain 2fd.
with the respective frequen-
Square the deviations and multiply them
obtain 2fd*.
cies of each class and
find
in c a s e of continuous series is to
Thus the only difference in procedure
mid-points of the various classes.
deviation from the following data
and standard
lustration 13. Calculate mean, median
20-30 30-40 40-50 50-60
0-10 10-20
Profits (Rs. lakh) 03
23 39 16
12 17
No. of companies (MBA, Univ. of Lucknow, 2009)
STATISTICAL METHODS
292
DEVIATION
STANDARD

MEDIAN
AND
OF MEAN,
CALCULATION
Solution. 35/10
No. of
companies (m
-

fd fd C.f
Profits m.p.
108 12
(Rs lakh) 36
-3 68
12 34 29
0-10 5
2 23
17 -23 52
10-20 15
23 0 91
20-30 25
0
39 16 16 107
30-40 35 +1
45 16 12 110
40-50
+2
+6
3
50-60 55 fd=-71 fd= 227
N=110

Calculation of X
fd
-A+xi 10
fd=-71, N=110, i=
A 35,
28.55.
X35- x10 35-6.45
Calculation of Median
110
Med. Size ofth item 55th item
2
Median lies in the class 30-40

Med. = L+ 2 Gxi
f
L 30, N/2 55, c.f. =52, f=39, i= 10

Med. -30 x 10
39
30+0.77 30.77.

Calculation of Standard Deviation

o - V - /N1 xi

f=227, N=110, 2 fd=-71, i= 10


V2_(-713
110 110 x 10

=
V2.064 -0.417 10 x
1.283x 10 12.83.
Hence mean or X of the
standard deviation is 12.83 lakh. distribution is Rs. 28.55
lakh, median = Rs. 30.77 lakh and
llustration 14. Find the standard
deviation
Age under 10
from the following data
20 30
No. of 40
persons dying 15 30 53
50 60 70 80
75 100
(B.Com.. Bangalore Uniu.: B.Com., 110 115 125
Madras Univ.. 2009: 2015)
MBA. Pune Unw..
MEASURES OF DISPERSION
293

Solution.
CALCULATION OF STANDARD DEVIATION

Age mp. (m-35)/10


m d fd fd2
0-10 5 15 -3 45 135
15 15
10-20 2 -30 60

20-30 25 23 -23 23

30-40 35 22 0 0

40-50 45 25 +1 +25 25
50-60 55 10 tZ +20 40

60-7 65 +3 +15 45

70-80 75 10 +4 +40 160


N= 125 fd 2 Xfd= 488

sfd2 (Efdxi
N N
-N (2
125
x 10

= v3.904-0.0003 x 10
= 1.976x 10 19.76.

lustration 15. Find the standard deviation of the following distribution

Age 20-25 25-30 30-35 35-40 40-45 45-50


No. of persons 170 110 80 45 40 35
Take assumed average = 32.5 . (B.Com., PTU, 2014)

Solution. CALCULATION OF STANDARD DEVIATION

Age m.p. No. of persons (m- 32.5)/5


m f fd fd2
20-25 22.5 170 -2 -340 680

25-30 27.5 110 1 -110 110

30-35 32.5 80

35-40 37.5 45 +1 +45 45


40-45 40 +2 +80 160
42.5
45-50 47.5 35 +3 +105 315

N= 480 fd--220 fd=1310

When we are asked in the question to take a specified value as assumed


mean we should take deviations only from that value.
STATISTICAL METHODS
294

2
xi
a- V N N

1310-220 5
480 480
=
2.729 -0.21 x5 v2.519 x5
= 1.587 x 5 =7.935.

Mathematical Properties of Standard Deviation Standard deviation has


some very important mathematical properties which considerably enhance
its utility in statistical work.

1. Combined Standard Deviation Just as it is possible to compute


combined mean of two or more than two groups, similarly we can also
compute combined standard deviation of two or more groups. Combined
standard deviation is denoted by o12 and is computed as follows:

12
O12 = VMo+N^o2+Njd2+ Nzdh2
N + N2
O12 Combined standard deviation:
o1 standard deviation of first group:
=

o2 standard deviation of second


=

group:
d =| X1 -X12 |:
da= | X2 - X12 |.
The above formula
three or more groups.can be extended to find out the standard
For example, deviationof
groups would be: combined standard
deviation of three
No+ Np02+ Nsas+Nd+ Nod+ Ngds2
N + N2 + N3
where d = | X1 - X123 |:

d=| X-X123
da =| Xs-X123|.
illustration 16. The
following are some of
boys and girls in a class: the
particulars of the
distribution of weight of
Number Boys
Mean weight 100 Girts
Variance 60 kg 50
(a) Find the 9 45 kg
(b) Which of standard
the two deviation of the
combined data
distributions is more 4
Solution. variable ?
(a) Combined S.D. (B.Com. MD.
o12 ="V
NoNaz Nd2 Nd? Uniw. 2009)
N+N
MEASURES OF DISPERSION
295

rar finding combined standard deviation, we have to calculate combined mean.

X12 NX+eX2
N + N2
100 (60) +50 (45) 6000+2250E
100+50 150
N = 100. 2 = 9 , =50. o2=4, d = |X - Xi21 = | 60 - 55 | = 5

o2 =
| X2 -

X12| =
| 45 -

55| =10.
V 100 (9)+50 (4)+100 (5+50 (10
Substituting the values
=
O12
100+50

9002002500-5000
150
8600
8600
150
7.57
=7.57
(D) For finding which distribution is more variable compare the coefficient of varation of two
distributions
3
C.V. (Boys)=x 100 60 x 100 5.00
X
CV. (Gits)=x 100 x 100 = 4.44
Since coefficient of variation is more for distribution of weight of boys hence this distribution
shows greater variability.
lustration 17. The number of workers employed, the mean wage (in Rs.) per month and
standard deviation (in Rs.) in each section of a factory are given below. Calculate the mean
wages and standard deviation of all the workers taken together.
No. of workers Mean wages Standard deviation
Section
employed in Rs.) (in Rs.)
50 11130 600
A
B 60 11200 700
C 90 11150 800

Solution.
X1231A1 +2X2+ Nsa
N +N2 + Ns
(50 x 11130)+(60 x 11200)+ (90 x11150)
50+60+90
5,56,500+6,72,000+10,03,500
200

22,32,000
200
_Rs. 11,160.
Combined standard deviation ofthree series
O123VMas+ Noz'+Nooa+ Ndh +Ndk +Nbdi
0123 N + M+ Ns
11160 |=30
d =|X1 - X123 | or | 11130 -
2 X2-X123| or | 11200 11160 |=40
d3= X3- X123| or | 11150-11160 =
10

(800+50 (30)+60 (40) +90 (102


O123=50(600)*+ 60 (700)+90 50+60+90
1,80,00,000+2,94,00,000+5,76,00,000 +45,000+96,000+9,000
200

10,51,50,000 V5,25,750 725.09


200
alural n b Qn Na Natural Numbers.
ers*
can be obtalned
by the
The sta
ndard deviat ETHODS
12 (M
followingig1ormula
fo of
on of the
hus the 1)
standard deviation of atural numbers 1 to 10 will be
Note. The ans 1 (10- 1)
=
V x99 v8:25 = =
answer would be
standard deviation 287.
3. The is used. the same when
Sum of the But this holds direct method
Arithmetic good of calculating
Mean is Squares of the Deviations only for natural calculati.ing
deviations of Minimum. n ojJ
other words, the Items u numbers.
would always items of any
series the Series from
be
greater. This from value sum of
other than the squares their
a
computed from the is the reason the of the
why standard arithmetic mean
4. The
arithmetic mean. deviation is
Standard alwavs
accuracy. where the Deviation
Values Enables us to
help of Determine, with Great
ofa Frequency Distribution
Tchebycheff's
(1821-1894), no matter theorem given by are
Located. Deal ot
With
a

cent of the mathematician P.L.


values will fallwhat the shape of the P.L. the
Tchebycheff
distribution, and at least within t 2 standard distribution is, at least 75
deviations of the deviatiorns
89 per cent from the mean of per
from the
with greater mean. With the
help of
will be values
withint3 the
For a precision the normal
number of items that curve we can standard
symmetrical
Mean t lo covers distribution, the fall measure
within specific even
Mean t2 o covers 68:27% of the items following relationships
ranges.
hold good
Mean +3o covers 95.45%
of the items.
This can be 99:73% of the items.
illustrated by the
following diagram:
Distribution of The Items in
Standard Deviation Tems of Mean and

68.27%

95.45%
99.73%
x-3a -20 x-1a x x+1o x+20
X+30

Relation between Measures


there is of
measures
fixed
a

of relaionship Dispersion
between In a
normal distribution
the dispersion.
The the
next and
TIext the standard quartile deviation isthree most commonly ed
deviation is larde
is smallest, the mean
Smallest, u
9.D. 2
.D. 3 largest. in the deviauo
deviau
and M.D. 4 following 5
or
These
3
relationships 9.D.
can be =o or o proportio'
4, 5. The same
quite ormal. proportions
They are
easily
trInd inemorized
to hold because of
M.D
the
=M.D. 2. 3.
5O or

true for
By natural
numbers
uselul in
mean estünmating
We one
many sequent
distributions are
thaen
only measure of
positive integers, e.a dispersio
MEASURES OF DISPERSION
297
ther is known. or in checkingroughly the accuracy of a calculated vaiue. If
anothe
computed a differs very widely from its value estimated from .D. or M.D.
the c
either an error has been made or the distribution differs considerably from
normal.

Another comparison may be made of the proportion of items that are


ally included within the range of one 9.D., M.D. or S.D. measured both
typica
mean. In a normal distribution,
above and below the 50
Xt9.D. includes per cent of the items.
Xt M.D. includes 57.31 per cent of the itemns
Xto includes 68.27 per cent or about two-thirds ofitems.
length of life of 400 radio tubes
llustration 18. The following table gives
the
No. of Radio tubes Length of ife (hours) No. of Radio tubes
Length of life (hours)
1,000-1,199 12 2,000-2,199 55

1,200-1,399 30 2,200-2,399 36

1,400-1,599 65 2,400-2,599 25
1,600-1,799 78 2,600-2,799
1,800-1,999 90
(i) the standard deviation of the
Calculate() the average length of life of a radio tube, where length of life of a tube
number of tubes
length of a_tube, and (il) the percentage (MBA, Aramalat Untw., 2015)
falls within Xt 20.

Solution. CALCULATION OF MEAN AND STANDARD DEVIATION

Length of life m.p. No. of (m-1899-5)/200


Radio tubes d fd
(hours) m

-4 -48 192
1000-1199 1099.5 12
-3 -90 270
1200-1399 1299.5 30
-2 -130 260
1400-1599 1499.5 65
-78 78
1600-1799 1699.5 78
0
1800-1999 1899.5 90
+1 +55 55
2000-2199 2099.5 55
+2 +72 144
2299.5 36
2200-2399 +3 +75 225
2400-2599 2499.5 25
+36 144
9 +4
2600-2799 2699.5
N 400
E fd= - 108
Efd 1368

108 = 1899.5 -54 1845.5


() = A+xi=
N
1899.5 -

(i) - (Etd xi= V1368


200
-108
400
x 200

( N
V3.347 x 200 1.829 x 200 365.8
=
V3.42 0.073 x 200 2577.1.
1113.9 to
C)Xt 2d =1845.5t2 (365.8) and 2577.1. For this we
=

tubes lying between 1113.9


have to determine the percentage of class. In the class
we that the items are equally
distributed within each
an assumption would be 6.84 frequencies
12 frequencies. At 1113.9 or 1114 there
-1199 there are
Between 2400
12 114 6.84. Frequencies between 1114 and 1199 are 12 6.84 5.16.
=

200
AT 26/1
and 2599, there are 25
25
trequencies. equencies
177 22.1
200
be 5.16 30
1113.9 and 2577.1 would 65-
+ +
husthe total of trequencies between 78 90+
55 36 22.1 381.26 or .26 100 95.32 per cent.
400

Coefficient of Variation The standard deviation discussed above i is


n
absolute measure of dispersion. The corresponding relative measure is
a s the coetficient of variation. This measure developed by
by K
Karl
nown
Fearson is the most commonly used measure of relative variation. It
is
used in such problems where we want to compare the variability of rue
or more than two series. That series (or group) for which the coefficient of
varnation is greater is said to be more variable or conversely less consistent,
less uniform. less stable or less homogeneous. On the other hand. the
series for which coefficient of variation is less is said to be less variable
or more consistent. more uniform. more stable or more homogeneous,
Coetticient of variation is denoted by C.. and is obtained as follows
Coefficient of variation or C.V. = x 100
X
t may be pointed out that although any measure of dispersion can be used in conjunction
with any average in computing relative dispersion, statisticians, in fact, almost always use the
standard deviation as the measure of dispersion and the arithmetic mean as the average. When
the relative dispersion is stated in terms of the arithmetic mean and the standard
deviation, the
resulting percentage is known as the coefficient of variation or coefficient of variability.
lustration 19. The following table shows the monthly
University on morning breakfast
expenditures of 80 students of a

Expenditure (in Rs.) No. of Students


Expenditure (in Rs.) No. of Students
780-820 2 530-570 13
730-770
480-520 9
680-720 7 430-470 7
630-670 12 380-420 4
580-620 18
330-370 2
Calculate arithmetic mean, standard deviation and coefficient
of variation of the above data.
Solution. CALCULATION OF X, S.D. and C.V.
Expenditure
(Rs.)
m.p.
m (m-600)/50
f
d fd
780-820 800 2
fd2
+4 + 8
730-770 750 6 32
680-7200 700
+3 +18 4
5
+2
630-670 650 +14 28
12 +1
580-620 600 18 +12 12
530-570 550 13
480-5200 500
-1
9 -13 13
430-470 450 -18 36
380-420 400 -3 -21 63
330-370 350 2 -16 64
-5
N 80 -10 50
fd= -26
d= 352
MEASURES OF DISPERSION

2 2
Mean X A
X=
ANN + Xi=
600ax50
26
600 16.25 583.75
S.D td(2
N tdxi=
N xi- 352 (- 26
=
N4.4 .106 80 80 50
x 50 =2.072 x 50 103.6
c.v. =
x 100 =103.6
X 583.75 x 100 =17.75%.
lkistration 20. From the
prices of shares of X and Y
in value :
below find out
X 35 54 which is more
52 53 stable
56 58
Y 108 107 105 52 50
105 106 51 49
107 104 103 104 101
Solution. In order to find out which (B.Com.. Bangalore Univ..
cient of variations. shares are more 2005)
stable, we have to
compare coeffi-
CALCULATION OF COEFFICIENT OF
(X-X) VARIATION
X X
x2 Y
(Y- Y)
35 -16 256 108
54 +3 +3
9 9
52 107 +2
+1 4
53 105 0
+2 4
56 105 0
+5
58
25 106 +1
+7 49
52 107 +2
+1
50 104 -1
1
51 103 -2
49 104 -1
-2 4
X= 510 101 -4
X= 0 16
Ex=350 2Y=1050 y 0
Coefficient of Variation X y=40
C.V.=x 100
510 51
10
Ex2 3
g VN V15.916
C.V. x 100 11.6
Coefficient of Variation Y:
C.V.x 100
Y=Y1050 105
N 10
= V10 240
N

C.V.105 x 100 = 1.905


STATISTICA METHO
300
case or shares Y, hence tho..
are
of variation is much less
in
more stable MI
Since
coetficient
in
FOotball match were as follows
ac

value. in a S
two teams
Goals scored by No. of Football Matches Played (b
l u s t r a t i o n 21.

Scored in a Team 'A'


No. of Goals
Football Match
Team B'
15 20
10 10
07 05
05 04
03 02
02 01
5
42 42
Total
state which team is more
consistent.
Calculate coefficient of variation and
Solution. In order to find out which team is more consistent we shall have to compare
the
coefficient of variation.
CALCULATION OF COEFFICIENT OF VARIATION

X (X-7) Y (Y- 7)
X 2

15 +8 64 20 +13 169
10 +3 9 10 +3 9
0 5 2
5 4 4 3 9
3 4 16 2
2 5
25
25 1 -6
ZX= 42 36
Ex=0
x=118 2Y=42 y=0
Team A Ey 252
Team B
C.V.x100 C.V.x 100
X42
X- 7 Y- -7
118 443
C.v=443 x
a-V N
252 -6.48
C.V. 100 6
7 63.29
C.V.5.48
lustration 22. Two brands of 7
x 100 92.57
tyres are tested with the
Life
(in'000 miles)
following results:
Brand of Tyres
20-25 X
25-30 Y
30-35 22
35-40 64 24
40-45 10 76
(a) Which brand of
(b) tyres have 3
trucks?Compare the greater
variability and stateaverage life ? which brand of
tyres would you use on of
yOu
order to answer part (a) we have to Compare the meanS and to answer part
Solution
coefticient of variation.
the
( )
npare
compa

AND COEFFICIENT OF VARIATION (BRAND Xx)


CALCULATION OF MEAN
(7 32.5)/5
m.p. fd
Life d
(O00 miles)
-2 2 4
22
20-25

27.5 22 1 22 22
25-30
64 0 0 0
32.5
30-35
10 +1 +10 0
37.5
35-40
3 +2 +6 2
42.5
40-45
N 100 2 fd=-8 2fd= 48

X=A+xiN
A 32.5, 2fd=
-

8, N= 100, i= 5
8
X 32.5- 100 x5 32.5 -0.4 32.1
fd2 (E fd
N
xi= V100 -8x5
100

= v0.48- 0.0064 x 5 0.6882x5 3.441

C.V.x 100 3.441100 =10.72


COEFFICIENT OF VARIATION (BRAND Y)
CALCULATION OF MEAN AND
(m-32.5)/5
Life m.p. d fd d2
m
(000 miles)
0 -2
20-25 22.5 24
-1 -24
25-30 27.5 24
76 0
30-35 32.5 0 0
+1
35-40 37.5
0
42.5
+2
40-45 N= 100
fd-24 Xfd =24

fd
= A+xi
N
N= 100, i=5
A 32.5, 2fd=-24,
31.3
32.5100 32.5-1.2
X= 32.5-0x5
=
( EXi=1/ 24 _(-24 x5

d=V N -
100
N

5 = 0.4271 x 5 2.136
=
V0.24-0.0576 x

100 2.136
6.824
C.V. x 31.3 x 100 life.
X
Xof tyres, they have greater average
a) arithmetic mean is for brand
more are more consistent and
oince less for brand Y of tyres, they
variation is
Since coefficient of
should be preferred for use. firms A and B.
paid to workers in two
of the weekly wages
beldon 23. An analysis gave the following
resuit
Onging to the same industry,
STATISTICAL METI
Fim A
Fim B
550
Number of wage earmers 650
Ro. 1450 Ra 1400
Average weetly wage
Rs. V10,000 Rs V19600
Standarnd deviation of the distibution of wages

Answer the tollowing questions with proper justifications ?


amount as weekly wages
(&) Which fim, A or B pays out larger
in individual wages ?
(6) In which tim, A or 8 is there greater variability
W h a t are the measures of () average weekly wages and () standard deviation
7
indvdual wages of ali workers in the two firms taken together
Solution. (a) In onder to tind out which tirm A or B pays largor amount of wookly wages
we compare the total wage bill of fim A and tirm B.
Fim A Total wage bill = 550 x 1450 Rs. 7,97,500
Fimm B Total wage bill» 650 x 1400 Rs. 9,10,000
Hence fim B pays larger amount as weekly wages.
(b)Todetemine the firm in which there is greater variability in individual wages, we shall
compare the coefticient of variation.

Fim A C.V100 v10,000 x 100 6.89 per cent


1450
Fim B C.V. 100
10019,600

00 x 100 10 per cent.


X 1400
Since coetficient of variation is more in case of firm 8, hence
there is greater variation in the
distribution of wages of tirm B.
(c) Combined Mean and Standard Deviation

X12 NXI+Ne X
Ni+N2
(550 x 1450)+ (650 x 1400)
550+650
7,97,500+9,10,000 17,07,500
Rs. 1422.92
1200 1200
O12 = ia+N2az
V +Ndh N2dh
Ni +
d= X1-
X12|=|1450-1422.92 =27.08
d=| X2- X12 | =| 1400
1422.92 | =
22.92
(550x 10.000) (650 x+
12
19,600)+550(27.08)+
1200 650(22.92)
55,00,000+ 1,27,40,000+ 4,03,329.52
1200
+
3,41,462.16
1,89,84,791.68
1200 125.78
Hence combined mean is Rs. 1422.92 and
standard deviation Rs.
Variance. The term varlance 125.78
standard deviation was used to describe
highly important in by R.A. Flsher in
advanced 1913. The the square ot the
into several work where it is concept of variance
is
in their parts, each attributalble to one of possible to split the
original series. the factors total
Vartance is defined causing vartation
as tollows:
Variance -2(X- M2
For N
detals please reler to
chapter on
'Analysts of Vartanee,
MEASURES OF DISPERSION
303

Thus. variance is nothing Dut the Square of the standard deviation


Variance = o
O =VVariance
or deviations taken from assumed
freauency distribution where
are mean
n a
n
as follows
ariance may directly be computed
Variance =
N
N x
N 2
d- A) and i= class interval
when

Standard Deviation Compared Both the variance and the


and
Variance
are m e a s u r e s of variability
in a population. These two
standard deviation
related as is clear from the
formula : Variance =o4.
measures are closely deviation from the arithmetic
mean and
is the average squared
Variance variance. In a subsequent
the square root of the
standard deviation is of variance analysis wil
be discussed at length.
chapter the significance
or greater the
the value of o the lesser the variability
The smaller
population.
uniformity in the of 80 students in
table gives the marks obtained by a group
Illustration 24. The following
Calculate the variance. No. of students
an examination. Marks obtained
No. of students
Marks obtained
34-38
2
10-14 8
38-42
14-18
42-46
18-22 46-50 6
8
22-26 50-54 2
12 4
26-30 54-58
16
30-34 (MSW, Delhi Univ., 2005)

CALCULATION OF VARIANCE

Solution. (m-32)/4
Marks m.p. fd id
m
-10 50
-5
12 -16 64
10-14
16 -12 36
14-18
20 -16 32
18-22 2
24 8 -12 12
222-26
28 12 0
26-30
16 10 10
30-34 32
10 +16 32
34-38 36 +2
8 +12 36
38-42 40 3
96
12-46 44 +24
+4
+10 50
36-50 48 +5
2 144
50-54 52 +24
+6
4 fd 300 fd 562
54-58 56
N= 80

Theoretical Distributions
on
chapter
details please refer to
STATISTICA ME
338

INTRODUCTION

measures of
chapters
We have (scussed
the entire
central C
In the prrvious two
However, they do
n t revcal
kewness and
called ske
tory. Ther
and variability. characteristics

Two distributlons
kurto
other comparable
wo
a
distribution.
widelr ay hav-
may differ
understand
help us to deviation
but in their
standard
and
from the lollowing
same mean

can be seen
appearance a s

Symmetrical Distribution Asymmetrical Distribution


90

X= 15
o 6
60 60

40

30 30 30

20
10 10 10

05 10 15 20 25 30 05 10 15 20 25 30

In both these distributions the value of mean and standard deviation is the
same (X= 15, o =6). But it does not imply that the distributions are alike in
nature. The distribution on the left-hand side is a symmetrical one whereas
the distribution on the right-hand side is
skewness help us to
symmetrical or skewed. Measures of
Some
distinguish
between different of
types distributions.
important definitions of skewness are as follows:
1. When a series is not
skewed." symmetrical it is said to be
asymmetrical or
2.
-Croxton & Couwden
"Skewmess refers to the
frequency distribution." asymmetry or lack of symmetry in the shape o a
- Morris Hamburg
3. "Measures of skewness tell us the
In
symmetrical distribution direction and the extent of
more the
the mean, median
and skeWTe
away from the mode, the mode are identical The
mean moves
skewness." larger the or
4. "A
asymi
Simpson & Ka/ka
distribution is
at differcnt points in thesaid to be
'skewed' when the mean
and the n fall
shifted to one side or
distribution,
the other-to and the balance meu is
left or (or centre ol
The right." - 8rett
Garrett
lack ofanalysis
of above
symmetry, definitions shows that the
asymmetrical) it is i.e., when a term
indicates the
called
skewed a distribution
not symmetrical is 'skewness (or is
difference between distribution.
If, particular
a
distribution the Any
manner in which measure O
ewness

for
example, skewnesscompared with a items are ed in
is
positive, the symmetrical (or normal tributio
frequencies distrition
are in the
MOMEN AND KURTOs 33
SKEWNESSs,

range of values on the high-value of the curve end


over a greate
out the low-value end. If the curve is
spread ad le) than they are on
right-hand side)
on both sides of the centre point
and the
(the
will be the same
spread of skewness
norn.
nal.
and mode will all
have the same value. The concept
median is often based upon the
statistical theory
n. from the fact that
portance m e a s u r e of skewness is,
therefore.
gains of the normal distribution. A
of this assumption.
against the consequences
assumption

in orde to guard om the following


three diagrams
ecessary
of skewness will be clear
Theconcept skewed
distribution.
distribution
a : and a
positively
showing
a symmetrical
negatively s k e w e d d i s t r i b u t i o n .

Distribution. It is
Symmetrical
1. that in a sym-
the diagram
clear from the values of
metrical distribution
mode coincide.
The
median and
mean, is the s a m e
of the frequencies X=Med=Mode
Spread centre point of
the
sides of the
on both
curve. Distribution. AA
2. Asymmetrical
which is not symmetrical
distribution and
distribution
a skewed
is called could either be
distribution
such a skewed
negatively
skewed or
positively from the following
as would be clear

diagram Skewed
Distribution.

3. Positively distribution. Mo
skewed
In the positively m a x i m u m and
Med
m n e a n is Positively Skewed Distribution
the value of the median lies in
least-the
that of mode from the
as is clear
between the two
following diagram Skewed Distribu-
4. Negatively of
the shape
tion. The following is
negatively s k e w e d d i s t r i b u t i o n :
distribution
skewed
In a negatively and
is maximum Mo
the value of mode lies in
Med
l e a s t - t h e median
Negatively Skewed Distribution
that of mean
two. In the positively
between the end of the
distribution the frequencies values on
the high-value
skewed range of l o w - v a l u e end.
In the nega
greater a the
spread out
over
a r e on on the
e x c e s s tail is
side) than they
dre
the
right-hand reversed, i.e..
Curve (the position is
distribution the
the interval
vely skewed symmetrical
distributions

left-hand side. moderately o n e - t h i r d of


the interval
noted that in
should be is approximately
which provides
a
lt and the median relationship
DEtween the
mean
It is this
the mode. and
Detween the
mean skewness.
the degree of
eans of measuring
STAVISTIC
340 and
Skewness

Dispersion

between

D i f f e r e n c e

the
amoun unt of variat rather th
with
the dircction
th of th
Vari.
c o c e r c d

about
is
of skou.
us
Dispersion tels measures
S k e w m e s s
lacl,
1n
dirction.

departurr
tron
syinieiry.

ol
dispersion.

is aan im.
n imporlant
deper
upon
the
amount

tlial although it is
skeWness

rarel. characte
m a y be noted pattern of a
distribution.

t h e most important
the precise 1s by 1ar
detining Variation
series.
economic
and
distribution.

TESTS OF SKEWNESS

is skewed or hod

whether a
distribution
the follow
In order to
ascertain
Skewness is present if
be applied.
ing tests may
median and
mode do not coincide
. T h e values of nmean,
are plotted
on a graph they do not give the
When the data a vertical line
form. i.e., When cut along the
bell-shaped
halves are not equal.
centre the two
deviations from the median is not qual to
The sum of the positive
deviationS.
the sum of the negative
the median.
Quartiles are not equidistant from
not equally distributed at points of equal deviation
.Frequencies are
from the mode.
in of a symmetrical
Conversely stated. when
skewness is absent, i.e., case

conditions are satisfied


distribution. the following

The values of mean, median and mode coincide.


the normal bell-shaped form.
Data when plotted on a graph give
cqual to the sum
Sum of the positive deviations from the median is
of the negative deviations.

Quartiles are equidistant from the median.


requencies are equally distributed at points of equal deviations ro
the mode.

MEASURES OF SKEWNESS

in a
Measures of skewness tell us the direction and extent of
series. and permit us to asyi these.
th
compare two or more series with regaru
They may either be absolute or relative.

Absolute Measures of Skewness


Skewness can b¢ measured in absolute terms by taking the dllcerence be

tween e a n and mode.


Symbolically,
Absolute Sk: =
X -

Mode.
When skewness 1S ascd on by
the

formula absolule Sk quartiles, absolute skewness 1s


=* 912 Median.
MOMENTS
AND KURTOSIS
wNESS,

341

value
f
of.mean is greater than mode skewness will be
f the sign in the bove formula. Conversely, thPopositive,
s i t i v e , ie.,
we
get a plus s i S e l y , if the value of mode
mean, we shall get a minus sign
minus
meaning
is
realer than

is negatively skewed. thereby that the


distributic
istribution
The reason why the erence between mean and mode can be used to
measure
ewrmess is that in a symmetrical distribution the values of nean,
iedian and
and mode are alike,. but the mean moves away from the mode when
hservations are asymmetrical Consequently, the distance between the
the obser

mean and the mode could be used to measure skewness-the greater this
ance. whether
distance, wheth positive or negative, the more asymmetrical the distribution.
is
unsatisfacto on two counts:
measure
However. s u c h a
in the unit of value of the distribution and
1. It would be expressed
could. therefore. not be compared with another comparable series
different units.
expressed in
the Mean
2 Distributions vary greatly and the difference between, say,
considerable in
and the Mode in absolute terms might be
one

the frequency curves of two


series and small in another, although
distributions were similarly skewed.
in relation to some m e a s u r e of
If the absolute differences were expressed
distributions, the measure would be
the spread of values in their respective to the
relative and can be used directly
for comparison. This leads us
skewness.
discussion of the relative m e a s u r e s of

Relative Measures of Skewness

measures of relative skewness, namely,


There are four important
T h e Karl Pearson's coefficient of skewness.
The Bowley's coefficient of skewness.
The Kelly's coefficient of skewness.
based on moments."
Measure of skewness used for making comparison
skewness should mainly be
hese measures of of one distibution alone
As a description
two
distributions. as "slight
ECween skewness is necessarily vague
or m o r e
of a measure of
einterpretation skewness.
"moderate
skewness", or It
SKewness", "marked have the following three properties.
m e a s u r e of
skewness should
g0od
should should be independent
that its value the
a pure number in
the sernse variation in
De the degree of
the series and also of
OI the units of
series: distribution is symmetrical: and
when the
nave a zero value, so that we could easily interpret
scale of measure
nave some meaningful
the measured value.

section 'Moments' in this chapter.


the
ror details please refer to
342
STATISTICAL METHO
Karl Pearson's Coefficient of Skewness

This method of measuring skewness. als0 known as Fearsonian Coefficier


of Skewness. was suggested by Karl Pearson". a great British Biometricia
and Statistician. It is based upon the difference between mean and mo
This difference is divided by standard deviation to give a relative measure
ure.
The formula thus becomes
Mean - Mode
Skp Standard Deviation ...

Skp =
Karl Pearson's coef!icient of skewness.
There is no limit to this measure in theory and this is a slight drawback
But in practice the value given by this formuia is rarely very high and usual
lies between + 1.
When a distribution is svmmetrical, the values of nmean, median and mode
coincide and. therefore. the coefficient of skewness will be zero. When a
distribution is positively skewed. the coefficiernt of skewness shall have plus
sign and when it is negatively skewed. the coefficient of skewness shal! have
minus sign. The degree of skewness shall be obtained by the numerial value.
say. 0.8 or 0.2 etc. Thus, this formula gives both the direction as well as the
extent of skewness.
The above method of measuring skewness cannot be used where mode
is ill
defined. However. in moderately skewed distribution the
avarages have the
following relationship:
Mode 3 Median - 2 Mean

therefore, if this value of mode is substituted in the above formula


we arrive at another formula for finding out skewness,

Sl
Skp X-(3 Med. -2X] X-3 Med. +2X3 (X- Med.)
..(t)
Theoretically,
the value of this coefficient varies
practice it is rare that the coefficient of skewnessbetween t 3; however, in
method exceeds t 1. obtained by the above

Bowley's Coefficient of Skewness


An alternative measure of skewness has
Bowley. Bowley's been
proposed by the late Professor
measure is based on
firstand third
quartiles are
quartiles. In a symmetrical distributiorn
from the
following equidistant from the median
diagram. as can be seen

Median

Around 1890, the British


publish his work that Mathematician Karl Pearson
statistics. Karl Pearson has come to be regarded (1857-1936) began o
valuation. applied statistics to as the
founder of moa
biological problems of heredity a
MIENIS ANID KUR
WNESS, MOMI

l distributi the thired guartile j, the a1 dttdi


InaAsy etria
is Irlow it, 1
As the r s t iarilr
Med
the Med M
s ositivly sk
stributio thhan the bottos
this di rdia1
If
brr
rther fro
fart.
to
Irnd

farther r o posslbln
Henr a
skrwness
(Mee. 1 2 Med
9 Med
Skis Med.) (M«d

Bowley's
Coelficjent of skewnes
Sk
obtaincd by two thr: meaures are
the results
remembered that the n u n e r a l values are not
ttrust be one ariothe.
Fspsdally,
with measure, bsau fts
compared wleys
nod to be one another
sinc the
1 and + J, while Paron's
related to
limited to valucs
betwsn -
bass, is
cGTUDLtational bear no
such limíts. these two formulae
has no
values obtained from
measure
numerlcal with unusually
do the r a r e occaslons,
Not only another but, on
signs.
relatlonship to
one
with opposite
necessary for them to emerge
distributlons, It Is posssble
shaped
Skewness
Coefficient of
Kelly's quarters
of
extreme
the two
discussed above neglects the entire data
measure to cover
Bowley's
would be better for a
neasure

often
interested in th
the data. It we are
mecasuring
skewness,
e t e n d e d by taking
any two
b e c a u s e in c a n be
especially measure cquidistant
extreme items.
Bowley's two percentiles
nore medían o r any neaSuring
from the formula for
deciles equidistant the following and ninth
median. Kelly
has suggested (or the first
from the percentiles
the 90th
skewness upon
the 10th and
Med.
deciles):
+
Dy 2 D -
10+ Poo 2 Meaalso Sk
Dy- Di
Sk Po P1o skewness.
cocficient of
Sky Kelly's attraction if
skewness is
to
be
thheoretical ana
has one in practice
neasure of skewness
is not popular
his However. this method
percentiles.
DAscd on Pearson's method
is used.
generally Karl
Moment
the Third
Measure of Skewness Based o n
use of the thir
by making the
head
obtaied under
skewness may be discussed

c a s u r e of Thi1s would be
Tnonent about the mean.

application
of the
aDo
Moments. explain
he
ilhustrations shall
ollowlng
nethods.
skewness 47.
coefficient of 42.5
Ustration 1. Calculate Pearson's 37.5 33
32 5 45
5 22.5 27.5 61
125 129 2006)

54 108 Untv.,
28 a2 (B.Com.,
Kerala
344
STATISTICA AETH
CALCULATION OF COEFFICIENT OF SKEWNEcS
Solution.
X 27 5) 5 Id
d

28 84 fd
125
-84
17.5 42
-54
252
22 5 54 163
108 0 54
5
129 +1 +129
325
61 +2 +122 129
37 .5
42.5 45 +3 +135 244
47.5 33 +4 +132 405
N= 500 fd 2996 528
Mean -Mode
fd =1780
Coeff. of Sk. =

Mean X= A+ xi
N
A 27.5, 2fd=296, N= 500, i= 5.
296
500X=27.5+ox5 30.46
Mode: Since the maximum frequency is 129, the corresponding value of X, i.e., 32.5 is h.
modal value. he

S.D.
(E fa xi
N
fd=1780, N= 500, E fd =296, i= 5

= 1780
V; (296 x5 V3.56 .35 x5 8.96
500 500
Coeff. of Sk. 30.46- 32.5 - 2.04 - 0.228
8.96 8.96
llustration 2. Calculate Karl Pearson's coefficient of skewness from the following data :
Profits (Rs. Lakhs) No. of Cos. Profits (Rs. Lakhs) No. of Cos.
70-80 12 110-120 50
80-90 18 120-130 45
90-100 35 130-140 30
100-110 42 140-150 8
(M.Com Nalanda Uni.. 2013
Solution. CALCULATION OF COEFF. OF SKEWNESS BY
KARL PEARSON'S METHODD
Profits m.p
(Rs. Lakhs) m (m- 115)/10
fd fd2
70-80 75 12 192
80-90 85 -48
18 162
90-100 95 -54
35 140
100-110 105 -70
42 42
110-120 115 -42
20-130 50
125 45
130-140 +1 +45 45
135 30
140-150 +2 120
145 +60
8
+3 +24 72
N=240 fo= - 85 f d 773
SKEWNESS, MOMENTS AND KURTOSIS

345
inspection mode lies in the class 110-120
By
Mode L+a
L 110. A1= |50 42= 8, A2= 150 45 5,
8 i= 10
Mode 110+
8+5 X 10 110+ 6.15 116.15
( fo xi= V73_(-85?
= VN-NXi= V x10
=
V3.221
10 1.7595 -

0.125 x
240240x10
x 10 17.595.
Coeff. of Sk=.46-116.15 -4.69
17.595 17.595-0.267

lustration 3. Calculate Karl earson's coefficient of skewness


Variable Frequency Variable Frequency
70-80 11 30-40 21
60-70 22 20-30 11
50-60 30 10-20 6
40-50 35 0-10 5
(MBA, Kumaun Univ., 2009)

Solution. Rearranging the given data in ascending order and calculating Karl Pearson's
coefficient of skewness.

CALCULATION OF COEFFICIENT OF SKEWNESS


Class m.p. (m-35)/10
m fd fd
-15 45
0-10
15 -12 24
10-20
20-30 25 11 -1 -11 11

30-40 35 21
35 +35 35
40-50 45 +
+2 +60 120
50-60 55 30
+3 +66 198
60-70 65 22
+4 +44 176
70-80 75 11
N=141
E fd 167 fd=609

Coeff. of Sk =
Mean-Mode

Mean: -A+xi
N
Zfd= 167, N=141, i= 10
A 35,
X= 35+ x10 35 + 11.84 =46.84
Mode: By inspection mode lies in the class 40-50.

MoL L+
A1+ A2 5, i= 10
|35-21|= 14, A2 =|35 30|
=

L= 40, A1 =

14
X 10 40 +7.37 47.37
=

Mo 40 +

Standard deviation Efa2 (dxi=N 141H 141

You might also like