UNIT 4 Stats Text
UNIT 4 Stats Text
UNIT 4 Stats Text
281
monly enployed.
and quartile
discussed above, namely, range
ds of dispersion the term
in the strict sense of
1he two methods
m e a s u r e s of dispersion
are not However, to
deviation. s c a t t e r n e s s around an average.
do not show the'
aecause they distribution we should take
the deviations from
formation of a
study the The two other measures namnely, the average deviation and
an average. deviation, help u s in achieving this goal.
the standarddeviation is also known as the average deviation. It is the average
of that
The mean a distribution and
the median or mean
between the items in from
the deviations
there is an advantage in taking
difference
Theoretically minimum when
series. median is
because the s u m of deviations of items from
median arithmetic mean is more frequently
ignored.However, in practice, the reason why it
signs are and this is the
the value of average deviation used must
used in calculating called mean deviation. In any case, the average
confusion in meaning
is more commonly that any possible
stated in a given problem so
be clearly
is avoided.
Steps.
Compute the
median of the series. and denotee
from median ignoring t signs
deviations of items
The
these deviations by | D.
of these
deviations, ie., 2 | D|.
O b t a i n the total observations.
obtained in step (iî) by the number of
Divide the total
deviation is the range that
will
is normal, the mean t mean skewed,
If a distribution series. If it is moderately
items in the
57.7 cent of the items to fall within this
include per 57.5 per cent of the
approximately
then we may expect the distribution is highly compact
deviation is small,
range. Hence, if average concentrated within a small
m o r e than
half of the c a s e s are
since
or uniform,
range around the mean.
m c a n
iation,
dcvia called the
282 the
Donding
to mean deviviation by the
dividing
c o r r e s poob t a i n e d by Thus, if nean
m e deviation
is
m e a s u r e
The
relative
mean
deviation, ncan
deviation.
ol
mean devia shall be
of computng cocfflcient
Cocllicient u s e d in the
average median,
Particlar fronm medlan.
becn
compiulcd deviatlon by M.D.
as dividing
mean
deviation by deviation
and its
mean
Calculate the
llustration 6. beloW
members given 16,000 18,800
seven
v e and 15,200
(Rs.)14,000
14,800
17,000 17,500 18,000 19,000
Group 16,200
15,500 16,000
Group / (Rs.) DEVIATION
CALCULATION OF MEAN
Solution. Group
Group Deviation from
Deviation from median 17,000
Income (Rs.)
median 15,200 DI
Income (Rs.)
DI 1,500
15,500
14,000 1,200
16,000 1,000
400
14,800
16,200 800
15,200
17,000 0
16,000 800
19,000 2,000
N=5 D= 6,000 N=7 D= 6,800
Mean Deviation: Group l: M.D. =
N
D= Deviation from median ignoring signs,
Median Size of th item =
3rd item
Size of 3rd item is 15,200 M.D. = 5 =1,200
This means that the
Rs. 1,200. average deviation of the individual incomes from the median income is
Mean Deviation
Group II
Median Size of 1
thitem=
Size of 4th item is 2 =
4th item
17,000
2|D= 6,800, N=7.
Note. If we
median. Thus forwerefirst
to compute
M.D
6,800971.43.
coefficient of mean
the group: deviation we shall
divide mean by
deviano
PERSION
ASURES OF DISPE
283
Coefficient of M.D 1,200
15,200 0.079
the
second group
for
arnd
of Mean Deviation
Calculation
Series n discrete series the formula for
Discre
deviation is: calculating mean
-*
M.D = J D ,
D
N
by the same logic as given before)
D| denotes deviation Irom median ignoring signs.
Steps.
. Calculate the median of the series.
, Take the deviations of the items from median ignoring signs and
denote them by | D|.
.Multiply these deviations by the respective frequencies and obtain the
total Ef| D|.
. Divide the total obtained in Step (i) by the number of observationsS.
This gives us the value of mean deviation.
llustration7. (a) Calculate mean deviation from the following series
X 10 11 12 13 14
3 12 18 12 3
B.A. (Hons.) Econ., Madras Univ., 2009)
Solution. CALCULATION OF MEAN DEVIATION
X
DI fD C.f
10 3 2 6 3
11 12 12 15
12 18 33
13 12 12 45
14 3 2 6 48
N= 48 f| D=36
M.D. Ef D
N
Median Size of N+ th item= =24.5th item
2 2
S1Ze of 24.5th item is 12, hence Median = 12
36
M.D. =0.75.
48
0) Calculate the mean deviation from the mean for the following data:
Size 4 6 8 10 12 14 16
Frequency 2 5 3 2 1
B.Com. (H), Madras Uniw.. 20091
STATISTICAL METHOD
MEAN
FROM
284
CALCULATION
OF MEAN
DEVIATION
X-8 f D
IX DI
Solution 12
6
2 8
2
24
40 6
2
5
30 A 8
3
10 24 6
2
12 14 8
1 8
14 16
1
XfX= 160
Sf| D]=56
16
N 20
X=N 20
2fDISb -2.8.
M.D. = N 20
Series
D e v i a t i o n - C o n t i n u o u s
Calculation of Mean
the procedure remains
c o n t i n u o u s series
deviation in we have to
For calculating
mean difiference is that here
discussed above. The only deviations of these
the same as various classes and take
obtain the mid-point
of the
is same, i.e.,
median. The formula
points from
M.D. = I D ] |
N
Med. f | D= 1314.8
Size ofth item =
= 50th item
2
MEASURES OF DISPERSION
285
lies in the
class 30-40
Median
Med. LN2c
50, c.f
=
37, f= 25, i 10
30. N 2
=
=
Med. 30 50 37
25
10 30 -5.2 352
M.D.=1D 13148 13.148
N 100
Calculate the mean deviation and its coefficient from the following data
Class Frequency Class Frequency
0-10 5 40-50 20
10-20 8 50-60 14
20-30 12 60-70 12
30-40 15 70-80 6
(B.Com, Andhra Univ. 2005)
Med. L+ N/2-C.xi
f i= 10
N/2= 46, c.f. =
40, f=20,
L =40,
Med
46-40 10 =40 + 3 =43
40+ -
20
.
M.D-2|D_1414 _ 15.37
N 92
Coeff. of M.D.
M.D. 15.3 0.357.
Median 43
deviation. Population
landard S.
Sample standard deviation by
requires a mea
286 s i l u a t i o n
r C or
a or any up not
d e v i a t i o n . I general
PuDne
1s
useful.
mean of presented
will be the
average
data.
C Consequently chaIge
dispersion
lhat s l a t i s t i c s ,
of the devi
with in ol lI1Can
Very fanmiliar
item vallue
every the
cach and change iten
InS
It is based on item
would
value
o
Cxtrene
than
V a u e of any the
in the allected by
deviation is less value, comparison
c onparison about
value,
Can
the standard deviation. central
from a made.
taken be
are can
easily
Since deviations d i s t r i b u t i o n s
different algebraic es
Tornation of method
is t h a t signs
drawback
of this F o r example if
dispersion (or
measures the absolute the greater
deviation or variability).
dard
standard
of disperston
amount the
deviations
Thus, if we have
two o r
unifornity it is the
oaf
the opposite.
eaa1S Just
of
degre deviat0n identical
neans,
sfaldard or nearly
series with identical the most repre
uuarable deviation that has the
1ar standard
useful in judging
with the smallest
stribution with
tiol standard
d e
Hence
v i a t i o n is extremely
mean.
diti of the
mean.
tativeness
presenta
Deviation
and Standard
Deviation
Mean
between
Difference b et of
rence item
each and every
are based on
of dispersion
following respects
measures
0ath these
But they differ in the whereas
deviation
distribution.
t h e d i s t r i b u t i o n .
taken into
account.
Algebraic signs
are
standard deviation signs The
calculation of
,
mean.
or
in the
median
either from from the
be computed
always computed
can
deviation
Mean o n the
other hand, is deviation of
.
deviation,
the
of the
standard
because the sum of s q u a r e s
mean
arithmetic is the least.
arithmetic mean
items from
Standard Deviation
Calculation
of
standard
observations
individual
n case of methods:
Individual
Observations
any of the
following two
by applying
be computed actual m e a n .
deviation may items irom the
d e v i a t i o n of the assumed mean.
1. By taking of the items rom an
deviations
2. By taking actual
deviations are taken from
When deviations
Actual Mean.
Deviations taken from
formula is applied
mean the following
g V N
X).
items from the
the
Take the deviations of
X.
Denote these deviations by
total E x .
deviations and obtain the
'Square these and extract
total number of o b s e r v a t i o n s , ie., N
Divide 2 by the deviation.
us the
value of standard
ne square-root. This gives
actual i11ran is in
When the
Mean.
UIAlions taken from Assuned to take deviations
fract 123.674 it would
be too c u m b e r s o m e
it is either
OS. Say, these devialions. In Such a c a s e
rom
a n d then obtain squares of De taken from an
the or clsethe deviations
nean may be approximated the vialue ot the
assu n e d mean and the necessary adjustnent 111ade in
STATISTICAL
METHODS
288 a p p r o x i m a t i o n
is to>Ss rate
ol
Inelhod
are taken from
ar
former
assumed
d e v i a t i o n s
The case
deviation. a
standard
invariably
in sucl
followina c
the
and.
mean
thentore.
taken
from
assumed
mean
rmula is
deviations are
When
applied = 1 d _ / 2 d 2
o VN N
mean, i.e
Steps. items from
an
assumed
btain
deviatiin
deviations of the Take the otal of these deviatio
Take the deviations
d. by
Denote these
(X- A).
i.e.. obtain 2d.
the total 2d4.
deviations and obtain
Square these
in the above formula.
Substitute the values of Ed. Ed and N
are as under:
cholesterol levels of 10 persons
277, 251.
lustration 9. Blood serum
260, 290, 245, 255, 288, 272, 263,
240, assumed mean. (MBA, HPTU, 2014)
with the help of
Calculate standard deviation
DEVIATION BY THE
CALCULATION OF STANDARD
Solution. ASSUMED MEAN METHOD
X (X- 264)* d
d
240 -24 576
260 4 16
290 +26 676
245 -19 361
255 9 81
288 +24 576
272 + 8 64
263
1
277 +13 169
251 -13
2X 2641 169
2d= + 1
d 2689
a -N2 2
d= 2689,
N
Ed=+ 1, N= 10
G =1/26891 2
10 10
=
268.9 - 0.01
llustration 10. Calculate the standard deviation
16.398.
240.12 from the
240.13 following observations
240.15 240.15
240.17 240.12 240.17
240.16
The asuined 240.22 240.21
ean
minimise should be
taken 264 calculations. ln this as
nearer to
as
assumed case the the actual mean
mean. actual to
mean is
264.1 andpoSSID
We
ALGUL TON OF SIANDARD DEVIATION
Solution.
X (X- 240)
d d2
(1:6
o N -No
N
0-2666
10 10
= Vo-02666 0-0256 =0-033
a= ,
where x= (X X)
-
N
is rarely used because if the
actual mean
However, in practice this method
is in fractions the calculations
take a lot of time.
for-
Mean Method. When
this method is used, the following
(b) Assuned
mula is applied:
E d22fd2
where d =
(X- A) .
Solution.
(X- 6.5) fd fd
Size of item d
X 27
-9
3
3.5 3
-14 28
4.5 1
-22 22
5.5 22 0
6.5 60 0
+1 +85 85
7.5 85
32 +2 +64 128
8.5
+3 +24 72
9.5 8
N=217 fd= +128 2fd = 362
VN N
Efd 362, Efd= 128, N=217
d V362/128
V217 217
=
V1.668 -0.348 1.149.
(c) Step Deviation Method. When this method
is used take deviations of we
midpoints from an assumed mean and divide these
class interval, i.e., 't. In case class deviations
intervals are unequal, we by the width of
tions of by the lowest common factor and use 'c divide the devia-
formula midpoints
for calculating standard instead of 't in the
ard deviation is: deviation. The formula for
calculating stand-
xi
N N
Here, 2fd = 240, N=50, 2 fd=36, i=5
2
240 36
G-V50-50x5 v4.8-0.5184 x5 =10.35
Calculation of Standard Deviation-Continuous Series. In continuous
series any of the methods discussed above for discrete frequency distribution
can be used. However. in practice it is the step deviation method that is
most used. The formula is
-(2sd xi
N N
Steps.
Find the mid-points of various classes. and
from an assumed m e a n
Take the deviations of these mid-points
denote these deviations by d.
factor and denote this column by
Wherever possible take a common
d.
and ob-
the frequencies of each class with these deviations
Multiply
tain 2fd.
with the respective frequen-
Square the deviations and multiply them
obtain 2fd*.
cies of each class and
find
in c a s e of continuous series is to
Thus the only difference in procedure
mid-points of the various classes.
deviation from the following data
and standard
lustration 13. Calculate mean, median
20-30 30-40 40-50 50-60
0-10 10-20
Profits (Rs. lakh) 03
23 39 16
12 17
No. of companies (MBA, Univ. of Lucknow, 2009)
STATISTICAL METHODS
292
DEVIATION
STANDARD
MEDIAN
AND
OF MEAN,
CALCULATION
Solution. 35/10
No. of
companies (m
-
fd fd C.f
Profits m.p.
108 12
(Rs lakh) 36
-3 68
12 34 29
0-10 5
2 23
17 -23 52
10-20 15
23 0 91
20-30 25
0
39 16 16 107
30-40 35 +1
45 16 12 110
40-50
+2
+6
3
50-60 55 fd=-71 fd= 227
N=110
Calculation of X
fd
-A+xi 10
fd=-71, N=110, i=
A 35,
28.55.
X35- x10 35-6.45
Calculation of Median
110
Med. Size ofth item 55th item
2
Median lies in the class 30-40
Med. = L+ 2 Gxi
f
L 30, N/2 55, c.f. =52, f=39, i= 10
Med. -30 x 10
39
30+0.77 30.77.
o - V - /N1 xi
=
V2.064 -0.417 10 x
1.283x 10 12.83.
Hence mean or X of the
standard deviation is 12.83 lakh. distribution is Rs. 28.55
lakh, median = Rs. 30.77 lakh and
llustration 14. Find the standard
deviation
Age under 10
from the following data
20 30
No. of 40
persons dying 15 30 53
50 60 70 80
75 100
(B.Com.. Bangalore Uniu.: B.Com., 110 115 125
Madras Univ.. 2009: 2015)
MBA. Pune Unw..
MEASURES OF DISPERSION
293
Solution.
CALCULATION OF STANDARD DEVIATION
20-30 25 23 -23 23
30-40 35 22 0 0
40-50 45 25 +1 +25 25
50-60 55 10 tZ +20 40
60-7 65 +3 +15 45
sfd2 (Efdxi
N N
-N (2
125
x 10
= v3.904-0.0003 x 10
= 1.976x 10 19.76.
30-35 32.5 80
2
xi
a- V N N
1310-220 5
480 480
=
2.729 -0.21 x5 v2.519 x5
= 1.587 x 5 =7.935.
12
O12 = VMo+N^o2+Njd2+ Nzdh2
N + N2
O12 Combined standard deviation:
o1 standard deviation of first group:
=
group:
d =| X1 -X12 |:
da= | X2 - X12 |.
The above formula
three or more groups.can be extended to find out the standard
For example, deviationof
groups would be: combined standard
deviation of three
No+ Np02+ Nsas+Nd+ Nod+ Ngds2
N + N2 + N3
where d = | X1 - X123 |:
d=| X-X123
da =| Xs-X123|.
illustration 16. The
following are some of
boys and girls in a class: the
particulars of the
distribution of weight of
Number Boys
Mean weight 100 Girts
Variance 60 kg 50
(a) Find the 9 45 kg
(b) Which of standard
the two deviation of the
combined data
distributions is more 4
Solution. variable ?
(a) Combined S.D. (B.Com. MD.
o12 ="V
NoNaz Nd2 Nd? Uniw. 2009)
N+N
MEASURES OF DISPERSION
295
X12 NX+eX2
N + N2
100 (60) +50 (45) 6000+2250E
100+50 150
N = 100. 2 = 9 , =50. o2=4, d = |X - Xi21 = | 60 - 55 | = 5
o2 =
| X2 -
X12| =
| 45 -
55| =10.
V 100 (9)+50 (4)+100 (5+50 (10
Substituting the values
=
O12
100+50
9002002500-5000
150
8600
8600
150
7.57
=7.57
(D) For finding which distribution is more variable compare the coefficient of varation of two
distributions
3
C.V. (Boys)=x 100 60 x 100 5.00
X
CV. (Gits)=x 100 x 100 = 4.44
Since coefficient of variation is more for distribution of weight of boys hence this distribution
shows greater variability.
lustration 17. The number of workers employed, the mean wage (in Rs.) per month and
standard deviation (in Rs.) in each section of a factory are given below. Calculate the mean
wages and standard deviation of all the workers taken together.
No. of workers Mean wages Standard deviation
Section
employed in Rs.) (in Rs.)
50 11130 600
A
B 60 11200 700
C 90 11150 800
Solution.
X1231A1 +2X2+ Nsa
N +N2 + Ns
(50 x 11130)+(60 x 11200)+ (90 x11150)
50+60+90
5,56,500+6,72,000+10,03,500
200
22,32,000
200
_Rs. 11,160.
Combined standard deviation ofthree series
O123VMas+ Noz'+Nooa+ Ndh +Ndk +Nbdi
0123 N + M+ Ns
11160 |=30
d =|X1 - X123 | or | 11130 -
2 X2-X123| or | 11200 11160 |=40
d3= X3- X123| or | 11150-11160 =
10
68.27%
95.45%
99.73%
x-3a -20 x-1a x x+1o x+20
X+30
of relaionship Dispersion
between In a
normal distribution
the dispersion.
The the
next and
TIext the standard quartile deviation isthree most commonly ed
deviation is larde
is smallest, the mean
Smallest, u
9.D. 2
.D. 3 largest. in the deviauo
deviau
and M.D. 4 following 5
or
These
3
relationships 9.D.
can be =o or o proportio'
4, 5. The same
quite ormal. proportions
They are
easily
trInd inemorized
to hold because of
M.D
the
=M.D. 2. 3.
5O or
true for
By natural
numbers
uselul in
mean estünmating
We one
many sequent
distributions are
thaen
only measure of
positive integers, e.a dispersio
MEASURES OF DISPERSION
297
ther is known. or in checkingroughly the accuracy of a calculated vaiue. If
anothe
computed a differs very widely from its value estimated from .D. or M.D.
the c
either an error has been made or the distribution differs considerably from
normal.
1,200-1,399 30 2,200-2,399 36
1,400-1,599 65 2,400-2,599 25
1,600-1,799 78 2,600-2,799
1,800-1,999 90
(i) the standard deviation of the
Calculate() the average length of life of a radio tube, where length of life of a tube
number of tubes
length of a_tube, and (il) the percentage (MBA, Aramalat Untw., 2015)
falls within Xt 20.
-4 -48 192
1000-1199 1099.5 12
-3 -90 270
1200-1399 1299.5 30
-2 -130 260
1400-1599 1499.5 65
-78 78
1600-1799 1699.5 78
0
1800-1999 1899.5 90
+1 +55 55
2000-2199 2099.5 55
+2 +72 144
2299.5 36
2200-2399 +3 +75 225
2400-2599 2499.5 25
+36 144
9 +4
2600-2799 2699.5
N 400
E fd= - 108
Efd 1368
( N
V3.347 x 200 1.829 x 200 365.8
=
V3.42 0.073 x 200 2577.1.
1113.9 to
C)Xt 2d =1845.5t2 (365.8) and 2577.1. For this we
=
200
AT 26/1
and 2599, there are 25
25
trequencies. equencies
177 22.1
200
be 5.16 30
1113.9 and 2577.1 would 65-
+ +
husthe total of trequencies between 78 90+
55 36 22.1 381.26 or .26 100 95.32 per cent.
400
2 2
Mean X A
X=
ANN + Xi=
600ax50
26
600 16.25 583.75
S.D td(2
N tdxi=
N xi- 352 (- 26
=
N4.4 .106 80 80 50
x 50 =2.072 x 50 103.6
c.v. =
x 100 =103.6
X 583.75 x 100 =17.75%.
lkistration 20. From the
prices of shares of X and Y
in value :
below find out
X 35 54 which is more
52 53 stable
56 58
Y 108 107 105 52 50
105 106 51 49
107 104 103 104 101
Solution. In order to find out which (B.Com.. Bangalore Univ..
cient of variations. shares are more 2005)
stable, we have to
compare coeffi-
CALCULATION OF COEFFICIENT OF
(X-X) VARIATION
X X
x2 Y
(Y- Y)
35 -16 256 108
54 +3 +3
9 9
52 107 +2
+1 4
53 105 0
+2 4
56 105 0
+5
58
25 106 +1
+7 49
52 107 +2
+1
50 104 -1
1
51 103 -2
49 104 -1
-2 4
X= 510 101 -4
X= 0 16
Ex=350 2Y=1050 y 0
Coefficient of Variation X y=40
C.V.=x 100
510 51
10
Ex2 3
g VN V15.916
C.V. x 100 11.6
Coefficient of Variation Y:
C.V.x 100
Y=Y1050 105
N 10
= V10 240
N
value. in a S
two teams
Goals scored by No. of Football Matches Played (b
l u s t r a t i o n 21.
X (X-7) Y (Y- 7)
X 2
15 +8 64 20 +13 169
10 +3 9 10 +3 9
0 5 2
5 4 4 3 9
3 4 16 2
2 5
25
25 1 -6
ZX= 42 36
Ex=0
x=118 2Y=42 y=0
Team A Ey 252
Team B
C.V.x100 C.V.x 100
X42
X- 7 Y- -7
118 443
C.v=443 x
a-V N
252 -6.48
C.V. 100 6
7 63.29
C.V.5.48
lustration 22. Two brands of 7
x 100 92.57
tyres are tested with the
Life
(in'000 miles)
following results:
Brand of Tyres
20-25 X
25-30 Y
30-35 22
35-40 64 24
40-45 10 76
(a) Which brand of
(b) tyres have 3
trucks?Compare the greater
variability and stateaverage life ? which brand of
tyres would you use on of
yOu
order to answer part (a) we have to Compare the meanS and to answer part
Solution
coefticient of variation.
the
( )
npare
compa
27.5 22 1 22 22
25-30
64 0 0 0
32.5
30-35
10 +1 +10 0
37.5
35-40
3 +2 +6 2
42.5
40-45
N 100 2 fd=-8 2fd= 48
X=A+xiN
A 32.5, 2fd=
-
8, N= 100, i= 5
8
X 32.5- 100 x5 32.5 -0.4 32.1
fd2 (E fd
N
xi= V100 -8x5
100
fd
= A+xi
N
N= 100, i=5
A 32.5, 2fd=-24,
31.3
32.5100 32.5-1.2
X= 32.5-0x5
=
( EXi=1/ 24 _(-24 x5
d=V N -
100
N
5 = 0.4271 x 5 2.136
=
V0.24-0.0576 x
100 2.136
6.824
C.V. x 31.3 x 100 life.
X
Xof tyres, they have greater average
a) arithmetic mean is for brand
more are more consistent and
oince less for brand Y of tyres, they
variation is
Since coefficient of
should be preferred for use. firms A and B.
paid to workers in two
of the weekly wages
beldon 23. An analysis gave the following
resuit
Onging to the same industry,
STATISTICAL METI
Fim A
Fim B
550
Number of wage earmers 650
Ro. 1450 Ra 1400
Average weetly wage
Rs. V10,000 Rs V19600
Standarnd deviation of the distibution of wages
X12 NXI+Ne X
Ni+N2
(550 x 1450)+ (650 x 1400)
550+650
7,97,500+9,10,000 17,07,500
Rs. 1422.92
1200 1200
O12 = ia+N2az
V +Ndh N2dh
Ni +
d= X1-
X12|=|1450-1422.92 =27.08
d=| X2- X12 | =| 1400
1422.92 | =
22.92
(550x 10.000) (650 x+
12
19,600)+550(27.08)+
1200 650(22.92)
55,00,000+ 1,27,40,000+ 4,03,329.52
1200
+
3,41,462.16
1,89,84,791.68
1200 125.78
Hence combined mean is Rs. 1422.92 and
standard deviation Rs.
Variance. The term varlance 125.78
standard deviation was used to describe
highly important in by R.A. Flsher in
advanced 1913. The the square ot the
into several work where it is concept of variance
is
in their parts, each attributalble to one of possible to split the
original series. the factors total
Vartance is defined causing vartation
as tollows:
Variance -2(X- M2
For N
detals please reler to
chapter on
'Analysts of Vartanee,
MEASURES OF DISPERSION
303
CALCULATION OF VARIANCE
Solution. (m-32)/4
Marks m.p. fd id
m
-10 50
-5
12 -16 64
10-14
16 -12 36
14-18
20 -16 32
18-22 2
24 8 -12 12
222-26
28 12 0
26-30
16 10 10
30-34 32
10 +16 32
34-38 36 +2
8 +12 36
38-42 40 3
96
12-46 44 +24
+4
+10 50
36-50 48 +5
2 144
50-54 52 +24
+6
4 fd 300 fd 562
54-58 56
N= 80
Theoretical Distributions
on
chapter
details please refer to
STATISTICA ME
338
INTRODUCTION
measures of
chapters
We have (scussed
the entire
central C
In the prrvious two
However, they do
n t revcal
kewness and
called ske
tory. Ther
and variability. characteristics
Two distributlons
kurto
other comparable
wo
a
distribution.
widelr ay hav-
may differ
understand
help us to deviation
but in their
standard
and
from the lollowing
same mean
can be seen
appearance a s
X= 15
o 6
60 60
40
30 30 30
20
10 10 10
05 10 15 20 25 30 05 10 15 20 25 30
In both these distributions the value of mean and standard deviation is the
same (X= 15, o =6). But it does not imply that the distributions are alike in
nature. The distribution on the left-hand side is a symmetrical one whereas
the distribution on the right-hand side is
skewness help us to
symmetrical or skewed. Measures of
Some
distinguish
between different of
types distributions.
important definitions of skewness are as follows:
1. When a series is not
skewed." symmetrical it is said to be
asymmetrical or
2.
-Croxton & Couwden
"Skewmess refers to the
frequency distribution." asymmetry or lack of symmetry in the shape o a
- Morris Hamburg
3. "Measures of skewness tell us the
In
symmetrical distribution direction and the extent of
more the
the mean, median
and skeWTe
away from the mode, the mode are identical The
mean moves
skewness." larger the or
4. "A
asymi
Simpson & Ka/ka
distribution is
at differcnt points in thesaid to be
'skewed' when the mean
and the n fall
shifted to one side or
distribution,
the other-to and the balance meu is
left or (or centre ol
The right." - 8rett
Garrett
lack ofanalysis
of above
symmetry, definitions shows that the
asymmetrical) it is i.e., when a term
indicates the
called
skewed a distribution
not symmetrical is 'skewness (or is
difference between distribution.
If, particular
a
distribution the Any
manner in which measure O
ewness
for
example, skewnesscompared with a items are ed in
is
positive, the symmetrical (or normal tributio
frequencies distrition
are in the
MOMEN AND KURTOs 33
SKEWNESSs,
Distribution. It is
Symmetrical
1. that in a sym-
the diagram
clear from the values of
metrical distribution
mode coincide.
The
median and
mean, is the s a m e
of the frequencies X=Med=Mode
Spread centre point of
the
sides of the
on both
curve. Distribution. AA
2. Asymmetrical
which is not symmetrical
distribution and
distribution
a skewed
is called could either be
distribution
such a skewed
negatively
skewed or
positively from the following
as would be clear
diagram Skewed
Distribution.
3. Positively distribution. Mo
skewed
In the positively m a x i m u m and
Med
m n e a n is Positively Skewed Distribution
the value of the median lies in
least-the
that of mode from the
as is clear
between the two
following diagram Skewed Distribu-
4. Negatively of
the shape
tion. The following is
negatively s k e w e d d i s t r i b u t i o n :
distribution
skewed
In a negatively and
is maximum Mo
the value of mode lies in
Med
l e a s t - t h e median
Negatively Skewed Distribution
that of mean
two. In the positively
between the end of the
distribution the frequencies values on
the high-value
skewed range of l o w - v a l u e end.
In the nega
greater a the
spread out
over
a r e on on the
e x c e s s tail is
side) than they
dre
the
right-hand reversed, i.e..
Curve (the position is
distribution the
the interval
vely skewed symmetrical
distributions
Dispersion
between
D i f f e r e n c e
the
amoun unt of variat rather th
with
the dircction
th of th
Vari.
c o c e r c d
about
is
of skou.
us
Dispersion tels measures
S k e w m e s s
lacl,
1n
dirction.
departurr
tron
syinieiry.
ol
dispersion.
is aan im.
n imporlant
deper
upon
the
amount
tlial although it is
skeWness
rarel. characte
m a y be noted pattern of a
distribution.
t h e most important
the precise 1s by 1ar
detining Variation
series.
economic
and
distribution.
TESTS OF SKEWNESS
is skewed or hod
whether a
distribution
the follow
In order to
ascertain
Skewness is present if
be applied.
ing tests may
median and
mode do not coincide
. T h e values of nmean,
are plotted
on a graph they do not give the
When the data a vertical line
form. i.e., When cut along the
bell-shaped
halves are not equal.
centre the two
deviations from the median is not qual to
The sum of the positive
deviationS.
the sum of the negative
the median.
Quartiles are not equidistant from
not equally distributed at points of equal deviation
.Frequencies are
from the mode.
in of a symmetrical
Conversely stated. when
skewness is absent, i.e., case
MEASURES OF SKEWNESS
in a
Measures of skewness tell us the direction and extent of
series. and permit us to asyi these.
th
compare two or more series with regaru
They may either be absolute or relative.
Mode.
When skewness 1S ascd on by
the
341
value
f
of.mean is greater than mode skewness will be
f the sign in the bove formula. Conversely, thPopositive,
s i t i v e , ie.,
we
get a plus s i S e l y , if the value of mode
mean, we shall get a minus sign
minus
meaning
is
realer than
mean and the mode could be used to measure skewness-the greater this
ance. whether
distance, wheth positive or negative, the more asymmetrical the distribution.
is
unsatisfacto on two counts:
measure
However. s u c h a
in the unit of value of the distribution and
1. It would be expressed
could. therefore. not be compared with another comparable series
different units.
expressed in
the Mean
2 Distributions vary greatly and the difference between, say,
considerable in
and the Mode in absolute terms might be
one
Skp =
Karl Pearson's coef!icient of skewness.
There is no limit to this measure in theory and this is a slight drawback
But in practice the value given by this formuia is rarely very high and usual
lies between + 1.
When a distribution is svmmetrical, the values of nmean, median and mode
coincide and. therefore. the coefficient of skewness will be zero. When a
distribution is positively skewed. the coefficiernt of skewness shall have plus
sign and when it is negatively skewed. the coefficient of skewness shal! have
minus sign. The degree of skewness shall be obtained by the numerial value.
say. 0.8 or 0.2 etc. Thus, this formula gives both the direction as well as the
extent of skewness.
The above method of measuring skewness cannot be used where mode
is ill
defined. However. in moderately skewed distribution the
avarages have the
following relationship:
Mode 3 Median - 2 Mean
Sl
Skp X-(3 Med. -2X] X-3 Med. +2X3 (X- Med.)
..(t)
Theoretically,
the value of this coefficient varies
practice it is rare that the coefficient of skewnessbetween t 3; however, in
method exceeds t 1. obtained by the above
Median
farther r o posslbln
Henr a
skrwness
(Mee. 1 2 Med
9 Med
Skis Med.) (M«d
Bowley's
Coelficjent of skewnes
Sk
obtaincd by two thr: meaures are
the results
remembered that the n u n e r a l values are not
ttrust be one ariothe.
Fspsdally,
with measure, bsau fts
compared wleys
nod to be one another
sinc the
1 and + J, while Paron's
related to
limited to valucs
betwsn -
bass, is
cGTUDLtational bear no
such limíts. these two formulae
has no
values obtained from
measure
numerlcal with unusually
do the r a r e occaslons,
Not only another but, on
signs.
relatlonship to
one
with opposite
necessary for them to emerge
distributlons, It Is posssble
shaped
Skewness
Coefficient of
Kelly's quarters
of
extreme
the two
discussed above neglects the entire data
measure to cover
Bowley's
would be better for a
neasure
often
interested in th
the data. It we are
mecasuring
skewness,
e t e n d e d by taking
any two
b e c a u s e in c a n be
especially measure cquidistant
extreme items.
Bowley's two percentiles
nore medían o r any neaSuring
from the formula for
deciles equidistant the following and ninth
median. Kelly
has suggested (or the first
from the percentiles
the 90th
skewness upon
the 10th and
Med.
deciles):
+
Dy 2 D -
10+ Poo 2 Meaalso Sk
Dy- Di
Sk Po P1o skewness.
cocficient of
Sky Kelly's attraction if
skewness is
to
be
thheoretical ana
has one in practice
neasure of skewness
is not popular
his However. this method
percentiles.
DAscd on Pearson's method
is used.
generally Karl
Moment
the Third
Measure of Skewness Based o n
use of the thir
by making the
head
obtaied under
skewness may be discussed
c a s u r e of Thi1s would be
Tnonent about the mean.
application
of the
aDo
Moments. explain
he
ilhustrations shall
ollowlng
nethods.
skewness 47.
coefficient of 42.5
Ustration 1. Calculate Pearson's 37.5 33
32 5 45
5 22.5 27.5 61
125 129 2006)
54 108 Untv.,
28 a2 (B.Com.,
Kerala
344
STATISTICA AETH
CALCULATION OF COEFFICIENT OF SKEWNEcS
Solution.
X 27 5) 5 Id
d
28 84 fd
125
-84
17.5 42
-54
252
22 5 54 163
108 0 54
5
129 +1 +129
325
61 +2 +122 129
37 .5
42.5 45 +3 +135 244
47.5 33 +4 +132 405
N= 500 fd 2996 528
Mean -Mode
fd =1780
Coeff. of Sk. =
Mean X= A+ xi
N
A 27.5, 2fd=296, N= 500, i= 5.
296
500X=27.5+ox5 30.46
Mode: Since the maximum frequency is 129, the corresponding value of X, i.e., 32.5 is h.
modal value. he
S.D.
(E fa xi
N
fd=1780, N= 500, E fd =296, i= 5
= 1780
V; (296 x5 V3.56 .35 x5 8.96
500 500
Coeff. of Sk. 30.46- 32.5 - 2.04 - 0.228
8.96 8.96
llustration 2. Calculate Karl Pearson's coefficient of skewness from the following data :
Profits (Rs. Lakhs) No. of Cos. Profits (Rs. Lakhs) No. of Cos.
70-80 12 110-120 50
80-90 18 120-130 45
90-100 35 130-140 30
100-110 42 140-150 8
(M.Com Nalanda Uni.. 2013
Solution. CALCULATION OF COEFF. OF SKEWNESS BY
KARL PEARSON'S METHODD
Profits m.p
(Rs. Lakhs) m (m- 115)/10
fd fd2
70-80 75 12 192
80-90 85 -48
18 162
90-100 95 -54
35 140
100-110 105 -70
42 42
110-120 115 -42
20-130 50
125 45
130-140 +1 +45 45
135 30
140-150 +2 120
145 +60
8
+3 +24 72
N=240 fo= - 85 f d 773
SKEWNESS, MOMENTS AND KURTOSIS
345
inspection mode lies in the class 110-120
By
Mode L+a
L 110. A1= |50 42= 8, A2= 150 45 5,
8 i= 10
Mode 110+
8+5 X 10 110+ 6.15 116.15
( fo xi= V73_(-85?
= VN-NXi= V x10
=
V3.221
10 1.7595 -
0.125 x
240240x10
x 10 17.595.
Coeff. of Sk=.46-116.15 -4.69
17.595 17.595-0.267
Solution. Rearranging the given data in ascending order and calculating Karl Pearson's
coefficient of skewness.
30-40 35 21
35 +35 35
40-50 45 +
+2 +60 120
50-60 55 30
+3 +66 198
60-70 65 22
+4 +44 176
70-80 75 11
N=141
E fd 167 fd=609
Coeff. of Sk =
Mean-Mode
Mean: -A+xi
N
Zfd= 167, N=141, i= 10
A 35,
X= 35+ x10 35 + 11.84 =46.84
Mode: By inspection mode lies in the class 40-50.
MoL L+
A1+ A2 5, i= 10
|35-21|= 14, A2 =|35 30|
=
L= 40, A1 =
14
X 10 40 +7.37 47.37
=
Mo 40 +