Measures of Dispersion or Variation: Vijay - Gahlawat@yahoo - Co.in

Download as ppt, pdf, or txt
Download as ppt, pdf, or txt
You are on page 1of 31

Measures of Dispersion or Variation

Dr. Vijay Kumar


[email protected]; [email protected]

1
Dispersion– The degree to which data tends to spread
about an average value is called variation or
dispersion of the data.

Measures of Dispersion:- Techniques that are used to


measure the extent of variation or the deviation of
each value in the data set from a measure of central
tendency (mean, median).

2
Why we need Measures of Variations?

• To determine the reliability of an average.


• To serve as a basis for the control of the variability
• To compare two or more series with regards to their
variability
• To facilitate the use of other statistical measures
(correlation analysis, the testing of hypothesis, the
analysis of fluctuations, techniques of production
control, cost control etc.)
3
Properties of a good measure of dispersion
• It should be simple to understand.
• It should be easy to compute.
• It should be rigidly defined.
• It should be based on each and every observation of
the distribution.
• It should be capable for further algebraic treatment.
• It should have sampling stability.
• It should not be unduly affected by extreme values.
4
Methods of studying variation / dispersion

1. The Range
2. The Interquartile Range
3. The Average Deviation
4. The Standard Deviation
5. The Lorenz Curve

5
Absolute measures of variation

Absolute measures of variations are expressed in the


same statistical unit in which the original data are
given such as rupees, kilograms, meters, tonnes etc.
These values may be used to compare the variation
in two or more than two distributions provided the
variables are expressed in the same units and have
almost the same average value.

6
Relative measures of variations
When the two data sets are expressed in different
units, for example – quintals of sugar versus tonnes
of sugarcane, or if the average value is very much
different, such as manager’s salary versus worker’s
salary, then relative measures of variations are
used.
A measure of relative variation is the ratio of a
measure of absolute variation to an average. It is
sometimes called a coefficient of variation ( a pure
number independent of the unit of measurement) 7
1.The Range

Range = L-S
L= Largest Value and
S= Smallest Value

The relative measure corresponding to range


L S
or the coefficient of range =
LS

8
Merits/Advantage of Range

• Range is the simplest to understand and the


easiest to compute, among all the methods of
studying variation.
• It takes minimum time to calculate the value of
range.
• For getting a quick rather than a very accurate
picture of variability, one may compute range.

9
Demerits/Disadvantages of Range

• Range is not based on each and every observation


of the distribution.
• It is subject to fluctuations of considerable
magnitude from sample to sample.
• Range can not be computed in case of open-end
distributions.
• Range cannot tell us anything about the character
of the distribution within two observations.
10
Uses of Range

• Quality Control
• Fluctuations in the share market
• Weather Forecasts

11
2. Interquartile Range
The range which include middle 50% 0bservations
Inter-quartile range = Q3  Q1
Semi-interquartile range or quartile deviation

(Q.D.) = Q3  Q1
2
The Relative measure corresponding to this
measure
Coefficient of Q.D. = Q3  Q1

Q3  Q1 12
Merits of Q.D.

• It is superior to range.
• We can calculate range in case of open end
distributions.
• It is not effected by the presence of extreme
observations.

13
Demerits of Q.D.
• It ignores 50% observations.
• It is not capable for further mathematical
manipulations.
• Its value is very much effected by sampling
fluctuations.
• It is not a measure of variation as it really does not
show the scatter around an average but rather a
distance on the scale, i.e. Q.D. is not itself
measured from an avg., but it is a positional avg.
14
3.The Mean Absolute Deviation / Average
Deviation
A.D. is obtained by calculating the absolute
deviations of each observation from median or
mean, and than averaging these deviations by
taking their arithmetic mean.
Computation of Mean Absolute Deviation –
Ungrouped Data
MAD (Mean) = XX
N
Or
MAD (Median) =  X  Med . 15
N
Computation of Mean Absolute Deviation – Grouped Data

MAD (Mean) = f XX


N
Or
MAD (Median) =
f X  Med .
N
Where N is the sum of frequencies i.e. N f

The Relative measure corresponding to this measure


Coefficient of MAD (Mean) = MAD
X
For MEDIAN MAD
Coefficient of MAD (Median) = Med . 16
Merits of A.D.

• It is simple to understand and easy to compute.


• It is based on each and every observation of
the data.
• A.D. is less effected by the values of extreme
observations.

17
Demerits of A.D.
• The greatest drawback is that algebraic signs are
ignored
• This method may not give us very accurate results.
(because A.D. give us best results when deviations
are taken from median)
• It is not capable for further algebraic treatments.
• It is rarely used in sociological and business studies.

18
4.The Standard Deviation

Most widely used measure of studying variation. Its


significance lies in the fact that it is free from those
defects from which the earlier methods suffer and
satisfies most of the properties of a good measure of
variation.
It is a measure of how much spread or variability is
present in the sample.

19
Standard Deviation is also known as Root Mean
Square Deviation for the reason that it is the square
root of the means of square deviations from the
arithmetic mean. Standard deviation is denoted by
small Greek letter  (read as sigma) and is defined
as
2
 _

 =   
 x  x

If we square standard deviation, we get Variance


Hence variance =  2 or  = var .
20
Calculation of SD- Ungrouped Data:-

(a) Deviation taken from Actual Mean:-


2
  _

 = 
 x  x 

N

Or

x x
2 2

 = N
 
N


 

21
Deviations taken from Assumed mean:-

d  d 
2 2

 = N
 
N


 

Where
d   x  A
and A is Assumed Mean or Arbitrary point

22
Calculation of SD – Grouped Data
Deviations taken from Actual Mean:-
2
 
_

 =  f  x  x
 
N

 fx   fx 
2


2

=   

N  N 

Where f is the frequency

N  f
And
23
Deviations taken from Assumed Mean:-

 fd   fd 
2 2

 =
N
 
N
 h

 

Where
d
 x  A
h
and A is Assumed Mean or Arbitrary point.
h is class interval
N  f

24
Coefficient of Variation
The corresponding relative measure of S.D. is known as
coefficient of variation. It is most commonly used
measure of relative measure.
It is used in such problems where we want to compare
the variability of two or more than two series.
Coefficient of Variation denoted by C.V. is obtained as
follows: 
 100
C.V. = _
x
25
Mathematical Properties of S.D.

• We can obtain combined S.D.


• The sum of the squares of the deviations of all the
observations from their arithmetic mean is
minimum. In other words, the sum of the squares
of the observations taken from a value other than
the A.M. would always be greater.
• S.D. is independent of change of origin but not
scale.
26
Merits of S.D.
• S.D. is the best measure of variation because of its
mathematical characteristic
• It is based on every observation of the distribution.
• It is capable for further algebraic treatment
• It is less affected by sampling fluctuations.
• For comparing the variability of two or more
distributions coefficients of variation is considered
to be most appropriate and this measure is based on
mean and S.D. 27
• S.D. is most prominently used in further
statistical work (Skewness, Correlation etc.)
• It is a key-note in sampling and provides a unit
of measurement for the Normal Distribution.

28
Demerits of S.D.

• As compare to other measures it is difficulty to


compute. However, it does not reduce the
importance of this measure because of eh high
degree of accuracy of results it gives.
• It gives more weight (importance) to extreme
values and less to those which are near the mean.

29
5. Lorenz Curve
• It is a graphic method of studying variation. It was
devised by Max O. Lorenz (Economic Statistician).
This curve was used by him for the first time to
measure the distribution of wealth and income.
• Now the curve is also used to study the distribution of
profits, wages, turnovers etc.
• The most common use of this curve is in the study
of the degree of inequality in the distribution of
income and wealth between countries or between
different periods of time. 30
It is a cumulative percentage curve in which
the % of items is combined with the % of
other things as wealth, profits, turnovers
etc.

31

You might also like