Chapter 4
Measures of Dispersion
Measures of Dispersion
By comparing two different data sets (for now, measured in the same
units, e.g. kg, km etc). By chance, it may happened that the two data
sets have the same means, medians or modes.
Does it mean that the two data sets are the same or they have the same
features.
No.
here we need some extra insight into the data; as a first step, we need to
measure their respective dispersions or variabilities about the center and
then compare them.
What are Absolute Dispersion and Relative
Dispersion
Absolute & relative dispersion are two different ways to measure the
spread of a data set. They are used extensively in biological statistics, as
biological phenomena almost always show some variation and spread.
The easiest way to differentiate relative dispersion/absolute dispersion is to check
whether your statistic involves units. Absolute measures always have units, while
relative measures do not.
Most commonly used absolute measures of
dispersions
1. Range,
2. Mid-range
3. Inter-quartile Range (also called the fourth-spread),
4. Semi-inter-quartile Range( or Quartile deviation)
5. Mean Deviation
6. Variance
7. Standard Deviation
1: Range
The range R, is defined as:
The difference between the largest and smallest observations in a set of
data.
Symbolically, it is given by relation:
R= -
Where,
stands for largest observation
stands for smallest observation
Mid-range
It is just the average of two extreme values, i.e.
mid-range =
mid-range =
Inter-quartile Range
The interquartile range is a measure of spread and is defined as :
the difference between the third and first quartile .
it is denoted by IQR and symbolically;
IQR = Q3 −Q1
Quartile Deviation
The interquartile range is a measure of spread and it is denoted by Q.D. and symbolically;
Q.D. =
It is also called Semi-Inter-quartile range (SIQR) because it is just the half of IQR.
Co-efficient of Quartile Deviation
The pure measure (free of units of measurements) is the co-e fficient of
quartile deviation .
It is defined as
Co-efficient of Quartile Deviation =
This measure is free of measurements units and can be used to compare
two or more data with different units of measurement.
Mean Deviation:
Mean (or median) deviation (MD) or mean absolute deviation (MAD) is also a
measure of dispersion defined as
the average of the absolute differences/deviations between the data values and
the data center (usually, mean or median).
Mathematically, Using the mean as the data center,
Mean deviation from mean
For ungrouped data:
M.D.=
For grouped data:
M.D.=
Mean deviation from median:
For ungrouped data:
M.D. (median)=
For grouped data:
M.D. (median)=
Example
Find the MD and MedD for the following simple data.
65 55 89 56 35 14 56 55 87 45 92
Solution:
Lets denote the data by X.
What we need first, are the mean and median. The mean is
=
Since n is odd, the median is just the middle observation of the ordered data,
14 35 45 55 55 56 56 65 87 89 92
hence median is 56.
M.D. =
M.D. (median)= = 16.8
Example
Find the MD and MedD for the following grouped data.
x : 14 35 45 55 56 65 87 89 92
f: 4 7 11 13 18 13 8 6 3
Solution: Again, first we need the mean and the median to calculate the necessary columns. The mean
is
Solution
=58.7
=56
M.D.= = = 14.2
M.D. (median)= = = 13.5
Variance
Variance is defined as:
The mean of the squared deviations of all the observations from the mean.
Population variance is denoted by
the sample variance is denoted by S2 or
Mathematically,
For small samples(n <= 30)
S2 = for ungrouped data
S2 = for grouped data
F or large samples(n > 30)
S2 = for ungrouped data
S2 = for grouped data
Standard deviation (SD)
It is a widely used measure of variability or diversity, used in statistics and
probability theory.
It shows how much variation or “dispersion” exists from the average (mean, or
expected value).
A low standard deviation indicates that the data points tend to be very close to the
mean,
whereas high standard deviation indicates that the data points are spread out over
a large range of values.
Formulas for SD
S= for ungrouped data
S= for grouped data
F or large samples(n > 30)
S= for ungrouped data
S=
Formulas for SD
S=+ for ungrouped data
S=+ for grouped data