0% found this document useful (0 votes)
14 views26 pages

Chapt 4

Chapter 4 discusses measures of dispersion, which quantify the spread of data around an average value. It introduces absolute and relative measures of dispersion, including range, quartile deviation, mean deviation, and variance, along with their merits and demerits. The chapter emphasizes the importance of these measures in statistical analysis for comparing variability and assessing the reliability of central tendency measures.

Uploaded by

amdegiorgis101
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views26 pages

Chapt 4

Chapter 4 discusses measures of dispersion, which quantify the spread of data around an average value. It introduces absolute and relative measures of dispersion, including range, quartile deviation, mean deviation, and variance, along with their merits and demerits. The chapter emphasizes the importance of these measures in statistical analysis for comparing variability and assessing the reliability of central tendency measures.

Uploaded by

amdegiorgis101
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 26

CHAPTER 4

Measures of Dispersion
Introduction and objectives of measuring
Variation

•Dispersion or variation- is the scatter or spread of items of a


distribution.
•It is the degree to which numerical data tend to spread about an
average value is called dispersion or variation of the data.
•Measures of dispersions are statistical measures which provide ways
of measuring the extent in which data are dispersed or spread out.
Objectives of measuring Variation
•To judge the reliability of measures of central tendency
•To control variability itself.
•To compare two or more groups of numbers in terms of their
variability. To make further statistical 04/22/2025
analysis.By Getahun G Woldemariam(AU W
Absolute and Relative Measures of Dispersion
 Absolute measures – is the measures of dispersion which are expressed
in terms of the original unit of a series.
 Such measures are not suitable for comparing the variability of two
distributions which are expressed in d/t units of measurement and
different average size.
 Relative measures of dispersions are a ratio or percentage of a measure
of absolute dispersion to an appropriate measure of central tendency
 Are thus pure numbers independent of the units of measurement.
 For comparing the variability of two distributions (even if they are
measured in the same unit), we compute the relative measure of
dispersion instead of absolute measures of dispersion.
04/22/2025 By Getahun G Woldemariam(AU W
Types of Measures of Dispersion
 The most commonly used measures of dispersions are:
1. Range and relative range
2. Quartile deviation and coefficient of Quartile deviation
3. Mean deviation and coefficient of Mean deviation
4. Standard deviation and coefficient of variation.
The Range (R)
 It is the largest score minus the smallest score.
 It is a quick and dirty measure of variability.
 Because the range is greatly affected by extreme scores, it may give
a distorted picture of the scores.
 The following two distributions have the same range, 13, yet appear
to differ greatly in the amount of variability.

04/22/2025 By Getahun G Woldemariam(AU W


Distribution 1: 32 35 36 36 37 38 40 42 42 43 43 45

Distribution 2: 32 32 33 33 33 34 34 34 34 34 35 45

For this reason, among others, the range is not the most important
measure of variability.
R L  S , L l arg est observation
S smallest observation
Range for grouped data:
If data are given in the shape of continuous frequency distribution,
the range is computed as:
R UCLk  LCL1 , UCLk is upperclass lim it of the last class.
UCL1 is lower class lim it of the first class.
This is some times expressed as:
R X k  X1 , X k is class mark of the last class.
04/22/2025 XBy Getahun
1 is classmark of the first class.
G Woldemariam(AU W
Merits and Demerits of range
Merits:
•It is rigidly defined.
•It is easy to calculate and simple to understand.
Demerits:
•It is not based on all observation.
•It is highly affected by extreme observations.
•It is affected by fluctuation in sampling.
•It can not be computed in the case of open end distribution.
•It is very sensitive to the size of the sample.
Relative Range (RR)
It is also some times called coefficient of range and given by:
L S R
RR  
LS LS
Example: 1.Find the relative range of the above two distribution.
(Exercise!) 2.If the range and relative range of a series are 4 and 0.25
respectively. Then what is the value of: a) Smallest observation
b) Largest observation 04/22/2025 By Getahun G Woldemariam(AU W
Solution: (2)
R 4  L  S 4 _________________(1)
RR 0.25  L  S 16 _____________(2)
Solving (1) and (2) at the same time , one can obtain the following value
L 10 and S 6

The Quartile Deviation (Semi-inter quartile range), Q.D


• The inter quartile range is the difference b/n the third and the
first quartiles of a set of items & semi-inter quartile range is
half of the inter quartile range.
Q3  Q1
Q.D 
2
Coefficient of Quartile Deviation (C.Q.D)
(Q3  Q1 2 2 * Q.D Q3  Q1
C. Q.D   
(Q3  Q1 ) 2 Q3  Q1 Q3  Q1
•It gives the average amount by which the two quartiles differ from the
median. Example: Compute Q.D and its coefficient for the following
distribution. 04/22/2025 By Getahun G Woldemariam(AU W
Values Freq.
140- 150 17 Solutions:
150- 160 29 In the previous chapter we have obtained the
160- 170 42 values of
170- 180 72 all quartiles as:
180- 190 84 Q1= 174.90, Q2= 190.23, Q3=203.83
190- 200 107
200- 210 49 Q3  Q1 203.83  174.90
 Q.D   14.47
210- 220 34 2 2
220- 230 31 C.Q.D 
2 * Q.D

2 *14.47
0.076
230- 240 16 Q3  Q1 203.83  174.90
240- 250 12
Remark: Q.D or C.Q.D includes only the middle 50% of the
observation.

04/22/2025 By Getahun G Woldemariam(AU W


The Mean Deviation (M.D):
The mean deviation of a set of items is defined as the arithmetic
mean of the values of the absolute deviations from a given
average. Depending up on the type of averages used we have
different mean deviations.
A. Mean Deviation about the mean
•Denoted by M.D(X ) and given by

n
 Xi  X
M .D ( X )  i 1
n

For the case of frequency distribution it is given as:


k
 fi X i  X
i 1
M .D ( X ) 
n

Steps to calculate M.D ( X):


1.Find the arithmetic mean, X
2.Find the deviations of each reading from X
3.Find the arithmetic mean of04/22/2025
the deviations, ignoring
By Getahun G sign.
Woldemariam(AU W
b. Mean Deviation about the median.
.
~
Denoted by M.D( X ) and given by

n ~
~
 Xi  X
M .D ( X )  i 1
n
For the case of frequency distribution it is given as:
k ~
~
 fi X i  X
M .D( X )  i 1
n
~
Steps to calculate M.D X
1.Find the median,X
~
~
2.Find the deviations of each reading from X
3. Find the arithmetic mean of the deviations, ignoring sign.

04/22/2025 By Getahun G Woldemariam(AU W


C, Mean Deviation about the mode.
Denoted by M.D X̂ and given by
n

X ˆ
 X
i
ˆ) 
M.D(X i1
n
For the case of frequency distribution it is given as:

k
 f i X i  Xˆ
M .D ( Xˆ )  i 1
n
Steps to calculate M.D X̂
1.Find the mode, X̂
2.Find the deviations of each reading from X̂
3. Find the arithmetic mean of the deviations, ignoring sign.

04/22/2025 By Getahun G Woldemariam(AU W


Examples:
1.The following are the number of visit made by ten mothers to
the local doctor’s surgery. 8, 6, 5, 5, 7, 4, 5, 9, 7, 4
Find mean deviation about mean, median and mode.
Solutions:
First calculate the three averages
~
X 6, X 5.5, Xˆ 5
Then take the deviations of each observation from these averages.
Xi 4 4 5 5 5 6 7 7 8 9 total

2 2 1 1 1 0 1 1 2 3 14
Xi  6
X i  5.5 1.5 1.5 0.5 0.5 0.5 0.5 1.5 1.5 2.5 3.5 14

Xi  5 1 1 0 0 0 1 2 2 3 4 14

10
 X i  6) 14
i 1
 M .D ( X )   1.404/22/2025 By Getahun G Woldemariam(AU W
10 10
10

~
 X i  5.5 14
M .D ( X )  i 1  1.4
10 10

10
 X i  5) 14
M .D ( Xˆ )  i 1  1.4
10 10
2. Find mean deviation about mean, median and mode for the
following distributions.(exercise)
Class Frequen
cy Remark: Mean deviation about the
40-44 7 median is always the smallest.
45-49 10 Coefficient of Mean Deviation (C.M.D)
50-54 22 M .D
55-59 15 C .M . D 
60-64 12 Average about which deviations are taken
65-69 6 ~
M .D ( X ) ~ M .D ( X )
70-74 3  C.M .D( X )  C.M .D( X )  ~
X X
04/22/2025 By Getahun G Woldemariam(AU W
M .D ( ˆ)
X
C.M .D( Xˆ ) 

Example: calculate the C.M.D about the mean, median and
mode for the data in example 1 above
Solutions:
~
M .D( X ) 1.4 ~ M .D( X ) 1.4
 C.M .D( X )   0.233 C.M .D( X )  ~  0.255
X 6 X 5.5

ˆ M .D( Xˆ ) 1.4
C.M .D( X )   0.28
ˆ
X 5

Exercise: Identify the merits and demerits of Mean Deviation

04/22/2025 By Getahun G Woldemariam(AU W


The Variance
Population Variance
•If we divide the variation by the number of values in the
population, we get something called the population variance.
•This variance is the "average squared deviation from the mean".
1
Population Varince  2 
N
 (Xi   ) 2 , i 1,2,.....N
For the case of frequency distribution it is expressed as:
1
Population Varince    f i ( X i   ) 2 , i 1,2,.....k
2
N
Sample Variance
•One would expect the sample variance to simply be the
population variance with the population mean replaced by the
sample mean.
•However, one of the major uses of statistics is to estimate the
corresponding parameter.
•This formula has the problem 04/22/2025
that the estimated value isn't Wthe
By Getahun G Woldemariam(AU
same as the parameter.
•To counteract this, the sum of the squares of the deviations is
divided by one less than the sample size.
1
Sample Varince S 
n 1
 ( 2
X i  X ) 2
, i 1,2,....., n

For the case of frequency distribution it is expressed as:


1
Sample Varince S 
n 1
 f i
2
( X i  X ) 2
, i 1,2,.....k
We usually use the following short cut formula.
n
 Xi
2
 nX 2
S 2  i 1 , for raw data.
n 1
k
 fi X i
2
 nX 2
S 2  i 1 , for frequency distribution.
n 1

04/22/2025 By Getahun G Woldemariam(AU W


Examples: Find the variance and standard deviation of the
following sample data
1.5, 17, 12, 10.
2.The data is given in the form of frequency distribution.
Class Frequency
40-44 7
45-49 10
50-54 22
55-59 15
60-64 12
65-69 6
70-74 3
Solutions:
1 X 11
Xi 5 10 12 17 Total
(Xi- X )2 36 1 1 36 74
04/22/2025 By Getahun G Woldemariam(AU W
n
 (Xi  X )2
74
 S 2  i 1  24.67.
n 1 3
 S  S 2  24.67 4.97.

X 55
Xi(C.M) 42 47 52 57 62 67 72 Total
fi(Xi- X ) 2 1183 640 198 60 588 864 867 4400
n
 fi ( X i  X )2 4400
 S 2  i 1  59.46.
n 1 74
 S  S 2  59.46 7.71.
Special properties of Standard deviations
2
 (Xi  X )   i
(
2 X  A )
, A X
n 1 n 1
2. For normal (symmetric) distribution the following holds.
•Approximately 68.27% of the data values fall within one
standard deviation of the mean. i.e. with
04/22/2025 in ( X G SWoldemariam(AU
By Getahun , X  S) W
•Approximately 95.45% of the data values fall within two
standard deviations of the mean. i.e. with in ( X  2S , X  2S )
•Approximately 99.73% of the data values fall within three
standard deviations of the mean. i.e. with in ( X  3S , X  3S )
3. Chebyshev's Theorem
For any data set ,no matter what the pattern of variation, the
proportion of the values that fall with in k standard deviations of
1
the mean or ( X  kS , X  kS ) will be at least 1 
k2

where k is a number greater than 1. i.e. the proportion of items


falling beyond k standard deviations of the mean is at most 12
k
Example: Suppose a distribution has mean 50 and standard
deviation 6. What percent of the numbers are?
a. Between 38 and 62
b. Between 32 and 68
c. Less than 38 or more than 62.
d. Less than 32 or more than 68.
04/22/2025 By Getahun G Woldemariam(AU W
Solutions:
A, 38 and 62 are at equal distance from the mean,50 and this
distance is 12
 ks 12
12 12
 k   2
S 6
1
 Applying the above theorem, at least (1  2
) *100% 75%
k
of the numbers lie between 38 and 62
b, Similarly done.
c, It is just the complement of a) i.e. at most 1
*100% 25%
k2
of the numbers lie less than 32 or more than 62.
d, Similarly done.
Exercise: The average score of a special test of knowledge of
wood refinishing has a mean of 53 and standard deviation of 6.
Find the range of values in which at least 75% the scores will lie.

04/22/2025 By Getahun G Woldemariam(AU W


4.If the standard deviation of X 1 , X 2 , .....X n is S then the standard
deviation of
X 1  k , X 2  k , ..... X n  k will also be S
kX 1 , kX 2 , .....kX n would be k S
a  kX 1 , a  kX 2 , .....a  kX n would be k S
Exercise: Verify each of the above relation ship, considering k
and a as constants.
Examples:
1. The mean and standard deviation of n Tetracycline Capsules
X 1 , X 2 , ..... X n
are known to be 12 gm and 3 gm respectively.
New set of capsules of another drug are obtained by the linear
transformation Yi = 2Xi – 0.5 ( i = 1, 2, …, n ) then what will be
the standard deviation of the new set of capsules.
2.The mean and the standard deviation of a set of numbers are
respectively 500 and 10.
04/22/2025 By Getahun G Woldemariam(AU W
A, If 10 are added to each of the numbers in the set, then what
will be the variance and standard deviation of the new set?
B, If each of the numbers in the set are multiplied by -5, then
what will be the variance and standard deviation of the new set?
Solutions:
1, Using c) above the new standard deviation = k S 2 * 3 6
2.a. They will remain the same.
b. New standard deviation  k S 5 *10 50
Coefficient of Variation (C.V)
•Is defined as the ratio of standard deviation to the mean usually
expressed as percents.
S
C.V  *100
X
•The distribution having less C.V is said to be less variable or more
consistent. Example: An analysis of the monthly wages paid (in Birr)
to workers in two firms A and B belonging to the same industry gives
04/22/2025 By Getahun G Woldemariam(AU W
Value Firm A Firm B
Mean wage 52.5 47.5
Median wage 50.5 45.5
Variance 100 121

In which firm A or B is there greater variability in individual


wages?
Solutions:
Calculate coefficient of variation for both firms.
SA 10
C.V A  *100  *100 19.05%
XA 52.5
SB 11
C.VB  *100  *100 23.16%
XB 47.5

Since C.VA < C.VB, in firm B there is greater variability in


individual wages.
Exercise: A meteorologist interested in the consistency of temperatures
in three cities during a given week collected the following data. The
temperatures for the five days of 04/22/2025
the week in the three
By Getahun cities were W
G Woldemariam(AU
City 1 25 24 23 26 17

City2 22 21 24 22 20

City3 32 27 35 24 28

Which city have the most consistent temperature, based on these data?
Standard Scores (Z-scores)
1.If X is a measurement from a distribution with mean X

and standard deviation S, then its value in standard units is


X 
Z , for population.

X X
Z , for sample
S
• Z gives the deviations from the mean in units of standard deviation
• Z gives the number of standard deviation a particular observation lie
above or below the mean.
• It is used to compare two observations coming from different groups.
04/22/2025 By Getahun G Woldemariam(AU W
Examples:
Two sections were given introduction to statistics examinations.
The following information was given.
Value Section 1 Section 2
Mean 78 90
Stan.deviation 6 5

Student A from section 1 scored 90 and student B from section


2 scored 95.Relatively speaking who performed better?
Solutions:
Calculate the standard score of both students.
X A  X1 90  78
ZA   2
S1 6
XB  X2 95  90
ZB   1
S2 5

 Student A performed better relative to his section because


the score of student A is two standard deviations above the
mean score of his section while, the score of student B is
only one standard deviation above the mean score of his
04/22/2025 By Getahun G Woldemariam(AU W
Two groups of people were trained to perform a certain task
and tested to find out which group is faster to learn the task. For
the two groups the following information was given:
Value Group one Group two
Mean 10.4 min 11.9 min
Stan.dev. 1.2 min 1.3 min

Relatively speaking:
A, Which group is more consistent in its performance
B, Suppose a person A from group one take 9.2 minutes while
person B from Group two take 9.3 minutes, who was faster in
performing the task? Why?
Solutions:
A, Use coefficient of variation.
S1 1.2
C.V1  *100  *100 11.54% C.V  S 2 *100  1.3 *100 10.92%
X1 10.4 2
X2 11.9
04/22/2025 By Getahun G Woldemariam(AU W
Since C.V2 < C.V1, group 2 is more consistent.
B, Calculate the standard score of A
and B
X A  X1 9.2  10.4
ZA    1
S1 1.2
XB  X2 9.3  11 .9
ZB    2
S2 1.3
Child B is faster because the time taken by
child B is two standard deviations shorter than
the average time taken by group 2, while the
time taken by child A is only one standard
deviation shorter than the average time taken
by group 1.

04/22/2025 By Getahun G Woldemariam(AU W

You might also like