0% found this document useful (0 votes)
41 views6 pages

CH 4

The document discusses measures of variation and dispersion in data. It introduces concepts like range, mean deviation, variance, and standard deviation. Examples are provided to demonstrate how to calculate these measures from data sets.

Uploaded by

Byhiswill
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
41 views6 pages

CH 4

The document discusses measures of variation and dispersion in data. It introduces concepts like range, mean deviation, variance, and standard deviation. Examples are provided to demonstrate how to calculate these measures from data sets.

Uploaded by

Byhiswill
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 6

Measure of variation(Ch 4)

4. MEASURES OF VARIATION (DISPERSION)


Introduction and objectives of measuring variation
We have seen that averages are representatives of a frequency distribution. But they fail to give a
complete picture of the distribution. They do not tell anything about the spread or dispersion of
observations within the distribution. Suppose that we have the distribution of yield (kg per plot) of two
rice varieties from 5 plots each.
Variety 1: 45 42 42 41 40 Variety 2: 54 48 42 33 30
The mean yield of both varieties is 42 kg. The mean yield of variety 1 is close to the values in this
variety. On the other hand, the mean yield of variety 2 is not close to the values in variety 2. The mean
doesn’t tell us how the observations are close to each other. This example suggests that a measure of
central tendency alone is not sufficient to describe a frequency distribution. Therefore, we should have a
measure of spreads of observations. There are different measures of dispersion.
Objectives of measuring variation
 To describe dispersion (variability) in a data.
 To compare the spread in two or more distributions.
 To determine the reliability of an average.
Note: The desirable properties of good measures of variation are almost identical with that of a good
measure of central tendency.
Absolute and relative measures
Absolute measures of dispersion are expressed in the same unit of measurement in which the original
data are given. These values may be used to compare the variation in two distributions provided that the
variables are in the same units and of the same average size. In case the two sets of data are expressed in
different units, however, such as quintals of sugar versus tones of sugarcane or if the average sizes are
very different such as manager’s salary versus worker’s salary, the absolute measures of dispersion are
not comparable. In such cases measures of relative dispersion should be used. A measure of relative
dispersion is the ratio of a measure of absolute dispersion to an appropriate measure of central tendency.
It is a unitless measure.
Types of Measures of Variation
i. The range and relative range
Range(R) is defined as the difference between the maximum and minimum observations in a set of data.

It is the crudest absolute measures of variation. It is widely used in the construction of quality control
charts and description of daily temperature.
Properties of range
 It is affected by extreme values.
 It does not take into account all observations.
 It is easy to calculate and simple to understand.
 It does not tell anything about the distribution of values in the set of data relative to some
measures of central tendency.

Page 1 of 6
Measure of variation(Ch 4)

Relative range (RR) is defined as


ii. The mean deviation and coefficient of mean deviation
Mean deviation (MD) is the average of the absolute deviations taken from a central value, generally the
mean or median. For grouped data
For grouped data
n k n k

∑|x i− X| ∑ f i|mi−X| , ∑|x i−~X| ∑ f i|mi−~


X|
MD X = i=1 = i =1 MD~X = i=1 = i =1
n n n n
Example: Calculate the mean deviation about the median and about the mean of the following scores of
students in a certain test. 6,7,7,10,10
n

∑|x i− X| |6−8|+2 ×|7−8|+ 2×|10−8|


MD X = i=1 = =1.6 ,
n 5
n

∑|x i−~X| |6−7|+2 ×|7−7|+ 2×|10−7|


MD~X = i=1 = =1.4
n 5

Note: In case of grouped data, the mid-point of each class interval is treated as representative of the
class and we can use the above formula. Besides,
Properties of mean deviation
 It is relatively simple to understand as compared to standard deviation.
 Its computation is simple.
 It is less affected by extreme values than standard deviation.
 It is better than the range and relative range since it is based on all observations.
 It is not suitable for further statistical treatment.
iii. Variance, standard deviation and coefficient of variation
The variance is the average of the squares of the distance each value is from the mean. The symbol for
the population variance is σ2. Let be the measurements on N population units then, the
population variance is given by the formula:

(∑ )
N 2

N N
xi
∑ ( xi −μ ) ∑ x −
i=1
2 2
i
,
2 i =1 i =1 N
σ = =
N N
Where µ is population mean and N is population size

Page 2 of 6
Measure of variation(Ch 4)

Let be the measurements on n sample units then, the sample variance is denoted by S 2, and

( )
n 2

n n ∑ xi n

∑ ( x i−x ) ∑ x − ∑ x 2i −n x 2
i=1
its formula is 2 2
i
where is the sample mean and n is the
2 i=1 i=1 n i =1
S= = =
n−1 n−1 n−1
sample size.

Standard deviation, denoted by σ or S, is the square root of the variance. That is, Population standard
deviation and sample standard deviation
Example: For a newly created position, a manager interviewed the following numbers of applicants
each day over a five-day period: 16, 19, 15, 15, and 14. Find the variance and standard deviation.
5

Solution:
∑ xi 16+19+ 15+15+14 79
x= i=1 = = =15.8
5 5 5
5

∑ ( x i−x )2 ( 16−15.8 )2+ …+ ( 14−15.8 )2 14.8


S2= i=1 = = =3.7⇒ S= √3.7=1.92∨¿
5−1 4 4

(∑ )
n 2

n
xi 2
( 16+ … 14 ) 6241
∑x 2 i=1
i – 16 2+ …+14 2− 1263−
n 5 5 14.8
S2= i=1 = = = =3.7
n−1 4 4 4

 For grouped frequency distribution, the formula for variance is

( )
k 2

k k ∑ f i xi
∑ f i ( x i−x ) ∑ f i x 2i −
2 i=1

n
S2= i=1 = i=1
n−1 n−1

Where is the number of classes, is the class mark of class i and


Properties of variance
 The unit of measurement of the variance is the square of the unit of measurement of the observed
values. It is one of its limitations.
 The variance gives more weight to extreme values as compared to those which are near to the
mean value, because the difference is squared in variance.
 It is based on all observations in the data set.
Properties of standard deviation
 Standard deviation is considered to be the best measure of dispersion and is used widely.
 There is, however, one difficulty with it. If the unit of measurement of variables of two series is
not the same, then their variability cannot be compared by comparing the values of standard
deviation.

Page 3 of 6
Measure of variation(Ch 4)

Uses of the variance and standard deviation


 The variance and standard deviations can be used to determine the spread of data, consistency of
a variable and the proportion of data values that fall within a specified interval in a distribution.
 If the variance or standard deviation is large, the data is more dispersed. This information is
useful in comparing two or more data sets to determine which is more (most) variable.
 Finally, the variance and standard deviation are used quite often in inferential statistics.
Coefficient of variation (CV)
The standard deviation is an absolute measure of dispersion and it cannot be used to compare the
variability of two or more different series as it is affected by measurement scales. When we need to
make comparisons of variability of two or more different data series we use a relative measure known as
the coefficient of variation (CV). Coefficient of variation is the ratio of the standard deviation to the
arithmetic mean, usually expressed in percent:

A distribution having less coefficient of variation is said to be less variable or more consistent or more
uniform or more homogeneous.
Example: Last semester, the students of two departments, A and B took Stat 276 course. At the end of
the semester, the following information was recorded.
Dept A Dept B
Mean score 79 64
SD 23 11
Compare the relative dispersion of the two departments.

,
Since > , the variation is department A is greater or in department B, the distribution of the marks is
more consistent.
Exercise: The mean weight of 20 children was found to be 30 kg with variance of 16kg 2 and their mean
height was 150 cm with variance of 25cm2. Compare the variability of weight and height of these
children.
iv. The standard scores
A standard score is a measure that describes the relative position of a single score in the entire
distribution of scores in terms of the mean and standard deviation. It also gives us the number of
standard deviations a particular observation lie above or below the mean.

Standard score,
Where x is the value of the observation, μ∧ X are the population & sample mean and σ ∧S arestandard
deviation of the population and sample respectively.
Interpretation:

Page 4 of 6
Measure of variation(Ch 4)

Example: Two sections were given an exam in a course. The average score was 72 with standard
deviation of 6 for section 1 and 85 with standard deviation of 5 for section 2. Student A from section 1
scored 84 and student B from section 2 scored 90. Who performed better relative to his/her group?
Solution:
Section 1: x = 72, S = 6 and score of student A from Section 1; A x = 84
Section 2: x = 85, S = 5 and score of student B from Section 2; B x = 90

Z-score of student A:

Z-score of student B:
From these two standard scores, we can conclude that student A has performed better relative to his/her
section students because his/her score is two standard deviations above the mean score of selection 1
while the score of student B is only one standard deviation above the mean score of section 2 students.
Exercise: A student scored 65 on a calculus test that had a mean of 50 and a standard deviation of 10;
she scored 30 on a algebra test with a mean of 25 and a standard deviation of 5. Compare her relative
positions on each test.
v. Skewness and kurtosis
 Skweness refers to lack of symmetry in a distribution. If a distribution is not symmetrical we call it
skewed distribution. Note that for a symmetrical and unimodal distribution: Mean = median = mode
Measure of skewness:

Pearson coefficient of skewness (Pcsk) defined as:

Interpretation:

In moderately skewed distributions: Mode = mean- 3(mean-median)


Note: In a negatively skewed distribution larger values are more frequent than smaller values. In a
positively skewed distribution smaller values are more frequent than larger values.
Exercise: If the mean, mode and standard deviation of of a frequency distribution are 70.2, 73.6, and
6.4, respectively. What can you state about its skeweness?

 Kurtosis refers to the degree of peakedness of a distribution. When the values of a distribution are
closely bunched around the mode in such a way that the peak of the distribution becomes relatively
high, the distribution is said to be leptokurtic. If it is flat topped we call it platykurtic. A distribution
which is neither highly peaked nor flat topped is known as a mesokurtic distribution (normal).

Page 5 of 6
Measure of variation(Ch 4)

Page 6 of 6

You might also like