0% found this document useful (0 votes)

6 views

Lecture 5 Notes

Uploaded by

sirajamalif

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

6 views

Lecture 5 Notes

Uploaded by

sirajamalif

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 23

Introduction to Dispersion

What is Dispersion?
• Measures of average (such as the median and mean) represent the typical
value for a dataset. But within the dataset, the actual values usually differ from
one another and from the average value itself.

• The extent to which the central value are good representatives of the values
in the original dataset depends upon the variability or dispersion in the
original data.

• Dispersion is the spread or scatter of item values from a measure of central

tendency. Dispersion is usually measured as an average of deviations about
some central value.

• Dispersion thus is a type of average and is sometimes called a second order

average. Datasets are said to have high dispersion or variation when they
contain values considerably higher and lower than the central value.

Example 1

• Let us consider two groups of students with score in a particular examination

as shown in the table.

Group 1 49 50 50 51
Group 2 0 0 100 100

• The AM for each group is 50.

• It is clear from the data that the first group consists of near average intelligent
student and the 2nd group is made up of very bright and very dull students.

• It is evident that the distributions of both groups have the same AM.

• But they differ in variation from 𝑥𝑥̄ such variation is usually measured by the
measure of dispersion.
Example 2

Size of Tutorial Groups Size of Tutorial Groups

• In the two charts, the number of different sized tutorial groups in semester 1
and semester 2 are presented.

• In both semesters the mean and median tutorial group size is 5 students,
however the groups in semester 2 show more dispersion (or variability in size)
than those in semester 1.

Characteristics of a good measure of dispersion

Dispersion within a dataset can be measured or described in several ways including

the range, inter-quartile range, variance, and standard deviation.

The following are the characteristics of an ideal measure of variation or dispersion:

i. It should be easy to understand.

ii. It should be easy to calculate.

iii. It should be based upon all observations.

iv. It should be rigidly defined.

v. It should not be unduly affected by extreme values.

vi. It should be suitable for further algebraic treatment.

vii. It should be less affected by sampling fluctuation.

Purpose of measure of dispersion
Measure of dispersion is important for the following purpose.

i. To determine the reliability of an average.

ii. To compare the variability.

iii. To compare two or more series with regard to their variability.

iv. To facilitate the use of other statistical measures.

v. It is one of the most important quantities used to characterize a frequency

distribution.

Types of measure of dispersion

• Measure of dispersion or variation may be either absolute or relative.

• Absolute measure of variation is expressed in the same statistical unit in

which the original data are given such as takas, kilograms, tones, etc. and
may be used to compare the variation in two distributions, provided the
variables are expressed in the same units and of same average size.

• On the other hand, often it is necessary to compare the distribution in two or

more different frequency distributions having variables expressed in different
units.

• In such a case dispersion is calculated by dividing the absolute measure of

dispersion by a measure of central tendency – which generates pure number
that are independent of the unit of measurement. The resultant numerical
value is a relative measure of dispersion.

Which measures of Dispersion to choose?

Absolute Measure of Dispersion Relative Measure of Dispersion

When dealing with data, if ones’ When dealing with data, if ones’
objective is “only to determine” the objective is “to determine and
variation of single set of compare” the variations of multiple set
variable/Information: s/he can/will of variables/information having
choose to use Absolute measure of expressed in same/different unit(s): s/he
dispersion. can/will choose to use Relative measure
of dispersion.
Different types of Absolute and Relative measure of dispersion

Absolute Measure of Dispersion Relative Measure of Dispersion

1. Range 1. Coefficient of range

2. Quartile deviation 2. Coefficient of quartile deviation

3. Variance and Standard deviation 3. Coefficient of variation and

standard deviation

Range
• The range of a set of data values is the difference between the highest and
the lowest values in the set.

• If 𝑋𝑋𝑙𝑙 & 𝑋𝑋𝑆𝑆 are the smallest and the largest values respectively in a set then the
range “R” is defined as R = 𝑋𝑋𝑙𝑙 − 𝑋𝑋𝑆𝑆

• For group data the range is taken either as the difference between the lower
boundary of the first class and the upper boundary of the last class or as the
difference between the highest and the lowest mid-values

Coefficient of Range

The coefficient of dispersion corresponding to range called coefficient of range

𝑿𝑿𝒍𝒍 −𝑿𝑿𝑺𝑺
Coefficient of Range =
𝑿𝑿𝒍𝒍 + 𝑿𝑿𝑺𝑺

Where 𝑋𝑋𝑙𝑙 = Largest value and 𝑋𝑋𝑆𝑆 = Smallest Value

Range

Merits Demerits

• It is not based on all observation.

• Easy to understand and • Range does not give any

calculate. indication of the character of the
distribution with in the two
• It is based only on extreme extreme observations.
observations and no detail in
formations is required. • Range is subject of fluctuations
from sample to sample.
• It gives us a quick idea of the
variability of a set of data • Cannot be computed in case of
open-end class.

Quartile Deviation

• Quartiles divide the observations in to four equal parts, when observations are
arranged in order of magnitudes.

• Median, denoted by 𝑄𝑄2 , is the middle most observation and 𝑄𝑄1 & 𝑄𝑄3 are the
middle most observations of the lower and upper half respectively.

• Therefore 𝑄𝑄2 − 𝑄𝑄1 and 𝑄𝑄3 − 𝑄𝑄2 gives us some measure of dispersion.

• The AM of these two measures give us the quartile deviation and is denoted
by QD.

(𝐐𝐐𝟐𝟐 −𝐐𝐐𝟏𝟏 )+ (𝐐𝐐𝟑𝟑 −𝐐𝐐𝟐𝟐 ) 𝐐𝐐𝟑𝟑 −𝐐𝐐𝟏𝟏

𝑄𝑄𝑄𝑄 = =
𝟐𝟐 𝟐𝟐

Coefficient of Quartile Deviation

• The coefficient of variation corresponding to quartile deviation is called the

coefficient of quartile deviation and is defined as

𝑸𝑸𝟑𝟑 −𝑸𝑸𝟏𝟏
Coefficient of 𝑄𝑄𝑄𝑄 =
𝑸𝑸𝟑𝟑 +𝑸𝑸𝟏𝟏
Merits Demerits

• It is superior to range as a • It ignores 50% of items that is the

measure of dispersion. first 25% and last 25% of
observations.
• It is applicable in Open-end
class. • Very much affected by sampling
fluctuations.
• Easy to understand and
compute. • Not suited for further algebraic
treatment.
• Not affected by extreme values.

Test Yourself 1
The following data represents Annual wages of two Factories X and Y for the
given information
I. Determine range and coefficient of range. (in ‘000 Tk)
II. Determine the quartile deviation and coefficient of Co-efficient of quartile
deviation.

Table 1: Annual wages of Factory X workers (in ‘000 Tk)

91 70 74 79 86 93
60 71 76 79 87 96
112 72 127 79 87 62
68 72 77 79 90 76
69 73 77 85 48 157

Table 2: Annual wages of Factory Y workers (in ‘000 Tk)

97 78 85 92 97 105
72 79 85 92 97 107
112 79 87 92 97 72
113 80 90 96 68 75
78 82 90 97 100
Hints:
1. Find the lowest and highest value for each table.
2. Calculate the quartiles (𝑄𝑄1, 𝑄𝑄2, 𝑄𝑄3) for each table.

I. Determine range and coefficient of range. (in ‘000 Tk)

For table 1:
R = 157-48 = 109
157−48
Coefficient of Range = 157+48 = 0.5317

For table 2:
R = 113-68 = 45
113−68
Coefficient of Range = 113+68 = 0.2486

II. Determine the quartile deviation and coefficient of Co-efficient of quartile

deviation.
For table 1:
First Quartile 𝑄𝑄1:
𝑖𝑖𝑖𝑖 1∗30
4
= 4
= 7.5, not an integer

𝑄𝑄1= 72
Second Quartile 𝑄𝑄2 :
𝑖𝑖𝑖𝑖 2∗30
4
= 4
= 15, an integer
1
𝑄𝑄2 = 2 [77 + 79] = 78

Third Quartile 𝑄𝑄3 :

𝑖𝑖𝑖𝑖 3∗30
4
= 4
= 22.5, not an integer

𝑄𝑄3 = 87
𝑄𝑄3−𝑄𝑄1 87−72
QD = 2
= 2
= 7.5
𝑄𝑄3−𝑄𝑄1 87−72
Coefficient of QD = 𝑄𝑄3+𝑄𝑄1 = 87+72
= 0.09434
For table 2:
First Quartile 𝑄𝑄1:
𝑖𝑖𝑖𝑖 1∗29
4
= 4
= 7.25, not an integer

𝑄𝑄1 = 79
Second Quartile 𝑄𝑄2 :
𝑖𝑖𝑖𝑖 2∗29
4
= 4
= 14.5, not an integer

𝑄𝑄2 = 90
Third Quartile 𝑄𝑄3 :
𝑖𝑖𝑖𝑖 3∗29
4
= 4
= 21.75, not an integer

𝑄𝑄3 = 97
𝑄𝑄3−𝑄𝑄1 97−79
QD = 2
= 2
=9
𝑄𝑄3−𝑄𝑄1 97−79
Coefficient of QD = 𝑄𝑄3+𝑄𝑄1 = 97+79
= 0.10227
Variance and Standard Deviation

Note: For computational convenience we will use the following formulae

Ungroup Data Group Data

Population

Population
𝑵𝑵 𝟐𝟐 𝒌𝒌 𝟐𝟐
𝟏𝟏 �∑𝑵𝑵
𝒊𝒊=𝟏𝟏 𝒙𝒙𝒊𝒊 � 𝟏𝟏 �∑𝒌𝒌𝒊𝒊=𝟏𝟏 𝒇𝒇𝒊𝒊 𝒙𝒙𝒊𝒊 �
𝝈𝝈 = [ � 𝒙𝒙𝟐𝟐𝒊𝒊 −
𝟐𝟐
] 𝝈𝝈 = [ � 𝒇𝒇𝒊𝒊 𝒙𝒙𝟐𝟐𝒊𝒊 −
𝟐𝟐
]
𝑵𝑵 𝑵𝑵 𝑵𝑵 𝑵𝑵
𝒊𝒊=𝟏𝟏 𝒊𝒊=𝟏𝟏

𝒌𝒌
𝟏𝟏
𝒔𝒔 =𝟐𝟐
[ � 𝒇𝒇𝒊𝒊 𝒙𝒙𝟐𝟐𝒊𝒊

Sample
Sample

𝑵𝑵 𝟐𝟐
𝟏𝟏 �∑𝑵𝑵
𝒊𝒊=𝟏𝟏 𝒙𝒙𝒊𝒊 � 𝒏𝒏 − 𝟏𝟏
𝟐𝟐
𝒔𝒔 = [ � 𝒙𝒙𝟐𝟐𝒊𝒊 − ] 𝒊𝒊=𝟏𝟏
𝒏𝒏 − 𝟏𝟏 𝒏𝒏 𝟐𝟐
𝒊𝒊=𝟏𝟏 �∑𝒌𝒌𝒊𝒊=𝟏𝟏 𝒇𝒇𝒊𝒊 𝒙𝒙𝒊𝒊 �
− ]
𝒏𝒏
Ungroup Data Group Data

Variance

• Variance provides an average measure of squared difference between each

observation and arithmetic mean.

• In other words, the variance shows, on an average, how close the values of a
variable are to the arithmetic mean.
Calculating Variance

• If 𝑥𝑥1 , 𝑥𝑥2 , 𝑥𝑥3 , … … … , 𝑥𝑥𝑁𝑁 are sample values and 𝑥𝑥 is the sample mean, then
the deviation of the value 𝑥𝑥𝑖𝑖 from the sample mean 𝑥𝑥̅ is ( 𝑥𝑥𝑖𝑖 − 𝑥𝑥̅ ) and the
squared deviation is (𝑥𝑥𝑖𝑖 − 𝑥𝑥̅ )2.

• �)𝟐𝟐 . The following graph shows

The sum of squared deviations is 𝝈𝝈𝒏𝒏𝒊𝒊=𝟏𝟏 (𝒙𝒙𝒊𝒊 − 𝒙𝒙
the squared deviations of the values from their mean.

Standard Deviation

• The variance represents squared units, and therefore is not appropriate

measure of dispersion when we wish to express the concept of dispersion in
terms of the original unit.

• The Standard deviation is another measure of dispersion. The standard

deviation is the positive square root of the variance and is expressed in the
original unit of the data.

• Standard Deviation of variable 𝑋𝑋, 𝑆𝑆𝑆𝑆(X) = �𝑽𝑽𝑽𝑽𝑽𝑽(𝑿𝑿)

Population: Ungrouped Data

• If 𝑋𝑋1 , 𝑋𝑋2 , 𝑋𝑋3 , … … … 𝑋𝑋𝑁𝑁 are N values of a population of size N, then the
population variance, commonly designated as 𝝈𝝈2, is defined as

𝝈𝝈𝑵𝑵
𝒊𝒊=𝟏𝟏 (𝑿𝑿𝒊𝒊 − µ)
𝟐𝟐
𝝈𝝈𝟐𝟐 = , where µ = Population Mean
𝑵𝑵
Recommended formula:
𝑵𝑵 𝟐𝟐
𝟐𝟐
𝟏𝟏 𝟐𝟐
�∑𝑵𝑵
𝒊𝒊=𝟏𝟏 𝒙𝒙𝒊𝒊 �
𝝈𝝈 = [ � 𝒙𝒙𝒊𝒊 − ]
𝑵𝑵 𝑵𝑵
𝒊𝒊=𝟏𝟏

• For the same population, Standard Deviation (SD) of the population,

commonly designated as 𝝈𝝈, is defined as

𝝈𝝈𝑵𝑵
𝒊𝒊=𝟏𝟏 (𝑿𝑿𝒊𝒊 −µ)
𝟐𝟐
𝝈𝝈 = √𝑽𝑽𝑽𝑽𝒓𝒓𝒊𝒊𝒊𝒊𝒊𝒊𝒊𝒊𝒊𝒊 = √𝝈𝝈𝟐𝟐 = �
𝑵𝑵

Test Yourself 2
A population of 10 students got the marks in the examination as given in the
table below. Find the variance and Standard Deviation of the given data.
13 15 14 16 2 8 9 23 28 12
Answer:

𝑥𝑥𝑖𝑖 𝑥𝑥𝑖𝑖2
13 169

15 225

14 196

16 256

2 4

8 64

9 81

23 529

28 784

12 144

∑ 𝑥𝑥𝑖𝑖 = 140 ∑ 𝑥𝑥𝑖𝑖2 = 2452

2
1 �∑𝑁𝑁
𝑖𝑖=1 𝑥𝑥𝑖𝑖 � 1 (140)2
Variance, 𝝈𝝈2 = 𝑁𝑁 [ ∑𝑁𝑁 2
𝑖𝑖=1 𝑥𝑥𝑖𝑖 − ] = 10 [2452 − ] = 49.2
𝑁𝑁 10

SD = 𝝈𝝈 = 7.01427
Population: Grouped Data

• In case of grouped data, if 𝑋𝑋1 , 𝑋𝑋2 , 𝑋𝑋3, … … … , 𝑋𝑋𝑘𝑘 are values that occur with
frequencies 𝑓𝑓1 , 𝑓𝑓2 , 𝑓𝑓3 , … … … , 𝑓𝑓𝑘𝑘 respectively in a population of size N, then
the population variance 𝝈𝝈2 is defined as

𝟐𝟐
𝝈𝝈𝒌𝒌𝒊𝒊=𝟏𝟏 𝒇𝒇𝒊𝒊 (𝑿𝑿𝒊𝒊 − µ )𝟐𝟐 𝝈𝝈𝒌𝒌𝒊𝒊=𝟏𝟏 𝒇𝒇𝒊𝒊 (𝑿𝑿𝒊𝒊 − µ)𝟐𝟐
𝝈𝝈 = =
𝝈𝝈𝒌𝒌𝒊𝒊=𝟏𝟏 𝒇𝒇𝒊𝒊 𝑵𝑵

Recommended Formula:

𝒌𝒌 𝟐𝟐
𝟏𝟏 �∑𝒌𝒌𝒊𝒊=𝟏𝟏 𝒇𝒇𝒊𝒊 𝒙𝒙𝒊𝒊 �
𝝈𝝈 = [ � 𝒇𝒇𝒊𝒊 𝒙𝒙𝟐𝟐𝒊𝒊 −
𝟐𝟐
]
𝑵𝑵 𝑵𝑵
𝒊𝒊=𝟏𝟏

• For the same population, Standard Deviation (SD) of the population, 𝝈𝝈 is

defined as

𝝈𝝈𝒌𝒌𝒊𝒊=𝟏𝟏 𝒇𝒇𝒊𝒊 (𝑿𝑿𝒊𝒊 − µ)𝟐𝟐

� 𝟐𝟐
𝝈𝝈 = √𝑽𝑽𝑽𝑽𝑽𝑽𝑽𝑽𝑽𝑽𝑽𝑽𝑽𝑽𝑽𝑽 = 𝝈𝝈 = �
𝑵𝑵

Test Yourself 3
A population of 40 students got marks in the examination as given in the table
below. Find the variance and Standard Deviation of the given data.
𝑋𝑋𝑋𝑋 15 20 25 30 35
𝑓𝑓𝑓𝑓 6 8 15 7 4

𝑥𝑥𝑖𝑖 𝑓𝑓𝑖𝑖 𝑓𝑓𝑖𝑖 ∗ 𝑥𝑥𝑖𝑖2 𝑓𝑓𝑖𝑖 ∗ 𝑥𝑥𝑖𝑖

15 6 1350 90

20 8 3200 160

25 15 9375 375

30 7 6300 210

35 4 4900 140

∑ 𝑓𝑓𝑖𝑖 ∗ 𝑥𝑥𝑖𝑖2 = 25125 ∑ 𝑓𝑓𝑖𝑖 ∗ 𝑥𝑥𝑖𝑖 = 975

2
1 �∑𝑘𝑘
𝑖𝑖=1 𝑓𝑓𝑖𝑖 𝑥𝑥𝑖𝑖 � 1 (975)2
𝜎𝜎 = 𝑁𝑁 [
2 ∑𝑘𝑘𝑖𝑖=1 𝑓𝑓𝑖𝑖 𝑥𝑥𝑖𝑖2 − ] = 40 [25125 − ] = 33.9844
𝑁𝑁 40

SD = 5.8296

Sample: Ungrouped Data

• If 𝑋𝑋1 , 𝑋𝑋2 , 𝑋𝑋3, … … … , 𝑋𝑋𝑘𝑘 are values of a sample of size n, then the sample
variance 𝒔𝒔𝟐𝟐 is defined as

𝝈𝝈𝒏𝒏 � )𝟐𝟐
𝒊𝒊=𝟏𝟏 (𝑿𝑿𝒊𝒊 −𝒙𝒙
𝟐𝟐
𝒔𝒔 = , where x = Sample mean
𝒏𝒏−𝟏𝟏

Recommended Formula:

𝑵𝑵 𝟐𝟐
𝟏𝟏 �∑𝑵𝑵
𝒊𝒊=𝟏𝟏 𝒙𝒙𝒊𝒊 �
𝟐𝟐
𝒔𝒔 = [ � 𝒙𝒙𝟐𝟐𝒊𝒊 − ]
𝒏𝒏 − 𝟏𝟏 𝒏𝒏
𝒊𝒊=𝟏𝟏

For the same population, Standard Deviation (SD) of the population, 𝝈𝝈 is defined
as

𝝈𝝈𝒏𝒏𝒊𝒊=𝟏𝟏 (𝑿𝑿𝒊𝒊 − 𝒙𝒙
� )𝟐𝟐
𝒔𝒔 = √𝑽𝑽𝑽𝑽𝑽𝑽𝑽𝑽𝑽𝑽𝑽𝑽𝑽𝑽𝑽𝑽 = �𝒔𝒔𝟐𝟐 = �
𝒏𝒏 − 𝟏𝟏
Test Yourself 4
A sample of 10 students got the marks in the examination as given in the table
below. Find the variance and Standard Deviation of the given data.
13 15 14 16 2 8 9 23 28 12

𝑥𝑥𝑖𝑖 𝑥𝑥𝑖𝑖2
13 169

15 225

14 196

16 256

2 4

8 64

9 81

23 529

28 784

12 144

∑ 𝑥𝑥𝑖𝑖 = 140 ∑ 𝑥𝑥𝑖𝑖2 = 2452

2
1 �∑𝑁𝑁
𝑖𝑖=1 𝑥𝑥𝑖𝑖 � 1 (140)2
𝑠𝑠 = 𝑛𝑛−1 [
2 ∑𝑁𝑁 2
𝑖𝑖=1 𝑥𝑥𝑖𝑖 − ] = 9 [2452 − ] = 54.66667
𝑛𝑛 10

SD = 7.39369

Sample: Grouped Data

In case of grouped data, 𝑋𝑋1 , 𝑋𝑋2 , 𝑋𝑋3, … … … , 𝑋𝑋𝑘𝑘 are values that occur with
frequencies 𝑓𝑓1 , 𝑓𝑓2 , 𝑓𝑓3 , … … … , 𝑓𝑓𝑘𝑘 respectively in a sample of size n, then the sample
variance 𝒔𝒔𝟐𝟐 is defined as

𝟐𝟐
𝝈𝝈𝒌𝒌𝒊𝒊=𝟏𝟏 𝒇𝒇𝒊𝒊 (𝑿𝑿𝒊𝒊 − 𝒙𝒙
� )𝟐𝟐 𝝈𝝈𝒌𝒌𝒊𝒊=𝟏𝟏 𝒇𝒇𝒊𝒊 (𝑿𝑿𝒊𝒊 − 𝒙𝒙
�)𝟐𝟐
𝒔𝒔 = =
𝝈𝝈𝒌𝒌𝒊𝒊=𝟏𝟏 𝒇𝒇𝒊𝒊 − 𝟏𝟏 𝒏𝒏 − 𝟏𝟏

Recommended Formula:

𝒌𝒌 𝟐𝟐
𝟏𝟏 �∑𝒌𝒌𝒊𝒊=𝟏𝟏 𝒇𝒇𝒊𝒊 𝒙𝒙𝒊𝒊 �
𝟐𝟐
𝝈𝝈 = [ � 𝒇𝒇𝒊𝒊 𝒙𝒙𝟐𝟐𝒊𝒊 − ]
𝒏𝒏 − 𝟏𝟏 𝒏𝒏
𝒊𝒊=𝟏𝟏
For the same population, Standard Deviation (SD) of the population, 𝝈𝝈 is defined
as

𝝈𝝈𝒌𝒌𝒊𝒊=𝟏𝟏 𝒇𝒇𝒊𝒊 (𝑿𝑿𝒊𝒊 − 𝒙𝒙

� )𝟐𝟐
𝒔𝒔 = √𝑽𝑽𝑽𝑽𝑽𝑽𝑽𝑽𝑽𝑽𝑽𝑽𝑽𝑽𝑽𝑽 = �𝒔𝒔𝟐𝟐 = �
𝒏𝒏 − 𝟏𝟏

Test Yourself 5
A sample of 40 students got marks in the examination as given in the table
below. Find the variance and Standard Deviation of the given data.
𝑋𝑋𝑋𝑋 15 20 25 30 35
𝑓𝑓𝑓𝑓 6 8 15 7 4

𝑥𝑥𝑖𝑖 𝑓𝑓𝑖𝑖 𝑓𝑓𝑖𝑖 ∗ 𝑥𝑥𝑖𝑖2 𝑓𝑓𝑖𝑖 ∗ 𝑥𝑥𝑖𝑖

15 6 1350 90

20 8 3200 160

25 15 9375 375

30 7 6300 210

35 4 4900 140

∑ 𝑓𝑓𝑖𝑖 ∗ 𝑥𝑥𝑖𝑖2 = 25125 ∑ 𝑓𝑓𝑖𝑖 ∗ 𝑥𝑥𝑖𝑖 = 975

2
1 �∑𝑘𝑘
𝑖𝑖=1 𝑓𝑓𝑖𝑖 𝑥𝑥𝑖𝑖 � 1 (975)2
𝑠𝑠 = 𝑛𝑛−1 [
2 ∑𝑘𝑘𝑖𝑖=1 𝑓𝑓𝑖𝑖 𝑥𝑥𝑖𝑖2 − ] = 39 [25125 − ] = 34.8558
𝑛𝑛 40

SD = 5.9039
Population or Sample?

• As we can see, the formulas for variance and SD differ between population
and sample.

• For population, it is a parameter, whereas for sample, it is a statistic.

• Unless clearly mentioned, datasets are samples taken from a large

population.

• So the formula for populations should be used:

o If the question directly mentions that the data is for the whole
population or

o If the question dictates that the data was taken for all members of
population e.g. all students of a class, and then ask for variance/SD for
that population.

• Otherwise, always use the formula for samples, specially when nothing is
mentioned about sample/population.

Why 𝑛𝑛 − 1, not 𝑛𝑛?

• We see that both formulas of samples’ variance/SD (grouped/ungrouped) has

𝑛𝑛 − 1 as a denominator, instead of 𝑛𝑛.

• But for population, it is only 𝑁𝑁, the total population.

• The reason for using 𝑛𝑛 − 1 is complex and out of scope, but three basic points
can be mentioned:

o The average of all variance/SD taken from sample should be equal to

the variance/SD of the population. If we use 𝑛𝑛, it is not.

o For 𝑛𝑛 − 1, the sample’s variance/SD is closer to population’s.

o As samples are a finite set from the population, the value of the last
data is determined by the value of others. Thus the degree of freedom
of the set is 1 less than the size i.e. 𝑛𝑛 − 1.
Test Yourself 6
An Advertising company is looking for a group of extras to shoot a sequence for a
movie. The ages of the first 20 candidates to be interviewed are

50 56 44 49 52 57 56 57 56 59
54 55 61 60 51 59 62 52 54 49

𝑥𝑥𝑖𝑖 𝑥𝑥𝑖𝑖2
50 2500

56 3136

44 1936

49 2401

52 2704

57 3249

56 3136

57 3249

56 3136

59 3481

54 2916

55 3025

61 3721

60 3600

51 2601

59 3481

62 3844

52 2704

54 2916

49 2401

∑ 𝑥𝑥𝑖𝑖 = 1093 ∑ 𝑥𝑥𝑖𝑖2 = 60137

2
1 �∑𝑁𝑁
𝑖𝑖=1 𝑥𝑥𝑖𝑖 � 1 (1093)2
𝑠𝑠 2 = 𝑛𝑛−1 [ ∑𝑁𝑁 2
𝑖𝑖=1 𝑥𝑥𝑖𝑖 − ] = 19 [60137 − ] = 21.29211
𝑛𝑛 20

SD = 4.6143
As the director suggested that a standard deviation of 3 years would be accepted.
So this group of extras will not qualify.

Coefficient of Variation (CV)

• The coefficient of variation (CV) is defined as the ratio of the standard

deviation to the mean:
𝝈𝝈
Population CV, 𝒄𝒄𝒗𝒗 = ∗ 𝟏𝟏𝟏𝟏𝟏𝟏 = 𝑣𝑣𝑣𝑣𝑣𝑣𝑣𝑣𝑣𝑣 𝑖𝑖𝑖𝑖 𝑥𝑥%
µ

𝒔𝒔
Sample CV, 𝒄𝒄𝒗𝒗 = ∗ 𝟏𝟏𝟏𝟏𝟏𝟏 = 𝑣𝑣𝑣𝑣𝑣𝑣𝑣𝑣𝑣𝑣 𝑖𝑖𝑖𝑖 𝑥𝑥%
𝒙𝒙�

• It shows the extent of variability in relation to the mean of the population.

• The coefficient of variation should be computed only for data measured on a

ratio scale.

• For comparison between data sets with different units or widely different
means, one should use the coefficient of variation instead of the standard
deviation.

Comparing Coefficients

• C.V. is free of unit (unit-less) and the SD is divided by the corresponding AM

to make it comparable for different means.

• To compare the variability of two sets of data (i.e. to determine which set is
more variable), we need to calculate the AM and SD of both sets.

• Then we can calculate the CVs for both sets.

• The data set with the larger value of CV has larger variation which is
expressed in percentage.

• For example, the relative variability of the data set 1 will be larger than data
set 2 if 𝐶𝐶𝑉𝑉1 > 𝐶𝐶𝑉𝑉2 and vice versa.
Test Yourself 7
In a University, students can take any number of courses per semester. For two
samples of 30 students each. The data of how many courses one takes is given
below:

Course Number 2 3 4 5 6

Sample 1 2 5 10 12 1

Sample 2 1 6 8 13 2

For which sample of students, the relative variability of course numbers is

higher?

Sample 1:

𝑥𝑥𝑖𝑖 𝑓𝑓𝑖𝑖 𝑓𝑓𝑖𝑖 ∗ 𝑥𝑥𝑖𝑖2 𝑓𝑓𝑖𝑖 ∗ 𝑥𝑥𝑖𝑖

2 2 8 4

3 5 45 15

4 10 160 40

5 12 300 60

6 1 36 6

∑ 𝑓𝑓𝑖𝑖 ∗ 𝑥𝑥𝑖𝑖2 = 549 ∑ 𝑓𝑓𝑖𝑖 ∗ 𝑥𝑥𝑖𝑖 = 125

2
1 �∑𝑁𝑁
𝑖𝑖=1 𝑓𝑓𝑖𝑖 𝑥𝑥𝑖𝑖 � 1 (125)2
𝑠𝑠12 = 𝑛𝑛−1 [ ∑𝑁𝑁 2
𝑖𝑖=1 𝑓𝑓𝑖𝑖 𝑥𝑥𝑖𝑖 − ] = 29 [549 − ] = 0.971264
𝑛𝑛−1 30

SD1 = 0.985527
AM1 = 4.1667
𝑆𝑆𝑆𝑆
CV1 = 𝐴𝐴𝐴𝐴 = 0.2365
Sample 2:

𝑥𝑥𝑖𝑖 𝑓𝑓𝑖𝑖 𝑓𝑓𝑖𝑖 ∗ 𝑥𝑥𝑖𝑖2 𝑓𝑓𝑖𝑖 ∗ 𝑥𝑥𝑖𝑖

2 1 4 2

3 6 54 18

4 8 128 32

5 13 325 65

6 2 72 12

∑ 𝑓𝑓𝑖𝑖 ∗ 𝑥𝑥𝑖𝑖2 = 583 ∑ 𝑓𝑓𝑖𝑖 ∗ 𝑥𝑥𝑖𝑖 = 129

2
1 �∑𝑁𝑁
𝑖𝑖=1 𝑓𝑓𝑖𝑖 𝑥𝑥𝑖𝑖 � 1 (975)2
𝑠𝑠22 = 𝑛𝑛−1 [ ∑𝑁𝑁 2
𝑖𝑖=1 𝑓𝑓𝑖𝑖 𝑥𝑥𝑖𝑖 − ] = 29 [25125 − ] = 0.9759
𝑛𝑛−1 30

SD2 = 0.9879
AM2 = 4.3
𝑆𝑆𝑆𝑆
CV2 = 𝐴𝐴𝐴𝐴 = 0.2297

Here,
CV1 > CV2
So, the relative variability is higher in sample 1.

Combined Standard Deviation

The combined standard deviation of two sets of data containing 𝑛𝑛1 and 𝑛𝑛2
observations with means µ1 and µ2 and standard deviations 𝝈𝝈1 and 𝝈𝝈2 respectively
is given by

𝐧𝐧𝟏𝟏 �𝛔𝛔𝟐𝟐𝟏𝟏 + 𝐝𝐝𝟐𝟐𝟏𝟏 � + 𝐧𝐧𝟐𝟐 �𝛔𝛔𝟐𝟐𝟐𝟐 + 𝐝𝐝𝟐𝟐𝟐𝟐 �

𝛔𝛔𝟏𝟏𝟏𝟏 =�
𝐧𝐧𝟏𝟏 + 𝐧𝐧𝟐𝟐

Where, 𝝈𝝈𝝈𝝈𝝈𝝈 = combined Standard Deviation

𝑑𝑑1 = µ12 − µ1
𝑑𝑑2 = µ12 − µ2
𝑛𝑛1 µ1+ 𝑛𝑛2 µ2
µ12 =
𝑛𝑛1 +𝑛𝑛2

• This formula of combined standard deviation of two sets of data can be

extended to compute the standard deviation of more than two sets of data on
the same lines.

Test Yourself 8
From the analysis of monthly wages paid to employees in two service
organizations X and Y, the following results were obtained:

Organization X Organization Y
Number of wage-earners 550 650
Average monthly wages 5000 4500
Variance of the 900 1600
distribution of wages

a. Which organization pays a larger amount as monthly wages?

Organization X pays: 550 * 5000 = 2750000

Organization Y pays: 650 * 4500 = 2925000
Organization Y pays larger amount as monthly wages.
b. Determine the combined variance of all the employees taken together?

For organization X:
µ1 = 5000
𝑉𝑉𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎, 𝜎𝜎12 = 900
𝑆𝑆𝑆𝑆, 𝜎𝜎1 = 30
𝑛𝑛1 = 550
For organization Y:
µ2 = 4500
𝑉𝑉𝑉𝑉𝑉𝑉𝑉𝑉𝑉𝑉𝑉𝑉𝑉𝑉𝑉𝑉, 𝜎𝜎22 = 1600
𝑆𝑆𝑆𝑆, 𝜎𝜎2 = 40
𝑛𝑛2 = 650
Now,
𝑛𝑛1 µ1+ 𝑛𝑛2 µ2 550∗5000+650∗4500
µ12 = = = 4729.16667
𝑛𝑛1 +𝑛𝑛2 550+650

𝑑𝑑1 = µ12 − µ1 = -270.83333

𝑑𝑑2 = µ12 − µ2 = 229.16667
𝑛𝑛1 �𝜎𝜎12 +𝑑𝑑12 �+𝑛𝑛2 (𝜎𝜎22 +𝑑𝑑22 )
Combined Variance, 𝜎𝜎12 2 = 𝑛𝑛1 +𝑛𝑛2
= 63345.13144

Test Yourself 9
For a group of 50 male workers, the mean and standard deviation of their
monthly wages are tk. 6300 and tk. 600 respectively. For a group of 40 female
workers, these are tk. 5400 and tk. 600 respectively. Find the standard deviation
of monthly wages for the combined group of workers.
For Group 1:
µ1 = 6300
𝑆𝑆𝑆𝑆, 𝜎𝜎1 = 600
𝑛𝑛1 = 50
For Group 2:
µ2 = 5400
𝑆𝑆𝑆𝑆, 𝜎𝜎2 = 600
𝑛𝑛2 = 40
Now,
𝑛𝑛1 µ1+ 𝑛𝑛2 µ2
µ12 = = 5900
𝑛𝑛1 +𝑛𝑛2

𝑑𝑑1 = µ12 − µ1 = -400

𝑑𝑑2 = µ12 − µ2 = 500

𝑛𝑛1 �𝜎𝜎12 +𝑑𝑑12 �+𝑛𝑛2 �𝜎𝜎22 +𝑑𝑑22 �

Combined Standard deviation, 𝜎𝜎12 = � 𝑛𝑛1 +𝑛𝑛2
= 748.3315
Application of Standard Deviation

• In stock charts, Standard Deviation (SD) is a measure of volatility. Chartists

can use SD to measure expected risk and determine the significance of
certain price movements.

• Bollinger Bands show the upper and lower limits of ‘normal’ price
movements based on SD of prices.

• In many fields of Science and Business studies, SD is significant. From the

variance and SD we can understand the fitness of a statistical model when we
deal with the dependencies of data.

Variance

Merits Demerits

• Rigidly defined.

• Based upon all observation.

• Difficult to calculate.
• Easy to understand
• Affected by extreme values.
• Less affected by sampling
• Difficult to calculate for open-end
fluctuations.
class.
• Suitable for further algebraic
treatment.

Preface 2017 Practical Guide To Vegetable Oil Processing
No ratings yet
Preface 2017 Practical Guide To Vegetable Oil Processing
4 pages
The Negative Effects of Online Games On Senior High School Student
100% (1)
The Negative Effects of Online Games On Senior High School Student
32 pages
Midlands State University Msu Faculty of
No ratings yet
Midlands State University Msu Faculty of
14 pages
Currys Pyramid Model
No ratings yet
Currys Pyramid Model
10 pages
Bast 503 Lect 5
No ratings yet
Bast 503 Lect 5
53 pages
Dispersion: (Measures of Variability)
No ratings yet
Dispersion: (Measures of Variability)
93 pages
Measures of Dispersion
No ratings yet
Measures of Dispersion
17 pages
Measure of Dispersion
No ratings yet
Measure of Dispersion
64 pages
Measures of Dispersion
No ratings yet
Measures of Dispersion
32 pages
Imp - MEASURES OF DISPERSION
No ratings yet
Imp - MEASURES OF DISPERSION
5 pages
Stat-L CHAPTER 4
No ratings yet
Stat-L CHAPTER 4
18 pages
Dispersion
No ratings yet
Dispersion
18 pages
Dispersion
No ratings yet
Dispersion
31 pages
Measures of Dispersion and Shape - ALAS DOS NA POTA
No ratings yet
Measures of Dispersion and Shape - ALAS DOS NA POTA
12 pages
MBA_Quantitative_Techniques_and_Analytics_03_(1) (1)
No ratings yet
MBA_Quantitative_Techniques_and_Analytics_03_(1) (1)
14 pages
Handnote Chapter 4 Measures of Dispersio
No ratings yet
Handnote Chapter 4 Measures of Dispersio
45 pages
Measure of Variation
No ratings yet
Measure of Variation
16 pages
Unit 2 Measures of Dispersion: Structure
No ratings yet
Unit 2 Measures of Dispersion: Structure
16 pages
Chapter - 4 Dispersion
No ratings yet
Chapter - 4 Dispersion
10 pages
Lecture 2b Brief Lecture Notes On Measures of Dispersion (Variability)
No ratings yet
Lecture 2b Brief Lecture Notes On Measures of Dispersion (Variability)
11 pages
Dispersion
50% (2)
Dispersion
58 pages
Measure of Dispersion part 1
No ratings yet
Measure of Dispersion part 1
50 pages
Requisites For An Ideal Measures of Dispersion
100% (1)
Requisites For An Ideal Measures of Dispersion
9 pages
Chapter Four: Measures of Dispersion (Variation) : Abebuabebaw
No ratings yet
Chapter Four: Measures of Dispersion (Variation) : Abebuabebaw
21 pages
Measures of Dispersion
No ratings yet
Measures of Dispersion
25 pages
BS Unit 3
No ratings yet
BS Unit 3
22 pages
Dispersion (S)
No ratings yet
Dispersion (S)
44 pages
Chapter 3 Measure of Variation Dhiraj [Becon 2025]
No ratings yet
Chapter 3 Measure of Variation Dhiraj [Becon 2025]
83 pages
Measures of Variability-April19-1
No ratings yet
Measures of Variability-April19-1
88 pages
3.dispersion and Skewness-Students Notes-MAR
No ratings yet
3.dispersion and Skewness-Students Notes-MAR
29 pages
Dispersion
No ratings yet
Dispersion
26 pages
Final Measures of Dispersion DR Lotfi
No ratings yet
Final Measures of Dispersion DR Lotfi
54 pages
GE-Math_Learning-Module-9_AY-2021-2022
No ratings yet
GE-Math_Learning-Module-9_AY-2021-2022
9 pages
Measure of dispersion
No ratings yet
Measure of dispersion
6 pages
Measures of Dispersion
No ratings yet
Measures of Dispersion
7 pages
01_Ram Kishor MTECH_3rd SEM_ ESE-711_BATCH (2022-2024)_research Methodology
No ratings yet
01_Ram Kishor MTECH_3rd SEM_ ESE-711_BATCH (2022-2024)_research Methodology
23 pages
Chapter 3 - Measures Dispersion
No ratings yet
Chapter 3 - Measures Dispersion
20 pages
Measures OF VARIATION-final
No ratings yet
Measures OF VARIATION-final
32 pages
Lecture 4 Copy 1
No ratings yet
Lecture 4 Copy 1
13 pages
Kest 106
No ratings yet
Kest 106
17 pages
Chapter 6 (Philoid-In)
No ratings yet
Chapter 6 (Philoid-In)
17 pages
STASTIC
No ratings yet
STASTIC
12 pages
Dispersion (Measures of Variability)
100% (3)
Dispersion (Measures of Variability)
42 pages
Lesson# 4 Measure of Dispersion: Department of Statistics FC College University, Lahore
No ratings yet
Lesson# 4 Measure of Dispersion: Department of Statistics FC College University, Lahore
63 pages
Measure of Dispersion
No ratings yet
Measure of Dispersion
66 pages
Dispersion PDF
No ratings yet
Dispersion PDF
17 pages
UNIT FIVE (1)
No ratings yet
UNIT FIVE (1)
23 pages
Dispersion 26-11-2023
No ratings yet
Dispersion 26-11-2023
41 pages
Dispersion Theory
No ratings yet
Dispersion Theory
7 pages
11 Stat 6 Measures of Dispersion
No ratings yet
11 Stat 6 Measures of Dispersion
17 pages
MBA_U4_Quantitative Techniques for Business Decisions
No ratings yet
MBA_U4_Quantitative Techniques for Business Decisions
22 pages
Measures-of-Dispersion
No ratings yet
Measures-of-Dispersion
37 pages
Chapter 4 Measures of Dispersion
No ratings yet
Chapter 4 Measures of Dispersion
45 pages
Ch-2.2 Measures of Dispersion
No ratings yet
Ch-2.2 Measures of Dispersion
14 pages
Chapter A
No ratings yet
Chapter A
10 pages
Last Formuas of First Chapter
No ratings yet
Last Formuas of First Chapter
20 pages
Measures of Dispersion: Hapter
No ratings yet
Measures of Dispersion: Hapter
17 pages
Business Statistics and Decision Making
No ratings yet
Business Statistics and Decision Making
24 pages
Descriptive Lec
No ratings yet
Descriptive Lec
8 pages
[3]Measure-of-Dispersion
No ratings yet
[3]Measure-of-Dispersion
17 pages
Measure of Dispersion
No ratings yet
Measure of Dispersion
4 pages
AE 9-Activity 5-Measures of Dispersion and Shape
No ratings yet
AE 9-Activity 5-Measures of Dispersion and Shape
13 pages
Note Chapter 3
No ratings yet
Note Chapter 3
14 pages
Statistical Foundations for Psychology
From Everand
Statistical Foundations for Psychology
James C. Ware
No ratings yet
Stock Price Volatility Thesis
100% (3)
Stock Price Volatility Thesis
6 pages
Barclays - Research & Segmentation (Student)
No ratings yet
Barclays - Research & Segmentation (Student)
4 pages
Marketing of Innovations
No ratings yet
Marketing of Innovations
38 pages
Internal Analysis
No ratings yet
Internal Analysis
35 pages
City Branding-Israel
No ratings yet
City Branding-Israel
142 pages
Scientific Method and Science Skills - Matrix Education
No ratings yet
Scientific Method and Science Skills - Matrix Education
12 pages
Homework Policy Moe
100% (1)
Homework Policy Moe
5 pages
Grounded Theory Analysis With MAXQDA: Step-By-Step Guide
No ratings yet
Grounded Theory Analysis With MAXQDA: Step-By-Step Guide
8 pages
Business Statistics, A First Course: 4 Edition
No ratings yet
Business Statistics, A First Course: 4 Edition
38 pages
Poorly Understood: What America Gets Wrong About Poverty 1st Edition Mark Robert Rank download
100% (2)
Poorly Understood: What America Gets Wrong About Poverty 1st Edition Mark Robert Rank download
55 pages
Reception Analysis
No ratings yet
Reception Analysis
9 pages
Symphony Mobile Phone Is A Part of Leading Telecommunication and Consumer Electronic Group SB Tel Enterprise Limited
No ratings yet
Symphony Mobile Phone Is A Part of Leading Telecommunication and Consumer Electronic Group SB Tel Enterprise Limited
7 pages
Youtube MCN Thesis
No ratings yet
Youtube MCN Thesis
104 pages
Qatar
No ratings yet
Qatar
16 pages
Argument Development
No ratings yet
Argument Development
43 pages
Manual On Subsurface Investigations (AASTO) - PPT
No ratings yet
Manual On Subsurface Investigations (AASTO) - PPT
36 pages
Approval Sheet
No ratings yet
Approval Sheet
24 pages
2015equipment Benchmark Study Report
No ratings yet
2015equipment Benchmark Study Report
55 pages
Project Sem 7 PPT Group 1
No ratings yet
Project Sem 7 PPT Group 1
34 pages
Establish Decision Context: Appendix A: The Language of Risk and ISO 31000
No ratings yet
Establish Decision Context: Appendix A: The Language of Risk and ISO 31000
6 pages
Crime Mapping
No ratings yet
Crime Mapping
26 pages
Cms Placement Brochure Mba Fulltime 2019 20
No ratings yet
Cms Placement Brochure Mba Fulltime 2019 20
40 pages
Ex 8 Sem Syllabus
No ratings yet
Ex 8 Sem Syllabus
6 pages
01 SBL Assessment Brief Template 24-25 TASK 1 Final - Tagged (1) - 4
No ratings yet
01 SBL Assessment Brief Template 24-25 TASK 1 Final - Tagged (1) - 4
11 pages
Social Psychology Research Paper
No ratings yet
Social Psychology Research Paper
9 pages
Postbuckling and Collapse Analysis
100% (1)
Postbuckling and Collapse Analysis
8 pages