0% found this document useful (0 votes)
186 views105 pages

Stat Unit1 - 5

This document discusses measures of dispersion, or variation, in data sets. It describes several measures: [1] Range is the difference between the largest and smallest values; [2] Quartile deviation is half the difference between the upper and lower quartiles; [3] Variance and standard deviation measure how far values are from the mean. It also distinguishes between absolute measures (with the same units as data) and relative measures like coefficient of variation (unit-free). Examples are provided to demonstrate calculating the range, quartiles, quartile deviation, and coefficient of quartile deviation for data sets.

Uploaded by

Reayan Banday
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
186 views105 pages

Stat Unit1 - 5

This document discusses measures of dispersion, or variation, in data sets. It describes several measures: [1] Range is the difference between the largest and smallest values; [2] Quartile deviation is half the difference between the upper and lower quartiles; [3] Variance and standard deviation measure how far values are from the mean. It also distinguishes between absolute measures (with the same units as data) and relative measures like coefficient of variation (unit-free). Examples are provided to demonstrate calculating the range, quartiles, quartile deviation, and coefficient of quartile deviation for data sets.

Uploaded by

Reayan Banday
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 105

MEASURES OF

DISPERSION
(OR VARIATION)

Dr. Soya Mathew


DISPERSION:
To analyze and compare the variability in data sets we need a
measure of dispersion. It should be a number that effectively
summarizes the scatter in the set or the distribution of the values.
The measure should indicate the extend to which observations tend
to spread over an interval rather than to cluster around a central
value.
Types of Measures of Dispersion:
 Range (𝑅)
 Quartile Deviation (𝑄𝐷)
 Variance (𝑉)and Standard Deviation (𝑆𝐷)
 Co-efficient of Variation (𝐶𝑉)
Note: 𝑅, 𝑄𝐷 𝑎𝑛𝑑 𝑆𝐷, 𝑉 are absolute measure of variation as they have
the same unit of measurement as the original observations whereas 𝐶𝑉
is a relative measure of variation as it is free from the units of
measurements.
Thus, measures of dispersion are categorized mainly into two types
• Measures of Absolute Dispersion: The actual variation or dispersion determined by the
Measures of Absolute Dispersion is called ‘absolute dispersion’.
• Measures of Relative Dispersion: The measures of absolute dispersion cannot be used
to compare the variation of two or more series. For e.g., the standard deviation of the
height of students (in inches) cannot be compared with the standard deviation of
weights (in pounds).
To compare the variation of two or more series, we need a measure of relative
dispersion. It is defined as:
Absolute Dispersion
Relative Dispersion =
Average
Range:
The range is the simplest measure of dispersion. It is defined as the difference
between the largest value and the smallest value in the data:
𝑹𝒂𝒏𝒈𝒆 = 𝑳 − 𝑺
Where 𝐿 = Largest Observation
𝑆 = Smallest Observation
𝐿−𝑆
Co – efficient of Range =
𝐿+𝑆

• It provides the knowledge of the total spread of the data.

• If the range is low then individual values are close to each other therefore it
helps to understand precision of the data.
Example:
1. Compute the range for each data set:
a) 5, 3, 8, 6. 7, 11, 23, 6, 25

Range = 𝐿 − 𝑆
⟹ Range = 25 − 3
⟹ Range = 22
𝐿−𝑆 22
Co – efficient of Range = = = 0.7857143
𝐿+𝑆 28
b) 6, 6, 6, 6, 6
Range = 𝐿 − 𝑆
⟹ Range = 6 − 6
⟹ Range = 0
𝐿−𝑆 0
Co – efficient of Range = = =0
𝐿+𝑆 12
c) 8, 7, 6, 5, 4
Range = 𝐿 − 𝑆
⟹ Range = 8 − 4
⟹ Range = 4
𝐿−𝑆 4
Co – efficient of Range = = = 0.333
𝐿+𝑆 12
d) −7, −3, 0, 5, 8, 10, 22
Range = 𝐿 − 𝑆
⟹ Range = 22 − (−7)
⟹ Range = 29
𝐿−𝑆 29
Co – efficient of Range = = = 1.9333
𝐿+𝑆 15
e) 0, 2, 4, 5, 6, 8, 10
Range = 𝐿 − 𝑆
⟹ Range = 10 − 0
⟹ Range = 10
𝐿−𝑆 10
Co – efficient of Range = = =1
𝐿+𝑆 10
Example:
2. Find the range of the following data: (2018)
Class: 40 – 45 45 – 50 50 – 55 55 – 60 60 – 65 65 - 70

Frequency: 04 13 14 12 05 02

Solution:
Range = 𝐿 − 𝑆

⟹ Range = 70 − 40
⟹ Range = 30
Quartile Deviation (𝑸𝑫):
Quartile Meaning:

One of the three points that

divide a data set into four equal parts.

Or the values that divide data into

quarters. Each group contains equal number of observations or data. Median


acts as base for calculation of quartile.
Median ⇢ divides the variates into two equal parts

Quartiles ⇢ divides the variates into four equal parts


Quartile Deviation (𝑸𝑫):
The difference between upper and lower quartiles (𝑄3 − 𝑄1 ) is known as inter –
quartile range.

Quartile Deviation (or Semi- Interquartile Range) is half of the difference between
upper quartile (𝑄3 ) and the lower quartile (𝑄1 ). Thus,

𝑄3 − 𝑄1
Q.D =
2

Now for the comparative studies of variability of two distribution, we make use of
relative measure, known as Coefficient of Quartile Deviation. It is defined as

𝑄3 − 𝑄1
Coefficient of 𝑄. 𝐷=
𝑄3 + 𝑄1
Quartile Deviation in Individual Series

1. Arrange the data in the ascending or descending order.

𝑁+1 𝑡ℎ
2. 𝑄1 = size of term.
4

𝑁+1 𝑡ℎ
3. 𝑄3 = size of 3 term.
4

𝑄3 − 𝑄1
4. Calculate Q.D = .
2
Problems:
1. A statistical data was collected from 11 school children on the number of hours they spend
watching television in one week. The data given are 3, 8.5, 12, 9, 16.5, 9, 14, 20, 18, 19,
20. Find quartile deviation and coefficient of Quartile Deviation.

Solution:

Arranging the data in the ascending order: 3, 8.5, 9, 9, 12, 14, 16.5, 18, 19, 20, 20

𝑁+1 𝑡ℎ
𝑄1 = size of term
4

12 𝑡ℎ
⟹ 𝑄1 = size of term
4

⟹ 𝑄1 = size of 3𝑟𝑑 term

⟹ 𝑄1 = 9
𝑁+1 𝑡ℎ
𝑄3 = size of 3 term
4

12 𝑡ℎ
⟹ 𝑄3 = size of 3 term
4

⟹ 𝑄3 = size of 9𝑡ℎ term


⟹ 𝑄3 = 19
𝑄3 − 𝑄1
∴ Q.D =
2

19 − 9
⟹ Q.D =
2

10
⟹ Q.D =
2

⟹ Q.D =5 hours
𝑄3 − 𝑄1
Coefficient of 𝑄. 𝐷 =
𝑄3 + 𝑄1

19 − 9
⟹ Coefficient of 𝑄. 𝐷 =
19 + 9

10
⟹ Coefficient of 𝑄. 𝐷 =
28

⟹ Coefficient of 𝑄. 𝐷 = 0.3571
Quartile Deviation in Discrete Series

1. Calculate 𝑁 , the total frequency

𝑁+1 𝑡ℎ
2. 𝑄1 = size of term.
4

𝑁+1 𝑡ℎ
3. 𝑄3 = size of 3 term.
4

𝑄3 − 𝑄1
4. Calculate Q.D = .
2
Example:
1. Compute Quartile Deviation, Coefficient of Quartile Deviation using the
following data.
Wages in Rs.: 10 20 30 40 50 60

No. of Workers: 4 7 15 8 7 2
Solution:
𝑁+1 𝑡ℎ Wages in No. of Workers Cumulative Frequency
𝑄1 = size of term Rs.
4
44 𝑡ℎ 10 4 4
⟹ 𝑄1 = size of term
4 20 7 11
⟹ 𝑄1 = size of 11𝑡ℎ term 30 15 26
⟹ 𝑄1 = 20 Rs 40 8 34

𝑁+1 𝑡ℎ
50 7 41
𝑄3 = size of 3 term 60 2 43
4
44 𝑡ℎ 𝑵 = 43
⟹ 𝑄3 = size of 3 term
4
⟹ 𝑄3 = size of 33𝑟𝑑 term
Wages in Rs. No. of Workers Cumulative Frequency
⟹ 𝑄3 = 40 Rs
𝑄3 − 𝑄1
∴ Q.D = 10 4 4
2
40 − 20 20 7 11
⟹ Q.D =
2
30 15 26
20
⟹ Q.D =
2
40 8 34
⟹ Q.D = 10 Rs
50 7 41
𝑄3 − 𝑄1
Coefficient of 𝑄. 𝐷 =
𝑄3 + 𝑄1
60 2 43
40 − 20
⟹ Coefficient of 𝑄. 𝐷 =
40 + 20 𝑵 = 43
20
⟹ Coefficient of 𝑄. 𝐷 =
60
⟹ Coefficient of 𝑄. 𝐷 = 0.3333
2. From the following data calculate the range and semi inter – quartile

range.
Ages 20 30 40 50 60 70 80
No. of
3 61 132 153 140 51 3
Persons
Solution:
Range = 𝐿 − 𝑆
⟹ Range = 80 − 20
⟹ Range = 60 years
𝑁+1 𝑡ℎ Ages No. of Persons Cumulative Frequency
𝑄1 = size of term
4
20 3 3

544 𝑡ℎ
⟹ 𝑄1 = size of term 30 61 64
4
40 132 196
⟹ 𝑄1 = size of 136𝑡ℎ term
50 153 349

⟹ 𝑄1 = 40 years 60 140 489

70 51 540

80 3 543
𝑁+1 𝑡ℎ Ages No. of Persons Cumulative Frequency
𝑄3 = size of 3 term
4

544 𝑡ℎ 20 3 3
⟹ 𝑄3 = size of 3 term
4
30 61 64
⟹ 𝑄3 = size of 408𝑡ℎ term
40 132 196
⟹ 𝑄3 = 60 years
50 153 349
𝑄3 − 𝑄1
∴ Q.D = 60 140 489
2

60 − 40 70 51 540
⟹ Q.D =
2
80 3 543
20
⟹ Q.D =
2

⟹ Q.D = 10 years
𝑄3 − 𝑄1
Coefficient of 𝑄. 𝐷 =
𝑄3 + 𝑄1

60 − 40
⟹ Coefficient of 𝑄. 𝐷 =
60 + 40

20
⟹ Coefficient of 𝑄. 𝐷 =
100

⟹ Coefficient of 𝑄. 𝐷 = 0.2
3. Prices of shares of a company were as under from Monday to Saturday
for 40 weeks. Find QD of shares.

Day Mon Tue Wed Thu Fri Sat

Price 150 200 190 210 230 180

No. of Weeks 5 5 8 10 5 7
Solution: Prices No. of Weeks Cumulative Frequency

𝑁+1 𝑡ℎ 150 5 5
𝑄1 = size of term
4
200 5 10

41 𝑡ℎ
⟹ 𝑄1 = size of term 190 8 18
4
210 10 28
⟹ 𝑄1 = size of 10.25𝑡ℎ term
230 5 33

⟹ 𝑄1 = 190 Rs 180 7 40
𝑁+1 𝑡ℎ
𝑄3 = size of 3 term Prices No. of Weeks Cumulative Frequency
4

41 𝑡ℎ
⟹ 𝑄3 = size of 3 term 150 5 5
4

⟹ 𝑄3 = size of 30.75𝑡ℎ term 200 5 10

⟹ 𝑄3 = 230 Rs 190 8 18

𝑄3 − 𝑄1
∴ Q.D = 210 10 28
2

230 −190 230 5 33


⟹ Q.D =
2
180 7 40
40
⟹ Q.D =
2

⟹ Q.D = 20 Rs
𝑄3 − 𝑄1
Coefficient of 𝑄. 𝐷 =
𝑄3 + 𝑄1

230 −190
⟹ Coefficient of 𝑄. 𝐷 =
230 +190

40
⟹ Coefficient of 𝑄. 𝐷 =
420

⟹ Coefficient of 𝑄. 𝐷 = 0.095
Quartile Deviation in Continuous Series
1. Find the cumulative frequency for each class.
𝑁
2. Find the lower quartile class. It is the class interval for which cumulative frequency is ≥ .
4

3. To calculate lower quartile , the following formula is used


𝒉 𝑵
𝑄1 = 𝑳 + −𝒄
𝒇 𝟒

where 𝑁 = σ 𝑓,

𝐿 = lower limit of 𝑄1 class,

h = width of the 𝑄1 class,

𝑓 = frequency of the 𝑄1 class,

𝑐 = cumulative frequency of the class preceeding the 𝑄1 class


Quartile Deviation in Continuous Series (Contd…)

3𝑁
4. Find the upper quartile class. It is the class interval for which cumulative frequency is ≥ .
4

5. To calculate upper quartile , the following formula is used


𝒉 𝟑𝑵
𝑄3 = 𝑳 + −𝒄
𝒇 𝟒

where 𝐿 = lower limit of 𝑄3 class,

h = width of the 𝑄3 class,

𝑓 = frequency of the 𝑄3 class,

𝑐 = cumulative frequency of the class preceeding the 𝑄3 class


Problems:
1. Compute Quartile Deviation, Coefficient of Quartile Deviation using the following
data. Class 10 – 20 20 – 30 30 – 40 40 – 50 50 – 60 60 – 70 70 – 80
Interval:
Frequency: 12 19 5 10 9 6 6

Class Frequency Cumulative Frequency


Solution: Interval
10 – 20 12 12
𝑁 67
Here = = 16.75 20 – 30 19 31
4 4
30 – 40 5 36
∴ 20 – 30 is the 𝑄1 class. 40 – 50 10 46

50 – 60 9 55

60 – 70 6 61

70 – 80 6 67
𝑁 67
Here = = 16.75 Class Frequency Cumulative Frequency
4 4
Interval

∴ 20 – 30 is the 𝑄1 class. 10 – 20 12 12

20 – 30 19 31
𝒉 𝑵
𝑄1 = 𝑳 + −𝒄
𝒇 𝟒
30 – 40 5 36

10 40 – 50
⟹ 𝑄1 = 20 + 16.75 − 12 10 46
19
50 – 60 9 55
47.5
⟹ 𝑄1 = 20 +
19 60 – 70 6 61

⟹ 𝑄1 = 20 + 2.5 70 – 80 6 67

⟹ 𝑄1 = 22.5
3𝑁
Now = 3 16.75 = 50.25 Class Frequency Cumulative Frequency
4
Interval

∴ 50 – 60 is the 𝑄3 class. 10 – 20 12 12

20 – 30 19 31
𝒉 𝟑𝑵
𝑄3 = 𝑳 + −𝒄
𝒇 𝟒
30 – 40 5 36

10 40 – 50
⟹ 𝑄3 = 50 + 50.25 − 46 10 46
9
50 – 60 9 55
42.5
⟹ 𝑄3 = 50 +
9 60 – 70 6 61

⟹ 𝑄3 = 50 + 4.72 70 – 80 6 67

⟹ 𝑄3 = 54.72
𝑄3 − 𝑄1
∴ Q.D =
2

54.72 −22.5
⟹ Q.D =
2

32.22
⟹ Q.D =
2

⟹ Q.D = 16.11 units


𝑄3 − 𝑄1
Coefficient of 𝑄. 𝐷 =
𝑄3 + 𝑄1

54.72 − 22.5
⟹ Coefficient of 𝑄. 𝐷 =
54.72 + 22.5

32.22
⟹ Coefficient of 𝑄. 𝐷 =
77.22

⟹ Coefficient of 𝑄. 𝐷 = 0.417
Problems:

2. A survey of domestic consumption of electricity in a colony gave the following


distribution of units consumed. Find quartile deviation.

Units: Below 100 100 – 200 200 – 300 300 – 400 400 – 500 500 – 600 600 – 700 Above 700

No. of 20 21 30 46 20 25 16 10
Consumers:
Units No. of Cumulative Frequency
𝑁 188 Consumers
Here = = 47
4 4
Below 100 20 20
∴ 200 – 300 is the 𝑄1 class.
100 – 200 21 41

𝒉 𝑵
𝑄1 = 𝑳 + −𝒄 200 – 300 30 71
𝒇 𝟒

300 – 400 46 117


100
⟹ 𝑄1 = 200 + 47 − 41
30 400 – 500 20 137

600
⟹ 𝑄1 = 200 + 500 – 600 25 162
30
600 – 700 16 178
⟹ 𝑄1 = 200 + 20
Above 700 10 188
⟹ 𝑄1 = 220
Units No. of Cumulative Frequency
3𝑁 Consumers
Here = 3 47 = 141
4
Below 100 20 20
∴ 500 – 600 is the 𝑄3 class.
100 – 200 21 41

𝒉 𝟑𝑵
𝑄3 = 𝑳 + −𝒄 200 – 300 30 71
𝒇 𝟒

300 – 400 46 117


100
⟹ 𝑄3 = 500 + 141 − 137
25 400 – 500 20 137

400
⟹ 𝑄3 = 500 + 500 – 600 25 162
25
600 – 700 16 178
⟹ 𝑄3 = 500 + 16
Above 700 10 188
⟹ 𝑄3 = 516
𝑄3 − 𝑄1
∴ Q.D =
2

516 −220
⟹ Q.D =
2

296
⟹ Q.D =
2

⟹ Q.D = 148 units


Problems:

3. The following marks obtained by 50 students in statistics. Obtain the semi-


interquartile range and its coefficient.

Marks less 10 20 30 40 50 60
than
Frequency 4 10 30 40 47 50
𝑁 50
Here = = 12.5 Marks Frequency Cumulative Frequency
4 4

∴ 20 – 30 is the 𝑄1 class. 0 – 10 4 4

𝒉 𝑵 10 – 20
𝑄1 = 𝑳 + −𝒄 6 10
𝒇 𝟒

20 – 30 20 30
10
⟹ 𝑄1 = 20 + 12.5 − 10
20
30 – 40 10 40
25
⟹ 𝑄1 = 20 +
20 40 – 50 7 47

⟹ 𝑄1 = 20 + 1.25 50 – 60 3 50

⟹ 𝑄1 = 21.25 𝑚𝑎𝑟𝑘𝑠
3𝑁
Here = 3 12.5 = 37.5 Marks Frequency Cumulative Frequency
4

∴ 30 – 40 is the 𝑄3 class. 0 – 10 4 4

10 – 20 6 10
𝒉 𝟑𝑵
𝑄3 = 𝑳 + −𝒄
𝒇 𝟒
20 – 30 20 30

10 30 – 40
⟹ 𝑄3 = 30 + 37.5 − 30 10 40
10
40 – 50 7 47
⟹ 𝑄3 = 30 + 7.5
50 – 60 3 50
⟹ 𝑄3 = 37.5 𝑚𝑎𝑟𝑘𝑠
𝑄3 − 𝑄1
∴ Q.D =
2
37.5 −21.25
⟹ Q.D =
2
16.25
⟹ Q.D =
2

⟹ Q.D = 8.125 marks


𝑄3 − 𝑄1
Coefficient of 𝑄. 𝐷 =
𝑄3 + 𝑄1
37.5 −21.25
⟹ Coefficient of 𝑄. 𝐷 =
37.5 + 21.25
16.25
⟹ Coefficient of 𝑄. 𝐷 =
58.75

⟹ Coefficient of 𝑄. 𝐷 = 0.277
Standard Deviation (S𝑫):
Karl Pearson introduced the concept of Standard deviation in the year 1893.
It is a common measure of dispersion in most circumstances.

Standard deviation is the positive square root of the arithmetic mean of the
squares of deviations of the observations from their arithmetic mean. So, it
is called as Root - Mean Square Deviation or Mean Error or Mean Square
Error.

The Standard deviation is denoted by the small Greek letter „𝜎‟ (read as
sigma)
Standard Deviation in Individual Series
Deviation taken from Actual Mean

σ 𝒙− 𝑥ҧ 𝟐
S .D (σ)= , where 𝑁 = Number of observations
𝑁

Alternatively, we can find out standard deviation by using variables directly, i.e., no deviation is

found out.

σ 𝒙𝟐 σ𝒙 𝟐 σ 𝒙𝟐 𝟐
𝝈 = − ⟹ 𝝈 = ഥ
− 𝒙
𝑵 𝑵 𝑵

Deviation taken from Assumed Mean

σ 𝒅𝟐 σ𝒅 𝟐
𝝈 = − where 𝑑 = 𝑥 − 𝐴 , and 𝐴 =Assumed Mean
𝑵 𝑵
Problems:
• A survey was conducted for the number of road accidents in a major city during 11 successive
weeks. The results are given below 8 ,6 ,3 ,0 ,5 ,9 ,2 ,1 ,3 ,5 ,2 . Calculate SD of road accidents.

Solution:
𝒙: 8 6 3 0 5 9 2 1 3 5 2
෍ 𝒙 = 𝟒𝟒

𝒙𝟐 : 64 36 9 0 25 81 4 1 9 25 4
෍ 𝒙𝟐 = 𝟐𝟓𝟖

σ𝑥 44
𝑥ҧ = = =4
𝑁 11

σ 𝑥2 2
𝜎 = − 𝑥ҧ
𝑁

258 2
⟹ 𝜎 = − 4 = 2.73
11
Standard Deviation in Discrete Series
𝒙 𝟐
σ𝑓𝑖 𝑥𝑖 − ഥ
S .D (σ)= where 𝑁 = σ 𝑓𝑖
𝑁

Alternatively,

σ 𝑓𝑖 𝑥𝑖 𝟐 σ 𝑓𝑖 𝑥𝑖 𝟐 σ 𝑓𝑖 𝑥𝑖 𝟐 𝟐
𝝈 = − = ഥ
− 𝒙
𝑵 𝑵 𝑵

Deviation taken from Assumed Mean

σ 𝒇𝒅𝟐 σ 𝒇𝒅 𝟐
𝝈 = −
𝑵 𝑵

where
𝑑 = 𝑋 − 𝐴 and 𝐴 =Assumed Mean
Problems:

• 25 students were given an arithmetic test. The time in minute to complete the test is as
follows. Calculate SD of their time to complete the test.

Time in minutes: 1 2 3 4 5

No. of Students: 4 3 10 5 3
σ 𝑓𝑖 𝑥𝑖 75
𝑥ҧ = = =3 𝒙 𝒇 𝒙𝟐 𝒇𝒙 𝒇𝒙𝟐
𝑁 25

We have 1 4 1 4 4

σ 𝑓𝑖 𝑥𝑖 𝟐 𝟐 2 3 4 6 12
𝝈= ഥ
− 𝒙
𝑵

3 10 9 30 90
261 2
⟹ 𝜎= − 3
25 4 5 16 20 80

⟹ 𝜎 = 10.44 − 9 5 3 25 15 75

⟹ 𝜎 = 1.44 𝑵 = 𝟐𝟓 75 261

⟹ 𝜎 = 1.2
Standard Deviation in Continuous Series
𝒙 𝟐
σ𝑓𝑖 𝑥𝑖 − ഥ
S .D (σ)= where 𝑁 = σ 𝑓𝑖
𝑁

Alternatively,

σ 𝑓𝑖 𝑥𝑖 𝟐 σ 𝑓𝑖 𝑥𝑖 𝟐 σ 𝑓𝑖 𝑥𝑖 𝟐 𝟐
𝝈 = − = ഥ
− 𝒙
𝑵 𝑵 𝑵

Deviation taken from Assumed Mean

σ 𝒇𝒅𝟐 σ 𝒇𝒅 𝟐
𝝈 = 𝐡 −
𝑵 𝑵

where ℎ =width of the class interval

𝑥𝑖 − 𝐴
𝑑= and A=Assumed Mean

Problems:

1. A study of 100 engineering companies gives the following information. Find SD of


profit earned.

Profit (in Crore) 0 – 10 10 – 20 20 – 30 30 – 40 40 – 50 50 – 60

No. of Companies 8 12 20 30 20 10
σ 𝑓𝑖 𝑥𝑖 3220 Class 𝒇 Mid Point 𝒙𝟐 𝒇𝒙 𝒇𝒙𝟐
𝑥ҧ = = = 32.2 Intervals (𝒙)
𝑁 100

0 – 10 8 5 25 40 200
We have

10 – 20 12 15 225 180 2700


𝟐
σ 𝑓𝑖 𝑥𝑖 𝟐
𝝈= ഥ
− 𝒙
𝑵
20 – 30 20 25 625 500 12500

122900 30 – 40 30 35 1225 1050 36750


⟹ 𝜎= − 32.2 2
100
40 – 50 20 45 2025 900 40500

⟹ 𝜎 = 1229 − 1036.84
50 – 60 10 55 3025 550 30250

⟹ 𝜎 = 192.16 𝑵 = 𝟏𝟎𝟎 3,220 1,22,900

⟹ 𝜎 = 13.86 Rs
2. The profit (in lakhs ) earned by 100 companies are shown below. Compute the
standard deviation.
Profits 20 – 30 30 – 40 40 – 50 50 – 60 60 – 70 70 – 80 80 – 90 90 – 100

No. of 4 8 18 30 15 10 8 7
Companies
σ 𝑓𝑖 𝑥𝑖 5910 Class 𝒇 Mid Point 𝒙𝟐 𝒇𝒙 𝒇𝒙𝟐
𝑥ҧ = = = 59.1
𝑁 100 Intervals (𝒙)
20 – 30 4 25 625 100 2500
We have
30 – 40 8 35 1225 280 9800

σ 𝑓𝑖 𝑥𝑖 𝟐 𝟐 40 – 50 18 45 2025 810 36450


𝝈= ഥ
− 𝒙
𝑵
50 – 60 30 55 3025 1650 90750

380100 2 60 – 70 15 65 4225 975 63375


⟹ 𝜎= − 59.1
100
70 – 80 10 75 5625 750 56250

⟹ 𝜎 = 3801 − 3492.81 80 – 90 8 85 7225 680 57800

90 - 100 7 95 9025 665 63175


⟹ 𝜎 = 308.19
𝑵 = 𝟏𝟎𝟎 5910 380100

⟹ 𝜎 = 17.56 Rs
Problems:
3. Find the coefficient of dispersion based on S.D. From the following, information

Wages in Rs. 70 – 80 80 – 90 90 – 100 100 – 110 110 – 120 120 – 130

No. of Persons 12 18 35 42 50 45
We have Class 𝒇 Mid 𝒅 𝒅𝟐 𝒇𝒅 𝒇𝒅𝟐
Intervals Point 𝑥𝑖 − 𝐴
=
(𝒙) ℎ

σ 𝑓𝑑 2 σ 𝑓𝑑 2 70 – 80 12 75 −𝟑 9 −𝟑𝟔 108
𝜎 = h −
𝑁 𝑁

80 – 90 18 85 −𝟐 4 −𝟑𝟔 72

445 33 2 90 – 100 −𝟏 −𝟑𝟓


⟹ 𝜎 = 10 − 35 95 1 35
202 202
100 – 110 42 𝟏𝟎𝟓 0 0 0 0
⟹ 𝜎 =10 2.203 − 0.027
110 – 120 50 115 1 1 50 50

⟹ 𝜎 = 10 2.176
120 – 130 45 125 2 4 90 180
⟹ 𝜎 = 14.75 Rs
𝟐𝟎𝟐 33 445
Variance:
Variance is the square of Standard deviation. It is calculated as
𝑉𝑎𝑟𝑖𝑎𝑛𝑐𝑒 = 𝜎 2
Coefficient of Variation (𝑪. 𝑽):
Standard Deviation is the absolute measure of dispersion. The
corresponding relative measure is known as the coefficient of
variation.
The data set (or group) for which the coefficient of variation is
greater is said to be more variable or less consistent, less
uniform, less stable. On the other hand, the data set for which
coefficient of variation is less is said to be less variable or
more consistent, more uniform.
It is widely used for comparing two or more data sets.
𝝈
Coefficient of variation or 𝐂. 𝐕 = × 𝟏𝟎𝟎

𝒙

Where 𝜎 = Standard Variation

𝑥ҧ = Arithmetic Mean

Note: 𝑥ҧ is taken as the measure of efficiency while 𝐶. 𝑉 can be


used to measure consistency.
Problems:

1. Price of a particular commodity in five years in two cities are given below

(a) In which city are the prices higher?

(b) In which city are prices more stable?

Price in City A (in Rs) 20 22 19 23 16

Price in City B (in Rs) 10 20 18 12 15


Price in City A Price in City B Price in City A Price in City B

σ𝑥 100 75 𝒙 𝒙2 𝒙 𝒙2
𝑥ҧ = 𝑥ҧ = = 20 Rs 𝑥ҧ = = 15 Rs
𝑁 5 5
20 400 10 100
22 484 20 400
σ 𝑥2 2 2030 1193
𝜎= − 𝑥ҧ 𝜎= − 20 2 𝜎= − 15 2 19 361 18 324
𝑁 5 5

⟹ 𝜎 = 2.45 ⟹ 𝜎 = 3.69
23 529 12 144
16 256 15 225
𝜎 2.45 3.69
C. V = × 100
𝑥ҧ C. V = × 100 C. V = × 100
20 15

⟹ C. V = 24.6 σ 𝒙 =100 σ 𝒙𝟐 = 2030 σ 𝒙 =75 σ 𝒙𝟐 = 1193


⟹ C. V = 12.25

(a) Since A.M. for city A is more than A.M for city B, the prices in city A are higher.

(b) Since C.V of City A < C.V of City B, price in City A are more stable.
Problems:

2. Following are the two data sets representing number of Customers in two different
shops. Find out which set is more consistent.

Day of the Week Mon Tue Wed Thur Fri Sat Sun

Set A 200 500 300 1000 400


Number of
Customers
Set B 100 300 400 1000 1500 500 100
Customers in Set A Customers in Set B
Customers in Set A Customers in Set B
𝒙 𝒙2 𝒙 𝒙2
σ𝑥 2400 3900 200 40000 100 10000
𝑥ҧ = 𝑥ҧ = = 480 Rs 𝑥ҧ = = 557.14 Rs
𝑁 5 7
500 250000 300 90000

300 90000 400 160000


σ 𝑥2 2 1540000 2 3770000 2
𝜎= − 𝑥ҧ 𝜎= − 480 𝜎= − 557.14 1000 1000000 1000 1000000
5 7
𝑁
⟹ 𝜎 = 278.57 ⟹ 𝜎 = 477.67 400 160000 1500 2250000

- - 500 250000
𝜎 278.57 477.67
C. V = 𝑥ҧ × 100 C. V = × 100 C. V = × 100 - - 100 10000
480 557.14

⟹ C. V = 58.04 ⟹ C. V = 85.74 σ 𝒙 =2,400 σ 𝒙𝟐 = σ 𝒙 =3900 σ 𝒙𝟐 =


15,40,000 37,70,000

Here C.V of Set A < C.V of Set B

⟹ Data Set A is Consistent


Problems:

3. Goals scored by two team A and B in a football season as follows. Find out which
team is better and which team is more consistent.

No. of Goal 0 1 2 3 4

Team A 27 9 8 5 4

Team B 17 9 6 5 3
Solution:

No of goals Team A Team B

𝑥 𝑓 𝑥2 𝑓 𝑥2 𝑓𝑥 𝑓 𝑥2 𝑓𝑥 2 𝑓𝑥

0 27 0 0 0 17 0 0 0

1 9 1 9 9 9 1 9 9

2 8 4 32 16 6 4 24 12

3 5 9 45 15 5 9 45 15

4 4 16 64 16 3 16 48 12

53 150 56 40 126 48
Team A Team B

σ𝑥 56 48
𝑥ҧ = 𝑥ҧ = = 1.06 𝑥ҧ = = 1.2
𝑁 53 40

𝜎
150 2 126
𝜎= − 1.06 𝜎== − 1.2 2
σ 𝑓𝑥 2 2
53 40
= − 𝑥ҧ
𝑁 ⟹ 𝜎 = 1.306 ⟹ 𝜎 = 1.307

𝜎 1.306 1.307
C. V = 𝑥ҧ × 100 C. V = × 100 C. V = × 100
1.06 1.2

⟹ C. V = 123.21 ⟹ C. V = 108.92

Here C.V of Team B < C.V of Team A

⟹ Team B is more consistent in scoring goals as compared to team A.


Problems:

4. Lives of 2 models of refrigerators were studied in a survey. Based on the data given
below, which model has a longer life and which has more uniformity
Life( No. of years) 0–2 2–4 4–6 6–8 8 – 10 10 – 12

No of Refrigerators A 5 16 13 7 5 4

B 2 7 12 19 9 1
Solution:

Life(No. Refrigerator A Refrigerator B


of Years)

𝑥 𝑓 𝑥2 𝑓 𝑥2 𝑓𝑥 𝑓 𝑥2 𝑓 𝑥2 𝑓𝑥

0–2 1 5 1 5 5 2 1 2 2
2–4 3 16 9 144 48 7 9 63 21
4–6 5 13 25 325 65 12 25 300 60
6–8 7 7 49 343 49 19 49 931 133
8 – 10 9 5 81 405 45 9 81 729 81
10 – 12 11 4 121 484 44 1 121 121 11

N=50 1706 256 N=50 2146 308


Refrigerator A Refrigerator B

σ 𝑓𝑥 256 308
𝑥ҧ = 𝑥ҧ = = 5.12 𝑥ҧ = = 6.16
𝑁 50 50

𝜎
1706 2 2146
𝜎= − 5.12 𝜎== − 6.16 2
σ 𝑓𝑥 2 2
50 50
= − 𝑥ҧ
𝑁 ⟹ 𝜎 = 2.81 ⟹ 𝜎 = 2.23

𝜎 2.81 2.23
C. V = 𝑥ҧ × 100 C. V = × 100 C. V = × 100
5.12 6.16

⟹ C. V = 54.88 ⟹ C. V = 36.21

Here C.V of Refrigerator B < C.V of Refrigerator A


⟹ Refrigerator B has a longer life and more uniformity.
Problems:

5. An analysis of monthly wages of the workers of two organization C and D gave the
following results

(a) Which firm C or D pays larger amount as monthly wages ?

(b) Which organization has more homogeneity in wages?

C D
No. of Worker 500 600

Average monthly wage Rs. 186 Rs. 175

Variance of distribution of Rs. 81 Rs.100


wages
Solution:
Given C D
𝑁 500 600
𝑥ҧ Rs. 186 Rs. 175

𝜎2 Rs. 81 Rs.100

(a) Total wage (C) = mean × Number of employees=186 × 500= Rs. 93000
Total wage (D) = mean × Number of employees=175 × 600= Rs. 105000

Organization D paid large amount of monthly wages.


Solution:
Given Factory A Factory B
𝑁 500 600
𝑥ҧ Rs. 186 Rs. 175

𝜎2 Rs. 81 Rs.100

𝜎 9
(b) C.V. ( C )= 𝑥ഥ × 100 = 186 × 100 = 4.84
𝜎 10
C.V. (D)= × 100 = × 100 = 5.71
𝑥ഥ 175

Here C.V of C < C.V of D

Hence organization C has more homogeneity in wages.


Problems:

Following information relating to wages per employee for two factories are given below.
(a) Which factory wage distribution is more consistent?
(b) Which factory weekly wage bill is lower?

Factory A Factory B

Number of employees 50 100

Average wages per employee per 1200 850


week(Rs.)

Variance of the wages per employee per 81 64


week (Rs.)
Solution:
Given Factory A Factory B

𝑁 50 100

𝑥ҧ 1200 850

𝜎2 81 64

𝜎 9
(a) C.V. ( Factory A )= × 100 = × 100 = 0.75
𝑥ഥ 1200

𝜎 8
C.V. ( Factory B )= × 100 = × 100 = 0.94
𝑥ഥ 850

Here C.V of Factory A < C.V of Factory B

Hence wage distribution of factory A is more consistent.


Solution:
Factory A Factory B
Given
𝑁 50 100

𝑥ҧ 1200 850

𝜎2 81 64

(b) Total wage ( Factory A) = mean × Number of employees=1200 × 50 = Rs. 60000


Total wage ( Factory B) = mean × Number of employees=850 × 100 = Rs. 85000

Factory A wage bill is lower.


SKEWNESS:
Consider the following frequency distributions which give the scores obtained by the students
who were studying in Commerce, Humanities and Science groups.

1. Scores : 0 – 10 10 – 20 20 – 30 30 – 40 40 – 50 50 – 60

No. of students : 5 8 15 15 8 5

2. Scores : 0 – 10 10 – 20 20 – 30 30 – 40 40 – 50 50 – 60

No. of students : 4 7 16 11 7 5

3. Scores : 0 – 10 10 – 20 20 – 30 30 – 40 40 – 50 50 – 60

No. of students : 5 7 11 1 6 7
Mean and variance of the above distributions are same but they differ widely in their overall
appearance as we can seen from the following diagrams
In figure (1) the right and left of mode(highest ordinate) are perfect mirror
images of one another. They are called as symmetric distributions.
In figure (2) you can see more items on the right side of mode and have a
longer tail to the right side of mode.
In figure (3), more items on the left of mode and have longer tail to the left of
mode.
When frequency curves are drawn for different frequency distributions,
there is an apparent common characteristic, which is striking to the eye,
called symmetry or lack of symmetry.

The lack of symmetry of a distribution is known as skewness.

Here it is clear that figures (2) and (3) are not symmetric or they are
skewed. Thus, there are 2 types of skewness:

1) Positive skewness (long tail to the right)

2) Negative skewness (long tail to the left)


Symmetric Frequency Curve:
• 𝑀𝑒𝑎𝑛 = 𝑀𝑒𝑑𝑖𝑎𝑛 = 𝑀𝑜𝑑𝑒

• The highest ordinate (mode)

divides the total area under

the curve into two equal

parts.
Positive Skewness:

The frequency curve is said to be positively skewed if more items are


found to the right side of the mode. In this case the frequency curve
will have a longer tail to the right. Also Mode, Median and Mean are
in the ascending order of their magnitude. (Mean > median > Mode.)
Negative Skewness:

The frequency curve is said to be negatively skewed if more items


are found to the left side of the mode. In this case the curve will have
a longer tail to the left and Mode, Median and Mean are in the
descending order of their magnitude. (Mean < Median < Mode.)
Measures of Skewness:
Measures of skewness indicate to what extend and in what direction the
distribution of variable departs from symmetry of a frequency curve. It
gives the information about the shape of the distribution and the degree of
variation on either side of the central value.

There are 3 important measures of Skewness

1. Karl Pearson’s Co-efficient of Skewness

2. Bowley’s Co-efficient of Skewness

3. Co-efficient of Skewness based on Moments.


Karl Pearson’s Co-efficient of Skewness :
Karl Pearson derived the coefficient of skewness denoted by Sk and is defined as

𝑀𝑒𝑎𝑛 − 𝑀𝑜𝑑𝑒 𝑥ഥ − 𝑍
𝑆𝑘 = =
𝑆𝑡𝑎𝑛𝑑𝑎𝑟𝑑 𝐷𝑒𝑣𝑖𝑎𝑡𝑖𝑜𝑛 𝜎

3 ( 𝑥ഥ − 𝑀)
Since, 𝑍 = 3𝑀 − 2𝑥,ҧ 𝑆𝑘 = where 𝑀 is the median
𝜎

Note:

• For positive skewness, 𝑆𝑘 > 0 (Since Mean > Mode)

• For negative skewness, 𝑆𝑘 < 0 (Since Mean < Mode)

• For symmetry 𝑆𝑘 = 0 (Since Mean = Mode)


Example:
1. For a distribution 𝑀𝑒𝑎𝑛 = 30, 𝑀𝑜𝑑𝑒 = 26.8 and 𝑣𝑎𝑟𝑖𝑎𝑛𝑐𝑒 = 64. Find the coefficient of
skewness. Interpret the result.

Solution:

Given 𝑥ҧ = 30, 𝑀𝑜𝑑𝑒 = 26.8, 𝜎 2 = 64,

Then S.D, 𝜎 = 8

Karl Pearson’s coefficient of skewness

𝑀𝑒𝑎𝑛 − 𝑀𝑜𝑑𝑒 𝑥ҧ − 𝑍
𝑆𝑘 = =
𝑆𝑡𝑎𝑛𝑑𝑎𝑟𝑑 𝐷𝑒𝑣𝑖𝑎𝑡𝑖𝑜𝑛 𝜎

30 − 26.8
⟹ 𝑆𝑘 = = 0.4
8

Since 𝑆𝑘 > 0, the distribution is positively Skewed.


2. For a group of 20 items, σ 𝑥 = 1452, σ 𝑥 2 = 144280 and Mode = 63.7. Obtain Karl
Pearson’s Coefficient of skewness.
Solution:

Given 𝑁 = 20, σ 𝑥 = 1452, σ 𝑥 2 = 144280, Mode = 63.7

σ𝑥 1452
Then, 𝑥ҧ = = = 72.6
𝑁 20

σ 𝑥2 2
S.D, 𝜎 = − 𝑥ҧ
𝑁

144280 2
⟹ 𝜎= − 72.6
20

⟹ 𝜎 = 44.08
Karl Pearson’s coefficient of skewness

𝑀𝑒𝑎𝑛 − 𝑀𝑜𝑑𝑒 𝑥ഥ − 𝑍
𝑆𝑘 = =
𝑆𝑡𝑎𝑛𝑑𝑎𝑟𝑑 𝐷𝑒𝑣𝑖𝑎𝑡𝑖𝑜𝑛 𝜎

72.6 − 63.7
⟹ 𝑆𝑘 =
44.08

⟹ 𝑆𝑘 = 0.2019

Since 𝑆𝑘 > 0, the distribution is positively Skewed.


3. The number of accidents reported at city hospital in a week as follows 40, 62,
40, 25, 40, 34 and 60. Calculate Karl Pearson’s coefficient of skewness

Solution:

Given observations in ascending order is 25, 34, 40, 40, 40, 60, 62
Here, 𝑀𝑜𝑑𝑒 = 40 (Most frequently occurred item)

σ𝑥 301
Arithmetic mean, 𝑥ഥ = = = 43
𝑛 7

σ 𝑥2 σ𝑥 2
S. D, σ = −
𝑛 𝑛
14025 2
⟹ 𝜎= − 43
7

⟹ 𝜎 = 2003.57 − 1849

⟹ 𝜎 = 12.43

Karl Pearson’s coefficient of skewness

𝑀𝑒𝑎𝑛 − 𝑀𝑜𝑑𝑒 𝑥ഥ − 𝑍
𝑆𝑘 = =
𝑆𝑡𝑎𝑛𝑑𝑎𝑟𝑑 𝐷𝑒𝑣𝑖𝑎𝑡𝑖𝑜𝑛 𝜎

43 − 40
⟹ 𝑆𝑘 = = 0.24
12.43

Since 𝑆𝑘 >0, the distribution is positively Skewed.


4. Calculate Karl Pearson’s Coefficient of skewness for the following data.
Age 20 30 40 50 60 70 80

No of Persons 3 61 132 153 140 51 3

Solution:
𝒙 𝒇 𝒙𝟐 𝒇𝒙 𝒇 𝒙𝟐
20 3
30 61
40 132
50 153
60 140
70 51
80 3
Total
Here 𝑀𝑜𝑑𝑒 = 50 𝒙 𝒇 𝒙𝟐 𝒇𝒙 𝒇 𝒙𝟐
σ 𝑓𝑥 27030
𝑥ഥ = = = 49.78 20 3 400 60 1200
𝑁 543
σ 𝑓𝑥 2 2
𝜎= − 𝑥ഥ 30 61 900 1830 54900
𝑁

1422900 40 132 1600 5280 211200


⟹ 𝜎= − 49.78 2
543
50 153 2500 7650 382500

⟹ 𝜎 = 11.93
60 140 3600 8400 504000
Karl Pearson’s coefficient of skewness
70 51 4900 3570 249900
𝑀𝑒𝑎𝑛 − 𝑀𝑜𝑑𝑒 𝑥ഥ − 𝑍
𝑆𝑘 = = 80 3 6400 240 19200
𝑆𝑡𝑎𝑛𝑑𝑎𝑟𝑑 𝐷𝑒𝑣𝑖𝑎𝑡𝑖𝑜𝑛 𝜎

49.78 − 50 Total 543 27030 1422900


⟹ 𝑆𝑘 = = −0.018
11.93

Since 𝑆𝑘 < 0, the distribution is negatively Skewed.


5. The monthly income distribution of 100 persons living in a village is attached.
Determine (a) Mode (b) Standard Deviation (c) Coefficient of Skewness of this
distribution.
Income (In 0 – 10 10 – 20 20 – 30 30 – 40 40 – 50 50 – 60
1000’s)
No. of Persons 12 18 27 20 17 6

Income (In No. of Persons


Solution: 1000’s)
0 – 10 12
(a) Here highest frequency = 27
10 – 20 18

Hence Modal Class is 20 – 30 20 – 30 27

30 – 40 20
𝑓1 − 𝑓0
Mode = 𝐿 + ℎ 40 – 50 17
2𝑓1 − 𝑓0 − 𝑓2
50 – 60 6
27 −18
⟹ Mode = 20 + 10 Marks No. of Students
2 27 −18−20
0 – 10 12
9
⟹ Mode = 20 + 10
54 −38 10 – 20 18

20 – 30 27
9
⟹ Mode = 20 + 10
16 30 – 40 20

⟹ Mode = 20 + 5.625 40 – 50 17

⟹ Mode = 25.63 50 – 60 6
σ 𝑓𝑥 2800 Class Mid Frequenc 𝒙𝟐 𝒇𝒙 𝒇 𝒙𝟐
(b) Now, 𝑥ഥ = = = 28 Point y (𝒇)
𝑁 100 (𝒙)

0 – 10 5 12 25 60 300
σ 𝑓𝑥 2 2
𝜎= − 𝑥ഥ 10 – 20 15 18 225 270 4050
𝑁

20 – 30 25 27 625 675 16875


98300 2
⟹𝜎= − 28 30 – 40 35 20 1225 700 24500
100

40 – 50 45 17 2025 765 34425


⟹ 𝜎 = 983 − 784
50 – 60 55 6 3025 330 18150
⟹ 𝜎 = 14.11
100 2800 98300
( c) Karl Pearson coefficient of skewness
𝑀𝑒𝑎𝑛 − 𝑀𝑜𝑑𝑒 𝑥ഥ − 𝑍
𝑆𝑘 = =
𝑆𝑡𝑎𝑛𝑑𝑎𝑟𝑑 𝐷𝑒𝑣𝑖𝑎𝑡𝑖𝑜𝑛 𝜎

28 − 25.63
⟹ 𝑆𝑘 = = 0.168
14.11

Since 𝑆𝑘 > 0, the distribution is positively skewed.


Bowley’s Co-efficient of Skewness :
Sir Arthur Bowley derived a measure of skewness based on Quartiles which is
known as Bowley’s coefficient of skewness denoted as SB and is defined as:
𝑄3 + 𝑄1 − 2𝑀𝑒𝑑𝑖𝑎𝑛
𝑆𝐵 =
𝑄3 − 𝑄1

Note:

• The value of SB lies between −1 to +1

• If 𝑆𝐵 > 0, then the distribution is positively skewed.

• If 𝑆𝐵 < 0, then the distribution is negatively skewed.

• If 𝑆𝐵 = 0, then the distribution is symmetric.


Example:
1. For a certain distribution the upper and lower quartiles are 56 and 44
respectively. If the median for the same data is 55 then identify the
nature of skewness.
Solution:
Given
Q1 = 44, Q3 = 56, Median = 55
Bowley’s coefficient of skewness,
𝑄3 + 𝑄1 − 2 𝑀𝑒𝑑𝑖𝑎𝑛
𝑆𝐵 =
𝑄3 − 𝑄1
56 + 44 − 2(55)
⟹ 𝑆𝐵 =
56 − 44

100 − 110
⟹ 𝑆𝐵 =
12

−10
⟹ 𝑆𝐵 =
12

⟹ 𝑆𝐵 = −0.83

Since 𝑆𝐵 < 0 the distribution is negatively skewed.


2. Calculate the Bowley’s coefficient of skewness for the following data.
No. Calls 0 1 2 3 4 5 6 7

Frequency 14 21 25 43 51 40 39 12

Solution: No. Calls Frequency Cumulative


Frequency
Here 𝑁 = 245 0 14 14

𝑁+1 𝑡ℎ
1 21 35
𝑄1 = size of term 2 25 60
4
3 43 103
246 𝑡ℎ
⟹ 𝑄1 = size of term 4 51 154
4
𝑡ℎ 5 40 194
⟹ 𝑄1 = size of 61.5 term
6 39 233

⟹ 𝑄1 = 3 7 12 245
𝑁+1 𝑡ℎ No. Calls Frequency Cumulative
𝑄3 = size of 3 term
4 Frequency
0 14 14
⟹ 𝑄3 = size of 3 61.5 𝑡ℎ term
1 21 35

⟹ 𝑄3 = size of 184.5𝑡ℎ term 2 25 60

3 43 103
⟹ 𝑄3 = 5
4 51 154

𝑁+1 𝑡ℎ
Median = term 5 40 194
2
6 39 233
246 𝑡ℎ
⟹ Median = term 7 12 245
2
𝑡ℎ
⟹ Median = 123 term

⟹ Median = 4
∴ Bowley’s coefficient of skewness,
No. Calls Frequency Cumulative
Frequency
𝑄3 + 𝑄1 − 2 𝑀𝑒𝑑𝑖𝑎𝑛
𝑆𝐵 = 0 14 14
𝑄3 − 𝑄1
1 21 35
5 + 3 − 2(4)
⟹ 𝑆𝐵 = 2 25 60
5−3
3 43 103
8−8
⟹ 𝑆𝐵 = 4 51 154
2
5 40 194
⟹ 𝑆𝐵 = 0
6 39 233

Since 𝑆𝐵 = 0 the given distribution is 7 12 245

symmetric.
3. Calculate quartile deviation and Bowley’s measure of skewness from
the following data.
Commission
(Rs) 110 – 115 115 – 120 120 – 125 125 – 130 130 – 135 135 – 140 140 – 145 145 – 150 150 – 155 155 – 160

No. of Salesmen 04 10 20 49 72 90 52 33 17 7
𝑁 354 Commission (Rs.) No. of Salesmen Cumulative
Here = = 88.5 Frequency
4 4
110 – 115 04 04
∴ 130 – 135 is the 𝑄1 class 115 – 120 10 14

120 – 125 20 34
𝒉 𝑵
𝑄1 = 𝑳 + −𝒄
𝒇 𝟒 125 – 130 49 83

130 – 135 72 155


5
⟹ 𝑄1 = 130 + 88.5 − 83 135 – 140 90 245
72
140 – 145 52 297
27.5
⟹ 𝑄1 = 130 + 145 – 150 33 330
72
150 – 155 17 347
⟹ 𝑄1 = 130 + 0.38
155 – 160 07 354

⟹ 𝑄1 = 130.38
3𝑁 Commission (Rs.) No. of Salesmen Cumulative
Here = 3 88.5 = 265.5 Frequency
4
110 – 115 04 04
∴ 140 – 145 is the 𝑄3 class
115 – 120 10 14

𝒉 𝟑𝑵 120 – 125 20 34
𝑄3 = 𝑳 + −𝒄
𝒇 𝟒
125 – 130 49 83

5 130 – 135 72 155


⟹ 𝑄3 = 140 + 265.5 − 245
52
135 – 140 90 245

102.5 140 – 145 52 297


⟹ 𝑄3 = 140 +
52
145 – 150 33 330

⟹ 𝑄3 = 140 + 1.97 150 – 155 17 347

155 – 160 07 354


⟹ 𝑄3 = 141.97
𝑁 354 Commission (Rs.) No. of Salesmen Cumulative
Here = = 177 Frequency
2 2
110 – 115 04 04
∴ 130 – 135 is the Median class 115 – 120 10 14

120 – 125 20 34
𝒉 𝑵
Median = 𝑳 + −𝒄 125 – 130
𝒇 𝟐 49 83

130 – 135 72 155


5
⟹ Median = 135 + 177 − 155 135 – 140 90 245
90
140 – 145 52 297
110
⟹ Median = 135 + 145 – 150 33 330
90
150 – 155 17 347
⟹ Median = 135 + 1.22
155 – 160 07 354

⟹ Median = 136.22
∴ Bowley’s coefficient of skewness,

𝑄3 + 𝑄1 − 2 𝑀𝑒𝑑𝑖𝑎𝑛
𝑆𝐵 =
𝑄3 − 𝑄1

141.97 + 130.38 − 2(136.22)


⟹ 𝑆𝐵 =
141.97 − 130.38

272.35 − 272.44
⟹ 𝑆𝐵 =
11.59

−0.09
⟹ 𝑆𝐵 =
11.59

⟹ 𝑆𝐵 = −0.008

Since 𝑆𝐵 < 0 the given distribution is negatively skewed.


KURTOSIS:
Two or more distributions might have same measures of central
tendency, dispersion and skewness, but they show different degrees
of concentration of values of observation around mode and hence
they may show different degrees of peakedness of the distributions.
Kurtosis is the measure of peakedness or flatness of the
frequency distribution.
KURTOSIS:
In figure 1 the curve is more peaked
than others and is called as Leptokurtic.
In figure 2 the curve is less peaked than
others or it is more flat topped, is called
as Platykurtic.
In figure 3 curve is neither more peaked
nor more flat topped (or it is
moderately peaked), is called
Mesokurtic (or Natural Curve or Normal Curve)

You might also like