0% found this document useful (0 votes)

13 views36 pages

3 - Measures of Variation

The document discusses measures of variation and dispersion in data sets, emphasizing the importance of understanding variability alongside central tendency. It covers various measures such as range, interquartile range, variance, standard deviation, and coefficient of variation, providing examples and calculations for clarity. Additionally, it explains the five-number summary and the identification of outliers using quartiles and IQR.

Uploaded by

khadijashafique72

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

13 views36 pages

3 - Measures of Variation

Uploaded by

khadijashafique72

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

You are on page 1/ 36

1

Measures of Variation / Dispersion/ Spread

• Arithmetic mean is a concise method of
presentation of data but inadequate as it gives no
indication of its reliability.
• It is possible that means of two data sets are same
but even than two data sets may be quite different
with respect to variation among values within
each data set
Data 1 Data 2 By comparing mean, both
49 20 data sets look same, but
50 50 quite different in terms of
variability among values
51 80
within each data 2
Measures of variation / dispersion/ spread
• Measures of variation measure the variation
present among the values in a data set with a
single number so measures of variation are
summary measures of spread of values in the
data.
• A measure of central tendency along with a
measure of dispersion gives an adequate
description of statistical data.

3
Measures of Variability
• Distance Based Measures of Spread
– Range
– Interquartile Range
• Centre Based Measures of Spread
– Variance
– Standard Deviation
– Coefficient of Variation
Range
The range of a data set is the difference between the
largest and smallest data values.
Range  X Largest  X Smallest
Example: The following data sets represent the plant height
in cm for two verities of wheat Pak 81 and LU 26

Pak 81 80 82 83 85 86 88 89 90
LU 26 50 81 83 85 85 87 88 90

Range Pak81=90-80= 10 cm
Range LU 26=90-50= 40 cm
Dot Plots for Both Data Sets

Dot Plot of Height

80
Height

LU 26 PAK 81
Var
Disadvantages of the Range

•Ignores the intermediate observations

7 8 9 10 11 12 7 8 9 10 11 12
Range = 12 - 7 = 5 Range = 12 - 7 = 5

•Sensitive to outliers
1,1,1,1,2,2,2,2,3,3,4,5
Range = 5 - 1 = 4

1,1,1,1,2,2,2,2,3,3,4,120
Range = 120 - 1 = 119

9
InterQuartile Range (IQR)
• The interquartile range of a data set is the difference
between the third quartile (largest quartile) and the
first quartile(smallest quartile).
• It is the range for the middle 50% of the data.
• It overcomes the sensitivity to extreme data values.

Interquartile range( IQR) = Q3 - Q1

(a)Not effected by presence of outlier in the data

(b)Based on 50% intermediate observations
IQR
Example: The following data sets represent the plant height
in cm for two verities of wheat Pak 81 and LU 26

Pak 81 80 82 83 85 86 88 89 90
LU 26 50 81 83 85 85 87 88 90

Range Pak81=90-80= 10 cm
Range LU 26=90-50= 40 cm

IQR Pak81= 88.75 – 82.25 = 6.50 cm

IQR LU 26= 87.75 – 81.50 = 6.25 cm
Median
minimum
Q1 (Q2) Q3 maximum

25% 25% 25% 25%

80 82.25 85.50 88.75 90

IQR
88.75 – 82.25= = 6.50 cm
Median
Q1 (Q2) Q3 maximum
minimum
25% 25% 25% 25%

50 81.50 85 87.75
90
IQR
87.75 – 81.50= = 6.25 cm
Center Base Measures
Idea:- Select single value as reference value (ideal reference
value is mean of the data) take deviations of values from
mean and take sum of these deviations
Example:- Following data represent the yield per plot of
three wheat verities A,B and C. Compare the yield
performance of three verities
XA XB XC
40 30 10
50 50 50
60 70 90

13
XA XB XC X A  XA X B  XB  X C  XC 

40 30 10 -10 -20 -40

50 50 50 0 0 0
60 70 90 10 20 40
Problem:- Sum of deviations from mean is always 0
 (X  X ) 0
(Due to cancelling sign problem) regardless of the spread of
values in the data. Hence deviation of values from mean can
not be used as a measure of spread in the data
Dot Plot
90

60
ld

50
ie
Y

0
14
A B C
Solution: Squared the deviations and then sum the squared
deviations to get rid of cancelling sign problem
 ( X  X )
2

X  XB  XC  XC 
2
X  XA
2 2
XA XB XC A B

40 30 10 100 400 1600

50 50 50 0 0 0
60 70 90 100 400 1600
Variance: Average of the Squared deviations from mean
2
2 å (X A - X A) 200
S A = = = 66.67 Kg 2
n 3
2
2 å (X B - XB) 800
S B = = = 266.67 Kg 2
n 3
2
2 å (X C - XC ) 3200
S C = = = 1066.67 Kg152
n 3
Alternative formula (Desk formula) for
variance
X X2
40 1600
50 2500
60 3600
150 7700

2
2 å (X A - X A)
S = = 66.67
n
1
S2 = ç
æ ( å X ) 2ö
÷ 1 æ
ç (150) 2ö
÷
nç
ç
çå X 2
-
n ø 3è
÷
÷
÷
= ç
ç
7700 -
3 ø
÷
÷
÷
= 66.67
è

16
Variance
• The variance is a measure of variability that
utilizes all the data.
• It is based on the squared difference between
the value of each observation (xi) and the
mean of the data
• The variance is denoted by s2.
Problem With Variance
Variance measures the variation in the data as
the square of the units of measurements of
the data so it is difficult to interpret it
precisely.

Solution:- Take positive square root of the

variance known as standard deviation
denoted by S.
It has the same units as the measurements
themselves
18
Standard Deviation
• The standard deviation of a data set is the positive
square root of the variance.
• It is measured in the same units as the data,
making it more easily comparable, than the
variance, to the mean.
• The standard deviation is denoted S.

S A  66.67  8.16 Kg
S B  266.67 16.33 Kg
SC  1066.67 32.66 Kg
Coefficient of Variation (CV)
• Shows relative variability, that is, variability
relative to the magnitude of the data i.e variation
relative to mean
• Always in percentage (%)
• Unitfree measure of variation
• Can be used to compare two or more sets of data
– measured in different units
– same units but different average size
S
CV= ×100
X
Coefficient of Variation
The following data represent length (in inches) and
weight (in Kg) for a sample of 10 fish of same
species after using a particular type of fish feed
Fish 1 2 3 4 5 6 7 8 9 10
Weight 1.8 1.9 2.1 2.4 2.5 2.6 2.7 2.8 3.1 3.2
Length 11 12 12 13 15 15 16 17 18 18

Which characteristic weight or length is relatively

more variable
Standard deviation
(S) Mean CV
Weight 0.472 kg 2.51 kg 18.82
Length 2.584 inches 14.70 inches 17.58
Example: Following data represent the height in feet of sugarcane and wheat
plants after applying a particular type of fertilizer, Compare plants of which crop
are more variable in height

Sugarcane 10.3 12.1 12.3 12.5 12.6 12.8 13.0 13.2

Wheat 3.1 3.5 3.8 3.9 4.0 4.3 4.5 4.9
Dot Plot of Height
14

SD=0.90 feet S SD=0.567 feet

12Mean=12.23 CV= ×100 Mean=4.0 feet
feet X
10
CV=14.21%
Height

8
CV=7.37%

2
SugarCane Wheat
Crop
Five Number Summary
The five number summary of a data set consist of
1. Minimum value
2. Maximum value
3. Q1
4. Q2
5. Q3.
Graph of Five number summary is called Box-
Whisker Plot

24
Five Number Summary
Example: The following data set shows the marks obtained by
students
25 41 27 32 43 66 35 31 15 5
34 26 32 38 16 30 38 30 20 21
Determine the five number summary.
The array of the above data is given below:
5 15 16 20 21 25 26 27 30 30
31 32 32 34 35 37 38 41 43 66

1. Minimum value 05
2. Maximum value 66
3. Q1 22
4. Q2 30.5
25
5. Q3.
70
Construction of Box Whisker Plot
60
1. Start The box From Q1 and ends at Q3
2. Within the box Draw a line to represent 50
Q2
3. Draw lower whisker to Min. Value upto Q1
40
4. Draw upper Whisker from Q3 upto
Max.Value
30
1. Q1=22.0 Q3=36.5
20
2. Q2=30.5
3. Minimum Value=5.0 10

4. Maximum Value=66.0
0
26
Interpretation of Box-Whisker Plot 70

Box-Whisker Plot is useful to identify

60
•From upper and lower whiskers;
Maximum and Minimum Values in the data
50
•From line within box i.e Q2 ;
Average Size of the data 40
•From length of the graph Range=Max-Min
•From length of the box i.e Q3-Q1=IQR 30

Variability in the data i.e lengthy box indicates more

variability 20

•From Position of line within box

10
Shape of the data
Line At the center of the box-------Symmetrical
0
Line above center of the box-------Negatively 27
Outliers
The outliers are the values that fall well outside the
overall pattern of the data. It may be
• The result of a measurement or recording error
• A member from a different population than the rest of
the sample.
• Simply an unusual extreme value.

Professor Jhon Wilder Tukey suggested a method for

defining outliers. We can use quartiles and the IQR =
Q3-Q1 to identify the outliers.

28
Determine Inner and Outer Fences
If Q1=22.0 Q2=30.5
Q3=36.5
The inner fences and outer fences are defined as follows:
Lower Inner Fence Q1  1.5IQR  0.25
Inner Fences : 
Upper Inner Fence Q 3  1.5IQR  58.25

Lower Outer Fence Q1  3IQR   21.5

Outer Fences : 
Upper Outer Fence Q 3  3IQR  80.0

That is, in a box and whisker diagram inner fences are

constructed to the bottom and top of the box at a distance
of 1.5 times the IQR and the outer fences are constructed to
the bottom and top of the box at a distance 3 times the IQR.

29
Identification of Suspected and Sure Outliers
80
1. The values that lie within inner Only 66 is
fences are normal values mild outlier 70
2. The values that lie outside inner *
fences, but inside outer fences 60

are possible / suspected / mild

50
outliers
3. The values that lie outside outer 40
fences are sure outliers
30
Question : If marks is 96 instead of 66 can
this value be considered as outlier 20

Plot each suspected outliers with an asterisk 10

and each outliers with a hollow dot.
0

30
Example: A study of the eﬀects of smoking on sleep
patterns is conducted. The measure observed is the
time, in minutes, that it takes to fall asleep. These
data are obtained:
Smokers:
69.3 56.0 22.1 47.6 53.2 48.1 52.7 34.4
60.2 43.8 23.2 13.8
Nonsmokers:
28.6 25.1 26.4 34.9 29.8 28.4 38.5 30.2
30.6 31.8 41.6 21.1 36.0 37.9 13.9
Graphical Analysis
Dot Plot of Time Boxplot of Time
70 70

60 60

50 50

T im e
T im e

40 40

30 30

20 20

10 10
Non-Smoker Smoker Non-Smoker Smoker
Habit Habit
Statistical Analysis
Statistics
Habit MeanStDev CoefVar Median
Non-Smoker 30.32 7.13 23.51 30.20
Smoker 43.70 16.93 38.74 47.85
Standard Variable
• A variable that has mean “0” and Variance “1” is called standard
variable
• Values of standard variable is called standard scores
• Values of standard variable i.e standard scores are unit-less
• Construction
Varable  Mean of variable
Z
Standard deviation of variable

35
2 2 X 
X 
32
8
X (X  X ) Z (Z  Z ) n 4
54
S x2  13.5
3 25 -1.3624 1.8561 4
S x 3.67
6 4 -0.5450 0.2970
X X X8
11 9 0.81741 0.6682 Z 
Sx 3.67
12 16 1.0899 1.1879
Z
 Z
0
32 54 0 4.009 n
2 4.009
Sz  1
4

Variable Z has mean “0” and variance “1” so Z is a

standard variable

Standard Score at X 3
X  X 3 8
Z   1.3624
Sx 3.67 36
Using z scores to evaluate performance
The industry in which sales rep Mr.Bilal works has
mean annual sales=$2,500 with standard deviation
=$500.
The industry in which sales rep Mr. Perviz works has
mean annual sales=$4,800 with standard
deviation=$600.
Last year Mr.Bilal’s sales were $4,000 and
Which of the representatives
Mr. Perviz’s would you hire if
sales were $6,000.
you had one sales position to fill?

37
Standard Units
Sales person Bilal Sales person Perviz

XB= $2,500 XP =$4,800

S= $500 SP = $600

XB= $4,000 XP= $6,000

XB  XB XP  XP
ZB  ZP 
SB SP
4,000  2,500 6,000  4,800
ZB  3 ZP  2
500 600

Mr.Bilal is the best choice 38

Example:- Following data represent the performance of
different batsman in ODI and T-20 matches. Evaluate
in which format (ODI or T-20) Babar Azam perform
better as compare to otherODI T-20
batsman

Imam 42 25
Hadir 54 34
Azhar 30 16
Shafiq 45 41
Asif 62 36
Babar Azam 55 40
mean= 48 32
SD= 10.41 8.851
Z= 0.673 0.904

Business Statistics Cheat Sheet?
No ratings yet
Business Statistics Cheat Sheet?
7 pages
Untitled
No ratings yet
Untitled
1,326 pages
Detailed Lesson Plan in Mathematics For Grade 10
No ratings yet
Detailed Lesson Plan in Mathematics For Grade 10
21 pages
Week 4
No ratings yet
Week 4
18 pages
Lecture 4 Measures of Dispersion
No ratings yet
Lecture 4 Measures of Dispersion
34 pages
Measures of Dispersion Topic 11
No ratings yet
Measures of Dispersion Topic 11
8 pages
Variation in The Data: ST ND
No ratings yet
Variation in The Data: ST ND
5 pages
Discriptive Statistics
No ratings yet
Discriptive Statistics
50 pages
Measures of Central Tendency & Variability: Lina, Karima, Joselyn, Arlene
No ratings yet
Measures of Central Tendency & Variability: Lina, Karima, Joselyn, Arlene
34 pages
Module 4. Part2 Analyzing and Interpreting Data 1
No ratings yet
Module 4. Part2 Analyzing and Interpreting Data 1
42 pages
Lecture 2b - Describing Data-Numerical
No ratings yet
Lecture 2b - Describing Data-Numerical
47 pages
Class 1 - 20th August 2024 - Descriptive Statistic
No ratings yet
Class 1 - 20th August 2024 - Descriptive Statistic
6 pages
Variability Final
No ratings yet
Variability Final
53 pages
CH 3 - 250408 - 170537
No ratings yet
CH 3 - 250408 - 170537
33 pages
Measures of Dispersion Tendency
No ratings yet
Measures of Dispersion Tendency
7 pages
4 Numerical Methods For Describing Data
No ratings yet
4 Numerical Methods For Describing Data
50 pages
Statistical Data
No ratings yet
Statistical Data
41 pages
Lecture 5&6
No ratings yet
Lecture 5&6
15 pages
Answers IBS
No ratings yet
Answers IBS
13 pages
Add Math - F5A1T1
No ratings yet
Add Math - F5A1T1
14 pages
S3-Measures of Dispersion
No ratings yet
S3-Measures of Dispersion
15 pages
Statistics Notes Part 1
No ratings yet
Statistics Notes Part 1
26 pages
Lecture 04
No ratings yet
Lecture 04
88 pages
PPT3
No ratings yet
PPT3
26 pages
04 - Measures of Variation
No ratings yet
04 - Measures of Variation
24 pages
Statistics Midterm Review
No ratings yet
Statistics Midterm Review
21 pages
Lecture No. 6 Measures of Variability
No ratings yet
Lecture No. 6 Measures of Variability
25 pages
Statistics Part 1 and 2
No ratings yet
Statistics Part 1 and 2
53 pages
Analysis of Statistcal Data
No ratings yet
Analysis of Statistcal Data
46 pages
Lec006 - Measures of Dispersion
No ratings yet
Lec006 - Measures of Dispersion
42 pages
Describing Data 3 4
No ratings yet
Describing Data 3 4
17 pages
Statistics Measure of Center
No ratings yet
Statistics Measure of Center
11 pages
Variation
No ratings yet
Variation
12 pages
Chapter 2
No ratings yet
Chapter 2
46 pages
Statistics Lecture 4
No ratings yet
Statistics Lecture 4
48 pages
BB Module 2 BASIC STATISTICS
No ratings yet
BB Module 2 BASIC STATISTICS
63 pages
Lecture 4 Copy 1
No ratings yet
Lecture 4 Copy 1
13 pages
Descriptive Statistics - Measures of Spread: April 2014
No ratings yet
Descriptive Statistics - Measures of Spread: April 2014
5 pages
Descriptive Statistics - Measures of Spread: April 2014
No ratings yet
Descriptive Statistics - Measures of Spread: April 2014
5 pages
Topic5.Measures of Variation Shapes of Distribution
No ratings yet
Topic5.Measures of Variation Shapes of Distribution
10 pages
Averages and Variation Eda
No ratings yet
Averages and Variation Eda
29 pages
Statistics Unit1 Notes
No ratings yet
Statistics Unit1 Notes
11 pages
Chapter 3 Review
100% (1)
Chapter 3 Review
12 pages
Module 2 - Exploratory Data Analysis (EDA) : Central Tendency and Variability
No ratings yet
Module 2 - Exploratory Data Analysis (EDA) : Central Tendency and Variability
56 pages
SU5 - Chapter 5
No ratings yet
SU5 - Chapter 5
41 pages
Chapter 3, Part A Descriptive Statistics: Numerical Measures
No ratings yet
Chapter 3, Part A Descriptive Statistics: Numerical Measures
7 pages
1.3 Variation
No ratings yet
1.3 Variation
16 pages
Basic Business Statistics: Concepts & Applications: Activity 4+ 5 + 6 Descriptive Statistics and Graphical Analysis
No ratings yet
Basic Business Statistics: Concepts & Applications: Activity 4+ 5 + 6 Descriptive Statistics and Graphical Analysis
33 pages
Part 2-Chapter 3 - Describing Data - Edit
No ratings yet
Part 2-Chapter 3 - Describing Data - Edit
46 pages
Introductory of Statistics - Chapter 3
No ratings yet
Introductory of Statistics - Chapter 3
7 pages
Measusres of Locations
No ratings yet
Measusres of Locations
52 pages
Chapter 5 Measures of Variability
No ratings yet
Chapter 5 Measures of Variability
24 pages
Lecture 2-3 Data Analysis Location & Dispression
No ratings yet
Lecture 2-3 Data Analysis Location & Dispression
43 pages
Week 6+7+8
No ratings yet
Week 6+7+8
37 pages
Stat 102 Module 3
No ratings yet
Stat 102 Module 3
8 pages
Measures of Variability and Normal Distribution
No ratings yet
Measures of Variability and Normal Distribution
61 pages
Descriptive Stat Pt.2
No ratings yet
Descriptive Stat Pt.2
27 pages
Practice 3 Measures of Dispersion 2023 09 20 19 02 53
No ratings yet
Practice 3 Measures of Dispersion 2023 09 20 19 02 53
18 pages
CH 03
No ratings yet
CH 03
48 pages
Question On Box Plot 1
No ratings yet
Question On Box Plot 1
7 pages
Topic 1 Describing Data II
No ratings yet
Topic 1 Describing Data II
68 pages
Student Solutions Manual to Accompany Economic Dynamics in Discrete Time, secondedition
From Everand
Student Solutions Manual to Accompany Economic Dynamics in Discrete Time, secondedition
Yue Jiang
4.5/5 (2)
Shortcuts to College Calculus Refreshment Kit
From Everand
Shortcuts to College Calculus Refreshment Kit
Juan Acevedo
No ratings yet
CUSUM Chart
0% (1)
CUSUM Chart
23 pages
Testing Experimental Data For Univariate Normality
No ratings yet
Testing Experimental Data For Univariate Normality
18 pages
Various Measures of Central Tendency
No ratings yet
Various Measures of Central Tendency
17 pages
EC203 Tutorial 4
No ratings yet
EC203 Tutorial 4
3 pages
ASM Quiz With Solution
No ratings yet
ASM Quiz With Solution
12 pages
Universiti Teknologi Malaysia Faculty of Education Mid-Term Examination
No ratings yet
Universiti Teknologi Malaysia Faculty of Education Mid-Term Examination
8 pages
Ejemplo 2 Regresión Lineal Multiple Desarrollado
No ratings yet
Ejemplo 2 Regresión Lineal Multiple Desarrollado
14 pages
Homan' Self-Esteem Research
No ratings yet
Homan' Self-Esteem Research
9 pages
CEM 515 SPC Quiz Student Name: - Student No
No ratings yet
CEM 515 SPC Quiz Student Name: - Student No
2 pages
BSC Statistics
No ratings yet
BSC Statistics
12 pages
Unit 4
No ratings yet
Unit 4
25 pages
Mathematical Assessment Synthetic Hydrology: Vol. $, No. 4 Water Resources Research Fourth Quarter 1967
No ratings yet
Mathematical Assessment Synthetic Hydrology: Vol. $, No. 4 Water Resources Research Fourth Quarter 1967
9 pages
4 Measures of Dispersion
No ratings yet
4 Measures of Dispersion
32 pages
(Ebook PDF) Modern Business Statistics, With Microsoft Office Excel 4th Edition Download
100% (7)
(Ebook PDF) Modern Business Statistics, With Microsoft Office Excel 4th Edition Download
56 pages
Ujian Akhir Tengah Semester 3 TRIA
No ratings yet
Ujian Akhir Tengah Semester 3 TRIA
22 pages
Practice With Scatterplots Answers 6
No ratings yet
Practice With Scatterplots Answers 6
2 pages
Chapter 3 Measures of Variability
No ratings yet
Chapter 3 Measures of Variability
69 pages
Stat As Tics and Their Use in HR 179
No ratings yet
Stat As Tics and Their Use in HR 179
27 pages
Basic Terms2
No ratings yet
Basic Terms2
44 pages
Module 2 Project Complete
No ratings yet
Module 2 Project Complete
10 pages
Chapter 5 Worked Solutions
No ratings yet
Chapter 5 Worked Solutions
44 pages
Xác Suất Thống Kê
No ratings yet
Xác Suất Thống Kê
32 pages
Normal Distribution
No ratings yet
Normal Distribution
18 pages
Measures of Dispersion
No ratings yet
Measures of Dispersion
29 pages
Percentile
No ratings yet
Percentile
2 pages
Modele Liniare Si Neliniare
No ratings yet
Modele Liniare Si Neliniare
14 pages
Introduction To Engineering Data Analysis 2
No ratings yet
Introduction To Engineering Data Analysis 2
5 pages