0% found this document useful (0 votes)
98 views28 pages

Chapter 4 Measures of Variability PDF

This document discusses various measures of variability used to quantify how spread out or dispersed a data set is. It introduces the range, interquartile range, variance, standard deviation, and coefficient of variation. The range is the simplest measure but can be misleading. The interquartile range is less influenced by outliers. Variance and standard deviation take into account all values and have the same units as the original data. The coefficient of variation allows comparison of variability between data sets with different units.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
98 views28 pages

Chapter 4 Measures of Variability PDF

This document discusses various measures of variability used to quantify how spread out or dispersed a data set is. It introduces the range, interquartile range, variance, standard deviation, and coefficient of variation. The range is the simplest measure but can be misleading. The interquartile range is less influenced by outliers. Variance and standard deviation take into account all values and have the same units as the original data. The coefficient of variation allows comparison of variability between data sets with different units.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 28

Chapter 4

Measures of Variability
 It is often desirable to consider measures of variability
(dispersion), as well as measures of location.
Measures of Variability
 Range
 Interquartile Range
 Variance
 Standard Deviation
 Coefficient of Variation
Measures of Variation
Variation

Range Variance Standard Coefficient of


Deviation Variation

 Measures of variation give information on the


spread or variability or dispersion of the data
values.

Same center,
different variation
Range
 The range of a data set is the difference between the
largest and smallest data values.
 It is the simplest measure of variability.
 It is very sensitive to the smallest and largest data
values.
Measures of Variation:
The Range
 Simplest measure of variation
 Difference between the largest and the smallest values:

Range = Xlargest – Xsmallest

Example:

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14

Range = 13 - 1 = 12
Measures of Variation:
Why The Range Can Be Misleading

 Ignores the way in which data are distributed


7 8 9 10 11 12 7 8 9 10 11 12
Range = 12 - 7 = 5 Range = 12 - 7 = 5

1,1,1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,3,3,3,3,4,5

 Sensitive to outliers
Range = 5 - 1 = 4
1,1,1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,3,3,3,3,4,120
Range = 120 - 1 = 119
Range
Range = largest value - smallest value
Range = 615 - 425 = 190
425 430 430 435 435 435 435 435 440 440
440 440 440 445 445 445 445 445 450 450
450 450 450 450 450 460 460 460 465 465
465 470 470 472 475 475 475 480 480 480
480 485 490 490 490 500 500 500 500 510
510 515 525 525 525 535 549 550 570 570
575 575 580 590 600 600 600 600 615 615

Note: Data is in ascending order.


Interquartile Range
 The interquartile range of a data set is the difference
between the third quartile and the first quartile.
 It is the range for the middle 50% of the data.
 It overcomes the sensitivity to extreme data values.
Interquartile Range
3rd Quartile (Q3) = 525
1st Quartile (Q1) = 445
Interquartile Range = Q3 - Q1 = 525 - 445 = 80
425 430 430 435 435 435 435 435 440 440
440 440 440 445 445 445 445 445 450 450
450 450 450 450 450 460 460 460 465 465
465 470 470 472 475 475 475 480 480 480
480 485 490 490 490 500 500 500 500 510
510 515 525 525 525 535 549 550 570 570
575 575 580 590 600 600 600 600 615 615

Note: Data is in ascending order.


Variance

The variance is a measure of variability that utilizes


all the data.

It is based on the difference between the value of


each observation (xi) and the mean ( x for a sample,
m for a population).
Measures of Variation:
The Sample Variance
The variance is the average of the squared
differences between each data value and the mean.

– Sample variance: n

 (X  X)i
2

S 2 i 1
n -1
Where X = arithmetic mean
n = sample size
Xi = ith value of the variable X
– Population variance: N

 (X i  m) 2

  2 i 1
N
Where m= population mean
N = population size
Xi = ith value of the variable X
Standard Deviation

The standard deviation of a data set is the positive


square root of the variance.

It is measured in the same units as the data, making


it more easily interpreted than the variance.
Measures of Variation:
The Sample Standard Deviation

• Most commonly used measure of variation


• Shows variation about the mean
• Is the square root of the variance
• Has the same units as the original data
n

– Sample standard deviation:


 (X  X)
i
2

S i1
n -1
Measures of Variation:
The Standard Deviation
Steps for Computing Standard Deviation

1. Compute the difference between each value and the


mean.
2. Square each difference.
3. Add the squared differences.
4. Divide this total by n-1 to get the sample variance.
5. Take the square root of the sample variance to get
the sample standard deviation.
Standard Deviation

The standard deviation is computed as follows:

s s 2
  2

for a sample for a population


Measures of Variation:
Sample Standard Deviation:
Calculation Example
Sample
Data (Xi) : 10 12 14 15 17 18 18 24

n=8 Mean = X = 16

A measure of the “average” scatter


around the mean
Measures of Variation:
Comparing Standard Deviations
Data A
Mean = 15.5

11 12 13 14 15 16 17 18 19 20 21
S = 3.338

Data B Mean = 15.5


S = 0.926
11 12 13 14 15 16 17 18 19 20 21

Data C Mean = 15.5


S = 4.570
11 12 13 14 15 16 17 18 19 20 21
Measures of Variation:
The Coefficient of Variation

• Measures relative variation


• Always in percentage (%)
• Shows variation relative to mean
• Can be used to compare the variability of two or
more sets of data measured in different units
Coefficient of Variation

The coefficient of variation indicates how large the


standard deviation is in relation to the mean.

The coefficient of variation is computed as follows:


s   
 100  %  100  %
x  m 
for a sample for a population
Measures of Variation:
Comparing Coefficients of Variation
• Stock A:
– Average price last year = $50
– Standard deviation = $5

Both stocks have


• Stock B: the same
standard
– Average price last year = $100 deviation, but
stock B is less
– Standard deviation = $5 variable relative
to its price
Measures of Variation:
Comparing Coefficients of Variation
• Stock A:
– Average price last year = $50
– Standard deviation = $5

Stock C has a
• Stock C: much smaller
standard
– Average price last year = $8 deviation but a
much higher
– Standard deviation = $2 coefficient of
variation
Grouped Data
 The weighted mean computation can be used to
obtain approximations of the mean, variance, and
standard deviation for the grouped data.
 To compute the weighted mean, we treat the
midpoint of each class as though it were the mean
of all items in the class.
 We compute a weighted mean of the class midpoints
using the class frequencies as weights.
 Similarly, in computing the variance and standard
deviation, the class frequencies are used as weights.
Variance for Grouped Data
For sample data 

For population data 


Sample Variance for Grouped Data

Rent ($)
20-40 5 30 -76 28880
40-60 6 50 -56 18816
60-80 5 70 -35 6480
80-100 10 90 -16 2560
100-120 5 110
4 80
120-140 5 130 24 2880
140-160 4 150 44 7744
160-180 5 170 64 20480
180-200 5 190 84 35280
50 123200
Sample Variance for Grouped Data

Sample Variance 

s2 = 123200/(50 – 1) = 2514.286

Sample Standard Deviation 


Measures of Variation:
Comparing Standard Deviations

Smaller standard deviation

Larger standard deviation


Measures of Variation:
Summary Characteristics
 The more the data are spread out, the greater the
range, variance, and standard deviation.

 The more the data are concentrated, the smaller


the range, variance, and standard deviation.

 If the values are all the same (no variation), all


these measures will be zero.

 None of these measures are ever negative.

You might also like