CH 03
CH 03
A Decision-Making Approach
Chapter 3
Describing Data Using
Numerical Measures
Chap 3-1
Chapter Goals
After completing this chapter, you should be able to:
Compute and interpret the mean, median, and mode for a
set of data
Compute the range, variance, and standard deviation and
know what these values mean
Construct and interpret a box and whisker graph
Compute and explain the coefficient of variation and
z scores
Use numerical measures along with graphs, charts, and
tables to describe data
Chap 3-2
Chapter Topics
Measures of Center and Location
Mean, median, mode
Other measures of Location
Weighted mean, percentiles, quartiles
Measures of Variation
Range, interquartile range, variance and standard
deviation, coefficient of variation
Using the mean and standard deviation together
Coefficient of variation, z-scores
Chap 3-3
Summary Measures
Coefficient of
Variation
Chap 3-4
Measures of Center and Location
Overview
Center and Location
x i
XW
wx i i
x i1
n w i
N
x i W
wxi i
i1
N
w i
Chap 3-5
Mean (Arithmetic Average)
The Mean is the arithmetic average of data
values
Population mean N = Population Size
N
x i
x1 x 2 x N
i1
N N
Sample mean n = Sample Size
n
x i
x1 x 2 x n
x i1
n n
Chap 3-6
Mean (Arithmetic Average)
(continued)
0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10
Mean = 3 Mean = 4
1 2 3 4 5 15 1 2 3 4 10 20
3 4
5 5 5 5
Chap 3-7
Median
0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10
Median = 3 Median = 3
Chap 3-8
Median
(continued)
To find the median, sort the n data values
from low to high (sorted data is called a
data array)
Find the value in the i = (1/2)n position
The ith position is called the Median Index
Point
If i is not an integer, round up to next highest
integer
Chap 3-9
Median Example
(continued)
Data array:
4, 4, 5, 5, 9, 11, 12, 14, 16, 19, 22, 23, 24
Note that n = 13
Find the i = (1/2)n position:
i = (1/2)(13) = 6.5
Since 6.5 is not an integer, round up to 7
The median is the value in the 7th position:
Md = 12
Chap 3-10
Shape of a Distribution
Describes how data is distributed
Symmetric or skewed
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 0 1 2 3 4 5 6
Mode = 5 No Mode
Chap 3-12
Weighted Mean
Example: Sample of
26 Repair Projects
Weighted Mean Days
Days to
Frequency to Complete:
Complete
5 4 XW
wx
i i
(4 5) (12 6) (8 7) (2 8)
6 12 w i 4 12 8 2
7 8 164
6.31 days
8 2 26
Chap 3-13
Review Example
Five houses on a hill by the beach
$2,000 K
House Prices:
$2,000,000
500,000 $500 K
300,000 $300 K
100,000
100,000
$100 K
$100 K
Chap 3-14
Summary Statistics
House Prices:
Mean: ($3,000,000/5)
$2,000,000 = $600,000
500,000
300,000
100,000
100,000
Median: middle value of ranked data
Sum 3,000,000
= $300,000
Chap 3-15
Which measure of location
is the “best”?
Chap 3-16
Other Location Measures
Other Measures
of Location
Percentiles Quartiles
Chap 3-17
Percentiles
The pth percentile in an ordered array of n values is the
value in ith position, where
p If i is not an integer,
i (n) round up to the next
100 higher integer value
Q1 Q2 Q3
Chap 3-19
Quartiles
25
Q1 = 25 percentile, so find i :
th
i = 100 (9) = 2.25
Chap 3-20
Box and Whisker Plot
A graphical display of data using a central “box”
and extended “whiskers”:
Example:
25% 25% 25% 25%
* *
Outliers Lower 1st Median 3rd Upper
Limit Quartile Quartile Limit
Chap 3-21
Constructing the
Box and Whisker Plot
* *
Outliers Lower 1st Median 3rd Upper
Limit Quartile Quartile Limit
Chap 3-23
Distribution Shape and
Box and Whisker Plot
Q1 Q2 Q3 Q1 Q2 Q3 Q1 Q2 Q3
Chap 3-24
Box-and-Whisker Plot Example
Below is a Box-and-Whisker plot for the following data:
Min Q1 Q2 Q3 Max
0 2 2 2 3 3 4 5 6 11 27
*
0 2 3 6 12 27
Upper limit = Q3 + 1.5 (Q3 – Q1) 27 is above the
upper limit so is
= 6 + 1.5 (6 – 2) = 12 shown as an outlier
This data is right skewed, as the plot depicts
Chap 3-25
Measures of Variation
Variation
Sample Sample
Variance Standard
Deviation
Chap 3-26
Variation
Same center,
different variation
Chap 3-27
Range
Simplest measure of variation
Difference between the largest and the smallest
observations:
Example:
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14
Range = 14 - 1 = 13
Chap 3-28
Disadvantages of the Range
Ignores the way in which data are distributed
7 8 9 10 11 7 8 9 10 11
12 Range = 12 - 7 = 5 12 Range = 12 - 7 = 5
Sensitive to outliers
1,1,1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,3,3,3,3,4,5
Range = 5 - 1 = 4
1,1,1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,3,3,3,3,4,120
Range = 120 - 1 = 119
Chap 3-29
Interquartile Range
Chap 3-30
Interquartile Range Example
Example:
X Median X
minimum Q1 (Q2) Q3 maximum
12 30 45 57 70
Interquartile range
= 57 – 30 = 27
Chap 3-31
Variance
Average of squared deviations of values from
the mean
Population variance: N
2
i
(x μ) 2
σ i1
N
n
Sample variance:
2
i
(x x ) 2
s i1
n -1
Chap 3-32
Standard Deviation
Most commonly used measure of variation
Shows variation about the mean
Has the same units as the original data
N
Population standard deviation:
i
(x μ) 2
σ i1
N
n
Sample standard deviation:
i
(x x ) 2
s i1
n -1
Chap 3-33
Calculation Example:
Sample Standard Deviation
Sample
Data (Xi) : 10 12 14 15 17 18 18 24
n=8 Mean = x = 16
(10 x ) 2 (12 x ) 2 (14 x ) 2 (24 x ) 2
s
n 1
130
4.3095
7
Chap 3-34
Comparing Standard Deviations
Same mean, but different
standard deviations:
Data A
Mean = 15.5
11 12 13 14 15 16 17 18 19 20 21 s = 3.338
Data B
Mean = 15.5
11 12 13 14 15 16 17 18 19 20 21 s = .9258
Data C
Mean = 15.5
11 12 13 14 15 16 17 18 19 20 21 s = 4.57
Chap 3-35
Coefficient of Variation
Measures relative variation
Always in percentage (%)
Shows variation relative to mean
Is used to compare two or more sets of data
measured in different units
Population Sample
σ s
CV 100% CV 100%
μ x
Chap 3-36
Comparing Coefficients
of Variation
Stock A:
Average price last year = $50
Standard deviation = $5
s $5
CVA 100% 100% 10%
x $50 Both stocks
have the same
Stock B: standard
Average price last year = $100 deviation, but
stock B is less
Standard deviation = $5
variable relative
to its price
s $5
CVB 100% 100% 5%
x $100
Chap 3-37
The Empirical Rule
If the data distribution is bell-shaped, then
the interval:
μ 1σ contains about 68% of the values in
the population or the sample
68%
μ
μ 1σ
Chap 3-38
The Empirical Rule
μ 2σ contains about 95% of the values in
the population or the sample
μ 3σ contains about 99.7% of the values
in the population or the sample
95% 99.7%
μ 2σ μ 3σ
Chap 3-39
Tchebysheff’s Theorem
Regardless of how the data are distributed,
at least (1 - 1/k2) of the values will fall within
k standard deviations of the mean
Examples:
At least within
(1 - 1/12) = 0% ……..... k=1 (μ ± 1σ)
(1 - 1/22) = 75% …........ k=2 (μ ± 2σ)
(1 - 1/32) = 89% ………. k=3 (μ ± 3σ)
Chap 3-40
Standardized Data Values
Chap 3-41
Standardized Population Values
x μ
z
σ
where:
x = original data value
μ = population mean
z = standard score
Chap 3-42
Standardized Sample Values
x x
z
s
where:
x = original data value
x = sample mean
z = standard score
Chap 3-43
Standardized Value Example
IQ scores in a large population have a bell-
shaped distribution with mean μ = 100 and
standard deviation σ = 15
Find the standardized score (z-score) for a
person with an IQ of 121.
Chap 3-45
Using Excel
Select:
Data / data analysis / descriptive statistics
Chap 3-46
Using Excel
(continued)
Click OK
Chap 3-47
Excel output
Microsoft Excel
descriptive statistics output,
using the house price data:
House Prices:
$2,000,000
500,000
300,000
100,000
100,000
Chap 3-48
Chapter Summary
Chap 3-49
Chapter Summary
(continued)
Chap 3-50