Chapter 03
Chapter 03
Advantages:
• Simple to calculate
• Summarizes the data with a single value
Disadvantages:
• With only a summary value you lose information about
the original data.
• Sample 1 with n = 3: 999, 1000, 1001 = 1000
• Sample 2 with n = 3: 0, 1000, 2000 = 1000
• Just knowing the mean does not help you know what the
underlying data looks like.
• The value of the mean is sensitive to outliers (values
that are much higher or lower than most of the data).
21 27 27 28 34 45 50
Number
of children Frequency
0 4 The value that
1 5 appears most
2 8 often is 2
0 1 2 3 4 5
3 4 (occurs 8 times),
4 2 so the mode = 2
children.
5 1
Mode = 2
Distribution Shape
Symmetric Skewed
Left- Right-
Skewed Skewed
Measures of Variability
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14
Range = 13 - 1 = 12
Copyright © 2020, 2015, 2013 Pearson Education, Inc.
3-19
The Range
Advantages:
• Easy to calculate and understand
Disadvantages:
• Only based on two numbers in the data set
(Ignores the way in which data are distributed)
• Sensitive to outliers
Example:
Sample
Data (xi) : 4 6 8 9 11 12 12 18
n=8 Mean = = 10
Short-cut formula
for the sample
variance:
Short-cut formula
for the population
variance:
Coefficient of Variation:
Nike:
Although Google
Google: had a larger
deviation, it had
the more
consistent price.
Copyright © 2020, 2015, 2013 Pearson Education, Inc.
3-31
3.4 Working with Grouped Data
1 to under 5 6
5 to under 9 12
9 to under 13 10
13 to under 17 4
Percentiles Quartiles
n = 15
Q1
Similarly, we find
Q2 = 3.27
Q3 = 4.26
Min Q1 Q2 Q3 Max
0.59 2.37 3.27 4.26 5.97 11.31
(outlier)