Chapter 3 A
Chapter 3 A
Chapter 3 A
Data Description
Outline
2
Describe data, using measures of variation, such as
the range, variance, and standard deviation.
4
Introduction
5
3.1 Measures of Central Tendency
Measures
Statistics Parameter
Thus, the average of household (HH) income obtained from a sample of household is a
statistic,
and the average of household (HH) income obtained from the entire population of HH is a
parameter
6
Measures of Central Tendency:
The Mean
The mean also known as the arithmetic
average, is found by adding the values of Mean
the data and dividing by the total number
of values.
7
Example 3-1-page 112 Avian Flu Cases:
8
Example 3-1: Days Off per Year
The data represent the number of days off per year for a
sample of individuals selected from nine different
countries. Find the mean.
20, 26, 40, 36, 23, 42, 35, 24, 30
𝑥1 + 𝑥2 + … . . +𝑥𝑛
𝑥ҧ =
𝑛
20 + 26 + 40 + 36 + 23 + 42 + 35 + 24 + 30 276
X= = = 30.7
9 9
9
Measures of Central Tendency:
mean for grouped data
10
Finding the Mean for Grouped Data
11
Example 3-3: Miles Run
X=
f X m
=
490
= 24.5 miles
n 20
Measures of Central Tendency: Weighted Mean
13
Measures of Central Tendency: Weighted Mean
14
Example 3-14: Grade Point Average
X= wX
=
3 4 + 3 2 + 4 3 + 2 1 32
= = 2.7
w 3+3+ 4+ 2 12
16
Example 3-4: Tablet Sales
17
Example 3-5: Tornadoes in the U.S.
19
Examples: Mode
Unimodal: Find the mode of the signing bonuses of eight NFL players
for a specific year. The bonuses in millions of dollars are
18.0, 14.0, 34.5, 10, 11.3, 10, 12.4, 10
You may find it easier to sort first.
10, 10, 10, 11.3, 12.4, 14.0, 18.0, 34.5
Find the mode for the number of branches that six banks have.
401, 344, 209, 201, 227, 353
Note: Do not say that the mode is zero. That would be incorrect,
because in some data, such as temperature, zero can be an actual
value.
20
Example 3-7: Licensed Nuclear Reactors
104 and 109 both occur the most (5 times). The data
set is said to be bimodal.
22
Measures of Central Tendency: Midrange
23
Properties and Uses of Central Tendency
The Mean
• Define rigorously with a mathematical formula which is highly
amenable to mathematical treatment
• Uses all data values.
• Varies less than the median or mode when samples are taken
from the same population and all three measures are
computed for these samples.
• Used in computing other statistics, such as the variance
• Unique, usually not one of the data values
• Cannot be used with open-ended classes
• Affected by extremely high or low values, called outliers
24
Properties of the Median
25
Properties of the Mode
26
Properties of the Midrange
➢Easy to compute.
➢Gives the midpoint.
➢Affected by extremely high or low values in a
data set
27
Distributions Shapes
28
Exercise
3. High Temperatures The reported high temperatures (in degrees Fahrenheit) for
selected world cities on an October day are shown below. Find (i) the mean, (ii) the
median, (iii) the mode, and (iv) the mid-range. Which measure of central tendency
do you think best describes these data?
62 72 66 79 83 61 62 85 72 64 74 71
42 38 91 66 77 90 74 63 64 68 42
Solution:
62+72+66+ ………………..+42 1566
(i) Mean = x = = = 68.1
23 23
(ii) Arrange the observation in ascending order
38 42 42 61 62 62 63 64 64 66 66 68 71 72 72 74 74 77
79 83 85 90 91
Median = 68.
(iii) Modes are: 42 62 64 66 72 74
38+91 129
(iv) Midrange = = =64.5
2 2 29
Exercise
30
Exercise
Solution:
31
Extending the concept
36. If the mean of five values is 64, find the sum of the
values. 320
37. If the mean of five values is 8.2 and four of the values
are 6, 10, 7, and 12, find the fifth value.
38. Find the mean of 10, 20, 30, 40, and 50.
a. Add 10 to each value and find the mean. 40
b. Subtract 10 from each value and find the mean. 20
c. Multiply each value by 10 and find the mean. 300
d. Divide each value by 10 and find the mean. 3
e. Make a general statement about each situation.
3-2 Measures of Variation
33
3-2 Measures of Variation
Group 1 Group 2
45 70 Mean score for both the group
100 75 of students is 75.
80 80
225 225 But their performance is not same.
34
Measures of Variation: Range
The average for both brands is the same, but the range
for Brand A is much greater than the range for Brand B.
Bluman Chapter 3 36
Measures of Variation: Variance & Standard Deviation
σ𝑁
𝑖=1 𝑋 − 𝜇
2 σ𝑁
𝑖=1 𝑋 − 𝜇
2
2 𝜎=
𝜎 =
𝑁 𝑁
Uses or -To determine the spread of the data.
Purposes -To determine the consistency of a variable.
-To determine the number of data values that fall within a
specified interval in a distribution (Chebyshev’s Theorem).
-Used in inferential statistics.
37
Example 3-21: Outdoor Paint
Find the variance and standard deviation for the data set for
Brand A paint. 10, 60, 50, 30, 40, 20
= =
X 210
Months, X X – µ (X – µ)2 = 35
N 6
10 –25 625
( X − )
2
60 25 625 2 =
n
50 15 225 1750
=
30 –5 25 6
40 5 25 = 291.7
20 –15 225
1750
00 1750 =
6
= 17.1
38
Measures of Variation: Variance & Standard Deviation
(Sample Theoretical Model)
s 2
=
n −1
• The sample standard deviation is
( X − X )
2
s=
n −1
39
Variance & Standard Deviation
(shortcut or Sample Computational formula for 𝑠 2 and s)
n ( n − 1)
s= s 2
40
Example : European Auto Sales
11.2 n ( n − 1)
125.44
11.9 6 ( 958.94 ) − ( 75.6 )
2
141.61
12.0 144.00 s =
2
s 2 = 1.28
6 ( 5)
12.8 163.84 s = 1.13
13.4 179.56
14.3 204.49 Note: that σ 𝑋 2 is not the same as σ 𝑋 2 .
75.6 958.94 -The notation σ 𝑋 2 means to square the values
first, then sum;
- σ 𝑋 2 means to sum the values first, then square
the sum. 41
shortcut or Sample Computational formula for 𝑠 2 and s for grouped
data )
• The steps for finding the variance and standard deviation for grouped data
are summarized in this Procedure Table.
Bluman, Chapter 3 42
Example 3-22: Miles run per week
Find the variance and the standard deviation for the frequency
distribution of the data in Example 2–7. The data represent the
number of miles that 20 runners ran during one week.
Bluman, Chapter 3 43
Solution of example 3-24
s =
2
( )
n f X m2 − ( f X m )
2
=
20(13,310) − (490)
2
= 68.7
n(n − 1) 20(20 − 1)
Take the square root to get the standard deviation
s = 68.7 = 8.3
Bluman, Chapter 3 44
Measures of Variation: Coefficient of Variation
Whenever two samples have the same units of measure: the variance and
standard deviation for each can be compared directly.
s
CVAR = 100%
X
45
Example 3-23: Sales of Automobiles
5
CVar = 100% = 5.7% Sales
87
773
CVar = 100% = 14.8% Commissions
5225
46
Range Rule of Thumb
47
Example : Range Rule of Thumb
48
Range Rule of Thumb
The range rule of thumb can be used to estimate the largest and smallest
data values of a data set. The smallest data value will be approximately 2
standard deviations below the mean, and the largest data value will be
approximately 2 standard deviations above the mean of the data set.
LOW 10 − 2 ( 3) = 4
HIGH 10 + 2 ( 3) = 16
49
Measures of Variation: Chebyshev’s Theorem
50
Measures of Variation: Chebyshev’s Theorem
51
Chebyshev’s Theorem
53
Example 3-25: Prices of Homes
Thus, at least 75% of all homes sold in the area will have a price range
from $30,000 and $70,000.
54
Example 3-26: Travel Allowances
Thus,
At least 84% of the data values will fall between $0.20 and $0.30.
Bluman, Chapter 3 55
Measures of Variation: Empirical Rule (Normal)
56
Measures of Variation: Empirical Rule (Normal)
Bluman, Chapter 3 57
Measures of Variation: Empirical Rule (Normal)
Bluman, Chapter 3 58
Linear transformation of the data
59
Linear transformation of the data
Example:
Suppose you own a store with five employees, their hourly
salaries are: $10, $13, $10, $11, $16
𝑥=$12
ҧ s=2.550
then you decided to give each employee a raise of $1.00 per hour.
So, the new salaries will be : $11, $14, $11, $12, $17
𝑥=$13
ҧ s=2.550
So, we noticed that the value of the mean increases by the
amount added to the data, but the standard deviation dose not
change.
60
Linear transformation of the data
Example:
Suppose that the five employees worked , the numbers of hours
per week shown as : 15, 12, 18, 20, 10
𝑥=15
ҧ s=4.123
𝑥ҧ = 30 s=8.246
So we noticed that the value of the mean and standard
deviation also doubled .
61
Linear transformation of the data
62