Module 1.3 - Central Tendency and Variability of Data
Module 1.3 - Central Tendency and Variability of Data
𝑋 1+ 𝑋 2+…+ 𝑋 𝑛
𝑋=
𝑛
Mean
o For a finite population of values:
𝑋 1+ 𝑋 2 +…+ 𝑋 𝑛
𝜇=
𝑁
Mean
o For ungrouped frequency distribution:
𝑋=
∑ ( 𝑓 ∗ 𝑋)
𝑛
Mean
o For ungrouped frequency distribution:
Scores, X Frequency, f
0 2
1 4
2 12
3 4
4 3
Mean
o For ungrouped frequency distribution:
𝑋=
∑ ( 𝑓 ∗ 𝑋 𝑚)
𝑛
Mean
o For grouped frequency distribution:
Class Frequency, f
15.5 – 20.5 3
20.5 – 25.5 5
25.5 – 30.5 4
30.5 – 35.5 3
35.5 – 40.5 2
Mean
o For grouped frequency distribution:
∑
Class Frequency
,f
(𝑓 ∗ 𝑋 𝑚 )
𝑋=
15.5 – 20.5 3 18 54
20.5 – 25.5 5 23 115
25.5 – 30.5
30.5 – 35.5
4
3
28
33
112
99
𝑛
35.5 – 40.5 2 38 76
Median
o When a data set is ordered, it is called a data array.
o Solution:
o Solution:
o Data array: 180, 186, 191, 201, 202, 209, 219, 220.
Solution:
o Mode = 8 days
Mode
Example: Six strains of bacteria were tested to see how
long they could remain alive outside their normal
environment. The time, in minutes, is given below. Find the
mode.
Solution:
o Data set: 15, 18, 18, 18, 20, 22, 24, 24, 24, 26, 26.
Solution:
o There are two modes (bimodal). The values are 18 and 24.
Mode
o For ungrouped frequency distribution:
Scores, X Frequency, f
0 2
1 4
2 12
3 4
4 3
Mode = 2
Mode
o For grouped frequency distribution:
Class Frequency, f
15.5 – 20.5 3
20.5 – 25.5 5
25.5 – 30.5 4
30.5 – 35.5 3
35.5 – 40.5 2
Solution:
Distribution Shapes
o Frequency distributions can assume many shapes.
o The three most important shapes are positively skewed,
symmetrical, and negatively skewed.
Positively skewed
Distribution Shapes
Symmetrical
Distribution Shapes
Negatively Skewed
Variability of Data
Measures of Variation
o We have presented statistics that describes the central
tendencies of a data set.
SET A SET B
No. Score No. Score
1 12 1 15
2 10 2 9
3 13 3 10
4 12 4 7
5 15 5 16
6 14 6 17
7 12 7 18
8 15 8 19
9 12 9 8
10 10 10 6
Mean 12.5 Mean 12.5
Measures of Variation
o Variance and SD can be used to determine the spread of
the data.
√ ∑ ( 𝑋 − 𝜇)
2
𝜎 =√𝜎 =
2
𝑁
Population Variance and Standard Deviation
Example: Consider the following data to constitute the
population: 10, 60, 50, 30, 40, 20. Find the mean and
∑
variance.
2
2(𝑋−𝜇)
𝜎=
Solution:
10 -25 625
60
50
+25
+15
625
225
𝑁
Note: Do not round off at early stage of the
30 -5 25
computation. For the standard deviation, use the
40 +5 25 exact value of variance from your calculator. The
two decimal places is only to show the solution.
𝜎 =√𝜎
20 -15 225 2
𝜎 = √ 291.67
𝜎 =17.08
Sample Variance
o The unbiased estimator of the population variance or the
sample variance is a statistic whose value approximates the
expected value of a population variance. It is denoted by
where
Alternative Formula:
(∑ 𝑋 )
2
∑ 2
𝑋 −
𝑛
𝑠2 =
𝑛−1
n
Sample Standard Deviation
o The sample standard deviation is the square root of the
sample variance.
√ ∑(𝑋−𝑋) 2
𝑠=√ 𝑠 =
2
𝑛−1
o Alternative Formula:
√
(∑ 𝑋 )
2
∑ 2
𝑋 −
𝑛
𝑠=
𝑛 −1
Sample Variance and Sample Standard
Deviation
Example: Find the variance and standard deviation for the
following sample: 16, 19, 15, 15, 14.
Solution:
∑
2 2
( 𝑋 ) (79)
2
∑ 2
𝑋 −
𝑛
1263 − 𝑠=√ 3.7
5
𝑠 = =
𝑛−1 5 −𝑠=1.92
1
Sample Standard Deviation
o For ungrouped frequency distribution:
(∑
2
𝑓 ∗ 𝑋)
∑(𝑓 ∗𝑋 2
) −[
𝑛
]
𝑠2 =
𝑛 −1
Sample Standard Deviation
o For ungrouped frequency distribution:
5 2
6 3
7 8
8 1
9 6
10 4
Sample Standard Deviation
o For ungrouped frequency distribution:
Solution:
5 2 10 50
6 3 18 108
7 8 56 392
8 1 8 64
9 6 54 486
10 4 40 400
𝑛=24
Sample Standard Deviation
o For grouped frequency distribution:
Solution:
( ∑ 𝑓 ∗ 𝑋 𝑚)
Class Frequenc 2
y
5.5 – 10.5
10.5 – 15.5 2
1 8
13
8
26
64
338 2
∑ ( 𝑓 ∗ 𝑋 ) −[
𝑚
2
𝑛
]
15.5 – 20.5 3 18 54 972 𝑠=
20.5 – 25.5 5 23 115 2645
𝑛−1
25.5 – 30.5 4 28 112 3136
𝑠=√ 68.68
30.5 – 35.5 3 33 99 3267
35.5 – 40.5 2 38 76 2888
𝑠=8.29
𝑛=20
Coefficient of Variation
o The coefficient of variation is defined to be the standard
deviation divided by the mean. The result is expressed as a
percentage.
Chebyshev’s Theorem
o The proportion of values from a data set that will fall within
k standard deviations of the mean will be at least , where
k is any number greater than 1.
Solution:
For k= 2, 75% of the values will lie within 2
standard deviations of the mean
$50,000+2($10,000) = $70,000
$50,000-2($10,000) = $30,000
Solution:
Note: Do not round off at early stage of the computation. Use the exact value from
your calculator. The two decimal places is only to show the solution.