MMW Descriptive Statistics
MMW Descriptive Statistics
• Measures of Central Tendency – summary measure that describe a whole set of data with a single quantity
that is represent the middle or center of its distribution. The way of which group of data that cluster
around a central value. This is the measure that tells where the center of a data set is located. Most common
used measures of central tendency: mean, median, and mode.
Mean
Example 1: Six friends in a biology class of 20 students receives test grades of 92, 84, 65, 76, 88, and 90. Find the
mean of these test scores.
∑X 495
x̅ = =
N 6
92 + 84 + 65 + 76 + 88 + 90 𝐱̅ = 𝟖𝟐. 𝟓
=
6
Example 2: The ages of five contestants in a Statistics Quiz Bee are the following: 18, 17, 18, 19, and 18.
∑X 90
x̅ = =
N 5
18 + 17 + 18 + 19 + 18 𝐱̅ = 𝟏𝟖
=
5
Median
Example 1 (Odd)
Example 2 (Even):
Example 3: Seven mothers were selected and given a blood pressure check. Their blood pleasure were recorded:
135, 121, 119, 130, 121, 131,
𝐱̃ = 𝟒𝟏𝟏
Mode
Example 1: Find the mode of the data set: 15, 28, 25, 48, 22, 43, 39, 44, 43, 49, 34, 22, 33, 27, 25, 22, and 30
15, 22, 22, 22, 25, 25, 27, 28, 30, 33, 34, 39, 43, 43, 44, 48, 49 (Unimodal)
Example 2: The speed of ten stenographers in typing per minutes are as follows: 121, 110, 120, 119, 112, 121, 118,
115, 107, 115
107, 110, 112, 115, 115, 118, 119, 120, 121, 121 (Bimodal)
None
Weighted mean
• The weighted mean of the n numbers X1, X2, X3, … , Xn with the respective assigned weights W1, W2, W3, … , Wn
∑(𝐱 ×𝐰)
• Weighted Mean = ∑𝐰
• ∑( x × w) - is the sum of the products formed by multiplying each number by its assigned weight
• ∑ w - is the sum of all the weights
Example 1: Many colleges use the 4-point grading system: A = 4, B = 3, C =2, D =1, F =
Course Grade Units
0. Find the grade point average of Dillon’s grades in the given semester course grade.
English B 4
∑(x ×w) 35
= = History A 3
∑w 14
Chemistry D 3
(3 × 4) + (4 × 3) +(1 × 3) + (2 × 4) = 15 Algebra C 4
=
14
Frequency Distribution
• It lists observed events and frequency occurrence of each observed event
• Often used to organized raw data
Frequency Distribution
Number of computers Number of households
x f
0 5
1 12
2 14
3 3
4 2
5 3
6 0
8 1
N = 40
∑𝐟 where:
− 𝐜𝐟
𝐱̃ = 𝐥𝐛𝐦𝐜 +[ 𝟐 ] 𝐜𝐰
𝐟𝐦𝐜 • 𝑙𝑏𝑚𝑜 = lower boundaries of modal class
• 𝐷1 = difference of the modal class and the class
where: preceding it
• 𝐷2 = difference of the modal class and the class
• ∑ 𝑓 = total frequencies succeeding it
Lower fx
Class Mark (x) Cumulative
boundaries (lb) (multiply
Scores Frequency (add both scores frequency (cf)
(unang given sa frequency and
then divide by 2)
score - .5) class mark)
11 – 15 1 10.5 13 13 1
16 – 20 2 15.5 18 36 3
21 – 25 5 20.5 23 115 8
26 – 30 11 25.5 28 308 19
31 – 35 12 30.5 33 396 31
36 – 40 11 35.5 38 418 42
41 - 45 5 40.5 43 215 47
46 - 50 1 45.5 48 48 48
N = 48 fx = 1549
Example: Compute the mean, median, and mode of the scores of the students in a basic statistic test
∑ 𝐟𝐱
Mean: 𝐱̅ =
𝐍
Lower boundaries
Scores Frequency Class Mark (x) fx
(lb)
11 – 15 1 10.5 13 13
16 – 20 2 15.5 18 36
21 – 25 5 20.5 23 115
26 – 30 11 25.5 28 308
31 – 35 12 30.5 33 396
36 – 40 11 35.5 38 418
41 - 45 5 40.5 43 215
46 - 50 1 45.5 48 48
N = 48 fx = 1549
• ∑ fx = 1549 • N = 48
1549 𝐱̅ = 𝟑𝟐. 𝟐𝟕
x̅ =
48
∑𝐟
−𝐜𝐟
Median: 𝐱̃ = 𝐥𝐛𝐦𝐜 + [ 𝟐 ]𝐜𝐰
𝐟𝐦𝐜
•
∑f
=
48
= • fmc = 12
2 2
24 (will tell what column is the median class) • cf = 19 (before median class)
• lbmc = 30.5 • cw = 5 (31 to 35)
Cumulative
Scores Frequency Lower boundaries (lb)
frequency (cf)
11 – 15 1 10.5 1
16 – 20 2 15.5 3
21 – 25 5 20.5 8
26 – 30 11 25.5 19
31 – 35 12 30.5 31 Median
36 – 40 11 35.5 42 Class
41 - 45 5 40.5 47
46 - 50 1 45.5 48
∑ 𝒇 = 48
Solution:
24 − 19 𝑥̃ = 30.5 + 2.08
𝑥̃ = 30.5 + [ ]5
12 ̃ = 𝟑𝟐. 𝟓𝟖
𝒙
5
𝑥̃ = 30.5 + [ ] 5
12
𝐃𝟏
Mode: 𝐱̂ = 𝐥𝐛𝐦𝐨 + [ ] 𝐜𝐰
𝐃𝟏 + 𝐃𝟐
Solution:
D1 1
x̂ = lbmo + [ ] cw x̂ = 30.5 + [ ] 5
D1 + D2 2
1 x̂ = 30.5 + 2.5
x̂ = 30.5 + [ ]5
1+1
𝐱̂ = 𝟑𝟑
Measures of Position
• It tells where the score stands relative to the others in a set of data
• Measure whether a value is about the average, or whether its unusually high or low
• Used for quantitative data that falls on some numerical scale
• Can be applied to other variables
• General Method, Linear Interpolation, Mendenhall and Sincich Methos
Example 1: The owner of a coffee shop recorded the number of customers who into his café each hour in a day. The
results were: 14, 10, 12, 9, 17, 5, 8, 9, 14, 10, and 11. Find the lower quartile and upper quartile of the data.
Arrange the scores in ascending order:
Example 2: Consider the set of scores in a quiz in Math 10 of Section Rizal: 11, 13, 14, 15, 15, 16, 19, 19, 20. Find
𝑄1 , 𝑄2 , and 𝑄3
Example 1: Find the 3 rd decile of the following test scores of a random sample of ten students: 35, 42, 40, 28, 15, 23,
33, 20, 18, and 28
Solution:
kn 30
Dk = D3 =
10 10
3(10) D3 = 3
D3 =
10
If whole number:
3rd + 4th 43
D3 = D3 =
2 2
20 + 23 𝐃𝟑 = 𝟐𝟏. 𝟓
D3 =
2
Example 2: Mrs. Rogon is a Mathematics teacher, she gives a 60-item test for remedial class. The scores of 15
students are 20, 35, 55, 28, 46, 32, 25, 56, 55, 28, 37, 60, 47, 52, 17. Find the value of 2 nd decile, 7 th decile, and 8 th
decile
17, 20, 25, 28, 28, 32, 35, 37, 46, 47, 52, 55, 55, 56, 60
Solution: 2 nd Decile
kn 30
Dk = D2 =
10 10
2(15) D2 = 3
D2 =
10
If whole number: 2nd Decile
3rd + 4th 53
D2 = D2 =
2 2
25 + 28 𝐃𝟐 = 𝟐𝟔. 𝟓
D2 =
2
Solution: 7 th Decile
7(15) D7 = 10.5 to (11th)
D7 =
10 𝐃𝟕 = 𝟓𝟐
105
D7 =
10
Solution: 8 th Decile
8(15) 125 = 0 D7 = 12
D7 = D7 =
10 10
If whole number: 8 th Decile
12th + 13th 110
D8 = D8 =
2 2
55 + 55 𝐃𝟖 = 𝟓𝟓
D8 =
2
Percentiles
Example 1: The list shows the number of bottles of strawberry jam sold in a day by 14 different vendors: 9, 6, 10,
12, 15, 13, 9, 11, 17, 15, 18, 20. Solve for P43, P60, and P75
9, 10, 11, 12, 13, 15, 15, 16, 17, 18, 19, 20
Solution: 43rd Percentile
3(12) Pk = 5.16 𝑡𝑜 (6𝑡ℎ)
Pk =
100 𝐏𝐤 = 𝟏𝟓
516
Pk =
100
Solution: 60 th Percentile
60(12) Pk = 7.2 to (8th)
Pk =
100 𝐏𝐤 = 𝟏𝟔
720
Pk =
100
Solution: 75 th Percentile
75(12) 900 Pk = 9
Pk = Pk =
100 100
• Measure of variability of a set of data is a number that conveys the idea of spread for the data set
• Range, standard deviation, and variation
Range
• It measures the distance between the largest and smallest values and, as such gives an idea of the spread of
data set
• Do not use the concept of deviation
• It is affected by outliers but does not consider all values in data set
• Least reliable; not very useful measure of variability
• Range (R) = highest value – lowest value
Variance
Find the variance and stand deviation: The following numbers were obtained by sampling a population: 2, 4, 7, 12,
15
∑(x − μ)2
s2 = √
n− 1
𝐬 𝟐 = 29.5
118
s2 = √
4
Example 2: A consumer group has tested a sample of 8 size – D batteries from each 3 companies. The results of the
tests are shown in the following table. According tot these tests, which company produces batteries for which the
values representing hours of constant use have the smallest standard deviation
Company Hours of constant use per battery
EverSoBright 6.2, 6.4, 7.1, 5.9, 8.3, 5.3, 7.5, 9.3
Dependable 6.8, 6.2, 7.2, 5.9, 7.0, 7.4, 7.3, 8.2
Beacon 6.1, 6.6, 7.3, 5.7, 7.1, 7.6, 7.1, 8.5
(6.2 − 7)2 + (6.4 − 7)2 + (7.1 − 7)2 + (5.9 − 7)2 + (8.3 − 7)2 + (5.3 − 7)2 + (7.5 − 7)2 + (9.3 − 7)2
s= √
7
𝐬 = 𝟏𝟑𝟑𝐡
12.34
s= √
7
𝐬 = 𝟎.𝟕𝟐 𝐡
3.62
s= √
7
(6.1 − 7)2 + (6.6 − 7)2 + (7.3 − 7)2 + (5.7 − 7)2 + (7.1 − 7)2 + (7.6 − 7)2 + (7.1 − 7)2 + (8.5 − 7)2
s= √
7
5.38 𝐬 = 𝟎.𝟖𝟖 𝐡
s= √
7
Company Hours of constant use per battery
EverSoBright s = 1.33h
Dependable s = 0.72h
Beacon s = 0.88h
The batteries from Dependable have the smallest standard deviation. According to these results, the dependable
company produces the most consistent batteries with regards to life expectancy constant use.
Lesson 3: Measures of Variation: Range, Variance, and Standard Deviation for Grouped Data
Range
Scores of 40 students in a 60-point quiz
R = 58.5 – 4.5 R = 54
∑ fx 1212 x̅ = 30.3
x̅ = x̅ =
N 40
Variance:
Standard Deviation:
s = √160.98 s = 12.69
Solution:
4.2 − 1.4 0.5 × 100 = 50%
= 0.5
4.2 + 1.4
So,
3.625 − 1.8 = 33.64%
× 100
3.625 + 1.8
So,
M. D(x̅) 0.76 = 26.209%
× 100 × 100
x̅ 2.9
About median:
∑ |x−x̃ | 7.61
M. D(x̅) = = = 0.761
n 10
x̃ = 3.15
So,
M. D(x̃) 0.761 = 24.158%
× 100 × 100
x̃ 3.15
4. Coefficient of Variation
𝛔
× 𝟏𝟎𝟎
𝐱̅
σ = 0.894
∑ x2 ∑x 2 92.1
σ= √ −( ) σ= √ − (2.9)2
n 𝑛 10
So,
σ 0.894 = 30.287%
× 100 × 100
x̅ 2.9