Chapter 4
Chapter 4
Data Management
Introduction
The role of data management tools is important to further analyze and
interpret data. Utilizing these tools will greatly enhance the theories that might be
otherwise misunderstood.
This module deals with measures of central tendency, measures of dispersion,
measures of relative position, and normal distribution.
Learning Outcomes:
At the end of this chapter, you are expected to:
LESSON 1:
MEASURES OF CENTRAL TENDENCY
The mean (also known as the arithmetic mean) is the most commonly used
measure of central position. It is the sum of measures divided by the number of
measures in a variable. It is symbolized as x (read as x bar). Mean is appropriate to
use when the distribution is at least interval scale.
To find the mean of ungrouped data, use the formula
1
Remember!
Ʃ𝑥
𝑥̅ =
𝑛
where:
∑ 𝑥 = sum of entries 𝑛 = number of entries
Example 1: The grades in Chemistry of 10 students are 87, 84, 85, 85, 86, 90, 79, 82,
78, 76. What is the average grade of the 10 students?
Solution:
87+84+85+85+86+90+79+82+78+76
𝑥̅ = = 83.2
10
Weighted Mean
Occasionally, we want to find the mean of a set of values wherein each value
or measurement has a different weight or degree of importance. We call this the
weighted mean and the formula for computing it is as follows:
Remember!
Ʃ𝑥𝑊
𝑥̅ =
∑𝑊
where:
𝑥 = measurement or value 𝑊 = weight
Example 2: Below are Maria’s subjects and the corresponding number of units and
grades she got for the first grading period. Compute her grade point
average.
2
Solution:
x
xW
W
80(1.5) 82(1.5) 83(1) 81(2) 80(1) 85(1.5) 82(2)
10.5
859.5
10.5
x 81.86
Therefore, Maria has the GPA of 81.86 for the first grading period.
When the number of items in a set of data is too big, items are grouped for
convenience. The manner of computing for the mean of grouped data is given by the
formula:
Remember!
Ʃ𝑓𝑥
𝑥̅ =
Ʃ𝑓
where:
x = class mark (midpoint of a class interval)
f= frequency of each class
Example 3: Compute the mean of the scores of the students in a Mathematics test.
Solution: The frequency distribution for the data is given. The columns x and fx are
added.
3
Class Interval F x fx
46 - 50 1 48 48
41 - 45 5 43 215
36 - 40 11 38 418
31 - 35 12 33 396
26 - 30 11 28 308
21 - 25 5 23 115
16 - 20 2 18 36
11 - 15 1 13 13
Ʃ𝑓 = 48 Ʃ𝑓𝑥 = 1,549
Ʃ𝑓𝑥
𝑥̅ =
𝑛
1,549
𝑥̅ =
48
𝑥̅ = 32.27
Example 4: Solve for the mean gross sale of Aling Mely’s Sari-sari Store for one
month.
Solution: The frequency distribution for the data is given below. The columns x and fx
are added.
Sales in Pesos F x fx
4,501 - 5,000 3 4,750 14,250
4,001 - 4,500 4 4,250 17,000
3,501 - 4,000 6 3,750 22,500
3,001 - 3,500 5 3,250 16,250
2,501 - 3,000 7 2,750 19,250
2,001 - 2,500 3 2,250 6,750
1,501 - 2,000 1 1,750 1,750
1,001 - 1,500 1 1,250 1,250
Ʃ𝑓 = 30 Ʃ𝑓𝑥 = 99,000
4
Ʃ𝑓𝑥
𝑥̅ =
𝑛
99,000
𝑥̅ =
30
𝑥̅ = 3,300
The median is the middle entry or term in a set of data arranged in either
increasing or decreasing order. The median is a positional measure. Thus, the values
of the individual measures in a set of data do not affect it. It is affected by the number
of measures and not by the size of the extreme values. This measure is appropriate to
use when the distribution is at least ordinal scale since ranking of the data is involved.
To find the median of a given set of data, take note of the following:
Example 5: The number of books borrowed in the library from Monday to Friday last
week were 58, 60, 54, 35, and 97 respectively. Find the median.
Example 6: Cora’s quizzes for the second quarter are 8, 7,6, 10, 9, 5, 9, 6, 10, and 7.
Find the median.
5, 6, 6, 7, 7, 8, 9, 9, 10, 10
Since the number of measures is even, then the median is the average of
the two middle scores.
78
Md 7.5
2
5
Median of Grouped Data
To find the median of grouped data, identify first the median class, the class
interval holding the median. Since the median divides the distribution into two equal
parts, first get 50% of the total number of cases or scores. Then identify the interval
containing the score where 50% of the cases would fall below this value.
In computing for the median of grouped data, the following formula is used:
Remember!
Ʃ𝑓
− 𝑐𝑓
𝑀𝑑 = 𝑙𝑏𝑚𝑐 +( 2 )𝑖
𝑓𝑚𝑐
where:
𝑙𝑏𝑚𝑐 = true lower limit or lower-class boundary of the median class
cf = cumulative frequency of the lower class next to the median class
𝑓𝑚𝑐 = frequency of the median class
f = frequency of each class;
i = class size
Ʃ𝑓
The median class is the class that contains the 𝑡ℎ quantity. The computed
2
median must be within the median class.
Example 7: Compute the median of the scores of the students in a Mathematics test.
Solution: The frequency distribution for the data is given below. The columns for lb
and “less than” cumulative frequency are added.
6
Ʃ𝑓 48
Since = = 24, the 24th quantity is in the class 31 - 35. Hence, the median
2 2
class is 31 - 35.
Ʃ𝑓
− 𝑐𝑓
2
𝑀𝑑 = 𝑙𝑏𝑚𝑐 + ( )𝑖
𝑓𝑚𝑐
48
− 19
𝑀𝑑 = 30.5 + ( 2 )5
12
𝑀𝑑 = 30.5 + 32.08
𝑀𝑑 = 32.58
Example 8: Solve for the median gross sale of Aling Mely’s Sari-sari Store for one
month.
Solution: The frequency distribution for the data is given below. The columns for lb
and “less than” cumulative frequency are added.
Ʃ𝑓 30
Since = = 15, the 15th quantity is in the class 3,001- 3,500. Hence, the
2 2
median class is 3,001- 3,500.
7
Ʃ𝑓
− 𝑐𝑓
𝑀𝑑 = 𝑙𝑏𝑚𝑐 + ( 2 )𝑖
𝑓𝑚𝑐
30
− 12
𝑀𝑑 = 3000.5 + ( 2 ) 500
5
𝑀𝑑 = 3000.5 + 300
𝑀𝑑 = 3,300.5
The mode is another measure of position. The mode is the measure or value
which occurs most frequently in a set of data. It is the value with the greatest frequency.
Mode is appropriate to use when the variable measured is in the nominal scale.
Solution: The mode is 6 since it is the shoe size that occurred the most number of
times.
Example 10: The sizes of 9 classes in a certain school are 50, 52, 55, 50, 51, 54, 55,
53 and 54.
Solution: The modes are 54 and 55 since the two measures occurred the same
number of times. The distribution is bimodal.
The mode pf grouped data can be approximated using the following formula:
8
Remember!
𝐷1
𝑀𝑜 = 𝐿𝑏𝑚𝑜 + ( )𝑖
𝐷1 + 𝐷2
where:
The modal class is the class with the highest frequency. If binomial classes
exist, any of these classes may be considered as modal class.
Examples 11: Compute the mode of the scores of the students in a Mathematics test.
Solution: The frequency distribution for the data given below. The column for lb is
added.
Class Interval f lb
46 - 50 1 45.5
41 - 45 5 40.5
36 - 40 11 35.5
31 - 35 12 30.5
26 - 30 11 25.5
21 - 25 5 20.5
16 - 20 2 15.5
11 - 15 1 10.5
Since class 31 - 35 has the highest frequency, the modal class is 31 - 35.
𝐷1
𝑀𝑜 = 𝐿𝑏𝑚𝑜 + ( )𝑖
𝐷1 + 𝐷2
9
1
𝑀𝑜 = 30.5 + ( )5
1+1
𝑀𝑜 = 30.5 + 2.5
𝑀𝑜 = 33
Example 12. Solve for the modal gross sale of Aling Mely’s Sari-sari Store for one
month.
Solution: The frequency distribution for the data is given below. The columns for lb is
added.
Sales in Pesos f lb
4,501 - 5,000 3 4,500.5
4,001 - 4,500 4 4,000.5
3,501 - 4,000 6 3,500.5
3,001 - 3,500 5 3,000.5
2,501 - 3,000 7 2,500.5
2,001 - 2,500 3 2,000.5
1,501 - 2,000 1 1,500.5
1,001 - 1,500 1 1,000.5
Since the class 2,501 - 3,000 has the highest frequency, the modal class is
2,501 - 3,000.
𝐷1
𝑀𝑜 = 𝐿𝑏𝑚𝑜 + ( )𝑖
𝐷1 + 𝐷2
4
𝑀𝑜 = 2,500.5 + ( ) 500
4+2
𝑀𝑜 = 2,500.5 + 333.33
𝑀𝑜 = 2,833.33
10
LESSON 2:
MEASURES OF DISPERSION
The measures that describe the degree of spread of the data are called
“measure of dispersion” or “measure of variability” or “measure of spread”. This
measure is used to determine how scattered the values are in the distribution. In this
topic, we will consider four measures of dispersion, namely: range, average deviation,
variance, and standard deviation.
Remember!
𝑅 = 𝐻 − 𝐿
where:
𝐻 = Highest measure L= Lowest measure
The main advantage of the range is that it does not consider every measure in
the data.
Example 13: Consider the four data sets presented below. Find the range of each
data set.
Data Set
Data Set 1 11 12 13 14 15
Data Set 2 13 14 15 17 19
Data Set 3 10 15 18 20 22
Data Set 4 21 23 25 27 30
Solution:
Comparing the data sets, Data Set 1 has the least variation because it has the
smallest value of R. On the other hand, Data Set 3 has the most variation
because it has the largest value of R.
11
The range of a grouped data is simply the difference between the upper class
boundary of the top interval an lower class boundary of the bottom interval.
Example 14: Find the range of the scores in Midterm Exam of BEEd First Year
Students.
Solution:
Upper class boundary (UCB) = 50.5
Lower class boundary (LCB) = 20.5
Remember!
∑ |𝑥 − 𝑥̅ |
𝐴𝐷 =
𝑛 − 1
12
Example 15. The raw scores of eight students in Statistics are given as follows: 17,
17, 26, 28, 30, 30, 31, and 37. Compute the average deviation.
Solution:
∑ |𝑥 − 𝑥̅ |
𝐴𝐷 =
𝑛 − 1
42
𝐴𝐷 =
8−1
42
𝐴𝐷 =
7
𝐴𝐷 = 6
Example 16. The scores of nine students in Psychology are given as follows: 15, 19,
20, 24, 28, 30, 32, 32, and 40. Calculate the average deviation.
Solution:
∑ |𝑥 − 𝑥̅ |
𝐴𝐷 =
𝑛 − 1
57.33
𝐴𝐷 =
9−1
13
57.33
𝐴𝐷 =
8
𝐴𝐷 = 7.17
For the grouped data or scores organized in the form of frequency distribution,
the average deviation is computed as follows:
Remember!
∑ 𝑓𝑖 |𝑥𝑖 − 𝑥̅ |
𝐴𝐷 =
𝑛 − 1
The steps in determining the average deviation of grouped data are as follows:
14
Class ̅ ̅
𝒇𝒊 𝒙𝒊 ̅
𝒙𝒊 − 𝒙 |𝒙𝒊 − 𝒙| 𝒇𝒊 |𝒙𝒊 − 𝒙|
Interval
36 – 40 7 38 12.60 12.60 88.2
31 – 35 10 33 7.60 7.60 76
26 – 30 5 28 2.60 2.60 13
21 – 25 14 23 -2.40 2.40 33.6
16 – 20 6 18 -7.40 7.40 44.4
11 – 15 8 13 -12.40 12.40 99.2
𝑥̅ = 25.40 𝑛 = 50 ̅
Ʃ𝑓 𝑖 |𝑥𝑖 − 𝑥|
= 354.4
∑ 𝑓𝑖 |𝑥𝑖 − 𝑥̅ |
𝐴𝐷 =
𝑛 − 1
354.4
𝐴𝐷 =
50 − 1
354.4
𝐴𝐷 =
49
𝐴𝐷 = 7.23
We say that the scores deviate from the mean of 25.40 by an average of 7.23
units.
Another way to avoid a sum of zero for the deviation scores is to square each
deviation score and get the average of all squared deviation scores. The resulting
measure is called “variance” which has a squared unit. In symbol, 𝑠2 .
To compute the variance of ungrouped data, the following formula may be used
Remember!
Ʃ(𝑥 − 𝑥̅ )2
𝑠2 =
𝑛
15
Example 18. Consider the data set below. Compute the variance of each data set.
Data Set
Data Set 1 13 16 14 10 15
Data Set 2 22 25 23 27 29
Solution:
Data Set 1:
Ʃ(𝑥 − 𝑥̅ )2
𝑠2 =
𝑛
21.2
𝑠2 =
5
𝑠2 = 4.24
Data Set 2:
Ʃ(𝑥 − 𝑥̅ )2
𝑠2 =
𝑛
32.8
𝑠2 =
5
𝑠2 = 6.56
When the data are presented in frequency distribution, the following formula
must be used
16
Remember!
Ʃ𝑓(𝑥 − 𝑥̅ )2
𝑠2 =
𝑛−1
Class
f x ̅
𝒙 − 𝒙 ̅)𝟐
(𝒙 − 𝒙 ̅)𝟐
𝒇 (𝒙 − 𝒙
Interval
36 – 40 7 38 12.60 158.76 1,111.32
31 – 35 10 33 7.60 57.76 577.6
26 – 30 5 28 2.60 6.76 33.8
21 – 25 14 23 -2.40 5.76 80.64
16 – 20 6 18 -7.40 54.76 328.56
11 – 15 8 13 -12.40 153.76 1,230.08
𝑥̅ = 25.40 Ʃ𝑓 = 50 Ʃ𝑓(𝑥 − 𝑥̅ )2 = 3,362
Ʃf(x − x̅)2
s2 =
n−1
3,362
𝑠2 =
49
𝑠2 = 68.61
17
Example 20: Consider the frequency distribution below. Calculate the variance of the
distribution.
Class
f x ̅
𝒙 − 𝒙 ̅)𝟐
(𝒙 − 𝒙 ̅)𝟐
𝒇 (𝒙 − 𝒙
Interval
33 – 37 6 35 10.78 116.2084 697.2504
28 – 32 9 30 5.78 33.4084 300.6756
23 – 27 12 25 0.78 0.6084 7.3008
18 – 22 8 20 -4.22 17.8084 142.4672
13 – 17 10 15 -9.22 85.0084 850.084
𝑥̅ = 24.22 Ʃ𝑓 = 45 Ʃ𝑓 (𝑥 − 𝑥̅ )2 = 1,997.778
Ʃf(x − x̅)2
s2 =
n−1
1,997.778
𝑠2 =
44
𝑠2 = 45.40
Recall that, in the computation of the variance, the deviation was squared. This
implies that the variance is expressed in squared units. Extracting the square root of
the value of the variance will give the value of the standard deviation. In symbol, 𝑠.
To take the standard deviation of ungrouped data, extract the square root of
the variance. In mathematical formula,
Remember!
Ʃ(𝑥 − 𝑥)2
𝑠=√
𝑛
Data Set
Data Set 1 13 16 14 10 15
Data Set 2 22 25 23 27 29
18
Solution:
Data Set 1:
Ʃ(𝑥 − 𝑥)2
𝑠=√
𝑛
𝑠 = √4.24
𝑠 = 2.06
Data Set 2:
Ʃ(𝑥 − 𝑥)2
𝑠=√
𝑛
𝑠 = √6.56
𝑠 = 2.56
On the basis of the obtained standard deviation, we say that the scores in
Data Set 1 deviate from the mean by 2.06 units, on the everage. For Data Set 2, the
scores deviate from the mean by an average of 2.56 units.
To take the standard deviation of grouped data, extract the square root of the
variance. In mathematical formula,
Remember!
Ʃ(𝑥 − 𝑥)2
𝑠=√
𝑛−1
19
Class Interval Frequency
36 – 40 7
31 – 35 10
26 – 30 5
21 – 25 14
16 – 20 6
11 – 15 8
Ʃ𝑓 = 50
Solution:
Ʃ𝑓(𝑥 − 𝑥̅ )2
𝑠=√
𝑛
𝑠 = √68.61
𝑠 = 8.28
Example 23: Consider the frequency distribution below. Calculate the variance of the
distribution.
Solution:
Ʃ𝑓(𝑥 − 𝑥̅ )2
𝑠=√
𝑛−1
𝑠 = √45.40
𝑠 = 6.74
20
LESSON 3:
MEASURES OF RELATIVE POSITION
Quartiles (Q)
Quartiles are the score points which divides the distribution into four equal
parts. Each set of observations has 3 quartiles and are denoted by Q1, Q2, and Q3.
Q1 Q2 Q3
a. 25% of the distribution has a value ≤ Q1 (lower quartile or the first quartile).
b. 50% of the distribution has a value ≤ Q2 (median or middle quartile).
c. 75% of the distribution has a value ≤ Q3 (upper quartile or the last quartile).
Deciles (D)
Quartiles are the score points which divides the distribution into ten equal
parts. Each set of observations has 9 deciles and are denoted by D1, D2, D3, …D9.
D1 D2 D3 D4 D5 D6 D7 D8 D9
Percentiles are the score points which divides the distribution into one -
hundred equal parts. Each set of observations has 99 percentiles and are denoted by
P1, P2, P3, …P99.
21
Relationship Among Percentile, Decile, and Quartile
P10 = D1
P20 = D2
P25 = Q1
P50 = D5 = Q1 = median
P75 = Q3
P90 = D9
Remember!
𝑃𝑉 = 𝑋𝑗 + 𝑔 (𝑋𝑗+1 − 𝑋𝑗 )
Thus, PV is the number in the jth position (Xj) of the ordered data plus g
multiplied by the difference between the succeeding value (Xj+1) and (Xj).
Example 24: Find Q1, D5, P80, and P99 for the following data:
45 67 78 55 88 90 56 68 99 40
65 70 86 99 59 75 45 84 69 50
40 45 45 50 55 56 59 65 67 68
69 70 75 78 84 86 88 90 99 99
Since Q1 = P25, we get 25% (20+1) = 5.25, which mean that j = 5 and
g = .25 which means that Q1 is the 5th score (55) plus 0.25 of the
difference between the 6th score (56) and the 5th score (55). Hence,
𝑃𝑉 = 𝑋𝑗 + 𝑔 (𝑋𝑗+1 − 𝑋𝑗 )
𝑄1 = 55 + .25 (56 − 55)
𝑄1 = 55 + .25
𝑄1 = 55.25
Hence, 25% of the scores in the distribution are below 55.25.
22
Solving for D5.
𝑃𝑉 = 𝑋𝑗 + 𝑔 (𝑋𝑗+1 − 𝑋𝑗 )
𝐷5 = 68 + .50 (69 − 68)
𝐷5 = 68 + .50
𝐷5 = 68.50
For P80, we get 80% (20+1) = 16.8, which mean that j = 16 and g
= .80 which means that P80 is the 16th score (86) plus 0.80 of the
difference between the 17th score (88) and the 16th score (86).
Hence,
𝑃𝑉 = 𝑋𝑗 + 𝑔 (𝑋𝑗+1 − 𝑋𝑗 )
𝑃80 = 86 + .80 (88 − 86)
𝑃80 = 86 + 1.60
𝑃80 = 87.60
For P99, we get 99% (20+1) = 20.79. Thus, P99 is the score which
is .79 of the way from the 20th score to the next score. Since we do
not have a score beyond the 20th score we take the 20th score as the
value of P99. Therefore, P99 = 99.
23
Remember!
𝑛
− 𝐹𝑏
𝑃𝑥 = 𝐿𝐿 + (2 )𝑐
𝑓
where:
LL = true lower limit of the class interval containing P x
Fb = the sum of all frequencies below the intervals
containing Px (or the <cf directly below the intervals
containing Px)
f = frequency of the intervals containing Px
c = class size;
n = total number of cases
Example 25. Consider the frequency distribution below. Find Q1, D4, and P90.
Solution:
𝑥%(𝑛) − 𝐹𝑏
𝑃𝑥 = 𝐿𝐿 + ( )𝑐
𝑓
12.5 − 8
𝑄1 = 15.5 + ( )5
6
𝑄1 = 15.5 + 3.75
𝑄1 = 19.25
24
Solving for D4.
𝑥%(𝑛) − 𝐹𝑏
𝑃𝑥 = 𝐿𝐿 + ( )𝑐
𝑓
20 − 14
𝐷4 = 20.5 + ( )5
14
𝐷4 = 20.5 + 2.14
𝐷4 = 22.64
For P90, we first get 90%(n) to determine the interval class containing
P90. Note that 90% (50) = 45. With reference to the “<cf” column, 45 is
between 43 and 50, so, the interval 35.5 –40.5 contains P90. Thus, with
reference to this interval, we have LL = 35.5; Fb = 43; f = 7; and c = 5.
𝑥%(𝑛) − 𝐹𝑏
𝑃𝑥 = 𝐿𝐿 + ( )𝑐
𝑓
45 − 43
𝑃90 = 35.5 + ( )5
7
𝑃90 = 36.93
25
2. Draw line segment in the box that marks the median Q2.
3. Draw line segment (called whiskers) that extend from the box to the smallest
and largest values of the data.
40 45 45 50 55 56 59 65 67 68
69 70 75 78 84 86 88 90 99 99
Solution:
1. Smallest value: 40
2. Q1 = 55.25
3. Q2 = 68.50
4. Q3 = 85.5
5. Largest value: 99
The resulting box plot is:
S Q1 Q2 Q3 L
30 40 50 60 70 80 90 100
LESSON 4:
NORMAL DISTRIBUTION
Normal Distribution
Asymptotic tail
Source: thoughtco.com
26
The properties of the normal distribution are as follows:
1. It is bell – shaped and is symmetric with respect to the vertical line that
passes through the highest point of curve.
2. It is unimodal and the mean, median and mode are equal.
3. It is asymptotic with respect to the baseline, which means that the tails of
the distribution get closer and closer to the baseline without crossing the
baseline.
4. The total area under the curve and above the baseline is always equal to
1.0.
Empirical Rule
Because the under the normal curve and above the baseline is 1.0, we consider
the normal curve as the graphic picture of the proportion of scores in a distribution. We
state below a common property of all normal curves with a given mean µ and standard
deviation σ. This property is called the empirical rule which highlights one
interpretation of the standard deviation as a concept of “distance”.
a. about 68.27% of all the cases are expected to fall between µ - σ and µ + σ.
b. about 95.45% of all the cases are expected to fall between µ - 2σ and µ +
2σ.
c. about 99.73% of all the cases are expected to fall between µ - 3σ and µ +
3σ.
Example 27: Suppose the first-year college class consisting of 120 students posted a
mean score of 70 with a standard deviation of 9 in their final exam in
Math. Assuming that the scores are continuously and normally
distributed,
27
Solution:
b. Again, from the empirical rule, we expect about 95% of the scores to fall
between the values µ - 2σ and µ + 2σ. Since, µ - 2σ = 70 – 2(9) = 52 and µ
+ 2σ = 70 + 2(9) = 88, then, about 95% of the pupils are expected to score
between 52 and 88.
The standard score is the distance of the score from the mean in terms of the
standard deviation. It tells how many standard deviations the observed value lies
above or below the mean of its distribution. The standard score is useful in comparing
observed values from different distributions. To be able to find areas under the normal
curve, observed values must first be converted into standard scores, and these would
help solve statistical problems.
To change an observed value into standard score, you use the following
equation:
Remember!
𝑥 − 𝑥̅
𝑧 =
𝑠
Note: A positive z-score will mean that the score/observed value is above the
mean.
A negative z-score will mean that the score/observed value is below
the mean.
Example 28. In a given distribution, the mean is 65 and the standard deviation is 6.
Find the corresponding standard score of:
a. 68 b. 59
Solution:
𝑥 − 𝑥̅ 68 − 65 3
𝑧68 = = = = 0.5
𝑠 6 6
28
b. The corresponding z-score of 59 is
𝑥 − 𝑥̅ 59 − 65 −6
𝑧59 = = = = −1.0
𝑠 6 6
Example 29: On the final examination in Math, the mean grade was 82 and the
standard deviation was 8. In English, the mean grade was 86 and the
standard deviation was 10. Joseph scored 88 in Math and 92 in English.
In which subject was his standing higher?
Solution: The first that has to be done us change the scores into standard scores.
For English
𝑥 −𝑥̅ 92 −86 6
𝑧𝐸 = = = = 0.6
𝑠 10 10
For Math
𝑥 −𝑥̅ 88 −86 6
𝑧𝑀 = = = = 0.75
𝑠 8 8
His standing in Math was higher than his standing in English. He was 0.6
standard deviation above the mean in English and 0.75 standard deviation
above the mean in Math.
Remember!
𝑥 −𝜇
𝑧 =
𝜎
29
interval, converting the interval to a z scale and then compute the probability by using
the standard normal distribution table.
30
Example 30. Find the area under the standard normal curve between the mean and
each given value of z:
a. z = -1.33
b. z = 1.75
Solution:
a. To find the area between the mean z = 0 and z = -1.33, we read the z
value of 1.3 on the first column, then the z value of 0.03 on the first row of
Table 1. The intersection of the identified row and column yields the
number 0.4082.
0.4082
-1.33 0
Thus, the area from the mean up to the value of z = -1.33 is 0.4082 or
40.82%
b. For z = 1.75, we read the z value of 1.7 on the first column, then the z
value of 0.05 on the first row. The intersection row and column yield the
number 0.4599 or 45.99%
0.4599
Thus, the area from the mean up to the value of z = 1.75 is 0.4599 or
45.99%
Example 31: Find the area under the standard normal curve
Solution:
a. The area to the left of z = 2.0 includes the area from z = 0 and z = 2.0 plus
half of the entire area under the normal curve. From the table, the area
31
from the mean up to z = 2.0 is 0.4772. Therefore, the entire area to the
left of z = 2.0 is 0.5 + 0.4772 = 0.9772 or 97.72%
0.9772
0 2.0
b. The area to the right of z = -1.0 includes the area from the mean down to
z = -1.0 plus half of the entire area under the normal curve. By symmetry,
the area from the mean down to z = -1.0 is equal to the area from the
mean up to z = 1.0 which is 0.3414. Thus, the entire area to the right of z
= -1.0 is 0.5 + 0.3414 = 0.8414 or 84.14%
0.8414
-1.0 0
c. To find the area to the right of z = 1.96, we first note that the area from the
mean to the entire right is 0.5. If we subtract the area from the mean up to
z = 1.96 from 0.5, we get the desired area to the right of z = 1.96. Using
the normal table, the area from the mean up to z = 1.96 is 0.4750.
Therefore, the area to the right of z = 1.96 is 0.5 – 0.4750 = 0.025 or
2.5%.
0.025
0 1.96
d. To find the area to the left of z = -2.56 is equal to the area to the right of z
= 2.56 by symmetry. Using the normal table, the area from the mean up to
z = 2.56 is 0.4960. Therefore, the area to the left of z = -2.56 is 0.5 – 0.4960
= 0.004 or 0.4%.
0.004
-2.56 0
32
e. To find the area between z = 1.5 and z = 2.75, we get the area from the
mean up to z = 2.75, then subtract the area from the mean up to z = 1.5.
Using the normal table, the area from the mean up to z = 2.75 is 0.4970
while the area from the mean up to z = 1.5 is 0.4332. therefore, the
desired area is given by 0.4970 – 0.4332 = 0.0638 or 6.38%.
0.0638
0 1.56 2.75
f. The area from z = - 1.0 to z = 2.0 can be obtained by adding the area
from the mean down to z = - 1.0 and the area from the mean up to z =
2.0. By symmetry, the area from the mean down to z = - 1.0 is equal to
the area from the mean down to z = 1.0 which is 0.3414. Also, the area
from the mean up to z = 2.0 is 0.4772. Therefore, the desired area is
given by 0.3414 + 0.4772 = 0.8186 or 81.86%
0.8186
- 1.0 0 2.0
Example 32. The average PAG-IBIG salary loan for RFS Pharmacy Inc. Employees
is ₱23,000. If the debt is normally distributed with a standard deviation
of ₱2,500, find the probability that the employee owes less than ₱18,500.
Solution:
33
x 18,500 23,000 4,500
z 1.80
2,500 2,500
Step 3. Find the appropriate area. The area obtained in the Standardized
Normal Distribution Table is 0.4641, which corresponds to the area
between z = 0 and z = -1.80.
= 0.5000 - 0.4641
= 0.0359
0.0359
18,500 23,000
Hence, the probability that the employee owes less than ₱18,500 in PAG-IBIG
salary loan is 0.0359 or 3.59%.
Example 33: The average age of bank managers is 40 years. Assume the variable is
normally distributed. If the standard deviation is 5 years, find the
probability that the age of a randomly selected bank manager will be in
the range between 35 and 46 years old.
Solution: Assume that ages of bank managers are normally distributed; then cut off
points are as shown in the figure below.
35 40 46
34
Step 2. Find the two z values.
x 35 40 5 x 46 40 6
z 1.00 z 1.20
5 5 5 5
Step 3. Find the appropriate area for z = -1.00 and z = 1.20 using the table.
= 0.3413 + 0.3849
= 0.7262
35 40 46 x - value
-1.00 1.20 z - value
Chapter Exercises
Directions: Answer the following. Show all pertinent solutions.
1. Find the mean, median, and mode/modes of each of the following sets of
data:
a. 10, 12, 15, 16, 20, 25
b. 65, 73, 82, 76, 90, 32, 65, 70
c. 33, 45, 56, 39, 38, 33, 45, 54, 39, 32
d. 103, 234, 156, 365, 234, 268, 333, 103, 256, 365
e. 18, 24, 25, 16, 35, 21, 24, 33, 34, 25, 45,33,28, 17, 18, 16, 21, 45
35
2. The final grades of a student in six subjects where he was enrolled are shown
below. Find his/her grade point average.
3. Consider the following distribution below. Find the mean, median, and mode.
Class Interval f
80 - 89 8
70 - 79 15
60 - 69 29
50 - 59 45
40 - 49 39
30 - 39 31
20 - 29 19
10 - 19 9
Class Interval f
94- 99 2
88- 93 7
82 - 87 19
76- 81 8
70- 75 10
64- 69 28
58- 63 37
52- 57 19
46- 51 8
40- 45 2
5. Find the range, average deviation, variance, and standard deviation of the
following sets of data:
a. 23, 21, 18, 17, 19, 21, 20, 18, 19, 24
b. 70, 65, 69, 73, 90, 87, 81, 89.
c. 24, 27, 32, 29, 31, 35, 27, 32, 23, 25, 30, 24.
36
6. The salaries of all the 130 employees of a company are tabulated in a
frequency distribution, as shown in the next page:
8. The table below gives the age distribution of 100 individuals living in the vicinity
of Escolta.
Age Frequency
55 - 59 2
50 - 54 5
45 - 49 10
40 - 44 12
35 - 39 15
30 - 34 16
25 - 29 13
20 - 24 10
15 - 19 4
10 - 14 4
37
9. Find the area under the normal curve which lie:
a. Between z = -0.63 and z = 0.63
b. Between z = 0 and z = -1.25
c. To the right of z = -1.75
d. To the left of z = -1.30
e. To the left of z = 1.05
10. In a given distribution, the mean is 65 and the standard deviation is 6. Find the
corresponding standard score of:
a. 77
b. 47
11. For a certain type of computers, the length of time between charges of the
battery is normally distributed with a mean of 50 hours and a standard
deviation of 15 hours. John owns one of these computers and wants to know
the probability that the length of time will be between 50 and 70 hours.
38