0% found this document useful (0 votes)
23 views28 pages

Module 345stat Roxas1stsem23 24

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
23 views28 pages

Module 345stat Roxas1stsem23 24

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 28

Module 3

Measures of Central Tendency

This chapter is concerned with another way of describing numerical data. In order to
transform the set of data into a meaningful form, the raw data must be organized into a frequency
distribution table and presented textually and graphically. The measure of central tendency is a
single number, which represents the general level of performance of the group.
The Central Tendency
The central tendency is the center or concentration of scores in the set of gathered data. It
is a single value that represents the set of data.
The three measures of central tendency are the mean, the median and the mode.
The Mean – the arithmetic mean or average is the sum of the values in the group of data
divided by the number of values. The mean is the most reliable central tendency measure.

The Mean of Ungrouped Data

The Population Mean is used when a study involves all persons, animals, objects or
things, use population mean. For raw data, that is less than 30 is considered as ungrouped data.
Example: The researcher wants to find out the study habits of the female basketball players in the
university. Even though there are only 12 payers yet they comprise the population.

Population Mean = Sum of all the population


Number of values in the population

u = Σx
n
where:
u = population mean
n = total number of observations in the population
x = a particular value
Σx = sum of population values of x

Sample Mean
The sample mean is the sum of all the sample values divided by the total number
of values in the sample.

Sample mean = Sum of all the values in the sample


Number of values in the sample
_
x = Σx
n
Where: x = sample mean
Σx = sum of sample values
n = total number of sample values
Example:

Consider the following data on the performance in Mathematics and Science of the fourth
year high school students in a certain public school.
Student Performance in Performance in
Mathematics Science
1 28 23
2 23 17
3 30 27
4 21 15
5 16 18
6 26 25
7 17 20
8 19 18
9 23 20
10 16 17
11 23 24
12 20 15
1. What is the arithmetic mean performance in Mathematics?
2. What is the arithmetic mean performance in Science?
3. Is the data considered as a population parameter? Why?

Solution:
1. Arithmetic mean of performance in Mathematics.
_
x = Σx = 262 = 21.83
n 12
2. Arithmetic mean of performance in Science.
_
x = Σx = 239= 19.92
n 12

3. No, because we are only considering 12 fourth year high school students and they represents a
sample of the population. It is a fact that many public schools have more than 12 graduating
students.
Weighted Mean
The weight of each value are considered according to its importance.

x1w1+ x2w2 + x3w3 +…….+ xnwn


Xw = w 1+ w2 + w3 + ……+ wn
Example:

The Mc Bee Fast Food chain pays its employees on an hourly basis raging from P 15.00,
P16.50, P18.50 and an hour. 15 employees earn P15.00 an hour, 22 are paid P16.50 an hour and
7 are paid P18.50 per hour. Find the weighted mean.

Solution:

To compute for the weighted mean, find the sum of the products of 15 by P15.00, 22 by
P16.50 and 7 by P18.50, and divide by the sum of 15, 22 and 7. The resulting average is called
the weighted mean.

Xw = 15 (15.00) + 22 (16.50) + 7 (18.50)


15 + 22 + 7

= 225.00 + 363 + 129.50


44

= 717.5
44

= Php 16.31

Median

The median is the middle most value after they have been ranked from the lowest to the
highest, or vice versa.
Ranking is the process of arranging the data from the highest to the lowest or vice versa
based on certain criteria. Such criteria can be ordered in terms of quantity, quality, appraised
value and chronology.

Steps in ranking:
1. Arranged the data to be ranked in a descending or ascending order.
2. Assign consecutive numbers for each item from the highest or from the lowest.
3. Rank an item occurring once the same as its consecutive number.
4. The rank of an item occurring two or more times is done by adding their consecutive
numbers and divide by the number of items.
Prices of denim jeans Prices of denim jeans
from highest to lowest from lowest to highest
Number Data Rank Data Number Rank
1 850.00 1 550.00 1 1
2 750.00 2 600.00 2 2.5
3 725.00 3 600.00 3 2.5
4 675.00 4 675.00 4 4
5 600.00 5.5 725.00 5 5
6 600.00 5.5 750.00 6 6
7 550.00 7 850.00 7 7

What is the middle most price of denim jeans? (P 675.00)


What is the median of the prices? (the median = P 675.00)

Mode

The mode is the most frequent value or score of the observations that appears in the
study. In the case where there are 2 most frequent value in the study it is called bimodal. If there
are 3 trimodal, etc.

Example: The following data are the annual salaries of the office secretary in private companies:

36,000 48,000 60,000 30,000


75,000 54,000 33,000 54,000
108,000 80,000 45,000 60,000
54,000 30,000 54,000 39,000
72,000 54,000 48,000 42,000

What is the modal annual salary?

Office secretaries with annual salary of P54,000 appear 5 times more than any of the
salaries. It is the mode of the annual salaries stated.
Exercises 3a:

Name______________________________________________ Score_____________
Course/year________________________________________ Date_____________

Answer the following

1. All the students in Integral Calculus are considered in the study. Their grades are 2.25,
1.75, 3.0, 2.50, 2.75, 2.50, 2.0, 3.0.

a. State the formula for the population mean.


b. What is the median grade?
c. Compute the mean grade.
d. Find the mode.
e. Is the mean you are computing an estimate or a parameter? Why?

2. There are six persons employed in an insurance company. The reported transactions for
the month (rounded to the nearest hundreds) are as follows: P35,000, P50,000, 75,000,
28,000, 19,000, and 42,000.

a. Compute the mean


b. What is the median?
c. What is the mode?
Is the data a sample or a population? Why?

3. The daily salary of the workers in the construction company are P120, 150, 175, 130, 150,
200, 280, 225, 160, 200, 175, 120.

a. Find the mean salary of the workers.


b. What is the median salary?
c. What is the mode salary?

4. The grades of Peter in his subjects at the end of the school year are as follows; English,- 88,
Filipino- 90, Mathematics – 89, Science – 85, T.H.E.- 85, Social Studies – 89 Values Education
– 88, P.E.H.M. – 87, and R.H.G.P. – 88. The unit per subject are: One (1) unit for subjects
English, Filipino, Mathematics, Social Studies, Values Education, and PEHM, Two (2) units for
Science and T.H.E. and (2)unit for RGHP.

a. What is the General Average (mean) of Ruben?

Grouped Data

The Mean
To determine the mean of the interval and the ratio data organized into the
frequency distribution use the summation of the product of the frequency and the midpoint of the
class or the frequency midpoint method.
The frequency midpoint method
Σfm
x = n
Formula:
Where:
x = is the arithmetic mean
m = is midpoint of each class
f = is the frequency of each class
n = is the total number of frequencies

Using the same data in table 5

Frequency Midpoint Frequency x Midpoint


Class interval f m fm
95 – 99 2 97 194
90 – 94 3 92 276
85 – 89 6 87 522
80 – 84 8 82 656
75 – 79 10 77 770
70 – 74 14 72 1008
65 – 69 12 67 804
60 – 64 7 62 434
55 – 59 6 57 342
50 – 54 4 52 208
45 – 49 3 47 141
n = 75 Σfm = 5355

In the first row the product 194 (fm) were taken from 2 (f) times 97 (m). This is done
continuously until the last row is completed.

Mean ( x ) = Σfm = 5355 = 71.4


n 75
Grouped Data

Median
The median is the point where half of the values lie above it and the other half of
the values lie below it. When the raw data have been organized into a frequency distribution we
cannot determine the exact median. However the median can be estimated by locating the point
in which the median class lies.

Formula:
Md = Ll + [ N/2 -Cf ] I
f
Where:
Md = The median
Ll = the lower limit of the median class
Cf = Cumulative frequency
F = Frequency of the class interval
I = Class interval
N = Total number of frequencies

Cumulative Cumulative
Frequency Midpoint Frequency Frequency
Class interval f m CF< CF>
95 – 99 2 97 75 2
90 – 94 3 92 73 5
85 – 89 6 87 70 11
80 – 84 8 82 64 19
75 – 79 10 77 56
70 – 74 14 72 46 43
65 – 69 12 67 32 55
60 – 64 7 62 20 62
55 – 59 6 57 13 68
50 – 54 4 52 7 72
45 – 49 3 47 3 75
n = 75
Substituting the following values in the formula
Ll = 69 + 70 = 139 = 69.5
2 2
Cf< = 32
F = 14
I = 5
N = 75

Md = Ll + [N/2 - Cf] I
F
Md = 69.5 + [75/2 - 32] 5
14
Md = 69.5 + [37.5 - 32] 5
14
Md = 69.5 + [5.5] 5
14
Md = 69.5 + 27.5
14
Md = 69.5 + 1.96
Md = 71.46

The Mode

If you recall that the mode is the most frequent scores or values. For grouped data into a
frequency distribution, the mode can be the midpoint in the largest class frequency
In the aforementioned problem the largest class frequency is located in the class interval
69.5 – 74.5 with the frequency equal to 14. The midpoint is 72., Therefore the Mode is 72.
When two values occur most frequently with the same number of frequency then there
are two modes in which case the two modes are called Bimodal.
Exercises 3b
Name____________________________________________________ Score_______
Course/year______________________________________________ Date ________

1. Determine the mean, the median and the mode of the following daily wages of sales clerks in
the following frequency distribution.
Daily wages Number of workers
130 - 149 5
150 - 169 26
170 - 189 30
190 - 209 42
210 - 229 20
230 – 249 8
250 – 269 4

2. The new management of a radio station changed its format from drama to commentaries. A
recent sample of radio listeners revealed the following age distribution. Find the mean, the
median and the mode age of listeners.
Age Frequency
10-19 15
20-29 29
30-39 38
Age Frequency
40-49 62
50-59 40
60-70 19

3. There are 464 students in second year high school. The following are the frequency distribution
of scores in the English test.
Class Interval Frequency
28-30 5
25-27 8
22-24 10
16-18 17
13-15 20
10-12 18
7-9 12
4-6 5
1-3 1
N=
a. What is the mean score in English?
b. Determine median score?
c. What is mode score?
d. Is this a sample estimate or a population parameter?
4. In the class of 30 students, the following are their scores in mathematics
90 87 89 85 88 74 59 82 92 87 79 76 83 71 85
81 69 82 75 70 73 76 91 83 75 75 88 71 83 90

Prepare a worksheet and compute for the mean, median and the mode. Use 55 as the lower
limit of the lowest step interval and the class interval of 5.

Exercises 3c
Name____________________________________________________ Score_______
Course/year______________________________________________ Date ________
1. The following data on the daily wages of construction worker are presented in the
frequency distribution.
Daily wages Number of workers
120 – 134 2
135 – 149 16
150 – 164 25
165 – 179 39
180 – 194 53
195 – 209 28
210 – 224 12
225 – 239 3
1. Find Q1 and Q2.,,,
2. Find D1 and D9,
3. Find P34 and P69.
2. Given the following frequency distribution. Find the quartile 1, quartile 3, decile 3, decile 8,
percentile 8, and percentile 98.
Classes Frequency
10.5 – 15.5 15
15.5 – 20.5 29
20.5 – 25.5 38
25.5 – 30.5 62
30.5 – 35.5 40
35.5 – 40.5 19
40.5 – 45.5 18
45.5 – 50.5 12
3. There are 464 students the secondary year high school. The following are the frequency
distribution of scores in English test.
Class interval Frequency
28 – 30 5
25 – 27 8
22 – 24 10
19 – 21 20
16 – 18 17
13 – 15 20
10 – 12 18
7–9 12
4–6 5
1–3 1
a. What is the P29 and P89 ?
b. Determine the D1 and D7 ?
c. What is the Q1 and Q3?
d. Is the sample estimate or a population parameter? Give reason for your answer.

Measures of Location

The measures of location are those points that divide a class frequency distribution of a
variable into a number of equal parts. Generally, 100 is the multiple of the point measures, that
is, they divide 100 exactly. The more common point measures are the quartile, the decile, and the
percentile.

The Quartile
The quartile is a point measure that divides the class frequency distribution of a variable
theoretically into four equal parts. There are two quartiles that need to be computed, the first
quartile and the third quartile. The second quartile is the median.

¼ ½ ¾

Lowest Quartile 1 Quartile 2 Quartile 3 Highest


Q1 Median Q3

First Quartile
The computation of the first quartile is the same as the computation of the median;
however, it is only ¼ of the scale that is being considered:
The scores of 110 students in a multiple choice test in Philippine History grouped into a
class frequency distribution is shown in table 5.1.

Table 5.1
Frequency Distribution of the Multiple Choice
Test Scores in History

Class Interval Frequency Midpoint CF<


42.5 – 45.5 3 44 110
39.5 – 42. 5 8 41 107
36.5 – 39.5 10 38 99
33.5 – 36.5 12 35 89
30.5 – 33.5 15 32 77
27.5 – 30.5 16 29 62
24.5 – 27.5 14 26 46
21.5 – 24.5 11 23 32
18.5 – 21.5 9 20 21
15.5 – 18.5 7 17 12
12.5 – 15.5 5 14 5
n= 110

Procedure:
1. Use the formula:

Q1 =Ll + (N/4 -Cf) I


f

2. Find the values of the symbols in the formula as follows:


a. N = 110 = 27.5 N is to be divided by 4, it is one fourth of the
4 4 population.

b. Cf = 21 the cumulative frequency starting below that approaches or next lower


than 27.5, the N/4.
c. Ll = 21.5 the exact lower limit of the class just above or immediately higher than
the Cf, 21.5 -24.5, the first quartile class.
d. F = 11 the frequency of the first quartile class
e. I=3 the interval, the difference between the lower limits of any two adjacent
lower classes.

3. Substitute the values in the formula and compute.


Q1 = 21.5 + (27.5 -21) 3
11
= 21.5 + (6.5) 3
11
= 21.5 + 19.5
11
= 21.5 + 1.77
Q1 = 23.27

Third Quartile
Using the same data given in table 5.1 follow the process of computing quartile 1 (Q 1).
To compute the third quartile (Q3) consider ¾ of the scale from the lowest to the highest values.

Formula: Q3 = Ll + (¾N –Cf) I


F
1. Find the values of the symbols in the formula as follows starting with ¾ N.

1. ¾N = ¾110 = 82.50 this is ¾ of 110.


2. Cf = 77 the cumulative frequency that approaches or next lower than 82.5
3. Ll = 33.5 the exact lower limit of the class just above or immediately above the Cf
and which contain the Q3, the third quartile.
4. F = 12 the frequency of the third quartile class
5. I=3 the class interval

2. Substitute the values in the formula and compute.


Q3 = 33.5 + (82.5 -77) 3
12
= 33.5 + 5.5 3
12
= 33.5 + 1.375
Q3 = 34.875 (ans.)

The Decile
The term deci means ten and the decile is a point measure that divides the class frequency
distribution of a variable into ten equal parts.

Lowest Median Highest

D1 D2 D3 D4 D5 D6 D7 D8 D9
The formula:
D= LL + ( dN/ 10 - Cf) I
F
Note that D stands for the decile rank as D1, D2, D3,… D9.

The process for computing the decile:

1. Find the value of 3N/10 or .3N:


1. 3N = (3)(100) = 330 = 33
10 10 10
2. Cf = 32 the cumulative frequency that approaches 33 or next lower than 33
3. Ll = 24.5 the exact lower limit of the class just or next higher than the Cf of 32, 24.5 -27.5
4. F = 14 the frequency of the class immediately above the Cf, of the decile 3 class, 24.5 – 27.5.
5. I = 3 the interval, the difference between the lower limits of any two adjacent classes.

6. Compute D3 by substituting the values in the formula.

D3= 24.5 + ( 33-32) 3


14
D3= 24.5 + (1.0) 3
14
D3= 24.5 + 3
14
D3= 24.5 + .21
D3= 24.71

The Percentile
Percentiles are points that divide the distribution theoretically into 100 equal parts.
Hence, percentile means one-hundredth.

Lowest Median Highest

D1 D 2 D3 D4 D5 D6 D7 D8 D9

P45 P84

Computation of Percentile
The formula:
Pr = Ll + (RN/100 - Cf) I
F

Note: r stands for the percentile rank to be computed. So if percentile rank 45 is to be computed, P r should be P45, if
percentile rank 84, Pr should be P84, etc.

Table 5.1
Frequency Distribution of the Multiple Choice
Test Scores in History

Class Interval Frequency Midpoint CF<


42.5 – 45.5 3 44 110
39.5 – 42. 5 8 41 107
36.5 – 39.5 10 38 99
33.5 – 36.5 12 35 89
30.5 – 33.5 15 32 77
27.5 – 30.5 16 29 62
24.5 – 27.5 14 26 46
21.5 – 24.5 11 23 32
18.5 – 21.5 9 20 21
15.5 – 18.5 7 17 12
12.5 – 15.5 5 14 5
n= 110

If P45 is to be computed, use the formula:

P45 = Ll + (45N/100 – Cf) I


F
1. Find the values of the symbols in the formula starting with 45N/100 as follows:

a. 45N = (45)(110) = 4950 = 49.5, 45th


100 100 100
b. Cf = 46 the cumulative frequency that approaches 49.5 or next lower than 49.5
c. Ll = 27.5 the exact lower limit of the class next or immediately higher than the Cf of
46.
d. F = 16 the frequency of the 45th percentile class, 27.5 – 30.5.
e. I = 3 the difference between the lower limits of any two adjacent lower classes.

2. Substitute the values computed for their respective symbols in the formula and solve.

P45 = 27.5 + (49.5 – 46) 3


16
P45 = 27.5 + (3.50) 3
16
P45 = 27.5 + 10.5
16
P45 = 27.5 + .656
P45 = 28.156

Exercise 4a
Name____________________________________________ Score____________
Course/year________________________________________Date_____________

1. The following data on the daily wages of construction worker are presented in the
frequency distribution.

Daily wages Number of workers


120-134 2
135-149 16
150-164 25
165-179 39
180-194 53
195-209 28
210-224 12
225-239 3

Find the following: Q1 & Q3, D1 & D9, P34 & P69 .

2. Given the following frequency distribution. Find the quartile 1, quartile 3, decile 3, decile
8, percentile 8, and percentile 98.

Classes Frequency
10.5-15.5 15
15.5-20.5 29
20.5-25.5 38
25.5-30.5 62
30.5-35.5 40
35.5-40.5 19
40.5-45.5 18
45.5-50.5 12
Measures of Variability

This chapter will present several measures that describe the dispersion, variability, or the
spread of the data. Discussed in this chapter are the range, quartile deviation, percentile range,
mean absolute deviation and standard deviation, the box plot, coefficient of variation, and
skewness.
A measure of variability or measures of variation is a method of measuring the degree by
which quantitative data or values tend to spread from point of central tendency or cluster about
the central point of the mean. It is also called measures of dispersion.
The most common measures of variation are the following:
1. The Range
2. The Quartile Deviation
3. The Percentile range
4. The Average Deviation
5. The Standard Deviation

1. The Range
A.. Ungrouped Data

The simplest measure of variability is the range. In ungrouped data it is the


difference between the highest and the lowest values or data, or highest score/data minus
lowest score/data:

R = H-L
Where:
R= the range
H= the highest data
L= the lowest data

Find the range in the following set of number:

18, 25, 12 , 15, 9, 20 ,16, 12, 23, 18, 20, 14

R=H–L
R = 25 – 9
R = 16

The Range
B. Group Data
The range for the group data is the difference between the highest class upper
boundary and the lowest class lower boundary:

R = Ubh - LBl
Find the range for the Grouped Data.
Table 5.1
Frequency Distribution of a Multiple Choice
Test Scores in History
Class Interval Frequency Midpoint CF<
42.5 – 45.5 3 44 110
39.5 – 42. 5 8 41 107
36.5 – 39.5 10 38 99
33.5 – 36.5 12 35 89
30.5 – 33.5 15 32 77
27.5 – 30.5 16 29 62
24.5 – 27.5 14 26 46
21.5 – 24.5 11 23 32
18.5 – 21.5 9 20 21
15.5 – 18.5 7 17 12
12.5 – 15.5 5 14 5

N= 110

Solution:
R = Ubh - LBl
R = 45.5 – 12.5
R = 33

2. The Quartile Deviation


The quartile deviation is the amount of dispersion in the middle of 50 percent of the data.
It is also called the semi-quartile range. The formula follows:

Q.D. = Q3 – Q1
2
Where:
Q.D. = the quartile deviation
Q1 = the quartile 1
Q3 = the quartile 3

The quartile 1 (Q1) divides the frequency distribution into a lower one-fourth of the
data, while Q3 divides the distribution into an upper one-fourth of the data. In
computing Q1, which is the lower one-fourth of the distribution, use the following
formula:

Q1 = L1 + [ N/4 - Cf ] I
F
The quartile 3 (Q3) is the value that separates the data from the upper one-fourth. The
formula for quartile 3 is as follows:
Q3 = L1 + [ 3N/4 - Cf ] I
F
Find the quartile (Q. D) of the following scores in the arithmetic computation test of 100
students as shown in the frequency distribution table as follows.

Table 6.1
Frequency Distribution of Arithmetic Computation Test

LL HL f M CF<
94.5 – 99.5 1 97 100

89.5 – 94.5 3 92 99

84.5 – 89.5 8 87 96

79.5 – 84.5 11 82 88

74.5 13 - 79.5 77 77

69.5 – 74.5 19 72 64

64.5 – 69.5 17 67 45
59.5 12 - 64.5 62 28

54.5 – 59.5 9 57 16

49.5 – 54.5 5 52 7

44.5 – 49.5 2 47 2
N= 100

Lower limit of Q1 Frequency of Q1


Cf of Q1

Lower limit of Q3 Frequency of


Q1 Cf of Q3

Solution:
Computation of Quartile 1
N = 100 = 25
4 4
Q1 = L1 + [ N/4 - Cf ] I
F

Q1 = 59.5 + [ 25 - 16 ] 5
12
= 59.5 + (9) 5
12

Q1 = 63.25
Computation of Quartile 3

3N = 3(100) = 75
4 4

Q3 = L1 + [ 3N/4 - Cf ] I
F

Q3 = 74.5 + [ 75 - 64 ] 5
13
= 74.5 + (11) 5
13
Q3 = 78.73

Quartile Deviation

Q.D. = 78.73 – 63.25 = 15.48


2 2

Q.D. = 7.74

3. Percentile Range
The Percentile Range is the distance between the 10 th percentile and the 90th percentiles.
The symbol P10 is used to represent the 10th percentile and P90 the 90th percentile.

The formulas for the two percentiles are:

( 10N - Cf) I
100
P10 =LL + -------------------
F
( 90N - Cf) I
100
P90 =LL + -------------------
F

Using the same data in Table 6.1

(
P10 = 54.5 + 10 (100)/ 100 -7 )5
9

(
P10 = 54.5 + 10 -7 )5
9

P10 = 54.5 + 3 ()5


9

P10 = 54.5 + 1.67

= 56.17 (ans)

P90 = 84.5 + (90(100) / 100 - 88 ) 5


8

P90 = 84.5 + (90- 88 ) 5


8

P90 = 84.5 + (2 ) 5
8
P90 = 84.5 + 1.25
= 85.75 (ans)

Percentile Range = P90 - P10


= 85.75 – 56.17
= 29.58

3. Average Deviation (AD) or Mean Absolute Deviation (MAD)

A. Ungrouped Data
The average deviation is considered more important or appropriate than the quartile
deviation because it takes into account all the individual values of the distribution. The mean
absolute deviation (MAD) measures the extent of each individual data in the distribution that
deviates from the mean of that distribution.
The formula:

MAD = ∑│X - x │
n

To compute for the mean absolute deviation for ungrouped data first arrange the values in
column and find the value of the mean ( x). Determine the deviations of the raw score to the
mean (X- x ). Change the deviations using the absolute value sign │X - x │. Finally get the sum
of the absolute deviations and divide the sum by the total frequency (n).

Mean Absolute Deviation (MAD)

B. Grouped Data
When the data are presented in a frequency distribution, first compute the mean of the
distribution and find the deviation of each midpoint from the mean. Multiply each absolute
value deviation by the corresponding class frequency. Finally divide the sum of the products
by the total number of observations.

The formula:

MAD = ∑f│M - x │
n

Where:
MAD = mean absolute deviation
f = class frequency
M = class mark
X = the individual value
x = sample mean
n = total class frequency

LL HL f M fM /M -x / f/M - x /

94.5 – 99.5 1 97 97 26.1 26.1


89.5 – 94.5 3 92 276 21.1 63.3
84.5 – 89.5 8 87 696 16.1 128.8
79.5 – 84.5 11 82 902 11.1 122.1
74.5 - 79.5 13 77 1001 6.1 79.3
69.5 – 74.5 19 72 1368 1.1 20.9
64.5 – 69.5 17 67 1139 3.9 66.3
59.5 - 64.5 12 62 744 8.9 106.8
54.5 – 59.5 9 57 513 13.9 125.1
49.5 – 54.5 5 52 260 18.9 94.5
44.5 – 49.5 2 47 94 23.9 47.8
N=100 7090 881.0

Solutions:

x = ∑fM MAD = ∑f / M-X/


n n
x= 7090 = 881.00
100 100
x= 70.9 MAD = 8.81
The Standard Deviation (Sd)

The standard deviation is the most stable and considered the most important measures of
variability. Compared to the mean absolute deviation, it is relatively easier to handle
mathematically.

To solve for the standard deviation first find the sample mean (x). Then find the deviation
of each X value from the mean (x). Square each deviation that is, (X-x) 2. Find the total of the
squared deviation that is Σ (X-x)2. Divide the sum by the total number of observed values.
Extract the square root of the quotient. The resulting root is the value of the standard deviation.
Formula:

Sd = ∑ (X- x) 2 σ = ∑ x 2 – (x)2
n n n

Where:

Sd= standard deviation


X= observed value
x= sample mean
n= total number of observed values

Ungrouped Data

Find the standard deviation of the grades of sample students in high school physics

Grades
X (X-x) (X-x)2
75 -6 36
77 -4 16
78 -3 9
79 -2 4
80 -1 1
81 0 0
83 2 4
84 3 9
85 4 16
88 7 49
________________ ________________

∑X=810 ∑(X-x)2 = 144

Solutions:
x=∑X
n
= 810
10

= 81 – the mean

Sd = ∑ (X- x) 2
n

Sd = 144 .

10
Sd= 3.79

Grouped Data

Computation of the SD from Grouped Data by the


Midpoint Method

The formula for computing the SD from group data using the midpoint method is as
follows:

Σ (fM)2
Sd = Σ (fM)2 - n
n - 1
Where:
Sd =Standard deviation
M = the midpoint of a class
f = the class frequency
n = the total number of sample observation

Data given from Table 6.1:

LL HL f M fM fM2

94.5 – 99.5 1 97 97 9409


89.5 – 94.5 3 92 276 25392
84.5 – 89.5 8 87 696 60552
79.5 – 84.5 11 82 902 73964
74.5 - 79.5 13 77 1001 77077
69.5 – 74.5 19 72 1368 98496
64.5 – 69.5 17 67 1139 76313
59.5 - 64.5 12 62 744 46128
54.5 – 59.5 9 57 513 29241
49.5 – 54.5 5 52 260 13520
44.5 – 49.5 2 47 94 4418
N = 100 7090 514510

∑fM ∑fM2

B. Procedure:

Σ (fM)2
Sd = Σ (fM)2 - n
n - 1

1. Find the midpoint of the classes.


2. Multiply the midpoints by their respective frequencies to find the fM (f ×M = fM) See
Column f, M, and fM.
3. Sum up the fM to find the ∑fM.
4. n=100. This is found by adding the frequencies
5. Substitute the symbols with their respective values in the formula.

Σ (fM)2
Sd = Σ (fM)2 - n
n - 1

(7090)2
Sd = 514510 - 100
100 - 1
Sd = 11829
99

Sd = 119.48

Sd = 10.93

The Box plot


The box plot is a graphical display based on quartiles. It enables the researcher to picture
out the sets of data or observations being considered.

How to construct a Box Plot


In constructing a box plot there is a need to find the five data. Determine the minimum
value, compute the first quartile Q1, the median Md, and the third quartile Q3 and the
maximum value.

The results of a 100 – item mathematics test are shown with the following information. (Results
of computation were rounded off to the nearest whole number):

Lowest score 28
Quartile 1 (Q1) 38
Median (Md) 47
Quartile 3 (Q3) 60
Highest score 95
Steps in constructing a box plot
a. Make an appropriate scale along the horizontal axis.
b. Draw a box that starts at Q1 (38) and ends at Q3 (60).
c. Inside a box place a vertical line to represent the median (47)
d. Extend the horizontal line from the box out to the minimum score (95).

Minimum Score Median Maximum


Score
The box plot
Q1 Q3
between the
ends of the
box is 22, that is
the interquartile
20 30 40 50 60 70 80 90 100 range. We can
Score
now conclude that 50 percent of the scores lies between 38 and 60.
The box plot also shows that the distribution is negatively skewed because the broken
line from the right of the box Q3 (60) to the highest score of 95 is longer than the left line Q 1 (38)
to the lowest score of 28.
Another way of showing the skewness of the distribution is that the 25 percent of data
greater than quartile 3 is more spread than the 25 percent of than data less than the first quartile.
Another indication of negative skewness is that the median is not located at the center of
the box. The distance from the first quartile to the median is shorter than the distance from the
median to the third quartile.

Skewness
Another characteristic that can be measured in a set of data is the skewness of
distribution. If the frequency distribution is symmetrical it has no skewness, the skewness is zero
(0). If one or more sets of data are extremely large, the mean of the distribution becomes greater
than the median or mode. Such distribution is said to be positively skewed.
On the other hand, if one or more extremely small data are present and the mean is the
smallest of the three measures of central tendency, the distribution is said to be negatively
skewed.

Symmetrical Positive skewness Negative skewness


Daily wages Ages Teaching experience

s\ P15.50
s= 4.0 s= 3.0
P 175 36 38 39 10 11 13
Median
Mode
Graphical Presentation

The Coefficient of Skewness which is a measure used to describe the degree of skewness
was developed by Karl Pearson.

The formula:

Sk = 3 (x – md ) Where:
Sd Sk = Coefficient of skewness
x = the mean
Md= the median
Sd = the standard deviation

The scores of 75 students in a mathematics test had the mean of 71.4, the median of 71.46
and the mode of 72. The standard deviation was 10.5.
a. Determine if the distribution is symmetrical, positively skewed, or negatively skewed.
b. What is the coefficient of skewness?

Solution:
a. The distribution is symmetrical because the there is a slight difference in the mean
median and the mode. The three measures of central tendency almost lie on the same
point.
b. The coefficient of skewness
Sk = 3 (x – md)
Sd
Sk = 3 (71.4 – 71.46)
10.5
Sk = 3 (-.06)
10.5
Sk = -.18
10.5
Sk = -.017

Interpretation: the result of -.017 shows a very negligible amount of skewness (negative). The result is
almost zero. The coefficient of skewness generally lies between the -3 and +3.

Exercise 5a
Name________________________________________________ Score___________
Course/year___________________________________________Date____________

a. The following is the frequency distribution table on a multiple choice test scores in History.

Class Interval Frequency Midpoint CF<


42.5 – 45.5 3 44 110
39.5 – 42.5 8 41 107
36.5 – 39.5 10 38 99
33.5 – 36.5 12 35 89
30.5 – 33.5 15 32 77
27.5 – 30.5 16 29 62
24.5 – 27.5 14 26 46
21.5 – 24.5 11 23 32
18.5 – 21.5 9 20 21
15.5 – 18.5 7 17 12
12.5 – 15.5 5 14 5

N = 110

a. Make a box plot of the above data


b. Compute the mean
c. Compute the standard deviation
d. Find the coefficient of variation
e. Find the coefficient of skewness

2. Given are the following data . . .

Classes Frequency Midpoint Cf<


21-27 16
28-34 28
35-41 39
42-48 40
49-55 39
56-62 33
63-69 10
N

a. Compute the quartile 1


b. Compute the quartile 3
c. Make a box plot
d. Compute the percentile range

3. The following grades of the students in Filipino in three sections

Statistical measures Section A Section B Section C

Mean 81.5 80.25 82


Median 83 80.25 80
Mode 82 80 79
Standard deviation 2.75 0.5 3.25
Mean Deviation 2.25 0.3 2.5
Quartile deviation 1.5 0.2 1.9
N 40 36 42

1. Find the coefficient of variation for sections A, B, C.


2. What section has the symmetrical or bell shaped distribution?
3. Determine the skewness of the different sections A, B, and C.

You might also like