Chapter 2: Frequency Distribution and Measures of Central Tendency 2.1 A FREQUENCY DISTRIBUTION Is A Tabular Arrangement of Data Whereby The Data Is Grouped
Chapter 2: Frequency Distribution and Measures of Central Tendency 2.1 A FREQUENCY DISTRIBUTION Is A Tabular Arrangement of Data Whereby The Data Is Grouped
TENDENCY
2.1 A FREQUENCY DISTRIBUTION is a tabular arrangement of data whereby the data is grouped
into different intervals and then the number of observations that belong to each interval is
determined. Data presented in this manner are known as grouped data.
Example 1.
The following data give the result of a sample survey regarding the behavior of 30 students of
BSIE III-2 inside the classroom. The letters A, B and C represent three categories, excellent, fair
and poor, respectively.
A B A A C C A C C C
C B C B B C B B B C
B C C A C C C B C A
Solution:
The categories are the letters. Record these categories in the first column. Then read each
result from the given data and mark a tally, denoted by “I” in the second column next to the
corresponding category. The tallies are marked in blocks of five for counting convenience.
Lastly, record the total tallies for each category in the third column. This column is called the
column of frequency.
A IIII – I 6
B IIII – IIII 9
The sum of the entries in the frequency column gives the sample size or the total frequency.
2.2 Definition
A quantity which varies from one observation to another within a given domain and a given
condition is called VARIABLE while a quantity which does not vary under a given condition is
called CONSTANT.
2.3 Definition
The actual values of a variable are called VARIATES. Variates which are not organized and are
often recorded in the order observed are called RAW DATA. These data are often called
UNGROUPED DATA and they show pattern at all so it is difficult to describe and analyze.
2.4 Definition
The arrangement of data according to magnitude is called an ARRAY. Array are very useful in
constructing a frequency distribution because extreme values can be determined, thus, class
width and midpoint can be chosen.
Example 1.
Raw Data:
Table 1
8 12 9 8 12 14
10 11 13 11 9 15
13 10 14 13 12 13
12 8 15 10 11 14
14 9 12 9 13 10
Solution:
Array:
Table 2
8 9 10 12 13 14
8 9 11 12 13 14
8 10 11 12 13 14
9 10 11 12 13 15
9 10 12 13 14 15
2.5 Definition
The smallest data can belong to a given intervals called the LOWER-CLASS LIMIT while the
largest value that can belong to the interval is called the UPPER-CLASS LIMIT.
2.6 Definition
The difference between the upper-class limit and the lower-class limit is called CLASS WIDTH.
2.7 Definition
The midpoint of each gap is called class boundaries, which is commonly called TRUE LIMIT.
2.8 Definition
The class size or class width is the difference the lower- and upper-class boundaries.
where B2 is the upper boundary while B1 is the lower boundary of the given class. The
class width is often affected by the nature of the data and the number of classes.
2.9 Definition
The point halfway between the class limits or class boundaries is called the CLASSMARK or the
MIDPOINT.
x = L1 + L2 or x = B1 + B 2
2 2
where x is the midpoint, L 1 and L2 are the class limits (lower limit and upper limit
respectively) while B1 and B2 are the class boundaries.
1. Decide on the number of classes your frequency table will have. Usually, it is between 5
and 20.
2. Determine the highest and the lowest value in the given set of raw data and determine
the range (RANGE is the difference between the highest score and the lowest score).
3. Find the class width. Divide the range by the number of classes with the same size. The
class width should be an odd number. This ensures that the midpoint of each class has
the same place value as the data. Class intervals are chosen so that the class marks or
class midpoints coincide with the actual observed data.
4. Select a starting point, either the lowest score or the lower class limit. Add the class
width to the starting point to get the second lower class limit. Then enter the upper
class limit.
5. Find the boundaries by subtracting 0.5 of the unit difference from each lower class limit
and adding 0.5 of the unit difference to the upper class limit.
6. Represent each score by a tally.
7. Count the total frequency for each class to determine the number of observations
within each class. This is called the CLASS FREQUENCY. It is the number of times a
certain observation occurred.
Example 1
Given below is the set of scores in a unit test in statistics of 50 students of BSA III-1.
Construct a frequency distribution.
53 82 43 49 69 90
48 64 71 69 51 88
73 40 43 57 60 74
43 31 55 39 67 80
71 91 76 71 59 76
42 27 45 52 69 83
59 58 85 59 49 78
44 45 62 46 86 79
56 68 61 33 36 91
61 67 39 70 63 75
Solution:
Range = 91 -72
Range = 64
Range
c. c =
no . of desired classes
64
c=
7
c = 9.145
x = L1 + L2 or x = B1 + B2
2 2
27+35
x=
2
62
x=
2
x = 31
f. Determine the <cf by getting the frequency of the lowest class and add it
successively to the frequency of the other classes.
g. Determine the >cf by getting the frequency of the highest class and add it
successively to the frequency of the other classes.
2.10 Definition
The relative frequency distribution is similar to the frequency distribution except instead
of the number of observations belonging to a particular interval, the ratio of the number of
observations in the interval to the total number of observations, also known as the relative
frequency, is determined. The percentage frequency distribution is arrived at by multiplying the
relative frequencies of each interval by 100%.
2.11 Definition
From a frequency distribution, this can be done by simply adding together the
frequencies of the interval and all other preceding intervals (i.e., intervals whose values are less
than the values of a particular interval). We can also calculate the relative cumulative frequency
distribution and the percentage cumulative frequency distribution from the cumulative
frequency distribution.
Example 1
Solution:
Note that the following values range from 0 to 10.0. Therefore, we can form the
following 10 classes:
We assume that a measurement that falls on the border between two intervals belongs
to the previous interval (e.g. the value of 4 belongs to class 4 instead of class 5). By counting the
number of observations that fall into each class, we get the following frequency distribution:
The following table shows the cumulative frequency distribution for the above set of
measurements:
Measurements Cumulative Frequency Relative Cumulative Percentage
Frequency Cumulative Frequency
0.0 – 1.0 3 0.075 7.5
1.0 – 2.0 7 0.175 17.5
2.0 – 3.0 11 0.275 27.5
3.0 – 4.0 18 0.450 45.0
4.0 – 5.0 24 0.600 60.0
5.0 – 6.0 29 0.725 72.5
6.0 – 7.0 34 0.850 85.0
7.0 – 8.0 35 0.875 87.5
8.0 – 9.0 37 0.925 92.5
9.0 – 10.0 40 1.000 100.0
Frequency distribution tables can also be utilized for qualitative data. Since qualitative
data is not ordinal (e.g., can be ordered), the concept of cumulative frequency does not apply
when dealing with qualitative data.
Example 2
The following table illustrates the use of frequency distributions with qualitative data.