Lecture 7b Data Analysis
Lecture 7b Data Analysis
We compile a frequency table for the data and then draw some
type of corresponding frequency graph.
❖ Mode: the score with the highest frequency. Read it off from the frequency
column.
❖ Median: The middle score when score are arranged in order from the
smallest to the largest. (Accumulate the frequency column on FDT to find the
middle score)
Notice how the mean is affected by the extreme high score in part (c).
Example 3
The median value for an odd number of scores is always one of the scores (as in
example 2 above) but if we have an even number of scores the median is the average
of the two middle scores.
6 9
7 1
3. Analysis Of Data (Analyze Data on FDT)
Example 4 cont.
We add the cumulative frequency column to the table to help us to find these scores.
Number of Cumulative
Magazines Frequenc frequency From the cumulative frequency column, the 100th
x y score is a 3 and the 101st score is also a 3.
f
Median = middle score
0 2 2 = 100th score + 101st score
1 12 14
2 49 63 2
3 64 127
= 3+3
4 43 170
5 20 190 2
6 9 199
7 = 3 1 200
f = 200
3. Analysis Of Data (Analyze Data on FDT)
(c) To calculate the mean, another column is added to the distribution table.
The mean (average) is the best known and most commonly used. It is easy to
compute and takes all measurements into consideration. By multiplying the mean
by the total frequency the sum can be found. Sometimes abnormal scores can
have an exaggerated effect on the mean and this is the main shortcoming of the
mean as a measure of central tendency (Example 2(c) above).
Another example: If you scored 3 HDs and a CP, then the CP will pull your GPA
down
The median on the other hand is not influenced by abnormal or extreme scores and
in statistics it is often desirable to disregard extreme scores which may be due to
unusual circumstances. Thus the median probably indicates the score of the
majority more accurately than the mean
3. Analysis Of Data (Analyze Data on FDT)
Although the median is easy to find, it does not lend itself to further arithmetic
calculations. For example, if we know the median of twenty scores, we cannot find
the sum of these scores.
The mode, like the median, is easy to find, easy to understand and is unaffected by
abnormal scores. The mode is not used as often as the mean and median, but
shopkeepers would be very interested in the mode in order to stock the items that
are most often chosen by customers, e.g. shoe sizes that most common or
popular sizes of soft drink bottle
Group Data (Analyze Data on FDT)
Example
A small airline is making a survey of its services. The following table shows a summary of
passenger distribution on 150 flights over a certain route
Number Number
of of Flights Class Because there is a wide spread in the number
Passenger f Centre of passengers carried, the data is presented in 8
s Class X groupings 5-9, 10-14, 15-19, etc.
5-9 6 7 This makes it impossible to say exactly how many
10-14 10 12 passengers were on a particular flight but the lack of
15-19 18 17 detail is compensated for by the more compact
20-24 28 22 arrangement of the data which gives an overall picture
25-29 22 27 of the passenger distribution on that route.
30-34 16 32
35-39 32 37 (a) The class with the greatest frequency is called the
40-44 18 42 modal class. What is the modal class for the
distribution?
∑𝐟𝐱 = 150
Group Data (Analyze Data on FDT)
(b) Complete the table and calculate the mean.
To calculate the mean of a grouped frequency distribution we assume that the mid point of
a class (or group) will be characteristic of that class. Such an assumption would rarely be true
for any distribution but research has shown that the mean calculated in this way is usually
very close to the true mean.
The class centers are obtained by finding the average of the end points of a class
For example, the class center for the class 5-9 is 5+9 = 7
2
In the same way the other class centers are found as shown in the table.
10 + 14, 15 + 19, 20 + 24, 25 + 29, etc
2 2 2 2
The difference between consecutive class centers is equal to the class interval
In this case the class interval is 5
Group Data (Analyze Data on FDT)
The mean of grouped data is then calculated just as for simple data
Number of Number of Class
Passengers Flights Centre
Class f x fxx
5-9 6 7 6 * 7 = 42
10-14 10 12 10 * 12 = 120
15-19 18 17 18 * 17 = 306
20-24 28 22 28 * 22 = 616
25-29 22 27 28 * 27 = 756
30-34 16 32 16 * 32 = 512
35-39 32 37 32 * 37 =1184
40-44 18 42 18 * 42 = 756