0% found this document useful (0 votes)
4 views

Lecture 7b Data Analysis

The document discusses the analysis of data, focusing on frequency distribution tables and statistical measures. It explains measures of central tendency (mode, median, mean) and provides examples for calculating these measures, as well as discussing the relative merits of each. Additionally, it covers measures of dispersion, including range, variance, and standard deviation, to analyze the spread of scores from the mean.

Uploaded by

pkoralyostevelyn
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views

Lecture 7b Data Analysis

The document discusses the analysis of data, focusing on frequency distribution tables and statistical measures. It explains measures of central tendency (mode, median, mean) and provides examples for calculating these measures, as well as discussing the relative merits of each. Additionally, it covers measures of dispersion, including range, variance, and standard deviation, to analyze the spread of scores from the mean.

Uploaded by

pkoralyostevelyn
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 15

3.

Analysis Of Data (Analyze Data on FDT)


Data collected must be reduced to an understandable form,
which can be quickly grasped,

 We compile a frequency table for the data and then draw some
type of corresponding frequency graph.

 It is useful to add further clarity by finding certain measures which


describe important features of the distribution.

 To help analyze data summarized by a frequency distribution, a


variety of statistical measures are used.

 Two of the most important of these are:


(i) Measures of Central Tendency; (Stage 1)
(ii) Measures of Dispersion. (Stage 2)
3. Analysis Of Data (Analyze Data on FDT)
Stage 1 : Measures of Central Tendency
(A Central Point in the distribution of the scores)
Mode, Median & Mean

❖ Mode: the score with the highest frequency. Read it off from the frequency
column.

❖ Median: The middle score when score are arranged in order from the
smallest to the largest. (Accumulate the frequency column on FDT to find the
middle score)

❖ Mean/Average: Sum of the score/Number of score.

✓ Most common of the three is the MEAN/AVERAGE (central value)


3. Analysis Of Data (Analyze Data on FDT)
Example 1
Simple example.
Suppose the following numbers represent the marks out of 10 a student obtained
over 7 tests.
7, 4, 6, 5, 9, 8, 8
What was the mode score, the mean score, and the median score?
1.So the mode, being the score occurring most, is 8.
2. The mean involves adding of all the scores (47) and then dividing by the number
of scores (7) 47/7 = 6 5/7 = 6 ……
3. Now the median is the middle score 4 5 6 (7) 8 8 9
Note, there are the same number of scores on either side of the median, in this case
3 score to the left (4 5 6) and 3 scores to the right (8 8 9)
3. Analysis Of Data (Analyze Data on FDT)
Example 1 cont.
What would happen to the mean, median and mode if the student sat another test and
scored 5?
The mode is now two scores, 5 and 8
The mean is now 6½= 6.5
The median is a bit tricky.
Scores in order now are 4 5 5 6 7 8 8 9
There is no “middle score”. For half the scores to be to the left and half to the right, the
division would have to be
4 5 5 6 | 7 8 8 9
between the scores of 6 and 7. We say the median is the average of the two scores on
either side i.e. 6+7 = 6.5
2
So the median is 6.5 (it is just a coincidence that the mean was also 6.5!)
3. Analysis Of Data (Analyze Data on FDT)
Example 2
Calculate the mean, median and mode for the following
sets of scores:
a) 10, 14, 6, 17, 20, 11, 8, 10, 16

1. The score which occurs most frequently is 10, therefore


Mode = 10
2. Arranged in order of size = 6, 8, 10, 10, 11, 14, 16, 17, 20
The middle score is 11, therefore, Median = 11

3.Mean = σ 𝐟𝐱 = 112 = 12.44


σ𝐟 9
3. Analysis Of Data (Analyze Data on FDT)
Example 2 cont.
Calculate the mean, median and mode for the following sets of scores:
b) 7, 6, 8, 7, 7, 6, 7, 8, 7 (Ans: Mean, mode, median = 7, 7, 7)
c) 8, 8, 6, 9, 7, 6, 8, 7, 8, 9, 210 (Ans: Mean, mode, median = 26, 8, 8)

Notice how the mean is affected by the extreme high score in part (c).

Example 3
The median value for an odd number of scores is always one of the scores (as in
example 2 above) but if we have an even number of scores the median is the average
of the two middle scores.

For instance the median of 1, 2, 3, 4, 5 is 3 but the median of


1, 2, 3, 4, 5, 6 is 3+4 = 3.5.
2
3. Analysis Of Data (Analyze Data on FDT)
Example 4.
Two hundred students were asked to state how many magazines they purchased during
the last term. The results are given in the table below.

a) What number of magazines was most


Number of Number of commonly purchased? 3
Magazines Students This is the mode of the distribution.
0 2 (b) Find the median for the frequency
1 12 distribution table.
2 49 Total frequency = 200, which is even, therefore the
median will be the average of the two middle
3 64 scores, the 100th and the 101st scores
4 43 We add the cumulative frequency column to the
5 20 table to help us to find these scores.

6 9
7 1
3. Analysis Of Data (Analyze Data on FDT)
Example 4 cont.
We add the cumulative frequency column to the table to help us to find these scores.

Number of Cumulative
Magazines Frequenc frequency From the cumulative frequency column, the 100th
x y score is a 3 and the 101st score is also a 3.
f
Median = middle score
0 2 2 = 100th score + 101st score
1 12 14
2 49 63 2
3 64 127
= 3+3
4 43 170
5 20 190 2
6 9 199
7 = 3 1 200

f = 200
3. Analysis Of Data (Analyze Data on FDT)
(c) To calculate the mean, another column is added to the distribution table.

Number of Frequency ∑𝐟 means the sum of the frequencies and ∑𝐟𝐱


Magazines x gives the sum of all the scores.
Score Frequency Score To calculate the mean of the distribution,
x f f x x
0 2 2x0= 0
Mean (or Average) = sum of all the scores
1 12 12 x 1 = 12
2 49 49 x 2 = 98 number of scores
3 64 64 x 3 = 192 = ∑𝐟𝐱
4 43 43 x 4 = 172
∑𝐟
5 20 20 x 5 = 100
6 9 9 x 6 = 54
7 1 7x1= 7 The average number of magazines purchased

f = 200 fx = 635


= ∑𝐟𝐱 = 635 = 3.175
∑𝐟 200
3. Analysis Of Data (Analyze Data on FDT)
Discussion
Relative Merits of Mean, Median and Mode.

The mean (average) is the best known and most commonly used. It is easy to
compute and takes all measurements into consideration. By multiplying the mean
by the total frequency the sum can be found. Sometimes abnormal scores can
have an exaggerated effect on the mean and this is the main shortcoming of the
mean as a measure of central tendency (Example 2(c) above).

Another example: If you scored 3 HDs and a CP, then the CP will pull your GPA
down
The median on the other hand is not influenced by abnormal or extreme scores and
in statistics it is often desirable to disregard extreme scores which may be due to
unusual circumstances. Thus the median probably indicates the score of the
majority more accurately than the mean
3. Analysis Of Data (Analyze Data on FDT)
Although the median is easy to find, it does not lend itself to further arithmetic
calculations. For example, if we know the median of twenty scores, we cannot find
the sum of these scores.

The mode, like the median, is easy to find, easy to understand and is unaffected by
abnormal scores. The mode is not used as often as the mean and median, but
shopkeepers would be very interested in the mode in order to stock the items that
are most often chosen by customers, e.g. shoe sizes that most common or
popular sizes of soft drink bottle
Group Data (Analyze Data on FDT)
Example
A small airline is making a survey of its services. The following table shows a summary of
passenger distribution on 150 flights over a certain route

Number Number
of of Flights Class Because there is a wide spread in the number
Passenger f Centre of passengers carried, the data is presented in 8
s Class X groupings 5-9, 10-14, 15-19, etc.
5-9 6 7 This makes it impossible to say exactly how many
10-14 10 12 passengers were on a particular flight but the lack of
15-19 18 17 detail is compensated for by the more compact
20-24 28 22 arrangement of the data which gives an overall picture
25-29 22 27 of the passenger distribution on that route.
30-34 16 32
35-39 32 37 (a) The class with the greatest frequency is called the
40-44 18 42 modal class. What is the modal class for the
distribution?
∑𝐟𝐱 = 150
Group Data (Analyze Data on FDT)
(b) Complete the table and calculate the mean.
To calculate the mean of a grouped frequency distribution we assume that the mid point of
a class (or group) will be characteristic of that class. Such an assumption would rarely be true
for any distribution but research has shown that the mean calculated in this way is usually
very close to the true mean.
The class centers are obtained by finding the average of the end points of a class

For example, the class center for the class 5-9 is 5+9 = 7
2
In the same way the other class centers are found as shown in the table.
10 + 14, 15 + 19, 20 + 24, 25 + 29, etc
2 2 2 2

The difference between consecutive class centers is equal to the class interval
In this case the class interval is 5
Group Data (Analyze Data on FDT)
The mean of grouped data is then calculated just as for simple data
Number of Number of Class
Passengers Flights Centre
Class f x fxx
5-9 6 7 6 * 7 = 42
10-14 10 12 10 * 12 = 120
15-19 18 17 18 * 17 = 306
20-24 28 22 28 * 22 = 616
25-29 22 27 28 * 27 = 756
30-34 16 32 16 * 32 = 512
35-39 32 37 32 * 37 =1184
40-44 18 42 18 * 42 = 756

∑𝐟 =150 ∑𝐟𝐱 = 4292

The mean is then = ∑𝐟𝐱 = 4292 = 28.613


∑𝐟 150
Note that the only difference with a grouped frequency table is to calculate the CLASS CENTERS and use
them as the x values
3. Analysis Of Data (Analyze Data on FDT)
Stage 2 : Measures of Dispersion
Spread of the scores from the MEAN

Here we investigate the spread or scatter of values from the mean


(Range, Variance, Standard Deviation)

❖ Range: The difference between the highest and the lowest


score. Not common but is used in weather/temperature.

❖ Variance: Average Squared Deviation

❖ Standard Deviation: Root Mean-Square Deviation

You might also like