0% found this document useful (0 votes)
11 views

Lesson 5

The document discusses calculating measures of central tendency, variability, and location for grouped data. It defines key terms like frequency distribution, class intervals, class limits, and class boundaries. The goal is to learn how to calculate measures like mean, median, mode, range, and standard deviation for grouped data.

Uploaded by

sheilayvonne45
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views

Lesson 5

The document discusses calculating measures of central tendency, variability, and location for grouped data. It defines key terms like frequency distribution, class intervals, class limits, and class boundaries. The goal is to learn how to calculate measures like mean, median, mode, range, and standard deviation for grouped data.

Uploaded by

sheilayvonne45
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 40

JOMO KENYATTA UNIVERSITY

OF
AGRICULTURE & TECHNOLOGY
JKUAT SODeL

SCHOOL OF OPEN, DISTANCE AND eLEARNING


P.O. Box 62000, 00200
©2014

Nairobi, Kenya
E-mail: [email protected]

STA 2100 Probability and Statistics I

JJ II LAST REVISION ON February 13, 2014


J I
J DocDoc I
Back Close
STA 2100 Probability and Statistics I
This presentation is intended to covered within one week.
The notes, examples and exercises should be supple-
mented with a good textbook. Most of the exercises have
solutions/answers appearing elsewhere and accessible by
JKUAT SODeL

clicking the green Exercise tag. To move back to the same


page click the same tag appearing at the end of the solu-
tion/answer.
©2014

Errors and omissions in these notes are entirely the re-


sponsibility of the author who should only be contacted
through the Department of Curricula & Delivery
(SODeL) and suggested corrections may be e-mailed to
[email protected].
JJ II
J I
J DocDoc I
JKUAT: Setting trends in higher Education, Research and Innovation

Back Close 0
STA 2100 Probability and Statistics I
LESSON 5
Numerical Summaries of Data (Grouped
Frequency Distributions)
JKUAT SODeL

Learning outcomes
Upon completing this topic, you should be able to:
ˆ Calculate and interpret the measures of central tendency
©2014

for grouped data.


ˆ Identify and calculate the measures of variability for grouped
data
ˆ Identify and calculate the measures of location for grouped
JJ II data.
J I
J DocDoc I
JKUAT: Setting trends in higher Education, Research and Innovation

Back Close 1
STA 2100 Probability and Statistics I
5.1. Introduction
In our last lecture we looked at the measures of central tendency,
measures of dispersion/variability and location for ungrouped
data. We saw different ways of computing the mean, that is
JKUAT SODeL

arithmetic mean, harmonic mean and geometric mean. We also


saw the importance of each of these measures in terms of the data
that we have at hand. In addition, we also saw how to compute
©2014

the measures of dispersion and location, and their merits and


demerits given a situation. All these were done using the simple
frequency distributions, also referred to as ungrouped data.
In this lesson, we focus on the same measures as discussed in
the last lesson but we now consider grouped frequency distribu-
JJ II tions. To assist us solve the problems that we shall deal with in
J I this lesson, we refer to lesson 3 under the section (Constructing
J DocDoc I
JKUAT: Setting trends in higher Education, Research and Innovation

Back Close 2
STA 2100 Probability and Statistics I
a Frequency Distribution),where we showed how to construct
a frequency distribution (also known as grouped frequency) ta-
ble. Therefore, in this lesson, we shall go straight to the use of
these tables assuming that we are now familiar with the con-
JKUAT SODeL

struction of the tables.

5.2. Measures of central tendency for grouped data


©2014

As previously indicated, measures of central tendency generally


shows the tendency of statistical data to get concentrated at
certain values. To begin with, lets remind ourselves of some of
the useful information that we need in this lesson by defining
the following:
JJ II Frequency distribution: When summarizing large masses of
J I raw data, it is often useful to distribute the data into
J DocDoc I
JKUAT: Setting trends in higher Education, Research and Innovation

Back Close 3
STA 2100 Probability and Statistics I
classes or categories and determine the number of individ-
uals belonging to each class or group or category. Such a
number is what we refer to as class frequency. When data
are arranged in classes together with the corresponding
JKUAT SODeL

class frequency for each class in a table, then such a table


is referred to as frequency distribution or frequency
table.
©2014


Example . Consider the following data that gives the masses
of 100 male JKUAT IT students as recorded in the frequency
table below.

JJ II
J I
J DocDoc I
JKUAT: Setting trends in higher Education, Research and Innovation

Back Close 4
STA 2100 Probability and Statistics I
Mass (kg) No. of Relative Cumulative
Students frequency Frequency
60-62 5 0.05 5
63-65 18 0.18 23
JKUAT SODeL

66-68 42 0.42 65
69-71 27 0.27 92
72-74 8 0.08 100
©2014

Total 100 1.00


Data that is organized and summarized as in this table are of-
ten called grouped data. Much of the original details of the
raw data may be lost because of the grouping. However, an im-
portant advantage of using grouped data is that a clear overall
JJ II picture of the data emerges, as evidence of certain vital relation-
J I ships.
J DocDoc I
JKUAT: Setting trends in higher Education, Research and Innovation

Back Close 5
STA 2100 Probability and Statistics I
Class intervals and class limits: The symbol showing a class
such as 60-62 is called a class interval. The numbers 60 and
62 are called class limits. The value 60 is the Lower Class
Limit (LCL) while 62 is the Upper Class Limit (UCL).
JKUAT SODeL

Sometimes, it may be theoretically possible to have a class


with no UCL or LCL. Such is an open ended class, for
instance - the class “30 years and over”
©2014

Class Boundaries: In our table, we were correct in the value of


masses to the nearest kg. However, masses recorded in the
interval 60-62 could theoretically include masses from 5905
to 62.5; e.g. 59.8 belongs to this class. The numbers, 59.5
and 62.5 are called class boundaries or true class limits. In
JJ II practice, this is obtained by averaging the class limits of
J I successive classes.
J DocDoc I
JKUAT: Setting trends in higher Education, Research and Innovation

Back Close 6
STA 2100 Probability and Statistics I
Class size or interval or width: This is the difference between
the UCB and the LCB, for instance 62.5-59.5=3
Class mark/mid-mark: This is the midpoint of class inter-
val obtained by taking the average of UCB and LCB. In
JKUAT SODeL

grouped data, while computing the mean and standard de-


viation, we assume that the observations coincide with the
midpoint.
©2014

5.2.1. Computing the Mean


Example . The speed, to the nearest mile per hour, of 120
vehicles passing a check point were recorded and grouped as
follows:
JJ II Speed(mph) 21-25 26-30 31-35 36-35 46-60
J I No. of Cars 22 48 25 16 9
J DocDoc I
JKUAT: Setting trends in higher Education, Research and Innovation

Back Close 7
STA 2100 Probability and Statistics I
Estimate the mean of this distribution.

Solution
First, we need to work out the mid-interval values for the
first interval21-25 using the LCB=20.5 and UCB=25.5
JKUAT SODeL

Thus mid-point for this class is midpoint = 12 (20.5 + 25.5) =


23
Thus we assume that all values in the interval 20-25 are now
©2014

represented by the value 23.


Similarly, we get the midpoints for the remaining classes as
shown below.

JJ II
J I

J DocDoc I
JKUAT: Setting trends in higher Education, Research and Innovation

Back Close 8
STA 2100 Probability and Statistics I
speed(mph) Midpoint, x f fx
21-25 23 22 506
26-30 28 48 1344
31-35 33 25 825
JKUAT SODeL

36-45 40.5 16 648


46-60 53 9 477
P P
Total f = 120 f x = 3800
©2014

2
P P
Hence, mean x̄= f x/ f = 3800/120 = 31 3

5.2.2. Assumed mean and coding method


Suppose we guess that the mean of a distribution x = a(assumed
JJ II
mean), then each xi value can be written as:
J I
J DocDoc I
JKUAT: Setting trends in higher Education, Research and Innovation

Back Close 9
STA 2100 Probability and Statistics I
xi = a + di where di = the deviation of xi from the assumed
mean a.
P P P P
Thus, the mean x̄ = f i xi / f i = fi (a + di )/ fi ;
expanding this and simplify the expression, we get x̄ = a +
JKUAT SODeL

P P P P
fi di / fi that we can simply write as x̄ = a + f x/ f
without the subscripts.
So the mean, x̄ = assumed mean+mean deviation f rom the assum
©2014

If we use grouped data, all values filling a class interval are


considered as coincident with the class mark of the interval. The
last formula can then be adjusted further if we find that all
classes have the same class width (interval) equal to a constant
c. Therefore, di = cui
P P P P
JJ II Thus we have x̄ = a + f d/ f = a + f cui / f
P P
J I Since c is a constant, we have x̄ = a+c( f ui / f ) = a+cū
J DocDoc I
JKUAT: Setting trends in higher Education, Research and Innovation

Back Close 10
STA 2100 Probability and Statistics I
where ū is the mean of u
The formula x̄ = a + cū is what we refer to as the cod-
ing method for computing the mean and other measures from
a frequency distribution table. It is mainly useful when class
JKUAT SODeL

intervals are equal!



Example . Consider the data that we saw in Example 1.
Using the method of assumed mean and coding method, we can
©2014

obtain the mean as follows, taking a = 67:

JJ II
J I
J DocDoc I
JKUAT: Setting trends in higher Education, Research and Innovation

Back Close 11
STA 2100 Probability and Statistics I
Mass (kg) No. of class d= u= fu
Students,f mark, x−a d/3
x
60-62 5 61 -6 -2 -10
JKUAT SODeL

63-65 18 64 -3 -1 -18
66-68 42 67 0 0 0
69-71 27 70 3 1 27
©2014

72-74 8 73 6
16 2
P P
Total f= fu =
100 15
Thus, using the coding formula we have x̄ = a + cū , where
P P
ū = f u/N = 15/100 = 0.15 remember f =N
JJ II Implying x̄ = a + cū = 67 + 3(0.15) = 67.45kg
J I
J DocDoc I
JKUAT: Setting trends in higher Education, Research and Innovation

Back Close 12
STA 2100 Probability and Statistics I
5.2.3. Median for a grouped frequency
As previously mentioned, there is loss of individual identities;
hence we cannot calculate the exact value for the median but
can be estimated using two methods.
JKUAT SODeL

1. Using the interpolation formula


2. By graphical interpolation (check on the assignment for
©2014

this)
Using the interpolation formula
Given grouped frequency data, the best that we can do is to es-
timate the group/class that contains the median item and hence
obtain the ‘theoretical’ value. To achieve this objective, we pro-
JJ II ceed as follows:
J I Step1: Form a Cumulative Frequency (CF) column
J DocDoc I
JKUAT: Setting trends in higher Education, Research and Innovation

Back Close 13
STA 2100 Probability and Statistics I
Step 2: Find N/2
Step3: Find that F value that first exceeds, N/2 which
identifies the median class M
Step 4: Calculate the median using the formula
median = LM + ( N/2+F
JKUAT SODeL

M −1
fM
)CM
Where:
LM : is the lower class boundary of the median class
©2014

FM −1 :if the cumulative frequency of the class just prior to


the median class
fM :is the observed frequency of the median class
CM :is the class interval/width of the median class

Example . Estimate the median for the following data which
JJ II represents the ages of 130 representatives who took part in a
J I statistical survey.
J DocDoc I
JKUAT: Setting trends in higher Education, Research and Innovation

Back Close 14
STA 2100 Probability and Statistics I
Age in Years 20-25 25-30 30-35 35-40 40-45 45-50
No. of reps 2 14 29 43 33 9
Solution:
Using the procedure illustrated above, we have
JKUAT SODeL

Age in Years 20-25 25-30 30-35 35-40 40-45 45-50


No. of reps 2 14 29 43 33 9
CF 2 16 45 88 121 130
©2014

Thus, N/2 = 130/2 = 65


Using the value 65, the CF value which first exceeds 65 is 88
thus, the class represented by CF=88 is the median class.
The median class is therefore the class 35-40
So, median = LM + ( N/2+FfM
M −1
)CM = 35 + ( 130/2+45
43
)5=35 +
JJ II 20
( 43 )5 = 35 + 2.33 = 37.33 years 
J I
J DocDoc I
JKUAT: Setting trends in higher Education, Research and Innovation

Back Close 15
STA 2100 Probability and Statistics I
5.2.4. Mode for a grouped frequency:
Sometimes a set of data is obtained where it is appropriate to
measure a representative value in terms of ‘popularity ’. The
mode of a set of data is that value which occurs most often or
JKUAT SODeL

equivalently has the largest frequency.


As for the case of the mean and median, the mode for grouped
data cannot also be determined exactly, but can be estimated
©2014

by use of the interpolation technique or graphically using a his-


togram.
An estimate can therefore be obtained as follows:
Step 1: Determine the modal class (class with the highest
frequency)
JJ II Step 2: Calculate D1 =difference between largest frequency
J I and the frequency immediately preceding it.
J DocDoc I
JKUAT: Setting trends in higher Education, Research and Innovation

Back Close 16
STA 2100 Probability and Statistics I
Step 3: Calculate D2 =difference between largest frequency
and the frequency immediately following it.
Step 4: Use the interpolation formula mode = Lm +( D1D+D
1
2
)Cm
Where:
JKUAT SODeL

Lm :is the lower class limit of the median class


Cm :modal class interval/width

Example . Estimate the mode for the following data which
©2014

represents the ages of 130 representatives who took part in a


statistical survey.
Age in Years 20-25 25-30 30-35 35-40 40-45 45-50
No. of reps 2 14 29 43 33 9
Solution:
JJ II
D1 = 43 − 29 = 14
J I
J DocDoc I
JKUAT: Setting trends in higher Education, Research and Innovation

Back Close 17
STA 2100 Probability and Statistics I
D2 = 43 − 33 = 10
Lm = 35
Cm = 5
14 14
Hence, mode = 35 + ( 10+14 )5 = 35 + ( 24 )5 = 37.92 years
JKUAT SODeL

5.3. Measures of variability for grouped data


©2014

In this section, we want to see how we can find the variance


and standard deviation of our data given a grouped frequency
table. We shall also consider the coding method in computing
the variance and standard deviation of the data. In addition,
we also talk briefly about the mean absolute deviation (MAD).
JJ II
J I
J DocDoc I
JKUAT: Setting trends in higher Education, Research and Innovation

Back Close 18
STA 2100 Probability and Statistics I
5.3.1. Variance and Standard deviation
In most cases, we are always interested in a measure that can
be used for further statistical analysis of a set of data. In that
case, the variance and standard deviation are measures that can
JKUAT SODeL

be used for this purpose.


P P
Variance if defined as var(x) = f (x− x̄)/ f , while stan-
pP P
dard deviation is sd(x) = f (x − x̄)/ f . This formula can
©2014

be written further as follows;


pP 2
f (x − x̄)/ f = n1
 P 2
f x − ( f x)2 /n
P P
sd(x) =
pP
f x2 / f − ( f x2 / f )2
P P P
or sd(x) =
This can then be used to find the variance of the data as
shown in the following example.
JJ II
J I
Example . The data below relates to the number of suc-
J DocDoc I
JKUAT: Setting trends in higher Education, Research and Innovation

Back Close 19
STA 2100 Probability and Statistics I
cessful sales made by the salesmen employed by a large micro-
computer firm in a particular quarter. Calculate the standard
deviation of the number of sales.
No. of 0 to 4 5 to 9 10 to 15 to 20 to 25 to
sales 14 19 24 29
JKUAT SODeL

No. of 1 14 23 21 15 6
salesmen,f
©2014

Solution:
We can solve this problem by first finding the midpoint, com-
puting the mean and then variance

JJ II
J I
J DocDoc I
JKUAT: Setting trends in higher Education, Research and Innovation

Back Close 20
STA 2100 Probability and Statistics I
No. of Sales No.of mid- fx f x2
Sales- point,
men, (x)
f
JKUAT SODeL

0 to 4 1 2 2 4
5 to 9 14 7 98 686
10 to 14 23 12 276 3312
©2014

15 to 19 21 17 357 6069
20 to 24 15 22 330 7260
25 to 29 6 27 162 4374
Total 80 1225 21,703
1225
Hence, mean, x̄ = 80 = 15.31 sales
JJ II
q
21703
√ √
sd = − (15.31)2 = 271.29 − 234.40 = 36.89 =
80
J I 6.1 sales 
J DocDoc I
JKUAT: Setting trends in higher Education, Research and Innovation

Back Close 21
STA 2100 Probability and Statistics I
Note: In this case we have assumed that we are dealing with
the whole population, hence we divide the denominator by
n and not n − 1.
We can use the coding method to find the standard deviation
JKUAT SODeL

and variance of the grouped data as follows:


Let x = a + cu and x̄ = a + cx̄
therefore, variance = f (x−x̄)2 / f , substituting the above
P P
©2014

figures, we have
= f (a + cu − a − cū)2 / f simplifying the equation we
P P

have
=c2 f (u − ū)2 / f
P P

=c2 variance implying that



JJ II sd(x) = c variance
J I Example . Consider once more the data that we saw in Ex-
J DocDoc I
JKUAT: Setting trends in higher Education, Research and Innovation

Back Close 22
STA 2100 Probability and Statistics I
ample 1. Using the above formulas, we can obtain the standard
deviation as follows:
First, remember, we already obtained the mean x̄ = 67.45kg
Mass (kg) No. of class x − x̄ (x − x̄)2 f (x − x̄)2
JKUAT SODeL

Students,f mark,
x
60-62 5 61 -6.45 41.6025 208.0125
©2014

63-65 18 64 -3.45 11.9025 214.2450


66-68 42 67 -0.45 0.2025 8.505
69-71 27 70 2.55 6.5025 175.5675
72-74 8 73 5.55 30.8125 246.420
100 852.750
2
P P
JJ II Variance= f (x − x̄) / f = 852.75/100 = 8.5275
√ √
J I Standard deviation= variance = 8.5275 = 2.92 kg
J DocDoc I
JKUAT: Setting trends in higher Education, Research and Innovation

Back Close 23
STA 2100 Probability and Statistics I
Alternatively, using the coding method, we have;

Example . Using the coding method to find the standard
deviation of the same data set.
Mass (kg) No. of class u = f u f u2
JKUAT SODeL

Students,f mark, x (x −
62)/3
60-62 5 61 -2 -10 20
©2014

63-65 18 64 -1 -18 18
66-68 42 67 0 0 0
69-71 27 70 1 27 27
72-74 8 73 2 16 32
100 15 97
JJ II Again, we can say that we already obtained the mean x̄ =
J I 67.45kg
J DocDoc I
JKUAT: Setting trends in higher Education, Research and Innovation

Back Close 24
STA 2100 Probability and Statistics I
Thus, variance=c2 [( f u2 / f ) − ū2 ]=9 [97/100 − 0.152 ] =
P P

8.5275
JKUAT SODeL
©2014

JJ II
J I
J DocDoc I
JKUAT: Setting trends in higher Education, Research and Innovation

Back Close 25
STA 2100 Probability and Statistics I
5.3.2. Mean absolute deviation (MAD)
P P
Also known as mean deviation, is defined by f |x − x̄|/ f
P
or f |x − x̄|/N
Where, |x − x̄| is the absolute difference between the value
JKUAT SODeL

x and its mean x̄.


The function |x| or absolute value of x is defined by
|x| = x if x ≥ 0
©2014

|x| = −x if x < 0
For instance, | − 56| = 56 , |9| = 9 or | − 3.8| = 3.8

Example . Given the data in Example 3, obtain the MAD
for the data.
Solution:
JJ II
J I
J DocDoc I
JKUAT: Setting trends in higher Education, Research and Innovation

Back Close 26
STA 2100 Probability and Statistics I
Mass (kg) No. of class x − x̄ |x − x̄| f |x − x̄|
Students,f mark, x
60-62 5 61 -6.45 6.45 32.25
63-65 18 64 -3.45 3.45 116.1
JKUAT SODeL

66-68 42 67 -0.45 0.45 18.9


69-71 27 70 2.55 2.55 68.85
72-74 8 73 5.55 5.55 44.4
©2014

P
Total f= 280.5
100
Hence, M AD = 280.5/100 = 2.805 ≈ 2.81kg 

JJ II
J I
J DocDoc I
JKUAT: Setting trends in higher Education, Research and Innovation

Back Close 27
STA 2100 Probability and Statistics I
5.4. Measures of location/Position
5.4.1. Quartiles and Percentiles
We have already discussed how to find the median of grouped
data. The process of obtaining the quartiles and percentiles in
JKUAT SODeL

a grouped data is quite similar to what we have seen with the


median. For instance,if we are interested in the 1st quartile,
then instead of using N ∗ 2/4 = N/2 we use N ∗ 1/4 = N/4 and
©2014

the rest remain similar to the median computation procedure.


Remember, the median is the 2nd quartile. Similarly, for the
percentiles, we divide N by 100. For instance, the 1st percentile
will be N ∗ 10/100 = N/10.
Thus,
JJ II
Q1 is the 14 nth value
J I
J DocDoc I
JKUAT: Setting trends in higher Education, Research and Innovation

Back Close 28
STA 2100 Probability and Statistics I
Q2 is the 24 nth value
Q3 is the 34 nth value
Exercise 1.  Jua Kali Solicitors monitored the time spent
on consultations with a random sample of 120 of their clients.
JKUAT SODeL

The times spent, to the nearest minute are summarized in the


following table.
Time 10- 15- 20- 25- 30- 35- 45- 60- 90–
©2014

14 19 24 29 34 44 59 89 119
No. 2 5 17 33 27 25 7 3 1
of
clients
(a) Obtain the estimates of the median and quartiles of this
JJ II distribution.
J I (b) Comment on the skewness of the distribution.
J DocDoc I
JKUAT: Setting trends in higher Education, Research and Innovation

Back Close 29
STA 2100 Probability and Statistics I
5.5. Combining sets of Data
There are some instances where we have been given (a) the num-
ber of observations, (b) the mean and (c) the standard deviation
for each data set, but we need to combine the data.
JKUAT SODeL

We may then be forced to find the mean and standard devi-


ation for all the data in the combined set.

©2014

Example . The number of errors, x, on each 200 pages of a


book was noted and the results summarized as follows;
P P 2
x = 920, x = 5032
(a) Calculate the mean and standard deviation of the number
of errors per page. A further 50 pages were added and checked
and it was found that the mean was 4.4 errors with a standard
JJ II
deviation of 2.2 errors.
J I
J DocDoc I
JKUAT: Setting trends in higher Education, Research and Innovation

Back Close 30
STA 2100 Probability and Statistics I
(b) Find the mean and standard deviation of the number of
errors per page for the 250 pages.
Solution:
P
(a) x̄ = x/n=920/200=4.6
JKUAT SODeL

s2 = x2 /n − x̄2 = 5032/200 − 4.62 = 4


P

The mean is 4.6 errors per page and standard deviation is 2


errors.
©2014

(b) For the errors,y,on the further 50 pages


Mean=4.4
P P
Therefore, 4.4 = y/50, which implies that y = 4.4∗50 =
220
The standard deviation =2.2
Meaning, 2.22 =
P 2
JJ II y /50 − 4.42 implying
P 2
J I y = 50(2.22 + 4.42 ) = 1210 for the combined set of 250
J DocDoc I
JKUAT: Setting trends in higher Education, Research and Innovation

Back Close 31
STA 2100 Probability and Statistics I
pages
P P
Thus, total number of errors = x+ y = 920+220 = 1140
Mean=1140/250 = 4.56 and
(standarddeviation)2 = x2 + y 2 /250 − 4.562
P P
JKUAT SODeL

= 5032−1210
250
− 4.562 = 4.1744

Standard deviation = 4.1744 = 2.04(3.sf) 


©2014

Exercise 2. Carontons of orange juice are advertised as


containing 1 litre. A random sample of 100 cartons gave the
following results for the volume, x.
P P 2
x = 101.4, x = 102.83
Calculate the mean and standard deviation of the volume of
orange juice in these 100 cartons.
JJ II
J I
J DocDoc I
JKUAT: Setting trends in higher Education, Research and Innovation

Back Close 32
STA 2100 Probability and Statistics I
5.6. Summary
The relationship between mean, median and mode is as follows.
The median lies between the mean and mode but closer
to the mean by a factor of 2 to 1. Hence the relationship
JKUAT SODeL

median − mode = 2(mean − median) is approximately true.


We can therefore express the following relationships:
2(mean)+mode
ˆ median =
©2014

ˆ mode = 3(median) − 2(mean)


3(median)−mode
ˆ mean = 2
To comment on the skewness of the distribution of a data set, we
may use the Quartile coefficient of skewness given by (Q3 −QQ23)−(Q
−Q1
2 −Q1 )

JJ II
J I
J DocDoc I
JKUAT: Setting trends in higher Education, Research and Innovation

Back Close 33
STA 2100 Probability and Statistics I
Learning Activities
ˆ Briefly show how you can use the coding method to obtain
the mean and standard deviation of a simple frequency
distribution table.
JKUAT SODeL

ˆ With relevant examples, discuss the interpolation method


of finding the median and mode of grouped frequency data.
©2014

ˆ With relevant examples, briefly discuss how we can com-


pute the quartiles and percentiles for grouped frequency
table.

JJ II
J I
J DocDoc I
JKUAT: Setting trends in higher Education, Research and Innovation

Back Close 34
STA 2100 Probability and Statistics I
Solutions to Exercises
Exercise 1. For grouped continuous data with n = 120 ,
Q1 is the 14 nth value i.e the 30th value,
Q2 is the 42 nth value i.e. the 60th value,
JKUAT SODeL

Q3 is the 34 nth value i.e. the 90th value.


Our table should now look like this
©2014

JJ II
J I
J DocDoc I
JKUAT: Setting trends in higher Education, Research and Innovation

Back Close 35
STA 2100 Probability and Statistics I
Time CF
9.5-14.5 2
14.5-19.5 7
19.5-24.5 24
JKUAT SODeL

24.5-29.5 57
29.5-34.5 84
34.5-44.5 109
©2014

44.5-59.5 116
59.5-89.5 119
89.5-119.5 120
For solution (a)
Q1 lies in the interval 24.5-29.5 (width=5)
JJ II There are 33 items in this interval
J I So Q1 = 24.5 + 33 6
∗ 5 = 25.4 min
J DocDoc I
JKUAT: Setting trends in higher Education, Research and Innovation

Back Close 36
STA 2100 Probability and Statistics I
Q2 lies in the interval 29.5-34.5 (width=5)
There are 27 items in this interval
3
So Q1 = 29.5 + 27 ∗ 5 = 30 min
Q3 lies in the interval 34.5-44.5 (width=10)
JKUAT SODeL

There are 25 items in this interval


6
So Q1 = 34.5 + 25 ∗ 10 = 36.9 min
This is an implementation of the formula median = LM +
©2014

N/2+FM −1
( fM )CM
Solution (b)
Q3 − Q2 = 6.9 min, Q2 − Q1 = 4.6 min
Since Q3 − Q2 > Q2 − Q1 , it implies that we have a positive
skew
JJ II Exercise 1
J I
J DocDoc I
JKUAT: Setting trends in higher Education, Research and Innovation

Back Close 37
STA 2100 Probability and Statistics I
P P 2
Exercise 2. x = 101.4, x = 102.83, n = 100
P
Therefore, x̄ = x/n = 101.4/100 = 1.014
So the mean volume is 1.014 litres.
pP p
s= x2 /n − x̄2 = 102.83/100 − 1.0142 = 0.0101...
JKUAT SODeL

Thus, standard deviation is 0.010 (2 s.f.)


Exercise 2
©2014

JJ II
J I
J DocDoc I
JKUAT: Setting trends in higher Education, Research and Innovation

Back Close 38

You might also like