Chapter 5 - Descriptive Statistics
Chapter 5 - Descriptive Statistics
Chapter 5 - Descriptive Statistics
CHAPTER 5
DESCRIPTIVE STATISTICS
________________________________________________________________________
At the end of chapter, the students are able to
construct the histogram / frequency polygon and Ogive
distinguish between ungrouped data and grouped data
find mean, median, mode for ungrouped data and grouped data
calculate variance and standard deviation for ungrouped data and grouped data
________________________________________________________________________
Statistic is concerned with the collection, ordering and analysis of data. Data consists of sets of
recorded observations or values. Any quantity that can have a number of values is a variable. A
variable maybe one of two kinds:
a) Discrete - a variable whose possible values can be counted
b) Continuous - a variable whose values can be measured on a continuous scale.
5.1
Arrangement Data
Table of Values
A set of data
28 31 29 27 30 29 29 26 30 28
28 29 27 26 32 28 32 31 25 20
27 30 29 30 28 29 31 27 28 28
Can be arranged in ascending order
25 26 26 27 27 27 27 28 28 28
28 28 28 28 29 29 29 29 29 29
30 30 30 30 30 31 31 31 32 32
Once the data is in the ascending order, it can be entered into a table. The number of
occasions on which any particular value occurs is called the frequency, denoted by f .
Value
25
26
27
28
29
30
31
32
Tally Diagrams
When dealing with large number of reading, instead of writing all the values in ascending order,
it is more convenient to compile a tally diagram, recording the range of values of the variable
and adding a stroke for each occurrence of that reading.
Value
25
26
27
28
29
30
31
32
Tally marks
/
//
////
//// //
//// /
////
///
//
Grouped Data
If the range of values of the variable is large, it is often helpful to consider these values arranged
in regular groups or classes.
Hours
10 - 19
20 - 29
30 - 39
40 - 49
50 - 59
60 - 69
Tally marks
//
////
////
////
////
////
//// ////
//// //// /
///
/
n=
Frequency, f
2
4
14
16
8
6
f = 50
Class Boundaries
A class or group boundary lies midway between the data values. For example,
7.1 - 7.3
7.4 - 7.6
The class values 7.1 and 7.4 are the lower limit of the class.
The class values 7.3 and 7.6 are the upper limit of the class.
The difference between the lower and upper limit is class width.
The class boundaries are 0.05 below the lower limit and 0.05 above the upper limit,
7.4 7.3
because of
= 0.05 .
2
The class interval is the difference between the lower and upper class boundaries.
50
The central value ( or midpoint) of the class interval is one half of the difference between
the upper and lower limit.
Class width
Upper
limit
7.6
Lower
limit
7.4
7.35
Lower class
boundary
7.5
Central
Value
7.65
Upper class
boundary
Class interval
Histogram
A histogram is a graphical representation of a frequency distribution in which vertical
rectangular blocks are drawn so that:
the centre of the base indicates the central value of the class and
the area of the rectangle represents the class frequency
For Example, the measurement of the lengths of 50 brass rods gave the following frequency
distribution:
Length (mm),
x
3.45 - 3.47
3.48 - 3.50
3.51 - 3.53
3.54 - 3.56
3.57 - 3.59
3.60 - 3.62
3.63 - 3.65
Lower Class
Boundary
3.445
3.475
3.505
3.535
3.565
3.595
3.625
Upper Class
Boundary
3.475
3.505
3.535
3.565
3.595
3.625
3.655
Centre Value
3.460
3.490
3.520
3.550
3.580
3.610
3.640
Frequency,
f
2
6
12
14
10
5
1
51
Histogram/Frequency Polygon
16
14
12
Frequency
10
8
6
4
2
0
3.460 3.490 3.520 3.550 3.580 3.610 3.640
Length (mm)
Cumulatative Frequency
Ogive
60
50
40
30
20
10
0
3.460
3.490
3.520
3.550
3.580
3.610
3.640
Length (mm)
5.2
Data Description
Discuss about the Central Tendency Measures (mean, median, and mode) and Measures of
Dispersion (variance, and standard deviation) for ungrouped data and grouped data.
5.2.1
Ungrouped Data
x1 + x 2 + ... + x n
.
n
Mode is the data value that occurs most frequently.
Median is the middle score in ranked data.
Variance, 2 =
( x xi ) 2
i =1
( x x)
i =1
n
Standard deviation, = variance .
52
Example 5.1(a): Find mean, mode, and median for the set of data
2
4
6
8
10
Solution:
2 + 4 + 6 + 8 + 10 30
Mean, x =
=6
=
5
5
No mode.
Median is 6
Example 5.1(b): Find mean, mode, median, variance and standard deviation for the set of data
1
2
2
2
3
4
5
5
5
5
Solution:
1 + 2 + 2 + 2 + 3 + 4 + 5 + 5 + 5 + 5 34
Mean, x =
=
= 3.4
10
10
Mode is 5
3+ 4
Median is
= 3.5
2
(3.4 1) 2 + (3.4 2) 2 + (3.4 2) 2 + (3.4 2) 2 + (3.4 3) 2 +
Variance, 2 =
22.4
= 2.24
10
Standard deviation, = 2.24 1.497
=
Example 5.1(c): Find mean, mode, median, variance and standard deviation for the set of data
1
1
1
2
3
4
5
5
5
Solution:
1 + 1 + 1 + 2 + 3 + 4 + 5 + 5 + 5 27
Mean, x =
=
=3
9
9
Mode are 1 and 5
Median is 3
(3 1) 2 + (3 1) 2 + (3 1) 2 + (3 2) 2 + (3 3) 2 + (3 4) 2 + (3 5) 2 +
Variance, 2 =
(3 5) 2 + (3 5) 2
9
26
2.889
9
Standard deviation, = 2.889 1.700
=
53
Example 5.1(d): Find mean, mode, median, variance and standard deviation for the set of data
11
12
14
14
16
18
19
20
21
25
5.2.2
Grouped Data
2.
3.
4.
5.
6.
7.
fi
frequency for median class
fm
frequency for mode class
f Mod
frequency for before mode class
f Mod 1
frequency for after mode class
f Mod +1
cumulative frequency before median class f c
midpoint for i-th class
xi
8.
n = fi
9.
10.
11.
12.
LBMod
LBMed
c
1
i =1
Mean, x =
fx
i
i =1
n
54
1
c
Mode = LBMod +
1 + 2
n fc
c
Median = LBMed + 2
f
m
k
Variance, 2 =
f ( x x)
i =1
fx
i =1
i i
n
n
Standard deviation, = variance .
()
Example 5.2(a): The table below shows a frequency distribution for weight of 50 fishes catch in
fish farms at UPMKB. The weights are given correct to the nearest kg. Find the mean, median,
mode, variance, and standard deviation weigh of the fish.
Weight (kg)
Frequency, f
1-5
8
6 - 10
15
11 - 15
20
16 - 20
5
21 -25
2
Solution:
Weight Frequency, Midpoint,
(kg)
f
x
1-5
8
3
6 - 10
15
8
11 - 15
20
13
16 - 20
5
18
21 -25
2
23
n= f
= 50
fx
x2
fx 2
24
120
260
90
46
9
64
169
324
529
72
960
3380
1620
1058
fx
= 540
fx
Lower
Cumulative
Boundary Frequency
0.5
8
5.5
23
10.5
43
15.5
48
20.5
50
= 7090
fx
i i
540
= 10.80 kg
n
50
1
5
c = 10.5 +
Mode = LBMod +
5 = 11.75 kg
5 + 15
1 + 2
Mean, x =
i =1
n f
25 23
Median = LBMed + 2 c c = 10.5 +
5 = 11.00 kg
20
fm
k
fx
i i
()
7090
(10.8) 2 = 25.16
n
50
Standard deviation, = variance = 25.16 5.02 .
Variance, 2 =
i =1
x =
55
Example 5.2(b): The table below shows a frequency distribution for marks of 100 students in
UPMKB. Find the mean, median, mode, variance, and standard deviation weigh of the fish.
Marks
Frequency, f
0 - 19
6
20 - 39
14
40 - 59
31
60 - 79
42
80 - 99
7
56
Exercise 5:
1. Find mean, mode, median, variance and standard deviation for the following set of data
a) 1, 2, 3, 4, 5
b) 11, 13, 13, 13, 15, 15, 16, 18
c) 3, 5, 9, 15, 17, 17, 23, 25, 29, 29, 31
2. The table below shows a frequency distribution for height of papaya trees. Find the mean,
median, mode, variance, and standard deviation.
Height (m)
Frequency, f
1.0 - 1.2
4
1.3 - 1.5
8
1.6 - 1.8
16
1.9 - 2.0
22
2.1 - 2.3
15
2.4 - 2.6
5
3. The table below shows a frequency distribution for volumes of water. Find the mean, median,
mode, variance, and standard deviation.
Volume (ml)
Frequency, f
1 - 100
2
101 - 200
5
201 - 300
14
301 - 400
19
401 - 500
26
501 - 600
25
601 - 700
16
701 - 800
13
57