Chapter 3: Descriptive Statistcs
Chapter 3: Descriptive Statistcs
1
Outline
Data Collection
Methods of Data Presentation
2
Data Collection: it’s the process of measuring, counting, gathering,
assembling (ordering) the raw data for statistical investigation.
Data Presentation: having collected and edited data, organize the
data in a readily comprehensible condensed form that aids to draw
inferences/interpretation and conclusion called data presentation.
Data can be present in to the ways;
Tabular (table) presentation,
Diagrammatic (chart, bar-diagram…), and
Graphic presentation (ogive, bar-graph, histogram…).
3
For proper data presentation, it’s necessary classify the data.
5
Categorical Frequency Distribution: it organized the data that can
be place in specific categories; such as nominal or ordinal. e.g.
marital status.
The data for marital status of 20 workers can be presented as;
M S S D S
S W D S M
W W D D S
W M M S S
Since, the data are categorical, the discrete classes of marital
status in the distribution are M, S, D, and W.
6
The procedures used to construct the frequency distribution are;
List the data and count the frequency.
Find the percentages of values in each class.
M 4 20
S 8 40
D 4 20
W 4 20
7
Ungrouped Frequency Distribution: the data organized in
which all score values the raw data could possible occur in each
data actually occurred without classify the data in to groups.
It often constructed for small data set for discrete variable.
8
The data for the mark of 20 students can be presented as;
80 76 90 85 80
70 60 62 70 85
65 60 90 74 75
76 70 70 80 85
Construct ungrouped frequency distribution
9
Class Frequency Class Frequency
60 2 75 2
62 1 76 1
65 1 80 3
70 4 85 3
74 1 90 2
10
Grouped Frequency Distribution: organization of the data in
which the data classified in to non-overlapping intervals called
classes, and records the number of observations in each class
called frequency.
It summarizes the data in condensed form that can be readily
11
Some Statistical Terms in Grouped Frequency Distribution
Units of Measurement (U): it’s the measure of the distance between two
possible consecutive measures.
It’s usually taken as 1, 0.1, 0.01, 0.001, etc.
Class Boundaries: it’s a true class limits that no gap exists between
classes.
There’s no gap between the upper boundary of one class and lower
boundary of the next class.
The lower class boundary found by subtracting U/2 from the
corresponding lower class limit, and the upper class boundary found
13
by
Class Width/interval: the difference between the upper and lower
class boundaries of any class.
It’s the difference between the lower limits of any two
consecutive classes or the difference between any two
consecutive class marks.
14
Cumulative Frequency Above: the total frequency of all values
greater than or equal to the lower class boundary of a given class.
each item or class of items in the data set as a proportion of the total
number of observation.
RCF can be expressed in decimal, fraction or percentage form.
RF = , where;
different classes.
The classes must be all inclusive, that’s all data values must be
included.
The classes must be continuous until all data set included in the
6. Find the class boundaries by subtracting U/2 units from the lower
limits and adding U/2 units from the upper limits.
The boundaries are also half-way between the upper limit of one class
and the lower limit of the next class.
19
8. The cumulative frequencies (CF below can be calculated as, taking
the first frequency for the first class and adding the preceding frequency
up to the first class with the consecutive frequency to get CF of the
consecutive class).
9. The cumulative frequencies (CF above can be calculated as, adding
all frequencies to get the frequency of the first class and subtract the
preceding frequency from the cross ponding cumulative frequency to
get the cumulative frequency above for the rest of the class).
20
Construct a grouped frequency distribution for the data given as:
11,29, 6, 33, 14, 31, 22, 27, 19, 20, 18, 17, 22, 38, 23, 21, 26, 34, 39, 27.
Select the starting point for the lowest class limit , the smallest
data value, and add the class width to the smallest value to get the
lower limit of the next class until the required 6 class will reached
o The lower class limits are 6, 12, 18, 24, 30, 36.
21
Subtract one unit from the lower limit of the second class to get
upper limit of the first class, that’s 12 – 1 = 11.
o Then, add the class width to each upper limit to get all respective
upper limits,
o Therefore, the upper limits of the required class are 11, 17, 23,
29, 35, 41.
Construct the required classes by combining the list of lower class
and upper class limits.
o Therefore, the required class limits are, 6 – 11, 12 – 17, 18 – 23,
24 – 29, 30 – 35, and 36 – 41.
22
Find the class boundaries (lower and upper class boundary) by
subtracting (U/2 = 0.5) from each lower class limit to get lower
class boundary) and add (U/2=0.5) to each upper class limits to
get upper class boundary,
o Therefore, the lower and upper class boundary of the first class are
6 - 0.5 = 5.5, and 11 + 0.5 = 11.5 respectively.
o Then, continue adding the class width on both class limits to
obtain the rest boundaries, and 5.5 – 11.5, 11.5 – 17.5, 17.5 – 23.5,
23.5 – 29.5, 29.5 – 35.5, 35.5 – 41.5 are class boundaries.
Count the number of observations which lie with each classes and
fill in each frequencies and cumulative frequencies.
23
Class Limit
6 – 11
24 – 29
18 – 23
12 – 17
Class Boundary
5.5 – 11.5
11.5 – 17.5
23.5 – 29.5
17.5 – 23.5
Class Mark
8.5
26.5
20.5
14.5
4
7
2
2
Frequency
4
2
11
15
C. Freq. (less than)
9
16
18
20
0.35
Relative
0.2
0.1
0.75
0.55
Relative
0.8
0.9
24
0.45