0% found this document useful (0 votes)
15 views24 pages

Chapter 3: Descriptive Statistcs

Uploaded by

bantaleme
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
15 views24 pages

Chapter 3: Descriptive Statistcs

Uploaded by

bantaleme
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 24

Chapter Three

Data Collection and Presentation

1
Outline
 Data Collection
 Methods of Data Presentation

o Tabular Methods of Data Presentation


 Frequency Distributions

o Graphic Methods of Data Presentation


o Diagrammatic Method of Data Presentation

2
Data Collection: it’s the process of measuring, counting, gathering,
assembling (ordering) the raw data for statistical investigation.
Data Presentation: having collected and edited data, organize the
data in a readily comprehensible condensed form that aids to draw
inferences/interpretation and conclusion called data presentation.
 Data can be present in to the ways;
 Tabular (table) presentation,
 Diagrammatic (chart, bar-diagram…), and
 Graphic presentation (ogive, bar-graph, histogram…).

3
 For proper data presentation, it’s necessary classify the data.

Classification: it’s the process of technically arranging the data in


to classes or categories on the basis of their similarities.
 In the classification of data, the important terms are;

Raw Data: recorded/ collected information in its original character.

Frequency: the number of occurrence of values in a specific class


of the distribution
 It’s the repetition of each observation appears in the data set.

Frequency Distribution: the organization of raw data in tabular


form using classes and frequencies.
4
Percentage Frequency Distribution: the tabular summary of the
data showing the percentage of the total number of items in each
of several non-over lapping classes.

Probability Distribution: the description of probability over the


values of the random variable.
 There are three basic types of frequency distributions.
 Categorical frequency distribution
 Ungrouped frequency distribution
 Grouped frequency distribution

5
Categorical Frequency Distribution: it organized the data that can
be place in specific categories; such as nominal or ordinal. e.g.
marital status.
 The data for marital status of 20 workers can be presented as;

M S S D S

S W D S M

W W D D S

W M M S S
 Since, the data are categorical, the discrete classes of marital
status in the distribution are M, S, D, and W.
6
 The procedures used to construct the frequency distribution are;
 List the data and count the frequency.
 Find the percentages of values in each class.

*100, where; f- frequency of the class, n- total observation


 Construct the categorical frequency distribution

Class/ Category Frequency Percent

M 4 20
S 8 40
D 4 20
W 4 20
7
Ungrouped Frequency Distribution: the data organized in
which all score values the raw data could possible occur in each
data actually occurred without classify the data in to groups.
 It often constructed for small data set for discrete variable.

 The steps to construct ungrouped frequency distribution are:


 Identify the smallest and largest raw score in the collected data
and compute the range.
 Arrange the data in order of magnitude/ ascending order and
count the frequency.

8
 The data for the mark of 20 students can be presented as;

80 76 90 85 80
70 60 62 70 85
65 60 90 74 75
76 70 70 80 85
 Construct ungrouped frequency distribution

Step 1: Put the data in ascending order

Step 2: Compute the frequency.

Step 3: construct the distribution table

9
Class Frequency Class Frequency
60 2 75 2
62 1 76 1
65 1 80 3
70 4 85 3
74 1 90 2

 Since, each individual value presented separately, that’s why


it’s ungrouped frequency distribution.

10
Grouped Frequency Distribution: organization of the data in
which the data classified in to non-overlapping intervals called
classes, and records the number of observations in each class
called frequency.
 It summarizes the data in condensed form that can be readily

understood and easily interpreted.


 It’s usually used in large range of the data (no of observation),

in which the data must be grouped in to classes that are more


than one unit.

11
Some Statistical Terms in Grouped Frequency Distribution

Grouped Frequency Distribution: a frequency distribution when


several numbers are grouped in one class.

Class: each category of the frequency distribution.


Frequency: the number of data value falling within each class.
 The class contained non-listed observations which are non-
appeared in the data set.

Total Frequency: the summation of all class frequencies.


Class Limits: the boundaries for each class which separates one
class in a grouped frequency distribution from another. 12
 The limits could actually appear in the data and have gaps between
the upper limits of one class and lower limit of the next which have
the same decimal value as the data value.

Units of Measurement (U): it’s the measure of the distance between two
possible consecutive measures.
 It’s usually taken as 1, 0.1, 0.01, 0.001, etc.

Class Boundaries: it’s a true class limits that no gap exists between
classes.
 There’s no gap between the upper boundary of one class and lower
boundary of the next class.
 The lower class boundary found by subtracting U/2 from the
corresponding lower class limit, and the upper class boundary found
13
by
Class Width/interval: the difference between the upper and lower
class boundaries of any class.
 It’s the difference between the lower limits of any two
consecutive classes or the difference between any two
consecutive class marks.

Class Mark (Mid-points): the average of the lower and upper


class limits or the average of upper and lower class boundary.

Cumulative Frequency: number of observations less than/more


than or equal to a specific value.

14
Cumulative Frequency Above: the total frequency of all values
greater than or equal to the lower class boundary of a given class.

Cumulative Frequency Blow: the total frequency of all values


less than or equal to the upper class boundary of a given class.

Cumulative Frequency Distribution (CFD): the tabular arrangement


of class interval together with their corresponding cumulative
frequencies.
 It’s the table which shows the absolute frequency of occurrence
added to each successive class.
 It can be more than or less than type, depending on the type of
cumulative frequency used. 15
Relative Frequency (rf): the frequency divided by total frequency.

Relative Cumulative Frequency (rcf): the cumulative frequency

divided by the total frequency.

RCF Distribution: the table which shows the number of occurrence of

each item or class of items in the data set as a proportion of the total

number of observation.
 RCF can be expressed in decimal, fraction or percentage form.

RF = , where;

RF- relative frequency

AF- absolute/cumulative frequency


16
TF- total frequency (total no of observation)
Guidelines for Classes
 There should be between 5 and 20 classes.

 The classes must be mutually exclusive.

o No data value can fall into two different classes/non-repeated in

different classes.

 The classes must be all inclusive, that’s all data values must be

included.

 The classes must be continuous until all data set included in the

class, that’s no gaps in a frequency distribution.


17
 The classes must be equal in width, except the first or last class.
Steps for Constructing Grouped Frequency Distribution
1. Compute the Range (R) = Max. value - Min. value.
2. Select the number of classes desired, usually between 5 and 20
 Use Sturges rule, k = 1 + 3.32log (n), and rounding up, not off;
where, k _ number of classes desired and n_ total number of
observation in the sample.
3. Find the class width, that’s range dividing by number of classes, and
rounding up, not off, w = R/k
4. Pick a suitable starting point, the minimum value which’s the lower
limit of the first class.
 Then, continue to add the class width to the lower limit of the first class
18
5. To find the upper limit of the first class, subtract ‘U’ from the lower
limit of the second class.
 Then, continue to add the class width to the upper limit of the first
class to get the rest of the upper limits for the required classes.

6. Find the class boundaries by subtracting U/2 units from the lower
limits and adding U/2 units from the upper limits.
 The boundaries are also half-way between the upper limit of one class
and the lower limit of the next class.

7. Find the frequencies by counting the no of data value included in the


class.

19
8. The cumulative frequencies (CF below can be calculated as, taking
the first frequency for the first class and adding the preceding frequency
up to the first class with the consecutive frequency to get CF of the
consecutive class).
9. The cumulative frequencies (CF above can be calculated as, adding
all frequencies to get the frequency of the first class and subtract the
preceding frequency from the cross ponding cumulative frequency to
get the cumulative frequency above for the rest of the class).

20
 Construct a grouped frequency distribution for the data given as:
11,29, 6, 33, 14, 31, 22, 27, 19, 20, 18, 17, 22, 38, 23, 21, 26, 34, 39, 27.

 Find the range, R = H – L = 39 – 6 = 33

 Compute the number of classes desired using Sturges formula,

k = 1 + 3.32logn == k = 1 + 3.32log20 = 5.32 ≈ 6.


 Find the class width, w = = 33/6 = 5.5 ≈ 6.

 Select the starting point for the lowest class limit , the smallest

data value, and add the class width to the smallest value to get the
lower limit of the next class until the required 6 class will reached
o The lower class limits are 6, 12, 18, 24, 30, 36.

21
 Subtract one unit from the lower limit of the second class to get
upper limit of the first class, that’s 12 – 1 = 11.
o Then, add the class width to each upper limit to get all respective
upper limits,
o Therefore, the upper limits of the required class are 11, 17, 23,
29, 35, 41.
 Construct the required classes by combining the list of lower class
and upper class limits.
o Therefore, the required class limits are, 6 – 11, 12 – 17, 18 – 23,
24 – 29, 30 – 35, and 36 – 41.

22
 Find the class boundaries (lower and upper class boundary) by
subtracting (U/2 = 0.5) from each lower class limit to get lower
class boundary) and add (U/2=0.5) to each upper class limits to
get upper class boundary,
o Therefore, the lower and upper class boundary of the first class are
6 - 0.5 = 5.5, and 11 + 0.5 = 11.5 respectively.
o Then, continue adding the class width on both class limits to
obtain the rest boundaries, and 5.5 – 11.5, 11.5 – 17.5, 17.5 – 23.5,
23.5 – 29.5, 29.5 – 35.5, 35.5 – 41.5 are class boundaries.
 Count the number of observations which lie with each classes and
fill in each frequencies and cumulative frequencies.
23
Class Limit

6 – 11

24 – 29
18 – 23
12 – 17
Class Boundary

5.5 – 11.5

11.5 – 17.5

23.5 – 29.5
17.5 – 23.5
Class Mark

8.5

26.5
20.5
14.5

4
7
2
2
Frequency

4
2

11

15
C. Freq. (less than)

9
16
18
20

C. Freq. (more than)

0.2 Relative Freq. (freq./TF =20)


0.1
0.1

0.35

Relative
0.2
0.1

0.75
0.55

C. freq. (less than)


(CF</TF = 20)
1

Relative
0.8
0.9

24
0.45

C. freq. (more than) (CF>/TF =


20)

You might also like