Lec 6 ORGANIZATION AND PRESENTATION OF DATA
Lec 6 ORGANIZATION AND PRESENTATION OF DATA
Lec 6 ORGANIZATION AND PRESENTATION OF DATA
Meaning of Data
• Data is plural word and comprehend the idea of collection of pieces
of information on some variables.
• Data are the raw, disorganized facts and figures collected from any
field of inquiry.
By
Dr. Md. Siddikur Rahman, Associate Professor, Dept. of Statistics, BRUR
1
We may further attempt to record their ages or measure their heights
and weights.
Some information can be obtained simply by observing i.e; we may
observe whether a given day is rainy or sunny day.
All these information constitute data.
Different Types of Data
All statistical data may be broadly classified into two broad categories:
Qualitative and quantitative data.
Sources of Data
Statistical data depending upon the sources of two types
• Primary data
The data which are originally collected by an investigator or an
agent for the first time for the purpose of statistical enquiry are
known as primary data.The data is thus original in character.
• Secondary data
The data which are collected or obtained from some published or
unpublished sources are called secondary data. This type of data is
not original in character.
By
Dr. Md. Siddikur Rahman, Associate Professor, Dept. of Statistics, BRUR
2
For example: The reports and publications made by Bangladesh Bureau
of Statistics (BBS) are primary for that organization but secondary for
those who use it.
Methods of Data Collection
Census method
• In Census every unit of the population is studied.
Sample Survey
• In sampling method instead of every unit of the population only a
part of the population is studied and the conclusions are drawn on
the basis of the sample for the entire population.
• Sampling is a technique to select some representative units from
population units.
SCHEDULE
A schedule, also known as an interview schedule, is an instrument that is
not given to the respondents but it is filled in by interviewer himself who
reads the questions to the respondents and records the answers as
provided by the respondents.
By
Dr. Md. Siddikur Rahman, Associate Professor, Dept. of Statistics, BRUR
4
ORGANIZATION AND PRESENTATION OF DATA
Having obtained the data, the most usual questions one might ask:
• How many of the workers are below 30 years of age? Over 50?
• How many of them earn between 74 and 81 taka?
• How many of them secondary level of education?
• Are the workers frequent to remain absent from work?
The answers to the above questions can be given simply counting the
cases that appear in the table. But it will simply be a cumbersome job
and sometimes impossible, if the number of cases is very large. What
would then one expect us to do with this large volume of data?
Most of us would wish that someone had classified, categorized or
summarized the data in a more convenient and readily interpretable
form.
Tabular and graphical procedures provide useful ways and means of
organizing and describing the data such that they are more easily used
and interpreted.
By
Dr. Md. Siddikur Rahman, Associate Professor, Dept. of Statistics, BRUR
5
The concept of frequency distribution is introduced here as a tabular
method of summarizing data. This frequency distribution can also be
displayed graphically employing a number of diagrams, charts, plots and
curves.
Presentation of data
1. Arrangement
Ø Ascending (Lowest to highest)
Ø Descending (Highest to lowest)
2. Tabulation
Ø Frequency Table
Ø Frequency Distribution
Ø Cross tabulation
3. Graphs
Ø Histogram
Ø Frequency curve (Polygon)
ü Line chart
ü Scatter diagram
4. Diagrams
Ø Bar diagram
Ø Pie diagram
By
Dr. Md. Siddikur Rahman, Associate Professor, Dept. of Statistics, BRUR
6
Frequency distribution
A Frequency distribution is a set of mutually exclusive classes or
categories together with the frequency of occurrences of items, values or
observations in each class or category in a given set of data, presented
usually in a tabular form.
Frequency Distribution:
• A tabular presentation of data showing the number of observations in
each class.
• A grouping of data into mutually exclusive classes showing the
number of observations in each class presented usually in a tabular
form
Large ////////////// 16 32
Medium //////////////////// 24 48
//////// 10 20
Small
Total 50 100
By
Dr. Md. Siddikur Rahman, Associate Professor, Dept. of Statistics, BRUR
8
Frequency distribution of workers by religion
Religion Tally Number of workers Percent
marks
(%)
Muslim /// 72
3
Hindu //////// 9 18
//// 5 10
Christian
Total 50 100
By
Dr. Md. Siddikur Rahman, Associate Professor, Dept. of Statistics, BRUR
9
Table: Array of daily wage data
50 63 70 75 84
51 65 7175 85
54 65 72 76 86
56 66 72 77 87
56 67 72 79 88
57 68 73 80 88
59 68 73 81 89
60 69 74 82 93
61 69 74 82 93
62 70 74 83 97
… …
… …
97 1
By
Dr. Md. Siddikur Rahman, Associate Professor, Dept. of Statistics, BRUR
10
Formation of grouped discrete frequency distribution
• Grouping, however, has limitations too.
• One disadvantage of group distribution is the loss of information.
Example: The number of complete days the workers were absent from
their work are arranged below in an ascending array:
5 8 9 9 10 10 10 10 11 11
12 12 12 13 13 13 14 14 14 15
15 15 15 16 16 16 16 14 17 17
17 18 18 18 18 18 19 19 19 19
20 21 21 22 23 24 26 27 29 33
By
Dr. Md. Siddikur Rahman, Associate Professor, Dept. of Statistics, BRUR
13
It doesn’t maintain the continuity of the age data. So we have to
reconstruct the frequency distribution with the correction factor
Lower lim it of sec ond class - Upperr lim it of first class 30 - 29
C= = = 0.5
2 2
By
Dr. Md. Siddikur Rahman, Associate Professor, Dept. of Statistics, BRUR
14
Class interval/
Absolute Percentage Relative
boundaries
frequency frequency frequency
(Wages)
49.5 –57.5 6 12.0 0.12
57.5 – 65.5 7 14.0 0.14
65.5 – 73.5 14 28.0 0.28
73.5 – 81.5 10 20.0 0.20
81.5– 89.5 10 20.0 0.20
89.5 –97.5 3 6.0 0.06
Total 50 100.0 1.00
By
Dr. Md. Siddikur Rahman, Associate Professor, Dept. of Statistics, BRUR
15
Thus the cumulative frequency 27 in column 3 states that 27 workers
received less than taka 73.5.in other words 54 % workers received such
amount.
2. More than frequency distribution for wage data
Class interval/
Cumulative %Cumulative
boundaries frequency
frequency frequency
(Wages)
49.5 –57.5 6 50 100.0
57.5 – 65.5 7 44 88.0
65.5 – 73.5 14 37 74.0
73.5 – 81.5 10 23 46.0
81.5– 89.5 10 13 26.0
89.5 –97.5 3 3 06.0
Total 50
By
Dr. Md. Siddikur Rahman, Associate Professor, Dept. of Statistics, BRUR
19
PRESENTATION OF QUALITATIVE DATA
Bar diagram/Bar Chart (The classes are reported on the horizontal
axis and the class frequencies on the vertical axis)
NUMBER SOLD 50
40
30 50
20 30
10
0
Foreign Domestic
Laptop Type
Foreign
37%
Domestic
63%
By
Dr. Md. Siddikur Rahman, Associate Professor, Dept. of Statistics, BRUR
20
PRESENTATION OF QUANTITATIVE DATA
Diagrams for discrete data
• Dot diagram
Diagrams for continuous data
• Histogram
Histogram: A graph in which the classes are marked on the horizontal
axis and the class frequencies on the vertical axis.
8
7
7
6 6
6
NO. Of laptops
5 5
5
4
3 Series1
2
2
1
0
14 – 24 24 – 34 34 – 44 44 – 54 54 – 64 64 – 74
Selling price(thousands TK)
By
Dr. Md. Siddikur Rahman, Associate Professor, Dept. of Statistics, BRUR
21
concentrations of the scores, the shape of the distribution, and presence
of any outliers in the distribution.
Compared to other graphical methods, Steam and leaf plot is an easy and
quick way of displaying data.
Example:
The following data represent the marks obtained by 2 students in a
statistics test
84 17 38 45 47 53 76 54 75 22 66 65 55 54 51 33 39 19 54 72
Use a stem and leaf plot to display the data?
Solution:
Here, the lowest score is 17 and the highest score is 84.Suppose, for the
score 84, the leading digit (tens) of scores as the stem is 8, and the
trailing digit (units) as the leaf is 4.
The complete diagram in unordered sequence’s is-
By
Dr. Md. Siddikur Rahman, Associate Professor, Dept. of Statistics, BRUR
22
The final figure in ordered sequence is
Stem Leaf
1 79
2 2
3 389
4 57
5 134445
6 56
7 256
8 4
There are more scores in the fifties than any other group; 8 scores are
less than 50, and only 4 scores are above 70.
Frequency polygon
A frequency polygon also shows the shape of a distribution and is
similar to a histogram. The midpoint of each class is scaled on the X-
axis and the class frequencies on the Y-axis.
Selling price Number of Laptops sold
Midpoint
(in thousand taka) (Frequency)
14 – 24 19 5
24 – 34 29 2
34 – 44 39 7
44 – 54 49 6
54 – 64 59 5
64 – 74 69 5
Total 30
By
Dr. Md. Siddikur Rahman, Associate Professor, Dept. of Statistics, BRUR
23
By
Dr. Md. Siddikur Rahman, Associate Professor, Dept. of Statistics, BRUR
24
Construct a cumulative frequency polygon (ogive)?
Total 30
By
Dr. Md. Siddikur Rahman, Associate Professor, Dept. of Statistics, BRUR
25
By
Dr. Md. Siddikur Rahman, Associate Professor, Dept. of Statistics, BRUR
26
Line diagram
A line graph is particularly useful for numerical data if we wish to show
time series data.
Census of Population of Bangladesh: 1901-1991
Year Population
1901 28.9
1911 31.6
1921 33.2
1931 35.6
1941 42
1951 44.2
1961 55.2
1974 76.4
1981 89.9
1991 111.5
By
Dr. Md. Siddikur Rahman, Associate Professor, Dept. of Statistics, BRUR
27
120
100
Population(in million)
80
60
Population
40
20
0
Year 1901 1911 1921 1931 1941 1951 1961 1974 1981
Census Year
Scatter diagram
Scatter diagram are useful for displaying information on two quantitative
variables, which are believed to be inter-related. Height & weight, age &
height, income & expenditure, are some data sets that are assumed to be
related to each other. Data are below relate to the age at marriage of 20
couples obtained in a survey. A Scatter diagram displays these data. The
diagram clearly demonstrates that as age of the husband increases, the
wife’s age also increases, thus implying a positive relationship exists
between husband’s age and wife’s age.
By
Dr. Md. Siddikur Rahman, Associate Professor, Dept. of Statistics, BRUR
28
40
35
30
Wife's age 25
20
15 Wife's age
10
5
0
0 10 20 30 40 50
Husband's age
Solve self- review 2-1(Page: 26), self- review 2-2 (Page: 31), self- review 2-4
(Page: 33), Exercises (15-18, Page 39), self- review 2-6 (Page: 42).
By
Dr. Md. Siddikur Rahman, Associate Professor, Dept. of Statistics, BRUR
29