Chapter02 - Data Represetation
Chapter02 - Data Represetation
3
On the basis of presentation data can be classified
as,
Ungrouped data
Data
Grouped data
4
Ungrouped Data:
Ungrouped data is the data you first gather from an experiment or
study.
Here, data is in raw form that is, it’s not sorted into categories,
classified, or otherwise grouped.
An ungrouped set of data is basically a list of numbers.
Example:
Marks of 20 students in STA-240 midterm is given as,
100,92,41,75,98,85,45,55,85,100,75,74,72,88,94,67,41,84,87,87
5
Grouped Data:
Grouped data is data that has been bundled together in categories
or class or class interval.
Example:
Marks of 20 students in STA-240 midterm is given as,
Marks No of students
41-50 3
51-60 1
61-70 1
71-80 4
81-90 6
91-100 5
total 20
6
Frequency:
The frequency is the number of times a particular data point occurs in the
set of data.
Example: consider the list of numbers: 1, 2, 3, 4, 6, 9, 9, 8, 5, 1, 1, 9, 9, 0, 6, 9.
in this list of numbers, the frequency of the number 9 is 5 (because it occurs
5 times).
Frequency Distribution:
Representation of data in a tabular form which indicates the frequency is
known as frequency distribution.
7
Qualitative
frequency
Distribution
*Associated with
qualitative variable/data
Frequency
Distribution
Quanttitative
frequency
Distribution
*Associated with
qualitative variable/data
8
Quantitative Frequency Distribution
Associated terms with quantitative frequency distribution
table :
• Class interval
• No of classes
• Range
• Class width
• Tally
9
Class interval:
the size of each class into which a range of a variable is divided.
Class
Marks No of students Interval
41-50 3
51-60 1
61-70 Class 1 Inclusive Exclusive
Interval CI
71-80 4 CI
81-90 6
91-100 5
10
Inclusive Class Interval:
• When the lower and the upper class limit is included, then it is an inclusive
class interval
• Usually for the case of discrete variate or data, inclusive type of CI is used.
Example:
Marks of 15 students in STA-240 midterm is given as,
100,92,75,98,85,85,100,75,74,72,88,94,84,87,87 and we arrange as,
Marks No of students
71-80 4
81-90 6
91-100 5
total 15
11
Exclusive Class Interval:
• When the lower limit is included, but the upper limit is excluded, then it
is an exclusive class interval.
• Usually in the case of continuous variate or data, exclusive type of class
intervals are used.
Example:
Marks of 15 students in STA-240 midterm is given as,
100, 92, 75.6, 98, 85, 80, 100, 75, 74, 72, 88, 94.5, 84, 90, 87 and we
arrange as,
Marks No of Students
70-80 4
80-90 5
90-100 6
total 15
12
**Steps to construct quantitative frequency distribution:
1. Find out the highest and the lowest value of your data set.
2. Calculate Range by using
Range = (highest Value- Lowest value )+1
3. Determine the no of classes by using the following formula:
no of classes, k= 1+3.322logN where N= no of observations of your
data set.
4. Determine class width by using the following formula,
class width=Range/k
5. Sort or tally the items in appropriate class interval.
6. Count the corresponding tallies of each CI and introduced Frequency
Column and display this by a table.
13
Problem-01
The age of the 35 workers of a certain company is given below
as,
25, 28, 29, 32, 33, 37, 42, 37, 42, 46, 35, 37, 42, 46, 35, 38, 43,
46, , 40, 25, 56, 65, 54, 41, 37, 27, 54, 55, 58, 42, 31, 27, 28, 25,
28.
14
2. Bi-variate
frequency
distribution
1. Uni-
3. tri-variate
variate
frequency
frequency
distribution
distribution
Qualitative
Frequency
Distribution
15
Uni-variate frequency distribution:
• It involves a single variable.
• Example : Religion view of 162 people of Uttara sector-10 as,
Hindu 52
Christian 10
16
Bi-variate frequency distribution:
• It involves two variables.
• Example : Frequency distribution of 125 cases regarding two factors
(cancer status and smoking status).
Smoking status
Smoker 50 25 75
Non smoker 10 40 50
Total 60 65 125
17
Graphical Representation of Data:
• Graphical representation of data is a visual display of
data and the statistical results.
• It is more often and effective than the tabular
representation of data.
• There are different types of graphical representation
based on the nature of the data
18
Discrete Data
Representation Methods:
*Line Diagram
Quantitative data *Dot Diagram
Representation
Continuous Data
Qualitative data Representation Methods:
Data
representation *Histogram
Representation Methods: *Frequency polygon
*Bar Diagram *Ogive Line
*Multiple Bar
Diagram
*Pie Chart
19
Graphs.pdf
20
Problem-02
The accompanying table shows the stock position of finished goods
in metric tons as of June 2004 of the Bangladesh Chemical
Industries Corporation. Represent the data by bar diagram-
Finished goods Quantity
TSP 8916
SSP 18455
Paper 2660
Cement 7048
Insulator 1462
21
22
Problem 03
Given below are the 2010 census population of Chittagong and
Dhaka divisions by sex. Display them by a multiple bar diagram.
Population in 2010
23
20000
18000 17634
16306
16000
14000
10000 Male
Female
8000
6000
4000
2000
0
Chittagonh Dhaka
24
Problem 04
Production of major crops of Bangladesh in 2017-18 is provided
by BBS as,
Major Crops Production
(Lac Metric tons)
Aush 18.32
Amon 115.2
Boro 124.5
Wheat 15.07
Potato 31.53
Draw a Pie Chart.
25
Crops Production Relative percentage Angle
Frequency frequency
Aush 18.32 0.06 6 21.6
Amon 115.2 0.38 38 136.7
26
Pie chart for crops
36 21.6
18
Aush
Amon
136.7 Boro
Wheat
potato
147.7
27
36 21.6
18
Aush
Amon
136.7 Boro
Wheat
potato
147.7
28
Problem 05:
A Food and Beverage Company ask 150 randomly sampled customers to take a taste test and select the
beverage they preferred most. The results are shown in the table:
Bevarage No. of people
Lemon lime 20
Cola plus 30
Coca cola 60
Pepsi 40
Total 150
29
Problem 0
The following is a distribution of the final examination scores which 200
students obtained in a 3- week course in Statistics.
Scores No. of Students
1-20 24
21-40 55
41-60 76
61-80 32
81-100 13
Total 200
30
a) Convert the distribution into a percentage
distribution.
b) Draw a histogram for this data set.
c) Draw a frequency polygon and Ogive line for this
data set.
31
Data
Primary Data
Data that is collected by a researcher from first-hand sources, using methods like surveys,
interviews, or experiments.
Example: An organization doing market research about a new product (say phone) they are about
to release will need to collect data like purchasing power, feature preferences, daily phone usage,
etc. from the target market. The data from past surveys are not used because the product differs.
Secondary data
Secondary data is type of data that gathered from studies, surveys, or experiments that have been run by
other people or for other research.
Example: You want to do a project about determine the Cost of living Index for the people of Dhaka city.
32
**Method of collecting Primary Data:
1. Observation method
2. Questionnaire method
3. Interviewing Method
a. Direct Interview
b. Telephone Interview
c. Indirect Interview.
33
Some Sources of secondary Data in Bangladesh:
1. World Bank
2. BBS (Bangladesh Bureau of Statistics)
3. BDHS(Bangladesh Demographic and Health Survey)
4. NIPORT(National Institute of Population Research and Training)
5. BIRDEM(Bangladesh Institute of Research and Rehabilitation for
Diabetes, Endocrine and Metabolic Disorders)
6. BRAC
7. Icddr,b
34
Thank You
35