Collection and Presentation of Data-3
Collection and Presentation of Data-3
We come across a lot of information every day from different sources. Our
newspapers, TV, Phone and the Internet, etc are the sources of information in our life.
This information can be related to anything, from bowling averages in cricket to
profits of the company over the years. These facts and figures are often numerical and
are called Data. Statistics is the study of data. Let’s look into this in detail.
Collection of Data
Most of the time when we collect data for our experiment with an objective. It
usually falls into one of these two categories:
1. Categorical Data
2. Numerical Data
Categorical Data
This data represents the characteristics of something entity. For example, if we are
collecting data about some people. Categorical data related to this information might
be, gender of the person, marital status, etc. These things will have values that are
not numerical, often “Yes/No” or in this case “Male/Female”. Since they are not
numerical, they cannot be added together.
Numerical Data
This data comes out of measurement and is numerical in nature. For example,
Weight of the person, stock prices, marks of students of class XII, etc. This data is
also called quantitative data. It can be broken down further into types:
1. Continuous Data
2. Discrete Data
Continuous Data: This data can take any value between intervals. The number of
possible values for this data cannot be counted. For example Length of a ruler can
take any length between 0-100cm. It can be either 30cm, 30.11cm and so on. There
are infinitely many possible values.
0 seconds of 15 secondsVolume 0%
This ad will end in 14
Discrete Data: This data takes only certain values. For example: If a coin is tossed
three times, and we want to count the number of heads. There are only a handful of
values that are possible. 0,1,2 or 3. It cannot take 2.2 or any other value. So, there
are only finite possible values.
Presentation of Data
After collecting the data, we need to present it in a meaningful way. Let’s take an
example,
Suppose we have the data of heights of students in a class,
140, 161, 152, 184, 135, 168 and 144.
We need to answer the following questions related to the data:
1. What is the height of the longest student in the class?
2. What is the height of the shortest student in the class?
3. What is the average height?
It is a little difficult to analyze the data in this format. The data in the form is called
raw data. Analyzing the data in this form might take more time if the data is big. It
can be made a little easier if sort the data in ascending or descending order. Thus, in
this way, the presentation of data affects the information and the time taken to
extract it from the data.
Suppose if this data was even bigger, then it would be very difficult to organize the
data in sorted order. In such cases, we might use a frequency table. Let’s see this
through an example.
Un-Grouped Frequency Distribution
In this type of frequency table, we consider the values as it is and then count their
number of occurrences in the data. We don’t group the data. Let’s see this through
an example.
Question: Let’s say we have marks of students of class XII. The marks are out
of 40.
2
20 21 15 7 8 10
9
2
31 40 5 11 13 20
4
1
24 27 15 38 33 29
3
Mark
Frequency
s
5 1
7 1
8 1
10 1
11 1
13 2
15 2
20 2
21 1
24 2
29 1
33 1
38 1
40 1
Notice that in this table, we have not grouped the data instead we have taken exact
values and their frequency. So, this type of representation is called ungrouped
frequency distribution.
Grouped Frequency Distribution
The previous kind of representation is definitely an improvement over previous
representations but as seen in the above example, tables can get pretty big in such
representations. Tally Marks and grouping can also be used to represent this data.
Question: We have the data for the number of covid cases on a particular day
in 20 cities.
10 21 25 33
15 8 16 20
0 5 38 28
5 0 16 23
Group Frequency
0-5 2
5-10 3
10-15 1
15-20 3
20-25 2
25-30 2
30-35 1
35-40 1
The intervals like 0-5, 5-10 .. And so on given in the above example are called class
intervals. The larger number is called higher limit and the lower number is called the
lower limit.
Let’s see some sample problems on these concepts
Sample Problems
Problem 1: The table below represents the data. Represent this data in the form
of suitable frequency distribution.
3 4 3 3
2 4 4 3
2 2 2 3
Solution:
We can see from the data given above, that there are only three values – 2,3 and 4.
These values occur multiple times throughout the data. Since there are very less
number of values, we can represent this kind of data in the form un-grouped
frequency table.
Value Frequency
2 4
3 5
4 3
Total – 12
Problem 2: The data given below represents the blood groups of the 20 students
of class XI.
O A AB A
A
AB O B
B
A A O B
B O B A
B AB O B
Represent the data given above in the table in the form of a frequency table.
Which of the following blood group has the highest frequency among the
students?
Solution:
We know there are four types of blood groups in the table.
O, A, AB and B
So, we will use ungrouped frequency distribution table to represent the data.
O 5
A 5
AB 4
B 6
Total 20
From the frequency distribution table we can tell the B is the blood group which
most commonly occurring in students.
Problem 3: The table represents the weights of the students of class X.
60 73 62 54
48 88 49 52
55 60 62 63
77 47 65 59
Solution:
Let’s make a grouped frequency distribution table for this data.
Assuming intervals like 0-10,10-20…and so on. Let’s divide the data into these
intervals are count the frequency.
0-10 0
10-20 0
20-30 0
30-40 0
40-50 3
50-60 4
60-70 6
70-80 2
80-90 1
Total 16
This above table represents a grouped frequency table. Now answering the
questions.
1. Most students lie in the range from 60-70.
2. For overweight students, we need to count the number of students with weight
greater than 70. It can be observed from the table that there are three such
students.
For underweight students, the number students with weight less than 50 are also
three students.
Problem 4: Three coins are tossed 20 times. The number of heads that occurred
each time is recorded and given in this data below. Prepare a frequency
distribution for the given data.
0 2 1 3
2 1 1 1
3 2 0 3
2 3 2 2
2 0 1 2
Solution:
We know there are maximum of three heads possible at each turn in this experiment.
So we can actually make an ungrouped frequency distribution for such data
0 3
1 5
2 8
3 4
Total 20
Thus, the table above represents the frequency table for this data.