0% found this document useful (0 votes)
38 views10 pages

Ch.04 Organisation of Data

The document discusses the organization and classification of data, emphasizing the importance of arranging raw data into meaningful formats for analysis. It outlines various classification methods, including geographical, chronological, qualitative, and quantitative classifications, along with the characteristics of good classification. Additionally, it explains the concept of variables, types of statistical series, and the differences between individual and frequency series.

Uploaded by

Bhoomi Jain
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
38 views10 pages

Ch.04 Organisation of Data

The document discusses the organization and classification of data, emphasizing the importance of arranging raw data into meaningful formats for analysis. It outlines various classification methods, including geographical, chronological, qualitative, and quantitative classifications, along with the characteristics of good classification. Additionally, it explains the concept of variables, types of statistical series, and the differences between individual and frequency series.

Uploaded by

Bhoomi Jain
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 10

1

Ch.04
ORGANISATION OF DATA

When an investigator collects data for an investigation, these are just raw data. Raw data are not
capable of offering any meaningful conclusion. Data are to be organised before these are
presented for final observations or conclusions. “Organisation of the data refers to the
arrangement of figures in such a form that comparison of the mass of similar data may be
facilitated and further analysis may be possible.” An important method of organisation of data is to
distribute these into different classes on the basis of their characteristics. This process is called
classification of data.

WHAT IS CLASSIFICATION?
Classification is the process of arranging things (either actually or notionally) in groups or classes
according to their resemblances and affinities and gives expression to the unity of attributes may
exist amongst a diversity of individuals. “This definition suggests two important features of
classification:
i. Data are divided into different groups. For example, on the basis of education, persons
may be classified as educated and uneducated.
ii. Data are grouped or classified on the basis of their class similarities. All similar units are
put in one class and as the similarity changes, class also changes.

Objectives of classification
1. Brief and Simple
2. Utility
3. Distinctiveness
4. Comparability
5. Scientific Arrangement
6. Attractive and Effective

Characteristics of Good Classification


1. Comprehensiveness
2. Clarity
3. Homogeneity
4. suitability
5. Stability
6. Elastic
Basis of Classification
There may be different basis of classifying statistical information as shown in chart below.

Basis of Classification

(1)Basis of Classification (2)Chronological (3)Chronological Quantitative


or
Numerical

Simple Manifold
2

1. Geographical (or Spatial) Classification: This classification of data is based on the


geographical or locational differences of the data. To illustrate, data relating to the number
of firms producing bicycles in india would be classified as under:

Number of Firms Producing Bicycles in 2018


across Different Locations

Place Number of Firms


Punjab 30
Haryana 20
UP 25

2. Chronological Classification: When data are classified on the basis of time, it is known as
chronological classification. This is illustrated in the following Table 2.

Year Sales (₹)


2016 80 lakh
2017 90 lakh
2018 95 lakh
3

3. Qualitative classification: This classification is according to Qualities or Attributes of the


data. For example, data may be classified on the basis of occupation, religion, level of
intelligence of the population. This classification may be of two types:
i. Simple Classification: It is called classification according to dichotomy. This is
because data are divided on the basis of existence of absence of a quality. Male-
female, healthy-unhealthy, educated-uneducated, are examples of dichotomy.

ii. Manifold Classification: When classification according to quality of data involves


more than one characteristic, it is called manifold classification or multiple
classifications. As a result of it, there may be more than two classes. To illustrate,
factory workers may be classified as ‘skilled’ and “unskilled’. These may be further
classified as literate or illiterate and still further as rural or urban. This classification
may take the following form:

Classification of the factory workers: An Example of


Manifold Qualitative Classification

Skilled Unskilled

Literate Illiterate Literate Illiterate

Rural Urban Rural Urban Rural Urban Rural Urba


n

4. Quantitative or Numerical Classification is done on the basis of numerical values of the


facts.

Annual Profit of Small Scale Firms in the state of UP:


Hypothetical Data, just for an illustration of Quantitative Classification

Annual Profit (₹) Number of Firms


0-1,00,000 5
1,00,00-2,00,000 150
2,00,000-3,00,000 1500
3,00,000-4,00,000 800
4,00,000-5,00,000 400
Above 5,00,000 200

2. CONCEPT OF VARIABLE
A characteristic or a phenomenon which is capable of being measured and changes its
value overtime is called variable. Thus, a variable refers to that quantity which is subject to
change and which can be measured by some unit. If we measure the weight of students of
Class XI, then the weight of the students will be called variables.
4

1. Discrete Variable: Discrete variable are those variables that increase in jumps or in
complete numbers. For example, the number of students in class XI could be
1 1 3
1,2,3,10,11,15 or 20 etc. but cannot be 1 ,1 , 1 , etc .
4 2 4
2. Continuous Variable: Variables that assume a range of values or increase not in
jumps but continuously or in fractions are called continuous variables. For example,
height of the boys in a school is expressed as 5’1”, 5’2” 5’3”, and so on.
In short, while the values of discrete variables are in complete numbers (1,2,3, etc.),
values of continuous variables are in fractions (5’4”, 5’2”, etc.) or are in any range
such as 10-15, 15-20, etc.

3. RAW DATA
a mass of data in its crude form is called raw data. It is an unorganised mass of the various
items. These are yet to be organised by the investigator.

Marks obtained by the students of class XI in statistics

30 20 40 20 15 20
25 10 20 15 25 20
15 45 10 30 20 25
30 20 30 20 15 35
25 10 25 15 35 10

Data presented in this table are raw data. These are not homogeneous data or the data
classified into different groups or classes with similarities. No meaningful conclusion is
possible from this data. To draw any conclusion from these data, an investigator has to first
organise them. To draw any conclusion from these data, an investigator has to first
organise them. To do so, an investigator has to classify the same in the form of series.
“A series as used statistically may be defined as things or attributes of things arranged
according to some logical order.”

4. CONVERSION OF RAW DATA INTO STATISTICAL SERIES


Classification of data implies conversion of raw data into statistical series.

Types of Statistical Series


Broadly, statistical series are of two types:
1. Individual Series
2. Frequency Series
Frequency series are further divided as:
1. Discrete Series Frequency Array. and
2. Frequency Distribution or Series with Class-intervals.
Types of Series

Individual Series Frequency Series

Discrete Series/Frequency/Array Frequency Distribution


5

1. Individual Series
Individual series are those series in which the items are listed singly. These series may
be presented in two ways:
i. According to series Numbers: One way of presenting an individual series is that
all the items are arranged in a serial order.

Marks Obtained by the Students in Statistics

Roll Number Marks


1 20
2 25
3 15
4 30
5 25
6 20
7 10
8 45

ii. Ascending or Descending Order of Data: The other way of presenting an


individual series is a simple ascending or descending order.

Organisation of data in the form of individual series is a very simple form of presentation
of data. But this method is not of much use when the number of items is very large.

2. Frequency Series
Frequency series or series with frequencies may be of two types:
i. Discrete series or frequency Array, and
ii. Frequency Distribution. Before we discuss these two types of series, let us
understand the meaning of the following terms:
(a). Frequency: Frequency is the number of times an item occurs (or repeats
itself) in the series.
(b).Class frequency: The number of times an item repeat itself corresponding to
a range of value (or class interval) is called class frequency. For example, if
there are 4 students securing marks between10-15, then 4 is the frequency
corresponding to the class interval 10-15. Thus, 4 will be called class
frequency.
(c). Tally Bars: Every time an item occurs, a tally bar ( ) is marked against that
item. Thus, making a group of five, i.e., IIII. This method of marking and
counting is known as Four and cross method.

Four and Cross Method of converting Raw Data into Frequency series (data
in Table 5 or 6)

Marks Tally Bars Frequency


10 IIII 4
15 IIII 5
6

20 IIII III 8
25 IIII 5
30 IIII 4
35 II 2
40 I 1
45 I 1

i. Discrete Series of Frequency Array


A discrete series or frequency array is that series in which data are presented
in a way that exact measurements of items are clearly shown. In such series
there are no class intervals, and a particular item in the series is numbered
rather than measured with some range.

Illustration.
Twenty students of class XI have secured the following marks:
11, 12, 14, 16, 11, 17, 16, 17, 14
17, 18, 20, 14, 20, 17, 20, 17, 14, 20.

Present the data as a frequency array.

Discrete Series or Frequency Array

Marks Tally Bars of the Frequency (Total)


Frequency
11 III 3
12 I 1
14 IIII 4
16 II 2
17 IIII 5
18 I 1
20 IIII 4
Total 20

ii. Frequency Distribution


It is that series in which items cannot be exactly measured. The items assume
a range of values and are placed within the range or items. In other words,
data are classified into different classes with a range, the range is called class
intervals.
Frequency Distribution

Marks Frequency
10-15 4
15-20 5
20-25 8
25-30 5
30-35 4
35-40 2
40-45 1
45-50 1
Some important Terms
7

i. Class: A range of values which incorporate a set of items is called a class. For example, 5-10,
10-15 are the classes.
ii. Class Limits: The extreme values of a class are limits. Every class interval has two limits, lower
limit and upper limit. Of the class interval 5-10 in the above example, the lower limit is 5 and
he upper limit is 10.
iii. Magnitude of a class interval: Magnitude of a class interval is the difference between the upper
limit and he lower limit of a class. For example, in a class interval 10-15, the magnitude of the
class interval would be 15-10=5. Thus,
Magnitude of a class interval (i) = Upper limit ( l 2 )−¿ Lower limit (l 1)
iv. Mid-value: Mid-value is the average value of the upper and lower limits. It is known by adding
up the upper limit and lower limit values and dividing the total by 2. Thus,

Upper limit + Lower Limit


Mid-value=
2

l 2+ l1
m=
2

Where, m= mid-value; l 2= upper limit.


20+10
For example mid-value of 10-20 class interval = =15
2

5. TYPES OF FREQUENCY DISTRIBUTION

Frequency Distribution

(1) (2) (3) (4) (5)


Exclusiv Inclusiv Open Cumulative Mid-Values
e e End Frequency Frequency
Series Series Series Series Series

1. Exclusive Series
Exclusive series is that series in which every class interval exclusive items corresponding to
its upper limit.

Exclusive Series

Marks Frequency
10-15 4
15-20 5
20-25 8
25-30 5
30-35 4
Total = 26

2. Inclusive Series
8

An inclusive series is that series which includes all items up-to its upper limit. In such
series, the upper limit of class interval does not repeat itself as a lower limit of the next
class interval. Thus, there is a gap between the upper limit of a class interval and the lower
limit of the next class interval.

In short, while in the exclusive series there is an overlapping of the class limits (upper class
limit of one class interval being the lower class limit of the next class interval), there is no
such overlapping in the inclusive series.

Inclusive Series
Marks Frequency
10-14 4
15-19 5
20-24 8
25-29 5
30-34 4
Total = 26

Conversion of inclusive series into Exclusive series


Following steps are involved in the conversion of an inclusive series into an exclusive
series:
i. First, we find the difference between the upper limit of class interval and the lower limit
of the next class interval.
ii. Half of the difference is added to the upper limit of a class interval and half is subtracted
from the lower limit of the class interval.

Conversion of the above inclusive series into and exclusive series.

Marks Frequency
9.5-14.5 4
14.5-19.5 5
19.5 -24.5 8
24.5-29.5 5
29.5-34.5 4

3. Open End Series


In some series, the lower class limit of the first class interval and the upper limit of the last
class interval are missing. Instead, less than’ or below is specified in place of the lower
class limit of the first class interval and ‘more than’ or above is specified in place of the
upper class limit of the last class interval. Such series are called ‘Open-end’ series.

Open End Series

Marks Frequency
Below 5 1
5-10 3
10-15 4
9

15-20 6
20 and above 1

4. Cumulative Frequency Series


Cumulative frequency series is that series in which the frequencies are continuously added
corresponding to each class interval in the series.
Let us proceed with an illustration of converting Simple Frequency Series into a Cumulative
Frequency Series.

Simple Frequency Series

Marks Frequency
5-10 3
10-15 8
15-20 9
20-25 4
25-30 4

Cumulative Frequency Series

Method I Method II
Marks Number of Students Marks Number of Students
Less than 10 0+3=3 More than 5 28
Less than 15 3+8=11 More than 10 28 -3=25
Less than 20 11+9=20 More than 15 25 -8=17
Less than 25 20+4=24 More than 20 17 9=8
Less than 30 24+4=28 More than 25 8 -4=4

Conversion of Cumulative Frequency series into Simple Frequency Series

Illustration.
Convert the following cumulative frequency series into a simple frequency series.
4 students obtained less than 10 marks
20 students obtained less than 20 marks
40 students obtained less than 30 marks
48 students obtained less than 40 marks
50 students obtained less than 50 marks

Conversion of Cumulative Frequency series


into a simple frequency series

Cumulative Frequency Series Simple Frequency Series


Marks Number of students Marks Number of students
less than
10 4 0-10 4
20 20 10-20 20-4=16
30 40 20-30 40-20=20
40 48 30-40 48-4=8
50 50 40-50 50-48=2
10

Illustration

Mid-value 5 15 25 35 45
Frequency 6 5 11 9 8

Such series may be converted into simple frequency series using the following method: (i)
First, mutual difference between mid-values (i ) is determined; and (ii) Second, the

( )
1
difference so obtained is reduced to half i which when deducted from the mid-value
2
gives lower limit of the class interval and when added to the mid-value gives the
corresponding upper limit.

1
Thus, Lower limit; l 1=m− i
2
1
Upper limit: l 2=¿ m+ i
2
Where, m=¿ mid-value; i=¿ difference between mid-values; l 1=lower limit and
l 2=upper limit .

Conversion of a series with Mid-values


into a series with class intervals

Mid-value Frequency Classes Technique

10 10
5 6 0-10 l 1=5− =0 ,l 2=5+ =10
2 2

10 10
15 5 10-20 l 1=15− =10 , l 2=15+ =20
2 2

10 10
25 11 20-30 l 1=25− =20 ,l 2=25+ =30
2 2

10 10
35 9 30-40 l 1=35− =30 , l 2=35+ =40
2 2

10 10
45 8 40-50 l 1=45− =40 , l 2=45+ =50
2 2

You might also like