0% found this document useful (0 votes)
36 views18 pages

Organisation of Data

Classification involves arranging data into groups based on common characteristics. It has several objectives like simplifying data, explaining similarities and differences, and preparing data for further analysis. There are different methods of classification like geographical, chronological, qualitative, and quantitative. Variables refer to characteristics that can vary, and can be discrete or continuous. Frequency distribution arranges the frequencies and associated values of a variable. Statistical series organize classified data in a logical order.

Uploaded by

Tanu Bhati
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
36 views18 pages

Organisation of Data

Classification involves arranging data into groups based on common characteristics. It has several objectives like simplifying data, explaining similarities and differences, and preparing data for further analysis. There are different methods of classification like geographical, chronological, qualitative, and quantitative. Variables refer to characteristics that can vary, and can be discrete or continuous. Frequency distribution arranges the frequencies and associated values of a variable. Statistical series organize classified data in a logical order.

Uploaded by

Tanu Bhati
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 18

ORGANISATION OF DATA

Meaning of classification

• Classification is the process of arranging data into sequences and


groups according to their common characteristics or separating them
into different but related parts .
• On the basis of chosen characteristics , the similarity and dissimilarity
in the various items are noted and items exhibiting similarity are
grouped together in one class . Through classification , we try to strike
a note of homogeneity in the elements of the collected information
Objectives of classification

• To simplify and condense the mass of


data .
• To explain similarity and dissimilarity of
data
• To facilitate comparisons .
• To study the relationship .
• To prepare the data for tabulation .
• To present a mental picture .
Requisite of a good classification
Methods of classification
Geographical classification – Data is collected according to
geographical location or region .
population of 5 states of India (as per census 2011)
State Andhra Tamilnad Rajastha Karnatak Gujarat
pradesh u n a
Populati 84665 72138 68621 61130 60383
on

Chronological classification – Data is classified with respect to


different periods of time . It is also knowns as temporal classification.

Year 1951 1961 1971 1981 1991

Population 1744 2659 4066 6220 9421


(in 000)
• Qualitative classification – in this data is classified on the basis of
descriptive characteristics or on the basis of attributes like sex ,
literacy , region , caste , education etc. which cannot be quantified .
• This type of classification is of two types –
Simple classification- when facts are classified into two classes
according to one attribute only , then the classification is said to be
simple . For example
Population

Male Female
• Manifold classification- when facts are classified in accordance with
more than one attribute or when each class is subdivided into more
than two sub-classes , then the classification is said to be manifold.
Quantitative qualification

• In this classification , data is classified on the basis of some


characteristics which can be measured such as height, weight,
income , expenditure , production , sales etc. For example
Age distribution of 500 students of a school

AGE (IN YEARS ) NUMBER OF STUDENTS


10-12 150
12-14 130
14-16 100
16-18 120
TOTAL STUDENTS 500
Concept of a variable

• A variable refers to quantity or characteristics whose value varies


from one investigation to another . Age , price, height, weight ,
wages , expenditure , imports, production etc.
It may be noted that different variables measured in different units .
Variables are of two kinds –
Discrete variable(discontinuous variable )-variables which are capable
of taking only exact value and not only fractional value are t
Continuous variable-variables which can take all the possible values in a
given specified range .
Frequency and frequency distribution

• Frequency refers to number of times a given value appears in a


distribution .
• A table in which the frequencies and the associated values of a
variable are written side by side , is known as frequency distribution.
• A frequency distribution can be discrete and continuous depending
upon whether the variable is discrete or continuous .
Statistical Series on the basis of construction
• The arrangement of classified data in some logical order , like
according to the size , according to the time , according to some other
measurable and non-measurable characteristics is known as Statistical
series .
Types of statistical series on the basis of construction
Statistical
series

Individual Continuous
Discrete series
series series

Cumulative Equal and


Exclusive Inclusive Open end Mid value
frequency unequal class
series series series series
series interval
Statistical series on the basis of
characteristics
• Time series – if the different values that a variable has taken in a
period of time are arranged in a chronological order , the series so
obtained is called a time series
• Spatial series – the data arranged according to location and
geographical consideration form a spatial series .
• Condition series – in this series , data is classified according to the
changes occurring under certain conditions .

Types of continuous series
Exclusive series – the classes of the type 10-20 , 20-30 , 30-40 etc. ,
wherein the upper limit of one class interval becomes the lower limit
of the next class , are known as exclusive classes .Such classification
ensures continuity of data because the upper limit of one class is the
lower limit of succeeding class .
Inclusive series – the classes of the type 10-19, 20-29,30-39etc .
Wherein all observation s with magnitude greater than or equal to
the lower limit and less than equal to the upper limit of a class are
included in it , are known as inclusive classes .
Open-end distribution – In a frequency distribution , if the lower limit
of the first class and the upper limit of last class is not given , it is
known as open-end distribution.
Marks Below 20 20-30 30-40 40-50 Above 50
No. of 7 6 12 15 8
students

Why open end distribution is used ?


When the few items of the data are very small or considerably large ,
then they are known as extreme items or extreme values . When we
take all extreme values, we may need so many class intervals that the
frequency distribution will become unwidely . To avoid the unwanted
classes, open end classes are used .
• Less than cumulative frequency distribution –In a less than cumulative
frequency distribution , the frequencies of each class interval are
added successively from top to bottom . It represents the cumulative
number of observations less than or equal to the class frequency to
which it relates .

Marks 0-10 10-20 20-30 30-40 40-50 50-60


No. of 2 5 10 12 17 4
students

Marks No. of students


Less than 10 2
Less than 20 2+5=7
Less than 30 2+5+10=17
Less than 40 2+5+10+12=29
Less than 50 2+5+10+12+17=46
Less than 60 2+5+10+12+17+4=50
• More than cumulative frequency distribution- in a more than
cumulative frequency distribution , the cumulative frequencies of each
class interval is obtained by finding the cumulative totals of frequencies
starting from the highest value of the variable to the lowest value

Marks No. of students


More than 0 50
More than 10 48
More than 20 43
More than 30 33
More than 40 21
More than 50 4
• Bivariate frequency distribution – when the data is classified on the
basis of two variables such as height and weight , marks in statistics
and economics , the distribution is known as bivariate frequency
distribution.

You might also like