0% found this document useful (0 votes)
36 views7 pages

12-12-2022 Types of Biological Data 1

The document discusses different types of biological data. It distinguishes between qualitative and quantitative data. Qualitative data is based on non-numerical attributes that can be assessed as present or absent, while quantitative data has numerical features. Quantitative data can be on a nominal, ordinal, or metric scale. Nominal data cannot be ranked, ordinal data can be ranked but not based on numeric values, and metric data uses actual numeric measurements with units. The document provides examples of different types of data related to clinical conditions like diabetes.

Uploaded by

Yoshita Singh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
36 views7 pages

12-12-2022 Types of Biological Data 1

The document discusses different types of biological data. It distinguishes between qualitative and quantitative data. Qualitative data is based on non-numerical attributes that can be assessed as present or absent, while quantitative data has numerical features. Quantitative data can be on a nominal, ordinal, or metric scale. Nominal data cannot be ranked, ordinal data can be ranked but not based on numeric values, and metric data uses actual numeric measurements with units. The document provides examples of different types of data related to clinical conditions like diabetes.

Uploaded by

Yoshita Singh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

Types of Biological data

Data are the raw materials, essential for statistics which consists of numbers.

A data can be obtained based on measurements such as blood glucose, Height or weight measured
using instruments or it can be based on counts. The data can be any group of measurements of interest.

All the characteristics individually studied are referred to as datum and collectively is called as data.

Statistics deals with collection, organization, summarization and analysis of data. Statistics is the use of
data to reach better decision by a decision maker. It deals with drawing of inference about a portion of
data.

In Biostatistics, data are obtained from Plant Sciences, Animal Sciences and Medicine to manage a
condition which is not certain.

Data in broad is classified into qualitative and quantitative data. Qualitative data is based on non-
numerical features or qualititative observation of elementary units called as attributes that can be
assessed as present or absent whereas quantitative data possess numerical features.

The individual items in a population is referred to as elementary unit

A qualitative data can be represented in a nominal scale or an ordinal scale.

A data on a nominal scale can neither be graded nor ranked which is described by its name and at times
can have a dichotomous classification or a binary classification.

A data based on a metric scale measurement is a quantitative data which is represented with the unit of
measurement.

To differentiate between nominal and ordinal data and quantitative data, let us consider the example of
clinical condition diabetes.

The occurrence of the disease represents it as a nominal data. Further classification of the condition as
present or absent is a binary data and to further characterize the extent of disease as borderline, mild,
moderate, severe and grade or rank the condition specifies it as an ordinal data which can be denoted as
codes 0,1,2 while tabulating the data for convenience. A quantitative data can be denoted as ordinal
data. Exact measurement of blood glucose is metric scale. In metric measurements, accuracy is based on
approximation and not required more than needed.

A qualitative data is easy to obtain. Different types of data have various roles to manage an uncertain
condition.

Blood pressure is measured using sphygmomanometer. For more accuracy it can be measured to
nearest mm Hg but it need not be measured in decimals. Quantitative presentation like 96 mm Hg and
98 mm Hg (diastolic pressure) is required to monitor the prognosis of a patient .But it can also be
presented for convenience as mild hypertension, moderate and severe.
Blood biochemical parameters like cholesterol, creatinine, phosphorus, Hemoglobin, various enzymes
like Creatine phosphokinase, Lactate dehydrogenase, Serum glutamate oxaloacetate transaminase,
enzymes which increase after myocardial infarction are measured quantitatively with specific
instruments and have to be represented as a data with appropriate SI unit as IU or katal.

Measurements in metric scale are of two types

Ratio scale Interval scale


Measurements have constant interval size and has Measurements that have constant interval size but
agreed zero, viz. age and number of children no absolute zero, viz.Temperature of 0°C does not
mean no temperature
Measurements are easy to handle It can’t be said that 44 ° C is twice hot as 22 ° C.
Added , multiplied , divided Interval scales are circular in biological data.
15cm plant is half as tall as 30 cm plant 24 hr circadian rhythm and 365 days per year
90 leaves plant has three times the leaves as 30
leaves plant
Difference in weight like 40 to 50 kg is same as the The difference between 2 pm and 4 pm is same as
difference between 50 to 60 kg the difference between 10pm and 12 pm

The characteristics that differ from one biological entity to other are called variable. The number of
leaves counted in a group of plants and the height measured are various characteristics. Variable refers
to the characteristic that may take different values at different place, time and situation. It varies in
amount and magnitude in a frequency distribution.

A variable can be qualitative or quantitative. When two possibilities exist, measurement is binary.
Gender distinguished as male or female, Recovery or non-recovery from disease.

Data is categorized on the basis of characteristic studied. When the value of variable is finite and cannot
assume fractional value and decimals the variable is discrete. On the other hand a continuous variable
can manifest every possible fractional value. Ex. Height and weight of persons. The series of data
obtained for a discrete variable is a discrete frequency distribution obtained based on counts and the
series obtained for a continuous variable is a continuous frequency distribution obtained by
measurement.

Discrete frequency distribution

No. of children in a No. of


family families
0 50
1 150
2 100
3 40
4 25
5 20
6 10
Continuous frequency distribution

Height in cms No of persons


100-110 25
110-120 36
120-130 55
130-140 70
140-150 84
150-160 90
160-170 60

The number of observations in each class is frequency. A frequency distribution is a table in which data
is grouped into classes and the number in each class is recorded. If the numbers of items are expressed
by proportion in each class, the distribution is a relative frequency distribution or percentage
distribution.

Continuous frequency distribution

The following terms are to be understood when a continuous frequency distribution is formed or when
data classification is done according to class intervals.

Class limits: The lowest and the highest value included in each class constitute the class limit. If the
weight is recorded as 4.0 to 4.2, the lowest value is 4 and the highest 4.2. The two boundaries are
known as the upper and the lower limit. The lower limit is the value below which there can be no
observation in the particular class and the upper limit is the value above which there can be any
observation.

Class intervals:

It is the difference between the upper and the lower limit in the class. If the class is 50 to 100, the class
interval is 50. The width of the class interval is important when a continuous frequency distribution is
made. It depends on the following factors.

1. Range in the data ( difference between largest and smallest item)


2. Details required
3. Number of classes
4. Ease of classification for further statistical calculations
5. Number of observations

The formula to be used is for obtaining class interval , i is


i = L –S/k L refers to largest value and S the smallest value and K the number of
classes.
If the maximum weight of chick is 5.0 and the minimum is 3.5 and if the number of
observations is 100, and if 10 classes are to be formed,
i =5 – 3.5/10 = 0.15

The starting class would be 3.5 to 3.65 next 3.65 to 8 and so on.How to fix the number of classes ?

The number of classes can be fixed arbitrarily keeping the nature of problem or based on Sturge’s rule.

The number of classes is decided based on using the formula

K = 1 + 3.322 log N

N = total number of observations

Log N = logarithms of number

If 10 observations are made , the number of classes will be

K = 1 +3.322 log 10 = 1+3.222x1 = 4.322 or 4

If 100 observations are made, the number of classes will be

K = 1+3.322 log 100 = 1+3.322 x2 = 7.644 or 8

The number of classes shall be between 4 to 20 . It cannot be less than 4 even if number of observations
is less than 10 and if N is 10 lakh, k will be 1+3.322 x6 = 20.932.

To determine the magnitude of class interval, the following formula is used.

i = Range/ 1+3.322 log N

where range is the difference between the large and small value. In the above example taken,

i = 5 – 3.5/1 + 3.322 x 2 = 1.5 / 7.644 = 0.196 or 0.2 . If we take a class interval of 0.2, the number of
classes formed would be 1.5/0.2 = 7.5 or 8

The application of the above formula may give a value involving fractions which has to be approximated.

Class frequency

The number of observations corresponding to a particular class is frequency. In the above example if the
number of chicks between 3.5 and 3.7 is 15, it implies that the frequency is 15.

If all the individual frequencies are added together, the total frequency is obtained. The total frequency
is 100.

Class Midpoint or Class Mark

It is the value lying halfway between the upper limit and the lower limit.
Mid-point of a class = upper limit of the class + lower limit of the class / 2 .

For further calculations, the midpoint is considered to represent the class in a continuous data.

There are two methods of classifying the data according to class intervals.

1. Exclusive method
2. Inclusive method

Exclusive method

If the class intervals are fixed and the upper limit of one class becomes the lower limit of the next class,
it is exclusive method of classification.

Weight

3.5-3.7

3.7-3.9

3.9 – 4.1

Exclusive method ensures continuity of data so that upper limit of one class becomes the lower limit of
next class. Thus in the above example, there are 15 chicks whose weight lies between 3.5 to 3.699 kg. A
chick whose weight is 3.7 would be included in the class 3.7 to 3.9 kg. This method is widely followed in
statistics. It would be confusing to a layman who has no statistical awareness. If the class intervals upper
and lower limit are repetitive , exclusive method of data entry should be followed. The upper limit is
always exclusive and the item with that value is not included in that class.

A better way of expressing data in exclusive method is

5.5 but under 3.7

3.7 but under 3.9

It avoids confusion and in practice this approach is preferred.

Inclusive method

In Inclusive method, upper limit of one class is included in that class itself. In the above example, if
the class is 3.5 to 3.6, weight s of chicks between 3.51 to 3.69 would be included in this class and a
chick with weight of 3.7 would be placed in the next class .

3.5-3.69

3.7- 3.89
To decide whether exclusive or inclusive method, it depends on whether the variable is continuous
or discrete. For continuous variable, exclusive method is preferred and for discrete variable,
inclusive method is applicable.

To ensure continuity of data recorded, an inclusive data is converted to exclusive data by using the
correction factor.

Correction factor = Upper limit of succeeding class – Lower limit of preceding class /2

Ex. CF = 3.7 – 3.69 /2 = 0.01/2 = 0.005

The obtained value is subtracted from all the lower limits and added to all the upper limits to obtain
the class intervals in exclusive form.

3.495 – 3.695
3.695 – 3.895

The difference between the limits in the inclusive method is 0.19 but adopting the correction factor
and conversion to exclusive form the difference between the limits becomes 0.2.

Ex.

Vari able Frequency


1-9 10
10-19 25
20-29 36
30-39 20

CF = 10 -9 /2 = 0.5

After adjustment, classes will be,

0.5- 9.5

0.5 – 9.5 10
9.5 – 19.5 25
19.5 – 29.5 36
29.5 – 39.5 20

The class intervals should be in the form of 5, 10 or multiples of 5 to understand and stratify the
distribution.

Intervals should not have values like 3, 7, 19, 26 etc.,


In a data which is large having few small values or large values, data with more number of
observations concentrated within particular range, gaps observed with few values or no information,
grouping of data can be done by using open end class intervals or unequal sized class intervals.

It is preferable to have equal sized class intervals to facilitate comparison of data

If the salary of employees is the criteria of grouping the data, the range of limits can be open ended
to include few observations with small and large values at the end and class intervals of varying sizes
to include observations were most values fall and also to prevent constructing distribution with too
many classes.

< 5000 15
5000 – 15000 35
15000 – 25000 50
25000 –35000 80
35000 – 45000 65
45000 – 70000 20
70000 – 150000 10
>150000 5

You might also like