0% found this document useful (0 votes)
12 views53 pages

Applied Biostatistics

The document provides an overview of applied biostatistics, detailing statistical methods, classifications, applications, and limitations. It explains the importance of descriptive and inferential statistics, the stages of statistical investigation, and various types of data. Additionally, it covers methods for presenting data and measures of central tendency, emphasizing the role of statistics in health sciences and other fields.

Uploaded by

Rinki Chaudhary
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views53 pages

Applied Biostatistics

The document provides an overview of applied biostatistics, detailing statistical methods, classifications, applications, and limitations. It explains the importance of descriptive and inferential statistics, the stages of statistical investigation, and various types of data. Additionally, it covers methods for presenting data and measures of central tendency, emphasizing the role of statistics in health sciences and other fields.

Uploaded by

Rinki Chaudhary
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 53

Applied Biostatistics

Rinki Chaudhary
Assistant Professor
School of Physics, Humanities and Applied Sciences
Shobhit Deemed University, MEERUT - 250110
Content
 Introduction
 Type of Statical Method
 Classification of Biostatistics
 Applications of Statistics
 Limitations of Statistics
 Types of Data
 Methods of presentation of data
 Central Tendency
What Is Statistics?

1. Collecting Data
Data
e.g., Survey Why?
Analysis
2. Presenting Data
e.g., Charts & Tables © 1984-1994 T/Maker Co.

3. Characterizing Data Decision-


Making
e.g., Average

© 1984-1994 T/Maker Co.


Applied Statistics: the application of statistical methods to solve real
problems involving randomly generated data and the development of
new statistical methodology motivated by real problems.
Biostatistics is the branch of applied statistics directed toward
applications in the health sciences and biology. Biostatistics in
agriculture is the application of statistical methods to analyze data from
experiments in agriculture.
Biostatistics: The tools of statistics are employed in many fields -
business, education, psychology, agriculture, and economics, to
mention only few. When the data being analyzed are derived from the
public health data, biological sciences and medicine, we use the term
biostatistics to distinguish this particular application of statistical tools
and concepts.
Statistical Methods

Statistical
Methods

Descriptive Inferential
Statistics Statistics

© 2011 Pearson Education, Inc


Classification of Biostatistics

Descriptive statistics:
A statistical method that is concerned with the collection, organization,
summarization, and analysis of data from a sample of population.

Inferential statistics:
A statistical method that is concerned with the drawing
conclusions/inferring about a particular population by selecting and
measuring a random sample from the population.
Descriptive Statistics:
Statistical procedures used to summarize, organize, and simplify data.
This process should be carried out in such a way that reflects overall
findings.
 Raw data is made more manageable
 Raw data is presented in a logical form
 Patterns can be seen from organized data
Some statistical summaries which are especially common in descriptive
analyses are:
 Measures of central tendency
 Measures of dispersion
 Measures of association
 Cross-tabulation, contingency table
 Histogram
 Quantile
Inferential Statistics
This branch of statistics deals with techniques of making conclusions about the
population

Inferential statistics builds upon descriptive statistics

The inferences are drawn from particular properties of sample to particular


properties of population

Inferential statistics are used to make generalizations from a sample to a


population.

They encompasses a variety of procedures to ensure that the inferences are sound
and rational, even though they may not always be correct
Stages in statistical investigation

There are five stages or steps in any statistical investigation

1. Collection of data
• The process of obtaining measurements or counts.

2. Organization of data
• Includes editing, classifying, and tabulating the data collected

3. Presentation of data
• overall view of what the data actually looks like
• facilitate further statistical analysis
• Can be done in the form of tables and graphs or diagrams
4. Analysis of data
• To dig out useful information for decision making
• It involves extracting relevant information from the data (like mean, median,
mode, range, variance. . . )

5. Interpretation of data
• Concerned with drawing conclusions from the data collected and analyzed; and
giving meaning to analysis results
• A difficult task and requires a high degree of skill and experience
Definition of Some basic terms
Population:
• Population is the complete set of possible measurements for which inferences are
to be made.

Census:
• a complete enumeration of the population. But in most real problems it cannot be
realized, hence we take sample.

Sample:
• A sample from a population is the set of measurements that are actually collected
in the course of an investigation.

Parameter:
• Characteristic or measure obtained from a population.
Data:
• Refers to a collection of facts, values, observations, or measurements that the
variables can assume.

Statistics:
• Statistics is a branch of mathematics dealing with data collection, organization,
analysis, interpretation and presentation.

Sampling:
• The process or method of sample selection from the population.

Sample Size:
• The number of elements or observation to be included in the sample.
Applications of Statistics

• In almost all fields of human endeavor


• Almost all human beings in their daily life are subjected to
obtaining numerical facts e.g. abut price.
• Applicable in some process e.g. invention of certain drugs,
extent of environmental pollution.
• In industries especially in quality control area
Uses of Statistics
The main function of statistics is to enlarge our knowledge of complex phenomena. The
following are some uses of statistics:

• It presents facts in a definite and precise form.


• Data reduction.
• Measuring the magnitude of variations in data.
• Furnishes a technique of comparison.
• Estimating unknown population characteristics.
• Testing and formulating of hypothesis.
• Studying the relationship between two or more variable.
• Forecasting future events
Limitations of Statistics
As a science statistics has its own limitations. The following are some of
the limitations:

• Deals with only quantitative information.


• Deals with only aggregate of facts and not with individual data items.
• Statistical data are only approximately and not mathematical correct.
• Statistics can be easily misused and therefore should be used be experts
Types of Data

Types of
Data

Quantitative Qualitative
Data Data

© 2011 Pearson Education, Inc


Quantitative Data

Measured on a numeric 4
scale.

94
Number of defective 5
items in a lot. 3 2
1 2
 Salaries of CEOs of
12 1
oil companies.
 0 28
Ages of employees at
a company. 7 3
1
© 2011 Pearson Education, Inc
Qualitative Data
Classified into categories.
 College major of each
student in a class.
 Gender of each employee
at a company.
 Method of payment
(cash, check, credit card).

$ Credit

© 2011 Pearson Education, Inc


Source of Data

 Primary data:
 collected from the items or individual respondents
directly for the purpose of certain study.

 Secondary data:
 which had been collected by certain people or agency,
and statistically treated and the information contained in
it is used for other purpose.
Frequency: number of times that something occurs.
The notation fx is used to denote the frequency or number of times the value x
occurs.

Relative frequency = frequency divide by sum of all frequencies


The relative frequency is just the frequency divided by the sample size n.

Frequency distribution: is a table showing a listing of all observed values of the


variable being studied and how many times each value is observed. The number of
times that something occurs is known as its frequency.

Cumulative frequency: frequencies are added up.


•For example 1/30*100= 3% and 7/30*100 =23%

Cumulative relative frequency: sums of all relative frequencies below and including
each category
Methods of presentation of data

1. Numerical presentation
2. Graphical presentation
3. Mathematical presentation
1- Numerical presentation
Tabular presentation (simple – complex)

Simple frequency distribution Table (S.F.D.T.)


Title
Name of variable
Frequency %
(Units of variable)
-
- Categories
-

Total
Table (I): Distribution of 50 patients at the surgical department of
Alexandria hospital in May 2008 according to their blood groups

Blood Frequency %
group
A 12 24
B 18 36
AB 5 10
O 15 30
Total 50 100
Table (II): Distribution of 50 patients at the surgical department
of Alexandria hospital in May 2008 according to their age

Age Frequency %
(years)
20-<30 12 24
30- 18 36
40- 5 10
50+ 15 30
Total 50 100
Complex frequency distribution Table

Table (III): Distribution of 20 lung cancer patients at the chest department


of Alexandria hospital and 40 controls in May 2008 according to smoking

Lung cancer
Total
Smoking Cases Control
No. % No. % No. %
Smoker 15 75% 8 20% 23 38.33
Non
smoker 5 25% 32 80% 37 61.67

Total 20 100 40 100 60 100


2- Graphical presentation

 Graphs drawn using Cartesian coordinates


• Line graph
• Frequency polygon
• Frequency curve
• Histogram
• Bar graph
• Scatter plot
 Pie chart
 Statistical maps
Line Graph
MMR/1000
60 Year MMR
50 1960 50
40
30 1970 45
20 1980 26
10
1990 15
0
Year
1960 1970 1980 1990 2000 2000 12

Figure: Maternal mortality rate of (country),


1960-2000
Frequency polygon

Age Sex Mid-point of interval


(years)
Males Females

20 -30 3 (12%) 2 (10%) (20+30) / 2 = 25


30 -40 9 (36%) 6 (30%) (30+40) / 2 = 35
40-50 7 (8%) 5 (25%) (40+50) / 2 = 45
50 -60 4 (16%) 3 (15%) (50+60) / 2 = 55
60 - 70 2 (8%) 4 (20%) (60+70) / 2 = 65
Total 25(100%) 20(100%)
Frequency polygon
Males Females
%
40 Sex
Age M-P
35
M F
30
20-30 (12%) (10%) 25
25
20 30-40 (36%) (30%) 35
15
40-50 (8%) (25%) 45
10
5 50-60 (16%) (15%) 55

0
Age 60-70 (8%) (20%) 65
25 35 45 55 65

Figure : Distribution of 45 patients at (place) , in (time) by


age and sex
Frequency curve

9
8 Female

7 Male

6
Frequency

5
4
3
2

1
0
20- 30- 40- 50- 60-69
Age in years
Histogram Distribution of a group of cholera patients by age
% 35
30
Age (years) Frequency %
25
20 25-30 3 14.3
15
30-40 5 23.8
10
40-45 7 33.3
5
45-60 4 19.0
0
60-65 2 9.5
0 25 30 40 45 60 65
Age (years)
Total 21 100

Figure: Distribution of 100 cholera patients at (place) , in (time) by


age
Bar Chart (Bar Graph):

• Place categories on the horizontal axis.

• Place frequency (or relative frequency) on the vertical axis.

• Construct vertical bars of equal width, one for each category.

• Its height is proportional to the frequency (or relative frequency)


of the category.
Bar chart
%

45
40
35
30
25
20
15
10
5
0
Single Married Divorced Widowed

Marital status
Bar chart
%

45
Male Female
40
35
30
25
20
15
10
5
0
Single Married Divorced Widowed
Marital status
Steps to create a pie-chart
• Construct a frequency table

• Calculate relative frequency % (percentage)

• Change the percentages into degrees,


where: degree = Percentage X 360o .
• Draw a circle and divide it accordingly

For single variable:


For example, in a class of 40 students, 15 are boys and 25 are
girls. (See the pie chart)
Measure of Central Tendency

Measures of central tendency as the name suggests are numerical measurements of


the central part of the distribution. Measures of central tendency are also called
averages or measures of location because they show the location of the center of the
distribution from which the data were sampled.
According to professor Bowley, averages are , “ Statistics constants which enable us
to comprehend in a single effort the significance of the whole.” In other words, these
are numbers tell us where the majority of values in the distribution located.
For example, the average marks in a distribution of marks of all the students of a
class. The averages which are commonly used in biostatistics are as follows:
1. Mean or arithmetic mean
2. Media
3. Mode
MEAN ‘OR’ ARITHMETIC MEAN
MEAN OF INDIVIDUAL ITEM
MEAN IN DIFFERENT FREQUENCY DISTRIBUTION
MEAN IN CONTINUOUS DISTRIBUTION

In continuous distribution, there are given class interval and their corresponding
frequencies. First of all, we find mid values of these classes and treat them as the
variable values. Now we apply the formula (2) for the calculation of arithmetic mean.
Example 3. For the data given in the below Table 2 on systolic BP of 68
patients, patients, calculate the arithmetic mean.

Systolic BP (mm HG) frequency


90-100 3
100-110 5
110-120 7
120-130 10
130-140 15
140-150 11
150-160 9
160-170 6
170-180 2
Solution. For the calculation of mean we prepare the following table:
Systolic BP Frequency(f) Mid value (x) fx
(mm HG)
90-100 3 95 285
100-110 5 105 505
110-120 7 115 805
120-130 10 125 1250
130-140 15 135 2025
140-150 11 145 1595
150-160 9 155 1395
160-170 6 165 990
170-180 2 175 350
𝑛

∑ 𝑓 𝑖 𝑥𝑖
9220
𝑖= 1
𝑋= = =135.6 𝑚𝑚h𝑔
𝑁 68
WEIGHTED MEAN
Example: The following table gives the platelets count (in lakh/cmm) from
the analysis of the blood samples on five different in a pathology
laboratory. Find the average platelets count per patient.
Day Platelets count (in lakh/cmm) No. of patient. (w)
(x)

1 0.5 65

2 0.75 80

3 1.00 95

4 1.5 90

5 2.00 70
Solution: The table for the calculation of weighted mean is given by:

Day Platelets count (in No. of patient. wx


lakh/cmm) (x) (w)

1 0.5 65 32.5

2 0.75 80 60

3 1.00 95 95

4 1.5 90 135

5 2.00 70 140

Total = 400 x = 462.5

lakh/cmm
Combine Mean
CORRECTED MEAN

Some time there are problems of such type that we used wrong digits
while the actual digits were different, then we replace the wrong digits
with the correct digits and now we can get the correct mean. The
procedure will be clear from the example given below:

Example: A student calculates the mean of 20 observations 25.2. Later on


he found that he misread one observation 34 in place of 43, find the
correct mean?

You might also like