0% found this document useful (0 votes)
96 views52 pages

1 Introduction To Biostatistics

Biostatistics is the practical application of statistical concepts and techniques to topics in biology. Because biology is such a broad field — studying all forms of life from viruses to trees to fleas to mice to people — biostatistics covers a very wide area, including designing biological experiments, safely conducting research on human beings, collecting and verifying data from those studies, summarizing and displaying that data, and analyzing the data to draw meaningful conclusions from it.

Uploaded by

kriss Wong
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
96 views52 pages

1 Introduction To Biostatistics

Biostatistics is the practical application of statistical concepts and techniques to topics in biology. Because biology is such a broad field — studying all forms of life from viruses to trees to fleas to mice to people — biostatistics covers a very wide area, including designing biological experiments, safely conducting research on human beings, collecting and verifying data from those studies, summarizing and displaying that data, and analyzing the data to draw meaningful conclusions from it.

Uploaded by

kriss Wong
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 52

INTRODUCTION TO

BIOSTATISTICS
DR.S.Shaffi Ahamed
Asst. Professor
Dept. of Family and Comm. Medicine
KKUH

This session covers:

Origin and development of Biostatistics


Definition of Statistics and Biostatistics
Reasons to know about Biostatistics
Types of data
Graphical representation of a data
Frequency distribution of a data

Statistics is the science which deals

with collection, classification and


tabulation of numerical facts as the
basis for explanation, description
and comparison of phenomenon.
------ Lovitt

Origin and development of


statistics in Medical Research
In 1929 a huge paper on application of
statistics was published in Physiology
Journal by Dunn.
In 1937, 15 articles on statistical methods
by Austin Bradford Hill, were published in
book form.
In 1948, a RCT of Streptomycin for
pulmonary tb., was published in which
Bradford Hill has a key influence.
Then the growth of Statistics in Medicine
from 1952 was a 8-fold increase by 1982.

Douglas Altman

Gauss -

Ronald Fisher

Karl Pearson

C.R. Rao

BIOSTATISICS
(1) Statistics arising out of biological
sciences, particularly from the fields of
Medicine and public health.
(2) The methods used in dealing with
statistics in the fields of medicine, biology
and public health for planning,
conducting and analyzing data which
arise in investigations of these branches.

Reasons to know about


biostatistics:
Medicine is becoming increasingly
quantitative.
The planning, conduct and interpretation
of much of medical research are
becoming increasingly reliant on the
statistical methodology.
Statistics pervades the medical literature.

Example: Evaluation of Penicillin (treatment


A) vs Penicillin & Chloramphenicol
(treatment B) for treating bacterial
pneumonia in children< 2 yrs.
What is the sample size needed to demonstrate the
significance of one group against other ?
Is treatment A is better than treatment B or vice versa ?
If so, how much better ?
What is the normal variation in clinical measurement ? (mild,
moderate & severe) ?
How reliable and valid is the measurement ? (clinical &
radiological) ?
What is the magnitude and effect of laboratory and technical
error ?
How does one interpret abnormal values ?

CLINICAL MEDICINE
Documentation of medical history of
diseases.
Planning and conduct of clinical studies.
Evaluating the merits of different
procedures.
In providing methods for definition of
normal and abnormal.

PREVENTIVE MEDICINE
To provide the magnitude of any health
problem in the community.
To find out the basic factors underlying the
ill-health.
To evaluate the health programs which was
introduced in the community
(success/failure).
To introduce and promote health legislation.

WHAT DOES STAISTICS


COVER ?
Planning
Design
Execution (Data collection)
Data Processing
Data analysis
Presentation
Interpretation
Publication

HOW A BIOSTATISTICIAN
CAN HELP ?

Design of study
Sample size & power calculations
Selection of sample and controls
Designing a questionnaire
Data Management
Choice of descriptive statistics & graphs
Application of univariate and multivariate
statistical analysis techniques

INVESTIGATION
Data Colllection

Data Presentation
Tabulation
Diagrams
Graphs

Descriptive Statistics
Measures of Location
Measures of Dispersion
Measures of Skewness &
Kurtosis

Inferential Statistiscs
Estimation

Hypothesis
Testing
Ponit estimate
Inteval estimate

Univariate analysis
Multivariate analysis

TYPES OF DATA
QUALITATIVE DATA
DISCRETE QUANTITATIVE
CONTINOUS QUANTITATIVE

QUALITATIVE
Nominal
Example: Sex ( M, F)
Exam result (P, F)
Blood Group (A,B, O or AB)
Color of Eyes (blue, green,
brown, black)

ORDINAL
Example:
Response to treatment
(poor, fair, good)
Severity of disease
(mild, moderate, severe)
Income status (low, middle,
high)

QUANTITATIVE (DISCRETE)
Example: The no. of family members
The no. of heart beats
The no. of admissions in a day
QUANTITATIVE (CONTINOUS)
Example: Height, Weight, Age, BP, Serum
Cholesterol and BMI

Discrete data -- Gaps between possible values

Number of Children

Continuous data -- Theoretically,


no gaps between possible values

Hb

CONTINUOUS DATA

DISCRETE DATA
wt. (in Kg.) : under wt, normal & over wt.
Ht. (in cm.): short, medium & tall

Table 1 Distribution of blunt injured patients


according to hospital length of stay
hospital length of stay
Number
Percent
1 3 days

5891

43.3

4 7 days

3489

25.6

2 weeks

2449

18.0

3 weeks

813

6.0

1 month

417

3.1

More than 1 month

545

4.0

14604

100.0

Total
Mean = 7.85 SE = 0.10

Scale of measurement
Qualitative variable:
A categorical variable
Nominal (classificatory) scale
- gender, marital status, race
Ordinal (ranking) scale
- severity scale, good/better/best

Scale of measurement
Quantitative variable:
A numerical variable: discrete; continuous
Interval scale :
Data is placed in meaningful intervals and order. The unit of
measurement are arbitrary.
- Temperature (37 C -- 36 C; 38 C-- 37 C are equal) and
No implication of ratio (30 C is not twice as hot as 15 C)

Ratio scale:
Data is presented in frequency distribution in
logical order. A meaningful ratio exists.
- Age, weight, height, pulse rate
- pulse rate of 120 is twice as fast as 60
- person with weight of 80kg is twice as heavy
as the one with weight of 40 kg.

Scales of Measure

Nominal qualitative classification of equal


value: gender, race, color, city
Ordinal - qualitative classification which can
be rank ordered: socioeconomic status of
families
Interval - Numerical or quantitative data: can
be rank ordered and sizes compared :
temperature
Ratio - Quantitative interval data along with
ratio: time, age.

INVESTIGATION
Data Colllection

Data Presentation
Tabulation
Diagrams
Graphs

Descriptive Statistics
Measures of Location
Measures of Dispersion
Measures of Skewness &
Kurtosis

Inferential Statistiscs
Estimation

Hypothesis
Testing
Ponit estimate
Inteval estimate

Univariate analysis
Multivariate analysis

Frequency Distributions
data distribution pattern of
variability.

the center of a distribution


the ranges
the shapes

simple frequency distributions


grouped frequency distributions
midpoint

Tabulate the hemoglobin values of 30 adult


male patients listed below
Patien Hb
t No
(g/dl)

Patien Hb
t No
(g/dl)

Patien Hb
t No
(g/dl)

1
2

12.0
11.9

11
12

11.2
13.6

21
22

14.9
12.2

3
4

11.5
14.2

13
14

10.8
12.3

23
24

12.2
11.4

5
6

12.3
13.0

15
16

12.3
15.7

25
26

10.7
12.5

7
8

10.5
12.8

17
18

12.6
9.1

27
28

11.8
15.1

9
10

13.2
11.2

19
20

12.9
14.6

29
30

13.4
13.1

Steps for making a


table
Step1

Find Minimum (9.1) & Maximum (15.7)

Step2

Calculate difference 15.7 9.1 = 6.6

Step3

Decide the number and width of


the classes (7 c.l) 9.0 -9.9, 10.0-10.9,----

Step4

Prepare dummy table


Hb (g/dl), Tally mark, No. patients

DUMMY TABLE
Hb (g/dl)

Tall marks

No.
patients

Tall Marks TABLE


Hb (g/dl)

Tall marks

No.
patients

9.0 9.9
10.0 10.9
11.0 11.9
12.0 12.9
13.0 13.9
14.0 14.9
15.0 15.9

9.0 9.9
10.0 10.9
11.0 11.9
12.0 12.9
13.0 13.9
14.0 14.9
15.0 15.9

l
lll
lll
llll llll

1
3
6
10
5
3
2

Total

Total

llll
lll
ll

30

Table Frequency distribution of 30 adult male


patients by Hb
Hb (g/dl)
No. of
patients
9.0 9.9
1
10.0 10.9
3
11.0 11.9
6
12.0 12.9
10
13.0 13.9
5
14.0 14.9
3
15.0 15.9
2
Total
30

Table Frequency distribution of adult patients by


Hb and gender:
Hb
(g/dl)

Gender

Total

Male

Female

<9.0
9.0 9.9
10.0 10.9
11.0 11.9
12.0 12.9
13.0 13.9
14.0 14.9
15.0 15.9

0
1
3
6
10
5
3
2

2
3
5
8
6
4
2
0

2
4
8
14
16
9
5
2

Total

30

30

60

Elements of a Table
Ideal table should have

Number

Number
Title
Column headings
Foot-notes
Table number for identification in a report

Title,place Time period


Column Heading

Describe the body of the table, variables,


(What, how classified, where and when)

Variable name, No. , Percentages (%), etc.,

Foot-note(s) - to describe some column/row headings,


special cells, source, etc.,

Table II. Distribution of 120 (Madras) Corporation divisions


according to annual death rate based on registered deaths in
1975 and 1976

Figures in parentheses indicate percentages

DIAGRAMS/GRAPHS
Discrete data
--- Bar charts (one or two groups)
Continuous data
--- Histogram
--- Frequency polygon (curve)
--- Stem-and leaf plot
--- Box-and-whisker plot

Example data
68
79
43
28
49
16
49
30

63
27
25
25
38
24
28
43

42
22
74
45
42
64
23
49

27
28
51
12
27
47
19
12

30
24
36
57
31
23
11

36
25
42
51
50
22
52

28
44
28
12
38
43
46

32
65
31
32
21
27
31

Histogram

Frequency

20

10

0
11.5

21.5

31.5

41.5

51.5

61.5

71.5

Age

Figure 1 Histogram of ages of 60 subjects

Polygon

Frequency

20

10

0
11.5

21.5

31.5

41.5

Age

51.5

61.5

71.5

Example data
68
79
43
28
49
16
49
30

63
27
25
25
38
24
28
43

42
22
74
45
42
64
23
49

27
28
51
12
27
47
19
12

30
24
36
57
31
23
11

36
25
42
51
50
22
52

28
44
28
12
38
43
46

32
65
31
32
21
27
31

Stem and leaf plot


Stem-and-leaf of Age

N = 60

Leaf Unit = 1.0

6
19

1 122269
2 1223344555777788888

(11) 3 00111226688
13

4 2223334567999

5 01127

6 3458

7 49

Box plot
80
70

Age

60
50
40
30
20
10

Descriptive statistics
report: Boxplot
- minimum score
- maximum score
- lower quartile
- upper quartile
- median
- mean

- the skew of the distribution:


positive skew: mean > median & high-score whisker is longer
negative skew: mean < median & low-score whisker is longer

Pie Chart
Circular diagram total -100%
10%

Divided into segments each


representing a category
20%

Mild
Moderate
Severe

70%

Decide adjacent category


The amount for each category is
proportional to slice of the pie

The prevalence of different degree of


Hypertension
in the population

Bar Graphs
25

N
u
m
b
e
r

20
15
10

20

20
16

12

12
8

5
0
Smo Alc Chol DM HTN No F-H
Exer
Risk factor

Heights of the bar indicates


frequency
Frequency in the Y axis
and categories of variable
in the X axis
The bars should be of equal
width and no touching the
other bars

The distribution of risk factor among cases with


Cardio vascular Diseases

HIV cases enrolment in


USA by gender
Bar chart

HIV cases Enrollment


in USA by gender

Stocked bar chart

Graphic Presentation of
Data
the frequency polygon
(quantitative data)

the histogram
(quantitative data)

the bar graph


(qualitative data)

General rules for designing


graphs
A graph should have a self-explanatory
legend
A graph should help reader to understand
data
Axis labeled, units of measurement indicated
Scales important. Start with zero
(otherwise // break)
Avoid graphs with three-dimensional
impression, it may be misleading (reader
visualize less easily

Any Questions

You might also like