Basic of Biostatistics - 1
Basic of Biostatistics - 1
to
Biostatistics
Department: Department of Epidemiology and Biostatistics
Instructor: Haider Abbas BSc (Hon), MSc, MSPH, PhD
Scholar
Date: May-12, 2025
Course Objective
Applied Statistics:
The application of statistical methods to solve real problems involving randomly generated
data, and the development of new statistical methodologies motivated by real problems.
Biostatistics:
A branch of applied statistics directed toward applications in the health sciences and biology.
Use of Statistical Tools in Biostatistics:
• The tools of statistics are applied in many fields: business, education, psychology,
agriculture, economics
• When data comes from public health, biological sciences, or medicine, the term
biostatistics describes the specific application of statistical tools and concepts.
Classification of Biostatistics
1. Descriptive Statistics
• A statistical method concerned with:
• Collection
• Organization
• Summarization
• Analysis of data from a sample of the population
2. Inferential Statistics
• A statistical method concerned with:
• Drawing conclusions/inferences about a population
• Based on measurements from a random sample of that population
Descriptive Statistics
Descriptive Statistics
• Statistical procedures used to summarize, organize, and simplify data.
• The process should reflect overall findings effectively.
Key Points:
• Raw data is made more manageable.
• Raw data is presented in a logical form.
• Patterns can be identified from organized data.
Descriptive Statistics (continued)
• This branch of statistics deals with techniques of making conclusions about the
population.
• Inferential statistics builds upon descriptive statistics.
Key Points:
• Inferences are drawn from sample properties to population properties.
• Used to make generalizations from a sample to a population.
• Includes a variety of procedures to ensure that inferences are sound and rational, though
not always correct.
Inferential Statistics (continued)
Descriptive
statistics Inferential Statistics
• Planning
• Design
• Execution (Data collection)
• Data Processing
• Data analysis
• Presentation
• Interpretation
• Publication
HOW A “BIOSTATISTICIAN” CAN
HELP ?
• Design of study
• Sample size & power calculations
• Selection of sample and controls
• Designing a questionnaire
• Data Management
• Choice of descriptive statistics & graphs
• Application of univariate and multivariate
statistical analysis techniques
Stages in statistical investigation
There are five stages or steps in any statistical investigation
1. Collection of data: The process of obtaining measurements or counts.
2. Organization of data: Includes editing, classifying, and tabulating the data collected
3. Presentation of data: overall view of what the data actually looks like I facilitate further
statistical analysis I Can be done in the form of tables and graphs or diagrams
4. Analysis of data: To dig out useful information for decision making. It involves extracting
relevant information from the data (like mean, median, mode, range, variance. . . )
5. Interpretation of data Concerned with drawing conclusions from the data collected and
analyzed; and giving meaning to analysis results. A difficult task and requires a high degree
of skill and experience
definitions of Some basic terms
Population
Census
Sample
Parameter
Statistic, Statistics
Sampling
sample size
Variable Data
Definition of Some basic terms
• Population: is the complete set of possible measurements for which inferences are to be
made.
• Census: a complete enumeration of the population. But in most real problems, it cannot be
realized, hence we take a sample.
• Sample: A sample from a population is the set of measurements that are actually collected
in the course of an investigation.
• Parameter: Characteristic or measure obtained from a population.
• Statistic: A statistic refers to a numerical quantity computed from sample data (e.g. the
mean, the median, the maximum...).
• Data: Refers to a collection of facts, values, observations, or measurements that the
variables can assume.
Definitions of Some basic terms
The main function of statistics is to enlarge our knowledge of complex phenomena. The
following are some uses of statistics:
I. It presents facts in a definite and precise form.
II. Data reduction.
III. Measuring the magnitude of variations in data.
IV. Furnishes a technique of comparison.
V. Estimating unknown population characteristics.
VI. Testing and formulating of hypothesis.
VII. Studying the relationship between two or more variables.
VIII. Forecasting future events
Limitations of statistics
As a science, statistics has its own limitations. The following are some of the
limitations:
I. Deals with only quantitative information.
II. Deals with only the aggregate of facts and not with individual data items.
III. Statistical data are only approximate and not mathematically correct.
IV. Statistics can be easily misused and therefore should be used by experts
Types of Variables and
Measurement Scales
Variable: A variable is a characteristic or attribute that can assume different values in
different persons, places, or things.
Example:
Age, Diastolic blood pressure,
Heart rate,
The height of adult males,
The weights of preschool children,
Gender of Biostatistics students,
Marital status of instructors at the PIMS,
Ethnic group of patients
Types of Variables
A-Depending on the characteristic of the measurement, a variable can be:
1. Qualitative(Categorical) variable:
A variable or characteristic that cannot be measured in quantitative form but can only be identified by
name or categories,
for instance, place of birth, ethnic group, type of drug, stages of breast cancer (I, II, III, or IV), degree of
pain (minimal, moderate, severe or unbearable).
The categories should be clear-cut, not overlapping, and cover all the possibilities. For example, sex (male
or female), vital status (alive or dead), disease stage (depends on disease), ever smoked (yes or no).
2. Quantitative(Numerical) variable:
It is one that can be measured and expressed numerically.
Example:
survival time
systolic blood pressure
Number of children in a family
height, age, and body mass index.
Quantitative(Numerical) variable:
They can be of two types
1. Discrete Variables:
Have a set of possible values that is either finite or countably infinite.
The values of a discrete variable are usually whole numbers.
Numerical discrete data occur when the observations are integers that correspond with a count of
some sort.
Examples of discrete variables
Number of pregnancies,
The number of bacteria colonies on a plate,
The number of cells within a prescribed area upon microscopic examination,
The number of heart beats within a specified time interval,
A mother’s history of the number of births ( parity) and pregnancies (gravidity),
The number of episodes of illness a patient experiences during some time period, etc
Quantitative(Numerical) variable:
1. Continuous variable:
A continuous variable has a set of possible values, including all values in an interval of the real line.
No gaps between possible values.
Each observation theoretically falls somewhere along a continuum
Observations are not restricted to take on certain numerical values: Often, measurements
(e.g., height, weight, age)
Continuous data are used to report a measurement of the individual that can take on any
value within an acceptable range
Types of Variables:
B- On the basis of Scales of measurement:
There are four types of measurement scales:
Example:
Body temperature in degrees F. and Celsius (measured in degrees).
It is a meaningful difference
On the basis of scale of
Measurement
4.Ratio scales of Measurement:
The highest level of measurement scale, characterized by the fact that equality of ratios as
well as equality of intervals can be determined
There is a true zero point. i.e. zero is absolute Example:
volume
height
weight
length
time until death, etc...
Nominal (e.g.
Gender,
ethnic Group)
Categorical
(Qualitative) Ordinal.
(e.g.
Educational
Level)
Variables
Interval. (e.g.
Temperature,
C, F)
Numerical
(Quantitative)
Ratio (e.g.
Biparietal
diameter)
Types of Variables
C. On the basis of the source of data:
1. Primary Data:
Data generated for the first time primarily/originally for the study in question
It needs the involvement of the researcher himself. Census and sample surveys are sources of
primary types of data
2. Secondary Data:
Obtained from other pre-existing/previously collected sources
In this case, data were obtained from already collected sources like newspapers, magazines, DHS,
hospital records, and existing data like:
Mortality reports
Morbidity reports
Epidemic reports
Reports of laboratory utilization (including laboratory test results)
Statistics
Statistics