Unit-1 Biostatistics Descriptive
Unit-1 Biostatistics Descriptive
BIOSTATISTICS 1 to Biostatistics
Unit-1: Introduction to Biostatistics
Objectives:
1. Define Biostatistics
2. Define statistics
3. Describe types of statistics
4. Explain population and sample
5. Differentiate between parameter and statistic
6. Inclusion and exclusion criteria
7. Define data, Variable and discuss their types
8. Discuss scales of measurement in statistics ( Nominal, Ordinal, Continuous)
Biostatistics:
The word Biostatistics means
"Bio" comes from the Greek word "βίος" (bios), meaning "life".
“Statistics" comes from the Latin word "statisticum", meaning "state" or "condition", and the
German word "Statistik", meaning "analysis of data".
Together, "Biostatistics" literally means: "The analysis of data related to life"
Or, more broadly; Applied Mathematics dealing with collecting, classifying, analyzing, summarizing
and interpreting data. The application of statistical principles to study life sciences and health.
Biostatistics:
Definition
"The application of statistical principles and methods to analyze and interpret data in the life
sciences, medicine, and health sciences, to improve understanding, prediction, and decision-
making."
World Health Organization (WHO):
"Biostatistics is the application of statistical methods to the analysis of data related to health and
disease."
Biostatisticians use statistical techniques to:
1. Analyze and interpret life sciences data
2. Identify patterns and trends
3. Make predictions and recommendations
4. Inform decision-making in medicine, public health, and healthcare policy
In biostatistics, you'll often deal with data related to:
1. Human health
2. Disease
3. Mortality
4. Morbidity
5. Survival
6. Genetics
7. Environmental factors
8. Healthcare systems
Statistics:
Definitions
Applied Mathematics dealing with collecting, classifying, analyzing, summarizing and interpreting
data.
Unit-1: Introduction
BIOSTATISTICS 2 to Biostatistics
American Statistical Association (ASA): "Statistics is the science of collecting, analyzing, interpreting,
and presenting data to aid in decision-making."
Statistics is not a political science itself, but it plays a crucial role in political science research.
Political scientists use statistical methods to analyze data, test hypotheses, and draw conclusions
about political behavior, public opinion, election results, and policy impacts. Statistics helps to
quantify and interpret complex social phenomena, making it an essential tool in the field of political
science.
The aggregate of facts regarding patients attending a clinic encompasses various key data points,
including demographics that reflect a diverse mix of ages, genders, and socioeconomic
backgrounds; visit frequency that highlights patterns of regular check-ups and acute health issues;
and common conditions such as diabetes and hypertension prevalent among patients. Additionally,
it considers insurance status, which affects access to care, as well as health outcomes like recovery
rates and patient satisfaction scores. Appointment adherence rates reveal the effectiveness of
reminder systems, while referral patterns provide insights into care coordination with specialists.
Collectively, these factors enable clinics to better understand their patient population, identify
trends, and improve the quality of care delivered.
Mean: The average value of the data set, calculated by summing all values and dividing by the
number of observations. It provides a central tendency.
Standard Deviation (SD): A measure of the dispersion or spread of the data points around the
mean. A low SD indicates that the data points are close to the mean, while a high SD suggests
greater variability.
Importance of Statistics:
Statistics are essential for informed decision-making, enabling individuals and organizations to
analyze data trends and patterns. They support evidence-based research in fields like medicine and
social sciences, guiding practice and theory. In policymaking, statistics inform data-driven strategies
that allocate resources effectively. In business, they help optimize operations and understand
customer behavior, while in healthcare, they improve outcomes through rigorous analysis of clinical
Unit-1: Introduction
BIOSTATISTICS 3 to Biostatistics
data and treatment effectiveness. Overall, statistics provide a foundation for effective analysis and
strategic planning across various sectors.
1. Data Summarization: Statistics helps summarize large datasets into understandable formats,
making complex information more accessible.
2. Experimental Design: It aids in designing effective laboratory and field experiments as well as
surveys, ensuring reliable and valid results.
3. Effective Planning: Statistics supports sound planning in various fields by providing data-driven
insights.
4. Conclusions and Predictions: It assists in drawing conclusions and making predictions based on
data, helping to forecast outcomes under specific conditions.
5. Data Analysis: Statistical techniques are essential for analyzing numerical data and are widely
used across disciplines like biology, physics, chemistry, and business.
6. Informed Decision-Making: Administrators in both public and private sectors rely on statistical
data to inform decisions with a factual basis.
7. Political Argumentation: Politicians use statistics to support their arguments and clarify issues,
enhancing their credibility.
8. Social Science Applications: Social scientists employ statistical methods to study various socio-
economic aspects, emphasizing that a strong grasp of statistics is vital for accurate analysis in
their field.
Overall, statistics is crucial for informed decision-making, effective research, and understanding
trends across multiple disciplines.
1. To inform the general public: Statistics provide clear, quantifiable information that helps the
public understand issues, trends, and policies, making complex data more accessible. Example:
Health departments publish statistics on vaccination rates for diseases like measles, helping the
public understand the importance of immunizations in preventing outbreaks.
2. To explain things that have happened: Statistical analyses help clarify past events by
summarizing data, identifying patterns, and providing insights into causes and effects. Example:
Nurses might use statistics to explain an increase in hospital readmission rates for heart failure
patients, analyzing data to identify contributing factors such as patient education and follow-up
care.
3. To justify a claim: Statistical evidence can support arguments or claims by demonstrating
trends, correlations, or outcomes, lending credibility to assertions. Example: A nursing
professional may present statistics on the effectiveness of hand hygiene protocols in reducing
hospital-acquired infections to support claims for increased funding for infection control
initiatives.
4. To provide general comparison: Statistics enable comparisons between different groups,
periods, or regions, helping to highlight differences or similarities in various contexts. Example:
Clinical researchers compare the outcomes of different pain management strategies across
various hospitals to determine which methods lead to better patient satisfaction and pain
relief.
5. To predict decisions regarding future outcomes: Using historical data, statistical models can
forecast potential future events or trends, aiding decision-making in various fields. Example: A
hospital uses statistical models to predict patient admission rates during flu season, helping to
prepare staffing and resources accordingly.
6. To estimate unknown quantities: Statistics allow researchers to make informed guesses about
populations or phenomena based on samples, helping to fill in gaps in knowledge. Example:
Unit-1: Introduction
BIOSTATISTICS 4 to Biostatistics
Nurses conduct surveys to estimate the prevalence of depression among patients in a clinic,
using sample data to infer mental health trends in the broader population.
7. To establish associations/relationships between factors: Statistical methods can identify and
quantify relationships between variables, helping to understand how different factors influence
each other. Example: Researchers may analyze data to establish a correlation between nurse
staffing levels and patient outcomes, showing how higher staffing ratios lead to lower mortality
rates.
8. To evaluate program effectiveness: Statistics are used to assess the impact of policies or
interventions by measuring outcomes and comparing them to benchmarks or control groups.
Example: A hospital evaluates the effectiveness of a new discharge planning program by
comparing readmission rates before and after its implementation, using statistical analysis to
measure improvements.
9. To enhance resource allocation: By analyzing data on needs and outcomes, statistics help
organizations and governments allocate resources more effectively to areas that require them
most. Example: Nursing administrators analyze patient care data to identify high-demand areas
in the hospital, allowing for better allocation of nursing staff and resources to those units.
10. To support evidence-based decision-making: Statistical analysis provides the empirical
evidence needed for policymakers and organizations to make informed decisions, ensuring that
strategies are grounded in data rather than intuition. Example: Nurses use systematic reviews
of statistical studies on wound care to decide on the most effective dressing materials and
techniques to improve patient healing outcomes.
Characteristics of statistics
Types of statistics:
There are two types of statistics descriptive and inferential statistics.
1. Descriptive Statistics
Descriptive statistics involve methods for summarizing and organizing data. This branch focuses
on presenting data in a meaningful way, making it easier to understand and interpret. Key
components include:
Measures of Central Tendency: These include the mean (average), median (middle value), and
mode (most frequent value), which provide insights into the center of the data.
Measures of Dispersion: These include the range, variance, and standard deviation, which indicate
how spread out the data points are.
Graphical Displays: Descriptive statistics often use visual representations, such as histograms, pie
charts, and box plots, to illustrate the data distribution and trends clearly.
2. Inferential Statistics
Inferential statistics involve methods that allow researchers to make conclusions or inferences
about a population based on sample data. This type includes:
Hypothesis Testing: This method tests assumptions or claims about a population using sample data,
determining whether there is enough evidence to support a specific hypothesis.
Confidence Intervals: These provide a range of values that likely contain the population parameter,
offering a measure of uncertainty around the sample estimate.
Regression Analysis: This technique explores relationships between variables and makes
predictions based on the data.
Both descriptive and inferential statistics are essential for analyzing and interpreting data.
Descriptive statistics help summarize and present data, while inferential statistics facilitate drawing
conclusions and making predictions based on that data.
Differences:
Type of
Description Example
Statistics
Summarizes his performance over 1. A cricket player wants to find his score average Descriptive
those specific matches. for the last 20 matches. Statistics
Provides a summary of his scores
2. Mujeeb wants to describe the variation in his
using measures like mean or
four test scores in statistics.
standard deviation.
Calculates the average based on 3. Mrs. Imran wants to determine the average
actual spending data from the weekly amount she spent on groceries in the past
past. 6 months.
Uses his past performance to 1. A cricket player wants to estimate his chance of Inferential
predict future outcomes. scoring based on his current season average. Statistics
2. Based on his first four scores, Mujeeb would
Makes predictions about future
like to predict the variation in his final statistics
scores based on current data.
test scores.
3. Based on last six months grocery bills, Mrs.
Extends her past spending data to
Imran would like to predict the average amount
forecast future expenses.
she will spend on groceries for the upcoming year.
Populations
The word population mean, like the word populace, derives from the Latin populus, meaning
"people."
Definition:
Population is a term used to refer to a group of individuals of the same species living in a specific
geographic area at a certain time. This group shares resources, interacts, and often interbreeds if
it's a biological context. In social sciences, "population" generally refers to a group of people in a
particular area, like a country or city, and can be studied for various demographic factors such as
size, density, distribution, and growth rates.
In statistics, a population is the entire set of individuals or items that are of interest in a particular
study, from which samples can be drawn.
Target Population (also referred to as Reference Population, Universe, or Source Population) is the
entire group or set of individuals, items, or events that researchers aim to study or make inferences
about. It defines the boundaries of the group from which they wish to gather information or to
which they intend to generalize their findings.
For example:
In a public health study assessing the prevalence of a disease, the target population might be all
individuals within a specific age range in a country.
Unit-1: Introduction
BIOSTATISTICS 7 to Biostatistics
Study Population (or Accessible Population) is the portion of the target population that
researchers can realistically reach for data collection.
Example:
If the target population is all high school students in a country, the study population might be
students in selected schools within one city where the researchers are conducting the study.
Sample:
A Sample is a smaller group selected from a study population to represent the larger target
population in a study.
External Population
External Population refers to a group outside the study population to which researchers aim to
generalize their findings.
Examples:
KMC students for addiction: If a study is conducted KMC students regarding drug addiction, the
external population could be all medical students in Pakistan, assuming the findings can apply to
them.
Immigrant population of Pakistan in England: If researchers study this group, they might aim to
extrapolate results back to the broader Pakistani population, suggesting that findings about
addiction trends among immigrants may also reflect trends in Pakistan.
Here’s a table summarizing the categories and criteria for a study population focused on diabetic
patients under 60 years of age:
Description Category
All individuals living in the community General Population
Individuals aged less than 60 years with diabetes Included Population
Sample of diabetic patients aged < 60 from clinics Study Population
- Age less than 60 years
Inclusion Criteria
- Diagnosed with diabetes
- Age 60 years or older
- Individuals with other chronic illnesses Exclusion Criteria
- Pregnant women
Randomly selected diabetic patients under 60 years from local clinics Sample
This table provides a clear outline of the different populations and criteria for a hypothetical study
on diabetes.
Unit-1: Introduction
BIOSTATISTICS 8 to Biostatistics
Data:
The word "data" is derived from:
Latin: "Datum" (singular) and "Data" (plural) In Latin, "datum" means "something given" or "a fact."
The plural form "data" was used to describe a collection of facts or information.
Definition:
In statistics, data is a collection of observations or measurements that are used to make decisions
and analyze information.
Collection Method:
Primary Data Collection Methods
1. Direct Personal Investigations
- Personal observations and recordings
2. Indirect Investigation/Personal Interviews
- Face-to-face/phone interviews
3. Questionnaire
- Standardized questions for respondents
4. Enumerators
- Trained individuals collect data
5. Local Sources
- Local authorities, records, documents
6. Laboratory Experiments
- Controlled experiments in artificial settings
7. Field Experiments
- Experiments in natural settings
Varaibles:
The term "variable" is derived from:
Unit-1: Introduction
BIOSTATISTICS 10 to Biostatistics
Latin: "Variabilis," meaning "changeable" or "capable of being changed."
Definition:
A variable is a characteristic, attribute, or quantity that can change or vary, having multiple
possible values, and is used to represent or describe a concept or phenomenon in various contexts,
including statistics, mathematics, and research.
Ordinal:
Ordinal data can be arranged in a meaningful order, but the intervals between the data points
are not defined or meaningful.
Example: Severity of cough (mild, moderate, severe); economic status (poor, middle, or upper
class).
Example:
Scientists will manipulate the vitamin C intake in a group of lets say 100 people who are over the
age of 40 years. 50 people will be given a daily high dose of vitamin C and 50 people will be given a
placebo pill over a period of 25 years. The goal is to see if the high vitamin C dosage affects the
people's life span. Independent variable of your daily vitamin C intake can determine the
dependent variable of your life span?
Solution:
In this example, we have an experiment designed to study the effects of vitamin C on lifespan.
Here’s how to identify and interpret the variables:
Objective:
To determine if a high dose of vitamin C impacts the lifespan of people over age 40 compared to
those who do not take it.
Unit-1: Introduction
BIOSTATISTICS 12 to Biostatistics
So this experiment will observe if manipulating the independent variable (vitamin C intake) results
in changes to the dependent variable (lifespan), providing insights into the potential health benefits
of vitamin C.
1. Absolute Measurement:
The exact value or count observed.
Example: Measuring a patient’s temperature as 37.5°C.
2. Relative Measurement:
Compares an observed value to a standard or another value. Always expressed in proportion or
percentage with no standardize unit.
Example: Patient’s weight increased by 5% over a month.
3. Absolute Error:
The difference between the measured value and the actual value.
Example: If a patient’s true temperature is 37.3°C but recorded as 37.5°C, the absolute error is
0.2°C.
4. Relative Error:
The absolute error divided by the true value, often expressed as a percentage.
Example: If a patient’s weight is misrecorded by 1 kg on a base of 50 kg, the relative error is 2%.
5. Biased/Cumulative/Systematic Error:
Consistent error due to faulty equipment or techniques.
Example: A thermometer that consistently reads 0.3°C higher. This type of error increases with
increasing sample size.
6. Random/Accidental Error:
Unpredictable variations in measurements.
Example: Slight changes in blood pressure readings due to patient movement. This error
decreases with increasing sample size.
7. Accurate Measurement:
Free from errors, providing a true reflection of values.
Example: Using calibrated equipment and proper technique to get precise blood pressure.
8. Significant Digits:
Digits in a measurement that convey meaningful precision.
Example: Recording a patient’s height as 170.2 cm instead of 170 cm if precise measurement is
critical.
9. Misuse of Statistics:
Using data in a misleading way, leading to incorrect conclusions.
Example: Concluding that a treatment is effective without considering all patient variables.
Are birth weights of new born of mothers exposed to smoking lower then those, whose
mothers are not exposed?
Is there any relationship b/w mothers nutritional status and low birth weight babies?
Does the average duration of stay of a new born male baby in ICU more then new born female
baby?
Is there any effect of education of mother on prevalence of diarrhea among children under 5
years?
Solutions:
Are birth weights of newborns of mothers exposed to smoking lower than those whose
mothers are not exposed?
Type of Data: Quantitative (birth weights in grams or kilograms), Categorical (smoking
exposure: yes or no).
Analysis: Comparison of means or medians (birth weights) between two groups (exposed vs.
non-exposed mothers).
Is there any relationship between mothers' nutritional status and low birth weight babies?
Type of Data: Categorical (mothers' nutritional status: underweight, normal, overweight),
Categorical (low birth weight: yes or no).
Analysis: Association or correlation between nutritional status categories and birth weight
categories.
Does the average duration of stay of a newborn male baby in ICU differ from that of a newborn
female baby?
Type of Data: Quantitative (duration of ICU stay in hours or days), Categorical (gender: male or
female).
Analysis: Comparison of means for ICU stay duration between male and female newborns.
Is there any effect of the mother’s education on the prevalence of diarrhea among children
under 5 years?
Type of Data: Categorical (mother’s education level: no education, primary, secondary, etc.),
Categorical (diarrhea prevalence: yes or no).
Analysis: Comparison of diarrhea prevalence across different levels of maternal education.
Each question aims to establish a relationship between variables, either by comparing groups
or analyzing associations, using a mix of quantitative and categorical data types.
EXAMPLE
A researcher was interested in studying the proportion of adult smokers in Peshawar. A random
sample of size 10000 adults was selected. It was observed that 42% of the adults were smokers.
SOLVED
A researcher was interested in studying the proportion of adult smokers in Peshawar. A random
sample of size 10000 adults was selected. It was observed that 42% of the adults were smokers.
TUTORIAL
A researcher was interested in studying the average income of household in Pakistan. A random
sample of size 1000 household was selected. The average income of these household was Rs 3000
per month.
TUTORIAL SOLVED
A researcher was interested in studying the average income of household in Pakistan. A random
sample of size 1000 household was selected. The average income of these household was Rs 3000
per month.
Q. If you are asked to select a random sample of 1000 household from the population. what would
be your strategy to select a representative sample? Explain.
Explanation:
To select a representative sample of 1,000 households, I would use a stratified random sampling
approach. This involves dividing the population into relevant subgroups, or strata, such as
Unit-1: Introduction
BIOSTATISTICS 15 to Biostatistics
geographic region or income level, that reflect key characteristics of the population. I would then
determine the sample size proportionally for each stratum (e.g., if 60% of the population is urban,
select 600 urban households) and use simple random sampling within each stratum to select
households. This ensures that all segments of the population are represented, creating a more
accurate and reliable sample.
HOME WORK
An epidemiological study was carried out in Peshawar to see whether obesity is associated with
hypertension in young adults. The investigators decided that it was not feasible to take a sample
from all the young adults in peshawar. It was felt by the investigators that commercial fitness
centres may provide a good source of young adults.
Explanation:
For this study, investigators are exploring the relationship between obesity and hypertension
among young adults in Peshawar, using a sample from commercial fitness centers as a practical
approach. Since sampling the entire population of young adults in Peshawar isn’t feasible, fitness
centers offer an accessible subgroup where young adults concerned with fitness or health may
already be more prevalent. However, this sampling method introduces selection bias because the
sample may not represent all young adults, especially those who do not attend fitness centers. This
bias might affect the generalizability of the findings, as individuals at fitness centers could differ
from the broader population in terms of health awareness and lifestyle habits. To mitigate this,
investigators could diversify their sample sources or adjust for potential biases in their analysis.
Examaple:
A random sample of young adults was studied from several fitness centres, randomly selected
from a detailed list of fitness centres in peshawar and the body weight and blood pressure of the
selected subjects was measured. Answer the following questions :
Solved:
A random sample of young adults was studied from several fitness centres, randomly selected
from a detailed list of fitness centres in peshawar and the body weight and blood pressure of the
selected subjects was measured. Answer the following questions :
1. What is the target population for this study ? All young adults in Peshawar
2. What is the study population in this study ? Young adults attending various fitness centers in
Peshawar
3. Does the sample represent the study population ? Yes, the sample likely represents the study
population, as the sample was chosen randomly from a detailed list of fitness centers, reducing
sampling bias within this specific group.
Unit-1: Introduction
BIOSTATISTICS 16 to Biostatistics
Does the study population represent the target population ? Not entirely The study population may
not fully represent the target population, as young adults attending fitness centers might differ in
health-related behaviors and awareness compared to those who do not. This could affect the
generalizability of the findings to all young adults in Peshawar.
In the health mela held at At Khyber Medical University. An opinion poll was
conducted about the health policies of the govt: from the following data sheet,
.determine the types of data selected for the poll
Read NEWS Opinion Education Socio economic sex Age in year Study p
papers status
2 3 2 2 2 70 1
2 1 2 3 2 42 2
1 4 3 3 2 35 3
1 1 2 3 2 54 4
1 3 1 2 1 35 5
1 2 1 2 2 55 6
1 5 3 1 2 40 7
1 1 3 3 2 40 8
1 1 1 3 2 25 9
2 1 1 2 1 49 1
2 5 1 1 2 35 1
2 2 2 3 2 50 1
1 4 4 3 2 30 1
1 5 1 1 2 38 1
2 5 5 1 2 55 1
1 5 5 1 2 40 1
1 5 1 1 2 47 1
1 1 2 2 2 51 1
Unit-1: Introduction
BIOSTATISTICS 17 to Biostatistics
1 1 1 3 1 62 1
2 4 3 3 2 65 2
2 1 4 2 2 35 2
1 1 2 1 2 60 2
1 1 2 3 2 65 2
1 3 1 5 1 21 2
1 5 2 1 2 50 2
.Sex
2. Feamle Male .1
Level of Education
Opinion scale
2. No Yes .1
Solved