0% found this document useful (0 votes)
96 views194 pages

Basic Biostatistics Part I

The document is an introductory course on biostatistics, covering essential topics such as types of variables, data organization, measures of central tendency, and probability principles. It emphasizes the importance of statistics in health sciences, including applications in clinical trials, epidemiological studies, and survey design. Additionally, it outlines the limitations of statistics and various data sources, including primary and secondary data.

Uploaded by

Benyam Zenebe
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
96 views194 pages

Basic Biostatistics Part I

The document is an introductory course on biostatistics, covering essential topics such as types of variables, data organization, measures of central tendency, and probability principles. It emphasizes the importance of statistics in health sciences, including applications in clinical trials, epidemiological studies, and survey design. Additionally, it outlines the limitations of statistics and various data sources, including primary and secondary data.

Uploaded by

Benyam Zenebe
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 194

Basics for Biostatistics Part I

Zeytu Gashaw Asfaw (PhD)


Department of Epidemiology and Bio-statistics
School of Public Health, Addis Ababa University
Addis Ababa, Ethiopia

November 6, 2024

Zeytu Gashaw Asfaw (PhD) Department of Epidemiology


Basics for
andBiostatistics
Bio-statistics
Part ISchool of Public Health,
November
Addis 6,
Ababa
2024University
1 / 194 A
Table of contents

1 Introduction to the course


2 Types of variables
3 Data and types of data
4 Scales of measurement
5 Methods of data organization and presentation
6 Measures of central tendency
7 Measures of dispersion
8 Basic principles of probability
9 References

Zeytu Gashaw Asfaw (PhD) Department of Epidemiology


Basics for
andBiostatistics
Bio-statistics
Part ISchool of Public Health,
November
Addis 6,
Ababa
2024University
2 / 194 A
Introduction to the course

What is Statistics?

Statistics is a science deals with collect, organize, analyze, and draw


meaningful inferences from data, which lead to good decisions.
Statistics: is the art and science of making decisions in the face of
uncertainty
The field of statistics provides some of the most fundamental tools
and techniques of the scientific method;
forming hypotheses,
designing experiments and observational studies,
gathering data,
summarizing data,
drawing inferences from data (e.g., testing hypotheses)

Zeytu Gashaw Asfaw (PhD) Department of Epidemiology


Basics for
andBiostatistics
Bio-statistics
Part ISchool of Public Health,
November
Addis 6,
Ababa
2024University
3 / 194 A
Introduction to the course

...What is Statistics?

Roughly speaking, the field of statistics can be divided into


Mathematical Statistics: the study and development of statistical
theory and methods in the abstract; and
Applied Statistics: the application of statistical methods to solve real
problems involving randomly generated data, and the development of
new statistical methodology motivated by real problems.
Biostatistics: statistics as applied to the life and health sciences
Biostatistics is the branch of applied statistics directed toward
applications in the health sciences and biology.
Biostatistics is sometimes distinguished from the field of biometry
based upon whether applications are in the health sciences
(biostatistics) or in broader biology (biometry; e.g., agriculture,
ecology, wildlife biology).

Zeytu Gashaw Asfaw (PhD) Department of Epidemiology


Basics for
andBiostatistics
Bio-statistics
Part ISchool of Public Health,
November
Addis 6,
Ababa
2024University
4 / 194 A
Introduction to the course

Stages in Statistical Investigation

Data Collection
Organization and Presentation of data
Data Analysis
Interpretation of the results

Zeytu Gashaw Asfaw (PhD) Department of Epidemiology


Basics for
andBiostatistics
Bio-statistics
Part ISchool of Public Health,
November
Addis 6,
Ababa
2024University
5 / 194 A
Introduction to the course

Health Service Statistics

Health statistics are very useful to improve the health situation of the
population of a given country. For example, the following questions could
not be answered correctly unless the health statistics of a given area is
consolidated and given due emphasis.
What is the leading cause of death in the area?
Is it malaria, tuberculosis, etc.?
At what age is the mortality highest, and from what disease?
Are certain diseases affecting specified groups of the population more
than others? (This might apply, for example, to women or children, or
to individuals following a particular occupation.)

Zeytu Gashaw Asfaw (PhD) Department of Epidemiology


Basics for
andBiostatistics
Bio-statistics
Part ISchool of Public Health,
November
Addis 6,
Ababa
2024University
6 / 194 A
Introduction to the course

...Health Service Statistics

In comparison with similar areas, is this area healthier or not?


Are the health institutions in the area able to cope with the disease
problem?
Is there any season at which various diseases have a tendency to
break out? If so, can these be distinguished?
What are the factors involved in the incidence of certain diseases, like
malaria, tuberculosis, etc.?

Zeytu Gashaw Asfaw (PhD) Department of Epidemiology


Basics for
andBiostatistics
Bio-statistics
Part ISchool of Public Health,
November
Addis 6,
Ababa
2024University
7 / 194 A
Introduction to the course

Health service statistics are used to:

describe the level of community health


diagnose community ills
discover solutions to health problems and find clues for administrative
action
determine priorities for health programmes
promote health legislation
determine the met and unmet health needs
disseminate information on the health situation and health
programmes
determine success or failure of specific health programmes

Zeytu Gashaw Asfaw (PhD) Department of Epidemiology


Basics for
andBiostatistics
Bio-statistics
Part ISchool of Public Health,
November
Addis 6,
Ababa
2024University
8 / 194 A
Introduction to the course

Classification of Statistics

Descriptive Statistics
It helps to describe a given set of data without going beyond that data
It consists of collection, organization, summarization,and anaysis of
data

Inferential Statistics
It helps to make inference/conclusion about a population based on the
selected sample
It consists of predict and forecast values of population parameters, test
hypothesis about values of population parameters and make decisions

Zeytu Gashaw Asfaw (PhD) Department of Epidemiology


Basics for
andBiostatistics
Bio-statistics
Part ISchool of Public Health,
November
Addis 6,
Ababa
2024University
9 / 194 A
Introduction to the course

Definition of some basic terms

A population consists of the set of all measurements/elements under


study.
A sample is a subset of the measurements selected from the
population.
A census is a complete enumeration of every item in a population.
Sampling is the process of taking a sample form a population
Parameter is a statistical measure obtained from a population data
Statistic is a statistical measure obtained from a sample data
Variable is a characteristics under study that assumes different values
for different elements

Zeytu Gashaw Asfaw (PhD) Department of Epidemiology


Basics for
andBiostatistics
Bio-statistics
Part ISchool of Public Health,
November
Addis6,Ababa
2024 University
10 / 194 A
Introduction to the course

Essential Application of Statistics

Research Interpretations and Conclusions


Meta-Analysis of Literature Reviews
Clinical Trial Design
Designing Surveys
Epidemiological Studies
Statistical Modeling

Zeytu Gashaw Asfaw (PhD) Department of Epidemiology


Basics for
andBiostatistics
Bio-statistics
Part ISchool of Public Health,
November
Addis6,Ababa
2024 University
11 / 194 A
Introduction to the course

Research Interpretations and Conclusions

Statistics forms an important part of most sciences, helping


researchers test hypotheses, confirm (or reject) theories, and arrive at
reliable conclusions.
The data generated from experiments and studies is never
straightforward one has to take into account randomness and
uncertainty, eliminate coincidences and arrive at the most accurate
findings.
Statistical analysis helps reduce or eliminate errors so that researchers
can confidently make conclusions that will then direct further
research.

Zeytu Gashaw Asfaw (PhD) Department of Epidemiology


Basics for
andBiostatistics
Bio-statistics
Part ISchool of Public Health,
November
Addis6,Ababa
2024 University
12 / 194 A
Introduction to the course

Meta-Analysis of Literature Reviews

Before a researcher or scientist embarks on new research, it is


customary to perform a comprehensive literature search of all the
available published information on a specific topic.
However, it is always difficult to make one definitive conclusion from
multiple studies, especially if the studies follow different research
methodologies, have been published in different journals (leading to
publication bias), or are spread over a large time range.
A statistical analysis of these studies helps extract the common truth
underlying all these studies, or uncover a hidden pattern or
relationship.

Zeytu Gashaw Asfaw (PhD) Department of Epidemiology


Basics for
andBiostatistics
Bio-statistics
Part ISchool of Public Health,
November
Addis6,Ababa
2024 University
13 / 194 A
Introduction to the course

Clinical Trial Design

One of the most important applications of statistical analysis is in


designing clinical trials.
When a new drug or treatment is discovered, it has to first be tested
on a group/groups of people to understand its efficacy and safety.
A clinical trial involves selecting a population/sample size, defining
the time range over which to monitor the treatment, designing the
phases, and selecting parameters that will help decide how effective
the treatment is and if it is better than an existing one.
Biostatisticians can take on the task of performing a statistical
analysis of the study, helping not only to design it but also analyze
and determine the outcomes.

Zeytu Gashaw Asfaw (PhD) Department of Epidemiology


Basics for
andBiostatistics
Bio-statistics
Part ISchool of Public Health,
November
Addis6,Ababa
2024 University
14 / 194 A
Introduction to the course

Designing Surveys

Do people who go to the gym lead a healthier, happier life? How safe
is the city of Addis Ababa? How effective is your HIV-awareness
programme? Questions like these that cannot be answered without
the help of statistics.
Surveys require careful design and implementation, considerations
about the survey format, accounting for bias and fatigue, etc.
Data collected from surveys have to be carefully studied by statistical
analysis experts who also use their own discretion and experience to
derive the most meaningful information from a survey.
Through surveys, governments can determine the effectiveness of an
initiative, businesses can understand the response to a particular
product, and social scientists can perform quantitative research.

Zeytu Gashaw Asfaw (PhD) Department of Epidemiology


Basics for
andBiostatistics
Bio-statistics
Part ISchool of Public Health,
November
Addis6,Ababa
2024 University
15 / 194 A
Introduction to the course

Epidemiological Studies

Epidemiological studies help determine the link between the cause


and effect of a disease, especially in outbreaks and epidemics.
A statistical analysis involves identifying the most likely cause of a
disease - for example, the link between smoking and lung cancer.
This information is used to develop public health policies and
implement preventive healthcare programmes.
Data visualization and statistical analysis also played an important
role in understanding the Ebola epidemic in West Africa.

Zeytu Gashaw Asfaw (PhD) Department of Epidemiology


Basics for
andBiostatistics
Bio-statistics
Part ISchool of Public Health,
November
Addis6,Ababa
2024 University
16 / 194 A
Introduction to the course

Statistical Modeling

Statistical modeling involves building predictive models based on


pattern recognition and knowledge discovery.
It is used in environmental and geographical studies, predicting
election outcomes, survival analysis of populations, and more.
Meteorologists use statistical tools to help them predict the weather.
The line between statistical modelling and machine learning is
becoming increasingly blurry - Robert Tibshirani, a statistician at
Stanford called machine learning ”glorified statistics”.

Zeytu Gashaw Asfaw (PhD) Department of Epidemiology


Basics for
andBiostatistics
Bio-statistics
Part ISchool of Public Health,
November
Addis6,Ababa
2024 University
17 / 194 A
Introduction to the course

Limitation of Statistics

It does not study qualitative characteristics directly


It doesn’t deals with a single individuals but deals with aggregate of
facts.
Statistical findings are approximate
It is sensitive to misuse.

Zeytu Gashaw Asfaw (PhD) Department of Epidemiology


Basics for
andBiostatistics
Bio-statistics
Part ISchool of Public Health,
November
Addis6,Ababa
2024 University
18 / 194 A
Introduction to the course

Sources of Data

In the age of information, data has become the driving force behind
decision-making and innovation.
Whether in business, science, healthcare, or government, data serves
as the foundation for insights and progress.
As a researcher, you need to understand the various sources of data as
they are essential for conducting comprehensive and impactful studies.

Zeytu Gashaw Asfaw (PhD) Department of Epidemiology


Basics for
andBiostatistics
Bio-statistics
Part ISchool of Public Health,
November
Addis6,Ababa
2024 University
19 / 194 A
Introduction to the course

...Sources of Data

Sources of data are;


Primary Data Sources
Secondary Data Sources
Tertiary Data Sources
Emerging Data Sources

Zeytu Gashaw Asfaw (PhD) Department of Epidemiology


Basics for
andBiostatistics
Bio-statistics
Part ISchool of Public Health,
November
Addis6,Ababa
2024 University
20 / 194 A
Introduction to the course

Primary Data Sources

Primary data sources refer to original data collected firsthand by


researchers specifically for their research purposes.
These sources provide fresh and relevant information tailored to the
study’s objectives.
Examples of primary data sources include surveys and questionnaires,
direct observations, experiments, interviews, and focus groups.
As a researcher, you must be familiar with primary data sources,
which are original data collected firsthand specifically for your
research purposes.

Zeytu Gashaw Asfaw (PhD) Department of Epidemiology


Basics for
andBiostatistics
Bio-statistics
Part ISchool of Public Health,
November
Addis6,Ababa
2024 University
21 / 194 A
Introduction to the course

Secondary Data Sources

Secondary data sources involve data collected by someone else for


purposes other than your specific research.
Therefore, secondary data complements primary data and can provide
valuable context and insights to your research.
Examples of Secondary Data Sources
Published literature: Published literature refers to academic papers,
books, and reports published by researchers and scholars in various
fields.
Government sources: Government agencies collect and maintain
vast amounts of data on a wide range of topics.
Online databases: The internet has opened up access to a wealth of
data through online databases, data repositories, and open data
initiatives.
Market research reports: Market research companies conduct
surveys and gather data to analyze market trends, consumer behavior,
and industry insights.
Zeytu Gashaw Asfaw (PhD) Department of Epidemiology
Basics for
andBiostatistics
Bio-statistics
Part ISchool of Public Health,
November
Addis6,Ababa
2024 University
22 / 194 A
Introduction to the course

Tertiary Data Sources

In addition to primary and secondary data, you should be aware of


tertiary data sources, which play a critical role in aggregating and
organizing existing data from various origins.
Tertiary data sources focus on collecting, curating, and preserving
data for easy access and analysis.
Examples of Tertiary Data Sources
Data aggregators: Data aggregators are companies or organizations
that specialize in collecting and compiling data from multiple sources
into centralized databases. These sources can include government
agencies, research institutions, businesses, and other data providers.
Data brokers: The best way to describe data brokers is that they are
entities that buy and sell data, often without the direct consent or
knowledge of the individuals whose data is being traded.
Data archives: Data archives serve as repositories for historical data
and research findings.
Zeytu Gashaw Asfaw (PhD) Department of Epidemiology
Basics for
andBiostatistics
Bio-statistics
Part ISchool of Public Health,
November
Addis6,Ababa
2024 University
23 / 194 A
Introduction to the course

Emerging Data Sources

As you delve into the world of data collection, its important to know
the emerging sources that have gained prominence in recent years.
These newer data sources provide valuable insights and opportunities
for research across various domains. Below are some of these
emerging data sources:
Examples of Emerging Data Sources
Internet of Things (IoT): The Internet of Things (IoT) has changed
data collection in the 21st century through the everyday connection
of devices and objects to the Internet. Smart devices like sensors,
wearables, and home appliances generate vast amounts of data in
real-time.
Social media and web data: Social media platforms and websites
host a wealth of information generated by users worldwide.
Sensor data: Sensor data is becoming increasingly relevant in various
fields, including environmental monitoring, urban planning, and
healthcare.
Zeytu Gashaw Asfaw (PhD) Department of Epidemiology
Basics for
andBiostatistics
Bio-statistics
Part ISchool of Public Health,
November
Addis6,Ababa
2024 University
24 / 194 A
Types of variables

Types of variables

Quantitative variables: are quantifiable


1 Discrete

Assumes countable/countably infinite values.


Discrete Data(Whole numbers): Only certain values are possible (there
are gaps between the possible values). Implies counting.
Example: Number of students in a class; number of cars in a parking
lot etc
2 Continuous
Assumes values in intervals
Continuous Data( Decimal points): Theoretically, with a fine enough
measuring device. Implies measuring
Example: weight, height, length, temperature etc
Qualitative / Categorical variables: are non quantifiable
Example: color, nationality, sex,...

Zeytu Gashaw Asfaw (PhD) Department of Epidemiology


Basics for
andBiostatistics
Bio-statistics
Part ISchool of Public Health,
November
Addis6,Ababa
2024 University
25 / 194 A
Data and types of data

Types of Data in Statistics?

The data is classified into majorly four categories:


Nominal data
Ordinal data
Discrete data
Continuous data

Zeytu Gashaw Asfaw (PhD) Department of Epidemiology


Basics for
andBiostatistics
Bio-statistics
Part ISchool of Public Health,
November
Addis6,Ababa
2024 University
26 / 194 A
Data and types of data

...Types of Data in Statistics?

Zeytu Gashaw Asfaw (PhD) Department of Epidemiology


Basics for
andBiostatistics
Bio-statistics
Part ISchool of Public Health,
November
Addis6,Ababa
2024 University
27 / 194 A
Data and types of data

Qualitative or Categorical Data

Qualitative or Categorical data are not numerical.


Qualitative or categorical data are counts of the number of
participants or observations in each category.
This data is often described with percentages or other ratios (eg,
risks).
Categorical data can fall into 2 classifications: nominal or ordinal.
The objects being studied are grouped into categories based on some
qualitative trait.
The resulting data are merely labels or categories.
Nominal and Ordinal scales will be used for categorical data or
qualitative data.

Zeytu Gashaw Asfaw (PhD) Department of Epidemiology


Basics for
andBiostatistics
Bio-statistics
Part ISchool of Public Health,
November
Addis6,Ababa
2024 University
28 / 194 A
Data and types of data

Examples of categorical variables are

race - White, Black or African American, American, Indian,...


sex - male or female
age group - Neonates (birth to 1 month), Infants (1 month to 1
year), Children (1 - 12 yrs), Adolescents (13 - 17 yrs), Adults (18
years or older), Older adults (65 and older)*
educational level - BSc, MSc, PhD
HIV test, COVID test, Pregnancy test - positive or negative
Smoking - smoker or non-smoker
Questionnaire response - agree, disagree, neutral).
Nationality - Ethiopian, Kenyan, Norwegian,...
Colour - red, white, Green,...
+++

Zeytu Gashaw Asfaw (PhD) Department of Epidemiology


Basics for
andBiostatistics
Bio-statistics
Part ISchool of Public Health,
November
Addis6,Ababa
2024 University
29 / 194 A
Data and types of data

Quantitative Data

The objects being studied are ”measured” based on some quantitative


trait.
The resulting data are a set of numbers.
Interval and Ratio scales will be used to measure quantitative data.
Examples
Pulse rate
Exam marks
Height
Time to complete a biostatistics exam
Age Number of cigarettes smoked

Zeytu Gashaw Asfaw (PhD) Department of Epidemiology


Basics for
andBiostatistics
Bio-statistics
Part ISchool of Public Health,
November
Addis6,Ababa
2024 University
30 / 194 A
Data and types of data

Quantitative Data

Discrete
Continuous

Zeytu Gashaw Asfaw (PhD) Department of Epidemiology


Basics for
andBiostatistics
Bio-statistics
Part ISchool of Public Health,
November
Addis6,Ababa
2024 University
31 / 194 A
Data and types of data

Quantitative Data

Discrete data - Gaps between possible values. Examples are;


Number of children in a family
Number of students passing a stats exam
Number of crimes reported to the police
Number of bicycles sold in a day
Examples of Continuous Data: ( All clinical data examples)
Age (in years)
Height (in cm)
Weight (in kg)
Sys.BP, Hb, etc
Generally, continuous data comes from measurements.

Zeytu Gashaw Asfaw (PhD) Department of Epidemiology


Basics for
andBiostatistics
Bio-statistics
Part ISchool of Public Health,
November
Addis6,Ababa
2024 University
32 / 194 A
Data and types of data

Quantitative (Numerical) vs Qualitative (Categorical)

Zeytu Gashaw Asfaw (PhD) Department of Epidemiology


Basics for
andBiostatistics
Bio-statistics
Part ISchool of Public Health,
November
Addis6,Ababa
2024 University
33 / 194 A
Data and types of data

Quantitative (Numerical) vs Qualitative (Categorical)

Zeytu Gashaw Asfaw (PhD) Department of Epidemiology


Basics for
andBiostatistics
Bio-statistics
Part ISchool of Public Health,
November
Addis6,Ababa
2024 University
34 / 194 A
Scales of measurement

Scales of measurement

In Statistics, the variables or numbers are defined and categorised


using different scales of measurements.
Each level of measurement scale has specific properties that
determine the various use of statistical analysis.
What is the Scale?
A scale is a device or an object used to measure or quantify any event
or another object.

Zeytu Gashaw Asfaw (PhD) Department of Epidemiology


Basics for
andBiostatistics
Bio-statistics
Part ISchool of Public Health,
November
Addis6,Ababa
2024 University
35 / 194 A
Scales of measurement

Levels of Measurements

There are four different scales of measurement. The data can be defined
as being one of the four scales. The four types of scales are:
Nominal Scale
Ordinal Scale
Interval Scale
Ratio Scale

Zeytu Gashaw Asfaw (PhD) Department of Epidemiology


Basics for
andBiostatistics
Bio-statistics
Part ISchool of Public Health,
November
Addis6,Ababa
2024 University
36 / 194 A
Scales of measurement

Hierarchical Data Order

These levels of measurement can be placed in a hierarchical order:


Ratio > Interval > Ordinal > Nominal
Nominal data is the least complex and give a simple measure of
whether objects are the same or different.
Ordinal data maintains the principles of nominal data but adds a
measure of order to what is being observed.
Interval data builds on ordinal by adding more information on the
range between each observation by allowing us to measure the
distance between objects.
Ratio data adds to interval with including an absolute zero.

Zeytu Gashaw Asfaw (PhD) Department of Epidemiology


Basics for
andBiostatistics
Bio-statistics
Part ISchool of Public Health,
November
Addis6,Ababa
2024 University
37 / 194 A
Scales of measurement

Levels of Measurements

Zeytu Gashaw Asfaw (PhD) Department of Epidemiology


Basics for
andBiostatistics
Bio-statistics
Part ISchool of Public Health,
November
Addis6,Ababa
2024 University
38 / 194 A
Scales of measurement

Nominal Scale

A nominal scale is the 1st level of measurement scale in which the


numbers serve as ”tags” or ”labels” to classify or identify the objects.
A nominal scale usually deals with the non-numeric variables or the
numbers that do not have any value.

Characteristics of Nominal Scale

A nominal scale variable is classified into two or more categories.


In this measurement mechanism, the answer should fall into either of
the classes.
It is qualitative.
The only permissible aspect of numbers in the nominal scale is
”counting.”
Zeytu Gashaw Asfaw (PhD) Department of Epidemiology
Basics for
andBiostatistics
Bio-statistics
Part ISchool of Public Health,
November
Addis6,Ababa
2024 University
39 / 194 A
Scales of measurement

Examples of Nominal Scale

Color - red, blue, green, yellow,...


Sex - male, female
Religion - Christianity, Islam, ...
Eye color - Blue, brown, black, green, etc.
Smoking status: Smoker, non-smoker

Zeytu Gashaw Asfaw (PhD) Department of Epidemiology


Basics for
andBiostatistics
Bio-statistics
Part ISchool of Public Health,
November
Addis6,Ababa
2024 University
40 / 194 A
Scales of measurement

Ordinal Scale

The ordinal scale is the 2nd level of measurement that reports the
ordering and ranking of data without establishing the degree of
variation between them.
Ordinal represents the ”order.”
Ordinal data is known as qualitative data or categorical data.
It can be grouped, named and also ranked.
Characteristics of the Ordinal Scale

The ordinal scale shows the relative ranking of the variables


It identifies and describes the magnitude of a variable
Along with the information provided by the nominal scale, ordinal
scales give the rankings of those variables
The interval properties are not known
The surveyors can quickly analyse the degree of agreement concerning
the identified order of variables
Zeytu Gashaw Asfaw (PhD) Department of Epidemiology
Basics for
andBiostatistics
Bio-statistics
Part ISchool of Public Health,
November
Addis6,Ababa
2024 University
41 / 194 A
Scales of measurement

Examples of Ordinal Scale

Ranking of school students 1st, 2nd, 3rd, etc.


Ratings in restaurants, hospitals,...
Evaluating the frequency of occurrences
Very often, Often, Not often, Not at all
Assessing the degree of agreement
Totally agree, Agree, Neutral, Disagree, Totally disagree
Economic status (poor, medium, higher)
military ranks...
Grades in an exam: A+, A, B+, B, C+, C, D+, D, and fail.
Degree of illness; none, mild, moderate, acute, chronic.
Opinion of students about stat classes; Very unhappy, unhappy,
neutral, happy, ecstatic!

Zeytu Gashaw Asfaw (PhD) Department of Epidemiology


Basics for
andBiostatistics
Bio-statistics
Part ISchool of Public Health,
November
Addis6,Ababa
2024 University
42 / 194 A
Scales of measurement

Nominal data (Binary) and Ordinal data:

1 What is your gender? Nominal


Male
Female
2 Did you enjoy the teaching session? Nominal
Yes
No
3 What is the level of satisfaction with the new curriculum at a medical
school received? Ordinal
Very satisfied
Somewhat satisfied
Neutral
Somewhat dissatisfied
Very dissatisfied

Zeytu Gashaw Asfaw (PhD) Department of Epidemiology


Basics for
andBiostatistics
Bio-statistics
Part ISchool of Public Health,
November
Addis6,Ababa
2024 University
43 / 194 A
Scales of measurement

Interval Scale

The interval scale is the 3rd level of measurement scale.


It is defined as a quantitative measurement scale in which the
difference between the two variables is meaningful.
In other words, the variables are measured in an exact manner, not as
in a relative way in which the presence of zero is arbitrary.
Characteristics of Interval Scale

The interval scale is quantitative as it can quantify the difference


between the values
It allows calculating the mean and median of the variables
To understand the difference between the variables, you can subtract
the values between the variables
The interval scale is the preferred scale in Statistics as it helps to
assign any numerical values to arbitrary assessment such as feelings,
calendar types, etc.
Zeytu Gashaw Asfaw (PhD) Department of Epidemiology
Basics for
andBiostatistics
Bio-statistics
Part ISchool of Public Health,
November
Addis6,Ababa
2024 University
44 / 194 A
Scales of measurement

Examples of Interval Scale

Intelligence (IQ test score of 100, 110, 120, etc.)


Pain level (1-10 scale)
Body length in infant.
temperature (Farenheit)
temperature (Celcius)
pH,

Zeytu Gashaw Asfaw (PhD) Department of Epidemiology


Basics for
andBiostatistics
Bio-statistics
Part ISchool of Public Health,
November
Addis6,Ababa
2024 University
45 / 194 A
Scales of measurement

Ratio Scale

The ratio scale is the 4th level of measurement scale, which is


quantitative.
It is a type of variable measurement scale.
It allows researchers to compare the differences or intervals.
The ratio scale has a unique feature.
It possesses the character of the origin or zero points.

Zeytu Gashaw Asfaw (PhD) Department of Epidemiology


Basics for
andBiostatistics
Bio-statistics
Part ISchool of Public Health,
November
Addis6,Ababa
2024 University
46 / 194 A
Scales of measurement

Characteristics of Ratio Scale

Ratio scale has a feature of absolute zero


It doesn’t have negative numbers, because of its zero-point feature
It affords unique opportunities for statistical analysis.
The variables can be orderly added, subtracted, multiplied, divided.
Mean, median, and mode can be calculated using the ratio scale.
Ratio scale has unique and useful properties.
One such feature is that it allows unit conversions like kilogram -
calories, gram calories, etc.

Zeytu Gashaw Asfaw (PhD) Department of Epidemiology


Basics for
andBiostatistics
Bio-statistics
Part ISchool of Public Health,
November
Addis6,Ababa
2024 University
47 / 194 A
Scales of measurement

Examples of Ratio scale

weight
pulse rate
respiratory rate
body temperature (K)
body length in infants or height in adults.
enzyme activity
dose amount
reaction rate
flow rate
concentration

Zeytu Gashaw Asfaw (PhD) Department of Epidemiology


Basics for
andBiostatistics
Bio-statistics
Part ISchool of Public Health,
November
Addis6,Ababa
2024 University
48 / 194 A
Scales of measurement

Data types - Important?

Why do we need to know what type of data we are dealing with?


The data type or level of measurement influences the type of
statistical analysis techniques that can be used when analysing data.

Zeytu Gashaw Asfaw (PhD) Department of Epidemiology


Basics for
andBiostatistics
Bio-statistics
Part ISchool of Public Health,
November
Addis6,Ababa
2024 University
49 / 194 A
Scales of measurement

To conclude

Type of variables in any data set are:


Categorical (qualitative) - Nominal and Ordinal
Quantitative - Discrete and Continuous
Whereas the scales used to measure these two variables are:
Nominal scales
Ordinal scales
Interval scales
Ratio scales.

Zeytu Gashaw Asfaw (PhD) Department of Epidemiology


Basics for
andBiostatistics
Bio-statistics
Part ISchool of Public Health,
November
Addis6,Ababa
2024 University
50 / 194 A
Scales of measurement

Data Quality

The following are some of the key characteristics of high quality data:
1 Data accuracy
2 Data completeness
3 Data consistency
4 Data coherence
5 Data timeliness
6 Clear and accessible data definitions
7 Data relevance
8 Data reliability

Zeytu Gashaw Asfaw (PhD) Department of Epidemiology


Basics for
andBiostatistics
Bio-statistics
Part ISchool of Public Health,
November
Addis6,Ababa
2024 University
51 / 194 A
Scales of measurement

Here is the visual representation of the attributes of high quality


data set

Zeytu Gashaw Asfaw (PhD) Department of Epidemiology


Basics for
andBiostatistics
Bio-statistics
Part ISchool of Public Health,
November
Addis6,Ababa
2024 University
52 / 194 A
Methods of data organization and presentation

Method of data presentation

The collected data will be too large, thus it should be organized.


Data presentation is used to:
1 to display the points of similarity and dissimilarity
2 to condensation and suppression of irrelevant detail
3 to enable one to form a mental picture of objects of perception
4 to prepare the ground for comparison and inference

Zeytu Gashaw Asfaw (PhD) Department of Epidemiology


Basics for
andBiostatistics
Bio-statistics
Part ISchool of Public Health,
November
Addis6,Ababa
2024 University
53 / 194 A
Methods of data organization and presentation

...Method of data presentation

There are two main methods of presenting data of a variable character or


a variable.
1 Tabulation/Tabular Presentation
2 Drawing/Graphical Presentation

Zeytu Gashaw Asfaw (PhD) Department of Epidemiology


Basics for
andBiostatistics
Bio-statistics
Part ISchool of Public Health,
November
Addis6,Ababa
2024 University
54 / 194 A
Methods of data organization and presentation

...Method of data presentation

Zeytu Gashaw Asfaw (PhD) Department of Epidemiology


Basics for
andBiostatistics
Bio-statistics
Part ISchool of Public Health,
November
Addis6,Ababa
2024 University
55 / 194 A
Methods of data organization and presentation

Tabulation

Tabulation is a device for presenting data from a mass of statistical


data.
Preparation of frequency distribution table is the first requirement for
that.
Tables are often simple or complex depending upon the measurement
of a single set of things or multiple sets of things.

Zeytu Gashaw Asfaw (PhD) Department of Epidemiology


Basics for
andBiostatistics
Bio-statistics
Part ISchool of Public Health,
November
Addis6,Ababa
2024 University
56 / 194 A
Methods of data organization and presentation

Frequency Distribution

The distribution of the overall number of observations among the


various categories is termed as frequency distribution.
Frequency Distribution is a very important step in statistical analysis.
It groups a sizable amount of series or observations of the master
table and presents the data very concisely, giving all information at a
look.
It records how frequently a characteristic or an occurrence occurs in
persons of the same group.
Data are often recorded within the sort of frequency table.
In short, collecting and summarizing a great amount of data is called
frequency distribution.

Zeytu Gashaw Asfaw (PhD) Department of Epidemiology


Basics for
andBiostatistics
Bio-statistics
Part ISchool of Public Health,
November
Addis6,Ababa
2024 University
57 / 194 A
Methods of data organization and presentation

...Frequency Distribution

Zeytu Gashaw Asfaw (PhD) Department of Epidemiology


Basics for
andBiostatistics
Bio-statistics
Part ISchool of Public Health,
November
Addis6,Ababa
2024 University
58 / 194 A
Methods of data organization and presentation

Types of Tabulation

1 Simple Tabulation
2 Complex Tabulation
Simple Tabulation
Simple Tabulation is when the information/data are tabulated to one
characteristic.
For example, the survey determined the frequency or number of
employees of a firm owning different brands of mobile phones.

Zeytu Gashaw Asfaw (PhD) Department of Epidemiology


Basics for
andBiostatistics
Bio-statistics
Part ISchool of Public Health,
November
Addis6,Ababa
2024 University
59 / 194 A
Methods of data organization and presentation

Zeytu Gashaw Asfaw (PhD) Department of Epidemiology


Basics for
andBiostatistics
Bio-statistics
Part ISchool of Public Health,
November
Addis6,Ababa
2024 University
60 / 194 A
Methods of data organization and presentation

Zeytu Gashaw Asfaw (PhD) Department of Epidemiology


Basics for
andBiostatistics
Bio-statistics
Part ISchool of Public Health,
November
Addis6,Ababa
2024 University
61 / 194 A
Methods of data organization and presentation

Graphs for quantitative data

Histogram
A histogram is a graphical display of data using bars of various
heights.
In a histogram, each bar groups numbers into ranges. Taller bars
show that more data falls in this range.
A histogram displays the form/shape and spread of continuous sample
data.

Zeytu Gashaw Asfaw (PhD) Department of Epidemiology


Basics for
andBiostatistics
Bio-statistics
Part ISchool of Public Health,
November
Addis6,Ababa
2024 University
62 / 194 A
Methods of data organization and presentation

Zeytu Gashaw Asfaw (PhD) Department of Epidemiology


Basics for
andBiostatistics
Bio-statistics
Part ISchool of Public Health,
November
Addis6,Ababa
2024 University
63 / 194 A
Methods of data organization and presentation

...Graphs for quantitative data

Frequency Polygon
A frequency polygon is a graph constructed by using lines to join the
midpoints of every interval or bin.
The heights of the points depict the frequencies.
A frequency polygon is usually created from the histogram or by
calculating the midpoints of the bins from the frequency distribution
table.

Zeytu Gashaw Asfaw (PhD) Department of Epidemiology


Basics for
andBiostatistics
Bio-statistics
Part ISchool of Public Health,
November
Addis6,Ababa
2024 University
64 / 194 A
Methods of data organization and presentation

Frequency Polygon

Zeytu Gashaw Asfaw (PhD) Department of Epidemiology


Basics for
andBiostatistics
Bio-statistics
Part ISchool of Public Health,
November
Addis6,Ababa
2024 University
65 / 194 A
Methods of data organization and presentation

...Graphs for quantitative data

Frequency Curve
A frequency curve is a smooth curve for which the entire area is taken
to be unity.
It’s a limiting sort of a histogram or frequency polygon.
The frequency curve for distribution is obtained by drawing a smooth
and freehand/blank check curve through the mid-points of the upper
sides of the rectangles forming the histogram.

Zeytu Gashaw Asfaw (PhD) Department of Epidemiology


Basics for
andBiostatistics
Bio-statistics
Part ISchool of Public Health,
November
Addis6,Ababa
2024 University
66 / 194 A
Methods of data organization and presentation

Frequency Curve

Zeytu Gashaw Asfaw (PhD) Department of Epidemiology


Basics for
andBiostatistics
Bio-statistics
Part ISchool of Public Health,
November
Addis6,Ababa
2024 University
67 / 194 A
Methods of data organization and presentation

...Graphs for quantitative data

Line Chart
A line chart is a graphical representation of an assets historical price
action that connects a series of data points with a continual line.
This is often the foremost basic type of chart used in finance and
typically only depicts a security’s closing prices over time.

Zeytu Gashaw Asfaw (PhD) Department of Epidemiology


Basics for
andBiostatistics
Bio-statistics
Part ISchool of Public Health,
November
Addis6,Ababa
2024 University
68 / 194 A
Methods of data organization and presentation

Line Chart

Zeytu Gashaw Asfaw (PhD) Department of Epidemiology


Basics for
andBiostatistics
Bio-statistics
Part ISchool of Public Health,
November
Addis6,Ababa
2024 University
69 / 194 A
Methods of data organization and presentation

...Graphs for quantitative data

Normal Distribution Curve


A normal distribution is a type of continuous probability distribution
for a real-valued random variable.
A normal distribution is usually informally called a bell curve.

Zeytu Gashaw Asfaw (PhD) Department of Epidemiology


Basics for
andBiostatistics
Bio-statistics
Part ISchool of Public Health,
November
Addis6,Ababa
2024 University
70 / 194 A
Methods of data organization and presentation

Normal Distribution Curve

Zeytu Gashaw Asfaw (PhD) Department of Epidemiology


Basics for
andBiostatistics
Bio-statistics
Part ISchool of Public Health,
November
Addis6,Ababa
2024 University
71 / 194 A
Methods of data organization and presentation

...Graphs for quantitative data

Cumulative Distribution Curve


In statistics, the cumulative distribution function of a real-valued
random variable, or just distribution function of, evaluated at, is the
probability that will take a value less than or adequate to.

Zeytu Gashaw Asfaw (PhD) Department of Epidemiology


Basics for
andBiostatistics
Bio-statistics
Part ISchool of Public Health,
November
Addis6,Ababa
2024 University
72 / 194 A
Methods of data organization and presentation

Cumulative Distribution Curve

Zeytu Gashaw Asfaw (PhD) Department of Epidemiology


Basics for
andBiostatistics
Bio-statistics
Part ISchool of Public Health,
November
Addis6,Ababa
2024 University
73 / 194 A
Methods of data organization and presentation

...Graphs for quantitative data

Scatter Diagram
A graph during which the values of two variables are plotted along
two axis, the pattern of the resulting points revealing any correlation
present.

Zeytu Gashaw Asfaw (PhD) Department of Epidemiology


Basics for
andBiostatistics
Bio-statistics
Part ISchool of Public Health,
November
Addis6,Ababa
2024 University
74 / 194 A
Methods of data organization and presentation

Scatter Diagram

Zeytu Gashaw Asfaw (PhD) Department of Epidemiology


Basics for
andBiostatistics
Bio-statistics
Part ISchool of Public Health,
November
Addis6,Ababa
2024 University
75 / 194 A
Methods of data organization and presentation

Diagrams for qualitative data

Bar Chart
A bar chart or bar graph is a chart or graph that presents categorical
data with rectangular bars with heights or lengths proportional to the
values that they represent.
The bars can be often plotted vertically or horizontally.
A vertical bar chart is usually called a column chart.

Zeytu Gashaw Asfaw (PhD) Department of Epidemiology


Basics for
andBiostatistics
Bio-statistics
Part ISchool of Public Health,
November
Addis6,Ababa
2024 University
76 / 194 A
Methods of data organization and presentation

Bar Chart

Zeytu Gashaw Asfaw (PhD) Department of Epidemiology


Basics for
andBiostatistics
Bio-statistics
Part ISchool of Public Health,
November
Addis6,Ababa
2024 University
77 / 194 A
Methods of data organization and presentation

...Diagrams for qualitative data

Pictogram
A pictogram is a chart that uses pictures to represent data.
Pictograms are set out in the same way as to bar charts, but rather
than bars they use columns of pictures to point out the numbers
involved.

Zeytu Gashaw Asfaw (PhD) Department of Epidemiology


Basics for
andBiostatistics
Bio-statistics
Part ISchool of Public Health,
November
Addis6,Ababa
2024 University
78 / 194 A
Methods of data organization and presentation

Pictogram

Zeytu Gashaw Asfaw (PhD) Department of Epidemiology


Basics for
andBiostatistics
Bio-statistics
Part ISchool of Public Health,
November
Addis6,Ababa
2024 University
79 / 194 A
Methods of data organization and presentation

...Diagrams for qualitative data

Pie Chart
A pie chart is a sort of graph in which a circle is split into sectors that
each represents a proportion of the entire.
Pie charts are a useful way to organize data in order to see the size of
components relative to the entire and are particularly good at
showing percentage or proportional data.

Zeytu Gashaw Asfaw (PhD) Department of Epidemiology


Basics for
andBiostatistics
Bio-statistics
Part ISchool of Public Health,
November
Addis6,Ababa
2024 University
80 / 194 A
Methods of data organization and presentation

Pie Chart

Zeytu Gashaw Asfaw (PhD) Department of Epidemiology


Basics for
andBiostatistics
Bio-statistics
Part ISchool of Public Health,
November
Addis6,Ababa
2024 University
81 / 194 A
Methods of data organization and presentation

...Diagrams for qualitative data

Map Diagram
A map diagram is a way of representation of any event distribution by
means of diagrams, that are placed on the map inside the structure of
territorial division which expresses the summarized value of this event
within the bounds of this territorial structure.

Zeytu Gashaw Asfaw (PhD) Department of Epidemiology


Basics for
andBiostatistics
Bio-statistics
Part ISchool of Public Health,
November
Addis6,Ababa
2024 University
82 / 194 A
Methods of data organization and presentation

Map Diagram

Zeytu Gashaw Asfaw (PhD) Department of Epidemiology


Basics for
andBiostatistics
Bio-statistics
Part ISchool of Public Health,
November
Addis6,Ababa
2024 University
83 / 194 A
Methods of data organization and presentation

Which chart or graph would be appropriate to display the proportion of


males versus females among the shoppers?
A)A bar graph
B)A time plot
C)A pie chart
D)Choices (A) and (C)
E)Choices (A),(B) and (C)

Zeytu Gashaw Asfaw (PhD) Department of Epidemiology


Basics for
andBiostatistics
Bio-statistics
Part ISchool of Public Health,
November
Addis6,Ababa
2024 University
84 / 194 A
Measures of central tendency

Measures of Central Tendency

Measures of central tendency (average) helps to condense a mass of


data into a single representative value
An average is a single value intended to represent a data set as a
whole.
Characterize the average or typical behavior of the data.

Zeytu Gashaw Asfaw (PhD) Department of Epidemiology


Basics for
andBiostatistics
Bio-statistics
Part ISchool of Public Health,
November
Addis6,Ababa
2024 University
85 / 194 A
Measures of central tendency

...Measures of Central Tendency

Best averages are:


based on all the observations
simple to understand and easy to interpret
easily manipulated algebraically
little affected by fluctuations of sampling
should not unduly be influenced by extreme values

Zeytu Gashaw Asfaw (PhD) Department of Epidemiology


Basics for
andBiostatistics
Bio-statistics
Part ISchool of Public Health,
November
Addis6,Ababa
2024 University
86 / 194 A
Measures of central tendency

...Measures of Central Tendency

There are many types of central tendency measures:


Arithmetic mean
Weighted arithmetic mean
Geometric mean
Median
Mode

Zeytu Gashaw Asfaw (PhD) Department of Epidemiology


Basics for
andBiostatistics
Bio-statistics
Part ISchool of Public Health,
November
Addis6,Ababa
2024 University
87 / 194 A
Measures of central tendency

Arithmetic Mean (AM)

The Arithmetic Mean of a set of n numbers


X1 +X2 +...+Xn
AM = n

Arithmetic Mean for population and sample


PN
i=1 xi
µ= N
Pn
i=1 xi
x̄ = n

Zeytu Gashaw Asfaw (PhD) Department of Epidemiology


Basics for
andBiostatistics
Bio-statistics
Part ISchool of Public Health,
November
Addis6,Ababa
2024 University
88 / 194 A
Measures of central tendency

Properties of Arithmetic Mean

It requires at least the interval scale


All values are used
It is unique
It is easy to calculate and allow easy mathematical treatment
The sum of the deviations from the mean is 0
The arithmetic mean is the only measure of central tendency where
the sum of the deviations of each value from the mean is zero!
It is easily affected by extremes, such as very big or small numbers in
the set (non-robust).

Zeytu Gashaw Asfaw (PhD) Department of Epidemiology


Basics for
andBiostatistics
Bio-statistics
Part ISchool of Public Health,
November
Addis6,Ababa
2024 University
89 / 194 A
Measures of central tendency

How Extremes Affect the Arithmetic Mean?

The mean of the values 1,1,1,1,100 is 20.8.


However, 20.8 does not represent the typical behavior of this data set!
Extreme numbers relative to the rest of the data is called outliers!
Examination of data for possible outliers serves many useful purposes,
including
Identifying strong skew in the distribution.
Identifying data collection or entry errors.
Providing insight into interesting properties of the data.

Zeytu Gashaw Asfaw (PhD) Department of Epidemiology


Basics for
andBiostatistics
Bio-statistics
Part ISchool of Public Health,
November
Addis6,Ababa
2024 University
90 / 194 A
Measures of central tendency

Weighted Arithmetic Mean (WAM)

The Weighted Arithmetic Mean (WAM) of a set of n numbers

w1 X1 +w2 X2 +...+wn Xn
WAM = w1 +w2 +...+wn

This formula will be used to calculate the mean and variance for
grouped data

Zeytu Gashaw Asfaw (PhD) Department of Epidemiology


Basics for
andBiostatistics
Bio-statistics
Part ISchool of Public Health,
November
Addis6,Ababa
2024 University
91 / 194 A
Measures of central tendency

Geometric Mean (GM)

Given a set of n numbers x1 , ..., xn , the geometric mean is given by


the following formula:

GM = x1 , ..., xn

If we know the initial and final value over a certain period of n


(instead of the individual number sin each period), then
q
final value
GM = initial value

Zeytu Gashaw Asfaw (PhD) Department of Epidemiology


Basics for
andBiostatistics
Bio-statistics
Part ISchool of Public Health,
November
Addis6,Ababa
2024 University
92 / 194 A
Measures of central tendency

Arithmetic Mean vs Geometric Mean: the AM-GM inequality:

If x1 , ..., xn ≥ 0, then

X1 +X2 +...+Xn √
AM = n ≥ x1 , ..., xn = GM

with equality if and only if x1 = x2 = ... = xn .

Zeytu Gashaw Asfaw (PhD) Department of Epidemiology


Basics for
andBiostatistics
Bio-statistics
Part ISchool of Public Health,
November
Addis6,Ababa
2024 University
93 / 194 A
Measures of central tendency

Properties of Geometric Mean

Similar to arithmetic mean, except used in different scenario


It requires interval level
All values are used
It is unique
It is easy to calculate and allow easy mathematical treatments

Zeytu Gashaw Asfaw (PhD) Department of Epidemiology


Basics for
andBiostatistics
Bio-statistics
Part ISchool of Public Health,
November
Addis6,Ababa
2024 University
94 / 194 A
Measures of central tendency

Median

The Median is the midpoint of the values after they have been
ordered from the smallest to the largest
Equivalently, the Median is a number which divides the data set into
two equal parts, each item in one part is no more than this number,
and each item in another part is no less than this number.

Zeytu Gashaw Asfaw (PhD) Department of Epidemiology


Basics for
andBiostatistics
Bio-statistics
Part ISchool of Public Health,
November
Addis6,Ababa
2024 University
95 / 194 A
Measures of central tendency

Two-step process to find the median

Step 1. Sort the data in a nondecreasing order


Step 2.
If the total number of items n is an odd number, then the number on
the n+12 position is the median;
If n is an even number, then the average of the two numbers on the n2
and n2 + 1 positions is the median.
For ordinal level of data, choose any one on the two middle positions.

Zeytu Gashaw Asfaw (PhD) Department of Epidemiology


Basics for
andBiostatistics
Bio-statistics
Part ISchool of Public Health,
November
Addis6,Ababa
2024 University
96 / 194 A
Measures of central tendency

Properties of Median

It requires at least the ordinal scale


All values are used
It is unique
It is easy to calculate but does not allow easy mathematical treatment
It is not affected by extremely large or small numbers (robust)

Zeytu Gashaw Asfaw (PhD) Department of Epidemiology


Basics for
andBiostatistics
Bio-statistics
Part ISchool of Public Health,
November
Addis6,Ababa
2024 University
97 / 194 A
Measures of central tendency

Mode

The mode is the number that occurs most often in a data set.
The number that has the highest frequency.
It is the value which occurs the maximum number of times in a given
data set.

Zeytu Gashaw Asfaw (PhD) Department of Epidemiology


Basics for
andBiostatistics
Bio-statistics
Part ISchool of Public Health,
November
Addis6,Ababa
2024 University
98 / 194 A
Measures of central tendency

Example - Mode

The exam scores for ten students are: 81, 93, 84, 75, 68, 87, 81, 75,
81, 87
The score of 81 occurs the most often. It is the Mode!

Zeytu Gashaw Asfaw (PhD) Department of Epidemiology


Basics for
andBiostatistics
Bio-statistics
Part ISchool of Public Health,
November
Addis6,Ababa
2024 University
99 / 194 A
Measures of central tendency

Properties of Mode

Even nominal data have mode(s)


All values are used
It is not unique
Modeless: if all data have different values, such as 1,1,1
Multimodal: if more than one value have the same frequency, such as
1,1,2,2,3.
It is easy to calculate but does not allow easy mathematical treatment
It is not affected by extremely large or small numbers (robust)

Zeytu Gashaw Asfaw (PhD) Department of Epidemiology


Basics for
andBiostatistics
Bio-statistics
Part ISchool of Public Health,
NovemberAddis
6, Ababa
2024 University
100 / 194 A
Measures of central tendency

Mean for grouped data

The mean of a sample of data organized in a frequency distribution is


computed by the following formula:
f1 x1 +...+f1 xk
x̄ = f1 +...+fk

where fi is the frequency of Class i and xi is the class mid-point of


Class i.

Zeytu Gashaw Asfaw (PhD) Department of Epidemiology


Basics for
andBiostatistics
Bio-statistics
Part ISchool of Public Health,
NovemberAddis
6, Ababa
2024 University
101 / 194 A
Measures of central tendency

Zeytu Gashaw Asfaw (PhD) Department of Epidemiology


Basics for
andBiostatistics
Bio-statistics
Part ISchool of Public Health,
NovemberAddis
6, Ababa
2024 University
102 / 194 A
Measures of central tendency

Zeytu Gashaw Asfaw (PhD) Department of Epidemiology


Basics for
andBiostatistics
Bio-statistics
Part ISchool of Public Health,
NovemberAddis
6, Ababa
2024 University
103 / 194 A
Measures of central tendency

Zeytu Gashaw Asfaw (PhD) Department of Epidemiology


Basics for
andBiostatistics
Bio-statistics
Part ISchool of Public Health,
NovemberAddis
6, Ababa
2024 University
104 / 194 A
Measures of central tendency

Zeytu Gashaw Asfaw (PhD) Department of Epidemiology


Basics for
andBiostatistics
Bio-statistics
Part ISchool of Public Health,
NovemberAddis
6, Ababa
2024 University
105 / 194 A
Measures of central tendency

Zeytu Gashaw Asfaw (PhD) Department of Epidemiology


Basics for
andBiostatistics
Bio-statistics
Part ISchool of Public Health,
NovemberAddis
6, Ababa
2024 University
106 / 194 A
Measures of central tendency

Zeytu Gashaw Asfaw (PhD) Department of Epidemiology


Basics for
andBiostatistics
Bio-statistics
Part ISchool of Public Health,
NovemberAddis
6, Ababa
2024 University
107 / 194 A
Measures of central tendency

Zeytu Gashaw Asfaw (PhD) Department of Epidemiology


Basics for
andBiostatistics
Bio-statistics
Part ISchool of Public Health,
NovemberAddis
6, Ababa
2024 University
108 / 194 A
Measures of central tendency

Zeytu Gashaw Asfaw (PhD) Department of Epidemiology


Basics for
andBiostatistics
Bio-statistics
Part ISchool of Public Health,
NovemberAddis
6, Ababa
2024 University
109 / 194 A
Measures of central tendency

Zeytu Gashaw Asfaw (PhD) Department of Epidemiology


Basics for
andBiostatistics
Bio-statistics
Part ISchool of Public Health,
NovemberAddis
6, Ababa
2024 University
110 / 194 A
Measures of central tendency

Zeytu Gashaw Asfaw (PhD) Department of Epidemiology


Basics for
andBiostatistics
Bio-statistics
Part ISchool of Public Health,
NovemberAddis
6, Ababa
2024 University
111 / 194 A
Measures of central tendency

Zeytu Gashaw Asfaw (PhD) Department of Epidemiology


Basics for
andBiostatistics
Bio-statistics
Part ISchool of Public Health,
NovemberAddis
6, Ababa
2024 University
112 / 194 A
Measures of central tendency

Zeytu Gashaw Asfaw (PhD) Department of Epidemiology


Basics for
andBiostatistics
Bio-statistics
Part ISchool of Public Health,
NovemberAddis
6, Ababa
2024 University
113 / 194 A
Measures of central tendency

Zeytu Gashaw Asfaw (PhD) Department of Epidemiology


Basics for
andBiostatistics
Bio-statistics
Part ISchool of Public Health,
NovemberAddis
6, Ababa
2024 University
114 / 194 A
Measures of dispersion

Measures of dispersion

What is measures of dispersion?


Why measures of dispersion?
How measures of dispersions are calculated?
Range
Quartile deviation or semi inter-quartile range,
Mean deviation and
Standard deviation.
Methods for detecting outlier
Measure of Relative Standing
Measure of shape

Zeytu Gashaw Asfaw (PhD) Department of Epidemiology


Basics for
andBiostatistics
Bio-statistics
Part ISchool of Public Health,
NovemberAddis
6, Ababa
2024 University
115 / 194 A
Measures of dispersion

Measures of dispersion

describe the homogeneity or heterogeneity of the distribution,


understand the reliability of the mean,
compare the distributions as regards the variability.
describe the relative standing of the data and also shape of the
distribution.

Zeytu Gashaw Asfaw (PhD) Department of Epidemiology


Basics for
andBiostatistics
Bio-statistics
Part ISchool of Public Health,
NovemberAddis
6, Ababa
2024 University
116 / 194 A
Measures of dispersion

What is measures of dispersion?

Central tendency measures do not reveal the variability present in the


data.
Dispersion is the scatteredness of the data series around it average.
Dispersion is the extent to which values in a distribution differ from
the average of the distribution.

Zeytu Gashaw Asfaw (PhD) Department of Epidemiology


Basics for
andBiostatistics
Bio-statistics
Part ISchool of Public Health,
NovemberAddis
6, Ababa
2024 University
117 / 194 A
Measures of dispersion

Why measures of dispersion? (Significance)

Determine the reliability of an average


Serve as a basis for the control of the variability
To compare the variability of two or more series and
Facilitate the use of other statistical measures.

Zeytu Gashaw Asfaw (PhD) Department of Epidemiology


Basics for
andBiostatistics
Bio-statistics
Part ISchool of Public Health,
NovemberAddis
6, Ababa
2024 University
118 / 194 A
Measures of dispersion

Zeytu Gashaw Asfaw (PhD) Department of Epidemiology


Basics for
andBiostatistics
Bio-statistics
Part ISchool of Public Health,
NovemberAddis
6, Ababa
2024 University
119 / 194 A
Measures of dispersion

Zeytu Gashaw Asfaw (PhD) Department of Epidemiology


Basics for
andBiostatistics
Bio-statistics
Part ISchool of Public Health,
NovemberAddis
6, Ababa
2024 University
120 / 194 A
Measures of dispersion

Characteristics of an Ideal Measure of Dispersion

It should be rigidly defined.


It should be easy to understand and easy to calculate.
It should be based on all the observations of the data.
It should be easily subjected to further mathematical
It should be least affected by the sampling fluctuation .
It should not be unduly affected by the extreme values

Zeytu Gashaw Asfaw (PhD) Department of Epidemiology


Basics for
andBiostatistics
Bio-statistics
Part ISchool of Public Health,
NovemberAddis
6, Ababa
2024 University
121 / 194 A
Measures of dispersion

How dispersions are measured?

Measure of dispersion:
Absolute: Measure the dispersion in the original unit of the data.
Variability in 2 or more distribution can be compared provided they
are given in the same unit and have the same average.
Relative: Measure of dispersion is free from unit of measurement of
data.
It is the ratio of a measaure of absolute dispersion to the average,
from which absolute deviations are measured.
It is called as co-efficient of dispersion

Zeytu Gashaw Asfaw (PhD) Department of Epidemiology


Basics for
andBiostatistics
Bio-statistics
Part ISchool of Public Health,
NovemberAddis
6, Ababa
2024 University
122 / 194 A
Measures of dispersion

Types of measures of dispersion

The two types of measures of dispersion are -


Absolute measures of dispersion
Relative measures of dispersion

Zeytu Gashaw Asfaw (PhD) Department of Epidemiology


Basics for
andBiostatistics
Bio-statistics
Part ISchool of Public Health,
NovemberAddis
6, Ababa
2024 University
123 / 194 A
Measures of dispersion

Types of measures of dispersion

Zeytu Gashaw Asfaw (PhD) Department of Epidemiology


Basics for
andBiostatistics
Bio-statistics
Part ISchool of Public Health,
NovemberAddis
6, Ababa
2024 University
124 / 194 A
Measures of dispersion

Absolute measures of dispersion

Absolute measures of dispersion are expressed in the same unit in


which observations are given.
These measures are useful for comparing variation in two or more
distributions where units of measurement are the same.
These measures cannot be used for comparing the variability of
distributions express in dissimilar units.

Zeytu Gashaw Asfaw (PhD) Department of Epidemiology


Basics for
andBiostatistics
Bio-statistics
Part ISchool of Public Health,
NovemberAddis
6, Ababa
2024 University
125 / 194 A
Measures of dispersion

Absolute measures of dispersion

Distance deviation measures or measures of limits


These use distance of spread between two values in the data set.
This distance becomes a measure of variability on measure of
dispersion.
The larger the distance between two values the greater is the variability.
The methods for the study of measure of dispersion by distance include
range, percentiles, quartile deviation, semi-interquartile deviations.
Average deviation measures
they are the average of deviation determined from the measure of
Central tendency.
They are used more commonly for measuring variability or dispersion.
These include mean deviation, standard deviation and variance.

Zeytu Gashaw Asfaw (PhD) Department of Epidemiology


Basics for
andBiostatistics
Bio-statistics
Part ISchool of Public Health,
NovemberAddis
6, Ababa
2024 University
126 / 194 A
Measures of dispersion

Relative measures of dispersion

These are expressed as a ratio or percentage of all the coefficient of


the absolute measures of dispersion.
Therefore, relative measures of dispersion are also called coefficient of
dispersion.
These are pure unit unless numbers.
Relative measures are used for comparing variability in two or more
distributions having different units of measurements.

Zeytu Gashaw Asfaw (PhD) Department of Epidemiology


Basics for
andBiostatistics
Bio-statistics
Part ISchool of Public Health,
NovemberAddis
6, Ababa
2024 University
127 / 194 A
Measures of dispersion

Range

The difference between the values of the two extreme items of a


series.
Range - Maximum Value - Minimum Value
Example: Age of a sample of 10 subjects from a population of 169
subjects are:
X1 X2 X3 X4 X5 X6 X7 X8 X9 X10
42 28 28 61 31 23 50 34 32 37
The youngest subject in the sample is 23years old and the oldest is 61
years, The range:
Range = Xmax − Xmin = 61 − 23 = 38
Co-efficient of Range:
Xmax −Xmin 61−23 38
Co − efficient of Range = Xmax −Xmin = 61+23 = 84 = 0.452

Zeytu Gashaw Asfaw (PhD) Department of Epidemiology


Basics for
andBiostatistics
Bio-statistics
Part ISchool of Public Health,
NovemberAddis
6, Ababa
2024 University
128 / 194 A
Measures of dispersion

Characteristics of Range

Simplest and most crude measure of dispersion


It is not based on all the observations.
Unduly affected by the extreme values and fluctuations of sampling.
The range may increase with the size of the set of observations
though it can decrease
Gives an idea of the variability very quickly

Zeytu Gashaw Asfaw (PhD) Department of Epidemiology


Basics for
andBiostatistics
Bio-statistics
Part ISchool of Public Health,
NovemberAddis
6, Ababa
2024 University
129 / 194 A
Measures of dispersion

Percentiles, Quartiles (Measure of Relative Standing) and Interquartile


Range

Descriptive measures that locate the relative position of an


observation in relation to the other observations are called measures
of relative standing.
They are quartiles, deciles and percentiles
The quartiles and the median divide the array into four equal parts,
deciles into ten equal groups, and percentiles into one hundred equal
groups.
Given a set of n observations X1 , X2 , .Xn , the p th percentile 0 P 0 is the
value of X such that 0 P 0 per cent of the observations are less than and
100 − p per cent of the observations are greater than P.
25th percentile = 1st Quartile i.e., Q1
50th percentile = 2nd Quartile i.e., Q2
75th percentile = 3rd Quartile i.e., Q3
Zeytu Gashaw Asfaw (PhD) Department of Epidemiology
Basics for
andBiostatistics
Bio-statistics
Part ISchool of Public Health,
NovemberAddis
6, Ababa
2024 University
130 / 194 A
Measures of dispersion

Percentiles, Quartiles (Measure of Relative Standing) and Interquartile


Range

Q1 = ( n+1 th
4 ) ordered observation

Q2 = ( 2[n+1] th
4 ) ordered observation

Q3 = ( 3[n+1] th
4 ) ordered observation

Interquartile Range (IQR): The difference between the 3rd and 1st
quartile. IQR = Q3 − Q1
Q3 −Q1
Semi Interquartile Range:= 2
3 −Q1
Coefficient of quartile deviation: Q
Q3 +Q1

Zeytu Gashaw Asfaw (PhD) Department of Epidemiology


Basics for
andBiostatistics
Bio-statistics
Part ISchool of Public Health,
NovemberAddis
6, Ababa
2024 University
131 / 194 A
Measures of dispersion

Interquartile Range

Merits:
It is superior to range as a measure of dispersion.
A special utility in measuring variation in case of open end distribution
or one which the data may be ranked but measured quantitatively.
Useful in erratic or badly skewed distribution.
The Quartile deviation is not affected by the presence of extreme
values.

Zeytu Gashaw Asfaw (PhD) Department of Epidemiology


Basics for
andBiostatistics
Bio-statistics
Part ISchool of Public Health,
NovemberAddis
6, Ababa
2024 University
132 / 194 A
Measures of dispersion

...Interquartile Range

Limitations:
As the value of quartile deviation dose not depend upon every item of
the series it can’t be regarded as a good method of measuring
dispersion.
It is not capable of mathematical manipulation.
Its value is very much affected by sampling fluctuation

Zeytu Gashaw Asfaw (PhD) Department of Epidemiology


Basics for
andBiostatistics
Bio-statistics
Part ISchool of Public Health,
NovemberAddis
6, Ababa
2024 University
133 / 194 A
Measures of dispersion

Z-score

Another measure of relative standing is the z-score for an observation


(or standard score).
It describes how far individual item in a distribution departs from the
mean of the distribution.
Standard score gives us the number of standard deviations, a
particular observation lies below or above the mean.
Standard score (or z -score) is defined as follows:
For a population z-score = X −µ
σ where X = the observation from the
population µ the population mean, σ = the population s.d x
For a sample z-score= X −s X̄ where X =the observation from the
sample, X̄ the sample mean, s = the sample s.d

Zeytu Gashaw Asfaw (PhD) Department of Epidemiology


Basics for
andBiostatistics
Bio-statistics
Part ISchool of Public Health,
NovemberAddis
6, Ababa
2024 University
134 / 194 A
Measures of dispersion

Mean Absolute Deviation (MAD) or Mean Deviation (MD)

The average of difference of the values of items from some average of


the series (ignoring negative sign), i.e. the arithmetic mean of the
absolute differences of the values from their average,
Note
1 MD is based on all values and hence cannot be calculated for
openended distributions.
2 It uses average but ignores signs and hence appears unmethodical.
3 MD is calculated from mean as well as from median for both
ungrouped data using direct method and for continuous distribution
using assumed mean method and short-cut-method.
4 The average used is either the arithmetic mean or median

Zeytu Gashaw Asfaw (PhD) Department of Epidemiology


Basics for
andBiostatistics
Bio-statistics
Part ISchool of Public Health,
NovemberAddis
6, Ababa
2024 University
135 / 194 A
Measures of dispersion

Computation of Mean absolute Deviation

For individual series: X1 , ..., Xn


P
|Xi −X̄ |
MAD = n
For discrete series: X1 , ..., Xn and with corresponding frequency
f1 , ..., fn
P
|Xi −X̄ |
fiP
MAD = fi
X̄ : Mean of the data series.

Zeytu Gashaw Asfaw (PhD) Department of Epidemiology


Basics for
andBiostatistics
Bio-statistics
Part ISchool of Public Health,
NovemberAddis
6, Ababa
2024 University
136 / 194 A
Measures of dispersion

Computation of Mean absolute Deviation:

For continuous grouped data: m1 , ..., mn are the class mid points with
corresponding class frequencyf1 , ..., fn
P
|mi −X̄ |
fiP
MAD = fi
X̄ : Mean of the data series.
MAD
Coeff. Of MAD: = Average
The average from which the Deviations are calculated.
It is a relative measure of dispersion and is comparable to similar
measure of other series.

Zeytu Gashaw Asfaw (PhD) Department of Epidemiology


Basics for
andBiostatistics
Bio-statistics
Part ISchool of Public Health,
NovemberAddis
6, Ababa
2024 University
137 / 194 A
Measures of dispersion

Zeytu Gashaw Asfaw (PhD) Department of Epidemiology


Basics for
andBiostatistics
Bio-statistics
Part ISchool of Public Health,
NovemberAddis
6, Ababa
2024 University
138 / 194 A
Measures of dispersion

Zeytu Gashaw Asfaw (PhD) Department of Epidemiology


Basics for
andBiostatistics
Bio-statistics
Part ISchool of Public Health,
NovemberAddis
6, Ababa
2024 University
139 / 194 A
Measures of dispersion

Merits and Limitations of MAD

Simple to understand and easy to compute.


Based on all observations.
MAD is less affected by the extreme items than the Standard
deviation.
Greatest draw back is that the algebraic signs are ignored.
Not amenable to further mathematical treatment.
MAD gives us best result when deviation is taken from median. But
median is not satisfactory for large variability in the data. If MAD is
computed from mode, the value of the mode can not be determined
always.

Zeytu Gashaw Asfaw (PhD) Department of Epidemiology


Basics for
andBiostatistics
Bio-statistics
Part ISchool of Public Health,
NovemberAddis
6, Ababa
2024 University
140 / 194 A
Measures of dispersion

Standard Deviation (S)

It is the positive square root of the average of squares of deviations of


the observations from the mean. This is also called root mean
squared deviation (S).
For individual series: X1 , ..., Xn
qP qP
(Xi −X̄ )2 Xi2
P
Xi 2
S= n ,S= n −( n )
For discrete series: X1 , ..., Xn and with corresponding frequency
f1 , ..., fn
qP rP
2 f X2
P
fi (X i −X̄ ) Pi i − ( Pfi Xi )2
S= P
fi , S = fi fi

Zeytu Gashaw Asfaw (PhD) Department of Epidemiology


Basics for
andBiostatistics
Bio-statistics
Part ISchool of Public Health,
NovemberAddis
6, Ababa
2024 University
141 / 194 A
Measures of dispersion

Standard Deviation (S)

For continuous grouped series with class midpoints : m1 , ..., mn and


with corresponding frequency f1 , ..., fn
qP rP
2 f m2
P
fi (mi − X̄ ) Pi i − ( Pfi mi )2
S= P
fi ,S= fi fi

Variance: It is the square of the s.d


Coefficient of Variation (CV): Corresponding Relative measure of
dispersion.
S
CV = X̄
× 100

Zeytu Gashaw Asfaw (PhD) Department of Epidemiology


Basics for
andBiostatistics
Bio-statistics
Part ISchool of Public Health,
NovemberAddis
6, Ababa
2024 University
142 / 194 A
Measures of dispersion

Characteristics of Standard Deviation:

SD is very satisfactory and most widely used measure of dispersion


Amenable for mathematical manipulation
It is independent of origin, but not of scale
If SD is small, there is a high probability for getting a value close to
the mean and if it is large, the value is father away from the mean
Does not ignore the algebraic signs and it is less affected by
fluctuations of sampling
SD can be calculated by:
Direct method
Assumed mean method.
Step deviation method.

Zeytu Gashaw Asfaw (PhD) Department of Epidemiology


Basics for
andBiostatistics
Bio-statistics
Part ISchool of Public Health,
NovemberAddis
6, Ababa
2024 University
143 / 194 A
Measures of dispersion

Uses of Standard deviation

Basic rule – More spread will yield a larger SD


Uses of the standard deviation
The standard deviation enables us to determine, with a great deal of
accuracy, where the values of a frequency distribution are located in
relation to the mean.
We can do this according to a theorem devised by the Russian
mathematician P.L. Chebyshev (1821- 1894).
Chebyshev’s Theorem
For any data set with the mean 0 µ0 and the standard deviation 0 s 0 at
least 75values will fall within the 2 interval and at least 89% of the
values will fall within the 3 interval of the mean

Zeytu Gashaw Asfaw (PhD) Department of Epidemiology


Basics for
andBiostatistics
Bio-statistics
Part ISchool of Public Health,
NovemberAddis
6, Ababa
2024 University
144 / 194 A
Measures of dispersion

Measure of Shape

The fourth important numerical characteristic of a data set is its


shape:
1 Skewness
2 Kurtosis.
Skewness is a measure of symmetry, or more precisely, the lack of
symmetry. A distribution, or data set, is symmetric if it looks the
same to the left and right of the center point.
Kurtosis is a measure of whether the data are heavy-tailed or
light-tailed relative to a normal distribution.

Zeytu Gashaw Asfaw (PhD) Department of Epidemiology


Basics for
andBiostatistics
Bio-statistics
Part ISchool of Public Health,
NovemberAddis
6, Ababa
2024 University
145 / 194 A
Measures of dispersion

Skewness

Skewness characterizes the degree of asymmetry of a distribution


around its mean.
For a sample data, the skewness is defined by the formula
n Pn xi −x̄ 3
Skewness = (n−1)(n−2) i=1 ( S )

where n = the number of observations in the sample, xi = i th observation


in the sample, s= standard deviation of the sample, x̄ = sample mean

Zeytu Gashaw Asfaw (PhD) Department of Epidemiology


Basics for
andBiostatistics
Bio-statistics
Part ISchool of Public Health,
NovemberAddis
6, Ababa
2024 University
146 / 194 A
Measures of dispersion

Kurtosis

Kurtosis characterizes the relative peakedness or flatness of a


distribution compared with the bell-shaped distribution (normal
distribution).
Kurtosis of a sample data set is calculated by the formula:
n(n+1) Pn Xi −X̄ 4 3(n−1)2
Kurtosis = [ (n−1)(n−2)(n−3) i=1 ( S ) ] − (n−2)(n−3)

Positive kurtosis indicates a relatively peaked distribution.


Negative kurtosis indicates a relatively flat distribution.

Zeytu Gashaw Asfaw (PhD) Department of Epidemiology


Basics for
andBiostatistics
Bio-statistics
Part ISchool of Public Health,
NovemberAddis
6, Ababa
2024 University
147 / 194 A
Measures of dispersion

What is the skewness and kurtosis for normal?

The values for asymmetry and kurtosis between -2 and +2 are


considered acceptable in order to prove normal univariate distribution
(George and Mallery, 2010).
Hair et al. (2010) and Bryne (2010) argued that data is considered to
be normal if skewness is between -2 to +2 and kurtosis is between -7
to +7.

Zeytu Gashaw Asfaw (PhD) Department of Epidemiology


Basics for
andBiostatistics
Bio-statistics
Part ISchool of Public Health,
NovemberAddis
6, Ababa
2024 University
148 / 194 A
Measures of dispersion

Zeytu Gashaw Asfaw (PhD) Department of Epidemiology


Basics for
andBiostatistics
Bio-statistics
Part ISchool of Public Health,
NovemberAddis
6, Ababa
2024 University
149 / 194 A
Basic principles of probability

What is Probability?

Probability is the branch of mathematics concerning numerical


descriptions of how likely an event is to occur, or how likely it is that
a proposition is true.
Probability theory, a branch of mathematics concerned with the
analysis of random phenomena.
The outcome of a random event cannot be determined before it
occurs, but it may be any one of several possible outcomes.
The actual outcome is considered to be determined by chance.
The probability of an event is a number between 0 and 1, where,
roughly speaking, 0 indicates impossibility of the event and 1
indicates certainty.
The higher the probability of an event, the more likely it is that the
event will occur.
Zeytu Gashaw Asfaw (PhD) Department of Epidemiology
Basics for
andBiostatistics
Bio-statistics
Part ISchool of Public Health,
NovemberAddis
6, Ababa
2024 University
150 / 194 A
Basic principles of probability

Basic concepts of probability

Random Experiments
Sample space
Events
1 Mutually exclusive events (Disjoint events)
2 Equally likely events - equal chance to occur.
3 Favourable events - the number of outcomes favourable to an event in
an experiment is the number of outcomes which entail the happening
of the event
4 Exhaustive events - outcomes are said to be exhaustive when they
include all possible outcomes.
5 Independent events - if the occurrence or non-occurrence of an event
does not affect the occurrence or non-occurrence of the other.

Zeytu Gashaw Asfaw (PhD) Department of Epidemiology


Basics for
andBiostatistics
Bio-statistics
Part ISchool of Public Health,
NovemberAddis
6, Ababa
2024 University
151 / 194 A
Basic principles of probability

Examples - Applications Of Probability in Real Life

Probability is used to answer the following types of questions:


What is the chance that it will rain tomorrow?
What is the chance that a stock will go up in price?
What is the chance that I will have a heart attack?
What is the chance that I will live longer than 70 years?
What is the likelihood that when rolling a pair of dice, I will roll
doubles?
What is the probability that I will win the lottery?
What is the probability that I will become diabetic?

Zeytu Gashaw Asfaw (PhD) Department of Epidemiology


Basics for
andBiostatistics
Bio-statistics
Part ISchool of Public Health,
NovemberAddis
6, Ababa
2024 University
152 / 194 A
Basic principles of probability

...Examples - Applications Of Probability in Real Life

Forecasting the weather.


Sports outcomes. Coaches use probability to decide the best possible
strategy to pursue in a game.
Card games and other games of chance.
Insurance.
Medical diagnosis.
This is one of the most noble applications of probability in real life.
How does your doctor know that your cough is just because of an
infection and not because of something more serious?
Election results.
Lottery probability.

Zeytu Gashaw Asfaw (PhD) Department of Epidemiology


Basics for
andBiostatistics
Bio-statistics
Part ISchool of Public Health,
NovemberAddis
6, Ababa
2024 University
153 / 194 A
Basic principles of probability

Counting rules

To assign probabilities for an event, the possible outcomes of a


random experiment should be counted.
The following principles helps to determine the number of possible
outcomes favoring a given event.

Addition Rule
Multiplication Principle
Permutation
Combination

Zeytu Gashaw Asfaw (PhD) Department of Epidemiology


Basics for
andBiostatistics
Bio-statistics
Part ISchool of Public Health,
NovemberAddis
6, Ababa
2024 University
154 / 194 A
Basic principles of probability

Addition Rule

If a task can be accomplished by k distinct procedures where the i th


procedure has ni alternatives, then the total number of ways of
accomplishing the task equals
n1 + n2 + ... + nk
Example: Suppose that a man wants to make a journey from Addis
Ababa to Djibouti. The following are the means of transportation.
Air transport: 2 flights; Vehicles: 4 alternatives; Train: 2 alternatives.
(The total alternatives are 2+4+2=8)

Zeytu Gashaw Asfaw (PhD) Department of Epidemiology


Basics for
andBiostatistics
Bio-statistics
Part ISchool of Public Health,
NovemberAddis
6, Ababa
2024 University
155 / 194 A
Basic principles of probability

Multiplication Principle

If a choice consists of k steps of which the 1st can be made in n1


ways, the 2nd can be made in n2 ways,..., and the k th can be made in
nk ways, then the whole choice can be made in
n1 × n2 × .... × nk ways
Example: If a test consists of 10 multiple choice questions, with each
permitting 4 possible answers, how many ways are there in which a
student gives his/her answers?
4 × 4 × 4 × ... × 4 = 410 ways, 1, 048, 576 ways of completing the
exam

Zeytu Gashaw Asfaw (PhD) Department of Epidemiology


Basics for
andBiostatistics
Bio-statistics
Part ISchool of Public Health,
NovemberAddis
6, Ababa
2024 University
156 / 194 A
Basic principles of probability

Permutation

It is the possible ordered selections of r objects out of a total of n


objects.
The number of permutations of n objects taken r at a time is denoted
by n Pr , where
n!
n Pr = (n−r )!
The number of permutations of n objects taken all at a time is geven
by: n Pn = n!
In permutation order is important

Zeytu Gashaw Asfaw (PhD) Department of Epidemiology


Basics for
andBiostatistics
Bio-statistics
Part ISchool of Public Health,
NovemberAddis
6, Ababa
2024 University
157 / 194 A
Basic principles of probability

...Permutation

Example 5.8: Suppose that we have four letters a, b, c, d. What is the


number of possible arrangements of these letters taken all at a time?
4! = 4 × 3 × 2 × 1 = 24
What is the number of possible arrangements of these letters if we
use only three of the letters at a time?
4 P3 = 24

Zeytu Gashaw Asfaw (PhD) Department of Epidemiology


Basics for
andBiostatistics
Bio-statistics
Part ISchool of Public Health,
NovemberAddis
6, Ababa
2024 University
158 / 194 A
Basic principles of probability

Combination

It is the possible selections of r items from a group of n items


regardless of the order of selection.
The number of combinations is denoted and is read as n choose r
(nCr).
The number of combinations of r out of n elements is:
n!
n Cr = r !(n−r )!
Order is not important

Zeytu Gashaw Asfaw (PhD) Department of Epidemiology


Basics for
andBiostatistics
Bio-statistics
Part ISchool of Public Health,
NovemberAddis
6, Ababa
2024 University
159 / 194 A
Basic principles of probability

...Combination

Example: How many different committees of 3 can be formed from


Tolosa, Bethelhiem, Kebede and Lensa?
4 possible number of committees
Example: From a group of 5 men and 7 women, how many different
committees consisting of 2 men and 3 women can be formed?
350 possible committees.

Zeytu Gashaw Asfaw (PhD) Department of Epidemiology


Basics for
andBiostatistics
Bio-statistics
Part ISchool of Public Health,
NovemberAddis
6, Ababa
2024 University
160 / 194 A
Basic principles of probability

Approaches to measuring Probability

There are four different conceptual approaches to study probability theory.


The classical approach.
The frequencies approach.
The axiomatic approach.
The subjective approach.

Zeytu Gashaw Asfaw (PhD) Department of Epidemiology


Basics for
andBiostatistics
Bio-statistics
Part ISchool of Public Health,
NovemberAddis
6, Ababa
2024 University
161 / 194 A
Basic principles of probability

The Classical Approach

All outcomes are equally likely and mutually exclusive.


Total number of outcome is finite, say N.
The probability that event A occur denoted P(A) is defined as:
n(A) No:ofOutcomefavourableforA
P(A) = N = Totalnumberofoutcomes

Zeytu Gashaw Asfaw (PhD) Department of Epidemiology


Basics for
andBiostatistics
Bio-statistics
Part ISchool of Public Health,
NovemberAddis
6, Ababa
2024 University
162 / 194 A
Basic principles of probability

The Classical Approach

Limitation
If it is not possible to enumerate all the possible outcomes for an
experiment.
If the sample points (outcomes) are not mutually independent.
If the total number of outcomes is infinite.
If each and every outcomes is not equally likely.

Zeytu Gashaw Asfaw (PhD) Department of Epidemiology


Basics for
andBiostatistics
Bio-statistics
Part ISchool of Public Health,
NovemberAddis
6, Ababa
2024 University
163 / 194 A
Basic principles of probability

The Classical Approach

Example: A survey of employees of a company is summarized in the


following table.

D R I Row Totals
Executive (E) 5 34 9 48
Worker (W) 63 21 27 111
Column Tables 68 55 36 159

D = Democrat, R = Republican, I = Independent

Zeytu Gashaw Asfaw (PhD) Department of Epidemiology


Basics for
andBiostatistics
Bio-statistics
Part ISchool of Public Health,
NovemberAddis
6, Ababa
2024 University
164 / 194 A
Basic principles of probability

The Classical Approach

What is the probability that a randomly selected employee is an


executive? We use the method of relative frequency. There are 48
executives among the 159 employees. Hence
48
P(E ) = 159

What is the probability that a randomly selected employee is a


Republican? There are 55 Republicans among the 159 employees.
Hence
55
P(R) = 159

Zeytu Gashaw Asfaw (PhD) Department of Epidemiology


Basics for
andBiostatistics
Bio-statistics
Part ISchool of Public Health,
NovemberAddis
6, Ababa
2024 University
165 / 194 A
Basic principles of probability

The Classical Approach

What is the probability that a randomly selected employee is an


executive and a Democrat? Among the 159 employees, only 5 are
both an executive and a Democrat. Hence
5
P(EandD) = 159

What is the probability that a randomly selected employee is either a


worker or a Republican? There is some ambiguity in the use of the
word ”or” in the English language. Sometimes it is used in the
exclusive sense of either one or the other, but not both. For example,
when your teacher says ”either pass the course or you will have to
take it again,” you naturally expect that only one of the two
alternatives will occur.

Zeytu Gashaw Asfaw (PhD) Department of Epidemiology


Basics for
andBiostatistics
Bio-statistics
Part ISchool of Public Health,
NovemberAddis
6, Ababa
2024 University
166 / 194 A
Basic principles of probability

The Classical Approach

Sometimes the word ”or” is used in an inclusive sense. For example,


if a curriculum guideline states that a student may either take Physics
35 or Chemistry 32 to satisfy their science requirement, then one
believes that this allows the student to take both science courses.
In order to avoid any confusion, in mathematics the word ”or” is
always used in the inclusive sense of one or the other or both. Hence
111+55−21 145
P(W or R) = 159 = 159

Notice the subtraction of the number 21. This is necessary to avoid


”double counting.” When we counted the 111 workers and then
added to that number the 55 Republicans, we counted the 21 workers
that were also Republicans twice. Subtracting 21 corrects for that
double count.
Zeytu Gashaw Asfaw (PhD) Department of Epidemiology
Basics for
andBiostatistics
Bio-statistics
Part ISchool of Public Health,
NovemberAddis
6, Ababa
2024 University
167 / 194 A
Basic principles of probability

The Classical Approach

What is the probability that the employee is a Democrat, given that


the employee is a worker? Notice here that we have been given some
additional information, namely that the employee must be a worker.
What do we do with that information? We only consider the workers;
that is, the sample size is reduced from 159 to 111, the number of
workers. Hence
63
P(D, given W ) = 111

Notice that the denominator changed to reflect the given information.


Notice also that the numerator only reflects the Democrats that are
workers. A common mistake is to put 68 in the numerator, but that
ignores the information that we have been given. Once we have been
given that the chosen employee is a worker, we focus only on the
workers and ignore everybody else.
Zeytu Gashaw Asfaw (PhD) Department of Epidemiology
Basics for
andBiostatistics
Bio-statistics
Part ISchool of Public Health,
NovemberAddis
6, Ababa
2024 University
168 / 194 A
Basic principles of probability

The Classical Approach

Exercise: A fair die is tossed once. What is the probability of getting


Number 4?
An odd number?
Number greater than 4?
Either 1 or 2 or . Or 6
Number 8 ?

Zeytu Gashaw Asfaw (PhD) Department of Epidemiology


Basics for
andBiostatistics
Bio-statistics
Part ISchool of Public Health,
NovemberAddis
6, Ababa
2024 University
169 / 194 A
Basic principles of probability

The Frequencies Approach

Relative frequency probability: If some process is repeated a large


number of n times, and some resulting event E occurs m times, the
relative frequency of E ( m
n ) will be approximately equal to the
probability of E
Symbolically:
m
P(E ) = n

Zeytu Gashaw Asfaw (PhD) Department of Epidemiology


Basics for
andBiostatistics
Bio-statistics
Part ISchool of Public Health,
NovemberAddis
6, Ababa
2024 University
170 / 194 A
Basic principles of probability

The Frequencies Approach

Example
Suppose that of 158 people who attended a dinner party, 99 were ill
due to food poisoning.
The probability of illness for a person selected at random is
99
Pr (illines) = 158 = 0.63

or 63%

Zeytu Gashaw Asfaw (PhD) Department of Epidemiology


Basics for
andBiostatistics
Bio-statistics
Part ISchool of Public Health,
NovemberAddis
6, Ababa
2024 University
171 / 194 A
Basic principles of probability

Axiomatic Approach

Let E be a random experiment and S be a sample space associated with E.


With each event A
P(A) ≥ 0
0 ≤ P(A) ≤ 1
P(S) = 1
If A and B are mutually exclusive events, P(A U B) = P(A) + P(B)
P(A) = 1 − P(A0 )
P(∅) = 0; ∅ is impossible event

Zeytu Gashaw Asfaw (PhD) Department of Epidemiology


Basics for
andBiostatistics
Bio-statistics
Part ISchool of Public Health,
NovemberAddis
6, Ababa
2024 University
172 / 194 A
Basic principles of probability

Subjective Approach

Subjective probability: measures the confidence or a wish that a


particular individual has in the truth of a particular proposition.
Example: If some one says that he is 95% certain that a cure for
AIDS will be discovered within 5 years, then he means that
Pr(discovery of cure of AIDS within 5 years) = 95%
It is usually set from intuition, educated guesses, or estimates
Although the subjective view of probability has enjoyed increased
attention over the years, it has not been fully accepted by scientists

Zeytu Gashaw Asfaw (PhD) Department of Epidemiology


Basics for
andBiostatistics
Bio-statistics
Part ISchool of Public Health,
NovemberAddis
6, Ababa
2024 University
173 / 194 A
Basic principles of probability

Conditional probability and Independence

Conditional probability
Conditional Events: If the occurrence of one event has an effect on
the next occurrence of the other event then the two events are
conditional or dependent events.
The formula for calculating a sample conditional probability is easy to
use:
P(AnB)
P(A|B) = P(B)

Zeytu Gashaw Asfaw (PhD) Department of Epidemiology


Basics for
andBiostatistics
Bio-statistics
Part ISchool of Public Health,
NovemberAddis
6, Ababa
2024 University
174 / 194 A
Basic principles of probability

Conditional probability and Independence

Independence
Two events, A and B, are independent if the occurrence or
non-occurrence of either of one does not affect the probability of the
occurrence of the other.
Two events A and B are independent if and only if
P(AnB) = P(A) × P(B)
Example 1: Given that P (A) = 0.4, P (B) = 0.2, Are A and B
independent?
Solution:
P(AnB) = P(A) × P(B) = 0.4 × 0.2 = 0.08
Hence, A and B are independent

Zeytu Gashaw Asfaw (PhD) Department of Epidemiology


Basics for
andBiostatistics
Bio-statistics
Part ISchool of Public Health,
NovemberAddis
6, Ababa
2024 University
175 / 194 A
Basic principles of probability

Conditional probability and Independence

Independence
Example 2: P(C) = 0.5, P (D) = 0.3, P(CnD) = 0.1. Are C and D
independent?
Solution: P(CnD) = P(A) × P(D) = 0.5 × 0.3 = 0.15
P(CnD) 6= 0.1
Hence, C and D are dependent

Zeytu Gashaw Asfaw (PhD) Department of Epidemiology


Basics for
andBiostatistics
Bio-statistics
Part ISchool of Public Health,
NovemberAddis
6, Ababa
2024 University
176 / 194 A
Basic principles of probability

Some probability axioms, postulates and theorems

the probability of an event is a non-negative real number


P(event) ≥ 0.
P(S)=1
for any two mutual exclusive event P(A[B) = p(A) + P(B)
If A1 , A2 , ..., Ak are pair wise mutually exclusive events in S; then :
P(A1 UA2 UA3 U...UAk ) = P(A1 ) + P(A2 ) + P(A3 ) + ... + P(Ak )
If A is an event in a sample space S, and is its complement, then
P(A0 ) = 1 − P(A)

Zeytu Gashaw Asfaw (PhD) Department of Epidemiology


Basics for
andBiostatistics
Bio-statistics
Part ISchool of Public Health,
NovemberAddis
6, Ababa
2024 University
177 / 194 A
Basic principles of probability

axioms, postulates and theorems

For any sample space, S , P(∅) = 0


An event which cannot occur (or an impossible event) has probability
of 0
for any event probability of even A must be 0 ≤ P(A) ≤ 1
Example: The probabilities that a student will score an A, B, C, D or
F in a given course are 0.22, 0.32, 0.25, 0.15 and 0.06 respectively.
Find the probability that the student will get:
a grade lower than C
a pass grade
Solution:Since the five events are mutually exclusive,
P (failing) = P (D or F) = P (D) + P (F) = 0.15 + 0.06= 0.21
P (passing) = P (A or B or C) = P (A) + P (B) + P(C) = 0.22 +
0.32 + 0.25= 0.79

Zeytu Gashaw Asfaw (PhD) Department of Epidemiology


Basics for
andBiostatistics
Bio-statistics
Part ISchool of Public Health,
NovemberAddis
6, Ababa
2024 University
178 / 194 A
Basic principles of probability

axioms, postulates and theorems

Example: The probability that a person gets affected by disease X is


0.08, that a person gets affected by disease Y is 0.05, and that a
person gets affected by both diseases is 0.02, in a given community.
Then, Find the probability that a randomly selected person from this
community:
Gets affected by either or both diseases;
Gets affected by only one of the diseases.

Zeytu Gashaw Asfaw (PhD) Department of Epidemiology


Basics for
andBiostatistics
Bio-statistics
Part ISchool of Public Health,
NovemberAddis
6, Ababa
2024 University
179 / 194 A
Basic principles of probability

axioms, postulates and theorems

Solution: Let A be the event that a person gets affected by disease


X, and B be the event that a person gets affected by disease Y We
are given that P(A) = 0.08, P (B) = 0.05 and P(A n B) = 0.02

P(getting either or both) = P(AUB)


= P(A) + P(B) − P(AnB)
= 0.08 + 0.05 − 0.02 = 0.011

Zeytu Gashaw Asfaw (PhD) Department of Epidemiology


Basics for
andBiostatistics
Bio-statistics
Part ISchool of Public Health,
NovemberAddis
6, Ababa
2024 University
180 / 194 A
Basic principles of probability

axioms, postulates and theorems

P(getting only one of the diseases) = P(AnB 0 )n(BnA0 )


= P(AnB 0 ) + P(BnA0 )

P(onlyX ) = P(AnB 0 )
= P(A) − P(AnB) = 0.06

P(onlyy ) = P(BnA0 )
= P(B) − P(AnB) = 0.03

P(getting only one of the diseases) = 0.06 + 0.03 = 0:09


Zeytu Gashaw Asfaw (PhD) Department of Epidemiology
Basics for
andBiostatistics
Bio-statistics
Part ISchool of Public Health,
NovemberAddis
6, Ababa
2024 University
181 / 194 A
Basic principles of probability

Simple/unconditional conditional probabilities

Unconditional Probability
When the size of the total group or grand total (n) serves as the
denominator to calculate a probability, the probability is termed as
unconditional probability.
Example: A study was conducted to investigate the effect of
prolonged exposure to bright light on retinal damage in premature
infants. Eighteen out of 21 premature infants exposed to bright light
developed retinopathy, while 21 of 39 premature infants exposed to
reduced light level developed retinopathy. What is the probability of
developing retinopathy?

Zeytu Gashaw Asfaw (PhD) Department of Epidemiology


Basics for
andBiostatistics
Bio-statistics
Part ISchool of Public Health,
NovemberAddis
6, Ababa
2024 University
182 / 194 A
Basic principles of probability

Simple/unconditional conditional probabilities

Unconditional Probability
No. of infant with retinopaty
P(retinopaty ) =
Total number of infant
18 + 21
=
21 + 39
= 0.65

Zeytu Gashaw Asfaw (PhD) Department of Epidemiology


Basics for
andBiostatistics
Bio-statistics
Part ISchool of Public Health,
NovemberAddis
6, Ababa
2024 University
183 / 194 A
Basic principles of probability

unconditional conditional probabilities

Are probabilities that are based on the knowledge that some other
event has occurred. In this case the subset of the total group is taken
as a denominator
We want to compare the probability of retinopathy, given that the
infant was exposed to bright light, with that the infant was exposed
to reduced light
Exposure to bright-light and exposure to reduced-light are
conditioning events (i.e. events we want to take into account when
calculating conditional probabilities)
Conditional probabilities are denoted by P (A/B) (read as Probability
of A / B) or P (Event/Conditioning event).

Zeytu Gashaw Asfaw (PhD) Department of Epidemiology


Basics for
andBiostatistics
Bio-statistics
Part ISchool of Public Health,
NovemberAddis
6, Ababa
2024 University
184 / 194 A
Basic principles of probability

unconditional conditional probabilities

The formula for calculating a sample conditional probability is easy to


use :

No. of observation for both event


p(Eve/cond event) =
No. of observations for which cond event occurs

P(AnB)
P(A/B) = , if P(B) > 0
P(B)

Zeytu Gashaw Asfaw (PhD) Department of Epidemiology


Basics for
andBiostatistics
Bio-statistics
Part ISchool of Public Health,
NovemberAddis
6, Ababa
2024 University
185 / 194 A
Basic principles of probability

unconditional conditional probabilities

Example: For the retinopathy data, the conditional probability of


retinopathy, given exposure to light, is:
No. of infants with both event
p(retino/expo Bright light) =
No. of infants exposure to Bright light
18
= = 0.86
21

No. of infants with both event


p(retino/expo reduce light) =
No. of infants exposure to reduce light
21
= = 0.54
39
The conditional probabilities suggest that premature infants exposed
to bright light have a higher risk of retinopathy than premature
infants exposed to reduced light.
Zeytu Gashaw Asfaw (PhD) Department of Epidemiology
Basics for
andBiostatistics
Bio-statistics
Part ISchool of Public Health,
NovemberAddis
6, Ababa
2024 University
186 / 194 A
Basic principles of probability

Multiplication Rule of probability

The definition of P (A/B) can also be used to derive general


multiplication rule.
If A and B are any two events in a sample space S, then
P(AnB) = P(A) × P(A/B), P(B) 6= 0
P(AnB) = P(B) × P(B/A), P(A) 6= 0
If A, B and C are any three events in a sample space S, such that
P(A) 6= 0 and P(AnB) 6= 0.
P(AnBnC ) = P(A) × P(B/A) × P(C /AnB)

Zeytu Gashaw Asfaw (PhD) Department of Epidemiology


Basics for
andBiostatistics
Bio-statistics
Part ISchool of Public Health,
NovemberAddis
6, Ababa
2024 University
187 / 194 A
Basic principles of probability

Example

Table 1 shows the frequency of cocaine use by gender among adult


cocaine users

Life time frequency Male Female Total of cocaine use


1 - 9 times 32 7 39
20 - 99 times 18 20 38
more than 100 times 25 9 34
Total 75 36 111

Zeytu Gashaw Asfaw (PhD) Department of Epidemiology


Basics for
andBiostatistics
Bio-statistics
Part ISchool of Public Health,
NovemberAddis
6, Ababa
2024 University
188 / 194 A
Basic principles of probability

Questions

1 What is the probability of a person randomly picked is a male?


2 What is the probability of a person randomly picked uses cocaine
more than 100 times?
3 Given that the selected person is male, what is the probability of a
person randomly picked uses cocaine more than 100 times?
4 Given that the person has used cocaine less than 100 times, what is
the probability of being female?
5 What is the probability of a person randomly picked is a male and
uses cocaine more than 100 times?

Zeytu Gashaw Asfaw (PhD) Department of Epidemiology


Basics for
andBiostatistics
Bio-statistics
Part ISchool of Public Health,
NovemberAddis
6, Ababa
2024 University
189 / 194 A
Basic principles of probability

Answer

Total adult males


Pr (m) =
Total adult cocaine users
75
= = 0.68.
111

All adult cocaine users more than 100 times


Pr (c > 100) =
Total adult cocaine users
34
= = 0.31
111

25
Pr (c > 100) = = 0.33.
75

Zeytu Gashaw Asfaw (PhD) Department of Epidemiology


Basics for
andBiostatistics
Bio-statistics
Part ISchool of Public Health,
NovemberAddis
6, Ababa
2024 University
190 / 194 A
Basic principles of probability

...Answer

7 + 20
Pr (f ≤ 100) =
36
27
= = 0.75
36

Pr (mnc > 100) = Pr (m)Pr (c > 100)


75 25
=
111 75
25
= = 0.23
111

Zeytu Gashaw Asfaw (PhD) Department of Epidemiology


Basics for
andBiostatistics
Bio-statistics
Part ISchool of Public Health,
NovemberAddis
6, Ababa
2024 University
191 / 194 A
References

References

Chance, Beth, and Allan Rossman. 2018. Investigating Statistics,


Concepts, Applications, and Methods. 3rd ed.
http://www.rossmanchance.com/iscam3/.
Dupont, William D. 2009. Statistical Modeling for Biomedical
Researchers. Cambridge University Press.
Ioannidis, John. 2005. Why Most Published Research Findings Are
False. PLoS Medicine 2: e124.
Lane, D. 2003, June 20. Levels of Measurement. OpenStax CNX.
Retrieved May 1, 2013, from http://cnx.org/content/m10809/latest/

Zeytu Gashaw Asfaw (PhD) Department of Epidemiology


Basics for
andBiostatistics
Bio-statistics
Part ISchool of Public Health,
NovemberAddis
6, Ababa
2024 University
192 / 194 A
References

References

Colton, T. ( 1974). Statistics in Medicine, 1st ed. ,Little, Brown and


Company(inc), Boston, USA.
Bland, M. (2000). An Introduction to Medical Statistics, 3 rd ed.
University Press, Oxford.
Altman, D.G. (1991). Practical Statistics for Medical Research,
Chapman and Hall, London.
Armitage, P. and Berry, G. (1987). Statistical Methods in Medical
Research, 2nd ed. Blackwell, Oxford.
Kirkwood B.R. (1988). Essentials of Medical Statistics. Blackwell
Science Ltd. Australia
Davies A.M And Mansourian (1992). Research Strategies For Health.
Publicshed On Behalf of The World Haealth Organization. Hongrefe
and Huber Publishers, Lewiston, NY.
Zeytu Gashaw Asfaw (PhD) Department of Epidemiology
Basics for
andBiostatistics
Bio-statistics
Part ISchool of Public Health,
NovemberAddis
6, Ababa
2024 University
193 / 194 A
References

Thank You!!!

Zeytu Gashaw Asfaw (PhD) Department of Epidemiology


Basics for
andBiostatistics
Bio-statistics
Part ISchool of Public Health,
NovemberAddis
6, Ababa
2024 University
194 / 194 A

You might also like