Chapter 1 Presentation
Chapter 1 Presentation
Biostatistics
3
In applying statistics to a scientific, industrial,
Societal or health problem, it is necessary to begin
with a population to be studied.
4
Once a sample of the population is
determined, data is collected for the sample
members.
This data can then be subjected to statistical
analysis, serving two related purposes:
description and inference.
Descriptive statistics summarize the population data
by describing what was observed in the sample
numerically or graphically.
5
Inferential statistics uses patterns revealed through
analysis of sample data to draw inferences about the
population represented.
6
The raw materials for any statistical analysis are the
data.
Thus, the first thing we have to do is collect relevant
data using appropriate methods.
Once the task of collecting data is
tedious completed,
this collection of raw data in
reveals very little. itself
It is extremely difficult to determine the
meaning
true of a bunch of numbers that have simply
been recorded on a piece of paper.
7
It remains for us to organize and describe these data in
a concise, meaningful manner.
In order to determine their significance, we must display the
data in the form of tables, graphs and charts (so that we can
have a good overall picture of the data).
8
Limitations of Statistics
• Although statistics is widely applied and has shown
its merit in planning, policy making, marketing
decisions, quality control, medical studies, etc., it has
some limitations:
10
Major steps in a statistical investigation
12
b) Telephone: This involves trained interviewers phoning
people to collect data. This method is quicker and less
expensive than face-to-face interviewing.
13
2. Self-completed (written questionnaire): In this
method, written questions are mailed or hand-delivered to
respondents.
(a) Mail survey: Here questionnaires are mailed to people and
mailed back by the respondents after completion.
– It is a relatively inexpensive method of collecting data: one
can distribute a large number of questionnaires in a short
time.
– This requires the questionnaire to be simple and
straightforward.
– A major disadvantage of a mail survey is that it usually has
lower response rates than other data collection methods.
– Also, people with limited ability to read or write
may experience problems.
14
(b) Hand-delivered questionnaire: This is a
self-enumerated survey where questionnaires are hand-
delivered to people and mailed back by the
respondents after completion.
16
– The questions should read well and have a good flow.
– The words should be simple, direct and familiar to all
respondents.
– The questions should be clear and as specific as
possible.
– Questions should not be double-barreled
• Example: Does your company provide training for
new employees and re-training for existing staff?
• This example is double-barreled as it asks two
questions rather than one
17
– Questions should not be leading
– If the questions are close-ended, the response
categories should be mutually exclusive and
exhaustive?
– Open-ended questions give respondents an
opportunity to answer the question in their own
words.
– Close-ended questions give respondents a choice
of answers and the respondent is supposed to
select one.
18
Sources of Data
Primary Data: When an individual, agency or
organization controls the design and data collection
processes
Secondary Data: When you use data previously collected
by others for their own purposes
In this case data were obtained from already collected sources like
newspaper, magazines, CSA, DHS, hospital records and existing data
like;
Mortality reports
Morbidity reports
Epidemic reports
Reports of laboratory utilization (including laboratory test results)
19
– As a general rule, primary data sources are preferred
to secondary sources since the primary source
contains much pertinent information about:
– collection methods and
20
Some Basic Terms in Biostatistics
In collecting data concerning the characteristics of a group of
individuals or objects, it is often impossible or impractical (from
the point of view of time and cost) to observe the entire group.
In such cases, instead of examining the entire group, called
population, we examine only a small part of the population,
called sample.
21
Sampling Techniques
A census is a complete enumeration of the entire population.
There are several reasons for taking a sample instead of a
complete enumeration of the whole population or census. These
include:
– A census may be very expensive.
– A census may require too much time.
– A carefully obtained sample may be more accurate than a
census.
For example, in a large inventory census or in a study of the
prevalence of HIV among adolescents in Ethiopia, errors due to
fatigue or carelessness on the part of the census taker may
introduce a serious bias in the results.
22
Broadly speaking, there two types of
sampling
are techniques: random sampling
non-random sampling. and
In random sampling, the elements to be
included the sample entirely depend on
in
chance.
Random sampling techniques often yield
samples that representative the
populationare of are drawn.
from which they
In non-random sampling, the units in the
sample are chosen by the investigator based on
his/her personal convenience and beliefs. 23
Random Sampling Techniques
1. Simple Random Sampling
This is a method of sampling in which every member
of the population has the same chance of being
included in the sample.
2. Systematic Random Sampling
In some instances, the most practical way of
sampling is to select, say, every 20th name on a list,
every 12th house on one side of a street, every 50th
piece of item coming off a production line, and so
on.
This is called systematic sampling.
24
3. Stratified Random Sampling
Stratified random sampling is the procedure of dividing the
population into relatively homogeneous groups, called strata,
and then taking a simple random sample from each stratum.
If the population elements are homogeneous, then there is
no need to apply this technique.
25
To obtain a sample from each stratum:
26
4. Cluster Sampling
This is a method of sampling in which the total population is
divided into relatively small subdivisions, called clusters, and
then some of these clusters are randomly selected using
simple random sampling.
Once the clusters are selected, one possibility is to use all the
elements in the selected clusters.
27
Example: Suppose we want to make a survey
of households in Addis Ababa.
Collecting information on each and every
household is impractical from the point of view of
cost and time.
What we do is divide the city into a number of
relatively small subdivisions, say, Kebeles. So the
Kebeles are our clusters.
Then we randomly select, say, 20 Kebeles using
simple random sampling.
28
To collect information about individual households,
we have two options:
a) We visit all households in these 20 Kebeles
b) We randomly select households from each of these
20 selected Kebeles using simple random sampling.
– This method is called two-stage sampling since
simple random sampling is applied twice (first,
to select a sample of Kebeles and second, to
select a sample of households from the
selected Kebeles)
29
B. Non-random Sampling Techniques
Convenience or Accidental sampling (members of
the population are chosen based on their relative
ease of access)
Judgment or Purposive sampling – (The researcher
chooses the sample based on who he/she thinks
would be appropriate for the study)
– Purposive sampling starts with a purpose in mind and the
sample is thus selected to include people or objects of
interest and exclude those who do not suit the purpose.
Purposive sampling can be subject to bias and error.
30
– Case study (The research is limited to one
group, often with a similar characteristic or
of small size.)
32
• The difference between non-probability and
probability sampling is that non-probability
sampling does not involve random selection
and probability sampling does.
33
With non-probability samples, we may or may
not represent the population well, and it will
often be hard for us to know how well we've
done so.
34
Criteria for the acceptability of a
sampling method
38