Inferential Statistics 20032024
Inferential Statistics 20032024
23/02/2024 1
23/02/2024 2
23/02/2024 3
Course outline
• Probability theories & laws
• Probability distributions
• Introduction to making statistical inference
• Sampling and Sampling techniques
23/02/2024 4
Probability theories and laws
• Introduction
• Types/Approaches to probability
• Laws of probability
23/02/2024 5
Introduction
• Decisions are made in public health daily based on
data/information available to us
23/02/2024 6
Introduction 2
• Probability
• defined as the chance of an event occurring
• It is the measure of possibility of a particular outcome
occurring
• Deal with chance
• It takes a value between 0 and 1
• Probability of an event is non-negative
• Sum of probability in an experiment is always equal to 1
23/02/2024 7
Definition of terms
• Experiment
• Any procedure that can be infinitely repeated and has a well-
defined set of possible outcomes e.g., flipping a coin, rolling a die,
or drawing a card from a deck
• Trial
• A particular act of any experiment e.g., flipping a coin once, rolling
one die once
• Outcome
• The result of a single trial of a probability experiment e.g.,
• When a coin is tossed, there are two possible outcomes: head or tail
• In the roll of a single die, there are six possible outcomes: 1, 2, 3, 4, 5, or 6
• Event
• A particular outcome of an experiment
• Sample space
• The set of all possible outcomes of a probability experiment
23/02/2024 8
Definition of terms 2
• Some sample spaces for various probability
experiments are shown here
Experiment Sample space
Toss one coin Head, tail
Die 1 Die 2
1 2 3 4 5 6
1 (1,1) (1,2) (1,3) (1,4) (1,5) (1,6)
2 (2,1) (2,2) (2,3) (2,4) (2,5) (2,6)
3 (3,1) (3,2) (3,3) (3,4) (3,5) (3,6)
4 (4,1) (4,2) (4,3) (4,4) (4,5) (4,6)
5 (5,1) (5,2) (5,3) (5,4) (5,5) (5,6)
6 (6,1) (6,2) (6,3) (6,4) (6,5) (6,6)
23/02/2024 11
Approaches to probability
• Classical
• Relative frequency
• Subjective probability
23/02/2024 12
Classical
• Classical probability assumes that all outcomes in
the sample space are equally likely to occur
• Concept assume a probability space which contain
a number of equally likely event (same chance),
mutually exclusive (cannot occur at the same time
together) and collectively exhaustive (all
possibilities considered) events
• Ratio of number of occurrence favorable to event
to total number of possible outcomes
23/02/2024 13
Classical 2
• The probability of any event E is
• For example,
• when a single die is rolled, each outcome has the
same probability of occurring
• when a card is selected from an ordinary deck of 52
cards, you assume that the deck has been shuffled,
and each card has the same probability of being
selected
23/02/2024 14
Relative Frequency
• Also called empirical it is
• The probability of an event as a proportion of times the
event occurs in a long series of trials
• The proportion of times event occurs in a long series of
trials
• a mathematical likelihood that an event will occur in
repeated trials under similar conditions at a
determinable frequency
• In empirical probability, one might roll a given die
6000 times, observe the various frequencies, and
use these frequencies to determine the probability
of an outcome
23/02/2024 15
Relative Frequency 2
• Given a frequency distribution, the probability of an
event being in a given class is:
• OR
23/02/2024 16
Relative Frequency 3
• Example
• Choice of travel mode
from Kaduna to Abuja:
• Probability of selecting
a person who is
travelling by air is:
• 6/50=0.12
23/02/2024 17
Relative Frequency 4
• In a sample of 50 people,
• 21 had type O blood,
• 22 had type A blood,
• 5 had type B blood, and
• 2 had type AB blood.
• Set up a frequency
distribution and find the
following probabilities:
• A person has type O blood.
• A person has type A or type
B blood.
• A person has neither type A
nor type O blood.
• A person does not have type
AB blood.
23/02/2024 18
Relative frequency 5
i. P(O)=f/n=21/50=
ii. P(A or B)=22/50+5/50=27/50
iii. P(neither A nor O)=P(B or AB)=5/50+2/50=7/50
iv. P(not AB)=1-P(AB)=1-2/50=48/50
23/02/2024 19
Relative frequency 6
• Assignment
• Hospital records indicated
that maternity patients
stayed in the hospital for
the number of days shown
in the distribution
• Find these probabilities:
• A patient stayed exactly 5
days
• A patient stayed at most 4
days
• A patient stayed less than 6
days
• A patient stayed at least 5
days
23/02/2024 20
Classical vs Relative frequency
• The difference between classical and empirical
probability is that classical probability assumes that
certain outcomes are equally likely (such as the
outcomes when a die is rolled), while empirical
probability relies on actual experience to determine
the likelihood of outcomes
23/02/2024 21
Subjective probability
• Definition
• It is the degree one’s belief in the certainty of occurrence of events.
23/02/2024 22
Subjective probability 2
• Expresses the degree of one’s belief in certainty or
uncertainty
• For example
• A climatologist may say that there is a 25% probability
that it will rain in the month of February 2024 in Zaria
• A physician might say that, on the basis of her diagnosis,
there is a 30% chance the patient will need an operation
23/02/2024 23
Summary approaches to
probability
• With known probability space, probability is the
ratio of favorable elementary events to total
number of elementary events - classical
• With unknown probability space, a probability is a
relative frequency – relative frequency concept
• Without any objective information, probability is a
subjective assessment based on cumulative
personal experience or intuition - subjective
23/02/2024 24
Probability rules
• Probability Rule 1
• The probability of any event E is a number (either a fraction
or decimal) between and including 0 and 1
• For any event A, the range of possible probabilities is: 0 ≤ P(A)
≤1
• This is denoted by:
• Probability Rule 2
• If an event E cannot occur (i.e., the event contains no
members in the sample space), its probability is 0
• Probability Rule 3
• If an event E is certain, then the probability of E is 1
• Probability Rule 4
• The sum of all the probabilities for all possible events is equal
to one
23/02/2024 25
Probability laws
• Laws of probability
1. Addition law (or, +)
2. Multiplication law (and, x)
3. Conditional probability
4. Complementary probability
23/02/2024 26
Addition Law
• The probability of two or more events occurring is
the sum of their respective probabilities if the
events are mutually exclusive
• P (A or B) = P(A) + P(B)
A B
23/02/2024 27
Addition Law 2
• When two events are not mutually exclusive, we
must subtract one of the two probabilities of the
outcomes that are common to both events, since
they have been counted twice
• Then the probability of either event A or B
happening is given as
• P (A or B) = P(A) + P(B) – P(AB)
A B
23/02/2024 28
Addition Law 3
• The events of getting a 4 and getting a 6 when a
single card is drawn from a deck are mutually
exclusive events, since a single card cannot be both
a 4 and a 6
• On the other hand, the events of getting a 4 and
getting a heart on a single draw are not mutually
exclusive, since you can select the 4 of hearts when
drawing a single card from an ordinary deck
23/02/2024 29
Addition Law 4
23/02/2024 30
Addition Law 5
• Determine which events are mutually exclusive and
which are not, when a single card is drawn from a
deck.
1. Getting a 7 and getting a jack
2. Getting a club and getting a king
3. Getting a face card and getting an ace
4. Getting a face card and getting a spade
23/02/2024 31
Addition law 6
• A box contains 3 glazed doughnuts, 4 jelly
doughnuts, and 5 chocolate doughnuts. If a person
selects a doughnut at random, find the probability
that it is either a glazed doughnut or a chocolate
doughnut.
• Since the box contains 3 glazed doughnuts, 5
chocolate doughnuts, and a total of 12 doughnuts,
• P(glazed or chocolate) = P(glazed) + P(chocolate)
=3/12+5/12=8/12=2/3
• Remember, the events are mutually exclusive
23/02/2024 32
Addition law 7
At a political rally, there are 20 Republicans, 13
Democrats, and 6 Independents. If a person is
selected at random, find the probability that he or
she is either a Democrat or an Independent
• P(Democrat or Independent)= P(Democrat) +
P(Independent) =13/39+6/39=19/39
23/02/2024 33
Addition law 8
• A single card is drawn at random from an ordinary
deck of cards. Find the probability that it is either
an ace or a black card.
• Remember the events are not mutually exclusive
• P(ace or black card)=P(ace) + P(black card) - P(black
aces)= 4/52+26/52-2/52=7/13
23/02/2024 34
Addition law 9
• For example, in a hospital
unit there are
• 8 nurses and 5 physicians;
• 7 nurses and 3 physicians are
females
• If a staff person is selected,
find the probability that the
subject is a nurse or a male
• P(nurse or male) =P(nurse)
+P(male) - P(male
nurse)=8/13+3/13-
1/13=10/13
23/02/2024 35
Addition law 10
• On New Year’s Eve, the • P(intoxicated or
probability of a person accident)=P(intoxicated)+
driving while intoxicated P(accident)-P(intoxicated
is 0.32, the probability of and accident)=0.32+0.09-
a person having a driving 0.06=0.35
accident is 0.09, and the
probability of a person
having a driving accident
while intoxicated is 0.06.
What is the probability of
a person driving while
intoxicated or having a
driving accident?
23/02/2024 36
Addition law 11
• In summary, then, when the two events are
mutually exclusive, use addition rule 1.
• When the events are not mutually exclusive, use
addition rule 2.
• The probability rules can be extended to three or
more events
• For three mutually exclusive events A, B, and C,
• P(A or B or C) =P(A)+P(B)+ P(C)
• For three events that are not mutually exclusive,
• P(A or B or C)= P(A)+P(B)+P(C)-P(A and B)-P(A and C)-P(B and
C)+P(A and B and C)
23/02/2024 37
Multiplication law
The multiplication rules can be used to find the
probability of two or more events that occur in
sequence
The joint probability of independent events A and B
occurring is the product of the probabilities
• P (A and B) = P (A) x P (B)
• P (A and B) = p (AB)
23/02/2024 38
Multiplication law 2
Tossing a Coin
• A coin is flipped, and a die is rolled. Find the probability
of getting a head on the coin and a 4 on the die.
• P(head and 4)= P(head)*P(4)
(1/2)*(1/6)=1/12
• What are the sample spaces for the coin and the die?
• Note that the sample space for the coin is H, T; and for the
die it is 1, 2, 3, 4, 5, 6.
23/02/2024 39
Conditional probability
• Conditional probability
• probability of occurrence of an event given that another
event has occurred
• When the outcome or occurrence of the first event
affects the outcome or occurrence of the second event
in such a way that the probability is changed, the events
are said to be dependent events
• To find probabilities when events are dependent,
use the multiplication rule with a modification in
notation
23/02/2024 40
Conditional probability 2
• The conditional probability of an event B in
relationship to an event A is the probability that
event B occurs after event A has already occurred
• The notation for conditional probability is P(B│A)
• When two events are dependent, the probability of
both occurring is
• P(A and B)=P(A)*P(B│A)
23/02/2024 41
Conditional probability 3
• For example, the probability of getting an ace on
the first draw is 4/52, and the probability of getting
a king on the second draw is 4/51
• By the multiplication rule, the probability of both
events occurring is:
• =(4/52)*(4/51)=(4/663)
• The event of getting a king on the second draw
given that an ace was drawn the first time is called
a conditional probability
23/02/2024 42
Conditional probability 4
Drawing Cards a. P(3 jacks)
• Three cards are drawn a. = (4/52)*(3/51)*(2/50)
from an ordinary deck b. =(24/132,600)=(1/5525)
and not replaced. Find b. P(ace and king and
the probability of these queen)=
events. c. P(club and spade and
a. Getting 3 jacks heart)=
b. Getting an ace, a king,
and a queen in order d. P(3 clubs)=
c. Getting a club, a spade,
and a heart in order
d. Getting 3 clubs
23/02/2024 43
a. P(3 jacks)
a. = (4/52)*(3/51)*(2/50)
b. =(24/132,600)=(1/5525)
b. P(ace and king and queen)=
c. P(club and spade and heart)=
d. P(3 clubs)=
23/02/2024 44
Complimentary probability
• The complement of an event E is the set of outcomes in
the sample space that are not included in the outcomes
of event E
• The complement of E is denoted by (read “E bar”)
• For instance, the sample space consists of the
outcomes 1, 2, 3, 4, 5, and 6
• The event E of getting odd numbers consists of the outcomes
1, 3, and 5.
• The event of not getting an odd number is called the
complement of event E, and it consists of the outcomes 2, 4,
and 6
• Complimentary probability
• P(A) = 1 - P(A)
23/02/2024 45
Complimentary probability 2
• If the probability of an event or the probability of
its complement is known, then the other can be
found by subtracting the probability from 1
• ie,
• OR
• OR
23/02/2024 46
Complimentary probability 3
• Find the complement of each event:
• Rolling a die and getting a 4
• Selecting a letter of the alphabet and getting a vowel
• Selecting a month and getting a month that begins with
aJ
• Selecting a day of the week and getting a weekday
23/02/2024 47
Exercise
• The probability that a man has HIV is 0.2 & the
probability that his wife also has HIV is 0.3. the
probability that the wife got HIV from her infected
husband is 0.05. What is the probability that:
1. Both have HIV
2. Either of them has HIV
3. The husband was got it from his infected wife
23/02/2024 48
Course outline
• Probability theories & laws
• Probability distributions
• Introduction to statistical inference
• Sampling and sampling techniques
23/02/2024 49
Probability distributions
• Introductions
• Types of probability distributions
• Binomial distribution
• Normal distribution
• Standard normal distribution
23/02/2024 50
Probability distributions
• Probability distribution of a random variable is the
distribution of the probability of all possible
outcomes of the random variable
• Expressed as a
• Table
• Graph
• Mathematical equation
https://en.wikipedia.org/wiki/Statistical_inference
23/02/2024 51
Probability distributions 2
• Random variable
• A quantity takes on various values in apparently
unpredictable sequence and reveal no pattern
• Observed value is because of chance
23/02/2024 52
Probability distributions 3
• A probability distribution satisfies the following
• The observations can fall any where within an interval
• The probability of a variable with a value falling
anywhere within an interval bounded by a and b is
always required
• A graphical representation of the distribution of the
random variable is the histogram and the limiting form
will be a smoot curve
• The total area under the curve is called the probability
density function (PDF)
• The PDF = unity (1)
• The probability of a value between a and b is the area
under the curve (AUC) between a and b
23/02/2024 53
Types of probability distributions
Discrete Continuous
• Random variable of • Random variable of
interest is discrete interest is continuous
• e.g., geometric, • e.g., normal,
binomial, Poisson exponential
23/02/2024 54
Binomial distribution
• A probability distribution that shows the probability
of every possible number of successes in an
experiment in which there are only two possible
outcomes (success or failure), and the probability
of success is known and the same in each of the
repeated number of independent trials
23/02/2024 55
Binomial distribution 2
• Distribution for dichotomous variables e.g.,
• Positive or negative
• Success or failure
• Only 2 possible outcomes are possible which are
mutually exclusive
• Probability of success same from trial to trial
• Consists of repeated trials which are independent
• Number of trials (n)
• Number of success (r)
• Probability of success (p)
• Probability of failure (1-p)
23/02/2024 56
Binomial distribution 3
• Binomial distribution is the probability distribution
of the number of success (r) in n trials
• Mean=np
• Variance=np(1-p) q=1-p
• p+q=1
• p=
• Pr(r) = where r=0,1,2…
23/02/2024 61
Normal Distribution 3
https://analystprep.com/cfa-level-1-exam/quantitative-methods/key-properties-
normal-distribution/
Normal Distribution 4
23/02/2024 63
Normal Distribution 5
• Increasing the sample size and decreasing the
width of the classes, the histograms will look like
the ones shown in Figure
23/02/2024 64
Properties of a normal distribution
• Bell-shaped
• Completely symmetrical
about its mean
• All the measures of
central tendency coincide
at the same value
(Mean=Median=Mode)
• The total area under the
curve is 1 or 100%
• Determined by its mean
and variance
23/02/2024 65
Properties of a normal distribution 2
• Approximately 68.26%
of the total frequency
of observation are
within 1 SD from mean
• Approximately 95.44%,
within 2 SD from the
mean
• Approximately 99.74 %
within 3 SD from the
mean
23/02/2024 66
Standard Normal Distribution
• Standard normal distribution or Z distribution
• It is a specified normal distribution whose
• Mean = 0
• SD =1
• Given by Z =
σ
• Necessary to compare values in one sample with
that of another or some standard values
Standard Normal Distribution 2
https://www.scribbr.com/statistics/normal-
distribution/
Standard Normal Distribution 3
• Exercise
• A study of blood pressures, a group of boys shared a
mean DBP of 105.8mmHg and a SD of 13.4mmHg. For
what proportion of boys will be expected to have a DBP
of <120mmHg if BP has a normal distribution.
Symmetric distribution
https://statisticsbyjim.com/basics/skewed-
distribution/#Skewed%20Probability%20Distributions%20and%20Hypothesis%20Tests
23/02/2024 71
Symmetric and asymmetric distributions
• Symmetric
• When the data values are evenly distributed about the
mean, a distribution is said to be a symmetric
distribution - A normal distribution is symmetric
• Asymmetric/skewed
• When most values are not spread evenly but rather tilt
towards a particular side, we have a skewed distribution
• When majority of the data values fall to the left or right
of the mean, the distribution is said to be skewed
23/02/2024 72
Negatively or left-skewed
distribution
https://statisticsbyjim.com/basics/skewed-
distribution/#Skewed%20Probability%20Distributions%20and%20Hypothesis%20Tests
23/02/2024 73
Positively or right-skewed distribution
https://statisticsbyjim.com/basics/skewed-
distribution/#Skewed%20Probability%20Distributions%20and%20Hypothesis%20Tests
23/02/2024 74
Course outline
• Probability theories & laws
• Probability distributions
• Introduction to statistical inference
• Sampling and sampling techniques
23/02/2024 75
Introduction to statistical
inference
• Introduction
• Hypothesis testing
• Estimation method using confidence intervals
23/02/2024 76
Introduction to statistical
inference
• Statistical inference
• Statistical inference is the process of making
propositions/conclusions about a population, using data
drawn from the population with some form of sampling
Introduction to statistical
inference 2
Introduction to statistical
inference 3
23/02/2024 79
Introduction to statistical
inference 4
• Approaches to statistical inference include
• Test of hypothesis
• Estimation method using confidence intervals
23/02/2024 80
Introduction to statistical
inference 5
23/02/2024 81
Hypothesis testing
• Hypothesis
• Statement about a phenomena for which the truth has
not been verified e.g., Fear increases hypertension, Sore
throat is commoner in winter
• Statistical hypothesis
• A statement about the parameters describing a
population
• Hypothesis testing
• This is the process or procedure of investigating the
truth of a hypothesis.
23/02/2024 82
Hypothesis testing 2
• Procedure for testing a hypothesis
1. State the null hypothesis- HO
2. State the alternative hypothesis - HA
3. State the level of statistical error or level of
significance - α
4. Choose appropriate test statistics
5. Obtain critical value and set decision rule
6. Compute test statistics
7. Take decision to reject or fail to reject the null
hypothesis
8. Conclusion
23/02/2024 83
Null and alternate hypothesis
1. Null hypothesis
• Hypothesis of no difference
• Denoted by Ho
2. Alternative hypothesis
• Hypothesis to consider if null hypothesis is rejected
• Usually the research question
• Denoted by Ha
Null and alternate hypothesis 2
23/02/2024 85
Null and alternate hypothesis 3
Does tooth flossing Tooth flossing has no Tooth flossing has an effect on
affect the number of effect on the number of the number of cavities.
cavities? cavities.
Does daily meditation Daily meditation has no Daily meditation has an effect
decrease the effect on the incidence of on the incidence of depression.
incidence of depression.
depression?
https://www.scribbr.com/statistics/null-and-
alternative-hypotheses/
23/02/2024 86
Level of significance
3. State the level of statistical error (level of
significance)
• Types of error in hypothesis testing
Type 1 error/alpha (α) error
• Rejecting the null hypothesis when it is indeed true
• Level of significance (p value usually set at 5% or 0.05)
Type 2 error/beta(β) error
• Failure to reject the null hypothesis when it is false
• 1- β = power of the test
• Ability to reject a null hypothesis when it is false
Types of error in hypothesis testing
https://uen.pressbooks.pub/ebpresearchmeth
23/02/2024
ods/chapter/part-ii-data-analysis-methods-in- 88
quantitative-research/
Types of error in hypothesis testing 2
https://www.reddit.com/r/funny/comments/7i
mtol/each_semester_i_put_new_images_on_
23/02/2024 89
my_office_door/
Choosing an appropriate
statistical test
4. Choose appropriate statistical test
• The choice of a test depends on
• Study objective (research question)
• Type of data (qualitative or quantitative)
• Sample size (small or large)
• Study design
Study objective (research question)
Determining
differences/independence Measuring association between
between groups variables
Type of variable Unpaired Paired Type of Test statistic
observations observations variable
Nominal Chi-square test Sign test or Nominal Chi-square test
McNemar’s chi-
square
Ordinal
Mann-Whitney Wilcoxon Ordinal Spearman’s Rank correlation
2 groups U test signed-rank
test coefficient
> 2 groups Kruskal-Walis 1-
way ANOVA Friedman 2-
way ANOVA
Numeric Numeric Pearson’s correlation
2 groups T-test coefficient
Paired t-test
23/02/2024 92
Hypothesis testing 5
5. Set decision rule
6. Apply data and evaluate test statistics
7. Take decision to reject or not, the null hypothesis
8. Conclude
23/02/2024 93
Confidence intervals
• A confidence interval (CI) is a range of values in
which contain the true estimate of the population
parameter with a degree of confidence
23/02/2024 94
https://www.scribbr.com/statistics/confidence-interval/
Confidence intervals 2
• The accuracy of the confidence interval depends on
the scientific rigor of your research process
• Measurement of variables
• Sampling
• Generic formula
• CI = Estimate ± Critical value for test statistic x Standard
Error (SE)
23/02/2024 95
Confidence intervals 3
• Estimates
• Derived from the sample
• Descriptive statistic e.g., mean, median, proportions,
difference of means, difference of proportions, correlation
coefficient etc.
• Confidence – certainty
• Confidence level = 1 − α
• If α = 0.05
• Confidence level = 1 − 0.05 = 0.95, or 95%.
• A 95% CI means you are confident that 95 out of 100 times
the estimate will fall between the upper and lower values
specified by the confidence interval.
23/02/2024 96
Confidence intervals 3
• Formula varies depending on the type of data e.g.,
23/02/2024 97
Central limit theorem
• States
• Sampling distributions are approximately normally
distributed regardless of the underlying population
distribution of the variable
• The mean of the sampling distribution ( is equal to
the true population mean
• =(∑xi/n)=
• The standard deviation of the sampling distribution(sd)
is directly proportional to the population standard
deviation (σ) and inversely proportional to the square
root of the sample size(n)
• S.E ( )= σ/√n
Course outline
• Probability theories & laws
• Probability distributions
• Introduction to statistical inference
• Sampling and sampling techniques
23/02/2024 99
Introduction to sampling
• Sample
– A subgroup of the population selected for a study
– Any subgroup of the population, technically speaking, can
be called a sample
• A sample should have the following characteristics
– It must be a true representative
– It must be logical and accurate
– It must be large and adequate
– It must contain no bias.
100
Introduction to sampling 2
• Study population:
– Group of individual units to be investigated
– All subjects (Human or otherwise) under study
• Reference population:
– broader population to which the findings from the study
population are to be generalized.
• Sampling frame:
– List of all the units of the population from which samples
will be drawn
– The list of all elementary units in a population to be
studied
101
Sampling
• Sampling
– The process by which a sample is selected from a
population
104
Simple Random sampling
• Every subject in a population has equal probability of
appearing in the selection
• Steps
– Make a numbered list of the units in the population that you want to
sample – sampling frame
– Decide the sample size – sample size formula
– Then select the determined number of sampling units, n using any of
the following three methods:
• Balloting
• Random Number Table
• Computer-generated list
105
Simple Random sampling 2
• Balloting
– A simple random sample of 50 students is to be
selected from a school of 250 students.
– Using a list of all 250 students, each student is given a
number (1 to 250), and these numbers are written on
small pieces of paper.
– All the 250 papers are put in a box, after which the
box is shaken vigorously, to ensure randomisation.
– Then, 50 papers are taken out of the box, and the
numbers are recorded.
– The students belonging to these numbers will
constitute the sample.
Simple Random sampling 3
• Random number table
– First, decide how large a number you need.
– Next, count if it is a one, two or larger digit
number. For example, if your sampling frame
consists of 10 units, you must choose from
numbers 1-10, (inclusive).
– You must use two digits to ensure that 10 has an
equal chance of being included.
– You also use two digits for a sampling frame
consisting of 0-99 units.
Simple Random sampling 4
• Random number table cont’d
– If, however, your sampling frame has 0-999 units, then
you obviously need to choose from three digits. In this
case, you take an extra digit from the table to make up
the required three digits. For example, the number in
columns 10&11, row 27: 7, would become 748; going
down, the next numbers would be 229, 059 etc.
– You would do the same if you needed a four-digit
number, for a sampling frame 0-9999 units. In our
example of the number on columns 10, 11, 12, 13 row
27 of the table: 748, this would now become 7488,
the next down 2294, and so on.
Simple Random sampling 5
• Random number table cont’d
– Decide beforehand whether you are going to go
across the page to the right, → down the page ↓,
across the page to the le , ← or up the page. ↑
– Without looking at the table, and using a pencil, pen,
stick, or even your finger, pin-point a number
– If this number is within the range you need, take it. If
not, continue to the next number in the direction you
chose before-hand, (across, up or down the page),
until you find a number that is within the range you
need.
Random number table
Simple Random sampling 6
• Advantages
– It requires a minimum knowledge of population
– It is free from subjectivity and personal error
– It can be used for inferential purpose
• Disadvantages
– Does not require knowledge of the entire population
– Inferential accuracy of the findings depends on sample size
– Time-consuming and recourse intensive
– Difficult to implement in population-based studies
– Developing sampling frame is expensive in population-
based studies
Systematic sampling
• It is the selection of nth subjects or units at regular intervals
from the sample frame
• Steps
– Prepare a list of all the units or elements or members of the
population and assign numbers to each element from 1 to N
– Determine the size of the sample n to be selected from the population
of size N
– Determine the sampling interval
– Starting point is gotten by selecting the first number at random
112
Systematic sampling 2
• A sampling interval (or fraction)
– It is determined by dividing total population N and size of
sample n
( )
• =
( )
– e.g., A systematic sample is to be selected from 1200 students
at a school.
– The calculated sample size is 100.
– The sampling interval is
• = =
– The first student to be selected in the sample is chosen
randomly, for example by balloting one out of twelve pieces of
paper, numbered 1 to 12
– If number 6 is picked, then every twelfth student will be
selected in the sample, starting with student number 6, until
100 students are selected: the numbers selected would be 6,
18, 30, 42, etc.
Systematic sampling 3
Example: systematic sampling
Systematic sampling 4
• Advantages:
– It is a simple method of selecting a sample
– It reduces the field cost
– Inferential statistics may be used
– May be used for drawing conclusions and generalization
• Disadvantages:
– It is not free from error
– It can not ensure the representativeness
– There is a risk in drawing conclusion from the sample
– Regularity/periodicity in data can produced biased
estimate
Stratified sampling
• A stratified sample is a sample obtained by dividing the population into
subgroups, called strata (age, sex, rural/urban residence, geopolitical
zone) according to various homogeneous characteristics and then
selecting members from each stratum for the sample
• Used for heterogeneous population in which certain characteristics are
important to the study. E.g sex, age, occupation ….
• Steps:
– Identify the sub-groups or strata
– Construct a sampling frame in each stratum
– Perform sampling within each stratum:
• Equal Allocation
• Proportional Allocation
• Optimal Allocation
116
Stratified sampling 2
• Advantages
– It ensures representation of all population subgroups
that are important to the study
– It is an improvement over the earlier methods
– It is an objective method of sampling
– It can be used for inferential purpose
• Disadvantages
– Difficulty in deciding relevant criterion for
stratification
– It is a costly and time-consuming method
– There is risk in generalization
Cluster sampling
• A cluster sample is a sample obtained by selecting a pre-
existing or natural group, called a cluster, and using the
members in the cluster for the sample.
– Typically, the population is divided into mutually exclusive and
exhaustive clusters
– The clusters should be as heterogeneous as possible (very different
within but alike from cluster to cluster with respect of outcome of
interest)
– The number of units in the cluster is the size of the cluster
– Clusters are often geographic units (e.g., districts, villages) or
organizational units (e.g., clinics, training groups, schools, classrooms,
housing units, neighborhood)
118
Cluster sampling 2
Cluster sampling 3
Advantages Disadvantages
• It is an easy method • It is not free from errors
• It is an economical method • It is not comprehensive
• It is practicable and can simplify • It may not be a good representative
fieldwork of the population
• It can be used for inferential • Larger sampling errors especially
purpose with multi-stage models
• Good when individual sampling
frame not available (e.g. no listing
of persons but HH, building
addresses, city blocks, etc.)
• May be multi-stage
• Useful when dealing with scattered
populations, e.g. Nomads, hamlets
(small scattered settlements)
Multistage sampling
• Multistage cluster sampling: selection is done in stages until
the final sampling units are arrived at
• In multistage sampling, the researcher uses a combination of
sampling methods
• Useful in very large and diverse populations sampling may be
done in two or more stages
• This is often the case in population-based, in which people are
to be interviewed from different houses, which have to be
chosen from different communities that have to be chose
from LGAs or wards (e.g. districts, wards, streets or blocks)
121
Multistage sampling 2
• For example, suppose a research organization wants to conduct a
nationwide survey for a new product being manufactured.
• A sample can be obtained by using the following combination of
methods
– First the researchers divide the country into six geopolitical zones (or
clusters)
– Then three states from each zone are selected at random.
– Next from each of the three selected states, three LGAs are further
selected.
– Next, in each of the selected LGA, three settlements are selected
– Finally, fifteen houses/households are selected at random from each
of the selected settlement
– And the families living on these houses/households are selected at
random and are given samples of the product to test and are asked to
report the results.
– This hypothetical example illustrates a typical multistage sampling
method.
Multistage sampling 2
• Advantages
– It is a good representative of the population
– It is an improvement over the other methods
– It is an objective procedure of sampling
– It can be used for inferential purpose
• Disadvantages
– It is a difficult and complex method of sampling
– It involves errors from various stages of sampling
Non-probability sampling techniques
• Sampling that does not use random selection
of subjects
– Not based on probability, so cannot calculate
confidence intervals
Non-probability sampling techniques 2
• Types
– Convenience: accidental or haphazard based on
accessibility to subjects
– Purpose sampling: purposively selected
– Quota sampling: allocation of fixed amount
– Snowball: referrals from initial subjects
– Judgment sampling
– Volunteer sampling: free or incentivized to participate
125
Non-probability sampling techniques 3
• Advantage
– Quick, inexpensive, convenient and useless for
epidemiologic studies
• Disadvantage
– It does not allow generalization to be made
References and
Acknowledgements
• Adapted from
• Prof Dr Tukur DAHIRU Biostatistics for Anatomy,
Physiology and Nursing Sciences Students
• Dr Sunday Joseph Slides for 600L MBBS Community
diagnosis Group D 2014
• E A Bambogye Medical Statistics second edition -
2014 Ibadan University Press
• https://www.scribbr.com/statistics/confidence-
interval/
23/02/2024 127