0% found this document useful (0 votes)
35 views34 pages

Unit-4-Estimation-and-Hypothesis-testing Revised

The document covers inferential biostatistics, focusing on estimation and hypothesis testing, including sample size calculation and sampling techniques. It distinguishes between descriptive and inferential statistics, emphasizing the importance of using samples to infer population results. Various sampling methods, both probability and non-probability, are outlined along with examples of inferential statistics applications such as regression analysis, hypothesis testing, and confidence intervals.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
35 views34 pages

Unit-4-Estimation-and-Hypothesis-testing Revised

The document covers inferential biostatistics, focusing on estimation and hypothesis testing, including sample size calculation and sampling techniques. It distinguishes between descriptive and inferential statistics, emphasizing the importance of using samples to infer population results. Various sampling methods, both probability and non-probability, are outlined along with examples of inferential statistics applications such as regression analysis, hypothesis testing, and confidence intervals.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 34

INFERENTIAL BIOSTATISTICS

Estimation &
Hypothesis Testing
Sample Size and Techniques
LEARNING OBJECTIVES
 Define estimation and hypothesis testing.
 Compute standard error of the mean.

 Differentiate probability and non-probability


sampling
 Compute for sample size using Gay’s Rule and
through the use of online Lynch Formula
Calculator.
 Most of the time it is difficult for the researchers to obtain
data from the entire study population therefore they
usually rely on samples
 Though we use samples in our study, but the main target
of the study is the entire population,
 Inferential statistic is the use of statistical tools to infer
the population results from the sample results.
There are two types of statistics
1- Descriptive statistics: this include the calculation of
measures of central tendency, measures of dispersion,
and presenting the data in tables and graphs,
2- Inferential statistics: it is the main aim of studying
statistics, and it is the use of statistical tools and
descriptive measures to infer the population results from
sample results.
ADVANTAGES OF USING INFERENTIAL STATISTICS

1. A precise tool for estimating population


The main purpose of using inferential statistics is to
estimate population values. With the use of this method,
of course, we expect accurate and precise measurement
results and are able to describe the actual conditions.
2. Highly structured analytical methods
 Inferential statistics have a very neat
formula and structure. The method used is tested
mathematically and can be regarded as an unbiased
estimator.
DIFFERENCES IN INFERENTIAL STATISTICS AND DESCRIPTIVE STATISTICS

1. Descriptive statistics aim to describe the


characteristics of the data. While statistical inferencing
aims to draw conclusions for the population by analyzing
the sample.
2. Descriptive statistics are usually only presented in the
form of tables and graphs. The test statistics used are
fairly simple, such as averages, variances, etc. While
inferential statistics, the statistics used are classified as
very complicated. Not everyone is able to use inferential
statistics so special seriousness and learning
are needed before using it.
 Therefore, we cannot use any analytical tools available
in descriptive analysis to infer the overall data.
PROCEDURE FOR USING INFERENTIAL STATISTICS

1. Determine the population data that we want to


examine
2. Determine the number of samples that are
representative of the population
3. Select an analysis that matches the purpose and type
of data we have
4. Make conclusions on the results of the analysis
INFERENTIAL STATISTICS EXAMPLES
1. Regression Analysis
 Regression analysis is one of the most popular analysis
tools. Regression analysis is used to predict the relationship
between independent variables and the dependent variable.
 Using this analysis, we can determine which variables have
a significant effect in a study.
 For example, you want to know what factors can influence
the decline in poverty (dependent). You use variables such
as road length, economic growth, electrification ratio,
number of teachers, number of medical personnel,
etc.(independent)
 After analysis, you will find which variables have an
influence in reducing the poverty rate.
INFERENTIAL STATISTICS EXAMPLES

2. Hypothesis test
 Hypothesis testing is a statistical test where we want to know the
truth of an assumption or opinion that is common in
society. Usually, this test is used to find out about the truth of a
claim circulating in the community.
 Hypothesis testing also helps us to prove whether the opinions or
things we believe are true or false.
 For example, we often hear the assumption that female students
tend to have higher mathematical values than men. Is that right?
 To prove this, you can take a representative sample and analyze
the mathematical values of the samples taken.
 By using a hypothesis test, you can draw conclusions about the
actual conditions.
 Can you use the entire data on the overall mathematics value of
students and analyze the data? Certainly very allowed.
 But, of course, you will need a longer time in reaching conclusions
because the data collection process also requires substantial time.
INFERENTIAL STATISTICS EXAMPLES
3. Confidence Interval:
 Confidence interval or confidence level is a statistical test used to estimate the
population by using samples. With this level of trust, we can estimate with a
greater probability what the actual population value is.
 When using confidence intervals, we will find the upper and lower limits of a
statistical test that we believe there is a population value we estimate.
 When we use 95 percent confidence intervals, it means we believe that the test
statistics we use are within the range of values we have obtained based on the
formula.
 For example, we want to estimate what the average expenditure is for everyone in
city X. Therefore, research is conducted by taking a number of samples. The
results of this study certainly vary.
 Therefore, we must determine the estimated range of the actual expenditure of
each person. The hope is, of course, the actual average value will fall in the range
of values that we have calculated before.
INFERENTIAL STATISTICS EXAMPLES
4. Time series analysis:
 As you know, one type of data based on time is time
series data. Sometimes, often a data occurs repeatedly or
has special and common patterns so it is very interesting
to study more deeply.
 Time series analysis is one type of statistical analysis
that tries to predict an event in the future based on pre-
existing data. With this method, we can estimate
how predictions a value or event that appears in the
future.
 Example: every year, policymakers always estimate
economic growth, both quarterly and yearly. By using
time series analysis, we can use data from 20 to 30 years
to estimate how economic growth will be in the future.
Estimating population parameters from
sample statistics

Parameter: It is a characteristic or a measure calculated


from the population under study, eg. Percentage of a
disease in a certain population, the main age of group of
people (population mean).

Statistic: It is a characteristic or a measure calculated


from a sample drawn from a population, eg. The mean
age of sample of patients (sample mean)
Inferential statistic is divided into two major subjects:
1- Estimation
2- Testing of hypothesis

This is done by drawing a sample of number of samples


from the study population to estimate its parameters or
test hypotheses about that population.
The estimation is used to estimate the population
parameters when those parameters are unknown

While the test of hypothesis is used to make a decision


about accepting or rejecting a statistical hypothesis about
a hypothesized parameter.
Estimation: it is statistical method based on statistical
theories. It is used to estimate the parameter by using the
sample statistics.
There are two methods to estimate the unknown
parameter:
- Point estimation: it is using the sample statistics to
estimate the population parameter with a single value.
- Interval estimation: it is using the sample statistics to
estimate the population parameter with an interval of
values
One the most important parameter is the population mean
(µ)
We can use the sample mean (x) as a point estimation for
the population mean µ
It is expected that the sample mean value will not be equal
to the population mean
The absolute difference between the statistic and the actual
parameter to be estimated is called standard error
Every sample statistic has standard error, for example,
there is standard error of the sample mean, standard
error of the sample standard deviation, and standard
error of the correlation coefficient. We will focus on
standard error of sample mean.
Standard error of the mean: it is the standard deviation of
the means of different samples. It can also be defined as
the deviation of the means of different samples from the
population mean. 
 x 
n
The value of standard deviation of the population is
usually not known, therefore we estimate its value using
standard deviation of the sample (S) s
sx 
n
Example:
Calculate the standard error of the mean for sample of size
(49) drawn from a population with standard deviation
(14)
Answer:
The standard error for the mean =

14
=2 
49 x 
n
Example 2:
The performance for certain skill was measured for a
sample of a size (36), it was found that the mean is 30
and standard deviation 9 . Calculate the standard error of
mean?
Answer: n= 36, mean = 30 and S = 9
s
= sx  9 = 1.5 sx 
36 n
SAMPLE SIZE AND SAMPLING
TECHNIQUES
SAMPLE SIZE?
 The more heterogeneous a population is, the
larger the sample needs to be.

 Depends on topic – frequently it occurs?

 For probability sampling, the larger the sample


size, the better.

 With nonprobability samples, not generalizable


regardless – still consider stability of results
WHY SAMPLE?

 The population of interest is usually too large to


attempt to survey all of its members.

 A carefully chosen sample can be used to


represent the population.
 The sample reflects the characteristics of the
population from which it is drawn.
PROBABILITY VERSUS NONPROBABILITY

 Probability Samples: each member of the


population has a known non-zero probability of
being selected
 Methods include random sampling, systematic
sampling, stratified sampling and cluster sampling.

 Nonprobability Samples: members are


selected from the population in some
nonrandom manner
 Methods include convenience sampling, judgment
sampling, quota sampling, and snowball sampling
TYPE OF PROBABILITY SAMPLES
1- Random sampling (simple)
2- Systematic sampling
3- Stratified sampling
4- Cluster sample
RANDOM SAMPLING
Random sampling is the purest form of
probability sampling.
 Each member of the population has an equal and known
chance of being selected.

 When there are very large populations, it is often


‘difficult’ to identify every member of the population, so
the pool of available subjects becomes biased.
 You can use software, such as minitab to generate random
numbers or to draw directly from the columns
SYSTEMATIC SAMPLING
 Systematic sampling is often used instead of
random sampling. It is also called an Nth name
selection technique.
 After the required sample size has been calculated,
every Nth record is selected from a list of
population members.
 As long as the list does not contain any hidden
order, this sampling method is as good as the
random sampling method.
 Its only advantage over the random sampling
technique is simplicity (and possibly cost
effectiveness).
STRATIFIED SAMPLING
 Stratified sampling is commonly used probability
method that is superior to random sampling because it
reduces sampling error.

 A stratum is a subset of the population that share at


least one common characteristic; such as males and
females.

 Identify relevant stratums and their actual


representation in the population.

 Random sampling is then used to select a sufficient


number of subjects from each stratum.

 Stratified sampling is often used when one or more of the


stratums in the population have a low incidence relative
to the other stratums.
CLUSTER SAMPLING
 Cluster Sample: a probability sample in which each
sampling unit is a collection of elements.
 Effective under the following conditions:
 A good sampling frame is not available or costly, while a
frame listing clusters is easily obtained
 The cost of obtaining observations increases as the
distance separating the elements increases

 Examples of clusters:
 City blocks – political or geographical
 Housing units – college students
 Hospitals – illnesses
 Automobile – set of four tires
TYPE OF NON-PROBABILITY SAMPLES
1- Convenient sampling
2- Judgement sampling (purposive)
3- Quota sampling
4- Snowball sampling
CONVENIENCE SAMPLING
 Convenience sampling is used in exploratory
research where the researcher is interested in
getting an inexpensive approximation.

 The sample is selected because they are


convenient.

 It is a nonprobability method.
 Often used during preliminary research efforts to get
an estimate without incurring the cost or time
required to select a random sample
JUDGMENT SAMPLING

 Judgment sampling is a common


nonprobability method.

 The sample is selected based upon judgment.


 an extension of convenience sampling

 When using this method, the researcher


must be confident that the chosen sample is
truly representative of the entire population.
QUOTA SAMPLING
 Quota sampling is the nonprobability
equivalent of stratified sampling.

 First identify the stratums and their


proportions as they are represented in the
population

 Then convenience or judgment sampling is


used to select the required number of subjects
from each stratum.
SNOWBALL SAMPLING
 Snowball sampling is a special nonprobability
method used when the desired sample characteristic
is rare.
 It may be extremely difficult or cost prohibitive to
locate respondents in these situations.
 This technique relies on referrals from initial
subjects to generate additional subjects.
 It lowers search costs; however, it introduces bias
because the technique itself reduces the likelihood
that the sample will represent a good cross section
from the population.

You might also like