0% found this document useful (0 votes)
68 views

Sampling and It

This document discusses research methodology, specifically sampling techniques. It covers key topics such as defining a sample and ensuring it is representative of the population. Methods for determining sample size are examined, including approaches based on desired precision or statistical power. Formulas for calculating sample sizes to estimate population means or proportions are provided. Different sampling types and techniques are also outlined, along with important terminology and notations used in sampling.

Uploaded by

Nouman Shahid
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
68 views

Sampling and It

This document discusses research methodology, specifically sampling techniques. It covers key topics such as defining a sample and ensuring it is representative of the population. Methods for determining sample size are examined, including approaches based on desired precision or statistical power. Formulas for calculating sample sizes to estimate population means or proportions are provided. Different sampling types and techniques are also outlined, along with important terminology and notations used in sampling.

Uploaded by

Nouman Shahid
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 14

ADVANCED RESEARCH METHODOLOGY

SAMPLING AND IT’S


TECHNIQUES
Assignment 3

Nayab Khan (Mphil)


11/1/2016
SAMPLING AND IT’S TECHNIQUES
TABLE OF CONTENT

1. What Is a Sample?
1.1.Attributes of Sample
1.2.Representative Sample
2. Determining Sample?
2.1.Why to Determine Sample?
2.2.Sample Size Determination
2.3.How to Determine Sample?
3. Calculating Sample Size
3.1.Approaches to Calculate Sample Size
3.2.Sample Size to Estimate a Population Mean
4. Basic Terms and Notations
5. Types of Sampling
6. Methods of Sampling
7. Event Sampling Methodology
8. Response Rate
9. Sampling Strategies
10. Errors in Sampling
10.1. Bias in sampling
10.2. Statistical errors
11. Software used for Calculating Sample
1. SAMPLE
 A sample is “a smaller (but hopefully representative) collection of units from a
population used to determine truths about that population” (Field, 2005)
 A sample is a “subgroup of a population” (Frey et al. 125).
1.1. ATTRIBUTES OF SAMPLE

 Every individual in the chosen population should have an equal chance to be


included in the sample.
 Ideally, choice of one participant should not affect the chance of another's
selection (hence we try to select the sample randomly – thus, it is important to
note that random sampling does not describe the sample or its size as much as
it describes how the sample is chosen).

1.2. REPRESENTATIVE SAMPLE


 A sample whose characteristics correspond to, or reflect, those of the original
population or reference population
 To ensure representativeness, the sample may be either completely random or
stratified depending upon the conceptualized population and the sampling
objective.
 The aim of any sample is to represent the characteristics of the sample frame.
 There are a number of different methods used to generate a sample.
 As a researcher you will have to select the most appropriate method meet the
requirements of your research.
2. DETERMINING SAMPLE

2.1. WHY TO DETERMINE SAMPLE


 Resources (time, money) and workload
 Gives results with known accuracy that can be calculated mathematically
 The population of interest is usually too large to attempt to survey all of its
members.
 A carefully chosen sample can be used to represent the population.
 The sample reflects the characteristics of the population from which it is drawn.
 When it’s impossible to study the whole population
2.2. SAMPLE SIZE DETERMINATION

Sample size determination is the act of choosing the number of observations


or replicates to include in a statistical sample. The sample size is an important feature of
any empirical study in which the goal is to make inferences about a population from a
sample. In practice, the sample size used in a study is determined based on the expense of
data collection, and the need to have sufficient statistical power. In complicated studies
there may be several different sample sizes involved in the study: for example, in
stratified survey there would be different sample sizes for each stratum. In a census, data
are collected on the entire population; hence the sample size is equal to the population
size. In experimental design, where a study may be divided into different treatment
groups, there may be different sample sizes for each group.
Sample sizes may be chosen in several different ways:

 Experience - For example, include those items readily available or convenient to


collect. A choice of small sample sizes, though sometimes necessary, can result in
wide confidence intervals or risks of errors in statistical hypothesis testing.
 Using a target variance for an estimate to be derived from the sample eventually
obtained
 Using a target for the power of a statistical test to be applied once the sample is
collected.

2.3. HOW TO DETERMINE SAMPLE SIZE


 “We need a margin of error less than 2.5%”. Typical surveys have margins of error
ranging from less than 1% to something of the order of 4% — we can choose any
margin of error we like but need to specify it.
 95% confidence intervals are typical but not in any way mandatory — we could do
90%, 99% or something else entirely. For this example, we assume 95%.
 May be guided by past surveys or general knowledge of public opinion. Let’s suppose
answer is 30%.
3. CALCULATING SAMPLE SIZE
3.1. APPROCHES TO CALCULATE SAMPLE SIZE
There are two approaches to sample size calculations:
• Precision-based: With what precision do you want to estimate the proportion, mean
difference . . . (or whatever it is you are measuring)?
• Power-based: How small a difference is it important to detect and with what degree of
certainty?

Generally, the sample size for any study depends on the:[1]

 Acceptable level of significance


 Power of the study
 Expected effect size
 Underlying event rate in the population
 Standard deviation in the population.

Before calculating a sample size, you need to determine a few things about the target population

and the sample you need:

1. Population Size — How many total people fit your demographic? For instance, if you

want to know about mothers living in the US, your population size would be the total

number of mothers living in the US. Don’t worry if you are unsure about this number. It

is common for the population to be unknown or approximated.

2. Margin of Error (Confidence Interval) — No sample will be perfect, so you need to

decide how much error to allow. The confidence interval determines how much higher or

lower than the population mean you are willing to let your sample mean fall. If you’ve

ever seen a political poll on the news, you’ve seen a confidence interval. It will look

something like this: “68% of voters said yes to Proposition Z, with a margin of error of

+/- 5%.”

3. Confidence Level — How confident do you want to be that the actual mean falls within

your confidence interval? The most common confidence intervals are 90% confident,

95% confident, and 99% confident.

4. Standard of Deviation — How much variance do you expect in your responses? Since

we haven’t actually administered our survey yet, the safe decision is to use .5 – this is the

most forgiving number and ensures that your sample will be large enough.

We already know that the margin of error is 1.96 times the standard error and that the standard
√pˆ(1−pˆ)
error is as; = .
𝑛
In general the formula is

√pˆ(1−pˆ)
ME = z n

Where;

• ME is the desired margin of error

• z is the z-score, e.g. 1.645 for a 90% confidence interval, 1.96 for a 90% confidence interval,
2.58 for a 99% confidence interval (see Table 8.2, page 369)

• pˆ is our prior judgment of the correct value of p.

• n is the sample size (to be found)

3.2. SAMPLE SIZE TO ESTIMATE A POPULATION MEAN


The issues are similar if we are designing a survey or an experiment to estimate a population mean. In
this case, the formula is
𝑠
ME = 𝑡
√𝑛

Where;

• ME is the desired margin of error •

t is the t-score that we use to calculate the confidence interval, that depends on both the degrees of
freedom and the desired confidence level,

• s is the standard deviation,

• n is the sample size we want to find.

4. BASIC TERMS AND NOTATIONS


 Target Population:
The population to be studied/ to which the investigator wants to generalize his results
 Sampling Unit:
Smallest unit from which sample can be selected
 Sampling frame
List of all the sampling units from which sample is drawn
 Sampling scheme
Method of selecting sampling units from sampling frame
 Population
A population can be defined as including all people or items with the characteristic one
wish to understand.
 Parameters
Numerical characteristic of a population
 Statistics
Numerical characteristic of a sample
 Data
The measurements that are collected by the investigator
Notation Population
σ: The known standard deviation of the population.
σ2: The known variance of the population.
P: The true population proportion.
N: The number of observations in the population.
x: The sample estimate of the population mean.
Notation Sample
s: The sample estimate of the standard deviation of the population.
s2: The sample estimate of the population variance.
p: The proportion of successes in the sample.
n: The number of observations in the sample.
SD: The standard deviation of the sampling distribution.
SE: The standard error. (This is an estimate of the standard deviation of the sampling
distribution.)

5. TYPES OF SAMPLING
There are two basic types of sampling, which further have many sub-methods for
sampling.
1. Probability Sampling
A probability sampling scheme is one in which every unit in the population has a
chance (greater than zero) of being selected in the sample, and this probability can
be accurately determined.
 Methods include random sampling, systematic sampling, and stratified
sampling.
 They are considered to be:
 Objective
 Empirical
 Scientific
 Quantitative
 Representative
2. Non-Probability Sampling
Any sampling method where some elements of population have no chance of
selection (these are sometimes referred to as 'out of coverage'/'under-covered'), or
where the probability of selection can't be accurately determined. It involves the
selection of elements based on assumptions regarding the population of interest,
which forms the criteria for selection. Hence, because the selection of elements is
nonrandom, non-probability sampling not allows the estimation of sampling
errors.
 Methods include convenience sampling, judgment sampling, quota
sampling, and snowball sampling
 They are considered to be:
 Interpretive
 Subjective
 Not scientific
 Qualitative
 Unrepresentative
6. METHODS OF SAMPLING
6.1. PROBABILITY SAMPLING METHODS

 Simple random sampling. Simple random sampling refers to any sampling method that
has the following properties.

 The population consists of N objects.


 The sample consists of n objects.
 If all possible samples of n objects are equally likely to occur, the sampling
method is called simple random sampling.

There are many ways to obtain a simple random sample. One way would be the lottery
method. Each of the N population members is assigned a unique number. The numbers
are placed in a bowl and thoroughly mixed. Then, a blind-folded researcher selects n
numbers. Population members having the selected numbers are included in the sample.

 Stratified sampling. With stratified sampling, the population is divided into groups,
based on some characteristic. Then, within each group, a probability sample (often a
simple random sample) is selected. In stratified sampling, the groups are called strata.

As a example, suppose we conduct a national survey. We might divide the population


into groups or strata, based on geography - north, east, south, and west. Then, within each
stratum, we might randomly select survey respondents.
 Cluster sampling. With cluster sampling, every member of the population is assigned to
one, and only one, group. Each group is called a cluster. A sample of clusters is chosen,
using a probability method (often simple random sampling). Only individuals within
sampled clusters are surveyed.

Note the difference between cluster sampling and stratified sampling. With stratified
sampling, the sample includes elements from each stratum. With cluster sampling, in
contrast, the sample includes elements only from sampled clusters.

 Multistage sampling. With multistage sampling, we select a sample by using


combinations of different sampling methods.

For example, in Stage 1, we might use cluster sampling to choose clusters from a
population. Then, in Stage 2, we might use simple random sampling to select a subset of
elements from each chosen cluster for the final sample.

 Systematic random sampling. With systematic random sampling, we create a list of


every member of the population. From the list, we randomly select the first sample
element from the first k elements on the population list. Thereafter, we select
every element on the list.

6.2. NON-PROBABILITY SAMPLING METHODS


 Voluntary sample. A voluntary sample is made up of people who self-select into
the survey. Often, these folks have a strong interest in the main topic of the
survey.

Suppose, for example, that a news show asks viewers to participate in an on-line
poll. This would be a volunteer sample. The sample is chosen by the viewers, not
by the survey administrator.
 Convenience sample. A convenience sample is made up of people who are easy
to reach.

Consider the following example. A pollster interviews shoppers at a local mall. If


the mall was chosen because it was a convenient site from which to solicit survey
participants and/or because it was close to the pollster's home or business, this
would be a convenience sample.
 Quota sampling

This method of sampling is often used by market researchers. Interviewers are


given a quota of subjects of a specified type to attempt to recruit. For example, an
interviewer might be told to go out and select 20 adult men and 20 adult women,
10 teenage girls and 10 teenage boys so that they could interview them about their
television viewing. There are several flaws with this method, but most
importantly it is not truly random.2

 Snowball sampling

This method is commonly used in social sciences when investigating hard to


reach groups. Existing subjects are asked to nominate further subjects known to
them, so the sample increases in size like a rolling snowball. For example, when
carrying out a survey of risk behaviours amongst intravenous drug users,
participants may be asked to nominate other users to be interviewed.

7. EVENT SAMPLING METHODOLOGY


ESM is a new form of sampling method that allows researchers to study ongoing
experiences and events that vary across and within days in its naturally-occurring
environment. Because of the frequent sampling of events inherent in ESM, it enables
researchers to measure the typology of activity and detect the temporal and dynamic
fluctuations of work experiences. Popularity of ESM as a new form of research design
increased over the recent years because it addresses the shortcomings of cross-sectional
research, where once unable to, researchers can now detect intra-individual variances
across time. In ESM, participants are asked to record their experiences and perceptions in
a paper or electronic diary.
8. RESPONSE RATE
Response rate (also known as completion rate or return rate) in survey research refers to
the number of people who answered the survey divided by the number of people in the
sample. It is usually expressed in the form of a percentage. The term is also used in direct
marketing to refer to the number of people who responded to an offer.
8.1. IMPORTANCE

A survey’s response rate is the result of dividing the number of people who were
interviewed by the total number of people in the sample who were eligible to participate
and should have been interviewed.[1] A low response rate can give rise to sampling bias if
the nonresponse is unequal among the participants regarding exposure and/or outcome.
Such bias is known as non-response bias.
For many years, a survey's response rate was viewed as an important indicator of survey
quality. Many observers presumed that higher response rates assure more accurate survey
results (Aday 1996; Babbie 1990; Backstrom and Hursh 1963; Rea and Parker 1997). But
because measuring the relation between nonresponse and the accuracy of a survey
statistic is complex and expensive, few rigorously designed studies provided empirical
evidence to document the consequences of lower response rates until recently.
Such studies have finally been conducted in recent years, and several conclude that the
expense of increasing the response rate frequently is not justified given the difference in
survey accuracy.

9. STRATEGIES FOR SAMPLING


There are three broad approaches to selecting a sample for a qualitative study.
Convenience sample
This is the least rigorous technique, involving the selection of the most accessible
subjects. It is the least costly to the researcher, in terms of time, effort and money, but
may result in poor quality data and lacks intellectual credibility. There is an element of
convenience sampling in many qualitative studies, but a more thoughtful approach to
selection of a sample is usually justified.
Judgment sample
Also known as purposeful sample, this is the most common sampling technique. The
researcher actively selects the most productive sample to answer the research question.
This can involve developing a framework of the variables that might influence an
individual's contribution and will be based on the researcher's practical knowledge of the
research area, the available literature and evidence from the study itself. This is a more
intellectual strategy than the simple demographic stratification of epidemiological
studies, though age, gender and social class might be important variables. If the subjects
are known to the researcher, they may be stratified according to known public attitudes or
beliefs. It may be advantageous to study a broad range of subjects (maximum variation
sample), outliers (deviant sample), subjects who have specific experiences (critical case
sample6 ) or subjects with special expertise (key informant sample). Subjects may be able
to recommend useful potential candidates for study (snowball sample). During
interpretation of the data it is important to consider subjects who support emerging
explanations and, perhaps more importantly, subjects who disagree (confirming and
disconfirming samples).
Theoretical sample
The iterative process of qualitative study design means that samples are usually theory
driven to a greater or lesser extent. Theoretical sampling necessitates building
interpretative theories from the emerging data and selecting a new sample to examine and
elaborate on this theory. It is the principal strategy for the grounded theoretical approach3
but will be used in some form in most qualitative investigations necessitating
interpretation.
10.ERRORS IN SAMPLING
In statistics, sampling error is the error caused by observing a sample instead of the whole
population. The sampling error is the difference between a sample statistic used to
estimate a population parameter and the actual but unknown value of the parameter
(Bunns & Grove, 2009).

There are five common errors of sampling, as follows

 Population Specification Error: This error occurs when the researcher does not

understand who she should survey. For example, imagine a survey about breakfast cereal

consumption. Who should she survey? It might be the entire family, the mother, or the

children. The mother probably makes the purchase decision, but the children influence

her choice.

 Sample Frame Error: A frame error occurs when the wrong sub-population is used to

select a sample. A classic frame error occurred in the 1936 presidential election between

Roosevelt and Landon. The sample frame was from car registrations and telephone

directories. In 1936, many Americans did not own cars or telephones and those who did

were largely Republicans. The results wrongly predicted a Republican victory.

 Selection Error: This occurs when respondents self select their participation in the study

– only those that are interested respond. Selection error can be controlled by going extra

lengths to get participation. A typical survey process includes initiating pre-survey

contact requesting cooperation, actual surveying, post survey follow-up if a response is


not received, a second survey request, and finally interviews using alternate modes such

as telephone or person to person.

 Non-Response: Non-response errors occur when respondents are different than those

who do not respond. This may occur because either the potential respondent was not

contacted or they refused to respond. The extent of this non-response error can be

checked through follow-up surveys using alternate modes.

 Sampling Errors: These errors occur because of variation in the number or

representativeness of the sample that responds. Sampling errors can be controlled by (1)

careful sample designs, (2) large samples, and (3) multiple contacts to assure

representative response.

10.1. BIAS IN SAMPLING

There are five important potential sources of bias that should be considered when selecting a
sample, by whatever method.

1. Any changes from the pre-agreed sampling rules can introduce bias
2. Bias is introduced if people in hard to reach groups are omitted
3. Replacing selected individuals with others, for example if they are difficult to contact, also
introduces bias
4. It is important to try and maximize the response rate to a survey; low response rates can
introduce bias
5. If an out of date list is used as the sample frame, it may also introduce bias, if it excludes
people who have recently moved to an area, for example.

10.2. STATISTICAL ERRORS


 Type I error, also known as a “false positive”: the error of rejecting a null hypothesis
when it is actually true. In other words, this is the error of accepting an alternative
hypothesis (the real hypothesis of interest) when the results can be attributed to
chance. Plainly speaking, it occurs when we are observing a difference when in truth
there is none (or more specifically - no statistically significant difference). So the
probability of making a type I error in a test with rejection region R is 0 P R H ( | is true) .
 Type II error, also known as a "false negative": the error of not rejecting a null
hypothesis when the alternative hypothesis is the true state of nature. In other words,
this is the error of failing to accept an alternative hypothesis when you don't have
adequate power. Plainly speaking, it occurs when we are failing to observe a difference
when in truth there is one. So the probability of making a type II error in a test with
rejection region R is 1 ( | is true) − P R Ha . The power of the test can be ( | is true) P R
Ha .
11.SOFTWARE USED FOR CALCULATING SAMPLE
 The Survey System
 Raosoft, Inc.
 Vanderbilt
 Ower-analysis

You might also like