0% found this document useful (0 votes)
14 views66 pages

Lecture PDF On Sampling and Coding

The document discusses the importance of sampling in statistics, defining a sample as a subset of a larger population used to make inferences. It outlines the characteristics of a good sample, the sampling process, and different types of sampling methods, including probability and non-probability sampling. Key concepts such as sampling frame, representativeness, and the need for randomness are emphasized to ensure accurate data collection and analysis.

Uploaded by

Shalu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views66 pages

Lecture PDF On Sampling and Coding

The document discusses the importance of sampling in statistics, defining a sample as a subset of a larger population used to make inferences. It outlines the characteristics of a good sample, the sampling process, and different types of sampling methods, including probability and non-probability sampling. Key concepts such as sampling frame, representativeness, and the need for randomness are emphasized to ensure accurate data collection and analysis.

Uploaded by

Shalu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 66

Sampling

Dr. Manish Agarwal


Statistics is a tool for converting data into information:

Statistics

Data Information

But
•Where then does data come from?
•How is it gathered?
• How do we ensure its accurate?
•Is the data reliable?
•Is it representative of the population from which it was
drawn?
Sample
• A sample is a subset of a larger
population of objects individuals,
households, businesses, organizations
and so forth. In short , a sample is a
subset of the population that is being
Population studied, from which data is collected
Population to make inferences about the
population.
• Sampling is the process of selecting a
representative group of individuals
from the population to be included in
Sample
Sample the sample. Sampling enables
researchers to make estimates of some
unknown characteristics of the
population in question
• A finite group is called population
whereas a non-finite (infinite) group is
called universe
• A census is a investigation of all the
individual elements of a population
Sampling
• Identify the
population you
want to study.

• The sample must


be representative
of the population
you want to study.
Why Sampling is required?
What you What you
want to talk Population actually
about observe in
the data

Sampling Process
Sampling Sample
Frame

Inference

Using data to say something (make an inference) with confidence, about


a whole (population) based on the study of a only a few (sample).
Why sampling?

Get information about large populations


 Less costs
 Less field time
 More accuracy i.e. Can Do A Better Job of Data
Collection
 When it’s impossible to study the whole
population
Characteristics of a Good Sample:
1. Representativeness: A good sample should be
representative of the population it is intended to represent.
The sample should include individuals from all relevant
subgroups of the population, in appropriate proportions, to
ensure that the sample accurately reflects the
characteristics of the population.
2. Adequacy: The sample should be of sufficient size to
provide statistically reliable results. A larger sample size
increases the precision of the estimates and reduces the
potential for sampling errors.
3. Randomness: The sample should be selected randomly to
avoid bias in the selection process. Random sampling
ensures that every member of the population has an equal
chance of being selected for the sample.
Characteristics of a Good Sample:
4. Relevance: The sample should be relevant to the research question
being investigated. For example, if the research is focused on a
particular age group, the sample should be selected from that age
group to ensure that the results are relevant to the research question.
5. Feasibility: The sample should be feasible to obtain. The cost and
time involved in obtaining the sample should be reasonable and
feasible within the constraints of the research project.
6. Accessibility: The sample should be accessible to the researcher. The
sample should be easily accessible and willing to participate in the
research. If the sample is not willing to participate, the quality of the
data collected may be compromised.
7. Ethical Considerations: The sample should be selected in a way that
is ethical and respects the privacy and confidentiality of the
participants. The researcher should obtain informed consent from the
participants, and take steps to protect their privacy and
confidentiality.
What is sampling Frame?
• A sampling frame is a list or set of elements from which a
sample is selected for a research study.
• It is a complete list of all the individuals, units, or elements in
the population from which the sample is to be drawn.
• The sampling frame is the foundation of the sampling process,
and it is important to ensure that it is comprehensive and
representative of the population.
• For example, if a researcher wants to study the opinions of
college students in a particular university, the sampling frame
would be a list of all the students enrolled in the university.
• The sampling frame could be obtained by obtaining a roster of
all the students or by using the university's enrollment records.
Practical approach to determine the expected
sample frame
1. Define the population: The first step is to define the population being
studied. This can be done by specifying the inclusion and exclusion
criteria.

2. Identify the sampling units: The next step is to identify the sampling
units. The sampling units are the individual elements or units that will be
included in the sample. For example, if the population is students in a
university, the sampling units could be individual students, classes, or
departments.

3. Obtain a list of sampling units: Once the sampling units have been
identified, a list of all the sampling units should be obtained. This can be
done by obtaining a roster or list of all members of the population, or by
creating a list through other means such as internet searches or directories.
Practical approach to determine the expected
sample frame
4. Evaluate the sampling frame: The sampling frame should be evaluated
to determine its adequacy and representativeness. The sampling frame
should include all members of the population, and should be
representative of the population in terms of relevant characteristics such
as age, gender, education, income, and so on.

5. Adjust the sampling frame: If the sampling frame is not adequate or


representative, adjustments may be needed. This could involve adding or
removing sampling units, or using alternative sources to obtain a better
sampling frame.

6. Finalize the sampling frame: Once the sampling frame is determined, it


should be finalized and used to select the sample for the research study.
What is sampling?
• If a sample of a population is to provide useful information
about that population, then the sample must contain
essentially the same variation as the population.

• The more heterogeneous a population is…


– The greater the chance is that a sample may not adequately
describe a population we could be wrong in the inferences we
make about the population.

• And…
– The larger the sample needs to be to adequately describe the
population we need more observations to be able to make
accurate inferences.
What is Sampling?
• Sampling is the process of selecting observations (a sample)
to provide an adequate description and robust inferences of
the population
– The sample is representative of the population.

• There are 2 types of sampling:


– Non-Probability sampling
– Probability sampling
The Language of Sampling
• Sample element: a case or a single unit that is selected from a population and
measured in some way—the basis of analysis (e.g., an person, thing, specific time,
etc.).

• Universe: the theoretical aggregation of all possible elements—unspecified to time


and space (e.g., Rohilkhand University).

• Population: the theoretical aggregation of specified elements as defined for a given


survey defined by time and space (e.g., Rohilkhand University students and staff in
2008).

• Sample or Target population: the aggregation of the population from which the
sample is actually drawn (e.g., MBA students and faculty in 2008-09 Academic year).

• Sample frame: a specific list that closely approximates all elements in the population
—from this the researcher selects units to create the study sample (database of MBA
students and faculty in 2008-09).

• Sample: a set of cases that is drawn from a larger pool and used to make
generalizations about the population
Probability Sampling
• Probability sampling is a sampling method in which each member of
the population has an equal and known chance of being selected for
the sample.
• The simple random sample, in which each member of the population has
an equal probability of being selected, is the best-known probability
sample.
• A sample must be representative of the population with respect to the
variables of interest.
• A sample will be representative of the population from which it is
selected if each member of the population has an equal chance
(probability) of being selected.
• Probability samples are more accurate than non-probability samples
– They remove conscious and unconscious sampling bias.
• Probability samples allow us to estimate the accuracy of the sample.
• Probability samples permit the estimation of population parameters.
Types of Probability Sampling
 Simple Random Sample

 Stratified Random Sample

 Cluster sampling

 Systematic
Simple Random Sample
• Every subset of a specified size n from the
population has an equal chance of being
selected
Simple Random Sampling…
• Simple random sampling is a probability sampling technique used
to create a sample of individuals or items from a larger population.
• In simple random sampling, every individual or item in the
population has an equal and independent chance of being selected
for the sample.
• Drawing three names from a hat containing all the names of the
students in the class is an example of a simple random sample: any
group of three names is as equally likely as picking any other group
of three names.
• VERY EASY TO DEFINE!
• VERY, VERY DIFFICULT TO DO!
•Random sample of 100 cokes bottles today at the coke plant.
•Random sample of 50 pine trees in a 1000 acre forest.
•Random sample of 5 deer in a national forest.

5.19
Simple Random Sampling…
A government income tax auditor must choose a sample of 5 of
11 returns to audit…[Can do many different ways]
Generate Sorted
Person Random # Person Random #
baker 0.87487 1 mark 0.08350
george 0.89068 2 ralph 0.11597
ralph 0.11597 3 joe 0.24662
mary 0.58635 4 sally 0.34346
sally 0.34346 5 aaron 0.37239
joe 0.24662 andrea 0.47609
andrea 0.47609 greg 0.53542
mark 0.08350 mary 0.58635
greg 0.53542
kim 0.73809
aaron 0.37239
baker 0.87487
kim 0.73809
george 0.89068
Simple Random Sampling Vs. Complex Random Sampling
• Simple random sampling involves randomly selecting individual units
from the population to create a sample. This can be done using a variety
of methods, including random number generators or tables, or
systematic sampling. For example, if you wanted to survey students at
a university, you could randomly select students from a list of all
enrolled students to create your sample.
• Complex random sampling designs, on the other hand, involve
multiple stages of sampling to create a sample. This may be necessary
when the population is large or when it is not feasible to obtain a
complete list of all units in the population. In a complex random
sampling design, the population is divided into smaller groups or
clusters, and a sample of clusters is randomly selected. Then, a sample
of individuals or items is selected from within each selected cluster. For
example, if you wanted to survey households in a city, you could divide
the city into neighborhoods, randomly select a sample of
neighborhoods, and then randomly select households within each
selected neighborhood to create your sample
Simple Random Sampling Vs. Complex Random Sampling
Criteria Simple Random Sampling Complex Random Sampling

Randomly selecting individual Randomly selecting clusters from the


Sampling method units from the population.
population, and then randomly selecting
individuals or items within each cluster.

Number of stages One stage Multiple stages


Good representation of the
Sample population since all individuals Good representation of the population since
all clusters and individuals within them have
representativeness have an equal chance of being an equal chance of being selected.
selected.
Population size Suitable for small to medium-sized Suitable for large or geographically
limitation populations. dispersed populations.

Low sampling error if sample is Can introduce more sampling error if


Sampling error representative of the population.
clusters or individuals within them are not
representative of the population.

Complexity and More complex and time-consuming, may


resources required Simple and less time-consuming. require more resources.

Dividing a city into neighborhoods,


Randomly selecting students from randomly selecting a sample of
neighborhoods, and then randomly
Examples a list of all enrolled students to selecting households within each selected
create a sample for a survey. neighborhood to create a sample for a
survey.
Systematic Sample
• Systematic sampling is a probability sampling technique used to create a
sample of individuals or items from a larger population.
• In systematic sampling, every kth individual or item is selected from the
population to be included in the sample.
• A sampling procedure in which a starting point is selected by a random
process and then every nth number on the list is selected.
• Every kth member ( for example: every 5th person) is selected from a list of
all population members.
The steps involved in systematic sampling are as follows:

1. Define the population: The population is the group of individuals


or items that you want to study.
2. Determine the sample size: The sample size is the number of
individuals or items you need to include in your sample.
3. Calculate the sampling interval: The sampling interval is
calculated by dividing the population size by the desired sample
size. For example, if the population size is 1000 and you want a
sample size of 100, the sampling interval would be 10.
4. Select a random starting point: Select a random number
between 1 and the sampling interval k.
5. Select every kth individual or item: Starting from the randomly
selected starting point, select every kth individual or item from the
population until the desired sample size is reached.
6. Collect data: Collect data from the selected individuals or items
in the sample.
Systematic Random Sampling (SS)
• Method:
– Starting from a random point on a sampling frame, every n th element in the
frame is selected at equal intervals (sampling interval).

• Sampling Interval tells the researcher how to select elements from the frame (1
in ‘k’ elements is selected).
– Depends on sample size needed

• Example:
– You have a sampling frame (list) of 10,000 people and you need a sample of
1000 for your study…What is the sampling interval that you should follow?
– Every 10th person listed (1 in 10 persons)

• Empirically provides identical results to SRS, but is more efficient.


Stratified Random Sample
• The population is divided into two or more
groups called strata, according to some
criterion, such as geographic location, grade
level, age, or income, and subsamples are
randomly selected from each strata.
Stratified Sampling ( Group of similar people)

• Method:
– Divide the population by certain characteristics into homogeneous
subgroups (strata) (e.g., PhD students, Masters Students, Bachelors
students).
– Elements within each strata are homogeneous, but are heterogeneous
across strata.
– A simple random or a systematic sample is taken from each strata relative
to the proportion of that stratum to each of the others.

• Researchers use stratified sampling


– When a stratum of interest is a small percentage of a population and
random processes could miss the stratum by chance.
– When enough is known about the population that it can be easily broken
into subgroups or strata.
POPULATION
N = 1000

equal intensity

STRATA 1 STRATA 2

n= 500 n = 500
POPULATION
N =1000

proportional to size

STRATA 1
n =400 STRATA 2
n = 600
STRATIFIED SAMPLING…….

Draw a sample from each


stratum

30
Cluster sampling
Section 1 Section 2

Section 3

Section 5

Section 4
Cluster Sample
• The population is divided into subgroups (clusters) like
families. A simple random sample is taken of the subgroups
and then all members of the cluster selected are surveyed.
Cluster sampling
• Some populations are spread out (over a state or country).

• Elements occur in clumps (towns, districts)—Primary


sampling units (PSU).

• Elements are hard to reach and identify.

• You cannot assume that any one clump is better or worse


than another clump.
Cluster sampling
• Used when:
– Researchers lack a good sampling frame for a dispersed
population.
– The cost to reach an element to sample is very high.

• Each cluster is as varied heterogeneous internally and


homogeneous to all the other clusters.
• Usually less expensive than SRS but not as accurate
– Each stage in cluster sampling introduces sampling error—the
more stages there are, the more error there tends to be.
Restrictive & Unrestrictive Sampling
• Restricted sampling involves selecting a sample from a limited or
restricted population, such as a particular group of people, a specific
location, or a certain time period.
• The researcher may have limited access to the population, either due to
logistical or ethical reasons, and thus must work within those constraints.
• Restricted sampling can be useful in situations where the population is
difficult to access or when the research question is specific to a certain
group or location.
• E.g. Suppose a researcher wants to study the effect of a new exercise
program on people who have recently had a heart attack. The researcher
may only have access to patients who are currently receiving treatment at
a particular hospital, and thus must select a sample from this restricted
population. This is an example of restricted sampling.
Restrictive & Unrestrictive Sampling
• Unrestricted sampling, on the other hand, involves selecting a
sample from an unrestricted or general population, without any
restrictions or limitations.
• The researcher has complete access to the population and can select
the sample from any individual or item within the population.
• Unrestricted sampling is commonly used in research studies that aim
to generalize the findings to a larger population.
• E.g. Suppose a researcher wants to study the prevalence of
smartphone use among adults in a particular city. The researcher has
access to a list of all adults in the city and can select a sample from
any individual in the population. This is an example of unrestricted
sampling. Or
• Suppose a researcher wants to study the effectiveness of a new
teaching method on students' academic performance. The researcher
has access to a list of all students in a particular school district and
can select a sample from any student in the district. This is an
example of unrestricted sampling.
Non probability sampling
• Non-probability sampling is a sampling method in which the
members of the population do not have an equal chance of being
selected for the sample.
• Non-probability sampling methods are often used when the
population is difficult to define or access.
• The selection of sampling units in nonprobability sampling is
quite arbitrary, as researchers rely heavily on personal judgment.
• Technically, no appropriate statistical techniques exist for
measuring random sampling error from a nonprobability sample.
• Therefore, projecting the data beyond the sample is, technically
speaking, statistically inappropriate.
Convenience Sample
• A convenience sample is a non-probability sampling method where individuals are selected
based on their availability and willingness to participate in a study.
• It is a type of sampling where the sample is selected based on who is easiest or most
convenient to access. Convenience sampling is generally considered a weak sampling
method because the sample may not accurately represent the population being studied. Here
are some examples of convenience sampling:
 A researcher who wants to study the effects of a new drug on patients may recruit
participants from the waiting room of a medical clinic. This may be a convenient way to
find participants, but the sample is not representative of the broader population of patients
who might use the drug.
 A student who wants to conduct a survey on campus may ask their classmates or friends to
participate in the survey. This is a convenient way to gather data, but the sample may not
accurately represent the views of all students on campus.
 A researcher who wants to study the use of social media by teenagers may recruit
participants through social media platforms. This may be a convenient way to access
participants, but the sample may not accurately represent all teenagers who use social
media.
 A market researcher who wants to study the purchasing habits of shoppers in a mall may
recruit participants in the mall during a specific time of day. This may be a convenient way
to access shoppers, but the sample may not accurately represent all shoppers who visit the
mall.
Convenience Sample
• The sampling procedure of obtaining those people or
units that are most conveniently available.
• i.e. Selection of whichever individuals are easiest to reach
• It is done at the “convenience” of the researcher
 Judgment or Purposive Sample
 The sampling procedure in which an
experienced researcher selects the
sample based on some appropriate
characteristic of sample members… to
serve a purpose
 Judgment sampling often is used in

attempts to forecast election results


Judgment or Purposive Sample

• A judgment or purposive sample is a non-probability sampling method in


which the researcher selects participants based on their specific characteristics
or expertise related to the research question.
• In other words, the researcher uses their judgment to select individuals who are
likely to provide useful information or who meet specific criteria related to the
research question. Here are some examples of judgment or purposive sampling:
 A researcher who is conducting a study on the experiences of refugees may
select participants who have recently fled their home country and have resettled
in the host country. The researcher may use their judgment to select
participants who are most likely to provide valuable information about the
refugee experience.
 A researcher who is conducting a study on the experiences of cancer patients
may select participants who have been diagnosed with a particular type of
cancer and are undergoing chemotherapy. The researcher may use their
judgment to select participants who are most likely to provide valuable
information about the effects of chemotherapy.
 A researcher who is conducting a study on the experiences of successful
entrepreneurs may select participants who have founded successful startups and
have a track record of generating high revenue. The researcher may use their
judgment to select participants who are most likely to provide valuable
information about the factors that contribute to entrepreneurial success.
 Quota Sample
 A quota sample is a non-probability sampling
method in which the researcher selects participants
based on predetermined quotas for certain
demographic groups.
 The goal of quota sampling is to ensure that the
sample reflects the proportions of certain
characteristics found in the larger population being
studied..
 For example, an interviewer in a particular city may
be assigned 100 interviews, 35 with owners of Sony
TVs, 30 with owners of Samsung TVs, 18 with
owners of Panasonic TVs, and the rest with owners
of other brands.
 The interviewer is responsible for finding enough
people to meet the quota.
 Snowball sampling (friend of friend….etc.)
 The sampling procedure in which the
initial respondents are chosen by
probability or non-probability methods,
and then additional respondents are
obtained by information provided by the
initial respondents
Sampling
Technique Definition Advantages Disadvantages Examples
Requires a complete
Every member of the Representative of the
list of the population,
Simple random population has an equal population, easy to Rolling a dice to select
may not capture
sampling chance of being selected understand and individuals for a survey
specific subgroups of
for the sample implement
the population

May not capture


Easy to understand and
specific subgroups of
Members of the implement, requires
Systematic the population, Selecting every 10th name
population are selected less time and effort
sampling potential bias if there from a phone directory
at regular intervals than simple random
is a pattern in the
sampling
population

The population is divided


Ensures representation Requires knowledge of
into subgroups based on Sampling 50 males and 50
of specific subgroups the population, more
Stratified certain characteristics females from each age
of the population, time-consuming than
sampling and a random sample is group (18-24, 25-34, 35-44,
reduces variability simple random
selected from each 45-54, 55 and above)
within subgroups sampling
subgroup

The population is divided May not be


Efficient for large Selecting a random sample
into clusters and a representative of the
populations and of schools in a district and
Cluster sampling random sample of population if the
geographically then selecting a sample of
clusters is selected for clusters are not
dispersed populations students from each school
the study homogeneous

A combination of More complex and Sampling households in a


Multi-stage different sampling Efficient for large and requires more time city, then selecting
sampling techniques used in complex populations and resources than individuals within each
sequence other techniques household
Sampling Technique Definition Advantages Disadvantages Examples
Participants are Easy and May not be
Convenience selected based on inexpensive to representative of Recruiting students
sampling their availability and implement, useful the population, from a class for a
willingness to for preliminary potential bias in survey
participate studies sample selection

Participants are Potential bias in


selected based on Useful for specific sample selection, Recruiting
Judgment or the researcher's research questions may not be participants based
purposive sampling judgement and and hard-to-reach representative of on their expertise or
knowledge of the populations the population experience
population

Participants are
selected based on Potential bias in
pre-specified Useful for studying sample selection, Recruiting equal
Quota sampling characteristics to specific subgroups may not be numbers of males
achieve a certain of the population representative of and females for a
distribution in the the population survey
sample

Recruiting
Participants are Potential bias in individuals who
recruited through sample selection, have overcome
Snowball sampling referrals from other Useful for hard-to- may not be addiction through
participants in the reach populations representative of referrals from others
study the population who have overcome
addiction
Strengths and Weaknesses of Basic Sampling Techniques

Technique Strengths Weaknesses


Nonprobability Sampling Least expensive, least Selection bias, sample not
Convenience sampling time-consuming, most representative, not recommended for
convenient descriptive or causal research
Judgmental sampling Low cost, convenient, Does not allow generalization,
not time-consuming subjective
Quota sampling Sample can be controlled Selection bias, no assurance of
for certain characteristics representativeness
Snowball sampling Can estimate rare Time-consuming
characteristics

Probability sampling Easily understood, Difficult to construct sampling


Simple random sampling results projectable frame, expensive, lower precision,
(SRS) no assurance of representativeness.
Systematic sampling Can increase Can decrease representativeness
representativeness,
easier to implement than
SRS, sampling frame not
necessary
Stratified sampling Include all important Difficult to select relevant
subpopulations, stratification variables, not feasible to
precision stratify on many variables, expensive
Cluster sampling Easy to implement, cost Imprecise, difficult to compute and
effective interpret results
Issues in Sample Design and Selection (1)

 Accuracy – Samples should be representative of the target


population (less accuracy is required for exploratory research
than for conclusive research projects)

 Resources – Time, money and individual or institutional


capacity are very important considerations due to the
limitation on them.
Issues in Sample Design and Selection (2)
 Availability of Information – Often information on potential
sample participants in the form of lists, directories etc. is
unavailable (especially in developing countries) which makes
some sampling techniques (e.g. systematic sampling) impossible
to undertake

 Geographical Considerations – The number and dispersion of


population elements may determine the sampling technique
used (e.g. cluster sampling)

 Statistical Analysis – This should be performed only on samples


which have been created through probability sampling (i.e. not
probability sampling)
Sampling and Non-Sampling Errors…
Two major types of error can arise when a sample of observations is
taken from a population:
Sampling error and Non-Sampling error.
• Sampling error refers to the error that occurs when a sample is used
to estimate characteristics of a population. It is the difference
between the value of a sample statistic (such as the mean or
proportion) and the value of the corresponding population
parameter. Sampling error is unavoidable and can be reduced by
increasing the sample size and using random sampling techniques.
Random and we have no control over it E.g.
• If a researcher wants to estimate the average age of all college
students in a city and selects a random sample of 100 students, the
sample mean age may not be the same as the population mean age
due to sampling error. The larger the sample size, the smaller the
sampling error.
Sampling Error…
• Sampling error refers to differences between the
sample and the population that exist only because
of the observations that happened to be selected
for the sample.
• i.e it refers to the difference between the sample
result and the result of a census conducted using
identical procedures.

Increasing the sample size will reduce this type of


error.
Sampling and Non-Sampling Errors…
• Non-sampling error, on the other hand, refers to all
other sources of error in a research study that are not
related to sampling.
• Non-sampling errors are more serious and are due to
mistakes made in the acquisition of data or due to the
sample observations being selected improperly.
• Non-sampling errors can occur due to various reasons,
such as measurement errors, data processing errors,
non-response bias, and selection bias. These errors can
result in inaccurate or misleading conclusions and affect
the validity of the study.
• Most likely caused be poor planning, sloppy work etc.
Types of Non Sampling Errors
1. Measurement error: This occurs when the instrument used to
measure a variable does not accurately measure the intended
concept. For example, if a survey asks about the level of stress
a person experiences, but the response options are not clear or
are ambiguous, this may result in measurement error.
2. Data processing errors: These errors can occur during the data
entry or data analysis stage of the study. For instance, if data is
entered incorrectly, this can result in errors in the final analysis.
3. Non-response bias: This occurs when individuals who do not
respond to a survey or study differ from those who do respond
in important ways. For example, if a survey on healthcare
access is sent only to people with internet access, this may lead
to non-response bias as individuals without internet access may
have different healthcare access than those who do have
internet access.
Types of Non Sampling Errors
4. Selection bias: This occurs when the sample selected
for the study is not representative of the population
being studied. For example, if a study on the prevalence
of smoking in a particular area is conducted by
surveying individuals in a shopping mall, this may not
be representative of the general population as smokers
may be less likely to be found in a shopping mall.
5. Confounding variables: This occurs when an outside
factor affects the relationship between the variables
being studied. For example, if a study is conducted on
the relationship between sleep and academic
performance, the results may be affected by
confounding variables such as diet or exercise.
Graphical Depiction of
Sampling Errors
Respondents
Planned (actual
Sample sample)
Sampling Frame

Non-Response Error

Sampling Frame Error

Random Sampling Error


Total Population
Comparative Differences between Probability
and Non-probability Sampling Methods
Choosing Non-Probability Vs. Probability Sampling

Conditions Favoring the Use of


Factors Nonprobability Probability
sampling sampling

Nature of research Exploratory Conclusive

Relative magnitude of sampling Nonsampling Sampling


and nonsampling errors errors are errors are
larger larger

Variability in the population Homogeneous Heterogeneou


(low) s (high)

Statistical considerations Unfavorable Favorable

Operational considerations Favorable Unfavorable


Proportional versus Disproportional
Sampling
• Proportional stratified sample : A stratified
sample in which the number of sampling units
drawn from each stratum is in proportion to
the population size of that stratum.
• Disproportional stratified sample : A stratified
sample in which the sample size for each
stratum is allocated according to analytical
considerations
Proportional versus Disproportional
Sampling
. . . the process used to
select sampling units from
the population.

Survey Sampling is a limited Service


supplier firm that specializes in
helping firms with their sampling
plans.
www.surveysampling.com
Steps in Developing a Sampling Plan
1. Define the population: The first step in developing a sampling plan is
to clearly define the population of interest. The population should be
well-defined and easily identifiable.
2. Determine the sample size: Once the population is defined, the
researcher needs to determine the sample size. The sample size should
be sufficient to provide a representative sample of the population while
also being practical in terms of time, cost, and resources.
3. Select the sampling method: There are various sampling methods
available, such as simple random sampling, stratified random sampling,
cluster sampling, and so on. The researcher needs to select the most
appropriate sampling method based on the research question, the
population characteristics, and the available resources.
4. Develop a sampling frame: The sampling frame is a list of all the
members of the population from which the sample will be drawn. It is
important to ensure that the sampling frame is comprehensive,
accurate, and up-to-date.
Steps in Developing a Sampling Plan
5. Select the sample: Once the sampling frame is
developed, the researcher needs to select the sample. This
involves using the chosen sampling method to randomly
select the individuals who will be included in the sample.
6. Implement the sampling plan: After selecting the
sample, the researcher needs to implement the sampling
plan. This involves contacting the selected individuals
and collecting data from them.
7. Monitor and adjust the sampling plan: Throughout the
research process, the researcher needs to monitor the
sampling plan to ensure that it is being implemented
correctly and to make any necessary adjustments to the
plan.
Steps in Developing a Sampling Plan (Example)
1. Define the population: Let's say we want to conduct a study on
the job satisfaction of employees in a particular company. The
population of interest would be all the employees in that company.
2. Determine the sample size: We might decide that we need a
sample size of 100 employees to get a representative sample. This
decision could be based on statistical power calculations or other
factors.
3. Select the sampling method: We might choose stratified random
sampling, where we divide the employees into strata based on job
role (e.g., managers, supervisors, entry-level employees) and then
randomly sample a proportion of employees from each stratum.
4. Develop a sampling frame: To create the sampling frame, we
would need a list of all the employees in the company, organized
by job role.
Steps in Developing a Sampling Plan (Example)

5. Select the sample: Using the stratified random sampling


method, we would randomly select a proportion of
employees from each job role stratum to ensure that our
sample is representative of the population.
6. Implement the sampling plan: Once we have selected our
sample, we would need to contact each selected employee
and ask them to participate in our study. We could collect
data through surveys, interviews, or other methods.
7. Monitor and adjust the sampling plan: Throughout the
study, we would need to monitor our sampling plan to
ensure that it is being implemented correctly and that our
sample is representative of the population. If we find that
our sample is not representative, we might need to adjust
our sampling plan or collect additional data.
Central Limit Theorem
• The central limit theorem is a fundamental concept in statistics that describes
the behavior of the sampling distribution of the sample mean.
• It states that as the sample size increases, the distribution of the sample mean
approaches a normal distribution, regardless of the shape of the population
distribution.
• In simpler terms, the central limit theorem tells us that if we take repeated
samples of the same size from a population and calculate the mean of each
sample, the distribution of those sample means will be approximately normal,
even if the population distribution is not normal.
• The central limit theorem is important because it allows us to make statistical
inferences about the population based on the sample data, even if the
population distribution is not known or is not normal.
• For example, if we want to estimate the mean height of all students in a
university, we can take a random sample of students, calculate the mean height
of that sample, and use the central limit theorem to determine the likelihood of
that sample mean being close to the true population mean.
• The central limit theorem has numerous practical applications in statistics, such
as hypothesis testing, confidence intervals, and regression analysis. It is also
used extensively in quality control, finance, engineering, and many other fields.
Standard Error
• The standard error is a measure of the precision of an estimate of a population
parameter based on a sample.
• It is a measure of the variability of the sample mean, and is calculated as the standard
deviation of the sampling distribution of the mean.
• In other words, the standard error tells us how much the sample mean is likely to vary
from the true population mean.
• The larger the standard error, the more uncertain we are about our estimate of the
population mean.
• The significance of the standard error in sampling analysis is that it helps us to make
inferences about the population based on the sample data.
• By calculating the standard error, we can estimate the margin of error for our estimate
of the population mean, and construct confidence intervals to indicate the range of
values that the population mean is likely to fall within.
• For example, suppose we want to estimate the average salary of all employees in a
company. We take a random sample of 100 employees and calculate the sample mean
salary to be $50,000, with a standard error of $1,000. Using this information, we can
construct a 95% confidence interval for the population mean salary, which would be
approximately $50,000 ± ($1,960 × $1,000) or $47,090 to $52,910. This means that
we can be 95% confident that the true population mean salary falls within this range.

You might also like