0% found this document useful (0 votes)
28 views26 pages

Chapter 1

This document discusses different methods for collecting data and sampling from populations. It covers surveys, sampling plans including simple random sampling, stratified random sampling, and cluster sampling. Simple random sampling involves randomly selecting samples from a population where each member has an equal chance of selection. Stratified random sampling divides the population into subgroups and then randomly samples from each subgroup. Cluster sampling randomly selects clusters of elements rather than individual elements from a population.

Uploaded by

2v5xzx8bdr
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
28 views26 pages

Chapter 1

This document discusses different methods for collecting data and sampling from populations. It covers surveys, sampling plans including simple random sampling, stratified random sampling, and cluster sampling. Simple random sampling involves randomly selecting samples from a population where each member has an equal chance of selection. Stratified random sampling divides the population into subgroups and then randomly samples from each subgroup. Cluster sampling randomly selects clusters of elements rather than individual elements from a population.

Uploaded by

2v5xzx8bdr
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 26

Chapter 1: Data Collection and

Sampling

1. Surveys
2. Sampling
3. Sampling Plans
4. Sampling and non-
sampling Errors

1
Copyright © 2009 Cengage Learning
Recall…

Statistics is a tool for converting data into


information:
Statistics
Data Information

But where then does data come from? How is it gathered?


How do we ensure it is accurate? Is the data reliable? Is it
representative of the population from which it was drawn?
This chapter explores some of these issues.
2
Copyright © 2009 Cengage Learning
Surveys…

⚫ A survey solicits information from people.

⚫ The Response Rate (i.e. the proportion of all people selected who
complete the survey) is a key survey parameter.

⚫ Surveys may be administered in a variety of ways, e.g.


– Personal interview
– Telephone interview
– Self- administered questionnaire (mailed to a sample of people)
– Online surveys

3
Copyright © 2009 Cengage Learning
Questionnaire Design…
The questionnaire must be well designed. Some basic points to
consider:
▪ Keep the questionnaire as short as possible. (to encourage
respondents to complete it.)

▪ Ask short, simple, and clearly worded questions. (to answer


quickly.)

▪ Use yes|no and multiple choice questions. (useful because of their


simplicity.)

▪ Avoid using leading-questions. (wouldn’t you agree that the


statistics exam was too difficult? ----- lead to a particular answer. )
4 ▪ Pretest a questionnaire on a small number of people.(to uncover
Copyright © 2009 Cengage Learning
Errors to avoid during data
collection

https://www.youtube.com/watch?v=7onVHIk
S1YY

5
Copyright © 2009 Cengage Learning
Sampling…

❑ Sampling is a process used in statistical analysis


in which a predetermined number of
observations are taken from a larger population.

❑ The methodology used to sample from a larger


population depends on the type of analysis being
performed but may include simple random
sampling or other methods.
6
Copyright © 2009 Cengage Learning
Sampling…

⚫ Recall that statistical inference allows us to draw


conclusions about a population based on a sample.

⚫ Sampling (i.e. selecting a sub-set of a whole


population) is often done for reasons of cost (it is
less expensive to sample 1,000 television viewers
than 100 million TV viewers).

⚫ The sampled population and the target population


should be similar to one another.
7
Copyright © 2009 Cengage Learning
Sampling Plans…

⚫ A sampling plan is just a method or procedure for


specifying how a sample will be taken from a population.

⚫ We will focus our attention on these three methods:


1) Simple random sampling,
2) Stratified random sampling
3) Cluster sampling

8
Copyright © 2009 Cengage Learning
1) Simple Random Sampling…

⚫ A simple random sample is a sample selected in such a


way that every possible sample of the same size is
equally likely to be chosen. (each object in the population has
an equal chance of being chosen.)

- One way to conduct a simple random sample is to assign


a number to each element in the population, write these
numbers on individual slips of paper, toss them into a hat,
and draw the required number of slips (the sample size n)
from that.
9
Copyright © 2009 Cengage Learning
Simple Random Sampling Example

⚫ An organization has 500 employees. We want to


extract a sample of 100 from them.
✓ Step 1: Make a list of all the employees working in
the organization. (the list must contain 500 names).
✓ Step 2: Assign a sequential number to each
employee (1,2,3…500). This is your sampling frame
(the list from which you draw your simple random
sample).

10
Copyright © 2009 Cengage Learning
Simple Random Sampling Example

✓ Step 3: Figure out what your sample size is going to be.


(In this case, the sample size is 100).
✓ Step 4: Write these numbers on individual slips of paper.
✓ Step 5: Toss these slips of paper (500) into a hat, and draw
the required number of slips (100) from that.
▪ Note: We can also use a table of random numbers or a
random number generator to select the required number of
slips.

11
Copyright © 2009 Cengage Learning
Table of Random Numbers (Example)

12
Copyright © 2009 Cengage Learning
Simple Random Sampling

⚫ In making inferences about a population, we attempt to


extract as much information as possible from a sample.
⚫ The simple random sampling often accomplishes this
goal at low cost. Other methods, however, can be used
to increase the amount of information about the
population.
⚫ This process is relatively easy for small population but
relatively difficult and time consuming for a large
population.

13
Copyright © 2009 Cengage Learning
2) Stratified Random Sampling…

⚫ Stratified sampling technique is generally applied in


order to obtain a representative sample, if a population
from which a sample is to be selected does not
constitute a homogeneous group.

⚫ A stratified random sample is obtained by separating


the population into mutually exclusive sets, or strata
(layers), and then drawing simple random samples from
each stratum.

14
Copyright © 2009 Cengage Learning
Stratified Random Sampling…

⚫ Examples of criteria for separating a population into


strata:
1) 2) 3)

Gender Age Occupation


Male < 20 professional
Female 20-30 clerical
31-40 blue collar
41-50 other
Make comparisons across strata 51-60
15
Copyright © 2009 Cengage Learning
> 60
Example 1: Stratified Random
Sampling

Gender Population Sample Size


Proportion n=100 n=1000
Stratum 1: Female 60% 100*0.6= 60 1000*0.6=600
students
Stratum 2: Male students 40% 100*0.4=40 1000*0.4=400

▪ After the population has been stratified, we can use


simple random sampling to generate the complete
sample:

16
Copyright © 2009 Cengage Learning
Stratified Random Sampling
(Example 2)

Stratum 1 Stratum 2 Stratum 3

After the population has been stratified, we can use simple random
sampling to generate the complete sample:

17
Copyright © 2009 Cengage Learning
Advantages and disadvantages of
stratified random sampling

- Advantage:
⚫ The aim of the stratified random sample is to
reduce the potential for human bias in the
selection of cases to be included in the sample.
- Disadvantage:
⚫ A stratified random sample can only be carried
out if a complete list of the
population is available.
18
Copyright © 2009 Cengage Learning
3) Cluster Sampling…
⚫ A cluster sample is a simple random sample of groups or clusters of
elements (vs. a simple random sample of individual objects).

⚫ This method is useful when the population elements are widely


dispersed geographically.

⚫ A list of elements of the population is not available but it is easy to


obtain a list of clusters.

⚫ The clusters are constructed such that the sampling units are
heterogeneous within the clusters and homogeneous among the
clusters. This is opposite to the construction of the strata in the
19 stratified sampling.
Copyright © 2009 Cengage Learning
Example: Cluster Sampling

⚫ A firm is interested in estimating the average per capita


income in a certain city. There is not an available list of
resident adults.
⚫ The city is marked off into rectangular blocks (60 blocks).
⚫ The researchers decide that each of the city blocks will be
considered a cluster.
⚫ The clusters are numbered from 1 to 60 and there is
budget for sampling n = 20 clusters and to interview every
household within each cluster.
20
Copyright © 2009 Cengage Learning
Sample Size…

Numerical techniques for determining sample


sizes will be described in Business Stat II,
but suffice it to say that the larger the sample
size is, the more accurate we can expect the
sample estimates to be.

21
Copyright © 2009 Cengage Learning
Sampling and Non-Sampling
Errors…

⚫ Two major types of error can arise when a sample


of observations is taken from a population:
– sampling error and non-sampling error.

Sampling error refers to differences between the


sample and the population that exist only because of
the observations that happened to be selected for
the sample.

Increasing the sample size will reduce this error.


22
Copyright © 2009 Cengage Learning
Nonsampling Error…

⚫ Non-Sampling errors are more serious and are due to


mistakes made in the collection of data or due to the sample
observations being selected improperly.
⚫ Three types of non-sampling errors:

– Errors in data collection


– Nonresponse errors
– Selection bias

⚫ Note: increasing the sample size will not reduce this type of
error.
23
Copyright © 2009 Cengage Learning
Errors in data collection

⚫ …arise from the recording of incorrect


responses, due to:
— incorrect measurements being taken because
of faulty equipment,
— inaccurate recording of data
— inaccurate responses to questions

24
Copyright © 2009 Cengage Learning
Nonresponse Error…

⚫ …refers to error (or Bias) introduced when responses are not


obtained from some members of the sample, i.e. the sample
observations that are collected may not be representative of
the target population.

⚫ As mentioned earlier, the Response Rate (i.e. the proportion of


all people selected who complete the survey) is a key survey
parameter and helps in the understanding in the validity of the
survey and sources of nonresponse error.

25
Copyright © 2009 Cengage Learning
Selection Bias…

Selection bias is the bias that occurs in a survey or


experimental data when the selection of data points
isn't sufficiently random to draw a general
conclusion. (I.e. voters without telephones were
excluded from possible inclusion in the sample
taken)

➢ If you survey your friends, they may not be


representative of the population.
26
Copyright © 2009 Cengage Learning

You might also like