2 Data Collection 1

Download as pdf or txt
Download as pdf or txt
You are on page 1of 38

DATA COLLECTION 1

IT MS02: Quantitative Methods


• Sampling Techniques

• Data Collection Method


Topics
Data Collection
• The process of gathering information for
a specific purpose.
• Mainly driven by PURPOSE and DESIRED
OUTCOME

• Other things to consider:


• Do we investigate the whole population
(Census) or a part of the population
(Sample)?
• Sampling Design, if needed
• Collection Methodology
• Time and Cost Issue
Sampling Techniques
Population vs Sample
Population Sample
• Requires attention to details • Requires attention to details
of collection method of the collection method
• More Accurate results • The only option if items have
• Time Consuming to be destroyed
• Expensive • Saves time and money
• Sampling design determines
the level of accuracy
Notations
• N = number of elements in the
population

• n = number of elements in the sample


Sampling Methods
Sampling

Non-
Probability
Probability
Sampling
Sampling

Simple
Systematic Stratified Cluster Convenience Purposive Quota
Random
Sampling Sampling Sampling Sampling Sampling Sampling
Sampling
Probability vs Non Probability
Sampling
Probability Sampling Non Probability
Sampling
• The sample is chosen based • Subjective selection of
on known probabilities. sample

• Every element in the • Elements in the sample are


sampling frame has a chance
to be part of the sample chosen without regard to the
probability of occurrence
• Minimize bias in sample
selection • Also referred to as non-
random sampling
Probability vs Non Probability
Sampling
Probability Sampling Non Probability
Sampling
• Used mostly for • Used when there is limited
access to population and/or
quantitative research resources
where insights and
conclusions are extended • Mostly used for exploratory
into the population analysis
Probability Sampling

Probability
Sampling

Simple
Systematic Stratified Cluster
Random
Sampling Sampling Sampling
Sampling

• Sampling Frame – list of all members of the population


Simple Random Sampling
• All elements in the population have an equal chance of being
part of a sample
• Can be done with or without replacement
• Advantage: Simple and easy to Understand
• Disadvantages:
• Needs an exhaustive list of all elements in the population
• Sample size must be large
• Can lead to more resources if elements are widely spread
geographically
Simple Random Sampling
Steps:
1. List the elements of the population and
number from 1 to N
2. Select n numbers from 1 to N, using a
randomization technique like a table of
random numbers or random number
generators
3. Sample will consist of elements
corresponding to the selected numbers
Table of Random Numbers
• In Excel, we can use the
functions:

=RAND()
• This yields a value from 0 to 1
• =INT(RAND()*N) +1 to generate a
random whole number from 1 to
n

=RANDBETWEEN([min],[max])
• Returns a random integer number
between the numbers you specify.
Example: Simple Random Sampling
• Imagine you are a quality control manager at a
computer monitor factory, and you want to test
the quality of the monitors. You have a large
production line, and you want to select a sample
of monitors for testing. You assign a unique serial
number to each monitor, and then you use a
random number generator to select 20 monitors
from the entire production. This ensures that each
product has an equal chance of being tested.
Systematic Sampling
• Select the first sampled element randomly, then take the kth
element thereafter

• Advantages:
• Easy to identify the elements to be included in the sample
• The sample is distributed evenly over the entire population
• Can be done even without an available list of all elements in the sample
(example: choosing households in a certain community)

• Disadvantages:
• Requires information on the arrangement of the elements in the sampling
frame
• Periodic irregularities in the list will affect the reliability of the results
Systematic Sampling
Steps:
1. Assign a unique number from 1 to N
to each element of the population
2. Determine the sampling interval k.
K= N/n
3. Obtain the first element in the
sample using a randomization
technique
4. Take every kth element from the
random start until the desired
sample size is met
Example: Systematic Sampling

• During the early launch of a


mobile app, the developer
decided to collect feedback from
the first 100 app users. Every
10th person who installed the
app will receive a short survey.
Stratified Sampling
• Elements of the sample are taken from the different strata
• Strata should be different from each other, with elements
within each stratum being highly similar

• Advantages:
• Assured representation of items across the entire
population
• Can facilitate easier administration of data collection
• Disadvantage: Need information on the stratification variable
to identify the stratum of each element
Stratified Sampling
Steps:
1. Divide the population into non-overlapping
strata
• Every element will only belong to one and only
one stratum, based on a common characteristic
(stratification variable)

2. Obtain a simple random sample for each


stratum, with sample sizes proportional to
strata sizes
• Example: If Strata 1 has 40% of the population,
then 40% of the sample should be selected from
Strata 1 too)
Example: Stratified Sampling
We want to observe consumer behavior in one of the
municipalities in Region 3. We check the proportion of females,
the proportion of young / old, the proportions according to
average income in the sample. Then, we divide the population
in sub-groups according to gender, age and income. After that,
we apply SRS or systematic sampling method to select a certain
number of people from each subgroup we have created.
The aim is to ensure the same sub-group proportions in the
sample. If there are 10% of young females with high income in
the population, then we want 10% of our sample to be young
females with high income.
Cluster Sampling
• Selecting clusters or groups to represent the population
• Advantages:
• Only needs a list of clusters, not list of elements
• More cost-effective in terms of transportation and listing, especially
if the population is geographically widespread
• Disadvantage:
• Often requires a larger sample size than other probability sampling
techniques for the same level of precision
Cluster Sampling
Steps:
1. Divide the population into
nonoverlapping clusters, each
representative of the population
2. Select a sample of clusters using
simple random sampling
3. Sample consists of all the elements in
the selected cluster
Example: Cluster Sampling
• When studying a network’s performance, you can randomly select
specific departments or servers as clusters to analyze the
performance metrics.
Non Probability Sampling

Non- Probability
Sampling

Convenience Purposive
Quota Sampling
Sampling Sampling
Convenience Sampling
• Sample consists of elements
that are most accessible or
easiest contact.
• Subjects who are available and
willing to participate in the study
• Mostly used in research in biology
and social sciences as
participation is limited for these
areas of study.
Example: Convenience Sampling
• Suppose you work for a multinational
company with offices around the globe.
You were tasked to redesign the current
database of the company to address
the concerns of employees on the
frontline. However, only the inputs and
suggestions from employees in your
local office were considered.
Purposive Sampling
• Handpick individuals who are
considered to be the most
knowledgeable or related to
the study topic
• Frequently utilized in
qualitative analysis or when
specialists’ evaluations are
required
• Can have high researcher bias
and may not be appropriate for
generalization
Example: Purposive Sampling
• A researcher is conducting a study on the

performance of employees in a big company.

Instead of interviewing random employees, only

high-performing and recognized employees were

interviewed. The researcher deliberately selects

contributors who have precise characteristics

relevant to the research topic.


Quota Sampling
• Similar to stratified sampling, however, sample selection
within the stratum does not use the probability sampling
method.
• There is a set quota or required number of sampling units for
each group and utilizes convenient sampling to select units
within each group.
Example: Quota Sampling
When gathering feedback on a software product, quotas on
feedback can be set for different demographics (age,
occupation, location) of users.
Data Collection
Methods
Data Collection Methods

Observation Surveys Interview Use of Secondary


Sources
Observation
• Done when the population consists of
machines, animals, files, documents, or any
other inanimate objects
• Observer watches some activity and
records what happens
• Counting occurrence of events
• Taking some measurements
• Seeing how something works.
• Observers can be human or automatic
recorders
Surveys
• Done by asking people to answer questionnaires which usually
contain close-ended questions
• Can be administered through printed questionnaires or online
surveys
Interviews
• Asking questions verbally
• One of the most reliable ways of
getting data but expensive
• Almost anyone can come up with a
list of questions, but the key to
efficient interviews is knowing what
to ask
• Allow follow-up questions, making
them more customized
• Can be done in person, via calls, or a
web chat interface
Use of Secondary Sources
• Obtain data from previous studies of individuals, private
organizations, and government agencies.
• Documented data can be in the form of published or written
reports, unpublished documents, existing databases, or journals
to name a few.
Usability Testing
• Testing the functionality of a website, app, or other digital product by
observing real users as they attempt to complete tasks on it
• Companies usually use this during or after the development of products
or services.
• If they choose to use it during development, it might be to determine where
users find the product challenging to navigate.
• They might also use it after product release to track needed updates.
• Done with other data collection methods
• Observation while users are completing tasks
• Survey after completing tasks
QUESTIONS?

You might also like