Sasa Module-2

The document outlines the processes and methods of data collection and sampling design, emphasizing the importance of systematic data gathering to answer research questions accurately. It differentiates between primary and secondary data sources, describes various data collection methods, and discusses sampling techniques including random, stratified, and cluster sampling. Additionally, it highlights the criteria for determining sample size and the consequences of improperly collected data.

Uploaded by

Raiven Justine Convento

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

11 views38 pages

Sasa Module-2

Uploaded by

Raiven Justine Convento

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 38

DATA COLLECTION

AND BASIC CONCEPTS

IN SAMPLING DESIGN
Anecdotal means that the information being conveyed
is based on casual observation, not scientific research.

Data collection is the process of gathering and

measuring information on variables of interest, in
an established systematic fashion that enables one
to answer stated research questions, test
hypotheses, and evaluate outcomes.
• Inability to answer research
questions accurately.
• Inability to repeat and validate the
study.
• Distorted findings resulting in
CONSEQUENCES wasted resources.

FROM
• Misleading other researchers to
pursue fruitless avenues of
IMPROPERLY investigation.
COLLECTED • Compromising decisions for public
policy.
DATA • Causing harm to human
participants and animal subjects.
1. Set the objectives for
collecting data
2. Determine the data needed
STEPS IN based on the set objectives.
3. Determine the method to

DATA be used in data gathering

and define the

GATHERING
comprehensive data collection
points.
4. Design data gathering forms
to be used.
5. Collect data.
SOURCES OF DATA
PRIMARY Sources
• Provide a first-hand account of an event or time
period and are considered to be authoritative.
• They represent original thinking, reports on
discoveries or events, or they can share new
information.
• They are usually the first formal appearance of
original research.
SOURCES OF DATA
SECONDARY Sources
• offer an analysis, interpretation or a restatement of
primary sources and are considered to be persuasive
• They often involve generalization, synthesis,
interpretation, commentary or evaluation in an attempt
to convince the reader of the creator's argument.
• They often attempt to describe or explain primary
sources.
The primary data can be collected by
the following five methods:
1. DIRECT PERSONAL INTERVIEWS - The researcher
has direct contact with the interviewee. The
researcher gathers information by asking questions to
the interviewee.
The primary data can be collected by
the following five methods:
2. INDIRECT/QUESTIONNAIRE METHOD - This
methods of data collection involve sourcing and
accessing existing data that were originally
collected for the purpose of the study.
Open-ended question – No response categories and appropriate
for collecting subjective data.
Closed-ended question - Includes a list of response categories from
which the respondent will select his answer. appropriate for
collecting objective
The primary data can be collected by
the following five methods:
3. A FOCUS GROUP - is a group interview of
approximately six to twelve people who share
similar characteristics or common interests. A
facilitator guides the group based on a
predetermined set of topics
The primary data can be collected by
the following five methods:
4. EXPERIMENT- is a method of collecting data
where there is direct human intervention on the
conditions that may affect the values of the
variable of interest.
The primary data can be collected by
the following five methods:
5. OBSERVATION- s a technique that involves
systematically selecting, watching and recoding
behaviors of people or other phenomena and
aspects of the setting in which they occur, for the
purpose of getting (gaining) specified information. It
includes all methods from simple visual
observations to the use of high level machines.
The secondary data can be collected by the
following five methods:
1. Published report on newspaper and periodicals.
2. Financial Data reported in annual reports.
3. Records maintained by the institution.
4. Internal reports of the government
departments.
5. Information from official publications.
SAMPLE SIZE Choosing of sample size
depends on.
“How many participants should be
chosen for a survey”? •Non-statistical
• Typically denoted by n and it isconsiderations – It may
always a positive integer. include availability of
• Can vary in different research resources, manpower,
settings. budget, ethics and sampling
frame.
Take Note! •Statistical considerations
-Representativeness, not size, is – It will include the desired
the more important consideration. precision of the estimate
THREE CRITERIA need to be specified to
determine the appropriate sample size:
1. LEVEL OF PRECISION - Also called sampling error,
the level of precision, is the range in which the
true value of the population is estimated to be.
2. CONFIDENCE INTERVAL -It is statistical measure of
the number of times out of 100 that results can
be expected to be within a specified range.
For example, a confidence interval of 90% means
that results of an action will probably meet
expectations 90% of the time.
To find the right z – score to use, refer to the
table:
3. DEGREE OF VARIABILITY - Depending upon the
target population and attributes under
consideration, the degree of variability varies
considerably.
- Reflects how much individual data points differ from
one another and from their mean.
- The more heterogeneous a population is, the
larger the sample size is required to get an optimum
level of precision.
• Estimating the Mean or Average
The sample size required to
estimate the population mean µ
METHODS IN to with a level of confidence with
specified margin of error e, given
DETERMINING by Z𝜎 2

THE SAMPLE 𝑛≥
𝑒
SIZE where:
Z is the z-score corresponding to
level of confidence.
e is the level of precision.
Take Note: If When σ is unknown, it is common practice
to conduct a preliminary survey to determine s and
use it as an estimate of σ or use results from
previous studies to obtain an estimate of σ. When using
this approach, the size of the sample should be at
least 30. The formula for the sample standard deviation
s is
σ 𝑥 − 𝑥ҧ 2
s=
𝑛−1
Example SOLUTION:
A soft drink machine is
The z – score for
regulated so that the amount of confidence level 95% in the
drink dispensed is z – table is 1.96
2
approximately normally Z𝜎
distributed with a standard 𝑛 ≥
𝑒
deviation equal to 0.5 ounce. 2
Determine the sample size needed 1.96 0.5
𝑛≥
if we wish to be 95% confident 0.03
that our sample mean will be
= 1067.11
within 0.03 ounce from the true
mean We need a 1067 sample for
our study
Estimating Proportion (Infinite Population)
The sample size required to obtain a confidence interval
for p with specified margin of error e is given by
2
Z𝜎 Note: There is a dilemma in
𝑛≥ 𝑝(1 − 𝑝) this formula:
𝑒
𝑥
Where: It dependents on p =
Z is the z-score corresponding
𝑁
which we know only after
to level of confidence.
we have taken the sample.
e is the level of precision.
P is population proportion.
Example SOLUTION:
Suppose we are doing a study on The z – score for
the inhabitants of a large town and
confidence level 99% in the
want to find out how many z – table is 2.58
households serve breakfast in the Z𝜎 2
mornings. We don’t have much 𝑛≥ 𝑝(1 − 𝑝)
information on the subject to begin 𝑒
2
with, so we’re going to assume that 2.58
half of the families serve 𝑛 ≥ 0.5 1.05
0.01
breakfast: this gives us maximum
variability. So p = 0.5. We want = 16641
99% confidence and at least 1% We need a 16,641 sample for
precision. our study
SLOVIN’S FORMULA
Slovin’s formula is used to calculate the sample size
n given the population size and error. It is computed
as 𝑁
n≥ 2
1+ 𝑁𝑒
Where:
N is the total population.
e is the level of precision.
Example SOLUTION:
The z – score for
A researcher plans to confidence level 99% in the
conduct a survey about z – table is 2.58
food preference of BS 𝑁
Stat students. If the n≥ 2
1 + 𝑁𝑒
population of students is 1000
n≥
1000, find the sample size 1 + 1000(0.05) 2
if the error is 5%. =285.71
The researcher need to
survey 286 BS stat students.
TWO TYPES OF SAMPLE
Random Non-Random
BASIC SAMPLING TECHNIQUE OF PROBABILITY SAMPLING
SIMPLE RANDOM SAMPLING
- Most basic method of drawing a probability sample.
- Assigns equal probabilities of selection to each possible sample.
Advantage: It is very simple and easy to use.
Disadvantage: The sample chosen may be distributed over a wide
geographic area.
When to use: This is preferable to use
if the population is not widely spread
geographically. More appropriate to
use if the population is more or less
homogenous with respect to the
characteristics of the population.
BASIC SAMPLING TECHNIQUE OF PROBABILITY SAMPLING
SYSTEMATIC RANDOM SAMPLING
- This method uses the kth interval formula.
- The sampling interval is the standard distance between elements
chosen for the sample.
Advantage - Easy to sample and administer in the field.
- Samples are evenly distributed across the population.
Disadvantage - May lack precision if
unexpected periodicity exists.
When to use - advisable to us if the
ordering of the population is
essentially random and when
stratification with numerous data is used.
BASIC SAMPLING TECHNIQUE OF PROBABILITY SAMPLING
Obtaining a Systematic Random Sample
1. Decide on a method of assigning a unique serial number, from 1 to N,
to each one of the elements in the population.
2. Compute for the sampling interval

3. Select a number, from 1 to k, using a randomization mechanism. The

element in the population assigned to this number is the first element of
the sample. The other elements of the sample are those assigned to the
numbers and so on until you get a sample of size.
BASIC SAMPLING TECHNIQUE OF PROBABILITY SAMPLING
Obtaining a Systematic Random Sample
EXAMPLE
Select a sample of 50 students from 500 students under this method kth
item and picked up from the sampling frame.

We start to get a sample starting form i and for every kth unit
subsequently. Suppose the random number i is 6, then we select 15, 25,
35, 45, ..
BASIC SAMPLING TECHNIQUE OF PROBABILITY SAMPLING
STRATIFIED RANDOM SAMPLING
- It is obtained by separating the population into non-overlapping
groups called strata and then obtaining a simple random sample from
each stratum.
- The individuals within each stratum should be homogeneous (or
similar) in some way.
BASIC SAMPLING TECHNIQUE OF PROBABILITY SAMPLING
STRATIFIED RANDOM SAMPLING
Advantages
Precision: Stratification enhances the accuracy of population estimates.
Flexible Sampling Designs: Different sampling methods can be applied to
each stratum.
Ease of Use: Functions similarly to random sampling.
Disadvantages
Data Availability: Stratification variables may be hard to obtain, especially
in homogeneous populations.
Representation Issues: Some strata may lack adequate representation.
High Costs: Transportation costs can escalate if the population is
geographically dispersed.
BASIC SAMPLING TECHNIQUE OF PROBABILITY SAMPLING
STRATIFIED RANDOM SAMPLING
EXAMPLE
A sample of 50 students is to be drawn from a population consisting of
500 students belonging to two institutions A and B. The number of
students in the institution A is 200 and the institution B is 300. How will
you draw the sample using proportional allocation?
Solution: There are two strata in this case.
Given: 𝑁1 = 200 𝑁2 = 300 𝑁 = 500 𝑛=50
𝒏 𝟓𝟎 The sample sizes are 20 from A and
𝒏𝟏 = 𝑵𝟏 = 𝟐𝟎𝟎 = 𝟐𝟎
𝑵 𝟓𝟎𝟎 30 from B. Then the units from each
𝒏 𝟓𝟎 institution are to be selected by
𝒏𝟐 = 𝑵𝟐 = 𝟑𝟎𝟎 = 𝟑𝟎
𝑵 𝟓𝟎𝟎 simple random sampling.
BASIC SAMPLING TECHNIQUE OF PROBABILITY SAMPLING
CLUSTER SAMPLING
- You take the sample
from naturally occurring
groups in your
population.
- The clusters are
constructed such that the 1. Divide the population into non-overlapping clusters.
2. Number the clusters in the population from 1 to N.
sampling units are 3. Select n distinct numbers from 1 to N using a randomization
heterogeneous within the mechanism. The selected clusters are the clusters associated
with the selected numbers.
cluster and homogeneous 4. The sample will consist of all the elements in the selected
among the clusters. clusters.
BASIC SAMPLING TECHNIQUE OF PROBABILITY SAMPLING
CLUSTER SAMPLING
Advantage: There is no need to come out with a list of units in the population; all
what is needed is simply a list of the clusters. It is also less costly since the
elements are physically closer together.
Disadvantage: In actual field applications, adjacent households tend to have more
similar characteristics than households distantly apart.
When to use: If the population can be grouped into clusters where individual
population elements are known to be different with respect to the characteristics
under study, this preferable to use.
Example:
Randomly select 3 schools from
the population, then sample all
students in each school
BASIC SAMPLING TECHNIQUE OF PROBABILITY SAMPLING
MULTI-STAGE SAMPLING
- Selection of the sample is done in two or more steps or stages,
with sampling units varying in each stage.
BASIC SAMPLING TECHNIQUE OF PROBABILITY SAMPLING
MULTI-STAGE SAMPLING
Advantage: It is easier to generate adequate sampling frames.
Transportation costs are greatly reduced since there is some form
of clustering among the ultimate or final samples; i.e., they are in
the sample lower-stage units.
Disadvantage: Its complexity in theory may be difficult to apply in
the field. Estimation procedures may be difficult for non
statisticians to follow.
When to use: If no population list is available and if the population
covers a wide area.
BASIC SAMPLING TECHNIQUE OF NON -PROBABILITY
SAMPLING
CONVENIENCE SAMPLING SNOWBALL SAMPLING
It is a process of picking out • The same way as the
people in the most convenient referral/recruitment system.
and fastest way to get reactions • Starts with a few participants
immediately. and continues to get larger
until desired sample size is
E.g: Telephone interview to get met.
the immediate reactions E.g. women who earn at least 3
million per year
BASIC SAMPLING TECHNIQUE OF NON -PROBABILITY
SAMPLING
PURPOSIVE SAMPLING QUOTA SAMPLING
• Based on the selective judgement of Researchers identifies population
the researchers that is why it is also sections or strata and decides how
called Judgmental Sampling many participants are required from
• Researcher sets a set of criteria that each section. Like based on gender, age,
is relevant to the topic of their educational attainment, etc.
study.
E,g a cigarette company wants to find
E.g. Suppose you’re studying Buddhism
out what age group prefers what brand
as a religion. So, you select people from
of cigarettes in a particular city. They
Malaysia, where nearly a fifth of the
apply survey quota on the age groups
population practices the religion. of 21-30, 31-40, 41-50, and 51+

STAT 311 - Lesson 2
No ratings yet
STAT 311 - Lesson 2
22 pages
Lesson2 - Data Collection Organization and Presentation
100% (1)
Lesson2 - Data Collection Organization and Presentation
20 pages
Business Research Chapter 4
No ratings yet
Business Research Chapter 4
62 pages
Research Methodology (Chapter 3
No ratings yet
Research Methodology (Chapter 3
34 pages
Collection and Presentation of Data
No ratings yet
Collection and Presentation of Data
70 pages
3T2324 Module 2 - 3
No ratings yet
3T2324 Module 2 - 3
43 pages
MR Sampling
No ratings yet
MR Sampling
24 pages
MMW Group-4 Lesson 4 Data Management
No ratings yet
MMW Group-4 Lesson 4 Data Management
68 pages
Statistical Analysis With Software Applications BSA PDF
100% (4)
Statistical Analysis With Software Applications BSA PDF
59 pages
MMW Module 4
No ratings yet
MMW Module 4
51 pages
Business Research Methods William G. Zikmund
No ratings yet
Business Research Methods William G. Zikmund
31 pages
Lesson 2 Data Collection
No ratings yet
Lesson 2 Data Collection
19 pages
STATISTICS
No ratings yet
STATISTICS
4 pages
RS299a-Chapter 3a-Data Collection
No ratings yet
RS299a-Chapter 3a-Data Collection
44 pages
Lecture 2
No ratings yet
Lecture 2
65 pages
Data-Management
No ratings yet
Data-Management
101 pages
MMW Reviewer Midterms
No ratings yet
MMW Reviewer Midterms
17 pages
Sasa Module-2
No ratings yet
Sasa Module-2
38 pages
STATS Reviewer
No ratings yet
STATS Reviewer
2 pages
Math Reviewer
No ratings yet
Math Reviewer
7 pages
M4-Data Management 1.2
No ratings yet
M4-Data Management 1.2
22 pages
Module On Data MGT
No ratings yet
Module On Data MGT
32 pages
APPLIED STATISTICS FOR BUSINESS AND ECONOMICS Midterms Reviewer
No ratings yet
APPLIED STATISTICS FOR BUSINESS AND ECONOMICS Midterms Reviewer
23 pages
Worksheet 4.2a Sampling and Sampling Techniques
No ratings yet
Worksheet 4.2a Sampling and Sampling Techniques
4 pages
Engineering Data Analysis
No ratings yet
Engineering Data Analysis
7 pages
Eda 223 Reviewer All Lacan
No ratings yet
Eda 223 Reviewer All Lacan
20 pages
Methodology 1
No ratings yet
Methodology 1
29 pages
7 Sample Design and Sampling
No ratings yet
7 Sample Design and Sampling
36 pages
Ch6 Sampling and Estimation
No ratings yet
Ch6 Sampling and Estimation
24 pages
SMA 4.1 Sampling and Estimation
No ratings yet
SMA 4.1 Sampling and Estimation
27 pages
Chapter 1 Collection of Data
No ratings yet
Chapter 1 Collection of Data
8 pages
Definition of Statistics
No ratings yet
Definition of Statistics
7 pages
Ba123iu Week 8
No ratings yet
Ba123iu Week 8
42 pages
Unit 4 Statistics
No ratings yet
Unit 4 Statistics
33 pages
EM 104 Module
No ratings yet
EM 104 Module
12 pages
Seminar 4
No ratings yet
Seminar 4
43 pages
GEDS 802 Note - Descriptive Stat - pt.2
No ratings yet
GEDS 802 Note - Descriptive Stat - pt.2
27 pages
m103 Presentationunit 1-2
No ratings yet
m103 Presentationunit 1-2
25 pages
Reviewer Sa Stats Q1
No ratings yet
Reviewer Sa Stats Q1
4 pages
Nature of Statistics W1
No ratings yet
Nature of Statistics W1
39 pages
Unit 2 Statistics PDF
No ratings yet
Unit 2 Statistics PDF
18 pages
Final - Module-11-Collection-of-Data
No ratings yet
Final - Module-11-Collection-of-Data
9 pages
Statistics WT Lab Fe Lec
100% (1)
Statistics WT Lab Fe Lec
122 pages
Details of Study: Sampling Design
No ratings yet
Details of Study: Sampling Design
29 pages
Sampling and Sampling Techniques
No ratings yet
Sampling and Sampling Techniques
3 pages
Den
No ratings yet
Den
15 pages
STATISTICS - Is A Branch of Mathematics That Deals With The Collection
No ratings yet
STATISTICS - Is A Branch of Mathematics That Deals With The Collection
14 pages
Statistics Intro
No ratings yet
Statistics Intro
3 pages
(Reafor) Term Paper 2
No ratings yet
(Reafor) Term Paper 2
5 pages
Lesson 2 - Data Collection Organization and Presentation
No ratings yet
Lesson 2 - Data Collection Organization and Presentation
16 pages
Use The Regression Line To Make Predictions and Evaluate How Reliable These Predictions Are
No ratings yet
Use The Regression Line To Make Predictions and Evaluate How Reliable These Predictions Are
51 pages
Statistics: by Sir Lee Knows
No ratings yet
Statistics: by Sir Lee Knows
77 pages
Data Science Q&A - Latest Ed (2020) - 2 - 2
No ratings yet
Data Science Q&A - Latest Ed (2020) - 2 - 2
2 pages
Descriptive Research Embraces A Large Proportion of Research. It Is Preplanned and Structured in Design
No ratings yet
Descriptive Research Embraces A Large Proportion of Research. It Is Preplanned and Structured in Design
6 pages
Statistical Analysis With Software Application
No ratings yet
Statistical Analysis With Software Application
3 pages
Manual 191
No ratings yet
Manual 191
109 pages
Variable and Types of Statistical Variables
100% (1)
Variable and Types of Statistical Variables
9 pages
Sample - Is The Subset of The Entire Population
No ratings yet
Sample - Is The Subset of The Entire Population
6 pages
Ge4 Week 10 11
No ratings yet
Ge4 Week 10 11
3 pages
FRA Business Report
100% (1)
FRA Business Report
21 pages
HBSC4103 Topic 1
No ratings yet
HBSC4103 Topic 1
41 pages
Lesson 2 PDF
No ratings yet
Lesson 2 PDF
3 pages
Noise A Flaw in Human Judgment Abstract
No ratings yet
Noise A Flaw in Human Judgment Abstract
6 pages
Ch-1.1 Chemometrics
No ratings yet
Ch-1.1 Chemometrics
16 pages
Paper QC QA in GIS 2018 16pages
100% (1)
Paper QC QA in GIS 2018 16pages
17 pages
2 Epsp of Ivd Medical Devices
No ratings yet
2 Epsp of Ivd Medical Devices
15 pages
Beatty Practical Notes - Final (Pure Biology)
No ratings yet
Beatty Practical Notes - Final (Pure Biology)
15 pages
Leica Surveying Reflectors WP PDF
No ratings yet
Leica Surveying Reflectors WP PDF
13 pages
Quality Assurance 6th Sem IMP Question and Answer UNIVERSITY
No ratings yet
Quality Assurance 6th Sem IMP Question and Answer UNIVERSITY
46 pages
Advisory Circular: U.S. Department of Transportation
No ratings yet
Advisory Circular: U.S. Department of Transportation
87 pages
Usbr1012 PDF
No ratings yet
Usbr1012 PDF
4 pages
Random and Systematic Errors
No ratings yet
Random and Systematic Errors
8 pages
Kohlberg Moral Development
No ratings yet
Kohlberg Moral Development
19 pages
EEM by GPW
No ratings yet
EEM by GPW
78 pages
Lesson No. 01 - AIS
No ratings yet
Lesson No. 01 - AIS
40 pages
Cryptology
No ratings yet
Cryptology
63 pages
Measured Surveys of Land Buildings and Utilities 3rd Edition Rics
No ratings yet
Measured Surveys of Land Buildings and Utilities 3rd Edition Rics
95 pages
Ross S. Lunetta, John G. Lyon Remote Sensing and GIS Accuracy Assessment Mapping Science PDF
No ratings yet
Ross S. Lunetta, John G. Lyon Remote Sensing and GIS Accuracy Assessment Mapping Science PDF
320 pages
Sasa Module-1
No ratings yet
Sasa Module-1
16 pages
Year 5 Shadow Investigation Task
No ratings yet
Year 5 Shadow Investigation Task
2 pages
Lesson No. 4 - Strategic Cost Management
No ratings yet
Lesson No. 4 - Strategic Cost Management
33 pages
3D Clothing Simulation
No ratings yet
3D Clothing Simulation
6 pages
Binary Code
No ratings yet
Binary Code
32 pages
Rockwell Hardness
No ratings yet
Rockwell Hardness
2 pages
Measurement Task (4) (D2)
No ratings yet
Measurement Task (4) (D2)
7 pages
Pranav Data Science Lab
No ratings yet
Pranav Data Science Lab
34 pages
Lesson No. 04 - OpMan
No ratings yet
Lesson No. 04 - OpMan
24 pages
Seismic Magnitude Forecasting Through Machine Learning Paradigms: A Confluence of Predictive Models
No ratings yet
Seismic Magnitude Forecasting Through Machine Learning Paradigms: A Confluence of Predictive Models
8 pages
Ai MS 2
No ratings yet
Ai MS 2
8 pages
LAB 2 PHOTOMODE 70%docx
No ratings yet
LAB 2 PHOTOMODE 70%docx
10 pages
3 Sources of Authority
No ratings yet
3 Sources of Authority
14 pages
Tangent Modulus Analysis
No ratings yet
Tangent Modulus Analysis
12 pages
Advanced English Writing: Mehrosh Azeem
No ratings yet
Advanced English Writing: Mehrosh Azeem
14 pages
Engineering Journal A Fire Fly Optimization Based Video Object Co-Segmentation
No ratings yet
Engineering Journal A Fire Fly Optimization Based Video Object Co-Segmentation
7 pages
Question 2.2
No ratings yet
Question 2.2
4 pages
Analysis of Effectiveness Particle Swarm Optimization in Improving The Performance of Naïve Bayes Algorithm
No ratings yet
Analysis of Effectiveness Particle Swarm Optimization in Improving The Performance of Naïve Bayes Algorithm
5 pages
Yield Monitor Accuracy
No ratings yet
Yield Monitor Accuracy
4 pages

Sasa Module-2

Uploaded by

Sasa Module-2

Uploaded by

DATA COLLECTION

AND BASIC CONCEPTS

Data collection is the process of gathering and

DATA be used in data gathering

3. Select a number, from 1 to k, using a randomization mechanism. The

You might also like