0% found this document useful (0 votes)
11 views38 pages

Sasa Module-2

The document outlines the processes and methods of data collection and sampling design, emphasizing the importance of systematic data gathering to answer research questions accurately. It differentiates between primary and secondary data sources, describes various data collection methods, and discusses sampling techniques including random, stratified, and cluster sampling. Additionally, it highlights the criteria for determining sample size and the consequences of improperly collected data.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views38 pages

Sasa Module-2

The document outlines the processes and methods of data collection and sampling design, emphasizing the importance of systematic data gathering to answer research questions accurately. It differentiates between primary and secondary data sources, describes various data collection methods, and discusses sampling techniques including random, stratified, and cluster sampling. Additionally, it highlights the criteria for determining sample size and the consequences of improperly collected data.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 38

DATA COLLECTION

AND BASIC CONCEPTS


IN SAMPLING DESIGN
Anecdotal means that the information being conveyed
is based on casual observation, not scientific research.

Data collection is the process of gathering and


measuring information on variables of interest, in
an established systematic fashion that enables one
to answer stated research questions, test
hypotheses, and evaluate outcomes.
• Inability to answer research
questions accurately.
• Inability to repeat and validate the
study.
• Distorted findings resulting in
CONSEQUENCES wasted resources.

FROM
• Misleading other researchers to
pursue fruitless avenues of
IMPROPERLY investigation.
COLLECTED • Compromising decisions for public
policy.
DATA • Causing harm to human
participants and animal subjects.
1. Set the objectives for
collecting data
2. Determine the data needed
STEPS IN based on the set objectives.
3. Determine the method to

DATA be used in data gathering


and define the

GATHERING
comprehensive data collection
points.
4. Design data gathering forms
to be used.
5. Collect data.
SOURCES OF DATA
PRIMARY Sources
• Provide a first-hand account of an event or time
period and are considered to be authoritative.
• They represent original thinking, reports on
discoveries or events, or they can share new
information.
• They are usually the first formal appearance of
original research.
SOURCES OF DATA
SECONDARY Sources
• offer an analysis, interpretation or a restatement of
primary sources and are considered to be persuasive
• They often involve generalization, synthesis,
interpretation, commentary or evaluation in an attempt
to convince the reader of the creator's argument.
• They often attempt to describe or explain primary
sources.
The primary data can be collected by
the following five methods:
1. DIRECT PERSONAL INTERVIEWS - The researcher
has direct contact with the interviewee. The
researcher gathers information by asking questions to
the interviewee.
The primary data can be collected by
the following five methods:
2. INDIRECT/QUESTIONNAIRE METHOD - This
methods of data collection involve sourcing and
accessing existing data that were originally
collected for the purpose of the study.
Open-ended question – No response categories and appropriate
for collecting subjective data.
Closed-ended question - Includes a list of response categories from
which the respondent will select his answer. appropriate for
collecting objective
The primary data can be collected by
the following five methods:
3. A FOCUS GROUP - is a group interview of
approximately six to twelve people who share
similar characteristics or common interests. A
facilitator guides the group based on a
predetermined set of topics
The primary data can be collected by
the following five methods:
4. EXPERIMENT- is a method of collecting data
where there is direct human intervention on the
conditions that may affect the values of the
variable of interest.
The primary data can be collected by
the following five methods:
5. OBSERVATION- s a technique that involves
systematically selecting, watching and recoding
behaviors of people or other phenomena and
aspects of the setting in which they occur, for the
purpose of getting (gaining) specified information. It
includes all methods from simple visual
observations to the use of high level machines.
The secondary data can be collected by the
following five methods:
1. Published report on newspaper and periodicals.
2. Financial Data reported in annual reports.
3. Records maintained by the institution.
4. Internal reports of the government
departments.
5. Information from official publications.
SAMPLE SIZE Choosing of sample size
depends on.
“How many participants should be
chosen for a survey”? •Non-statistical
• Typically denoted by n and it isconsiderations – It may
always a positive integer. include availability of
• Can vary in different research resources, manpower,
settings. budget, ethics and sampling
frame.
Take Note! •Statistical considerations
-Representativeness, not size, is – It will include the desired
the more important consideration. precision of the estimate
THREE CRITERIA need to be specified to
determine the appropriate sample size:
1. LEVEL OF PRECISION - Also called sampling error,
the level of precision, is the range in which the
true value of the population is estimated to be.
2. CONFIDENCE INTERVAL -It is statistical measure of
the number of times out of 100 that results can
be expected to be within a specified range.
For example, a confidence interval of 90% means
that results of an action will probably meet
expectations 90% of the time.
To find the right z – score to use, refer to the
table:
3. DEGREE OF VARIABILITY - Depending upon the
target population and attributes under
consideration, the degree of variability varies
considerably.
- Reflects how much individual data points differ from
one another and from their mean.
- The more heterogeneous a population is, the
larger the sample size is required to get an optimum
level of precision.
• Estimating the Mean or Average
The sample size required to
estimate the population mean µ
METHODS IN to with a level of confidence with
specified margin of error e, given
DETERMINING by Z𝜎 2

THE SAMPLE 𝑛≥
𝑒
SIZE where:
Z is the z-score corresponding to
level of confidence.
e is the level of precision.
Take Note: If When σ is unknown, it is common practice
to conduct a preliminary survey to determine s and
use it as an estimate of σ or use results from
previous studies to obtain an estimate of σ. When using
this approach, the size of the sample should be at
least 30. The formula for the sample standard deviation
s is
σ 𝑥 − 𝑥ҧ 2
s=
𝑛−1
Example SOLUTION:
A soft drink machine is
The z – score for
regulated so that the amount of confidence level 95% in the
drink dispensed is z – table is 1.96
2
approximately normally Z𝜎
distributed with a standard 𝑛 ≥
𝑒
deviation equal to 0.5 ounce. 2
Determine the sample size needed 1.96 0.5
𝑛≥
if we wish to be 95% confident 0.03
that our sample mean will be
= 1067.11
within 0.03 ounce from the true
mean We need a 1067 sample for
our study
Estimating Proportion (Infinite Population)
The sample size required to obtain a confidence interval
for p with specified margin of error e is given by
2
Z𝜎 Note: There is a dilemma in
𝑛≥ 𝑝(1 − 𝑝) this formula:
𝑒
𝑥
Where: It dependents on p =
Z is the z-score corresponding
𝑁
which we know only after
to level of confidence.
we have taken the sample.
e is the level of precision.
P is population proportion.
Example SOLUTION:
Suppose we are doing a study on The z – score for
the inhabitants of a large town and
confidence level 99% in the
want to find out how many z – table is 2.58
households serve breakfast in the Z𝜎 2
mornings. We don’t have much 𝑛≥ 𝑝(1 − 𝑝)
information on the subject to begin 𝑒
2
with, so we’re going to assume that 2.58
half of the families serve 𝑛 ≥ 0.5 1.05
0.01
breakfast: this gives us maximum
variability. So p = 0.5. We want = 16641
99% confidence and at least 1% We need a 16,641 sample for
precision. our study
SLOVIN’S FORMULA
Slovin’s formula is used to calculate the sample size
n given the population size and error. It is computed
as 𝑁
n≥ 2
1+ 𝑁𝑒
Where:
N is the total population.
e is the level of precision.
Example SOLUTION:
The z – score for
A researcher plans to confidence level 99% in the
conduct a survey about z – table is 2.58
food preference of BS 𝑁
Stat students. If the n≥ 2
1 + 𝑁𝑒
population of students is 1000
n≥
1000, find the sample size 1 + 1000(0.05) 2
if the error is 5%. =285.71
The researcher need to
survey 286 BS stat students.
TWO TYPES OF SAMPLE
Random Non-Random
BASIC SAMPLING TECHNIQUE OF PROBABILITY SAMPLING
SIMPLE RANDOM SAMPLING
- Most basic method of drawing a probability sample.
- Assigns equal probabilities of selection to each possible sample.
Advantage: It is very simple and easy to use.
Disadvantage: The sample chosen may be distributed over a wide
geographic area.
When to use: This is preferable to use
if the population is not widely spread
geographically. More appropriate to
use if the population is more or less
homogenous with respect to the
characteristics of the population.
BASIC SAMPLING TECHNIQUE OF PROBABILITY SAMPLING
SYSTEMATIC RANDOM SAMPLING
- This method uses the kth interval formula.
- The sampling interval is the standard distance between elements
chosen for the sample.
Advantage - Easy to sample and administer in the field.
- Samples are evenly distributed across the population.
Disadvantage - May lack precision if
unexpected periodicity exists.
When to use - advisable to us if the
ordering of the population is
essentially random and when
stratification with numerous data is used.
BASIC SAMPLING TECHNIQUE OF PROBABILITY SAMPLING
Obtaining a Systematic Random Sample
1. Decide on a method of assigning a unique serial number, from 1 to N,
to each one of the elements in the population.
2. Compute for the sampling interval

3. Select a number, from 1 to k, using a randomization mechanism. The


element in the population assigned to this number is the first element of
the sample. The other elements of the sample are those assigned to the
numbers and so on until you get a sample of size.
BASIC SAMPLING TECHNIQUE OF PROBABILITY SAMPLING
Obtaining a Systematic Random Sample
EXAMPLE
Select a sample of 50 students from 500 students under this method kth
item and picked up from the sampling frame.

We start to get a sample starting form i and for every kth unit
subsequently. Suppose the random number i is 6, then we select 15, 25,
35, 45, ..
BASIC SAMPLING TECHNIQUE OF PROBABILITY SAMPLING
STRATIFIED RANDOM SAMPLING
- It is obtained by separating the population into non-overlapping
groups called strata and then obtaining a simple random sample from
each stratum.
- The individuals within each stratum should be homogeneous (or
similar) in some way.
BASIC SAMPLING TECHNIQUE OF PROBABILITY SAMPLING
STRATIFIED RANDOM SAMPLING
Advantages
Precision: Stratification enhances the accuracy of population estimates.
Flexible Sampling Designs: Different sampling methods can be applied to
each stratum.
Ease of Use: Functions similarly to random sampling.
Disadvantages
Data Availability: Stratification variables may be hard to obtain, especially
in homogeneous populations.
Representation Issues: Some strata may lack adequate representation.
High Costs: Transportation costs can escalate if the population is
geographically dispersed.
BASIC SAMPLING TECHNIQUE OF PROBABILITY SAMPLING
STRATIFIED RANDOM SAMPLING
EXAMPLE
A sample of 50 students is to be drawn from a population consisting of
500 students belonging to two institutions A and B. The number of
students in the institution A is 200 and the institution B is 300. How will
you draw the sample using proportional allocation?
Solution: There are two strata in this case.
Given: 𝑁1 = 200 𝑁2 = 300 𝑁 = 500 𝑛=50
𝒏 𝟓𝟎 The sample sizes are 20 from A and
𝒏𝟏 = 𝑵𝟏 = 𝟐𝟎𝟎 = 𝟐𝟎
𝑵 𝟓𝟎𝟎 30 from B. Then the units from each
𝒏 𝟓𝟎 institution are to be selected by
𝒏𝟐 = 𝑵𝟐 = 𝟑𝟎𝟎 = 𝟑𝟎
𝑵 𝟓𝟎𝟎 simple random sampling.
BASIC SAMPLING TECHNIQUE OF PROBABILITY SAMPLING
CLUSTER SAMPLING
- You take the sample
from naturally occurring
groups in your
population.
- The clusters are
constructed such that the 1. Divide the population into non-overlapping clusters.
2. Number the clusters in the population from 1 to N.
sampling units are 3. Select n distinct numbers from 1 to N using a randomization
heterogeneous within the mechanism. The selected clusters are the clusters associated
with the selected numbers.
cluster and homogeneous 4. The sample will consist of all the elements in the selected
among the clusters. clusters.
BASIC SAMPLING TECHNIQUE OF PROBABILITY SAMPLING
CLUSTER SAMPLING
Advantage: There is no need to come out with a list of units in the population; all
what is needed is simply a list of the clusters. It is also less costly since the
elements are physically closer together.
Disadvantage: In actual field applications, adjacent households tend to have more
similar characteristics than households distantly apart.
When to use: If the population can be grouped into clusters where individual
population elements are known to be different with respect to the characteristics
under study, this preferable to use.
Example:
Randomly select 3 schools from
the population, then sample all
students in each school
BASIC SAMPLING TECHNIQUE OF PROBABILITY SAMPLING
MULTI-STAGE SAMPLING
- Selection of the sample is done in two or more steps or stages,
with sampling units varying in each stage.
BASIC SAMPLING TECHNIQUE OF PROBABILITY SAMPLING
MULTI-STAGE SAMPLING
Advantage: It is easier to generate adequate sampling frames.
Transportation costs are greatly reduced since there is some form
of clustering among the ultimate or final samples; i.e., they are in
the sample lower-stage units.
Disadvantage: Its complexity in theory may be difficult to apply in
the field. Estimation procedures may be difficult for non
statisticians to follow.
When to use: If no population list is available and if the population
covers a wide area.
BASIC SAMPLING TECHNIQUE OF NON -PROBABILITY
SAMPLING
CONVENIENCE SAMPLING SNOWBALL SAMPLING
It is a process of picking out • The same way as the
people in the most convenient referral/recruitment system.
and fastest way to get reactions • Starts with a few participants
immediately. and continues to get larger
until desired sample size is
E.g: Telephone interview to get met.
the immediate reactions E.g. women who earn at least 3
million per year
BASIC SAMPLING TECHNIQUE OF NON -PROBABILITY
SAMPLING
PURPOSIVE SAMPLING QUOTA SAMPLING
• Based on the selective judgement of Researchers identifies population
the researchers that is why it is also sections or strata and decides how
called Judgmental Sampling many participants are required from
• Researcher sets a set of criteria that each section. Like based on gender, age,
is relevant to the topic of their educational attainment, etc.
study.
E,g a cigarette company wants to find
E.g. Suppose you’re studying Buddhism
out what age group prefers what brand
as a religion. So, you select people from
of cigarettes in a particular city. They
Malaysia, where nearly a fifth of the
apply survey quota on the age groups
population practices the religion. of 21-30, 31-40, 41-50, and 51+

You might also like