0% found this document useful (0 votes)
85 views17 pages

Data Collection UGC - NET Paper-1

1. Data collection involves gathering numerical data from various sources to aid statistical analysis and policymaking. Common collection methods include interviews, questionnaires, and surveys. 2. Several sampling methods exist to select subsets of a population for data collection including simple random sampling, stratified sampling, cluster sampling, and non-probability methods. Proper sampling aims to reduce bias and error. 3. Potential sources of bias and error in sampling include failing to follow agreed rules, omitting hard-to-reach groups, low response rates, and using outdated sample frames. Selection error can occur when only interested respondents choose to participate.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
85 views17 pages

Data Collection UGC - NET Paper-1

1. Data collection involves gathering numerical data from various sources to aid statistical analysis and policymaking. Common collection methods include interviews, questionnaires, and surveys. 2. Several sampling methods exist to select subsets of a population for data collection including simple random sampling, stratified sampling, cluster sampling, and non-probability methods. Proper sampling aims to reduce bias and error. 3. Potential sources of bias and error in sampling include failing to follow agreed rules, omitting hard-to-reach groups, low response rates, and using outdated sample frames. Selection error can occur when only interested respondents choose to participate.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 17

DATA COLLECTION

• Collection of data is a statistical requirement.


Statistics are a set or series of numerical data that
acts as a facilitating factor of policy-making. In
other words, numerical data establishes
Statistics. Numerical data undergoes processing
and manipulations before it aids the process of
decision making. Hence, numerical data are the
raw materials to statistics. These raw materials
can originate from various sources. Statisticians
and analysts collect these data in different
methods.
METHODS OF COLLECTION
TECHNIQUE KEY FACTS EXAMPLE

INTERVIEWS Interviews can be conducted in person or over One-on-one conversation with


the telephone parent of at-risk youth who
Interviews can be done formally (structured), can help you understand the
semi-structured, or informally issue
Questions should be focused, clear, and
encourage open-ended responses
Interviews are mainly qualitative in nature

Questionnair Responses can be analyzed with quantitative Results of a satisfaction survey


es and methods by assigning numerical values to or opinion survey
Surveys Likert-type scales
Results are generally easier (than qualitative
techniques) to analyze
Pretest/Posttest can be compared and analyzed
SAMPLING NEEDS OF SAMPLING
• Sampling is used in practice for a variety of reasons such as:
• Sampling can save time and money. A sample study is usually less
expensive than a census study and produces results at a relatively
faster speed.
• Sampling may enable more accurate measurements for a sample
study is generally conducted by trained and experienced
investigators.
• Sampling remains the only way when population contains infinitely
many members.
• Sampling remains the only choice when a test involves the
destruction of the item under study.
• Sampling usually enables to estimate the sampling errors and, thus,
assists in obtaining information concerning some characteristic of
the population.
METHOD OF SAMPLING
• Simple random sampling--In this case each individual
is chosen entirely by chance and each member of the
population has an equal chance, or probability, of
being selected. One way of obtaining a random sample
is to give each individual in a population a number, and
then use a table of random numbers to decide which
individuals to include.1 For example, if you have a
sampling frame of 1000 individuals, labelled 0 to 999,
use groups of three digits from the random number
table to pick your sample. So, if the first three numbers
from the random number table were 094, select the
individual labelled “94”, and so on.
Systematic sampling

• Systematic sampling
• Individuals are selected at regular intervals from
the sampling frame. The intervals are chosen to
ensure an adequate sample size. If you need a
sample size n from a population of size x, you
should select every x/nth individual for the
sample. For example, if you wanted a sample size
of 100 from a population of 1000, select every
1000/100 = 10th member of the sampling frame.
Stratified sampling
• In this method, the population is first divided into subgroups (or strata)
who all share a similar characteristic. It is used when we might reasonably
expect the measurement of interest to vary between the different
subgroups, and we want to ensure representation from all the subgroups.
For example, in a study of stroke outcomes, we may stratify the population
by sex, to ensure equal representation of men and women. The study
sample is then obtained by taking equal sample sizes from each stratum.
In stratified sampling, it may also be appropriate to choose non-equal
sample sizes from each stratum. For example, in a study of the health
outcomes of nursing staff in a county, if there are three hospitals each
with different numbers of nursing staff (hospital A has 500 nurses, hospital
B has 1000 and hospital C has 2000), then it would be appropriate to
choose the sample numbers from each hospital proportionally (e.g. 10
from hospital A, 20 from hospital B and 40 from hospital C). This ensures a
more realistic and accurate estimation of the health outcomes of nurses
across the county
CLUSTER sampling
• a clustered sample, subgroups of the population are used as the
sampling unit, rather than individuals. The population is divided
into subgroups, known as clusters, which are randomly selected to
be included in the study. Clusters are usually already defined, for
example individual GP practices or towns could be identified as
clusters. In single-stage cluster sampling, all members of the chosen
clusters are then included in the study. In two-stage cluster
sampling, a selection of individuals from each cluster is then
randomly selected for inclusion. Clustering should be taken into
account in the analysis. The General Household survey, which is
undertaken annually in England, is a good example of a (one-stage)
cluster sample. All members of the selected households (clusters)
are included in the survey.
Non-Probability Sampling Methods
• Convenience sampling is perhaps the easiest
method of sampling, because participants are
selected based on availability and willingness to
take part. Useful results can be obtained, but the
results are prone to significant bias, because
those who volunteer to take part may be
different from those who choose not to
(volunteer bias), and the sample may not be
representative of other characteristics, such as
age or sex. Note: volunteer bias is a risk of all
non-probability sampling methods.
Quota sampling
• This method of sampling is often used by market
researchers. Interviewers are given a quota of subjects of a
specified type to attempt to recruit. For example, an
interviewer might be told to go out and select 20 adult
men, 20 adult women, 10 teenage girls and 10 teenage
boys so that they could interview them about their
television viewing. Ideally the quotas chosen would
proportionally represent the characteristics of the
underlying population.
• Whilst this has the advantage of being relatively
straightforward and potentially representative, the chosen
sample may not be representative of other characteristics
that weren’t considered (a consequence of the non-
random nature of sampling). 2
JUDGEMENT SMAPLING
• Judgement (or Purposive) Sampling
• Also known as selective, or subjective, sampling, this technique
relies on the judgement of the researcher when choosing who to
ask to participate. Researchers may implicitly thus choose a
“representative” sample to suit their needs, or specifically approach
individuals with certain characteristics. This approach is often used
by the media when canvassing the public for opinions and in
qualitative research.
• Judgement sampling has the advantage of being time-and cost-
effective to perform whilst resulting in a range of responses
(particularly useful in qualitative research). However, in addition to
volunteer bias, it is also prone to errors of judgement by the
researcher and the findings, whilst being potentially broad, will not
necessarily be representative.
SNOW BALL SAMPLING
• This method is commonly used in social sciences when
investigating hard-to-reach groups. Existing subjects are
asked to nominate further subjects known to them, so the
sample increases in size like a rolling snowball. For example,
when carrying out a survey of risk behaviours amongst
intravenous drug users, participants may be asked to
nominate other users to be interviewed.
• Snowball sampling can be effective when a sampling frame
is difficult to identify. However, by selecting friends and
acquaintances of subjects already investigated, there is a
significant risk of selection bias (choosing a large number of
people with similar characteristics or views to the initial
individual identified).
BIAS
• Bias in sampling
• There are five important potential sources of bias that
should be considered when selecting a sample, irrespective
of the method used. Sampling bias may be introduced
when:1
• Any pre-agreed sampling rules are deviated from
• People in hard-to-reach groups are omitted
• Selected individuals are replaced with others, for example if
they are difficult to contact
• There are low response rates
• An out-of-date list is used as the sample frame (for
example, if it excludes people who have recently moved to
an area)
ERROR
• Population Specification Error—This error
occurs when the researcher does not
understand who they should survey. For
example, imagine a survey about breakfast
cereal consumption. Who to survey? It might
be the entire family, the mother, or the
children. The mother might make the
purchase decision, but the children influence
her choice.
SAMPLE FRAME ERROR
• Sample Frame Error—A frame error occurs
when the wrong sub-population is used to
select a sample. A classic frame error occurred
in the 1936 presidential election between
Roosevelt and Landon. The sample frame was
from car registrations and telephone
directories. In 1936, many Americans did not
own cars or telephones, and those who did
were largely Republicans. The results wrongly
predicted a Republican victory.
SELECTION ERROR
• Selection Error—This occurs when respondents
self-select their participation in the study – only
those that are interested respond. Selection error
can be controlled by going extra lengths to get
participation. A typical survey process includes
initiating pre-survey contact requesting
cooperation, actual surveying, and post-survey
follow-up. If a response is not received, a second
survey request follows, and perhaps interviews
using alternate modes such as telephone or
person-to-person.
NON RESPONSE
• Non-Response—Non-response errors occur
when respondents are different than those
who do not respond. This may occur because
either the potential respondent was not
contacted or they refused to respond. The
extent of this non-response error can be
checked through follow-up surveys using
alternate modes.
SAMPLING ERROR
• Sampling Errors—These errors occur because
of variation in the number or
representativeness of the sample that
responds. Sampling errors can be controlled
by (1) careful sample designs, (2) large
samples, and (3) multiple contacts to assure
representative response.

You might also like