chapter13-non-sampling-errors
chapter13-non-sampling-errors
It is a general assumption in the sampling theory that the true value of each unit in the population can be
obtained and tabulated without any errors. In practice, this assumption may be violated due to several
reasons and practical constraints. This results in errors in the observations as well as in the tabulation.
Such errors which are due to the factors other than sampling are called non-sampling errors.
The non-sampling errors are unavoidable in census and surveys. The data collected by complete
enumeration in census is free from sampling error but would not remain free from non-sampling errors.
The data collected through sample surveys can have both – sampling errors as well as non-sampling
errors. The non-sampling errors arise because of the factors other than the inductive process of inferring
about the population from a sample.
In general, the sampling errors decrease as the sample size increases, whereas non-sampling error
increases as the sample size increases.
In some situations, the non-sampling errors may be large and deserve greater attention than the sampling
error.
In any survey, it is assumed that the value of the characteristic to be measured has been defined precisely
for every population unit. Such a value exists and is unique. This is called the true value of the
characteristic for the population value. In practical applications, data collected on the selected units are
called survey values and they differ from the true values. Such difference between the true and observed
values is termed as the observational error or response error. Such an error arises mainly from the lack
of precision in measurement techniques and variability in the performance of the investigators.
These sources are not exhaustive but surely indicate the possible source of errors.
(a) Specification errors: These errors occur at planning stage due to various reasons, e.g., inadequate
and inconsistent specification of data with respect to the objectives of surveys/census, omission or
duplication of units due to imprecise definitions, faulty method of enumeration/interview/ambiguous
schedules etc.
(b) Ascertainment errors: These errors occur at field stage due to various reasons e.g., lack of
trained and experienced investigations, recall errors and other type of errors in data collection, lack of
adequate inspection and lack of supervision of primary staff etc.
(c) Tabulation errors: These errors occur at tabulation stage due to various reasons, e.g., inadequate
scrutiny of data, errors in processing the data, errors in publishing the tabulated results, graphs etc.
(ii) Content errors relating to the wrong entries due to the errors on the part of investigators and
respondents.
Same division can be made in the case of tabulation error also. There is a possibility of missing data or
repetition of data at tabulation stage which gives rise to coverage errors and also of errors in coding,
calculations etc. which gives rise to content errors.
Total error: Difference between the sample survey estimate and the parametric true value being
estimated is termed as total error.
Sampling error:
If complete accuracy can be ensured in the procedures such as determination, identification and
observation of sample units and the tabulation of collected data, then the total error would consist only
of the error due to sampling, termed as sampling error.
The measure of sampling error is mean squared error (MSE). The MSE is the difference between the
estimator and the true value and has two components:
- square of sampling bias.
- sampling variance.
If the results are also subjected to non-sampling errors, then the total error would have both sampling and
non-sampling error.
Total bias:
The difference between the expected value and the true value of the estimator is termed as total bias. This
consists of sampling bias and nonsampling bias.
Non-sampling bias:
Sampling Theory| Chapter 13 | Non Sampling Errors| Shalabh, IIT Kanpur
Page 3
For the sake of simplicity, assume that the two following steps are involved in the randomization:
(i) for selecting the sample of units and
(ii) for selecting the survey personnel.
Let be the estimate of population mean based on sample of units supplied by the sample of
the survey personnel. The conditional expected value of taken over the second step of randomization
which may be different from based on true values of the units in the sample.
which is the value for which an unbiased estimator can be had by the specified survey process. The value
may be different from true population mean and the total bias is given as
In the case of complete enumeration, there is no sampling bias and the total bias consists only of non-
sampling bias.
In the case of sample surveys, the total bias consists only of the non-sampling bias.
The non-sampling bias in a census can be estimated by surveying a sample of units in the population
using better techniques of data collection and compilation than those adopted under general census
condition. The surveys are called post-enumeration surveys, which are usually conducted just after the
census for studying the quality of census data, may be used for this purpose.
Sampling Theory| Chapter 13 | Non Sampling Errors| Shalabh, IIT Kanpur
Page 4
In a large scale sample survey, the ascertainment bias can be estimated by resurveying a sub-sample of
the original sample using better survey techniques.
Another method of checking survey data is to compare the values of the units obtained in the two surveys
and to reconcile the discrepant figures by further investigation. This method of checking is termed
reconciliation (check ) surveys.
Non-sampling variance:
The MSE of based on sample of units and supplied by sample of the survey personnel is
where is the true value being estimated. This takes into account both the sampling and the non-
sampling errors, i.e.,
where is the expected value of the estimator taken over both steps of randomization.
Note that
where .
In the complete census, the MSE is composed of only the non-sampling variance and square of the non-
sampling bias.
Non-response error:
The non-response error may occur due to refusal by respondents to give information or the sampling units
may be inaccessible. This error arises because the set of units getting excluded may have characteristic so
different from the set of units actually surveyed as to make the results biased. This error is termed as non-
response error since it arises from the exclusion of some of the anticipated units in the sample or
population. One way of dealing with the problem of non-response is to make all the efforts to collect
information from a sub-sample of the units not responding in the first attempt.
1. Consistency checks:
Certain items in the questionnaires can be added, which may serve as a check on the quality of the
collected data. To locate the doubtful observations, the data can be arranged in increasing order of some
basic variable. Then they can be plotted against each sample unit. Such graph is expected to follow a
certain pattern and any deviation from this pattern would help in spotting the discrepant values.
2. Sample check
An independent duplicate census or sample survey can be conducted on a comparatively smaller group by
trained and experienced staff. If the sample is properly designed and if the checking operation is
efficiently carried out, then it is possible to detect the presence of non-sampling errors and to get an idea
of their magnitude . Such a procedure is termed as the method of sample check.
7. Interpenetrating sub-samples:
The use of interpenetrating sub-sample technique helps in providing an appraisal of the quality of
information as the interpenetrating sub-samples can be used to secure information on non-sampling errors
such as differences arising from differential interviewer bias, different methods of eliciting information
etc. After the sub-samples have been surveyed by different groups of investigators and processed by
different team of workers at the tabulation stage, a comparison of the final estimates based on the sub-
samples provides a broad check on the quality of the survey results..
Sampling Theory| Chapter 13 | Non Sampling Errors| Shalabh, IIT Kanpur
Page 7