Sampling, Data Collection and Processing
Sampling, Data Collection and Processing
• In the first stage, random numbers of districts are chosen in all the
states, followed by random no. of Talukas, From among the
chosen Taluks, villages are chosen randomly.
Multistage sampling
E.g. For hookworm survey in school children in a district.
• Choose 10% of Talukas randomly and then
• Choose10% of the villages situated in the chosen Talukas
• Among the chosen villages, choose 10% of the schools randomly
• From the chosen schools, 10% of the students are selected
randomly, yielding a 10% sample of the district school children
Multi-phase sampling
• In this method, part of the information is collected from the whole
population & part, from the sub- sample.
E.g. In a TB survey:
• Mantoux test may be done in all subjects of the study population
in the first phase;
• In the second phase, Mantoux positive cases undergo x-ray of the
chest
• Among the x-ray positive cases, sputum may be examined in the
third phase.
Cluster sampling
• This sampling technique used when a large study population is
"naturally" divided into clusters and the clusters in turn have a
homogeneous population
E.g. the country is sub-divided into towns, cities, wards, villages etc.
but relatively homogeneous groupings are evident in these villages,
slums etc.
Cluster sampling
In this technique,
• the total population of each cluster must be known
• The complete list (sampling frame) of all the individuals in the
country is not necessary
• A fixed number of clusters are chosen using cluster sampling
• A small sample is then selected from the chosen clusters using
simple random sampling OR The entire population of the cluster
may be surveyed
Cluster sampling
The advantage of cluster sampling is that
• it is cheap, quick, and easy as Instead of sampling the entire
country when using simple random sampling, the researcher can
allocate resources to the few selected clusters.
• The size of the sample (as the size of the sample increases
sampling error decreases)
• The natural variability of the individual readings
Non - Sampling Errors
• This error has no relationship to the sampling technique
• For ex, if a kitchen appliances firm wants to conduct a survey to ascertain the
demand for its micro ovens, it may define the population as ‘all women
above the age of 20 who cook (assuming that very few men cook)’.
• The definition can be further refined and defined at the sampling unit level,
that, all women above the age 20, who cook and whose monthly
household income exceeds Rs.20,000.
• The population definition can be refined further by specifying the area from
where the researcher has to draw his sample, that is, households located
in Hyderabad.
2. Specifying the Sampling Frame:
• A sampling frame is the list of elements from which the sample may be
drawn.
• Continuing with the micro oven ex, an ideal sampling frame would be a
database that contains all the households that have a monthly income
above Rs.20,000.
• In general, researchers use easily available sampling frames like
telephone directories and lists of credit card and mobile phone users.
2. Specifying the Sampling Frame:
• The sampling method outlines the way in which the sample units
are to be selected.
• The choice of the sampling method is influenced by the objectives
of the business research, availability of financial resources, time
constraints, and the nature of the problem to be investigated.
• All sampling methods can be grouped under two distinct heads,
that is, probability and non-probability sampling.
5. Determination of Sample Size:
• These are guide lines that would help the researcher in every step of the
process.
• As the interviewers and their co-workers will be on field duty of most of the
time, a proper specification of the sampling plans would make their work
easy and they would not have to revert to their seniors when faced with
operational problems.
7. Selecting the Sample:
• This is the final step in the sampling process, where the actual
selection of the sample elements is carried out.
• At this stage, it is necessary that the interviewers stick to the rules
outlined for the smooth implementation of the business research.
• This step involves implementing the sampling plan to select the
sampling plan to select a sample required for the survey.
Data and Information
Data is an individual unit that contains raw Information is a group of data that collectively
materials which do not carry any specific carries a logical meaning.
meaning.
Data doesn’t depend on information. Information depends on data.
Raw data alone is insufficient for decision Information is sufficient for decision making
making
An example of data is a student’s test score The average score of a class is the information
derived from the given data.
Thurstone scale
• Thurstone scale is defined as a unidimensional scale that is used to
track respondent’s behavior, attitude or feeling towards a subject.
• This scale consists of statements about a particular issue or topic
where each statement has a numerical value that indicates the
respondents attitude towards the topic as favorable or unfavorable.
• Respondents indicate the statements that they agree with, and an
average is computed.
• A mean score of the agreements or disagreements is calculated as
the attitude of the respondent towards the topic.
Thurstone scale _ Example
Likert scale
• A Likert scale is a rating scale used to measure survey participants'
opinions, attitudes, motivations, and more. It uses a range of
answer options ranging from one extreme attitude to another,
sometimes including a moderate or neutral option. However, 4- to
7-point scales are the most popular.
The Guttman Scale
The Guttman scale is a commonly used unidimensional scale, like the
Likert scale and the Thurstone scale. The Guttman scale is also known
as cumulative scaling or scalogram analysis. It is an ordinal scale with
a number of statements placed in a hierarchical order. The order is
arranged so that if a respondent agrees with a statement, they will
also agree with all of the statements that fall below it in extremity.
The first statement that indicates disagreement shows the
respondent’s position on the subject.
Semantic Differential Scale
• A semantic differential scale is a survey question used to measure
people’s attitudes, feelings, or perceptions by having them submit
a rating between two opposing adjectives. Each end of the scale
features a pair of contrasting terms—such as “unreliable” and
“reliable”—with a fixed number of points in between, typically
seven. However, five-point scales are also used, as they are more
user-friendly, especially for respondents on mobile devices.