Chapter 2 Types of Data
Chapter 2 Types of Data
- Abdi-Khalil Edriss
TYPES OF DATA
I. Data
The purpose of this chapter is to discuss the most common classifications of data, and
to describe when and how each type of data is generally helpful in identifying analysis
techniques. In addition, a thorough discussion on how to design a questionnaire, when
to use open-ended and close-ended questions for data collection, and some insights
into how to plan a survey are discussed in this chapter.
2.2. Is it true that the type of the data determine the type of
statistical or econometric model?
Yes. The data set usually speaks by itself. We have to be careful what type
of model (static or time series models) we should apply or fit to what type
of data set. We classify data into two major types. Namely, cross-sectional
1
Data is a plural form of datum
~ 21 ~
(or spot data), time-series (or longitudinal) and panel (hybrid of cross and
time-series) data sets.
i. Cross-sectional data
o Usually contain independent observations
o Exclude time factors or contains no element of time factors, and
hence is named spot data
o Are analyzed through static models such as regression models,
qualitative models, simultaneous models, etc. (Refer to chapter
eight for more).
iii. Valid – a good questionnaire provides valid data by ensuring that the
respondent understands what information is being sought. Validity
implies that the question elicits a true and accurate response that
measures what you are interested in measuring.
~ 23 ~
2.5. What the main criteria for a good questionnaire?
~ 25 ~
Now it would not be unusual to find more people supporting B if asked in the order
(B,A) than asked in the order (A,B).
o For many survey questions the order of the possible responses (or
choices) to a particular question is as important as the position of the
question on the questionnaire. For example, if a person being
interviewed is presented with a long list of possible choices, or if each
possible choice is wordy or difficult to interpret, a person is likely to
respond with the most recent choice (the last one on the list). But, if a
respondent must choose items from a long written list, then the items
appearing toward the top of the list have a selection advantage.
~ 26 ~
Example of a closed question is – which of the following is the most important
problem facing your country? (Check one)
a. Crime
b. Unemployment
c. Inflation
d. Budget deficits
e. Drought
One may see that any closed form of question will limit the respondent‘s response,
and may force a respondent into an answer that would not necessarily be a first
choice. Or, the respondent‘s choices could be forced into predetermined categories.
POINTS TO PONDER
Generally, a good plan for designing a closed question with appropriate alternative is
to use a similar open question on a pretest or pilot survey; then choose as the fixed
alternatives those that most nearly represent the choices expressed in the open
answers.
POINTS TO PONDER
Training of field workers or interviewers can greatly reduce the duration of the
interview, as they become more familiar with the questionnaire. This training should
take place before the pre-test of the survey, at which time the length of time required
of each interview should be estimated. Due to distance, fieldwork in rural areas is
always much slower than in urban areas.
~ 27 ~
SAMPLE STRUCTURED QUESTIONNAIRE
QUESTIONNAIRES
Household Identification
A. HOUSEHOLD CHARACTERISTICS
Household Identification
Region _______________________________
ADD ________________________________
RDP _________________________________
EPA _________________________________
Village ________________________________
Household Number _______________________
Interviewer‘s name ________________________
A. IDENTIFICATION
Note- this part of the questionnaire should be posed to household head. Only if the head is
absent or is expected to be absent for the next two days should the question be posed to
another adult member of the household preferably the spouse of the household head.
~ 28 ~
A8. How many children do you have? 1. One 2. Two 3. Three
4. More than 3
A9. How many people live in this household?
1. 1 2. 2-4 3. 5-6 4. More than 6
B.1. Do you have land for cultivation? 1. Yes 2. No if ―YES‖ what is the
holding size of your land? 1. Below 1 hectare 2. 1-2 ha 3. 2.1-5 hectare
4. 5.1-10 hectare 5. Over 10 hectare
B13. if you have not received credit, what are the reasons that you have not received credit?
1. Not willing 2. Not important 3. Not attending meetings 4. Unknown
~ 29 ~
1. Less MK100 2. MK101-500 3. MK501-1000 4. MK1001-2000
5. Over MK2000
C3. After harvesting, who is fully involved in the processing of the groundnuts?
1. Only father 2. Only mother 3. Children 4. All
C4. How much groundnut have you produced for 1999/2000 seasons? __________Kg, and
how much of these have you sold? _______________
C5. If the groundnuts or part of the groundnuts are for sale, who controls the income?
1. Father 2. Mother 3. Both
C6. Do you exchange the groundnuts with other crop when trading?
1. Yes 2. No
If ―YES‖, specify which crop(s) ____________________
C9. Have you received any crop loans from any of the above programs?
1. Yes 2. No if ―YES‖, specify _______________
C10. If your groundnut is for sale, which market gives you higher prices for your groundnut
production?
1. Private traders 2. ADMARC 3. Local market 4. Fellow farmers
D1. Do you grow more than one groundnut variety? If ―No‖ go to question D5
1. Yes 2. No
D4. Have you grown groundnuts for more than one season?
~ 30 ~
1. Yes 2. No
D5. Which crop(s) do you grow with groundnut? (Multiple answers allowed)
1. Soya beans 2. Pigeonpea 3. Maize 4. Cowpea 5. None
D6. Have adopted any of the following technologies? (Multiple answers allowed)
1. Intercropping (mixed) 2. Ridging 3. Variety (Improved) crops 4. Rotation
D8. When did you receive the first training in groundnut technology adoption?
1. Before 1990 2. 1990/3 seasons 3. 1994/6 seasons 4. 1997/99 seasons
D11. What is the spacing between the plants in a single row (ridge) that you use?
1. Less than 15 cm 2. 15-20 cm 3. 20-30 cm 4. Over 30 cm
D15. Did Field Assistants make you aware of the groundnut technologies?
1. Little 2. Much 3. Very much 4. None
D16. How did you benefit from the groundnut technologies? (Multiple answers allowed)
1. Yield increment 2. Income increment 3. Fertility increment 4. None
~ 31 ~
i. Personal Interview
POINTS TO PONDER
Aside from cost involved, the major limitations of personal interview is that if the
interviewers are not thoroughly trained, they may deviate from the required protocol
and introduce a bias into the sampled data.
o Getting data from objective sources that are not affected by the
respondents themselves.
o May not involve measurement on peoples, but from other sources.
For example, laboratory experiments, from records, income
information, counting of objects, etc.
POINTS TO PONDER
The disadvantage of using an observer is the possibility of errors in observations; this
may include over-reporting, under-reporting or no reporting on particular or subtle
issues.
o Involves getting data from the respondent directly, that is, the
questionnaires are completed by the respondents themselves.
POINTS TO PONDER
The disadvantage of using self-administered questionnaires is that non-response or
lower rate of response. It may introduce bias in the data; in addition, the respondents
may not be representative of the population of interest. To eliminate some of these
biases, investigators frequently contact the non-respondents through follow-up letters,
telephone interviews or personal interviews (though costly in many respects).
~ 32 ~
III. A Check List for Planning a Survey
The Frame – select the frame (or frames) so that the list of sampling units and
the target population show close agreement. Keep in mind that multiple frames
may make the sampling more efficient. For example, residents of a city can be
sampled from a list of city blocks coupled with a list of residents within
blocks.
Sample Design – choose the design of the sample, including the number of
sample elements, so that the sample provides sufficient information for the
objectives of the survey.
The Pretest – select a small sample for a pretest. The pretest is crucial, since
it allows you to field-test the questionnaire or other measurement device, to
screen interviews, and to check on the management of field operations. The
results of the pretest usually suggest that some modifications must be made
before a full-scale sampling or actual survey is undertaken.
~ 33 ~
Organization of Fieldwork – plan the fieldwork in detail. Any large-scale
survey involves numerous people working as interviewers, coordinators or
data managers. The various jobs should be carefully organized and lines of
authority clearly established before the survey is begun.
Data Analysis - outline the analyses (descriptive and/or advanced) that are to
be completed. Closely related to the previous note, detailed specifications of
what analyses are to be performed, models and the like. It also involves
discussion and interpretations of the results.
POINTS TO PONDER
Note that building a house without proper plan and architectural design would lead
to disastrous results with lots of resource wastage, and similarly a survey without
proper check list leads to unplanned research with no scientific procedures followed
resulting with biased and inconsistent results consisting high standard errors.
The study population is groundnut farmers in the Central Region of Malawi2. The
survey will be conducted in Lilongwe and Salima districts in the Central Region
2
Malawi is divided into three regions: Northern, Central and Southern Regions.
~ 34 ~
of the country. These two districts lie within Lilongwe Agricultural Development
Division and Salima Agricultural Development Division (ADD)3, which accounts
for 70% of groundnut production in Malawi (Ministry of Agriculture, 1998).
Lilongwe ADD is situated at an altitude of about 600 above sea level while Salima
ADD lies on the lakeshore flood plain at about 200 meters above sea level. The
Central region has a warm to hot weather and cloudy with light to heavy rains,
rainfall ranges from 600-1000 mm per annum, falling in one rainy season from
November to March. This type of rainfall supports crops, such as groundnut,
tobacco and maize that are planted early in the growing season.
The choice of the two districts is necessitated by the need to cover as many
diverse factors as possible that might affect the household‘s decision to grow
groundnuts. These factors are income levels, input and output prices, access to
land and socio-cultural factors related to labour transactions within family groups.
3
Malawi is divided into eight ADDs that form different agro-ecological zones. These ADDs lie within the three regions of the
country. The ADDs constitute the primary management unit of extension services. The ADDs are subdivided into Rural
Development Projects (RDPs), which are further subdivided into Extension Planning Areas (EPAs). Extension agents called
Field Assistants supervise at the EPA level.
~ 35 ~
Sample Frame
The sampling frame (or comprehensive sampling units) for this research is:
Sample Size4
Z 2 (1 p) p 1.962 (1 0.7)0.7
n 323
e2 0.052
Sampling Method
4
For detailed sample size calculations, refer to Chapter 4.
5
EPAs with relatively high groundnut production are identified from maps prepared by Famine Early
Warning System (FEWS) office in Agro-Economic Survey (AES), Department of the Ministry of
Agriculture and Irrigation, Malawi.
~ 36 ~
This criterion is chosen with the need for active farmer participation in mind,
farmers having a commercial orientation and farmers having a subsistent
orientation toward the groundnut crop. Because groundnut is a minor crop in area
terms, it will be important to choose EPAs where the areas allocated to groundnuts
are relatively large in order to improve the likelihood that farmers would be
motivated to participate actively in the research.
Only two EPAs, Chafumbwa from Lilongwe ADD and Chinguluwe from Salima
ADD, are provisionally selected in such a way to capture variation in population
associated greatly in adopting improved technologies, as the alternative of
increasing production from expansion in cultivated area is not available. The rest
of the EPAs are selected randomly from the list of EPAs obtained from the Field
Assistants. These are Mkorera, Mchenchi and Cheseka EPAs from Lilongwe
ADD and Kaphateya EPA from Salima ADD.
The second stage is the probability of selecting sample villages (or clusters) from
the list of villages in six EPAs selected above. A total of ten villages are selected
from the EPAs ensuring that villages with larger proportion have proportionally6
greater chance of containing a selected cluster than small villages.
Training Enumerators
Prior to data collection, the enumerators are first trained by the principal
researcher on techniques of administering the questionnaire for collecting
agronomic and socio-economic data. This involves explaining in detail all the
questions to the enumerators. This is done to ensure that the enumerators
understood what each question is asking for, and that the enumerators should be
asking the same questions to the respondents thereby minimizing enumerator bias
and other errors.
Pre-testing of the questionnaires follows the training of the enumerators, and this
is aimed at detecting problems in the wording of questions bearing in mind that
6
This type of sample is self-weighting, which will simplify the analysis and improving the
representativeness of the sample.
~ 37 ~
the questionnaire is written in English but it is to be administered in Chichewa 7.
Questionnaire pre-testing may also allow enumerators to be exposed to the real
field situations and get used to the questionnaire. After these exercises, all the
necessary changes are made on the questionnaires, and then are administered to
the selected or sampled households in the villages.
Studies and analyses drawing upon data collected in different places, at different
times or utilizing different data collection systems are beset by questions of
comparability. Lack of comparability occurs most often as a result of dissimilarities
of definitions, but may arise from dissimilarity of coverage, differential accuracy and
differential reliability and validity of the data-collecting instruments.
Quality refers primarily to the accuracy of the data, the extent to which the recorded
observation corresponds to the characteristic or attribute of the unit observed, but
also to the validity and reliability of recording or data-collecting instrument or
technique.
Over the years Social Scientists have conducted research to collect primary data from
communities in both urban and rural areas. Quantitative data has been collected for
years and used in explaining the setting in particular situations. During the mid-1980s,
there was realization that quantitative data in itself may not be adequate; there is
usually underlying information that is not collected which can provide an explanation
to a situation. Hence, the promotion of qualitative data collection through
Participatory Rural Appraisal is a tool where researchers discuss issues with the
communities in greater detail.
Transect walk involves both the researcher and the community members to
walk across the village(s) to see and note the various resources available. One
could also notice present problems, for example, soil degradation.
~ 40 ~
2.14. What is Historical timelines?
Seasonal calendars involve the people in identifying what activities they
undertake at particular times of the year. The information is essential when
Development Agencies would want to introduce a project; consideration
should be made on what periods of the year people are busy, or have time
available for development activities. For example, introducing ‗Food for
Work‘ project during the growing season would be disastrous; people would
rather concentrate on their own fields than do community work for food.
Problems in rural areas are many. There is need to rank such problems so that
the most important ones are attended to first. Matrix ranking (pair-wise and
scoring/voting) helps the researcher together with the people to select the
pressing problems to be addressed as a matter of urgency. The community
members can score or vote the highest problem using stones or seeds. The
pair-wise ranking compares each problem with the others at a time on the
basis of the importance or severity of the problem. The most mentioned
problem then ranks the highest and consequently may require the earliest
intervention.
~ 41 ~
2.16. What is Venn diagram?
Venn diagram is yet another participatory method where interrelationship of
individuals or organizations is studied. The further away from the center, the
less the relationship is. The size of the circle around the individual or the
organization also signifies the strength of the relationship.
POINTS TO PONDER
Qualitative and quantitative methods of collecting socio-economic data have
advantages as well as disadvantages. The use of qualitative methods in gathering
data is not a panacea to all the limitations associated with quantitative methods. Uses
of both methods assist in overcoming some of the limitations and ensure that
reasonably accurate data is collected.
~ 42 ~
========================================================
MENTAL GYMNASTICS
CHAPTER TWO
=======================================================
~ 43 ~
What is the problem associated with underdevelopment
in Africa? The number one problem is that there is
disconnection between African intellectuals, politicians
and its natural resources. The national accumulated
human resource and the natural resources could not
communicate; and therefore, exposed to external
exploitations directly or indirectly that undermine
national development.
~ 44 ~