Unit 3 - Assignment With Answers
Unit 3 - Assignment With Answers
In this assignment, you will practice with who (units) and what (variables) we study in social
research
Read this assignment carefully and answer all questions.
Hand in the assignment via Canvas.
1. In social science research, variables characterize units. In the next sentences, please indicate
the unit (of analysis) and the variable(s):
The Dutch railways keep track of train Train journeys Time (delays in minutes
delays. Each journey of a train is (probably))
observed. The delay is recorded and used
for improving services.
Seligman and his colleagues discovered Dogs Being conditioned to fear a
the phenomenon of learned helplessness tone (or not),
when dogs that had been conditioned to Trying to escape (or not)
fear a tone did not try to escape a shock
after hearing the tone, while dogs that
had not been conditioned to fear the
tone did.
1
2. Units of observation can, but do not have to be, identical to the units of analysis. The units of
analysis are the units we are interested in. We use the units of observation to collect and
store data. Indicate the units of analysis and the units of observation for the following
examples. They may be the same.
A researcher wants to compare the quality of study Study programs Individual students
programs in The Netherlands. She therefore collects data (variable 1),
from students, program managers and from colleagues managers (variable
teaching in similar programs. These data are used to give 2) and colleagues
a grade for the study programs she is studying. (variable 3)
In a panel study about COVID 19, individuals (18-75) are Individuals (18- There are (at least)
asked about their opinions regarding measures reducing 75) two different ways
the spread of COVID 19. They are asked the same to UoO can be
questions at three points in time: March, August and conceptualized:
October 2020. 1. Individuals
2. Individuals x
time
In the first, each
individual in ONE
UoO and all variables
have a time
indication too
(opinion in March,
for example).
In the second, each
individual is THREE
units of observation
(March, August and
October). In that
case variables do
NOT have a time
2
indication.
3. Failing to distinguish between units of analysis and the units of observation can lead to
problems. In general, inference should not be made to lower levels of aggregation. Doing
that is referred to as the ecological fallacy. For the following three statements about
aggregate data, formulate an alternative conclusion about the individual level that shows
that the original conclusion might be invalid.
(US 1900s) In cities with a high Immigrants are more Immigrants move to cities with
percentage of immigrants, the level of literate high levels of economic growth
literacy is higher. and these are generally cities
with a highly literate population
Researchers found that death rates Individual fat You do not know this. It may be
from breast cancer were significantly consumption is that a third variable (economic
increased in countries where fat associated with breast prosperity) reduces fat
consumption was high when cancer consumption AND death rates
compared with countries where fat because of breast cancer. Thus
consumption was low. fat consumption and death
because of breast cancer are
not related on an individual
level.
Note: the interpretations in this exercise are written for educational purposes. It does not mean that
they are necessarily true. For example: we do not think students cause so much nuisance that others
4. The values or attributes of variables have to be exhaustive (a.k.a. complete) and mutually
exclusive. Indicate for the following examples whether they fulfil these requirements. If an
aspect is missing, make clear how you would do that.
3
(very happy)
Religion; protestant, catholic, Yes (other) No, Protestants Religion; protestant,
Christian, other, none and Catholics are catholic, other, none
Christians too.
Level of education you followed: Yes No, it is possible Highest level of
that you’ve education, measured
primary education, lower vocational, followed several in the following
secondary education, higher categories: primary
education, other education, secondary
education, higher
education, other
Weight, No, < 3 kg is No, Weight, measures in
missing 50 kg in two the categories: 0-50
3-50 kg, categories kg, 51-100 kg, 101 -
50-100 kg, + > 150 and > 100 150 kg, >150 kg
>100 kg, overlap
>150 kg
Because sometimes data are missing (due to non-response or a filter question for example).
Think of the variable: which party did you vote for (after the filter question: did you cast a
valid vote in the most recent election).
7. Let us assume that we are studying residents of the UTwente campus. The collected data is
stored in a data matrix. The following table is the codebook that belongs to the data file. A
codebook contains all kinds of ‘meta data’ about the data: which data were collected, the
meaning of the variables, etc. Can you add the measurement level for each of the variables in
the codebook below.
4
level
Student Are you a student? Yes / Dichotomous
No
Employee Are you an employee? Yes / Dichotomous
No
Age What is your age in years? 0 - 76 Ratio
8. Given the codebook above, can you provide an example of what a dataset might look like. Fill
each cell of the matrix below.
Make sure you add the variables in the top row. Each following row contains the data on one
individual. You can add one variable that has an identifier for each participant. Note that the
code -99- is used for a missing variable.
In each column we find a variable and all observations done on that variable.
In the cells we find values that we observe for each variable and each unit (campus residents). In
other words, we can read facts about the campus inhabitants that we observed in the cells.
12. Below we have added some excerpts of information from a (simulated) study about student
experiences with different types of student housing in Enschede. Can you use this
information to construct a codebook? What other information do you need to complete the
codebook?
5
Do you live in shared living or independent living?
How many roommates do you have?
If you focus on the past month, how do you rate your experience of living here?
Dataset
6
You can guess that the rating of housing experiences was done on a scale of 0 to 20 and that higher
numbers represent better ratings. However, based on the material here, you cannot be certain. It
would be important to look at the original questionnaire material. For example, you may need to
access the survey environment online.