Introduction To Sample Surveys, Lab 6
Introduction To Sample Surveys, Lab 6
Weighting
The Parents and Teens 2004 Survey was sponsored by the Pew Internet and American
Life Project. Telephone interviews were conducted with a nationally representative
sample of 1100 teens 12-17 years old and their parents living in continental United States
telephone households. The interviewed sample was weighted to match national
parameters for both parent and child demographics. The parent demographics used for
weighting were: sex; age; education; race; Hispanic origin; marital status and region
(U.S. Census definitions). The child demographics used for weighting were gender and
age.
Open the Parents and Teens data set in the UTS Online folder.
You can see that there are 185 variables, one of which is a weighting variable.
6.1. Find frequency tables for the census region (cregion: variable 11), the parent’s
education (EDUC: variable 45) and the parent’s race/ethnicity (RACE: variable 47).
When you find these tables use the method of clicking Paste rather than OK in the
dialogue box. This will open the Syntax window and paste the commands in the Syntax
window. Select Run > All to get your output.
Make a note of the percentages in each category in your tables.
How many total respondents do the tables say are in the sample?
There are now a smaller proportion of Northwest respondents and western respondents
and a larger proportion of Midwest respondents. The proportion of people with lower
levels of education has decreased and the proportions of those with higher levels of
education have increased. The proportion of “White” respondents has increased, while
the proportions of other respondents have gone down.
6.3. If you wanted to reproduce the weighted frequency table for census region, you
would need to type in the Syntax window the following command.
WEIGHT by weight .
FREQUENCIES VARIABLE=cregion .
Try to find the unweighted and weighted frequency tables for questions k5a to k5d and
see if there were any marked differences between the weighted and unweighted
percentages. Summarise these differences below.
The response proportions are slightly lower in the weighted sample (86.7%) than in the
unweighted sample (88.3%). For QK5a – QK5c, the proportion of people who need help
is slightly lower in the weighted sample than in the unweighted sample. In QK5d, the
proportion of people who need help is lower in the weighted sample than in the
unweighted sample.
6.5. Create a new column using Transform > Recode into Different Variables to
recode the cregion variable into a new weight variable called newweight. Repeat Q6.1
using the new weights (so use the command WEIGHT by newweight).
Note that weights are usually calculated using more than one demographic variable.