0% found this document useful (0 votes)
70 views4 pages

Lab 5 Solutions

This document provides instructions for an exercise in data cleaning and error detection using a survey data file. It includes: 1) Details on the survey questions and data file with instructions to add an ID variable. 2) A list of 12 deliberate errors inserted into the data file for students to find and describe. 3) Guidance that errors can be found through frequency tables, cross tabulations, and summing ranks. 4) Instructions to search the ABS website for information on geographic classifications and census data. 5) Details on how people, families, and dwellings are counted differently in census tables on the ABS website. 6) The number of dwellings in a particular Local Government Area

Uploaded by

dan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
70 views4 pages

Lab 5 Solutions

This document provides instructions for an exercise in data cleaning and error detection using a survey data file. It includes: 1) Details on the survey questions and data file with instructions to add an ID variable. 2) A list of 12 deliberate errors inserted into the data file for students to find and describe. 3) Guidance that errors can be found through frequency tables, cross tabulations, and summing ranks. 4) Instructions to search the ABS website for information on geographic classifications and census data. 5) Details on how people, families, and dwellings are counted differently in census tables on the ABS website. 6) The number of dwellings in a particular Local Government Area

Uploaded by

dan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 4

35100 Introduction to Sample Surveys – Lab 5

Error Detection: Cleaning Data and Searching for Information on ABS web site.

5.1. An updated version of the Attitudes to the Library questionnaire with 8 questions is
attached to this lab sheet, and a data file representing the responses of 30 survey
participants can be found on UTS Online. We are going to start by inserting a variable at
the start of the data giving the questionnaire id number.

Set up column 1 in the data as id. Go to Variable View, and right click the 1 to the left of
the first row. Select “Insert Variable” from the menu. Label the new variable id. Click on
Transform > Compute, select id as the Target Variable and as the Numeric
Expression type $casenum.

There are 12 deliberate data entry errors in the data file. Find these errors. For each error
explain the questionnaire number and the variable involved, what the error is, and what
method you used to find the error. This is an exercise in data cleaning. Use frequency
tables, cross tabulations and any other logic tests to find the errors. (Just scanning the
data file is not a good method particularly if the data file you have is very large.)

Error 1: Q1 Case 15: Sex=21


Error 2: Q2 Case 28: Value = 6 when the range is 1-5
Error 3: Q5a Case 27: Value = 45, when the range is 1-5
Error 4: Q6b Case 5: Value = 5 when there are four things to rank
Error 5: Q6 Case 1: Sum of ranks is 12
Error 6: Q6 Case 5: Sum of ranks is 12
Error 7: Q6 Case 19: Sum of ranks is 13
Error 8: Q3,4 Case 13: Q3 = 0 and Q4 = 1 (didn’t go to the library but used a facility)
Error 9: Q5 Case 11: Choice 1 = Choice 2
Error 10: Q5 Case 20: Choice 1 = Choice 2
Error 11: Q7,8 Case 8: Q7 is Yes, but answers Q8 (which is conditional on Q7 = No)
Error 12: Q7,8 Case 24: Q7 is Yes, but answers Q8 (which is conditional on Q7 = No)

Errors 1-4 can be found by constructing frequency tables. Errors 5-7 can be found by
summing ranks. Errors 8-12 can be found by constructing crosstabs.

35100 Introduction to Sample Surveys week 6 1


Q.1 Please tick the appropriate box, indicating your sex. Male

Female

Q.2 Please tick the box that best represents your view on the statement below:

Strongly Agree Neutral Disagree Strongly


Agree Disagree
The library
offers good
service

Q.3 How many times have you visited the library during the last week?

Q.4 On your most recent visit to the library during the last week, which of the following
features of the library did you use?
(You may tick more than one box.)
Closed reserve

Study areas

Search facilities

Journal collections

Photocopiers

Q.5 From the following list of services provided by the Library choose two which you
feel are the most important. (Indicate each of your two choices with a tick.)

Closed reserve

Study areas

Search facilities

Journal collections

Photocopiers

35100 Introduction to Sample Surveys week 6 2


Q6 Please rank the following items relating to the Library in terms of their importance to
you, with 1 being the most important and 4 being the least important.

Item Rank

Being able to find a desk to work at by myself ---

Being able to find a desk to work as part of a group ---

Being able to easily find the books on my reading list ---

Being able to find the most recent copies of journals ---

Q7 Have you ever used the online service to reserve a book at the Library?

Yes

No

Q8 If you answered “No” to question 7, please indicate your reason for not using the
service.

Too difficult to use

Didn’t know I could

No need to reserve

Other

35100 Introduction to Sample Surveys week 6 3


5.2. Statisticians should be familiar with the wide range of resources available from the
Australian Bureau of Statistics. Go to the ABS website. Use the search engine to find
information on the Australian Standard Geographic Classification (ASGC). You might
find it useful to search for the “ASGC Structural Chart”. If we are including a geographic
question on any survey we may conduct, we should try to make the codes compatible
with this standard classification. The “main” structure of the ASGC has six hierarchical
levels at Population Census times. What are these six levels?

Australia
State/Territory (S-T)
Statistical Division (SD)
Statistical Subdivision (SSD)
Statistical Local Area (SLA)
Census Collection District (CD)

5.3. The ABS makes a huge amount of data freely available. Go to the Census page. You
can then select a geographic area to obtain a summary of that geographic area using the
QuickStats Search. Enter “Rockdale” in the search box, and select “Rockdale LGA”
(towards the bottom of the list) and select “GO”.

You will now see some basic information followed three sections: “People”, “Families”
and “Dwellings”. What is different about the tables in these three sections?

The tables corresponding to “People” count individual people. The tables corresponding
to “Families” counts a family as a single observation. The tables in the “Dwellings”
section count whole dwellings as a single observation in the table.

Now, find how many (total) dwellings in the Rockdale LGA at the 2011 Census had 3 or
more motor vehicles per dwelling.

There were 3482 dwellings with three or more motor vehicles. This is a lower proportion
than both NSW as a whole and Australia…

35100 Introduction to Sample Surveys week 6 4

You might also like