0% found this document useful (0 votes)
24 views4 pages

Week 10 Tutorial Questions Chapter 6

Uploaded by

joehe2625
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
24 views4 pages

Week 10 Tutorial Questions Chapter 6

Uploaded by

joehe2625
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 4

CHAPTER 6

TRANSFORMING DATA

DISCUSSION QUESTIONS

1. Why is transforming data necessary and why does it take so much time? What
ways can you think of to reduce the time needed to transform data?

2. What are the strengths and weaknesses of each of the four data validation
procedures discussed in this chapter? What are other possible ways to validate
data?

Data Validation Strengths Weaknesses


Procedure
Visual Inspection
Basic Statistical Tests
Audit a Sample
Advanced Testing
Techniques

3. Match the following terms with the appropriate definition or example:

1 aggregate data a process of analyzing data to make certain the


data has the properties of high-quality data:
accuracy, completeness, consistency, timeliness,
and validity.
2 cryptic data values b data values that are correctly formatted but not
listed in the correct field.
3 data cleaning c all types of errors that come from inputting data
incorrectly.
4 data concatenation d examining data using human vision to see if there
are problems.
5 data consistency e the process of tracing extracted or transformed
values back to their original source.
6 data contradiction errors f data items that have no meaning without
understanding a coding scheme.
7 data de-duplication g the process of changing data into a common
format so that is useful for decision-making.
8 data entry errors h a technique that rotates data from a state of rows
to a state of columns.
9 data filtering i errors that occur when a secondary attribute in a
row of data does not match the primary
attribute.
10 data imputation j a data field that contains only two responses,
typically a 0 or 1. Also called a dichotomous
variable.
11 data parsing k the principle that every value in a field should be
stored in the same way.
12 data pivoting l the process of changing the organization and
relationships among data fields to prepare the
data for analysis.
13 data standardization m data errors that occur when a data value falls
outside an allowable level.
14 data structuring n a data field that contains only two responses,
typically a 0 or 1. Also called a dummy variable.
15 data threshold violations o the process of updating data to be consistent,
accurate and complete.
16 data validation p data that is inconsistent, inaccurate, or
incomplete.
17 dichotomous variable q separating data combined in a single field into
multiple fields.
18 dirty data r the combining of data from two or more fields
into a single field.
19 dummy variable s the process of replacing a null or missing value
with a substituted value.
20 misfielded data values t the process of removing records or fields of
information from a data source.
21 violated attribute u the process of analyzing data and removing two
dependencies or more records that contain identical
information.
22 visual inspection v the presentation of data in a summarized form.
W an error that exists when the same entity is
described in two conflicting ways.
X the process of ordering data to reveal unexpected
values.

4. Excel Project: Data Pivoting


You are a data analyst for the city of Burlington, Vermont. Download the data file
“P6-2BurlingtonVermontData.xlsx” from the student download page at
http://www.pearsonglobaleditions.com, which contains the annual account balance
information for city departments for six fiscal years. For this problem, use the sheet
titled “Annual Data”, which contains data aggregated to the annual level for the
different city departments.

REQUIRED
Using this data, prepare a different sheet that has a PivotTable that answers each of
the following questions:
a) How have total department budgets changed each year? To answer the question,
create a Pivot Table that shows the budgeted amount of expenditures for each
fiscal year. Do not include grand totals. Add conditional formatting data bars to
show which amounts are the greatest.
b) Which funds have the largest expense budgets for fiscal year 6? Create a
PivotTable that shows fund names and budgeted amounts for fiscal year 6. Sort
the data so the greatest budget amounts are listed at the top.
c) Regardless of department, organization, or fund, what type of activities were
most costly during the entire period (hint: use the “Detail Description” field for
this question)? How much did they pay for this activity?

5. Excel Project: Aggregating Data at Different Levels


You are an internal auditor for the city of Burlington, Vermont. Download the data
file “P6-3BurlingtonVermontData.xlsx” from the student download page at
http://www.pearsonglobaleditions.com, which contains all the annual account
balance information for city departments for six fiscal years. In this workbook are
two sheets. The “Annual Data” sheet contains data aggregated to the annual level
for the different city departments. The “Monthly Data” sheet contains data
aggregated to the monthly level for the different city departments. You are
planning to perform audit procedures on the “Monthly Data.” But before you do,
you need to verify that the data in this sheet matches the data in the “Annual Data”
worksheet, which you already verified is correct.

REQUIRED
Analyze the two sheets and, based on your analysis, answer the following questions:
a) Under what circumstances can you not use the “Annual Data” sheet for your
audit? Said differently, why might you need the data in the “Monthly Data”
sheet?
b) On a separate worksheet in Excel, create a summary of the data that shows the
total dollar amount of transactions for the two different sheets. Are these the
same for both data sets?
c) Does the total amount for transactions differ for the different departments and
years? Create two sheets: the first sheet should compare departments and the
second sheet should compare years. What do you learn from these analyses?
d) Based on your analysis of the previous questions, suggest the areas that you
believe are most important to investigate further. Why do you believe these
areas are the most important to investigate further?

6. For the S&S Case discussed in this chapter, you receive the following output
containing basic descriptive statistics for some of the columns in the full dataset
(the chapter example problem contained a small excerpt of data, this problem
uses more data). S&S has a total of 60 products that customers purchase across 3
categories.

REQUIRED
List the concerns you have with the data and discuss what steps you would take for
each concern you identified.

You might also like