0% found this document useful (0 votes)
107 views12 pages

CH 6 Transforming Data

Uploaded by

zeina.samara213
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
107 views12 pages

CH 6 Transforming Data

Uploaded by

zeina.samara213
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 12

Accounting Information Systems

Fifteenth Edition, Global Edition

Chapter 6
Transforming Data

• Copyright © 2021 Pearson Education Ltd.


Data Structuring
• Data structuring is the process of changing the
organization and relationships among data fields to
prepare the data for analysis.
• Extracted data often needs to be structured in a manner
that will enable analysis.

• Aggregate data is the presentation of data in a


summarized form.
• Data joining is the process of combining different data
sources.

• Copyright © 2021 Pearson Education Ltd.


Figure 6.4 Examples of Different
Levels of Aggregating Data

• Copyright © 2021 Pearson Education Ltd.


Data Standardization (1 of 3)
• Data standardization is the process of standardizing the
structure and meaning of each data element so it can be
analyzed and used in decision making.
– It is particularly important when merging data from
several sources.
– It may involve changing data to a common format, data
type, or coding scheme.

• Copyright © 2021 Pearson Education Ltd.


Data Standardization (2 of 3)
• Data parsing involves separating data from a single field
into multiple fields.

• Data concatenation is the combining of data from two or


more fields into a single field.

• Copyright © 2021 Pearson Education Ltd.


Figure 6.7 Data Parsing Example

• Copyright © 2021 Pearson Education Ltd.


Figure 6.8 Data Concatenation Example

• Copyright © 2021 Pearson Education Ltd.


Data Standardization (3 of 3)
• Cryptic data values are data items that have no meaning
without understanding a coding scheme.
– When a field contains only two different responses, typically
0 or 1, this field is called a dummy variable or dichotomous
variable.
• Misfielded data values are data values that are correctly
formatted but not listed in the correct field.
Country City Zip code
Berlin German ZL1340

• Data consistency is the principle that every value in a field


should be stored in the same way.
Date format: dd/mm/year 21/07/2023
08/11/2023
12/30/2023 ✘• Copyright © 2021 Pearson Education Ltd.
Data Cleaning
• Data cleaning is the process of updating data to be
consistent, accurate, and complete.
– Dirty data is data that is inconsistent, inaccurate, or
incomplete.
– To be useful, dirty data must be cleaned.
• Data de-duplication is the process of analyzing data and
removing two or more records that contain identical
information.
• Data filtering is the process of removing records or fields
of information from a data source.

• Copyright © 2021 Pearson Education Ltd.


Data Cleaning

• Data contradiction errors are errors that exist when the


same entity is described in two conflicting ways.
• Data threshold violations are data errors that occur when
a data value falls outside an allowable level.
• Violated attribute dependencies are errors that occur
when a secondary attribute in a row of data does not
match the primary attribute.
• Data entry errors are all types of errors that come from
inputting data incorrectly.

• Copyright © 2021 Pearson Education Ltd.


Data Validation
Data validation is the process of analyzing data to make
certain the data has the properties of high-quality data:

• Visual inspection is the process of examining data using


human vision to see if there are problems.
• Basic statistical tests can be performed to validate the
data.
• Audit a sample is one of the best techniques for assuring
data quality.
• Advanced testing techniques are possible with a deeper
understanding of the content of data.

• Copyright © 2021 Pearson Education Ltd.


Key Terms
• Data structuring • Data filtering
• Aggregate data • Data imputation
• Data pivoting • Data contradiction errors
• Data standardization • Data threshold violations
• Data parsing • Violated attribute dependencies
• Data concatenation • Data entry errors
• Cryptic data values • Data validation
• Dummy variable or • Visual inspection
dichotomous variable
• Misfielded data values
• Data consistency
• Dirty data
• Data cleaning
• Data de-duplication
• Copyright © 2021 Pearson Education Ltd.

You might also like