CHAPTER 8 Data Structures and Caatts
CHAPTER 8 Data Structures and Caatts
DATA
STRUCTURES AND
CAATTS FOR
DATA
EXTRACTION
LEARNING OBJECTIVES
After studying this chapter, you should:
• Understand the components of data structures and how these are
used to achieve data processing operations.
• Be familiar with structures used in flat-file systems, including
sequential, indexes, hashing, and pointer structures.
• Be familiar with relational database structures and the principles of
normalization.
• Understand the features, advantages, and disadvantages of the
embedded audit module approach to data extraction.
• Know the capabilities and primary features of generalized audit
software.
• Become familiar with the more commonly used features of audit
command language.
2
Data Structures have two fundamental components: Organization and
Access method.
3
Flat-File Structure
End users in this environment own their data files rather
than share them with other users
Data files are structured, formatted, and arranged to suit
the specific needs of the owner or primary user.
Sequential Structure
• All records in contiguous storage spaces in specified
sequence (key field)
• Sequential files are simple & easy to process
• Application reads from beginning in sequence
5
“
6
Indexed Structure
Named because, in addition to the actual data file,
there exists a separate index that is itself a file of
record addresses
The data file itself may be organized either
sequentially or randomly.
7
8
Virtual Storage Access Method
(VSAM) Structure
Used for very large files that require routine batch
processing and a moderate degree of individual record
processing.
9
10
11
Hashing Structure
Employs an algorithm that converts the primary key of a
record directly into a storage address
• Advantage: Fast access speed
• Disadvantage: Inefficient use of storage space
12
13
Pointer Structures
Stores the address (pointer) of related record in a
field with each data record.
Pointers provide connections between records and
may be used to link records between files.
3 types of pointers
1. Physical address pointer- contains actual disk storage location which allows
direct access to the record.
Advantage: Access speed.
Disadvantages: If related record moves, pointer must be changed. With no
logical relationship to records they identify, if pointer is lost or destroyed,
record it references is also lost.
13
Linkages Between Relational Tables
13
User Views
A user view is the set of data that a particular user sees. User view
(external schema) a view of part or all of the contents of a database
specified to facilitate a particular purpose or user activity.
13
Anomalies, Structural Dependencies and Data
Normalization
Database Anomalies
A database anomaly is an inconsistency in the data resulting
from an operation like an update, insertion, or deletion.
13
Update Anomaly
13
13
Insertion Anomaly
13
Deletion Anomaly
13
Auditors and Data Normalization
13
DESIGNING RELATIONAL DATABASES
Identify Entities
The four key entities:
Inventory (Inventory Status Report)
Supplier
Inventory Purchases (Purchase Order)
Inventory Receipts (Receiving Report).
Construct a Data Model Showing Entity Associations
(Cardinality)
Add Primary Keys and Attributes to the Model
COMMERCIAL
COMPLY WI TH PROVE N I NDUS TRY BES T
PR ACTI CES AN D TO S ATI S FY THE M O S T
DATABASE COMMON N E E DS OF DI FFERENT CLI ENT
SYSTEM
OR GANI ZAT I ON S . FOR E X A M P LE, A LL
OR GANI ZAT I ON S TH AT S ELL P RO DUCTS TO
CUS TOMERS WI L L N E E D A N I NV ENTO RY TA BLE ,
A CUSTOME R TABL E , A S UP P LI ER TA BLE, A ND
S O FORTH.
IDENTIFY IMPORTANT
TRANSACTIONS LIVE WHILE
THEY ARE BEING PROCESSED
AND EXTRACT THEM.
EXAMPLES
· ER RORS
· FR AUD
COMPLI ANC E
EMBEDDED AUDIT MODULE
DISADVANTAGES
Ve r i f yi ng EA M i nt e g r i t y - s uch as
e nvi r onme nt s w i t h a hi gh l e v e l of p r og r am
mai nt e nanc e
GENERALI ZE D AU D I T S O F TWA RE
It is the most widely used CAATT for IS auditing. GAS allows auditors to access
electronically coded data files and perform various operations on their contents.
Some of the more common uses for GAS include:
1 • F o ot i ng a nd b a la n ci n g en ti re f i l e s o r se l e cte d da ta
i t e ms
2 • Se l e ct i ng a n d r e p o r t i n g de ta i le d da ta co n ta i n e d i n
files
3 • Se l ect i ng st r a t i f i e d s t a t i s ti ca l sa mp le s f ro m da ta
files
4 • F or ma t t i ng r e s ult s o f t ests i n to re p o rts
• P r i n t i ng con f i r ma t i o n s i n e i the r sta n da rdi z e d o r
5 s p e ci a l wor di n g
• Scr eeni ng da t a a n d se l e cti v e ly i n cl udi n g o r
6
e x cl udi ng i t e ms
7 • C ompa r i ng mul t i p le f i le s a n d i de n ti f yi n g a n y
di f f er ences
8 • R eca l cula t i n g d a t a f i e l d s
GENERALIZED AUDIT SOFTWARE
is popular because…
MAN Y
GAS S OFTWARE GAS CAN BE US E D
PR ODUCTS ARE AUDI T ORS C AN
I S EASY TO US E TO AUDI T T H E
PLATFORM PER FORM T E S TS
AND R EQUI R E S DAT A CURR E N TL Y
I NDEPENDEN T, I NDEPENDEN TL Y
LI T TLE BEI NG STORE D I N
WOR KS ON OF I T STAFF .
COMPUTER MOS T LI FE
MAI NFRAME S
BACKGROUN D. S TR UCTURES AN D
AND PCS.
FOR MATS.
USING GAS TO ACCESS SIMPLE FILE
STRUCTURE
USING GAS TO ACCESS COMPLEX FILE
STRUCTURE
ACCESS CONTROL LIST
(ACL) SOFTWARE
A p r opr i e t ar y v e r s i o n o f GA S
Le a de r i n t h e i ndu s t r y
De s i g n as an a u di t o r - f r i e nd l y
l an guage
Ac c e ss t o da t a g e ne r a l l y e as y w i t h
Op e n Dat ab a s e Co nne c t i v i t y
( ODBC) i nt e r f a c e
DATA
DEFINITION
ONE OF ACL’S
STRENGTHS IS THE
ABILITY TO READ DATA
STORED IN MOST
FORMATS. ACL USES
THE DATA DEFINITION
FEATURE FOR THIS
PURPOSE. THE DATA
DEFINITION SCREEN
ALLOWS THE AUDITOR
TO DEFINE
IMPORTANT
CHARACTERISTICS OF
THE SOURCE FILE.
CUSTOMIZING
VIEW
A VIEW IS SIMPLY A
WAY OF LOOKING AT
DATA IN A FILE;
AUDITORS SELDOM
NEED TO USE ALL THE
DATA CONTAINED IN A
FILE. ACL ALLOWS THE
AUDITOR TO
CUSTOMIZE THE
ORIGINAL VIEW
CREATED DURING
DATA DEFINITION TO
ONE THAT BETTER
MEETS HIS OR HER
AUDIT NEEDS.
FILTERING DATA
ACL provides powerful options for filtering data that support various
audit tests. Filters are expressions that search for records that meet
the filter criteria. ACL’s expression builder allows the auditor to use
logical operators such as AND, OR, NOT and others to define and
test conditions of any complexity and to process only those records
that match specific conditions.
STRATIFYING DATA
ACL’s stratification feature allows the auditor to view the
distribution of records that fall
into specified strata. Data can be stratified on any numeric
field such as sales price, unit
cost, quantity sold, and so on. The data are summarized and
classified by strata, which
can be equal in size (called intervals) or vary in size (called
free)..
STATISTICAL ANALYSIS
ACL offers many sampling methods for statistical analysis.
Two of the most frequently used are record sampling and
monetary unit sampling (MUS). Each method allows
random and interval sampling. The choice of methods will
depend on the auditor’s strategy and the composition of the
file being audited. On one hand, when records in a file
are fairly evenly distributed across strata, the auditor may want
an unbiased sample and will thus choose the record sample
approach.
SUMMARY