0% found this document useful (0 votes)
9 views37 pages

Chapter 9

The document discusses the evolution from flat-file data management to database systems, highlighting the advantages of databases such as reduced data redundancy, single updates, and improved data currency. It outlines the database design process, including normalization to eliminate anomalies, and the importance of a Database Management System (DBMS) for managing access and data integrity. Additionally, it addresses distributed databases, their benefits and challenges, and the implications for accounting practices in maintaining accurate records.

Uploaded by

JM TOME
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views37 pages

Chapter 9

The document discusses the evolution from flat-file data management to database systems, highlighting the advantages of databases such as reduced data redundancy, single updates, and improved data currency. It outlines the database design process, including normalization to eliminate anomalies, and the importance of a Database Management System (DBMS) for managing access and data integrity. Additionally, it addresses distributed databases, their benefits and challenges, and the implications for accounting practices in maintaining accurate records.

Uploaded by

JM TOME
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 37

Objectives for Chapter 9

Problems inherent in the flat file approach to data management


that gave rise to the database concept
Relationships among the defining elements of the database
environment
Anomalies caused by unnormalized databases and the need for
data normalization
Stages in database design: entity identification, data modeling,
constructing the physical database, and preparing user views
Features of distributed databases and issues to consider in
deciding on a particular database configuration
Flat-File Versus Database Environments
Computer processing involves two components: data and
instructions (programs)
Conceptually, there are two methods for designing the
interface between program instructions and data:
File-oriented processing: A specific data file was created for
each application
Data-oriented processing: Create a single data repository to
support numerous applications.
Disadvantages of file-oriented processing include
redundant data and programs and varying formats for
storing the redundant data.
Flat-File Environment
User 1 Data
Transactions
Program 1 A,B,C

User 2
Transactions
Program 2
X,B,Y

User 3
Transactions
Program 3
L,B,M
Data Redundancy and Flat-File
Problems
Data Storage - creates excessive storage costs of
paper documents and/or magnetic form
Data Updating - any changes or additions must be
performed multiple times
Currency of Information - potential problem of
failing to update all affected files
Task-Data Dependency - user’s inability to obtain
additional information as his or her needs change
Database Approach

User 1
Database
Transactions
Program 1
A,
D B,
User 2
Transactions B C,
Program 2 M X,
S Y,
User 3
L,
Transactions M
Program 3
Advantages of the Database Approach
Data sharing/centralize database resolves flat-file problems:
No data redundancy: Data is stored only once, eliminating data
redundancy and reducing storage costs.
Single update: Because data is in only one place, it requires only a
single update, reducing the time and cost of keeping the
database current.
Current values: A change to the database made by any user yields
current data values for all other users.
Task-data independence: As users’ information needs expand, the
new needs can be more easily satisfied than under the flat-file
approach.
Disadvantages of the Database Approach
Can be costly to implement
additional hardware, software, storage, and network
resources are required
Can only run in certain operating environments
may make it unsuitable for some system configurations
Because it is so different from the
file-oriented approach, the database
approach requires training users
may be inertia or resistance
Elements of the Database Environment
Database
System Requests

System Development Administrator


Process

Applications
User DBMS
Transactions
Programs Data
Definition Host
Language
U Operating
S Transactions User System
Data
E Programs Manipulation
R Language
S Transactions User
Query
Programs Language Physical
Database

User Queries
Internal Controls and DBMS
The database management system (DBMS) stands between
the user and the database per se.
Thus, commercial DBMS’s (e.g., Access or Oracle) actually
consist of a database plus…
Plus software to manage the database, especially controlling
access and other internal controls
Plus software to generate reports, create data-entry forms,
etc.
The DBMS has special software to know which data
elements each user is authorized to access and deny
unauthorized requests of data.
DBMS Features
Program Development - user created applications
Backup and Recovery - copies database
Database Usage Reporting - captures statistics on
database usage (who, when, etc.)
Database Access - authorizes access to sections of the
database
Also…
User Programs - makes the presence of the DBMS
transparent to the user
Direct Query - allows authorized users to access data
without programming
Data Definition Language (DDL)
DDL is a programming language used to define
the database per se.
It identifies the names and the relationship of all data
elements, records, and files that constitute the
database.
DDL defines the database on three viewing levels
Internal view – physical arrangement of records (1
view)
Conceptual view (schema) – representation of
database (1 view)
User view (subschema) – the portion of the database
each user views (many views)
Data Manipulation Language (DML)
DML is the proprietary programming language
that a particular DBMS uses to retrieve, process,
and store data to / from the database.
Entire user programs may be written in the
DML, or selected DML commands can be
inserted into universal programs, such as COBOL
and FORTRAN.
Can be used to ‘patch’ third party applications to
the DBMS
Query Language
The query capability permits end users and
professional programmers to access data in the
database without the need for conventional programs.
Can be an internal control issue since users may be
making an ‘end run’ around the controls built into the
conventional programs
IBM’s structured query language (SQL) is a fourth-
generation language that has emerged as the standard
query language.
Adopted by ANSI as the standard language for all
relational databases
Database Conceptual Models
Refers to the particular method used to organize
records in a database
A.k.a. “logical data structures”
Objective: develop the database efficiently so that data
can be accessed quickly and easily
There are three main models:
hierarchical (tree structure)
network
relational
Most existing databases are relational. Some legacy systems
use hierarchical or network databases.
The Relational Model
The relational model portrays data in the form of
two dimensional ‘tables’.
Its strength is the ease with which tables may be
linked to one another.
A major weakness of hierarchical and network
databases
Relational model is based on the relational
algebra functions of restrict, project, and join.
Relational Algebra
RESTRICT – filtering out rows, PROJECT – filtering out columns,
such as the dark blue such as the light blue

JOIN – build a new table or data set from multiple existing tables

X1 Y1 Y1 Z1 X1 Y1 Z1

X2 Y2 Y2 Z2 X2 Y2 Z2

X3 Y1 Y3 Z3 X3 Y1 Z1
Associations and Cardinality
Association – the labeled line connecting two entities
or tables in a data model
Describes the nature of the between them
Represented with a verb, such as ships, requests, or
receives
Cardinality – the degree of association between two
entities
The number of possible occurrences in one table that are
associated with a single occurrence in a related table
Used to determine primary keys and foreign keys
Properly Designed Relational Tables
Each row in the table must be unique in at least one
attribute, which is the primary key.
Tables are linked by embedding the primary key into
the related table as a foreign key.
The attribute values in any column must all be of the
same class or data type.
Each column in a given table must be uniquely named.
Tables must conform to the rules of normalization,
i.e., free from structural dependencies or anomalies.
Three Types of Anomalies
Insertion Anomaly: A new item cannot be added
to the table until at least one entity uses a particular
attribute item.
Deletion Anomaly: If an attribute item used by
only one entity is deleted, all information about
that attribute item is lost.
Update Anomaly: A modification on an attribute
must be made in each of the rows in which the
attribute appears.
Anomalies can be corrected by creating additional
relational tables.
Advantages of Relational Tables
Removes all three types of
anomalies
Various items of interest
(customers, inventory, sales) are
stored in separate tables.
Space is used efficiently.
Very flexible – users can form ad
hoc relationships
The Normalization Process
A process which systematically splits
unnormalized complex tables into smaller tables
that meet two conditions:
all nonkey (secondary) attributes in the table are
dependent on the primary key
all nonkey attributes are independent of the other
nonkey attributes
When unnormalized tables are split and reduced to
third normal form, they must then be linked together
by foreign keys.
Steps in Normalization
Unnormalized table with
repeating groups Remove
repeating
groups
First normal
form 1NF
Remove
partial
dependencies
Second normal
form 2NF
Remove
transitive
dependencies
Third normal
form 3NF
Remove
remaining
Higher normal anomalies
forms
Accountants and Data Normalization
Update anomalies can generate conflicting and obsolete
database values.
Insertion anomalies can result in unrecorded
transactions and incomplete audit trails.
Deletion anomalies can cause the loss of accounting
records and the destruction of audit trails.
Accountants should understand the data normalization process
and be able to determine whether a database is properly
normalized.
Six Phases in Designing Relational
Databases
1. Identify entities
identify the primary entities of the
organization
construct a data model of their relationships
2. Construct a data model showing entity
associations
determine the associations between entities
model associations into an ER diagram
Six Phases in Designing Relational
Databases
1. Add primary keys and attributes
assign primary keys to all entities in the model to
uniquely identify records
every attribute should appear in one or more user
views
2. Normalize and add foreign keys
remove repeating groups, partial and transitive
dependencies
assign foreign keys to be able to link tables
Six Phases in Designing Relational
Databases
1. Construct the physical database
create physical tables
populate tables with data
2. Prepare the user views
normalized tables should support all
required views of system users
user views restrict users from have access to
unauthorized data
Distributed Data Processing (DDP)
Data processing is organized around several information
processing units (IPUs) distributed throughout the
organization.
Each IPU is placed under the control of the end user.
DDP does not always mean total decentralization.
IPUs in a DDP system are still connected to one another and
coordinated.
Typically, DDP’s use a centralized database.
Alternatively, the database can be distributed, similar to the
distribution of the data processing capability.
Centralized Databases in DDP
Environment
The data is retained in a central location.
Remote IPUs send requests for data.
Central site services the needs of the remote IPUs.
The actual processing of the data is performed at the remote
IPU.
Advantages of DDP
Cost reductions in hardware and data entry tasks
Improved cost control responsibility
Improved user satisfaction since control is closer to
the user level
Backup of data can be improved through the use of
multiple data storage sites
Disadvantages of DDP
Loss of control
Mismanagement of resources
Hardware and software incompatibility
Redundant tasks and data
Consolidating incompatible tasks
Difficulty attracting qualified personnel
Lack of standards
Data Currency
Occurs in DDP with a centralized database
During transaction processing, data will
temporarily be inconsistent as records are read
and updated.
Database lockout procedures are necessary to
keep IPUs from reading inconsistent data and
from writing over a transaction being written
by another IPU.
Distributed Databases: Partitioning
Splits the central database into segments
that are distributed to their primary users
Advantages:
users’ control is increased by having data stored
at local sites
transaction processing response time is
improved
volume of transmitted data between IPUs is
reduced
reduces the potential data loss from a disaster
The Deadlock Phenomenon
Especially a problem with partitioned databases
Occurs when multiple sites lock each other out of
data that they are currently using
One site needs data locked by another site.
Special software is needed to analyze and resolve
conflicts.
Transactions may be terminated and restarted.
Distributed Databases: Replication
The duplication of the entire
database for multiple IPUs
Effective for situations with a high
degree of data sharing, but no
primary user
Supports read-only queries
Data traffic between sites is reduced
considerably.
Concurrency Problems and Control
Issues
Database concurrency is the presence of complete
and accurate data at all IPU sites.
With replicated databases, maintaining current data
at all locations is difficult.
Time stamping is used to serialize transactions.
Prevents and resolves conflicts created by updating data
at various IPUs
Distributed Databases and the
Accountant
The following database options impact the
organization’s ability to maintain database integrity,
to preserve audit trails, and to have accurate
accounting records.
Centralized or distributed data?
If distributed, replicated or partitioned?
If replicated, totally or partially replication?
If partitioned, what allocation of the data segments
among the sites?

You might also like