Chapt 1 Database Models
Chapt 1 Database Models
Chapter One
DATABASE MODELS
Overview
Flat Files
Hierarchical Database Model
Network Database Models
ISAM
Relational Database Model
Object-Oriented Database Model
Characteristics of a Relational Database Basic Concepts
Characteristics of a Relational Database Commercial Products
Characteristics of a Relational Database Codds Rules
In it's most simple form, a flat-file database is nothing more than a single, large
table (e.g., a spreadsheet).
A flat file contains only one record structure; there are no links between separate
records.
Access to data is done in a sequential manner; random access is not supported.
Access times are slow because the entire file must be scanned to locate the desired
data.
Terrence Brunton
Chapter One
Access times can be improved if the data is sorted but this introduces the potential
for error (e.g., one or more records may be misfiled).
Other problems with a flat-file database include 1) data redundancy; 2) data
maintenance, and 3) data integrity.
EXAMPLE. An orders file might require fields for the order number, date,
customer name, customer address, quantity, part description, price, etc. In this
example, each record must repeat the name and address of the customer (data
redundancy). If the customers address changed, it would have to be changed in
multiple locations (data maintenance). If the customer name were spelled differently
in different orders (e.g., Acme, Acme Inc, Acme Co.) then the data would be
inconsistent (data integrity). (See the Normalization Exercise for a practical example
of these deficiencies.
Flat file data structures are only viable for small data processing requirements
The hierarchical and network database models preceded the relational model;
today very few commercial databases use either of these models.
A hierarchical database is a series of flat-files linked in structured 'tree'
relationships
IBM's IMS (Information Management System) database, often referred to by the
name of its proprietary language, DL/I (Data Language I), is the only viable
commercial hierarchical database still in use today, primarily on older mainframe
computers.
The concept for the hierarchical database model was originally based on a bill of
materials (BOM).
Data is represented as a series of parent/child relationships.
This concept is fundamentally different from that used in relational model where
data resides in a collection of tables without any hierarchy and that are physically
independent of each other.
In the hierarchical model, a database 'record' is a tree that consists of one or more
groupings of fields called 'segments'.
Segments make up the individual 'nodes' of the tree (e.g., the 'customers' record
may consist of 'customer' and 'order' segments.)
The model requires that each child segment can be linked to only one parent and
a child can only be reached through its parent.
The requirement for a one-to-many relationship between parent and child can
result in redundant data (e.g., 'orders' might be a child of 'customers' as well as a child
of 'parts').
To get around the data redundancy problem, data is stored in one place and
referenced by links or physical pointers in other places (e.g., the 'customers' record
contains actual data in the 'orders' segment while the 'parts' record contains a pointer
to the 'orders' data in 'customers').
Terrence Brunton
Chapter One
The Indexed Sequential Access Method (ISAM), is a disk storage and access
method, not a database model.
There is much confusion surrounding the term ISAM and in practice it is used to
refer to desktop and/or file-based databases like Microsoft's FoxPro and Jet-based
Access, Clipper, dBase, Paradox, and Btrieve.
It also is used to refer to navigational database applications that rely on a
procedural approach to data access and retrieval.
Possibly the only 'true' ISAM-based products are IBMs Information Management
System (IMS) and Btrieve.
Terrence Brunton
Chapter One
Under ISAM, records are located using a key value. A smaller index file stores the
keys along with pointers to the records in the larger data file. The index file is first
searched for the key and then the associated pointer is used to locate the desired
record.
ISAM is more concerned with the access and storage of data; it does not represent
a full database management system (DBMS).
Data access and storage methods are discussed below.
The theory behind the relational database model is discussed the section entitled
Characteristics of a Relational Database; this section focuses on the distinctions
between the relational model and other database models.
In a relational database, the logical design is independent of the physical design.
Queries against a relational database management system (RDBMS) are based
on logical relationships and processing those queries does not require pre-defined
access paths among the data (i.e., pointers).
The relational database provides flexibility that allows changes to the database
structure to be easily accommodated.
Because the data reside in tables, the structure of the database can be changed
without having to change any applications that were based on that structure.
EXAMPLE: You add a new field for e-mail address in the customers table. If you
are using a non-relational database, you probably have to modify the application that
will access this information by including 'pointers' to the new data. With a relational
database, the information is immediately accessible because it is automatically related
to the other data by virtue of its position in the table. All that is required to access the
new e-mail field is to add it to a SELECT list.
The structural flexibility of a relational database allows combinations of data to be
retrieved that were never anticipated at the time the database was initially designed
In contrast, the database structure in older database models is "hard-coded" into
the application; if you add new fields to a non-relational database, any application
that access the database will have to be updated.
In practice, there is significant confusion as to what constitutes a relational
database management system (RDBMS).
Dr. E.F. Codd provided 12 rules that define the basic characteristics of a relational
database but implementation of these rules varies from vendor to vendo.
No RDBMS product is fully compliant with all 12 rules but some are more so
than others,
When distinguishing DBMS products, there are typically three key area on which
to focus: 1) query formulation and data storage/access; 2) application integration; and
3) processing architecture.
Object-Oriented Database
Terrence Brunton
Chapter One
The most significant limitation of the relational model is its limited ability to
deal with BLOBs.
Binary Large OBjects or BLOBs are complex data types such as images,
spreadsheets, documents, CAD, e-mail messages, and directory structures.
At its most basic level, 'data' is a sequence of bits ('1s' and '0s') residing in some
sort of storage structure.
'Traditional' databases are designed to support small bit streams representing
values expressed as numeric or small character strings.
Bit stream data is atomic; it cannot be broken down into small pieces.
BLOBs are large and non-atomic data; they have parts and subparts and are not
easily represented in a relational database.
There is no specific mechanism in the relational model to allow for the retrieval of
parts of a BLOB.
A relational database can store BLOBs but they are stored outside the database
and referenced by pointers.
The pointers allow the relational database to be searched for BLOBs, but the
BLOB itself must be manipulated by conventional file I/O methods.
Object-orient databases provide native support BLOBs.
Unfortunately, there is no clear model or framework for the object-oriented
database like the one Codd provided for the relational database.
Under the general concept of an object-oriented database, everything is treated as
an object that can be manipulated.
Objects inherit characteristics of their class and have a set of behaviors (methods)
and properties that can be manipulated.
The hierarchical notion of classes and subclasses in the object-oriented
database model replaces the relational concept of atomic data types.
The object-oriented approach provides a natural way to represent the hierarchies
that occur in complex data.
EXAMPLE, a Word document object consists of paragraph objects and has
method to draw' itself.
There are a limited number of commercial object-oriented database systems
available; mostly for engineering or CAD applications.
In a way, object-oriented concept represents a Back to the Future approach in
that it is very similar to the old hierarchical database design.
Relational databases are not obsolete and may simply evolve by adding additional
support for BLOBs.
Terrence Brunton
Chapter One
Overview
This section discusses the following topics with respect to a relational database:
Although IBM did most of the research, Oracle delivered the first commercial
relational database in 1979.
IBM delivered their first product,' SQL/Data System,' in 1982.
Microsoft originally worked in partnership with Sybase to deliver SQL Server 4.2
in 1992.
In 1993 the partnership broke up with Sybase going after the Unix market and
Microsoft pursuing the Windows NT market.
Microsoft SQL Server 6.0 was released in 1995; version 6.5 shipped in 1996.
Microsoft SQL Server 7.0 shipped in November of 1998.
Terrence Brunton
Chapter One
A relational database is more than just data organized into related tables.
The relational database model is based firmly in the mathematical theory of
relational algebra and calculus.
The original concept for the model was proposed by Dr. E.F. Codd in a 1970
paper entitled A Relational Model of Data for Large Shared Data Banks.
Later Dr. Codd clarified his model by defining twelve rules (Codds Rules) that a
database management system (DBMS) must meet inn order to be considered a
relational database.
In practice, many database products are considered 'relational' even if they do not
strictly adhere to all 12 rules.
A summary of Dr. Codds 12 rules is presented below:
A set of related tables forms a database and all data is represented as tables; the
data can be viewed in no other way.
A table (a.k.a. relation or entity) is a logical grouping of related data in tabular
form (rows and columns).
Each row (a.k.a. record or tuple) describes an item (person, place or thing) and
each row contains information about a single item in the table.
Each column (a.k.a. field or attribute) describes a single characteristic about an
item.
Each value (datum) is defined by the intersection of a row and column.
Data is atomic; there is no more than one value associated with the intersection of
a row and column.
There is no hierarchical ranking of tables.
Terrence Brunton
Chapter One
The relationships among tables are logical; there are no physical relationships
among tables.
Terrence Brunton
Chapter One
There must be a single language that handles all communication with the database
management system.
The language must support relational operations with respect to: data
modification (i.e., SELECT, INSERT, UPDATE, DELETE), data definition (i.e.,
CREATE, ALTER, DROP) and administration (i.e., GRANT, REVOKE, DENY,
BACKUP, RESTORE).
Structured Query Language (SQL) is the de facto standard for a relational
database language.
SQL is a nonprocedural or declarative language; it allows users to express
what they want from the RDBMS without specifying the details about where it's
located or how to get it.
The SQL language is discussed in more detail here.
A relational database must not be limited to source tables when presenting data to
the user.
Views are virtual tables or abstractions of the source tables.
A view is an alternative way of looking at data from one or more tables.
A view definition does not duplicate data; a view is not a copy of the data in the
source tables.
Once created, a view can be manipulated in the same way as a source table.
If you change data in a view, you are changing the underlying data in the source
table (although there are limits on how data can be modified from a view).
Views allow the creation of custom tables that are tailored to special needs.
EXAMPLE: By not including the columns with sensitive information in a view
definition, a view can be used to restrict a users access to the data.
Views can be used to simplify data access by predefining complex joins; the
concept is similar to that of a 'saved query'.
Rows are treated as sets for data manipulation operations (SELECT, INSERT,
UPDATE, DELETE).
A relational database must support basic relational algebra operations (selection,
projection; & join) and set operations (union, intersection, division, and difference).
Set operations and relational algebra are used to operate on 'relations' (tables) to
produce other relations.
Terrence Brunton
Chapter One
Logical independence means the relationships among tables can change without
impairing the function of applications and ad hoc queries.
The database schema or structure of tables and relationships (logical) can change
without having to re-create the database or the applications that use it.
Terrence Brunton
10
Chapter One
There cannot be other paths into the database that subvert data integrity; in other
words, you can't get in the 'back door' and change the data in such a manner as data
integrity is violated.
The DBMS must prevent data from being modified by machine language
intervention.
EXAMPLE ONE
MOVIE WORLD DVD CLUB
This first example is particularly useful for helping students to identify entities, attributes
and relationships. The importance of naming will also be highlighted particularly in the
context of the business rules.
Identification and naming of entities and attributes is a crucial challenge to the system
analysts since it establishes the structure of the database.
Entity Relationship Diagramming in particular is helpful in the formation of data base
structure. The key device used to draw out the entities and attributes is a type of
narrative that we call the business rules.
BACKGROUND
Movie World DVD Club is a small business establishment that is engaged in the rental of
DVDs. This enterprise is located on the UWI St. Augustine Campus, upstairs the Student
Activity Centre. Though the firm mainly handles the rental of DVDs they also sell DVDs
and movie posters.
Terrence Brunton
11
Chapter One
They have a wide range of DVDs available to their customers. There is a broad selection
of movies (over 2,000), copies of which are readily available. Some of the different
genres of movies are science-fiction, comedy, drama, indian, action, horror/thriller, etc.
DVD rentals are characterized by high volume, low-cost transactions. Because each
rental transaction affects inventory, inventory updates need to be frequent. Inventory
management is an integral part of the firms operations. At all times it is important to
know which movies are available, need to be ordered, or are out on loan. In this
particular business, inventory management also involves managing loans of the movies
amongst the firms customer base.
Periodic shipments and the efficient management of back order, by producing copies of
DVDs, are critical to Movie Worlds business operations. Movie World requires detailed
reporting capability in order to achieve this and to manage their inventory effectively.
Presently, the operations at Movie World are performed manually, there has been no
computerization of their registration, borrowing or returning processes and they do not
have a database for their business/operations. Records are kept via manual bookkeeping.
Hence, to initiate the process of building a database at Movie World the records of both
members and all the different movies they have in stock for rental would have to be
entered into MS Access to start the database building process.
New movies are sourced periodically and are stored in an updated DVD list.
inventory and transaction information are documented in different log books.
All
The customers consist entirely of students and staff of the UWI. The customer base now
stands at just over 600 and is relatively stable. The owner sees the need to increase this
customer base. Problems with some customers, such as bad attitudes and dishonesty,
make business harder from time to time. The club has grown substantially since
inception, which is evident by its present customer base and product mix, which they try
to increase frequently. The owner anticipates a further 33% growth in the coming year.
Terrence Brunton
12
Chapter One
2. Rentals and other transactions are recorded manually in different Log books.
Solution: The data will keep track of and calculate each Members rentals, their fines
due, amount paid, balance due, etc.
3. All members of the Movie Club are manually recorded in a Member Log book.
Solution: The database will contain a MEMBER table that will contain all the
members of the DVD Club with their respective MEMBER IDs.
4. Movie World faces a major growth problem that is evident by their relatively stagnant
customer base. The owner cited the need for more members. More members would
mean more revenue pouring into the firm, which could fuel further development.
Solution: The database marketing system can help the organization to achieve this
growth. It will allow the firm to precisely analyze rental trends. With this database,
marketers will be able to project rentals based on rental trends and repeat rentals by
customers (Aaker; Kumar & Day, 2001).
5. So much effort is put into performing day-to-day activities that the owner finds it
difficult to formulate and execute strategic plans for development, while continuing to
run the business single-handedly.
Solution: Divestiture and delegation of responsibility may assist in this instance. The
owner and manager could lessen the amount of control they exercise in day-to-day
tasks that they may be involved in at present. Instead, once they have sufficient trust
in the front-desk employee, they can leave the more routine duties to this worker and
thus be able to focus their attention and energies on more strategic areas of
management, e.g. properly utilizing and operationalising the marketing function of
the databse we are proposing, so that the business can experience the growth it desires
and needs.
6. Resources, in terms of computer hardware and software, finances for training in
database maintenance, etc. are not available at present at Movie World.
Solution: In order to put this database model into effect, Movie World needs to
acquire the necessary computer hardware, software, peripherals and network(s). They
already have sufficient physical space for the installation of this computer equipment,
with some minor structural adjustments and renovations. However, in addition to the
acquisition of this equipment, staff needs to be trained and certified in Microsoft
applications, in particular MS Access. This is necessary for them to be able to
properly manage the database system in terms of basic operation, database backups,
access management and security.
7. The fine for damages and lost DVDs is somewhat high.
Solution: Movie World should consider lowering these prices, especially if the DVD
is not an original. Thus, fines for damaged/lost original should be higher than for the
copies.
Terrence Brunton
13
Chapter One
Business Rules
1. When a new member joins the Movie World DVD Club, personal data is entered in
the Members Table.
This prompts the naming of an entity MEMBER.
2. Each member is issued a membership ID card bearing the Member ID
number. This establishes the entity MEMBER. MEMBER ID is
the Primary Key (PK) for the Member Registration.
Note: The hardcopy of the Member ID is a printed form from
the database. A hardcopy of a Members Contract will also be
printed from a form in the database.
This is more
appropriate for
Physical DFD
with
specification of
output
)
) DFDs
)
Terrence Brunton
14
Chapter One
Terrence Brunton
15
Chapter One
MEMBER
INVENTORY
browses
RENTAL
selects
MEMBER
provides
RENTAL
INVOICE
DAMAGE
FINE
ships
INVOICE LINE
SUPPLIER
GENRE
Terrence Brunton
DVD
INVENTORY
ORDER
16
Chapter One
ships
MEMBER
Member-ID
UWI ID
First Name
Last Name
Address
Phone
Status
RENTAL
Rental ID
Member ID
DVD ID
Date Out
Date In
Amount Due
Condition upon return
Damage Fines
Amount Paid
Balance Due
Relational Schema
SUPPLIER
Order ID
DVD Title
Order Quantity
Company Name
Contact
Address
Email
Phone
Fax
Delivery Method
Order Cost
Terrence
Brunton
Order Date
Expected Delivery Date
Delivery Status
INVENTORY
DVD_ID
DVD_Title
Category
Main Actor
Quantity in stock
Quantity on order
Order ID
17
Chapter One
Relational Schema
DAMAGE/FINE
MEMBER
Member-ID
UWI ID
First Name
Last Name
Address
Phone
Status
Terrence Brunton
RENTAL INVOICE
Invoice ID
Member ID
Invoice Line ID
Amount Due
Condition upon return
Damage Fines
Amount Paid
Balance Due
Invoice Total
Dam./Fine ID
Fie Amount
Amount
Paid
Balance
INVOICE LINE
Invoice Line ID
DVD ID
Price
Date Out
18
Chapter One
ORDER
Order ID
DVD Title
DVD ID
Order Quantity
Company Name
Contact
Address
Email
Phone
Fax
Delivery Method
Order Cost
Order Date
Expected Delivery Date
Delivery Status
Terrence Brunton
INVENTORY OF DVDs
ORDER LINE
Order Line ID
DVD ID
Price
DVD_ID
DVD_Title
Category (Genre)
Main Actor
Quantity in stock
Quantity on order
Order ID
19