Fundamentals of Database System: Name: - Section
Fundamentals of Database System: Name: - Section
DATABASE SYSTEM
MODULE 2
Name: _________________________________
In this module, we look at an outline of the stages involved in the development of a database.
We consider the broader issue of how to decide what should be in a database and how to structure
the tables that should be included. Our aim is to give you a basic development method so that you
can see how a basic database system is developed. We don't argue that this specific method is the
most applicable to any given situation – however, we do consider that this method is straightforward
and will allow you to contextualise or, by comparison, consider a range of database development
techniques.
Before we consider the development method in more detail let's discuss why we need to take a
formal approach to database development. After all, it is quite simple to use structured query
language (SQL) CREATE TABLE statements to define tables, or to use the facilities of a database
tool to define them for you. Once developed, the tables can be manipulated and displayed in many
different ways, again using SQL statements, a database tool or an application development tool.
However, uncontrolled ad hoc creation of tables by end users leads to an unmanageable and
unusable database environment, and can result in the inclusion of multiple copies of potentially
inconsistent data. In effect, this can create islands of data within which the end users cannot find the
data that they require.
SQL is a special kind of computer language used for relational databases. These initials originated
from 'structured query language'. Although this phrase is no longer used the initials SQL still are. SQL
is an essential part of the practical understanding of relational databases, but we are only concerned
that you appreciate its role in defining and accessing a database.
OBJECTIVES:
Describe the key points of the waterfall model applied to database development.
Appreciate the roles of various development artefacts, such as the data requirements
document, conceptual data model and such like used to communicate between activities in the
database development life cycle.
2|Page
Communicate effectively about aspects of the development of databases.
Directions: Read and understand the following questions carefully. Encircle the letter of the
correct answer.
1. What is a database?
a. A piece of software to do calculations
b. A structured collection of information
c. A device to load information onto the computer
d. A calculator
2. Which of the following describes a relational database?
a. Access
b. There are two or more tablets that are linked using the primary keys
c. A database that has a table
d. There are more than one table that have duplicate data
3. A department needs to store information about each member. The individual information about
each number is called a:
a. Field
b. Table
c. File
d. Record
4. One of the employees has decided to resign. What would the secretary do with the record for
that member on the database?
a. Amend the record
b. Delete the record
c. Delete the field
d. Delete the file
5. The bank has recruited a new member. How will they add their details to the database?
3|Page
a. Add another field
b. Add a new record
c. Amend a record
Lesson 2
Database Development Life Cycle
As it is one component in most information system development tasks, there are several steps in
developing a database system. Here more emphasis is given to the design phases of the database
system development life cycle. The major steps in database system development are;
1. Planning: that is identifying information gap in an organization and propose a database solution to
solve the problem.
2. Analysis: that concentrates more on fact finding about the problem or the opportunity. Feasibility
analysis, requirement determination and structuring, and selection of best design method are also
performed at this phase.
3. Design: in database system development more emphasis is given to this phase. The phase is
further divided into three sub-phases.
A. Conceptual Design: concise description of the data, data type, relationship between data
and constraints on the data.
There is no implementation or physical detail consideration.
Used to elicit and structure all information requirements.
B. Logical Design: a higher level conceptual abstraction with selected specific database
model to implement the data structure.
It is particular DBMS independent and with no other physical
considerations.
C. Physical Design: physical implementation of the upper level design of the database with
respect to internal storage and file structure of the database for the selected DBMS.
To develop all technology and organizational specification.
4. Implementation: the testing and deployment of the designed database for use.
5|Page
5. Operation and Support: administering and maintaining the operation of the database system and
providing support to users.
Basic Concepts
There may be several types of architectures of database systems. However, American National
Standards Institute/ Standards Planning And R Committee (ANSI/SPARC) architecture is applicable
to most modern database systems. External level, Conceptual level and Internal level.
All users should be able to access same data. This is important since the database is having a
shared data feature where all the data is stored in one location and all users will have their
own customized way of interacting with the data.
6|Page
A user’s view is unaffected or immune to changes made in other views. Since the requirement
of one’s user is independent of the other, a change made in one user’s view should not affect
other users.
Users should not need to know physical database storage details. As there are naïve users of
the system, hardware level or physical details should be a black-box for such users.
DBA should be able to change database storage structures without affecting the users’ views.
A change in file organization, access method should not affect the structure of the data which
in turn will have no effect on the users.
DBA should be able to change conceptual structure of database without affecting all users. In
any database system, the DBA will have the privilege to change the structure of the database,
like adding tables, adding and deleting an attribute, changing the specification of the objects in
the database.
All the above and many other functionalities are possible due to the three level ANSI-SPARC
Database System Architectures.
7|Page
Three-level ANSI-SPARC Architecture of a Database System
The Database System Architecture is consisting of the three levels: External level, conceptual level,
Internal level.
External Level:
o The external level is the one closest to the users, i.e., it is the one concerned with the way the
data is viewed by individual users. An external view is the content of the database as seen by
some particular user (i.e., to that user the database is similar to the view he is
working/accessing).
o Each external view is defined by a means of an external schema, which consists basically of
definitions of each of the various external record types in that external view. The external
schema is written using the external DDL portion of the user’s data sub language.
8|Page
o External level is users' view of the database. Describes that part of database that is relevant to
a particular user. Different users have their own customized view of the database independent
of other users.
Conceptual Level:
o The conceptual level is found in between the other two. It is a representation of the entire
information
content of the database including the relations with one another and security and integrity rules,
etc.
o It is the view of the data as it really is or by its entirety rather than as users are forced to see it by
the constraints of (for example) the particular language or hardware they might be using.
o The conceptual view is defined by means of the conceptual schema, which is written using
another DDL, the conceptual DDL of the data sublanguage in use. If data independence is to be
achieved, then those conceptual DDL must not involve any considerations of storage structure or
access technique. Thus there must be no reference in the conceptual schema to stored field
representations, stored record
sequence, indexing, hashing addressing, pointers or any other storage and access details.
o The conceptual schema includes a great many additional features, such as the security and
integrity rules.
Internal Level:
o Is the one closest to the physical storage, i.e., it is concerned with the way the data is
physically stored.
o Is a low-level representation of the entire database?
o The internal view is described by means of the internal schema, which not only defines the
various stored record types but also specifies what indexes exist, how stored fields are
represented, what physical sequence the stored records are in, and so on. The internal
schema is written using yet another DDL-the internal DDL.
o There will be many distinct external views, each consisting of a more or less abstract
representation of some portion of the total database, and there will be precisely one
9|Page
conceptual view, consisting of a similarly abstract representation of the database in its entirety.
Note that most users will not be interested in the total database, but only in some restricted
portion of it. Likewise, there will be precisely one internal view, representing the total database
as physically stored. The following example will clarify the levels to some extent.
o At the conceptual level, the database contains information concerning an entity type called
employee. Each individual employee occurrence has an employee_number (six characters), a
department_number (four characters), and a salary (five decimal digits).
o At the internal level, employees are represented by a stored record type called stored_emp,
twenty bytes long. Stored_emp contains four stored fields: a six-byte prefix (presumably
containing control information such as flags or pointers), and three data fields corresponding to
the three properties of employees. In addition, stored_emp records are indexed on the empno
field by an index called empx, whose definition is not shown.
The Pascal user has an external view of the database in which employee is represented by a Pascal
record containing two fields (department numbers are of no interest to this user and therefore been
omitted from the view). The record type is defined according to the syntax and declaration rules in
Pascal.
Similarly, the COBOL user has an external view in which each employee is represented by a COBOL
record containing two fields (this time salary is not needed by this user and omitted). The record type
is defined according to COBOL rules.
Notice that: the corresponding objects can have different names at each level. The employee
number is referred to as empno in the Pascal view, as emp# in the internal view and as
employee_number in the conceptual view. In general, to define the correspondence between the
conceptual view and the internal view; and the conceptual view and the external view we need an
operation called mapping. The mappings are important, for example, fields can have different data
types, field and record names can be changed, and several conceptual fields can be combined into a
single external field, and so on.
10 | P a g e
Internal level is the physical representation of the database on the computer. Describes how the
data is stored in the database.
The following example can be taken as an illustration for the difference between the three
levels in the ANSI-SPARC database system Architecture. Where:
o The first level is concerned about the group of users and their respective data requirement
independent of the other.
o The second level is describing the whole content of the database where one piece of
information will be represented once.
o The third level
11 | P a g e
Differences between Three Levels of ANSI-SPARC Architecture
Internal schema: at the internal level to describe physical storage structures and access
paths.
Typically uses a physical data model.
Conceptual schema: at the conceptual level to describe the structure and constraints for the
whole database for a community of users. Uses a conceptual or an implementation data
model.
External schema: at the external level to describe the various user views. Usually uses the
same data model as the conceptual level.
Data Independence
Define as the ability (immunity) of applications to change storage structure and access technique
without modifying the main application.
In older systems, the way in which the data is organized in secondary storage, and the technique for
accessing it, are both dictated by the requirements of the application under consideration, and
moreover that knowledge of that data organization and that access technique is built into the
application logic and code. In such type of systems, it is impossible to change the storage structures
(how the data is physically stored) or access technique (how it is accessed) without affecting the
application. The applications mentioned are simply programs that are designed to specific tasks
where every knowledge of the data structure and the access mechanism is also defined within itself.
12 | P a g e
In database systems, it would be extremely undesirable to allow applications to be data dependent.
Major reasons are:
Different applications will need different views of the same data. Suppose, we have an employee data
stored with (employee_id, employee_name, employee_salary, and employee_address, etc. data
items), one user may need only to use the employee_name and employee_salary data items
whereas another user requires only the employee_name and employee_address data items. For data
dependent applications, such needs will entail the change of the main application with creation of two
different copies of the same application, as it would be applied by both users.
The Database Administrator (DBA) must have the freedom to change the storage structure or
access technique in response to changing requirements, without having to modify existing
applications. For example, new kinds of data might be added to the database, new standards might
be adopted; new types of storage devices might become available, and so on.
The ability to modify the physical schema without changing the logical schema.
Applications depend on the logical schema.
In general, the interfaces between the various levels and components should be well defined
so that changes in some parts do not seriously influence others.
13 | P a g e
The capacity to change the internal schema without having to change the conceptual schema
Refers to immunity of conceptual
schema to changes in the internal
schema.
Internal schema changes e.g. using
different file organizations, storage
structures/devices should not require
change to conceptual or external
schemas.
Database Model
A database model is a conceptual description of how the database works. It describes how the data
elements are stored in the database and how the data is presented to the user and programmer for
access; and the relationship between different items in the database.
A specific DBS has its own specific Data Definition Language, but this type of language is too low
level to describe the data requirements of an organization in a way that is readily understandable by
a variety of users. We need a higher-level language. Such a higher-level is called database model.
Database Model: a set of concepts to describe the structure of a database, and certain constraints
that the database should obey.
A database model is a description of the way that data is stored in a database. Database model helps
to understand the relationship between entities and to create the most effective structure to hold data.
Data
Data relationships
Data semantics
Data constraints
The main purpose of database model is to represent the data in an understandable way.
14 | P a g e
Categories of database models include:
Object-based
Record-based
Physical
Record-based Data Models
Consist of a number of fixed format records. Each record type defines a fixed number of fields; each
field is typically of a fixed length. The following are examples of this database model category.
1. Hierarchical Model
In this model, the data is organized in a tree structure that originates from a root, and each
class of data resides at different levels along a particular branch of the root. The data structure
at each class level is called a node. There is always a single root node which is usually owned
by the system or DBMS. Each of the pointers in the root then will point to (child) nodes there
by depicting a parent-child sort of relationship. Searches are done by traversing the tree up
and down with known search algorithms and modules supplied by the DBMS or may, for
special cases, be designed by the application programmer. The initial structure of the database
must be defined by the application programmer when the database is created. From this point
on, the parent-children structure can’t be changed without redesigning the whole structure.
15 | P a g e
Relation is established by creating physical link between stored records (each is stored
with a predefined access path to other records)
To add new record type or relationship, the database must be redefined and then stored in
a new form.
2. Network Model
The network is a conceptual description of databases where many-to-many (multiple parent-
children) relationships exist. To make this model easier to understand, the relationships
between the different data items are commonly referred to as sets to distinguish them from the
strictly parent-child relationships defined by the HDBM.
The network model uses pointers to map the relationships between the different data items.
The flexibility of the NDB model is in showing many-to-many relationships is its greatest
strength, though the flexibility comes at a price (the interrelationships between the different
data sets become extremely complex and difficult to map).
Like the HDBM, NDBMs can very quickly be searched, especially through the use of index
pointers that lead directly to the first item in a set being searched. The NDBM suffers from the
same structural problem as the HDBM; the initial design of the database is arbitrary, and once
its setup, any changes to the different sets require the programmer to create an entirely new
structure. The dual problems of duplicated data and inflexible structure led to the development
16 | P a g e
of a database model that minimizes both problems by making relationships between the
different data items the foundation for how the database is structure.
Allows record types to have more than one parent unlike hierarchical
A network database models sees records as set members
Each set has an owner and one or more members
Allows/supports many to many relationships between entities
Like hierarchical model network model is a collection of physically linked records.
Allow member records to have more than one owner
17 | P a g e
The primary purpose behind the relational database model is the preservation of data integrity.
To be considered truly relational, a DBMS must completely prevent access to the data by any
means other than queries handled by the DBMS itself. While the relational model does not
specify how the data is stored on the disk, the preservation of data integrity implies that the
data must be stored in a format that prevents it from being accessed from outside the DBMS
that created it.
Developed by Dr. Edgar Frank Codd in 1970 (famous paper, 'A Relational Model for
Large Shared Data Banks').
Terminologies originates from the branch of mathematics called set theory and relation.
Can define more flexible and complex relationship.
Viewed as a collection of tables called “Relations” equivalent to collection of record
types.
Relation: Two dimensional table.
Data value is the value of the Attribute.
Records are related by the data stored jointly in the fields of records in two tables or
files. The related tables contain information that creates the relation.
The tables seem to be independent but are related somehow.
No physical consideration of the storage is required by the user.
Many tables are merged together to come up with a new virtual view of the relationship.
Over the years, relational databases have gotten better, faster, stronger, and easier to work with. But
they’ve also gotten more complex, and administering the database has long been a full-time job.
Instead of using their expertise to focus on developing innovative applications that bring value to the
business, developers have had to spend most of their time on the management activity needed to
optimize database performance.
18 | P a g e
Today, autonomous technology is building upon the strengths of the relational model, cloud
database technology, and machine learning to deliver a new type of relational database. The self-
driving database (also known as the autonomous database) maintains the power and advantages of
the relational model but uses artificial intelligence (AI), machine learning, and automation to monitor
and improve query performance and management tasks. For example, to improve query
performance, the self-driving database can hypothesize and test indexes to make queries faster, and
then push the best ones into production—all on its own. The self-driving database makes these
improvements continuously, without the need for human involvement.
Autonomous technology frees up developers from the mundane tasks of managing the database. For
instance, they no longer have to determine infrastructure requirements in advance. Instead, with a
self-driving database, they can add storage and compute resources as needed to support database
growth. With just a few steps, developers can easily create an autonomous relational database,
accelerating the time for application development.
Direction: Identify the following items. Write your answer on the space provided.
19 | P a g e
_____________________________6. This is simple to construct and operate on.
20 | P a g e