DBMS 1
DBMS 1
A database system is a computer-based system to record and maintain information. The information
concerned can be anything of significance to the organization for whose use it is intended.
Schema
Data
Schema is the structure of data, whereas the Data are the "facts". Schema can be complex to
understand, but really indicates the rules which the Data must obey.
Imagine a case where we want to store facts about employees in a company. Such facts could include
their name, address, date of birth, and salary. In a database all the information on all employees would
be held in a single storage "container", called a table. This table is a tabular object like a spreadsheet
page, with different employees as the rows, and the facts (e.g. their names) as columns... Let's call this
table EMP, and it could look something like:
From this information the schema would define that EMP has four components, "NAME", "ADDRESS",
"DOB", "SALARY". As designers we can call the columns what we like, but making them meaningful
helps. In addition to the name, we want to try and make sure that people don’t accidentally store a
name in the DOB column, or some other silly error. Protecting the database against rubbish data is one
of the most important database design steps. From what we know about the facts, we can say things
like:
Such rules can be enforced by a database. During the design phase of a database schema these and
more complex rules are identified and where possibly implemented. The more rules the harder it is to
enter poor quality data.
User Types
Application programmer, responsible for writing programs in some high-level language such as
Java, C++, etc.
End-user, who accesses the database via a query language
Database administrator (DBA), who controls all operations on the database
Database Architecture
External: concerned with the way individual users see the data
Conceptual: can be regarded as a community user view a formal description of data of interest
to the organization, independent of any storage considerations.
Internal: concerned with the way in which the data is actually stored
External View
A user is anyone who needs to access some portion of the data. They may range from application
programmers to casual users with adhoc queries. Each user has a language at his/her disposal.
The application programmer may use a high level language while the casual user will probably use a
query language.
Regardless of the language used, it will include a data sublanguage DSL which is that subset of the
language which is concerned with storage and retrieval of information in the database and may or may
not be apparent to the user.
a data definition language (DDL) - provides for the definition or description of database objects
a data manipulation language (DML) - supports the manipulation or processing of database
objects.
Each user sees the data in terms of an external view: Defined by an external schema, consisting basically
of descriptions of each of the various types of external record in that external view, and also a definition
of the mapping between the external schema and the underlying conceptual schema.
Conceptual View
Internal View
The internal view is a low-level representation of the entire database consisting of multiple occurrences
of multiple types of internal (stored) records.
It is however at one remove from the physical level since it does not deal in terms of physical records or
blocks or with any device specific constraints such as cylinder or track sizes. Detail mapping to physical
storage is highly implementation specific and are not expressed in the three-level architecture.
Mappings
DBMS
Database Administrator
The database administrator (DBA) is the person (or group of people) responsible for overall control of
the database system. The DBA's responsibilities include the following:
deciding the information content of the database, i.e. identifying the entities of interest to the
enterprise and the information to be recorded about those entities. This is defined by writing
the conceptual schema using the DDL
deciding the storage structure and access strategy, i.e. how the data is to be represented by
writing the storage structure definition. The associated internal/conceptual schema must also
be specified using the DDL
liaising with users, i.e. to ensure that the data they require is available and to write the
necessary external schemas and conceptual/external mapping (again using DDL)
defining authorization checks and validation procedures. Authorization checks and validation
procedures are extensions to the conceptual schema and can be specified using the DDL
defining a strategy for backup and recovery. For example periodic dumping of the database to a
backup tape and procedures for reloading the database for backup. Use of a log file where each
log record contains the values for database items before and after a change and can be used for
recovery purposes
monitoring performance and responding to changes in requirements, i.e. changing details of
storage and access thereby organizing the system so as to get the performance that is “best for
the enterprise”
The facilities offered by DBMS vary a great deal, depending on their level of sophistication. In general,
however, a good DBMS should provide the following advantages over a conventional system:
Independence of data and program - This is a prime advantage of a database. Both the
database and the user program can be altered independently of each other thus saving time and
money which would be required to retain consistency.
Data shareability and nonredundance of data - The ideal situation is to enable applications to
share an integrated database containing all the data needed by the applications and thus
eliminate as much as possible the need to store data redundantly.
Integrity - With many different users sharing various portions of the database, it is impossible
for each user to be responsible for the consistency of the values in the database and for
maintaining the relationships of the user data items to all other data item, some of which may
be unknown or even prohibited for the user to access.
Centralized control - With central control of the database, the DBA can ensure that standards
are followed in the representation of data.
Security - Having control over the database the DBA can ensure that access to the database is
through proper channels and can define the access rights of any user to any data items or
defined subset of the database. The security system must prevent corruption of the existing
data either accidently or maliciously.
Performance and Efficiency - In view of the size of databases and of demanding database
accessing requirements, good performance and efficiency are major requirements. Knowing the
overall requirements of the organization, as opposed to the requirements of any individual user,
the DBA can structure the database system to provide an overall service that is “best for the
enterprise”.
Data Independence
This is a prime advantage of a database. Both the database and the user program can be altered
independently of each other.
In a conventional system applications are datadependent. This means that the way in which the
data is organized in secondary storage and the way in which it is accessed are both dictated by
the requirements of the application, and, moreover, that knowledge of the data organization
and access technique is built into the application logic.
For example, if a file is stored in indexed sequential form then an application must know
o that the index exists
o the file sequence (as defined by the index)
The internal structure of the application will be built around this knowledge. If, for example, the file was
to be replaced by a hash-addressed file, major modifications would have to be made to the application.
Such an application is data-dependent - it is impossible to change the storage structure (how the data is
physically recorded) or the access strategy (how it is accessed) without affecting the application,
probably drastically. The portions of the application requiring alteration are those that communicate
with the file handling software - the difficulties involved are quite irrelevant to the problem the
application was written to solve.
Data Redundancy
In non-database systems each application has its own private files. This can often lead to redundancy in
stored data, with resultant waste in storage space. In a database the data is integrated.
The database may be thought of as a unification of several otherwise distinct data files, with any
redundancy among those files partially or wholly eliminated.
Data integration is generally regarded as an important characteristic of a database. The avoidance of
redundancy should be an aim, however, the vigor with which this aim should be pursued is open to
question.
Redundancy is
Data Integrity
This describes the problem of ensuring that the data in the database is accurate...
Inconsistencies between two entries representing the same “fact” give an example of lack of
integrity (caused by redundancy in the database).
Integrity constraints can be viewed as a set of assertions to be obeyed when updating a DB to
preserve an error-free state.
Even if redundancy is eliminated, the DB may still contain incorrect data.
Integrity checks which are important are checks on data items and record types.
type checks
o e.g. ensuring a numeric field is numeric and not a character - this check should be
performed automatically by the DBMS.
redundancy checks
o direct or indirect (see data redundancy) - this check is not automatic in most cases.
range checks
o e.g. to ensure a data item value falls within a specified range of values, such as checking
dates so that say (age > 0 AND age < 110).
comparison checks
o in this check a function of a set of data item values is compared against a function of
another set of data item values. For example, the max salary for a given set of
employees must be less than the min salary for the set of employees on a higher salary
scale.
A record type may have constraints on the total number of occurrences, or on the insertions and
deletions of records. For example in a patient database there may be a limit on the number of x-ray
results for each patient or the details of a patients visit to hospital must be kept for a minimum of 5
years before it can be deleted
Centralized control of the database helps maintain integrity, and permits the DBA to define
validation procedures to be carried out whenever any update operation is attempted (update
covers modification, creation and deletion).
Integrity is important in a database system - an application run without validation procedures
can produce erroneous data which can then affect other applications using that data.
Database Models
Example:
(Please Provide)
Network model allows each record to have multiple parent and child records, forming a
generalized graph structure.
It allowed a more natural modeling of relationships between entities.
Its schema is viewed as a graph in which object types are nodes and relationship types are arcs,
is not restricted to being a hierarchy or lattice.
Example:
(Please Provide)