Database System and Architecture
Database System and Architecture
The architecture of DBMS packages has evolved from the early systems, where the whole DBMS
software package was one tightly integrated system, to the modern DBMS packages that are modular in
design, with a client /server system architecture.
In a basic client/server DBMS architecture, the system functionality is distributed between two types of
modules:
1. Client module- typically designed so that it will run on a user workstation or personal computer.
Typically, application programs and user interfaces that access the database run in the client
module. Hence, the client module handles user interaction and provides the user-friendly interfaces
such as forms- or menu-based GUIs (graphical user interfaces).
2. Server module- typically handles data storage, access, search, and other functions.
DATA ABSTRACTION
Data abstraction generally refers to the suppression of details of data organization and storage, and the
highlighting of the essential features for an improved understanding of data. One of the main characteristics
of the database approach is to support data abstraction so that different users can perceive data at their
preferred level of detail.
DATA MODEL
A data model is a collection of concepts that can be used to describe the structure of a database. It
provides the necessary means to achieve this abstraction. Most data models also include a set of basic
operations for specifying retrievals and updates on the database.
1. High-level or Conceptual data models – They provide concepts that are close to the way many
users perceive data. Conceptual data models use concepts such as entities, attributes, and
relationships.
An entity represents a real-world object or concept, such as an employee or a project
from the mini world that is described in the database.
An attribute represents some property of interest that further describes an entity,
such as the employee’s name or salary.
A relationship among two or more entities represents an association among the
entities, for example, a works-on relationship between an employee and a project.
2. Low-level or Physical data models- They provide concepts that describe the details of how
data is stored on the computer storage media, typically magnetic disks. Concepts provided by
low-level data models are generally meant for computer specialists, not for end users.
3. Representational (or implementation) data models- Between these two extremes is a class
of representational (or implementation) data models, which provide concepts that may be
easily understood by end users but that are not too far removed from the way data is organized
in computer storage.
Representational data models hide many details of data storage on disk but can be
implemented on a computer system directly. Representational or implementation data models
are the models used most frequently in traditional commercial DBMSs. These include:
the widely used relational data model
As well as the so-called legacy data models—the network and hierarchical models—
that have been widely used in the past.
Representational data models represent data by using record structures and hence are
sometimes called record-based data models.
4. Object data model - We can regard the object data model as an example of a new family of
higher-level implementation data models that are closer to conceptual data models. A standard
for object databases called the ODMG object model has been proposed by them Object Data
Management Group (ODMG). Object data models are also frequently utilized as high-level
conceptual models, particularly in the software engineering domain.
In any data model, it is important to distinguish between the descriptions of the database
and the database itself.
1. The description of a database is called the database schema, which is specified during database
design and is not expected to change frequently.
2. Most data models have certain conventions for displaying schemas as diagrams. A displayed
schema is called a schema diagram.
3. The data in the database at a particular moment in time is called a database state or snapshot.
It is also called the current set of occurrences or instances in the database.
4. In a given database state, each schema construct has its own current set of instances
The goal of the three-schema architecture, is to separate the user applications from the physical
database. In this architecture, schemas can be defined at the following three levels:
1. The internal level has an internal schema, which describes the physical storage structure of
the database. The internal schema uses a physical data model and describes the complete
details of data storage and access paths for the database.
2. The conceptual level has a conceptual schema, which describes the structure of the whole
database for a community of users. The conceptual schema hides the details of physical storage
structures and concentrates on describing entities, data types, relationships, user operations,
and constraints. Usually, a representational data model is used to describe the conceptual
schema when a database system is implemented. This implementation conceptual schema is
often based on a conceptual schema design in a high-level data model.
3. The external or view level includes a number of external schemas or user views. Each
external schema describes the part of the database that a particular user group is interested in
and hides the rest of the database from that user group. As in the previous level, each external
schema is typically implemented using a representational data model, possibly based on an
external schema design in a high-level data model.
The three-schema architecture is a convenient tool with which the user can visualize the schema levels
in a database system. Most DBMSs do not separate the three levels completely and explicitly, but
support the three-schema architecture to some extent.
Some older DBMSs may include physical-level details in the conceptual schema.
The three-level ANSI architecture has an important place in database technology development because
it clearly separates the users’ external level, the database’s conceptual level, and the internal storage
level for designing a database.
DATA INDEPENDANCE
The three-schema architecture can be used to further explain the concept of data independence, which
can be defined as the capacity to change the schema at one level of a database system without having to
change the schema at the next higher-level. We can define two types of data independence:
1. Logical data independencies the capacity to change the conceptual schema without having to
change external schemas or application programs. We may change the conceptual schema to
expand the database (by adding a record type or data item), to change constraints, or to reduce
the database (by removing a record type or data item). In the last case, external schemas that
refer only to the remaining data should not be affected. Only the view definition and the
mappings need to be changed in a DBMS that supports logical data independence. After the
conceptual schema undergoes a logical reorganization, application programs that reference the
external schema constructs must work as before. Changes to constraints can be applied to the
conceptual schema without affecting the external schemas or application programs.
2. Physical data independencies the capacity to change the internal schema without having to
change the conceptual schema. Hence, the external schemas need not be changed as well.
Changes to the internal schema may be needed because some physical files were reorganized—
for example, by creating additional access structures—to improve the performance of retrieval
or update. If the same data as before remains in the database, we should not have to change the
conceptual schema.
DATABASE LANGUAGE
The DBMS must provide appropriate languages and interfaces for each category of users. Once the
design of a database is completed and a DBMS is chosen to implement the database, the first step is to
specify conceptual and internal schemas for the database and any mappings between the two.
1. In many DBMSs where no strict separation of levels is maintained, one language, called the data
definition language (DDL), is used by the DBA (Database Administrator) and by database
designers to define both schemas. The DBMS will have a DDL compiler whose function is to
process DDL statements in order to identify descriptions of the schema constructs and to store
the schema description in the DBMS catalo. In DBMSs where a clear separation is maintained
between the conceptual and internal levels, the DDL is used to specify the conceptual schema
only
2. Another language, the storage definition language (SDL), is used to specify the internal
schema. The mappings between the two schemas may be specified in either one of these
languages. In most relational DBMSs today, there is no specific language that performs the role
of SDL. Instead, the internal schema is specified by a combination of functions, parameters, and
specifications related to storage. These permit the DBA staff to control indexing choices and
mapping of data to storage.
3. For true three-schema architecture, we would need a third language, the view definition
language (VDL), to specify user views and their mappings to the conceptual schema, but in
most DBMSs the DDL is used to define both conceptual and external schemas. In relational DBMSs,
SQL is used in the role of VDL to define user or application views as results of predefined
queries
Once the database schemas are compiled and the database is populated with data, users must have
some means to manipulate the database. Typical manipulations include retrieval, insertion, deletion,
and modification of the data. The DBMS provide a set of operations or a language called the data
manipulation language (DML) for these purposes.
In current DBMSs, the preceding types of languages are usually not considered distinct languages;
rather, a comprehensive integrated language is used that includes constructs for conceptual schema
definition, view definition, and data manipulation. Storage definition is typically kept separate, since it
is used for defining physical storage structures to fine-tune the performance of the database system,
which is usually done by the DBA staff.
A typical example of a comprehensive database language is the SQL relational database language, which
represents combination of DDL, VDL, and DML, as well as statements for constraint specification,
schema evolution, and other features. The SDL was a component in early versions of SQL but has been
removed from the language to keep it at the conceptual and external levels only.
There are two main types of DMLs:
1. A high-level or nonprocedural DML can be used on its own to specify complex database
operations concisely. Many DBMSs allow high-level DML statements either to be entered
interactively from a display monitor or terminal or to be embedded in a general-purpose
programming language. In the latter case, DML statements must be identified within the
program so that they can be extracted by a pre-compiler and processed by the DBMS. Highlevel
DMLs, such as SQL, can specify and retrieve many records in a single DML statement, therefore,
they are called set-at-a-time or set-oriented DMLs. A query in a high-level DML often specifies
which data to retrieve rather than how to retrieve it; therefore, such languages are also called
declarative.
Whenever DML commands whether high level or low level, are embedded in a general-purpose
programming language that language is called the host language and the DML is called the data
sublanguage. On the other hand, a high-level DML used in a standalone interactive manner is called a
query language. In general, both retrieval and update commands of a high-level DML may be used
interactively and are hence considered part of the query language.
Casual end users typically use a high-level query language to specify their requests, whereas
programmers use the DML in its embedded form. For naive and parametric users, there usually are
user-friendly interfaces for interacting with the database; these can also be used by casual users or
others who do not want to learn the details of a high-level query language.
DBMS INTERFACES
1. Menu-Based Interfaces for Web Clients or Browsing - These interfaces present the user with
lists of options (called menus) that lead the user through the formulation of a request. Menus
do away with the need to memorize the specific commands and syntax of a query language;
rather, the query is composed step-by step by picking options from a menu that is displayed by
the system. Pull-down menus are a very popular technique in Web-based user interfaces.
They are also often used in browsing interfaces, which allow a user to look through the
contents of a database in an exploratory and unstructured manner.
2. Forms-Based Interfaces- A forms-based interface displays a form to each user. Users can fill
out all of the form entries to insert new data, or they can fill out only certain entries, in which
case the DBMS will retrieve matching data for the remaining entries. Forms are usually designed
and programmed for naive users as interfaces to canned transactions. Many DBMSs have forms
specification languages,
3. Speech Input and Output-Limited use of speech as an input query and speech as an answer to
a question or result of a request is becoming commonplace. Applications with limited
vocabularies such as inquiries for telephone directory, flight arrival/departure, and credit card
account information are allowing speech which are special languages that help programmers
specify such forms. SQL*Forms is a form-based language that specifies queries using a form
designed in conjunction with the relational database schema. Oracle Forms is a component of
the Oracle product suite that provides an extensive set of features to design and build
applications using forms. Some systems have utilities that define a form by letting the end user
interactively construct a sample form on the screen.
4. Graphical User Interfaces. A GUI typically displays a schema to the user in diagrammatic form.
The user then can specify a query by manipulating the diagram. In many cases, GUIs utilize both
menus and forms. Most GUIs use a pointing device, such as a mouse, to select certain parts of
the displayed schema diagram.
6. Interfaces for Parametric Users-Parametric users, such as bank tellers, often have a small set
of operations that they must perform repeatedly. For example, a teller is able to use single
function keys to invoke routine and repetitive transactions such as account deposits or
withdrawals, or balance inquiries. Systems analysts and programmers design and implement a
special interface for each known class of naive users. Usually a small set of abbreviated
commands is included, with the goal of minimizing the number of keystrokes required for each
request. For example, function keys in a terminal can be programmed to initiate various
commands. This allows the parametric user to proceed with a minimal number of keystrokes.
7. Interfaces for the DBA-Most database systems contain privileged commands that can be used
only by the DBA staff. These include commands for creating accounts, setting system
parameters, granting account authorization, changing a schema, and reorganizing the storage
structures of a database.