0% found this document useful (0 votes)
83 views15 pages

DBMS Languages

Database languages made easy

Uploaded by

Wachira Davis
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
83 views15 pages

DBMS Languages

Database languages made easy

Uploaded by

Wachira Davis
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
You are on page 1/ 15

DATABASE LANGUAGES

WACHIRA DAVIS

DATABASE LANGUAGES
A data sub language consists of two parts: a Data Definition Language (DDL) and a Data Manipulation
Language (DML). The DDL is used to specify the database schema and the DML is used to both read and
update the database. These languages are called data sublanguages because they do not include
constructs for all computing needs such as conditional or iterative statements, which are provided by
the high-level programming languages. Many DBMSs have a facility for embedding the sublanguage in a
high-level programming language such as COBOL, Fortran, Pascal, Ada, 'C', C++, Java, or Visual Basic. In
this case, the high-level language is sometimes referred to as the host language.

The Data Definition Language (DDL)

DDL is a language that allows the DBA or user to describe and name the entities, attributes, and
relationships required for the application, together with any associated integrity and security
constraints.

The DDL is used to define a schema or to modify an existing one. It cannot be used to manipulate data.

The result of the compilation of the DDL statements is a set of tables stored in special files collectively
called the system catalog. The system catalog integrates the metadata (that is data that describes
objects in the database) and makes it easier for those objects to be accessed or manipulated. The
metadata contains definitions of records, data items, and other objects that are of interest to users or
are required by the DBMS. The DBMS normally consults the system catalog before the actual data is
accessed in the database. The terms data dictionary and data directory are also used to describe the
system catalog, although the term 'data dictionary' usually refers to a more general software system
than a catalog for a DBMS.

The Data Manipulation Language (DML)

DML is a language that provides a set of operations to support the basic data manipulation operations
on the data held in the database. Data manipulation operations usually include the following:
 insertion of new data into the database;
 modification of data stored in the database;
 retrieval of data contained in the database;
 deletion of data from the database.
Data manipulation applies to the external, conceptual, and internal levels. At the internal level we must
define rather complex low-level procedures that allow efficient data access. At higher levels, emphasis is
placed on ease of use and effort is directed at providing efficient user interaction with the system.

The part of a DML that involves data retrieval is called a query language. A query language can be
defined as a high-level special-purpose language used to satisfy diverse requests for the retrieval of data
1
DATABASE LANGUAGES

WACHIRA DAVIS

held in the database. The term 'query' is therefore reserved to denote a retrieval statement expressed in
a query language.

We can have two types of DML: procedural and non-procedural. Typically, procedural languages treat
records individually, whereas non-procedural languages operate on sets of records. Procedural DML is a
language that allows the user to tell the system what data is needed and exactly how to retrieve the
data while Non-procedural DML allows the user to state what data is needed rather than how it is to be
retrieved.

Fourth-Generation Languages (4GLs)

There is no consensus about what constitutes a fourth-generation language. An operation that requires
hundreds of lines in a third-generation language (3GL), such as COBOL, generally requires significantly
fewer lines in a 4GL.

Compared with a 3GL, which is procedural (the user defines the steps that a program needs to perform a
task), a 4GL is non-procedural. In 4GL, the user defines what is to be done, (not how) – it defines
parameters for the tools that use them to generate an application program.

Fourth- generation languages encompass:


 presentation languages, such as query languages and report generators;
 speciality languages, such as spreadsheets and database languages;
 application generators that define, insert, update, and retrieve data from the database to build
applications;
 very high-level languages that are used to generate application code.
SQL and QBE are examples of 4GLs.

Forms generators
A forms generator is an interactive facility for rapidly creating data input and display layouts for screen
forms. The forms generator allows the user to define what the screen is to look like, what information is
to be displayed, and where on the screen it is to be displayed. It may also allow the definition of colors
for screen elements and other characteristics, such as bold, underline, blinking, reverse video, and so on.
They may also allow the creation of derived attributes, perhaps using arithmetic operators or
aggregates, and the specification of validation checks for data input.
Report generators
A report generator is a facility for creating reports from data stored in the database. It is similar to a
query language in that it allows the user to ask questions of the database and retrieve information from
it for a report. However, in the case of a report generator, we have much greater control over what the
output looks like. We can let the report generator automatically determine how the output should look
or we can create our own customized output reports using special report-generator command
instructions.

2
DATABASE LANGUAGES

WACHIRA DAVIS

Graphics generators
A graphics generator is a facility to retrieve data from the database and display the data as a graph
showing trends and relationships in the data. It allows the user to create bar charts, pie charts, line
charts, scatter charts, and so on.
Application generators

An application generator is a facility for producing a program that interfaces with the database. The use
of an application generator can reduce the time it takes to design an entire software application.
Application generators typically consist of rewritten modules that comprise fundamental functions that
most programs use. These modules, usually written in a high-level language, constitute a 'library' of
functions to choose from. The user specifies what the program is supposed to do; the application
generator determines how to perform the tasks.

Functions of a DBMS

There are several services that should be provided by any full-scale DBMS. They include the following:

1. Data storage, retrieval, and update: A DBMS must furnish users with the ability to store, retrieve,
and update data in the database. This is the fundamental function of a DBMS.

2. A user-accessible catalog: A DBMS must furnish a catalog in which descriptions of data items are
stored and which is accessible to users. The catalog is expected to be accessible to users as well as to
the DBMS. A system catalog, or data dictionary, is a repository of information describing the data in
the database: it is, the 'data about the data' or metadata. The amount of information and the way
the information is used vary with the DBMS.

Typically, the system catalog stores:

 names, types, and sizes of data items;


 names of relationships;
 integrity constraints on the data;
 names of authorized users who have access to the data;
 the data items that each user can access and the types of access allowed; for example,
insert, update, delete, or read access;
 external, conceptual, and internal schemas and the mappings between the schemas;
 usage statistics, such as the frequencies of transactions and counts on the number of
accesses made to objects in the database.
The DBMS system catalog is one of the fundamental components of the system. Many of the software
components rely on the system catalog for information. Some benefits of a system catalog are:

 Information about data can be collected and stored centrally. This helps to maintain
control over the data as a resource.

3
DATABASE LANGUAGES

WACHIRA DAVIS

 The meaning of data can be defined, which will help other users understand the purpose
of the data.
 Communication is simplified, since exact meanings are stored.
 The system catalog may also identify the user or users who own or access the data.
 Redundancy and inconsistencies can be identified more easily since the data is centralized.
 Changes to the database can be recorded.
 The impact of a change can be determined before it is implemented, since the system catalog
records each data item, all its relationships, and all its users.
 Security can be enforced.
 Integrity can be ensured.
 Audit information can be provided.

3. Transaction support: A DBMS must furnish a mechanism which will ensure either that all the
updates corresponding to a given transaction are made or that none of them is made.

A transaction is a series of actions, carried out by a single user or application program, which
accesses or changes the contents of the database. If more than one change is to be made to the
database and while making so the transaction fails during execution, perhaps because of a
computer crash, the database will be in an inconsistent state: some changes will have been made
and others not. Consequently, the changes that have been made will have to be undone to return
the database to a consistent state again.

4. Concurrency control services: A DBMS must furnish a mechanism to ensure that the database is
updated correctly when multiple users are updating the database concurrently. DBMS enables many
users to access shared data concurrently. Concurrent access is relatively easy if all users are only
reading data, as there is no way that they can interfere with one another. However, when two or
more users are accessing the database simultaneously and at least one of them is updating data,
there may be interference that can result in inconsistencies. The DBMS must ensure that, when
multiple users are accessing the database, interference cannot occur.

5. Recovery services: A DBMS must furnish a mechanism for recovering the database in the event
that the database is damaged in any way. This could be due to a system crash, media failure, a
hardware or software error causing the DBMS to stop, or it may be the result of the user detecting
an error during the transaction and aborting the transaction before it completes. In all these cases,
the DBMS must provide a mechanism to recover the database to a consistent state.

6. Authorization services: A DBMS must furnish a mechanism to ensure that only authorized users
can access the database.

7. Support for data communication: A DBMS must be capable of integrating with communication
software. Most users access the database from workstations. Sometimes these workstations are
connected directly to the computer hosting the DBMS. In other cases, the workstations are at

4
DATABASE LANGUAGES

WACHIRA DAVIS

remote locations and communicate with the computer hosting the DBMS over a network. In either
case, the DBMS receives requests as communications messages and responds in a similar way.

8. Integrity services: A DBMS must furnish a means to ensure that both the data in the database and
changes to the data follow certain rules. Database integrity refers to the correctness and
consistency of stored data: it can be considered as another type of database protection. While
integrity is related to security, it has wider implications: integrity is concerned with the quality of
data itself. Integrity is usually expressed in terms of constraints, which are consistency rules that the
database is not permitted to violate. For example, we may want to specify a constraint that no
member of staff can manage more than 100 properties at any one time. Here, we would
want the DBMS to check when we assign a property to a member of staff that this limit
would not be exceeded and to prevent the assignment from occurring if the limit has been
reached.

9. Services to promote data independence: A DBMS must include facilities to support the
independence of programs from the actual structure of the database.

10. Utility services: A DBMS should provide a set of utility services. Utility programs help the DBA to
administer the database effectively. Examples of utilities are: import facilities, to load the database
from flat files, and export facilities, to unload the database to flat files; monitoring facilities, to
monitor database usage and operation; statistical analysis programs, to examine performance or
usage statistics; index reorganization facilities, to reorganize indexes and their overflows; garbage
collection and reallocation, to remove deleted records physically from the storage devices, to
consolidate the space released, and to reallocate it where it is needed.

Components of a DBMS

DBMSs are highly complex and sophisticated pieces of software that aim to provide various services. It is
not possible to generalize the component structure of a DBMS as it varies greatly from system to
system. However, it is useful when trying to understand database systems to try to view the
components and the relationships between them.

A DBMS is partitioned into several software components (or modules), each of which is assigned a
specific operation. Some of the functions of the DBMS are supported by the underlying operating
system. However, the operating system provides only basic services and the DBMS must be built on top
of it. Thus, the design of a DBMS must take into account the interface between the DBMS and the
operating system.

The major software components in a DBMS environment are given in the following diagram and
discussed as follows:

5
DATABASE LANGUAGES

WACHIRA DAVIS

Source: Connolly, T. and Begg, C. : Database Systems: A Practical Approach to Design, Implementation and management
(Addison Wesley, N.Y., 2005).

 Query processor: This is a major DBMS component that transforms queries into a series of low-level
instructions directed to the database manager.

 Database manager (DM): The DM interfaces with user-submitted application programs and queries.
The DM accepts queries and examines the external and conceptual schemas to determine what
conceptual records are required to satisfy the request. The DM then places a call to the file manager
to perform the request.

 File manager: The file manager manipulates the underlying storage files and manages the allocation
of storage space on disk. It establishes and maintains the list of structures functions to generate
record addresses. However, the file manager does not directly manage the physical input and
output of data. Rather it passes the requests on to the appropriate access methods, which either
read data from or write data into the system buffer (or cache).

 DML preprocessor: This module converts DML statements embedded in an application program into
standard function calls in the host language. The DML preprocessor must interact with the query
processor to generate the appropriate code.

6
DATABASE LANGUAGES

WACHIRA DAVIS

 DDL compiler: The DDL compiler converts DDL statements into a set of tables containing metadata.
These tables are then stored in the system catalog while control information is stored in data file
headers.

 Catalog manager: The catalog manager manages access to and maintains the system catalog. The
system catalog is accessed by most DBMS components.

Database manager

The major software components for the database manager are shown on the next page and include the
following: (see the diagram.)

Authorization control: This module checks that the user has the necessary authorization to carry out the
required operation.

Command processor: Once the system has checked that the user has authority to carry out the
operation, control is passed to the command processor.

Integrity checker: For an operation that changes the database, the integrity checker checks that the
requested operation satisfies all necessary integrity constraints (such as key constraints).

Query optimizer: This module determines an optimal strategy for the query execution.

Transaction manager: This module performs the required processing of operations it receives from
transactions.

Scheduler: This module is responsible for ensuring that concurrent operations on the database proceed
without conflicting with one another. It controls the relative order in which transaction operations are
executed.

Recovery manager: This module ensures that the database remains in a consistent state in the presence
of failures. It is responsible for transaction commit and abort.

Buffer manager: This module is responsible for the transfer of data between main memory and
secondary storage, such as disk and tape. The recovery manager and the buffer manager are sometimes
referred to collectively as the data manager. The buffer manager is sometimes known as the cache

7
DATABASE LANGUAGES

WACHIRA DAVIS

manager.

Source: Connolly, T. and Begg, C. : Database Systems: A Practical Approach to Design, Implementation and
management (Addison Wesley, N.Y., 2005).

Multi-User DBMS Architectures

The common architectures that are used to implement multi-use database management systems are:
teleprocessing, file-server, and client-server.

Teleprocessing
The traditional architecture for multi-user systems was teleprocessing, where there is one computer
with a single central processing unit (CPU) and a number of terminals.

Teleprocessing is shown using the following diagram:

8
DATABASE LANGUAGES

WACHIRA DAVIS

Source: Connolly, T. and Begg, C. : Database Systems: A Practical Approach to Design, Implementation and
management (Addison Wesley, N.Y., 2005).

User terminals are typically 'dumb', incapable of functioning on their own and are cabled to the central
computer that performs all processing. This architecture placed a tremendous burden on the central
computer, which had to run the application programs and the DBMS, and also had to carry out a
significant amount of work on behalf of the terminals. Such work includes formatting data for display on
the screen.

In recent years, there have been significant advances in the development of high-performance personal
computers and networks. There is a trend in industry towards replacing expensive mainframe
computers with more cost-effective networks of personal computers that achieve the same, or even
better, results. This trend has given rise to two architectures: file-server and client-server.

File-Server Architecture

In a file-server environment, the processing is distributed about the network, typically a local area
network (LAN). The file-server holds the files required by the applications and the DBMS. However, the
applications and the DBMS run on each workstation, requesting files from the file-server when
necessary. In this way, the file-server acts simply as a shared hard disk drive. The DBMS on each
workstation sends requests to the file-server for all data.

The following is a diagram of the file-server architecture:

9
DATABASE LANGUAGES

WACHIRA DAVIS

Source: Connolly, T. and Begg, C. : Database Systems: A Practical Approach to Design, Implementation and
management (Addison Wesley, N.Y., 2005).
This approach can generate a significant amount of network traffic, which can lead to performance
problems. For example, consider a user request that requires the names of staff who work in the branch
at 163 Main St. We can express this request in SQL as:

SELECT fName, IName


FROM Branch b, Staff s

WHERE b.branchNo = s.branchNo AND b.street = '163 Main St';

As the file-server has no knowledge of SQL, the DBMS has to request the files corresponding to the
Branch and Staff relations from the file-server, rather than just the staff names that satisfy the query.

The file-server architecture, therefore, has three main disadvantages:

1. There is a large amount of network traffic.


2. A full copy of the DBMS is required on each workstation.
3. Concurrency, recovery, and integrity control are more complex because there can be multiple
DBMSs accessing the same files.

Traditional Two-Tier Client-Server Architecture

To overcome the disadvantages of the first two approaches and accommodate an increasingly
decentralized business environment, the client-server architecture was developed. In this architecture,
there is a client process, which requires some resource, and a server, which provides the resource.
There is no requirement that the client and server must reside on the same machine. In practice, it is

10
DATABASE LANGUAGES

WACHIRA DAVIS

quite common to place a server at one site in a local area network and the clients at the other sites. The
following diagram illustrates the client-server architecture:

Source: Connolly, T. and Begg, C. : Database Systems: A Practical Approach to Design, Implementation and
management (Addison Wesley, N.Y., 2005).

Data-intensive business applications consist of four major components: the database, the transaction
logic, the business and data application logic, and the user interface. The traditional two-tier client-
server architecture provides a very basic separation of these components.

The following diagram illustrates the two-tier client-server architecture:

Source: Connolly, T. and Begg, C. : Database Systems: A Practical Approach to Design, Implementation and
management (Addison Wesley, N.Y., 2005).
11
DATABASE LANGUAGES

WACHIRA DAVIS

The client (tier 1) is primarily responsible for the presentation of data to the user, and the server (tier 2)
is primarily responsible for supplying data services to the client.

Presentation services handle user interface actions and the main business and data application logic.
Data services provide limited business application logic, typically validation that the client is unable to
carry out due to lack of information, and access to the requested data, independent of its location. The
data can come from relational DBMSs, object-relational DBMSs, object-oriented DBMSs, legacy
DBMSs, or proprietary data access systems. Typically, the client would run on end-user desktops and
interact with a centralized database server over a network.

A typical interaction between client and server is as follows. The client takes the user's request, checks
the syntax and generates database requests in SQL or another database language appropriate to the
application logic. It then transmits the message to the server, waits for a response, and formats the
response for the end-user. The server accepts and processes the database requests, then transmits the
results back to the client. The processing involves checking authorization, ensuring integrity, maintaining
the system catalog, and performing query and update processing. In addition, it also provides
concurrency and recovery control. Functions of client and server are summarized as follows:

Client
 Manages the user interface
 Accepts and checks syntax of user input
Processes application logic
 Generates database requests and
transmits to server
 Passes response back to user

Server
 Accepts and processes database requests from clients
Checks authorization
 Ensures integrity constraints not violated
Performs query/update processing and transmits
response to client
 Maintains system catalog
 Provides concurrent database access
Provides recovery control

There are many advantages to this type of architecture.

1. It enables wider access to existing databases.

12
DATABASE LANGUAGES

WACHIRA DAVIS

2. Increased performance - if the clients and server reside on different computers then different
CPUs can be processing applications in parallel. It should also be easier to tune the server
machine if its only task is to perform database processing.
3. Hardware costs may be reduced - it is only the server that requires storage and processing
power sufficient to store and manage the database.
4. Communication costs are reduced - applications carry out part of the operations on the client
and send only requests for database access across the network, resulting in less data being sent
across the network.
5. Increased consistency - the server can handle integrity checks, so that constraints need be
defined and validated only in the one place, rather than having each application program
perform its own checking.
6. It maps on to open systems architecture quite naturally.

Three-Tier Client-Server Architecture

The need for enterprise scalability challenged the traditional two-tier client-server model. In the mid-
1990, as applications became more complex, and potentially could be deployed to hundreds or
thousands of end-users, the client side presented two problems that prevented true scalability:

 A 'fat' client, requiring considerable resources on the client's computer to run effectively. This
includes disk space, RAM, and CPU power.
 A significant client-side administration overhead.

By 1995, a new variation of the traditional two-tier client-server model appeared to solve the problem
of enterprise scalability. This new architecture proposed three layers, each potentially running on a
different platform as shown in the following diagram:

Diagram showing three-tier client-server architecture:

13
DATABASE LANGUAGES

WACHIRA DAVIS

Source: Connolly, T. and Begg, C. : Database Systems: A Practical Approach to Design, Implementation and
management (Addison Wesley, N.Y., 2005).

In the three-tier client-server architecture:


 The user interface layer, which runs on the end-user's computer (the client).
 The business logic and data processing layer. This middle tier runs on a server and is often called
the application server.
 A DBMS, which stores the data required by the middle tier. This tier may run on a separate
server called the database server.
The client is now responsible only for the application's user interface and perhaps performing some
simple logic processing, such as input validation, thereby providing a 'thin' client. The core business logic
of the application now resides in its own layer, physically connected to the client and database server
over a local area network (LAN) or wide area network (WAN). One application server is designed to
serve multiple clients.

The three-tier design has many advantages over the other traditional designs and these include:

1. The need for less expensive hardware because the client is 'thin'.
2. Application maintenance is centralized with the transfer of the business logic for many end-
users into a single application server. This eliminates the concerns of software distribution that
are problematic in the traditional two-tier client-server model.
3. The added modularity makes it easier to modify or replace one tier without affecting the
other, tiers.

14
DATABASE LANGUAGES

WACHIRA DAVIS

4. Load balancing is easier with the separation of the core business logic from the database
functions.
Another advantage of the three-tier architecture is that it maps quite naturally to the Web environment,
with a Web browser acting as the 'thin' client, and a Web server acting as the application server.

The three-tier architecture can be extended to n-tiers, with additional tiers added to provide more
flexibility and scalability. For example, the middle tier of the three-tier architecture could be split into
two, with one tier for the Web server and another for the application server. This three-tier architecture
has proved more appropriate for some environments, such as the Internet and corporate intranets
where a Web browser can be used as a client.

Reference:

Connolly, T. and Begg, C. : Database Systems: A Practical Approach to Design, Implementation and
management (Addison Wesley, N.Y., 2005).

15

You might also like