0% found this document useful (0 votes)
0 views16 pages

Module1 Notes

The document provides an overview of Database Management Systems (DBMS), detailing the properties and functions of databases, including data definition, construction, manipulation, and sharing. It discusses the characteristics of the database approach, the roles of various database users, and the advantages of using a DBMS, such as controlling redundancy and providing security. Additionally, it covers data models, schemas, data independence, and the languages and interfaces used in DBMS environments.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
0 views16 pages

Module1 Notes

The document provides an overview of Database Management Systems (DBMS), detailing the properties and functions of databases, including data definition, construction, manipulation, and sharing. It discusses the characteristics of the database approach, the roles of various database users, and the advantages of using a DBMS, such as controlling redundancy and providing security. Additionally, it covers data models, schemas, data independence, and the languages and interfaces used in DBMS environments.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 16

DBMS Module1 CASP

DBMS (Database Management System)

Introduction

A database is a collection of related data. By data,we mean known


facts that can be recorded and that have implicit meaning.

Example : The names, telephone numbers, and addresses of the people


you know.
The collection of related data with an implicit meaning is a database.

A database has the following properties:

■ A database represents some aspect of the real world


■ A database is a logically related collection of data with some inherent
meaning.
■ A database is designed, built, and populated with data for a specific
purpose.
A database can be of any size and complexity. For example, It
may be the list of names and addresses consisting of only a few hundred
records, each with a simple structure or a computerized catalog of a
large library with half a million entries organized under different
categories
An example of a large commercial database is Amazon.com.A database
may be generated and maintained manually or it may be computerized.
A database management system (DBMS) is a collection of programs
that enables users to create and maintain a database. The DBMS is a
general-purpose software system that facilitates the processes of
defining, constructing, manipulating, and sharing databases among
various users and applications.

Defining a database involves specifying the data types, structures, and


constraints of the data to be stored in the database. The database
definition or descriptive information is also stored by the DBMS in the
form of a database catalog or dictionary; it is called meta-data.
1
DBMS Module1 CASP

Constructing the database is the process of storing the data on some


storage medium that is controlled by the DBMS.
Manipulating a database includes functions such as querying the
database to retrieve specific data, updating the database and generating
reports from the data.
Sharing a database allows multiple users and programs to access the
database simultaneously.

An application program accesses the database by sending queries or


requests for data to the DBMS. A query typically causes some data to
be retrieved; a transaction may cause some data to be read and some
data to be written into the database. Other important functions provided
by the DBMS include protecting the database and maintaining it over a
long period of time.

The database and DBMS software together is called a database system.

2
DBMS Module1 CASP

Characteristics of the Database Approach


(Database Approach Vs. File-processing)

1. Self-Describing Nature of a Database System


The database system the complete definition or description of the
database structure and constraints. This definition is stored in the DBMS
catalog or data dictionary. The information stored in the catalog is called
meta-data, and it describes the structure of the primary database

2. Insulation between Programs and Data, and Data Abstraction

The structure of data files is stored in the DBMS catalog separately


from the access programs.

3. Support Multiple Views of the Data

A database typically has many users, each of whom may require a


different view of the database.

4. Sharing of Data and Multiuser Transaction Processing


A multiuser DBMS must allow multiple users to access the
database at the same time. The DBMS must include concurrency control
software.
Database Users
Many people are involved in the design, use, and maintenance of a large
database with hundreds of users.

1. Database Administrators
Administering the database resources is the responsibility of the
database administrator (DBA). The DBA is responsible for authorizing
access to the database, coordinating and monitoring its use, and
acquiring software and hardware resources as needed. The DBA is
3
DBMS Module1 CASP

responsible for problems such as security breaches and poor system


response time. In large organizations, the DBA is assisted by a staff that
carries out these functions.

2. Database Designers
Database designers are responsible for identifying the data to be
stored in the database. They choose appropriate structures to represent
and store this data. It is the responsibility of database designers to
communicate with all database users in order to understand their
requirements and to create a design that meets these requirements.
3. End Users
End users are the people whose jobs require access to the database for
querying, updating, and generating reports. The database primarily exists
for their use.
There are several categories of end users:
■Casual end users occasionally access the database, but they may
need different information each time. They use a sophisticated database
query language to specify their requests .
■ Naive or parametric end users make up a major portion of
database end users. They use program interfaces to use the database
system. Eg: _ Bank tellers, Reservation agents for airlines, hotels, and
car rental companies etc.
■ Sophisticated end users include engineers, scientists, business
analysts, and others who thoroughly familiarize themselves with the
facilities of the DBMS.They implement their own applications to meet
their complex requirements.
■ Standalone users maintain personal databases by using ready-
made program packages that provide easy-to-use menu-based or
graphics-based interfaces.
A typical DBMS provides multiple facilities to access a database.
Naive end users need to learn very little about the facilities provided by
the DBMS; they simply have to understand the user interfaces of the
standard transactions designed and implemented for their use.
Casual users learn only a few facilities that they may use repeatedly.

4
DBMS Module1 CASP

Sophisticated users try to learn most of the DBMS facilities in order to


achieve their complex requirements.
Standalone users typically become very proficient in using a specific
software package.
System Analysts and Application Programmers
System analysts determine the requirements of end users, especially
naive and parametric end users, and develop specifications for standard
transactions that meet these requirements. Application programmers
implement these specifications as programs. Such analysts and
programmers should be familiar with the full range of capabilities
provided by the DBMS to accomplish their tasks.
Advantages of Using the DBMS Approach
In addition to the 4 main characteristics dbms provide some
additional capabilities. They can be listed as follows:
1. Controlling Redundancy
Redundancy in storing the same data multiple times leads to several
problems.
There is the need to perform a single logical update multiple times. This
leads to duplication of effort. Second, storage space is wasted when the
same data is stored repeatedly. Third, files that represent the same data
may become inconsistent.
2. Restricting Unauthorized Access
When multiple users share a large database, it is likely that most
users will not be authorized to access all information in the database.
Here a DBMS should provide a security and authorization subsystem.
3. Providing Persistent Storage for Program Objects
Databases can be used to provide persistent storage for program
objects and data structures. This is one of the main reasons for object-
oriented database systems.
4. Providing Storage Structures and Search Techniques for
Efficient Query Processing
Database systems must provide capabilities for efficiently
executing queries and updates. The DBMS must provide specialized
data structures and search techniques to speed up disk search for the
desired records.
5
DBMS Module1 CASP

5. Providing Backup and Recovery


A DBMS must provide facilities for recovering from hardware or
software failures.The backup and recovery subsystem of the DBMS is
responsible for recovery
6. Providing Multiple User Interfaces
Because many types of users with varying levels of technical knowledge
use a database, a DBMS should provide a variety of user interfaces.
7. Representing Complex Relationships among Data
A DBMS must have the capability to represent a variety of complex
relationships among the data, to define new relationships as they arise,
and to retrieve and update related data easily and efficiently.
8. Enforcing Integrity Constraints
Most database applications have certain integrity constraints that
must hold for the data. A DBMS should provide capabilities for defining
and enforcing these constraints.
9. Permitting Inferencing and Actions Using Rules
Some database systems provide capabilities for defining
deduction rules for inferencing new information from the stored
database facts. Such systems are called deductive database systems

Additional Implications of Using the Database Approach


.
a) Potential for Enforcing Standards.
The database approach permits the DBA to define and enforce standards
among database users in a large organization.
b) Reduced Application Development Time.
Once a database is up and running, substantially less time is generally
required to create new applications using DBMS facilities.
c) Flexibility.
It may be necessary to change the structure of a database as
requirements change. Modern DBMSs allow certain types of changes to

6
DBMS Module1 CASP

the structure of the database without affecting the stored data and the
existing application programs.

d) Availability of Up-to-Date Information


A DBMS makes the database available to all users. As soon as one
user’s update is applied to the database, all other users can immediately
see this update.

e) Economies of Scale.
DBMS enables the whole organization to invest in more powerful
processors, storage devices, or communication gear, rather than having
each department purchase its own (lower performance) equipment. This
reduces overall costs of operation and management.

Data Models, Schemas, and Instances


Data abstraction generally refers to the suppression of details of data
organization and storage, and the highlighting of the essential features
for an improved understanding of data.
A data model is a collection of concepts that can be used to
describe the structure of a database that provides the necessary means to
achieve data abstraction. By structure of a database we mean the data
types, relationships, and constraints that apply to the data. Most data
models also include a set of basic operations for specifying retrievals
and updates on the database.

Categories of Data Models


Data models can be classified into 3 categories.
1. High-level/conceptual /object based models
2. Representational/Implementation/Record based model
3. Physical Model.
High-level or conceptual data models provide concepts that are close to
the way many users visualize data.(Object Based Model)

7
DBMS Module1 CASP

Low-level or physical data models provide concepts that describe the


details of how data is stored on the computer storage media, typically
magnetic disks. Concepts provided by low-level data models are
generally meant for computer specialists, not for end users. (Physical
Model)
Representational (or implementation) data models, provide concepts
that may be easily understood by end users. Representational data
models hide many details of data storage on disk but can be
implemented on a computer system directly. Representational or
implementation data models are the models used most frequently in
traditional commercial DBMSs. Representational data models represent
data by using record structures and hence are called record-based data
models.

Schemas, Instances, and Database State

The description of a database is called the database schema, which is


specified during database design and is not expected to change
frequently.

A displayed schema is called a schema diagram. Figure shows a schema


diagram for a sample database.The diagram displays the structure of
each record type but not the actual instances of records.

8
DBMS Module1 CASP

A schema diagram displays only some aspects of a schema, such as


the names of record types and data items, and some types of constraints.
The data in the database at a particular moment in time is called
a database instance, database state or snapshot. It is also called the
current set of occurrences or instances in the database. In a given
database state, each schema construct has its own current set of instances;
for example, the STUDENT construct will contain the set of individual
student entities (records) as its instances. Every time we insert or delete
a record or change the value of a data item in a record, we change one
state of the database into another state.

The DBMS stores the descriptions of the schema constructs and


constraints—also called the meta-data—in the DBMS catalog so that
DBMS software can refer to the schema whenever it needs to. The
schema is sometimes called the intension, and a database state is called
an extension of the schema. The schema is not supposed to change
frequently, it is not uncommon that changes occasionally need to be
applied to the schema as the application requirements change.
.
Three-Schema Architecture and Data Independence

9
DBMS Module1 CASP

In three-schema architecture, schemas can be defined at the following


three levels:
1. The internal or physical level has an internal schema, which
describes the physical storage structure of the database. The internal
schema uses a physical data model and describes the complete details of
data storage and access paths for the database.
2. The conceptual level has a conceptual schema, which describes the
structure of the whole database. The conceptual schema hides the details
of physical storage structures and concentrates on describing entities,
data types, relationships, user operations, and constraints.
3. The external or view level includes a number of external schemas or
user views. Each external schema describes the part of the database that
a particular user group is interested in and hides the rest of the database
from that user group.
The three-schema architecture is a convenient tool with which
the user can visualize the schema levels in a database system. Most
DBMSs do not separate the three levels completely and explicitly, but
support the three-schema architecture to some extent.
In a DBMS based on the three-schema architecture, each user
group refers to its own external schema. Hence, the DBMS must
10
DBMS Module1 CASP

transform a request specified on an external schema into a request


against the conceptual schema, and then into a request on the internal
schema for processing over the stored database.
Data Independence
Data independence can be defined as the capacity to change the schema
at one level of a database system without having to change the schema at
the next higher level.
There are two types of data independence:
1. Logical data independence is the capacity to change the conceptual
schema without having to change external schemas or application
programs. The conceptual schema may be expanded (by adding a
record type or data item), to change constraints, or to reduce the
database (by removing a record type or data item).After the
conceptual schema undergoes a logical reorganization, application
programs that reference the external schema constructs must work as
before. Changes to constraints can be applied to the conceptual
schema without affecting the external schemas or application
programs.
2. Physical data independence is the capacity to change the internal
schema without having to change the conceptual schema. Hence,
the external schemas need not be changed as well. Changes to the
internal schema may be needed because some physical files were
reorganized

Physical data independence exists in most databases and file


environments where physical details such as the exact location of data
on disk, and hardware details of storage encoding, placement,
compression, splitting, merging of records, and so on are hidden from
the user. Applications remain unaware of these details.
On the other hand, logical data independence is harder to achieve
because it allows structural and constraint changes without affecting
application programs—a much stricter requirement.
Data independence occurs because when the schema is changed at some
level, the schema at the next higher level remains unchanged; only the
11
DBMS Module1 CASP

mapping between the two levels is changed. Hence, application


programs referring to the higher-level schema need not be changed.

Database Languages and Interfaces


The DBMS must provide appropriate languages and interfaces for
each category of users.
DBMS Languages
In many DBMSs where no strict separation of levels is maintained,
the data definition language (DDL),is used by the DBA and by
database designers to define both schemas. The DBMS will have a DDL
compiler whose function is to process DDL statements in order to
identify descriptions of the schema constructs and to store the schema
description in the DBMS catalog.
In DBMSs where a clear separation is maintained between the
conceptual and internal levels, the DDL is used to specify the conceptual
schema only. The storage definition language (SDL), is used to specify
the internal schema. The mappings between the two schemas may be
specified in either one of these languages.
A view definition language (VDL), to specify user views and their
mappings to the conceptual schema, but in most DBMSs the DDL is
used to define both conceptual and external schemas.

Typical database manipulations include retrieval, insertion, deletion,


and modification of the data. The DBMS provides a set of operations or
a language called the data manipulation language (DML) for these
purposes.
There are two main types of DMLs. A high-level or nonprocedural
DML can be used on its own to specify complex database operations
concisely.
A low level or procedural DML must be embedded in a general-
purpose programming language. This type of DML typically retrieves
individual records or objects from the database and processes each
separately.
The Database System Environment

12
DBMS Module1 CASP

A DBMS is a complex software system. The database system


environment includes the types of software components that constitute a
DBMS and the types of computer system software with which the
DBMS interacts.
DBMS Component Modules
Figure illustrates, in a simplified form, the typical DBMS
components. The figure is divided into two parts.
1. The top part of the figure refers to the various users of the database
environment and their interfaces.
2. The lower part shows the internals of the DBMS responsible for
storage of data and processing of transactions.

The top part of Figure shows interfaces for the DBA staff, casual users
who work with interactive interfaces to formulate queries, application
programmers who create programs using some host programming
languages, and parametric users who do data entry work by supplying
parameters to predefined transactions.
The DBA staff works on defining the database and making
changes to its definition using the DDL and other privileged commands.
The DDL compiler processes schema definitions, specified in the DDL,
and stores descriptions of the schemas (meta-data) in the DBMS catalog.

13
DBMS Module1 CASP

The catalog includes information such as the names and


sizes of files, names and data types of data items, storage details of each
file, mapping information among schemas, and constraints. In addition,
the catalog stores many other types of information that are needed by the
DBMS modules, which can then look up the catalog information as
needed.
Casual users and persons with occasional need for
information from the database interact using the interactive query
14
DBMS Module1 CASP

interface. These queries are parsed and validated for correctness of the
query syntax by a query compiler that compiles them into an internal
form. This internal query is subjected to query optimization .The query
optimizer is concerned with the rearrangement and possible reordering
of operations, elimination of redundancies, and use of correct algorithms
and indexes during execution.
Application programmers write programs in host
languages such as Java, C, or C++ that are submitted to a precompiler.
The precompiler extracts DML commands from an application program
written in a host programming language. These commands are sent to
the DML compiler for compilation into object code for database access.
The rest of the program is sent to the host language compiler. The object
codes for the DML commands and the rest of the program are linked,
forming a canned transaction.
Canned transactions are executed repeatedly by parametric users,
who simply supply the parameters to the transactions. Each execution is
considered to be a separate transaction. An example is a bank
withdrawal transaction
In the lower part, the runtime database processor executes (1) the
privileged commands, (2) the executable query plans, and (3) the canned
transactions with runtime parameters.

It works with the system catalog . It also works with the stored data
manager, which in turn uses basic operating system services . The
runtime database processor handles other aspects of data transfer, such
as management of buffers in the main memory. Some DBMSs have their
own buffer management module while others depend on the OS for
buffer management.

Concurrency control and backup and recovery systems are integrated


into the working of the runtime database processor
Usually we have a client program that accesses the DBMS running
on a separate computer from the computer on which the database resides.
The former is called the client computer running a DBMS client
software and the latter is called the database server. In some cases, the
15
DBMS Module1 CASP

client accesses a middle computer, called the application server, which


in turn accesses the database server.
The DBMS interacts with the operating system when disk
accesses are needed. If the computer system is mainly dedicated to
running the database server, the DBMS will control main memory
buffering of disk pages. The DBMS also interfaces with compilers for
general purpose host programming languages, and with application
servers and client programs running on separate machines through the
system network interface.

16

You might also like