0% found this document useful (0 votes)
21 views13 pages

CHAPTER 1 Final

fdjkj

Uploaded by

lalitiitr1999
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
21 views13 pages

CHAPTER 1 Final

fdjkj

Uploaded by

lalitiitr1999
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 13

CHAPTER 1

Overview of the DBMS

This chapter discusses the basic concepts of the database management system. After
studying this chapter, students will be able to learn the following:

 Basics of DBMS
 File System vs. DBMS
 Architecture of DBMS
 Data Independence
 Database Languages
 Database Users
 Advantages and Disadvantages of DBMS

1.0 INTRODUCTION
There is an enormous growth in the number and importance of database applications
since last two decades. Databases are used in almost all the applications in every
organization including business, healthcare, education, government, military and
libraries etc. Due to the highly competitive environment, Organizations require
accurate and reliable data for effective and efficient decision making. Many
organizations today are building separate databases, called "datawarehouse" for this
type of decision support applications, which is out of the scope of this book.

Before defining the DBMS, we must have a clear understanding of what the DBMS is
managing- the database. The data are distinct pieces of information. It can be stored
in a variety of ways such as numbers or text on pieces of paper, as bits and bytes
stored in electronic memory etc. A collection of such types of data designed to be
used by different people in different applications is called database. In more clear
way, it is a collection of inter-related data stored together with a limited duplication
which can be used by many users in one or more application in an efficient manner.
To access necessary information, which may be needed in applications from the
database, we need some mechanism or methods, which is termed as the management
system. A management system is a collection of progress that enables users to create
and maintain the database in a proficient and fastest way.

The primary purpose of a DBMS is to allow a user to store, update and retrieve data in
abstract terms and thus make it easy to maintain and retrieve information from a
database. A DBMS relieves the user from having to know about exact physical
representations of data and having to specify detailed algorithms for storing, updating
and retrieving data.

So, summarily a DBMS can be defined as software, which provides secure survivable
services for creating and accessing the database, while maintaining all the required
features of the data. A DBMS provides the various facilities such as: -
i) Creating a database
ii) Modifying / Removing the database.
iii) Inserting, Updating or Deleting the data.
iv) Maintaining the data integrity and the security.
v) Retrieving data from existing databases.

Besides these, a DBMS also provides various advanced features like Transaction
Management, Concurrency Management, Recovery Management and Storage
Management etc. All these features are explained in a chapter on Transaction and
Concurrency Control in this book.

DBMS provides data integrity services to ensure that the data is not corrupted through
outside means such as power problem or disk crash. It also provides protection of data
against unauthorized/illegal users, which means that only authorized users can access
the given data in the database. Recovering management is an activity which works in
case of a transactions failure. It returns the system to a consisted state.

Fig. 1.1 General Overview of a DBMS

The database stored in a database can be described as a group of related items, called
fields. The items all together constitute a record. A database can have many records
for example we have a STUDENT database shown below: -

Field

Enrollment_No Stud_Name F_ Name Class Stream


Record 1 Ram Mr. Shyam XII Science
2 Tapan Mr. Vikas XII Arts
3 Aryan Mr. Devesh X Science

Each record in the database stores the information about a single a student. The fields
in each record store the important information about that student. For example Ram
has an Enrollment Number 1. His father’s name is Mr. Shyam and he is studying in
class XII with science stream.
The data in the database will be both integrated and shared. An organization that
wishes to establish a database in the modern computer system will want that all the
data must be considered for inclusion in a common pool. This common pool is
accessible to all parts of the organization with a need to know that information. So the
database need to be integrated in such a way that the data must not represent just the
isolated facts about the real world but should also represent naturally occurring
relationship among them.

For example a given database might contain two files one for student which contains
Enrollment_No, Stud_Name, Class, F_name and another for books which contains
Book_Id, Title, Author, Publisher and Price. These two files are being used in two
different applications say Library Management System and Accounts Management
System. In Library Management System, STUDENT and BOOK data files are used to
issue the books to the students while in Accounts Management System, STUDENT
file is used for fee purpose while BOOK file is used for billing purpose (for
calculating total amount expenditure in purchasing the books). So these files we can
use in multiple application.

Student Data File

Library Information System

Book Data File

Fig. 1.2 An application containing more than one data file.

Library Information
System

Student Data File

Account System

Fig. 1.3 A data file being used in more than one application.
1.1 FILE SYSTEM
When computers were introduced into the business world the data were used to be the
large sizes of files. In the file system each field or data item is stored sequentially on
disk in one large file. In order to find a particular item, the system has to search the
entire file from the beginning. It can also keep a pointer (a locator on the disk) to the
last data item retrieved so that searched for more occurrences of the same data type
don’t have to begin at start of the file.

The application to interact with these data files were written mostly in COBOL. It was
very complex to write such programmes for all data related activities like inserting,
updating and query the data. Moreover each application had its own master files
causing a huge redundancy as we may need the same data files for many applications.

Several disadvantages are associated with conventional file processing systems. These
disadvantages are given as: -

a) Limited Data Sharing: With the traditional file processing approach, each
application has its own private files and users have little opportunity to share
data outside their own applications. In addition, a major management effort
may also be required since different organizational units may own these
different files.
b) Program Data Dependence: File descriptions are stored within each
application program that accesses a given file. As a consequence, any change
to a file structure requires changes to the file descriptions for all programs that
access the file. It is often difficult even to locate all programs affected by such
changes.
c) Duplication of Data: This was due to the inability of the system to access the
same data for different applications.
d) Occurrences of inconsistencies: Occurrence of inconsistencies and other
error in data files. Because a change of information at one place would cause
the same updating at all the copies. In order to override such problems, user
requirement must be accessed and considered prior to the development and
implementation of software packages.
e) Lengthy Development Time: With traditional file processing system, there is
little opportunity to leverage previous development efforts. Each new
application requires that the developer essentially start from scratch by
designing new file formats and descriptions, and then writing the file access
logic for each new program. The lengthy development times required are often
inconsistent with today’s fast-paced business environment, in which time to
market (or time to production for an information system) is a key business
success factor.
f) Inadequate Security options: In file system we can provide the security just
at operating system level. We can not provide the authorization to access
different subsets of data for users, which may be required for many
applications.
g) Excessive Program Maintenance We had to write special programs to
answer each query, which was required by the application. These programs
were generally very complex because of the large volume of the data to be
searched.
1.2 ARCHITECTURE OF DBMS
The generally accepted method of explaining the architecture of a database system
was formalized in by ANSI/SPARC committee in 1975 and subsequently in more
detail in 1978. The knowledge of this architecture is extremely useful in describing
general concepts of the database and the structure of individual system. A major
purpose of a database system is to provide users an abstract view of the data. The
system hides certain details of how the data is stored and maintained. The ANSI-
SPARC model of a database identifies three distinct levels at which data items can be
described. The brief overview of all these three levels are given here.

Internal Level or Physical Level: At the lowest level the data elements appear as
disk storage. It contains the specifications for how data are actually stored in a
computer’s secondary memory. The internal view is expressed by the internal schema,
which contains the definition of the stored record, the method of representing the data
fields, and the access aids used. In general, the main aim of this level is to describe
how we intend physically to implement the logical database design. In the following
of STUDENT database, the internal level contains a physical description of the
structure for the conceptual record expressed in a high-level language.

Struct STUDENT {
int Enrollment_No;
char Stud_Name[15];
char F_Name[15];
char Class[15];
char Stream[15];
struct STUDENT *next; // pointer to next record
}
The physical structure contains a “pointer”, next. This will be simply the memory
address at which the next record is stored. Thus the set of student records may be
physically linked together to form a chain.

Summarily, the internal level is concerned with:

 Allocating storage space for data and indexes.


 Describing the forms that records will take when stored.
 Record placement. Assembling records into files.
 Data compression and encryption techniques.
 The internal level interfaces with the OS to place data on the storage devices,
build the indexes, retrieve the data, etc.

Below the internal level is the physical level, which is managed by the OS under the
direction of the DBMS. It deals with the mechanics of physically storing data on a
device such as a disk.

Conceptual Schema is a detailed specification of the overall structure of


organizational data. It is an enterprise wide representation of data, which defines the
whole databases for a community of users in terms of relatively small structures
without references to how data ate stored in the secondary memory. It is a complete
view of the data requirements of the organization that is independent of any storage
considerations There is only one conceptual schema per database, which also contains
the method of driving the objects in the conceptual view from the objects in internal
view. It describes all the records and relationships include in the conceptual view and
therefore in the database. The conceptual schema of the STUDENT database is given
as

Enrollment_No Stud_Name F_Name Class Stream


Conceptual Level

Summarily, the conceptual level represents:


– All entities, their attributes, and their relationships.
– The constraints on the data.
– Security and integrity information.
The Fig. Shows a graphical representation of the three levels.

Fig. 1.4 Three level architecture of DBMS

View Level or External Schema: This is a logical description of some portion of the
database that is required by a user to perform some task. This user view is
independent of database technology and typically contains a subset of the associated
conceptual schema, relevant to a particular user or group of users. We can have many
users' views for a given conceptual design in this way. For example, large
Organisations may have finance and stock control departments. Workers in finance
will not usually view stock details, as they are more concerned with the accounting
side of things, for example. Thus, workers in each department will require a different
user interface to the information stored in the database. Two external schemas of
STUDENT are given in the following figure

Enrollment_No Stud_Name Class


External View 1

Stud_Name F_Name
External View 2

Views may provide different representations of the same data. For example, some
users might view dates in the form (day/month/year) while others prefer
(year/month/day). Some views might include derived or calculated data. For example,
a person’s age might be calculated from his date of birth since storing his age would
require it to be updated each year.
Each external view is described by means of a schema called an external schema or
subschema. The external schema consists of the definition of the logical records and
the relationship in the external view. The external schema also contains the method of
deriving the objective in the external view from the objects in the conceptual view.
The objects include entities, attributes, and relationships.

1.4 DATABASE APPLICATIONS


A database application is an application program or a set of related programs that is
used to perform a series of activities on behalf of database users. These activities may
vary application to application according to the need. These activities may include.

Insert: Add a new entry a new entry in the database.


Update: Modify data
Delete: Delete data form the database
Read: Read data from the database in a useful format on the screen or printer.

These database applications can be of various categories from stand-alone application


(only for a single user) to enterprise whose scope is the entire organization or
enterprise. The size of the database may also vary ranging from megabytes to several
terabytes depending upon the nature of the application.

1.5 DATA INDEPENDENCE


The separation of data description from the application programs that uses the data is
called data independence. More simply, the ability to modify schema definitions at
one level without effecting a schema definition in the next higher level is called Data
Independence. Data independence can be defined at two levels: Logically and
Physically. A well-designed system maintains data independence both physically as
well as logically. A brief overview of both the levels are given here:

a) Logical Data Independence:. Logical data independence refers to the immunity


of external schemas to changes in the conceptual schema. Changes to the
conceptual schema (adding/removing records, columns, or relationships) should
be possible without having to change existing external schemas or rewrite
application programs. The database can change and grow to reflect changes in
reality without requiring the user intervention or changes in the application
programs or requests. For example, We can delete a field from our database if the
application program is not using that field. Logical data independence is difficult
to achieve than physical data independence since programs are heavily dependent
on the logical structure of the database.

b) Physical Data Independence: Physical data independence refers to the immunity


of the conceptual schema to changes in the internal schema. Changes to the internal
schema (using different storage structures or file organisations) should be possible
without having to change the conceptual or external schemas.

1.6 DATABASE LANGUAGE


To provide the various facilities to different types of users, a DBMS normally
provides one or more specialized programming languages often called Database
Languages. Database languages come in different forms. A language is needed to
describe the database to the DBMS as well as provide facilities for changing the
database and for defining and changing physical data structures. Another language is
needed for manipulating and retrieving data stored in the DBMS. These languages are
called Data Description Languages (DDL) and Data Manipulation Languages (DML)
respectively. A brief description of these two languages is given in the following
section.

Data Definition Language: As the name suggests, DDL is used to define the
conceptual schema. It also facilitates to give the details about how to implement this
schema in physical devices used to store the data. This definition include:
- the name of schema
- the attribute of the schema along with their data types.
- constraints specifications
- relationship among various schemas.

On compilation of the DDL statement, schema (usually a table) is created. The DBMS
maintains the information of all such schemas in a special file called data dictionary.

A data dictionary is a file that contains metadata – that is, data about data. It works
like a central repository for the database. For any data manipulation operation, DBMS
has to consult this data dictionary first. For example, if a constraint is applied on the
STUDENT schema that the students can enroll themselves only in two courses that is
either B.A. and M.A. then any attempt to register a student who wish to study in other
than these two courses will be checked in the data dictionary and would give a proper
error message.

There is a special type of DDL, called DSDL (Data Storage and Definition Language)
which is used to storage structure (internal schema) and access method. The complied
internal schema specifies the implementation details of the internal database,
including the access methods, employed. This information is handled by the DBMS,
the user need not be aware of these details.

Data Manipulation Language: DML is a language, which is used to access or


manipulate data. We can do the following by using DML:

- Retrieve the data from the database, called the query, A query is a statement in
the DML that requests the retrieval of data from the database as per the
application requirement.
- Insert the new data in the database.
- Modifying the existing data.
- Deleting the data from the database.

The DML provides the commands to perform all the above said activities. These
commands can be sued in an interactive model so that a result is returned immediately
following the execution of the command or with any programming languages such as
COBOL, C, etc. Embedded DML may provide the programmer with more control
over timing of report generation. There are two types of DML, first is procedural
DML which requires writing the procedures/methods to specify what data is needed
and how to get it. Procedural DML language has all the features of any other high
level language including control constructs, error handling and other features. The
second type of DML is non-procedural DML, which needs just what data is needed
without requiring the accessing method.
1.7 DATABASE USERS
A primary goal of a database system is to provide an improved environment for
retrieving information from and storing new information into the database. There are
number of users who can interact with the system.

System Analyst and Application Programmers: They determine the requirements


of end users and develop specifications in the form of application programs that meet
the requirements. Application Programmers develop the applications by using the host
or data languages and DBMS tools. These applications are used by the end-user. Such
applications access the database by issuing the appreciate request, typically an SQL
statement to the DBMS. These applications may be computer-aided design systems,
knowledge base or expert systems or applications having complex business rules and
application having graphics or audio data.

End Users: Usually, these are not the computer professionals. They access to the
database for querying, updating and generating reports. The main responsibilities
include constantly querying and updating the database, using standard types of queries
and updates called canned transactions that have been carefully programmed and
tested. End Users come form a diverse and increasing number of areas. They simply
use application written by database application programmers, and so require little
technical knowledge about DBMS software. For example, reservation clerks for
airlines, hotels, and car rental companies check availability for given requests and
make reservations if available. Here the user does not know the logical behind the
application.

Database Administrators
The DBMS is at the center of most modern application systems. Technology and
business requirements come together to deliver business solutions with the DBMS as
the central point of convergence. And the DBA is the guardian of the DBMS.

Each database requires at least one database administrator (DBA) to administer it.
Because a database management system can be large and can have many user. Often
this is not a one person job. In such cases, there us a group of DBAs who share
responsibility. The DBA must possess a mixture of technical expertise, political
savvy, and leadership and business knowledge to succeed.
A database administrator’s responsibilities can include the following task:

 Database Designing: The DBA may be required to design logical application


models and create physical schema structures and maintain user connectivity. For
this he must have a through understanding of the business needs and how to
implement it efficiently. For this, he/she has to interact with the users of the
systems to understand what data is to be stored and how it is likely to be used.
DBA creates appreciate storage structures and methods for accessing the data.
 Granting the permissions for the data access: DBA provide adequate security
to the companies data so that only those individuals that require the access have
the access. In a practical situation, it may be required that only the authorized
users can access the database. All these permissions are granted or revoked by the
DBA.
 Integrity Constraint Specifications: The data values stored in the database must
satisfy certain consistency constraints. Such constraints must be specified
explicitly by the database administrator.
 Backup and Recovery: Backup and recovery is probably the most important
thing in database application development. This backup and recovery must meet
the business needs of the data. DBA is responsible for recovery of data, if the
system fails. He is also responsible to take the backup of the data periodically to
avoid much loss of data.
 Performance and Tuning: Sometimes due to un-normalized database or very
heavy database in size, the performance of the database operations degrades. Now,
this is the DBA responsibility to make necessary changes to the database and
other performance related, to make the database efficient. The faster a DBA can
find a problem, the sooner he/she can correct the problem. The best solution is to
have a monitoring solution in place that would warn the DBA of impending
problems. Many problems are easy to fix before they become a problem. This pro-
active approach to problem detection saves a considerable amount of time and
frustration for the DBA as they can plan certain types of corrections and can
virtually eliminate ‘fire fighting’, a sudden problem that appears that needs
immediate attention.

 Monitoring of the growth: The DBA must also be aware of the growth of the
database. It is an expectation of the DBA to provide management with growth
forecasts so that any needed additional hardware can be ordered in a timely
manner.

To accurately build databases, and then manage data quality, integrity and security, a
thorough understanding of the data from a business perspective is mandatory. DBAs
have difficult jobs that require a delicate balance of business and technology;
leadership and understanding. Indeed, the role of the DBA is changing.

1.8 ELEMENTS OF A DBMS


The major components of DBMS are given as follows

DML Precompiler: Converts DML statements embedded in an application program


to normal procedure calls in a host language. The pre-compiler interacts with the
query processor.

DDL Compiler: Converts DDL statements to a set of tables containing metadata


stored in a data dictionary.

File Manager: File Manager is responsible for allocation of space on disk storage
and the data structures used to represent information stored on physical media.

Data Manager The database managers is the program or program unit that provides
the interface between the physical level and the conceptual level. It translates DML
statements into low-level file system commands to interact with the data stored in the
database.
Functions of the database manager:
 interaction with the file manager (file system)
 minimizing file reads and writes, as disk access is slower than main
memory access
 translating DML commands to file operations
 integrity enforcement
 checking that consistency constraints are satisfied
 taking some action when they aren't
 security enforcement
 preventing unauthorized access to data
 example: through a password and security classification system
 backup and recovery
 detect when information in the database or data dictionary is lost or
corrupted due to disk crash, power failure, software errors ...
 restore the database to a previous consistent state
 concurrency control - making sure that concurrent updates don't give
surprising or inconsistent results.
The database manager for a small system typically does not implement all of these
functions.

Query Processor: To retrieve the desired information from the database, user has to
write the query in DML either in interactive mode or in embedded form. Query
processor transforms this query into an equivalent correct and efficient execution
strategy and sends it to the data manager for execution.

Data Dictionary: Data dictionary is used to store the metadata- that is the data about
the data. It contains a list of all schemas in the database, the number of records in each
file, the names and types of each field and relationships between different data
structures. Most database management systems keep the data dictionary hidden from
users to prevent them from accidentally destroying its contents.
Data dictionaries do not contain any actual data from the database, only bookkeeping
information for managing it. Without a data dictionary, however, a database
management system cannot access data from the database.

1.9 ADVANTAGES OF DBMS


1 Reduction of Redundancies: Unlike previously used file system, now we
can integrate our database into a single logical structure, thus providing the
new centralized system. This new system stores all the facts in only one place
in the database. As a result it reduces the total amount of data storage required.
Also with centralized data, the extra processing to trace the required data is
also reduced. Although DBMS does not eliminate redundancy entirely, but it
allows to reduce it considerably.
2 Improved Data Sharing: DBMS allows a user to share the data in any
number of application programs.
3 Data Integrity: Integrity means that the data in the database is accurate.
Centralized control of the data helps in permitting the administrator to define
integrity constraints to the data in the database for example, in customer
database we can enforce an integrity constraint that it must accept the
customers only from New Delhi and Mumbai city.
4 Security: Having complete authority over the operational data, enables the
Database Administrator (DBA) in ensuring that the only mean of access to the
database is through proper channels. The DBA can define authorization
checks to be carried out whenever access to sensitive data is attempted.
Different checks can be established for each type of access (retrieve, modify,
delete etc.) to every piece of information in the database.
5 Data Consistency: By eliminating (for controlling) data redundancy, we
greatly reduce the opportunities for inconsistency. For example, if a customer
address is stored only once, we cannot have disagreement on the stored values.
Also updating data values is greatly simplified when each value is stored in
one place only. Finally, we avoid the wasted storage space that results from
redundant data storage.
6 Efficient Data Access: In a database system, the data is managed by the
DBMS and all access to the data is through the DBMS providing a key to
effective data processing. This contrasts with conventional data processing
systems where each application program has direct access to the data it reads
or manipulates.
7 Enforcement of Standards: With the centralized control of the data, DBA
can establish and enforces the data standards which may include the naming,
conventions, data quality standard etc.
8 Data Independence: In the conventional data processing application
programs, the programs usually are based on a considerable knowledge of data
structure and format. In such environment any change of data structure or
format would require appropriate changes to the application programs. If
major changes were to be made to the data, the application programs may need
to be rewritten.
In a database system, the database management system provides the interface
between the application programs and the data. When changes are made to the
data representation, the metadata maintained by the DBMS is changed but the
DBMS continues to provide data to application programs in the previously
used way. The DBMS handles the task of transformation of data wherever
necessary.

9 Reduced Application Development Time and Program Maintenance:


DBMS supports many important functions that are common to many
applications, accessing data stored in the DBMS, which facilitates the quick
development of application.

1.10 DISADVANTAGES
A database system generally provides on-line access to the database for many users.
In contrast, a conventional system is often designed to meet a specific need and
therefore generally provides access to only a small number of users. Because of the
larger number of users accessing the data when a database is used, the enterprise may
involve additional risks as compared to a conventional data processing system in the
following areas.
1. Confidentiality, Privacy and Security: When information is centralized and is
made available to users from remote locations, the possibilities of abuse are often
more than in a conventional system. To reduce the chances of unauthorized users
accessing sensitive information, it is necessary to take technical, administrative and,
possibly, legal measures. Most databases store valuable information that must be
protected against deliberate trespass and destruction.
2. Data Quality: Since the database is accessible to users remotely, adequate controls
are needed to control users updating data and to control data quality. With increased
number of users accessing data directly, there are enormous opportunities for users to
damage the data. Unless there are suitable controls, the data quality may be
compromised.
3. Data Integrity: Since a large number of users could be using a database
concurrently, technical safeguards are necessary to ensure that the data remain
correct during operation. The main threat to data integrity comes from several
different users attempting to update the same data at the same time. The database
therefore needs to be protected against inadvertent changes by the users.
4. Enterprise Vulnerability: Centralizing all data of an enterprise in one database
may mean that the database becomes an indispensable resource. The survival of
the enterprise may depend on reliable information being available from its
database. The enterprise therefore becomes vulnerable to the destruction of the
database or to unauthorized modification of the database.
5. The Cost of using a DBMS: Conventional data processing systems are typically
designed to run a number of well-defined, preplanned processes. Such systems are
often "tuned" to run efficiently for the processes that they were designed for.
Although the conventional systems are usually fairly inflexible in that new
applications may be difficult to implement and/or expensive to run, they are
usually very efficient for the applications they are designed for.
6. The database approach on the other hand provides a flexible alternative where
new applications can be developed relatively inexpensively. The flexible approach
is not without its costs and one of these costs is the additional cost of running
applications that the conventional system was designed for. Using standardized
software is almost always less machine efficient than specialized software.

EXERCISE

1. Explain the three layer architecture of DBMS.


2. Discuss the advantages of DBMS over traditional file system.
3. What is data independence? Explain in detail.
4. Differentiate between procedural and non-procedural database languages.
5. Discuss the role of Database Administrator.

You might also like