0% found this document useful (0 votes)
12 views

Database Management Systems Notes

Uploaded by

siler.aveer
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views

Database Management Systems Notes

Uploaded by

siler.aveer
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 49

CA -201 DBMS - SIRT Dr.

ANUPAMA JAIN
Objectives:
 To Understand the basic concepts and the applications of databasesystems
 To Master the basics of SQL and construct queries usingSQL
 To understand the relational database designprinciples
 To become familiar with the basic issues of transaction processing and concurrency
control
 To become familiar with database storage structures and accesstechniques

UNIT I:
Database System Applications, Purpose of Database Systems, View of Data – Data
Abstraction –Instances and Schemas – Data Models – the ER Model – Relational Model –
Other Models – Database Languages – DDL – DML – database Access for applications
Programs – Database Users and Administrator – Transaction Management – Database
Architecture – Storage Manager – the Query Processor.
Introduction to the Relational Model – Structure – Database Schema, Keys – Schema
Diagrams.
Database design and ER diagrams – ER Model - Entities, Attributes and Entity sets –
Relationships and Relationship sets – ER Design Issues – Concept Design – Conceptual
Design with relevant Examples. Relational Query Languages, Relational Operations.

UNIT II:
Relational Algebra – Selection and projection set operations – renaming – Joins – Division –
Examples of Algebra overviews – Relational calculus – Tuple Relational Calculus (TRC) –
Domain relational calculus (DRC).
Overview of the SQL Query Language – Basic Structure of SQL Queries, Set Operations,
Aggregate Functions – GROUPBY – HAVING, Nested Sub queries, Views, Triggers,
Procedures.

UNIT III:
Normalization – Introduction, Non loss decomposition and functional dependencies, First,
Second, and third normal forms – dependency preservation, Boyce/Codd normal form.
Higher Normal Forms - Introduction, Multi-valued dependencies and Fourth normal form,
Join dependencies and Fifth normal form

UNIT IV:
Transaction Concept- Transaction State- Implementation of Atomicity and Durability –
Concurrent Executions – Serializability- Recoverability – Implementation of Isolation –
Testing for serializability- Lock –Based Protocols – Timestamp Based Protocols- Validation-
Based Protocols – Multiple Granularity.

CA -201 DBMS - SIRT Dr. ANUPAMA JAIN


UNIT V:
Recovery and Atomicity – Log – Based Recovery – Recovery with Concurrent Transactions
– Check Points - Buffer Management – Failure with loss of nonvolatile storage-Advance
Recovery systems- ARIES Algorithm, Remote Backupsystems.
File organization – various kinds of indexes - B+ Trees- Query Processing – Relational Query
Optimization.

TEXT BOOKS:
1. Database System Concepts, Siebrecht, Korte, McGraw hill, Sixth Edition.(All UNITS
except IIIth)
2. Database Management Systems, Raghu Ramakrishnan, Johannes Gehrke, TATA
McGrawHill 3rdEdition.

REFERENCE BOOKS:
1. Fundamentals of Database Systems, Elmasri Navathe Pearson Education.
2. An Introduction to Database systems, C.J. Date, A.Kannan, S. Swami Nadhan,
Pearson, Eight Edition for UNIT III.

Outcomes:
 Demonstrate the basic elements of a relational database managementsystem
 Ability to identify the data models for relevantproblems
 Ability to design entity relationship and convert entity relationship diagrams into
RDBMS and formulate SQL queries on the respectdata
 Apply normalization for the development of applicationsoftware

CA -201 DBMS - SIRT Dr. ANUPAMA JAIN


INDEX

S. No Topic Page no
Unit
INTRODUCTION TO DATABASE
1 I MANAGEMENT SYSTEM 1

I VIEW OF DATA
2 6

I INSTANCES AND SCHEMAS


3 8

I
4 ENTITY-RELATIONSHIP MODEL 9

I DATABASE SCHEMA
5 21

S. No Topic Page no
Unit
II
1 PRELIMINARIES 23

II
2 RELATIONAL ALGEBRA 23
II
3 RELATIONAL CALCULUS 28

II
4 THE FORM OF A BASIC SQL QUERY 31

II INTRODUCTION TO VIEWS
5 39

II
6 TRIGGERS 40

CA -201 DBMS - SIRT Dr. ANUPAMA JAIN


S. No Topic Page no
Unit
III SCHEMA REFINEMENT
1 42
III FUNCTIONAL
2 44
DEPENDENCIES
III
3 NORMAL FORMS 46

III
4 DECOMPOSITIONS 49

III DEPENDENCY-PRESERVING
5 55
DECOMPOSITION INTO 3NF
III OTHER KINDS OF
6 DEPENDENCIES 56

S. No Topic Page no
Unit
IV
1 TRANSACTION CONCEPT 63
IV
2 CONCURRENT EXECUTION 67

IV TRANSACTION
3 72
CHARACTERISTICS
IV RECOVERABLE
4 SCHEDULES 76

IV
5 RECOVERY SYSTEM 79

IV TIMESTAMP-BASED
6 85
PROTOCOLS
IV
7 MULTIPLE GRANULARITY. 87

CA -201 DBMS - SIRT Dr. ANUPAMA JAIN


S. No Topic Page no
Unit
V FAILURE WITH LOSS OF
1 88
NON-VOLATILE STORAGE
V
2 REMOTE BACKUP 88
V RECOVERY AND
3 89
ATOMICITY
V
4 LOG-BASED RECOVERY 90

V RECOVERY WITH
5 CONCURRENT 92
TRANSACTIONS
V
6 DBMS FILE STRUCTURE 93

CA -201 DBMS - SIRT ANUPAMA


CA -201 DBMS - SIRT ANUPAMA
UNIT-
1
Introduction to Database Management System
As the name suggests, the database management system consists of two parts. They are:
1. Databaseand
2. ManagementSystem

What is a Database?
To find out what database is, we have to start from data, which is the basic building block of
any DBMS.

Data: Facts, figures, statistics etc. having no particular meaning (e.g., 1, ABC, 19 etc).
Record: Collection of related data items, e.g. in the above example the three data items had no
meaning. But if we organize them in the following way, then they collectively represent meaningful
information.
Roll Name Age

1 ABC 19

Table or Relation: Collection of


related records.

Roll Name Age

1 ABC 19

2 DEF 22

3 XYZ 28

The columns of this relation are called Fields, Attributes or Domains. The rows
are called Tuples or Records.
Database: Collection of related relations. Consider the following collection of tables:
T1 T2

Roll Name Age


Roll Address

1 ABC 19
1 KOL

2 DEF 22
2 DEL

3 XYZ 28
3 MUM

CA -201 DBMS - SIRT ANUPAMA


T3 T4

DATABASEMANGAEMENTSYSTEM Page 1

CA -201 DBMS - SIRT ANUPAMA


Roll Year
Year Hostel

1 I
I H1

2 II
II H2

3 I

Age and Hostel attributes are in different tables.

A database in a DBMS could be viewed by lots of different people with different responsibilities.

Figure 1.1: Employees are accessing Data through DBMS

For example, within a company there are different departments, as well as customers, who each need
to see different kinds of data. Each employee in the company will have different levels of access to the
database with their own customized front-endapplication.

In a database, data is organized strictly in row and column format. The rows are called Tuple or
Record. The data items within one row may belong to different data types. On the other hand, the
columns are often called Domain or Attribute. All the data items within a single attribute are of the
same data type.

What is Management System?

A database-management system (DBMS) is a collection of interrelated data and a set of programs to


access those data. This is a collection of related data with an implicit meaning and hence is a database.
The collection of data, usually referred to as the database, contains information relevant to an

CA -201 DBMS - SIRT ANUPAMA


DATABASEMANGAEMENTSYSTEM Page 2

CA -201 DBMS - SIRT ANUPAMA


enterprise. The primary goal of a DBMS is to provide a way to store and retrieve database information
that is both convenient and efficient. By data, we mean known facts that can be recorded and that have
implicit meaning.

Database systems are designed to manage large bodies of information. Management of data involves
both defining structures for storage of information and providing mechanisms for the manipulation of
information. In addition, the database system must ensure the safety of the information stored, despite
system crashes or attempts at unauthorized access. If data are to be shared among several users, the
system must avoid possible anomalous results.

Database Management System (DBMS) and Its Applications:

A Database management system is a computerized record-keeping system. It is a repository or a


container for collection of computerized data files. The overall purpose of DBMS is to allow he users
to define, store, retrieve and update the information contained in the database on demand. Information
can be anything that is of significance to an individual or organization.

Databases touch all aspects of our lives. Some of the major areas of application are as follows:
1. Banking
2. Airlines
3. Universities
4. Manufacturing andselling
5. Humanresources

Enterprise Information
◦ Sales: For customer, product, and purchaseinformation.
◦ Accounting: For payments, receipts, account balances, assets and other accountinginformation.
◦ Human resources: For information about employees, salaries, payroll taxes, and benefits, and for
generation ofpaychecks.
◦ Manufacturing: For management of the supply chain and for tracking production of items infactories,
inventories of items inwarehouses and stores, and orders foritems.
Online retailers: For sales data noted above plus online order tracking, generation
of recommendation lists,and
maintenance of online product evaluations.

◦ Banking: For customer information, accounts, loans, and bankingtransactions.


◦ Credit card transactions: For purchases on credit cards and generation of monthlystatements.
◦ Finance: For storing information about holdings, sales, and purchases of financial instruments such
as stocks and bonds; also, for storing real-time market data to enable online trading by customers
and
automated trading by the firm.
• Universities: For student information, course registrations, and grades (in addition to standard
enterprise information such as human resources andaccounting).
• Airlines: For reservations and schedule information. Airlines were among the first to use databases
in a geographically distributedmanner.
• Telecommunication: For keeping records of calls made, generating monthly bills, maintaining
balances on prepaid calling cards, and storing information about the communicationnetworks.

Purpose of Database Systems

CA -201 DBMS - SIRT ANUPAMA


DATABASEMANGAEMENTSYSTEM Page 3

CA -201 DBMS - SIRT ANUPAMA


Database systems arose in response to early methods of computerized management of commercial
data. As an example of such methods, typical of the 1960s, consider part of a university organization
that, among other data, keeps information about all instructors, students, departments, and course
offerings. One way to keep the information on a computer is to store it in operating system files. To
allow users to manipulate the information, the system has a number of application programs that
manipulate the files, including programsto:

Add new students, instructors, andcourses

Register students for courses and generate classrosters

Assign grades to students, compute grade point averages (GPA), and generatetranscripts

This typical file-processing system is supported by a conventional operating system. The system
stores permanent records in various files, and it needs different application programs to extract records
from, and add records to, the appropriate files. Before database management systems (DBMSs) were
introduced, organizations usually stored information in such systems. Keeping organizational
information in a file-processing system has a number of majordisadvantages:

Data redundancy and inconsistency. Since different programmers create the files and application
programs over a long period, the various files are likely to have different structures and the programs
may be written in several programming languages. Moreover, the same information may be duplicated
in several places (files). For example, if a student has a double major (say, music and mathematics) the
address and telephone number of that student may appear in a file that consists of student records of
students in the Music department and in a file that consists of student records of students in the
Mathematics department. This redundancy leads to higher storage and access cost. In addition, it may
lead to data inconsistency; that is, the various copies of the same data may no longer agree. For
example, a changed student address may be reflected in the Music department records but not
elsewhere in the system.

Difficulty in accessing data. Suppose that one of the university clerks needs to find out the names of
all students who live within a particular postal-code area. The clerk asks the data-processing
department to generate such a list. Because the designers of the original system did not anticipate this
request, there is no application program on hand to meet it. There is, however, an application program
to generate the list of all students.

Data isolation. Because data are scattered in various files, and files may be in different formats,
writing new application programs to retrieve the appropriate data is difficult.

Integrity problems. The data values stored in the database must satisfy certain types of consistency
constraints. Suppose the university maintains an account for each department, and records the balance
amount in each account. Suppose also that the university requires that the account balance of a
department may never fall below zero. Developers enforce these constraints in the system by adding
appropriate code in the various application programs. However, when new constraints are added, it is
difficult to change the programs to enforce them. The problem is compounded when constraints
involve several data items from differentfiles.

Atomicity problems. A computer system, like any other device, is subject to failure. In many
applications, it is crucial that, if a failure occurs, the data be restored to the consistent state that existed
prior to the failure.

CA-201 DBMS - SIRT ANUPAMA


Consider a program to transfer $500 from the account balance of department A to the account balance
of department B. If a system failure occurs during the execution of the program, it is possible that the

DATABASEMANGAEMENTSYSTEM Page 4

CA-201 DBMS - SIRT ANUPAMA


$500 was removed from the balance of department A but was not credited to the balance of department
B, resulting in an inconsistent database state. Clearly, it is essential to database consistency that either
both the credit and debit occur, or that neither occur.

That is, the funds transfer must be atomic—it must happen in its entirety or not at all. It is difficult to
ensure atomicity in a conventional file-processing system.

Concurrent-access anomalies. For the sake of overall performance of the system and faster response,
many systems allow multiple users to update the data simultaneously. Indeed, today, the largest
Internet retailers may have millions of accesses per day to their data by shoppers. In such an
environment, interaction of concurrent updates is possible and may result in inconsistent data.
Consider department A, with an account balance of $10,000. If two department clerks debit the
account balance (by say $500 and $100, respectively) of department A at almost exactly the same
time, the result of the concurrent executions may leave the budget in an incorrect (or inconsistent)
state. Suppose that the programs executing on behalf of each withdrawal read the old balance, reduce
that value by the amount being withdrawn, and write the result back. If the two programs run
concurrently, they may both read the value $10,000, and write back $9500 and $9900, respectively.
Depending on whichonewritesthevaluelast,theaccountbalanceofdepartment Amaycontaineither$9500or
$9900, rather than the correct value of $9400. To guard against this possibility, the system must
maintain some form of supervision.
But supervision is difficult to provide because data may be accessed by many different application
programs that have not been coordinated previously.

Security problems. Not every user of the database system should be able to access all the data. For
example, in a university, payroll personnel need to see only that part of the database that has financial
information. They do not need access to information about academic records. But, since application
programs are added to the file-processing system in an ad hoc manner, enforcing such security
constraints is difficult.

These difficulties, among others, prompted the development of database systems. In what follows, we
shall see the concepts and algorithms that enable database systems to solve the problems with file-
processing systems.

Advantages of DBMS:

Controlling of Redundancy: Data redundancy refers to the duplication of data (i.e storing same data
multiple times). In a database system, by having a centralized database and centralized control of data
by the DBA the unnecessary duplication of data is avoided. It also eliminates the extra time for
processing the large volume of data. It results in saving the storage space.

Improved Data Sharing: DBMS allows a user to share the data in any number of application programs.

Data Integrity: Integrity means that the data in the database is accurate. Centralized control of the
data helps in permitting the administrator to define integrity constraints to the data in the database. For
example: in customer database we can can enforce an integrity that it must accept the customer only
from Noida and Meerut city.

Security: Having complete authority over the operational data, enables the DBA in ensuring that the
only mean of access to the database is through proper channels. The DBA can define authorization
checks to be carried out whenever access to sensitive data is attempted.
CA-201 DBMS - SIRT ANUPAMA
DATABASEMANGAEMENTSYSTEM Page 5

CA-201 DBMS - SIRT ANUPAMA


Data Consistency: By eliminating data redundancy, we greatly reduce the opportunities for
inconsistency. For example: is a customer address is stored only once, we cannot have disagreement
on the stored values. Also updating data values is greatly simplified when each value is stored in one
place only. Finally, we avoid the wasted storage that results from redundant datastorage.

Efficient Data Access: In a database system, the data is managed by the DBMS and all access to the
data is through the DBMS providing a key to effective data processing

Enforcements of Standards: With the centralized of data, DBA can establish and enforce the data
standards which may include the naming conventions, data quality standards etc.

Data Independence: Ina database system, the database management system provides the interface
between the application programs and the data. When changes are made to the data representation, the
meta data obtained by the DBMS is changed but the DBMS is continuing to provide the data to
application program in the previously used way. The DBMs handles the task of transformation of data
wherever necessary.

Reduced Application Development and Maintenance Time: DBMS supports many important
functions that are common to many applications, accessing data stored in the DBMS, which facilitates
the quick development of application.

Disadvantages of DBMS
1) It is bit complex. Since it supports multiple functionalities to give the user the best, the underlying
software has become complex. The designers and developers should have thorough knowledge
about the software to get the most out ofit.
2) Because of its complexity and functionality, it uses large amount of memory. It also needs large
memory to runefficiently.
3) DBMS system works on the centralized system, i.e.; all the users from all over the world access
this database. Hence any failure of the DBMS, will impact all theusers.
4) DBMS is generalized software, i.e.; it is written work on the entire systems rather specific one.
Hence some of the application will runslow.

View of Data

A database system is a collection of interrelated data and a set of programs that allow users to access
and modify these data. A major purpose of a database system is to provide users with an abstract view
of the data. That is, the system hides certain details of how the data are stored and maintained.
Data Abstraction
For the system to be usable, it must retrieve data efficiently. The need for efficiency has led designers
to use complex data structures to represent data in the database. Since many database-system users
are not computer trained, developers hide the complexity from users through several levels of
abstraction, to simplify users’ interactions with thesystem:

CA-201 DBMS - SIRT ANUPAMA


DATABASEMANGAEMENTSYSTEM Page 6

CA-201 DBMS - SIRT ANUPAMA


Databas
e DISK

Figure 1.2: Levels of Abstraction in a DBMS

• Physical level (or Internal View / Schema): The lowest level of abstraction describes how the data
are actually stored. The physical level describes complex low-level data structures indetail.

• Logical level (or Conceptual View / Schema): The next-higher level of abstraction describes what
data are stored in the database, and what relationships exist among those data. The logical level thus
describes the entire database in terms of a small number of relatively simple structures. Although
implementation of the simple structures at the logical level may involve complex physical-level
structures, the user of the logical level does not need to be aware of this complexity. This is referred
to as physical dataindependence.
• View level (or External View / Schema): The highest level of abstraction describes only part of the
entire database. Even though the logical level uses simpler structures, complexity remains because of
the variety of information stored in a large database. Many users of the database system do not need
all this information; instead, they need to access only a part of the database. The view level of
abstraction exists to simplify their interaction with the system. The system may provide many views
for the samedatabase.
For example, we may describe a record as follows:
type instructor = record
ID:char (5);
name:char (20);
dept name:char (20);
salary:numeric (8,2);
end;

This code defines a new record type called instructor with four fields. Each field has a name
and a type associated with it. A university organization may have several such record types,
including

• department, with fields dept_name, building, andbudget


• course, with fields course_id, title, dept_name, andcredits

CA-201 DBMS – SAGE DR. ANUPAMA JAIN


DATABASEMANGAEMENTSYSTEM Page 7

CA-201 DBMS – SAGE DR. ANUPAMA JAIN


• student, with fields ID, name, dept_name, andtot_cred

At the physical level, an instructor, department, or student record can be described as a block of
consecutive storage locations.

At the logical level, each such record is described by a type definition, as in the previous code
segment, and the interrelationship of these record types is defined as well.

Finally, at the view level, computer users see a set of application programs that hide details of the
data types. At the view level, several views of the database are defined, and a database user sees
some or all of theseviews.

Instances and Schemas

Databases change over time as information is inserted and deleted. The collection of information
stored in the database at a particular moment is called an instance of the database. The overall design
of the database is called the database schema. Schemas are changed infrequently, if at all. The
concept of database schemas and instances can be understood by analogy to a program written in a
programming language.

Each variable has a particular value at a given instant. The values of the variables in a program at a
point in time correspond to an instance of a database schema. Database systems have several
schemas, partitioned according to the levels of abstraction. The physical schema describes the
database design at the physical level, while the logical schema describes the database design at the
logical level. A database may also have several schemas at the view level, sometimes called
subschemas, which describe different views of the database. Of these, the logical schema is by far
the most important, in terms of its effect on application programs, since programmers construct
applications by using the logical schema. Application programs are said to exhibit physical data
independence if they do not depend on the physical schema, and thus need not be rewritten if the
physical schemachanges.
Data Models

Underlying the structure of a database is the data model: a collection of conceptual tools for
describing data, data relationships, data semantics, and consistency constraints.

The data models can be classified into four different categories:

• Relational Model. The relational model uses a collection of tables to represent both data and the
relationships among those data. Each table has multiple columns, and each column has a unique
name. Tables are also known as relations. The relational model is an example of a record-based
model.

Entity-Relationship Model. The entity-relationship (E-R) data model uses a collection of basic
objects, called entities, and relationships among these objects.

Suppose that each department has offices in several locations and we want to record the locations at
which each employee works. The ER diagram for this variant of Works In, which we call Works In2

CA-201 DBMS – SAGE DR. ANUPAMA JAIN


DATABASEMANGAEMENTSYSTEM Page 8

CA-201 DBMS – SAGE DR. ANUPAMA JAIN


Example - ternary

CA-201 DBMS – SAGE DR. ANUPAMA JAIN


DATABASEMANGAEMENTSYSTEM Page 9

CA-201 DBMS – SAGE DR. ANUPAMA JAIN


E R Model -(Railway Booking System)

CA-201 DBMS – SAGE DR. ANUPAMA JAIN


DATABASEMANGAEMENTSYSTEM Page 10

CA-201 DBMS – SAGE DR. ANUPAMA JAIN


E R Model -(Banking Transaction System)

Object-Based Data Model. Object-oriented programming (especially in Java, C++, or C#) has
become the dominant software-development methodology. This led to the development of an object-
oriented data model that can be seen as extending the E-R model with notions of encapsulation,
methods (functions), and object identity.

Semi-structured Data Model. The semi-structured data model permits the specification of data
where individual data items of the same type may have different sets of attributes. This is in contrast
to the data models mentioned earlier, where every data item of a particular type must have the same
set of attributes. The Extensible Markup Language (XML) is widely used to represent semi-
structured data.

Historically, the network data model and the hierarchical data model preceded the relational data
model.
These models were tied closely to the underlying implementation, and complicated the task of modeling
data.
As a result, they are used little now, except in old database code that is still in service in some places.

Database Languages
A database system provides a data-definition language to specify the database
schema and a data-manipulation language to express database queries and updates. In practice,
the data-definition and data-manipulation languages are not two separate languages; instead they
simply form parts of a single database language, such as the widely used SQL language.

Data-Manipulation Language

A data-manipulation language (DML) is a language that enables users to access or manipulate data
as organized by the appropriate data model. The types of access are:
• Retrieval of information stored in thedatabase
• Insertion of new information into thedatabase
• Deletion of information from thedatabase
• Modification of information stored in thedatabase

There are basically two types:


• Procedural DMLs require a user to specify what data are needed and how to get thosedata.
• Declarative DMLs (also referred to as nonprocedural DMLs) require a user to specify what data
are needed without specifying how to get thosedata.
A query is a statement requesting the retrieval of information. The portion of a DML that involves
information retrieval is called a query language. Although technically incorrect, it is common practice
to use the terms query language and data-manipulation languagesynonymously.

Data-Definition Language (DDL)


We specify a database schema by a set of definitions expressed by a special language called a data-
definition language (DDL). The DDL is also used to specify additional properties of the data.

• Domain Constraints. A domain of possible values must be associated with every attribute (for
example, integer types, character types, date/time types). Declaring an attribute to be of a particular
domain acts as a constraint on the values that it can take. Domain constraints are the most elementary
form of integrity constraint. They are tested easily by the system whenever a new data item is entered
into thedatabase.
CA-201 DBMS – SAGE DR. ANUPAMA JAIN
DATABASEMANGAEMENTSYSTEM Page 11

CA-201 DBMS – SAGE DR. ANUPAMA JAIN


• Referential Integrity. There are cases where we wish to ensure that a value that appears in one
relation for a given set of attributes also appears in a certain set of attributes in another relation
(referential integrity). For example, the department listed for each course must be one that actually
exists. More precisely, the dept name value in a course record must appear in the dept name attribute
of some record of the departmentrelation.
• Assertions. An assertion is any condition that the database must always satisfy. Domain constraints
and referential-integrity constraints are special forms of assertions. However, there are many
constraints that we cannot express by using only these special forms. For example, “Every
department must have at least five courses offered every semester” must be expressed as an
assertion..

• Authorization. We may want to differentiate among the users as far as the type of access they are
permitted on various data values in the database. These differentiations are expressed in terms of
authorization, the most common being: read authorization, which allows reading, but not
modification, of data; insert authorization, which allows insertion of new data, but not modification
of existing data; update authorization, which allows modification, but not deletion, of data; and
delete authorization, which allows deletion of data. We may assign the user all, none, or a
combination of these types ofauthorization.

The DDL, just like any other programming language, gets as input some instructions (statements) and
generates some output. The output of the DDL is placed in the data dictionary, which contains
metadata—that is, data about data.
Data Dictionary

We can define a data dictionary as a DBMS component that stores the definition of data
characteristics and relationships. You may recall that such “data about data” were labeled metadata.
The DBMS data dictionary provides the DBMS with its self-describing characteristic. In effect, the
data dictionary resembles and X-ray of the company’s entire data set, and is a crucial element in the
data administrationfunction.
For example, the data dictionary typically stores descriptions of all:
• Data elements that are define in all tables of all databases. Specifically, the data dictionary stores
the name, datatypes, display formats, internal storage formats, and validation rules. The data
dictionary tells where an element is used, by whom it is used and soon.
• Tables define in all databases. For example, the data dictionary is likely to store the name of the
table creator, the date of creation access authorizations, the number of columns, and soon.
• Indexes define for each database tables. For each index the DBMS stores at least, the index name
the attributes used, the location, specific index characteristics and the creationdate.
• Define databases: who created each database, the date of creation where the database is located, who
the
DBA is and so on.
• End users and The Administrators of the database
• Programs that access the database including screen formats, report formats application formats,
SQL queries and soon.
• Access authorization for all users of alldatabases.
• Relationships among data elements which elements are involved: whether the relationship are
mandatory or optional, the connectivity and cardinality and soon.
Database Administrators and Database Users
A primary goal of a database system is to retrieve information from and store new information in the
database.
Database Users and User Interfaces

CA-201 DBMS – SAGE DR. ANUPAMA JAIN


DATABASEMANGAEMENTSYSTEM Page 12

CA-201 DBMS – SAGE DR. ANUPAMA JAIN


There are four different types of database-system users, differentiated by the way they expect to
interact with the system. Different types of user interfaces have been designed for the different types
of users.
Naive users are unsophisticated users who interact with the system by invoking one of the
application programs that have been written previously. For example, a bank teller who needs to
transfer $50 from account A to account B invokes a program calledtransfer.
Application programmers are computer professionals who write application programs. Application
programmers can choose from many tools to develop user interfaces. Rapid application
development (RAD) tools are tools that enable an application programmer to construct forms and
reports without writing a program.
Sophisticated users interact with the system without writing programs. Instead, they form their
requests in a database query language. They submit each such query to a query processor, whose
function is to break down DML statements into instructions that the storage manager understands.
Analysts who submit queries to explore data in the database fall in this category.
Online analytical processing (OLAP) tools simplify analysts’ tasks by letting them view summaries
of data in different ways. For instance, an analyst can see total sales by region (for example, North,
South, East, and West), or by product, or by a combination of region and product (that is, total sales
of each product in eachregion).
Database Architecture:
The architecture of a database system is greatly influenced by the underlying computer system on
which the database system runs. Database systems can be centralized, or client-server, where one
server machine executes work on behalf of multiple client machines. Database systems can also be
designed to exploit parallel computer architectures. Distributed databases span multiple
geographically separated machines.

Figure 1.3: Database System Architecture

A database system is partitioned into modules that deal with each of the responsibilities of the overall
system. The functional components of a database system can be broadly divided into the storage

CA-201 DBMS – SAGE DR. ANUPAMA JAIN


manager and the query processor components. The storage manager is important because databases

DATABASEMANGAEMENTSYSTEM Page 13

CA-201 DBMS – SAGE DR. ANUPAMA JAIN


typically require a large amount of storage space. The query processor is important because it helps
the database system simplify and facilitate access to data.

Figure 1.4: Two-tier and three-tier architectures.

Query Processor:
The query processor componentsinclude
· DDL interpreter, which interprets DDL statements and records the definitions in the data
dictionary.
· DML compiler, which translates DML statements in a query language into an evaluation plan
consisting of low-level instructions that the query evaluation engineunderstands.
A query can usually be translated into any of a number of alternative evaluation plans that all give
the same result. The DML compiler also performs query optimization, that is, it picks the lowest
cost evaluation plan from among the alternatives.
Query evaluation engine, which executes low-level instructions generated by the DML compiler.

Storage Manager:

A storage manager is a program module that provides the interface between the lowlevel data stored in
the database and the application programs and queries submitted to the system. The storage manager is
responsible for the interaction with the filemanager.
The storage manager components include:

CA-201 DBMS – SAGE DR. ANUPAMA JAIN


DATABASEMANGAEMENTSYSTEM Page 14

CA-201 DBMS – SAGE DR. ANUPAMA JAIN


· Authorization and integrity manager, which tests for the satisfaction of integrity constraints
and checks the authority of users to accessdata.
· Transaction manager, which ensures that the database remains in a consistent (correct) state
despite system failures, and that concurrent transaction executions proceed withoutconflicting.
· File manager, which manages the allocation of space on disk storage and the data structures
used to represent information stored ondisk.
· Buffer manager, which is responsible for fetching data from disk storage into main memory,
anddeciding whatdatatocacheinmainmemory.Thebuffermanagerisacriticalpartofthe
database system, since it enables the database to handle data sizes that are much larger than the size
of main memory.

Transaction Manager:

A transaction is a collection of operations that performs a single logical function in a database


application. Each transaction is a unit of both atomicity and consistency. Thus, we require that
transactions do not violate any database-consistency constraints.

Conceptual Database Design - Entity Relationship (ER) Modeling:

Database Design Techniques


1. ER Modeling (Top-downApproach)
2. Normalization (Bottom-Upapproach)

What is ER Modeling?
A graphical technique for understanding and organizing the data independent of the actual
databaseimplementation
We need to be familiar with the following terms to go further.
Entity
Anything that has an independent existence and about which we collect data. It is also known as entity
type.
In ER modeling, notation for entity is given below.

Entity instance
Entity instance is a particular member of the entity
type. Example for entity instance: A particular
employee RegularEntity
An entity which has its own key attribute is a regular
entity. Example for regular entity: Employee.
Weak entity
An entity which depends on other entity for its existence and doesn't have any key attribute of its own is
a weakentity.

CA-201 DBMS – SAGE DR. ANUPAMA JAIN


DATABASEMANGAEMENTSYSTEM Page 15

CA-201 DBMS – SAGE DR. ANUPAMA JAIN


Example for a weak entity: In a parent/child relationship, a parent is considered as a strong entity
and the child is a weak entity.
In ER modeling, notation for weak entity is given below.

Attributes
Properties/characteristics which describe entities are called
attributes. In ER modeling, notation for attribute is given below.

Domain of Attributes
The set of possible values that an attribute can take is called the domain of the attribute. For
example, the attribute day may take any value from the set {Monday, Tuesday ... Friday}. Hence this
set can be termed as the domain of the attributeday.
Key attribute
The attribute (or combination of attributes) which is unique for every entity instance is called key
attribute.
E. g the employee_id of an employee, pan_card_number of a person etc. If the key attribute
consists of two or more attributes in combination, it is called a compositekey.
In ER modeling, notation for key attribute is given below.

Simple attribute
If an attribute cannot be divided into simpler components, it is a simple
attribute. Example for simple attribute: employee_id of an employee.
Composite attribute
If an attribute can be split into components, it is called a composite attribute.
Example for composite attribute : Name of the employee which can be split into First_name,
Middle name, and Last-named.
Single valuedAttributes

If an attribute can take only a single value for each entity instance, it is a single valued attribute.
example for single valued attribute: age of a student. It can take only one value for a particularstudent.

CA-201 DBMS – SAGE DR. ANUPAMA JAIN


Multi-valuedAttributes

DATABASEMANGAEMENTSYSTEM Page 16

CA-201 DBMS – SAGE DR. ANUPAMA JAIN


If an attribute can take more than one value for each entity instance, it is a multi-valued attribute. Multi-
valued
example for multi valued attribute: telephone number of an employee, a particular employee may
have multiple telephone numbers.
In ER modeling, notation for multi-valued attribute is given below.

Stored Attribute
An attribute which needs to be stored permanently is a stored
attribute Example for stored attribute: name of a student
Derived Attribute
An attribute which can be calculated or derived based on other attributes is a derived attribute.
Example for derived attribute: age of employee which can be calculated from date of birth and current
date.
In ER modeling, notation for derived attribute is given below.

Relationships
Associations between entities are called relationships
Example: An employee works for an organization. Here "works for" is a relation between the
entity’s employee and organization.
In ER modeling, notation for relationship is given below.

However, in ER Modeling, To connect a weak Entity with others, you should use a weak
relationship notation as givenbelow

CA-201 DBMS – SAGE DR. ANUPAMA JAIN


DATABASEMANGAEMENTSYSTEM Page 17

CA-201 DBMS – SAGE DR. ANUPAMA JAIN


Degree of a Relationship
Degree of a relationship is the number of entity types involved. The n-ary relationship is the
general form for degree n. Special cases are unary, binary, and ternary, where the degree is 1, 2,
and 3, respectively.
Example for unary relationship: An employee ia a manager of another
employee Example for binary relationship: An employee works-for
department. Example for ternary relationship: customer purchase item
from a shop keeper Cardinality of a Relationship
Relationship cardinalities specify how many of each entity type is allowed. Relationships can have
four possible connectivities as given below.
1. One to one (1:1)relationship
2. One to many (1: N)relationship
3. Many to one (M:1)relationship
4. Many to many (M: N)relationship
The minimum and maximum values of this connectivity is called the cardinality of the relationship

Example for Cardinality – One-to-One (1:1)


Employee is assigned with a parking space.

One employee is assigned with only one parking space and one parking space is assigned to
only one employee. Hence it is a 1:1 relationship and cardinality is One-To-One (1:1)

In ER modeling, this can be mentioned using notations as given below

CA-201 DBMS – SAGE DR. ANUPAMA JAIN


DATABASEMANGAEMENTSYSTEM Page 18

CA-201 DBMS – SAGE DR. ANUPAMA JAIN


Example for Cardinality – One-to-Many (1:N)
Organization has employees

One organization can have many employees, but one employee works in only one organization.
Hence it is a 1: N relationship and cardinality is One-To-Many (1: N)
In ER modeling, this can be mentioned using notations as given below

Example for Cardinality – Many-to-One (M :1)


It is the reverse of the One-to-Many relationship. employee works in organization

One employee works in only one organization but one organization can have many employees.
Hence it is a M:1 relationship and cardinality are Many-to-One (M :1)

In ER modeling, this can be mentioned using notations as given below.

CA-201 DBMS – SAGE DR. ANUPAMA JAIN

DATABASE MANGAEMENTSYSTEM Page 19


Cardinality – Many-to-Many (M: N)
Students enrolls for courses

One student can enroll for many courses and one course can be enrolled by many students. Hence
it is a M: N relationship and cardinality is Many-to-Many(M: N)
In ER modeling, this can be mentioned using notations as given below

Relationship Participation
1. Total
In total participation, every entity instance will be connected through the relationship to another
instance of the other participating entity types
2. Partial
Example for relationship participation
Consider the relationship - Employee is head of the department.
Here all employees will not be the head of the department. Only one employee will be the head
of the department. In other words, only few instances of employee entity participate in the
above relationship. So employee entity's participation is partial in the saidrelationship.

Advantages and Disadvantages of ER Modeling (Merits and Demerits of ER Modeling)


Advantages
1. ER Modeling is simple and easily understandable. It is represented in business user’s language
and it can be understood by non-technicalspecialist.
2. Intuitive and helps in Physical Databasecreation.

CA-201 DBMS – SAGE DR. ANUPAMA JAIN


DATABASEMANGAEMENTSYSTEM Page 20

CA-201 DBMS – SAGE DR. ANUPAMA JAIN


3. Can be generalized and specialized based onneeds.
4. Can help in databasedesign.
5. Gives a higher-level description of thesystem.
Disadvantages
1. Physical design derived from E-R Model may have some number of ambiguities orinconsistency.
2. Sometime diagrams may lead tomisinterpretations

Relational Model
The relational model is today the primary data model for commercial data processing applications. It
attained its primary position because of its simplicity, which eases the job of the programmer,
compared to earlier data models such as the network model or the hierarchical model.
Structure of Relational Databases:

A relational database consists of a collection of tables, each of which is assigned a unique name. For
example, consider the instructor table of Figure:1.5, which stores information about instructors. The
table has four column headers: ID, name, dept name, and salary. Each row of this table records
information about an instructor, consisting of the instructor’s ID, name, dept name, and salary.

Database Schema

When we talk about a database, we must differentiate between the database schema, which is the
logical design of the database, and the database instance, which is a snapshot of the data in the
database at a given instant in time. The concept of a relation corresponds to the programming-
language notion of a variable, while the concept of a relation schema corresponds to the
programming-language notion of type definition.
Keys
A superkey is a set of one or more attributes that, taken collectively, allow us to identify uniquely a
tuple in the relation. For example, the ID attribute of the relation instructor is sufficient to distinguish
one instructor tuple from another. Thus, ID is a superkey. The name attribute of instructor, on the
other hand, is not a superkey, because several instructors might have the same name.

A superkey may contain extraneous attributes. For example, the combination of ID and name is a
superkey for the relation instructor. If K is a superkey, then so is any superset of K. We are often
interested in super keys for which no proper subset is a superkey. Such minimal superkeys are called
candidate keys.
It is customary to list the primary key attributes of a relation schema before the other attributes; for
example, the dept name attribute of department is listed first, since it is the primary key. Primary key
attributes are also underlined. A relation, say r1, may include among its attributes the primary key of
another relation, say r2. This attribute is called a foreign key from r1, referencing r2.

Schema Diagrams
A database schema, along with primary key and foreign key dependencies, can be depicted by
schema diagrams. Figure 1.12 shows the schema diagram for our university organization.

CA-201 DBMS – SAGE DR. ANUPAMA JAIN


DATABASEMANGAEMENTSYSTEM Page 21

CA-201 DBMS – SAGE DR. ANUPAMA JAIN


Figure 1.12: Schema diagram for the university database.

Referential integrity constraints other than foreign key constraints are not shown explicitly in schema
diagrams. Wewillstudyadifferentdiagrammaticrepresentationcalledtheentity-relationshipdiagram.

DATABASE MANAGEMENT SYSTEMS Page 49

You might also like