Unit I Dbms Notes
Unit I Dbms Notes
On the other hand, a file system is a more A database is generally used for storing related, structured
unstructured data store for storing arbitrary, data, with well defined data formats, in an efficient
probably unrelated data. manner for insert, update and/or retrieval.
A file system provides much looser guarantees While databases are consistent at any instant in
about consistency, isolation and durability. time,provide,isolated transactions and durable writes,
A file-processing system coordinates only the A database management system coordinates both the
physical access to the data. physical and the logical access to the data
Redundancy cannot be controlled in File Redundancy can be controlled in File processing system.
processing system.
✔ A DBMS catalog stores the description of a particular database (e.g. data structures,
types, and constraints)
✔ The description is called meta-data.
✔ Used by the DBMS software and also by database users who need information about
the database structure.
✔ This allows the DBMS software to work with any number of database applications –
for example a banking database or a company database.
✔ These definitions are specified by the database designer prior to creating the actual
database and are stored in the catalog.
✔ In traditional file processing, data definition is typically part of the application
programs themselves. Hence, these programs are constrained to work with only one
specific database.
✔ For Example, the database catalog of fig 1.4 is represented in fig 1.5.
4. Specialized Users:
Specialized users are sophisticated users who write specialized database application that
does not fit into the traditional data-processing framework. Among these applications are
computer aided-design systems, knowledge-base and expert systems etc.
Query Processor:
As query is very much necessary to find out only the data user need from tons of data of the
database, query processor is very important to process these query requests. Query processors
come with the following components,
1. DDL Interpreter:
It interprets the DDL statements and records the definitions in data dictionary.
2. DML Compiler: It translates the DML statements in a query language into an evaluation
plan consisting of low-level instructions that the query evaluation understands. It also
1. File Manager- File manager manages the file space and it takes care of the structure of
the file. It manages the allocation space on disk storage and the data structures used to
represent info stored on other media.
2. Buffer Manager – It transfers blocks between disk (or other devices) and Main Memory.
A DMA (Direct Memory Access) is a form of Input/Output that controls the exchange of
blocks process. When a processor receives a request for a transfer of a block, it sends it to
the DMA Controller which transfers the block uninterrupted.
3. Authorization and Integrity Manager – This Component of storage manager checks
for the authority of the users to access and modify information, as well as integrity
constraints (keys, etc).
4. Disk Manager- The block requested by the file manager is transferred by the Disk
Manager.
The Structures maintained by Storage manager are-
1. Data Files- Data files contains the data portion of the data base.
2. Data Dictionary- DBMS must a data dictionary function. The dictionary contains the
data about the data. Rather than just raw data. The information about attributes, entity,
mapping & cross reference information is contained in the data dictionary.
3. Indices or Indexing and Access Aids – An index is a small table having two columns in
which the first column contains a copy of the primary or candidate key of a table and the
second column contains a set of pointers holding the address of the disk block where that
particular key value can be found. The advantage of using indices is that index makes
search operation perform very fast.
The three schema architecture is used to visualize the schema levels in a database. The
three schemas are only descriptions of data, the data only actually exists is at the physical
level.
Each user group refers only to its own external schema. The DBMS must transform a
request specified on an external schema into a request against the conceptual schema, and
then into a request on the internal schema for processing over the database. The process
of transforming requests and results between levels is called mapping.
1.9 DBMS - Data Abstraction
For the system to be usable, it must retrieve data efficiently. The need for efficiency has
led designers to use complex data structures to represent data in the database. Since many
database-systems users are not computer trained, developers hide the complexity from
users through several levels of abstraction, to simplify users’ interactions with the
system:
● Physical Level: The lowest level of abstraction describes how the data are
actually stored. The physical level describes complex low-level data structures in detail.
● Logical Level: The next-higher level of abstraction describes what data are stored in the
database, and what relationships exist among those data. The logical level thus describes
the entire database in terms of a small number of relatively simple structures. Although
implementation of the simple structures at the logical level may involve complex
physical-level structures, the user of the logical level does not need to be aware of this
complexity. Database administrators, who must decide what information to keep in the
database, use the logical level of abstraction.
● View Level: The highest level of abstraction describes only part of the entire database.
Even though the logical level uses simpler structures, complexity remains because of the
variety of information stored in a large database. Many users of the database system do
not need all this information; instead, they need to access only a part of the database. The
view level of abstraction exists to simplify their interaction with the system. The system
may provide many views for the same database.
1.10 DBMS Languages
A DBMS must provide appropriate languages and interfaces for each category of users to
express database queries and updates. Database Languages are used to create and maintain
database on computer. There are large numbers of database languages like Oracle,
MySQL, MS Access, dBase, FoxPro etc. SQL statements commonly used in Oracle and
MS Access can be categorized as data definition language (DDL), data control language
(DCL) and data manipulation language (DML).
Data Definition Language (DDL)
It is a language that allows the users to define data and their relationship to other types of
data. It is mainly used to create files, databases, data dictionary and tables within
databases.
It is also used to specify the structure of each table, set of associated values with each
attribute, integrity constraints, security and authorization information for each table and
physical storage structure of each table on disk.
Data Manipulation Language (DML)
A Data Manipulation Language is a database language which enables users to access and
modify stored data in a database. The types of access are as follows,
Retrieval of information stored in the database.
● One-to-many − One entity from entity set A can be associated with more than one
entities of entity set B however an entity from entity set B, can be associated with at most
one entity.
● Many-to-one − More than one entities from entity set A can be associated with at most
one entity of entity set B, however an entity from entity set B can be associated with
more than one entity from entity set A.
ER Diagram Representation
Let us now learn how the ER Model is represented by means of an ER diagram. Any object, for
example, entities, attributes of an entity, relationship sets, and attributes of relationship sets, can
be represented with the help of an ER diagram.
Entity
Entities are represented by means of rectangles. Rectangles are named with the entity set they
represent.
Attributes
Attributes are the properties of entities. Attributes are represented by means of ellipses. Every
ellipse represents one attribute and is directly connected to its entity (rectangle).
If the attributes are composite, they are further divided in a tree like structure. Every node is then
connected to its attribute. That is, composite attributes are represented by ellipses that are
connected with an ellipse.
Relationship
Relationships are represented by diamond-shaped box. Name of the relationship is written inside
the diamond-box. All the entities (rectangles) participating in a relationship, are connected to it
by a line.
● Many-to-one − When more than one instance of entity is associated with the
relationship, it is marked as 'N:1'. The following image reflects that more than one
instance of an entity on the left and only one instance of an entity on the right can be
associated with the relationship. It depicts many-to-one relationship.
● Many-to-many − The following image reflects that more than one instance of an entity
on the left and more than one instance of an entity on the right can be associated with the
relationship. It depicts many-to-many relationship.
Participation Constraints
● Total Participation − Each entity is involved in the relationship. Total participation is
represented by double lines.
● Partial participation − Not all entities are involved in the relationship. Partial
participation is represented by single lines.
GENERALIZATION
Generalization is a bottom-up approach in which two lower level entities combine to form a
higher level entity. In generalization, the higher level entity can also combine with other lower
In our Employee example, we have seen different types of employees like Engineer, Accountant,
Salesperson, Clerk etc. Similarly each employee belongs to different departments. We can
represent it in an ER diagram as below. When you see this diagram for the first time, you will not
understand it quickly. One will take time to understand it or he might misunderstand some
requirement.
What if we group all the sub departments into one department and different employees into one
employee? However sub departments and different employee types have same features in their
own domain. So if we merge the child entities into their parent, it makes the diagram simpler,
hence easy to understand. This method of merging the branches into one is called generalization.
Isn’t it simpler? Generalization is the bottom up approach which helps to design the requirement
at high level. Thus making one to understand quickly.
SPECIALIZATION
Specialization is the opposite of generalization. It is a top-down approach in which one higher
level entity can be broken down into two lower level entities. In specialization, some higher level
entities may not have lower-level entity sets at all. Take a group ‘Person’ for example. A person
has name, date of birth, gender, etc. These properties are common in all persons, human beings.
But in a company, persons can be identified as employee, employer, customer, or vendor, based
on what role they play in the company.
One more example of specialization would be Person. We can further divide person as
STUDENT, TEACHER, ENGINEER, SOLDIER etc. (Merging STUDENT, TEACHER,
ENGINEER etc into PERSON is an example of generalization).
Inheritance
We use all the above features of ER-Model in order to create classes of objects in object-oriented
programming. The details of entities are generally hidden from the user; this process known as
abstraction.
Inheritance is an important feature of Generalization and Specialization. It allows lower-level
entities to inherit the attributes of higher-level entities.
For example, the attributes of a Person class such as name, age, and gender can be inherited by
lower-level entities such as Student or Teacher.
Aggregation
Aggregation is a process when relation between two entities is treated as a single entity. Here the
relation between Center and Course, is acting as an Entity in relation with Visitor.
Look at below ER diagram of STUDENT, COURSE and SUBJECTS. What does it infer?
Student attends the Course, and he has some subjects to study. At the same time, Course offers
some subjects. Here a relation is defined on a relation. But ER diagram does not entertain such a
relation. It supports mapping between entities, not between relations. So what can we do in this
case?
If we look at STUDENT and COURSE from SUBJECT’s point of view, it does not differentiate
both of them. It offers it’s subject to both of them. So what can we do here is, merge STUDENT
and COURSE as one entity. This process of merging is called aggregation. It is completely
different from generalization. In generalization, we merge entities of same domain into one
entity. In this case we merge related entities into one entity.
Here we have merged STUDENT and COURSE into one entity STUDENT_COURSE. This new
entity forms the mapping with SUBJECTS. The new entity STUDENT_COURSE, in turn has
two entities STUDENT and COURSE with ‘Attends’ relationship.
Attributes vs. Entity-Sets
• When to represent a value as an attribute?
• When to represent a value as a separate entity-set?
• Representing as a separate entity-set allows details to be added later
• Example: – Phone number is a good candidate for representing as a separate entity-set –
Employee name definitely should stay an attribute