E-Content - DBMS - Unit - 1
E-Content - DBMS - Unit - 1
UNIT-1
1.1 Introduction:
Database: The collections of related data are called as database.
Database management system: It is a collection of interrelated data and a set of
programs to access those data.
FILE PROCESSING SYSTEM: It is supported by a conventional operating system. The
system stores permanent records in various file. It’s having some disadvantages,
1. Data redundancy and inconsistency
2. Difficulty in accessing data
3. Data isolation
4. Integrity problems
5. Atomicity problems
6. Concurrent access anomalies
7. Security problems
1.4 VIEW OF DATA: The database system provides an abstract view of data to users. The
system hides details/ of how the data is stored and maintained. Here we are including 2
things,
1.4.1. Data abstraction: Data is hidden through various levels of abstraction such as,
a. Physical level
b. Logical level
c. View level
a) Physical level- The lowest level of abstraction describes how the data are physically
stored. The physical level describes complex low level data structure.
Figure-1.1
b) Logical level- The next level of abstraction describes about data is stored in the
database, and what relationship exists among those data. It describes the entire database
in terms of a small number of relatively simple structures.
c) View level- The highest level of abstraction describes only part of the entire database.
The view level of abstraction exists to simplify their interaction with the system. The system
may provide many views for the same database.
1.5 Instances and schema: The collection of information stored in the database at a
particular moment is called an instance of the database. The overall design of the database
is called the database schema. Types of schemas are
1. Physical schema- describes the database design at the physical level.
2. Logical schema-describes the data database design at the logical level.
3. Sub level- describes the data database design at the view level.
Data Independence:
A database system normally contains a lot of data in addition to users’ data. For example,
it stores data about data, known as metadata, to locate and retrieve data easily. It is
rather difficult to modify or update a set of metadata once it is stored in the database. But
as a DBMS expands, it needs to change over time to satisfy the requirements of the users.
If the entire data is dependent, it would become a tedious and highly complex job.
Metadata itself follows a layered architecture, so that when we change data at one layer, it
does not affect the data at another level. This data is independent but mapped to each
other.
Figure-1.2
Logical Data Independence
Logical data is data about database, that is, it stores information about how data is
managed inside. For example, a table (relation) stored in the database and all its
constraints, applied on that relation.
Logical data independence is a kind of mechanism, which liberalizes itself from actual data
stored on the disk. If we do some changes on table format, it should not change the data
residing on the disk.
Physical Data Independence
All the schemas are logical, and the actual data is stored in bit format on the disk. Physical
data independence is the power to change the physical data without impacting the schema
or logical data.
For example, in case we want to change or upgrade the storage system itself − suppose
we want to replace hard-disks with SSD − it should not have any impact on the logical data
or schemas.
1.6 DATA MODELS: A data model is a collection of concepts that can be used to describe
the structure of a database.
It provides the necessary means to achieve the abstraction.
By structure of a database we mean the data types, relationships and constraints that
apply to data.
Design of database is called data model
Data models are classified into three types:
1. Object based model
1. Object oriented model
2. E-R model
3. Functional model
2. Record based model
1. Relational model
2. Network model
3. Hierarchical model
3. Physical model
1. Unify model
2. Frame model
Object oriented model:
This data model is another method of representing real world objects. It considers each
object in the world as objects and isolates it from each other. It groups its related
functionalities together and allows inheriting its functionality to other related sub-groups.
Figure 1.3
Functional model:
In functional model the data is stored and accessed in form of functions.
Hierarchical model:
In this data model, the entities are represented in a hierarchical fashion. Here we identify a
parent entity, and its child entity. Again we drill down to identify next level of child entity
and so on. This model can be imagined as folders inside a folder.
Figure 1.4
Network model:
This is the enhanced version of hierarchical data model. It is designed to address the
drawbacks of the hierarchical model. It helps to address M:N relationship. This data model
is also represented as hierarchical, but this model will not have single parent concept. Any
child in the tree can have multiple parents here.
Figure-1.5
Relational model:
This model is designed to overcome the drawbacks of hierarchical and network models. It is
designed completely different from those two models. Those models define how they are
structured in the database physically and how they are inter-related. But in the relational
model, we are least bothered about how they are structured. It purely based on how the
records in each table are related. It purely isolates physical structure from the logical
structure. Logical structure is defines records are grouped and distributed.
Figure-1.6
Active − in this state, the transaction is being executed. This is the initial state of
every transaction.
Partially Committed − When a transaction executes its final operation, it is said to
be in a partially committed state.
Failed − A transaction is said to be in a failed state if any of the checks made by
the database recovery system fails. A failed transaction can no longer proceed
further.
Aborted − If any of the checks fails and the transaction has reached a failed state,
then the recovery manager rolls back all its write operations on the database to
bring the database back to its original state where it was prior to the execution of
the transaction. Transactions in this state are called aborted. The database recovery
module can select one of the two operations after a transaction aborts −
o Re-start the transaction
o Kill the transaction
Committed − If a transaction executes all its operations successfully, it is said to be
committed. All its effects are now permanently established on the database system.
Figure-1.7
DBMS architecture is mainly divided into 2 modules
1. Query processor
2. Storage manager
Figure-1.8
What is Database Design?
Database Design is a collection of processes that facilitate the designing,
development, implementation and maintenance of enterprise data management
systems
It helps produce database systems
1. That meet the requirements of the users
2. Have high performance.
The main objectives of database designing are to produce logical and physical designs
models of the proposed database system.
The logical model concentrates on the data requirements and the data to be stored
independent of physical considerations. It does not concern itself with how the data will be
stored or where it will be stored physically.
The physical data design model involves translating the logical design of the database onto
physical media using hardware resources and software systems such as database
management systems (DBMS).
1.11 Attributes:
Figure-1.9
Key Attribute:
Figure-1.10
Composite Attribute:
Figure-1.11
Derived Attribute:
Figure-1.12
Multivalued attribute:
Figure-1.13
Descriptive Attribute:
. Attributes of the relationship is called descriptive attribute.
Figure-1.14
1.12 Entities: An entity set is a set of entities of the same type (e.g., all persons
having an account at a bank).
The entity set which does not have sufficient attributes to form a primary key is called as
Weak entity set.
An entity set that has a primary key is called as Strong entity set.
Figure-1.14
Relationship
Relationships are represented by diamond-shaped box. Name of the relationship is written
inside the diamond-box. All the entities (rectangles) participating in a relationship, are
connected to it by a line.
Binary Relationship and Cardinality:
A relationship where two entities are participating is called a binary relationship.
Cardinality is the number of instance of an entity from a relation that can be associated
with the relation.
One-to-one − When only one instance of an entity is associated with the
relationship, it is marked as '1:1'. The following image reflects that only one
instance of each entity should be associated with the relationship. It depicts one-to-
one relationship.
Figure-1.15
Figure-1.15
Many-to-one − When more than one instance of entity is associated with the
relationship, it is marked as 'N:1'. The following image reflects that more than one
instance of an entity on the left and only one instance of an entity on the right can
be associated with the relationship. It depicts many-to-one relationship.
Figure-1.16
Many-to-many − The following image reflects that more than one instance of an
entity on the left and more than one instance of an entity on the right can be
associated with the relationship. It depicts many-to-many relationship.
Figure-1.17
Participation Constraints
Total Participation − Each entity is involved in the relationship. Total participation
is represented by double lines.
Partial participation − Not all entities are involved in the relationship. Partial
participation is represented by single lines.
Figure-1.18
1.13 Additional features of ER Model:
Generalization:
Aggregation:
TEXT BOOK:
1. Raghurama Krishnan, Johannes Gehrke, Database Management Systems, 3 rd Edition,
TATA McGraw hill.
2. Web pages