Dbms Unit I Notes
Dbms Unit I Notes
Database:
The database is a collection of inter-related data which is used to retrieve, insert
and delete the data efficiently. It is also used to organize the data in the form of a
table, schema, views, and reports, etc.
● It provides protection and security to the database. In the case of multiple users, it
also maintains data consistency.
Database Applications
Physical: This is the lowest level of data abstraction. It tells us how the data is
actually stored in memory. The access methods like sequential or random access and
file organization methods like B+ trees, hashing used for the same. Usability, size of
memory, and the number of times the records are factors that we need to know while
designing the database.
Suppose we need to store the details of an employee. Blocks of storage and the
amount of memory used for these purposes are kept hidden from the user.
Logical: This level comprises the information that is actually stored in the
database in the form of tables. It also stores the relationship among the data
entities in relatively simple structures. At this level, the information available to
the user at the view level is unknown.
We can store the various attributes of an employee and relationships, e.g. with the
manager can also be stored.
View: This is the highest level of abstraction. Only a part of the actual database is
viewed by the users. This level exists to ease the accessibility of the database by an
individual user. Users view data in the form of rows and columns. Tables and
relations are used to store data. Multiple views of the same database may exist.
Users can just view the data and interact with the database, storage and
implementation details are hidden from them.
The main purpose of data abstraction is to achieve data independence in order to save
time and cost required when the database is modified or altered.
We have namely two levels of data independence arising from these levels of
abstraction :
Logical level data independence: It refers characteristic of being able to modify the
logical schema without affecting the external schema or application program. The
user view of the data would not be affected by any changes to the
conceptual view of the data. These changes may include insertion or deletion of
attributes, altering table structures entities or relationships to the logical schema, etc.
DBMS Architecture
● The DBMS design depends upon its architecture. The basic client/server
architecture is used to deal with a large number of PCs, web servers, database
servers and other components that are connected with networks.
● The client/server architecture consists of many PCs and a workstation which are
connected via the network.
● DBMS architecture depends upon how users are connected to the database to get
their request done.
1-Tier Architecture
● In this architecture, the database is directly available to the user. It means the
user can directly sit on the DBMS and uses it.
● Any changes done here will directly be done on the database itself. It doesn't
provide a handy tool for end users.
The 1-Tier architecture is used for development of the local application, where
programmers can directly communicate with the database for the quick
response.
2-Tier Architecture
● The user interfaces and application programs are run on the client-side.
● The server side is responsible to provide the functionalities like: query processing
and transaction management.
3-Tier Architecture
● The 3-Tier architecture contains another layer between the client and server. In
this architecture, client can't directly communicate with the server.
Structure of a DBMS
A database system is partitioned into modules that deal with each of the
responsibilities of the overall system. The functional components of a database
system can be broadly divided into the storage manager and the query processor
components. The storage manager is important because databases typically require
a large amount of storage space. The query processor is important because it helps
the database system simplify and facilitate access to data.
It is the job of the database system to translate updates and queries written in a
nonprocedural language, at the logical level, into an efficient sequence of operations
at the physical level.
Query Processor
DDL interpreter, which interprets DDL statements and records the definitions in
the data dictionary.
Storage Manager
A storage manager is a program module that provides the interface between the
lowlevel data stored in the database and the application programs and queries
submitted to the system. The storage manager is responsible for the interaction with
the file manager. The raw data are stored on the disk using the file system, which is
usually provided by a conventional operating system. The storage manager
translates the various DML statements into low-level file-system commands. Thus,
the storage manager is responsible for storing, retrieving, and updating data in the
database.
The storage manager components include:
Authorization and integrity manager, which tests for the satisfaction of integrity
constraints and checks the authority of users to access data.
Transaction manager, which ensures that the database remains in a consistent
(correct) state despite system failures, and that concurrent transaction executions
proceed without conflicting.
File manager, which manages the allocation of space on disk storage and the data
structures used to represent information stored on disk
Buffer manager, which is responsible for fetching data from disk storage into main
memory, and deciding what data to cache in main memory. The buffer manager is a
critical part of the database system, since it enables the database to handle data sizes
that are much larger than the size of main memory.
Transaction Manager
Data models in DBMS help to understand the design at the conceptual, physical, and
logical levels as it provides a clear picture of the data making it easier for developers
to create a physical database.
Data models are used to describe how the data is stored, accessed, and updated in a
DBMS. A set of symbols and text is used to represent them so that all the members
of an organization can understand how the data is organized. It provides a set of
conceptual tools that are vastly used to represent the description of data.
There are many types of data models that are used in the industry.
Hierarchical Model
The hierarchical data model is one of the oldest data models, developed in the
1950s by IBM. In this data model, the data is organized in a hierarchical tree-like
structure. This data model can be easily visualized because each record in DBMS
has one parent and many children (possibly 0) as shown in the image given below.
The above-given image represents the data model of the Vehicle database, vehicle
are classified into two types Viz. two-wheelers and four-wheelers and then they are
further classified.
The main drawback we can see here is we can only have one too many
relationships under this model, hence the hierarchical data model is very rarely
used nowadays.
Network Model
A network model is nothing but a generalization of the hierarchical data model as
this data model allows many to many relationships therefore in this model a record
can also have more than one parent.
The network model in DBMS can be represented as a graph and hence it replaces
the hierarchical tree with a graph in which object types are the nodes and
relationships are the edges.
Here you can see all three departments are linked with the director which was not
possible in the hierarchical data model.
In the network model, there can be many possible paths to reach a node from the
root node (College is the root node in the above case), therefore the data can be
accessed efficiently when compared to the hierarchical data model. But, on the
other hand, the process of insertion and deletion of data is quite complex.
Relational Model
This is the most widely accepted data model. In this model, the database is
represented as a collection of relations in the form of rows and columns of a
two-dimensional table. Each row is known as a tuple (a tuple contains all the data
for an individual record) while each column represents an attribute. For example -
The above table shows a relation "STUDENT" with attributes such as Stu. Id,
Name, and Branch which consists of 4 records or tuples.
Check out this article to learn more about the Relational model in DBMS.
Since data is stored as objects we can easily store audio, video, images, etc in the
database which was very difficult and inconvenient to do in the relational model.
As shown in the image below two objects are connected with each other through
links.
In the above image, we have two objects that are Employee and Department in
which all the data is contained in a single unit (object). They are linked with each
other as they share a common attribute
For example -
It provides data structures and operations used in the relational model and also
provides features of object-oriented models like classes, inheritance, etc. The only
drawback of this data model is that it is complex and quite difficult to handle.
For example, Suppose we design a school database. In this database, the student will
be an entity with attributes like address, name, id, age, etc. The address can be
another entity with attributes like city, street name, pin code, etc and there will be a
relationship between them.
Component of ER Diagram
1. Entity:
An entity may be any object, class, person or place. In the ER diagram, an entity can
be represented as rectangles.
An entity that depends on another entity called a weak entity. The weak entity doesn't
contain any key attribute of its own. The weak entity is represented by a double
rectangle.
2. Attribute
The attribute is used to describe the property of an entity. Eclipse is used to represent
an attribute.
For example, id, age, contact number, name, etc. can be attributes of a student.
a. Key Attribute
c. Multivalued Attribute
An attribute can have more than one value. These attributes are known as a
multivalued attribute. The double oval is used to represent multivalued attribute.
For example, a student can have more than one phone number.
d. Derived Attribute
An attribute that can be derived from other attribute is known as a derived attribute. It
can be represented by a dashed ellipse.
For example, A person's age changes over time and can be derived from another
attribute like Date of birth.
3. Relationship
a. One-to-One Relationship
When only one instance of an entity is associated with the relationship, then it is
known as one to one relationship.
For example, A female can marry to one male, and a male can marry to one female.
b. One-to-many relationship
When only one instance of the entity on the left, and more than one instance of an
entity on the right associates with the relationship then this is known as a
one-to-many relationship.
For example, Scientist can invent many inventions, but the invention is done by the
only specific scientist.
c. Many-to-one relationship
When more than one instance of the entity on the left, and only one instance of an
entity on the right associates with the relationship then it is known as a many-to-one
relationship.
For example, Student enrolls for only one course, but a course can have many
students.
d. Many-to-many relationship
When more than one instance of the entity on the left, and more than one instance of
an entity on the right associates with the relationship then it is known as a
many-to-many relationship.
For example, Employee can assign by many projects and project can have many
employees.
Keys
For example, ID is used as a key in the Student table because it is unique for each
student. In the PERSON table, passport_number, license_number, SSN are keys since
they are unique for each person.
Types of keys:
1. Primary key
○ It is the first key used to identify one and only one instance of an entity
uniquely. An entity can contain multiple keys, as we saw in the PERSON table. The
key which is most suitable from those lists becomes a primary key.
○ In the EMPLOYEE table, ID can be the primary key since it is unique for each
employee. In the EMPLOYEE table, we can even select License_Number and
Passport_Number as primary keys since they are also unique.
○ For each entity, the primary key selection is based on requirements and
developers.
2. Candidate key
○ Except for the primary key, the remaining attributes are considered a candidate
key. The candidate keys are as strong as the primary key.
For example: In the EMPLOYEE table, id is best suited for the primary key. The rest
of the attributes, like SSN, Passport_Number, License_Number, etc., are considered a
candidate key.
3. Super Key
Super key is an attribute set that can uniquely identify a tuple. A super key is a
superset of a candidate key.
4. Foreign key
○ Foreign keys are the column of the table used to point to the primary key of
another table.
○ In the EMPLOYEE table, Department_Id is the foreign key, and both the tables
are related.
5. Alternate key
For example, employee relation has two attributes, Employee_Id and PAN_No, that
act as candidate keys. In this relation, Employee_Id is chosen as the primary key, so
the other candidate key, PAN_No, acts as the Alternate key.
6. Composite key
Generalization –
Generalization is the process of extracting common properties from a set of entities
and create a generalized entity from it. It is a bottom-up approach in which two or
more entities can be generalized to a higher level entity if they have some attributes
in common. For Example, STUDENT and FACULTY can be generalized to a higher
level entity called PERSON as shown in Figure 1. In this case, common attributes
like P_NAME, P_ADD become part of higher entity (PERSON) and specialized
attributes like S_FEE become part of specialized entity (STUDENT).
Specialization –
In specialization, an entity is divided into sub-entities based on their characteristics. It
is a top-down approach where higher level entity is specialized into two or more
lower level entities. For Example, EMPLOYEE entity in an Employee management
system can be specialized into DEVELOPER, TESTER etc. as shown in Figure 2. In
this case, common attributes like E_NAME, E_SAL etc. become part of higher entity
(EMPLOYEE) and specialized attributes like TES_TYPE become part of specialized
entity (TESTER).
Aggregation –
An ER diagram is not capable of representing relationship between an entity and a
relationship which may be required in some scenarios. In those cases, a relationship
with its corresponding entities is aggregated into a higher level entity. Aggregation is
an abstraction through which we can represent relationships as higher level entity
sets.
For Example, Employee working for a project may require some machinery. So,
REQUIRE relationship is needed between relationship WORKS_FOR and entity
MACHINERY. Using aggregation, WORKS_FOR relationship with its entities
EMPLOYEE and PROJECT is aggregated into single entity and relationship
REQUIRE is created between aggregated entity and MACHINERY.
Participation Constraint
1. Total Participation – Each entity in the entity set must participate in the
relationship. If each student must enroll in a course, the participation of students will
be total. Total participation is shown by a double line in the ER diagram.
2. Partial Participation – The entity in the entity set may or may NOT participate in
the relationship. If some courses are not enrolled by any of the students, the
participation in the course will be partial.
The diagram depicts the ‘Enrolled in’ relationship set with Student Entity set having
total participation and Course Entity set having partial participation.
The database can be represented using the notations, and these notations can be
reduced to a collection of tables.
In the database, every entity set or relationship set can be represented in tabular
form.
In the STUDENT table, Age is the derived attribute. It can be calculated at any point
of time by calculating the difference between current date and Date of Birth.
Using these rules, you can convert the ER diagram to tables and columns and assign
the mapping between the tables. Table structure for the given ER diagram is as
below: