DBMS_Unit_1_notes
DBMS_Unit_1_notes
Topics covered: Overview of DBMS, Advantages of DBMS over a file processing system,
Introduction & applications of DBMS, Purpose of database system, Views of data, Database
system architecture, Data independence, Evolution of data models, Degree of data abstraction,
Database users & DBA, Database languages, Database design, Design process, ER
model/diagram, Keys, Attributes, Constraints, Mapping cardinality, Extended ER diagram,
Generalization, Specialization & Aggregation, ER diagram issues, Weak entity, Relational
modal, Conversion of ER to relational table.
What is a Database?
The database is a collection of inter-related data which is used to retrieve, insert and delete
the data efficiently. It is also used to organize the data in the form of a table, schema, views,
and reports, etc.
For example: The college Database organizes the data about the admin, staff, students and
faculty etc.
Using the database, you can easily retrieve, insert, and delete the information.
What is RDBMS?
● All modern database management systems like SQL, MS SQL Server, IBM DB2,
ORACLE, My-SQL and Microsoft Access are based on RDBMS.
● It is called Relational Data Base Management System (RDBMS) because it is based on
relational model introduced by E.F. Codd.
● Data is represented in terms of tuples (rows) in RDBMS.
● Relational database is most commonly used database. It contains number of tables
and each table has its own primary key.
● Due to a collection of organized set of tables, data can be accessed easily in RDBMS.
● During 1970 to 1972, E.F. Codd published a paper to propose the use of relational
database model.
● RDBMS is originally based on that E.F. Codd's relational model invention.
1
● The RDBMS database uses tables to store data. A table is a collection of related data
entries and contains rows and columns to store data.
● A table is the simplest example of data storage in RDBMS.
1 Ajeet 24 B.Tech
2 aryan 20 C.A
3 Mahesh 21 BCA
4 Ratan 22 MCA
5 Vimal 26 BSC
o Data Definition: It is used for creation, modification, and removal of definition that
defines the organization of data in the database.
o Data Updation: It is used for the insertion, modification, and deletion of the actual
data in the database.
o Data Retrieval: It is used to retrieve the data from the database which can be used
by applications for various purposes.
2
o User Administration: It is used for registering and monitoring users, maintaining
data integrity, enforcing data security, dealing with concurrency control, monitoring
performance and recovering information corrupted by unexpected failure.
Characteristics of DBMS
o It uses a digital repository established on a server to store and manage the
information.
o It can provide a clear and logical view of the process that manipulates data.
o DBMS contains automatic backup and recovery procedures.
o It contains ACID properties which maintain data in a healthy state in case of failure.
o It can reduce the complex relationship between data.
o It is used to support manipulation and processing of data.
o It is used to provide security of data.
o It can view the database from different viewpoints according to the requirements of
the user.
Advantages of DBMS
o Controls database redundancy: It can control data redundancy because it stores
all the data in one single database file and that recorded data is placed in the
database.
o Data sharing: In DBMS, the authorized users of an organization can share the data
among multiple users.
o Easily Maintenance: It can be easily maintainable due to the centralized nature of
the database system.
o Reduce time: It reduces development time and maintenance need.
o Backup: It provides backup and recovery subsystems which create automatic
backup of data from hardware and software failures and restores the data if
required.
o multiple user interface: It provides different types of user interfaces like graphical
user interfaces, application program interfaces
File based systems were an early attempt to computerize the manual system. It is also
called a traditional based approach in which a decentralized approach was taken where
3
each department stored and controlled its own data with the help of a data processing
specialist. The main role of a data processing specialist was to create the necessary
computer file structures, and also manage the data within structures and design some
application programs that create reports based on file data.
Consider an example of a student's file system. The student file will contain information
regarding the student (i.e. roll no, student name, course etc.). Similarly, we have a subject
file that contains information about the subject and the result file which contains the
information regarding the result.
Some fields are duplicated in more than one file, which leads to data redundancy. So to
overcome this problem, we need to create a centralized system, i.e. DBMS approach.
DBMS:
4
In the above figure,
There are the following differences between DBMS and File systems:
Meaning DBMS is a collection of data. In DBMS, The file system is a collection of data. In
the user is not required to write the this system, the user has to write the
procedures. procedures for managing the database.
Sharing of data Due to the centralized approach, data Data is distributed in many files, and it
sharing is easy. may be of different formats, so it isn't
easy to share data.
Data Abstraction DBMS gives an abstract view of data The file system provides the detail of the
that hides the details. data representation and storage of data.
5
Security and DBMS provides a good protection It isn't easy to protect a file under the file
Protection mechanism. system.
Recovery DBMS provides a crash recovery The file system doesn't have a crash
Mechanism mechanism, i.e., DBMS protects the user mechanism, i.e., if the system crashes
from system failure. while entering some data, then the
content of the file will be lost.
Manipulation DBMS contains a wide variety of The file system can't efficiently store and
Techniques sophisticated techniques to store and retrieve the data.
retrieve the data.
Concurrency DBMS takes care of Concurrent access In the File system, concurrent access has
Problems of data using some form of locking. many problems like redirecting the file
while deleting some information or
updating some information.
Where to use Database approach used in large File system approach used in large
systems that interrelates many files. systems that interrelates many files.
Cost The database system is expensive to The file system approach is cheaper to
design. design.
Data Redundancy Due to the centralization of the In this, the files and application programs
and Inconsistency database, the problems of data are created by different programmers
redundancy and inconsistency are therefore there exists a lot of duplication
controlled. of data which may lead to inconsistency.
Structure The database structure is complex to The file system approach has a simple
design. structure.
Data In this system, Data Independence In the File system approach, there exists
Independence exists, and it can be of two types. no Data Independence.
6
o Logical Data Independence
o Physical Data Independence
Integrity Integrity Constraints are easy to apply. Integrity Constraints are difficult to
Constraints implement in file system.
Flexibility Changes are often a necessity to the The flexibility of the system is less as
content of the data stored in any system, compared to the DBMS approach.
and these changes are more easily with
a database approach.
3. Banking –
Database the executive’s framework is utilized to store the exchange data of the client in
the information base.
7
spite of that understudy’s enlistments subtleties, grades, courses, expense,
participation, results, and so forth all the data is put away in the information base.
8. Account –
The information base administration framework is utilized for putting away data about
deals, holding and acquisition of monetary instruments, for example, stocks and bonds
in a data set.
8
them. They store data about worker’s compensation, assessment, and work with the
assistance of an information base administration framework (DBMS).
11.Manufacturing –
Manufacturing organizations make various kinds of items and deal them consistently. To
keep the data about their items like bills, acquisition of the item, amount, inventory
network the executives, information base administration framework (DBMS) is utilized.
The diagram given below explains the process as to how the transformation of data to
information to knowledge to action happens respectively in the DBMS −
9
5. Various Views of Data
It refers that how database is actually stored in database, what data and structure of data
used by database for data. So describe all this database provides user with views and these
are
● Data abstraction
● Instances and schemas.
Data abstraction
As a data in database are stored with very complex data structure so when user come
and want to access any data, he will not be able to access data if he has go through this
data structure. So to simplify the interaction of user and database, DBMS hides some
information which is not of user interest, a this is called data abstraction:- So developer
hides complexity from user and store abstract view of data.
● Physical level / internal level: this is the lowest level of data abstraction which
describes how data is actually stored in database? This level basically describes
the data structure and access path /indexing use for accessing file.
● Logical level / conceptual level: The next level of abstraction describe what
data are stored in the database and what are the relationship existed among
those of data.
● View level / external level: In this level user only interact with database and the
complexity remain unviewed. User sees data and there may be many views of one
data like chart and graph.
● The DBMS design depends upon its architecture. The basic client/server
architecture is used to deal with a large number of PCs, web servers, database
servers and other components that are connected with networks.
● The client/server architecture consists of many PCs and a workstation which are
connected via the network.
● DBMS architecture depends upon how users are connected to the database to get
their request done.
10
Types of DBMS Architecture
Database architecture can be seen as a single tier or multi-tier. But logically, database
architecture is of two types like: 2-tier architecture and 3-tier architecture.
1-Tier Architecture
o In this architecture, the database is directly available to the user. It means the user
can directly sit on the DBMS and uses it.
o Any changes done here will directly be done on the database itself. It doesn't provide
a handy tool for end users.
o The 1-Tier architecture is used for development of the local application, where
programmers can directly communicate with the database for the quick response.
2-Tier Architecture
o The 2-Tier architecture is same as basic client-server. In the two-tier architecture,
applications on the client end can directly communicate with the database at the
server side. For this interaction, API's like: ODBC, JDBC are used.
o The user interfaces and application programs are run on the client-side.
o The server side is responsible to provide the functionalities like: query processing
and transaction management.
11
o To communicate with the DBMS, client-side application establishes a connection
with the server side.
3-Tier Architecture
o The 3-Tier architecture contains another layer between the client and server. In this
architecture, client can't directly communicate with the server.
o The application on the client-end interacts with an application server which further
communicates with the database system.
o End user has no idea about the existence of the database beyond the application
server. The database also has no idea about any other user beyond the application.
o The 3-Tier architecture is used in case of large web application.
12
13
o Physical data independence is used to separate conceptual levels from the internal
levels.
o Physical data independence occurs at the logical interface level.
Managing data was the key and was essential. Therefore, data model originated to solve
the file system issues. Here are the Data Models in DBMS −
Hierarchical Model
● In Hierarchical Model, a hierarchical relation is formed by collection of relations
and forms a tree-like structure.
● One of the first and most popular Hierarchical Model is Information Management
System (IMS), developed by IBM.
Example: The hierarchy shows an Employee can be an Intern, on Contract or Full- Time.
Sub-levels show that Full-Time Employee can be hired as a Writer, Senior Writer or
Editor:
14
Advantages
● Implementation is complex.
● This model has to deal with anomalies like Insert, Update and Delete.
● Maintenance is difficult since changes done in the database may want you to do
changes in the entire database structure.
Network Model
● The Hierarchical Model creates hierarchical tree with parent/ child relationship,
whereas the Network Model has graph and links.
● The relationship can be defined in the form of links and it handles many-to-many
relations. It itself states that a record can have more than one parent.
Example
15
Advantages
● Pointers bring complexity since the records are based on pointers and graphs.
● Changes in the database are not easy, that makes it hard to achieve structural
independence.
Relational Model
● A relational model groups data into one or more tables. These tables are related to
each other using common records.
● The data is represented in the form of rows and columns i.e. tables:
16
Example
Let us see an example of two relations <Employee> and <Department> linked to each
other, with DepartmentID, which is Foreign Key of <Employee> table and Primary key
of <Department> table.
Advantages
● The Relational Model does not have any issues that we saw in the previous two
models i.e. update, insert and delete anomalies have nothing to do in this model.
● Changes in the database do not require you to affect the complete database.
● Implementation of a Relational Model is easy.
● To maintain a Relational Model is not a tiresome task.
Disadvantages
17
● Database inefficiencies hide and arise when the model has large volumes of data.
● The overheads of using relational data model come with the cost of using powerful
hardware and devices.
18
The main purpose of data abstraction is to achieve data independence in order to save time
and cost required when the database is modified or altered.
10.Database Users & Database Administrator (DBA)
Database users are categorized based up on their interaction with the data base.
These are seven types of data base users in DBMS.
Database Administrator (DBA):-
Database Administrator (DBA) is a person/team who defines the schema and also
controls the 3 levels of database. The DBA will then create a new account id and
password for the user if he/she needs to access the data base.DBA is also responsible for
providing security to the data base and he allows only the authorized users to
access/modify the data base.DBA also monitors the recovery and back up and provide
technical support. The DBA has a DBA account in the DBMS which called a system or
super user account.DBA repairs damage caused due to hardware and/or software
failures.
Naive / Parametric End Users :
Parametric End Users are the unsophisticated who don’t have any DBMS knowledge but
they frequently use the data base applications in their daily life to get the desired results.
19
For examples, Railway’s ticket booking users are naive users. Clerk in any bank is a naive
user because they don’t have any DBMS knowledge but they still use the database and
perform their given task.
System Analyst:
System Analyst is a user who analyzes the requirements of parametric end users. They
check whether all the requirements of end users are satisfied.
Sophisticated Users:
Sophisticated users can be engineers, scientists, business analyst, who are familiar with the
database. They can develop their own data base applications according to their
requirement. They don’t write the program code but they interacts the data base by writing
SQL queries directly through the query processor.
Application Programmers:
Application Program is the back end programmers who write the code for the application
programs. They are the computer professionals. These programs could be written in
Programming languages such as Visual Basic, Developer, C, FORTRAN, COBOL etc.
11.Database Languages
o A DBMS has appropriate languages and interfaces to express database queries and
updates.
o Database languages can be used to read, store and update the data in the database.
20
Types of Database Language
These commands are used to update the database schema that's why they come under Data
definition language.
21
2. Data Manipulation Language
DML stands for Data Manipulation Language. It is used for accessing and manipulating data
in a database. It handles user requests.
(But in Oracle database, the execution of data control language does not have the
feature of rolling back.)
There are the following operations which have the authorization of Revoke:
TCL is used to run the changes made by the DML statement. TCL can be grouped into a
logical transaction.
22
o Commit: It is used to save the transaction on the database.
o Rollback: It is used to restore the database to original since the last Commit.
12.Database Design
Database design can be generally defined as a collection of tasks or processes that enhance
the designing, development, implementation, and maintenance of enterprise data
management system. Designing a proper database reduces the maintenance cost thereby
improving data consistency and the cost-effective measures are greatly influenced in terms
of disk storage space. Therefore, there has to be a brilliant concept of designing a database.
The designer should follow the constraints and decide how the elements correlate and
what kind of data must be stored.
The main objectives behind database designing are to produce physical and logical design
models of the proposed database system. To elaborate this, the logical model is primarily
concentrated on the requirements of data and the considerations must be made in terms of
monolithic considerations and hence the stored physical data must be stored independent
of the physical conditions. On the other hand, the physical database design model includes a
translation of the logical design model of the database by keep control of physical media
using hardware resources and software systems such as Database Management System
(DBMS).
The important consideration that can be taken into account while emphasizing the
importance of database design can be explained in terms of the following points given
below.
1. Database designs provide the blueprints of how the data is going to be stored in a
system. A proper design of a database highly affects the overall performance of any
application.
2. The designing principles defined for a database give a clear idea of the behavior of
any application and how the requests are processed.
3. Another instance to emphasize the database design is that a proper database design
meets all the requirements of users.
23
4. Lastly, the processing time of an application is greatly reduced if the constraints of
designing a highly efficient database are properly implemented.
Life Cycle
Requirement Analysis
First of all, the planning has to be done on what are the basic requirements of the project
under which the design of the database has to be taken forward. Thus, they can be defined
as:-
Planning - This stage is concerned with planning the entire DDLC (Database Development
Life Cycle). The strategic considerations are taken into account before proceeding.
System definition - This stage covers the boundaries and scopes of the proper database
after planning.
Database Designing
The next step involves designing the database considering the user-based requirements
and splitting them out into various models so that load or heavy dependencies on a single
aspect are not imposed. Therefore, there has been some model-centric approach and that's
where logical and physical models play a crucial role.
Physical Model - The physical model is concerned with the practices and implementations
of the logical model.
Logical Model - This stage is primarily concerned with developing a model based on the
proposed requirements. The entire model is designed on paper without any
implementation or adopting DBMS considerations.
24
Implementation
The last step covers the implementation methods and checking out the behavior that
matches our requirements. It is ensured with continuous integration testing of the database
with different data sets and conversion of data into machine understandable language. The
manipulation of data is primarily focused on these steps where queries are made to run and
check if the application is designed satisfactorily or not.
Data conversion and loading - This section is used to import and convert data from the
old to the new system.
Testing - This stage is concerned with error identification in the newly implemented
system. Testing is a crucial step because it checks the database directly and compares the
requirement specifications.
The process of designing a database carries various conceptual approaches that are needed
to be kept in mind. An ideal and well-structured database design must be able to:
Logical
A logical data model generally describes the data in as many details as possible, without
having to be concerned about the physical implementations in the database. Features of
logical data model might include:
25
14. Entity –Relationship Modal
Entity
An entity can be a real-world object, either animate or inanimate, that can be easily
identifiable. For example, in a school database, students, teachers, classes, and courses
offered can be considered as entities. All these entities have some attributes or properties
that give them their identity.
An entity set is a collection of similar types of entities. An entity set may contain entities
with attribute sharing similar values. For example, a Students set may contain all the
students of a school; likewise a Teachers set may contain all the teachers of a school from
all faculties. Entity sets need not be disjoint.
Attributes
Entities are represented by means of their properties, called attributes. All attributes have
values. For example, a student entity may have name, class, and age as attributes.
There exists a domain or range of values that can be assigned to attributes. For example, a
student's name cannot be a numeric value. It has to be alphabetic. A student's age cannot
be negative, etc.
Types of Attributes
● Simple attribute − Simple attributes are atomic values, which cannot be divided
further. For example, a student's phone number is an atomic value of 10 digits.
● Composite attribute − Composite attributes are made of more than one simple
attribute. For example, a student's complete name may have first_name and
last_name.
● Derived attribute − Derived attributes are the attributes that do not exist in the
physical database, but their values are derived from other attributes present in the
database. For example, average_salary in a department should not be saved directly
in the database, instead it can be derived. For another example, age can be derived
from data_of_birth.
● Single-value attribute − Single-value attributes contain single value. For example
− Social_Security_Number.
26
● Multi-value attribute − Multi-value attributes may contain more than one values.
For example, a person can have more than one phone number, email_address, etc
Relationship
The association among entities is called a relationship. For example, an
employee works_at a department, a student enrolls in a course. Here, Works_at and
Enrolls are called relationships.
Relationship Set
A set of relationships of similar type is called a relationship set. Like entities, a relationship
too can have attributes. These attributes are called descriptive attributes.
Degree of Relationship
The number of participating entities in a relationship defines the degree of the
relationship.
● Binary = degree 2
● Ternary = degree 3
● n-ary = degree
Mapping Cardinalities
Cardinality defines the number of entities in one entity set, which can be associated with
the number of entities of other set via relationship set.
● One-to-one − One entity from entity set A can be associated with at most one entity
of entity set B and vice versa.
27
● One-to-many − One entity from entity set A can be associated with more than one
entities of entity set B however an entity from entity set B, can be associated with at
most one entity.
● Many-to-one − More than one entities from entity set A can be associated with at
most one entity of entity set B, however an entity from entity set B can be
associated with more than one entity from entity set A.
● Many-to-many − One entity from A can be associated with more than one entity
from B and vice versa.
28
Let us now learn how the ER Model is represented by means of an ER diagram. Any object,
for example, entities, attributes of an entity, relationship sets, and attributes of
relationship sets, can be represented with the help of an ER diagram.
Entity
Entities are represented by means of rectangles. Rectangles are named with the entity set
they represent.
Attributes
Attributes are the properties of entities. Attributes are represented by means of ellipses.
Every ellipse represents one attribute and is directly connected to its entity (rectangle).
If the attributes are composite, they are further divided in a tree like structure. Every
node is then connected to its attribute. That is, composite attributes are represented by
ellipses that are connected with an ellipse.
29
Multivalued attributes are depicted by double ellipse.
30
Relationship
Relationships are represented by diamond-shaped box. Name of the relationship is written
inside the diamond-box. All the entities (rectangles) participating in a relationship, are
connected to it by a line.
31
● One-to-many − when more than one instance of an entity is associated with a
relationship, it is marked as '1: N'. The following image reflects that only one
instance of entity on the left and more than one instance of an entity on the right
can be associated with the relationship. It depicts one-to-many relationship.
● Many-to-one − when more than one instance of entity is associated with the
relationship, it is marked as 'N:1'. The following image reflects that more than one
instance of an entity on the left and only one instance of an entity on the right can
be associated with the relationship. It depicts many-to-one relationship.
● Many-to-many − The following image reflects that more than one instance of an
entity on the left and more than one instance of an entity on the right can be
associated with the relationship. It depicts many-to-many relationship.
32
Participation Constraints
● Total Participation − each entity is involved in the relationship. Total participation
is represented by double lines.
● Partial participation − Not all entities are involved in the relationship. Partial
participation is represented by single lines.
Example of an ER-Diagram
33
● Airline Reservation System
34
EER is a high-level data model that incorporates the extensions to the original ER
model. Enhanced ERD are high level models that represent the requirements and
complexities of complex database.
35
These concepts are used to create EE-R diagrams.
Super class shape has sub groups: Triangle, Square and Circle.
Sub classes are the group of entities with some unique attributes. Sub class inherits the
properties and attributes from super class.
Generalization –
Generalization is the process of extracting common properties from a set of entities and
creates a generalized entity from it. It is a bottom-up approach in which two or more
entities can be generalized to a higher level entity if they have some attributes in
common. For Example, STUDENT and FACULTY can be generalized to a higher level
entity called PERSON as shown in Figure 1. In this case, common attributes like
P_NAME, P_ADD become part of higher entity (PERSON) and specialized attributes like
S_FEE become part of specialized entity (STUDENT).
36
Specialization –
In specialization, an entity is divided into sub-entities based on their characteristics. It is a
top-down approach where higher level entity is specialized into two or more lower level
entities. For Example, EMPLOYEE entity in an Employee management system can be
specialized into DEVELOPER, TESTER etc. as shown in Figure 2. In this case, common
attributes like E_NAME, E_SAL etc. become part of higher entity (EMPLOYEE) and
specialized attributes like TES_TYPE become part of specialized entity (TESTER).
37
18.Aggregation
An ER diagram is not capable of representing relationship between an entity and a
relationship which may be required in some scenarios. In those cases, a relationship
with its corresponding entities is aggregated into a higher level entity. Aggregation is an
abstraction through which we can represent relationships as higher level entity sets.
For Example, Employee working for a project may require some machinery. So,
REQUIRE relationship is needed between relationship WORKS_FOR and entity
MACHINERY. Using aggregation, WORKS_FOR relationship with its entities EMPLOYEE
and PROJECT is aggregated into single entity and relationship REQUIRE is created
between aggregated entity and MACHINERY.
38
19.ER Design Issues
Here, we will discuss the basic design issues of an ER database schema in the following
points:
39
'parent' that may relate to a child, his father, as well as his mother. Such relationship
can also be represented by two binary relationships i.e., mother and father that may
relate to their child. Thus, it is possible to represent a non-binary relationship by a
set of distinct binary relationships.
Thus, it requires the overall knowledge of each part that is involved in designing and
modeling an ER diagram. The basic requirement is to analyze the real-world enterprise and
the connectivity of one entity or attribute with other.
Strong Entity
The strong entity has a primary key. Weak entities are dependent on strong entity. Its
existence is not dependent on any other entity.
40
Continuing our previous example, Professor is a strong entity here, and the primary key is
Professor_ID.
Weak Entity
A weak entity in DBMS does not have a primary key and are dependent on the parent entity.
It mainly depends on other entities.
Continuing our previous example, Professor is a strong entity, and the primary key is
Professor_ID. However, another entity is Professor_Dependents, which is our Weak
Entity.
<Professor_Dependents>
This is a weak entity since its existence is dependent on another entity Professor, which
we saw above. A Professor has Dependents.
41
Example of Strong and Weak Entity
The example of a strong and weak entity can be understood by the below figure.
ID is the primary key (represented with a line) and the Name in Dependent entity is
called Partial Key (represented with a dotted line).
The database can be represented using the notations, and these notations can be reduced to
a collection of tables.
In the database, every entity set or relationship set can be represented in tabular form.
42
There are some points for converting the ER diagram to the table:
In the given ER diagram, LECTURE, STUDENT, SUBJECT and COURSE forms individual
tables.
In the STUDENT entity, STUDENT_NAME and STUDENT_ID form the column of STUDENT
table. Similarly, COURSE_NAME and COURSE_ID form the column of COURSE table and so
on.
In the given ER diagram, COURSE_ID, STUDENT_ID, SUBJECT_ID, and LECTURE_ID are the
key attribute of the entity.
43
o The multivalued attribute is represented by a separate table.
In the given ER diagram, student address is a composite attribute. It contains CITY, PIN,
DOOR#, STREET, and STATE. In the STUDENT table, these attributes can merge as an
individual column.
In the STUDENT table, Age is the derived attribute. It can be calculated at any point of time
by calculating the difference between current date and Date of Birth.
Using these rules, you can convert the ER diagram to tables and columns and assign the
mapping between the tables. Table structure for the given ER diagram is as below:
44