Dbms Notes 1
Dbms Notes 1
Presented by:
Anil Kr. Chanchal
Assistant Professor
Computer Engineering & Applications Department,
GLA University, Mathura
Syllabus: Module 1
Basic Concepts:
– Characteristics of the Database,
– Database & Database users, DBA
– Schema & Instances,
– DBMS Architecture & Data Independence,
– Data Base Languages,
– Data Models- Relational, Network, Hierarchical.
– Data Modeling using the Entity Relationship Approach.
Relational Model concepts:
– Relational Data Model Concepts, Relational Algebra
File Organization Techniques:
– Sequential file organization,
– Index File Organization,
– Random file organization.
Syllabus: Module 2
Introduction on SQL:
– Data definition and Data manipulation command in SQL,
– views and queries in SQL,
– Specifying Constraints & index in SQL.
Normalization:
– Functional dependencies, Lossless join & dependency preserving
decomposition.
– Normal forms based on keys (1NF, 2NF, 3NF & BCNF),
– De-normalization
Transaction:
– Introduction,
– Properties (Atomicity, Consistency, Isolation, Durability),
– Transaction State.
Types of Database
– Concept of object oriented data base, Distributed database, Client server
database.
Books to Follow
Text Book:
• Database System Concepts 7th Edition Avi Silberschatz Henry
F. Korth, S. Sudarshan 2019 McGraw-Hill.
• Fundamentals of Database Systems, 7th Edition, Ramez
Elmasri, Shamkant B. Navathe, 2016 Pearson.
Reference Books:
• Bipin Desai, (2006), An Introduction to Database System, West
Pub. Co.
• Jeff Parkins and Bryan Morgan, Teach Yourself SQL in 14 days
Data
• Known facts
that can be
recorded and
that have
implicit
meaning.
Fundamental of
Database Management System
BCAC0005
Lecture- 2
Presented by:
Anil Kr. Chanchal
Assistant Professor
Computer Engineering & Applications Department,
GLA University, Mathura
Definitions
• Data:
– Known facts that can be recorded and have an implicit meaning.
• Database:
– A collection of related data.
• Database Management System (DBMS):
DBMS contains information about a particular enterprise
– Collection of interrelated data
– Set of programs to access the data
– An environment that is both convenient and efficient to use
• Mini-world:
– Some part of the real world about which data is stored in a database.
For example, student grades and transcripts at a university.
University Database Example
• Atomicity of updates
– Failures may leave database in an inconsistent state with partial updates carried
out
– Example: Transfer of funds from one account to another should either complete or
not happen at all
• Concurrent access by multiple users
– Concurrent access needed for performance
– Uncontrolled concurrent accesses can lead to inconsistencies
• Example: Two people reading a balance (say 100) and updating it by
withdrawing money (say 50 each) at the same time
• Security problems
– Hard to provide user access to some, but not all, data
Presented by:
Anil Kr. Chanchal
Assistant Professor
Computer Engineering & Applications
Department,
GLA University, Mathura
Database Users
Database
Database Users
1. Database Administrator (DBA)
• A person who has central control over the system is called a
database administrator (DBA).
• Functions of a DBA include:
– Schema definition
– Storage structure and access-method definition
– Schema and physical-organization modification
– Granting of authorization for data access
– Routine maintenance
– Periodically backing up the database
– Ensuring that enough free disk space is available for normal
operations, and upgrading disk space as required
– Monitoring jobs running on the database
Database Users
2. Database designers
• Responsible for identifying the type of data to be stored in
the database and for choosing appropriate structures to
represent and store this data.
• Communicate with all prospective database users in order to
understand their requirements and to create a design that
meets these requirements.
• In many cases, the designers are on the staff of the DBA and
may be assigned other staff responsibilities after the database
design is completed.
Database Users
3. End Users
a) Casual end users
b) Naive or parametric end users
c) Sophisticated end users
Database Users
Presented by:
Anil kr.
Assistant Professor
Computer Engineering & Applications Department,
GLA University, Mathura
View of Data
• A major purpose of a database system is to provide users with
an abstract view of the data.
– Data models
• A collection of conceptual tools for describing data,
data relationships, data semantics, and consistency
constraints.
– Data abstraction
• Hide the complexity of data structures to represent
data in the database from users through several levels
of data abstraction.
https://selfstudynote.blogspot.com/2016/05/database-schema-and-instance.html
BCAC00020 Fundamental of DBMS
Logical Data Independence
• Logical data is data about database, that is, it stores
information about how data is managed inside.
• For example, a table (relation) stored in the database and all
its constraints, applied on that relation.
• Logical data independence is a kind of mechanism, which
liberalizes itself from actual data stored on the disk.
• If we do some changes on table format, it should not change
the data residing on the disk.
BCAC00020 Fundamental of
DBMS
Fundamental of
Database Management System
BCAC0020
Lecture- 5 / DBMS Architecture
Presented by:
Anil Kr. Chanchal
Assistant Professor
Computer Engineering & Applications Department,
GLA University, Mathura
DBMS Architecture
• A database architect develops and implements software to
meet the needs of users.
• The design of a DBMS depends on its architecture.
• It can be centralized or decentralized or hierarchical.
• The architecture of a DBMS can classified into:
1-tier architecture
2-tier architecture
3-tier architecture
Source: https://medium.com/oceanize-geeks/concepts-of-database-architecture-
dfdc558a93e4
1-tier architecture
• The database is directly available
to the user. Application
• Any changes done here will Program +
directly be done on the database Database
itself.
• It doesn't provide a handy tool
for end users.
• The 1-Tier architecture is used
for development of the local
application, where programmers
can directly communicate with
the database for the quick
response.
2-tier architecture
• Same as basic client-server.
• Client side ( The user interfaces and application programs)
• Server side (Query processing and transaction management
over database)
• Applications on the client end can directly communicate with
the database at the server side.
• To communicate with the DBMS, client-side application
establishes a connection with the server side.
• For this interaction, API's like: ODBC, JDBC are used.
2-tier architecture
user interfaces
+
application
programs
ODBC
Network /
Database JDBC
query processing ,
transaction management
Server
Client
2-tier architecture
• Advantage
– Maintenance and understanding is easier
• Disadvantage
– Poor performance when there are a large number of users.
3-tier architecture
query processingA, A user interfaces
transaction management +
p p application
p p programs
S C
l l
e l
i i
r i
c Network c
Database v e
a a
e n
t t
r t
i i
o o
Server n n Client
3-tier architecture
• Application layer (business logic layer) between the user and
the DBMS
• Application layer is responsible for communicating the user's
request to the DBMS system and send the response from the
DBMS to the user.
• Application layer processes
– functional logic, constraint, and rules
before passing data to the user or down to the DBMS
• Three tier architecture is the most popular DBMS
architecture.
3-tier architecture
The goal of Three-tier architecture is:
• To separate the user applications and physical database
• Proposed to support DBMS characteristics
• Program-data independence
• Support of multiple views of the data
• This type of architecture is used in case of large web
applications.
Question ?
Where we use 1-tier/2-tier/3-tier Architecture ?
Answer
1 tier:
Development of the local application
2 tier:
Attendance Management System
Library Management System
3 tier: Any large website
IRCTC, facebook, etc.
Fundamental of
Database Management System
BCAC0020
Lecture- 6 / Database Languages
Presented by:
Anil Kr. Chanchal
Assistant Professor
Computer Engineering & Applications Department,
GLA University, Mathura
Database Language
1. Data Definition Language
• Define database structure.
• It is used to create,
– Schema, tables, indexes, constraints, etc.
in the database.
• Used to store the information of metadata
• Metadata
– the number of tables and schemas,
– their names,
– indexes,
– columns in each table,
– constraints, etc.
Tasks that come under DDL
Presented by:
Anil Kr. Chanchal
Assistant Professor
Computer Engineering & Applications Department,
GLA University, Mathura
Database model
• It shows the logical structure of a database, including the
relationships and constraints
• It determine how data can be stored and accessed.
• Most data models can be represented by an accompanying
database diagram.
Types of database models
Presented by:
Anil Kr. Chanchal
Assistant Professor
Computer Engineering & Applications Department,
GLA University, Mathura
Design Phases
Design Phases
1. Conceptual Design
Once all the requirements have been collected and analyzed, the next step is to
create a conceptual schema for the database, using a high level conceptual data
model. The result of this phase is an Entity-Relationship (ER) diagram or UML
class diagram. It describes how different entities (objects, items) are related to
each other. It also describes what attributes (features) each entity has. It includes
the definitions of all the concepts (entities, attributes) of the
application area. During or after the conceptual schema design, the basic data
model operations can be used to specify the high-level user operations identified
during the functional analysis. This also serves to confirm that the conceptual
schema meets all the indentified functional requirements.
Design Phases
2. Logical Design
The result of the logical design phase (or data model mapping
phase) is a set of relation schemas. The ER diagram or class
diagram is the basis for these relation schemas. To create the
relation schemas is quite a mechanical operation. There are
rules how the ER model or class diagram is transferred to
relation schemas. The relation schemas are the basis for
table definitions. In this phase (if not done in previous phase)
the primary keys and foreign keys are defined.
Design Phases
Normalization
Normalization is the last part of the logical design. The goal of
normalization is to eliminate redundancy and potential
update anomalies. Redundancy means that the same data is
saved more than once in a database. Update anomaly is a
consequence of redundancy. If a piece of data is saved in
more than one place, the same data must be updated in
more than one place. Normalization is a technique by which
one can modify the relation schema to reduce the
redundancy. Each normalization phase adds more relations
(tables) into the database.
Design Phases
Physical Design
The goal of the last phase of database design, physical design,
is to implement the database. At this phase one must know
which database management system (DBMS) is
used. For example, different DBMS's have different names for
datatypes and have different datatypes. The SQL clauses to
create the database are written. The idexes, the integrity
constraints (rules) and the users' access rights are defined.
Finally the data to test the database is added in.
In parallel with these activities, application programs are
designed. The implementation of the programs can start
when the database is created and data has been added in.
Design Phases
• Initial phase -- characterize fully the data needs of the
prospective database users.
• Second phase -- choosing a data model
– Applying the concepts of the chosen data model
– Translating these requirements into a conceptual schema
of the database.
– A fully developed conceptual schema indicates the
functional requirements of the enterprise.
• Describe the kinds of operations (or transactions) that
will be performed on the data.
Design Phases (Cont.)
Final Phase -- Moving from an abstract data model to the
implementation of the database
– Logical Design – Deciding on the database schema.
• Database design requires that we find a “good” collection
of relation schemas.
Business decision – What attributes should we record in
the database?
Computer Science decision – What relation schemas
should we have and how should the attributes be
distributed among the various relation schemas?
– Physical Design – Deciding on the physical layout of the
database
Design Alternatives
In designing a database schema, we must ensure that we avoid
two major pitfalls:
• Redundancy: a bad design may result in repeat
information.
Redundant representation of information may lead to
data inconsistency among the various copies of
information
• Incompleteness: a bad design may make certain aspects of
the enterprise difficult or impossible to model.
Avoiding bad designs is not enough. There may be a large
number of good designs from which we must choose.
Design Approaches
Entity Relationship Model
• Models an enterprise as a collection of entities and
relationships
Entity: a “thing” or “object” in the enterprise that is
distinguishable from other objects
• Described by a set of attributes
Relationship: an association among several entities
• Represented diagrammatically by an entity-relationship
diagram:
Normalization Theory
• Formalize what designs are bad, and test for them
Outline of the ER Model
ER diagram
• ER diagram or Entity Relationship diagram is a conceptual
model that gives the graphical representation of the logical
structure of the database.
• It shows all the constraints and relationships that exist among
the different components.
Components of ER diagram
Student Table
1. Partial Participation-
• Partial participation is represented using a single line
between the entity set and relationship set.
2. Total Participation-
• Total participation is represented using a double line
between the entity set and relationship set.
ER Diagram Symbols –
Specialization and Generalization
• In ER diagram,
– Attributes are associated with an entity set.
– Attributes describe the properties of entities in the entity set.
– Based on the values of certain attributes, an entity can be identified
uniquely.
Types of Entity Sets
1. Strong Entity Set
• A strong entity set is an entity set that contains sufficient
attributes to uniquely identify all its entities.
• In other words, a primary key exists for a strong entity set.
• Primary key of a strong entity set is represented by
underlining it.
1. Strong Entity Set
Symbols Used-
• A single rectangle is used for representing a strong entity set.
• A diamond symbol is used for representing the relationship
that exists between two strong entity sets.
• A single line is used for representing the connection of the
strong entity set with the relationship set.
• A double line is used for representing the total participation
of an entity set with the relationship set.
• Total participation may or may not exist in the relationship.
1. Strong Entity Set
In this ER diagram,
• Two strong entity sets “Student” and “Course” are related to each other.
• Student ID and Student name are the attributes of entity set “Student”.
• Student ID is the primary key using which any student can be identified
uniquely.
• Course ID and Course name are the attributes of entity set “Course”.
• Course ID is the primary key using which any course can be identified
uniquely.
• Double line between Student and relationship set signifies total
participation.
• It suggests that each student must be enrolled in at least one course.
• Single line between Course and relationship set signifies partial
participation.
• It suggests that there might exist some courses for which no enrollments
are made.
2. Weak Entity Set
• A weak entity set is an entity set that does not contain
sufficient attributes to uniquely identify its entities.
• In other words, a primary key does not exist for a weak entity
set.
• However, it contains a partial key called as a discriminator.
• Discriminator can identify a group of entities from the entity
set.
• Discriminator is represented by underlining with a dashed line
2. Weak Entity Set
NOTE-
•The combination of discriminator and primary key of the strong
entity set makes it possible to uniquely identify all entities of the weak
entity set.
•Thus, this combination serves as a primary key for the weak entity
set.
• Clearly, this primary key is not formed by the weak entity set
completely.
2. Weak Entity Set
Symbols Used-
• A double rectangle is used for representing a weak entity set.
• A double diamond symbol is used for representing the
relationship that exists between the strong and weak entity
sets and this relationship is known as identifying relationship.
• A double line is used for representing the connection of the
weak entity set with the relationship set.
• Total participation always exists in the identifying relationship.
2. Weak Entity Set
• In this ER diagram,
• One strong entity set “Building” and one weak entity set “Apartment” are
related to each other.
• Strong entity set “Building” has building number as its primary key.
• Door number is the discriminator of the weak entity set “Apartment”.
• This is because door number alone can not identify an apartment uniquely
as there may be several other buildings having the same door number.
• Double line between Apartment and relationship set signifies total
participation.
• It suggests that each apartment must be present in at least one building.
• Single line between Building and relationship set signifies partial
participation.
• It suggests that there might exist some buildings which has no apartment.
2. Weak Entity Set
• To uniquely identify any apartment,
– First, building number is required to identify the particular building.
– Secondly, door number of the apartment is required to uniquely
identify the apartment.
It contains sufficient attributes to form its It does not contain sufficient attributes to
primary key. form its primary key.
A single line is used for the representation A double line is used for the representation
of the connection between the strong of the connection between the weak entity
entity set and the relationship. set and the relationship set.
Total participation may or may not exist in Total participation always exists in the
the relationship. identifying relationship.
Important Note
• In ER diagram, weak entity set is always present in total
participation with the identifying relationship set.
• So, we always have the picture like shown here-
Relationship in DBMS
• A relationship is defined as an association among several
entities.
Example
• ‘Enrolled in’ is a relationship that exists between
entities Student and Course.
Relationship Set-
• A relationship set is a set of relationships of same type.
Example
• Set representation of above ER diagram is-
Degree of a Relationship Set
• The number of entity sets that participate in a relationship set
is termed as the degree of that relationship set.
Example
• Here, all the attributes are simple attributes as they can not
be divided further.
2. Composite Attributes
• Composite attributes are those attributes which are
composed of many other simple attributes.
Example-
• Here, all the attributes are single valued attributes as they can
take only one specific value for each entity.
4. Multi Valued Attributes
• Multi valued attributes are those attributes which can take
more than one value for a given entity from an entity set.
Example-
• Here, the attributes “Mob_no” and “Email_id” are multi valued attributes
as they can take more than one values for a given entity.
5. Derived Attributes
• Derived attributes are those attributes which can be derived
from other attribute(s).
Example-
Example-
Coach Description
T_name
City
Position
P_name
P_no Skill_level
Example 2
• A university registrar’s office maintains data about the following
entities:
– courses, including number, title, credits, syllabus, and prerequisites;
– course offerings, including course number, year, semester, section
number, instructor(s), timings, and classroom;
– students, including student-id, name, and program;
– instructors, including identification number, name, department, and
title.
• Further, the enrollment of students in courses and grades awarded
to students in each course they are enrolled for must be
appropriately modeled.
• Construct an E-R diagram for the registrar’s office.
• Document all assumptions that you make about the mapping
constraints.
Entities
• Student(sid, name, program)
• Course(C_number, title, credits, syllabus)
• course offerings( c_number, year, semester, section_number, timings, and
classroom)
• Instructor(iid, name, department, title)
Relationships
• Students enrolls in course offerings, then grade is allotted.
• Instructor teaches course offerings.
• A course is offered Course offerings
• A main course required A prerequisite course.
Example 3
• Construct an E-R diagram for a car-insurance company whose
customers own one or more cars each.
• Each car has associated with it zero to any number of
recorded accidents.
• Construct appropriate tables for the above ER
Diagram ?
• Car insurance tables:
– person (driver-id, name, address)
– car (license, year,model)
– accident (report-number, date, location)
– participated(driver-id, license, report-number,
damage-amount)
Example 4
• Construct an E-R diagram for a hospital with a set of patients
and a set of medical doctors.
• Associate with each patient a log of the various tests and
examinations conducted.
• Construct appropriate tables for the above ER
Diagram :
– Patient(SS#, name, insurance)
– Physician ( name, specialization)
– Test-log( SS#, test-name, date, time)
– Doctor-patient (physician-name, SS#)
– Patient-history(SS#, test-name, date)
Example 5
• Draw the E-R diagram which models an online
bookstore.
Converting ER Diagrams to Tables
Way-01:
AR ( a1 , a2 , b1 )
B ( b1 , b2 )
Way-02:
A ( a1 , a2 )
BR ( a1 , b1 , b2 )
Thumb Rules to Remember
• While determining the minimum number of tables required
for binary relationships with given cardinality ratios, following
thumb rules must be kept in mind-
– For binary relationship with cardinality ration m : n , separate and
individual tables will be drawn for each entity set and relationship.
– For binary relationship with cardinality ratio either m : 1 or 1 : n ,
always remember “many side will consume the relationship” i.e. a
combined table will be drawn for many side entity set and relationship
set.
– For binary relationship with cardinality ratio 1 : 1 , two tables will be
required. You can combine the relationship set with any one of the
entity sets.
Rule-06: For Binary Relationship With Both Cardinality
Constraints and Participation Constraints
•In the following diagram we have two entities Student and College
and their relationship.
•The relationship between Student and College is many to one as a
college can have many students however a student cannot study in
multiple colleges at the same time.
•Student entity has attributes such as Stu_Id, Stu_Name & Stu_Addr
and College entity has attributes such as Col_ID & Col_Name.
Component of ER Diagram
1. Strong Entity:
• An entity may be any object, class, person or place.
• In the ER diagram, an entity can be represented as rectangles.
• Consider an organization as an example- manager, product,
employee, department etc. can be taken as an entity.
a. Weak Entity
• An entity that depends on another entity called a weak entity.
• The weak entity doesn't contain any key attribute of its own.
•The weak entity is represented by a double rectangle.
Ex. Installment, Dependent of employee, etc.
2. Attribute
• The attribute is used to describe the property of an entity.
• Ellipse is used to represent an attribute.
Example, id, age, contact number, name, etc. can be attributes
of a student.
a. Key Attribute
• The key attribute is used to represent the main characteristics
of an entity.
• It represents a primary key.
• The key attribute is represented by an ellipse with the text
underlined.
b. Composite Attribute
• An attribute that composed of many other attributes is known as a
composite attribute.
• The composite attribute is represented by an ellipse, and those ellipses
are connected with an ellipse.
c. Multi-valued Attribute
• An attribute can have more than one value.
• These attributes are known as a multi-valued attribute.
• The double oval is used to represent multi-valued attribute.
Example, a student can have more than one phone number.
d. Derived Attribute
• An attribute that can be derived from other attribute is known
as a derived attribute.
• It can be represented by a dashed ellipse.
Example, A person's age changes over time and can be
derived from another attribute like Date of birth.
3. Relationship
• A relationship is used to describe the relation between
entities. Diamond or rhombus is used to represent the
relationship.
Recursive Relationship
• It is possible for the same entity to participate in the relationship.
•This is termed a recursive relationship.
Employee entity
• Employee no
• Employee surname
• Employee forename
• Employee DOB
• Employee dept number
• Manager no * (this is the employee no of the employee's manager)
Mapping Constraints(Cardinality)
• A mapping constraint is a data constraint that expresses the
number of entities to which another entity can be related via
a relationship set.
• It is most useful in describing the relationship sets that involve
more than two entity sets.
• For binary relationship set R on an entity set A and B, there
are four possible mapping cardinalities.
• These are as follows:
– One to one (1:1)
– One to many (1:M)
– Many to one (M:1)
– Many to many (M:M)
a. One-to-One Relationship
• When only one instance of an entity is associated with the
relationship, then it is known as one to one relationship.
• For example, a person has only one passport and a passport is
given to one person.
b. One-to-many relationship
• When only one instance of the entity on the left, and more
than one instance of an entity on the right associates with the
relationship then this is known as a one-to-many relationship.
• For example, Scientist can invent many inventions, but the
invention is done by the only specific scientist.
c. Many-to-one relationship
• When more than one instance of the entity on the left, and
only one instance of an entity on the right associates with the
relationship then it is known as a many-to-one relationship.
• For example, Student enrolls for only one course, but a course
can have many students.
d. Many-to-many relationship
• When more than one instance of the entity on the left, and
more than one instance of an entity on the right associates
with the relationship then it is known as a many-to-many
relationship.
• For example, Employee can assign by many projects and
project can have many employees.
Notation of ER diagram
• In ER diagram,
many notations are
used to express the
cardinality.
• Cardinality
specifies how
many
instances of
an entity
relate to one
instance of
another entity
Purchase order
Keys
• Keys play an important role in the relational database.
• It is used to uniquely identify any record or row of data from
the table. It is also used to establish and identify relationships
between tables.
• For example: In Student table, ID is used as a key because it is
unique for each student. In PERSON table, passport_number,
license_number, SSN are keys since they are unique for each
person.
Types of key
1. Primary key
• It is the first key which is used to identify one and only one
instance of an entity uniquely.
• An entity can contain multiple keys as we saw in PERSON
table.
• The key which is most suitable from those lists become a
primary key.
• In the EMPLOYEE table, ID can be primary key since it is
unique for each employee. In the EMPLOYEE table, we can
even select License_Number and Passport_Number as
primary key since they are also unique.
• For each entity, selection of the primary key is based on
requirement and developers.
2. Candidate key
• A candidate key is an attribute or set of an attribute which can
uniquely identify a tuple.
• The remaining attributes except for primary key are
considered as a candidate key.
• The candidate keys are as strong as the primary key.
• For example: In the EMPLOYEE table, id is best suited for the
primary key. Rest of the attributes like SSN, Passport_Number,
and License_Number, etc. are considered as a candidate key.
3. Super Key
• Super key is a set of an attribute which can uniquely identify a
tuple. Super key is a superset of a candidate key.
• For example: In the above EMPLOYEE table, for(EMPLOEE_ID,
EMPLOYEE_NAME) the name of two employees can be the
same, but their EMPLYEE_ID can't be the same. Hence, this
combination can also be a key.
• The super key would be EMPLOYEE-ID, (EMPLOYEE_ID,
EMPLOYEE-NAME), etc.
4. Foreign key
• Foreign keys are the column of the table which is used to
point to the primary key of another table.
• In a company, every employee works in a specific department,
and employee and department are two different entities.
• So we can't store the information of the department in the
employee table.
• That's why we link these two tables through the primary key
of one table.
• We add the primary key of the DEPARTMENT table,
Department_Id as a new attribute in the EMPLOYEE table.
• Now in the EMPLOYEE table, Department_Id is the foreign
key, and both the tables are related.
Steps to Create an ERD
• In a university, a Student enrolls in Courses.
• A student must be assigned to at least one or
more Courses.
• Each course is taught by a single Professor.
• To maintain instruction quality, a Professor can
deliver only one course
Step 1) Entity Identification
• We have three entities
– Student
– Course
– Professor
Step 2) Relationship Identification
• We have the following two relationships
– The student is assigned a course
– Professor delivers a course
Step 3) Cardinality Identification
• For them problem statement we know that,
– A student can be assigned multiple courses
– A Professor can deliver only one course
Step 4) Identify Attributes
• You need to study the files, forms, reports, data currently
maintained by the organization to identify attributes.
• You can also conduct interviews with various stakeholders to
identify entities.
• Initially, it's important to identify the attributes without
mapping them to a particular entity.
• Once, you have a list of Attributes, you need to map them to
the identified entities.
• Once the mapping is done, identify the primary Keys.
• If a unique key is not readily available, create one.
Entity Primary Key Attribute
Student Student_ID StudentName
Professor Employee_ID ProfessorName
Course Course_ID CourseName
Step 5) Create the ERD
• A more modern representation of ERD
Diagram
ER diagram example
• Suppose you are given the following requirements for a
simple database for the National
• Hockey League (NHL):
– the NHL has many teams,
– each team has a name, a city, a coach, a captain, and a set of
players,
– each player belongs to only one team,
– each player has a name, a position (such as left wing or goalie),
a skill level, and a set of injury records,
– a team captain is also a player,
– a game is played between two teams (referred to as host_team
and guest_team) and has a date (such as May 11th, 1999) and a
score (such as 4 to 2).
• Entities:
– Team(t_name, city, coach )
– Player(p_name, position, skill_level)
– Injury record (Weak entity, depend on player)
• Relationships:
– Each team has a captain which is also a player
– Each team has many player
– A game is played between two teams(host and guest),
and has date and score(attributes)
– A player has injury record
Date
Score
Coach Description
T_name
City
Position
P_name
P_no Skill_level
Example 2
• A university registrar’s office maintains data about the following
entities:
– courses, including number, title, credits, syllabus, and prerequisites;
– course offerings, including course number, year, semester, section
number, instructor(s), timings, and classroom;
– students, including student-id, name, and program;
– instructors, including identification number, name, department, and
title.
• Further, the enrollment of students in courses and grades awarded
to students in each course they are enrolled for must be
appropriately modeled.
• Construct an E-R diagram for the registrar’s office.
• Document all assumptions that you make about the mapping
constraints.
Entities
• Student(sid, name, program)
• Course(C_number, title, credits, syllabus)
• course offerings( c_number, year, semester, section_number, timings, and
classroom)
• Instructor(iid, name, department, title)
Relationships
• Students enrolls in course offerings, then grade is allotted.
• Instructor teaches course offerings.
• A course is offered Course offerings
• A main course required A prerequisite course.
Example 3
• Construct an E-R diagram for a car-insurance
company whose customers own one or more
cars each.
• Each car has associated with it zero to any
number of recorded accidents.
• Construct appropriate tables for the above ER
Diagram ?
• Car insurance tables:
– person (driver-id, name, address)
– car (license, year,model)
– accident (report-number, date, location)
– participated(driver-id, license, report-number,
damage-amount)
Example 4
• Construct an E-R diagram for a hospital with a set of patients
and a set of medical doctors.
• Associate with each patient a log of the various tests and
examinations conducted.
• Construct appropriate tables for the above ER
Diagram :
– Patient(SS#, name, insurance)
– Physician ( name, specialization)
– Test-log( SS#, test-name, date, time)
– Doctor-patient (physician-name, SS#)
– Patient-history(SS#, test-name, date)
Example 5
• Draw the E-R diagram which models an online
bookstore.
Fundamental of
Database Management System
BCAC0020
Lecture - 1
Presented by:
Anil Kr. Chanchal
Assistant Professor
Computer Engineering & Applications Department,
GLA University, Mathura
Fundamentals of database
management system
BCAC0020
Module-2
File Sequential file organization
Organization
Techniques Index File Organization, Random file organization
Relational data model concept Relational Algebra(select
operation, Project Operation)
•They contain all the attribute of a row. i.e.; when we say student
record, it will have his id, name, address, course, DOB etc.
Advantages of ISAM
• Since each record has its data block address, searching for a
record in larger database is easy and quick.
• This method gives flexibility of using any column as key field
and index will be generated based on that. In addition to the
primary key and its index, we can have index generated for
other fields too.
• It supports range retrieval, partial retrieval of records.
• Since the index is based on the key value, we can retrieve the
data for the given range of values.
Indexed Sequential Access Method (ISAM)
Disadvantages of ISAM
• An extra cost to maintain index has to be afforded. i.e.; we
need to have extra space in the disk to store this index value.
When there is multiple key-index combinations, the disk
space will also increase.
• As the new records are inserted, these files have to be
restructured to maintain the sequence.
• Similarly, when the record is deleted, the space used by it
needs to be released. Else, the performance of the database
will slow down.
3. Random(Heap) file organization
technique
• Here records are inserted at the end of the file as and when
they are inserted.
• There is no sorting or ordering of the records.
• Once the data block is full, the next record is stored in the
new block.
• This new block need not be the very next block.
• This method can select any block in the memory to
store the new records.
Random(Heap) file organization technique
Presented by:
Anil chanchal
Assistant Professor
Computer Engineering & Applications Department,
GLA University, Mathura
Relational model concepts
Important Results
– We can not insert a record into a referencing relation if the
corresponding record does not exist in the referenced
relation.
– We can not delete or update a record of the referenced
relation if the corresponding record exists in the
referencing relation.
• Here, relation ‘Student’ references the relation ‘Department’.
Branch
Branch_Name
_Code
Roll_no Name Age Branch_Code
CS Computer Science
1 Rahul 22 CS EE Electronics Engineering
Branch
Branch_Name
_Code
Roll_no Name Age Branch_Code
CS Computer Science
1 Rahul 22 CS EE Electronics Engineering
2 Anjali 21 CS IT Information Technology
3 Teena 20 IT CE Civil Engineering
4 James 23 ME
Cause-02: Deletion from a Referenced Relation
Branch
Branch_Name
_Code
Roll_no Name Age Branch_Code
CS Computer Science
1 Rahul 22 CS EE Electronics Engineering
Branch
Branch_Name
_Code
Roll_no Name Age Branch_Code
CSE Computer Science
1 Rahul 22 CS EE Electronics Engineering
ς Dno=4(EMPLOYEE)
• To select the EMPLOYEE whose salary is greater than $30,000
ς Salary>30000 (EMPLOYEE)
The SELECT Operation
Example
• select all employees who either work in department 4 and
make over $25,000 per year, or work in department 5 and
make over $30,000
(EMPLOYEE)
Salary>30000)
The SELECT Operation
• The degree of the relation resulting from a SELECT
operation—its number of attributes—is the same as the
degree of R.
ς (
<cond1> ς (R)) =
<cond2> ς (
<cond2> ς (R))
<cond1>
ς <cond1> ς
(ς <cond2>(...( (R)) ...)) =
<condn>
rating (S2)
8
S2
• Duplicate elimination
– Result of PROJECT operation is a set of distinct
tuples
sname rating
Projection yuppy 9
lubber 8
guppy 5
rusty 10
sname,rating(S2)
S2 sid sname rating age
28 yuppy 9 35.0
31 lubber 8 55.5 age
44 guppy 5 35.0
35.0
58 rusty 10 35.0
55.5
age(S2)
Selection & Projection
sid sname rating age
S2 28 yuppy 9 35.0
31 lubber 8 55.5
44 guppy 5 35.0
58 rusty 10 35.0
sname rating
yuppy 9
rusty 10
Selection & Projection
sid sname rating age
S2 28 yuppy 9 35.0
31 lubber 8 55.5
44 guppy 5 35.0
58 rusty 10 35.0
sname rating
yuppy 9
rusty 10
QUIZ…….??
S1 S2
πR.A←A, R.B←B(S).
• Otherwise, attribute names in relational algebra do not automatically
contain the relation name.
Joins: used to combine relations
• Condition Join: R⊳
⊲c S c (R S)
Courses HoD
A B C D
Courses HoD
A B C D
Courses HoD
A B C D
r s: A r
Another Division Example
Relations r, s:
A B C D E D E
a a 1 a 1
a a 1 b 1
a b 1 s
a a 1
a b 3
a a 1
a b 1
a b 1
r
r s: A B C
a
a
Example of Division
sname(Tempboats⊳⊲Reserves⊳⊲Sailors)
4) Find sailors who’ve reserved a red and a green boat
sname((TempredTempgreen)⊳⊲ Sailors)
5) Find the names of sailors who’ve reserved all boats
sname(Tempsids⊳⊲Sailors)
R1 𝖴 R2
Find the sids of suppliers who supply
some red part and some green part.
ρ(R1, πsid((πpidςcolor=red Parts) ⊳⊲Catalog))
πsid,pidCatalog)/(πpidς
( color=red ∨ color=green Parts)
Functional Dependency in DBMS
• In any relation, a functional dependency
α → β holds if
• Two tuples having same value of attribute α also have same
value for attribute β.
Mathematically,
• If α and β are the two sets of attributes in a relational
table R where:
α⊆R
β⊆R
• Then, for a functional dependency to exist from α to
β, If t1*α+ = t2*α+, then t1*β+ = t2*β+
α β
• fd : α → β t1[α+ t1[β+
t2[α+ t2[β+
……. …….
Types Of Functional Dependencies
1. Trivial Functional Dependencies
• A functional dependency X → Y is said to be trivial if and only if Y ⊆
X.
Examples
• AB → A
• AB → B
• AB → AB
2. Non-Trivial Functional Dependencies
• A functional dependency X → Y is said to be non-trivial if and
only if Y ⊄ X.
• Thus, if there exists at least one attribute in the RHS of a
functional dependency that is not a part of LHS, then it is
called as a non-trivial functional dependency.
Examples
• AB → BC
• AB → CD
Inference Rules
Reflexivity
• If B is a subset of A, then A → B always holds.
Transitivity
• If A → B and B → C, then A → C always holds.
Augmentation
• If A → B, then AC → BC always holds.
Decomposition
• If A → BC, then A → B and A → C always holds.
Composition
• If A → B and C → D, then AC → BD always holds.
Additive
• If A → B and A → C, then A → BC always holds.
Rules for Functional Dependency
Rule-01:
• A functional dependency X → Y will always hold if all the
values of X are unique (different) irrespective of the values of
Y.
A B C D E
A→B
5 4 3 2 2 A → BC
8 5 3 2 1 A → CD
A → BCD
1 9 3 3 5 A → DE
4 7 3 3 8 A → BCDE
Rule-02:
• A functional dependency X → Y will always hold if all the
values of Y are same irrespective of the values of X.
A B C D E
A→C
5 4 3 2 2 AB → C
8 5 3 2 1 ABDE → C
DE → C
1 9 3 3 5 AE → C
4 7 3 3 8
Closure of an Attribute Set
• The set of all those attributes which can be functionally
determined from an attribute set is called as a closure of that
attribute set.
• Closure of attribute set {X} is denoted as {X}+
Steps to Find Closure of an Attribute Set
Step-01:
• Add the attributes contained in the attribute set for which
closure is being calculated to the result set.
Step-02
• Recursively add the attributes to the result set which can be
functionally determined from the attributes already contained
in the result set.
Example
• Consider a relation R ( A , B , C , D , E , F , G ) with the
functional dependencies-
A → BC
BC → DE
D→F
CF → G
• Now, let us find the closure of some attributes and attribute
sets
Closure of attribute A
A+ = { A }
= , A , B , C - ( Using A → BC )
= , A , B , C , D , E - ( Using BC → DE )
= , A , B , C , D , E , F - ( Using D → F )
= , A , B , C , D , E , F , G - ( Using CF → G )
Thus,
• A+ = { A , B , C , D , E , F , G }
Closure of attribute D
D+ = { D }
= , D , F - ( Using D → F )
• We can not determine any other attribute using attributes D and F
contained in the result set. Thus,
D+ = { D , F }
Super Key
• If the closure result of an attribute set contains all the
attributes of the relation, then that attribute set is called as a
super key of that relation.
• Thus, we can say-
• “The closure of a super key is the entire relation schema.”
Candidate Key
• If there exists no subset of an attribute set whose closure
contains all the attributes of the relation, then that attribute
set is called as a candidate key of that relation.
Problem
Consider the given functional dependencies-
• AB → CD
• AF → D
• DE → F
• C→G
• F→E
• G→A
• Given below are the examples of super keys since each set
can uniquely identify each student in the Student table-
Here, using partial key Emp_no, we can not identify a tuple uniquely
but we can select a bunch of tuples from the table.
E1 Ajay Father
E2 Vijay Father
E2 Ankush Son
7. Composite Key
• A primary key comprising of multiple attributes and not just a
single attribute is called as a composite key.
8. Unique Key
• Unique key is a key with the following properties-
– It is unique for all the records of the table.
– Once assigned, its value can not be changed i.e. it is non-updatable.
– It may have a NULL value.
8. Unique Key
Example
• The best example of unique key is Adhaar Card Numbers
• The Adhaar Card Number is unique for all the citizens (tuples)
of India (table).
• If it gets lost and another duplicate copy is issued, then the
duplicate copy always has the same number as before.
• Thus, it is non-updatable.
• Few citizens may not have got their Adhaar cards, so for them
its value is NULL.
9. Surrogate Key
• Surrogate key is a key with the following properties-
• It is unique for all the records of the table.
• It is updatable.
• It can not be NULL i.e. it must have some value.
Example
• Mobile Number of students in a class where every student
owns a mobile phone.
10. Secondary Key
• Secondary key is required for the indexing purpose for better
and faster searching.
Finding Candidate Keys
• A set of minimal attribute(s) that can identify each tuple
uniquely in the given relation is called as a candidate key.
OR
• A minimal super key is called as a candidate key.
Finding Candidate Key
Step-01
• Determine all essential attributes of the given relation.
R1 ⋈ R2 ⋈ R3 ……. ⋈ Rn = R
R2( B , C )
R1 ⋈ R2 ⊃ R R1 ⋈ R2
A C B C
A B C 1 1 2 1
1 2 1 2 3 5 3
2 5 3 3 3 3 3
2 3 3
Now, if we perform the
3 5 3 natural join ( ⋈ ) of the
sub relations R1 and
3 3 3
R2 we get-
Determining Whether Decomposition Is Lossless Or Lossy
Condition-01
• Union of both the sub relations must contain all the attributes that
are present in the original relation R.
R 1 𝖴 R2 = R
Condition-02
• Intersection of both the sub relations must not be null.
• In other words, there must be some common attribute which is
present in both the sub relations.
R1 ∩ R2 ≠ ∅
Condition-03
• Intersection of both the sub relations must be a super key of either
R1 or R2 or both.
R1 ∩ R2 = Super key of R1 or R2
If any of these conditions fail, then the
decomposition is lossy.
Problem-01
• Consider a relation schema R ( A , B , C , D )
with the functional dependencies A → B and C → D.
Determine whether the decomposition of R into R1 ( A , B )
and R2 ( C , D ) is lossless or lossy.
Solution
Condition-01
• According to condition-01, union of both the sub relations must
contain all the attributes of relation R.
R 1 ( A , B ) 𝖴 R2 ( C , D ) = R ( A , B , C , D )
• Clearly, union of the sub relations contain all the attributes of
relation R.
• Thus, condition-01 satisfies.
Condition-02
• According to condition-02, intersection of both the sub relations
must not be null.
R1 ( A , B ) ∩ R2 ( C , D ) = Φ
• Clearly, intersection of the sub relations is null.
• So, condition-02 fails.
• Thus, we conclude that the decomposition is lossy.
Problem-02
• Consider a relation schema R ( A , B , C , D ) with the following
functional dependencies-
A→B
B→C
C→D
D→B
• Determine whether the decomposition of R into R1 ( A , B ) ,
R2 ( B , C ) and R3 ( B , D ) is lossless or lossy.
Solution
Condition 1
• R1 ( A , B ) 𝖴 R2 ( B , C ) 𝖴 R3 ( B , D ) = R(A,B,C,D)
Condition 2
• R1 ∩ R2 ≠ ∅ True
• (R1 U R2) ∩ R3 ≠ ∅ True
Condition 3
• R1 ∩ R2 = {B}+ = {BCD} super key of R2
• (R1 U R2) ∩ R3 = {B}+ = {BCD} super key of R3
• Hence lossless decomposition
Normalization in DBMS
• Reducing the redundancies
• Ensuring the integrity of data through lossless decomposition
• Normalization is done through normal forms.
First Normal Form
• A given relation is called in First Normal Form (1NF)
– if each cell of the table contains only an atomic value.
OR
– if the attribute of every tuple is either single valued or a null value.
First Normal Form
Example
• In other words,
A → B is called a partial dependency if and only if-
– A is a subset of some candidate key
– B is a non-prime attribute.
• If any one condition fails, then it will not be a partial
dependency.
Second Normal Form
Example
• Consider a relation- R ( V , W , X , Y , Z ) with functional
dependencies-
VW → XY
Y→V
WX → YZ
• The possible candidate keys for this relation are- VW , WX , WY
• Prime attributes = { V , W , X , Y }
• Non-prime attributes = { Z }
• Now, if we observe the given dependencies-
• There is no partial dependency.
• Thus, we conclude that the given relation is in 2NF.
Consider a relation- R ( V , W , X , Y , Z ) with
functional dependencies-
VW → XY
Y→V
WX → YZ
Third Normal Form
• A given relation is called in Third Normal Form (3NF)
if and only if-
– Relation already exists in 2NF.
– No transitive dependency exists for non-prime attributes.
{AB}+ = {ABCD}
hence AB is Candidate Key Definition of 2NF: No non-prime attribute
should be partially dependent on Candidate
Prime Attribute: A,B
Key
Non Prime Attribute: C,D
{ PQ → R, S → T }
{PQS}+ = {PQRST}
{PQS}+ = {PQRSTUVWXY}
{X}+ = {X, Y, Z}
X is Candidate Key
FD are X → Y and Y → Z
So, we can write X → Z
X→Y→Z
Prime Non Non
Hence the relation is not in 3 NF
Prime Prime
• Now check the above table is in 2 NF.
• FD: X → Y is in 2NF ( as Key is not breaking and its Fully
functional dependent )
• FD: Y → Z is also in 2NF( as it does not violate the definition of
2NF)
• Hence above table R( X, Y, Z ) is in 2NF but not in 3NF.
Convert the table R( X, Y, Z) into 3NF:
• Since due to FD: Y → Z, our table was not in 3NF, let's
decompose the table
• FD: Y → Z was creating issue, hence one table R1(Y, Z)
• Create one Table for key X, R2(X, Y), since X → Y
• Hence decomposed tables which are in 3NF are:
R1(X, Y)
R2(Y, Z)
• Question 2: Given a relation R( X, Y, Z, W, P)
and Functional Dependency set FD = , X → Y, Y
→ P, and Z → W-, determine whether the
given R is in 3NF? If not convert it into 3 NF.
R( X, Y, Z, W, P) and FD = { X → Y, Y → P, and Z → W}
{XZ}+ = XZYPW
XZ is Candidate Key
{ X → Y, Y → P, and Z → W}
X→Y→P
Prime Non Non
Hence the relation is not in 3 NF
Prime Prime
Transaction
• Transaction is a set of operations which are all
logically related.
OR