0% found this document useful (0 votes)
39 views94 pages

DBMS DC Unit 3

This document outlines the course structure for 'Database Management Systems' at RMK Group of Educational Institutions, detailing course objectives, prerequisites, syllabus, and outcomes. It includes a comprehensive lecture plan, activity-based learning, and assessment schedules, emphasizing practical applications and real-life case studies. The document is confidential and intended for educational purposes only.

Uploaded by

athishasecerf1
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
39 views94 pages

DBMS DC Unit 3

This document outlines the course structure for 'Database Management Systems' at RMK Group of Educational Institutions, detailing course objectives, prerequisites, syllabus, and outcomes. It includes a comprehensive lecture plan, activity-based learning, and assessment schedules, emphasizing practical applications and real-life case studies. The document is confidential and intended for educational purposes only.

Uploaded by

athishasecerf1
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 94

Please read this disclaimer before proceeding:

This document is confidential and intended solely for the educational purpose of
RMK Group of Educational Institutions. If you have received this document
through email in error, please notify the system manager. This document
contains proprietary information and is intended only to the respective group /
learning community as intended. If you are not the addressee you should not
disseminate, distribute or copy through e-mail. Please notify the sender
immediately by e-mail if you have received this document by mistake and delete
this document from your system. If you are not the intended recipient you are
notified that disclosing, copying, distributing or taking any action in reliance on
the contents of this information is strictly prohibited.
22IT202
DATABASE MANAGEMENT SYSTEMS

Created by:
Dr. J. Jeno Jasmine
Mr.D.Kirubakaran
1.TABLE OF CONTENTS
1. Contents
2. Course Objectives

3. Pre Requisites

4. Syllabus

5. Course outcomes

6. CO- PO/PSO Mapping

7. Lecture Plan

8. Activity based learning

9. Lecture Notes

10. Assignments

11. Part A Question & Answer

12. Part B Question & Answer

13. Supportive online Certification courses

14. Real time Applications in day to day life and to Industry

15. Contents beyond the Syllabus

16. Assessment Schedule

17. Prescribed Text Books & Reference Books

18. Mini Project suggestions


2. COURSE OBJECTIVES

To understand the basic concepts of Data modeling and


Database Systems.

To understand SQL and effective relational database


design concepts.

To learn relational algebra, calculus and normalization

To know the fundamental concepts of transaction


processing, concurrency control techniques, recovery
procedure and data storage techniques

To understand query processing, efficient data querying


and advanced databases.

6
3. PRE REQUISITES

PRE-REQUISITE

• 22CS101 Problem Solving and C++ Programming

• 20CS102 Software Development Practices

7
4. SYLLABUS

DATABASE MANAGEMENT SYSTEMS

UNIT I DATABASE CONCEPTS 9+6


Concept of Database and Overview of DBMS - Characteristics of databases -Data
Models, Schemas and Instances - Three-Schema Architecture - Database Languages
and Interfaces- Introductions to data models types- ER Model- ER Diagrams –
Enhanced ER Model - reducing ER to table Applications: ER model of University
Database Application – Relational Database Design by ER- and EER-to-Relational
Mapping.

List of Exercise/Experiments
Case Study using real life database applications anyone from the
following list
a) Inventory Management for a EMart Grocery Shop
b) Society Financial Management
c) Cop Friendly App – Eseva
d) Property Management – eMall
e) Star Small and Medium Banking and Finance
● Build Entity Model diagram. The diagram should align with the
business and functional goals stated in the application.

UNIT II STRUCTURED QUERY LANGUAGE 9+6


SQL Data Definition and Data Types – Constraints – Queries – INSERT, UPDATE,
and DELETE in SQL - Views - Integrity Procedures, Functions, Cursor and Triggers -
Embedded SQL - Dynamic SQL.

List of Exercise/Experiments
Case Study using real life database applications anyone from the following
list and do the following exercises.
a) Inventory Management for a EMart Grocery Shop
b) Society Financial Management
c) Cop Friendly App – Eseva
d) Property Management – eMall
e) Star Small and Medium Banking and Finance

1. Data Definition Commands, Data Manipulation Commands for inserting, deleting,


updating and retrieving Tables and Transaction Control statements
2. Database Querying – Simple queries, Nested queries, Sub queries and Joins
3. Views, Sequences, Synonyms
4. Database Programming: Implicit and Explicit Cursors
5. Procedures and Functions
6. Triggers
7. Exception Handling
UNIT III RELATIONAL ALGEBRA, CALCULUS AND NORMALIZATION 9+6

Relational Algebra – Operations - Domain Relational Calculus- Tuple Relational Calculus -


Fundamental operations. Relational Database Design - Functional Dependency –
Normalization (1NF, 2NF 3NF and BCNF) –Multivalued Dependency and 4NF –Joint
Dependencies and 5NF - De-normalization.

List of Exercise/Experiments
1. Case Study using real life database applications anyone from the following list
 Inventory Management for a EMart Grocery Shop
 Society Financial Management
 Cop Friendly App – Eseva
 Property Management – eMall
 Star Small and Medium Banking and Finance.
 Apply Normalization rules in designing the tables in scope.
.
UNIT IV TRANSACTIONS, CONCURRENCY CONTROL AND DATA STORAGE 9+6
Transaction Concepts – ACID Properties – Schedules based on Recoverability,
Serializability – Concurrency Control – Need for Concurrency – Locking Protocols – Two
Phase Locking – Transaction Recovery –Concepts – Deferred Update – Immediate
Update.Organization of Records in Files – Unordered, Ordered – Hashing Techniques –
RAID – Ordered Indexes – Multilevel Indexes - B+ tree Index Files – B tree Index Files.

List of Exercise/Experiments
Case Study using real life database applications anyone from the following list
a) Inventory Management for a EMart Grocery Shop
b) Society Financial Management
c) Cop Friendly App – Eseva
d) Property Management – eMall
e) Star Small and Medium Banking and Finance
Ability to showcase ACID Properties with sample queries with appropriate settings
for the above scenario
UNIT V QUERY OPTIMIZATION AND ADVANCED DATABASES 9+6
Query Processing Overview – Algorithms for SELECT and JOIN operations – Query
optimization using Heuristics.Distributed Database Concepts – Design –Concurrency Control
and Recovery – NOSQL Systems – Document-Based NOSQL Systems and MongoDB.

List of Exercise/Experiments
Case Study using real life database applications anyone from the following list
a) Inventory Management for a EMart Grocery Shop
b) Society Financial Management
c) Cop Friendly App – Eseva
d) Property Management – eMall
e) Star Small and Medium Banking and Finance

 Build PL SQL / Stored Procedures for Complex Functionalities, ex EOD Batch


Processing for calculating the EMI for Gold Loan for each eligible Customer.
TOTAL:45+30=75 PERIODS
5. COURSE OUTCOMES

CO1: Map ER model to Relational model to perform database design effectively.

CO2: Implement SQL and effective relational database design concepts.

CO3: Apply relational algebra, calculus and normalization techniques in database

design.

CO4: Understand the concepts of transaction processing, concurrency control,

recovery procedure and data storage techniques.

CO5: Apply query optimization techniques and understand advanced databases.


6. CO- PO/PSO MAPPING

PO1 PO2 PO3 PO4 PO5 PO6 PO7 PO8 PO9 PO10 PO11 PO12
CO1 2 1 1 1 1 1 1 2 2 2 2 2
CO2 3 2 2 1 1 1 1 2 2 2 2 2
CO3 2 1 1 1 1 1 1 2 2 2 2 2
CO4 2 1 1 1 1 1 1 2 2 2 2 2
CO5 2 1 1 1 1 1 1 2 2 2 2 2
CO6 2 1 1 1 1 1 1 2 2 2 2 2

PSO1 PSO2 PSO3


CO1 2 2 2
CO2 2 3 2
CO3 2 2 2
CO4 2 2 2
CO5 2 2 2
CO6 2 2 2
5. COURSE OUTCOME

Cognitive/
Expected
Course Affective Level
Course Outcome Statement Level of
Code of the Course
Attainment
Outcome

Course Outcome Statements in Cognitive Domain

Map ER model to Relational


Analyse
C212.1 model to perform database 60%
K4
design effectively

Implement SQL and effective


relational database design Apply
C212.2 60%
concepts. K3

Apply relational algebra, calculus


and normalization techniques in Analyse
C212.3 60%
database design. K4

Understand the concepts of


transaction processing,
concurrency control, recovery Understand
C212.4 60%
procedure and data storage K1
techniques.

Apply query optimization


Apply
C212.5 techniques and understand 60%
K3
advanced databases.

Course Outcome Statements in Affective domain

C212.7 Attend the classes regularly Respond (A2) 95%


Submit the Assignments
C212.8 Respond (A2) 95%
regularly.
Participation in Seminar/Quiz/
Group Discussion/ Collaborative
C212.9 Valuing (A3) 95%
learning and content beyond
syllabus 4

12
6. CO-PO/PSO MAPPING

Correlation Matrix of the Course Outcomes to


Programme Outcomes and Programme Specific
Outcomes Including Course Enrichment Activities

Programme Outcomes (POs), Programme Specific Outcomes (PSOs)

P P P P P P P P P P P P PS PS PS
O O O O O O O O O O O O O O O
Course
Outcomes (Cos) 1 2 3 4 5 6 7 8 9 10 11 12 1 2 3

K3
K
K4 K5 K5 /K A2 A3 A3 A3 A3 A3 A2 K3 K3 K3
3
5

C212.1 K4 3 3 2 2 3 3 3

C212.2 K3 3 2 1 1 3 3 3

C212.3 K4 3 3 2 2 3 3 3
5
C212.4 K4 3 3 2 2 3 3 3

C212.5 K4 3 3 2 2 3 3 3

C212.6 K4 3 3 2 2 3 3 3

C212.7 A2 3

C212.8 A2 2 2 2 3

C212.9 A3 3 3 3 3 3

C305 3 3 2 2 3 3 3

13
7. LECTURE PLAN

S.No Topic No of Proposed Actual CO Taxonom Mode of


Periods Date Lecture y Level Delivery
Date

1 1 CO2 K3 PPT
Relational
Algebra

2 Domain 1 CO2 K3 PPT


Relational
Calculus
3 Tuple Relational 1 CO2 K3 PPT
Calculus ,
Fundamental
Operations
4 Relational 1 CO2 K3 PPT
Database
Design

5 Functional 1 CO3 K4 PPT


Dependencies,
Non-loss
Decomposition

6 First, Second, 1 CO3 K4 PPT


Third Normal
Forms,
Dependency
Preservation

7 Boyce-Codd 1 CO3 K4 PPT


Normal Form,
Multi-valued
Dependencies
and Fourth
Normal Form

8 1 CO3 K4 PPT

Join
Dependencies
and Fifth
Normal Form
8. ACTIVITY BASED LEARNING
1. Crossword Puzzle
Across
3. Attributes whose values are obtained from other attribute values.
5. If not all / only a few entities in E participate in the relation R.
6. A level that describes how the data is actually stored on disk.
9. The level of how the relationship between data.
10. Results of analysis and synthesis of data.
11. Organized data sets based on a relationship structure.
15. The DBMS component that evaluates the query.
19. Properties of an entity.
22. Records of existing or occurring phenomena or facts.
24. Objects that distinguish from other objects.
26. Minimal set of attributes that can uniquely distinguish each row of data in a
table.
28. Which is used to uniquely distinguish an entity from other entities in the
entity set.
29. Language for manipulating data.
30. Atomic attributes which cannot be further divided into smaller subsections.
DOWN
1. One example of a DBMS.
2. Relationships that point to the same entity.
4. Languages for creating database schemas.
7. DBMS component that provides an interface between application programs and
data stored in the database.
8. Attributes that have only one single value.
12. The set of one or more attributes that can uniquely distinguish each row of
data in a table.
13. Attributes that can be further divided into smaller sub-attributes, which have
meaning.
14. has only a few values.
16. Relationships between several entities.
17. One of the advantages of databases.
18. The main tool for identifying entities in an entity set.
20. The value contained in the database at one time
21. Users who are responsible for database management (abbreviated).
23. Data regarding data.
25. Complete description of the terrain structure, records, and data relationships
in the database.
27. The software creates and maintains databases.
8. ACTIVITY BASED LEARNING
9. LECTURE NOTES
1. RELATIONAL ALGEBRA

The relational Algebra is a procedural query language. It consists of a set of


operations that take one or two relations as input and produce a new relation as
their result.

The fundamental Operations in Relational Algebra are:

select

project

union
set difference

cartesian product

rename

Here the select, project and rename operations are called unary operations,
because they operate on one relation. The other three operations operate on
pairs of relations and are, therefore called binary operations.
2. Operations
1) The select operation:

The select operation selects tuples that satisfy a given predicate.

The lower case Greek letter sigma (σ) is used to denote the selection. The predicate
appears as the subscript to σ.

Comparisons are allowed in the predicate using relational operators, =, , >, . <,
..

Several predicates can be combined into a larger predicate by using the connectives
 (and),  (or),  (not).

Question 1: - Select those tuples of the loan relation where the branch-name
is Perryridge.

Relational algebra query:

The equivalent relational algebra query for the given question is,

σ branch-name = “perryridge” (loan)

The result of the query is given below:

Question 2:- Find all the tuples in which the amount lent is more than 1200
in loan relation.
Relational algebra query:
The equivalent relational algebra query for the given question is,
σ amount >1200 (loan)
The result of the query is given below.
loan-number branch-name Amount

L-14 Downtown 1500


L-15 Perryridge 1500

L-16 Perryridge 1300

L-23 RedWood 3000

Question 3: - Find those tuples pertaining to loans of more than 1200 made by the
perryridge branch.

Relational Algebra query:


The equivalent relational algebra query for the given question is,
σ branch-name = “perryridge”amount >1200 (loan)

The result of the query is given below.

loan-number branch-name Amount

L-15 Perryridge 1500

L-16 Perryridge 1300


2) The Project Operation:

The project operation is a unary operation that returns its argument relation, with
certain attributes left out. Projection is denoted by the Greek letter pi (). The
attributes that should appear in the result are listed as subscript to the . The
argument relation follows in the parenthesis.

Composition of relational operations:

The relational operations can be composed together into a relational algebra


expression.

Question 1:- List all the loan-numbers and amount of the loan.

Relational algebra query:

The equivalent relational algebra query for the given question is,

loan-number, amount(loan)

The result of the query is given


Question 2:- Find those customers who live in Harrison.
Relational algebra query:
The equivalent relational algebra query for the given question is,
customer-name (σcustomer-city = “Harrison”(customer))
Here the expression is given as argument to the projection operation instead of the
name of a relation.
Result:
Customer- name
Hayes

Jones

3) The Union Operation:

The union operation is the binary operation which combines two relations. The union
operation is denoted by the letter (). For a union operation r s to be valid, two
conditions must hold.

The relations r and s must be of the same arity. That is they must have the
same number of attributes.

The domains of the ith attribute of r and ith attribute of s must be the same
for all i.

Question 1:- Find the names of all customers who have either an
account or a loan or both?

Relational algebra query:

The equivalent relational algebra query for the given question is,

customer-name (borrower) customer-name (depositor)


Result:

4) The set difference operation:

The set-difference operation denoted by (–), finds the tuples that are in one
relation but are not in another. It is a binary operation. For a set difference
operation r–s to be valid , the relations r and s should be of the same arity and
the domains of the ith attribute of r and ith attribute of s must be the same.

Question 1:- Find all the customers of the bank who have an account but not
a loan?

Relational algebra query:

The equivalent relational algebra query for the given question is,

customer-name (depositor) – customer-name (borrower)

Result:
5) The Cartesian product operation:

The Cartesian product operation, denoted by (X), allows combining


information from any two relations.

The cartesian product of r1 and r2 is written as r1X r2.

If r1 contains n1 tuples and r2 contains n2 tuples, then there are n1*n2 ways of
choosing a pair of tuples-one tuple from each relation.

So, there are n1*n2 tuples in r.

Example:

Consider the borrower and loan relations given below.

borrower relation: Loan relation:

customer- loan-no loan-no branch-name amt


name
Adams L-16 L-93 Roundhill 900

Hayes L-93 L-16 Downtown 1500

Jackson L-15 L-15 Perryridge 1300

Question 1:- Find the names of all customers who have a loan at the perryridge branch.

Computation of Relational algebra query:

This question needs the information in both the loan relation and the borrower relation.

If the query is written as

σbranch-name = “perryridge”(borrower X loan)


customer borrower.l loan.loan- branch- amount
name oan – number name
number
Adams L-16 L-93 Round Hill 900

Adams L-16 L-16 Downtown 1500

Adams L-16 L-15 Perryridge 1300

Hayes L-93 L-93 Round Hill 900

Hayes L-93 L-16 Downtown 1500

Hayes L-93 L-15 Perryridge 1300

Jackson L-15 L-93 Round Hill 900

Jackson L-15 L-16 Downtown 1500

Jackson L-15 L-15 Perryridge 1300

Result ofσbranch-name = “perryridge”(borrower X loan):

customer borrower. loan.loan- branch- amount


- name loan – number name
number
Adams L-16 L-15 Perryridge 1300

Hayes L-93 L-15 Perryridge 1300

Jackson L-15 L-15 Perryridge 1300

The above relation pertains results to only perryridge branch. However, the
customer-name column may contain customers who do not have a loan at
the perryridge branch. Therefore to obtain the correct result the query has
to be written as below.
σborrower.loan-number = loan.loan-number(σbranch-name = “perryridge”
(borrower X loan))
The result of the above query is given below:
Finally since only the customer-name is needed the projection operation is used as
below.

Пcustomer-name(σborrower.loan-number = loan.loan-number (σbranch-name = “perryridge”(borrower X


loan)))

Final result:

Customer-name

Jackson

6. The Rename Operation:

The rename operator is used to rename the attributes. The rename operator is
denoted by the lowercase Greek letter rho (). Given a relational algebra
expression E, the expressionx(E) returns the result of expression E under name
x.

A second form rename operation is as follows.

x(A1,A2,…,An)(E)

Returns the result of expression E under the name x, and with the attributes
renamed to A1, A2,….,An.

Examples:

Consider the account relation.


acc-no branch-name balance

A-101 Downtown 500

A-102 Perryridge 400

A-201 Brighton 900


Question 1:- Find the largest account balance in the bank?

Computation:

This query requires to (1) compute first a temporary relation consisting of those
balances that are not the largest and (2) take the set difference between the
relation П balance (account) and the temporary relation just computed, to obtain
the result.

Step 1:

To compute the temporary relation, it is needed to compare the values of all


account balances. This comparison is done by computing the cartesianproduct
account X account and forming a selection to compare the value of any two
balances appearing in one tuple. The rename operation is used to rename one
reference to the account relation.

The expression for temporary relation that consists of the balances that are not
largest is

П account.balance(σaccount.balance<d.balance(account Xd(account )))

This expression gives those balances in the account relation for which a larger
balance appears somewhere in the account relation renamed as d. The result
contains all balances except the largest one as shown next.

The result of account X d (account ) is


account.ac account.br account.ba d.accno d.branch- d.balance
cno anch- lance name
name
A-101 Downtown 500 A-101 Downtown 500

A-101 Downtown 500 A-102 Perryridge 400

A-101 Downtown 500 A-201 Brighton 900

A-102 Perryridge 400 A-101 Downtown 500

A-102 Perryridge 400 A-102 Perryridge 400

A-102 Perryridge 400 A-201 Brighton 900

A-201 Brighton 900 A-101 Downtown 500

A-201 Brighton 900 A-102 Perryridge 400

A-201 Brighton 900 A-201 Brighton 900

The result ofσaccount.balance<d.balance(account Xd(account ))

account.a account.b account.b d.accno d.branch- d.balance


ccno ranch- alance name
name
A-101 Downtown 500 A-201 Brighton 900

A-102 Perryridge 400 A-101 Downtown 500

A-102 Perryridge 400 A-201 Brighton 900

The result of П account.balance(σaccount.balance<d.balance(account Xd(account )))


account.balance

500

400

400
Step 2:

The query to find the largest account balance in the bank can be written as:

Пbalance(account) –

П account.balance(σaccount.balance<d.balance
(accountXd(account )))

The result of this query is given below

balance

900

Question 2:- Find the names of all customers who live on the same street
and in the same city as smith.

Consider customer relation.

customer customer customer


-name -street -city
Adams Spring Harrison
Curry North Rye
Brooks Main Harrison
Smith North Rye

Computation:
The smith’s street and city can be obtained by,
Пcustomer-street, customer-city (σcustomer-name = smith”(customer)))
In order to find other customers with this street and city, the customer relation must be
referred second time. The rename operation is used for this purpose. The resulting

expression is given below:


Пcustomer.customer-name(σcustomer.customer-street = smith-
addr.streetcustomer.customer-city = smith.addr.city(customer X smith-
addr(street,city) (Пcustomer-street, customer-city (σcustomer-name =
“smith”(customer)))))
In the previous expression customer-street and customer-city is renamed to street
and city. The result of the preceding expression is given below.

Additional Operations:

The additional operations in relational algebra are:

Set intersection operation

Natural join operation

Division operation

Assignment operation

Set intersection operation:

The set intersection operation is denoted by (). It returns the result that is
common. It is a binary operation.The set intersection can be done with a pair of
set difference operation.

r s = r – (r – s)

Example: - Consider the borrower and depositor relation.

Question 1: Find all customers who have both loan and an account?

Relational algebra expression:

The equivalent relational algebra query for the given question is,

customer-name (borrower)customer-name (depositor)


Natural join operation:

The natural join operation is a binary operation that allows combining


certain selections and Cartesian product into one operation. It is denoted by the
“join” symbol

The natural join operation forms a cartesian product of its two arguments,
performs a selection forcing equality on those attributes that appear in both
relation schemas and finally removes duplicate attributes.

Example 1:Consider the loan and borrower relation. Refer Cartesian product
operation for the relations.

Question 1:- Find the names of all customers who have a loan at the bank,
and find the amount of the loan?

Relational Algebra Expression:

Пcustomer-name, loan-number, amount (borrower loan)

Since the schemas for borrower and loan have the attribute loan-number in
common, the natural join operation considers only pair of tuples that have the
same value on loan number. It combines each such pair of tuples into a single
tuple on the union of two schemas. After performing the projection the following
result is obtained.

Result:
Customer-name Loan-number Amount

Adams L-16 1500

Hayes L-93 900

Jackson L-15 1300


Example 2: Consider customer, account and depositor relations
given below:

customer- customer- customer-


name street city
Adams North Brooklyn

Hayes Main Harrison

Brooks Alma Stanford

John Main Woodside

Jones North Harrison

acc-no branch- balance


name
A-101 Downtown 500
A-102 Perryridge 400
A-201 Brighton 900
A-217 Brighton 750

customer-name acc-no

Hayes A-102

John A-101

John A-201

Jones A-217
Question 2:- Find the names of all branches with customers who have an
account in the bank and who live in Harrison?
Relational algebra Expression:
Пbranch-name, (σcustomer-city = “Harrison”(customer account
depositor))
Result:
Branch-name

Brighton

Perryridge

3. The division operation:


The division operation denoted by (), is suited to queries that include the phrase
“for all”.
Example:
Consider branch, account and depositor relations. Refer natural join
operation for account and depositor relations.
branch relation:

branch-name branch-city assets branch-name

Brighton Brooklyn 7100000 Brighton

Downtown Brooklyn 900000 Downtown

Mianus Horseneck 400000 Mianus

North Rye 170000 North


Question 1:- Find all customers who have an account at all the branches located in
Brooklyn?

Relational algebra expression:

The expression for obtaining all branches in Brooklyn is,

r1 = Пbranch-name(σbranch-city = “Brooklyn”(branch))

The result of this expression is,

The (customer-name, branch-name) pairs of all customers who has an account at a branch
can be found by the expression,

r2 = Пcustomer-name, branch-name (depositor account)

The result of the above expression is,

customer-name branch-name
Hayes Perryridge
John Downtown
John Brighton
Jones Brighton

To find customers who appear in r2 with every branch name in r1, the divide operation is
used as given below.

Пcustomer-name, branch-name (depositor account) Пbranch-name(σbranch-city = “Brooklyn”(branch))

Result:
Customer-name
John
The Assignment Operation:

The assignment operation, denoted by , works like assignment in a programming


language. For example r – scan be written as
temp1  r

temp2  s

Result  temp1– temp2


5. Extended Relational Algebra Operations:

Generalized Projection:

The generalized projection operation extends the projection operation by


allowing arithmetic functions to be used in the projection list. The generalized
projection operation has the form

ПF1,F2,….,Fn(E)

Where E is any relational-algebra expression, and each of F1, F2,…, Fn is an


arithmetic expression involving constants and attributes in the schema of E.

Example:Consider the relation credit-info

Question:

The credit-info relation lists the credit limit and expenses so far done. To find how
much more each person can spend, the following expression is written:

Пcustomer-name, limit – credit-balance (credit-info)

The attribute resulting from the expression limit – credit-balance does not have a
name. The rename operation can be applied for this purpose as below.

Пcustomer-name, (limit – credit-balance) as credit-available (credit-info)

Result:
6. Aggregate Functions:

Aggregate functions take a collection of values and return a single value as a result

avg: average value


min: minimum value
max: maximum value
sum: sum of values
count: number of values

Aggregate functions in relational algebra:The symbol g is used for aggregate


operations. It is known as calligraphic G.

Question:

To find out the total sum of salaries of all part-time employees in the bank, the
following relational algebra expression is used.

gsum(salary)(pt-works)
7. Groups:
The result can be grouped based on some attribute. For example to partition the
relation pt-works into groups based on the branch, and to apply aggregation on
each group, the query is written as below.

branch-namegsum(salary)(pt-works)

The result of the expression is given below:

First the relation is grouped based on branch-name without performing aggregation


as shown below.

Final result after aggregation:


8. Outer join:
The outer join operation is an extension of the join operation to
deal with missing information.
Example:Consider the relations employee and ft-works as below.

To generate single relation from the above two relations, a possible


approach to use is the natural join operation. The expression is
given below.
Employee ft-works.
The result of this expression is given below.
In the above relation the street and city information about smith is lost, since the
tuple describing smith is absent from the ft-works relation. Similarly the branch
name and salary information about gates is lost, since the tuples describing gates
is absent from the employee relation.The outer join operation can be used to
avoid this loss of information. There are three forms of outer join operation. They
are:

Left outer Join (ii) Right outer Join (iii) Full outer Join

(i) The left outer join: This takes all tuples in the left relation that did not match
with any tuple in the right relation, pads the tuples with null values for all other
attributes from the right relation, and adds them to the result of the natural join.
The result of employee ft-works is given below

(ii) The right outer join: it is symmetric with the left outer join. It pads tuples
from the right relation that did not match any from the left relation with nulls and
adds them to the result of the natural join. The result of employee ft-works is
given below.
(iii) The full outer join: it does both of the above operations, padding tuples from
the left relation that did not match any from the right relation, as well as tuples
from the right relation that did not match any from the left relation, and adding
them to the result of the join. The below relation shows the result of employee
ft-works.
3. Relational Calculus

A relational calculus expression creates a new relation, which is specified in


terms of variables that range over rows of the stored database relations (in tuple
calculus) or over columns of the stored relations (in domain calculus).

In a calculus expression, there is no order of operations to specify how to retrieve


the query result—a calculus expression specifies only what information the result
should contain.

This is the main distinguishing feature between relational algebra and


relational calculus.

Relational calculus is considered to be a nonprocedural language.

This differs from relational algebra, where we must write a sequence of


operations to specify a retrieval request; hence relational algebra can be
considered as a procedural way of stating a query.

4. Tuple Relational Calculus

The tuple relational calculus is based on specifying a number of tuple variables.

Each tuple variable usually ranges over a particular database relation, meaning
that the variable may take as its value any individual tuple from that relation.

A simple tuple relational calculus query is of the form

{t | COND(t)}

where t is a tuple variable and COND (t) is a conditional expression involving


t.

The result of such a query is the set of all tuples t that satisfy COND (t).
For example, to find all employees whose salary is above $50,000, we can write
the following tuple calculus expression:

{t | EMPLOYEE(t) AND t.Salary>50000}

The condition EMPLOYEE(t) specifies that the range relation of tuple variable t
is EMPLOYEE.

Each EMPLOYEE tuple t that satisfies the condition t.Salary>50000 will be


retrieved. Notice that t.Salary references attribute Salary of tuple variable t;

To retrieve only some of the attributes—say, the first and last names—we write

t.Fname, t.Lname | EMPLOYEE(t) AND t.Salary>50000}

Informally, we need to specify the following information in a tuple


relational calculus expression:

For each tuple variable t, the range relation R of t. This value is specified by a
condition of the form R(t).

A condition to select particular combinations of tuples. As tuple variables range


over their respective range relations, the condition is evaluated for every possible
combination of tuples to identify the selected combinations for which the
condition evaluates to TRUE.

A set of attributes to be retrieved, the requested attributes. The values of these


attributes are retrieved for each selected combination of tuples.
Query 0. Retrieve the birth date and address of the employee (or employees)
whose name is John B. Smith.

Q0: {t.Bdate, t.Address | EMPLOYEE(t) AND t.Fname=‘John’ AND t.Minit=‘B’


AND t.Lname=‘Smith’}

In tuple relational calculus, we first specify the requested attributes t.Bdate and
t.Address for each selected tuple t. Then we specify the condition for selecting a
tuple following the bar (|)—namely, that t be a tuple of the EMPLOYEE relation
whose Fname, Minit, and Lname attribute values are ‘John’, ‘B’, and ‘Smith’,
respectively.

Expressions and Formulas in Tuple Relational Calculus

A general expression of the tuple relational calculus is of the form

{t1.Aj, t2.Ak, ... , tn.Am | COND(t1, t2, ..., tn, tn+1, tn+2, ..., tn+m)}

where t1, t2, … , tn, tn+1, … , tn+m are tuple variables, each Ai is an attribute of
the relation on which ti ranges, and COND is a condition or formula of the tuple
relational calculus.

A formula is made up of predicate calculus atoms, which can be one of the


following:

1. An atom of the form R(ti), where R is a relation name and ti is a tuple variable.
This atom identifies the range of the tuple variable ti as the relation whose name is

R. It evaluates to TRUE if ti is a tuple in the relation R, and evaluates to FALSE


otherwise.

2. An atom of the form ti.A op tj.B, where op is one of the comparison operators in
the set {=, <, ≤, >, ≥, ≠}, ti and tj are tuple variables, A is an attribute of the
relation on which ti ranges, and B is an attribute of the relation on which tj ranges.
3.An atom of the form ti.A op c or c op tj.B, where op is one of the comparison
operators in the set {=, <, ≤, >, ≥, ≠}, ti and tj are tuple variables, A is an attribute
of the relation on which ti ranges, B is an attribute of the relation on which tj
ranges, and c is a constant value.

Each of the preceding atoms evaluates to either TRUE or FALSE for a specific
combination of tuples; this is called the truth value of an atom.

In general, a tuple variable t ranges over all possible tuples in the universe. For
atoms of the form R(t), if t is assigned to a tuple that is a member of the specified
relation R, the atom is TRUE; otherwise, it is FALSE.

In atoms of types 2 and 3, if the tuple variables are assigned to tuples such that
the values of the specified attributes of the tuples satisfy the condition, then the
atom is TRUE.

A formula (Boolean condition) is made up of one or more atoms connected via


the logical operators AND, OR, and NOT and is defined recursively by Rules 1
and 2 as follows:
Rule 1: Every atom is a formula.
Rule 2: If F1 and F2 are formulas, then so are (F1 AND F2), (F1 OR F2), NOT
(F1), and NOT (F2). The truth values of these formulas are derived from their
component formulas F1 and F2 as follows:
a. (F1 AND F2) is TRUE if both F1 and F2 are TRUE; otherwise, it is FALSE.
b. (F1 OR F2) is FALSE if both F1 and F2 are FALSE; otherwise, it is TRUE.
c. NOT (F1) is TRUE if F1 is FALSE; it is FALSE if F1 is TRUE.
d. NOT (F2) is TRUE if F2 is FALSE; it is FALSE if F2 is TRUE.
The Existential and Universal Quantifiers

In addition, two special symbols called quantifiers can appear in formulas; these
are the universal quantifier (∀) and the existential quantifier (∃).

We define a tuple variable in a formula as free or bound according to the following


rules:

An occurrence of a tuple variable in a formula F that is an atom is free in F.

An occurrence of a tuple variable t is free or bound in a formula made up of

logical connectives—(F1 AND F2), (F1 OR F2), NOT(F1), and NOT(F2)— depending
on whether it is free or bound in F1 or F2 (if it occurs in either).

Notice that in a formula of the form F = (F1 AND F2) or F = (F1 OR F2), a tuple
variable may be free in F1 and bound in F2, or vice versa; in this case, one
occurrence of the tuple variable is bound and the other is free in F.

All free occurrences of a tuple variable t in F are bound in a formula F′ of the

form F′= (∃t)(F) or F′ = (∀t)(F). The tuple variable is bound to the quantifier
specified in F′. For example, consider the following formulas:

F1: d.Dname = ‘Research’

F2: (∃t)(d.Dnumber = t.Dno)

F3: (∀d)(d.Mgr_ssn = ‘333445555’)

The tuple variable d is free in both F1 and F2, whereas it is bound to the (∀)
quantifier in F3. Variable t is bound to the (∃) quantifier in F2.
Rule 3: If F is a formula, then so is (∃t)(F), where t is a tuple variable. The formula
(∃t)(F) is TRUE if the formula F evaluates to TRUE for some (at least one) tuple
assigned to free occurrences of t in F; otherwise, (∃t)(F) is FALSE.

Rule 4: If F is a formula, then so is (∀t)(F), where t is a tuple variable. The formula


(∀t)(F) is TRUE if the formula F evaluates to TRUE for every tuple (in the universe)
assigned to free occurrences of t in F; otherwise, (∀t)(F) is FALSE.

The (∃) quantifier is called an existential quantifier because a formula (∃t)(F) is


TRUE if there exists some tuple that makes F TRUE. For the universal quantifier,
(∀t)(F) is TRUE if every possible tuple that can be assigned to free occurrences of
t in F is substituted for t, and F is TRUE for every such substitution. It is called the
universal or for all quantifier because every tuple in the universe of tuples must
make F TRUE to make the quantified formula TRUE.

Sample Queries in Tuple Relational Calculus

Query 1. List the name and address of all employees who work for the ‘Research’

department.

Q1: {t.Fname, t.Lname, t.Address | EMPLOYEE(t) AND (∃d)(DEPARTMENT(d)


AND d.Dname=‘Research’ AND d.Dnumber=t.Dno)}

Query 2. For every project located in ‘Stafford’, list the project number, the
controlling department number, and the department manager’s last name, birth
date, and address.

Q2: {p.Pnumber, p.Dnum, m.Lname, m.Bdate, m.Address | PROJECT(p) AND

EMPLOYEE(m) AND p.Plocation=‘Stafford’ AND ((∃d)(DEPARTMENT(d) AND


p.Dnum=d.Dnumber AND d.Mgr_ssn=m.Ssn))}
Using the Universal Quantifier in Queries

Whenever we use a universal quantifier, it is quite judicious to follow a few rules


to ensure that our expression makes sense. We discuss these rules with respect
to the query Q3.

Query 3. List the names of employees who work on all the projects controlled by
department number 5. One way to specify this query is to use the universal
quantifier as shown:

Q3: {e.Lname, e.Fname | EMPLOYEE(e) AND ((✯x)(NOT(PROJECT(x)) OR NOT


(x.Dnum=5) OR ((∃w)(WORKS_ON(w) AND w.Essn=e.Ssn AND

x.Pnumber=w.Pno))))}

Query 4. List the names of employees who have no dependents.

Q4: {e.Fname, e.Lname | EMPLOYEE(e) AND (NOT (∃d)(DEPENDENT(d) AND


e.Ssn=d.Essn))}

Query 5. List the names of managers who have at least one dependent.

Q5: {e.Fname, e.Lname | EMPLOYEE(e) AND ((∃d)(∃ρ)(DEPARTMENT(d) AND


DEPENDENT(ρ) AND e.Ssn=d.Mgr_ssn AND ρ.Essn=e.Ssn))}

This query is handled by interpreting managers who have at least one dependent
as managers for whom there exists some dependent.

The Domain Relational Calculus

Domain calculus differs from tuple calculus in the type of variables used in
formulas:Rather than having variables range over tuples, the variables range over
single values from domains of attributes.
To form a relation of degree n for a query result, we must have n of these domain
variables—one for each attribute. An expression of the domain calculus is of the
form

{x1, x2, ..., xn | COND(x1, x2, ..., xn, xn+1, xn+2, ..., xn+m)} where x1, x2, … , xn,
xn+1, xn+2, … , xn+m are domain variables that range over domains (of
attributes), and COND is a condition or formula of the domain relational calculus.

A formula is made up of atoms. The atoms of a formula are slightly different from
those for the tuple calculus and can be one of the following:

1. An atom of the form R(x1, x2, … , xj), where R is the name of a relation of
degree j and each xi, 1 ≤ i ≤ j, is a domain variable. This atom states that a list of
values of <x1, x2, … , xj> must be a tuple in the relation whose name is R, where xi
is the value of the ith attribute value of the tuple. To make a domain calculus
expression more concise, we can drop the commas in a list of variables; thus, we
can write: {x1, x2, ..., xn | R(x1 x2 x3) AND ...} instead of: {x1, x2, ... , xn | R(x1,
x2, x3) AND ...}

2. An atom of the form xi op xj, where op is one of the comparison operators in the
set {=, <, ≤, >, ≥, ≠}, and xi and xj are domain variables.

3. An atom of the form xi op c or c op xj, where op is one of the comparison


operators in the set {=, <, ≤, >, ≥, ≠}, xi and xj are domain variables, and c is a
constant value. As in tuple calculus, atoms evaluate to either TRUE or FALSE for a
specific set of values, called the truth values of the atoms. In case 1, if the domain
variables are assigned values corresponding to a tuple of the specified relation R,
then the atom is TRUE. In cases 2 and 3, if the domain variables are assigned values
that satisfy the condition, then the atom is TRUE.
Query 0. List the birth date and address of the employee whose name is ‘John

B. Smith’.

Q0: {u, v | (∃q) (∃r) (∃s) (∃t) (∃w) (∃x) (∃y) (∃z) (EMPLOYEE(qrstuvwxyz) AND
q=‘John’ AND r=‘B’ AND s=‘Smith’)}

Query 1. Retrieve the name and address of all employees who work for the
‘Research’ department.

Q1: {q, s, v | (∃z) (∃l) (∃m) (EMPLOYEE(qrstuvwxyz) AND DEPARTMENT(lmno)


AND l=‘Research’ AND m=z)}

Query 2. For every project located in ‘Stafford’, list the project number, the
controlling department number, and the department manager’s last name, birth
date, and address.

Q2: {i, k, s, u, v | (∃j)(∃m)(∃n)(∃t)(PROJECT(hijk) AND EMPLOYEE(qrstuvwxyz)


AND DEPARTMENT(lmno) AND k=m AND n=t AND j=‘Stafford’)}

Query 6. List the names of employees who have no dependents.

Q6: {q, s | (∃t)(EMPLOYEE(qrstuvwxyz) AND (NOT(∃l)(DEPENDENT(lmnop)


AND t=l)))}
6. RELATIONAL DATABASE DESIGN

Relation schema corresponds to the programming-language notion of type


definition.

The term relation instance refers to a specific instance of a relation, i.e.,


containing a specific set of rows.

A1, A2, …, Anare attributes

R = (A1, A2, …, An ) is a relation schema

• Formally, given sets D1, D2, …. Dn a relation r is a subset of


D1 x D2 x … x Dn

• Relation is a set of n-tuples (a1, a2, …, an) where each ai Di

• Relation – cartesian product of domains

• The current values (relation instance) of a relation are specified by a table

•An element t of r is a tuple, represented by a row in a table

Example:

instructor = (ID, name, dept_name, salary)


The current values (relation instance) of a relation are specified by a table

An element t of r is a tuple, represented by a row in a table

Relation Instance:

• The current values (relation instance) of a relation are specified by a table

• An element t of r is a tuple, represented by a row in a table

• Order of tuples is irrelevant (tuples may be stored in an arbitrary order)


Undesirable Properties of Relations
A bad design may have several properties, including:
• Repetition of information.
• Inability to represent certain information.
• Loss of information.
7. FUNCTIONAL DEPENDENCIES:
DEFINITION: Let r be a relation and let X and Y be arbitrary subsets of the set
of attributes of r. Then we say that Y is functionally dependent on X, i.e. X->Y (X
functionally determines Y) if and only if each X value in r has associated with it
precisely one Y value in r.

X 🡒 Y holds if whenever two tuples have the same value for X, they must have
the same value for Y

For any two tuples t1 and t2 in any relation instance r(R): If t1[X]=t2[X], then
t1[Y]=t2[Y]

X 🡒 Y in R specifies a constraint on all relation instances r(R)

Examples of FD constraints

Social security number (SSN) determines employee name

SSN 🡒 ENAME

Project number determines project name and location

PNUMBER 🡒 {PNAME, PLOCATION}

Employee ssn and project number determines the hours per week that the
employee works on the project

{SSN, PNUMBER} 🡒 HOURS

TRIVIAL AND NONTRIVIAL DEPENDENCIES

One way to reduce the size of the set of FDs is to eliminate the
trivial dependencies. An FD is trivial if and only if the right side is a subset of the
left side.

EX: {supplier-no, part-no}->supplier-no

An FD is non-trivial if and only if the right side is a not a subset of the left side.

EX: {supplier-no, part-no}->project-no

Generally it is advisable to avoid trivial dependencies.


CLOSURE OF A SET OF DEPENDENCIES:

Closure of a set F of FDs is the set F+ of all FDs that can be inferred from F.

There are circumstances such that some FDs might imply


others.

For example, {supplier-no, part-no}->{city, qty} implies both of the following.

{supplier-no, part-no}->{city}

{supplier-no, part-no}->{qty}

As another example, consider the relation R with attributes A,B and C, such that
the FDs A->B and B->C both hold for R. Then it is easy to see that the FD A->C
also holds for R. The FD A->C is an example of a transitive FD i.e. C is said to
depend on A transitively via B. The set of all FDs that are implied by a given set S
of FDs is called the closure of S, written S+

The task of computing S+ from S can be done by the following ARMSTRONG’s


Design Axiom rules:

Let A,B and C be arbitrary subsets of the set of attributes of given relation
R and let AB mean the union of A and B. Then we have:

Reflexivity: if B is a subset of A then A->B

Augmentation: If A->B, then AC->BC.

Transitivity: If A->B and B->C, then A->C.

Self-determination: A->A

Decomposition: If A->BC, then A->B and A->C.

Union: If A->B and A->C, then A->BC.

Composition: If A->B and C->D, then AC->BD.

Pseudotransitive: If A->B and BC->D, then AC->D


Example:

Let R be the relation with attributes A,B,C,D,E,F and the FDs are:

A->BC B->E CD->EF

We now show that the FD AD->F holds for R and is thus a member of the closure of

the given set:

A->BC (given)

A->C (1, decomposition)

AD->CD (2, augmentation)

CD->EF (given)

AD->EF (3 &4, transitivity) AD->F (5, decomposition)

CLOSURE OF A SET OF ATTRIBUTES:

Closure of a set of attributes X with respect to Fis the set X+ of all attributes that

are functionally determined by X.

X+ can be calculated by repeatedly applying IR1, IR2, IR3 using the FDs in F

The closure S+ of a given set S of FDs can be computed by means of an

algorithm that says “Repeatedly apply the rules from the previous section until

they stop producing new FDs”.

Let R be the relation, Z be the set of all attributes of R and S be the set of

FDs that hold for R. From this we can determinate the set of all attributes of R

that is functionally dependent on Z i.e. the closure Z+ of Z under S.


A simple algorithm for computing this closure is given in the below pseudo code:

CLOSURE[Z,S]=Z;
do “forever”;

for each FD X->Y in S

do;
if X C CLOSURE[Z,S]
then CLOSURE[Z,S]=CLOSURE[Z,S]UY;

end
if CLOSURE[Z,S] did not change

on this iteration then leave the loop;


end;
Example:
Suppose we are given a relation R with attributes A,B,C,D,E,F and FDs are:
A->BC
E->CF
B->E
CD->EF

We now compute the closure{A,B}+ of the set of attributes {A,B} under this set of
FDs.

We initialize the result CLOSURE[Z,S] to {A,B}

We now go round the inner loop four times, once for each of the given FDs. On the
first iteration (for the FD A->BC), we find that the left side is a subset of
CLOSURE[Z,S]. so we add attributes (B and C) to the result. CLOSURE[Z,S] is now
the set {A,B,C}.

On the second iteration (for the FD E->CF), we find that the left side is not a subset
of the result, which thus remains unchanged.
On the third iteration (for the FD B->E), we add E to CLOSURE[Z,S], which now has
the value {A,B,C,E}.

On the fourth iteration (for the FD CD->EF), CLOSURE[Z,S] remains unchanged.

Now we go round the inner loop four times again. On the first iteration, the result
does not change; on the second, it expands to {A,B,C,E,F}, on the third and
fourth it does not change.

Now we go round the inner loop four times again. CLOSURE[Z,S] does not change,
and so the whole process terminates with {A,B}+ = {A,B,C,E,F}.

Thus if Z is a set of attributes of relation R and S is a set of FDs that hold for R, then
set of FDs that hold for R with Z as the left side is the set consisting of all FDs of
the form Z->Z’, where Z’ is some subset of the closure Z+ of Z under S. The
closure S+ of the original set S of FDs is then the union of all such sets of FDs,
taken over all possible attribute sets Z.

Two sets of FDs F and G are equivalent if:

• Every FD in F can be inferred from G, and

• Every FD in G can be inferred from F

• Hence, F and G are equivalent if F+ =G+

Definition (Covers):

F covers G if every FD in G can be inferred from F (i.e., if G+ subset-of F+)

F and G are equivalent if F covers G and G covers F


A set of FDs is minimal if it satisfies the following conditions:

Every dependency in F has a single attribute for its RHS.

We cannot remove any dependency from F and have a set of dependencies that is
equivalent to F.

We cannot replace any dependency X -> A in F with a dependency Y -> A, where


Y proper-subset-of X ( Y subset-of X) and still have a set of dependencies that is
equivalent to F.

IRREDUCIBLE SETS OF DEPENDENCIES:

Let S1 and S2 be two sets of FDs. If every FD implied by S1 is


implied by S2 i.e. if S1+ is a subset of S2+, we say that S2 is a cover of S1.
This means that if the DBMS enforces the FDs in S2, then it will
automatically be enforcing the FDs in S1.

If S2 is a cover of S1 and S1 is a cover for S2 i.e. if S1+ = S2+ -


we say that S1 and S2 are equivalent. In this case if the DBMS enforces the FDs
in S2 it will automatically be enforcing the FDs in S1 and vice versa.

A set S of FDs is said to be irreducible, if and only if it satisfies


the following three properties:

The right side of every FD in S involves just one attribute.

The left side of every ED in S is irreducible in turn-meaning that no attribute can


be discarded from the determinant without changing the closure S +. This type of
FD is called as left irreducible.

No FD in S can be discarded from S without changing the closure S +.

Example:

Consider the relation PARTS for which the following FDs hold:

PART-NO->PART-NAME
PART-NO->COLOUR

PART-NO->WEIGHT

PART-NO->CITY

This set of FDs is easily seen to be irreducible. The right side is


a single attribute in each case and the left side is also irreducible in turn. So none
of the FD’s can be discarded without changing the closure.

The following sets of FDs are not irreducible:

PART-NO->{PART-NAME,COLOUR} The right side of the first FD is not singleton


set.

PART-NO->WEIGHT

PART-NO->CITY

{PART-NO,PART-NAME}->COLOUR- The first FD here can be simplified by


dropping PART-NAME from left without changing the closure

PART-NO-> PART-NAME

PART-NO->WEIGHT

PART-NO->CITY

PART-NO-> PART-NO The first FD can be discarded without changing the closure.

PART-NO-> PART-NAME

PART-NO->COLOUR

PART-NO->WEIGHT

PART-NO->CITY

So, for every set of FDs there exist at least one equivalent set
that is irreducible.
Example:

Consider the relation R with attributes A,B,C,D and FDs:

A->BC

B->C

A->B

AB->C

AC->D

We now compute the irreducible set of FDs that is equivalent to the given set:

The first step is to rewrite the FDs such that each has a singleton right side:

A->B

A->C

B->C

A->B

AB->C

AC->D

In this, the FD A->B occurs twice, so one occurrence can be eliminated.

Next, attribute C can be eliminated from the left side of the FD AC->D, because we
have A->C which can be written as

AA->AC (by augmentation)

AC->D (given)

So, A->D by transitivity. Thus C on the left side of AC->D is redundant.


Next AB->C can be eliminated, because we have

A->C (given)

AB->CB (by augmentation)

So, AB->C (by decomposition)

Finally, the FD A->C is implied by the FDs A->B and B->C, so it can also be
eliminated.

The final irreducible sets of FDs are:

A->B

B->C

A->D

NOTE:

The irreducible sets can also be represented by the terms minimal sets,
minimal cover and canonical cover.
ANOMALIES

Database anomalies, are problems that arise due to the limitations or flaws within a given
database. Anomalies can be classified into insertion anomalies, deletion anomalies,and
modification anomalies. These anomalies are discussed based on the EMP_DEPT relation
given below.

Insertion Anomalies:

Insertion anomalies can be differentiated into two types, illustrated by the following
examples based on the EMP_DEPT relation:

To insert a new employee tuple into EMP_DEPT, we must include either the attribute values
for the department that the employee works for, or NULLs (if the employee does not work
for a department as yet). For example, to insert a new tuple for an employee who works
in department number 5, we must enter all the attribute values

of department 5 correctly so that they are consistent with the corresponding values for
department 5 in other tuples in EMP_DEPT.

It is difficult to insert a new department that has no employees as yet in the EMP_DEPT
relation. The only way to do this is to place NULL values in the attributes for employee.
This violates the entity integrity for EMP_DEPT because SSN is its primary key.
(ii)Deletion Anomalies:

If we delete from EMP_DEPT an employee tuple that happens to represent the last
employee working for a particular department, the information concerning that
department is lost from the database.

For example if we delete the details of Borg.James E who works for Headquarters then the
details of that department is lost.

Modification Anomalies:

In EMP_DEPT, if we change the value of one of theattributes of a particular departmentsay,


the manager of department 5, we mustupdate the tuples of all employees who work in
that department; otherwise, thedatabase will become inconsistent. If we fail to update
some tuples, the same departmentwill be shown to have two different values for
manager in different employeetuples, which would be wrong.

8. NORMALIZATION

Normalization of data can be considered a process of analyzing the given


relationschemas based on their FDs and primary keys to achieve the desirable properties
of(1) minimizing redundancy and (2) minimizing the insertion, deletion, and
updateanomalies.

Any attribute involved in a candidate key is a prime attribute

All other attributes are called non-prime attributes.

A superkey of a relation schema R = {A1, A2, ...., An} is a set of attributes S subset-of R
with the property that no two tuples t1 and t2 in any legal relation state r of R will have
t1[S] = t2[S]

A key K is a superkey with the additional property that removal of any attribute from K will
cause K not to be a superkeyany more.

There are two important properties of decompositions:

Non-additive or losslessness of the corresponding join

Preservation of the functional dependencies.


Note that:

Property (a) is extremely important and cannotbe sacrificed.

Property (b) is less stringent and may be sacrificed.

8.1. First Normal Form


Definition:

First normal form (1NF) disallows multi-valued attributes, composite


attributes, and their combinations. It states that the domain of an attribute must
include only atomic (simple, indivisible) values.

Figure.1 (a)
Description:

Consider the relation department in figure 1 (a). We assume that each


department can have a number of locations. This is not in 1NF because Dlocations
is not an atomicattribute, as illustrated by the first tuple in Figure 1(b).

There are three main techniques to achieve first normal form for such a relation:

Expand the key so that there will be a separate tuple in the original DEPARTMENT
relation for each location of a DEPARTMENT, as shown in Figure 1(c). In this case,
the primary key becomes the combination {Dnumber, Dlocation}. This solution
has the disadvantage of introducing redundancy in the relation.

If a maximum number of values is known for the attribute—for example, if it is


known that at most three locations can exist for a department—replace the
Dlocations attribute by three atomic attributes: Dlocation1, Dlocation2, and
Dlocation3. This solution has the disadvantage of introducing NULL values if most
departments have fewer than three locations.

Remove the attribute Dlocations that violates 1NF and place it in a separate
relation DEPT_LOCATIONS along with the primary key Dnumber of DEPARTMENT.
The primary key of this relation is the combination {Dnumber, Dlocation}, as
shown in the below Figure 2. A distinct tuple in DEPT_LOCATIONS exists for each
location of a department. This decomposes the non-1NF relation into two 1NF
relations.

Figure.2 Decomposition of 1 (a)


Among the three the third is considered to be the best solution because it does not
suffer from redundancy.

8.2 Second Normal Form:

Definition.

A relation schema R is in 2NF if it is in 1NF and satisfies full functional dependency.


i.e., every nonprime attribute A in R is fully dependent on the primary key of Rand
not part of it.

Figure 15.10(a) Normalizing EMP_PROJ into 2NF relations

The EMP_PROJ relation in Figure 15.10 (a) is in 1NF but is not in 2NF. The nonprime
attribute Ename violates 2NF because of FD2, as do the nonprime attributes Pname
and Plocation because of FD3. The functional dependencies FD2 and FD3 make
Ename, Pname, and Plocation partially dependent on the primary key {Ssn,
Pnumber} of EMP_PROJ, thus violating the 2NF.

If a relation schema is not in 2NF, it can be second normalized or 2NF normalizedinto a


number of 2NF relations in which nonprime attributes are associated onlywith the
part of the primary key on which they are fully functionally dependent.

Therefore, the functional dependencies FD1, FD2, and FD3 in Figure 15.10(a) lead
tothe decomposition of EMP_PROJ into the three relation schemas EP1, EP2, and
EP3shown in Figure 15.10(a), each of which is in 2NF.
8.3 Third Normal Form:

Definition: A relation schema R is in 3NF if it satisfies 2NF and no nonprime


attribute of R is transitively dependent on the primary key.

Description: The dependency Ssn→Dmgrssnis transitive through Dnumber in


EMP_DEPT in Figure 15.10(b), because both thedependencies Ssn→Dnumber and
Dnumber→Dmgrssn hold and Dnumber is neithera key itself nor a subset of the
key of EMP_DEPT.

The relation schema EMP_DEPT in Figure 15.10(b) is in 2NF, since no partial


dependencieson a key exist. However, EMP_DEPT is not in 3NF because of the
transitivedependency of Dmgrssn (and also Dname) on Ssn via Dnumber.

We can normalize EMP_DEPT by decomposing it into the two 3NF relation schemas
ED1 and ED2shown in Figure 15.10(b). ED1 and ED2 represent independententity
facts about employees and departments. A NATURAL JOIN operation onED1 and
ED2 will recover the original relation EMP_DEPT.

NOTE:

In X -> Y and Y -> Z, with X as the primary key, we consider this a problem only if Y
is not a candidate key.

When Y is a candidate key, there is no problem with the transitive dependency


8.4. BOYCE/CODD NORMAL FORM (BCNF):
A relation schema R is in BCNF with respect to a set F of functional dependencies if
for all functional dependencies in F+ of the form 

Where   R and   R, at least one of the following holds:

 is trivial (i.e.,   )

 is a super key for R

A relation is in BCNF if and only if every determinant is a candidate key.

Decomposing a Schema into BCNF


Let R be a schema R that is not in BCNF. Let   be the FD that causes a
violation of BCNF.
We decompose R into:
• ( U  )
• (R-(-))

Example:

Consider the relation SJT{student, subject, teacher}

The following constrains apply to the relation:

A subject can have multiple advisors

For a single subject multiple students can register


Each teacher teaches only one subject

For each subject, each student of that subject is taught only by one teacher.

Each subject can be taught by multiple advisors

The below table shows the sample values for SJT


Student Subject Teacher
123 Physics Faculty1
123 Music Faculty2
456 Biology Faculty3
789 Physics Faculty4
999 Physics Faculty1
The two functional dependencies from the constraints

{student, subject}-> teacher

teacher->subject

The FD diagram for SJT is shown below:

S Student

J Subject

T Teacher

The relation suffers from

Insertion anomaly: if a new faculty (say faculty 5) joins and no subject is assigned,
the faculty cannot be inserted as the prime attribute cannot be null.

Updation anomaly: if a student with id 789 is deleted, then Faculty 4 will get
deleted.

This difficulty is caused by the fact that the attribute teacher is a determinant but
not a candidate key. Whenever a non prime attribute determines the one or more
prime attribute then the relation violates BCNF. The teacher(non-prime attribute)
The solution to the problem is to split or decompose the original relation by two
BCNF projections as below:

ST{student, teacher}

TJ{Teacher, subject}
Student Teacher
123 Faculty1
123 Faculty2
456 Faculty3
789 Faculty4
999 Faculty1
Teacher Subject
Faculty1 Physics
Faculty2 Music
Faculty3 Biology
Faculty4 Physics
Faculty1 Physics

The anomalies can be overcome from the decompose of relation SJT into two
relations ST and TJ

9. MULTIVALUED DEPENDENCIES

Multivalued Dependency (MVD) represents a dependency between attributes


(for ex X,Y and Z) in a relation, such that for each value of X there is a set of
values for Y and a set of values for Z. However, the set of values for Y and Z
are independent of each other.

MVD is represented as

X-->>Y(X multi determines Y)

By symmetry whenever X-->>Y holds in R, so does X-->>Z. hence it can


be written as X-->>Y|Z.

Multivalued dependencies are a consequence of first normal form


(1NF), which disallowed an attribute in a tuple to have a set of values. If two or
more Multivalued independent attributes are available in the same relation we get
into a problem of having to repeat every value of one of the attributes with every
value of the other attribute to keep the relation state consistent and to maintain
the independence among the attributes involved.

For example, consider the relation Emp shown below:


Ename Pname Dname
Emp
Smith X john
Smith Y Anna
Smith X Anna
Smith Y john

A tuple in this Emp relation represents the fact that an employee whose name is Ename
works on the project whose name is Pname and has a dependent whose name is Dname.

An employee may work on several projects and may have several


dependents and the employee’s projects and the dependents are independent of one
another. To keep the relation state consistent we must have a separate tuple to represent
every combination of an employee’s dependent and an employee’s project.

This constraint is specified as a multivalued dependency on the Emp relation . In


the ex, the MVD’s are Ename-->>Pname and Ename-->>Dname or

Ename-->>Pname|Dname.

The Emp with Ename Smith works on projects with Pname X and Y and has 2
dependents with Dname ‘john’ and ‘Anna’. If we stored only the first two tuples in emp(<
‘smith’, ‘X’, ‘john’ > and < ‘smith’, ‘Y’, ‘Anna’>), we would incorrectly show
associations between project X, john and project Y, Anna. These should not be
conveyed, because no such meaning is intended in this relation.

Hence we must store the other 2 tuples (< ‘smith’, ‘X’, ‘Anna’>) and (<
‘smith’, ‘Y’, ‘john’>) to show that {X,Y} and {john , anna} are associated only
with smith e., there is no association between Pname and Dname which mean that the
two attributes are independent.

An MVD X-->>Y in R is called a trivial MVD if

Y is a subset of X or X U Y=R

An MVD that does not satisfy the above condition is non trivial.
9.1 FOURTH NORMAL FORM:

A relation schema R is in 4NF with respect to a set D of functional and


multivalued dependencies if for all multivalued dependencies in D+ of the form   ,
where   R and   R, at least one of the following hold:
   is trivial (i.e.,    or    = R)
 is a superkey for schema R
If a relation is in 4NF it is in BCNF

The Emp relation in the example is not in 4NF because in the non trivial MVD’s
ename-->>pname and ename-->>dname, ename is not a super key of emp.

Emp_proj Emp_dep

ename pname ename dname

Smith x Smith john

Smith y Smith anna

Therefore, the emp relation is decomposed into Emp_proj and Emp_dep as given
below:

The above relations are in 4NF because both are trivial.

10. JOIN DEPENDENCIES AND FIFTH NORMAL FORM:


Join dependency:

Join dependencies constrain the set of legal relations over a schema R to those relations
for which a given decomposition is a lossless-join decomposition.

Let R be a relation schema and R1 , R2 ,..., Rn be a decomposition of R. If R = R1  R2


…. Rn, we say that a relation r(R) satisfies the join dependency *(R1 , R2 ,..., Rn)

if:

r =R1 (r) ⋈R2 (r) ⋈ …… ⋈Rn(r)

A join dependency is trivial if one of the Ri is R itself.


10.1 FIFTH NORMAL FORM

A relation schema R is in fifth normal form (5NF) (or Project-Join Normal Form
(PJNF)) with respect to a set F of functional, multivalued, and join dependencies
if, for every nontrivial join dependency JD(R1, R2, ..., Rn) in F+ (that is, implied by
F), everyRi is a superkey of R.

Consider the Supply relation which has the join dependency.

Supply:

Sname Partname Projname

Smith Bolt proj x

Smith Nut Proj y

Adamsky Bolt Proj y

Walton Nut Proj z

Adamsky Nail Proj x

Adamsky Bolt Proj x

Smith Bolt Proj y

The Supply relation having the join dependency(R1,R2,R3) is decomposed into


three relations R1,R2 and R3 that are each in 5NF.
The natural join of any two of these relations produces spurious tuples, but
applying natural join to all three together does not.

The natural join of all three produces the state of the original relation.

R1 R2 R3

Sname Partname Sname Projname Partname Projnam


Smith Bolt Smith Proj x e
Smith Nut Smith Proj y Bolt Proj x
Adamsky Bolt Adamsky Proj y Nut Proj y
Walton Nut Walton Proj z Bolt Proj y
Adamsky Nail Adamsky Proj x Nut Proj z
Nail Proj x

R1⋈ R2 (R1⋈ R2) ⋈ R3

Sname Partname Projname Sname Partname Projname


Smith Bolt Proj x Smith Bolt Proj x
Smith Nut Proj y Smith Nut Proj y
Smith Bolt Proj y Smith Bolt Proj y
Smith Nut Proj x Adamsky Bolt Proj y
Adamsky Bolt Proj y Adamsky Bolt Proj x
Adamsky Bolt Proj x Walton Nut Proj z
Walton Nut Proj z Adamsky Nail Proj x
Adamsky Nail Proj y
Adamsky Nail Proj x
11. Denormalization
Denormalization is a database optimization technique in which we add
redundant data to one or more tables. This can help us avoid costly joins in a
relational database. Note that denormalization does not mean ‘reversing
normalization’ or ‘not to normalize’. It is an optimization technique that is
applied after normalization.
Basically, The process of taking a normalized schema and making it non-
normalized is called denormalization, and designers use it to tune the
performance of systems to support time-critical operations.
In a traditional normalized database, we store data in separate logical tables
and attempt to minimize redundant data. We may strive to have only one
copy of each piece of data in a database.
For example, in a normalized database, we might have a Courses table and a
Teachers table. Each entry in Courses would store the teacherID for a Course
but not the teacherName. When we need to retrieve a list of all Courses with
the Teacher’s name, we would do a join between these two tables. In some
ways, this is great; if a teacher changes his or her name, we only have to
update the name in one place.
The drawback is that if tables are large, we may spend an unnecessarily long
time doing joins on tables.
Denormalization, then, strikes a different compromise. Under
denormalization, we decide that we’re okay with some redundancy and some
extra effort to update the database in order to get the efficiency advantages
of fewer joins.
Pros of Denormalization:
Retrieving data is faster since we do fewer joins
Queries to retrieve can be simpler(and therefore less likely to have bugs),
since we need to look at fewer tables.
Cons of Denormalization:
Updates and inserts are more expensive.
Denormalization can make update and insert code harder to write.
Data may be inconsistent.
Data redundancy necessitates more storage.
In a system that demands scalability, like that of any major tech company, we
almost always use elements of both normalized and denormalized databases.
10.ASSIGNMENT

1. Compute the closure of the following set of functional dependencies


for a relation scheme R(A,B,C,D,E), F={A->BC, CD->E, B->D, E-
>A).

2. Compute the closure of the following set of functional dependencies


for a relation scheme R(A,B,C,D,E), F={A->BC, CD->E, B->D, E->A)
and Find the candidate key.

3. Consider the relation R = {A, B, C, D, E, F, G, H, I, J} and the set of


functional dependencies F= {{A, B} C, A {D, E}, B F, F {G, H}, D {I,
J} } 1. What is the key for R ? Demonstrate it using the inference
rules. 2. Decompose R into 2NF, then 3NF relations.

4. Consider a relation R(ABC) with following FD A->B, B->C and C-


>A. What is the normal form of R ?

5. Prove that any relational schema with two attributes is in BCNF


11. Part A Questions & Answers

S.No Question and Answers CO K

1 Define- relational algebra. CO3 K2


The relational algebra is a procedural query language. It
consists of a set of operations that take one or two relation
as input and produce a new relation as output.
2 Write short notes on domain relational calculus . CO3 K1
The domain relational calculus uses domain variables that
take on values from an attribute domain rather than values
for entire tuple
3 Define tuple variable. CO3 K1
Tuple variable is a variable whose domain is the set of all
tuples

4 Define the term Domain. CO3 K1


For each attribute there is a set of permitted values called
the domain of that attribute.
5 What is the difference between tuple relational CO3 K1
calculus and domain relational calculus?
The tuple-oriented calculus uses a tuple variables i.e.,
variable whose only permitted values are tuples of that
relation. E.g. QUEL. The domain-oriented calculus has
domain variables i.e., variables that range over the
underlying domains instead of over relation. E.g. ILL,
DEDUCE....The tuple-oriented calculus uses a tuple variables
i.e., variable whose only permitted values are tuples of
that relation.
6 List the possible operations is Relational Algebra. CO3 K1
• Select operation
• Project operation
• Union operation
• Set Difference operation
• Cartesian Product operation
• Rename operation
• Set-Intersection operation
• Natural-join operation
• Division
• Assignment operation
7 What is a SELECT operation? CO3 K2
The select operation selects tuples that satisfy a given
predicate. We use the lowercase letter ss to denote
selection. ss

8 What is a PROJECT operation? CO3 K1


The project operation is a unary operation that returns its
argument relation with certain attributes left out. Projection
is denoted by pie ( pp pp ).
S.No Question and Answers CO K

9 Define the terms i) Key attribute ii) Value set CO2 K1


Key attribute : An entity type usually has an attribute
whose values are distinct from each individual entity in the
collection. Such an attribute is called a key attribute. Value
set: Each simple attribute of an entity type is associated
with a value set that specifies the set of values that may be
assigned to that attribute for each individual entity.

10 What does the cardinality ratio specify? CO2 K1


Mapping cardinalities or cardinality ratios express the
number of entities to which another entity can be
associated. Mapping cardinalities must be one of the
following: One to one, One to many, Many to one and Many
to many.

11 What is first normal form? CO3 K1


The domain of attribute must include only atomic (simple,
indivisible) values.

12 What is meant by domain key normal form? CO3 K1


Domain/key Normal form (DKNF) is a normal form used in
database normalization which requires that the database
contains no constraints other than domain constraints and
key constraint

13 What is 2NF? CO3 K1


Relation schema R is in 2NF if it is in 1NF and every non-
prime attribute An in R is fully functionally dependent on
primary key.

14 What is meant by the degree of relationship set? CO2 K1


The degree of relationship type is the number of
participating entity types.

15 Explain trivial dependency? CO3 K2


Functional dependency of the form a ß is trivial if ß C a.
Trivial functional dependencies are satisfied by all the
relations.

16 What is meant by computing the closure of a set of CO3 K1


functional dependency?
The closure of F denoted by F+ is the set of functional
dependencies logically implied by F.
S.No Question and Answers CO K
17 Define single valued and multivalued attributes. CO3 K1
Single valued attributes: attributes with a single value for a
particular entity are called single valued attributes.
Multivalued attributes : Attributes with a set of value for a
particular entity are called multivalued attributes.

18 What does the cardinality ratio specify? CO3 K1


Mapping cardinalities or cardinality ratios express the
number of entities to which another entity can be
associated. Mapping cardinalities must be one of the
following: • One to one • One to many • Many to one •
Many to many

19 What are axioms? CO3 K1


Axioms or rules of inference provide a simpler technique for
reasoning about functional dependencies.

20 What are the uses of functional dependencies? CO3 K1


• To test relations to see whether they are legal under a
given set of functional dependencies.
• To specify constraints on the set of legal relations.

21 Explain trivial dependency? CO3 K1


Functional dependency of the form a ß is trivial if ß C a.
Trivial functional dependencies are satisfied by all the
relations.

22 What is meant by normalization of data? CO3 K1


It is a process of analyzing the given relation schemas based
on their Functional Dependencies (FDs) and primary key to
achieve the properties Minimizing redundancy Minimizing
insertion, deletion and updating anomalies

23 What is Attribute preservation condition? CO3 K1


Each attribute in R will appear in at least one relation
Schema Ri in the decomposition so that no attributes are
“lost”.
S.No Question and Answers CO K

24 Define Fourth Normal Form. CO3 K1


A relation schema R is in 4NF with respect to a set F of FD’s
if for all FD’s of the form A ->> B (Multi valued
Dependency), where A is contained in R and B is contained
in R, at least one of the following holds:
A ->> B is a trivial MD. A is a superkey for schema R.

25 Define 5NF or Join Dependencies. CO3 K1


Let R be a relation schema and R1, R2, …, Rn be a
decomposition of R. The join dependency *(R1, R2, …Rn) is
used to restrict the set of legal relations to those for which
R1,R2,…Rn is a lossless-join decomposition of R. Formally, if
R= R1 U R2U …U Rn, we say that a relation r® satisfies the
join dependency *(R1, R2, …Rn) if R = A join dependency is
trivial if one of the Ri is R itself.

26 Define - Irreducible Set of Dependencies. CO3 K1


A functional depending set S is irreducible if the set has the
following three properties:
Each right set of a functional dependency of S contains
only one attribute.
Each left set of a functional dependency of S is
irreducible. It means that reducing any
one attribute from left set will change the content of S (S will
lose some information).
Reducing any functional dependency will change the
content of S.

27 List the pitfalls in Relational Database Design. CO3 K1


Repetition of information
Inability to represent certain information

28 List the properties of decomposition. CO3 K1


Lossless join
Dependency Preservation
No repetition of information
S.No Question and Answers CO K
29 Define 1st Normal form CO3 K1
If the Relation R contains only the atomic fields then that
Relation R is in first normal form.
E.g.) R = (account no, balance) – first normal form.
30 24. Define 3 Normal Form. CO3 K1
A relation schema R is in 3 NF with respect to a set F of FD’s
if for all FD’s of the form A -
> B, where A is contained in R and B is contained in R, at
least one of the following holds:
A -> B is a trivial FD
A is a superkey for schema R.
Each attribute in B – A is contained in a candidate key for R
31 23. Define BCNF. CO3 K1
A relation schema R is in BCNF with respect to a set F of FD’s
if for all FD’s of the form A
-> B, where A is contained in R and B is contained in R, at
least one of the following holds:
A -> B is a trivial FD
A is a superkey for schema R.

32 22. Define Second Normal Form. CO3 K1


A relation schema R is in 2 NF with respect to a set F of FD’s
if for all FD’s of the form A -
> B, where A is contained in R and B is contained in R, and A
is a superkey for schema R.

33 Define Functional Dependency. CO3 K1


Functional dependencies are constraints on the set of legal
relations. They allow us to
express facts about the enterprise that we are modeling with
our database.
Syntax: A -> B e.g.) account no -> balance for account table.
12. Part B Questions
S.No Question and Answers CO K
1 What is mean by Relational Algebra? List out the relational CO3 K2
Algebra operation with its example.
2 Explain Select and Intersection operation of Relational Algebra CO3 K3
with example
3 Explain Project and Union operation of Relational Algebra with CO3 K2
example.
4 Consider the following Relational Database. CO3 K2
Student (roll_no, name,city,marks,c_no)
Course (c_no,cname,fees)
Construct Queries into Relational algebra.
a) List Student Details enrolled for ‘BBA (C.A)’ Course.
b) List the Course having fees < 20000
c) Display all students living in either ‘Nasik’ or ‘Pune’ city.
d) Display Course detail for student ‘Gaurav Sharma’.
5 Consider the following Relational Database. CO3 K2
Doctor (dno, dname,address,dcity)
Hospital (hno,hname,street,hcity)
Dochosp (dno,hno,date)
Construct Queries into Relational algebra.
a) Find hospital names to which ‘Dr. Mehata’ has visited.
b) Find out all the doctors who have visited hospitals in the
same city.
c) List all the doctors who visited ‘Krishna’ on ‘1-1-19’. .
d) List Name of hospital to which ‘Dr. Aman’ has visited on ‘5-3-
2019’
6 Discuss about Domain Relational calculus in detail? CO3 K2

7 Construct an E-R diagram for a hospital with a set of patients CO3 K3


and a set of medical doctors. Associate with each patient a log
of the various tests and examinations conducted. Also construct
appropriate tables for the ER diagram you have drawn
8 Discuss the correspondence between the ER model construct CO3 K3
and the relational model constructs. Show how each ER model
construct can be mapped to the relational model. Discuss the
option for mapping EER construct.
9 State the need for Normalization of a database and explain the CO3 K2
various normal forms (1st, 2nd, 3rd, BCNF, 4th, 5th and
Domain-key) with suitable examples.

10 What are normal-forms? Explain the types of Normal form with CO3 K2
an example.

11 What are the pitfalls in the relational database design? With a CO3 K2
suitable example, explain the role of functional dependency in
the process of normalization.

12 Briefly explain about the functional dependency concepts CO3 K2

13 Consider the relation R(A,B,C,D,E) with functional CO3 K3


dependencies. {A→BC, CD→E, B→D, E→A}. Identify super
keys. Find Fc, F+.
Explain Boyce-codd normal form with an example. Also state
how it differs from that of 3NF.

14 Consider the universal relation R = {A, B, C, D, E, F, G, H, I} CO3 K3


and the set of functional dependencies F = { {{A,B}  {C}, {A}
 {D,E}, {B}  {F}, {F}  {G,H} ,{D}  {I,J}}. What is the
key for R? Decompose R into 2NF, then 3NF relations.

15 Explain the principles of : (i) loss less join decomposition (ii) CO3 K2
join dependencies (iii) fifth normal form.
13. SUPPORTIVE ONLINE CERTIFICATION COURSES

Sl.No Name of the Name of the Course Website Link


. Institute

1. coursera Database https://www.coursera.org/learn/da


Management tabase-management
Essentials

2. coursera Database systems https://www.coursera.org/specializ


Specialization ations/database-systems
3. Udemy Introduction to https://www.udemy.com/course/d
Database atabase-engines-crash-course/
Engineering

4. Udemy Relational Database https://www.udemy.com/course/re


Design lational-database-design/
5. Udemy Database Design https://www.udemy.com/course/d
atabase-design/
6. Udemy Database Design https://www.udemy.com/course/c
Introduction wdatabase-design-introduction/
7. Udemy The Complete https://www.udemy.com/course/th
Database Design & e-complete-database-modeling-
Modeling Beginners and-design-beginners-tutorial/
Tutorial

8. Udemy Database Design https://www.udemy.com/course/c


and MySQL alebthevideomaker2-database-
and-mysql-classes/

9. NPTEL Data Base https://onlinecourses.nptel.ac.in/n


Management System oc21_cs04/preview
14.REAL TIME APPLICATIONS IN DAY TO DAY LIFE AND
TO INDUSTRY
Application and Uses of Database Management System (DBMS)

Railway Reservation System.


Library Management System
Banking System
Universities and colleges Management Systems
Credit card transactions.
Social Media Sites
Telecommunications
Finance Applications
15. CONTENT BEYOND SYLLABUS

Introduction to Hierarchical Database Model


Hierarchical Database Model, as the name suggests, is a database model in which
the data is arranged in a hierarchical tree edifice. As it is arranged based on the
hierarchy, every record of data tree should have at least one parent, except for the
child records in the last level, and each parent should have one or more child
records. The Data can be accessed by following through the classified structure,
always initiated from the Root or the first parent. Hence this model is named as
Hierarchical Database Model.

What is Hierarchical Database Model

It is a data model in which data is represented in the tree-like structure. In this


model, data is stored in the form of records which are the collection of fields. The
records are connected through links and the type of record tells which field is
contained by the record. Each field can contain only one value.

It must have only one parent for each child node but parent nodes can have more
than one child. Multiple parents are not allowed. This is the major difference
between the hierarchical and network database model. The first node of the tree is
called the root node. When data needs to be retrieved then the whole tree is
traversed starting from the root node. This model represents one- to- many
relationships.

Let us see one example: Let us assume that we have a main directory which
contains other subdirectories. Each subdirectory contains more files and directories.
Each directory or file can be in one directory only i.e. it has only one parent.
15. CONTENT BEYOND SYLLABUS

Here A is the main directory i.e. the root node. B1 and B2 are their child or
subdirectories. B1 and B2 also have two children C1, C2 and C2, C3 respectively.
They may be directories or other files. This depicts one- to- many relationships.

Uses of Hierarchical Database Model

The uses of the database model are as explained here.

A Hierarchical database model was widely used during the Mainframe Computers
Era. Today, it is used mainly for storing file systems and geographic information. It is
used in applications where high performance is required such as telecommunications
and banking. A hierarchical database is also used for Windows Registry in the
Microsoft Windows operating system. It is useful where the following two conditions
are met:

The data should be in a hierarchical pattern i.e. parent-child relationship must be


present.

The data in a hierarchical pattern must be accessed through a single path only.
Advantages

Few advantages are listed below.

Data can be retrieved easily due to the explicit links present between the table
structures.

Referential integrity is always maintained i.e. any changes made in the parent table
are automatically updated in a child table.

Promotes data sharing.

It is conceptually simple due to the parent-child relationship.

Database security is enforced.

Efficient with 1: N relationships.

A clear chain of command or authority.

High performance.

Disadvantages

Below are some of the disadvantages given.

If the parent table and child table are unrelated then adding a new entry in the child
table is difficult because additional entry must be added in the parent table.

Complex relationships are not supported.

Redundancy which results in inaccurate information.

Change in structure leads to change in all application programs.

M: N relationship is not supported.

No data manipulation or data definition language.


Features

Some features are pointed out below:

Many to many relationships: It only supports one – to – many relationships.


Many to many relationships are not supported.

Problem in Deletion: If a parent is deleted then the child automatically gets


deleted.

Hierarchy of data: Data is represented in a hierarchical tree-like structure.

Parent-child relationship: Each child can have only one parent but a parent can
have more than one children.

Pointer: Pointers are used for linking records that tell which is a parent and which
child record is.

Disk input and output is minimized: Parent and child records are placed or
stored close to each other on the storage device which minimizes the hard disk input
and output.

Fast navigation: As parent and child are stored close to each other so access time
is reduced and navigation becomes faster.
Examples

Let us take an example of college students who take different courses. A course can
be assigned to an only single student but a student can take as many courses as
they want therefore following one to many relationships.
Now we can represent the above hierarchical model as relational tables as shown
below:

Student

Course
16. ASSESSMENT SCHEDULE

Assessment Type Proposed Date


Assessment 1 May 2023

Assessment 2 June 2023

Model Exam July 2023


17. PRESCRIBED TEXT BOOKS &REFERENCE BOOKS

TEXT BOOKS:

1. Abraham Silberschatz, Henry F. Korth, S. Sudharshan, ―Database System


Concepts, Sixth Edition, Tata McGraw Hill, 2011.

2. Ramez Elmasri, Shamkant B. Navathe, ―Fundamentals of Database Systems‖,


Sixth Edition, Pearson Education, 2011.

REFERENCES:

1. C.J.Date, A.Kannan, S.Swamynathan, ―An Introduction to Database Systems,


Eighth Edition, Pearson Education, 2006.

2. Raghu Ramakrishnan, ―Database Management Systems, Fourth Edition,


McGraw-Hill College Publications, 2015.

3. G.K.Gupta,"Database Management Systems”, Tata McGraw Hill, 2011.


18. MINI PROJECT SUGGESTIONS

Design E-R model for the following and also apply normalization
1) Blood bank management system

Hospitals will get register to request the blood they want. And some donors will get signup to
this blood bank to donate the blood. These donors will be available to donate in the particular
areas according to the registered data. The hospitals will request for the blood and blood bank
will provide the details of donors near to the hospital. Blood bank also shows the availability of
blood groups to the hospitals. We can also maintain the data of donated blood to the hospitals.

2) School management system

Staff details will be stored with id and all the staff details will be stored in the system. And we
retrieve them at any time by using their id. Students information will also be stored in the
system and students marks also can be stored. Salary management can also be done in this
system for the staff members of the school. Fees of the students can also be maintained in the
system. Another feature will contain sections information and the section class teacher.

3) Payroll management system

create a system where the admin will be the manager. The manager will log in with his id and
he will add all the details about the employees and he can add any new employees who are
joined in the organization. add a feature to calculate the salaries of the employees based on
their designation and attendance. Add a feature to display the details of all the employees in
the organization and we can also display the details and salaries of the employees which are
calculated in the current month.

4) Railway system

Users can book the train tickets to reach their destination. n this option includes things like the
present station and destination station and the train that they want to travel in and provide the
user to check the details of the train by using the train id and it must also show the details of
train arrival time, in which platform the train is arriving and departure timings of the train. also
add an option in which that will allow the user to book a meal while traveling on the train. And
we can also add the option which shows the price range of a different class of booking like AC,
second class, sleeper, and others. And try to think yourself to add any options.
5) Hospital Data Management
assign unique IDs to the patients and store the relevant information under the same. add the
patient’s name, personal details, contact number, disease name, and the treatment the patient
is going through. mention under which hospital department the patient is (such as cardiac,
gastro, etc.). add information about the hospital’s doctors. A doctor can treat multiple patients,
and he/she would have a unique ID as well. Doctors would also be classified in different
departments. add the information of ward boys and nurses working in the hospital and
assigned to different rooms. Patients would get admitted into rooms, so add that information in
your database too.
Thank you

Disclaimer:

This document is confidential and intended solely for the educational purpose of RMK Group of
Educational Institutions. If you have received this document through email in error, please notify the
system manager. This document contains proprietary information and is intended only to the
respective group / learning community as intended. If you are not the addressee you should not
disseminate, distribute or copy through e-mail. Please notify the sender immediately by e-mail if you
have received this document by mistake and delete this document from your system. If you are not
the intended recipient you are notified that disclosing, copying, distributing or taking any action in
reliance on the contents of this information is strictly prohibited.

You might also like