DBMS Unit-3
DBMS Unit-3
DBMS Unit-3
Relational Decomposition
o When a relation in the relational model is not in appropriate normal form then the
decomposition of a relation is required.
o In a database, it breaks the table into multiple tables.
o If the relation has no proper decomposition, then it may lead to problems like loss of
information.
o Decomposition is used to eliminate some of the problems of bad design like anomalies,
inconsistencies, and redundancy.
Types of Decomposition
Lossless-join decomposition
Lossless-join decomposition is a process in which a relation is decomposed into two or more
relations.
“This property guarantees that no information is lost from the original relation during the
decomposition.”
It is also known as non-additive join decomposition.
When the sub relations combine again then the new relation must be the same as the original
relation was before decomposition.
12 25 34
10 36 09
12 42 30
It decomposes into the two sub relations −
R1 (A, B)
A B
12 25
10 36
12 42
R2 (B, C)
B C
25 34
36 09
42 30
Now, we can check the first condition for Lossless-join decomposition.
1. The union of sub relation R1 and R2 is the same as relation R.
R1U R2 = R
We get the following result −
A B C
12 25 34
A B C
10 36 09
12 42 30
The relation is the same as the original relation R. Hence, the above decomposition is Lossless-
join decomposition.
Dependency Preserving
o It is an important constraint of the database.
o In the dependency preservation, “at least one decomposed table or sub table must
satisfy the main table functional dependency.”
o If a relation R is decomposed into relation R1 and R2, then the dependencies of R either
must be a part of R1 or R2 or must be derivable from the combination of functional
dependencies of R1 and R2.
o For example, suppose there is a relation R (A, B, C, D) with functional dependency set
o (A->BC).
o The relational R is decomposed into R1 (A, B, C) and R2 (A, D) which is dependency
preserving because FD A->BC of R is a part of relation R1 (ABC).
o i.e. the attributes of R1(A, B, C) are the same as the functional dependency of the R (A-
>BC).
1. Loss of Information
Non-loss decomposition: When a relation is decomposed into two or more smaller
relations, and the original relation can be perfectly reconstructed by taking the
natural join of the decomposed relations, then it is termed as lossless
decomposition. If not, it is termed "lossy decomposition."
Example: Let's consider a table `R(A, B, C)` with a dependency `A → B`. If you
decompose it into `R1(A, B)` and `R2(B, C)`, it would be lossy because you can't
recreate the original table using natural joins.
Example: Consider a relation R(A,B,C) with the following data:
| A | B | C |
|----|----|----|
| 1 | X | P |
| 1 | Y | P |
| 2 | Z | Q |
| A | B |
|----|----|
| 1 | X |
| 1 | Y |
| 2 | Z |
R2(A, C):
| A | C |
|----|----|
| 1 | P |
| 1 | P |
| 2 | Q |
Now, if we take the natural join of R1 and R2 on attribute A, we get back the
original relation R. Therefore, this is a lossless decomposition.
3. Increased Complexity
Decomposition leads to an increase in the number of tables, which can complicate
queries and maintenance tasks. While tools and ORM (Object-Relational Mapping)
libraries can mitigate this to some extent, it still adds complexity.
4. Redundancy
Incorrect decomposition might not eliminate redundancy, and in some cases, can
even introduce new redundancies.
5. Performance Overhead
An increased number of tables, while aiding normalization, can also lead to more
complex SQL queries involving multiple joins, which can introduce performance
overheads.
EMPLOYEE table:
14 John 7272826385, UP
9064738238
The decomposition of the EMPLOYEE table into 1NF has been shown below:
14 John 7272826385 UP
14 John 9064738238 UP
Example: Let's assume, a school can store the data of teachers and the subjects they teach.
TEACHER table
25 Chemistry 30
25 Biology 30
47 English 35
83 Math 38
83 Computer 38
The above table is not in 2NF due to the KEY attribute TEACHER_ID has the repeated values,
To convert the given table into 2NF, we decompose it into two tables:
TEACHER_DETAIL table:
TEACHER_ID TEACHER_AGE
25 30
47 35
83 38
TEACHER_SUBJECT table:
TEACHER_ID SUBJECT
25 Chemistry
25 Biology
47 English
83 Math
83 Computer
1 Basketball 500
2 Basketball 500
3 Basketball 500
4 Cricket 600
Rollno Game Feestructure
5 Cricket 600
6 Cricket 600
7 Tennis 400
The above student table is in 1NF because there are no multivalue attributes.
Student table is also in 2NF because all non-key attributes are fully functional dependent on the
primary key (rollno).
But the table is not in 3NF because there is transitive dependency exists in above table.
So divide the student table into R1(game, feestructure) and R2 (rollno, game).
Table:R1
Rollno Game
1 Basketball
2 Basketball
3 Basketball
4 Cricket
5 Cricket
6 Cricket
7 tennis
Table:R2
Game Feestructure
Basketball 500
Cricket 600
Tennis 400
In above two tables no transitive dependency exists so that the above two tables follows the
3NF rule.
Example: Let's assume there is a company where employees work in more than one department.
EMPLOYEE table:
1. EMP_ID → EMP_COUNTRY
2. EMP_DEPT → {DEPT_TYPE, EMP_DEPT_NO}
The table is not in BCNF because neither EMP_DEPT nor EMP_ID alone are keys.
To convert the given table into BCNF, we decompose it into two tables:
EMP_COUNTRY table:
EMP_ID EMP_COUNTRY
264 India
264 India
EMP_DEPT table:
Functional dependencies:
1. EMP_ID → EMP_COUNTRY
2. EMP_DEPT → {DEPT_TYPE, EMP_DEPT_NO}
Super keys:
21 Computer Dancing
21 Math Singing
34 Chemistry Dancing
74 Biology Cricket
59 Physics Hockey
The given STUDENT table is in 3NF and BCNF also but the COURSE and HOBBY are two
independent attributes. Hence, there is no relationship between COURSE and HOBBY.
In the STUDENT relation, a student with STU_ID, 21 contains two
courses, Computer and Math and two hobbies, Dancing and Singing. So there is a Multi-
valued dependency on STU_ID, which leads to unnecessary repetition of data.
So to make the above table into 4NF, we can decompose it into two tables:
STU_ID COURSE
21 Computer
21 Math
34 Chemistry
74 Biology
59 Physics
STUDENT_HOBBY
STU_ID HOBBY
21 Dancing
21 Singing
34 Dancing
74 Cricket
59 Hockey
Relational Algebra
Relational algebra is a procedural query language. It gives a step by step process to obtain the
result of the query. It uses operators to perform queries.
Types of Relational operation
Example:
A B C
1 2 4
2 2 3
3 2 3
4 3 4
For the above relation, σ(c>3)R will select the tuples which have c more than 3.
A B C
1 2 4
4 3 4
Note: The selection operator only selects the required tuples but does not display them. For
display, the data projection operator is used.
2 4
2 3
3 4
3. Union(U): Union operation in relational algebra is the same as union operation in set theory.
Example:
FRENCH
Student_Name Roll_Number
Ram 01
Mohan 02
Vivek 13
Geeta 17
GERMAN
Student_Name Roll_Number
Vivek 13
Geeta 17
Shyam 21
Rohan 25
Consider the following table of Students having different optional subjects in their course.
π(Student_Name)FRENCH U π(Student_Name)GERMAN
Student_Name
Ram
Mohan
Vivek
Geeta
Shyam
Rohan
Note: The only constraint in the union of two relations is that both relations must have the
same set of Attributes.
4. Set Difference(-): Set Difference in relational algebra is the same set difference operation as
in set theory.
Example: From the above table of FRENCH and GERMAN, Set Difference is used as follows
π(Student_Name)FRENCH - π(Student_Name)GERMAN
Student_Name
Ram
Mohan
5. Set Intersection(∩): Set Intersection in relational algebra is the same set intersection
operation in set theory.
Example: From the above table of FRENCH and GERMAN, the Set Intersection is used as
follows
π(Student_Name)FRENCH ∩ π(Student_Name)GERMAN
Student_Name
Vivek
Geeta
Note: The only constraint in the Set Difference between two relations is that both relations
must have the same set of Attributes.
6. Rename(ρ): Rename is a unary operation used for renaming attributes of a relation.
ρ(a/b)R will rename the attribute 'b' of the relation by 'a'.
7. Cross Product(X): Cross-product between two relations. Let’s say A and B, so the cross
product between A X B will result in all the attributes of A followed by each attribute of B.
Each record of A will pair with every record of B.
Example:
A
Name Age Gender
Ram 14 M
Sona 15 F
Kim 20 M
B
ID Course
1 DS
2 DBMS
AXB
Name Age Gender ID Course
Ram 14 M 1 DS
Ram 14 M 2 DBMS
Sona 15 F 1 DS
Sona 15 F 2 DBMS
Kim 20 M 1 DS
Kim 20 M 2 DBMS
Note: If A has ‘n’ tuples and B has ‘m’ tuples then A X B will have ‘ n*m ‘ tuples.
Relational Calculus
1. Relational calculus is a non-procedural query language.
2. In the non-procedural query language, the user is concerned with the details
of how to obtain the end results.
3. The relational calculus tells what to do but never explains how to do.
{ t | P(t) }
Where t is a tuple variable and P (t) is a logical formula that describes the
conditions that the tuples in the result must satisfy. The curly braces {} are
used to indicate that the expression is a set of tuples.
For example, let’s say we have a table called “Employees” with the
following attributes:
Employee ID
Name
Salary
Department ID
To retrieve the names of all employees who earn more than $50,000 per year,
we can use the following TRC query:
{ t | Employees(t) ∧ t.Salary > 50000 }
In this query, the “Employees(t)” expression specifies that the tuple variable t
represents a row in the “Employees” table. The “∧” symbol is the logical AND
operator, which is used to combine the condition “t.Salary > 50000” with the
table selection.
The result of this query will be a set of tuples, where each tuple contains the
Name attribute of an employee who earns more than $50,000 per year.
do it.
Output: This query will yield the article, page, and subject from the
relational javatpoint, where the subject is a database.
Deletion Anomaly
If the details of students in this table are deleted then the details of the
college will also get deleted which should not occur by common
sense. This anomaly happens when the deletion of a data record
results in losing some unrelated information that was stored as part of
the record that was deleted from a table.
Updation Anomaly
Suppose the rank of the college changes then changes will have to be
all over the database which will be time-consuming and
computationally costly.
Colleg
Student_ID Name Contact e Course Rank
All places should be updated, If updation does not occur at all places
then the database will be in an inconsistent state.
Problems Caused Due to Redundancy
Data Inconsistency: Redundancy can lead to data inconsistencies,
where the same data is stored in multiple locations, and changes to
one copy of the data are not reflected in the other copies. This can
result in incorrect data being used in decision-making processes and
can lead to errors and inconsistencies in the data.
Storage Requirements: Redundancy increases the storage
requirements of a database. If the same data is stored in multiple
places, more storage space is required to store the data. This can
lead to higher costs and slower data retrieval.
Update Anomalies: Redundancy can lead to update anomalies,
where changes made to one copy of the data are not reflected in
the other copies. This can result in incorrect data being used in
decision-making processes and can lead to errors and
inconsistencies in the data.
Performance Issues: Redundancy can also lead to performance
issues, as the database must spend more time updating multiple
copies of the same data. This can lead to slower data retrieval and
slower overall performance of the database.
Security Issues: Redundancy can also create security issues, as
multiple copies of the same data can be accessed and manipulated
by unauthorized users. This can lead to data breaches and
compromise the confidentiality, integrity, and availability of the
data.
Maintenance Complexity: Redundancy can increase the
complexity of database maintenance, as multiple copies of the same
data must be updated and synchronized. This can make it more
difficult to troubleshoot and resolve issues and can require more
time and resources to maintain the database.
Data Duplication: Redundancy can lead to data duplication, where
the same data is stored in multiple locations, resulting in wasted
storage space and increased maintenance complexity. This can also
lead to confusion and errors, as different copies of the data may
have different values or be out of sync.
Data Integrity: Redundancy can also compromise data integrity,
as changes made to one copy of the data may not be reflected in
the other copies. This can result in inconsistencies and errors and
can make it difficult to ensure that the data is accurate and up-to-
date.
Usability Issues: Redundancy can also create usability issues, as
users may have difficulty accessing the correct version of the data
or may be confused by inconsistencies and errors. This can lead to
frustration and decreased productivity, as users spend more time
searching for the correct data or correcting errors.
Fifth normal form (5NF)
o A relation is in 5NF if it is in 4NF and not contains any join dependency and joining
should be lossless.
o 5NF is satisfied when all the tables are broken into as many tables as possible
in order to avoid redundancy.
o 5NF is also known as Project-join normal form (PJ/NF).
Example
SUBJECT LECTURER SEMESTER
Computer Anshika Semester 1
In the above table, John takes both Computer and Math class for Semester 1
but he doesn't take Math class for Semester 2. In this case, combination of
all these fields required to identify a valid data.
Suppose we add a new Semester as Semester 3 but do not know about the
subject and who will be taking that subject so we leave Lecturer and Subject
as NULL. But all three columns together acts as a primary key, so we can't
leave other two columns blank.
So to make the above table into 5NF, we can decompose it into three
relations P1, P2 & P3:
P1
SEMESTER SUBJECT
Semester 1 Computer
Semester 1 Math
Semester 1 Chemistry
Semester 2 Math
P2
SUBJECT LECTURER
Computer Anshika
Computer John
Math John
Math Akash
Chemistry Praveen
P3
SEMSTER LECTURER
Semester 1 Anshika
Semester 1 John
Semester 1 John
Semester 2 Akash
Semester 1 Praveen