JNTUH Dbms Unit4
JNTUH Dbms Unit4
We will present a number of sample queries using the following table definitions:
Queries:
CREATE TABLE sailors (sid int PRIMARY KEY, sname varchar (50),
rating integer ,age int)
CREATE TABLE boats(bid int PRIMARY KEY,bname varchar(50),color
varchar(50));
CREATE TABLE reserves (sid int,bid int,day date,PRIMARY KEY
(sid,bid,day) ,FOREIGN KEY(sid) REFERENCES sailors(sid), FOREIGN
KEY(bid) REFERENCES boats(bid));
111
This section presents the syntax of a simple SQL query and explains its meaning
through a conceptual Evaluation strategy. A conceptual evaluation strategy is a
way to evaluate the query that is intended to be easy to understand rather than
efficient. A DBMS would typically execute a query in a different and more
efficient way.
The basic form of an SQL query is as follows:
Example:
Output:
112
Example:
SELECT S.sname, S.age FROM Sailors S
Output:
Example:
Output:
113
AS is an optional keyword used to assign temporary names to table or column
name or both. This is known as creating Alias in SQL.
Q: Find the names of sailors who have reserved boat number 103.
Output:
Output:
114
Q: Find the names of sailors who have reserved a red boat.
A: SELECT S.sname FROM Sailors S, Reserves R, Boats B WHERE S.sid = R.sid
AND R.bid = B.bid AND B.color = 'red';
Output:
115
Like Operator:
The LIKE operator is used in a WHERE clause to search for a specified pattern in
a column.
There are two wildcards often used in conjunction with the LIKE operator:
The percent sign (%) represents zero, one, or multiple characters
The underscore sign (_) represents one, single character
Sailors
116
SELECT * FROM sailors WHERE sname LIKE '%a%'
INTERSECT
SELECT * FROM sailors WHERE sname NOT LIKE '%u%'
117
UNION, INTERSECT AND EXCEPT
UNION:
Union is an operator that allows us to combine two or more results from multiple
SELECT queries into a single result set. It comes with a default feature that
removes the duplicate rows from the result set. MySQL always uses the name of
the column in the first SELECT statement will be the column names of the result
set (output).
The number and order of the columns should be the same in all tables that
you are going to use.
The data type must be compatible with the corresponding positions of each
select query.
The column name selected in the different SELECT queries must be in the
same order.
Syntax:
SELECT expression1, expression2, expression_n
FROM tables
[WHERE conditions]
UNION
SELECT expression1, expression2, expression_n
FROM tables
[WHERE conditions];
Q: Find the sid’s and names of sailors who have reserved a red or a green boat.
A: SELECT S.sid,S.sname
FROM Sailors S, Reserves R, Boats B
WHERE S.sid = R.sid AND R.bid = B.bid AND B.color = 'red'
union
SELECT S2.sid,S2.sname
FROM Sailors S2, Boats B2, Reserves R2
WHERE S2.sid = R2.sid AND R2.bid = B2.bid AND B2.color = 'green';
118
Output:
INTERSECT:
The INTERSECT operator is used to fetch the records that are in common between
two SELECT statements or data sets. If a record exists in one query and not in the
other, it will be omitted from the INTERSECT results.
Syntax:
SELECT expression1, expression2, ... expression_n
FROM tables
[WHERE conditions]
INTERSECT
SELECT expression1, expression2, ... expression_n
FROM tables
[WHERE conditions];
The number and order of the columns should be the same in all tables that
you are going to use.
The data type must be compatible with the corresponding positions of each
select query.
The column name selected in the different SELECT queries must be in the
same order.
Q: Find the sid’s and names of sailors who have reserved a red and a green boat.
A: SELECT S.sid,S.sname
FROM Sailors S, Reserves R, Boats B
119
WHERE S.sid = R.sid AND R.bid = B.bid AND B.color = 'red'
INTERSECT
SELECT S2.sid,S2.sname
FROM Sailors S2, Boats B2, Reserves R2
WHERE S2.sid = R2.sid AND R2.bid = B2.bid AND B2.color = 'green';
Output:
EXCEPT:
The EXCEPT clause in SQL is widely used to filter records from more than one
table. This statement first combines the two SELECT statements and returns
records from the first SELECT query, which isn’t present in the second SELECT
query's result. In other words, it retrieves all rows from the first SELECT query
while deleting redundant rows from the second.
This statement behaves the same as the minus (set difference) operator does in
mathematics.
120
Q: Find the sids of all sailors who have reserved red boats but not green boats.
A: SELECT S.sid
FROM Sailors S, Reserves R, Boats B
WHERE S.sid = R.sid AND R.bid = B.bid AND B.color = 'red'
MINUS // minus in Oracle, Except in SQL server
SELECT S2.sid
FROM Sailors S2, Boats B2, Reserves R2
WHERE S2.sid = R2.sid AND R2.bid = B2.bid AND B2.color = 'green';
Output:
121
Nested Queries
A nested query is a query that has another query embedded within it; the embedded
query is called a sub query. A sub query is known as the inner query, and the query
that contains sub query is known as the outer query.
Syntax:
As per the execution process of sub query, it again classified in two ways.
2. Correlated Queries.
In correlated queries, First Outer query is executed and later Inner query will
execute. i.e Inner query always depends on the result of outer query.
Output:
122
SQL IN condition used to allow multiple value in a WHERE clause condition.
SQL IN condition you can use when you need to use multiple OR condition.
(Q) Find the names of sailors who have reserved a red boat.
(Q) Find the names of sailors who have not reserved a red boat.
Output:
123
SQL NOT IN condition used to exclude the defined multiple value in a WHERE
clause condition.
Example:
(Q) Find the names of sailors who have reserved boat number 103
SELECT S.sname FROM Sailors S WHERE
EXISTS (SELECT * FROM Reserves R WHERE R.bid = 103 AND R.sid = S.sid)
124
Aggregation Operators:
125
Queries:
SELECT COUNT(*) from employees; // Output=10
SELECT COUNT(name) FROM employees; // Output=9
SELECT COUNT(id) FROM employees; // Output=10
SELECT COUNT(DISTINCT id) FROM employees; // Output=9
SELECT COUNT(*) FROM employees WHERE collegename ="kitsw";
// Output=5
SELECT MAX(sal) FROM employees; // Output=65000
SELECT MAX(sal) as Maximum_salary FROM employees; // Output=65000
SELECT MAX(sal) FROM employees WHERE collegename="kits";
// Output=65000
SELECT MIN(sal) FROM employees; // Output=8000
SELECT MIN(sal) FROM employees WHERE collegename='kits';
// Output=8000
SELECT SUM(sal) FROM employees; // Output=369000
SELECT sum(sal) as total_salary FROM employees; // Output=369000
SELECT AVG(sal) FROM employees; // Output=36900
126
Order by Clause:
Order by clause is used sort the rows either in ascending order or descending order
(default is ascending)
Example:
SELECT * from employees ORDER BY sal ASC;
Output:
127
Group By clause:
The GROUP BY Statement in SQL is used to arrange identical data into groups
with the help of some functions. i.e if a particular column has same values in
different rows then it will arrange these rows in a group.
Important Points:
GROUP BY clause is used with the SELECT statement.
In the query, GROUP BY clause is placed after the WHERE clause.
In the query, GROUP BY clause is placed before ORDER BY clause if used
any.
Syntax:
SELECT column1, function_name(column2)
FROM table_name
WHERE condition
GROUP BY column1, column2
ORDER BY column1, column2;
Examples:
128
SELECT dept,COUNT(*) AS No_of_employees FROM employees GROUP BY
dept;
Output:
129
Q: Display the Number of male and female faculty.
A: SELECT gender,COUNT(*) FROM employees GROUP BY gender;
Output:
Q: What is the highest and lowest salary where each dept is paying.
A: SELECT dept,MAX(sal),MIN(sal) FROM employees GROUP BY dept;
Output:
Q: Display the number of male and female faculty from each Department.
A: SELECT dept,gender,COUNT(*) FROM employees GROUP BY dept, gender;
Output:
130
Q: Display the number of male and female’s faculty from each college.
A: SELECT collegename,gender,COUNT(*) from employees GROUP BY
collegename , gender;
Output:
Having Clause:
The HAVING clause is like WHERE but operates on grouped records returned by
a GROUP BY. HAVING applies to summarized group records, whereas WHERE
applies to individual records. Only the groups that meet the HAVING criteria will
be returned.
131
Query: SELECT dept,MAX(age) FROM employees GROUP BY dept HAVING
MAX(sal)>30;
Output:
132
NULL
We have assumed that column values in a row are always known. In practice column
values can be unknown. For example, when a sailor, say Dan, joins a yacht club, he
may not yet have a rating assigned. Since the definition for the Sailors table has a
rating column, what row should we insert for Dan? What is needed here is a special
value that denotes unknown. Suppose the Sailor table definition was modified to
include a maiden-name column. However, only married women who take their
husband's last name have a maiden name. For women who do not take their
husband's name and for men, the maiden-name column is inapplicable. Again, what
value do we include in this column for the row representing Dan? SQL provides a
special column value called null to use in such situations. We use null when the
column value is either Unknown or inapplicable. Using our Sailor table definition,
we might enter the row (98. Dan, null, 39) to represent Dan. The presence of null
values complicates many issues, and we consider the impact of null values on SQL
in this section.
OUTER JOIN:
In an outer join, along with tuples that satisfy the matching criteria, we also include
some or all tuples that do not match the criteria.
133
Example:
Student Marks Student Marks
Output:
In the right outer join, operation allows keeping all tuple in the right relation.
However, if there is no matching tuple is found in the left relation, then the attributes
of the left relation in the join result are filled with null values.
134
Example:
Student Marks Student Marks
In a full outer join, all tuples from both relations are included in the result,
irrespective of the matching condition.
135
103 Karthik 103 99
SELECT * from student LEFT OUTER JOIN marks on student.id=marks.id
UNION ALL
SELECT * from student RIGHT OUTER JOIN marks on student.id=marks.id
Output:
136
Triggers (MySQL)
Triggers are the SQL codes that are automatically executed in response to certain
events on a particular table. These are used to maintain the integrity of the data.
John is the marketing officer in a company. When a new customer data is entered
into the company’s database he has to send the welcome message to each new
customer. If it is one or two customers John can do it manually, but what if the
count is more than a thousand? Well in such scenario triggers come in handy.
Thus, now John can easily create a trigger which will automatically send a
welcome email to the new customers once their data is entered into the database.
Important points:
Triggers can be made to insert, update and delete statements in SQL. We have two
types of triggers:
137
Parts of Triggers:
Trigger Event: Insert / Update / Delete
Trigger action: Before / after
Trigger body: Logic / functionality (What Trigger is doing)
Triggers allow access to values from the table for comparison purposes using
NEW and OLD. The availability of the modifiers depends on the trigger event you
use:
New: get new value from column (Insert & after update).
Old: get old value from the column (Delete & before update).
Syntax:
CREATE [OR REPLACE ] TRIGGER trigger_name
{BEFORE | AFTER | INSTEAD OF }
{INSERT [OR] | UPDATE [OR] | DELETE}
[OF col_name]
ON table_name
[REFERENCING OLD AS o NEW AS n]
[FOR EACH ROW]
WHEN (condition)
DECLARE
Declaration-statements
BEGIN
Executable-statements
EXCEPTION
Exception-handling-statements
END;
138
Here,
139
Examples:
BEFORE INSERT:
AFTER INSERT:
DROP TABLE student;
CREATE TABLE student(id int,name char(30),gmail char(20),pwd
char(20));
INSERT INTO student
values(101,"Deepak","[email protected]","123456");
SELECT * FROM student;
140
INSERT INTO student
VALUES(102,"Sagar","[email protected]","996655");
SELECT * FROM student;
BEFORE UPDATE:
141
AFTER UPDATE:
CREATE TABLE log(user char(30),status char(40));
CREATE TRIGGER auexample AFTER UPDATE ON emp
FOR EACH ROW
INSERT INTO LOG VALUES(current_user(),concat('Updated by
',old.name," ",now()));
SELECT * FROM emp;
142
BEFORE DELETE
CREATE TABLE salaries(empno int,name varchar(20),salary int);
INSERT INTO salaries VALUES
(501,"Raju",12000),
(502,"Rajesh",13000),
(503,"Raghu",14000),
(504,"Ranjith",15000);
143
AFTER DELETE:
drop table salaries;
CREATE TABLE salaries(empno int,salary int);
INSERT INTO salaries values (501,3000),(502,2000),(503,1000);
CREATE table salary_avg(sal int);
INSERT INTO salary_avg (sal) SELECT SUM(salary) FROM salaries;
SELECT * from salary_avg;
DELIMITER //
CREATE TRIGGER adelete
AFTER DELETE
ON salaries FOR EACH ROW
BEGIN
UPDATE salary_avg SET sal=sal-old.salary;
END //
SELECT * FROM salaries;
144
SCHEMA REFINEMENT AND NORMALISATION
1. Schema Refinement:
The Schema Refinement is the process that re-defines the schema of a relation. The
best technique of schema refinement is decomposition.
Normalization or Schema Refinement is a technique of organizing the data in
the database. It is a systematic approach of decomposing tables to eliminate data
redundancy and undesirable characteristics like Insertion, Update and Deletion
Anomalies.
Redundancy refers to repetition of same data or duplicate copies of same data
stored in different locations.
Three types of redundancy:
1. File level
2. Entire record level
3. Few attribute have redundancy.
Here all the data is stored in a single table which causes redundancy of data or say
anomalies as SID and Sname are repeated once for same CID. Let us discuss
anomalies one by one.
1. Insertion anomalies: It may not be possible to store some information unless
some otherinformation is stored as well.
145
2. redundant storage: some information is stored repeatedly
3. Update anomalies: If one copy of redundant data is updated, then
inconsistency is created unless all redundant copies of data are updated.
4. Deletion anomalies: It may not be possible to delete some information
without losing someother information as well.
Problem in updation / updation anomaly – If there is updation in the fee from
5000 to 7000, then we have to update FEE column in all the rows, else data will
become inconsistent.
Insertion Anomaly and Deletion Anomaly- These anomalies exist only due to
redundancy, otherwise they do not exist.
Insertion Anomalies: New course is introduced say c4 with course name DB, but no
student is there who is having DB subject.
Because of insertion of some data, It is forced to insert some other dummy data.
146
Deletion Anomaly:
Deletion of S3 student causes the deletion of course.
Because of deletion of some data, It is forced to delete some other useful data.
147
Problems related to decomposition:
1. Do we need to decompose a relation?
2. That problems (if any) does a given decomposition cause?
Properties of Decomposition:
Consider there is a relation R which is decomposed into sub relations R1, R2 , …. , Rn.
This decomposition is called lossless join decomposition when the join of the sub
relations results in the same relation R that was decomposed.
For lossless join decomposition, we always have
R1 ⋈ R2 ⋈ R3 ……. ⋈ Rn = R
where ⋈ is a natural join operator
If a relation R is decomposed into two relations R1 & R2 then, it will be lossless iff
Example:
Consider the following relation R ( A , B , C ):
A B C
1 2 1
2 5 3
148
3 3 3
R(A, B, C)
Consider this relation is decomposed into two sub relations R 1 (A, B) and R2 (B, C)
A B B C
1 2 2 1
2 5 5 3
3 3 3 3
R1(A,B) R2(B, C)
Now, if we perform the natural join (⋈) of the sub relations R1 and R2 , we get
A B C
1 2 1
2 5 3
3 3 3
R(A, B, C)
Example - 2:
R (A, B, C)
A B C
149
1 1 1
2 1 2
3 2 1
4 3 2
R1 (A, R2 (B, C)
B)
A B B C
1 1 1 1
2 1 1 2
3 2 2 1
4 3 3 2
Let a relation R(A,B,C,D) and set a FDs F = { A -> B , A -> C , C -> D} are given.
A relation R is decomposed into -
150
Functional dependencies
Functional dependency is a relationship that exists when one attribute uniquely
determines another attribute.
A functional dependency XY in a relation holds true if two tuples having
the same value ofattribute X also have the same value of attribute Y.
i. e if X and Y are the two sets of attributes in a relation R where
X ⊆ R, Y ⊆ R
Then, for a functional dependency to exist from X to Y,
if t1.X=t2.X then t1.Y=t2.Y where t1,t2 are tuples and X,Y are attributes.
Example:
151
Example 2:
Example:
A B C D E
a 2 3 4 5
2 a 3 4 5
a 2 3 6 5
a 2 3 6 6
152
Types of Functional Dependencies
There are two types of functional dependencies
153
Axioms / Inference rules: It assumes that axioms are true without being able to
prove them.
Developed by Armstrong in 1974, there are six rules (axioms) that all possible
functional dependencies may be derived from them
1. Reflexivity Rule
In the reflexive rule, if Y is a subset of X, then X determines Y.
If X ⊇ Y then X → Y
Example:
X= {A, B, C, D}
Y= {B, C}
ABC→BC
2. Augmentation Rule
The augmentation is also called as a partial dependency. In augmentation, if X
determines Y, then XZ determines YZ for any Z
If X → Y then XZ → YZ
Example:
For R (ABCD), if A → B then AC → BC
3. Transitivity Rule
In the transitive rule, if X determines Y and Y determine Z, then X must also
determine Z.
If X → Y and Y → Z then X → Z
Additional rules:
1. Union Rule
Union rule says, if X determines Y and X determines Z, then X must also
determine Y and Z.
If X→ Y and X→Z then X→YZ
2. Composition
If W → Z, X → Y, then WX → ZY.
3. Decomposition
If X→YZ then X→Y and X→Z
154
Closure of a Set of Attributes(X+) / Attribute closure:
X+ is the set of all attributes that can be determined using the given set X
(attributes).
Example – 1:
We are given the relation R(A, B, C, D, E). This means that the table R has five
columns: A, B, C, D, and E. We are also given the set of functional dependencies:
{A->B, B->C, C->D, D->E}.
What is {A} +?
First, we add A to {A}+.
What columns can be determined given A? We have A -> B, so we can
determine B. Therefore, {A}+ is now {A, B}.
What columns can be determined given A and B? We have B -> C in the
functional dependencies, so we can determine C. Therefore, {A}+ is now
{A, B, C}.
Now, we have A, B, and C. What other columns can we determine? Well,
we have C -> D, so we can add D to {A}+.
Now, we have A, B, C, and D. Can we add anything else to it? Yes, since D
-> E, we can add E to {A}+.
We have used all of the columns in R and we have all used all functional
dependencies. {A}+ = {A, B, C, D, E}.
155
Example – 2:
Consider a relation R ( A , B , C , D , E , F , G ) with the functional dependencies-
A → BC
BC → DE
D→F
CF → G
Now, let us find the closure of some attributes and attribute sets
Closure of attribute A
A+ = { A }
= { A , B , C } ( Using A → BC )
= { A , B , C , D , E } ( Using BC → DE )
= { A , B , C , D , E , F } ( Using D → F )
= { A , B , C , D , E , F , G } ( Using CF → G )
Thus,
A+ = { A , B , C , D , E , F , G }
R(ABCDEFG) R(ABCDE)
A→B A→ BC
BC→ DE CD→ E
AEG →G B→ D
(AC)+={AC} (Using Reflexivity) E→ A
={ABC} (Using A→B) B+={B} (Using Reflexivity)
={ABCDE} (Using BC→ DE) ={BD} (Using B→ D)
R(ABCDEF)
AB→C
BC→AD
D→E
CF→B
(AB)+={AB} (Using Reflexivity)
={ABC} (Using AB→C)
={ABCD} (Using BC→AD)
={ABCDE} (Using D→E)
156
Super key: attribute / set of attributes whose closure contains all attributes of
given relation.
Step - 1
Determine all essential attributes of the given relation (attributes which are
not present in R.H.S).
Essential attributes of the relation are C and E.
So, attributes C and E will definitely be a part of every candidate key.
Step - 2
Now,
We will check if the essential attributes together can determine all remaining
non-essential attributes.
To check, we find the closure of CE.
157
So, we have
{ CE }+ ={C,E}
= { C , E , F } ( Using C → F )
= { A , C , E , F } ( Using E → A )
= { A , C , D , E , F } ( Using EC → D )
= { A , B , C , D , E , F } ( Using A → B )
We conclude that CE can determine all the attributes of the given relation.
So, CE is the only possible candidate key of the relation.
More Examples:
R(ABCD)
AB→CD
D→A
Candidate keys are AB,BD
R(ABCD)
AB→CD
C→A
D→B
Candidate keys are AB, AD, BC,DC
R(WXYZ)
Z→W
Y→XZ
WX→Y
Candidate keys are Y, WX,ZX
R(ABCD)
A→B
B→C
C→A
Candidate keys are AD, BD, CD
158
R(ABCDE)
AB→CD
D→A
BC→DE
Candidate keys are AB, BC, BD
R(ABCDE)
AB→C
C→D
D→E
A→B
C→A
Candidate keys are A,C
R(ABCDE)
A→D
AB→C
B→E
D→C
E→A
Candidate key is B
R(ABCDEF)
AB→C
DC→AE
E→F
Candidate key is ABD,BDC
159
Normalization
Normalization is a database design technique that reduces data redundancy and
eliminates undesirable characteristics like Insertion, Update and Deletion
Anomalies. Normalization rules divide larger tables into smaller tables and links
them using relationships. The purpose of Normalization in SQL is to eliminate
redundant (repetitive) data and ensure data is stored logically.
Relation is in 1NF
160
Note:
By default, every relation is in 1NF.
This is because formal definition of a relation states that value of all the
attributes must be atomic.
Partial Dependency
A partial dependency is a dependency where few attributes of the candidate key
determines non-prime attribute(s).
Or
A partial dependency is a dependency where a portion of the candidate key or
incomplete candidate key determines non-prime attribute(s).
In other words,
X → a is called a partial dependency if and only if-
1. X is a proper subset of some candidate key and
2. a is a non-prime attribute or non – key attribute
If any one condition fails, then it will not be a partial dependency.
161
NOTE
To avoid partial dependency, incomplete candidate key must not determine any
non-prime attribute.
However, incomplete candidate key can determine prime attributes.
Example - 1:
Consider a relation- R (V, W, X, Y, Z) with functional dependencies-
VW → XY
Y→V
WX → YZ
The possible candidate keys for this relation are
VW, WX, WY
From here,
Prime attributes = { V , W , X , Y }
Non-prime attributes = { Z }
Now, if we observe the given dependencies-
There is no partial dependency.
This is because there exists no dependency where incomplete candidate key
determines any non-prime attribute.
Thus, we conclude that the given relation is in 2NF.
Example - 2:
Consider a relation- R (A, B, C, D, E, F) with functional dependencies-
A→B
B→C
C→D
D→E
The possible candidate key for this relation is
AF
From here,
Prime attributes = {A, F}
Non-prime attributes = { B, C, D, E}
Now, if we observe the given dependencies-
There is partial dependency.
This is because there is an incomplete candidate key (attribute A) determines
non-prime attribute (B).
Thus, we conclude that the given relation is not in 2NF.
162
Example – 3:
Consider a relation- R (A, B, C, D) with functional dependencies-
AB → CD
C→A
D→B
The possible candidate key for this relation is
AB, AD, BC, CD
Example - 4:
Consider a relation- R (A, B, C, D) with functional dependencies-
AB → D
B→C
Possible candidate key is AB
163
3. Third Normal Form(3 NF):
A table is in 3NF
If
1. It is in 2NF
2. For every non trivial dependency
X → a, X is either super key or a is a prime attribute.
Or
A table is said to be in 3NF
If
1. It should be in 2NF
2. It should not have any transitive dependency
Or
A table is said to be 3NF form when it don’t have partial dependency and transitive
dependency.
Transitive dependency: When a non prime attribute finds another non prime
attribute, it is called transitive dependency.
Example-
Consider a relation- R ( A , B , C , D , E ) with functional dependencies-
A → BC
CD → E
B→D
E→A
The possible candidate keys for this relation are-
A, E, CD, BC
From here,
Prime attributes = { A , B , C , D , E }
There are no non-prime attributes
Now,
It is clear that there are no non-prime attributes in the relation.
In other words, all the attributes of relation are prime attributes.
164
Thus, all the attributes on RHS of each functional dependency are prime
attributes.
Thus, we conclude that the given relation is in 3NF.
Example:
Consider a relation- R ( A , B , C ) with the functional dependencies-
A→B
B→C
C→A
The possible candidate keys for this relation are- A, B, C
165
Fourth Normal Form:
A table is said to be in 4NF
If
1. It should be in BCNF
2. It should not have any Multi – valued dependency.
Multi – valued dependency:
1. For a dependency A →→ B, if for a single value of A, multiple values of B
exists.
2. Table should be at least 3 columns.
3. For a relation (A, B, C), If there is a Multi – valued dependency between
A →→ B, then B and C should be Independent of each other.
Example:
166
Fifth normal form:
A database is said to be in 5NF, if and only if,
It should be in 4NF
If we can decompose table further to eliminate redundancy and anomaly,
and when we re-join the decomposed tables by means of candidate keys, we
should not be losing the original data or any new record set should not arise.
In simple words, joining two or more decomposed table should not lose
records nor create new records. Or
It should not have join dependency
5NF is also known as Project-join normal form (PJ/NF).
Join dependency:
Let ‘R’ be a relation schema and R1, R2 ….Rn be the decompositions of R, R is said
to satisfy the join dependency *(R1, R2… Rn) if and only if
∏R1(R) ⋈∏R2(R) ⋈………∏RN(R) =R
Example:
167
Now, each of combinations is in three different tables. If we need to identify who
is teaching which subject to which semester, we need join the keys of each table
and get the result.
For example, who teaches Physics to Semester 1, we would be selecting Physics
and Semester1 from table 3 above, join with table1 using Subject to filter out the
lecturer names. Then join with table2 using Lecturer to get correct lecturer name.
That is we joined key columns of each table to get the correct data. Hence there is
no lose or new data – satisfying 5NF condition.
168