DBMS Unit 3 Notes
DBMS Unit 3 Notes
me/jntuh
UNIT – III
SQL: QUERIES, CONSTRAINTS, TRIGGERS: form of basic SQL query, UNION, INTERSECT,
and EXCEPT, Nested Queries, aggregation operators, NULL values, complex integrity constraints in
SQL, triggers and active databases. Schema Refinement: Problems caused by redundancy,
decompositions, problems related to decomposition, reasoning about functional dependencies, FIRST,
SECOND, THIRD normal forms, BCNF, lossless join decomposition, multi-valued dependencies,
FOURTH normal form, FIFTH normal form.
1. SQL COMMANDS
Structured Query Language (SQL) is the database language used to create a database and
to perform operations on the existing database. SQL commands are instructions used to
communicate with the database to perform specific tasks and queries with data. These SQL
commands are categorized into five categories as:
SQL commands
DROP SAVEPOINT
UPDATE
TRUNCATE
CREATE: It is used to create the database or its objects (like table, index, function,
views, store procedure and triggers).
DROP: It is used to delete objects from the database.
ALTER: It is used to alter the structure of the database.
TRUNCATE: It is used to remove all records from a table, including all spaces
allocated for the records are removed.
ii. DQL (Data Query Language): DML statements are used for performing queries on the
data within schema objects. The purpose of DQL Command is to get data from some schema
relation based on the query passed to it. The DQL commands are:
SELECT – is used to retrieve data from the database.
iii. DML (Data Manipulation Language): The SQL commands that deals with the
manipulation of data present in the database belong to DML or Data Manipulation Language
and this includes most of the SQL statements. The DML commands are:
2. DDL COMMANDS
DDL or Data Definition Language consists of the SQL commands that can be used to define
the database schema. It simply deals with descriptions of the database schema and is used to
create and modify the structure of database objects in the database. The DQL commands are:
i. CREATE: It is used to create the database or its objects like table, index, function, views,
store procedure and triggers.
Note: The content in the square brackets indicates it is optional. If not required, you can skip it.
Column constraints
Table constraints
Example 4: Using NOT NULL, UNIQUE as column constraints and PRIMARY KEY and CHECK as table
constraints.
CREATE TABLE Voter_list
(
VoterID numeric(10),
AdhaarNo numeric(12)NOT NULL UNIQUE,
Name varchar(20)NOT NULL,
Age int,
Mobile numeric(10) UNIQUE,
City varchar(20),
PRIMARY KEY(VoterID),
CHECK(AGE>18)
);
c) The ‘CREATE TABLE AS’ Statement: You can also create a table from another
existing table. The newly created table also contains data of existing table.
Syntax: CREATE TABLE NewTableName AS(SELECT Column1, column2, ..., ColumnN
FROM ExistingTableName
WHERE [condition]);
b) The ‘DROP TABLE’ Statement: This statement is used to drop an existing table. When
you use this statement, complete information present in the table will be lost.
Syntax: DROP TABLE TableName; .
iii. TRUNCATE: This command is used to delete the information present in the table but
does not delete the table. So, once you use this command, your information will be lost, but
not the table.
Syntax: TRUNCATE TABLE TableName; .
iv. ALTER: This command is used to add, delete or modify column(s) in an existing table. It
can also be used to rename the existing table and also to rename the existing column name.
a) The ‘ALTER TABLE’ with ADD column: You can use this command to add a new
column to the existing table.
Syntax: ALTER TABLE TableName
ADD ColumnName Datatype;
c) The ‘ALTER TABLE’ with MODIFY COLUMN: This statement is used to change
the data type or size of data type of an existing column in a table.
Example 2: Changing the data type of column ‘EmployeeID’ in the table ‘Employee_info’
from int to char(10).
d) The ‘ALTER TABLE’ with CHANGE column name: This statement is used to change
the column name of an existing column in a table.
e) The ‘ALTER TABLE’ with RENAME table name: This statement is used to change
the table name in the database.
3. DML COMMANDS: The SQL commands that deals with the manipulation of data
present in the database belong to DML or Data Manipulation Language and this includes
most of the SQL statements. The DML commands are:
i. INSERT: This statement is used to insert new record (row) into the table.
Example1 :
INSERT INTO Employee_Info( EmployeeID, EmployeeName, PhoneNumber, City,Country)
VALUES ('06', 'Sanjana', '9921321141', 'Chennai', 'India');
Example2 : When inserting all column values as per their order in the table, you can omit
column names.
INSERT INTO Employee_Info
VALUES ('07', 'Sayantini','9934567654', 'Pune', 'India');
ii. DELETE: This statement is used to delete the existing records in a table.
Example:
DELETE FROM Employee_Info
WHERE EmployeeName='Preeti';
Note: If where condition is not used in DELETE command, then all the rows data will be deleted. If used
only rows which satisfies the condition are deleted.
iii. UPDATE: This statement is used to modify the record values already present in the table.
Example:
UPDATE Employee_Info
SET EmployeeName = 'Jhon', City= 'Ahmedabad'
WHERE EmployeeID = 1;
Note: If where condition is not used in UPDATE command, then in all the rows Employee Name
changes to 'Jhon' and City name changes to 'Ahmedabad'. If used only
rows which satisfies the condition are updated.
4. DQL COMMAND: The purpose of DQL Command is to get data from one or more
tables based on the query passed to it.
i. SELECT: This statement is used to select data from a database and the data returned is
stored in a result table, called the result-set.
Example 2:
SELECT EmployeeID, EmployeeName
FROM Employee_Info;
The ‘SELECT with DISTINCT’ Statement: This statement is used to display only
different unique values. It mean it will not display duplicate values.
The ‘ORDER BY’ Statement: The ‘ORDER BY’ statement is used to sort the required
results in ascending or descending order. The results are sorted in ascending order by
default. Yet, if you wish to get the required results in descending order, you have to use
the DESC keyword.
Example
AGGREGATE FUNCTIONS:
The SQL allows summarizing data through a set of functions called aggregate
functions. The commonly used aggregate functions are: MIN( ), MAX( ), COUNT( ),
SUM( ), AVG( ).
MIN() Function: The MIN function returns the smallest value of the selected column in
a table.
MAX( ) Function: The MAX function returns the largest value of the selected column in
a table.
COUNT( ) Function: The COUNT function returns the number of rows which match the
specified criteria.
SUM( ) Function: The SUM function returns the total sum of a numeric column that you
choose.
AVG( ) Function: The AVG function returns the average value of a numeric column that
you choose.
The ‘GROUP BY’ Statement: This ‘GROUP BY’ statement is used with the aggregate
functions to group the result-set by one or more columns.
Example:
The ‘HAVING’ Clause: The ‘HAVING’ clause must be used SQL along with GROUP BY
clause only. It is similar to the WHERE clause.
Example
Operators in SQL:
Arithmetic operators
Bitwise operators
Comparison operator
Compound operator
Logical operator
Arithmetic Operators:
Operator Description
% Modulus [A % B]
/ Division [A / B]
* Multiplication [A * B]
– Subtraction [A – B]
+ Addition [A + B]
Bitwise Operators:
Operator Description
^ Bitwise Exclusive OR (XOR) [A ^ B]
| Bitwise OR [A | B]
& Bitwise AND [A & B]
Comparison Operators:
Operator Description
<> Not Equal to [A < > B]
<= Less than or equal to [A <= B]
>= Greater than or equal to [A >= B]
< Less than [A < B]
> Greater than [A > B]
= Equal to [A = B]
Compound Operators:
Operator Description
|*= Bitwise OR equals [A |*= B]
^-= Bitwise Exclusive equals [A ^-= B]
&= Bitwise AND equals [A &= B]
%= Modulo equals [A %= B]
/= Divide equals [A /= B]
*= Multiply equals [A*= B]
-= Subtract equals [A-= B]
+= Add equals [A+= B]
Logical Operators: The Logical operators present in SQL are as follows: AND, OR, NOT,
BETWEEN, LIKE, IN, EXISTS, ALL, ANY.
AND Operator: This operator is used to filter records that rely on more than one condition. This
operator displays the records, which satisfy all the conditions separated by AND, and give the
output TRUE.
Syntax:
Example:
OR Operator: This operator displays all those records which satisfy any of the conditions
separated by OR and give the output TRUE.
Example:
NOT Operator: The NOT operator is used, when you want to display the records which do not
satisfy a condition.
Example:
NOTE: You can also combine the above three operators and write a query as follows:
SELECT * FROM Employee_Info
WHERE NOT Country='India' AND (City='Bangalore' OR City='Hyderabad');
BETWEEN Operator: The BETWEEN operator is used, when you want to select values within a
given range. Since this is an inclusive operator, both the starting and ending values are
considered.
Example:
LIKE Operator
The LIKE operator is used in a WHERE clause to search for a specified pattern in a column of a
table. There are mainly two wildcards that are used in conjunction with the LIKE operator:
SELECT ColumnName(s)
FROM TableName
WHERE ColumnName LIKE pattern;
Refer to the following table for the various patterns that you can mention with the LIKE
operator.
Example:
SELECT * FROM Employee_Info
WHERE EmployeeName LIKE 'S%';
IN Operator: This operator is used for multiple OR conditions. This allows you to specify
multiple values in a WHERE clause.
Example:
SELECT * FROM Employee_Info
WHERE City IN ('Mumbai', 'Bangalore', 'Hyderabad');
NOTE: You can also use IN while writing Nested Queries.
EXISTS Operator: The EXISTS operator is used to test if a record exists or not.
Example:
SELECT City
FROM Employee_Info
WHERE EXISTS (SELECT City
FROM Employee_Info
WHERE EmployeeId = 05 AND City = 'Kolkata');
ALL Operator: The ALL operator is used with a WHERE or HAVING clause and returns TRUE
if all of the subquery values meet the condition.
Example:
SELECT EmployeeName
FROM Employee_Info
WHERE EmployeeID = ALL ( SELECT EmployeeID
FROM Employee_Info
WHERE City = 'Hyderabad');
Similar to the ALL operator, the ANY operator is also used with a WHERE or
ANY Operator:
HAVING clause and returns true if any of the subquery values meet the condition.
Example:
SELECT EmployeeName
FROM Employee_Info
WHERE EmployeeID = ANY (SELECT EmployeeID
FROM Employee_Info
WHERE City = 'Hyderabad' OR City = 'Kolkata');
Aliases Statement: Aliases are used to give a column / table a temporary name and only
exists for duration of the query.
Example:
5. NESTED QUERIES
Nested queries are those queries which have an outer query and inner subquery. So,
basically, the subquery is a query which is nested within another query.
i. UNION: This operator is used to combine the result-set of two or more SELECT
statements.
Syntax: SELECT ColumnName(s) FROM Table1 WHERE condition
UNION
SELECT ColumnName(s) FROM Table2 WHERE condition;
ii. INTERSECT: This clause used to combine two SELECT statements and return the
intersection of the data-sets of both the SELECT statements.
iii. EXCEPT: This operator returns those tuples that are returned by the first SELECT
operation, and are not returned by the second SELECT operation.
Note: UNION, INTERSECT or EXCEPT operations are possible if and only if first SELECT
query and second SELECT query produces same no of columns in same order, same
column names and data type. Otherwise it gives an error.
7. JOINS
JOINS are used to combine rows from two or more tables, based on a related column between
those tables. The following are the types of joins:
INNER JOIN: This join returns those records which have matching values in both the
tables.
FULL JOIN: This join returns all those records which either have a match in the left or
the right table.
LEFT JOIN: This join returns records from the left table, and also those records which
satisfy the condition from the right table.
RIGHT JOIN: This join returns records from the right table, and also those records
which satisfy the condition from the left table.
Let’s consider the below Technologies and the Employee_Info table, to understand the syntax
of joins.
Employee_Info
EmployeeID EmployeeName PhoneNumber City Country
01 Shravya 9898765612 Mumbai India
02 Vijay 9432156783 Delhi India
03 Preeti 9764234519 Bangalore India
04 Vijay 9966442211 Hyderabad India
05 Manasa 9543176246 Kolkata India
Technologies
TechID EmpID TechName ProjectStartDate
1 01 DevOps 04-01-2019
2 03 Blockchain 06-07-2019
3 04 Python 01-03-2019
4 06 Java 10-10-2019
INNER JOIN or EQUI JOIN: This is a simple JOIN in which the result is based on matched
data as per the equality condition specified in the SQL query. This join is used mostly.
NATURAL JOIN is a type INNER JOIN. We can also use it. It also gives same result.
Syntax
SELECT ColumnName(s)
FROM Table1
INNER JOIN Table2 ON Table1.ColumnName = Table2.ColumnName;
Example
FULL OUTER JOIN: The full outer join returns a result-set table with the matched data of
two table then remaining rows of both left table and right table with missing values are filled
with NULL values.
Syntax
SELECT ColumnName(s)
FROM Table1
FULL OUTER JOIN Table2 ON Table1.ColumnName = Table2.ColumnName;
Example
LEFT JOIN: The left outer join returns a result-set table with the matched data from the two
tables and then the remaining rows of the left table with null for the right table's columns.
Syntax:
SELECT ColumnName(s)
FROM Table1
LEFT JOIN Table2 ON Table1.ColumnName = Table2.ColumnName;
Example:
SELECT E.EmployeeId, E.EmployeeName, T.TechID
FROM Employee_Info E
LEFT JOIN Technologies T ON E.EmployeeID = T.EmpIDID ;
RIGHT JOIN: The right outer join returns a result-set table with the matched data from the
two tables being joined, then the remaining rows of the right table and null for the remaining left
table's columns.
Syntax:
SELECT ColumnName(s)
FROM Table1
RIGHT JOIN Table2 ON Table1.ColumnName = Table2.ColumnName;
Example:
SELECT E.EmployeeId, E.EmployeeName, T.TechID
FROM Employee_Info E
RIGHT JOIN Technologies T ON E.EmployeeID = T.EmpIDID ;
8. TRIGGERS
A trigger is a stored procedure in database which automatically invokes whenever a special event
in the database occurs. For example, a trigger can be invoked when a row is inserted into a
specified table or when certain table columns are being updated. So, a trigger can be invoked
either BEFORE or AFTER the data is changed by INSERT, UPDATE or DELETE statement.
Refer to the image below.
Syntax:
CREATE TRIGGER [TriggerName]
[BEFORE | AFTER]
{INSERT | UPDATE | DELETE}
on [TableName]
[FOR EACH ROW]
[TriggerBody]
Explanation of syntax:
EXAMPLE:
A trigger called ‘nb’ is created to alert the user when inserting account details with negative
balance value in to accounts table. Before inserting, the trigger is activated if the condition is
true. When a trigger activated, the action part of the trigger is get executed.
9. NORMALIZATION
Normalization is the process of minimizing the redundancy from a relation or set of
relations.
It is used to eliminate the Insertion, Update and Deletion Anomalies.
Normalization divides the larger table into the smaller table and links them using
relationship.
Normalization is done with the help of different normal form.
The inventor of the relational model Edgar Codd proposed the theory of normalization with
the introduction of the First Normal Form, and he continued to extend theory with Second and
Third Normal Form. Later he joined Raymond F. Boyce to develop the theory of Boyce-Codd
Normal Form. In software industry, they are using only up to third normal form and sometimes
Boyce-Codd Normal Form.
Redundancy means having multiple copies of same data in the database. This problem arises
when a database is not normalized. Redundancy leads the following problems.
Wastage of Memory: Disk space is wasted due to storing same copy multiple times.
Storage cost increases: When multiple copies of same data is stored, need more disk
space and storage cost increases.
Update anomaly: When Address of student is stored at several places; a change in the
address must be made in all the places. Changing the address at some places and leaving
other places leads to inconsistency problem.
Insertion Anomaly: The nature of a database may be such that it is not possible to add a
required piece of data unless another piece of unavailable data is also added. For
example, a library database cannot store the details of a new student until that student has
taken atleast one book from the library.
Deletion Anomaly: When some data is deleted, it also deletes other data automatically.
For example, deleting a book details from a library database, it also delete the student
details who have taken the book previously.
1NF Example:
The above table is not in 1NF because 501 and 502 is having two values in mobile column. If we
add a new column as alternative mobile number to the above table, then for 503 alternative
mobile number is NULL. Moreover, if a student has ‘n’ mobile numbers, then adding ‘n’ extra
column is meaningless. It is better to add extra rows. If we add extra row for each 501 and 502
then the table looks like
But the above table violates primary key constraint. Therefore instead of adding either columns
or rows, the best solution is to split the table into two tables as shown below. If we do as shown
below, if a student having ‘n’ number of mobile numbers also can be added.
HTNO FIRST LAST HTNO MOBILE
NAME NAME 501 9999988888
501 Jhansi Rani 501 7777799999
502 Ajay Kumar 502 8888888881
502 7897897897
503 Priya Verma 503 9898989898
2NF Example:
The above table is not in 2NF because there exist partial function dependencies. HTNO is a
key attribute in the above table. If every non-key attribute fully dependent on key attribute,
then we say it is fully functional dependent. Consider the below diagram. {Name, DOB,
DeptNo, DeptName, Location} depends on HTNO. But {DeptName, Location} also depends
on DeptNo.
Name
DOB DeptName
DeptNo
DeptNo
HTNO Location
DeptName
Location
It is clear that DeptName and Location not only depends upon HTNO but also on DeptNo.
So, there exists partial function dependency. This partial functional dependency can be
removed by splitting the above table into two tables as follows.
Transitive functional dependency means, we have the following relationships in the table: A
is functionally dependent on B (A→B), and B is functionally dependent on C (B→C). In this
case, C is transitively dependent on A via B (A→B and B→C mean A→B→C implies
A→C).
3NF Example:
Consider the following book details table example:
BOOK_DETAILS
BookID GenreID GenreType Price
1 1 Gardening 250.00
2 2 Sports 149.00
3 1 Gardening 100.00
4 3 Travel 160.00
5 2 Sports 320.00
The above table is not in 3NF because there exist transitive dependency. In the table able,
BookID determines GenreID { BookID → GenreID }
GenreID determines GenreType. { GenreID → GenreType }
BookID determines GenreType via GenreID. { BookID → GenreType }
It implies that transitive functional dependency is existing and the structure does not satisfy
third normal form. To bring this table in to third normal form, we split the table into two as
follows:
BOOK_DETAILS
BookID GenreID Price
1 1 250.00 GENRE_DETAILS
2 2 149.00 GenreID GenreType
3 1 100.00 1 Gardening
4 3 160.00 2 Sports
5 2 320.00 3 Travel
A relation (table) is said to be in the BCNF if and only if it satisfy the following conditions:
Example: Below we have a Patient table of a hospital. A patient can go to hospital many
times to take treatment. On a single day many patients can take treatment.
PatientID Name EmailID AdmittedDate Drug Quntity
101 Ram [email protected] 30/10/1998 A-10 10
102 Jhon [email protected] 30/10/1998 X-90 10
101 Ram [email protected] 10/06/2001 X-90 20
103 Sowmya [email protected] 05/03/2002 Y-30 15
102 Jhon [email protected] 05/03/2002 A-10 15
In the above table, {PateintID, AdmittedDate} acts as Primary key. But if we know the
EmailID value, we can find PatientID value.
In other words we can also define BCNF as there should not be any overlapping between candidate
keys. If you consider the original table (before splitting), we can get two candidate keys {PateintID,
AdmittedDate} and {EmailID, AdmittedDate}.
As there exist overlapping in the candidate keys, the table is not in BCNF. To bring it into BCNF,
we split into two tables as shown above.
If all these three conditions are true for any relation (table), then it contains multi-valued
dependency. The multi-valued dependency can be explained with an example. Let the Relation R
containing three columns A, B, C and four rows s, t, u, v.
A B C
s a1 b1 c1
t a1 b1 c2
u a1 b2 c1
v a1 b2 c2
If s(A) = t(A) = u(A) = v(A)
s(B) = t(B) and s(B) = v(B)
s(C) = u(C) and t(C) = v(C) , then there exist multi-valued dependency.
Example: Consider the below college enrolment table with columns HTNO, Subject and
Hobby.
Python
Subject Java Subject
C# Android
501 502
Hobby Cricket Hobby Chess
Dancing Singing
As shown in the above figure, if 501 opted for subjects like Java and C# and hobbies
of 501 are Cricket and Dancing. Similarly, If 502 opted for subjects like Python and
Android and hobbies of 501 are Chess and Singing, then it can be written into a table with
three columns as follows:
HTNO Subject Hobby
501 Java Cricket
501 Java Dancing
501 C# Cricket
501 C# Dancing
502 Python Chess
502 Python Singing
502 Android Chess
502 Android Singing
As there exist multi valued dependency, the above table is decomposed into two tables such
that
15. 5NF
A relation is said to be in 5-NF if and only if it satisfies the following conditions
A table is decomposed into multiple small tables to eliminate redundancy, and when
we re-join the decomposed tables, there should not be any loss in the original data or
shold not create any new data. In simple words, joining two or more decomposed table
should not lose records nor create new records.
Example: Consider a table which contains a record of Subject, Professor and Semester in
three columns. The primary key is the combination of all three columns. No column itself is not
a candidate key or a super key.
In the table, DBMS is taught by Ravindar and Uma Rani in semester 4, DS by Sindhusha and
Venu in sem 3. In this case, the combination of all these fields required to identify valid data.
So to make the table into 5NF, we can decompose it into three relations,
The above table is decomposed into three tables as follows to bring it into 5-NF.
R R1 R2
A B C A B B C
1 2 1 decompose → 1 2 and 2 1
2 5 3 2 5 5 3
3 3 3 3 3 3 3
Now, let us check whether this decomposition is lossless or not. For lossless decomposition,
we must have: R1 ⋈ R2 = R . Now, if we perform the natural join ( ⋈ ) of the sub relations
R1 and R2 , we get
A B C
1 2 1
This relation is same as the original relation R.
2 5 3
3 3 3
Thus, we conclude that the above decomposition is lossless join decomposition. This is
because the resultant relation after joining the sub relations is same as the decomposed
relation. No extraneous tuples (rows) appear after joining of the sub-relations.
R R1 R2
A B C A C B C
1 2 1 decompose → 1 1 and 2 1
2 5 3 2 3 5 3
3 3 3 3 3 3 3
Now, let us check whether this decomposition is lossless or not. For lossless decomposition,
we must have: R1 ⋈ R2 = R . Now, if we perform the natural join ( ⋈ ) of the sub relations
R1 and R2 , we get
A B C
1 2 1
This relation is not same as the original relation R.
2 5 3
2 3 3
3 5 3
3 3 3
Thus, we conclude that the above decomposition is not lossless join decomposition. This is
because the resultant relation after joining the sub relations is not same as the decomposed
relation. Extraneous tuples (rows) appear after joining of the sub-relations.
Ravindar.M, Asso.Prof, CSE Dept, JITS-KNR
PROBLEMS
Consider a relation R is decomposed into two sub relations R1 and R2.
Condition-01: Union of both the sub relations must contain all the attributes that are present
in the original relation R.
R1 ∪ R2 = R
Condition-02: Intersection of both the sub relations must not be null. In other words, there
must be some common attribute which is present in both the sub relations.
R1 ∩ R2 ≠ ∅
Condition-03: Intersection of both the sub relations must be a super key of either R1 or R2 or
both.
************************************************************************
Solution: To determine whether the decomposition is lossless or lossy, we will check all the
conditions one by one. If any of the conditions fail, then the decomposition is lossy otherwise
lossless.
Condition-01: According to condition-01, union of both the sub relations must contain all the
attributes of relation R. So, we have:
R1 ( A , B ) ∪ R2 ( C , D ) = R(A,B,C,D)
Clearly, union of the sub relations contains all the attributes of relation R. Thus, condition-01
satisfies.
Condition-02: According to condition-02, intersection of both the sub relations must not be
null. So, we have-
R 1 ( A , B ) ∩ R2 ( C , D ) = Φ
Clearly, intersection of the sub relations is null. So, condition-02 fails. Thus, we conclude that
the decomposition is lossy.
************************************************************************
Problem-02: Consider a relation schema R ( A , B , C , D ) with the following functional
dependencies
A→B B→C C→D D→B
Strategy to Solve: When a given relation is decomposed into more than two sub relations,
then
Consider any one possible ways in which the relation might have been decomposed
into those sub relations.
First, divide the given relation into two sub relations.
Then, divide the sub relations according to the sub relations given in the question.
As a thumb rule, remember-
Any relation can be decomposed only into two sub relations at a time.
Consider the original relation R was decomposed into the given sub relations as shown:
Condition-01: According to condition-01, union of both the sub relations must contain all the
attributes of relation R. So, we have
R‘ ( A , B , C ) ∪ R3 ( B , D ) = R(A,B,C,D)
Clearly, union of the sub relations contains all the attributes of relation R. Thus, condition-01
satisfies.
Condition-02: According to condition-02, intersection of both the sub relations must not be
null. So, we have
R‘ ( A , B , C ) ∩ R3 ( B , D ) = B
Clearly, intersection of the sub relations is not null. Thus, condition-02 satisfies.
Condition-03: According to condition-03, intersection of both the sub relations must be the
super key of one of the two sub relations or both. So, we have-
R‘ ( A , B , C ) ∩ R3 ( B , D ) = B
Ravindar.M, Asso.Prof, CSE Dept, JITS-KNR
Clearly, intersection of the sub relations is a super key of one of the sub relations.
So, condition-03 satisfies. Thus, we conclude that the decomposition is lossless.
Condition-01: According to condition-01, union of both the sub relations must contain all the
attributes of relation R’. So, we have
R1 ( A , B ) ∪ R2 ( B , C ) = R’ ( A , B , C )
Clearly, union of the sub relations contain all the attributes of relation R’. Thus, condition-01
satisfies.
Condition-02: According to condition-02, intersection of both the sub relations must not be
null.
So, we have
R 1 ( A , B ) ∩ R2 ( B , C ) = B
Clearly, intersection of the sub relations is not null. Thus, condition-02 satisfies.
Condition-03: According to condition-03, intersection of both the sub relations must be the
super key of one of the two sub relations or both. So, we have
R 1 ( A , B ) ∩ R2 ( B , C ) = B
+
Now, the closure of attribute B is: B = { B , C , D }
So,
Attribute ‘B’ can not determine attribute ‘A’ of sub relation R1.
Thus, it is not a super key of the sub relation R1.
Attribute ‘B’ can determine all the attributes of sub relation R2.
Thus, it is a super key of the sub relation R2.
Clearly, intersection of the sub relations is a super key of one of the sub relations. So,
condition-03 satisfies. Thus, we conclude that the decomposition is lossless.
The set of all those attributes which can be functionally determined from an attribute
set is called as a closure of that attribute set. Closure of attribute set {X} is denoted as {X}+.
Steps to Find Closure of an Attribute Set: Following steps are followed to find the
closure of an attribute set:
Step-01: Add the attributes contained in the attribute set for which closure is being
calculated to the result set.
Step-02: Recursively add the attributes to the result set which can be functionally determined
from the attributes already contained in the result set.
Solution:
Closure of {A, B}:
+
{ AB } = { A , B }
={A,B,C,D} ( Using AB → CD )
={A,B,C,D,G} ( Using C → G )
+
Thus, { AB } = { A , B , C , D , G }
Super Key:
If the closure result of an attribute set contains all the attributes of the relation,
then that attribute set is called as a super key of that relation.
Thus, we can say, “The closure of a super key is the entire relation schema.”
Example: In the above example (Question 1),
The closure of attribute A is the entire relation schema.
Thus, attribute A is a super key for that relation.
Candidate Key:
If there exists no subset of an attribute set whose closure contains all the attributes of
the relation, then that attribute set is called as a candidate key of that relation.
Example: In the above example (Question 1),
No subset of attribute A contains all the attributes of the relation.
Thus, attribute A is also a candidate key for that relation.
Essential attributes are those attributes which are not present on RHS of any functional
dependency.
Essential attributes are always a part of every candidate key.This is because they
cannot be determined by other attributes.
The RHS of all the above functional dependencies contain only B, D and E. The
attributes which are not present on RHS of any functional dependency are A, C and F.
So, essential attributes are: A, C and F.
The attributes of the relation that are present in RHS are non-essential attributes. They
can be determined by using essential attributes.
Now, following two cases are possible-
Case-01: If all essential attributes together can determine all remaining non-essential
attributes, then
o The combination of essential attributes is the candidate key.
o It is the only possible candidate key.
Case-02: If all essential attributes together can not determine all remaining non-
essential attributes, then-
The set of essential attributes and some non-essential attributes will be the candidate
key(s).
In this case, multiple candidate keys are possible.
To find the candidate keys, we check different combinations of essential and non-
essential attributes.
.
Ravindar.M, Asso.Prof, CSE Dept, JITS-KNR
Solution: We will find candidate keys of the given relation in the following steps-
Step-01: Determine all essential attributes of the given relation.
Step-02: Now, we will check if the essential attributes together can determine all remaining
non-essential attributes. To check, we find the closure of CE. So,
{ CE }+ ={C,E}
={C,E,F} ( Using C → F )
={A,C,E,F} ( Using E → A )
={A,C,D,E,F} ( Using EC → D )
={A,B,C,D,E,F} ( Using A → B )
We conclude that CE can determine all the attributes of the given relation. So, CE is the only
possible candidate key of the relation.
C E A B D F
So, number of super keys possible = 2 x 2 x 2 x 2 = 16. Thus, total number of super keys
possible = 16.
Solution: We will find candidate keys of the given relation in the following steps-
Step-02:
We will check if the essential attributes together can determine all remaining non-
essential attributes.
To check, we find the closure of EFH.
So, we have-
{ EFH }+ = { E , F , H }
={E,F,G,H} ( Using EF → G )
={E,F,G,H,I,J} ( Using F → IJ )
={E,F,G,H,I,J,K,L} ( Using EH → KL )
={E,F,G,H,I,J,K,L,M} ( Using K → M )
= { E , F , G , H , I , J , K , L , M , N } ( Using L → N )
We conclude that EFH can determine all the attributes of the given relation. So, EFH is the
only possible candidate key of the relation.