DBMS Unit3
DBMS Unit3
SQL: QUERIES, CONSTRAINTS, TRIGGERS: form of basic SQL query, UNION, INTERSECT,
and EXCEPT, Nested Queries, aggregation operators, NULL values, complex integrity constraints
in SQL, triggers and active data bases.
The basic form of an SQL query:
SELECT * DISTINCT+,*| column_name1, column_name2...) FROM table_name
WHERE condition + *GROUP BY column_list+ *HAVING condition+ *ORDER BY column_list.
• SELECT specifies which columns are to appear in the output DISTINCT eliminates duplicate
• FROM specifies the tables to be used
• WHERE filters the rows according to the condition The where condition is a boolean
combination (using AND, OR, and NOT) of conditions of the form expression op expression
where op is one of the comparison operators (<=, =, <>, >=, >)
• GROUP BY forms groups of rows with the same column value
• HAVING filters the group
• ORDER BY sorts the order of the output
Set Operations:
The SQL Set operation is used to combine the two or more SQL SELECT statements.
• Union
• UnionAll
• Except(minus)
• Intersect
Union
• The SQL Union operation is used to combine the result of two or more SQL SELECT queries.
• In the union operation, all the number of datatype and columns must be same in both the tables on
which UNION operation is being applied.
• The union operation eliminates the duplicate rows from its resultset.
• Syntax :
SELECT column_name FROM table1
UNION
SELECT column_name FROM table2;
• Example:
First_Table Second_Table Result Table: SELECT * FROM First_Table
UNION SELECT * FROM Second_Table
SQL Queries/Commands
• SQL commands are instructions. It is used to communicate with the database. It is also used to
perform specific tasks, functions, and queries of data.
• SQL can perform various tasks like create a table, add data to tables, drop the table, modify the
table, set permission for users.
• There are five types of SQL commands: DDL, DML, DCL, TCL, and DQL.
Data Definition Language (DDL)
• DDL changes the structure of the table like creating a table, deleting a table, altering a table, etc.
• All the command of DDL are auto-committed that means it permanently save all the changes in
the database.
• Here are some commands that come under DDL:
o CREATE - It is used to create a new table in the database
Syntax: CREATE TABLE TABLE_NAME (COLUMN_NAME DATATYPES[,....]);
Ex:
CREATE TABLE EMPLOYEE(Name VARCHAR2(20), Email VARCHAR2(100), DO
B DATE);
o ALTER - It is used to alter the structure of the database. This change could be either to
modify the characteristics of an existing attribute or probably to add a new attribute.
Syntax:
To add a new column in the table
ALTER TABLE table_name ADD column_name COLUMN-definition;
To modify existing column in the table:
ALTER TABLE table_name MODIFY(column_definitions....);
Ex:
ALTER TABLE STU_DETAILS ADD(ADDRESS VARCHAR2(20));
ALTER TABLE STU_DETAILS MODIFY (NAME VARCHAR2(20));
o DROP - It is used to delete both the structure and record stored in the table.
Syntax: DROP TABLE table_name;
Ex: DROP TABLE EMPLOYEE;
o TRUNCATE - It is used to delete all the rows from the table and free the space
containing the table.
Syntax: TRUNCATE TABLE table_name;
Ex: TRUNCATE TABLE EMPLOYEE;
Data Manipulation Language
• DML commands are used to modify the database. It is responsible for all form of changes in the
database.
• The command of DML is not auto-committed that means it can't permanently save all the changes
in the database. They can be rollback.
• Here are some commands that come under DML:
Ex: INSERT INTO javatpoint (Author, Subject) VALUES ("Sonoo", "DBMS");
o INSERT - The INSERT statement is a SQL query. It is used to insert data into the row of
a table.
Syntax: INSERT INTO TABLE_NAME (col1, col2, col3,.... col N) VALUES (value1,
value2, value3, .... valueN);
Or
INSERT INTO TABLE_NAME VALUES (value1, value2, value3, .... valueN);
Ex: INSERT INTO javatpoint (Author, Subject) VALUES ("Sonoo", "DBMS");
o UPDATE - This command is used to update or modify the value of a column in the table.
Syntax: UPDATE table_name SET [column_name1 = value1,..column_nameN =
valueN] [WHERE CONDITION]
For example: UPDATE students SET User_Name = 'Sonoo' WHERE Student_Id = '3'
o DELETE - It is used to remove one or more row from a table.
Syntax: DELETE FROM table_name [WHERE condition];
For example: DELETE FROM javatpoint WHERE Author="Sonoo";
Data Control Language
• DCL commands are used to grant and take back authority from any database user.
• Here are some commands that come under DCL:
o Grant - It is used to give user access privileges to a database.
Ex: GRANT SELECT, UPDATE ON MY_TABLE TO SOME_USER, ANOTHER_USER;
o Revoke - It is used to take back permissions from the user.
Ex: REVOKE SELECT, UPDATE ON MY_TABLE FROM USER1, USER2;
Transaction Control Language
• TCL commands can only use with DML commands like INSERT, DELETE and UPDATE only.
• These operations are automatically committed in the database that's why they cannot be used
while creating tables or dropping them.
• Here are some commands that come under TCL:
o COMMIT - Commit command is used to save all the transactions to the database.
Syntax: COMMIT;
Ex: DELETE FROM CUSTOMERS WHERE AGE = 25;
COMMIT;
o ROLLBACK – Rollback command is used to undo transactions that have not already been
saved to the database.
Syntax: ROLLBACK;
Ex: DELETE FROM CUSTOMERS WHERE AGE = 25; ROLLBACK;
o SAVEPOINT – It is used to roll the transaction back to a certain point without rolling back
the entire transaction.
Syntax: SAVEPOINT SAVEPOINT_NAME;
Data Query Language
• DQL is used to fetch the data from the database.
• It uses only one command:
o SELECT - This is the same as the projection operation of relational algebra. It is used to
select the attribute based on the condition described by WHERE clause.
Syntax: SELECT expressions FROM TABLES WHERE conditions;
Ex: SELECT emp_name FROM employee WHERE age > 20;
https://www.javatpoint.com/dbms-sql-command
NESTED QUERIES
A nested query is a query that has another query embedded within it. The embedded query is called a
subquery.
Inner or sub query returns a value which is used by the outer query.
A subquery typically appears within the WHERE clause of a query. It can sometimes appear in the
FROM clause or HAVING clause.
Types of subqueries:-
• Single row sub query - These returns only one row from inner select statement.
It uses only single row operator. (>,=,<,<=,>=)
Ex: - select ename, sal, job from emp where sal>(select sal from emp where empno=7566);
• Multiple row subquery - The subqueries return more than one row are called multiple row sub
queries. In this case multiple row operators are used.
o IN equal to any number of list.
o ANY compares value to each returned by subquery
<any less than the max value
>any greater than min value
Ex: - select empno, ename, job, from emp where sal<any(select sal from emp whee job=“clerk“);
• Multiple column subquery - In this the subquery return multiple columns.
Ex:- select ename, deptno from emp where(empno.deptno)in(select empno, deptno from emp
where sal>1200);
• Inline subquery- In this the subquery may be applied in select list and inform clause.
Ex:- Select ename, sal, deptno from (select ename, sal, deptno, mgr, hiredate from emp);
• Correlated subquery - In this the information of outer select participate as a condition in inner
select.
Ex:- select deptno, ename, sal, from emp x where sal>(select avg(sal)from emp
where x.deptno=deptno)orer by deptno;
Here first outer query is executed and it pass the value of deptno to the inner query then the inner
query executed and give the result to the outer query
Ex:
Student Table
create table student(id number(10), name varchar2(20),classID number(10), marks varchar2(20));
Insert into student values(1,'pinky',3,2.4);
Insert into student values(2,'bob',3,1.44);
Insert into student values(3,'Jam',1,3.24);
Insert into student values(4,'lucky',2,2.67);
Insert into student values(5,'ram',2,4.56); Id Name classID Marks
select * from student;
1 Pinky 3 2.4
2 Bob 3 1.44
3 Jam 1 3.24
4 Lucky 2 2.67
Teacher Table
Create table teacher(id number(10), name varchar(20), subject varchar2(10), classID number(10), salary
number(30));
Insert into teacher Id Name Subject classID Salary
values(1,’bhanu’,’computer’,3,5000);
Insert into teacher values(2,'rekha','science',1,5000); 1 Bhanu Computer 3 5000
Insert into teacher values(3,'siri','social',NULL,4500);
Insert into teacher values(4,'kittu','mathsr',2,5500); 2 Rekha Science 1 5000
select * from teacher;
3 Siri Social NULL 4500
2 9 3 40
3 10 1 38
Examples:
1. Select AVG(noofstudents) from class where teacherID IN(
Select id from teacher Where subject=’science’ OR subject=’maths’);
Output - 20.0
2. SELECT * FROM student WHERE classID = (
SELECT id FROM class WHERE noofstudents = (
SELECT MAX(noofstudents) FROM class));
Output - 4|lucky |2|2.67
5|ram |2|4.56
https://www.tutorialspoint.com/explain-about-nested-queries-in-dbms
Aggregation Operators
SQL aggregation function is used to perform the calculations on multiple rows of a single column of a
table. It returns a single value.
It is also used to summarize the data.
Types of SQL Aggregation Function:
o COUNT - COUNT function is used to Count the number of rows in a database table. It can work
on both numeric and non-numeric data types.
COUNT function uses the COUNT(*) that returns the count of all the rows in a specified table.
COUNT(*) considers duplicate and Null.
Syntax - COUNT(*) or COUNT( [ALL|DISTINCT] expression )
Table: PRODUCT_MAST
PRODUCT COMPANY QTY RATE COST
Item1 Com1 2 10 20
Item2 Com2 3 25 75
Item3 Com1 2 30 60
Item4 Com3 5 10 50
Item5 Com2 2 20 40
Item6 Cpm1 3 25 75
Item7 Com1 5 30 150
Item8 Com1 3 10 30
Item9 Com2 2 25 50
Item10 Com3 4 30 120
Ex:
SELECT COUNT(*) FROM PRODUCT_MAST; Output: 10
SELECT COUNT(*) FROM PRODUCT_MAST; WHERE RATE>=20; Output: 7
COUNT() with DISTINCT:
SELECT COUNT(DISTINCT COMPANY) FROM PRODUCT_MAST; Output: 3
COUNT() with GROUP BY:
SELECT COMPANY, COUNT(*) FROM PRODUCT_MAST GROUP BY COMPANY;
Output:
Com1 5
Com2 3
Com3 2
COUNT() with HAVING:
SELECT COMPANY, COUNT(*) FROM PRODUCT_MAST GROUP BY COMPANY
HAVING COUNT(*)>2;
Output:
Com1 5
Com2 3
o SUM - Sum function is used to calculate the sum of all selected columns. It works on numeric
fields only.
Syntax - SUM() or SUM( [ALL|DISTINCT] expression )
Ex:
SELECT SUM(COST) FROM PRODUCT_MAST; Output: 670
SUM() with WHERE:
SELECT SUM(COST) FROM PRODUCT_MAST WHERE QTY>3; Output: 320
SUM() with GROUP BY:
SELECT SUM(COST) FROM PRODUCT_MAST WHERE QTY>3 GROUP BY COMPANY;
Output:
Com1 150
Com2 170
SUM() with HAVING:
SELECT COMPANY, SUM(COST) FROM PRODUCT_MAST GROUP BY COMPANY
HAVING SUM(COST)>=170;
Output:
Com1 335
Com3 170
o AVG - The AVG function is used to calculate the average value of the numeric type. AVG
function returns the average of all non-Null values.
Syntax - AVG() or AVG( [ALL|DISTINCT] expression )
Ex:
SELECT AVG(COST) FROM PRODUCT_MAST; Output: 67.00
o MAX - MAX function is used to find the maximum value of a certain column. This function
determines the largest value of all selected values of a column.
Syntax - MAX() or MAX( [ALL|DISTINCT] expression )
Ex:
SELECT MAX(RATE) FROM PRODUCT_MAST; Output: 30
o MIN - MIN function is used to find the minimum value of a certain column. This function
determines the smallest value of all selected values of a column.
Syntax - MIN() or MIN( [ALL|DISTINCT] expression )
Ex:
SELECT MIN(RATE) FROM PRODUCT_MAST; Output: 10
https://www.javatpoint.com/dbms-sql-aggregate-function
NULL values
A field with a NULL value is a field with no value.
If a field in a table is optional, it is possible to insert a new record or update a record without adding a
value to this field. Then, the field will be saved with a NULL value.
How to Test for NULL Values
It is not possible to test for NULL values with comparison operators, such as =, <, or <>.
We will have to use the IS NULL and IS NOT NULL operators instead.
o The IS NULL Operator - The IS NULL operator is used to test for empty values (NULL values).
Syntax: SELECT column_names FROM table_name WHERE column_name IS NULL;
Ex: SELECT CustomerName, ContactName, Address FROM Customers WHERE Address IS
NULL;
o The IS NOT NULL Operator - The IS NOT NULL operator is used to test for non-empty values
(NOT NULL values).
Syntax: SELECT column_names FROM table_name WHERE column_name IS NOT NULL;
Ex: SELECT CustomerName, ContactName, Address FROM Customers WHERE Address IS
NOT NULL;
https://www.w3schools.com/sql/sql_null_values.asp
o Key constraints - Keys are the entity set that is used to identify an entity within its entity set
uniquely.
An entity set can have multiple keys, but out of which one key will be the primary key. A primary
key can contain a unique and null value in the relational table.
https://www.javatpoint.com/dbms-integrity-constraints
https://www.geeksforgeeks.org/sql-constraints/
Triggers
A Trigger in Structured Query Language is a set of procedural statements which are executed
automatically when there is any response to certain events on the particular table in the database. Triggers
are used to protect the data integrity in the database.
In SQL, this concept is the same as the trigger in real life. For example, when we pull the gun trigger, the
bullet is fired.
In Structured Query Language, triggers are called only either before or after the below events:
INSERT Event: This event is called when the new row is entered in the table.
UPDATE Event: This event is called when the existing record is changed or modified in the
table.
DELETE Event: This event is called when the existing record is removed from the table.
Types of Triggers in SQL
AFTER INSERT Trigger - This trigger is invoked after the insertion of data in the table.
AFTER UPDATE Trigger - This trigger is invoked in SQL after the modification of the data in
the table.
AFTER DELETE Trigger - This trigger is invoked after deleting the data from the table.
BEFORE INSERT Trigger - This trigger is invoked before the inserting the record in the table.
BEFORE UPDATE Trigger - This trigger is invoked before the updating the record in the table.
BEFORE DELETE Trigger - This trigger is invoked before deleting the record from the table.
Syntax of Trigger in SQL
CREATE TRIGGER Trigger_Name
[ BEFORE | AFTER ] [ Insert | Update | Delete]
ON [Table_Name]
[ FOR EACH ROW | FOR EACH COLUMN ]
AS
Set of SQL Statement
Example:
Student_Trigger table
CREATE TABLE Student_Trigger
(
Student_RollNo INT NOT NULL PRIMARY KEY,
Student_FirstName Varchar (100),
Student_EnglishMarks INT,
Student_PhysicsMarks INT,
Student_ChemistryMarks INT,
Student_MathsMarks INT,
Student_TotalMarks INT,
Student_Percentage );
The following query fires a trigger before the insertion of the student record in the table:
CREATE TRIGGER Student_Table_Marks
BEFORE INSERT
ON
Student_Trigger
FOR EACH ROW
SET new.Student_TotalMarks = new.Student_EnglishMarks + new.Student_PhysicsMarks +
new.Student_ChemistryMarks + new.Student_MathsMarks,
new.Student_Percentage = ( new.Student_TotalMarks / 400) * 100;
The following query inserts the record into Student_Trigger table:
INSERT INTO Student_Trigger (Student_RollNo, Student_FirstName, Student_EnglishMarks,
Student_PhysicsMarks, Student_ChemistryMarks, Student_MathsMarks, Student_TotalMarks,
Student_Percentage) VALUES ( 201, Sorya, 88, 75, 69, 92, 0, 0);
To check the output of the above INSERT statement, you have to type the following SELECT
statement:
SELECT * FROM Student_Trigger;
Student Student Student Student Student Student Student Student
RollNo First English Physics chemistry Maths Total Percentage
Name Marks Marks Marks Marks Marks
201 Surya 88 75 69 92 324 81
Advantages of Triggers in SQL
SQL provides an alternate way for maintaining the data and referential integrity in the tables.
Triggers helps in executing the scheduled tasks because they are called automatically.
They catch the errors in the database layer of various businesses.
They allow the database users to validate values before inserting and updating.
Disadvantages of Triggers in SQL
They are not compiled.
It is not possible to find and debug the errors in triggers.
If we use the complex code in the trigger, it makes the application run slower.
Trigger increases the high load on the database system.
https://www.javatpoint.com/trigger-in-sql
Active Databases
An active Database is a database consisting of a set of triggers. These databases are very difficult to be
maintained because of the complexity that arises in understanding the effect of these triggers. In such
database, DBMS initially verifies whether the particular trigger specified in the statement that modifies
the database is activated or not, prior to executing the statement. If the trigger is active then DBMS
executes the condition part and then executes the action part only if the specified condition is evaluated to
true. It is possible to activate more than one trigger within a single statement. In such situation, DBMS
processes each of the trigger randomly. The execution of an action part of a trigger may either activate
other triggers or the same trigger that Initialized this action. Such types of trigger that activates itself is
called as ‘recursive trigger’. The DBMS executes such chains of trigger in some pre-defined manner but it
effects the concept of understanding.
Schema refinement
Schema refinement is just a fancy term for saying polishing tables. It is the last step before
considering physical design/tuning with typical workloads:
1) Requirement analysis : user needs
2) Conceptual design : high-level description, often using E/R diagrams
3) Logical design : from graphs to tables (relational schema)
4) Schema refinement : checking tables for redundancies and anomalies
Let’s see an example of redundancies and anomalies. Consider the following table where the
client’s name is the primary key.
The table is presenting information on employees (sales reps) and their clients.
If we want to insert data, we notice that:
each row requires an entry in the client field
we can’t insert data for newly hired sales reps until they’ve been assigned to one or more
clients
if sales reps are in a training process, even if they’ve been already hired, they can’t
actually join the database because they need to have a delegated client… unless
“dummy” clients are created.
If we want to update data, we notice that:
the sales reps name is repeated for each client.
what if, for a given client, we misspelled the name of the sales reps Crosby instead of
Cosby… how can we edit that without affecting all the sales reps called Crosby?
If we want to delete data, what if Mary doesn’t have a client anymore because she’s taking a year
off? We are forced to either
create a dummy client
incorrectly showing her with a client she no longer handled
delete Mary’s record (even if however she’s still an employee)
notice we can not have “null” as a client since primary field keys cannot store null.
When we have to treat with schema refinement we often notice that the main problem is
redundancy. In order to identify schemas with such problems, we’ll introduce the notion of
functional dependencies: a relationship that exists when one attribute uniquely determines
another attribute. A functional dependency is simply a new type of constraint between two
attributes.
Say that R is a relation with attributes X and Y, we say that there is a functional dependency X ->
Y when Y is functionally dependent on X (where X is the determinant set and Y is the dependent
attribute).
http://blog.dancrisan.com/intro-to-database-systems-schema-refinement-functional-dependencies
Insertion Anomaly: Insertion anomaly arises when you are trying to insert some data
into the database, but you are not able to insert it. Example: If you want to add the details
of the student in the above table, then you must know the details of the department;
otherwise, you will not be able to add the details because student details are dependent on
department details.
Deletion Anomaly: Deletion anomaly arises when you delete some data from the
database, but some unrelated data is also deleted; that is, there will be a loss of data due
to deletion anomaly. Example: If we want to delete the student detail, which has
student_id 2, we will also lose the unrelated data, i.e., department_id 102, from the above
table.
Updating Anomaly: An update anomaly arises when you update some data in the
database, but the data is partially updated, which causes data inconsistency. Example: If
we want to update the details of dept_head from Jaspreet Kaur to Ankit Goyal for
Dept_id 104, then we have to update it everywhere else; otherwise, the data will get
partially updated, which causes data inconsistency.
Advantages
Provides Data Security
Provides Data Reliability
Create Data Backup
Disadvantages
Data corruption
Wastage of storage
High cost
Ways to reduce reduce data redundancy
Database Normalization: We can normalize the data using the normalization method. In
this method, the data is broken down into pieces, which means a large table is divided
into two or more small tables to remove redundancy. Normalization removes insert
anomaly, update anomaly, and delete anomaly.
Deleting Unused Data: It is important to remove redundant data from the database as it
generates data redundancy in the DBMS. It is a good practice to remove unwanted data to
reduce redundancy.
Master Data: The data administrator shares master data across multiple systems.
Although it does not remove data redundancy, but it updates the redundant data whenever
the data is changed.
https://www.javatpoint.com/redundancy-in-dbms
Decompositions
Problems related to decomposition
When a relation in the relational model is not in appropriate normal form then the decomposition
of a relation is required.
In a database, it breaks the table into multiple tables.
If the relation has no proper decomposition, then it may lead to problems like loss of
information.
Decomposition is used to eliminate some of the problems of bad design like anomalies,
inconsistencies, and redundancy.
Types of decomposition
Lossless Decomposition - If the information is not lost from the relation that is decomposed, then
the decomposition will be lossless.
The lossless decomposition guarantees that the join of relations will result in the same relation as
it was decomposed.
The relation is said to be lossless decomposition if natural joins of all the decomposition give the
original relation.
Functional dependencies
In a relational database management, functional dependency is a concept that specifies the
relationship between two sets of attributes where one attribute determines the value of another
attribute. It is denoted as X → Y, where the attribute set on the left side of the arrow, X is called
Determinant, and Y is called the Dependent.
roll_no name dept_name dept_building
42 abc CO A4
43 pqr IT A3
44 xyz CO A4
45 xyz IT A3
46 mno EC B2
47 jkl ME B2
From the above table we can conclude some valid functional dependencies:
roll_no → { name, dept_name, dept_building },→ Here, roll_no can determine values of fields
name, dept_name and dept_building, hence a valid Functional dependency
roll_no → dept_name , Since, roll_no can determine whole set of {name, dept_name,
dept_building}, it can determine its subset dept_name also.
dept_name → dept_building , Dept_name can identify the dept_building accurately, since
departments with different dept_name will also have a different dept_building
More valid functional dependencies: roll_no → name, {roll_no, name} ⇢ {dept_name,
dept_building}, etc.
Here are some invalid functional dependencies:
name → dept_name Students with the same name can have different dept_name, hence this is
not a valid functional dependency.
dept_building → dept_name There can be multiple departments in the same building. Example,
in the above table departments ME and EC are in the same building B2, hence dept_building →
dept_name is an invalid functional dependency.
More invalid functional dependencies: name → roll_no, {name, dept_name} → roll_no,
dept_building → roll_no, etc.
Armstrong’s axioms/properties of functional dependencies:
Reflexivity: If Y is a subset of X, then X→Y holds by reflexivity rule
Example, {roll_no, name} → name is valid.
Augmentation: If X → Y is a valid dependency, then XZ → YZ is also valid by the
augmentation rule.
Example, {roll_no, name} → dept_building is valid, hence {roll_no, name, dept_name} →
{dept_building, dept_name} is also valid.
Transitivity: If X → Y and Y → Z are both valid dependencies, then X→Z is also valid by the
Transitivity rule.
Example, roll_no → dept_name & dept_name → dept_building, then roll_no → dept_building is
also valid.
Types of Functional Dependencies in DBMS
Trivial functional dependency - In Trivial Functional Dependency, a dependent is always a
subset of the determinant. i.e. If X → Y and Y is the subset of X, then it is called trivial
functional dependency
Example:
roll_no name age
42 abc 17
43 pqr 18
44 xyz 18
Here, {roll_no, name} → name is a trivial functional dependency, since the dependent name is a
subset of determinant set {roll_no, name}. Similarly, roll_no → roll_no is also an example of
trivial functional dependency.
Non-Trivial functional dependency - In Non-trivial functional dependency, the dependent is
strictly not a subset of the determinant. i.e. If X → Y and Y is not a subset of X, then it is called
Non-trivial functional dependency.
Example:
roll_no name age
42 abc 17
43 pqr 18
44 xyz 1
Here, roll_no → name is a non-trivial functional dependency, since the dependent name is not a
subset of determinant roll_no. Similarly, {roll_no, name} → age is also a non-trivial functional
dependency, since age is not a subset of {roll_no, name}
Multivalued functional dependency - In Multivalued functional dependency, entities of the
dependent set are not dependent on each other. i.e. If a → {b, c} and there exists no functional
dependency between b and c, then it is called a multivalued functional dependency.
For example,
roll_no name age
42 abc 17
43 pqr 18
44 xyz 18
45 abc 19
Here, roll_no → {name, age} is a multivalued functional dependency, since the dependents name
& age are not dependent on each other(i.e. name → age or age → name doesn’t exist !)
Transitive functional dependency - In transitive functional dependency, dependent is indirectly
dependent on determinant. i.e. If a → b & b → c, then according to axiom of transitivity, a → c.
This is a transitive functional dependency.
For example,
enrol_no name dept building_no
42 abc CO 4
43 pqr EC 2
44 xyz IT 1
45 abc EC 2
Here, enrol_no → dept and dept → building_no. Hence, according to the axiom of transitivity,
enrol_no → building_no is a valid functional dependency. This is an indirect functional
dependency, hence called Transitive functional dependency.
Fully Functional Dependency - In full functional dependency an attribute or a set of attributes
uniquely determines another attribute or set of attributes. If a relation R has attributes X, Y, Z
with the dependencies X->Y and X->Z which states that those dependencies are fully functional.
Partial Functional Dependency - In partial functional dependency a non key attribute depends
on a part of the composite key, rather than the whole key. If a relation R has attributes X, Y, Z
where X and Y are the composite key and Z is non key attribute. Then X->Z is a partial
functional dependency in RBDMS.
Advantages
Data Normalization
Query Optimization
Consistency of Data
Data Quality Improvement
https://www.google.com/amp/s/www.geeksforgeeks.org/types-of-functional-dependencies-in-
dbms/amp/
Normalization
Types of normal forms
A large database defined as a single relation may result in data duplication. This repetition of
data may result in:
Making relations very large.
It isn't easy to maintain and update data as it would involve searching many records in
relation.
Wastage and poor utilization of disk space and resources.
The likelihood of errors and inconsistencies increases.
So to handle these problems, we should analyze and decompose the relations with redundant data
into smaller, simpler, and well-structured relations that are satisfy desirable properties.
Normalization is a process of decomposing the relations into relations with fewer attributes.
Normalization
Normalization is the process of organizing the data in the database.
Normalization is used to minimize the redundancy from a relation or set of relations. It is
also used to eliminate undesirable characteristics like Insertion, Update, and Deletion
Anomalies.
Normalization divides the larger table into smaller and links them using relationships.
The normal form is used to reduce redundancy from the database table.
The main reason for normalizing the relations is removing these anomalies. Failure to eliminate
anomalies leads to data redundancy and can cause data integrity and other problems as the
database grows. Normalization consists of a series of guidelines that helps to guide you in
creating a good database structure.
Data modification anomalies can be categorized into three types:
Insertion Anomaly: Insertion Anomaly refers to when one cannot insert a new tuple into a
relationship due to lack of data.
Deletion Anomaly: The delete anomaly refers to the situation where the deletion of data results
in the unintended loss of some other important data.
Updatation Anomaly: The update anomaly is when an update of a single data value requires
multiple rows of data to be updated.
Types of Normal Forms:
1NF A relation is in 1NF if it contains an atomic value.
2NF A relation will be in 2NF if it is in 1NF and all non-key attributes are fully functional
dependent on the primary key.
3NF A relation will be in 3NF if it is in 2NF and no transition dependency exists.
BCNF A stronger definition of 3NF is known as Boyce Codd's normal form.
4NF A relation will be in 4NF if it is in Boyce Codd's normal form and has no multi-valued
dependency.
5NF A relation is in 5NF. If it is in 4NF and does not contain any join dependency, joining
should be lossless.
Advantages
Normalization helps to minimize data redundancy.
Greater overall database organization.
Data consistency within the database.
Much more flexible database design.
Enforces the concept of relational integrity.
Disadvantages
You cannot start building the database before knowing what the user needs.
The performance degrades when normalizing the relations to higher normal forms, i.e.,
4NF, 5NF.
It is very time-consuming and difficult to normalize relations of a higher degree.
Careless decomposition may lead to a bad database design, leading to serious problems.
First Normal Form (1NF)
A relation will be 1NF if it contains an atomic value.
It states that an attribute of a table cannot hold multiple values. It must hold only single-valued
attribute.
First normal form disallows the multi-valued attribute, composite attribute, and their
combinations.
Example: Relation EMPLOYEE is not in 1NF because of multi-valued attribute EMP_PHONE.
EMPLOYEE table:
EMP_ID EMP_NAME EMP_PHONE EMP_STATE
14 John 7272826385, 14
9064738238
20 Harry 8574783832 Bihar
12 Sam 7390372389, 12
8589830302
The decomposition of the EMPLOYEE table into 1NF has been shown below:
EMP_ID EMP_NAME EMP_PHONE EMP_STATE
14 John 7272826385 UP
14 John 9064738238 UP
20 Harry 8574783832 Bihar
12 Sam 7390372389 Punjab
12 Sam 8589830302 Punjab
Second Normal Form (2NF)
In the 2NF, relational must be in 1NF.
In the second normal form, all non-key attributes are fully functional dependent on the primary
key
Example: Let's assume, a school can store the data of teachers and the subjects they teach. In a
school, a teacher can teach more than one subject.
TEACHER table
TEACHER_ID SUBJECT TEACHER_AGE
25 Chemistry 30
25 Biology 30
47 English 35
83 Math 38
83 Computer 38
In the given table, non-prime attribute TEACHER_AGE is dependent on TEACHER_ID which
is a proper subset of a candidate key. That's why it violates the rule for 2NF.
To convert the given table into 2NF, we decompose it into two tables:
TEACHER_DETAIL table:
TEACHER_ID TEACHER_AGE
25 30
47 35
83 38
TEACHER_SUBJECT table:
TEACHER_ID SUBJECT
25 Chemistry
25 Biology
47 English
83 Math
83 Computer
Third Normal Form (3NF)
A relation will be in 3NF if it is in 2NF and not contain any transitive partial dependency.
3NF is used to reduce the data duplication. It is also used to achieve the data integrity.
If there is no transitive dependency for non-prime attributes, then the relation must be in third
normal form.
A relation is in third normal form if it holds atleast one of the following conditions for every non-
trivial function dependency X → Y.
X is a super key.
Y is a prime attribute, i.e., each element of Y is part of some candidate key.
Example:
EMPLOYEE_DETAIL table:
EMP_ID EMP_NAME EMP_ZIP EMP_STATE EMP_CITY
222 Harry 201010 UP Noida
333 Stephan 02228 US Boston
444 Lan 60007 US Chicago
555 Katharine 06389 UK Norwich
666 John 462007 MP Bhopal
Super key:{EMP_ID}, {EMP_ID, EMP_NAME}, {EMP_ID, EMP_NAME, EMP_ZIP}....so on
Candidate key: {EMP_ID}
Non-prime attributes: In the given table, all attributes except EMP_ID are non-prime.
Here, EMP_STATE & EMP_CITY dependent on EMP_ZIP and EMP_ZIP dependent on
EMP_ID. The non-prime attributes (EMP_STATE, EMP_CITY) transitively dependent on super
key(EMP_ID). It violates the rule of third normal form.
That's why we need to move the EMP_CITY and EMP_STATE to the new <EMPLOYEE_ZIP>
table, with EMP_ZIP as a Primary key.
EMPLOYEE table:
EMP_ID EMP_NAME EMP_ZIP
222 Harry 201010
333 Stephan 02228
444 Lan 60007
555 Katharine 06389
666 John 462007
EMPLOYEE_ZIP table:
EMP_ZIP EMP_STATE EMP_CITY
201010 UP Noida
02228 US Boston
60007 US Chicago
06389 UK Norwich
462007 MP Bhopal
Boyce Codd normal form (BCNF)
BCNF is the advance version of 3NF. It is stricter than 3NF.
A table is in BCNF if every functional dependency X → Y, X is the super key of the table.
For BCNF, the table should be in 3NF, and for every FD, LHS is super key.
Example: Let's assume there is a company where employees work in more than one department.
EMPLOYEE table:
EMP_ID EMP_COUNTRY EMP_DEPT DEPT_TYPE EMP_DEPT_NO
264 India Designing D394 283
264 India Testing D394 300
364 UK Stores D283 232
364 UK Developing D283 549
In the above table Functional dependencies are as follows:
EMP_ID → EMP_COUNTRY
EMP_DEPT → {DEPT_TYPE, EMP_DEPT_NO}
Candidate key: {EMP-ID, EMP-DEPT}
The table is not in BCNF because neither EMP_DEPT nor EMP_ID alone are keys.
To convert the given table into BCNF, we decompose it into three tables:
EMP_COUNTRY table:
EMP_ID EMP_COUNTRY
264 India
264 India
EMP_DEPT table:
EMP_DEPT DEPT_TYPE EMP_DEPT_NO
Designing D394 283
Testing D394 300
Stores D283 232
Developing D283 549
EMP_DEPT_MAPPING table:
EMP_ID EMP_DEPT
D394 283
D394 300
D283 232
D283 549
Functional dependencies:
EMP_ID → EMP_COUNTRY
EMP_DEPT → {DEPT_TYPE, EMP_DEPT_NO}
Candidate keys:
For the first table: EMP_ID
For the second table: EMP_DEPT
For the third table: {EMP_ID, EMP_DEPT}
Now, this is in BCNF because left side part of both the functional dependencies is a key.
Fourth normal form (4NF)
A relation will be in 4NF if it is in Boyce Codd normal form and has no multi-valued
dependency.
For a dependency A → B, if for a single value of A, multiple values of B exists, then the relation
will be a multi-valued dependency.
Example: STUDENT
STU_ID COURSE HOBBY
21 Computer Dancing
21 Math Singing
34 Chemistry Dancing
74 Biology Cricket
59 Physics Hockey
The given STUDENT table is in 3NF, but the COURSE and HOBBY are two independent entity.
Hence, there is no relationship between COURSE and HOBBY.
In the STUDENT relation, a student with STU_ID, 21 contains two courses, Computer and Math
and two hobbies, Dancing and Singing. So there is a Multi-valued dependency on STU_ID,
which leads to unnecessary repetition of data.
So to make the above table into 4NF, we can decompose it into two tables:
STUDENT_COURSE
STU_ID COURSE
21 Computer
21 Math
34 Chemistry
74 Biology
59 Physics
STUDENT_HOBBY
STU_ID HOBBY
21 Dancing
21 Singing
34 Dancing
74 Cricket
59 Hockey
Fifth normal form (5NF)
A relation is in 5NF if it is in 4NF and not contains any join dependency and joining should be
lossless.
5NF is satisfied when all the tables are broken into as many tables as possible in order to avoid
redundancy.
5NF is also known as Project-join normal form (PJ/NF).
Example
SUBJECT LECTURER SEMESTER
Computer Anshika Semester 1
Computer John Semester 1
Math John Semester 1
Math Akash Semester 2
Chemistry Praveen Semester 1
In the above table, John takes both Computer and Math class for Semester 1 but he doesn't take
Math class for Semester 2. In this case, combination of all these fields required to identify a valid
data.
Suppose we add a new Semester as Semester 3 but do not know about the subject and who will
be taking that subject so we leave Lecturer and Subject as NULL. But all three columns together
acts as a primary key, so we can't leave other two columns blank.
So to make the above table into 5NF, we can decompose it into three relations P1, P2 & P3:
P1
SEMESTER SUBJECT
Semester 1 Computer
Semester 1 Math
Semester 1 Chemistry
Semester 2 Math
P2
SUBJECT LECTURER
Computer Anshika
Computer John
Math John
Math Akash
Chemistry Praveen
P3
SEMSTER LECTURER
Semester 1 Anshika
Semester 1 John
Semester 1 John
Semester 2 Akash
Semester 1 Praveen
https://www.javatpoint.com/dbms-normalization
https://www.javatpoint.com/dbms-first-normal-form
https://www.javatpoint.com/dbms-second-normal-form
https://www.javatpoint.com/dbms-third-normal-form
https://www.javatpoint.com/dbms-boyce-codd-normal-form
https://www.javatpoint.com/dbms-forth-normal-form
https://www.javatpoint.com/dbms-fifth-normal-form
Project and Hobby are multivalued attributes as they have more than one value for a single
person i.e., Geeks.
Multi Valued Dependency (MVD) :
We can say that multivalued dependency exists if the following conditions are met.
Conditions for MVD :
Any attribute say a multiple define another attribute b; if any legal relation r(R), for all pairs of
tuples t1 and t2 in r, such that,
t1[a] = t2[a]
Then there exists t3 and t4 in r such that.
t1[a] = t2[a] = t3[a] = t4[a]
t1[b] = t3[b]; t2[b] = t4[b]
t1 = t4; t2 = t3
Then multivalued (MVD) dependency exists.
To check the MVD in given table, we apply the conditions stated above and we check it with the
values in the given table.
Condition-1 for MVD –
t1[a] = t2[a] = t3[a] = t4[a]
Finding from table,
t1[a] = t2[a] = t3[a] = t4[a] = Geeks
So, condition 1 is Satisfied.
Condition-2 for MVD –
t1[b] = t3[b]
And
t2[b] = t4[b]
Finding from table,
t1[b] = t3[b] = MS
And
t2[b] = t4[b] = Oracle
So, condition 2 is Satisfied.
Condition-3 for MVD –
∃c ∈ R-(a ∪ b) where R is the set of attributes in the relational table.
t1 = t4
And
t2=t3
Finding from table,
t1 = t4 = Reading
And
t2 = t3 = Music
So, condition 3 is Satisfied.
All conditions are satisfied, therefore,
a→→b
According to table we have got,
name →→ project
And for,
a→→C
We get,
name → → hobby
Hence, we know that MVD exists in the above table and it can be stated by,
name → → project
name → → hobby
https://www.geeksforgeeks.org/multivalued-dependency-mvd-in-dbms/