Functional Dependency
Functional Dependency
Relational constraints are the restrictions imposed on the database contents and operations.
They ensure the correctness of data in the database.
1. Domain constraint
2. Tuple Uniqueness constraint
3. Key constraint
4. Entity Integrity constraint
5. Referential Integrity constraint
1. Domain Constraint-
Example-
Consider the following Student table-
S001 Akshay 20
S002 Abhishek 21
S003 Shashank 20
S004 Rahul A
Here, value ‘A’ is not allowed since only integer values can be taken by the age attribute.
Tuple Uniqueness constraint specifies that all the tuples must be necessarily unique in any relation.
Example-01:
S001 Akshay 20
S002 Abhishek 21
S003 Shashank 20
S004 Rahul 20
This relation satisfies the tuple uniqueness constraint since here all the tuples are unique.
Example-02:
S001 Akshay 20
S001 Akshay 20
S003 Shashank 20
S004 Rahul 20
This relation does not satisfy the tuple uniqueness constraint since here all the tuples are not unique.
3. Key Constraint-
Example-
S001 Akshay 20
S001 Abhishek 21
S003 Shashank 20
S004 Rahul 20
This relation does not satisfy the key constraint as here all the values of primary key are not unique.
Entity integrity constraint specifies that no attribute of primary key must contain a null value in any relation.
This is because the presence of null value in the primary key violates the uniqueness property.
Example-
S001 Akshay 20
S002 Abhishek 21
S003 Shashank 20
Rahul 20
This relation does not satisfy the entity integrity constraint as here the primary key contains a NULL value.
This constraint is enforced when a foreign key references the primary key of a relation.
It specifies that all the values taken by the foreign key must either be available in the relation of the primary key or be
null.
Important Results-
The following two important results emerges out due to referential integrity constraint-
We can not insert a record into a referencing relation if the corresponding record does not exist in the referenced
relation.
We can not delete or update a record of the referenced relation if the corresponding record exists in the referencing
relation.
Example-
Student
Department
Dept_no Dept_name
D10 ASET
D11 ALS
D12 ASFL
D13 ASHS
Here,
The relation ‘Student’ does not satisfy the referential integrity constraint.
This is because in relation ‘Department’, no value of primary key specifies department no. 14.
Thus, referential integrity constraint is violated.
To ensure the correctness of the database, it is important to handle the violation of referential integrity constraint
properly.
The set of all those attributes which can be functionally determined from an attribute set is called as a closure of that
attribute set.
Closure of attribute set {X} is denoted as {X}+.
Step-01:
Add the attributes contained in the attribute set for which closure is being calculated to the result set.
Step-02:
Recursively add the attributes to the result set which can be functionally determined from the attributes already
contained in the result set.
Example-
A → BC
BC → DE
D→F
CF → G
Now, let us find the closure of some attributes and attribute sets-
Closure of attribute A-
A+ = { A }
={A,B,C} ( Using A → BC )
={A,B,C,D,E} ( Using BC → DE )
={A,B,C,D,E,F} ( Using D → F )
={A,B,C,D,E,F,G} ( Using CF → G )
Thus,
A+ = { A , B , C , D , E , F , G }
Closure of attribute D-
D+ = { D }
= { D , F } ( Using D → F )
We can not determine any other attribute using attributes D and F contained in the result set.
Thus,
D+ = { D , F }
{ B , C }+ = { B , C }
={B,C,D,E} ( Using BC → DE )
={B,C,D,E,F} ( Using D → F )
={B,C,D,E,F,G} ( Using CF → G )
Thus,
{ B , C }+ = { B , C , D , E , F , G }
Super Key-
If the closure result of an attribute set contains all the attributes of the relation, then that attribute set is called as a
super key of that relation.
Thus, we can say-
“The closure of a super key is the entire relation schema.”
Example-
Candidate Key-
If there exists no subset of an attribute set whose closure contains all the attributes of the relation, then that attribute
set is called as a candidate key of that relation.
Example-
Problem-
AB → CD
AF → D
DE → F
C→G
F→E
G→A
(A) { CF }+ = { A , C , D , E , F , G }
(B) { BG }+ = { A , B , C , D , G }
(C) { AF }+ = { A , C , D , E , F , G }
(D) { AB }+ = { A , C , D , F ,G }
Solution-
Option-(A):
{ CF }+ = { C , F }
={C,F,G} ( Using C → G )
={C,E,F,G} ( Using F → E )
={A,C,E,E,F} ( Using G → A )
={A,C,D,E,F,G} ( Using AF → D )
Since, our obtained result set is same as the given result set, so, it means it is correctly given.
Option-(B):
{ BG }+ = { B , G }
={A,B,G} ( Using G → A )
={A,B,C,D,G} ( Using AB → CD )
Since, our obtained result set is same as the given result set, so, it means it is correctly given.
Option-(C):
{ AF }+ = { A , F }
={A,D,F} ( Using AF → D )
={A,D,E,F} ( Using F → E )
Since, our obtained result set is different from the given result set, so,it means it is not correctly given.
Option-(D):
{ AB }+ = { A , B }
={A,B,C,D} ( Using AB → CD )
={A,B,C,D,G} ( Using C → G )
Since, our obtained result set is different from the given result set, so,it means it is not correctly given.
Thus,
Keys in DBMS
Database Management System
Keys in DBMS-
A key is a set of attributes that can identify each tuple uniquely in the given relation.
NOTE-
1. Super Key-
A super key is a set of attributes that can identify each tuple uniquely in the given relation.
A super key is not restricted to have any specific number of attributes.
Thus, a super key may consist of any number of attributes.
Example-
Given below are the examples of super keys since each set can uniquely identify each student in the Student table-
NOTE-
All the attributes in a super key are definitely sufficient to identify each tuple uniquely in the given relation but all of
them may not be necessary.
2. Candidate Key-
OR
A set of minimal attribute(s) that can identify each tuple uniquely in the given relation is called as a candidate key.
Example-
Given below are the examples of candidate keys since each set consists of minimal attributes required to identify
each student uniquely in the Student table-
NOTES-
All the attributes in a candidate key are sufficient as well as necessary to identify each tuple uniquely.
Removing any attribute from the candidate key fails in identifying each tuple uniquely.
The value of candidate key must always be unique.
The value of candidate key can never be NULL.
It is possible to have multiple candidate keys in a relation.
Those attributes which appears in some candidate key are called as prime attributes.
3. Primary Key-
A primary key is a candidate key that the database designer selects while designing the database.
OR
Candidate key that the database designer implements is called as a primary key.
NOTES-
Remember-
4. Alternate Key-
Candidate keys that are left unimplemented or unused after implementing the primary key are called as alternate
keys.
OR
5. Foreign Key-
An attribute ‘X’ is called as a foreign key to some other attribute ‘Y’ when its values are dependent on the values of
attribute ‘Y’.
The attribute ‘X’ can assume only those values which are assumed by the attribute ‘Y’.
Here, the relation in which attribute ‘Y’ is present is called as the referenced relation.
The relation in which attribute ‘X’ is present is called as the referencing relation.
The attribute ‘Y’ might be present in the same table or in some other table.
Example-
Here, t_dept can take only those values which are present in dept_no in Department table since only those
departments actually exist.
NOTES-
Partial key is a key using which all the records of the table can not be identified uniquely.
However, a bunch of related tuples can be selected from the table using the partial key.
Example-
E1 Suman Mother
E1 Ajay Father
E2 Vijay Father
E2 Ankush Son
Here, using partial key Emp_no, we can not identify a tuple uniquely but we can select a bunch of tuples from the
table.
7. Composite Key-
A primary key comprising of multiple attributes and not just a single attribute is called as a composite key.
8. Unique Key-
The Adhaar Card Number is unique for all the citizens (tuples) of India (table).
If it gets lost and another duplicate copy is issued, then the duplicate copy always has the same number as before.
Thus, it is non-updatable.
Few citizens may not have got their Adhaar cards, so for them its value is NULL.
9. Surrogate Key-
Example-
Mobile Number of students in a class where every student owns a mobile phone.
Secondary key is required for the indexing purpose for better and faster searching.
Mathematically,
α⊆R
β⊆R
Then, for a functional dependency to exist from α to β,
t1[α] t1[β]
t2[α] t2[β]
……. …….
fd : α → β
Examples-
The examples of trivial functional dependencies are-
AB → A
AB → B
AB → AB
Examples-
AB → BC
AB → CD
Reflexivity-
Transitivity-
If A → B and B → C, then A → C always holds.
Augmentation-
If A → B, then AC → BC always holds.
A->B
A->AB(AUGMENT BY A)
Decomposition-
If A → BC, then A → B and A → C always holds.
UNION OR Composition-
If A → B and C → D, then AC → BD always holds.
Additive-
If A → B and A → C, then A → BC always holds.
Rules for Functional Dependency-
Rule-01:
A functional dependency X → Y will always hold if all the values of X are unique (different) irrespective of the values
of Y.
Example-
A B C D E
5 4 3 2 2
8 5 3 2 1
1 9 3 3 5
4 7 3 3 8
The following functional dependencies will always hold since all the values of attribute ‘A’ are unique-
A→B
A → BC
A → CD
A → BCD
A → DE
A → BCDE
In general, we can say following functional dependency will always hold-
A functional dependency X → Y will always hold if all the values of Y are same irrespective of the values of X.
Example-
Consider the following table-
A B C D E
5 4 3 2 2
8 5 3 2 1
1 9 3 3 5
4 7 3 3 8
The following functional dependencies will always hold since all the values of attribute ‘C’ are same-
A→C
AB → C
ABDE → C
DE → C
AE → C
In general, we can say following functional dependency will always hold true-
Rule-03:
For a functional dependency X → Y to hold, if two tuples in the table agree on the value of attribute X, then they must
also agree on the value of attribute Y.
Rule-04:
For a functional dependency X → Y, violation will occur only when for two or more same values of X, the
corresponding Y values are different.
Before you go through this article, make sure that you have gone through the previous article on Functional
Dependency.
In DBMS,
Two different sets of functional dependencies for a given relation may or may not be equivalent.
If F and G are the two sets of functional dependencies, then following 3 cases are possible-
Case-01: F covers G (F ⊇ G)
Case-02: G covers F (G ⊇ F)
Step-01:
Step-03:
Step-01:
Step-02:
Step-03:
Set F-
A→C
AC → D
E → AD
E→H
Set G-
A → CD
E → AH
(A) G ⊇ F
(B) F ⊇ G
(C) F = G
Solution-
Step-01:
Step-02:
Step-03:
Comparing the results of Step-01 and Step-02, we find-
Functional dependencies of set F can determine all the attributes which have been determined by the functional
dependencies of set G.
Thus, we conclude F covers G i.e. F ⊇ G.
Step-01:
Step-02:
Step-03:
Functional dependencies of set G can determine all the attributes which have been determined by the functional
dependencies of set F.
Thus, we conclude G covers F i.e. G ⊇ F.
Before you go through this article, make sure that you have gone through the previous article on Functional
Dependency in DBMS.
In DBMS,
A canonical cover is a simplified and reduced version of the given set of functional dependencies.
Since it is a reduced version, it is also called as Irreducible set.
Characteristics-
Need-
Working with the set containing extraneous functional dependencies increases the computation time.
Therefore, the given set is reduced by eliminating the useless functional dependencies.
This reduces the computation time and working with the irreducible set becomes easier.
Step-01:
Write the given set of functional dependencies in such a way that each functional dependency contains exactly one
attribute on its right side.
Example-
X→Y
X→Z
Step-02:
Consider each functional dependency one by one from the set obtained in Step-01.
Determine whether it is essential or non-essential.
To determine whether a functional dependency is essential or not, compute the closure of its left side-
Once by considering that the particular functional dependency is present in the set
Once by considering that the particular functional dependency is not present in the set
Then following two cases are possible-
It means that the presence or absence of that functional dependency does not create any difference.
Thus, it is non-essential.
Eliminate that functional dependency from the set.
NOTE-
Eliminate the non-essential functional dependency from the set as soon as it is discovered.
Do not consider it while checking the essentiality of other functional dependencies.
It means that the presence or absence of that functional dependency creates a difference.
Thus, it is essential.
Do not eliminate that functional dependency from the set.
Mark that functional dependency as essential.
Step-03:
Consider the newly obtained set of functional dependencies after performing Step-02.
Check if there is any functional dependency that contains more than one attribute on its left side.
Case-01: No-
There exists no functional dependency containing more than one attribute on its left side.
In this case, the set obtained in Step-02 is the canonical cover.
Case-01: Yes-
There exists at least one functional dependency containing more than one attribute on its left side.
In this case, consider all such functional dependencies one by one.
Check if their left side can be reduced.
Problem-
The following functional dependencies hold true for the relational scheme R ( W , X , Y , Z ) –
X→W
WZ → XY
Y → WXZ
Solution-
Step-01:
Write all the functional dependencies such that each contains exactly one attribute on its right side-
X→W
WZ → X
WZ → Y
Y→W
Y→X
Y→Z
Step-02:
For X → W:
Considering X → W, (X)+ = { X , W }
Ignoring X → W, (X)+ = { X }
Now,
For WZ → X:
Considering WZ → X, (WZ)+ = { W , X , Y , Z }
Ignoring WZ → X, (WZ)+ = { W , X , Y , Z }
Now,
X→W
WZ → Y
Y→W
Y→X
Y→Z
For WZ → Y:
Considering WZ → Y, (WZ)+ = { W , X , Y , Z }
Ignoring WZ → Y, (WZ)+ = { W , Z }
Now,
For Y → W:
Considering Y → W, (Y)+ = { W , X , Y , Z }
Ignoring Y → W, (Y)+ = { W , X , Y , Z }
Now,
X→W
WZ → Y
Y→X
Y→Z
For Y → X:
Considering Y → X, (Y)+ = { W , X , Y , Z }
Ignoring Y → X, (Y)+ = { Y , Z }
Now,
For Y → Z:
Considering Y → Z, (Y)+ = { W , X , Y , Z }
Ignoring Y → Z, (Y)+ = { W , X , Y }
Now,
X→W
WZ → Y
Y→X
Y→Z
Step-03:
Consider the functional dependencies having more than one attribute on their left side.
Check if their left side can be reduced.
In our set,
Now,
(Z)+ = { Z }
Clearly,
None of the subsets have the same closure result same as that of the entire left side.
Thus, we conclude that we can not write WZ → Y as W → Y or Z → Y.
Thus, set of functional dependencies obtained in step-02 is the canonical cover.
X→W
WZ → Y
Y→X
Y→Z
Canonical Cover
The process of breaking up or dividing a single relation into two or more sub relations is called as decomposition of a
relation.
Properties of Decomposition-
The following two properties must be followed when decomposing a given relation-
1. Lossless decomposition-
2. Dependency Preservation-
None of the functional dependencies that holds on the original relation are lost.
The sub relations still hold or satisfy the functional dependencies of the original relation.
Types of Decomposition-
R1 ⋈ R2 ⋈ R3 ……. ⋈ Rn = R
A B C
1 2 1
2 5 3
3 3 3
R( A , B , C )
Consider this relation is decomposed into two sub relations R1( A , B ) and R2( B , C )-
A B
1 2
2 5
3 3
R1 ( A , B )
B C
2 1
5 3
3 3
R2 ( B , C )
R1 ⋈ R 2 = R
Now, if we perform the natural join ( ⋈ ) of the sub relations R1 and R2 , we get-
A B C
1 2 1
2 5 3
3 3 3
NOTE-
R1 ⋈ R2 ⋈ R3 ……. ⋈ Rn ⊃ R
Example-
A B C
1 2 1
2 5 3
3 3 3
R( A , B , C )
Consider this relation is decomposed into two sub relations as R1( A , C ) and R2( B , C )-
1 1
2 3
3 3
R1 ( A , B )
B C
2 1
5 3
3 3
R2 ( B , C )
R1 ⋈ R 2 ⊃ R
Now, if we perform the natural join ( ⋈ ) of the sub relations R1 and R2 we get-
A B C
1 2 1
2 5 3
2 3 3
3 5 3
3 3 3
This relation is not same as the original relation R and contains some extraneous tuples.
Clearly, R1 ⋈ R2 ⊃ R.
NOTE-
Before you go through this article, make sure that you have gone through the previous article on Decomposition in
DBMS.
We have discussed-
Decomposition is a process of dividing a single relation into two or more sub relations.
Decomposition of a relation can be completed in the following two ways-
1. Lossless Join Decomposition
2. Lossy Join Decomposition
In this article, we will learn how to determine whether the decomposition is lossless or lossy.
Then,
Condition-01:
Union of both the sub relations must contain all the attributes that are present in the original relation R.
Thus,
R1 ∪ R2 = R
Condition-02:
R1 ∩ R2 ≠ ∅
Condition-03:
Intersection of both the sub relations must be a super key of either R1 or R2 or both.
Thus,
R1 ∩ R2 = Super key of R1 or R2
Solution-
A B C D
R1 A1 A2 B13 B14
R2 B21 B22 A3 A4
A->B(NO CHANGE)
C->D(NO CHANGE)
Condition-01:
According to condition-01, union of both the sub relations must contain all the attributes of relation R.
So, we have-
R1 ( A , B ) ∪ R2 ( C , D )
=R(A,B,C,D)
Clearly, union of the sub relations contain all the attributes of relation R.
Thus, condition-01 satisfies.
Condition-02:
According to condition-02, intersection of both the sub relations must not be null.
So, we have-
R1 ( A , B ) ∩ R2 ( C , D )
=Φ
Problem-02:
A→B
B→C
C→D
D→B
A B C D
R1 a1 a2 b13 b14
R2 b21 a2 a3 b24
R3 b31 a2 b33 a4
B->C
A B C D
R1 a1 a2 a3 b14
R2 b21 a2 a3 b24
R3 b31 a2 a3 a4
C->D
A B C D
R1 a1 a2 a3 a4
R2 b21 a2 a3 a4
R3 b31 a2 a3 a4
D->B(NO CHANGE)
Solution-
Strategy to Solve
When a given relation is decomposed into more than two sub relations, then-
Consider any one possible ways in which the relation might have been decomposed into those sub relations.
First, divide the given relation into two sub relations.
Then, divide the sub relations according to the sub relations given in the question.
As a thumb rule, remember-
Any relation can be decomposed only into two sub relations at a time.
Consider the original relation R was decomposed into the given sub relations as shown-
Decomposition of R(A, B, C, D) into R'(A, B, C) and R 3(B, D)-
Condition-01:
According to condition-01, union of both the sub relations must contain all the attributes of relation R.
So, we have-
R‘ ( A , B , C ) ∪ R3 ( B , D )
=R(A,B,C,D)
Clearly, union of the sub relations contain all the attributes of relation R.
Condition-02:
According to condition-02, intersection of both the sub relations must not be null.
So, we have-
R‘ ( A , B , C ) ∩ R3 ( B , D )
=B
Condition-03:
According to condition-03, intersection of both the sub relations must be the super key of one of the two sub relations
or both.
So, we have-
R‘ ( A , B , C ) ∩ R3 ( B , D )
=B
B+ = { B , C , D }
Now, we see-
Attribute ‘B’ can not determine attribute ‘A’ of sub relation R’.
Thus, it is not a super key of the sub relation R’.
Attribute ‘B’ can determine all the attributes of sub relation R3.
Thus, it is a super key of the sub relation R3.
Clearly, intersection of the sub relations is a super key of one of the sub relations.
Condition-01:
According to condition-01, union of both the sub relations must contain all the attributes of relation R’.
So, we have-
R1 ( A , B ) ∪ R2 ( B , C )
= R’ ( A , B , C )
Clearly, union of the sub relations contain all the attributes of relation R’.
Condition-02:
According to condition-02, intersection of both the sub relations must not be null.
So, we have-
R1 ( A , B ) ∩ R2 ( B , C )
=B
Condition-03:
According to condition-03, intersection of both the sub relations must be the super key of one of the two sub relations
or both.
So, we have-
R1 ( A , B ) ∩ R2 ( B , C )
=B
B+ = { B , C , D }
Now, we see-
Attribute ‘B’ can not determine attribute ‘A’ of sub relation R1.
Thus, it is not a super key of the sub relation R1.
Attribute ‘B’ can determine all the attributes of sub relation R2.
Thus, it is a super key of the sub relation R2.
Clearly, intersection of the sub relations is a super key of one of the sub relations.
Conclusion-
Overall decomposition of relation R into sub relations R1, R2 and R3 is lossless.
Normal Forms-
There exists several other normal forms even after BCNF but generally we normalize till BCNF only.
A given relation is called in First Normal Form (1NF) if each cell of the table contains only an atomic value.
OR
A given relation is called in First Normal Form (1NF) if the attribute of every tuple is either single valued or a null
value.
Example-
However,
Relation is in 1NF
NOTE-
A given relation is called in Second Normal Form (2NF) if and only if-
Partial Dependency
A partial dependency is a dependency where few attributes of the candidate key determines non-prime attribute(s).
OR
A partial dependency is a dependency where a portion of the candidate key or incomplete candidate key determines non-prime
attribute(s).
In other words,
NOTE-
To avoid partial dependency, incomplete candidate key must not determine any non-prime attribute.
However, incomplete candidate key can determine prime attributes.
Example-
VW → XY
Y→V
VWXY → Z
VW , WX , WY
From here,
Prime attributes = { V , W , X , Y }
Non-prime attributes = { Z }
A given relation is called in Third Normal Form (3NF) if and only if-
Transitive Dependency
NOTE-
OR
1. A is a super key
2. B is a prime attribute
Example-
A → BC
CD → E
B→D
E→A
B->C
A , E , CD , BC
From here,
Prime attributes = { A , B , C , D , E }
There are no non-prime attributes
Now,
Example-
A→B
B→C
C→A
A,B,C
Now, we can observe that RHS of each given functional dependency is a candidate key.
Before you go through this article, make sure that you have gone through the previous article on Normalization in
DBMS.
We have discussed-
In this article, we will discuss some important points about normal forms.
Point-01:
Point-03:
Point-04:
Point-05:
Singleton keys are those that consist of only a single attribute.
If all the candidate keys of a relation are singleton candidate keys, then it will always be in 2NF at least.
This is because there will be no chances of existing any partial dependency.
The candidate keys will either fully appear or fully disappear from the dependencies.
Thus, an incomplete candidate key will never determine a non-prime attribute.
Point-06:
If all the attributes of a relation are prime attributes, then it will always be in 2NF at least.
This is because there will be no chances of existing any partial dependency.
Since there are no non-prime attributes, there will be no Functional Dependency which determines a non-prime
attribute.
Point-07:
If all the attributes of a relation are prime attributes, then it will always be in 3NF at least.
This is because there will be no chances of existing any transitive dependency for non-prime attributes.
Point-08:
Third Normal Form (3NF) is considered adequate for normal relational database design.
Point-09:
Every binary relation (a relation with only two attributes) is always in BCNF.
Point-10:
BCNF is free from redundancies arising out of functional dependencies (zero redundancy).
Point-11:
Point-12:
BCNF decomposition is always lossless but not always dependency preserving.
Point-13:
Point-14:
There exist many more normal forms even after BCNF like 4NF and more.
But in the real world database systems, it is generally not required to go beyond BCNF.
Point-15:
Point-16:
Unlike BCNF, Lossless and dependency preserving decomposition into 3NF and 2NF is always possible.
Point-17:
Point-18:
If a relation consists of only singleton candidate keys and it is in 3NF, then it must also be in BCNF.
Point-19:
If a relation consists of only one candidate key and it is in 3NF, then the relation must also be in BCNF.