Module 3
Module 3
o Integrity constraints are a set of rules. It is used to maintain the quality of information.
o Integrity constraints ensure that the data insertion, updating, and other processes have
to be performed in such a way that data integrity is not affected.
o Thus, integrity constraint is used to guard against accidental damage to the database.
1. Domain constraints
o Domain constraints can be defined as the definition of a valid set of values for an
attribute.
o The data type of domain includes string, character, integer, time, date, currency, etc.
The value of the attribute must be available in the corresponding domain.
Example:
1
2. Entity integrity constraints
o The entity integrity constraint states that primary key value can't be null.
o This is because the primary key value is used to identify individual rows in relation
and if the primary key has a null value, then we can't identify those rows.
o A table can contain a null value other than the primary key field.
Example:
Example:
2
4. Key constraints
o Keys are the entity set that is used to identify an entity within its entity set uniquely.
o An entity set can have multiple keys, but out of which one key will be the primary
key. A primary key can contain a unique and null value in the relational table.
Example:
Redundancy means having multiple copies of same data in the database. This problem arises
when a database is not normalized. Suppose a table of student details attributes are: student
Id, student name, college name, college rank, course opted.
3
As it can be observed that values of attribute college name, college rank, course is being
repeated which can lead to problems. Problems caused due to redundancy are: Insertion
anomaly, Deletion anomaly, and Updation anomaly.
1. Insertion Anomaly –
If a student detail has to be inserted whose course is not being decided yet then insertion
will not be possible till the time course is decided for student.
This problem happens when the insertion of a data record is not possible without adding
some additional unrelated data to the record.
1. Deletion Anomaly –
If the details of students in this table is deleted then the details of college will also get
deleted which should not occur by common sense.
This anomaly happens when deletion of a data record results in losing some unrelated
information that was stored as part of the record that was deleted from a table.
2. Updation Anomaly –
Suppose if the rank of the college changes then changes will have to be all over the
database which will be time-consuming and computationally costly.
If updation do not occur at all places then database will be in inconsistent state.
Normalization
4
o Normalization divides the larger table into the smaller table and links them using
relationship.
o The normal form is used to reduce redundancy from the database table.
5
Example: Relation EMPLOYEE is not in 1NF because of multi-valued attribute
EMP_PHONE.
EMPLOYEE table:
o In the second normal form, all non-key attributes are fully functional dependent on the primary
key
Table: StudentCourse
StudentID CourseID StudentName Grade
6
Candidate Key:
Normalization:
To remove partial dependency and achieve 2NF, you would split the table as follows:
Table 1: Student
StudentID StudentName
101 Alice
102 Bob
103 Charlie
Table 2: CourseEnrollment
StudentID CourseID Grade
101 CS101 A
101 MATH101 B
102 CS101 A
103 MATH101 C
7
Now, both tables satisfy 2NF since all non-prime attributes are fully functionally dependent
on their respective primary keys.
Primary Key:
Transitive Dependency:
Table 1: StudentCourse
StudentID CourseID StudentName InstructorID
8
Table 2: Instructor
InstructorID InstructorName
Explanation:
Table 1: The transitive dependency is removed by storing only the InstructorID in the
StudentCourse table.
Table 2: The Instructor table contains the details about the instructor, ensuring no
redundant data.
Key Terms:
Why BCNF?
9
CourseID Instructor StudentID StudentName
Functional Dependencies:
Problem:
CourseID Instructor
C101 S1 Alice
C102 S2 Bob
C101 S2 Bob
Advantages of BCNF:
Eliminates redundancy.
Avoids anomalies in data insertion, deletion, or updates.
10
Fourth normal form (4NF)
o A relation will be in 4NF if it is in Boyce Codd normal form and has no multi-valued
dependency.
o For a dependency A → B, if for a single value of A, multiple values of B exists, then
the relation will be a multi-valued dependency.
Example
11
Fifth normal form (5NF)
A relation is in 5NF if it is in 4NF and not contains any join dependency and joining should
be lossless.
5NF is satisfied when all the tables are broken into as many tables as possible in order to
avoid redundancy.
Example
12
Math Akash Semester 2
In the above table, John takes both Computer and Math class for Semester 1 but he doesn't
take Math class for Semester 2. In this case, combination of all these fields required to
identify a valid data.
Suppose we add a new Semester as Semester 3 but do not know about the subject and who
will be taking that subject so we leave Lecturer and Subject as NULL. But all three columns
together acts as a primary key, so we can't leav
So to make the above table into 5NF, we can decompose it into three relations P1, P2 & P3:
P1
SEMESTER SUBJECT
Semester 1 Computer
Semester 1 Math
Semester 1 Chemistry
Semester 2 Math
P2
SUBJECT LECTURER
13
Computer Anshika
Computer John
Math John
Math Akash
Chemistry Praveen
P3
SEMSTER LECTURER
Semester 1 Anshika
Semester 1 John
Semester 1 John
Semester 2 Akash
Semester 1 Praveen
14
The above diagram also implies-
● BCNF is stricter than 3NF.
● 3NF is stricter than 2NF.
● 2NF is stricter than 1NF.
15
Functional dependency in DBMS
Example
The following is an example that would make it easier to understand functional dependency:
We have a <Department> table with two attributes: DeptId and DeptName.
DeptId = Department ID
DeptName = Department Name
The DeptId is our primary key. Here, DeptId uniquely identifies the DeptName attribute.
This is because if you want to know the department name, then at first you need to have
the DeptId.
16
Therefore, the above functional dependency between DeptId and DeptName can be
determined as DeptId is functionally dependent on DeptName:
DeptId -> DeptName
Examples-
Example
We are considering the same <Department> table with two attributes to understand the
concept of trivial dependency.
The following is a trivial functional dependency since DeptId is a subset
of DeptId and DeptName
{ DeptId, DeptName } -> Dept Id
17
Non –Trivial Functional Dependency
● A functional dependency X → Y is said to be non-trivial if and only if Y ⊄ X.
● Thus, if there exists at least one attribute in the RHS of a functional dependency that is not
a part of LHS, then it is called as a non-trivial functional dependency.
Examples-
Example
DeptId -> DeptName
The above is a non-trivial functional dependency since DeptName is a not a subset of DeptId.
Completely Non - Trivial Functional Dependency
It occurs when A intersection B is null in:
A ->B
Inference Rules-
Reflexivity-
If B is a subset of A, then A → B always holds.
Transitivity-
If A → B and B → C, then A → C always holds.
Augmentation-
If A → B, then AC → BC always holds.
Decomposition-
If A → BC, then A → B and A → C always holds.
Composition-
If A → B and C → D, then AC → BD always holds.
18
Additive-
If A → B and A → C, then A → BC always holds.
Rules for Functional Dependency-
Rule-01:
A functional dependency X → Y will always hold if all the values of X are unique (different)
irrespective of the values of Y.
Example-
A B C D E
5 4 3 2 2
8 5 3 2 1
1 9 3 3 5
4 7 3 3 8
The following functional dependencies will always hold since all the values of attribute ‘A’
are unique-
● A→B
● A → BC
● A → CD
● A → BCD
● A → DE
● A → BCDE
In general, we can say following functional dependency will always hold-
19
A → Any combination of attributes A, B, C, D, E
Rule-02:
A functional dependency X → Y will always hold if all the values of Y are same irrespective
of the values of X.
Example-
Consider the following table-
A B C D E
5 4 3 2 2
8 5 3 2 1
1 9 3 3 5
4 7 3 3 8
The following functional dependencies will always hold since all the values of attribute ‘C’
are same-
● A→C
● AB → C
● ABDE → C
● DE → C
● AE → C
In general, we can say following functional dependency will always hold true-
20
Any combination of attributes A, B, C, D, E → C
Rule-03:
For a functional dependency X → Y to hold, if two tuples in the table agree on the value of
attribute X, then they must also agree on the value of attribute Y.
Rule-04:
For a functional dependency X → Y, violation will occur only when for two or more same
values of X, the corresponding Y values are different.
Equivalence of Two Sets of Functional Dependencies-
In DBMS,f
● Two different sets of functional dependencies for a given relation may or may not be
equivalent.
● If F and G are the two sets of functional dependencies, then following 3 cases are possible-
Case-01: F covers G (F ⊇ G)
Case-02: G covers F (G ⊇ F)
Case-03: Both F and G cover each other (F = G)
21
Step-01:
Step-02:
Step-03:
Step-01:
Step-02:
Step-03:
22
● Compare the results of Step-01 and Step-02.
● If the functional dependencies of set G has determined all those attributes that were
Decomposition of a Relation-
Properties of Decomposition-
The following two properties must be followed when decomposing a given relation-
1. Lossless decomposition-
2. Dependency Preservation-
23
Functional dependency if
(F1 ∪ F2 ∪ … ∪ Fm)+ = F+.
Consider a relation R
R ---> F{...with some functional dependency(FD)....}
R is decomposed or divided into R1 with FD { f1 } and R2 with { f2 }, then
there can be three cases:
f1 U f2 = F -----> Decomposition is dependency preserving.
f1 U f2 is a subset of F -----> Not Dependency preserving.
f1 U f2 is a super set of F -----> This case is not possible.
Problem: Let a relation R (A, B, C, D ) and functional dependency {AB –> C, C –>
D, D –> A}.
Relation R is decomposed into R1( A, B, C) and R2(C, D). Check whether
decomposition is
dependency preserving or not.
Solution:
R1(A, B, C) and R2(C, D)
Let us find closure of F1 and F2
To find closure of F1, consider all combination of
ABC. i.e., find closure of A, B, C, AB, BC and AC
Note ABC is not considered as it is always ABC
closure(A) = { A } // Trivial
closure(B) = { B } // Trivial
closure(C) = {C, A, D} but D can't be in closure as D is not present R1.
= {C, A}
C--> A // Removing C from right side as it is trivial attribute
closure(AB) = {A, B, C, D}
= {A, B, C}
AB --> C // Removing AB from right side as these are trivial attributes
closure(BC) = {B, C, D, A}
= {A, B, C}
BC --> A // Removing BC from right side as these are trivial attributes
closure(AC) = {A, C, D}
AC --> D // Removing AC from right side as these are trivial attributes
F1 {C--> A, AB --> C, BC --> A}.
Similarly F2 { C--> D }
In the original Relation Dependency { AB --> C , C --> D , D --> A}.
AB --> C is present in F1.
C --> D is present in F2.
D --> A is not preserved.
F1 U F2 is a subset of F. So given decomposition is not dependency
24
Types of Decomposition-
R1 ⋈ R2 ⋈ R3 ……. ⋈ Rn = R
Example-
A B C
1 2 1
2 5 3
25
3 3 3
R( A , B , C )
Consider this relation is decomposed into two sub relations R1( A , B ) and R2( B , C )-
A B
1 2
2 5
3 3
R1( A , B )
B C
2 1
5 3
3 3
26
R2( B , C )
Now, if we perform the natural join ( ⋈ ) of the sub relations R1 and R2 , we get-
A B C
1 2 1
2 5 3
3 3 3
NOTE-
27
R1 ⋈ R2 ⋈ R3 ……. ⋈ Rn ⊃ R
Example-
A B C
1 2 1
2 5 3
3 3 3
R( A , B , C )
Consider this relation is decomposed into two sub relations as R1( A , C ) and R2( B , C )-
A C
1 1
28
2 3
3 3
R1( A , B )
B C
2 1
5 3
3 3
R2( B , C )
Now, if we perform the natural join ( ⋈ ) of the sub relations R1 and R2 we get-
A B C
1 2 1
2 5 3
2 3 3
29
3 5 3
3 3 3
This relation is not same as the original relation R and contains some extraneous tuples.
Clearly, R1 ⋈ R2 ⊃ R.
Thus, we conclude that the above decomposition is lossy join decomposition.
NOTE-
Condition-01:
Union of both the sub relations must contain all the attributes that are present in the original
relation R.
Thus,
R1 ∪ R2 = R
30
Condition-02:
R1 ∩ R2 ≠ ∅
Condition-03:
Intersection of both the sub relations must be a super key of either R1 or R2 or both.
Thus,
R1 ∩ R2 = Super key of R1 or R2
Problem-01:
Solution-
Condition-01:
According to condition-01, union of both the sub relations must contain all the attributes of
relation R.
31
So, we have-
R1 ( A , B ) ∪ R2 ( C , D )
=R(A,B,C,D)
Clearly, union of the sub relations contain all the attributes of relation R.
Thus, condition-01 satisfies.
Condition-02:
According to condition-02, intersection of both the sub relations must not be null.
So, we have-
R1 ( A , B ) ∩ R2 ( C , D )
=Φ
Clearly, intersection of the sub relations is null.
So, condition-02 fails.
Thus, we conclude that the decomposition is lossy.
32