0% found this document useful (0 votes)
2 views

DBMS Unit-2

The document outlines the syllabus for a Database Management System course, focusing on database design, normalization, and functional dependencies. It details various normal forms (1NF, 2NF, 3NF, BCNF, 4NF) and their purposes in reducing data redundancy and eliminating anomalies. Additionally, it explains concepts such as multivalued dependencies and the importance of lossless and dependency-preserving decompositions in database design.

Uploaded by

Alpha Ayush
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views

DBMS Unit-2

The document outlines the syllabus for a Database Management System course, focusing on database design, normalization, and functional dependencies. It details various normal forms (1NF, 2NF, 3NF, BCNF, 4NF) and their purposes in reducing data redundancy and eliminating anomalies. Additionally, it explains concepts such as multivalued dependencies and the importance of lossless and dependency-preserving decompositions in database design.

Uploaded by

Alpha Ayush
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 39

MIT School of Computing

Department of Computer Science & Engineering

Third Year Engineering

21BTCS502-Database Management System

Class - T.Y.PLD(Division-)

AY 2024-2025
SEM-I

1
Unit – II

Database Design

2
MIT School of Computing
Department of Computer Science & Engineering

Syllabus
● Functional Dependency, Purpose of Normalization, Data
Redundancy and Update Anomalies, Functional
Dependency Single Valued Dependencies.
PLD
● Single Valued Normalization: 1NF, 2NF, 3NF, BCNF.
● Decomposition: lossless join decomposition and dependency
preservation.
● Multi valued Normalization (4NF), Join Dependencies and
the Fifth Normal Form.

3
Functional Dependency
Functional dependency in DBMS refers to the relationship between
attributes within a table where one attribute's value uniquely determines
another's.
Key points include:
● Functional Dependency (FD) defines how one attribute relates to
another within a database.
● It ensures data integrity by linking attributes in a structured
manner.
●Denoted by an arrow (→), e.g., X→YX \rightarrow YX→Y, where
XXX determines YYY.
●XXX is the determinant attribute, and YYY is the dependent attribute.
●Example: sid→snamesid \rightarrow snamesid→sname means the
student ID sid uniquely determines the student's name sname.
Types of Functional Dependency
1.Multivalued dependency
2.Trivial functional dependency
3.Non-trivial functional dependency
4.Transitive dependency
1. Multivalued dependency-Multivalued dependency occurs in the
situation where there are multiple independent multivalued
attributes in a single table.
2. Trivial functional dependency -The Trivial dependency is a set
of attributes which are called a trivial if the set of attributes are
included in that attribute.
So, X -> Y is a trivial functional dependency if Y is a subset of X.
3. Non-trivial functional dependency-
Functional dependency which also known as a nontrivial
dependency occurs when A->B holds true where B is not a subset
of A. In a relationship, if attribute B is not a subset of attribute A,
then it is considered as a non-trivial dependency.

4. Transitive dependency-
A transitive is a type of functional dependency which happens
when t is indirectly formed by two functional dependencies.
Properties of functional dependencies
1. Reflexivity: If Y is a subset of X then X 🡪 Y and it is
always valid.
e.g. sid🡪 sid
2. Augmentation: if X 🡪 Y then XZ 🡪 YZ
e.g. sidphoneno 🡪 snamephoneno

3. Transitive: if X 🡪 Y and Y 🡪 Z then X 🡪 Z


e.g. sid 🡪 sname and sname 🡪 city then sid
🡪 city

4. Union: If X 🡪 Y and X 🡪 Z then X 🡪 Y


Z

5. Decomposition: if X 🡪 YZ then X 🡪 Y and X 🡪 Z


Purpose of Normalization
• If a database design is not perfect, it may contain anomalies,
which are like a bad dream for any database administrator.
Managing a database with anomalies is next to impossible.
• Update anomalies − If data items are scattered and are not
linked to each other properly, then it could lead to strange
situations.
• Deletion anomalies − We tried to delete a record, but parts of it
was left undeleted because of unawareness, the data is also
saved somewhere else.
• Insert anomalies − We tried to insert data in a record that does
not exist at all.
Data Redundancy and Update Anomalies
Normalization
• Normalization is a database design technique that reduces data
redundancy and eliminates undesirable characteristics like
Insertion, Update and Deletion Anomalies.
Types of Normal forms
1. First Normal Form
2. Second Normal Form
3. Third Normal Form
4. BCNF (Boyce Codd Normal Form)
5. Fifth Normal Form
1st Normal Form (1NF)

• A table is referred to as being in its First Normal Form if


atomicity of the table is 1.

• Here, atomicity states that a single cell cannot hold multiple


values. It must hold only a single-valued attribute.

• The First normal form disallows the multi-valued attribute,


composite attribute, and their combinations.

Let’s understand the First Normal Form with the help of an example.
Below is a students’ record table that has information about
student roll number, student name, student course, and age of
the student.

In the students record table, you can see that the course column has
two values. Thus it does not follow the First Normal Form.
Now, if you use the First Normal Form to the previous table,
you get the below table as a result.

By applying the First Normal Form, you achieve atomicity,


and also every column has unique values.
Before proceeding with the Second Normal Form, get familiar with
Candidate Key and Super Key.
Candidate Key
A candidate key is a set of one or more columns that can identify a
record uniquely in a table, and you can use each candidate key as
a Primary Key.
Now, let’s use an example to understand this better.
Super Key
Super key is a set of over one key that can identify a record uniquely
in a table, and the Primary Key is a subset of Super Key.
Let’s understand this with the help of an example.
Second Normal Form (2NF)

•In the 2NF, relational must be in 1NF.


•In the second normal form, all non-key attributes are fully
functional dependent on the primary key
Example: Let's assume, a school can store the data of teachers and
the subjects they teach. In a school, a teacher can teach more than
one subject.
• TEACHER table

TEACHER_ID SUBJECT TEACHER_AGE

25 Chemistry 30
25 Biology 30
47 English 35
83 Math 38
83 Computer 38

In the given table, non-prime attribute TEACHER_AGE is


dependent on TEACHER_ID which is a proper subset of a
candidate key. That's why it violates the rule for 2NF.
To convert the given table into 2NF, we decompose it into
two tables:

17
TEACHER_DETAIL table:

TEACHER_ID TEACHER_AGE
25 30
47 35
83 38

TEACHER_SUBJECT table:

TEACHER_ID SUBJECT
25 Chemistry
25 Biology
47 English
83 Math
83 Computer
Third Normal Form (3NF)

•A relation will be in 3NF if it is in 2NF and not contain any


transitive partial dependency.
•3NF is used to reduce the data duplication. It is also used to
achieve the data integrity.
•If there is no transitive dependency for non-prime attributes, then
the relation must be in third normal form.
A relation is in third normal form if it holds atleast one of the
following conditions for every non-trivial function dependency
X → Y.
1.X is a super key.
2.Y is a prime attribute, i.e., each element of Y is part of some
candidate key.
Example:
EMPLOYEE_DETAIL table:
EMP_ID EMP_NAME EMP_ZIP EMP_STATE EMP_CITY
222 Harry 201010 UP Noida
333 Stephan 02228 US Boston
444 Lan 60007 US Chicago
555 Katharine 06389 UK Norwich
666 John 462007 MP Bhopal

Super key in the table above:


1.{EMP_ID}, {EMP_ID, EMP_NAME}, {EMP_ID, EMP_NAME, EMP_ZIP}....so on
Candidate key: {EMP_ID}
Non-prime attributes: In the given table, all attributes except EMP_ID are non-prime.

Here, EMP_STATE & EMP_CITY dependent on EMP_ZIP and EMP_ZIP dependent


on EMP_ID. The non-prime attributes (EMP_STATE, EMP_CITY) transitively
dependent on super key(EMP_ID). It violates the rule of third normal form.
That's why we need to move the EMP_CITY and EMP_STATE to the new
<EMPLOYEE_ZIP> table, with EMP_ZIP as a Primary key.
EMPLOYEE table:

EMP_ID EMP_NAME EMP_ZIP


222 Harry 201010
333 Stephan 02228
444 Lan 60007
555 Katharine 06389
666 John 462007

EMPLOYEE_ZIP table:

EMP_ZIP EMP_STATE EMP_CITY


201010 UP Noida
02228 US Boston
60007 US Chicago
06389 UK Norwich
462007 MP Bhopal
Boyce Codd Normal Form (BCNF)
Boyce Codd Normal Form is also known as 3.5 NF. It is the
superior version of 3NF and was developed by Raymond F.
Boyce and Edgar F. Codd to tackle certain types of anomalies
which were not resolved with 3NF.

The first condition for the table to be in Boyce Codd Normal


Form is that the table should be in the third normal form.
Secondly, every Right-Hand Side (RHS) attribute of the
functional dependencies should depend on the super key of that
particular table.
For example :
You have a functional dependency X → Y. In the particular
functional dependency, X has to be the part of the super key of
the provided table.
Consider the subject table:

The subject table follows these


conditions:
•Each student can enroll in multiple subjects.
•Multiple professors can teach a particular subject.
•For each subject, it assigns a professor to the student.

In the above table, student_id and subject together form the primary
key because using student_id and subject; you can determine all the
table columns.
Another important point to be noted here is that one professor
teaches only one subject, but one subject may have two professors.
Which exhibit there is a dependency between subject and professor,
i.e. subject depends on the professor's name.
This table follows all the Normal forms except the Boyce Codd
Normal Form.
As you can see stuid, and subject forms the primary key, which
means the subject attribute is a prime attribute.
However, there exists yet another dependency - professor → subject.
BCNF does not follow in the table as a subject is a prime attribute,
the professor is a non-prime attribute.
.
To transform the table into the BCNF, you will divide the table
into two parts.
One table will hold stuid which already exists and the second
table will hold a newly created column profid and in the second
table will have the columns profid, subject, and professor, which
satisfies the BCNF
Multivalued Dependency
•Multivalued dependency occurs when two attributes in a table are
independent of each other but, both depend on a third attribute.
•A multivalued dependency consists of at least two attributes that
are dependent on a third attribute that's why it always requires at
least three attributes.
Example: Suppose there is a bike manufacturer company which
produces two colors(white and black) of each model every year.
BIKE_MODEL MANUF_YEAR COLOR
M2011 2008 White
M2001 2008 Black
M3001 2013 White
M3001 2013 Black
M4006 2017 White
M4006 2017 Black
Here columns COLOR and MANUF_YEAR are dependent on
BIKE_MODEL and independent of each other.

In this case, these two columns can be called as multivalued


dependent on BIKE_MODEL.

The representation of these dependencies is shown below:


1.BIKE_MODEL → → MANUF_YEAR
2.BIKE_MODEL → → COLOR

This can be read as "BIKE_MODEL multidetermined


MANUF_YEAR" and "BIKE_MODEL multidetermined
COLOR".
Fourth normal form (4NF)
•A relation will be in 4NF if it is in Boyce Codd normal form and
has no multi-valued dependency.

•For a dependency A → B, if for a single value of A, multiple


values of B exists, then the relation will be a multi-valued
dependency.
Example:
The given STUDENT table is in 3NF, but the COURSE and
HOBBY are two independent entity. Hence, there is no relationship
between COURSE and HOBBY.
In the STUDENT relation, a student with STU_ID, 21 contains two
courses, Computer and Math and two
hobbies, Dancing and Singing. So there is a Multi-valued
dependency on STU_ID, which leads to unnecessary repetition of
data.
So to make the above table into 4NF, we can decompose it into
two tables:
Relational Decomposition
•When a relation in the relational model is not in appropriate normal
form then the decomposition of a relation is required.
•In a database, it breaks the table into multiple tables.

•If the relation has no proper decomposition, then it may lead to


problems like loss of information.

•Decomposition is used to eliminate some of the problems of bad


design like anomalies, inconsistencies, and redundancy.

Types of Decomposition

1. Lossless Decomposition
2. Dependency Preserving
Lossless Decomposition
•If the information is not lost from the relation that is decomposed,
then the decomposition will be lossless.
•The lossless decomposition guarantees that the join of relations
will result in the same relation as it was decomposed.
•The relation is said to be lossless decomposition if natural joins of
all the decomposition give the original relation.
Example:
EMPLOYEE_DEPARTMENT table:
EMP_ID EMP_NAME EMP_AG EMP_CITY DEPT_ID DEPT_NAME
E
22 Denim 28 Mumbai 827 Sales
33 Alina 25 Delhi 438 Marketing
46 Stephan 30 Bangalore 869 Finance
52 Katherine 36 Mumbai 575 Production
The60
above Jack
relation is decomposed
40 into two relations EMPLOYEE
Noida 678 andTesting
DEPARTMENT
EMPLOYEE table:

EMP_ID EMP_NAME EMP_AGE EMP_CITY

22 Denim 28 Mumbai
33 Alina 25 Delhi
46 Stephan 30 Bangalore
52 Katherine 36 Mumbai
60 Jack 40 Noida
DEPARTMENT table

DEPT_ID EMP_ID DEPT_NAME

827 22 Sales
438 33 Marketing
869 46 Finance
575 52 Production
678 60 Testing
Now, when these two relations are joined on the common column
"EMP_ID", then the resultant relation will look like:
Employee ⋈ Department
EMP_ID EMP_NAME EMP_AGE EMP_CITY DEPT_ID DEPT_NAME

22 Denim 28 Mumbai 827 Sales

33 Alina 25 Delhi 438 Marketing

46 Stephan 30 Bangalore 869 Finance

52 Katherine 36 Mumbai 575 Production

60 Jack 40 Noida 678 Testing

Hence, the decomposition is Lossless join decomposition.


Dependency Preserving

•It is an important constraint of the database.


•In the dependency preservation, at least one decomposed table must
satisfy every dependency.
•If a relation R is decomposed into relation R1 and R2, then the
dependencies of R either must be a part of R1 or R2 or must be
derivable from the combination of functional dependencies of R1
and R2.
•For example, suppose there is a relation R (A, B, C, D) with
functional dependency set (A->BC). The relational R is decomposed
into R1(ABC) and R2(AD) which is dependency preserving because
FD A->BC is a part of relation R1(ABC).
Fifth normal form (5NF)
•A relation is in 5NF if it is in 4NF and not contains any join
dependency and joining should be lossless.
•5NF is satisfied when all the tables are broken into as many tables
as possible in order to avoid redundancy.
•5NF is also known as Project-join normal form (PJ/NF).
In the above table, John takes both Computer and Math class for
Semester 1 but he doesn't take Math class for Semester 2.
In this case, combination of all these fields required to identify a
valid data.

Suppose we add a new Semester as Semester 3 but do not know


about the subject and who will be taking that subject so we leave
Lecturer and Subject as NULL. But all three columns together acts
as a primary key, so we can't leave other two columns blank.

So to make the above table into 5NF, we can decompose it into


three relations P1, P2 & P3:
P1
Summary
Normal Form Description
1NF A relation is in 1NF if it contains an atomic value.
2NF A relation will be in 2NF if it is in 1NF and all non-key
attributes are fully functional dependent on the primary key.
3NF A relation will be in 3NF if it is in 2NF and no transition
dependency exists.
BCNF A stronger definition of 3NF is known as Boyce Codd's normal
form.
4NF A relation will be in 4NF if it is in Boyce Codd's normal form
and has no multi-valued dependency.
5NF A relation is in 5NF. If it is in 4NF and does not contain any join
dependency, joining should be lossless.

You might also like