0% found this document useful (0 votes)
29 views38 pages

Database Unit 4 Normilization 1 1

This document covers database normalization, its importance, and the various normal forms (1NF, 2NF, 3NF, BCNF, and 4NF) to reduce data redundancy and improve data integrity. It discusses functional dependencies, types of anomalies (insertion, deletion, modification), and the advantages and disadvantages of normalization. The document also provides examples and explanations of each normal form and the conditions required to achieve them.

Uploaded by

pravesh koirala
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
29 views38 pages

Database Unit 4 Normilization 1 1

This document covers database normalization, its importance, and the various normal forms (1NF, 2NF, 3NF, BCNF, and 4NF) to reduce data redundancy and improve data integrity. It discusses functional dependencies, types of anomalies (insertion, deletion, modification), and the advantages and disadvantages of normalization. The document also provides examples and explanations of each normal form and the conditions required to achieve them.

Uploaded by

pravesh koirala
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 38

UNIT IV DATABASE NORMALIZATION

4.1 Definition and Important of normalization


4.2 Functional Dependencies
4.3 Normalization: 1NF, 2NF,3NF,BCNF,and 4NF
4.1 Definition and Important of
normalization
Database normalization, or simply normalization, is the process of
restructuring a relational database in accordance with a series of so-
called normal forms in order to reduce data redundancy and improve
data integrity.
Data redundancy is the existence of data that is additional to the
actual data.
Data integrity is the maintenance of, and the assurance of the
accuracy and consistency of, data over its entire life-cycle.
It divides larger tables to smaller tables and links them using
relationships.
Need for data NORMALIZATION
 By removing errors and anomalies, normalization simplifies what are
usually complicated information analyses.
 This results in a well-oiled and functioning system full of quality,
reliable and useable data.
 Since data normalization makes your workflows and teams more
efficient, you are able to dedicate more resources towards
increasing your data extraction capabilities.
 As a result, you have more quality data entering your system and get
better insights on important aspects, thus enabling you to make
more low-risk, data-backed decisions. Ultimately, you see major
improvements on how your company is run.
Advantages of Normalization
A smaller database can be maintained as normalization eliminates the
duplicate data. Overall size of the database is reduced as a result.
 As databases become lesser in size, the passes through the data becomes
faster and shorter thereby improving response time and speed.
 Avoid redundant fields or columns.
 More flexible data structure i.e. we should be able to ad new rows and
data values easily
Better understanding of data.
Easier to maintain data structure i.e. it is easy to perform operations and
complex queries can be easily handled.
Disadvantages of Normalization
 Database systems are complex, difficult, and time-consuming to
design.
 Substantial hardware and software start-up costs.
 Initial training required for all programmers and users.
 On Normalizing the relations to higher normal forms i.e. 4NF, 5NF
the performance degrades.
 It is very time consuming and difficult process in normalizing
relations of higher degree.
 Careless decomposition may lead to bad design of database which
may leads to serious problems.
Functional Dependencies
 Functional dependency in DBMS, as the name suggests is a
relationship between attributes of a table dependent on each other.
 Introduced by E. F. Codd, it helps in preventing data redundancy and
gets to know about bad designs.
 For Example, consider the following table

 In this example, if we know the value of Employee number, we can


obtain Employee Name, city, salary, etc
 By this, we can say that the city, Employee Name, and
salary are functionally depended on Employee number.
 A functional dependency is denoted by an arrow →
 The functional dependency of X on Y is represented by X
→Y
 If column A of a table uniquely identifies the column B of
same table then it can represented as A->B (Attribute B is
functionally dependent on attribute A)
Multivalued dependency

Trivial functional
dependency
Types of Functional
Dependencies Non-trivial functional
dependency

Transitive dependency
Multivalued Dependency
Multivalued dependency occurs when two attributes in a table are
independent of each other but, both depend on a third attribute.
 A multivalued dependency consists of at least two attributes that are
dependent on a third attribute that's why it always requires at least three
attributes.

Here columns COLOR and MANUF_YEAR are dependent on BIKE_MODEL and


independent of each other.
In this case, these two columns can be called as multivalued dependent on
BIKE_MODEL.
Trivial Functional Dependency
The Trivial dependency is a set of attributes which are called a
trivial if the set of attributes are included in that attribute.
So, X -> Y is a trivial functional dependency if Y is a subset of X.
 A trivial functional dependency is a database dependency that
occurs when describing a functional dependency of an attribute
or of a collection of attributes that includes the original
attribute.
 Consider this table with two columns emp_id and emp_name
{emp_id, emp_name} -> emp_name [emp_name is a subset of
{emp_id, emp_name}]
Transitive Dependency
If non-primary key attributes depends upon other non-primary key
attributes than there occurs transitive dependency.
A transitive is a type of functional dependency which happens when t
is indirectly formed by two functional dependencies.
 A transitive functional dependency is when changing a non-key
column, might cause any of the other non-key columns to change.
 Consider the table, Changing the non-key column Full Name may
change Salutation.
{Company} -> {CEO} (if we know the company, we know its CEO's
name)
 {CEO } -> {Age} If we know the CEO, we know the Age
 Therefore according to the rule of rule of transitive dependency:
 { Company} -> {Age} should hold, that makes sense because if we
know the company name, we can know his age.
Note: You need to remember that transitive dependency can only
occur in a relation of three or more attributes.
Database Anomalies
Database anomalies are the problems in relations that occur due to
redundancy in the relations.
They can occur in poorly planned, un-normalised databases where all
the data is stored in one table.
 These anomalies affect the process of inserting, deleting and
modifying data in the relations.
 Some important data may be lost if a relation is updated that
contains database anomalies.
 It is important to remove these anomalies in order to perform
different processing on the relations without any problem.
Types of Anomalies
 Insertion Anomalies
Deletion Anomalies
 Modification Anomalies
Insertion Anomalies
An Insert Anomaly occurs when certain attributes cannot be inserted
into the database without the presence of other attributes

For example, we can't add a new course unless we have at least one
student enrolled on the course.
 If we want to add a new course then student details will become
null. So, course can’t be inserted without having student details. This
scenario forms insertion anomaly
Deletion Anomalies
A Delete Anomaly exists when certain attributes are lost because of the
deletion of other attributes

For example, consider what happens if Student S13 is the last student
to leave the course - All information about the course is lost.
Modification Anomalies
The modification anomaly occurs when the record is updated in the relation.
In this anomaly, the modification in the value of specific attribute requires
modification in all records in which that value occurs

For example, if we update cid of student then we need to update cname of


student too.
So, normalization process is required to eliminate anomalies from database.
First Normal form.
For a table to be in the First Normal Form, it should follow the following
rules:
a) It should only have single(atomic) valued attributes/columns.
b) All the columns in a table should have unique names.
• Consider the following table Student
Above table does not satisfy 1NF because column phone contains
multiple values. Hence, we need to create new table contact to store
phone numbers.
Following are the normalized tables that satisfy 1NF
Second Normal Form
For a table to be in the Second Normal Form, it must satisfy two
conditions:
a) The table should be in the First Normal Form.
b) There should be no Partial Dependency
• Partial Functional Dependency occurs only in relation with composite
keys.
• Partial functional dependency occurs when one or more non key
attribute are depending on a part of the primary key.
Example: Table: Stud_id, Course_id, Stud_name, Course_Name
Where: Primary Key = Stud_id + Course_id Then:
To determine name of student we use only Stud_id, which is part of
primary key. {Stud_id} -> {Stud_Name}
Hence, Stud_name is partially dependent on Stud_id. This is called
partial dependency.

Our Student table does not satisfy 2NF, because we have,


Primary Key: sid + sub_id
Here, to determine name of subject, we use sub_id which is a part of
primary key. Hence, sub_name is partially dependent on sub_id.
So, we need to remove partial dependency from Student table and
create new table subject to store subject details
Third Normal Form (3NF)
The official qualifications for 3NF are:
a) A table is already in 2NF.
b) Non primary key attributes do not depend on other non primary key attributes
(i.e. no transitive dependencies)
• All transitive dependencies are removed to place in another table.
• Consider the following student table

• In above table, there is transitive dependency on sname and salutation. Both are
non
• primary key attributes. Change in sname might cause change in salutiation. For
eg, if we change name Ram to Maya then we need to change salutiation too.
Functional Dependencies
Fourth Normal Form (4NF)
A table is in the 4NF if it is in 3NF and has no multivalued
dependencies.
A multivalued dependency exists when there are at least 3 attributes
(like X,Y and Z) in a relation and for value of X there is a well defined set
of values of Y and a well defined set of values of Z. However, the set of
values of Y is independent of set Z and vice versa.
Suppose a student can have more than one subject and more than one
activity.
Note that all three attributes make up the Primary Key.
Note that Student_Id can be associated with many subject as well as
many activities. This scenario is multi valued dependency.
 Databases with multivalued dependencies thus exhibit redundancy
Fourth Normal Form (4NF)
Boyce-Codd Normal Form (BCNF)
Boyce-Codd Normal Form (BCNF) is one of the forms of database
normalization. A database table is in BCNF if and only if there are no
non-trivial functional dependencies of attributes on anything other
than a superset of a candidate key.
 BCNF is also sometimes referred to as 3.5NF, or 3.5 Normal Form.
For a table to satisfy the Boyce-Codd Normal Form, it should satisfy
the following two conditions:
a) It should be in the Third Normal Form.
b) And, for any dependency A → B, A should be a super key.
The second point sounds a bit tricky, right? In simple words, it means,
that for a dependency A → B, A cannot be a non-prime attribute, if B
is a prime attribute.
Below we have a college enrolment table with columns student_id,
subject and professor.

In the above table student_id, subject together form the primary key,
because using student_id and subject, we can find all the columns of
the table.
Also, there is a dependency between subject and professor, where
subject depends on the professor name.
This table satisfies the 1st Normal form because all the values are
atomic, column names are unique and all the values stored in a
particular column are of same domain.
 This table also satisfies the 2nd Normal Form as their is no Partial
Dependency.
 And, there is no Transitive Dependency, hence the table also satisfies
the 3rd Normal Form.
 But this table is not in Boyce-Codd Normal Form.
Why this table doesn’t satisfy BCNF?

In the table above, student_id, subject form primary key, which means
subject column is a prime attribute.
But, there is one more dependency, professor → subject.
And while subject is a prime attribute, professor is a non-prime
attribute, which is not allowed by BCNF.
To make this relation(table) satisfy BCNF, we will decompose this table
into two tables, student table and professor table.
• Below we have the structure for both the tables.
The End Any Query ?

You might also like