0% found this document useful (0 votes)
14 views23 pages

Unit II-Normalization

Normalization is a multi-step process aimed at eliminating data redundancy and enhancing data integrity in databases. It consists of several levels, known as normal forms, including First Normal Form (1NF), Second Normal Form (2NF), Third Normal Form (3NF), Boyce-Codd Normal Form (BCNF), Fourth Normal Form (4NF), and Fifth Normal Form (5NF), each with specific guidelines to organize data effectively. The document also discusses examples and the importance of lossless decomposition in maintaining data integrity during normalization.

Uploaded by

Pradisha P
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views23 pages

Unit II-Normalization

Normalization is a multi-step process aimed at eliminating data redundancy and enhancing data integrity in databases. It consists of several levels, known as normal forms, including First Normal Form (1NF), Second Normal Form (2NF), Third Normal Form (3NF), Boyce-Codd Normal Form (BCNF), Fourth Normal Form (4NF), and Fifth Normal Form (5NF), each with specific guidelines to organize data effectively. The document also discusses examples and the importance of lossless decomposition in maintaining data integrity during normalization.

Uploaded by

Pradisha P
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 23

Normalizatio

n
Normalization

Normalization is the process to eliminate data


redundancy and enhance data integrity in the table.

Normalization also helps to organize the data in the


database.

It is a multi-step process that sets the data into


tabular form and removes the duplicated data from
the relational tables.
Levels of Normalization

Normalization works through a series of stages


called Normal forms.

There are several levels of normalization, each with


its own set of guidelines, known as normal forms.

First Normal Form (1NF):

This is the most basic level of normalization. In 1NF,


each table cell should contain only a single value,
and each column should have a unique name. The
first normal form helps to eliminate duplicate data
and simplify queries.
1st Normal Form (1NF)

· A table is referred to as being in its First Normal


Form if atomicity of the table is 1.

· Atomicity states that a single cell cannot hold


multiple values. It must hold only a single-valued
attribute.

· The First normal form disallows the multi-valued


attribute, composite attribute, and their
combinations.
Example - 1NF
Student Table
Second Normal Form (2NF)

Second Normal Form (2NF):

2NF eliminates redundant data by requiring that


each non-key attribute be dependent on the
primary key. This means that each column should
be directly related to the primary key, and not to
other columns.

The first condition for the table to be in Second


Normal Form is that the table has to be in First
Normal Form. The table should not possess partial
dependency.
Example - 2NF
Location Table Composite primary key
cust_id, storeid

Non-key attribute
store_location

Table does not fulfill the


second normal form.

To remove the partial functional dependency from the location table,


split the table into two parts
Third Normal Form (3NF)
Third Normal Form (3NF):

3NF builds on 2NF by requiring that all non-key


attributes are independent of each other. This
means that each column should be directly related
to the primary key, and not to any other columns in
the same table.
The main condition is that there should be no
transitive dependency for non-prime attributes,
which indicates that non-prime attributes should
not depend on other non-prime attributes in a
table.

A transitive dependency is a functional dependency


in which A → C (A determines C) indirectly,
Example - 3NF

In the student table,


stu_id determines subid, and
subid determines sub.
Boyce-Codd Normal Form (BCNF)

BCNF is a standard for organizing database tables


to minimize data repetition.

BCNF is the advance version of 3NF. It is stricter


than 3NF.

A table is in BCNF if every functional dependency X


→ Y, X is the super key of the table.

For BCNF, the table should be in 3NF, and for every


FD, LHS is super key.
Example

A company where employees work in more than


one department.

EMP_ID EMP_COUNTRY EMP_DEPT DEPT_TYPE EMP_DEPT_NO

264 India Designing D394 283

264 India Testing D394 300

364 UK Stores D283 232

364 UK Developing D283 549

Functional dependencies

EMP_ID → EMP_COUNTRY
EMP_DEPT → {DEPT_TYPE, EMP_DEPT_NO}
BCNF Conversion

The table is not in BCNF because neither EMP_DEPT nor


EMP_ID alone are keys.

To convert the given table into BCNF, we decompose it


into three tables:

EMP_COUNTRY table
EMP_DEPT table
EMP_DEPT_MAPPING table
Three Tables

EMP_ID EMP_COUNTRY

264 India

264 India

EMP_ID EMP_DEPT EMP_DEPT DEPT_TYPE EMP_DEPT_NO

D394 283 Designing D394 283

D394 300 Testing D394 300


D283 232
Stores D283 232
D283 549
Developing D283 549
Functional Dependencies

Functional dependencies:

EMP_ID → EMP_COUNTRY
EMP_DEPT → {DEPT_TYPE, EMP_DEPT_NO}

Candidate keys:

For the first table: EMP_ID


For the second table: EMP_DEPT
For the third table: {EMP_ID, EMP_DEPT}

The Final Table is in BCNF because left side part of


both the functional dependencies is a key.
Fourth normal form (4NF)

Multivalued dependencies are handled by 4NF.

A relation will be in 4NF if it is in Boyce Codd normal


form and has no multi-valued dependency.

For a dependency A → B, if for a single value of A,


multiple values of B exists, then the relation will be
a multi-valued dependency.
Example
STUDENT
STU_ID COURSE HOBBY
STUDENT table is in 3NF, but the
21 Computer Dancing COURSE and HOBBY are two
independent entity.
21 Math Singing Hence, there is no relationship
between COURSE and HOBBY.
34 Chemistry Dancing

74 Biology Cricket

59 Physics Hockey

To make the STUDENT table into 4NF, decompose


it into two tables:

STUDENT_COURSE
STUDENT_HOBBY
Decomposition

STU_ID COURSE STU_ID HOBBY

21 Computer 21 Dancing

21 Math 21 Singing

34 Chemistry 34 Dancing

74 Biology 74 Cricket

59 Physics 59 Hockey
Fifth Normal Form (5NF)

Join dependencies are handled by 5NF.

A relation is in 5NF if it is in 4NF and not contains


any join dependency and joining should be lossless.

5NF is satisfied when all the tables are broken into


as many tables as possible in order to avoid
redundancy.

5NF is also known as Project-join normal form


(PJ/NF).
Lossless Decomposition

If the information is not lost from the relation that is


decomposed, then the decomposition will be
lossless.

The lossless decomposition guarantees that the


join of relations will result in the same relation as it
was decomposed.

The relation is said to be lossless decomposition if


natural joins of all the decomposition give the
original relation.
EMPLOYEE_DEPARTMENT table

EMP_ID EMP_NAME EMP_AGE EMP_CITY DEPT_ID DEPT_NAME

22 Denim 28 Mumbai 827 Sales

33 Alina 25 Delhi 438 Marketing

46 Stephan 30 Bangalore 869 Finance

52 Katherine 36 Mumbai 575 Production

60 Jack 40 Noida 678 Testing

The relation is decomposed into two relations


EMPLOYEE and DEPARTMENT
Employee and Department

EMP_ID EMP_NAME EMP_AGE EMP_CITY

22 Denim 28 Mumbai

33 Alina 25 Delhi

46 Stephan 30 Bangalore

52 Katherine 36 Mumbai

60 Jack 40 Noida
DEPT_ID EMP_ID DEPT_NAME

827 22 Sales

438 33 Marketing

869 46 Finance

575 52 Production

678 60 Testing
Joining Operation

When these two relations are joined on the


common column "EMP_ID", then the resultant
relation will be
EMP_ID EMP_NAME EMP_AGE EMP_CITY DEPT_ID DEPT_NAME

22 Denim 28 Mumbai 827 Sales

33 Alina 25 Delhi 438 Marketing

46 Stephan 30 Bangalore 869 Finance

52 Katherine 36 Mumbai 575 Production

60 Jack 40 Noida 678 Testing

The decomposition is Lossless join


decomposition.
END of the Session

You might also like