0% found this document useful (0 votes)
39 views24 pages

Normalization Forms - Intro To DB

Data normalization is the process of structuring data to avoid duplication and inconsistencies. It involves decomposing relations with anomalies into smaller, well-structured relations. The goals are to achieve first, second, and third normal form which minimize redundancy and avoid anomalies like insertion, deletion, and modification anomalies. Higher normal forms like Boyce-Codd, fourth, and fifth further reduce anomalies by eliminating additional dependencies between attributes.

Uploaded by

тнє Sufi
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
39 views24 pages

Normalization Forms - Intro To DB

Data normalization is the process of structuring data to avoid duplication and inconsistencies. It involves decomposing relations with anomalies into smaller, well-structured relations. The goals are to achieve first, second, and third normal form which minimize redundancy and avoid anomalies like insertion, deletion, and modification anomalies. Higher normal forms like Boyce-Codd, fourth, and fifth further reduce anomalies by eliminating additional dependencies between attributes.

Uploaded by

тнє Sufi
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 24

Data Normalization

• Primarily a tool to validate and improve a


logical design so that it satisfies certain
constraints that avoid unnecessary
duplication of data
• The process of decomposing relations with
anomalies to produce smaller, well-
structured relations
Tanvir Ahmed 1
Well-Structured Relations
• A relation that contains minimal data redundancy
and allows users to insert, delete, and update rows
without causing data inconsistencies
• Goal is to avoid anomalies
– Insertion Anomaly – adding new rows forces user to
create duplicate data
– Deletion Anomaly – deleting rows may cause a loss of
data that would be needed for other future rows
– Modification Anomaly – changing data in a row
forces changes to other rows because of duplication

General rule of thumb: a table should not pertain to


more than one entity type
Tanvir Ahmed 2
Example – Figure 5.2b

Question – Is this a relation? Answer – Yes: unique rows and no multivalued


attributes

Question – What’s the primary key? Answer – Composite: Emp_ID,


Course_Title

Tanvir Ahmed 3
Anomalies in this Table
• Insertion – can’t enter a new employee without
having the employee take a class
• Deletion – if we remove employee 140, we lose
information about the existence of a Tax Acc class
• Modification – giving a salary increase to
employee 100 forces us to update multiple records
Why do these anomalies exist?
Because we’ve combined two themes (entity types)
into one relation. This results in duplication, and an
unnecessary dependency between the entities
Tanvir Ahmed 4
Tanvir Ahmed 5
Tanvir Ahmed 6
Tanvir Ahmed 7
Tanvir Ahmed 8
Tanvir Ahmed 9
Tanvir Ahmed 10
Tanvir Ahmed 11
Tanvir Ahmed 12
Functional Dependencies and Keys
• Functional Dependency: The value of one
attribute (the determinant) determines the
value of another attribute
• Candidate Key:
– A unique identifier. One of the candidate keys will
become the primary key
• E.g. perhaps there is both credit card number and SS# in
a table…in this case both are candidate keys
– Each non-key field is functionally dependent on
every candidate key

Tanvir Ahmed 13
5.22 -Steps in
normalization

Tanvir Ahmed 14
First Normal Form
• No multivalued attributes
• Every attribute value is atomic
• Fig. 5-2a is not in 1st Normal Form
(multivalued attributes)  it is not a relation
• Fig. 5-2b is in 1st Normal form
• All relations are in 1st Normal Form

Tanvir Ahmed 15
Second Normal Form
• 1NF plus every non-key attribute is fully
functionally dependent on the ENTIRE
primary key
– Every non-key attribute must be defined by the
entire key, not by only part of the key
– No partial functional dependencies
• Fig. 5-2b is NOT in 2nd Normal Form (see fig
5-23b)

Tanvir Ahmed 16
Fig 5.23(b) – Functional Dependencies
in EMPLOYEE2
Dependency on entire primary key

EmpID CourseTitle Name DeptName Salary DateCompleted

Dependency on only part of the key

EmpID, CourseTitle  DateCompleted


EmpID  Name, DeptName, Salary

Therefore, NOTTanvir
inAhmed
2nd Normal Form!! 17
Getting it into 2nd Normal Form
• See p193 – decomposed into two separate relations

Both are full


functional
EmpID Name DeptName Salary
dependencies

EmpID CourseTitle DateCompleted

Tanvir Ahmed 18
Third Normal Form
• 2NF PLUS no transitive dependencies
(one attribute functionally determines a
second, which functionally determines a
third)
• Fig. 5-24, 5-25

Tanvir Ahmed 19
Figure 5-24 -- Relation with transitive dependency
(a) SALES relation with simple data

Tanvir Ahmed 20
Figure 5-24(b) Relation with transitive dependency

CustID  Name
CustID  Salesperson BUT
CustID  Region
CustID  Salesperson  Region
All this is OK Transitive dependency
(2nd NF) (not 3rd NF)
Tanvir Ahmed 21
Figure 5.25 -- Removing a transitive dependency
(a) Decomposing the SALES relation

Tanvir Ahmed 22
Figure 5.25(b) Relations in 3NF

Salesperson  Region

CustID  Name
CustID  Salesperson

Now, there are no transitive dependencies…


Both relations are in 3rd NF
Tanvir Ahmed 23
Other Normal Forms
(from Appendix B)
• Boyce-Codd NF
– All determinants are candidate keys…there is no determinant
that is not a unique identifier
• 4th NF
– No multivalued dependencies
• 5th NF
– No “lossless joins”
• Domain-key NF
– The “ultimate” NF…perfect elimination of all possible
anomalies

Tanvir Ahmed 24

You might also like