Database Normalization 1 Running Head: Database Normalization
Database Normalization 1 Running Head: Database Normalization
Database Normalization
Lyndsey Parish
2009-07-CIS-311-OL009
Database Normalization 2
Abstract
This paper describes what database normalization is, why it is important, what it does, and why it is
used when building a database. This paper will also describe what happens to a database, when it has
not been normalized. It goes on then, to describe the first (1NF), second (2NF) and third (3NF) normal
forms.
Database Normalization 3
Database Normalization
When designing a relational database, it is common at first to run into problems like data
redundancy and data anomalies. To decrease data redundancy and anomaly problems, database
designers use a process called normalization. The process will eliminate data that is unnecessarily
stored in many tables, and eliminate inconsistency. Normalization works through a series of different
stages, which evaluate and correct the table structure. Database normalization is essentially the process
of organizing data in the database to increase consistency and integrity of the data.
Redundant data wastes time and space. If a data exists in more than one place and needs to be
changed, it must be changed exactly the same, in every place it exists. If the data is not changed
correctly, it will cause data inconsistency, meaning that a customer, for example, could have two
different addresses, leaving the end user unsure of which address is the correct one. Redundant data not
only poses an update anomaly, but also can create insertion and deletion anomalies. These all pose a
The normalization process ensures that all the relations become well formed, however they
must have certain characteristics to be considered normalized. Each table must represent a single
subject, no data may be entered in more than one table, non prime attributes on a table must be
dependent on the primary key, and every table must not have any anomalies. All these characteristics
Database Normalization 4
The different normalization stages are rules called “normal forms.” There are actually seven
normal forms, however only the first three are the most common ones used. A relational database starts
with first normal form (1NF), and progresses to third normal form (3NF). First normal form is the least
restrictive, and by adding restrictions the database can progress then into second normal form. All
relations in second normal form are also considered in first normal form because they still fall under
first normal forms requirements. All relations in third normal form are also in second normal form, and
first normal form. The pattern continues as you move into the higher normal forms.
When normalizing the relational database, any table that is considered a relation is in 1NF. Each
cell can contain only a single value. Every entry in a column must be the same kind, and every column
needs to have a unique name. The order of the columns and rows do not matter, as long as no two rows
are the same. Because these requirements are pretty vague, almost every table is qualified for and
begins at 1NF. This also still allows many possibilities for modification anomalies, so it must continue
to be normalized.
Second normal form (2NF) adds a few more restrictions, focusing on the removal of duplicate
data. For a relation to be in 2NF, it must first meet the requirements of 1NF. 2NF creates separate tables
for the duplicated data, and relates these tables with a foreign key. Each key component creates a new
table, and every non-key attributes dependent on the entire key. To get rid of most anomaly problems in
2NF, every determinant must be a key. Even after creating keys, 2NF can still have anomalies, so we
Third normal form (3NF) continues to add more restrictions to eliminate fields that do not
depend on any keys. For a relation to be in 3NF, it must also qualify for 2NF. Third normal form will
still have anomalies that are created by problems with keys and dependencies. Data that are in a record
Database Normalization 5
which are not a part of the records key will have to be removed from the table, because they do not
belong there. This process will eliminate transitive dependency, putting the relational database in 3NF.
A table in 3NF, may still harbor anomalies, however moving onto higher normal forms may not be
practical for many databases. While not moving onto higher normal forms may not create the perfect
Normalizing a database to 3NF is the most practical because every time the tables are
normalized, it creates even more tables, and it requires more space. When it requires too much space, it
might not make the normalization worth most of the effort, because it is using as much space as it was
when there was redundant data. If a database change often, normalizing it too much can also reduce the
performance of the database. Therefore most of the time, the database designer must find a balance
Database normalization is an important part of the database design process. The normal forms
determine to what degree the database is vulnerable to inconsistent data and data anomalies. The
higher the normal form, the less it is vulnerable, meaning it has higher integrity and consistency.
However, too much normalization can cause a lack of performance and increase the size of the
database. Therefore, the ideal and most commonly used normal form is third normal form. By using
third normal form, the database will have an ideal mix of data integrity and performance.
Database Normalization 6
References
Database Systems; Design, Implementation, and Management (8th ed.) (pp. 152-184)
United States
Chapple, M. (n.d.)
http://databases.about.com/od/specificproducts/a/normalization.htm
Taylor, G. (2001)