Chapter 17 Normalization
Chapter 17 Normalization
Database Tables and Normalization Eliminate repeating groups, eliminate nulls by making sure No attribute is dependent on only portion of primary key
that each repeating group attribute contains an appropriate Conversion to Third Normal Form
Normalization data value Data anomalies created are easily eliminated by completing
Process for evaluating and correcting table structures to three steps
minimize data redundancies Step 2: Identify the Primary Key
o Reduces data anomalies Primary key must uniquely identify attribute value Step 1: Identify Each New Determinant
Works through a series of stages called normal forms: New key must be composed For every transitive dependency, write its determinant as PK for
new table
o First normal form (1NF)
Determinant. Any attribute whose value determines other values
o Second normal form (2NF) Step 3: Identify All Dependencies
within a row
o Third normal form (3NF) Dependencies can be depicted with help of a diagram
Dependency diagram: Step 2: Identify the Dependent Attributes
2NF is better than 1NF; 3NF is better than 2NF o Depicts all dependencies found within given table Identify attributes dependent on each determinant identified in
For most business database design purposes, 3NF is as structure Step 1 and identify dependency
high as we need to go in normalization process o Helpful in getting bird’s-eye view of all relationships Name table to reflect its contents and function
Highest level of normalization is not always most desirable among table’s attributes
o Makes it less likely that will overlook an important Step 3: Remove the Dependent Attributes from Transitive
The Need for Normalization dependency Dependencies
Example: Company that manages building projects Eliminate all dependent attributes in transitive relationship(s)
First normal form describes tabular format in which: from each of the tables that have such a transitive relationship
Charges its clients by billing hours spent on each contract
All key attributes are defined Draw new dependency diagram to show all tables defined in
Hourly billing rate is dependent on employee’s position
Steps 1–3
Periodically, report is generated that contains information There are no repeating groups in the table
Check new tables as well as tables modified in Step 3 to make
displayed in Table 5.1 All attributes are dependent on primary key
sure that each table has determinant and that no table contains
inappropriate dependencies
Structure of data set in Figure 5.1 does not handle data very All relational tables satisfy 1NF requirements
well A table is in third normal form (3NF) when both of the following are
The table structure appears to work; report generated with Some tables contain partial dependencies true:
ease Dependencies based on only part of the primary key It is in 2NF
Unfortunately, report may yield different results depending Sometimes used for performance reasons, but should be It contains no transitive dependencies
on what data anomaly has occurred used with caution
Still subject to data redundancies Improving the Design
The Normalization Process Table structures are cleaned up to eliminate troublesome initial
Each table represents a single subject Conversion to Second Normal Form partial and transitive dependencies
Relational database design can be improved by converting Normalization cannot, by itself, be relied on to make good
No data item will be unnecessarily stored in more than one
the database into second normal form (2NF) designs
table
It is valuable because its use helps eliminate data redundancies
All attributes in a table are dependent on the primary key
Two steps
Issues to address in order to produce a good normalized set of tables:
Conversion to First Normal Form Evaluate PK Assignments
Repeating group Step 1: Write Each Key Component on a Separate Line
Evaluate Naming Conventions
Derives its name from the fact that a group of multiple Write each key component on separate line, then write
Refine Attribute Atomicity
entries of same type can exist for any single key original (composite) key on last line
Identify New Attributes
attribute occurrence Each component will become key in new table
Identify New Relationships
Relational table must not contain repeating groups Refine Primary Keys as Required for Data Granularity
Normalizing table structure will reduce data redundancies Step 2: Assign Corresponding Dependent Attributes Maintain Historical Accuracy
Determine those attributes that are dependent on other Evaluate Using Derived Attributes
Normalization is three-step procedure attributes
At this point, most anomalies have been eliminated Surrogate Key Considerations
Step 1: Eliminate the Repeating Groups When primary key is considered to be unsuitable, designers use
Present data in tabular format, where each cell has single Table is in second normal form (2NF) when: surrogate keys
value and there are no repeating groups It is in 1NF and Data entries in Table 5.3 are inappropriate because they
It includes no partial dependencies: duplicate existing records
o Yet there has been no violation of either entity
integrity or referential integrity