Normalisation: Cust# Name Ord# Date Part# Desc Qty Price Supp# Name
Normalisation: Cust# Name Ord# Date Part# Desc Qty Price Supp# Name
Normalisation is defined briefly but accurately in the following statement; The Key the Whole Key and Nothing but the Key (so help me Codd) Typically the literature on normalisation covers many levels of normalisation, 9 is not uncommon, but this seems to me to be a race amongst academics to identify as many levels as possible, in 99 cases out of 100, 3 levels of normalisation are all that is required. 1st Normal Form; converting an un-normalised data structure such as a report or an order form into 1st Normal Form (1NF) is commonly referred to as removing repeating groups but also may involve removing complex groups such as the Address Group described in rule 2 (see chapter 5). The aim is to ensure that each item is atomic. 2nd Normal Form; Converting a 1NF data structure into 2nd Normal Form (2NF) involves looking at each non-primary key attribute and ensuring that it depends on the whole of the key and not just part of it. 3rd Normal Form; Converting a 2NF data structure into 3rd Normal Form (3NF) involves looking at the interrelationships between non key attributes to see if any non key attributes depend only on each other. This is all best described by looking at an example. Consider the following table which has been built up by an order entry clerk;
Ord# 123
Date 20/3
Part# 1 2 3
Desc AA BB CC DD EE DD FF
Qty 2 3 4 5 6 7 8
Supp# 23 23 24 25 26 25 27
456
21/3
4 5
John
789
21/3
4 6
This table structure could be implemented quite easily in Cobol or in a network DBMS as shown in chapter 5, with all the associated problems.
CUSTOMERS(Customer_Number, Customer_Name, (Order_Number, Order_Date, (Part_Number, Part_Description, Part_Quantity, Part_Price, Supplier_Number, Supplier_Name))
The internal brackets are meant to represent repeating groups and the underline represents a primary key. This called an un-normalised or 0NF data structure.
There are two approaches converting this 0NF structure to 1NF the first involves replicating the values in the table as follows;
Part# 1 2 3 4 5 4 6
Desc AA BB CC DD EE DD FF
Qty 2 3 4 5 6 7 8
Supp# 23 23 24 25 26 25 27
However this seems to be a clumsy approach and results in a three part key consisting of Cust#, Ord# and Part#. A simpler approach is to separate the repeating groups out into separate tables.
CUSTOMERS(Customer_Number, Customer_Name)
CUSTOMERS(Customer_Number, Customer_Name)
The structure is now in 1NF since there are no repeating or complex group items (each item depends on the key). The next step is to convert the structure into 2NF, by examining each non primary key attribute to ensure that each depends on the whole of the key.
The CUSTOMERS and ORDERS tables each have a single column making up their primary key and are therefore by definition in 2NF. However looking at the ORDER_PARTS table it can be seen that Part_Description, Part_Price, Supplier_Number and Supplier Name only depend on Part_Number, i.e. their values are the same regardless of Order_Number. (Part_Quantity depends on the whole of the key since different quantities can appear on different orders.) To convert to 2NF a separate table is created for part descriptions, prices ,and supplier details
CUSTOMERS(Customer_Number, Customer_Name)
The structures are now in 2NF since every non-primary key attribute depends on the whole of the key. The next step is to convert the structure into 3NF by ensuring that each non-primary key attribute depends on nothing but the key.
The CUSTOMERS table is patently in 3NF because there is no non-primary key attribute for Customer_Name to depend on. The ORDERS table is in 3NF because there is no dependency between Order_Date and Customer_Number (a customer can place different orders on different dates). The ORDER_PARTS table is in 3NF because the quantity ordered is dependent on both the order number and the part number. Looking however at the PARTS table it can be seen that the Supplier_Name attribute depends on the Supplier_Number and has nothing to do with the part number. To convert the structure into 3 NF a separate table is created containing supplier details.
CUSTOMERS(Customer_Number, Customer_Name)
SUPPLIERS(Supplier_Number, Supplier_Name)