Week 5 - Lab 4 - Logical Models & Normalization - Updated
Week 5 - Lab 4 - Logical Models & Normalization - Updated
Welcome to your fourth lab in IS680! During lab #3 we saw how we can transform a conceptual
model into a logical model. This gave us a higher level of details and opens the opportunity to create our
actual database and explore it using SQL.
Before we go into the actual implementation of a relational database, we must first evaluate the
quality of our logical models through the technique of normalization. This process will enable us to
detect flaws in our original model and our transformation, and, more importantly, it will minimize
redundancy which in turn minimizes the risks against the integrity of our data.
We start this lab by conducting a review of what normalization is, and what the different normal
forms (steps in the normalization process) are. We will then conduct 3 exercises to practice our skills at
normalizing tables into well structured relations.
Lab Goals (continues form Lab#3)
SLO1- To describe the differences between conceptual and logical models.
SLO2- To transform conceptual models into logical models (i.e. transform EERDs into relations).
SLO3- To create effective logical models with tables and relational integrity constraints.
SLO4- To use the normalization process to correct poor logical models into well-structured
relations.
What is normalization?
A technique used to divide tables into smaller tables that are connected through relationships.
Specifically, it is:
In short, the goal of normalization is to reduce redundancy in our databases ensuring the integrity of
the data in the process. It does this by dividing tables with many attributes into smaller more focused
tables reducing the need to repeat information throughout our database.
Normalization is meant to ensure that the database has no major flaws, but it is important to note
that if we developed an appropriate conceptual model and followed the transformation steps from Lab
#3, we WILL get a logical model that is normalize to some significant extent. Therefore, it is important to
start our database design by:
3) transforming the conceptual model into a logical model following the guidelines from lab#3
Normal forms refer to the specific steps that we follow to minimize redundancy within the
normalization technique. Each step consists of reviewing the database for a particular type of
error/problem and correcting it in an appropriate form. Practitioners typically aim to normalize their
databases to third normal as it provides significant benefits with minimal impact to performance (the
more tables and relationships we must describe a particular scenario, the slower performance will
typically be).
Understanding Normal Forms
Now that we mentioned the concept of normal forms, it is important that we review what they are:
NOTE: Keep in mind that each normal form builds on top of the previous one.
Tables are not typically considered relations unless they are in first normal form.
SOLUTION: We need to eliminate composite and multivalued attributes by breaking them into
different rows. For example:
Examples gathered from “studytonight”. Please visit them form more details at:
- https://www.studytonight.com/dbms/second-normal-form.php
In the picture above, the subject is a multivalued attribute (Ckon knows only 1 subject which is Java
while Akon and Bkon know 2 and all of them know different subjects). We solve this by separating each
value within the subject into a different row so that we end up with atomic values (1 value that cannot
be divided further within each cell).
While we had to repeat values in the other cells (including the roll_no which is arguably the
identifier), we now have atomic values in ALL cells. We will solve the repetition automatically as we
progress through the next 2 normal forms.
Every attribute that is not a key (primary key or foreign key) must be functionally
dependent on the entire primary key (i.e. it must be fully functionally dependent on the
primary key).
Moreover, the value of an attribute cannot be dependent on only part of the primary key
(that is called a partial functional dependency).
A functional dependency is simply means that the value for a give attribute in 1 row depends on
some other attribute (typically the primary key).
A full functional dependency means that the value for an attribute depends on the entire primary
key (if it is a composite it MUST depend on all components of the primary key and not just 1 or some of
them).
A partial functional dependency refers to an attribute’s value being dependent on only part of the
primary key rather than on its entirety (i.e. if the composite primary key of an associative table is made
of 2 FKs, any value for an attribute in that table must be dependent on both components of the primary
key rather than just on 1).
RECAP: Therefore, for a table to be in 2 nd normal form, it must be in 1 st normal form and all
attributes must be fully functional dependent on the primary key. That is not a problem for simple
primary keys (i.e. with 1 column), but if we have a composite primary key we must ensure that the
values for any attribute recorded in that table are dependent on all components of the primary key
rather than just 1. If that is not the case, we have a partial functional dependency.
To solve a partial functional dependency, we simply separate the attributes that are only partially
dependent on the primary key and place them on their own table which will continue to be referenced
through a foreign key in the original table.
EXAMPLE:
In this example from the book, we have a table with 10 attributes including 2 that combined form
the primary key for the table (OrderID + ProductID). The “Ordered Quantity” is dependent on both
components of the primary key (i.e. a full dependency) since how many items were ordered depends
entirely on both which item and which order we are referring to.
Nevertheless, the values for all other attributes are dependent on either the OrderID or the
ProductID rather than the combination of both making them a partial dependency. That is, OrderDate,
CustomerID, CustomerName, & CustomerAddress depend on the OrderID and are entirely independent
from the Product that was ordered (the ProductID). At the same time, ProductDescription,
ProductFinish, and Product Standard Price are all dependent on the Product that we are referring to
(identified by the productid) and have nothing to do with the OrderID.
To solve this, we keep the attributes that form a full dependency in the table, but extract the partial
dependencies into new tables. For example, OrderDate, CustomerID, CustomerName, and
CustomerAddress are extracted together with the key they depend on (the OrderID) into a new table.
The same happens for the ProductID partial dependency.
Finally, we keep a primary key/foreign key relationships between the newly formed tables and the
original one. We can clearly see this below where Order has its own table with PK OrderID, Product has
its own table with PK ProductID, and the original table with the composite key becomes an associative
entity between them with a PK formed by combining the foreign keys OrderID and ProductID.
For example: In the example for 2NF we create a table for Customers’ orders which included: Order
ID, Order Date, Customer ID, Customer Name, and Customer Address. While we determine that these
attributes depended on the OrderID alone rather than on both the OrderID and the ProductID forming a
partial dependency on OrderID, the reality is a bit more complex than that.
Specifically, while all attributes can be correctly identified by the Order ID, the attributes of
CustomerName, and CustomerAddress are actually dependent on who the customer is (meaning, they
are dependent on the CustomerID) and are only indirectly related to the OrderID (since the CustomerID
is the one related to the OrderID. That makes it a transitive dependency.
To solve it, we simply extract the transitive dependency from the Customer Order table, and
connect both through a primary key/foreign key relationship. Therefore, we take the CustomerName,
CustomerAddress, and CustomerID and create a new table with the for the Customer. In the original
Customer Order table, we keep the attributes/columns dependent on the OrderID including a Foreign
Key (FK) connecting it to the newly created Customer table.
Part 2: Practicing Normalization
The manager of a company dinner club would like to have an information system that assists him to
plan the meals and to keep track of who attends the dinners, and so on.
Because the manager is not an IS expert, the following table is used to store the information. As a
member can attend many dinners and a member will not attend more than 1 dinner on the same date, the
primary key of the following table is Member ID and Dinner ID. Dinners can have many courses,
from one-course dinner to as many courses as the chef desired.
Member = customer
Dinner = meal
Venue = location
Food = specific food item
PK = MEMBERID + DINNERID
TEST: Does each attribute depend on both components of the Primary Key, or just a portition?
1. Is the above table considered a relation? Why or why not? Is it in any normal form? If so, which
one?
c. 1st normal form as we do not have any multi-valued entries, but we still have
dependencies.
2. Transform the table above into first normal form 1NF. (To do this, check if there are multivalued
attributes and transform the table to get rid of them)
b. Identify the dependencies and which type they are (full dependencies, partial
dependencies, transitive dependencies.
3. Transform the table above into second normal form 2NF. (To do these separate partial
dependencies into separate tables).
4. Transform the table above into third normal form 3NF. (To do this, remove the transitive
dependencies by creating separate tables and relate them with the common attribute)
Figure 4-4 shows a relation called GRADE REPORT for a university. Your assignment is as follows:
a. Draw a logical model for the table above, and graph and explain (with arrows) the
functional dependencies in the relation.
d. Draw a model for your 3NF relations and show the primary key/foreign key
relationships.
Exercise 3: Normalizing a Shipping Manifest
Figure 4-5 shows a relation for a shipping manifest.
a. Draw a relational schema and diagram the functional dependencies in the relation.
b. In what normal form is this relation?
1NF
d. Draw a relational schema for your 3NF relations and show the referential integrity
constraints.
a. draw your answer to part d using Microsoft Visio (or any other tool specified by your
instructor).