Week 6 lecture Normalization
Week 6 lecture Normalization
or
Change Profile to Silent Mode
Week 6
Normalisation (Part Two)
Topics
• Normalisation (Revision)
• Unnormalized Form (UNF)
• First Normal Form (1NF)
• Second Normal Form (2NF)
• Third Normal Form (3NF)
• Anomalies (Insert, Delete, Update)
• Advantages and Disadvantages
Introduction
• Normalisation is to get data into simple form that truly reflects separate entity types, their attributes and relationships between
them to avoid unnecessary duplication of data
• Starts from pre-documented sets of attributes and tries to group and regroup them without causing data inconsistencies in such
a way that Anomalies are avoided
Normalisation Revision (From UNF to 3NF)
Applying UNF (Unnormalised Form)
Step 1: List all the attributes from the unnormalised form/structure in a single Relation within a ( ) and Name
the relation.
Step 2: Choose a suitable Unique Identifier for this Relation
Step 3: Show Repeating Group within { } in this Relation
NOTE: Repeating group refers to an attribute or a group of attributes in an unnormalized table that can have multiple
values associated with a single instance/occurrence of the nominated key attribute (primary key) of the table.
Example:
Unnormalised Form (UNF): Things to remember
• All attributes with repeating groups included in a single Relation
• If we do not have Repeating Group in UNF, this Relation is already in First Normal Form (1NF)
• You may have one Repeating Group within another Repeating Group
• You may have one Repeating Group then another Repeating Group.
• You may even have Single or Composite Unique key in UNF.
Applying First Normal Form (1NF)
Step 1: Repeating groups should be removed to separate Relation (entity) and name that relation (entity)
Step 2: Assign a key attribute from the repeating group of attributes which has been separated to a new
relation.
Step 3: Also, carry forward the key attribute from the UNF relation to the new relation.
Step 2: Identify and Separate Partial Functional Dependency (Attributes that are wholly dependent on only part of
Composite Primary Key) to a new relation and name that relation.
Full Functional Dependency
Example: orderNo, productNo → orderQty, lineTotal
• Relation that is in First Normal Form will be in Second Normal Form if any one of following conditions apply:
• Primary Key consists of only one attribute (such as the attribute ORDER-NO in ORDER).
• No non-key attributes exist in Relation.
• Every non-key attribute is Functionally Dependent on full set of Primary Key attributes
Applying Third Normal Form (3NF)
Step 1: Check for transitive dependencies in 2NF. Check for relation that have more than one non key attribute.
A B C
Step 2: Identify and Separate Transitive Functional Dependency as per the 3NF rule to a new relation.
A → B → C (Transitive Dependency) orderNo → custNo → custName, custAddress
A → B (Existing relation) orderNo → custNo
B → C (New Relation) custNo → custName,custAddress
Person
Project Code Project Type Description Name Grade Salary Scale Date-join Project Alloc-time
Number
IC5001 New Dev Develop Claims System 2146 Jones A1 $2K-$4K 1/11/2024 24
IC5001 New Dev Develop Claims System 3145 Smith A2 $4K-$6K 2/10/2024 24
IC5001 New Dev Develop Claims System 6126 Black B1 $6K-$9K 7/11/2024 18
IC5001 New Dev Develop Claims System 1214 Brown A2 $4K-$6K 3/10/2024 12
IC5001 New Dev Develop Claims System 8191 Green A1 $2K-$4K 12/11/2024 18
PAY22 Maint Maintain Payments 6142 Jacks A2 $4K-$6K 9/11/2024 6
PAY22 Maint Maintain Payments 3169 White B2 $9K-$10K 4/11/2024 12
PAY22 Maint Maintain Payments 6145 Dean B3 $10K-$11K 8/10/2024 6
Applying UNF
PROJECT (project–code, project-type, description, {person-number, name, grade, salary-scale, date-join-project, alloc-time})
1NF
Contains
PROJECT PROJECT-ALLOCATION
Checking Dependencies:
project-code, person-number→ date-join-project,
Applying 2NF alloc-time
project-code→
1NF person-number→ name, grade, salary-scale
PROJECT-1 (project-code, project-type, description)
PROJECT-ALLOCATION-1 (project-code, person-number, name, grade,
salary-scale, date-join-project, alloc-time)
2NF
PROJECT-2 (project-code, project-type, description)
PROJECT-ALLOCATION-2 (project-code, person-number, date-join-project,
alloc-time)
PERSON-2 (person-number, name, grade, salary-scale)
Applying 2NF
2NF
PROJECT-2 (project-code, project-type, description)
PROJECT-ALLOCATION-2 (project-code, person-number, date-join-project, alloc-time)
PERSON-2 (person-number, name, grade, salary-scale)
Checking Transitive Dependencies: 3NF
person-number → grade → salary-scale
person-number→ grade
grade → salary-scale
PROJECT-3 (project-code, project-type, description)
PROJECT-ALLOCATION-3 (project-code, person-number, date-join-project, alloc-time)
PERSON-3 (person-number, name, grade*)
GRADE-SALARY-3 (grade, salary-scale)
Project Allocation - ER Model
• Final list of 3NF Relation can be represented by following ER model
scheduled PROJECT
PROJECT
ALLOCATION
works - on
belongs - in
GRADE-SAL PERSON
Avoiding Anomalies using 2NF
MODULE-RESULT
Student-id Module-id Module-title Module-level Grade
S001 CSC100 Introduction to Computing Certificate P
S002 CSC100 Introduction to Computing Certificate D
S001 CSC200 Web Development Intermediate P
S003 CSC200 Web Development Intermediate F
S001 ACC200 Accounting Part 1 Certificate P
S004 ACC201 Accounting Part 2 Advanced D
S005 HIS200 History Advanced P
• Assume that student cannot take module more than once.
• What normal form is MODULE-RESULT in currently ?
Avoiding Anomalies in 2NF
• INSERTION anomaly can occur when attempting to enter details of a new course, DB100 Databases.
• DELETION anomaly can occur when student S004 decides to drop the ACC201 module.
• UPDATE anomaly can occur if necessary to change module title of ‘Web Development’ to ‘Website Development’.
• Use of 2NF rule allows us to avoid all above anomalies.
Avoiding Anomalies using 2NF
MODULE-RESULT
Student-id Module-id Grade MODULE
S001 CSC100 P Module-id Module-title Module-level
S002 CSC100 D CSC100 Introduction to Computing Certificate
S001 CSC200 P CSC200 Web Development Intermediate
S003 CSC200 F ACC200 Accounting Part 1 Certificate
S001 ACC200 P ACC201 Accounting Part 2 Advanced
S004 ACC201 D HIS200 History Advanced
S005 HIS200 P
Avoiding Anomalies using 3NF
LECTURER
Lecturer-id Lecturer-name Department Salary Location
W01 Emma Greg Computing 35,000 Moorgate
S01 Wendy Holder Computing 42,000 Moorgate
D01 Amy King Accounting 27,000 Staples
J02 Bob Jones Computing 29,000 Moorgate
N01 Bob Whales Accounting 36,500 Staples
J01 Jack Nelson History 41,500 Lewiston
• Assume that all lecturers within department are located at same place.
• What normal form is LECTURER in currently ?
Avoiding Anomalies in 3NF
• INSERTION anomaly can occur when attempting to create a new English department.
• DELETION anomaly can occur when Professor Nelson retires.
• UPDATE anomaly can occur if necessary to change the location of a department e.g. Computing moves from Moorgate to
Billingsgate.
• Use of 3NF rule allows us to avoid all above anomalies.
Avoiding Anomalies using 3NF
LECTURER
Lecturer-id Lecturer-name Department Salary
W01 Emma Greg Computing 35,000
S01 Wendy Holder Computing 42,000
LECTURER-LOCATION
D01 Amy King Accounting 27,000 Department Location
J02 Bob Jones Computing 29,000 Computing Moorgate
N01 Bob Whales Accounting 36,500 Accounting Staples
J01 Jack Nelson History 41,500 History Lewiston
Normalisation Normalisation
Advantages Disadvantages
• Facilitates update but not retrieval
• Data interdependencies are identified
• Requires real understanding of business rules
• Data can be grouped into related sets
• Takes multiple joins to retrieve required information
• Data is easier to maintain
• Normalised entities can be unnatural sometimes
• Anomalies and resulting redundancy is eliminated.
• Full normalisation not always possible
Summary of Dependencies
• Types of dependency between attributes
Attributes Dependency
Key Non - Key Functional
Part of Key Non - Key Partial Functional
Non - Key Non - Key Transitive
• Our aim for key attribute is to determine all other non-key attributes.
• Therefore, only first type of dependency is desirable.
• Normalisation ensures that entities are decomposed so that there is only Functional Dependency.
ER Design and Normalisation
• When an E-R diagram is carefully designed, identifying all entities correctly, the tables generated from the E-R diagram should not
need further normalization.
• However, in a real (imperfect) design there can be FDs ER Design and Normalisation from non-key attributes of an entity to other
attributes of the entity
• E.g. Emp entity with attributes dept-num and dept-add, and an FD dept-num 🡪 dept-add
• Good ER Design would have made department an entity
ER Design and Normalisation
ER MODELING NORMALISATION
Top-down approach Bottom-up approach
vs
Analysis of entities Analysis of attributes
Intuitive technique Formal technique
• Suggested Method: Do conceptual design using ER Modeling then when converting to a logical model use normalization as a validation
technique to ensure that all entities are in 3NF.
Any Questions?