0% found this document useful (0 votes)
9 views

Databases Normalization - Lecture 8

The document discusses database normalization and provides examples to illustrate the concepts of first, second, and third normal form. It shows how to normalize a sample student database by removing repeating groups and dependencies on partial keys and non-key attributes. The normalized database structure eliminates data duplication and inconsistencies.

Uploaded by

thelangastamper
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views

Databases Normalization - Lecture 8

The document discusses database normalization and provides examples to illustrate the concepts of first, second, and third normal form. It shows how to normalize a sample student database by removing repeating groups and dependencies on partial keys and non-key attributes. The normalized database structure eliminates data duplication and inconsistencies.

Uploaded by

thelangastamper
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 20

Databases

MORE NORMALIZATION
Quick Recap of Normalisation

NORMAL FORM TEST REMEDY (NORMALIZATION)

1NF Relation should have no non-atomic attributes or nested relations. Form name relation for each non-atomic attribute or nested relation.
Decompose and set up a new relation for each partial key with its
For relations where primary key contains multiple attributes, no non-key attributes dependent attributes. Make sure to keep a relation with the original
2NF
should be functionally dependent on a part of the primary key. primary key and any attributes that are fully functionally dependent on
it.
Relation should not have a non-key attribute functionally determined by another non-key
Decompose and set up a relation that includes the non-key attribute(s)
3NF attribute (or by a sets of non-key attributes) i.e., there should be no transitive dependency
that functionally determine(s) other non-key attribute(s).
of a non-key attribute of the primary key.

Relation should not have any attribute in Functional Dependency which is non-prime, the Make sure that the left side of every functional dependency is a
BCNF
attribute that doesn’t occur in any candidate key. candidate key.

The relation should not have a multi-value dependency means it occur when two
4NF Decompose the table into two sub tables.
attributes of a table are independent of each other but both depend on a third attribute.
The relation should not have join dependency means if a table can be recreated by joining
multiple tables and each of the tables has a subset of the attributes of the table, then the Decompose all the tables into as many as possible numbers in order to
5NF
table is in avoid dependency.
Join Dependency.
More Recap on Definitions

 Terminology:
 Entity – usually a table
 Attribute – usually a field
 Relation – connection between entities
 In the real world we talk about entities, relations and attributes when designing a DB
 We map these onto DB concepts like tables, primary keys and relations
Normalisation of a Design

 We start from the real­‐world view of the data – Entities, attributes, relations
 Normalise so the resulting DB will be easy to create and maintain
 Then create the DB
 Normalisation after creating the DB can be done for small examples
 If we redesign the DB, we need check it’s still normalized
 Quick Normalisation recap(again):
 First normal form (1NF) – no repeating groups of attributes
 Second normal form (2NF) – in 1NF and every attribute that is not part of the key depends on the whole key
 Third normal form (3NF) – in 2NF and every attribute (or set of attributes) that is not part of a key depends on nothing but
the key
Example : Highly simplified Student DB

 Doesn’t allow for repeating subjects


 No detail of how long registered and for
what degree
 Single-digit course code for compact
tables
 Address not broken down into logical
entities like postcode

Sufficient to illustrate
1NF – 3NF
1NF

 No repeating groups
 Is there a problem here?
Normalise to 1NF (1)

Better?
 Move the repeating groups to another table • Not quite there
Normalise to 1NF (2)

 Each subject can be taken by multiple


students
 what is the primary key here?
 is this 1NF?
Revised original table

 Course details now in their own


table, reached through student
number
 is this 1NF?
1NF summary

 We eliminated repeating groups by moving part of the original table to a new table
 The resulting modified and additional tables are both 1NF
 Because of the repeating groups, we needed a composite key in the new table because each row in the table is
identified by 2 things:
 student number and subject code
2NF

Composite key
 Entities must be in 1NF and also every
attribute that is not part of the key depends
on the whole key
 When ensuring we are in 2NF, we should not
break the 1NF property:
 Do not introduce any repeating groups
 Focus on the newly-created Student-Subject
table as an example
 Why is this not 2NF?
 Because subject- name depends only on
subject code
Solution?

 Another table with subject code and subject name


 Why is this a problem?
 If you changed the name of a subject, you would have to
find every place it occurred instead of changing it in one
place
 Reminder:
 1NF: no repeating groups
 2NF: 1NF and non-­key attributes do not depend on a
partial key
 If you have a single (not composite) key, the table is 2NF are both 1NF
 why?
and 2NF?
2NF summary

 By eliminating a dependency on part of a key, we eliminated duplication


 In this case, it was the subject name
 Eliminating that duplication makes it easier to maintain the DB
 In this case, if the subject name changes, we change it in one place
3NF

 Entities must be in 2NF and every attribute (or set of  Res name directly
attributes) that is not part of a key depends on nothing but depends on res code,
the key not on the key – here,
 Implies also in 1NF since that is included in the definition of student number (look
2NF at the whole table)
 Since each row of the table (record) depends on the key,
 Why is this bad?
depending on something else is a transitive dependence
 Transitive dependence: a dependence that arises from a chain of
dependences, not directly – it depends on the key via some other
attribute on which it depends directly
 Example: res details- focus on 2 columns of student details table
Why is Transitive in this case?

 If you know the student ID, you can determine the res name
 The res name is not actually dependent on the student but on the res code :–
 Student ID only works to find a res name because a student can only be on one res:
 it is not a functional dependence
 A functional dependence is where knowing one attribute uniquely determines another
Why not being 3NF is bad (1)

 In this example: if we change the name of a res, we need to change it for every student with
that res in their record Is this 1NF, 2NF
 The intuition: and 3NF?
 since res name only depends on res code not the key in the record, it isn’t really defined on this table
and there is no way to prevent repetitions – or inconsistent changes
 res name is not really an attribute of a student but of a res, so it logically does not belong in a student
record
 Remember referential integrity?
 Storing res name like this cannot guarantee that if the name changes
it changes consistently throughout
 The solution (fix):
 Separate out res name into another table
 While we are about it we can add in other res details
Final student details table

Is this 1NF,
2NF and 3NF?
Final tables
All good?

 Before:-
 To find out which res a student is in, look up 1 table
 Now: -
 Look up 2 tables: STUDENT-DETAILS then RESIDENCE_DETAILS
 Before:-
 To get a report like Student Number, Student Name, Courses, Results: read off from 1 table
 Now:-
 Need a join of 3 tables: STUDENT-DETAILS, STUDENT-SUBJECT, SUBJECT
Summary

 1NF eliminates repeating groups


 2NF eliminates partial dependencies
 3NF eliminates non-­key dependencies (also called transitive dependencies)
 And supports referential integrity
 If you think of each record as an entity, it should only contain things it defines
 Anything else: point to the place where it is defined
 Overall positives :
 No unnecessary repetition of data
 No blank entries in tables with repeating groups
 Negatives:
 Takes time to get right
 Need more joins than with fewer tables and joins can be slow
 Normalisation is a core concept of relational DBs – more in advanced courses

You might also like