05 Normalisation
05 Normalisation
Database Applications
• Let’s represent this relation in a table:
Normalization
1 2
READS
Name PaperList READS2
Smith Record, Mail Name Paper
Lee Herald Smith Record
Smith Mail
• This is not ideal. Each person is associated with an unspecified number Lee Herald
of papers. The items in the PaperList column do not have a consistent
• This clearly contains the same information.
form.
• And it has the property that we sought. It is in First Normal Form
• Generally, RDBMS can’t cope with relations like this. Each entry in a (1NF).
table needs to have a single data item in it. – A relation is in 1NF if no entry consists of more than one value (i.e. does
• This is an unnormalised relation. not have repeating groups)
• All RDBMS require relations not to be like this - not to have multiple • So this will be the first requirement in designing our databases:
values in any column (i.e. no repeating groups) – our relations must be in 1NF.
3 4
5 6
Problems with a 1NF Relation: Duplication Problems with a 1NF Relation: Update
• Now suppose that Smith borrows another book. Anomalies
• To ensure 1NF, we shall have to have a complete new row:
• Such repetition means that updates can be difficult.
• Suppose that Smith goes on to a new grade.
StaffBorrower – Changes would be required to all records for Smith.
Sno Sname Sdept Grade Salary Bno Date_out
1 Smith Computing 2.7 26813 1 30/06/2002 – (And there is a danger that we may miss some.)
2 Black Marketing 1.5 17278 8 08/07/2002 • Suppose that the salary for grade 2.7 is changed.
1 Smith Computing 2.7 26813 53 12/07/2002
– All records for all staff members on grade 2.7 would have to be changed.
• A fact should be stored only once. Updates are then problem-free.
• We have stored all the other details about this member of staff again, in • This example relation is poorly structured, being subject to update
the new row. anomalies.
• Not only is information about Smith duplicated, but
– the fact that staff on grade 2.7 earn £26,813 is duplicated
7 8
9 10
11 12
A formal apparatus The predicate of a relation
• We need a method of analysing relations to detect and prevent these • Any relation has a predicate - a definition of what any row means
problems • This will usually just be a statement in natural language, e.g.
• We need a set of definitions and procedures to – SUML1 : “the student Stu took unit UCode and obtained a mark of
– diagnose whether relations have a ‘silly’ design Mark. Unit UCode is coordinated by lecturer Lect.”
– turn them into other, better designed, relations in a systematic way •
Or perhaps
• We begin by reminding ourselves what a relation “means” – SUML2 : “the student Stu took unit UCode and obtained a mark of
– a relation’s interpretation is not always obvious from the names of its Mark. In that unit, their tutorial was taken by lecturer Lect.”
columns ..e.g. •
Or even
•SUML – SUML3 : “the student Stu took unit UCode and obtained a mark of
Stu UCode Lect Mark Mark. In that unit, lecturer Lect was one of the lecturers”
Gary 3131 Hamilton 64
13 14
15 16
21 22
– if any relation isn’t 3NF, the E/R analysis was wrong, and we should
repeat the process
END of topic!
25