Cape Notes Unit 2 Module 1 Content 09
Cape Notes Unit 2 Module 1 Content 09
What is Normalization?
Normalization is a database design technique that reduces data redundancy and eliminates
undesirable characteristics like Insertion, Update and Deletion Anomalies. Normalization
rules divides larger tables into smaller tables and links them using relationships. The purpose
of Normalisation in SQL is to eliminate redundant (repetitive) data and ensure data is stored
logically.
The inventor of the relational model Edgar Codd proposed the theory of normalization of data
with the introduction of the First Normal Form, and he continued to extend theory with
Second and Third Normal Form. Later he joined Raymond F. Boyce to develop the theory of
Boyce-Codd Normal Form.
1. Arranging data into logical groups such that each group describes a small part of the
whole
2. Minimizing the amount of duplicated data stored in a database
3. Building a database in which you can access and manipulate the data quickly and
efficiently without compromising the integrity of the data storage
4. Organising the data such that, when you modify it, you make the changes in only one
place
NOTE: Sometimes database designers refer to these goals in terms such as data integrity,
referential integrity, or keyed data access.
Normalization is a complex process with many specific rules and different intensity levels. In
its full definition, normalization is the process of discarding repeating groups, minimizing
redundancy, eliminating composite keys for partial dependency and separating non-key
attributes.
In simple terms, the rules for normalization can be summed up in a single phrase: "Each
attribute (column) must be a fact about the key, the whole key, and nothing but the key." Said
another way, each table should describe only one type of entity (information).
A properly normalised design allows you to:
• Use storage space efficiently
• Eliminate redundant data
• Reduce or eliminate inconsistent data
• Ease the database maintenance burden
A bad database design usually include:
• Repetition of information
• Inability to represent certain information
• Loss of information
• Difficulty to maintain information
When you normalise a database, you start from the general and work towards the specific,
applying certain tests (checks) along the way. Some users call this process decomposition. It
means decomposing (dividing/breaking down) a ‘big’ un-normalise table (file) into several
smaller tables by:
• Eliminating insertion, update and delete anomalies
• Establishing functional dependencies
• Removing transitive dependencies
• Reducing non-key data redundancy
Primary Key: A primary key is a special relational database table column (or combination of
columns) designated to uniquely identify each table record. A primary key is used as a unique
identifier to quickly parse data within the table. A table cannot have more than one primary
key.
A primary key’s main features are:
• It must contain a unique value for each row of data.
• It cannot contain null values.
• Every row must have a primary key value.
A primary key might use one or more fields already present in the underlying data model, or
a specific extra field can be created to be the primary key.
Foreign Key: A foreign key is a column or group of columns in a relational database table
that provides a link between data in two tables. It acts as a cross-reference between tables
because it references the primary key of another table, thereby establishing a link between
them.