0% found this document useful (0 votes)
26 views4 pages

Cape Notes Unit 2 Module 1 Content 09

Uploaded by

twitchnemesisfn
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
26 views4 pages

Cape Notes Unit 2 Module 1 Content 09

Uploaded by

twitchnemesisfn
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 4

Unit 2 – Module 1 – Information Management

Objective 9– Explain the concept of normalization


Content: Definition of normalization; attribute redundancy and anomalies; normal forms:
including 1NF, 2NF, 3NF; keys: primary key, foreign key and composite key (or compound
or concatenated); partial keys and non-key dependencies; relationships, use of ERDs.

What is Normalization?
Normalization is a database design technique that reduces data redundancy and eliminates
undesirable characteristics like Insertion, Update and Deletion Anomalies. Normalization
rules divides larger tables into smaller tables and links them using relationships. The purpose
of Normalisation in SQL is to eliminate redundant (repetitive) data and ensure data is stored
logically.
The inventor of the relational model Edgar Codd proposed the theory of normalization of data
with the introduction of the First Normal Form, and he continued to extend theory with
Second and Third Normal Form. Later he joined Raymond F. Boyce to develop the theory of
Boyce-Codd Normal Form.

What does normalization solve?


Illogically or inconsistently stored data can cause a number of problems. In a relational
database, a logical and efficient design is just as critical. A poorly designed database may
provide erroneous information, may be difficult to use, or may even fail to work properly.
Most of these problems are the result of two bad design features called: redundant data and
anomalies. Redundant data is unnecessary reoccurring data (repeating groups of data).
Anomalies are any occurrence that weakens the integrity of your data due to irregular or
inconsistent storage (delete, insert and update irregularity, that generates the inconsistent
data).
Basically, normalization is the process of efficiently organising data in a database. There are
two main objectives of the normalization process: eliminate redundant data (storing the same
data in more than one table) and ensure data dependencies make sense (only storing related
data in a table). Both of these are valuable goals as they reduce the amount of space a
database consumes and ensure that data is logically stored.
The process of designing a relational database includes making sure that a table contains only
data directly related to the primary key, that each data field contains only one item of data,
and that redundant (duplicated and unnecessary) data is eliminated. The task of a database
designer is to structure the data in a way that eliminates unnecessary duplication(s) and
provides a rapid search path to all necessary information. This process of specifying and
defining tables, keys, columns, and relationships in order to create an efficient database is
called normalization.
Normalization is part of successful database design. Without normalization, database systems
can be inaccurate, slow, and inefficient and they might not produce the data you expect.
We use the normalization process to design efficient and functional databases. By
normalizing, we store data where it logically and uniquely belongs. The normalization
process involves a few steps and each step is called a form. Forms range from the first normal
form (1NF) to fifth normal form (5NF).

When normalising a database, you should achieve four goals:

1. Arranging data into logical groups such that each group describes a small part of the
whole
2. Minimizing the amount of duplicated data stored in a database
3. Building a database in which you can access and manipulate the data quickly and
efficiently without compromising the integrity of the data storage
4. Organising the data such that, when you modify it, you make the changes in only one
place
NOTE: Sometimes database designers refer to these goals in terms such as data integrity,
referential integrity, or keyed data access.

Normalization is a complex process with many specific rules and different intensity levels. In
its full definition, normalization is the process of discarding repeating groups, minimizing
redundancy, eliminating composite keys for partial dependency and separating non-key
attributes.
In simple terms, the rules for normalization can be summed up in a single phrase: "Each
attribute (column) must be a fact about the key, the whole key, and nothing but the key." Said
another way, each table should describe only one type of entity (information).
A properly normalised design allows you to:
• Use storage space efficiently
• Eliminate redundant data
• Reduce or eliminate inconsistent data
• Ease the database maintenance burden
A bad database design usually include:
• Repetition of information
• Inability to represent certain information
• Loss of information
• Difficulty to maintain information
When you normalise a database, you start from the general and work towards the specific,
applying certain tests (checks) along the way. Some users call this process decomposition. It
means decomposing (dividing/breaking down) a ‘big’ un-normalise table (file) into several
smaller tables by:
• Eliminating insertion, update and delete anomalies
• Establishing functional dependencies
• Removing transitive dependencies
• Reducing non-key data redundancy

Key Terms used in Normalization

Primary Key: A primary key is a special relational database table column (or combination of
columns) designated to uniquely identify each table record. A primary key is used as a unique
identifier to quickly parse data within the table. A table cannot have more than one primary
key.
A primary key’s main features are:
• It must contain a unique value for each row of data.
• It cannot contain null values.
• Every row must have a primary key value.
A primary key might use one or more fields already present in the underlying data model, or
a specific extra field can be created to be the primary key.

Foreign Key: A foreign key is a column or group of columns in a relational database table
that provides a link between data in two tables. It acts as a cross-reference between tables
because it references the primary key of another table, thereby establishing a link between
them.

Composite Key: A composite key, in the context of relational databases, is a combination of


two or more columns in a table that can be used to uniquely identify each row in the table.
Uniqueness is only guaranteed when the columns are combined; when taken individually the
columns do not guarantee uniqueness.
What is an ER diagram (ERD)?
Entity Relationship Diagram, also known as ERD, ER Diagram or ER model, is a type of
structural diagram for use in database design. An ERD contains different symbols and
connectors that visualize two important information: The major entities within the system
scope, and the inter-relationships among these entities.
While ER models are mostly developed for designing relational databases in terms of concept
visualization and in terms of physical database design, there are still other situations when ER
diagrams can help. Here are some typical use cases:
• Database design - Depending on the scale of change, it can be risky to alter a database
structure directly in a DBMS. To avoid ruining the data in a production database, it is
important to plan out the changes carefully. ERD is a tool that helps. By drawing ER
diagrams to visualize database design ideas, you have a chance to identify the
mistakes and design flaws, and to make corrections before executing the changes in
the database.
• Database debugging - To debug database issues can be challenging, especially when
the database contains many tables, which require writing complex SQL in getting the
information you need. By visualizing a database schema with an ERD, you have a full
picture of the entire database schema. You can easily locate entities, view their
attributes and identify the relationships they have with others. All these allow you to
analyze an existing database and to reveal database problems easier.
• Database creation and patching - Visual Paradigm, an ERD tool, supports a database
generation tool that can automate the database creation and patching process by
means of ER diagrams. So, with this ER Diagram tool, your ER design is no longer
just a static diagram but a mirror that reflects truly the physical database structure.
• Aid in requirements gathering - Determine the requirements of an information system
by drawing a conceptual ERD that depicts the high-level business objects of the
system. Such an initial model can also be evolved into a physical database model that
aids the creation of a relational database, or aids in the creation of process maps and
data flow modes.

You might also like