0% found this document useful (0 votes)
10 views9 pages

Normalization

The document discusses database normalization. It defines normalization as a technique that organizes data in a database by eliminating redundant data and undesirable characteristics like anomalies during data modification. The purposes of normalization are to remove redundant data and ensure data dependencies are logically organized. The document then illustrates issues like data inconsistencies and anomalies when data is not normalized. It proceeds to explain the first, second and third normal forms which are rules to organize data in tables without these issues.

Uploaded by

anthony muthui
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views9 pages

Normalization

The document discusses database normalization. It defines normalization as a technique that organizes data in a database by eliminating redundant data and undesirable characteristics like anomalies during data modification. The purposes of normalization are to remove redundant data and ensure data dependencies are logically organized. The document then illustrates issues like data inconsistencies and anomalies when data is not normalized. It proceeds to explain the first, second and third normal forms which are rules to organize data in tables without these issues.

Uploaded by

anthony muthui
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 9

Running Head: NORMALIZATION 1

Normalization

Student’s name

Institution
NORMALIZATION 2

Normalization can be defined as a unique technique that organizes data in a given database. It

tend to decompose tables hence eliminating data redundancy and some other undesirable

characteristics. Mostly these undesirable characteristics include: Update, Deletion and Insertion.

In simple terms, normalization is a multi-process of step by step organization of data in tabular

form, and this is done by eliminating duplicated data from a relation table (Chapple, M., 2011)

There are two main purposes of Normalization:

 Removal of redundant or useless data.

 It ensures that data is logically kept, i.e. all data dependencies are sensible.

ILLUSTRATION WITHOUT NORMALIZATION.

It is difficult to manipulate and update the database without data loss when normalization is

absent. Anomalies like Insertion and Deletion tend to be more frequent if the database is not

normalized (Codd, E. F., 1982). To better understand the concept of normalization, let us

consider the following teacher table:

Table 1: Table without normalization.

REG. NO. NAME ADDRESS SUBJECT

19 CASSIDY TENNESSE CHEMISTRY

20 KEVIN MICHIGAN PHYSICS

21 JORNARA EXETER PHYSICS

22 CASSIDY TENNESSE MATHS


NORMALIZATION 3

Updation characteristic: In order to update the address of a teacher who has occur more than

once in the table, it will force us to update the address column in all the other rows, if not, the

data will become inconsistent (Van, D., 1985).

Insertion characteristic: In the event that a new admission has to be made, we have a teacher

REG. NO., name and address of a teacher, but if the teacher is not willing to teach any subjects

yet then we will have to insert or put a NULL, this leads to an insertion Anamoly (Lee, H.,

1995).

Deletion Anamoly: Say the teacher REG. NO. 19 has been given only one subject and he drops

it temporarily, deleting that row will make the entire teacher record to be deleted along with it.

NORMALIZATION FORMS.

The rules of normalization are sub divided into:

I. First Normal Form

II. Second Normal Form

III. Third Normal Form

First Normal Form /1NF

According to the rule of first normal form, there must be no two rows that contains a repetitive

collection of data, that is, every column set must have its own unique value, in that, multiple

columns cannot be used to search the same row. This means that each table should be organized

into distinctive rows, and every row must have a primary key that classifies it as unique

(Diederich, J., & Milton, J., 1988).


NORMALIZATION 4

Usually, the primary key is a single column. However, a combination of more than one column

can be made to establish a single primary key. For instance, let’s consider a table which is not in

1NF:

Table 2: Teacher table

TEACHER AGE SUBJECT

CASSIDY 35 CHEMISTRY, PHYSICS

KEVIN 34 PHYSICS

JORNARA 37 PHYSICS

As the first normal rule states, a row must not contain a column that has more than one item

saved, say separated by commas. Instead of doing so, it must separate such kind of data into

multiple independent rows.

Table 3: Teacher table following First Normal Form

TEACHER AGE SUBJECT

CASSIDY 35 CHEMISTRY

CASSIDY 35 PHYSICS

KEVIN 34 PHYSICS

JORNARA 37 PHYSICS

The use of 1NF increases data redundancy since there will be more columns with similar

information in multiple rows, however, each row will be unique as a whole.


NORMALIZATION 5

Second Normal Form/2NF

According to the second Normal rule, there should be no partial dependency of a column on a

primary key. This means that for a given table which has a concatenated primary key, then each

and every column in the table which is not part of the primary key must depend entirely on the

concatenated key in order to exist.

However, if a column depends on one part only of the concatenated key, then the table is said to

have failed the Second Normal Form (Kolahi, S., 2007).

In the above First Normal Form, there are a total of two rows for CASSIDY so as to incorporate

the two subjects he has opted. This is searchable and follows 1NF though an inefficient use of

space. Also, in the above example, the candidate key is (TEACHER, SUBJECT), but age of

teacher solely depends on the teacher column and which is inaccurate according to 2NF. Second

Normal Form is achieved by splitting the subjects into independent components, and use the

teacher names as foreign keys (Kung, H. J., & Hui-Lien, T., 2006)

Table 4: New teacher table following 2NF

TEACHER AGE

CASSIDY 35

KEVIN 34

JORNARA 37

In the above teacher table, the candidate key is the teacher column since all other columns

depend on it, that is, the Age.


NORMALIZATION 6

Table 5: New subject table following 2NF

TEACHER SUBJECT

CASSIDY CHEMISTRY

CASSIDY PHYSICS

KEVIN PHYSICS

JORNARA PHYSICS

(Teacher, Subject) is the candidate key in the new subject table. As clearly shown, both the

tables above qualifies to be Second Normal Form thus they will never suffer from an update

Anomalies.

Third Normal Form/3NF

The 3NF states that every attribute of a table which is non-prime must be dependent on the

primary key. In simple terms, it means that the transitive functional dependency of a table should

be removed while the table remains in Second Normal form (Codd, E., 1982).

For instance, let’s consider the table below:

Table 6: Teacher Detail table.

TEACHER TEACHER DOB STREET CITY STATE ZIP

REG. NO. NAME

In the table above, the primary key is the TEACHER REG. NO. Street, city and state are all

dependent on the Zip. This dependency between zip and the other fields is what is called
NORMALIZATION 7

transitive dependency. To make it in 3NF, we have to move the other fields’ i.e. street, city and

state to another new table where zip will be the primary key.

Table 7: New Teacher detail table.

TEACHER REG. TEACHER NAME DOB ZIP

NO.

Table 8: Address table.

ZIP STREET CITY STATE

Removing transitive dependency has the advantage of reducing duplicated data and improving

on data integrity
NORMALIZATION 8

References.

1. Bahmani, A. H., Naghibzadeh, M., & Bahmani, B. (2008, May). Automatic database

normalization and primary key generation. In Electrical and Computer Engineering,

2008. CCECE 2008. Canadian Conference on (pp. 000011-000016). IEEE.

2. Biskup, J., Dayal, U., & Bernstein, P. A. (1979, May). Synthesizing independent database

schemas. In Proceedings of the 1979 ACM SIGMOD international conference on

Management of data (pp. 143-151). ACM.

3. Chapple, M. (2011). Database normalization basics.

4. Codd, E. F. (1982). Relational database: a practical foundation for productivity.

Communications of the ACM, 25(2), 109-117.

5. Diederich, J., & Milton, J. (1988). New methods and fast algorithms for database

normalization. ACM Transactions on Database Systems (TODS), 13(3), 339-365.

6. Hillyer, M. (2003). An introduction to database normalization. MySQL AB.

7. Kolahi, S. (2007). Dependency-preserving normalization of relational and XML data.

Journal of Computer and system Sciences, 73(4), 636-647.

8. Kung, H. J., & Hui-Lien, T. (2006). An alternative approach to teaching database

normalization: A simple algorithm and an interactive e-Learning tool. Journal of

information systems education, 17(3), 315.

9. Lee, H. (1995). Justifying database normalization: a cost/benefit model. Information

processing & management, 31(1), 59-67.

10. Van Gucht, D. (1985). Theory of unnormalized relational structures (database,

normalization).
NORMALIZATION 9

You might also like