0% found this document useful (0 votes)
190 views

Database Normalization

The document discusses database normalization through three forms: First normal form requires eliminating multi-valued attributes and repeating groups from tables. Second normal form requires that all attributes depend on the full primary key. Third normal form prohibits transitive dependencies where a non-key attribute depends on another non-key attribute rather than the primary key. Proper normalization avoids data issues like redundancy, inconsistencies, and difficulties in queries.

Uploaded by

api-3841500
Copyright
© Attribution Non-Commercial (BY-NC)
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
190 views

Database Normalization

The document discusses database normalization through three forms: First normal form requires eliminating multi-valued attributes and repeating groups from tables. Second normal form requires that all attributes depend on the full primary key. Third normal form prohibits transitive dependencies where a non-key attribute depends on another non-key attribute rather than the primary key. Proper normalization avoids data issues like redundancy, inconsistencies, and difficulties in queries.

Uploaded by

api-3841500
Copyright
© Attribution Non-Commercial (BY-NC)
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
You are on page 1/ 4

Database Normalization

A poor database design can cripple an application, producing problems with


redundancy, inaccuracy, consistency, and concurrency of your data.
Normalization is a process that serves to reduce, if not eliminate, these problems
with data. Since most businesses use 3rd normal form in the logical model, I'll
take you through 1st, 2nd, and 3rd NF's.

First normal forms requires that there be no multi-valued attributes, and no


repeating groups. A multi-valued attribute would contain more than one value for
that field in each row.

Consider the following StudentCourses table

StudentID Course
12345 3100,3600,3900
54321 1300,2300,1200

In this table, the Course field is a multi-valued attribute. There is not a single
value for each field.

Now consider this StudentCourses table

StudentID Course1 Course2 Course3


12345 3100 3600 3900
54321 1300 2300 1200

The Course1, Course2, Course3 fields represent repeating groups.

The proper way to store this data follows. First Normal form is satisfied.

StudentID Course
12345 3100
12345 3600
12345 3900
54321 1300
54321 2300
54321 1200

In the first two designs, selecting students that are enrolled in a certain course is
difficult. Say I want to do the following
Tell me all of the students enrolled in course 3100. In the first design, you'll have
to pull all of the course data and parse it somehow. And in the second design,
you'll have to check 3 different fields for course 3100. In the final design, a simple
Select StudentID from StudentCourses where Course=3100

Second Normal Form requires that any non-key field be dependent upon the
entire key. For example, consider the StudentCourses table below, where
StudentID and CourseID form a compound primary key.

StudentID CourseID StudentName CourseLocation Grade


12345 3100 April Math Building A
12345 1300 April Science Building B

The Student Name field does not depend at all on CourseID, but only on Student
ID. CourseLocation has be dependency on StudentID, but only on CourseID.

This Data should be split into three tables as follows.

Students Table

StudentID Name
12345 April

Courses Table

CourseID CourseLocation
3100 Math Building
1300 Science Building

StudentCourses Table

StudentID CourseID Grade


12345 3100 A
12345 1300 B

In this example, grade was the only field dependent on the combination of
StudentID and CourseID.

Lets suppose that in the first table design, the first row of data was entered with
a StudentName of Aprok, a simple typo. Now, suppose the following SQL is run.

Delete from StudentCourses where StudentName="April"

The erroneous "Aprok" row will not be deleted. However, in the final design,
using the following SQL

Delete From StudentCourses where StudentID=12345

will delete every course that April was in by using the ID.

Third Normal Form prohibits transitive dependencies. A transitive dependency


exists when any attribute in a table is dependent upon any other non-key
attribute in that table.

Consider the following example CourseSections Table.

CourseID Section ProfessorID ProfessorName


3100 1 6789 David
1300 1 6789 David

The professor is uniquely identified by the CourseID and Section of the course.
However, ProfessorName depends on ProfessorID and has no relation to
CourseID or Section.

This data is properly stored as follows.

Professors Table

ProfessorID ProfessorName
6789 David

CourseSections Table

CourseID Section ProfessorID


3100 1 6789
1300 1 6789

By splitting the data into two tables, the transitive dependency is removed.

Taking the original design of the CourseSections table introduces the chance
that ProfessorName may be Corrupted. Perhaps on the second row, the
ProfessorName is entered as Davif, a simple typo. Since there is no such
professor Davif, there is a problem

You might also like