Assignment 4
Assignment 4
Tyler Reymer
April 3, 2022
2
returning or bringing data into a normal state or condition, hence normalization. In a relational
database, normalization is used to reduce redundancy of data. In practice, data is organized into mainly
three normal forms including first (1NF), second (2NF), and third normal form (3NF). Above this,
fourth and fifth normal forms exist, but they are not typically used because they strive for perfect
database design with diminishing returns when it comes to performance. This is because you would be
further organizing data creating more time for operations against the database that may be unnecessary.
design. It is difficult to store data that maintain the same information in different places in the relational
database, so a database with minimum redundancy is ideal. In addition, by creating more relationships
a database inherently better protects data and eliminates inconsistent dependencies (Eessaar, 2016).
The normalization process can greatly improve performance, reduce required disk space, and
simplifies maintenance. On the other hand, if the database is too normalized, then it can increase
case basis. Too much normalization can create many tables. For example, the more tables that exists
will increase the cost for joins and processing time. Normalization can simplify updates, but if the table
is hardly used for write operations, then normalizing the table is not efficient. Normalization should be
done in right amount for the project requirements. For example, databases that do not change much do
not need to be fully normalized as it will only create the need for complicated queries.
3
As we can see, there is no relationships and the class data is repeating. This can be fixed using 1NF.
1NF:
Now, we have organized classes into its own column to eliminate a repeating group.
Let’s take this one more step forward to 2NF:
Students:
StudentNum Advisor Adv-Room
1025 Reymer 513
4129 Parks 123
Registration:
Student# Class#
1025 102-09
1025 148-01
1025 152-03
4129 102-09
4129 148-01
4129 176-08
Here we have created two tables, Students and Registration. We have created a relationship between the
tables using Student# as a primary key. By doing this, we eliminated an unnecessary functional
dependency between Class# and Student#. As a result, we eliminated redundant data.
Students:
StudentNum Advisor
1025 Reymer
4129 Parks
4
Faculty:
Name Room Dept
Reymer 513 505
Parks 123 156
Now, an advisor room is functionally dependent of an advisor. Meaning, an Adv-Room can’t exist
without an Advisor, so we move that data to its own table. This allows us to add more dependent
attributes like Department (Dept).
Example query:
Scenario: You want to find out which students are assigned to a specific faculty member and which
department they belong to.
Select *
From Faculty, Students
Where Faculty.Name = Students.Advisor
This is inefficient because using a Select * uses more memory by parsing all data in each entity rather
than specific fields that are important to the consumer. This is important as the tables get larger, too.
In this query we can retrieve fields we care about to figure out which students are assigned to which
advisor per department, thus improving performance on the query.
5
References
Eessaar, E. (2016). The Database Normalization Theory and the Theory of Normalized Systems:
Finding a Common Ground. Baltic Journal of Modern Computing, 4(1), 5–33.