0% found this document useful (0 votes)
15 views14 pages

BCDE103 Keys Lab Answer

The document describes normalizing a set of tables related to albums, tracks, genres and artists into third normal form. It begins with some sample data in an unnormalized table and describes steps to normalize it. First, the data is split into multiple tables to reach first normal form. Primary keys are then identified to properly link the tables: the album name and artist form the primary key of the album table, while foreign keys link related data in other tables back to the album table. Finally, the document verifies the tables are in third normal form.

Uploaded by

elliottjs1091
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as XLSX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
15 views14 pages

BCDE103 Keys Lab Answer

The document describes normalizing a set of tables related to albums, tracks, genres and artists into third normal form. It begins with some sample data in an unnormalized table and describes steps to normalize it. First, the data is split into multiple tables to reach first normal form. Primary keys are then identified to properly link the tables: the album name and artist form the primary key of the album table, while foreign keys link related data in other tables back to the album table. Finally, the document verifies the tables are in third normal form.

Uploaded by

elliottjs1091
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as XLSX, PDF, TXT or read online on Scribd
You are on page 1/ 14

Unnormalised table (UNF)

Enter just enough sample data to match the description given. Don't add any ID attributes at this stage. Note that we
will have to simplify how we handle names because an artist can be a person (e.g., Talvin Singh) or a group (e.g., Pink
Floyd).

album_name release_date album_length artist


Meddle 10/30/1971 46:08 Pink Floyd

OK 11/9/1998 60:48 Talvin Singh

Step 1
Remove multiple values from fields to get the database tables into 1NF. Remember that we aren't separating first
and last names, as commented on above. NOTE: we aren't concerned with keys at this stage.

Album Track Table

track_name track_order
One of These Days 1
A Pillow of Winds 2
Traveller 1
Butterfly 2

Album Genre Table

genre
Rock
Progressive Rock
Electronica

Album Table

album_name release_date length artist


Meddle 10/30/1971 46:08 Pink Floyd
OK 11/9/1998 60:48 Talvin Singh

Step 2
Determine whether the new tables are in 2NF.

In the Album Track Table it appears that the track_name determines the uniqueness of a row and that the
track_order depends on the track_name, and that the table is therefore in 2NF. However, if we had more data we
would find that track_names might be repeated on different albums (Google "songs with the same name").
Currently, there is no other column to determine the uniqueness of a record in the Album Track Table, but we can
also see that the table has no link (key) back to the Album Table, so it looks like we can't solve this issue yet. Before
you add an album_id column, read the rest of this solution, because that might not necessarily be the answer...

The Genre Table has only one column, so is that table in 2NF by default? We don't have a key linking a record in that
table with the Album Table either, so we will have to come back to this table as well.

Why haven't we been able to determine whether these tables are in 2NF? The answer is we haven't worked out the
primary keys for the tables - what column or combination of columns makes each row unique? Let's switch to that
task...

Step 3
Determine each table's primary key.

This time we will start with the database's main table: Album Table. In this table it appears that the album_name
column determines the uniqueness of records in the table. That would make it a candidate to be the primary key for
the table. However, I just Googled "albums with the same name" and, yes, there are albums with the same name *by
different artists*. That suggests that the primary key is then composed of two columns: album_name and artist. If
you think about it, that makes intuitive sense: artists can call their albums whatever they like, even if they have the
same titles as albums by other artists. So, we have a natural (or "business") primary key composed from those
columns. There's one more thing: artists can be people, and people don't have unique names; actually, there are
even bands with the same names. We won't be able to find a natural primary key involving artist name, so we need
to create a surrogate key (for Artist): artist_id. Let's just add it to our Album Table to see what happens:

Album Table

artist_id (PK) artist_name album_name (PK) release_date


Pink Floyd Meddle 10/30/1971
10546 Talvin Singh OK 11/9/1998

(Step 3 continued)
Now we have a primary key, which is composed of the artist_id and the album_name. The combination of artist_id
and album_name cannot be repeated on any records, otherwise we have duplicate records in our table, which is
forbidden by the relational data model.

We said above that the other tables we have created don't have primary keys or links (foreign keys) back to the
Album Table, so let's look at those tables in turn and determine their keys.

Here is the Album Track Table again:

Album Track Table


track_name track_order
One of These Days 1
A Pillow of Winds 2
Traveller 1
Butterfly 2

We need to relate that to the Album Table. We do that by adding the primary key of the Album Table (artist_id and
album_name) to the Album Track Table, which creates the foreign key (artist_id and album_name) :

Album Track Table

artist_id (FK) album_name (FK) track_name track_order


10001 Meddle One of These Days 1
10001 Meddle A Pillow of Winds 2
10546 OK Traveller 1
10546 OK Butterfly 2

Note: there is nothing preventing a foreign key from being composed of more than one column in the relational
model of data. (Reminder: I said above that we don't simply create an album_id at the start.) So, now do we have a
primary key for the Album Track Table? Yes: the combination of artist_id, album_name and track_name will be
unique: each track on an album by an artist will be uniquely named. If we remove any one of those columns then we
can get duplicates: different artists could record albums with the same name that also have a track with the same
name on them; an artist could record the same track on more than one of their albums. So, the primary key for
Album Track Table is artist_id, album_name, track_name, highlighted in yellow here:

Album Track Table

artist_id (PK) (FK) album_name (PK) (FK) track_name (PK) track_order


10001 Meddle One of These Days 1
10001 Meddle A Pillow of Winds 2
10546 OK Traveller 1
10546 OK Butterfly 2

We can do the same for the Album Genre Table, adding the artist_id and album_name as a foreign key. Does the
table now have a primary key? Yes: it is composed of all three columns, which will have unique combinations of
values - compare the three records in this table.

Album Genre Table

artist_id (PK) (FK) album_name (PK) (FK) genre (PK)


10001 Meddle Rock
10001 Meddle Progressive Rock
10546 OK Electronica

This spreadsheet is getting long - let's continue on the next spreadsheet…


utes at this stage. Note that we
vin Singh) or a group (e.g., Pink

producer track_names track_order record_label


Roger Waters One of These Days 1 Harvest
A Pillow of Winds 2
Talvin Singh Traveller 1 Island
Butterfly 2

at we aren't separating first


s stage.

producer record_label
Roger Waters Harvest
Talvin Singh Island
of a row and that the
ver, if we had more data we
with the same name").
bum Track Table, but we can
n't solve this issue yet. Before
cessarily be the answer...

ve a key linking a record in that

r is we haven't worked out the


w unique? Let's switch to that

pears that the album_name


idate to be the primary key for
lbums with the same name *by
s: album_name and artist. If
hey like, even if they have the
ey composed from those
names; actually, there are
lving artist name, so we need
ee what happens:

length producer record_label


46:08 Roger Waters Harvest
60:48 Talvin Singh Island

The combination of artist_id


cords in our table, which is

(foreign keys) back to the


he Album Table (artist_id and
lbum_name) :

ne column in the relational


e start.) So, now do we have a
e and track_name will be
one of those columns then we
have a track with the same
s. So, the primary key for

e as a foreign key. Does the


e unique combinations of

sheet…
genre
Rock
Progressive Rock
Electronica
Here are the tables we currently have. They all have primary keys and two of them have foreign keys to link them with
the Album Table. REMEMBER: when creating a foreign key in a table, you have to copy all the attributes of the primary
key in the source table, so if a primary key is a composite key (more than one attribute), then a related foreign key will
also be a composite key.

Album Table

artist_id (PK) artist_name album_name (PK) release_date


10001 Pink Floyd Meddle 10/30/1971
10546 Talvin Singh OK 11/9/1998

Album Track Table

artist_id (PK) (FK) album_name (PK) (FK) track_name (PK) track_order


10001 Meddle One of These Days 1
10001 Meddle A Pillow of Winds 2
10546 OK Traveller 1
10546 OK Butterfly 2

Album Genre Table

artist_id (PK) (FK) album_name (PK) (FK) genre (PK)


10001 Meddle Rock
10001 Meddle Progressive Rock
10546 OK Electronica

Now that they all have primary keys (unique identifiers) we can detemine if the tables are in 2NF or not. A table is in 2NF
if there are no partial dependencies: there are no non-key columns that depend on only some, not all, of the attributes
that make up a composite primary key. (Actually, it is more complicated than that, but that is sufficient for this course
and all our examples). Note that a table that has a primary key that is made from only one attribute is automaticall in
2NF. Review the Album Table: all non-key columns except for the artist_name depend on the whole primary key;
artist_name depends on just the artist_id (which is a partial dependency). So, we have to remove artist_name from this
table and put it in a separate Artist Table; we also take the artist_id because that is what the artist_name depends on.
Artist_id is therefore the primary key in the Artist Table.

Artist Table

artist_id (PK) artist_name


10001 Pink Floyd
10546 Talvin Singh

There is one column we haven't commented on that might now stand out: the producer column. That will be a person or
sometimes a band and we will have trouble using names as unique identifiers, so we need to deal with that in the same
way we dealt with artists. Let's do this in one step by creating a Producer Table that has a producer_id as the primary
key. Note: it is possible for a person/band to be both an artist and a producer, but this is beyond the scope of what we
are doing here (if you have thought of this, you might have a Role table).
Producer Table

producer_id (PK) name


50487 Roger Waters
50989 Talvin Singh

An important update to our Album Table is to identify which columns are now foreign keys pointing to the two tables we
just created. These are artist_id, which points to the primary key of the Artist Table, and producer_id, which points to the
primary key of the Producer Table. The Album Table now looks like this (note that an attribute can be in both a primary
key and a foreign key):

Album Table

artist_id (PK) (FK) album_name (PK) release_date length


10001 Meddle 10/30/1971 46:08
10546 OK 11/9/1998 60:48

Finally, we need to determine if the tables are in 3NF. Answer: none of the tables have any columns that are dependent
on non-key columns, so they are in 3NF. All columns in all tables are either primary key columns or are dependent on the
whole of the primary keys.

Let's show the answer on the next spreadsheet…


ve foreign keys to link them with
all the attributes of the primary
), then a related foreign key will

length producer record_label


46:08 Roger Waters Harvest
60:48 Talvin Singh Island

are in 2NF or not. A table is in 2NF


ly some, not all, of the attributes
that is sufficient for this course
one attribute is automaticall in
on the whole primary key;
to remove artist_name from this
at the artist_name depends on.

er column. That will be a person or


eed to deal with that in the same
s a producer_id as the primary
is beyond the scope of what we
keys pointing to the two tables we
d producer_id, which points to the
ttribute can be in both a primary

producer_id (FK) record_label


50487 Harvest
50989 Island

any columns that are dependent


y columns or are dependent on the
Here are the final tables, with primary key (PK) and foreign key (FK) columns indicated. The kind of primary key is indicated
to the right of each table.

Album Table
artist_id (PK) (FK) album_name (PK) release_date length
10001 Meddle 10/30/1971 46:08
10546 OK 11/9/1998 60:48

Artist Table
artist_id (PK) artist_name
10001 Pink Floyd
10546 Talvin Singh

Producer Table
producer_id (PK) name
50487 Roger Waters
50989 Talvin Singh

Album Track Table


artist_id (PK) (FK) album_name (PK) (FK) track_name (PK) track_order
10001 Meddle One of These Days 1
10001 Meddle A Pillow of Winds 2
10546 OK Traveller 1
10546 OK Butterfly 2

Album Genre Table


artist_id (PK) (FK) album_name (PK) (FK) genre (PK)
10001 Meddle Rock
10001 Meddle Progressive Rock
10546 OK Electronica

We might also want to have a genre table in which we store the genre names. If we did that we would have to modify the
Album Genre Table: the genre attribute would need to be a foreign key pointing to the Genre table. These two tables wou
look like this (note that genre in The Album Genre Table is still part of the PKey).

Genre Table
name (PK)
Rock
Progressive Rock
Electronica

Album Genre Table


artist_id (PK) (FK) album_name (PK) (FK) genre (PK) (FK)
10001 Meddle Rock
10001 Meddle Progressive Rock
10546 OK Electronica
he kind of primary key is indicated

producer_id (FK) record_label


50487 Harvest
50989 Island

hat we would have to modify the


enre table. These two tables would
Kind of primary key

Composite key which is actually a combined surrogate/natural key

Surrogate key

Surrogate key

Composite key which is actually a combined surrogate/natural key

Composite key which is actually a combined surrogate/natural key

Natural/Business key

Composite key which is actually a combined surrogate/natural key

You might also like