Database
Database
A database is a collection of data or information which is held together in an organised or logical way.
These can be as simple as a notebook which contains addresses sorted by surname or a birthday book with
birthday reminders by month.
Other paper based databases can be much larger, for example, the Yellow Pages directory. The directory is
organised by business type e.g.
architects
builders
florists
plumbers
taxies
Under each category are listed all of the local builders, sorted again alphabetically e.g.
Computerised databases
You will come across computerised databases in every aspect of your life. Here are some examples with which
you will be familiar:
When you first set up your database, you can choose to make a 'flat-
file' database or a 'relational' database.
With a flat file database, all of your data is stored in one large table.
Take a database that a vet might use. In our example on the right,
there is data about the owner of the pet i.e. name, address, phone
number, there is data about the pet - name, type of animal, date of
birth, and there is also data about any appointments the pet has.
This might seem pretty logical at first. But think about it, is it really as
good as it seems?
Every single time the pet has an appointment, the customer's title, surname, street, town, county, and phone
number have to be entered. Also, the pet's name, type and d.o.b. also have to be entered. That would get
fairly tedious having to enter so much data each time and there would be a great risk of making a mistake.
On the previous page, we saw that a flat file database wasn't always the best choice as it causes a lot of data
duplication.
In the database below, the data is split up into sensible groups i.e. customer data, pets data and appointments
data. Then a separate table is made for each group.
Customer Appointme
Pet Table
Table nt Table
Once the tables have been set up, a relationship can be created to link them together - as shown by the lines
linking the tables below.
The main benefit of a relational database is that data doesn't have to be duplicated. When a customer books
an appointment for their pet, a new record is created on the 'appointment's table' and the relevant Customer
and Pet IDs are chosen.
Reducing data duplication reduces the amount of data which needs to be stored, thus making the database
smaller. It also reduces the risk of mistakes, because every time you have to type the same data in, there is a
risk you could mis-spell it.
Entities
Each table holds all of the information about an object, person or thing.
a customer table
an appointments table
an exam sessions table
a teachers' names table
a concert venue table
Attributes
Remember that an entity is a person, place, thing or concept about which data can be collected.
Let us explain this a little bit more clearly by using a couple of examples.
Primary Key
A primary key is a special relational database table column (or combination of columns) designated to
uniquely identify all table records.
Composite Key
A composite key, in the context of relational databases, is a combination of two or more columns in a table
that can be used to uniquely identify each row in the table. Uniqueness is only guaranteed when the columns
are combined; when taken individually the columns do not guarantee uniqueness.
Foreign Key
A foreign key is used to link tables together and create a relationship. It is a field in one table that is linked to
the primary key in another table.
Artists
Recordings
Genre
These primary keys link to identically named fields in the Recordings table. Each of those identically named
fields are known as a 'foreign key'
Normalization
Normalization is the process of organizing data in a database. This includes creating tables and establishing
relationships between those tables according to rules designed both to protect the data and to make
the database more flexible by eliminating redundancy and inconsistent dependency.
1st Normal Form Definition
An atomic value is a value that cannot be divided. For example, in the table shown below, the values in the
[Color] column in the first row can be divided into "red" and "green", hence [TABLE_PRODUCT] is not in 1NF.
A repeating group means that a table contains two or more columns that are closely related. For example, a
table that records data on a book and its author(s) with the following columns: [Book ID], [Author 1], [Author
2], [Author 3] is not in 1NF because [Author 1], [Author 2], and [Author 3] are all repeating the same attribute.
How do we bring an unnormalized table into first normal form? Consider the following example:
This table is not in first normal form because the [Color] column can contain multiple values. For example, the
first row includes values "red" and "green."
To bring this table to first normal form, we split the table into two tables and now we have the resulting
tables:
Now first normal form is satisfied, as the columns on each table all hold just one value.
2nd Normal Form Definition
A table is in second normal form if any partial dependencies have been removed. That is, every non-key
attribute must be fully dependent on all of the primary key.
This table has a composite primary key [Customer ID, Store ID]. The non-key attribute is [Purchase Location].
In this case, [Purchase Location] only depends on [Store ID], which is only part of the primary key. Therefore,
this table does not satisfy second normal form.
To bring this table to second normal form, we break the table into two tables, and now we have the following:
What we have done is to remove the partial functional dependency that we initially had. Now, in the table
[TABLE_STORE], the column [Purchase Location] is fully dependent on the primary key of that table, which is
[Store ID].
3rd Normal Form Definition
By transitive functional dependency, we mean we have the following relationships in the table: A is
functionally dependent on B, and B is functionally dependent on C. In this case, C is transitively dependent on
A via B.
Third normal form (like second normal form) is concerned with the non-key attributes. To be in 3NF, there
must be no dependencies between any of the non-key attributes. A table with no or one non-key attribute
must be in 3NF.
In the table able, [Book ID] determines [Genre ID], and [Genre ID] determines [Genre Type]. Therefore, [Book
ID] determines [Genre Type] via [Genre ID] and we have transitive functional dependency, and this structure
does not satisfy third normal form.
To bring this table to third normal form, we split the table into two as follows:
Now all non-key attributes are fully functional dependent only on the primary key. In [TABLE_BOOK], both
[Genre ID] and [Price] are only dependent on [Book ID]. In [TABLE_GENRE], [Genre Type] is only dependent on
[Genre ID].
Question:
1NF
2NF
3NF
Summary