CT216 Software Engineering Tutorial: 1 Introduction To Relational Databases
CT216 Software Engineering Tutorial: 1 Introduction To Relational Databases
Eoin O Fiachain
1
A relation or table is a named two-dimensional table of data. Each relation
consists of a set of named columns and an arbitrary number of named rows.
A relational database is a database that employs the relational model.
1.2 SQL
2
1.3 Database Schema
1.4 MySQL
3
1.6 Running MySQL
After MySQL is installed, the MySQL server application can be started using
the winmysqladmin.exe program in the bin directory of the MySQL distrib-
ution.
This can be started through Windows Explorer or through the Command
Prompt.
The server application must be running in order for a client to connect.
cd c:\mysql\bin
mysql -u username -p databasename -h hostname
4
2 Designing An Example Database
In order to demonstrate MySQL we will focus on a particular example of a
small database about films. We will use this example database as a means
of introducing various features and aspects of MySQL.
In the Software Engineering lectures Entity-Relationship diagrams are intro-
duced as a data model compiled during the analysis phase of a project.
Database Modeling is outside the scope of this tutorial but a comprehensive
tutorial is provided at:
http://www.utexas.edu/its/windows/database/datamodeling/
Considering our example scenario, we arbitrarily identify three entities that
we wish to model: film, actor and genre. We then chose various attributes
(properties associated with each entity) that we wish to consider.
Film Name
Year of Release
Actor Last Name
First Name
Birth Date
Genre Name
Description
Each entity should have a primary key. This is an attribute or set of at-
tributes that uniquely identify each entity.
For the genre entity we select the name attribute as a primary key as it
uniquely identifies each genre e.g. thriller, crime, comedy etc.
Although we could use the name attribute as a primary key for the film entity,
instead we create a new artificial identifier id as two films may possibly have
the same name.
Similarly we add an id attribute for the actor entity as a combination of first
and last names may not necessarily be unique.
Using artificial id values also has the benefit of providing for easier references
to particular films. Instead of using the entire title to refer to a film we can
5
simply use a much more compact id number. On a physical level this provides
for less complex database references and decreased storage space.
Each film can be associated with many actors. Each actor can be associated
with many films. Therefore we have a many-to-many relationship between
the two entities.
Similarly, each film can be associated with many genres. Each genre is asso-
ciated with many films. Therefore we also have a many-to-many relationship
between the film and genre entities.
We cannot represent many-to-many relationships in a relational database.
We resolve this problem by introducing an extra association entity into the di-
agram with one-to-many relationships to replace the existing many-to-many
relationship.
6
We also add extra attributes to these association entities to allow us refer-
ence the primary keys of the original entities involved in the many-to-many
relationship. These are called foreign keys.
2.3 Normalization
7
2.4 Converting ERD to Relations
Each column must have an associated data type (integer, characters etc.).
All data items in a column must be of this data type.
We assign an appropriate data type to each column in our tables. There are
a variety of data types available MySQL that we will examine in due course.
The above ERD diagram is similar but slightly different to those we created
during analysis. It provides a “relational database” view of tables/columns
rather than a “modeling view” of entities/attributes.
Note the crows feet style used for descriping relationship multiplicities. This
corresponds directly to our 1:M, M:N, etc. style used in previous ERDs.
The above ERD represents our final database schema for our films database.
8
3 Structured Query Language for MySQL
SQL (Structured Query Language) is composed of declarative statements.
These indicate either what data is required from the server or what action
the server should take.
The semi-colon ; is used to separate multiple statements.
SQL keywords and table names are not case-sensitive (e.g. SELECT is the
same as select). However character literals are case-sentive (e.g. ‘DATA’ is
treated differently to ‘data’).
Each MySQL server can host multiple databases. Each database contains its
own tables. Each table can contain its own columns and rows.
Tables are grouped together with related tables within a single database.
Tables that have relationships with each other should be grouped together
within a single database as MySQL doesn’t allow cross-database queries.
Most applications will store all their data in a single database.
The CREATE DATABASE and DROP DATABASE statements can be used
to create and delete a database respectively.
N.B. Comments in SQL occur after a space and then two hyphens.
Any connection to the MySQL server can only be associated with one data-
base simultaneously. The USE statement is employed to activate a particular
database.
9
3.2 Creating and Dropping Tables
SHOW TABLES;
DESCRIBE table_name;
For example, a statement to create the film and filmcharacter tables from
our example database would be as follows:
10
This statement would create a table film with 3 columns id of type INT,
name of type VARCHAR(100) and year of type INT.
The filmcharacter table is created similarly.
The PRIMARY KEY declaration can be used to define which key or keys
form the primary key. Although it is technically optional, all tables should
have an associated primary key in a well-designed database.
11
Note that character and date/time values are always expressed within single-
quotes e.g. ‘Galway City’ or ‘6666’. A series of escape sequences can be used
to represent special characters in a similar manner to C and PHP including
single-quote (\’ ), newline (\n), percentage-sign (\% ) etc.
Numerical values are expressed without quotes.
Optional attributes follow the column type. These further define or restrict
the definition of the column.
Some common attributes include:
The INSERT statement allows the user to add new rows to a table:
An example statement to insert a new ‘horror’ genre into our genre table
would be:
12
If you are inserting a value for every column in the table then the column
list can be omitted.
The DELETE statement allows the user to delete a row or group of rows
from a table.
An example statement follows that deletes all rows from the genre table
where the name value is equal to ‘horror’:
This statement deletes all films that were released before 2000:
If the WHERE keyword and condition is omitted then all rows in a table are
deleted:
The UPDATE statement allows the user to update values in an existing row
or group of rows.
-- Sets a new name for any rows in the ‘actor’ table that
-- have a name equal to ‘Arnold Schwarzenegger’
UPDATE actor SET name=’Governor Arnie’
WHERE name=’Arnold Schwarzenegger’;
13
3.3.4 WHERE Conditions
By using all of the primary key columns in a condition, a single row can be
uniquely specified.
Some common comparison operators are:
Operator Description
= Equal
<> or != Not Equal
<= Less Than or Equal
< Less Than
=> Greater Than or Equal
> Greater Than
IS NULL Whether a value is NULL
IS NOT NULL Whether a value is not NULL
The LIKE operator can be used to see if a string value matches a particular
string expression.
The % character is used as a wildcard to match 0 or more characters. The
character is used as a wildcard to match exactly 1 character.
The following example deletes any rows in the films table whose name value
begins with the string ‘Star’:
14
3.4 Querying Tables
Querying tables is the most common task performed using SQL. The declar-
ative nature of SQL provides for a powerful and versatile means of selecting
precisely what data we want to retrieve from a table or group of tables.
For the purposes of demonstration, let us assume that our example database
has been populated with data as follows:
15
3.4.1 SELECT Command
The SELECT command is used to retrieve rows from one or more tables.
The following example retrieves the name column and the year column for
all rows in the film table with a year of 1995 or above:
+---------------------+------+
| name | year |
+---------------------+------+
| Big Lebowski, The | 1998 |
| Kill Bill, Volume 1 | 2003 |
+---------------------+------+
If you wish to view all columns in a table in a query, you can replace the
column list with the * character:
+----+---------------------+------+
| id | name | year |
+----+---------------------+------+
| 1 | Big Lebowski, The | 1998 |
| 4 | Kill Bill, Volume 1 | 2003 |
+----+---------------------+------+
If no WHERE clause is specified then all rows are retrieved from a table:
+----+---------------------+------+
| id | name | year |
+----+---------------------+------+
| 1 | Big Lebowski, The | 1998 |
| 2 | Resevoir Dogs | 1992 |
| 3 | Pulp Fiction | 1994 |
| 4 | Kill Bill, Volume 1 | 2003 |
+----+---------------------+------+
16
3.4.2 ORDER BY
+----+----------+-----------+---------------------+
| id | lastName | firstName | birthDate |
+----+----------+-----------+---------------------+
| 1 | Bridges | Jeff | 1949-12-04 00:00:00 |
| 3 | Buscemi | Steve | 1957-12-13 00:00:00 |
| 2 | Goodman | John | 1952-06-20 00:00:00 |
| 4 | Keitel | Harvey | 1939-05-13 00:00:00 |
| 7 | Madsen | Michael | 1958-09-25 00:00:00 |
| 5 | Roth | Tim | 1961-05-14 00:00:00 |
| 6 | Thurman | Una | 1970-04-29 00:00:00 |
+----+----------+-----------+---------------------+
+----+----------+-----------+---------------------+
| id | lastName | firstName | birthDate |
+----+----------+-----------+---------------------+
| 6 | Thurman | Una | 1970-04-29 00:00:00 |
| 5 | Roth | Tim | 1961-05-14 00:00:00 |
| 7 | Madsen | Michael | 1958-09-25 00:00:00 |
| 3 | Buscemi | Steve | 1957-12-13 00:00:00 |
| 2 | Goodman | John | 1952-06-20 00:00:00 |
| 1 | Bridges | Jeff | 1949-12-04 00:00:00 |
| 4 | Keitel | Harvey | 1939-05-13 00:00:00 |
+----+----------+-----------+---------------------+
17
3.4.3 Joins
SELECT
filmcharacter.filmID,
actor.lastName,
actor.firstName,
filmcharacter.characterName
FROM
actor,
filmcharacter
WHERE
actor.ID = filmcharacter.actorID;
+--------+----------+-----------+---------------+
| filmID | lastName | firstName | characterName |
+--------+----------+-----------+---------------+
| 1 | Bridges | Jeff | The Dude |
| 1 | Goodman | John | Walter |
| 1 | Buscemi | Steve | Donny |
| 2 | Buscemi | Steve | Mr. Pink |
| 2 | Keitel | Harvey | Mr. White |
| 3 | Keitel | Harvey | The Wolf |
| 2 | Roth | Tim | Mr. Orange |
| 3 | Roth | Tim | Pumpkin |
| 3 | Thurman | Una | Mia Wallace |
| 2 | Madsen | Michael | Mr. Blonde |
| 4 | Thurman | Una | The Bride |
| 4 | Madsen | Michael | Budd |
+--------+----------+-----------+---------------+
18
3.4.4 Functions
MySQL contains many functions which we can use to manipulate our query
output.
For example, the CONCAT(str1,str2,. . . ) function concatenates (adds to-
gether) a number of strings.
In this example, we will use this function to combine the firstName and
lastName of all the available actors. We put a single space character between
the two parts of the name.
+----------------------------------+
| CONCAT(firstName, ’ ’, lastName) |
+----------------------------------+
| Jeff Bridges |
| John Goodman |
| Steve Buscemi |
| Harvey Keitel |
| Tim Roth |
| Una Thurman |
| Michael Madsen |
+----------------------------------+
Aggregate functions are special types of functions which operate on the entire
set of rows in a query. Normal functions act separately on each individual
row.
Some common aggregate functions are:
19
For example:
+-----------+-----------+-------------+-----------+-----------+
| SUM(year) | AVG(year) | COUNT(year) | MAX(year) | MIN(year) |
+-----------+-----------+-------------+-----------+-----------+
| 7987 | 1996.7500 | 4 | 2003 | 1992 |
+-----------+-----------+-------------+-----------+-----------+
3.4.6 DISTINCT
The DISTINCT clause ensures that no identical rows exist in the query
output.
If we execute the following query we get some genres repeated. This is
because some genres are associated with more than one film.
+-----------+
| genreName |
+-----------+
| comedy |
| mystery |
| thriller |
| action |
| crime |
| mystery |
| thriller |
| crime |
| drama |
| action |
| crime |
| thriller |
+-----------+
20
However, if we add a DISTINCT clause no repetition takes place:
+-----------+
| genreName |
+-----------+
| comedy |
| mystery |
| thriller |
| action |
| crime |
| drama |
+-----------+
3.5 Security
Each time the user attempts to execute a statment, the MySQL server deter-
mines whether or whether not to allow the statement to execute based upon
a permissions system. The permissions are based upon three criteria:
• username
• password
21
3.6 MySQL Manual
22