0% found this document useful (0 votes)
207 views16 pages

Computer Science Coursebook-116-131

This document outlines learning objectives for understanding databases and the relational database model. The objectives cover limitations of file-based data storage, features of relational databases and DBMS software, database design using entity-relationship diagrams, and use of the SQL language for defining and querying databases. Specific skills listed include explaining normalization, writing SQL statements, and developing a relational design from requirements.

Uploaded by

api-470106119
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
207 views16 pages

Computer Science Coursebook-116-131

This document outlines learning objectives for understanding databases and the relational database model. The objectives cover limitations of file-based data storage, features of relational databases and DBMS software, database design using entity-relationship diagrams, and use of the SQL language for defining and querying databases. Specific skills listed include explaining normalization, writing SQL statements, and developing a relational design from requirements.

Uploaded by

api-470106119
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 16

Learning objectives

By the end of this chapter you should be able to:


• show understanding of the limitations of using a file- • show understanding of the normalisation process
based approach for the storage and retrieval of data • explain why a given set of database tables are, or are not,
• describe the features of a relational database which in 3NF and make the changes to a given set oftables to
address the limitations of a file-based approach produce a solution in 3NF
• show understanding of the features provided by a • show understanding that DBMS software carries out:
DBMS to address the issues of: data management, data • all creation/modification of the database structure using
~
I modelling, logical schema, data integrity, data security its DDL
• show understanding of how software tools found within a • query and maintenance of data using its DML
DBMS are used in practice • show understanding that the industry standard for both
• show awareness that high-level languages provide DDL and DML is Structured Query Language {SQL)
accessing facilities for data stored in a database • show understanding of a given SQL script
• show understanding of, and use, the terminology • write simple SQL (DDL) commands for: creating a
associated with a relational database model database, creating or changing a table definition, adding
• produce a relational design from a given description of a a primary or foreign key to a table
system • write a SQL script for querying or modifying data (DML)
• use an entity-relationship diagram to document a which are stored in (at most two) database tables
database design
Cambridge International AS and A level Computer Science

10.01 Limitations of a file-based approach


Data integrity and data privacy concerns
Let's consider a simple scenario. A theatrical agency makes bookings for bands and is setting
up a computerised system. Text files are to be used. One of these text files is to store data
about individual band members. Each line of the file is to contain the follow ing data for one
band member:

Name, contact details, banking details, band name, band agent name,
band agent contact details

The intention is that this file could be used if the agency needed to contact the band member
directly or through the band's agent. It could also be used after a gig when the band member
has to be paid. Ignoring what would constitute contact details or banking details, we can
look at a snapshot of some of the data that might be stored for the member's given name,
the member's family name and the band name. The file might have a thousand or more lines
of text. The following is a selection of some of the data that might be conta ined in various
lines in the file:
Xiangfei Jha ComputerKidz
Mahe sh Ravuru ITWizz
Dyl an Stoddart
Graham Vandana ITWizz
Vandana Graham ITWizz
Mahe sh Ravuru ITWizz
Precious Olsen ComputerKidz
Precious Olsen ITWizz

It is clear that there are problems with this data. It would appear that when the data for
Vandana Graham was first entered her names were inserted in the wrong order. A later
correct entry was made without deletion of the original incorrect data. This type of problem
is not unique to a file-based system. There is no validation technique that could detect the
original error. By contrast, validation should have led to the correction of the missing band
name for Dylan Stoddart. The Precious Olsen data are examples of duplication of data and
inconsistent data.

There is also possibly an error that is not evident from looking at the file contents. A band
name could be entered here when that band doesn't exist. This shows how a file-based
approach can lead to data integrity problems in an individual file. The reason is the lack of
in-built control when data is entered. The database approach can prevent such problems or,
at least, minimise the chances of them happening.

A different problem is a lack of data privacy. The file above was designed so that the finance
section could find the banking details and the recruitment section could find contact details.
The problem is that there cannot be any control of access to part of a file so staff in the
recruitment section would be able to access the banking details of band members. Data
privacy would be properly handled by a database system.

Mindful of this privacy problem the agency decides to store data in different files for different
departments of the organ isation. Table 10.01 summarises the main data to be stored in each
department's file.
- - -- -- - - - - - -- -

:' - Chapter 10: Database and Data Modelling

Department Data items in the section's file


Contract Member names Band name Gig details
Finance Member names Bank details Gig details
Publicity Band name Gig details
Recruitment Member names Band name Agent details

Table 10.01 Data to be held in the department files

There is now data duplication across the files. This is commonly referred to as data
redundancy which doesn't mean that the data is no longer of use but rather that once data
has been stored there is no need for it to be stored again. This can lead to data inconsistency
because of errors in the origina l entry or errors in subsequent ed iting. This is a different cause
of data lacking integrity. One of the primary aim s of the database approach is the el imi nation
of data redundancy.

Data redundancy: the same data stored more than once

Data dependency concerns


Th e above account has focussed on the problems assoc iated with data storage in files. We
r now need to consider the problems that might occur when programs access the files.

f
Traditionally a programmer wrote a program and at the same time defined the data files that
the program would need. For the agency each department would have its own programs
which would access the department's data files. When a programmer creates a program for a
department the programmer has to know how the data is organised in these files, for examp le,
that the fourth item on a line in the file is a band name. This is an example of 'data dependency'.

It is very likely that the files used by one department might have some data which is the same
r as the data in the files of other departments. However, in the scenario presented above there
is no plan for file sharing. A further issue is that the agency might decide that there is a need
for a change in the data stored . For instance, they might see an increasing trend for bands

r to perform with additional session musicians. Their data wi ll need to be entered into some

r files. This w ill require the existing files to be re-written. In turn, this w ill require the programs
to be re-written so that the new files are read correctly. In a database scenario the existing
programs could still be run even though additional data was added. The on ly programming
r change needed would be the writing of additional programs which used this additiona l data.

The other aspect of data dependency is that when file structures have been defined to suit
specific programs they wi ll not be suited to supporting new applications. The agency might feel
the need for an information system to analyse the success or otherwise of the gigs they have
organised over a number of years. Extracting the data for this from the sort of file-based system
described here would be a complex task which would take considerab le time to comp lete.

10.02 The database approach


It is vita l to understand that a database is not just a collection of data. A database is an
implementation according to the rules of a theoretical model. The basic concept was
proposed some 40 years ago by ANS I (American National Standards Institute) in its three-
level model. The three leve ls are:
- - - - - - - - -- - -- - -

Cambridge International AS and A level Computer Science

• the external level


• the conceptual level
• the internal leveL
The architecture is illustrated in Figure 10.01 in the context of a database to be set up for our
theatrical agency.

External level

Conceptual level

Internal level

Physical storage

Figure 10.01 The ANSI three-level architecture for the theatrical agency database

The physical storage of the data is represented here as being on disk. The details of the
storage (the internal schema) are known only at the internal level, the lowest level in the ANSI
architecture. This is contro lled by the database management system (DBMS) software.
The programmers who wrote this softwa re are the only ones who know the structure for
the storage of the data on disk. The software will accommodate any changes that might be
needed in the storage medium.

At the next level, the conceptual level, there is a single universal view of the database. This is
controlled by the database administrator (DBA) who has access to the DBMS. In the ANSI
architecture the conceptua l level has a conceptual schema describing the organisation of the
data as perceived by a user or programmer. However, this is often described as a logical schema.

At the external level there are individual user and programmer views. Each view has an
external schema describing whic h parts of the database are accessible. A view can support a
number of user programs. The OBA is responsible for setting up these views and for defining
the appropriate, spec ifi c access rights. The DBMS provides faciliti es for a programmer to
develop a user interface for a program. It also provides a query processor. The query is the
mechanism for extracting and manipulating data from the database. A programmer will
incorporate access to queries in a user interface. The other feature provided by the DBMS is
the capability for creating a report to present formatted output.

Data management system (DBMS): software that controls access to data in a database
Database administrator (OBA): a person who uses the DBMS to custom ise the database to suit user
and programmer requirements
- -- - - - -- - - - - - ---

. -
0
- - · Chapter 10: Database and Data Modelling

Discussion Point:
How many of the above concepts are recognisable in your experience of using a database?

10.03 The relational database


In the relational database model each item of data is stored in a relation which is a special
type of table. The strange choice of name has its origin in a mathematical theory. A relational
database is a collection of relationa l tables .

When a table is created in a relational database it is fi rst given a name and then the attri butes
are named . In a dat abase design, a table would be given a name with t he attribute names
li sted in brackets after the table name. For example, a database for th e theatrica l agency may
con t ain the fo llow in g tab les:

Member(MemberlD, MemberGivenName, MemberFamilyName, BandName, ...)

Band(BandName, AgentlD, ... )

The logical view of the data in these tab les is given in Table 10.02 and Table 10.03. Ea ch
attribute is associated with one column in the table and is in effect a column hea der. The
column itself contains attribute va lues.

MemberlD Member Member Band ...


Given Name FamilyName Name
0005 Xiangfe i Jha ComputerKidz ...


0009 Ma hesh Ravuru ITWizz ...
0001 Dylan Stoddart ComputerKidz ...

0025 Vandana Graham ITWizz ...

Table 10.02 Logical view of Member table in a relational database

Band Name Agent ID ...


Compute rKidz 01 ...
ITWizz 07 ...

Table 10.03 Logical view of Band table in a relational database

Although some database products do allow a direct view of a table this is not the norm hence
the use of the term 'logica l view' here. If a user w ishes to inspect all of the data in a table a
query shou ld be used.

Relation: the special type of table which is used in a relational database


Attribute: a column in a relation that contains values

A row in a re lation should be referred to as a tup le but this stri ct nomenclature is not
always used . Often a row is called a 'record ' and the attribute val ues 'fields'. The tuple is t he
collection of data stored for one 'instance' of the relat ion. In Table 10.02, each tuple relates to
one individual band member. A fundamen t al principle of a relational dat abase is th at a tuple
is a set of atomi c va lues; each attribute has one va lue or no value.
Cambridge International AS and A level Computer Science

The most important feature of the relational database concept is the primary key. A primary
key may be a single attribute or a combination of attributes. Every table must have a primary
key and each tuple in the table must have a value for the primary key and that value must
be unique. Once a table and its attributes have been defined the next task is to choose the
primary key. In some cases there may be more than one attribute for which unique va lues are
guaranteed. In this case, each one is a candidate key and one will be selected as the primary
key. More often there is no candidate key and so a prima ry key has to be created. Table 10.02
shows an example of this with the introduction of the attribute MemberlD as the primary key
(the primary key is underlined in the logical view).

The primary key ensures 'entity integrity'. The DBMS will not allow an attempt to insert a
value for a primary key when that value already exists. Therefore each tuple must be unique.
This is one of the features of the relational model that helps to ensure data integrity. The
primary key also provides a unique reference to any attribute value that a query is selecting.

Although it is possible for a database to contain stand-alone tables it is usually true that each
table will have some relationship with another table. This relationship. is implemented by
using a foreign key.

Primary key: an attribute or a combination of attributes for which there is a value in each tuple and
that value is unique
Foreign key: an attribute in one table that refers to the prima ry key in another table

The use of a foreign key can be discussed on the basis of the two database tables
represented in Table 10.02 and Table 10.03. When the database is being created, the Band
table is created first. Band Name is chosen as the primary key because unique names for
bands can be guaranteed. Then the Member table is created. MemberlD is defined as the
primary key and the attribute Band Name is identified as a foreign key referencing the primary
key in the Band table. Once this relationship between primary and foreign keys has been
established, the DBMS will prevent any entry for Band Name in the Member table being made
if the corresponding value does not exist in the Band table. This provides referential integrity
which is another reason why the relational database model helps to ensure data integrity.

Question 10.01
Band Name is a primary key for the Band table. Does this mean that as a foreign key in the
Member table it must have unique values? Explain your reasoning.

10.04 Entity-relationship modelling


The top-down , stepwise refinement (see Chapter 12, Section 12.01) approach to database
design uses an entity-relationship (ER) diagram. This might be initially created and used by
a systems analyst before being passed on to the database designer. Otherwise the designer
has to create it. The term 'relationship' (not to be confused with a relation!) was introduced
earlier in connection with the use of a foreign key. An entity (strictly speaking an entity
type) could be a thing, a type of person, an event, a transaction or an organisation. Most
importantly, there must be a number of 'instances' of the entity. An entity is something that
will become a table in a relational database.
- - - - - - - - - - -- - - -

. -, - · Chapter 10: Database and Data Modelling


11

I WORKED EXAMPLE 10.01

Creating an entity-relationship diagram for the theatrical agency


Let's consider a scenario for the theatrica l agency which will be sufficient to model a

I part of the final database t hey would need. The starting po int for a top-down design is a
statement of the requirement:

Th e agency needs a database to handle bookings for bands. Each band has a number of
members. Each booking is for a venue. Each booking might be for one or more bands.

Step 1: Choose the entities

You look for the nouns. You ignore 'agency' because there is on ly the one. You choose
Booking, Band, Member and Venue. For each of these there will be more than one
instance. You are aware that each book ing is for a gig at a venue but you ignore this
because you think that the Booking entity will be suffic ient to hold the requ ired data
about a gig.

Step 2: Identify the relationships

This requires experience but the aim is not to define too many. You choose the follow ing
three:

Booking with Ve nue

Booking with Band

Band with Member.

You ignore the fact that there will be, for example, a relationship between Member and
Venue because you think that this will be handled through the other relationships that
ind irectly link them . You can now draw a preliminary ER diagram as shown in Figure 10.02.

I Member lf---------1 Band !1-- - - - - - <I Booking 1
l - - - - - - <~V_e_n_ue~
-

Figure 10.02 A preli minary ent ity- relatio nship diagram

Step 3: Decide the cardinalities of the relationships

Now comes the crucial stage of deciding on what are known as the 'cardinalities' of the
relationships. At present we have a single line connecting each pair of entities. This line
actually defines two relationships which might be described as the 'forward' one and
the 'backward' one on the diagram as drawn. However, this only becomes apparent at
the final stage of drawing the re lationship. First we have to choose one of the fol lowing
descriptions for the cardina lity of each rela t ion:

• one-to-one or 1:1

• one-to -many or l:M

many-to-one or M:l

many-to-many or MM.

This can be illustrated by considering the relationship between Member and Band . We
argue that one Member is a member of only one Band. (This needs to be confirmed as a
fact by the agency.) We then argue that one Band has more than one Member so it has
Cambridge International AS and A level Computer Science

many. Therefore the relationsh ip bet ween Member and Band is M:l . In its si m plest form,
this re lationship can be drawn as shown in Figure 10.03.

j Member 1-p,,----------1~_B_a_n_d~

Figure 10.03 The M:1 relationship between Member and Band

Thi s can be given more detail by includ ing the fact that a member must belo ng to a Band
and a Band must have more than one Member. To reflect this, the relationship can be
drawn as show n in Figure 10.04.

j Member ~1--c~~----+I-HIj Band

Figure 10.04 The M:1 relationship with more detail

At each end of the relationship there are two symbols. One of t he symbols shows t he
minimum cardi nality and the other the maximum cardinalit y. In this-particular case the
minimum and maximum va lu es just happen to be the same. However, using the diagram
to document t hat a Member must belong to a Band is important. It ind icates that when
the database is created it must not be possible to create a new entry in t he Member table
unless there is a valid ent ry for Band Name in that tab le.

For the re lationship between Booking and Venue we argue that one Booking is for one
Venue (there must be a venue and there cannot be more than one) and that one Venue
can be used for many Bookings so the relationship between Booking and Venue is M:l .
However, a Venue might exist that has so fa r never had a booking so the relationsh ip can
be drawn as shown in Figure 10.05.

j Booking 1-p,,~o~-----1!+-11j Venue

Figure 10.05 The M:1 relationship between Booking and Venue

Finally for the relationship between Band and Booking we argue that one Booking can
be for many Bands and t hat one Band has many Bookings (ho pefully!) so the relationship
is M:M. However, a new band might not yet have a booking. Also there might be only one
Band for a book ing so the relationship can be drawn as shown in Figure 10.06.

j --°"4
'-"',_._I
BandP,, Booking j

Figure 10.06 The M:M relationship between Band and Booking

Step 4: Create the full ER diagram

At th is stage we should name each relationsh ip. Th e full ER diagram for the limi ted
scenario that has been consi dered is as shown in Fi gu re 10.07.
~ongs to ~ booked for n.-d . ~ade at
I
i..::,.:,~----1+11I·
Memberp:
has 1
Band P' I ~ Booking
is for
r 11 · 1
is booked for ~
Venue
--~

Figure 10.07 The ER diagram for the theatrical agency's booking database

To illustrate how the info rm at ion should be read from such a diagram we can look at the
pa rt shown in Figure 10.08. Despite the fact that there is a many-to-many relationship,
a read ing of a relationship always co nsiders just one entity to begin the sentence. So,
reading forwards and then backwards, we say that:
r - - - - . - -- - - - - - -- - - - --

{ · · ;"B1 rn '. · Chapter 10: Database and Data Modelling


1

One Band is booked for zero or many Bookings

One Booking is for one or many Bands


~
I
·
Band pI
- -~ is booked for

·
.~
IS
Booki ng
for-·- - ~ ·
I
Figure 10.08 Part of the annotated ER diagram

10.05 A logical entity-relationship model


A fully annotated ER diagram of the type developed in Section 10.04 holds all of the
information about the relationships that exist for the data that is to be stored in a system.
It can be defined as a conceptual model because it does not relate to any specific way of
implementing a system. If the system is to be implemented as a relational dat abase the ER
diagram has to be converted to a logical model. To do this we can start with a simplified ER
diagram that just identi fi es cardinalities.

If a relationship is l:M, no further refinement is needed. The relationsh ip shows that the entity at
t he many end needs to have a foreign key referencing the primary key of the entity at the one end .

If there were a 1:1 rela ti onsh ip there are options for implementation . However, such

f relationships are extreme ly rare and w ill not be considered further.

Th e problem relationship is th e M:M, where a fore ign key cannot be used. A foreign key
r


attribute can only have a single value so it cannot handle the many references required. The
solution for the M:M relationship is to create a link entity. For Ba nd and Booking, the logical
entity model will co ntain the link entity shown in Figure 10.09.
f
'---B_a_nd
_ _;----------oe;~-:___Ba_n_d_-_Bo_o_k_in_g_~~p>~- - -- -- __B_o_o_ki_n_g_
1 Figure 10.09 A link entity inserted to resolve a M:M relationship

~ Extension Question 10.01


Is it possib le to annot ate these re lationships?

r
I
With the lin k ent ity in the model it is now possible to have two foreign keys in the link entity;
one referencing the primary key of Band and one referencing the primary key of Booking.

[ Each entity in the logical ER diagram wi ll become a tab le in the relationa l data base. It is
th erefore poss ible to choose prima ry keys and foreign keys for the tables. These can be
sum ma rised in a key table. Table 10.04 shows se nsible choices for the theatrical agency's
booking database.

Table name Primary key Foreign key


Member MemberlD Band Name
Band Band Name
Band-Booking Band Na me & BookinglD BandName, BookinglD
Booking BookinglD VenueName
Venue VenueName

Table 10.04 A key table for t he agency booking database


Cambridge International AS and A level Computer Science

The decisions about the primary keys are determined by the uniqueness requirement. The
link ent ity cannot use eit her Band Name or BookinglD alone but the co m binat ion of the t wo
in a compound primary key will work.

TASKl0.01
Consider the following scenario. An organisation books cruises for passengers. Each cru ise
visits a number of ports. Create a conceptua l ER diagram and convert it to a logica l ER
diagram. Create a key tab le for the database that could be implemented from the design.

10.06 Normalisation
Normalisation is a design technique for constructing a set of table designs from a list of data
ite ms. It can also be used to improve on ex ist ing t able designs.
·-
lf~tl :I;• 'I llf ,..·• •~M ~ • ::1111111•

Normalising data for the theatrical agency


To illustrate the technique let's consider the document shown in Figure 10.10. This is a
book ing data sheet t hat the theatrical co mpany might use.

Booking data sheet: 2016/023


Venue:
Cambridge Internationa l Theatre
Camside
CAl
Booking data: 23.06.2016
Bands booked Number of band members Headlining
ComputerKidz 5 y
ITWizz 3 N

Figure 10.10 Example booking data sheet

The data items on th is sheet (ignoring head ings) can be listed as a set of attrib utes:
(BookinglD, Venue Name, Ve nu eAdd ressl , Ve nu eAddress2, Date,
(BandName, NumberOfMembers, Headl ining))
Th e list is put inside bracket s beca use we are starti ng a process of tabl e design. Th e
ext ra set of brackets aroun d Ban d Na me, NumberOfMembers, Headlining is beca use t hey
rep resent a repeating group. If t here is a rep eating group, t he attributes cannot sensibly
be put into one relational tab le. A t able must have single rows and atomic attribute
values so the only possib ility wou ld be to include t uples such as t hose shown in Table
10.05. There is now data re dunda ncy here with t he dup li ca ti on of th e bookin gl D, venue
data and the date.

Booking Venue Venue Venue Date Band Number Headlining


ID Name Address! Address2 Name Of Members
2016/ 023 Cambridge International Theatre Camside CAl 23.06.2016 Computer Kidz 5 y

2016/023 Cambridge International Theatre Camside CAl 23.06.2016 ITWizz 3 N

Table 10.05 Data stored in an unnormalised table


- - - -- - ~-- - - - - - - - - - - - - - - - - -- - ------ -

l
- - Chapter 10: Database and Data Modelling

Step 1: Conversion to first normal form (lNF)

The conversion to first normal form (lNF) requires splitting the data into two groups. At this
stage we represent the data as table definitions. Therefore we have to choose table names
and identify a primary key for each table. One table contains the non -repeating group
attributes the other the repeating group attributes. For the first table a sensible design is:

Booking(Booking lD, VenueName, VenueAddressl, VenueAddress2, Date)

The table w ith the repeating group is not so straightforward. It needs a compound
primary key and a foreign key to give a reference to the first table. The sensib le design is:

Band-Booking(BandName, BookinglD(fk), NumberOfMembers, Head lining)

Again the primary key is underlined but also the fore ign key has been identified , w ith
(fk). Because the repeating groups have been moved to a second table, these two tables
could be implemented wi th no data redundancy in either. This is one aspect of l NF. Also
it can be said that for each table the attri butes are dependent on the primary key.

Step 2: Conversion to second normal form (2NF)

Th e Booking table is automatical ly in 2NF; only tables w ith repeating group attrib utes
have to be converted. For conversion to second normal form (2N F), the process is
to exam in e each non -key attribute and ask if it is dependent on both parts of the
compound key. Any attributes that are dependent on only one of the attributes in the
compound key must be moved out into a new table. In this case, NumberOfMembers is
only dependent on Band Name. In 2NF there are now three table definition s:

Booking(BookinglD, VenueName, VenueAddressl, VenueAddress2, Date)


Band-Booking(BandName(fk), BookinglD(fk), Headlining) •
I
Band(BandName, NumberOfMembers)

Note that th e Booking table is unchan ged from lNF. The Band-Booking table now has
two foreign keys to provide reference to data in the other two tables. The characteristics

r of a table in 2NF is that it ei t her has a single primary key or it has a compound primary
key with any non -key attribute dependent on both componen t s.

Step 3: Conversion to third norma l form (3NF)

For conversion to third normal form (3NF) each tab le has to be exam in ed to see if th ere
are any non-key dependencies; that means we must look for any non-key attribute that is
dependent on another non-key attribute. If there is, a new table must be defin ed.

In our example, VenueAddressl and VenueAddress2 are dependent on VenueName. With


the additi on of the fourth table we have the following 3N F definitions:

Band(BandName, NumberOfMembers)

Band-Booking(BandName(fk), Booking lD(fk), Headlining)

Booking(Booking lD, Date, VenueName(fk))

Venue(VenueName, VenueAddressl , VenueAdd ress2)

Note that once aga in a new foreign key has been identified to keep a reference to data in the
newly created table. These four table definitions match four of the entities in the logi cal ER
model for wh ich the keys were identified in Table 10.04. This wi ll not always happen. A logical
ER diagram will describe a 2NF set of entities but not necessarily a 3NF set.

-- -- -- - -- - - -- - -
-~--- - - ---- -- - -- - - --

Cambridge International AS and A level Computer Science

Repeating group: a set of attributes that have more than one set of values when the other attributes
each have a single value

To summarise, if a set of tables are in 3N F it can be said that each non-key attribute is
depend ent on the key, the whole key and noth ing but t he key.

Question 10.02
In Step 2 of Worked Example 10.02, why is the Headlining attribute not placed in the Band table?

TASKl0.02
Norma li se the data shown in Figure 10.11.

Order no: Date:


07845 25-06-2016
Customer no: Customer name: CUP
056 Address: Cambridge square Cambridge
Sales rep no: 2 Sales Rep name: Dylan Stoddart

Product Price I
Description Quantity Total
no unit
327 Inkjet cartridges 24 $30 $720
563 Laser toner 5 $25 $125
Total Price $835
Figure 10.11 An order form

10.07 Structured Query Language (SQL)


SQL is t he programm ing language prov ided by a DBMS to sup port all of the o peration s
assoc iated w ith a relational database. Even when a database package offers high -l eve l
faci lit ies for user interaction, t hey use SQL.

Data Definition Language (DDL)


Data Definition Language (DDL) is t he part of SQ L provided fo r creating or alterin g tables .
These commands only create the st ruct ure. They do not put any data into t he database.

The fo llowing are so me examples of DDL that cou ld be used in creating the database for t he
th ea t rical agency:

CREATE DATABASE BandBooking;


CREATE TABLE Band (
B a ndName v archar2 (2 5 ) ,
NumberOfMembers number (1 )) ;
ALTER TABLE Band ADD PRIMARY KEY (BandName ) ;
ALTER TABLE Band-Booking ADD FOREIGN KEY (BandName REFERENCES
Band (BandName ) ;

Th ese examples show that once the database has been crea ted the tables can be created
and t he att ributes defi ned. It is poss ible to defin e a primary key and a forei gn key w ithin t he
CREATE TABLE co mmand but t he ALTER TA BLE command ca n be used as shown (it ca n
also be used to add ext ra attri bu tes) .

I
- -- -- - -
~ . - - - - ~ - - - - -- - - - ~ - - - - - -- -

;
1
",i;' r[: ·- . Chapter 10: Database and Data Modelling
,j .

TASKl0.03

t For the dat abase defined in Worked Example 10.02, complete the DDL fo r creating the four
tables. Use va rchar2(5) for BookinglD, number(l) for NumberOfMembers, date fo r Date,
varchar2(1) for Headlining and varchar2(25) for all other data .

Data Manipulation Language (DML)


Data Man ipula t ion Language (DM L) is used when a database is first created, t o pop ula te t he
tables w ith dat a. It can the n be used fo r ongoing ma intenance. Th e fo llowi ng code shows a
selection of t he use of the commands :

INSERT INTO Band ( 'Comput erKidz ' , 5);


INSERT INTO Band-Booking (BandName, BookingID)
VALUES (' ComputerKidz', ' 2016 / 02 3 ' ) ;
UPDATE Band
SET NumberOf Members = 6;
DELETE FROM BandName
WHERE BandName = 'ITWizz';

The above code shows t he two met hods of insert ing dat a. Th e firs t, simpler version ca n be
used if t he o rder of the att rib utes is known . The second is t he safe r met hod: the attributes
are defined then the values are listed . The next two statements show a cha nge of data and
t he remova l of dat a.

The mai n use of DML is t o obt ain data fro m a database using a que ry. A query always start s
with t he SELECT command . Some exam p les are:
SELECT BandName
FROM Band
ORDER BY BandName;

SELECT BandName
FROM Band-Booking
WHERE Headlining= ' Y'
GROUP BY BandName;

Bot h of t hese exam p les select dat a from a single table . The fi rst produces an ordered list of
all the ban ds. The second produces a list of bands t hat have headlined a gig. The GROUP BY
restri ction ens ures t hat t he band na mes are not repea ted .

A query can be based on a 'join cond it ion ' betw een data in t wo tab les. The most frequen t ly
used is an inner j oin which is illustrat ed by:

SELECT VenueName, Date


FROM Booking
WHERE Band-Booking .BookingID = Booking.BookingID
AND Band-Booking . BandName = ' ComputerKidz' ;

Note the use of t he full names of at tributes, whic h include th e t able name. Thi s query will find
t he ven ue and date of bo okings for the band ComputerKidz.
---- --- - -

Cambridge International AS and A level Computer Science

Accessing SQL commands using a different language


Although a database can be accessed directly using SQL there is often a need to control
access to a databa se usi ng a different language. This makes se nse beca use a program can
access data in a file so w hy not in a database? Programming languages therefore have a
mechanism for embedding an SQL command into a program.

A special case arises in a client-server web application as mentioned in Chapter 2 (Section


2.09). Server- side scripting usin g PHP can access a database associated with the server. The
following is an example of some code t hat could be includ ed in an HTML fil e:

<?php
// Connect to localhost using root as the username and no password
mysql connect("localhost", "root", "" );
II Select the database
mysql _ select _ db ( "BandBooking");
// Run a query
$result = mysql _ query("SELECT * FROM Band")
?>

This code assumes that you have created a MYSQL databa se on a server located on your own
computer.

10.08 DBMS features


Th ere are a few important features of a DBMS which have not been mentioned. The first and
most important is the data dictionary which is part of the database that is hidd en from view
from everyone except the OBA. It conta in s metadata about the data. This includ es details
of all the definitions of tables, attributes and so on but also of how the physical storage is
organ ised.

There are a number of features to improve perfo rmance. Of specia l note is the capab ility
to create an index for a table. This is needed if th e table contai ns a lot of data. An index is a
secondary table which is associated with an attribute th at has unique values. Th e index table
conta ins the attribute values and pointers to the co rrespond in g tuple in the original ta ble.
The index can be on the prim ary key or on a secondary key which was a cand id ate key when
the choice of primary key was made. Searching an ind ex table is much quicker than searching
the full table.

Final ly, the DBMS contro ls security issues which include:

• setting access rights for users


• implementing backup proced ures
• ensuring that an interrupted database tran saction cannot leave the database in an
undefined state.
1,
1
· - -=." , • Chapter 10: Database and ·Data1Mo'd~lling . · ·
- - - ---~~~-::1.:.i
!

• A database offers improved methods for ensuring data integrity compared to a file-based
approach.

• A database architecture provides, for the user, a conceptual level interface to the stored data.

• A relational database comprises tables of a special type; each table has a primary key and may
contain foreign keys.

• Entity-relationship modelling is a top-down approach to database design.

• Normalisation is a database design method which starts with a collection of attributes and
converts them into first normal form then into second normal form and , finally, into third normal
fo rm.

• Structured Query Language (SQL) includes data definition language (DDL) commands for
establishing a database and data manipulation language (DML) commands for creating queries.

• Features provided by a database management system (DBMS) include: a data dictionary, indexing
capability, control of user access rights and backup procedures.

Exam-style Questions
1 a A relatio nal database has been created to sto re data about subj ects that students are studying. The following is a
selection of some data stored in one of the t ables. The data represents t he student's name, the persona l tutor group,
th e personal tutor, the subject stu died , the level of stu dy and the subj ect teacher but there is some data missing:

r Xiangfei 3 MUB Computing A DER

Xiangfei 3 MUB Maths A BNN


Xiangfei 3 MUB Physics AS DAB
f
~
Mahesh 2 BAR History AS IJM
t Mahesh 2 BAR Geography AS CAB

t
I
Define the terms used to describe the components in a relational database table using examp les from
this table. [2]

ii If this rep resented all of the data, it wo uld have been imp ossible to creat e th is table.
What is it that has not been shown here and must have been defined to allow the creation as a relational
database tab le? Explain your answer and suggest examp les of the missing data. (4]

iii Is t hi s tab le in first norm al form (lNF)? Exp lain your reason . [2]

b It has been suggested t hat th e database design could be im proved . The design suggested contains the following
two tables:

Stude nt(Stu d entName, TutorGroup, Tutor)

Student Su bject(StudentName, Su bj ect,


Level, SubjectTeacher)
I ' ' < ' , ~ · . 1'.('1711 ~1:·;,
Cambridge International AS and A level Computer Science : ,ii~.:;
~~ ••
1\:::.. ··:-·/
- '",;·t~~ t~: . ......{

Identify features of this design which are characteristic of a relational database. [3]

ii Expla in why the use of StudentName here is a potential problem. [2]

iii Explain why the Student table is not in third normal form (3NF). [2]

2 Consider the following scenario:

A company provides catering services for clients who need special-occasion, celebratory dinners. For
each dinner, a number of dishes are to be offered . The dinner will be held at a venue. The company wi ll
provide staff to serve the meals at the venue.

The company needs a database to store data related to this business activity.

a An entity-relationship model is to be created as the first step in a database design. Identify a list of entities. [4)

b Identify pairs of entities where there is a direct relationship betwee n them. [4)

c For each pair of entities, draw the relationship and justify the chGice of cardinality illustrated by the
representation. [6]

3 Consider the fo llowing booking form used by a trave l agency.

Booking Number 00453

Hotel: Esplanade Rating: ***


Colwyn Bay
North Wales

Number of
Date Room type Room rate
rooms
23/06/2016 Front-facing double 2 $80
23/06/2016 Rear-facing double 1 $65
24/ 06/2016 Front-facing double 2 $80

a Creat e an unnorma lised list of attributes using the data shown in this form. Make sure that you distinguish
between the repeating and non-repeating attributes. [5]

b Convert the data to first normal form (lNF). Present this as designs for two tables with
keys identified . [3]

c Choose the appropriate table and convert it to two tables in second norma l form (2NF). Explain your choice
of table to modify. Explain your identification of the keys for t hese two new tables. [5]

d Identify which part of your design is not in Third Normal Form (3NF). [2)

You might also like