DB & DBMS
DB & DBMS
e-mail: [email protected]
Contents
1. LEARNING OUTCOMES CHECK LIST FOR THE SESSION ..............................................2
2. INTRODUCTION....................................................................................................................3
7. ADDITIONAL NOTES..........................................................................................................11
8. REFERENCES.....................................................................................................................12
‘Ways of thinking about the clinical information you collect (i.e. Data)’
should be read before this section.
Introduction to Health Informatics
Database concepts (1)
Robin Beaumont 28/03/00 Tel:0191 2731150 e-mail: [email protected] Source: Laptop; C:\HIcourseweb new\chap7\s2\dbcon1.doc Page 2
Introduction to Health Informatics
Database concepts (1)
2. Introduction
The term 'database' is used by everyone now-a-days yet what exactly does the word mean? Most people,
not being computer boffins, would find it difficult to provide an explanation but would, if pressed, suggest
one or more of the following examples. For example, the method the bank uses to store details of their
account, the medical records department of a large hospital or a card index of their own home video
collection. The question is what exactly do all these examples have in common?
"A database is one or more large structured tables of persistent data, usually associated
with software to update and query the data. A simple database might be a single table
containing many records, each of which contains the same set of fields where each field
is a certain fixed width."
This definition clearly needs a lot of explaining, for one thing what is meant by a table? The following
sections will take each of the terms in the above definition and provide examples, exercises and
explanations.
A database consists of
The diagram below provides an example of a typical table consisting of patient details:
A record
Robin Beaumont 28/03/00 Tel:0191 2731150 e-mail: [email protected] Source: Laptop; C:\HIcourseweb new\chap7\s2\dbcon1.doc Page 3
Introduction to Health Informatics
Database concepts (1)
In the above diagram:
A row = a record
A column = a field
A table in a database may be empty, that is contain no records or contain many millions of them. Similarly,
a single record may consist of one or thousands of fields. Each field in turn contains a single data item.
These data items can be categorised in a number of ways, which we discussed when taking about data
types.
The actual object that the table, as a whole represents (i.e. what its name stands for such as 'doctor'), plays
some part in determining the fields, however these fields must be subjected to a formal process known as
normalisation to ensure they should actually be included in the table or would be better included in a
different one. If the tables were developed from an object model (a technique covered elsewhere) they are
frequently already normalised.
Normalisation is an extremely important process ensuring that a database can actually work. Anyone can
come up with all sorts of tables and wonderful sets of fields within each of them but this is paramount to
disaster if the data is not structured which means NORMALISED. A Relational DBMS, such as Access97,
is designed to work with normalised data and if presented with non normalised (i.e. unstructured data) will
fail to work correctly requiring a lot of code, written by expensive programmers, and sticky tape to keep
going. Access97 does provide a facility that takes non-normalised data and supposedly normalises it for
you, unfortunately it does not work very well.
2. You should always seek advice from a qualified database person when devising databases
which have more than a few dozen fields
Robin Beaumont 28/03/00 Tel:0191 2731150 e-mail: [email protected] Source: Laptop; C:\HIcourseweb new\chap7\s2\dbcon1.doc Page 4
Introduction to Health Informatics
Database concepts (1)
Access97 allows you to do most things in several different ways. For example, you can create the database
by usiing the table definition window or you can define it using a standard language called SQL. Similarly
you can enter data in one of two ways; the first, datasheet view is pretty crude but does the job if you’re the
only one using the database. Alternatively, you can create pretty screens called forms which make the job
of entering data easier. You can manipulate or query your data in Access97 by using a graphical method
(called Query by example QBE) or again using SQL. You can also create more user friendly versions of
these results using reports which can provide high quality printed documents.
There is a great deal of information in the above paragraph, and do not worry if you do not understand it, as
each of these aspects will be considered in depth latter in the course. The important thing to realise is that a
DBMS is much more complex than a spreadsheet or a traditional card indexing system.
Obviously each supplier constantly tries to outdo the others in offering more features. However we will
concentrate on the basic ones listed above.
4.1 Meta-Data
Besides the DBMS storing the actual data you put in it (e.g. names and date of births in the previous patient
table example in section 3), it also stores details about this data and the structure in which its kept. That is
the DBMS 'knows' how many tables and records there are in the database as well as what data type each
Robin Beaumont 28/03/00 Tel:0191 2731150 e-mail: [email protected] Source: Laptop; C:\HIcourseweb new\chap7\s2\dbcon1.doc Page 5
Introduction to Health Informatics
Database concepts (1)
field is. All this additional information is called meta data and is very useful when you want to find out
certain details. The data-dictionary you created earlier is an example of meta-data.
In Access97 you can inspect the meta data that is stored about the current database you have open by
clicking on the Tools menu -> Options ->View tab. Then under the Show section, select or clear the Hidden
Objects check box and the System Objects check box. Access97 displays hidden objects with dimmed
icons to distinguish them from objects that aren't defined as hidden.
4.2 Actions/Privileges
In some respects computer databases (that is DBMS's) work in a very similar way to the old fashion card
index boxes loved by most researchers.
Consider a box collection of files and list the actions you can carry out on them.
A DBMS allows the user to carry out the following activities which are often called user 'rights':
Action: Level:
That is the user can possibly be allowed to create, Read, Update or Delete tables, records or fields. These
actions / user rights are often referred to as 'CRUD'. The DBMS may also allow the user to carry out ad
hoc:
The above 'CRUD' is very useful when considering what actions /access rights various people should have
to a set of data. For example, a nurse may be able to Read the details of a computerised prescription but
not allowed to carry out any of the other actions on the record. Similarly the doctor, who is a colleague of
the one who Created the prescription may not be able to Delete it. Different professional groups, but
notably the administrative managers and the clinicians, will have very different rights to the same sets of
data.
Many health systems do not allow anyone to delete anything only what's known as 'delete flag' the data
item. This tags the records as being deleted but does not remove it from the database. This is one way of
keeping an audit trial, enabling computer user's movements to be recorded. There was recently a case of a
GP changing prescriptions, only to be found out by the computers audit trail!
Often database analysts draw what is known as a 'CRUD', matrix. The purpose of a CRUD matrix is to
classify who should or does what to various data items (attributes, fields whatever),
Robin Beaumont 28/03/00 Tel:0191 2731150 e-mail: [email protected] Source: Laptop; C:\HIcourseweb new\chap7\s2\dbcon1.doc Page 6
Introduction to Health Informatics
Database concepts (1)
Data items
Patient
Patient blood Patient NOK etc
ID results NHS ID .
Users:
Ward
clerk R R CRU
Student
Nurse R R R CRU
etc.
CRUD Exercise -
Ward Nurse
Houseman
Unit Manager
Create a 'CRUD' matrix to indicate their type of access ('CRUD'), for the following data items:
We will be looking in more detail at queries and less so at reports in the practical sessions to this course.
Robin Beaumont 28/03/00 Tel:0191 2731150 e-mail: [email protected] Source: Laptop; C:\HIcourseweb new\chap7\s2\dbcon1.doc Page 7
Introduction to Health Informatics
Database concepts (1)
4.4 Views
From the above it is clear that a database can offer a large number of
Database
things to a particular user. In large databases a particular group of
users have a particular view of the data. This might well be the result
of having few privileges to a very small subset of the database, where
as other users will have far greater privileges applied to much more
data. The important thing to realise is that what a particular user sees
is not necessarily the same as the underlying data. The excellent
User 'A's User 'B's
recent novel 'Quite ugly one morning' by Robert Brookemyre
view of view of
data data describes this situation in a NHS context.
4.5 Forms
These provide a method of allowing users to enter data in a friendly manner. For example the screen may
minic a paper data entry form and provide a method of entering data into several tables simultaneously. A
practical session will teach you how to develop them.
Unfortunately the point and click interface is limited and to develop any database that is usable to any extent
must include some programming. Traditionally different aspects of the database had different languages,
there was a Data Definition Language (DDL) for creating the database than a Data Manipulation Language
(DML) to allow the user to interrogate the database. There were also special reporting languages that
provided formatting facilities. Most of these sublanguages have now been seamlessly incorporated into
most major programming languages such as Microsoft Visual Basic (the language for Access97) and
Delphi.
Robin Beaumont 28/03/00 Tel:0191 2731150 e-mail: [email protected] Source: Laptop; C:\HIcourseweb new\chap7\s2\dbcon1.doc Page 8
Introduction to Health Informatics
Database concepts (1)
Primary keys - The primary key is a field or combination of fields that uniquely identifies each record in a
table. (e.g. patient ID).
Is the 'ID' field in the table below a suitable primary key. Give reasons for your answer.
Is the 'ID' field in the table below a suitable primary key. Give reasons for your answer.
Robin Beaumont 28/03/00 Tel:0191 2731150 e-mail: [email protected] Source: Laptop; C:\HIcourseweb new\chap7\s2\dbcon1.doc Page 9
Introduction to Health Informatics
Database concepts (1)
A composite (compound) primary key is one where the key consists of more than one field. For example
imagine we did not have a unique patient ID field in the patient table. We could possibly say that first name
+ last name + DOB + post code would give us a unique combination. However this is really a botch for
several reasons. Firstly from the previous section I stated that a primary key must possess a valid value. It
MUST NOT be empty. Considering our composite key it is highly probable that some of the fields will be
empty (null). This means that we can't save any of the other details for the record, such as referral doctor
and date placed on list, until we have all the details that make up the primary key. Some patients may be
wait a very long time for referrals if this primary key were accepted.
It is always better to have a primary key that is an abstract number rather than one which is made up of
several fields which have meaning (see p126 Reingruber & Gregory). In fact the Object Oriented school of
modelling considers this so important that they have given primary keys a special name oid which stands
for object ID and which is a characteristic of all objects before any other is defined.
Robin Beaumont 28/03/00 Tel:0191 2731150 e-mail: [email protected] Source: Laptop; C:\HIcourseweb new\chap7\s2\dbcon1.doc Page 10
Introduction to Health Informatics
Database concepts (1)
7. Additional Notes
7.1 Primary Keys
Reingruber & Gregory 1994 have the following useful information concerning Primary key characteristics
besides the non-null property (see p126). In the extract below where Reingruber & Gregory mention entities
consider this as equivalent to objects or tables, the subtleties do not matter at this stage.
1. Stable. The value of a primary key must not change or become null throughout the life of an entity
(Brooks 1992). A stable primary key helps keep the model stable (Whitener 1989). For example if we
consider a patient record the value for the primary key must not change over time as would happen with
the age field.
2. Minimal. The primary key should be composed of the minimum number of fields that ensures the
occurrences are unique.
3. Factless. It should not contain intelligence - groupings of digits or characteristics within a value of the
key that hold additional meta-information [Note e.g. say patient ID consisted of GP ID + there own
unique ID]. This violates atomicity requirements for attributes [note RB: = fields], increases the potential
that the primary value would change.
4. Definitive. A value must exist for every record at creation time. The primary key acts as a constraining
mechanism for the entity. Because an entity occurrence cannot be substantiated unless the primary key
value also exists.
5. Accessible. Anyone how wants to create, read or delete a record must be able to see the Primary key
value (Whitener 1989).
Robin Beaumont 28/03/00 Tel:0191 2731150 e-mail: [email protected] Source: Laptop; C:\HIcourseweb new\chap7\s2\dbcon1.doc Page 11
Introduction to Health Informatics
Database concepts (1)
8. References
Reingruber Michael C. Gregory William W 1994 The Data Modeling Handbook John Wiley & Sons.
Chichester.
James Martin 1981 An end users guide to Data Base. Prentice Hall [ISBN 0 13 277129 2]
Document info:
Robin Beaumont 28/03/00 Tel:0191 2731150 e-mail: [email protected] Source: Laptop; C:\HIcourseweb new\chap7\s2\dbcon1.doc Page 12