CIE A2 Computing
CIE A2 Computing
Main Menu
INTERNATIONAL
EDITION
MODULE 3
SYSTEMS SOFTWARE MECHANISMS, MACHINE ARCHITECTURE,
DATABASE THEORY, PROGRAMMING PARADIGMS AND
INTEGRATED INFORMATION SYSTEMS.
MODULE 4
COMPUTING PROJECT
INTERNATIONAL PRACTICE PAPERS
The purpose of this paper is to provide practice for the student. It is not intended to be taken as a model for
the examination papers that the student will meet at the end of the course. Although, it follows all the rules
stated in the syllabus and has been produced by an experienced examiner. However, it has not been through
the very rigorous procedures that a genuine CIE paper is subjected to.
Also, the mark scheme is somewhat tight. Normally, the mark scheme would be studied by a team of
examiners, all of whom would have changes to make and additions to be added. The normal result is a mark
scheme with far more marks available.
Despite these shortcomings this paper is recommended to the student as a very good way to prepare for the
examination.
Answer all questions.
Do not use proprietary names for items of hardware or software in your answers.
Q 1. a) Explain the difference between memory and storage in a computer system. [2]
b) Describe the use of a file allocation table in the management of storage. [3]
c) Describe how paging and virtual memory can be used to allow a number of jobs to be stored
in main memory, seemingly simultaneously, in a computer system. [3]
Q 2. a) Explain the purpose of linkers and loaders when preparing a program for running. [4]
b) Describe how errors are found by a compiler during the translation process. [4]
Q 3. Describe the contents of the special registers of a processor during the execution of an arithmetic
instruction. [7]
Q 5. A college stores data about the students in a year group in a linked list in alphabetic order of name.
a) Using the students
KAZ DOPAL NG ANTOINE JUNG
As an example set of students, draw a diagram showing the important features of the list. [5]
b) Describe how the student JUNG can be removed from the list when she leaves the college.
[4]
c) Describe how to create a new list of students who take computing, from the list of students in
the year group. [4]
Q 6. Explain the use of a stack when procedures are called from within a program. [6]
The decision has been taken to connect up these resources to allow sharing of data. Describe
the hardware necessary for this system giving reasons for your answers. [4]
Q 10. State three reasons why simulation may be important to an application, giving an example of such
an application in each case. [6]
SUPPLIER
TOTAL (90)
Mark Scheme.
Q 2. a) Linker:
-Links modules once loaded into memory
-Finds routines in library and links with others
-Manages use of parameters to link modules.
Loader:
-Loads individually compiled modules into memory.
-Adjusts memory addresses
(1 per -, max 4) [4]
b) -Each instruction has a reserved word which is checked against a table of reserved words.
-Grammatical rules for reserved word checked against syntax definition.
-Variable names compared with those in table of variables
-Labels are checked for existence
-Positioning of control constructs is checked
(1 per -, max 4) [4]
Free
JUNG KAZ
NG X
Mark points:
-Head of list
-Alphabetic order
-Pointers
-Null value
-Free space [5]
b) (Can be shown on a diagram)
-Jung is found
-Previous pointer (from Jung to Dopal) is changed to point at Kaz
-Space used by Jung is added to free list...
-by changing head of list pointer to point at Jung and...
-Jung points to previous head of free list.
(1 per -, max 4) [4]
c) -Set up new value in head of lists table
-Read each person in turn from year group list
-If computer student...
-set pointer to that student...
-put null pointer for new list for that student.
Note use of two sets of pointers, may be more.
(1 per -, max 4) [4]
PASSWORD
LETTER DIGIT
[2]
c) (i) <PASSWORD>::= <LETTERS><NUMBER>|<LETTERS> [2]
(ii) <NUMBER>::= <NZD>|<NUMBER><DIGIT> [2]
Q 8. a) -Communication only possible if communication format understood by both.
-Means of communication must follow rules...
-which must be the same in order to be able to be understood
-Mark for example (ASCII, MPEG, ISOOSI...) [4]
b) -Repeater can be used because of distances
-Bridge will be able to control access to the admin system to those who are entitled
-Router to send data to correct branch of network
-Hubs/switches to control data flow on each branch of network
-Simple baseband coaxial (or explain) for each network
-Optic fibre (or other sensible) to connect networks.
(1 per - max 4) [4]
Q 9. (i) -Old and new systems run side by side until sure new system works
-where the integrity of the data is crucial to the application.
(ii) -Some of system altered while remainder stays on old system
-Application with a large amount of training to be carried out.
(iii) -Old system dispensed with at same time as new system is started
-Where cost/time is essential. [6]
Q 10. -Time, there is not enough time to do the experiment for real
-changes occurring to a population over many generations
-The cost would be too great otherwise...
-testing a design of an aircraft wing to see if it gives enough lift.
-Dangerous in real life...
-what will happen if a nuclear reactor explodes.
-Impossible otherwise...
-what is it like to fly through the rings of Saturn.
(These are example applications. 1 per -, max 6) [6]
TOTAL (90)
MODULE 3
Systems software mechanisms, Machine architecture, Database theory,
Programming paradigms and Integrated information Systems
Originally, if a program needed input, the program would have to contain the code to do this, similarly if
output were required. This led to duplication of code so the idea of an OS was born. The OS contained the
necessary input and output functions that could be called by an application. Similarly, disk input and output
routines were incorporated into the OS. This led to the creation of subroutines to do these simple tasks such
as read a character from a keyboard or send a character to a printer. The joining together of all these basic
input and output routines led to the input-output control system (IOCS). Originally, the IOCS could only
read a punched card or send data to a card punch. However, as new input and output media, such as
magnetic tape and disk, were developed the IOCS became more complex.
Another complication was added when assembly and high-level languages were developed as the machine
did not use these languages. Machines use binary codes for very simple instructions. With the development
of these new types of programming language a computer would have to
For a user to organise all this had now become too complex. Also, as the processor could work much faster
than the manual operator and the input and output devices, much time was wasted.
Further, to make full use of the processor, more than one program should be stored in memory and the
processor should give time to each of the programs. Suppose two programs are stored in memory and, if
one is using an input or output device (both very slow compared to the processor), it makes sense for the
other program to use the processor. In fact this can be extended to more than two programs as shown in Fig.
3.1.a.1.
The OS must now manage the memory so that all three programs shown in Fig. 3.1.a.1 are kept separate as
well as any data that they use. It must also schedule the jobs in the sequence that makes best use of the
processor.
Using I/O Using
Program A processor required processor
Using I/O
Program C processor required
Processor idle
Fig. 3.1.a.1
The I/O phase should not hold up the processor too much which can easily happen if the I/O devices are
very slow, like a keyboard or printer. This can be overcome by using Simultaneous Peripheral Operations
On-Line (spooling). The idea is to store all input and output on a high-speed device such as a disk. Fig.
3.1.a.2 shows how this may be achieved,
Application
Read program Write
process process
Fig. 3.1.a.2
Another problem is that programs may not be loaded into the same memory locations each time they are
loaded. For example, suppose that three programs are loaded in the order A, B, C on one occasion and in
the order C, A, B on another occasion. The results are shown in Fig. 3.1.a.3.
OS OS
Program A
Program C
Program B
Program A
Program C
Program B
Free Free
Fig. 3.1.a.3
A further problem occurs if two or more users wish to use the same program at the same time. For example,
suppose user X and user Y both wish to use a compiler for C++ at the same time. Clearly it is a waste of
memory if two copies of the compiler have to be loaded into main memory at the same time. It would make
much more sense if user X's program and user Y's program are stored in main memory together with a single
copy of the compiler as shown in Fig. 3.1.a.4.
OS
User X's
program and data
User Y's
Program
and data
Compiler
Free
Fig. 3.1.a.4
Now the two users can use the compiler in turns and will want to use different parts of the compiler. Also
note that there are two different sets of data for the compiler, user X's program and user Y's program. These
two sets of data and the outputs from the compiler for the two programs must be kept separate. Programs
such as this compiler, working in the way described, are called re-entrant.
Memory management, scheduling and spooling are described in more detail in the following Sections.
Distributed systems have operating systems that arrange for the sharing of the resources of the system by the
users of that system. No action is required from the user. An example would be when a user asks for a
particular resource in a network system, the resource would be allocated by the O.S. in a way that is
transparent to the user.
3.1.b Interrupts
The simplest way of obeying instructions is shown in Fig. 3.1.b.1.
Start
Fetch instruction
Execute instruction
End
Fig. 3.1.b.1
This is satisfactory so long as nothing goes wrong. Unfortunately things do go wrong and sometimes the
normal order of operation needs to be changed. For example, a user has used up all the time allocated to his
use of the processor. This change in order is instigated by interrupts. The nature of each of these interrupts
is
I/O interrupt
o Generated by an I/O device to signal a job is complete or an error has occurred. E.g. printer
is out of paper or is not connected.
Timer interrupt
o Generated by an internal clock indicating that the processor must attend to time critical
activities (see scheduling later).
Hardware error
o For example, power failure which indicates that the OS must close down as safely as
possible.
Program interrupt
o Generated due to an error in a program such as violation of memory use (trying to use part of
the memory reserved by the OS for other use) or an attempt to execute an invalid instruction
(such as division by zero).
If the OS is to manage interrupts, the sequence in Fig. 3.1.b.1 needs to be modified as shown in Fig. 3.1.b.2.
Start
Fetch instruction
Execute instruction
Is there an interrupt?
No
Yes
End
Fig. 3.1.b.2
This diagram shows that, after the execution of an instruction, the OS must see if an interrupt has occurred.
If one has occurred, the OS must service the interrupt if it is more important than the task already being
carried out (see priorities later). This involves obeying a new set of instructions. The real problem is 'how
can the OS arrange for the interrupted program to resume from exactly where it left off?'. In order to do this
the contents of all the registers in the processor must be saved so that the OS can use them to service the
interrupt. Chapter 3.3 explains registers that have to have their contents stored as well as explaining the
fetch/execute cycle in more detail.
Another problem the OS has to deal with happens if an interrupt occurs while another interrupt is being
serviced. There are several ways of dealing with this but the simplest is to place the interrupts in a queue
and only allow return to the originally interrupted program when the queue is empty. Alternative systems
are explained in Section 3.1.c. Taking the simplest case, the order of processing is shown in Fig.3.1.b.3.
Start
Fetch instruction
Is there an interrupt
No in the interrupt queue?
No Yes
Fig. 3.1.b.3
The queue of interrupts is the normal first in first out (FIFO) queue and holds indicators to the next interrupt
that needs to be serviced.
3.1.c Scheduling
One of the tasks of the OS is to arrange the jobs that need to be done into an appropriate order. The order
may be chosen to ensure that maximum use is made of the processor; another order may make one job more
important than another. In the latter case the OS makes use of priorities.
Suppose the processor is required by program A, which is printing wage slips for the employees of a large
company, and by program B, which is analysing the annual, world-wide sales of the company which has a
turnover of many millions of pounds.
Program A makes little use of the processor and is said to be I/O bound. Program B makes a great deal of
use of the processor and is said to be processor bound.
If program B has priority over program A for use of the processor, it could be a long time before program A
can print any wage slips. This is shown in Fig. 3.1.c.1.
A using processor
Fig. 3.1.c.1
Fig. 3.1.c.2 shows what happens if A is given priority over B for use of the processor. This shows that the
I/O bound program can still run in a reasonable time and much better throughput is achieved.
Fig. 3.1.c.2
To achieve these objectives some criteria are needed in order to determine the order in which jobs are
executed. The following is a list of criteria, which may be used to determine a schedule, which will achieve
the above objectives.
Priority. Give some jobs a greater priority than others when deciding which job should be given
access to the processor.
I/O or processor bound. If a processor bound job is given the main access to the processor it could
prevent the I/O devices being serviced efficiently.
Type of job. Batch processing, on-line and real-time jobs all require different response times.
Resource requirements. The amount of time needed to complete the job, the memory required, I/O
and processor time.
Resources used so far. The amount of processor time used so far, how much I/O used so far.
Waiting time. The time the job has been waiting to use the system.
In order to understand how scheduling is accomplished it is important to realise that any job may be in one,
and only one, of three states. A job may be ready to start, running on the system or blocked because it is
waiting for a peripheral, for example. Fig. 3.1.c.3 shows how jobs may be moved from one state to another.
Note that a job can only enter the running state from the ready state. The ready and blocked states are
queues that may hold several jobs. On a standard single processor computer only one job can be in the
running state. Also, all jobs entering the system normally enter via the ready state and (normally) only leave
the system from the running state.
Enter the
system READY BLOCKED
RUNNING
Leave the
system
Fig. 3.1.c.3
When entering the system a job is placed in the ready queue by the High Level Scheduler (HLS). The HLS
makes sure that the system is not over loaded.
Sometimes it is necessary to swap jobs between the main memory and backing store (see Memory
Management in Section 3.1.d). This is done by the Medium Level Scheduler (MLS).
Moving jobs in and out of the ready state is done by the Low Level Scheduler (LLS). The LLS decides the
order in which jobs are to be placed in the running state. There are many policies that may be used to do
scheduling, but they can all be placed in one of two classes. These are pre-emptive and non-pre-emptive
policies.
A pre-emptive scheme allows the LLS to remove a job from the running state so that another job can be
placed in the running state. In a non-pre-emptive scheme each job runs until it no longer requires the
processor. This may be because it has finished or because it needs an I/O device.
FCFS
o simply means that the first job to enter the ready queue is the first to enter the running state.
This favours long jobs.
SJF
o simply means sort jobs in the ready queue in ascending order of times expected to be needed
by each job. New jobs are added to the queue in such a way as to preserve this order.
RR
o this gives each job a maximum length of processor time (called a time slice) after which the
job is put at the back of the ready queue and the job at the front of the queue is given use of
the processor. If a job is completed before the maximum time is up it leaves the system.
SRT
o the ready queue is sorted on the amount of expected time still required by a job. This scheme
favours short jobs even more than SJF. Also there is a danger of long jobs being prevented
from running.
MFQ
o involves several queues of different priorities with jobs migrating downwards.
There are other ways of allocating priorities. Safety critical jobs will be given very high priority, on-line and
real time applications will also have to have high priorities. For example, a computer monitoring the
temperature and pressure in a chemical process whilst analysing results of readings taken over a period of
time must give the high priority to the control program. If the temperature or pressure goes out of a pre-
defined range, the control program must take over immediately. Similarly, if a bank's computer is printing
bank statements over night and someone wishes to use a cash point, the cash point job must take priority.
This scheme is shown in Fig. 3.1.c.4; this shows that queues are needed for jobs with the same priority.
Fig. 3.1.c.4
In this scheme, any job can only be given use of the processor if all the jobs at higher levels have been
completed. Also, if a job enters a queue that has a higher priority than the queue from which the running
program has come, the running program is placed back in the queue from which it came and the job that has
entered the higher priority queue is placed in the running state.
Multi-level feedback queues work in a similar way except that each job is given a maximum length of
processor time. When this time is up, and the job is not completely finished, the job is placed in the queue
which has the next lower priority level. At the lowest level, instead of a first in first out queue a round robin
system is used. This is shown in Fig. 3.1.c.5.
Fig. 3.1.c.5
3.1.d Memory Management
In order for a job to be able to use the processor the job must be stored in the computer's main memory. If
there are several jobs to be stored, they, and their data, must be protected from the actions of other jobs.
Suppose jobs A, B, C and D require 50k, 20k, 10k and 30k of memory respectively and the computer has a
total of 130k available for jobs. (Remember the OS will require some memory.) Fig. 3.1.d.1 shows one
possible arrangement of the jobs.
Free 20k
Job D 30k
Job C 10k
130k available for
Job B 20k jobs
Job A 50k
OS
Fig 3.1.d.1
Now suppose job C terminates and job E, requiring 25k of memory, is next in the ready queue. Clearly job
E cannot be loaded into the space that job C has relinquished. However, there is 20k + 10 k = 30k of
memory free in total. So the OS must find some way of using it. One solution to the problem would be to
move job D up to job B. This would make heavy use of the processor as not only must all the instructions
be moved but all addresses used in the instructions would have to be recalculated.
When jobs are loaded into memory, they may not always occupy the same locations. Supposing, instead of
jobs A, B, C and D being needed and loaded in that order, it is required to load jobs A, B, D and E in that
order. Now job D occupies different locations in memory to those shown above. So again there is a
problem of using different addresses.
The OS has the task of both loading the jobs and adjusting the addresses. The loader does both these tasks.
The calculation of addresses can be done by recalculating each address used in the instructions once the
address of the first instruction is known. Alternatively, relative addressing can be used. That is, addresses
are specified relative to the first instruction.
This system is known as variable partitioning with compaction. This is because the size of the segments
varies according to the sizes of the jobs and the 'holes' are removed by compacting the jobs.
An alternative method is to divide both the memory and the jobs into fixed size units called pages. As an
example, suppose jobs A, B, C, D and E consist of 6, 4, 1, 3 and 2 pages respectively. Also suppose that the
available memory for jobs consists of 12 pages and jobs A, B and C have been loaded into memory as
shown in Fig. 3.1.d.2.
Fig. 3.1.d.2
Now suppose job B terminates, releasing four pages, and jobs D and E are ready to be loaded. Clearly we
have a similar problem to that caused by variable partitioning. The 'hole' consists of four pages into which
job D (three pages) will fit, leaving one page plus the original one page of free memory. E consists of two
pages, so there is enough memory for E but the pages are not contiguous and we have the situation shown in
Fig. 3.1.d.3.
Job E Memory
Page 2 Free
Page 1 C1
Free
D3
D2
D1
A6
A5
A4
A3
A2
A1
Fig. 3.1.d.3
The big difference between partitioning and paging is that jobs do not have to occupy contiguous pages.
Thus the solution is shown in Fig. 3.1.d.4.
Memory
E2
E split
C1
E1
D3
D2
D1
A6
A5
A4
A3
A2
A1
Fig. 3.1.d.4
The problem with paging is again address allocation. This can be overcome by keeping a table that shows
which memory pages are used for the job pages. Then, if each address used in a job consists of a page
number and the distance the required location is from the start of the page, a suitable conversion is possible.
Suppose, in job A, an instruction refers to a location that is on page 5 and is 46 locations from the start of
page 5. This may be represented by
5 46
5 46 Becomes 8 46
Paging uses fixed length blocks of memory. An alternative is to use variable length blocks. This method is
called segmentation. In segmentation, programmers divide jobs into segments, possibly of different sizes.
Usually, the segments would consist of data, or sub-routines or groups of related sub-routines.
Since segments may be of different lengths, address calculation has to be carefully checked. The segment
table must not only contain the start position of each segment but also the size of each segment. This is
needed to ensure that an address does not go out of range. Fig. 3.1.d.5 shows how two jobs may be stored in
memory. In this case the programmer split Job A into 4 segments and Job B into 3 segments. These two
jobs, when loaded into memory, took up the positions shown in the Figure.
Job A Job B Memory
Segment A4 Free
Segment B3
A4
Segment A3
Segment B2
Segment A2 A3
Segment A1 Segment B1 B1
B3
A2
B2
A1
Fig. 3.1.d.5
Now suppose that an instruction specifies an address as segment 3, displacement (from start of segment)
132. The OS will look up, in the process segment table, the basic address (in memory) of segment 3. The
OS checks that the displacement is not greater than the segment size. If it is, an error is reported. Otherwise
the displacement is added to the base address to produce the actual address in memory to be used. The
algorithm for this process is
Segment Table
+
Seg. No. Seg. Size Base Address
1
2
3 1500 3500
4
Fig. 3.1.d.6
Paging and segmentation lead to another important technique called virtual memory. We have seen that jobs
can be loaded into memory when they are needed using a paging technique. When a program is running,
only those pages that contain code that is needed need be loaded. For example, suppose a word processor
has been written with page 1 containing the instructions to allow users to enter text and to alter the text.
Also suppose that page 2 contains the code for formatting characters, page 3 the code for formatting
paragraphs and page 4 contains the code for cutting, copying and pasting. To run this word processor only
page 1 needs to be loaded initially. If the user then wants to format some characters so that they are in bold,
then page 2 will have to be loaded. Similarly, if the user wishes to copy and paste a piece of text, page 4
will have to be loaded. When other facilities are needed, the appropriate page can be loaded. If the user
now wishes to format some more characters, the OS does not need to load page 2 again as it is already
loaded.
Now, what happens if there is insufficient space for the new page to be loaded? As only the page containing
active instructions need to be loaded, the new page can overwrite a page that is not currently being used.
For example, suppose the user wishes to use paragraph formatting; then the OS can load page 3 into the
memory currently occupied by page 2. Clearly, this means that programs can be written and used that are
larger than the available memory.
There must be some system that decides which pages to overwrite. There are many systems such as
overwrite the page that has not been used for the longest period of time, replace the page that has not
recently been used or the first in first out method. All of these create overheads for the OS.
To further complicate matters not every page can be overwritten. Some pages contain a job's data that will
change during the running of a program. To keep track of this the OS keeps a flag for each page that can be
initially set to zero. If the content of the page changes, the flag can be set to 1. Now, before overwriting a
page, the OS can see if that page has been changed. If it has, then the OS will save the page before loading a
new page in its place. The OS now has to both load and save pages. If the memory is very full, this loading
and saving can use up a great deal of time and can mean that most of the processor's time is involved in
swapping pages. This situation is called disk threshing.
Systems can use both multi-programming and virtual memory. Also, virtual memory can use segmentation
as well as paging although this can become very complex.
3.1.e Spooling
Spooling was mentioned in Section 3.1.a and is used to place input and output on a fast access device, such
as disk, so that slow peripheral devices do not hold up the processor. In a multi-programming, multi-access
or network system, several jobs may wish to use the peripheral devices at the same time. It is essential that
the input and output for different jobs do not become mixed up. This can be achieved by using
Simultaneous Peripheral Operations On-Line (spooling).
Suppose two jobs, in a network system, are producing output that is to go to a single printer. The output is
being produced in sections and must be kept separate for each job. Opening two files on a disk, one for each
job, can do this. Suppose we call these files File1 and File2. As the files are on disk, job 1 can write to
File1 whenever it wishes and job 2 can write to File2. When the output from a job is finished, the name (and
other details) of the file can be placed in a queue. This means that the OS now can send the output to the
printer in the order in which the file details enter the queue. As the name of a file does not enter the queue
until all output from the job to the corresponding file is complete, the output from different jobs is kept
separate.
Spooling can be used for any number of jobs. It is important to realise that the output itself is not placed in
the queue. The queue simply contains the details of the files that need to be printed so that the OS sends the
contents of the files to the printer only when the file is complete. The part of the OS that handles this task is
called the spooler or print spooler.
It should be noted that spooling not only keeps output from different jobs separate, it also saves the user
having to wait for the processor until the output is actually printed by a printer (a relatively slow device).
Spooling is used on personal computers as well as large computer systems capable of multi-programming.
All OS's for PC's allow the user to copy, delete and move files as well as letting the user create an
hierarchical structure for storing files. They also allow the user to check the disk and tidy up the files on the
disk.
However, Windows allows the user to use much more memory than MS-DOS and it allows multi-tasking.
This is when the user opens more than one program at a time and can move from one to another. Try
opening a word processor and the clipboard in Windows at the same time. Adjust the sizes of the windows
so that you can see both at the same time. Now mark a piece of text and copy it to the clipboard. You will
see the text appear in the clipboard window although it is not the active window. This is because the OS can
handle both tasks apparently at the same time. In fact the OS is swapping between the tasks so fast that the
user is not aware of the swapping.
Another good example of multi-tasking is to run the clock program while using another program. You will
see that the clock is keeping time although you are using another piece of software. Try playing a CD while
writing a report!
The OS not only offers the user certain facilities, it also provides application software with I/O facilities. In
this Section you will see how an OS is loaded and how it controls the PC.
This section, printed with a shaded background, is not required by the CIE Computing Specification, but
may be interesting and useful for understanding how the system works.
When a PC is switched on, it contains only a very few instructions. The first step the computer does is to
run the power-on-self-test (POST) routine that resides in permanent memory. The POST routine clears the
registers in the CPU and loads the address of the first instruction in the boot program into the program
counter. This boot program is stored in read-only memory (ROM) and contains the basic input/output
system (BIOS).
Control is now passed to the boot program which first checks itself and the POST program. The CPU then
sends signals to check that all the hardware is working properly. This includes checking the buses, systems
clock, RAM, disk drives and keyboard. If any of these devices, such as the hard disk, contain their own
BIOS, this is incorporated with the system's BIOS. Often the BIOS is copied from a slow CMOS BIOS chip
to the faster RAM chips.
The PC is now ready to load the OS. The boot program first checks drive A to see if a disk is present. If
one is present, it looks for an OS on the disk. If no OS is found, an error message is produced. If there is no
disk in drive A, the boot program looks for an OS on disk C. Once found, the boot program looks, in the
case of Windows systems, for the files IO.SYS and MSDOS.SYS. Once the files are found, the boot
program loads the boot record, about 512 bytes, which then loads IO.SYS. IO.SYS holds extensions to the
ROM BIOS and contains a routine called SYSINIT. SYSINIT controls the rest of the boot procedure.
SYSINIT now takes control and loads MSDOS.SYS which works with the BIOS to manage files and
execute programs.
The OS searches the root directory for a boot file such as CONFIG.SYS which tells the OS how many files
may be opened at the same time. It may also contain instructions to load various device drivers. The OS
tells MSDOS.SYS to load a file called COMMAND.COM. This OS file is in three parts. The first part is a
further extension to the I/O functions and it joins the BIOS to become part of the OS. The second part
contains resident OS commands, such as DIR and COPY.
The files CONFIG.SYS and AUTOEXEC.BAT are created by the user so that the PC starts up in the same
configuration each time it is switched on.
The OS supplies the user, and applications programs, with facilities to handle input and output, copy and
move files, handle memory allocation and any other basic tasks.
In the case of Windows, the operating system loads into different parts of memory. The OS then guarantees
the use of a block of memory to an application program and protects this memory from being accessed by
another application program. If an application program needs to use a particular piece of hardware,
Windows will load the appropriate device driver. Windows also uses virtual memory if an application has
not been allocated sufficient main memory.
As mentioned above, Windows allows multi-tasking; that is, the running of several applications at the same
time. To do this, Windows uses the memory management techniques described in Section 3.1.d. In order to
multi-task, Windows gives each application a very short period of time, called a time-slice. When a time-
slice is up, an interrupt occurs and Windows passes control to the next application. In order to do this, the
OS has to save the contents of the CPU registers at the end of a time-slice and load the registers with the
values needed by the next application. Control is then passed to the next application. This is continued so
that all the applications have use of the processor in turn. If an application needs to use a hardware device,
Windows checks to see if that device is available. If it is, the application is given the use of that device. If
not, the request is placed in a queue. In the case of a slow peripheral such as a printer, Windows saves the
output to the hard disk first and then does the printing in the background so that the user can continue to use
the application. If further printing is needed before other printing is completed, then spooling is used as
described in Section 3.1.e.
Any OS has to be able to find files on a disk and to be able to store user's files. To do this, the OS uses the
File Allocation Table (FAT). This table uses a linked list to point to the blocks on the disk that contain files.
In order to do this the OS has a routine that will format a disk. This simply means dividing the disk radially
into sectors and into concentric circles called tracks. Two or more sectors on a single track make up a
cluster. This is shown in Fig. 3.1.f.1.
Sectors
Cluster
using 3
sectors
Tracks
Fig 3.1.f.1
A typical FAT table is shown in Fig 3.1.f.2. The first column gives the cluster number and the second
column is a pointer to the next cluster used to store a file. The last cluster used has a null pointer (usually
FFFFH) to indicate the end of the linking. The directory entry for a file has a pointer to the first cluster in the
FAT table. The diagram shows details of two files stored on a disk.
Cluster Pointer
Pointer from directory 0 FFFD
entry for File 1 1 FFFF
2 3
3 5
Pointer from directory 4 6
entry for File 2 5 8
6 7
7 10
8 9
9 FFFF End of File 1 is in
cluster 9
10 11
11 FFFF
End of File 2 is in
cluster 11
Fig. 3.1.f.2
In order to find a file, the OS looks in the directory for the filename and, if it finds it, the OS gets the cluster
number for the start of the file. The OS can then follow the pointers in the FAT to find the rest of the file.
In this table any unused clusters have a zero entry. Thus, when a file is deleted, the clusters that were used
to save the file can be set to zero. In order to store a new file, all the OS has to do is to find the first cluster
with a zero entry and to enter the cluster number in the directory. Now the OS only has to linearly search
for clusters with zero entries to set up the linked list.
It may appear that using linear searches will take a long time. However, the FAT table is normally loaded
into RAM so that continual disk accesses can be avoided. This will speed up the search of the FAT.
Note that Windows 95/98 uses virtual FAT (VFAT) which allows files to be saved 32 bits at a time (FAT
uses 16 bits). It also allows file names of up to 255 characters. Windows 98 uses FAT 32 which allows
hard drives greater than 2 G. bytes to be formatted.
The facilities provided by a NOS depend on the size and type of network. For example, in a peer-to-peer
network all the stations on the network have equal status. In this system one station may act as a file server
and another as a print server. At the same time, all the stations are clients. A client is a computer that can
be used by users of the network. A peer-to-peer network has little security so the NOS only has to handle
communications,
file sharing,
printing.
If a network contains one or more servers, the NOS has to manage
file sharing,
file security,
accounting,
software sharing,
hardware sharing (including print spooling),
communications,
the user interface.
File sharing allows many users to use the same file at the same time. In order to avoid corruption and
inconsistency, the NOS must only allow one user write access to the file, other users must only be allowed
read access. Also, the NOS must only allow users with access rights permission to use files; that is, it must
prevent unauthorised access to data. It is important that users do not change system files (files that are
needed by the NOS). It is common practice for the NOS to not only make these files read only, but to hide
them from the users. If a user looks at the disk to see what files are present, these hidden files will not
appear. To prevent users changing read only files to read write files, and to prevent users showing hidden
files, the NOS does not allow ordinary users to change these attributes.
To ensure the security of data, the network manager gives users access rights. When users log onto a
network they must enter their user identity and password. The NOS then looks up, in a table, the users'
access rights and only allows them access to those files for which access is permitted. The NOS also keeps
a note of how the users want their desktops to look. This means that when users log on they are always
presented with the same screen. Users are allowed to change how their desktops look and these are stored
by the NOS for future reference.
As many users may use the network and its resources, it may be necessary for the NOS to keep details of
who has used the network, when and for how long and for what purpose. It may also record which files the
user has accessed. This is so that the user can be charged for such things as printing, the amount of time that
the network has been used and storage of files. This part of the NOS may also restrict the users' amount of
storage available, the amount of paper used for printing and so on. From time to time the network manager
can print out details of usage so that charges may be made.
The NOS must share the use of applications such as word processors, spreadsheets and so on. Thus when a
user requests an application, the NOS must send a copy of that application to the user's station.
Several users may well wish to use the same hardware at the same time. This is particularly true of printers.
When a user sends a file for printing, the file is split into packets. As many users may wish to use a printer,
the packets from different users will arrive at the print server and will have to be sorted so that the data from
different users are kept separate. The NOS receives these packets and stores the data in different files for
different users. When a particular file is complete, it can be added to the print queue as described in Section
3.1.e.
The NOS must also ensure that users' files are saved on the server and that they cannot be accessed by other
users. To do this the network manager will allocate each user a fixed amount of disk space and the NOS
will prevent a user exceeding the amount of storage allocated. If a user tries to save work when there is
insufficient space left, the NOS will ask the user to delete some files before the user can save any more. In
order to do this, the server's hard drive may be partitioned into many logical drives. This means that,
although there may be only one hard drive, different parts of it can be treated as though they are different
drives. For example, one part may be called the H drive which is where users are allowed to save their
work. This drive will be divided up into folders, each of which is allocated to a different user. Users only
have access to their own folders but can create sub-folders in their own folders. The NOS must provide this
service as well as preventing users accessing other users' folders. Another part of the drive may be called
(say) the U drive where some users can store files for other users who will be allowed to retrieve, but not
alter, them unless they are saved in the user's own area. The NOS will also only allow access to certain
logical drives by a restricted set of users.
For all the above to work, the NOS will have to handle communications between stations and servers. Thus,
the NOS is in two parts. One part is in each station and the other is in the server(s). These two parts must
be able to communicate with one another so that messages and data can be sent around the network. The
need for rules to ensure that communication is successful was explained in Chapter 1.6 in the AS text.
Finally, the NOS must provide a user interface between the hardware, software and user. This has been
discussed in Section 3.1.f, but a NOS has to offer different users different interfaces. When a user logs onto
a network, the NOS looks up the needs of the user and displays the appropriate icons and menus for that
user, no matter which station the user uses. The NOS must also allow users to define their own interfaces
within the restrictions laid down by the network manager.
It must be remembered that users must not need an understanding of all the tasks undertaken by the NOS.
As far as users are concerned they are using a PC as if it were solely for their own use. The whole system is
said to be transparent to the user. This simply means that users are unaware of the hardware and software
actions. A good user interface has a high level of transparency and this should be true of all operating
systems.
3.1 Example Questions
Q 1. Explain how a processor, which is working on another task, handles an interrupt. [4]
A. - At some point in the cycle/at the end of an instruction
- the processor will check to see if there are any outstanding interrupts.
- If there are, the current job is suspended and…
- the contents of the special registers are stored so that it can be restarted later
- interrupts are then serviced until all have been dealt with…
- control is returned to the original job.
Notes: The handling of interrupts is rather more complex than described here and the manipulation
of the special registers has to be explained in more detail, but the above answer covers all the points
that arise in this section.
Q 2. State three different types of interrupt that may occur and say in what order the three interrupts
would be handled if they all arrived at the processor together, giving a reason for your answer. [5]
A. - I/O interrupt like the printer running out of data to print and wanting the buffer refilling.
- Timer interrupt where the processor is being forced to move onto some other task.
- Program interrupt where the processor is being stopped from carrying out an illegal operation
that has been specified in the program code.
- Hardware interrupt the most serious of which is power failure.
- The order is from the bottom up. The most serious is power failure because no other tasks can
be carried out if there is no power, so the safe powering down of the system must be
paramount.
- Contrast that with the printer wanting more data before it can print any more out. Does it
really matter if the printer has to wait a few more seconds?
Notes: There are far more points to take into account about interrupt. How does the processor decide
which of a number of interrupts is the most important? Or is the interrupt more important than the
work it was doing anyway? An interrupt is simply a signal, it is not a piece of program code, so how
does the processor know what to do?
Q 4. Describe two types of scheduling that are used by computer processing systems. [4]
A. The answers are to be found straight out of the notes on this chapter, page 10.
Q 5. Explain how interrupts are used in a round robin scheduling operating system. [3]
A. - Each job is given a set amount of processor time.
- At the end of the time available for a job, it is interrupted.
- The operating system inspects the queue of jobs still to be processed and
- if it is not empty allocates the next amount of processor time to the job first in the queue.
- The previous job goes to the end of the queue.
- Use of priorities to determine place in the queue.
- Need for priorities to change according to amount of recent processing time they have had.
Q 6. a) Explain the difference between paging and segmenting when referring to memory
management techniques. [2]
b) Describe how virtual memory can allow a word processor and a spreadsheet to run
simultaneously in the memory of a computer even though both pieces of software are too
large to fit into the computer’s memory. [3]
A. a) - They are both methods of dividing up the available memory space into more
manageable pieces.
- Paging involves memory being divided into equal size areas.
- Segmenting involves areas of different size dependent upon the contents needing to be
stored. [2]
Notes: The first mark point does not really answer the question because it does not provide a
difference, however it does give a context to the rest of the answer so is worth credit.
b) - The software code is divided up into parts that have some level of equivalence.
- Those most commonly used routines are kept together
- These are loaded into memory and other routines are
- only loaded when the user calls upon them.
- Thus they give the impression of being permanently available. [3]
Notes: A nice question. It is not only a standard bookwork question but it also relates specifically to
an occurrence seen, if not understood, by computer users every day. A question with a little more
detail about the software being used might expect a more detailed answer about the sort of things that
would be in each of the pages, but it would also have more marks available.
Q 7. Describe how a PC operating system uses a file allocation table to find files when necessary.
A. - The disk surface must be divided up into small areas
- Files are stored in these small areas
- Each file will normally use more than one area
- The table of files has an entry pointing to the first area on the disk surface used by that file
and
- a pointer to the next area.
- Each subsequent area has a pointer to the next area, as in a linked list with…
- a null value to signify the end of the file.
Chapter 3.2
The Functions and Purposes of Translators
3.2.a Interpreters and Compilers
When electronic computers were first used, the programs had to be written in machine code. This code was
comprised of simple instructions each of which was represented by a binary pattern in the computer. To
produce these programs, programmers had to write the instructions in binary. This not only took a long
time, it was also prone to errors. To improve program writing assembly languages were developed.
Assembly languages allowed the use of mnemonics and names for locations in memory. Each assembly
instruction mapped to a single machine instruction which meant that it was fairly easy to translate a program
written in assembly language to machine code. To speed up this translation process, assemblers were
written which could be loaded into the computer and then the computer could translate the assembly
language to machine code. Writing programs in assembly language, although easier than using machine
code, was still tedious and took a long time.
After assembly languages came high-level languages which used the type of language used by the person
writing the program. Thus FORTRAN (FORmula TRANslation) was developed for science and engineering
programs and it used formulae in the same way as would scientists and engineers. COBOL (Common
Business Oriented Language) was developed for business applications. Programs written in these languages
needed to be translated into machine code. This led to the birth of compilers.
A compiler takes a program written in a high-level language and translates into an equivalent program in
machine code. Once this is done, the machine code version can be loaded into the machine and run without
any further help as it is complete in itself. The high-level language version of the program is usually called
the source code and the resulting machine code program is called the object code. The relationship between
them is shown in Fig. 3.2.a.1.
Fig. 3.2.a.1
The problem with using a compiler is that it uses a lot of computer resources. It has to be loaded in the
computer's memory at the same time as the source code and there has to be sufficient memory to hold the
object code. Further, there has to be sufficient memory for working storage while the translation is taking
place. Another disadvantage is that when an error in a program occurs it is difficult to pin-point its source in
the original program.
An alternative system is to use interpretation. In this system each instruction is taken in turn and translated
to machine code. The instruction is then executed before the next instruction is translated. This system was
developed because early personal computers lacked the power and memory needed for compilation. This
method also has the advantage of producing error messages as soon as an error is encountered. This means
that the instruction causing the problem can be easily identified. Against interpretation is the fact that
execution of a program is slow compared to that of a compiled program. This is because the original
program has to be translated every time it is executed. Also, instructions inside a loop will have to be
translated each time the loop is entered.
However, interpretation is very useful during program development as errors can be found and corrected as
soon as they are encountered. In fact many languages, such as Visual Basic, use both an interpreter and a
compiler. This enables the programmer to use the interpreter during program development and, when the
program is fully working, it can be translated by the compiler into machine code. This machine code
version can then be distributed to users who do not have access to the original code.
Whether a compiler or interpreter is used, the translation from a high-level language to machine code has to
go through various stages and these are shown in Fig. 3.2.a.2.
Source Program
Lexical Analysis
Syntax Analysis
Semantic Analysis
Intermediate Language
Code Generation
Code Optimisation
Object Program
At various stages during compilation it will be necessary to look up details about the names in the symbol
table. This must be done efficiently so a linear search is not sufficiently fast. In fact, it is usual to use a hash
table and to calculate the position of a name by hashing the name itself. When two names are hashed to the
same address, a linked list can be used to avoid the symbol table filling up.
The lexical analyser also removes redundant characters such as white space (spaces, tabs, etc.) and
comments. Often the lexical analysis takes longer than the other stages of compilation. This is because it
has to handle the original source code, which can have many formats. For example, the following two
pieces of code are equivalent although their format is considerably different.
IF X = Y THEN 'square X IF X = Y THEN Z := X * X
Z := X * X ELSE Z := Y * Y
ELSE 'square Y PRINT Z
Z := Y *Y
ENDIF
PRINT Z
When the lexical analyser has completed its task, the code will be in a standard format which means that the
syntax analyser can always expect the format of its input to be the same. Consider the instructions
X := 54
RETURN X
the lexical analyser will turn this into paired tokens. The first part describes the token, if necessary, and the
second part defines what the token represents. Thus the above input is turned into
Variable X
Assignment symbol
Integer 54
Return symbol
Variable X
During this stage of compilation the code generated by the lexical analyser is parsed (broken into small
units) to check that it is grammatically correct. All languages have rules of grammar and computer
languages are no exception. The grammar of programming languages is defined by means of BNF notation
or syntax diagrams. It is against these rules that the code has to be checked.
For example, taking a very elementary language, an assignment statement may be defined to be of the form
and expression is
and the parser must take the output from the lexical analyser and check that it is of this form.
If the statement is
which becomes
<variable> <assignment_operator> <expression>
and then
<assignment statement>
which is valid.
and this does not represent a valid statement hence an error message will be returned.
It is at this stage that invalid names can be found such as PRIT instead of PRINT as PRIT will be read as a
variable name instead of a reserved word. This will mean that the statement containing PRIT will not parse
to a valid statement. Note that in languages that require variables to be declared before being used, the
lexical analyser may pick up this error because PRIT has not been declared as a variable and so is not in the
symbol table.
Most compilers will report errors found during syntax analysis as soon as they are found and will attempt to
show where the error has occurred. However, they may not be very accurate in their conclusions nor may
the error message be very clear.
During syntax analysis certain semantic checks are carried out. These include label checks, flow of control
checks and declaration checks.
Some languages allow GOTO statements (not recommended by the authors) which allow control to be
passed, unconditionally, to another statement which has a label. The GOTO statement specifies the label to
which the control must pass. The compiler must check that such a label exists.
Certain control constructs can only be placed in certain parts of a program. For example in C (and C++) the
CONTINUE statement can only be placed inside a loop and the BREAK statement can only be placed inside
a loop or SWITCH statement. The compiler must ensure that statements like these are used in the correct
place.
Many languages insist on the programmer declaring variables and their types. It is at this stage that the
compiler verifies that all variables have been properly declared and that they are used correctly.
During lexical and syntax analysis a table of variables has been built up which includes details of the
variable name, its type and the block in which it is valid. The address of the variable is now calculated and
stored in the symbol table. This is done as soon as the variable is encountered during code generation.
Before the final code can be generated, an intermediate code is produced. This intermediate code can then
be interpreted or translated into machine code. In the latter case, the code can be saved and distributed as an
executable program. Two methods can be used to represent the high-level language in machine code. The
one uses a tree structure and the other a three address code (TAC). TAC allows no more than three operands
and instructions take the form
Operand1 := Operand2 Operator Operand3
A := (B + C) * (D – E) / F
R1 := B + C
R2 := D – E
R3 := R1 * R2
A := R3 / F
:=
A /
* F
+ -
B C D E
Fig. 3.2.d.1
Other statements can be represented in similar ways and the final stage of compilation can then take place.
The compiler has to consider, at this point, the type of code that is required. Code can be optimised for
speed of execution or for size of program. Often compilers try to compromise between the two.
An example of code optimisation is shown in Fig. 3.2.d.2 where the code on the left has been changed to
that on the right so that r1 * b is only evaluated once.
a := 5+ 3 a := 5 + 3
b := 5 * 3 b := 5 * 3
r1 := a + b r1 := a + b
r2 := r1 * b r2 := r1 * b
r3 := r2 / a r3 := r2 / a
r4 := r1 – b r4 := r2
r5 := r4 + 6 r5 := r4 + 6
c := r3 – r5 c := r3 – r5
Fig. 3.2.d.2
There are many other ways of optimising code but they are beyond that expected at this level.
3.2.e Linkers and Loaders
Programs are usually built up in modules. These modules are then compiled into machine code that has to
be loaded into the computer's memory. This process is done by the loader. The loader decides where in
memory to place the code and then adjusts memory addresses as described in Chapter 3.1. As the whole
program may consist of many modules, all of which have been separately compiled, the modules will have
to be correctly linked once they have been loaded. This is the job of the linker. The linker calculates the
addresses of the separate pieces that make up the whole program and then links these pieces so that all the
modules can interact with one another.
The idea of using modules that can be used in many programs was explained in Section 1.3 in the AS text.
This method of creating programs is important as it reduces the need to keep rewriting code and will be
further discussed under object oriented programming in Section 3.5.f.
4.2 Example Questions.
Q 1. Explain why the size of the memory available is particularly relevant to the process of compilation.
[4]
A. The computer must be able to simultaneously hold in memory:
- The compiler software/without which the process cannot be carried out.
- The source code/the code that needs to be compiled
- The object code/because the compilation process produces the program in machine code
form.
- Working space/processing area to be used by the compiler during the process.
Notes: This question is trying to make the candidate think about the implications of the creation of a
machine code version of the high-level language program. The significance of the size of the
memory is not as significant as it used to be because the memory in a micro is now large enough for
the problem not to arise in the main, but in the past the compilation of programs could cause trouble
on a micro leading to the standard translation technique of interpretation.
Q 2. a) Explain the difference between the two translation techniques of interpretation and
compilation. [2]
b) Give one advantage of the use of each of the two translation techniques. [2]
b) - Interpretation provides the programmer with better error diagnostics because the
source code is always present and hence can be used to provide a reference whenever
the error occurs.
- When a program is compiled no further translation is necessary no matter how many
times the program is run, consequently there is nothing to slow down the execution of
the program.
Notes: The question is probably not in its best form as there are many responses that could
justifiably be given to the difference between the two processes. A perfectly acceptable response
here would be that interpretation does not create an object code while compilation does.
Q 3. State the three stages of compilation and describe, briefly, the purpose of each. [6]
A. - Lexical analysis
- puts each statement into the form best suited to the syntax analyser.
- Syntax analysis
- Language statements are checked against the rules of the language.
- Code generation
- The machine code (object code) is produced.
Notes: The number of marks for the question plays a big part in this answer. There are only six
marks, three of which must be for stating the three stages. This means that there is only one mark
each for saying what the purpose of each is. Do not launch into a long essay, you don’t have time in
the constraints of the examination room, the examiner is simply looking for an outline description of
what the stage does. Be careful about writing down all you know about each stage. There is a
danger that the first thing you write down may be wrong. There is only one mark available and, if
the answer is very long, the mark will be lost immediately. Also, don’t think that the marks can be
carried across from another part. You may not know anything about code generation, but you do
know a lot about lexical analysis – sorry, the marks cannot be transferred over in a question like this.
Q 4. Explain, in detail, the stage of compilation known as lexical analysis. [6]
A. - Source program is used as the input
- Tokens are created from the individual characters and from…
- the special reserved words in the program.
- A token is a string of binary digits.
- Variable names are loaded into a look up table known as the symbol table
- Redundant characters (e.g. spaces) are removed
- Comments are removed
- Error diagnostics are issued.
Notes: Compare this question with the last one. This one is asking for the details, so it becomes
important to say as much as possible. The examiner may be a little more lenient about something in
a student’s list that is wrong, but only a little! After all, the main thing about the stages of
compilation is to know when each of them is appropriate, so don’t make too many errors.
Chapter 3.3
Computer Architecture and the Fetch-Execute Cycle
3.3.a Von Neumann Architecture
John Von Neumann introduced the idea of the stored program. Previously data and programs were stored in
separate memories. Von Neumann realised that data and programs are indistinguishable and can, therefore,
use the same memory. This led to the introduction of compilers which accepted text as input and produced
binary code as output.
The Von Neumann architecture uses a single processor which follows a linear sequence of fetch-decode-
execute. In order to do this, the processor has to use some special registers. These are
Register Meaning
PC Program Counter
CIR Current Instruction Register
MAR Memory Address Register
MDR Memory Data Register
Accumulator Holds results
The program counter keeps track of where to find the next instruction so that a copy of the instruction can be
placed in the current instruction register. Sometimes the program counter is called the Sequence Control
Register (SCR) as it controls the sequence in which instructions are executed.
The memory address register is used to hold the memory address that contains either the next piece of data
or an instruction that is to be used.
The memory data register acts like a buffer and holds anything that is copied from the memory ready for the
processor to use it.
The central processor contains the arithmetic-logic unit (also known as the arithmetic unit) and the control
unit. The arithmetic-logic unit (ALU) is where data is processed. This involves arithmetic and logical
operations. Arithmetic operations are those that add and subtract numbers, and so on. Logical operations
involve comparing binary patterns and making decisions.
The control unit fetches instructions from memory, decodes them and synchronises the operations before
sending signals to other parts of the computer.
The accumulator is in the arithmetic unit, the program counter and the instruction registers are in the control
unit and the memory data register and memory address register are in the processor.
A typical layout is shown in Fig. 3.3.a.1 which also shows the data paths.
Main Memory
Central Processing Unit (CPU)
Control Unit
ALU
PC
Accumulator
CIR
MAR
MDR
Fig 3.3.a.1
1. Load the address that is in the program counter (PC) into the memory address register (MAR).
2. Increment the PC by 1.
3. Load the instruction that is in the memory address given by the MAR into the memory data register
(MDR).
4. Load the instruction that is now in the MDR into the current instruction register (CIR).
5. Decode the instruction that is in the CIR.
6. If the instruction is a jump instruction then
a. Load the address part of the instruction into the PC
b. Reset by going to step 1.
7. Execute the instruction.
8. Reset by going to step 1.
Steps 1 to 4 are the fetch part of the cycle. Steps 5, 6a and 7 are the execute part of the cycle and steps 6b
and 8 are the reset part.
Step 1 simply places the address of the next instruction into the memory address register so that the control
unit can fetch the instruction from the right part of the memory. The program counter is then incremented
by 1 so that it contains the address of the next instruction, assuming that the instructions are in consecutive
locations.
The memory data register is used whenever anything is to go from the central processing unit to main
memory, or vice versa. Thus the next instruction is copied from memory into the MDR and is then copied
into the current instruction register.
Now that the instruction has been fetched the control unit can decode it and decide what has to be done.
This is the execute part of the cycle. If it is an arithmetic instruction, this can be executed and the cycle
restarted as the PC contains the address of the next instruction in order. However, if the instruction involves
jumping to an instruction that is not the next one in order, the PC has to be loaded with the address of the
instruction that is to be executed next. This address is in the address part of the current instruction, hence
the address part is loaded into the PC before the cycle is reset and starts all over again.
Instruction 1
Instruction 2 Instruction 1
Fig. 3.3.c.1
This helps with the speed of throughput unless the next instruction in the pipe is not the next one that is
needed. Suppose Instruction 2 is a jump to Instruction 10. Then Instructions 3, 4 and 5 need to be removed
from the pipe and Instruction 10 needs to be loaded into the fetch part of the pipe. Thus, the pipe will have
to be cleared and the cycle restarted in this case. The result is shown in Fig. 3.3.c.2
Instruction 1
Instruction 2 Instruction 1
Instruction 10
Instruction 11 Instruction 10
Fig. 3.3.c.2
Another type of computer architecture is to use many processors, each carrying out an individual instruction
at the same time as its partners. This type of processing uses an architecture known as parallel processors,
many independent processors working in parallel on the same program. One of the difficulties with this is
that the programs running on these systems need to have been written specially for them. If the programs
have been written for standard architectures, then some instructions cannot be completed until others have
been completed. Thus, checks have to be made to ensure that all prerequisites have been completed.
However, these systems are in use particularly when systems are receiving many inputs from sensors and the
data need to be processed in parallel. A simple example that shows how the use of parallel processors can
speed up a solution is the summing of a series of numbers. Consider finding the sum of n numbers such as
2 + 4 + 23 + 21 + …. + 75 + 54 + 3
Using a single processor would involve (n – 1) additions, one after the other. Using n/2 processors we could
simultaneously add n/2 pairs of numbers in the same time it would take a single processor to add one pair of
numbers. This would leave only n/2 numbers to be added and this could be done using n/4 processors.
Continuing in this way the time to add the series would be considerably reduced.
3.3 Example Questions
The questions in this section are meant to mirror the type and form of questions that a candidate
would expect to see in an exam paper. As before, the individual questions are each followed up with
comments from an examiner.
Q 1. The Program Counter (Sequence Control Register) is a special register in the processor of a
computer.
a) Describe the function of the program counter. [2]
b) Describe two ways in which the program counter can change during the normal execution of
a program, explaining, in each case, how this change is initiated. [4]
c) Describe the initial state of the program counter before the running of the program. [2]
b) - P.C. is incremented…
- as part of the fetch execute cycle.
- P.C. is altered to the value being held in the address part of the instruction…
- When the instruction is one that alters the normal sequence of instructions in the
program.
- This second type of command involves the P.C. being reset twice in the same cycle.
[4]
c) - The P.C. will contain the address of the first instruction in the sequence to be run…
- this must have been placed in the register by some external agent, the program loader.
[2]
Notes: Part (a) is often poorly understood by students. The majority believing that the program
counter is used to keep track of the number of programs running, or the order in which programs
have been called. There is obviously a confusion with the idea of a stack storing return addresses of
modules when they have been called.
Part (b) illustrates a characteristic of true examination questions. Most genuine questions will have
more mark points available than there are marks for the question. This is not true of these sample
questions. It should also be remembered that these sample questions have not been through the
rigorous testing process that a genuine paper would have undergone, so any problems with the
content should not be repeated in the examination. Candidates find difficulty in making the
distinction between different types of instruction, it may be of value to spend some time talking
about arithmetic/logic/jump/ command type instructions as they all affect the cycle in different ways.
Part (c) refers back to the AS work in the need to understand that the loader will initially set the
value of the P.C. so that the program can begin.
Q 4. a) Describe how pipelining normally speeds up the processing done by a computer. [2]
b) State one type of instruction that would cause the pipeline system to be reset, explaining why
such a reset is necessary. [3]
b) - Jump instruction
- The instructions in the pipeline are no longer the ones to be dealt with next…
- so the pipeline has to be reset. [3]
Chapter 3.4
Data Representation, Data Structures and Data Manipulation
ADD 1
electricity 1
=1
no electricity 0
ADD 1
no electricity 0
=2
Carry electricity 1
ADD 1
electricity 1
=3
electricity 1
ADD 1
no electricity 0
Carry
no electricity 0 =4
Carry electricity 1
The computer can continue like this for ever, just adding more wires when it gets bigger numbers.
This system, where there are only two digits, 0 and 1, is known as the binary system. Each wire, or digit, is
known as a binary digit. This name is normally shortened to BIT. So each digit, 0 or 1, is one bit. A single
bit has very few uses so they are grouped together. A group of bits is called a BYTE. Usually a byte has 8
bits in it.
The first thing we must be able to do with the binary system is to change numbers from our system of 10
numbers (the denary system) into binary, and back again. There are a number of methods for doing this, the
simplest being to use the sort of column diagrams, which were used in primary school to do simple
arithmetic
Thousands Hundreds Tens Units
except, this time we are using binary, so the column headings go up in twos instead of tens
32s 16s 8s 4s 2s units
To turn a denary number into a binary number simply put the column headings, start at the left hand side and
follow the steps:
If the column heading is less than the number, put a 1 in the column and then subtract the column
heading from the number. Then start again with the next column on the right.
If the column heading is greater than the number, put a 0 in the column and start again with the next
column on the right.
Note: You will be expected to be able to do this with numbers up to 255, because that is the biggest number
that can be stored in one byte of eight bits.
e.g. Change 117 (in denary) into a binary number.
This principle can be used for any number system, even the Babylonians’ sixties if you can learn the
symbols.
e.g. If we count in eights (called the OCTAL system) the column headings go up in 8’s.
512 64 8 1
Another system is called HEXADECIMAL (counting in 16’s). This sounds awful, but just use the same
principles.
256 16 1
So 117 (in denary) is 7 lots of 16 (112) plus an extra 5. Fitting this in the columns gives
256 16 1
0 7 5
Notice that 7 in binary is 0111 and that 5 is 0101, put them together and we get 01110101 which is the
binary value of 117 again. So binary, octal and hexadecimal are all related in some way.
There is a problem with counting in 16’s instead of the other systems. We need symbols going further than 0
to 9 (only 10 symbols and we need 16!).
We could invent 6 more symbols but we would have to learn them, so we use 6 that we already know, the
letters A to F. In hexadecimal A stands for 10, B stands for 11 and so on to F stands for 15.
So a hexadecimal number BD stands for 11 lots of 16 and 13 units
= 176 + 13
= 189 ( in denary)
Note: B = 11, which in binary = 1011
D = 13, which in binary = 1101
Put them together to get 10111101 = the binary value of 189.
Binary Coded Decimal
Some numbers are not proper numbers because they don’t behave like numbers. A barcode for chocolate
looks like a number, and a barcode for sponge cake looks like a number, but if the barcodes are added
together the result is not the barcode for chocolate cake. The arithmetic does not give a sensible answer.
Values like this that look like numbers but do not behave like them are often stored in binary coded decimal
(BCD). Each digit is simply changed into a four bit binary number which are then placed after one another
in order.
e.g. 398602 in BCD
Answer: 3 = 0011 9 = 1001
8 = 1000 6 = 0110
0 = 0000 2 = 0010
So 398602 = 001110011000011000000010 (in BCD)
Note: All the zeros are essential otherwise you can’t read it back.
The example we used for binary storage was 117 which becomes 01110101 in binary. If we want to store
+117 or –117, these numbers need a second piece of data to be stored, namely the sign. There are two simple
ways to store negative numbers.
Two’s Complement
The MSB stays as a number, but is made negative. This means that the column headings are
-128 64 32 16 8 4 2 1
Addition.
There are four simple rules 0+0=0
0+1=1
1+0=1
and the difficult one 1 + 1 = 0 (Carry 1)
e.g. Add together the binary equivalents of 91 and 18
Answer: 91 = 01011011
18 = 00010010 +
01101101 = 109
1 1
Subtraction.
This is where two’s complement is useful. To take one number away from another, simply write the number
to be subtracted as a two’s complement negative number and then add them up.
e.g. Work out 91 – 18 using their binary equivalents.
Answer: 91 = 01011011
-18 as a two’s complement number is –128 + 110
= -128 +(+64 +32 +8 +4 +2)
= 11101110
Now add them 01011011
11101110 +
1 01001001
1 111111
But the answer can only be 8 bits, so cross out the 9th bit giving
01001001 = 64 + 8 + 1 = 73.
Notes: Lots of carrying here makes the sum more difficult, but the same rules are used.
One rule is extended slightly because of the carries, 1+1+1 = 1 (carry 1)
Things can get harder but this is as far as the syllabus goes.
In decimal notation the number 23.456 can be written as 0.23456 x 102. This means that we need only store,
in decimal notation, the numbers 0.23456 and 2. The number 0.23456 is called the mantissa and the number
2 is called the exponent. This is what happens in binary.
For example, consider the binary number 10111. This could be represented by 0.10111 x 25 or 0.10111 x
2101. Here 0.10111 is the mantissa and 101 is the exponent.
Similarly, in decimal, 0.0000246 can be written 0.246 x 10-4. Now the mantissa is 0.246 and the exponent is
–4.
Thus, in binary, we have 0.00010101 can be written as 0.10101 x 2-11 and 0.10101 is the mantissa and –11 is
the exponent.
It is now clear that we need to be able to store two numbers, the mantissa and the exponent. This form of
representation is called floating point form. Numbers that involve a fractional part, like 2.46710 and
101.01012 are called real numbers.
3.4.e Normalising a Real Number
In the above examples, the point in the mantissa was always placed immediately before the first non-zero
digit. This is always done like this with positive numbers because it allows us to use the maximum number
of digits.
Suppose we use 8 bits to hold the mantissa and 8 bits to hold the exponent. The binary number 10.11011
becomes 0.1011011 x 210 and can be held as
0 1 0 1 1 0 1 1 0 0 0 0 0 0 1 0
Mantissa Exponent
Notice that the first digit of the mantissa is zero and the second is one. The mantissa is said to be
normalised if the first two digits are different. Thus, for a positive number, the first digit is always zero
and the second is always one. The exponent is always an integer and is held in two's complement form.
Now consider the binary number 0.00000101011 which is 0.101011 x 2-101. Thus the mantissa is 0.101011
and the exponent is –101. Again, using 8 bits for the mantissa and 8 bits for the exponent, we have
0 1 0 1 0 1 1 0 1 1 1 1 1 0 1 1
Mantissa Exponent
The reason for normalising the mantissa is in order to hold numbers to as high an accuracy as possible.
Care needs to be taken when normalising negative numbers. The easiest way to normalise negative numbers
is to first normalise the positive version of the number. Consider the binary number –1011. The positive
version is 1011 = 0.1011 x 2100 and can be represented by
0 1 0 1 1 0 0 0 0 0 0 0 0 1 0 0
Mantissa Exponent
Now find the two's complement of the mantissa and the result is
1 0 1 0 1 0 0 0 0 0 0 0 0 1 0 0
Mantissa Exponent
As another example, change the decimal fraction –11/32 into a normalised floating point binary number.
= 0.1011 x 2-1
and we have
1 0 1 0 0 0 0 0 1 1 1 1 1 1 1 1
Mantissa Exponent
The fact that the first two digits are always different can be used to check for invalid answers when doing
calculations.
3.4.f Accuracy and Range
There is always a finite number of bits that can be used to represent numbers in a computer. This means that
if we use more bits for the mantissa we will have to use fewer bits for the exponent.
Let us start off by using 8 bits for the mantissa and 8 bits for the exponent. The largest positive value we
can have for the mantissa is 0.1111111 and the largest positive number we can have for the exponent is
01111111. This means that we have
The largest negative number (i.e. the negative number closest to zero) is
Note that we cannot use 1.1111111 for the mantissa because it is not normalised. The first two digits must
be different.
The smallest negative number (i.e. the negative number furthest from zero) is
Have you noticed that zero cannot be represented in normalised form? This is because
0.0000000 is not normalised because the first two digits are the same. Usually, the computer
uses the smallest positive number to represent zero.
Thus, we have, for a positive number n,
Similarly, reducing the size of the mantissa reduces the accuracy, but we have a much greater range of
values as the exponent can now take larger values.
3.4.g Static and Dynamic Data Structures
Static data structures are those structures that do not change in size while the program is running. A typical
static data structure is an array because once you declare its size, it cannot be changed. (In fact, there are
some languages that do allow the size of arrays to be changed in which case they become dynamic data
structures.)
Dynamic data structures can increase and decrease in size while a program is running. A typical example is
a linked list.
The following table gives advantages and disadvantages of the two types of data structure.
Advantages Disadvantages
Static structures Compiler can allocate space during Programmer has to estimate the
compilation. maximum amount of space that is
going to be needed.
Easy to program.
Can waste a lot of space.
Easy to check for overflow.
3.4.h Algorithms
Linked Lists - Insertion
Consider Fig. 3.4.h.1 which shows a linked list and a free list. The linked list is created by removing cells
from the front of the free list and inserting them in the correct position in the linked list.
Fig. 3.4.h.1
Now suppose we wish to insert an element between the second and third cells in the linked list. The pointers
have to be changed to those in Fig. 3.4.h.2.
Fig. 3.4.h.2
The algorithm must check for an empty free list as there is then no way of adding new data. It must also
check to see if the new data is to be inserted at the front of the list. If neither of these are needed, the
algorithm must search the list to find the position for the new data. The algorithm is given below.
Fig. 3.4.h.3
In this case, the algorithm must make sure that there is something in the list to delete.
The algorithm is
Stacks – Insertion
Fig. 3.4.h.4 shows a stack and its head pointer. Remember, a stack is a last-in-first-out (LIFO) data
structure. If we are to insert an item into a stack we must first check that the stack is not full. Having done
this we shall increment the pointer and then insert the new data item into the cell pointed to by the stack
pointer. This method assumes that the cells are numbered from 1 upwards and that, when the stack is empty,
the pointer is zero.
Stack
Fig. 3.4.h.4
The algorithm for insertion is
Stacks – Deletion
When an item is deleted from a stack, the item's value is copied and the stack pointer is moved down one
cell. The data itself is not deleted. This time, we must check that the stack is not empty before trying to
delete an item.
These are the only two operations you can perform on a stack.
Queues - Insertion
Fig. 3.4.h.5 shows a queue and its head and tail pointers. Remember, a queue is a first-in-first-out (FIFO)
data structure. If we are to insert an item into a queue we must first check that the stack is not full. Having
done this we shall increment the pointer and then insert the new data item into the cell pointed to by the head
pointer. This method assumes that the cells are numbered from 1 upwards and that, when the queue is
empty, the two pointers point to the same cell.
Head pointer
Data
Data
Data
Data Tail pointer
Fig. 3.4.h.5
These are the only two operations that can be performed on a queue.
Trees - Insertion
When inserting an item into a binary tree, it is usual to preserve some sort of order. Consider the tree shown
in Fig. 3.4.i.1 showing a tree containing data that is stored according to its alphabetic order.
To add a new value, we look at each node starting at the root. If the new value is less than the value at the
node move left, otherwise move right. Repeat this for each node arrived at until there is no node. Insert a
new node at this point and enter the data.
Now let's try putting "Jack Spratt could eat no fat" into a tree. Jack must be the root of the tree. Spratt
comes after Jack so go right and enter Spratt. could comes before Jack, so go left and enter could. eat is
before Jack so go left, it's after could so go right. This is continued to produce the tree in Fig. 3.4.i.1.
Jack
could Spratt
eat no
fat
Fig. 3.4.i.1
The algorithm for this is
could Spratt
and eat no
fat
Fig. 3.4.i.2
Trees - Deletion
Deleting a node from a tree is a fairly complex task. Suppose that, in Fig. 3.4.i.2 we wish to delete the node
containing the word could. What do we do with the words and, eat and fat. The simplest way is to traverse
the tree with root could and store all the values in a different data structure. (We could use an array, a stack,
a queue or a linked list.) We must then insert these values back in the tree. Suppose we stored the values in
such a way that they will be re-entered into the tree in the order and, eat, fat.
and Spratt
eat no
fat
Fig. 3.4.i.3
Amendments are not normally carried out on a tree as they would change the order of the nodes.
The serial search expects the data to be in consecutive locations such as in an array. It does not expect the
data to be in any particular order. To find the position of a particular value involves looking at each value in
turn and comparing it with the value you are looking for. When the value is found you need to note its
position. You must also be able to report that a value has not been found in some circumstances.
This method can be very slow, particularly if there are a large number of values. The least number of
comparisons is one; this occurs if the first item in the list is the one you want. However, if there are n values
in the list, you will need to make n comparisons if the value you want is the last value. This means that, on
average, you will look at n/2 values each search. Clearly, if n is large, this can be a very large number of
comparisons.
Now suppose the list is sorted into ascending order as shown below.
Anne
Bhari
Chu
Diane
Ejo
Frank
Gloria
Hazel
Suppose we wish to find the position of Chu. Now compare Chu with the value that is in the middle of the
table. In this case Diane and Ejo are both near the middle; in this case we usually take the smaller value. As
Chu is before Diane, we now look only at this list.
Anne
Bhari
Chu
Diane
Now we only have a list of four entries. That is we have halved the length of the list. We now compare Chu
with Bhari and find that Chu is greater than Bhari so we use the list
Chu
Diane
which is half the length of the previous list. Comparing Chu with Chu we have found the position we want.
This has only taken three comparisons.
Clearly, this is more efficient than the serial search method. However, the data must first be sorted and this
can take some time. We need to compare these two methods using lists of different lengths. To do this we
need to know the number of comparisons that need to be made for list of different lengths. As we halve the
lists each time we do the comparison, the number of comparisons is given by m where 2 m is the smallest
value greater than or equal to n, the number of values in the list.
Suppose there are 16 values in the list to start with. The successive lists will contain 16, 8, 4 and 2 values.
This involves four comparisons. Now 24 is 16. Similarly, we only need 10 comparisons if the list contains
1000 values because 210 is equal to 1024, which is greater than 1000.
The formal algorithms for these searching techniques are not necessary as part of the syllabus, but are
included here because the authors are aware that some students are interested in the more formal solutions to
problems.
Serial Search
Let the data consist of n values held in an array called DataArray which has subscripts numbered from 1
upwards. Let X be the value we are trying to find. We must check that the array is not empty before
starting the search.
Note that the for loop only acts on the indented line. If there is more than one operation to perform inside a
loop, make sure that all the lines are indented.
Binary Search
Assume that the data is held in an array as described above but that the data is in ascending order in the
array. We must split the lists in two. If there is an even number of values, dividing by two will give a whole
number and this will tell us where to split the list. However, if the list consists of an odd number of values
we will need to find the integer part of it, as an array subscript must be an integer.
We must also make sure that, when we split a list in two, we use the correct one for the next search.
Suppose we have a list of eight values. Splitting this gives a list of the four values in cells 1 to 4 and four
values in cells 5 to 8. When we started we needed to consider the list in cells 1 to 8. That is the first cell
was 1 and the last cell was 8. Now, if we move into the first list (cells 1 to 4), the first cell stays at 1 but the
last cell becomes 4. Similarly, if we use the second list (cells 5 to 8), the first cell becomes 5 and the last is
still 8. This means that if we use the first list, the first cell in the new list is unchanged but the last is
changed. However, if we use the second list, the first cell is changed but the last is not changed. This gives
us the clue of how to do the sort.
A more detailed algorithm is given below that is useful f you wish to program the binary search. You would
not be expected to be able to reproduce this during an examination.
Binary Search Algorithm
In this algorithm, Vector is a one-dimensional array that holds the data in ascending order. X is the value we
are trying to find and n is the number of values in the array.
{Initialisation}
First = 1
Last = n
Found = FALSE
{Perform the search}
WHILE First <= Last AND NOT Found DO
{Obtain index of mid-point of interval}
Mid = INT(First + Last) / 2
{Compare the values}
IF X < Vector[Mid] THEN
Last = Mid – 1
ELSE
IF X > Vector[Mid] THEN
First = Mid + 1
ELSE
Output 'Value is at ', Mid
Found = TRUE
ENDIF
ENDIF
ENDWHILE
IF NOT Found THEN
{Unsuccessful search}
Output 'Value not in the list'
ENDIF
END
3 5 6 8 12 16 25
Merging is taking two lists which have been sorted into the same order and putting them together to form a
single sorted list. For example, if the lists are
and
There are many methods that can be used to sort lists. You only need to understand two of them. These are
the insertion sort and the quick sort. This Section describes the two sorts and a merge in general terms; the
next Section gives the algorithms.
Insertion Sort
In this method we compare each number in turn with the numbers before it in the list. We then insert the
number into its correct position.
20 47 12 53 32 84 85 96 45 18
We start with the second number, 47, and compare it with the numbers preceding it. There is only one and it
is less than 47, so no change in the order is made. We now compare the third number, 12, with its
predecessors. 12 is less than 20 so 12 is inserted before 20 in the list to give the list
12 20 47 53 32 84 85 96 45 18
This is continued until the last number is inserted in its correct position. In Fig. 3.4.k.1 the blue numbers are
the ones before the one we are trying to insert in the correct position. The red number is the one we are
trying to insert.
Fig. 3.4.k.1
Quick Sort
At first this method appears to be very slow. This is only because of the length of time it takes to explain the
method. In fact, for long lists, it is a very efficient method of sorting.
20 47 12 53 32 84 85 96 45 18
We use two pointers. One pointer points left and the other points right. Let the number we are trying to
place in its correct position be marked in red in the following description.
Start with the left pointer in the first position and the right pointer in the last position. Let the number in the
first position be the one we are trying to place in its correct position. We have
20 47 12 53 32 84 85 96 45 18
Compare the two numbers that are being pointed to. If they are in the wrong order, swap them. We have
18 47 12 53 32 84 85 96 45 20
Now keep moving the pointer that is not attached to the red number to wards the red number until either the
two pointers meet or the two numbers being pointed to are in the wrong order. In this case the left pointer
moves to 47 and 47 and 20 are in the wrong order, so swap them.
18 47 12 53 32 84 85 96 45 20
18 20 12 53 32 84 85 96 45 47
Now again move the pointer not pointing to the red number towards the red number until either the two
pointers meet or two numbers need to be swapped.
18 20 12 53 32 84 85 96 45 47
18 20 12 53 32 84 85 96 45 47
18 20 12 53 32 84 85 96 45 47
18 20 12 53 32 84 85 96 45 47
18 20 12 53 32 84 85 96 45 47
18 20 12 53 32 84 85 96 45 47
18 20 12 53 32 84 85 96 45 47
Swap
18 12 20 53 32 84 85 96 45 47
18 12 20 53 32 84 85 96 45 47
Pointers coincide. Now you will find that the number 20 is in its correct position. That is, all the numbers
to the left of 20 are less than 20 and all the numbers to the right of 20 are greater than 20.
We now split the list into two; the one to the left of the 20 and the one to the right of the 20. We then quick
sort each of these lists. The following shows the steps for the right hand list.
53 32 84 85 96 45 47
Swap
47 32 84 85 96 45 53
47 32 84 85 96 45 53
47 32 84 85 96 45 53
Swap
47 32 53 85 96 45 84
47 32 53 85 96 45 84
Swap
47 32 45 85 96 53 84
47 32 45 85 96 53 84
Swap
47 32 45 53 96 85 84
47 32 45 53 96 85 84
47 32 45 53 96 85 84
The number 53 is now in the correct position for this list. Split this list in two, as before, to give the two
lists
47 32 45 and 96 85 84
and sort these two lists using the quick sort method. When all the sublists have a single number, they can
be put back together to form a sorted list.
Merging
Consider the two sorted lists
2 4 7 10 15 and 3 5 12 14 18 26
In order to merge these two lists, we first compare the first values in each list, that is 2 and 3. 2 is less than 3
so we put it in the new list.
New = 2
Since 2 came from the first list we now use the next value in the first list and compare it with the number
from the second list (as we have not yet used it). 3 is less than 4 so 3 is placed in the new list.
New = 2 3
As 3 came from the second list we use the next number in the second list and compare it with 4. This is
continued until one of the lists is exhausted. We then copy the rest of the other list into the new list. The
full merge is shown in Fig. 3.4.k.2.
Fig. 3.4.k.2
3.4.l Algorithms
Insertion Sort
The following is an algorithm for the insertion sort.
Suppose the data are held in a one-dimensional array Vector with a lower bound LB and an upper bound UB
for the array subscripts. For example, if the array is from Vector[1] to Vector[10], LB is 1 and UB is 10.
Similarly, if the data to be sorted are in cells Vector[4] to Vector[7], LB is 4 and UB is 7
A. a) (i) 128 64 32 16 8 4 2 1
0 1 1 1 0 0 0 1 =01110001.
(ii) 1 = 0001
1 = 0001
3 = 0011
Therefore 113 = 0000000100010011
b) (i) 113 = 001 110 001 in binary
= 1 6 1 in octal.
(ii) 113 = 0111 0001 in binary
= 7 1 in hexadecimal.
Notice: (i) and (ii) It was necessary to show a method of working out. Many students have
calculators that will automatically change from one number system to another, so it is necessary to
show that you know what you are doing, even if you use a calculator to check the results. Also, the
question stated that the appropriate number of bytes be used, in part (i) this is obviously 8 bits and is
an easy mark, but in part (ii) it is necessary to add a set of zeros to the front of the answer to make it
a whole number of bytes.
(iii) and (iv) The question stated that the first answer had to be used, so one of the two marks is going
to be given for showing the relationship between binary and each of these two representations.
Notice that for the octal answer it was necessary to add a 0 to the front of the binary number to give 9
bits (3 lots of 3).
Notice that the question asked “Explain…”, so just writing the answer down is not acceptable. There
will be a mark for showing the column headings, particularly the value of the MSB in each case.
Q 3. Add together the binary equivalents of 34 and 83, using single byte arithmetic, showing you
working. [3]
A. 34 = 0 0 1 0 0 0 1 0
83 = 0 1 0 1 0 0 1 1 +
0 1 1 1 0 1 0 1 = 117
1
Note that the question asked for the working. The part that shows that you are capable of doing the
arithmetic is the carry, don’t miss it out. In a question like this try to ask yourself what evidence the
examiner is expecting for the mark. The other marks are for using 8 bits for each value, and for the
answer.
Q 4. Describe a floating point representation for real numbers using two bytes. [4]
Q 5. a) Explain how the fraction part of a real number can be normalised. [2]
b) State the benefit obtained by storing real numbers using normalised form. [1]
A. a) - Digits are moved so that the first two digits are different.
- A positive number starts 01… and a negative number starts 10…
Q 6. a) A floating point number is represented in a certain computer system in a single 8 bit byte. 5
bits are used for the mantissa and 3 bits for the exponent. Both are stored in two’s
complement form and the mantissa is normalised.
b) Explain the relationship between accuracy and range when storing floating point
representations of real numbers. [4]
b) - The larger the number of bits used for the mantissa, the better the accuracy
- The larger the number of bits used for the exponent, the greater the range
- There is always a finite number of bits.
- Therefore the more bits used for one part of the representation, the fewer bits can be
used for the other. [4]
Notes: This is a difficult question. The answers to part (a) have been written with / between the two
parts of the representation for clarity, this would not be expected in a student answer. The idea of
using a single byte is ridiculous, but it keeps the arithmetic possible in an examination.
In part (b) the idea is not difficult, but the explanation is very difficult to put into words. Students
who are happy with the arithmetic may find it simpler to use the examples in part (a) and move the
partition between the two parts to show the effect.
Q 7. State the difference between dynamic and static data structures giving an example of each. [3]
A. - A dynamic data structure can change in size during a program run, while a static data
structure maintains a fixed size.
- Dynamic List/Tree.
- Static Array.
Notes: An array may be able to change size in certain circumstances, but it is the only example we’ve
got, so don’t labour the point!
Q 8. a) Show how a binary tree can be used to store the data items Feddi, Eda, Joh, Sean, Dav, Gali
in alphabetic order. [4]
b) Explain why problems may arise if Joh is deleted from the tree and how such problems may
be overcome. [4]
A. a) Feddi
Eda Joh
Marks:
- Root node
- Eda and Joh correctly positioned in relation to Feddi
- Eda’s subtree
- Joh’s subtree
b) - Data in a tree serves two purposes (one is to be the data itself) the other is to act as a
reference for the creation of the subtree below it
- If Joh is deleted there is no way of knowing which direction to take at that node to
find the details of the data beneath it.
- Solution 1 is to store Joh’s subtree in temporary storage and then rewrite it to the tree
after Joh is deleted. (The effect is that one member of the subtree will take over from
Joh as the root of that subtree).
- The data, Joh, can be marked as deleted so that the data no longer exists but it can
maintain its action as an index for that part of the tree so that the subtree can be
correctly negotiated.
Notes: The tree can be reflected in a vertical mirror quite justifiably. For this reason it is probably
worthwhile making a note on the diagram of what rule has been used, just in case it does not match
the examiner’s version.
Q 9. Describe two types of search routine, giving an indication of when it would be advisable to use each.
[6]
A. - Serial search carried out on data list by…
- comparing each key with the desired one in turn…
- until found or error message issued.
- Useful if data keys are not in order or if the number of items of data is small.
- Binary search carried out by…
- continually comparing the middle key in the list with the required key…
- and hence repeatedly splitting the list into two equal halves until…
- middle value = required key or error if new list is empty.
- Fast method of searching if the data list is large and the data is sorted on the key. [4]
Q 10. Describe the steps in sorting a list of numbers into order using an insertion sort. [4]
A. - Each value in turn is compared with the values already in the list and is…
- inserted in the correct location.
- Once a value has been inserted the algorithm is repeated by comparing all the values in the
list, in turn…
- with the next value in the list.
- This continues until all the values have been inserted.
- Example.
Notes: A difficult topic to ask a question about because the examination is relatively short.
Consequently there is little time for each of the questions. If the question does not look like this it
will almost certainly consist of a number of data items, numerical or alphabetic, and the candidate
will be asked to explain a sort technique by showing how the data items are affected by a particular
sorting method. There is nothing in the syllabus to say that sort algorithms will not be asked for,
however, with the time pressure in the examination such a question would be most unlikely.
Q 11. Given two sorted lists describe an algorithm for merging them into one sorted list. [6]
The “algorithm” is not in any formal style. The syllabus states that the candidate should be able to
describe an algorithm, not reproduce one in a particular form. These are mark points in the solution
and not a rigid form of solution, a prose solution would be equally acceptable.
Chapter 3.5
Programming Paradigms
3.5 Introduction
This Chapter uses a number of different programming languages to illustrate the points being explained.
You will not be asked to write program code in the examination. However, you may well find it helpful, in
answering examination questions, to include code in your answers. This is perfectly satisfactory; indeed it
will often help you to clarify what you are saying. If you do write code, the accuracy of the syntax will not
be marked. Examiners will use your code to see if you understand the question and can explain your
answer.
Thus, treat the code in this Chapter simply as an explanation of the different facilities available in different
programming paradigms. Do not think that you have to be able to program in all the languages used in the
following Sections.
To make programming easier, assembly languages were developed. These replaced machine code functions
with mnemonics and addresses with labels. Assembly language programming is also a low-level paradigm
although it is a second generation paradigm. Figure 3.5.a.1 shows an assembly language program that adds
together two numbers and stores the result.
Figure 3.5.a.1
Although this assembly language is an improvement over machine code, it is still prone to errors and code is
difficult to debug, correct and maintain.
The next advance was the development of procedural languages. These are third generation languages and
are also known as high-level languages. These languages are problem oriented as they use terms appropriate
to the type of problem being solved. For example, COBOL (Common Business Oriented Language) uses
the language of business. It uses terms like file, move and copy.
FORTRAN (FORmula TRANslation) and ALGOL (ALGOrithmic Language) were developed mainly for
scientific and engineering problems. Although one of the ideas behind the development of ALGOL was that
it was an appropriate language to define algorithms. BASIC (Beginners All purpose Symbolic Instruction
Code) was developed to enable more people to write programs. All these languages follow the procedural
paradigm. That is, they describe, step by step, exactly the procedure that should be followed to solve a
problem.
The problem with procedural languages is that it can be difficult to reuse code and to modify solutions when
better methods of solution are developed. In order to address these problems, object-oriented languages
(like Eiffel, Smalltalk and Java) were developed. In these languages data, and methods of manipulating the
data, are kept as a single unit called an object. The only way that a user can access the data is via the
object's methods. This means that, once an object is fully working, it cannot be corrupted by the user. It
also means that the internal workings of an object may be changed without affecting any code that uses the
object.
A further advance was made when declarative programming paradigms were developed. In these languages
the computer is told what the problem is, not how to solve the problem. Given a database the computer
searches for a solution. The computer is not given a procedure to follow as in the languages discussed so
far.
Another programming paradigm is functional programming. Programs written using this paradigm use
functions, which may call other functions (including themselves). These functions have inputs and outputs.
Variables, as used in procedural languages, are not used in functional languages. Functional languages
make a great deal of use of recursion which is the ability for a procedure to call itself.
Here each line of code is executed one after the other in sequence.
Most procedural languages have two methods of selection. These are the IF … THEN … ELSE statement
and the SWITCH or CASE statement. For example, in C++, we have
IF (Number > 0)
cout << "The number is positive.";
ELSE
{
IF (Number = = 0)
cout << "The number is zero.";
ELSE
cout << "The number is negative.";
}
In C++ multiple selections can be programmed using the SWITCH statement. For example, suppose a user
enters a single letter and the output depends on that letter, a typical piece of code could be
switch (UserChoice)
{
case 'A':
cout << "A is for Apple.";
break;
case 'B':
cout << "B is for Banana.";
break;
case 'C':
cout << "C is for Cat.";
break;
default:
cout << "I don't recognise that letter.";
}
Repetition (or iteration) is another standard construct. Most procedural languages have many forms of this
construct such as
FOR … NEXT
REPEAT … UNTIL …
WHILE … DO …
A typical use of a loop is to add a series of numbers. The following pieces of C++ code add the first ten
positive integers.
The point to note with these procedural languages is that the programmer has to specify exactly what the
computer is to do.
Procedural languages are used to solve a wide variety of problems. Some of these languages are more
robust than others. This means that the compiler will not let the programmer write statements that may lead
to problems in certain circumstances. As stated earlier, there are procedural languages designed to solve
scientific and engineering problems while others are more suitable for solving business problems. There are
some that are particularly designed for solving problems of control that need real time solutions.
Procedural languages may use functions and procedures but they always specify the order in which
instructions must be used to solve a problem. The use of functions and procedures help programmers to
reuse code but there is always the danger of variables being altered inadvertently.
In the 1970s it was realised that code was not easily reused and there was little security of data in a program.
Also, the real world consists of objects not individual values. My car, registration number W123ARB, is an
object. Kay's car, registration number S123KAY, is another object. Both of these objects are cars and cars
have similar attributes such as registration number, engine capacity, colour, and so on. That is, my car and
Kay's car are instances of a class called cars. In order to model the real world, the Object-oriented
Programming (OOP) paradigm was developed. Unfortunately, OOP requires a large amount of memory
and, in the 1970s, memory was expensive and CPUs still lacked power. This slowed the development of
OOP. However, as memory became cheaper and CPUs more powerful, OOP became more popular. By the
1980s Smalltalk, and later Eiffel, had become well established. These were true object-oriented languages.
C++ also includes classes although the programmer does not have to use them. This means that C++ can be
used as a standard procedural language or an object-oriented language or a mixture of both! Java, with a
syntax similar to C++, is a fully object-oriented language. Although OOP languages are procedural in
nature, OOP is considered to be a new programming paradigm.
The following is an example, using Java, of a class that specifies a rectangle and the methods that can be
used to access and manipulate the data.
class Shapes {
class Rectangle {
//Declare the variables related to a rectangle
int length;
int width;
int area;
This example contains two classes. The first is called Shapes and is the main part of the program. It is from
here that the program will run. The second class is called Rectangle and it is a template for the description
of a rectangle.
The class Shapes has a constructor called Shapes, which declares two objects of type Rectangle. This is a
declaration and does not assign any values to these objects. In fact, Java simply says that, at this stage, they
have null values. Later, the new statement creates actual rectangles. Here small is given a width of 2 and a
length of 5, medium is given a width of 10 and a length of 25 and large is given a width of 50 and a length
of 100. When a new object is created from a class, the class constructor, which has the same name as the
class, is called.
The class Rectangle has a constructor that assigns values to width and length and then calculates the area of
the rectangle.
The class Rectangle also has a method called write( ). This method has to be used to output the details of
the rectangles.
In the class Shapes, its constructor then prints a heading and the details of the rectangles. The latter is
achieved by calling the write method. Remember, small, medium and large are objects of the Rectangle
class. This means that, for example, small.write( ) will cause Java to look in the class called Rectangle for a
write method and will then use it.
The functional programming paradigm provides a very high level view of programming. All programs
consist of a series of functions that use parameters for input and pass values to other functions. There are no
variables like the ones in procedural languages. However, like procedural languages, the programmer has to
tell the computer the precise steps to be taken to solve a problem. For example, in the language "Haskell",
the following returns the square of a number.
square :: Int Int
square n = n * n
says that we have a function called square that takes an integer as input and outputs an integer.
square n = n * n
says that the function requires the value of n as input and outputs n * n.
Another example is
Here, different takes three integers as input and outputs a Boolean value True or False. The output is True if
a, b and c are not all the same.
(i) a = 2, b = 2 and c = 3;
(ii) a = 2, b = 3 and c = 3;
(iii) a = 5, b = 5 and c = 5.
You should find that (i) and (ii) give an output of True and (iii) gives an output of False.
Most functions use guards to determine the output. Consider the following example.
Here | x <= y is a guard. The function first checks to see if x <= y is True. If it is, the function outputs x and
ends. If x <= y is False, Haskell moves to the next line and checks the guard. In this case there is no guard.
So the function outputs y.
Here the function is expecting three integers as input and outputs an integer. It in fact outputs the minimum
of three integers. If x <= y and x <= z then x must be the minimum and the function outputs x and stops.
However, if this is not true, x is not the minimum therefore y or z must be the minimum. The second guard
checks this. If this returns a True value, y is output. If the guard returns a False value, the function
continues and outputs z.
Functional programming can be very powerful. The trick is to keep breaking a problem down into sub-
problems until the sub-problems can be solved by using simple functions. Suppose we wish to find the
minimum of four integers. There are many ways of doing this but they all consist of sub-problems. One
algorithm is
This uses two functions namely the minimum of three integers and the minimum of two integers. But we
already have solutions to these problems so why not use them and we have
This is simply the use of step-wise refinement/top-down design as explained in Section 1.3 in the AS text.
Another facility of functional programming that makes it a powerful programming paradigm is the use of
recursion. Consider the problem of finding the sum of the first n integers. Traditionally this could be
written (in Visual Basic) as
sum = 0
For count = 1 to n
sum = sum + count
Next count
This is simply saying that the function sum expects an integer as input and outputs an integer. The first
guard says that if the input integer is 1, output 1. If this guard is False, then output n plus the value of sum
(n – 1). Then sum (n – 1) calls the same function but with the input being 1 less than the last call. This is
repeated until the input is reduced to 1 when a value of 1 is output.
sum 3
| 3 =1 False
| otherwise = 3 + sum 2 sum 2
|2 = 1 False
| otherwise = 2 + sum 1 sum 1
| 1 = 1 True
=1
=2+1
=3
=3+3
=6
The combination of step-wise refinement and recursion means that functional programming is a very
powerful tool. More powerful features of functional programming will be addressed in Section 3.5.h.
Another programming paradigm is the declarative one. Declarative languages tell the computer what is
wanted but do not provide the details of how to do it. These languages are particularly useful when solving
problems in artificial intelligence such as medical diagnosis, fault finding in equipment and oil exploration.
The method is also used in robot control. An example of a declarative language is Prolog. The idea behind
declarative languages is shown in Fig. 3.5.b.1.
Fig. 3.5.b.1
Here the user inputs a query to the search engine, which then searches the database for the answers and
returns them to the user. For example, using Prolog, suppose the database is
female(jane).
female(anne).
female(sandip).
male(charnjit).
male(jaz).
male(tom).
Note that in Prolog values start with a lowercase letter and variables start with an uppercase letter. A user
may want to know the names of all the males. The query
male(X).
will return
X = charnjit
X = jaz
X = tom
Notice that the user does not have to tell Prolog how to search for the values of X that satisfy the query. In a
procedural language the database may be held in a two-dimensional array Gender as shown below.
Array Gender
1 2
1 female Jane
2 female Anne
3 female Sandip
4 male Charnjit
5 male Jaz
6 male Tom
In Visual Basic we could write
For count = 1 To 6
If Gender[count, 1] = "male" Then
picResults.Print Gender[count, 2]
End If
This is fairly straightforward. However, suppose we now add to the Prolog database the following data.
parent(jane,mary).
parent(jane, rajinder).
parent(charnjit, mary).
parent(charnjit, rajinder).
parent(sandip, atif).
parent(jaz, atif).
and suppose we wish to know the name of the mother of Atif. In Prolog we use the query
X = sandip
The result is
X = charnjit Y = mary
X = charnjit Y = rajinder
X = jaz Y = atif
If we only want a list of fathers we use the underscore and create the query
parent(X, _ ), male(X).
and the result is
X = charnjit
X = charnjit
X = jaz
Further examples are given in Section 3.5.g. At this stage the important point is that the programmer does
not have to tell the computer how to answer the query. There are no FOR … NEXT, WHILE … DO … or
REPEAT … UNTIL … loops as such. There is no IF … THEN … statement. The system simply consists
of a search engine and a database of facts and rules. Examples of facts are given above. Examples of rules
will be given in Section 3.5.g.
3.5.c Structured Design
A complex problem needs to be broken down into smaller and smaller sub-problems until all the sub-
problems can be solved easily. This process is called step-wise refinement or top-down design.
Consider the problem of calculating the wages for an hourly paid worker. The worker is paid £6.50 per hour
for up to 40 hours and time-and-a-half for all hours over 40. Tax and National Insurance contributions have
to be deducted. This can be represented by Fig. 3.5.c.1.
Wages
Fig. 3.5.c.1
An alternative way of writing this is to use numbered statements. This can be easier if there are many sub-
problems to be solved.
1. Wages
1.1 Get number of hours
1.2 Calculate gross pay
1.2.1 Calculate normal wages
1.2.2 Calculate overtime
1.3 Calculate deductions
1.3.1 Calculate tax
1.3.2 Calculate National Insurance
1.4 Calculate net pay
1.5 Output wages slip
Either of these designs can be turned into a series of functions and procedures. The program could be called
Wages and consist of the following functions and procedure.
Wages
GetHours( ) returns an integer in range 0 to 60
CalculateWages(Hours) returns gross wage
CalculateNormalWages(Hours) returns wage for up to 40 hours
CalculateOvertime(Hours) returns pay for any hours over 40
CalculateDeductions(GrossWage) returns total deductions
CalculateTax(GrossWage) returns tax due
CalculateNI(GrossWage) returns N.I. due
CalculateNetWage(GrossWage, Deductions) returns net wage after deductions
Procedure OutputResults(Hours, GrossWage, Tax, NI, Deductions, NetWage)
Procedure to print the wage slip
Here we can see that if a single value is to be returned, the simplest way to do this is to use a function. If
there are no values to be returned, then a procedure should be used. If more than one value is to be returned,
a procedure should be used.
These statements state that the function will return an integer value and it does not expect any values to be
fed into it.
expects to be given an integer value as input and returns a value of type Double.
Note: If you are programming in C, C++ or Java, there are no procedures. These languages only use
functions. A function has to be typed. That is, the programmer must specify the type of value to be
returned. This is true in all languages. In C, C++ and Java, if a function is not going to return a value, its
return type is void. That is, no value is actually returned.
Another type of diagram is used with the Jackson Structured Programming (JSP) technique. Fig. 3.5.c.1
shows the sequence of steps.
Calculate tax
Calculate National Insurance
The diagram does not illustrate selection (IF … THEN … ELSE …) nor does it show repetition (FOR …
DO … , WHILE … DO … , etc.). Fig. 3.5.c.2 shows selection in JSP design. Note the use of a circle inside
the boxes to indicate that the operations B and C are conditional.
Condition Else
BO CO
Fig. 3.5.c.2
Repetition (also known as iteration) is shown in Fig. 3.5.c.3. The asterisk is used to indicate that B is an
iterative process; that is, B has to be repeated.
B *
Fig. 3.5.c.3
Initially we shall consider diagrams that represent data. Consider a pack of ordinary playing cards which
has been divided into red and black suits. Fig. 3.5.c.4 shows this. The top level shows we are using a pack;
the second layer shows that the pack is divided into red and black components. The third level shows that
the red component consists of many (which may be zero) cards. Similarly, black consists of many cards.
Pack
Red O Black O
Red * Black *
card card
Fig. 3.5.c.4
Fig, 3.5.c.5
Now suppose we deal a hand of cards from a shuffled pack until a spade is dealt. The data structure is
shown in Fig. 3.5.c.6.
Hand
non-spade cards
O
Cards before Spade O followed by a
a spade card spade card
Sales
* O O
Month's data Present Absent
* O O
Day's total Present Absent
Fig. 3.5.c.7
The JSP diagrams we have seen so far have shown physical data structures. Logical data structures describe
the data with respect to a given application.
Suppose we wish to extract the takings for all Mondays from the file shown in Fig. 3.5.c.7. The logical
diagram is shown in Fig. 3.5.c.8. Notice that the diagram does not violate the data structure shown in the
previous diagram. Also, notice the use of null ( ─ ) in the decision at the bottom of the diagram; this shows
that the ELSE part of the decision does nothing. Further there are no decisions below the totals in this
diagram because they are not being used.
Sales
*
Month's data
*
Day's total
O O
Monday ─
Fig. 3.5.c.8
Now suppose the application is to extract February's data. In this case we need to read past January totals so
these must be shown in the logical data structure. However, once we have dealt with February, we do not
wish to read the remaining months' data. Fig. 3.5.c.9 shows the logical structure for this problem.
Sales
Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
* Month * Month
Day's total Day's total
data data
Fig. 3.5.c.9
The next step is to enter, on the diagram, the constraints. This is done by numbering the constraints and then
listing their meanings. Placing the constraints on Fig. 3.5.c.8 produced Fig. 3.5.c.10 where
Sales
C1
*
Month's data
C2
*
Day's total
ELSE
C3
O O
Monday ─
Fig. 3.5.c.10
In this example, separate procedures can be written for each box in the logical structure. That is, each
section in the diagram can be a separate procedure or function. Each procedure and function can use
parameters to pass data to and from the calling procedure or function. This is further explained in the next
Section.
X=2*X
Y=2*Y
PerimeterOfRectangle = X + Y
End Function
In this function X and Y are integers the values of which must be passed to the function before it can find
the area of the rectangle. These variables are called formal parameters. To use this function, another
program will have to call it and provide the values for X and Y. This can be done by means of a statement
of the form
Perimeter = PerimeterOfRectangle(4, 6)
or we can use
A=3
B=4
Perimeter = PerimeterOfRectangle(A, B)
In both of these statements the variables inside the parentheses ( 4 and 6 in the first example and A and B in
the second) are called actual parameters. How the values are passed to the function or procedure depends
on the programming language. In the first example the values 4 and 6 are stored in the variables X and Y.
In the second example, in Visual Basic, the addresses of A and B are passed to the function so that X and Y
have the same address as A and B respectively. In C++, in both cases the actual values are passed to the
function which stores them in its own variable space. Thus we have two different ways of passing
parameters. Fig. 3.5.d.1 shows how Visual Basic normally passes parameters.
Fig. 3.5.d.1
Fig 3.5.d.2 shows what normally happens when C++ passes parameters. notice that two extra memory
locations are used and that C++ makes a copy of the values of A and B and stores them in separate locations
X and Y.
Fig 3.5.d.2
Visual Basic is said to pass parameters by reference (or address) and C++ passes them by value. It is
interesting to see the effect of passing values by address. Here is the function described above and a copy of
the calling function in Visual Basic.
X=2*X
Y=2*Y
PerimeterOfRectangle = X + Y
End Function
Dim A As Integer
Dim B As Integer
Dim Perimeter As Integer
A=3
B=4
End Sub
Fig.3.5.d.3
Notice that after the function has been run the values of A and B have changed. This is because the
addresses of A and B were passed not their actual values.
Visual Basic can pass parameters by value and C++ can pass parameters by reference. In Visual Basic we
have to use the ByVal key word if we want values to be passed by value. Here is a modified form of the
Visual Basic function together with the output from running the modified program.
End Function
Fig. 3.5.d.4
In order to pass parameters by reference in C++, the formal parameter must be preceded by an ampersand
(&).
Variables can have different values in different parts of the program. Look at the following Visual Basic
code and its output, shown in Fig. 3.5.d.5.
Dim A As Integer
Dim B As Integer
Dim C As Integer
Dim Perimeter As Integer
A=3
B=4
C=5
End Sub
Dim C As Integer
C = 10
X=2*X
Y=2*Y
PerimeterOfRectangle = X + Y
End Function
Fig. 3.5.d.5
This shows that C has a different value in the function PerimeterOfRectangle to in the calling function
cmdShow_Click. C is said to be a local variable and the C in PerimeterOfRectangle is stored in a different
address to the C in cmdShow_Click. Local variables only exist in the block in which they are declared.
This is very helpful as it means that different programmers, writing different routines, do not have to worry
about the names of variables used by other programmers. However, it is sometimes useful to be able to use
the same variable in many parts of a program. To do this, the variable has to be declared as global. In
Visual Basic this is done by means of the statement
Public C As Integer
which is placed in a module. If we do this with the previous example the code becomes
Dim A As Integer
Dim B As Integer
Dim Perimeter As Integer
A=3
B=4
C=5
picResults.Print "Before call to Sub A = "; A; " and B = "; B; " and C = "; C
Perimeter = PerimeterOfRectangle(A, B)
picResults.Print "Perimeter = "; Perimeter
picResults.Print "After call to Sub A ="; A; " and B = "; B; " and C = "; C
End Sub
C = 10
End Function
Fig. 3.5.d.6 shows that the value of C, when changed in the function PerimeterOfRectangle, is
changed in the calling routine also. In fact it is changed throughout the program.
Fig. 3.5.d.6
In C++ variables can be declared at any point in the program. This means that a variable can be local to a
small block of code. In the following example i is only available in the for loop. The output statement after
the loop is illegal as i no longer exists.
We now have the ability to allow variables to be used only in certain parts of a program or in any part.
Global variables should be used as sparingly as possible as they can cause a program to be very difficult to
debug. This is because it is not always clear when global variables are being changed.
What happens if a variable is declared as both global and local? The following code declares C as global
and C as local in the function PerimeterOfRectangle and the result of running it is shown in Fig. 3.5.d.7.
Notice that the value of C in the function cmdShow_Click is not changed although its value is changed in
PerimeterOfRectangle. This is because C is declared as a local variable in this function and this means that
the global C is not used.
Public C As Integer 'global declaration
Dim A As Integer
Dim B As Integer
Dim Perimeter As Integer
A=3
B=4
C=5
picResults.Print "Before call to Sub A = "; A; " and B = "; B; " and C = "; C
Perimeter = PerimeterOfRectangle(A, B)
picResults.Print "Perimeter = "; Perimeter
picResults.Print "After call to Sub A ="; A; " and B = "; B; " and C = "; C
End Sub
C = 10
X=2*X
Y=2*Y
PerimeterOfRectangle = X + Y
End Function
Fig. 3.5.d.7
3.5.e Stacks and Procedures
When a procedure or function is called, the computer needs to know where to return to when the function or
procedure is completed. That is, the return address must be known. Further, functions and procedures may
call other functions and procedures which means that not only must several return addresses be stored but
they must be retrieved in the right order. This can be achieved by using a stack. Fig. 3.5.e.1 shows what
happens when three functions are called after one another. The numbers represent the addresses of the
instructions following the calls to functions.
Main program
.
.
.
Call Function A Function A
100 … .
. .
. .
. Call Function B Function B
. 150 … .
. . .
. . .
End Return Call Function C Function C
250 … .
. .
. .
Return Return
Fig. 3.5.e.1
Notice that the addresses will be stored in the order 100, 150 then 250. When the returns take place the
addresses will be needed in the order 250, 150 then 100. That is, the last address stored is the first address
needed on returning from a function. This means that we need a data structure that provides a last in first
out facility. A stack does precisely this, so we store the return addresses in a stack. In the above example,
the addresses will be stored in the stack each time a function is called and will be removed from the stack
each time a return instruction is executed. This is shown in Fig. 3.5.e.2.
Call Function A
Push return address onto stack
250
Stack pointer 150
100
Return from B
Pop return address off stack
250
150
Stack pointer 100
Return from A
Pop return address off stack
250
150
100
Stack pointer NULL
Fig.3.5.e.2
Now suppose that values need to be passed to, or from, a function or procedure. Again a stack can be used.
Suppose we have a main program and two procedures, Proc A(A1, A2) and Proc B(B1, B2, B3). That is, A1
and A2 are the formal parameters for Proc A and B1, B2 and B3 are the formal parameters for Proc B. Now
look at Fig. 3.5.e.3 which shows the procedures being called and the return addresses that must be placed on
the stack.
Main program
.
.
.
Call Proc A(X1,X2) Proc A(A1,A2)
200 … .
.
.
Call Proc B(Y1,Y2,Y3) Proc B(B1,B2,B3)
400 … .
. .
. .
Return Return
Fig. 3.5.e.3
Now let us suppose that all the parameters are being passed by value. Then, when the procedures are called
the actual parameters must be placed on the stack and the procedures must pop the values off the stack and
store the values in the formal parameters. This is shown in Fig. 3.5.e.4; note how the stack pointer is moved
each time an address or actual parameter is popped onto or popped off the stack.
Call Proc A(X1, X2)
PUSH 200
PUSH X1 Stack pointer X2
PUSH X2 X1
200
A2 = X2 (POP X2)
A1 = X1 (POP X1)
X2
X1
Stack pointer 200
B3 = Y3 (POP Y3)
B2 = Y2 (POP Y2) Y3
B1 = Y1 (POP Y1) Y2
Y1
Stack pointer 400
200
Next we must consider what happens if the values are passed by reference. This works in exactly the same
way as the addresses of variables are passed so there is no need to return the values via parameters. The
procedures, or functions, will access the actual addresses where the variables are stored. Finally, how do
functions return values? Simply push them on the stack immediately before returning. The calling program
can then pop the value off the stack. Note that the return address has to be popped off the stack before
pushing the return value onto the stack.
3.5.f Object-Oriented Programming (OOP)
Section 3.5.a showed some simple examples of programs written in an object-oriented language. There are
many such languages, some of which were designed as such (Eiffel, Smalltalk) and others which have
evolved (C++, Visual Basic). Java is an OOP language whose syntax is based on C++. All these languages
have classes and derived classes and use the concepts encapsulation, inheritance and polymorphism. In this
Section we consider these concepts in a general way, using diagrams rather than any particular language.
Data encapsulation (or data hiding) has been explained in Section 3.5.a. It is the concept that data can only
be accessed via the methods provided by the class. This is shown in Fig. 3.5.f.1 where the objects, that is,
instantiations of a class, are prevented from directly accessing the data by the methods.
Object Object
Object Object
Fig. 3.5.f.1
Objects can only access the data by using the methods provided. An object cannot manipulate the data
directly. In the case of the rectangle class, an object of this class cannot directly calculate its area. That is,
we cannot write
To find the area of myRectangle, class Rectangle must provide a suitable method. The example in Section
3.5.a does not do this. The class Rectangle calculates the area when an instance of a class is instantiated.
The only way to find the area is to use the write( ) method which outputs the area. If a user wishes to access
the width and length of a rectangle, the class must provide methods to do this. Methods to return the width
and length are given below.
integer getWidth( ) {
getWidth := width;
}//end of getWidth method.
integer getLength( ) {
getLength := length;
}//end of getLength method.
myRectangle can now use these methods to get at the width and length. However, it cannot change their
values. To find the perimeter we can write
myWidth := myRectangle.getWidth( );
myLength := myRectangle.getLength( );
myPerimeter := 2 * (myWidth + myLength);
Thus, an object consists of the data and the methods provided by the class. The concept of data being only
accessible by means of the methods provided is very important as it ensures data integrity. Once a class has
been written and fully tested, neither its methods nor the data can be tampered with. Also, if the original
design of a method is found to be inefficient, the design can be changed, unknowingly to the user, without
the user's program being affected.
Another powerful concept is that of inheritance. Inheritance allows the re-use of code and the facility to
extend the data and methods without affecting the original code. In the following diagrams, we shall use a
rounded rectangle to represent a class. The name of the class will appear at the top of the rectangle,
followed by the data followed by the methods.
Consider the class Person that has data about a person's name and address and a methods called outputData(
) that outputs the name and address, getName( ) and getAddress( ) that return the name and address
respectively. This is shown in Fig. 3.5.f.2.
name
Data
address
outputData( )
getName( ) Methods
getAddress( )
Fig. 3.5.f.2
Now suppose we want a class Employee that requires the same data and methods as Person and also needs
to store and output an employee's National Insurance number. Clearly, we do not wish to rewrite the
contents of the class person. We can do this by creating a class called Employee that inherits all the details
of the class Person and adds on the extra data and methods needed. This is shown in Fig. 3.5.f.3 where the
arrow signifies that Employee inherits the data and methods provided by the class Person. Person is called
the super-class of Employee and Employee is the derived class from Person. An object of type Employee
can use the methods provided by Employee and those provided by Person.
Person
name
address
outputData( )
getName( )
getAddress( )
Employee
NINumber
outputData( )
getNINumber( )
Fig. 3.5.f.3
Notice that we now have two methods with the same name. How does the program determine which one to
use? If myPerson is an instantiation of the Person class, then
myPerson.outputData( );
will use the outputData( ) method from the Person class. The statement
myEmp.outputData( );
will use the method outputData( ) from the Employee class if myEmp is an instantiation of the Employee
class.
Now suppose we have two types of employee; one is hourly paid and the other is paid a salary. Both of
these require the data and methods of the classes Person and Employee but they also need different data to
one another. This is shown in Fig. 3.5.f.4.
Person
name
address
outputData( )
getName( )
getAddress( )
Employee
NINumber
outputData( )
getNINumber( )
HourlyPaidEmp SalariedEmp
hourlyRate salary
outputData( ) outputData( )
getHourlyRate( ) getSalary( )
Fig. 3.5.f.4
How can an object of type Employee output the name and address as well as the N.I. number? The
outputData( ) method in class Employee can refer to the outputData( ) method of its superclass. This is done
by writing a method, in class Employee, of the form
void outputData( ) {
super.outputData( );
System.out.println("The N.I. number is " + NINumber);
}//end of outputData method.
Here super. outputData( ) calls the outputData( ) method of the super-class and then outputs the N.I. number.
Similarly, the other derived classes can call the methods of their super classes.
In the above, we have explained the meanings of terms such as data encapsulation, class and inheritance.
However, sometimes the examiner may ask you to simply state the meanings of these terms. In this case a
simple definition is all that is required. Note also that there will only be one (or possibly two) marks for this
type of question. The following definitions would be satisfactiory answers to questions that say 'State the
meaning of the term … '.
Definitions
Data encapsulation is the combining together of the variables and the methods that can operate on the
variables so that the methods are the only ways of using the variables..
A class describes the variables and methods appropriate to some real-world entity.
Inheritance is the ability of a class to use the variables and methods of a class from which the new class is
derived.
female(jane).
female(anne).
female(sandip).
male(charnjit).
male(jaz).
male(tom).
parent(jane,mary).
parent(jane, rajinder).
parent(charnjit, mary).
parent(charnjit, rajinder).
parent(sandip, atif).
parent(jaz, atif).
Remember that variables must start with an uppercase letter; constants start with a lowercase letter.
Suppose we ask
male(X).
Prolog starts searching the database and finds male(charnjit) matches male(X) if X is given the value
charnjit. We say that X is instantiated to charnjit. Prolog now outputs
X = charnjit
Prolog then goes back to the database and continues its search. It finds male(jaz) so outputs
X = jaz
and again continues its search. It continues in this way until the whole database has been searched. The
complete output is
X = charnjit
X = jaz
X = tom
No
This rule states that X is father of Y if (the :- symbol) X is a parent of Y AND (the comma) X is male.
female(jane).
female(anne).
female(sandip).
male(charnjit).
male(jaz).
male(tom).
parent(jane,mary).
parent(jane, rajinder).
parent(charnjit, mary).
parent(charnjit, rajinder).
parent(sandip, atif).
parent(jaz, atif).
father(X, Y) :- parent(X, Y), male(X).
Suppose our goal is to find the father of rajinder. That is, our goal is to find all X that satisfy
father(X, rajinder).
In the database and the rule the components female, male, parent and father are called predicates and the
values inside the parentheses are called arguments. Prolog now looks for the predicate father and finds the
rule
In this rule Y is instantiated to rajinder and Prolog starts to search the data base for
parent(X, rajinder)
parent(jane, rajinder)
if X is instantiated to jane. Prolog now uses the second part of the rule
male(X)
with X = jane. That is, Prolog's new goal is male(jane) which fails. Prolog does not give up at this stage but
backtracks to the match
parent(jane, rajinder)
and starts again, from this point in the database, to try to match the goal
parent(X, rajinder)
parent(charnjit, rajinder)
with X instantiated to charnjit. The next step is to try to satisfy the goal
male(charnjit)
This is successful so
X = charnjit
Prolog continues to see if there are any more matches. There are no more matches so Prolog outputs
No
A powerful tool in Prolog is recursion. This can be used to create alternative versions for a rule. The Fig.
3.5.g.1 shows how ancestor is related to parent.
a is ancestor of b a is parent of b
a is ancestor of d c is parent of d
b is ancestor of d
d
Fig. 3.5.g.1
This shows that X is an ancestor of Y if X is a parent of Y. But it also shows that X is an ancestor of Y if X
is a parent of Z and Z is a parent of Y. It also shows that X is an ancestor of Y if X is a parent of Z , and Z is
a parent of W and W is a parent of Y. This can continue forever. Thus the rule is recursive. In Prolog we
require two rules that are written as
The first rule states that X is an ancestor of Y if X is a parent of Y. In Fig. 3.5.g.1, this is saying that a is an
ancestor of b, b is an ancestor of c and c is an ancestor of d.
The second rule is in two parts. Let us see how it works using Fig.3.5.g.1 which represents the database
parent(a, b).
parent(b, c).
parent(c, d).
Prolog finds the first rule and tries to match parent(a, c) with each predicate in the database. Prolog fails but
does not give up. It backtracks and looks for another rule for ancestor. Prolog finds the second rule and
tries to match
parent(a, Z).
It finds
parent(a, b)
so instantiates Z to b.
This is now put into the second part of the rule to produce
ancestor(b, c).
This means that Prolog has to look for a rule for ancestor. It finds the first rule
parent(b, c)
and succeeds.
This means that with X = a, Y = c we have Z = b and the second rule succeeds. Therefore Prolog returns
Yes.
ancestor(a,d)
and
ancestor(c, b).
You should find that the first goal succeeds and the second fails.
This form of programming is based on the mathematics of predicate calculus. Predicate calculus is a branch
of mathematics that deals with logic. All we have done in this Section is based on the rules of predicate
calculus. Prolog stands for Programming in LOGic and its notation is based on that of predicate calculus.
In predicate calculus the simplest structures are atomic sentences such as
In the last atomic conclusion, x is a variable. The meaning of the conclusion is that Frank likes anything (or
anybody) that likes computing.
Joint conditions use the logical operators OR and AND, examples of which are
The atomic formulae that serve as conditions and conclusions may be written in a simplified form. In this
form the name of the relation is written in front of the atomic formula. The names of the relations are called
predicate symbols. Examples are
loves(Mary, Harry)
likes(Philip, Zak)
The AND is represented by a comma in the condition part of the atomic conclusion. For example
These examples show the connection between Prolog and predicate calculus. You do not need to understand
how to manipulate the examples you have seen any further than has been shown in this Section.
AS with the previous Section we include here the definitions of terms used in this Section. Remember, they
can be used when a question says ''State the meaning of the term … '.
Definitions
Backtracking is going back to a previously found successful match in order to continue a search.
Predicate logic is a branch of mathematics that manipulates logical statements that can be either True or
False.
A goal is a statement that we are trying to prove whether or not it is True or False.
3.5.h List Processing and Functional Languages
In Section 3.5.a we introduced functional languages and saw how functions are defined in the Haskell
programming language. The important concept is pattern matching. Suppose we have two numbers x and y
and wish to output y if x is zero, otherwise output x. We can write this as
Here, the function called output uses guards to decide what to do. (Remember, guards are like IF … THEN
… ELSE … or CASE statements in other languages.) An alternative is to write the function output in two
parts.
output 0, 7
output 0, 7
matches with
output 0, y
As this match is successful, Haskell will not look at the next definition of output.
Now try
output 4, 7
output 0, y
output x, y
x = 4 and y = 7
In fact, in the second definition, y is never used so the definition can be written
output x, _ =x
The underscore is called a wild card and any value will match it.
Lists are written inside square brackets. The empty list is written [ ]. The list [2, 4, 6] consists of three
elements. In this case 2 is the head of the list and the list [4, 6] is the tail. We can write
[2, 4, 6] = 2 : [4, 6]
head tail
2 : 4 : 6 : [ ] =2 : 4 : [6]
=2 : [4, 6]
=[2, 4, 6]
list = x : xs
x : y : zs = x : (y : zs)
let us now see how we can define a function head that returns the head of a non-empty list.
head :: [a] a
head [x : _ ] = x
Note the use of the underscore to represent the tail. This is because we are not interested in the contents of
the tail. Similarly we can define the tail.
tail :: [a] a
tail [_ : xs ] = xs
The letter a is used to represent a generic data type. That is, the list could be a list of Int, or of char of any
other data type.
Now let us see how to find the sum of a list on integers. We have
But the sum of the empty list is zero. Thus we clearly have recursion as the sum of the list is the head plus
the sum of the tail of the list. The tail is steadily decreasing as each head is removed so that the tail will
eventually become the empty list. This gives us our stopping value since the sum of an empty list is zero.
Our function is
sum :: [Int] Int
sum [ ] =0
sum (x : xs) = x + sum xs
Remember x is an Int and xs is a list, that is why there are no square brackets.
sum [2, 3, 5]
match with sum [ ] and fail
match with sum ( x : xs) and succeed
sum = 2 + sum [3, 5]
match with sum [ ] and fail
match with sum ( x : xs) and succeed
sum = 3 + sum [5]
match with sum [ ] and fail
match with sum ( x : xs) and succeed
sum = 5 + sum [ ]
match with sum [ ] and succeed
sum = 0
sum = 5 + 0
=5
sum = 3 + 5
=8
sum = 2 + 8
= 10
Recursion is a very powerful and common feature of functional programming. Hence, here is another
similar example to clarify recursion.
Suppose we wish to find the product of a list of integers. That is
product [2, 3, 7] = 2 x 3 x 7 = 42
Let us now look at a different problem. Suppose we wish to find the squares of each element of a list of
integers. That is, given the list
[2, 5, 3]
[4, 25, 9]
Thus our input is a list of integers and our output is a list of integers. So we have
Now
and
that is
The problem with recursion is that it makes very heavy use of memory. This is because it has to store the
values of the variables and the return address each time a call is made. (See Section 3.5.e.) One way of
overcoming this is to use tail recursion. That is, make the recursive call the very last statement. This does
not mean that it can be part of the last statement, the call itself must be the last statement. Consider the
following function.
When fact (n – 1) is called, the value of n must be stored so that when the value of fact (n – 1) is known the
multiplication n * fact (n – 1) can be carried out. We want to avoid storing n because the result
n x (n – 1) x (n – 2) x … x 2 x 1
cannot be evaluated until fact 0 is found. This can involve a lot of storage if n is large. Let us define two
new functions called factorial and newFact.
factorial :: Int Int
factorial n = newFact n, 1
The following defines the sum of a list using tail recursion. Trace the solution of sumList [ 2,5, 6,3].
Head recursion is when the recursive call is made at the start of the function, immediately after the decision
that causes termination. This is very inefficient on storage as all variables used in the function must be
stored every time a call is made.
As in the two previous Sections we include here two definitions that can be used to answer questions of the
form 'State the meaning of the term …'.
Defintions
Tail recursion is when the last statement in a function calls the function itself and is the only content of the
last statement.
Head recursion is when the call, to the function containing the call, is at the start of the function and all
other statements that manipulate the data are after the call.
3.5.i Use of Special Registers/Memory Addressing Techniques
Fig. 3.5.i.1 shows the minimum number of registers needed to execute instructions. Remember that these
are used to execute machine code instructions not high-level language instructions.
B
u
s
Fig. 3.5.i.1
The program counter (PC) is used to keep track of the location of the next instruction to be executed. This
register is also known as the Sequence Control Register (SCR).
The memory address register (MAR) holds the address of the instruction or data that is to be fetched from
memory.
The current instruction register (CIR) holds the instruction that is to be executed, ready for decoding.
The memory data register (MDR) holds data to be transferred to memory and data that is being transferred
from memory, including instructions on their way to the CIR. Remember that the computer cannot
distinguish between data and instructions. Both are held as binary numbers. How these binary numbers are
interpreted depends on the registers in which they end up. The MDR is the only route between the other
registers and the main memory of the computer.
The accumulator is where results are temporarily held and is used in conjunction with a working register to
do calculations.
The index register is a special register used to adjust the address part of an instruction. This will be
explained in more detail later.
Note that the diagram does not show the control bus and the signals needed for instructions to be correctly
executed. These are not required for this examination.
We shall now see how these registers are used to execute instructions. In order to do this we shall assume
that a memory location can hold both the instruction code and the address part of the instruction. For
example, a 32-bit memory location may use 12 bits for the instruction code and 20 bits for the address part.
This will allow us to use up to 212 (= 4096) instruction codes and 220 (= 1 048 576) memory addresses.
Suppose four instructions are stored in locations 300, 301, 302 and 303 as shown in the following table and
that the PC contains the number 300.
The instruction is now decoded (not shown in the table) and is interpreted as 'load the contents of the
location whose address is given into the accumulator'.
We now start the execution phase. As the contents of an address are needed, the address part of the
instruction is copied into the MAR, in this case 400.
Now use the same steps to fetch and execute the next instruction. Note that the PC already contains the
address of the next instruction.
Note that all data moves between memory and the MDR via the data bus. All addresses use the address bus.
A summary of the steps needed to fetch and execute the LDA instruction are shown in Fig. 3.5.i.2
Copy PC to MAR
Increment PC
Fetch
Copy contents of location pointed to by MAR to MDR
phase
Decode instruction
Execute
Copy contents of address in MAR to MDR phase for
LDA
instruction
Fig. 3.5.i.2
In Fig. 3.5.i.3, what happens during the execute cycle depends on the instruction. For example, the STA n
(store the contents of the accumulator in the location with address n) has the execute steps shown in Fig.
3.5.i.3.
Fetch next instruction ( See Fig 3.5.i.2)
Fig. 3.5.i.3
This process works fine but only allows for the sequential execution of instructions. This is because the PC
is only changed by successively adding 1 to it. How can we arrange to change the order in which
instructions are fetched? Consider these instructions.
Suppose the PC contains the number 300, after the instruction ADD 500 has been fetched and executed the
PC will hold the number 301. Now the instruction JLZ 300 will be fetched in the usual way and the PC will
be incremented to 302. The next step is to execute this instruction. The steps are shown in Fig. 3.5.i.4.
Is accumulator < 0
No
Yes
Fig. 3.5.i.4
So far we have used two copy instructions (LDA and STA), one arithmetic instruction (ADD) and one jump
instruction (JLZ). In the case of the copy and arithmetic instructions, the address part has specified where to
find or put the data. This is known as direct addressing.
An alternative method of using the address part of an instruction is called indirect addressing. Here the
address part of the instruction specifies where to find the address to be used. Fig. 3.5.i.5 shows how this
type of instruction is executed if the instruction is LDI (load the accumulator indirectly).
Fig. 3.5.i.5
The advantage of this mode of addressing is that the actual address used in our example can be the full 32
bits giving 232 addresses.
Often it is necessary to access successive memory locations for data. Suppose we wish to add a series of
numbers stored in locations with addresses 600 to 609. We do not want to write a load instruction followed
by nine add instructions. After all, what would happen if we wished to add 100 numbers? We can solve this
problem by using indexed addressing. We could have an instruction, say ADDX, which uses index
addressing. Index addressing uses an index register that the programmer initially sets to zero. Index
addressing adds the contents of the index register to the address part of the instruction before using the
address. After each add instruction is executed, the programmer increments the index register (IR).
ADDX 700
Thus the contents of address 705 are added to the accumulator. The programmer then increments the IR to
make it 6 so that the next time the ADDX 700 instruction is executed the addressed used will be 706.
Third generation languages need the user to specify clearly all the steps that need to be taken to solve a
problem. Fourth generation languages do not do this. Languages that accompany modern database, word
processing and spreadsheet packages do not need the user to do this. The users of these packages tell the
application what they want to do not how to do it. An example is mail merge . Here all the user has to do is
tell the software what table or database to use and the mail merge will take place. Databases often use query
by example (QBE). Here the user simply states what is required and the software will do the task. For
example, Microsoft Access lets a user specify conditions such as DOB < 01/01/90 and the necessary coding
will be done. In fact Access uses the Structured Query Language (SQL) to create the queries. Consider the
following table called Students.
> 150
for the criteria. We could also specify, by means of a check box, that only the name should be printed. The
result would be
Dalvinder
Frank
Georgina
SELECT name
FROM Students
WHERE height > 150;
Notice that we do not have to give the steps needed to check each entry in the table Students. A more
complicated query is
SELECT name
FROM Students
WHERE height > 145
AND
weight > 32;
Again, we do not tell the computer exactly how to find the answer required as we would with a third
generation language.
The development of fourth generation languages has meant that people who are not programmers can
produce useful results.
For count = 1 To 10
A Visual Basic compiler would not understand the C++ syntax and vice versa. We therefore need, for each
language, a set of rules that specify precisely every part of the language. These rules are specified using
Backus Naur Form (BNF) or syntax diagrams.
All languages use integers, so we shall start with the definition of an integer. An integer is a sequence of the
digits 0, 1, 2, … , 9. Now the number of digits in an integer is arbitrary. That is, it can be any number. A
particular compiler will restrict the number of digits only because of the storage space set aside for an
integer. But a computer language does not restrict the number of digits. Thus the following are all valid
integers.
0
2
415
3040513002976
0000000123
where the vertical line is read as OR. Notice that all the digits have to be specified and that they are not
inside angle brackets (< and >) like <integer> and <digit>. This is because integer and digit have definitions
elsewhere; the digits 0, 1, 2, … , 9 do not.
Our full definition of a single digit integer is
147
This is a single digit integer ( 1 ) followed by the integer 47. But 47 is a single digit integer ( 4 ) followed
by a single digit integer ( 7 ). Thus, all integers of more than one digit start with a single digit and are
followed by an integer. Eventually the final integer is a single digit integer. Thus, an indefinitely long
integer is defined as
This is a recursive definition as integer is defined in terms of itself. Applying this definition several times
produces the sequence
=<digit><digit><integer>
=<digit><digit><digit><integer>
To stop this we use the fact that, eventually, <integer> is a single digit and write
That is, <integer> is a <digit> OR a <digit> followed by an <integer>. This means that at any time <integer>
can be replaced by <digit> and the recursion stops. Strictly speaking we have defined an unsigned integer as
we have not allowed a leading plus sign ( + ) or minus sign ( - ). This will be dealt with later. We now have
the full definition of an unsigned integer which, in BNF, is
This definition of an unsigned integer can also be described by means of syntax diagrams as shown in Fig. 3.5.k.1.
integer digit
digit 0
3
4
8 Fig. 3.5.k.1
9
Now we shall define a signed integer such as
+27
-3415
and we can use the earlier definition of an <unsigned integer>. It is usual to say that an integer is an
unsigned integer or a signed integer. If we do this we get the following definition, in BNF, of an integer.
There are other valid ways of writing these definitions. However, it is better to use several definitions than
try to put all the possibilities into a single definition. In other words, try to start at the top with a general
definition and then try to break the definitions down into simpler and simpler ones. That is, we have used
top-down design when creating these definitions. We have broken the definitions down until we have terms
whose values can be easily determined.
-
digit
0
Fig.3.5.k.2
Care must be taken when positioning the recursion in the definitions using BNF. Suppose we define a
variable as a sequence of one or more characters starting with a letter. The characters can be any letter, digit
or the underscore. Valid examples are
A
x
sum
total24
mass_of_product
MyAge
Let us see what happens if we use a similar definition to that for an unsigned integer.
<character><variable>
with <character> = 2 and <variable> = Sum. Continuing in this way we use 2, S and u for <character> and
then m for <letter>. This means that our definition simply means that we must end with a letter not start
with one. We must rewrite our definition in such a way as to ensure that the first character is a letter.
Moving the recursive call to the front of <character> can do this. This means that the last time it is called it
will be a letter and this will be at the head of the variable. The correct definition is
A syntax diagram can also represent this. This is left as an exercise. You should also note that, in the
definition of integer, we used tail recursion, but here we have used head recursion.
Let us now use our definition of an integer to define a real number such as
0.347
-2.862
+14.34
00235.006
Finally, suppose we do not want to allow leading zeros in our integers. That is
zero or digits
<digits> must be a single non-zero digit or a non-zero digit followed by any digits. This gives us
where
<zero> ::= 0
<non-zero integer> ::= 1|2|3|4|5|6|7|8|9
<digit> ::= <zero>|<non-zero digit>
integer 0
digits
digits 1
2 0
3 1
4 2
5 3
6 4
7 5
8 6
9 7
Fig. 3.5.k.4
3.5 Example Questions
Having worked through the 66 pages of section 3.5, many students will be worried about the detail offered
and the need to answer examination questions on what is very difficult and complex work. However, take
heart! The whole examination only lasts 2 hours, and in that time the examiner not only has to examine this
section but also the other 9 sections in module 3. This works out at only 12 minutes for each section, so the
idea of long, complex, algorithm type questions are not feasible. You are going to be asked questions that
will be taken from the syllabus but which will be fairly short, and knowledge based. The exception may be
with the types of language which may produce a question on the lines of the object oriented question in the
sample material.
Q 4. Explain the difference between direct and indirect addressing and explain why indirect addressing
allows access to more memory locations than indirect addressing. [6]
A. - Direct addressing means that the value in the address part of a machine code instruction is...
- the address of the data
- Indirect addressing means that the value in the address part of a machine code instruction is
the address of...
- the address of the data
- In a standard 32 bit word, 24 bits may be used for the address of the data
- this allows 2^24 locations in memory to be addressed.
- If this value points to a location which holds nothing but an address then 2^32 locations in
memory can be addressed.
Notes: A very simple example of this type of question, but bearing in mind the short amount of time
in the examination, this is realistic. The exam question can be made more difficult by having a
recursive definition in the Backus Naur question and by having a return loop in the syntax diagram.
Chapter 3.6
Databases
4.6.1 Files and Databases
Originally all data were held in files. A typical file would consist of a large number of records each of
which would consist of a number of fields. Each field would have its own data type and hold a single item
of data. Typically a stock file would contain records describing stock. Each record may consist of the
following fields.
The problem is when we check the stock the next day, we will create a new order because the stock that has
been ordered has not been delivered. To overcome this we could introduce a new field called On Order of
type Boolean. This can be set to True when an order has been placed and reset to False when an order has
been delivered. Unfortunately it is not that easy.
The original software is expecting the original seven fields not eight fields. This means that the software
designed to manipulate the original file must be modified to read the new file layout.
Further ad hoc enquiries are virtually impossible. What happens if management ask for a list of best selling
products? The file has not been set up for this and to change it so that such a request can be satisfied in the
future involves modifying all existing software. Further, suppose we want to know which products are
supplied by Food & Drink Ltd.. In some cases the company's name has been entered as Food & Drink Ltd.,
sometimes as Food and Drink Ltd. and sometimes the full stop after Ltd has been omitted. This means that a
match is very difficult because the data is inconsistent. Another problem is that each time a new product is
added to the database both the name and address of the supplier must be entered. This leads to redundant
data or data duplication.
The following example, shown in Fig. 3.6.a.1, shows how data can be proliferated when each department
keeps its own files.
File containing Stock
Programs to place Code, Description, Re-
Purchasing orders when stocks
order level, Cost Price,
Department are low Sale Price Supplier name
and address, etc
Suppose we wish to know which customers have bought parts produced by a particular supplier. We
first need to find the parts supplied by a particular supplier from one file and then use a second file to
find which customers have bought those parts. This difficulty can be compounded if data is needed
from more than two files.
Duplication of data
Details of suppliers have to be duplicated if a supplier supplies more than one part. Details of
customers are held in two different files.
Duplication is wasteful as it costs time and money. Data has to be entered more than once,
therefore it takes up time and more space.
Duplication leads to loss of data integrity. What happens if a customer changes his address? The
Sales Department may update their files but the Accounts Department may not do this at the
same time. Worse still, suppose the Order Department order some parts and there is an increase
in price. The Order Department increases the Cost and Sale prices but the Accounts Department
do not, there is now a discrepancy.
Data dependence
Data formats are defined in the application programs. If there is a need to change any of these
formats, whole programs have to be changed. Different applications may hold the data in
different forms, again causing a problem. Suppose an extra field is needed in a file, again all
applications using that file have to be changed, even if they do not use that new item of data.
Incompatibility of files
Suppose one department writes its applications in COBOL and another in C. Then COBOL files
have a completely different structure to C files. C programs cannot read files created by a
COBOL program.
File processing was a huge advance on manual processing of queries. This led to end-users
wanting more and more information. This means that each time a new query was asked for, a
new program had to be written. Often, the data needed to answer the query were in more than
one file, some of which were incompatible.
To try to overcome the search problems of sequential files, two types of database were introduced. These
were hierarchical and network databases. Examples of these are shown in Fig. 3.6.a.2 and Fig. 3.6.a.3
respectively. Employee
Part-Time Full-Time
The hierarchical model can still lead to inconsistent and redundant data. A network database is similar to an
hierarchical one, except that it has more complex pointers. An hierarchical database allows movement up
and down the tree like structure. A network database allows movement up, down and across the tree like
structure. The diagram in Fig. 3.6.a.3 shows how complex the pointers can become. This makes it very
difficult to maintain a network database.
1 Table
2 Desk
3 Chair
Fig. 3.6.b.1
In this example, the delivery note has more than one part on it. This is called a repeating group. In the
relational database model, each record must be of a fixed length and each field must contain only one item
of data. Also, each record must be of a fixed length so a variable number of fields is not allowed. In this
example, we cannot say 'let there be three fields for the products as some customers may order more
products than this and other fewer products. So, repeating groups are not allowed.
At this stage we should start to use the correct vocabulary for relational databases. Instead of fields we call
the columns attributes and the rows are called tuples. The files are called relations (or tables).
where DELNOTE is the name of the relation (or table) and Num, CustName, City, Country, ProdID and
Description are the attributes. ProdID and Description are put inside parentheses because they form a
repeating group. In tabular form the data may be represented by Fig. 3.6.b.2.
Fig. 3.6.b.2
This again shows the repeating group. We say that this is in un-normalised form (UNF). To put it into 1st
normal form (1NF) we complete the table and identify a key that will make each tuple unique. This is
shown in Fig. Fig. 3.6.b.3.
To make each row unique we need to choose Num together with ProdID as the key. Remember, another
delivery note may have the same products on it, so we need to use the combination of Num and ProdID to
form the key. We can write this as
To indicate the key, we simply underline the attributes that make up the key.
Because we have identified a key that uniquely identifies each tuple, we have removed the repeating group.
Definition of 1NF
A relation with repeating groups removed is said to be in First Normal Form (1NF). That is, a relation
in which the intersection of each tuple and attribute (row and column) contains one and only one
value.
However, the relation DELNOTE still contains redundancy. Do we really need to record the details of the
customer for each item on the delivery note? Clearly, the answer is no. Normalisation theory recognises
this and allows relations to be converted to Third Normal Form (3NF). This form solves most problems.
(Note: Occasionally we need to use Boyce-Codd Normal Form, 4NF and 5NF. This is rare and beyond the
scope of this specification.)
Let us now see how to move from 1NF to 2NF and on to 3NF.
Definition of 2NF
A relation that is in 1NF and every non-primary key attribute is fully dependent on the primary key is
in Second Normal Form (2NF). That is, all the incomplete dependencies have been removed.
In our example, using the data supplied, CustName, City and Country depend only on Num and not on
ProdID. Description only depends on ProdID, it does not depend on Num. We say that
and write
If we do this, we lose the connection that tells us which parts have been delivered to which customer. To
maintain this connection we add the dependency
Note the keys (underlined) for each relation. DEL_PROD needs a compound key because a delivery note
may contain several parts and similar parts may be on several delivery notes. We now have the relations in
2NF.
Can you see any more data repetitions? The following table of data may help.
City → Country
Let us now use the data above and see what happens to it as the relations are normalised.
1NF
DELNOTE
Num CustName City Country ProdID Description
005 Bill Jones London England 1 Table
005 Bill Jones London England 2 Desk
005 Bill Jones London England 3 Chair
008 Mary Hill Paris France 2 Desk
008 Mary Hill Paris France 7 Cupboard
014 Anne Smith New York USA 5 Cabinet
002 Tom Allen London England 7 Cupboard
002 Tom Allen London England 1 Table
002 Tom Allen London England 2 Desk
Convert to
2NF
DELNOTE
PRODUCT
Num CustName City Country ProdID Description
005 Bill Jones London England 1 Table
008 Mary Hill Paris France 2 Desk
014 Anne Smith New York USA 3 Chair
002 Tom Allen London England 7 Cupboard
5 Cabinet
DEL_PROD
Num ProdID
005 1
005 2
005 3
008 2
008 7
014 5
002 7
002 1
002 2
Convert to
3NF
DELNOTE DEL_PROD
Num CustName City Num ProdID
005 Bill Jones London 005 1
008 Mary Hill Paris 005 2
014 Anne Smith New York 005 3
002 Tom Allen London 008 2
008 7
014 5
002 7
002 1
002 2
PRODUCT CITY_COUNTRY
ProdID Description City Country
1 Table London England
2 Desk Paris France
3 Chair New York USA
7 Cupboard
5 Cabinet
UNF
DELNOTE(Num, CustName, City, Country,
(ProdID, Description))
1NF
DELNOTE(Num, CustName, City, Country,
ProdID, Description)
2NF
DELNOTE(Num, CustName, City, Country)
PRODUCT(ProdID, Description)
DEL_PROD(Num, ProdID)
3NF
DELNOTE(Num, CustName, City)
CITY_COUNTRY(City, Country)
PRODUCT(ProdID, Description)
DEL_PROD(Num, ProdID)
In this Section we have seen the data presented as tables. These tables give us a view of the data. The tables
do NOT tell us how the data is stored in the computer, whether it be in memory or on backing store. Tables
are used simply because this is how users view the data. We can create new tables from the ones that hold
the data in 3NF. Remember, these tables simply define relations.
Users often require different views of data. For example, a user may wish to find out the countries to which
they have sent desks. This is a simple view consisting of one column. We can create this table by using the
following relations (tables).
Films are shown at many cinemas, each of which has a manager. A manager may manage more than one
cinema. The takings for each film are recorded for each cinema at which the film was shown.
Therefore 2NF is
FILM(FID, Title)
CINEMA(CID, Cname, Loc, MID, MName)
TAKINGS(FID, CID, Takings)
In Cinema, the non-key attribute MName is dependent on MID. This means that it is transitively dependent
on the primary key. So we must move this out to get the 3NF relations
FILM(FID, Title)
CINEMA(CID, Cname, Loc, MID)
TAKINGS(FID, CID, Takings)
MANAGER(MID, MName)
In an E-R diagram DELNOTE, CITY_COUNTRY, PRODUCT and DEL_PROD are called entities.
Entities have the same names as relations but we do not usually show the attributes in E-R diagrams.
The statements show two types of relationship. There are in fact four altogether. These are
one-to-one represented by
one-to-many represented by
many-to-one represented by
many-to-many represented by
Fig. 3.6.c.1 is the E-R diagram showing the relationships between DELNOTE, CITY_COUNTRY,
PRODUCT and DEL_PROD.
DELNOTE
CITY_COUNTRY DEL_PROD
PRODUCT
Fig. 3.6.c.1
If the relations are in 3NF, the E-R diagram will not contain any many-to-many relationships. If there are
any one-to-one relationships, one of the entities can be removed and its attributes added to the entity that is
left.
Let us now look at our solution to the cinema problem which contained the relations
FILM(FID, Title)
CINEMA(CID, Cname, Loc, MID)
TAKINGS(FID, CID, Takings)
MANAGER(MID, MName)
in 3NF.
takes
FILM TAKINGS
is for
connected by FID
CINEMA TAKINGS
takes
connected by CID is for
manages
MANAGER CINEMA
managed
connected by MID by
CINEMA
MANAGER TAKINGS
FILM
Fig. 3.6.c.2
If you now look at Fig. 3.6.c.2, you will see that the link entity is TAKINGS.
3.6.d Form Design
Section 2.1.c discussed the design of screens and forms. All that was said in that section applies to
designing forms for data entry, data amendments and for queries. The main thing to remember when
designing screen layouts is not to fill the screen too full. You should also make sure that the sequence of
entering data is logical and that, if there is more than one screen, it is easy to move between them.
Let us consider a form that will allow us to create a new entry in DELNOTE which has the attributes Num,
CustName, City. Num is the key and, therefore, it should be created by the database system. Fig. 3.6.d.1
shows a suitable form.
Entered by
the system
Appears
automatically
when City is
completed if in
database.
Can use
drop down
lists to
complete
Fig. 3.6.d.1
With this form, if a new City is input the user can input the Country and the City_Country table will be
updated. If the City exists in the database, then Country will appear automatically.
Now let us design a form to allow a user to input a customer's order. In this case we shall need to identify
the customer before entering the order details. This is best done by entering the customer's ID. However,
this is not always known. An alternative, in this case, is to enter the customer's name. The data entry form
should allow us to enter either of these pieces of data and the rest of the details should appear automatically
as a verification that the correct customer has been chosen. Fig. 3.6.d.2 shows a form that is in two parts.
The upper part is used to identify the customer and the lower part allows us to enter each item of data that is
on the customer's order.
Enter either the customer's number OR the
customer's name. The other three boxes will
then be completed automatically by the system.
Customer
details
Entered by
the user
Order details
Entered by
the system
Fig. 3.6.d.2
Notice how certain boxes are automatically completed. Also, because the form requires a customer ID
(Number), orders can only be taken for customers whose details are on the database. This ensures the entry
of customer details before an order can be entered. In order to be consistent, the positions of the boxes for
customer details is the same on the Order Entry form as on the Add New Delivery Note form. It is usual for
both these forms to be password protected. This ensures that only authorised personnel can enter data into
the database.
This is a very simple example. Suppose the customer's ID is not known. We have seen one way of
overcoming this which satisfies the needs of the problem given. Some systems allow the post code to be
entered in order to identify the address. In this case, the street, town and county details are displayed and the
user is asked for the house number. Other systems allow the user to enter a dummy ID such as 0000 and
then a list of customers appears from which the user can choose a name. Alternatively, part of the name can
be entered and then a short list of possible names is displayed. Again the user can choose from this list.
Deletion and modification screens are similar, but must be password protected as before so that only
authorised personnel can change the database.
A query screen should not allow the user to change the database. Also, users should only be allowed to see
what they are entitled to see. To see how this may work, let us consider a query requesting the details of all
the cinemas in our second example. The view presented to the users will give details of cinema names.
locations, manager names and film names as shown in Fig, 3.6.d.3.
Cinema Location Manager Film
Odeon Croyden Smith Jaws
Embassy Osney Smith Jaws
Palace Lye Jones Jaws
Odeon Croyden Smith Tomb Raider
Embassy Osney Smith Tomb Raider
Palace Lye Jones Tomb Raider
Classic Sutton Allen Tomb Raider
Roxy Longden Allen Tomb Raider
Odeon Croyden Smith Cats & Dogs
Odeon Sutton Allen Cats & Dogs
Odeon Croyden Smith Colditz
Roxy Longden Allen Colditz
Fig, 3.6.d.3
However, another user may be given the view shown in Fig. 3.6.d.4.
Fig. 3.6.d.4
Another user may be given all the details, including the cinema and manager IDs. Notice that the columns
do not have to have the same names as the attributes in the database. This means that these names can be
made more user friendly.
In order to create the query a user will normally be presented with a data entry form. This form may contain
default values, as shown in Fig. 3.6.d.5, which allows a user to list cinemas that have takings for films
between set limits. This film allows users to choose all the films, all the cinemas and all locations or to be
more selective by choosing from drop down lists. When the user clicks the OK button a table, such as those
given above, will appear.
Fig. 3.6.d.5
In this Figure, the boxes are initially completed with default values. In this case, if the OK button is clicked,
all cinemas and films would be listed. However, suppose we want to know which films at the Odeon,
Croyden took less than £400. The user could modify the boxes, using drop down lists, as shown in Fig.
3.6.d.6.
Fig. 3.6.d.6
When the OK button is clicked, a report like that shown in Fig. 3.6.d.7 would appear together with a button
allowing the user to print the results or return to the query form.
Fig. 3.6.d.7
3.6.e Advantages of Using a Relational Database (RDB)
Advantage Notes
Control of data redundancy Flat files have a great deal of data redundancy that is
removed using a RDB.
Consistency of data There is only one copy of the data so there is less
chance of data inconsistencies occurring.
Data sharing The data belongs to the whole organisation, not to
individual departments.
More information Data sharing by departments means that departments
can use other department's data to find information.
Improved data integrity Data is consistent and valid.
Improved security The database administrator (DBA) can define data
security – who has access to what. This is enforced by
the Database Management System (DBMS).
Enforcement of standards The DBA can set and enforce standards. These may be
departmental, organisational, national or international.
Economy of scale Centralisation means that it is possible to economise on
size. One very large computer can be used with dumb
terminals or a network of computers can be used.
Improved data accessibility This is because data is shared.
Increased productivity The DBMS provides file handling processes instead of
each application having to have its own procedures.
Improved maintenance Changes to the database do not cause applications to be
re-written.
Improved back-up and recovery DBMSs automatically handle back-up and recovery.
There is no need for somebody to remember to back-up
the database each day, week or month.
The key used to uniquely identify a tuple is called the primary key.
In some cases more than one attribute, or group of attributes, could act as the primary key. Suppose we have
the relation
Clearly, EmpID could act as the primary key. However, NINumber could also act as the primary key as it is
unique for each employee. In this case we say that EmpID and NINumber are candidate keys. If we choose
EmpID as the primary key, then NINumber is called a secondary key.
We see that MID occurs in CINEMA and is the primary key in MANAGER. In CINEMA we say that MID
is the foreign key.
An attribute is a foreign key in a relation if it is the primary key in another relation. Foreign keys are used to
link relations.
Similarly, while a database system is checking stock for re-ordering purposes, the POS terminals will not be
able to use the database as each sale would change the stock levels. Incidentally, there are ways in which
the POS terminals could still operate. One is to only use the database for querying prices and to create a
transaction file of sales which can be used later to update the database.
It is often important that users have restricted views of the database. Consider a large hospital that has a
large network of computers. There are terminals in reception, on the wards and in consulting rooms. All the
terminals have access to the patient database which contains details of the patients' names and addresses,
drugs to be administered and details of patients' illnesses.
It is important that when a patient registers at reception the receptionist can check the patient's name and
address. However, the receptionist should not have access to the drugs to be administered nor to the
patient's medical history. This can be done by means of passwords. That is, the receptionists' passwords
will only allow access to the information to which receptionists are entitled. When a receptionist logs onto
the network the DBMS will check the password and will ensure that the receptionist can only access the
appropriate data.
Now the terminals on the wards will be used by nurses who will need to see what drugs are to be
administered. Therefore nurses should have access to the same data as the receptionists and to the
information about the drugs to be given. However, they may not have access to the patients' medical
histories. This can be achieved by giving nurses a different password to the receptionists. In this case the
DBMS will recognise the different password and give a higher level of access to the nurses that to the
receptionists.
Finally, the consultants will want to access all the data. This can be done by giving them another password.
All three categories of user of the database, receptionist, nurse and consultant, must only be allowed to see
the data that is needed by them to do their job.
So far we have only mentioned the use of passwords to give levels of security. However, suppose two
consultants are discussing a case as they walk through reception. Now suppose they want to see a patient's
record. Both consultants have the right to see all the data that is in the database but the terminal is in a
public place and patients and receptionists can see the screen. This means that, even if the consultants enter
the correct password, the system should not allow them to access all the data.
This can be achieved by the DBMS noting the address of the terminal and, because the terminal is not in the
right place, refusing to supply the data requested. This is a hardware method of preventing access. All
terminals have a unique address on their network cards. This means that the DBMS can decide which data
can be supplied to a terminal.
3.6.h Database Management System (DBMS)
Let us first look at the architecture of a DBMS as shown in Fig. 3.6.h.1.
EXTERNAL
LEVEL User 1 User 2 User 3
(Individual users)
CONCEPTUAL
LEVEL Company Level
(Integration of all
user views)
INTERNAL DISK/FILE
LEVEL organisation
(Storage view)
Fig. 3.6.h.1
At the external level there are many different views of the data. Each view consists of an abstract
representation of part of the total database. Application programs will use a data manipulation language
(DML) to create these views.
At the conceptual level there is one view of the data. This view is an abstract representation of the whole
database.
The internal view of the data occurs at the internal level. This view represents the total database as actually
stored. It is at this level that the data is organised into random access, indexed and fully indexed files. This
is hidden from the user by the DBMS.
The DBMS contains a data definition language (DDL). The DDL is used, by the database designer, to
define the tables of the database. It allows the designer to specify the data types and structures and any
constraints on the data. The Structured Query Language (SQL) contains facilities to do this. A DBMS such
as Microsoft Access allows the user to avoid direct use of a DDL by presenting the user with a design view
in which the tables are defined.
The DDL cannot be used to manipulate the data. When a set of instructions in a DDL are compiled, tables
are created that hold data about the data in the database. That is, it holds information about the data types of
attributes, the attributes in a relation and any validation checks that may be required. Data about data is
called meta-data. These tables are stored in the data dictionary that can be accessed by the DBMS to
validate data when input. The DBMS normally accesses the data dictionary when trying to retrieve data so
that it knows what to retrieve. The data dictionary contains tables that are in the same format as a relational
database. This means that the data can be queried and manipulated in the same way as any other data in a
database.
The other language used is the data manipulation language (DML). This language allows the user to insert,
update, delete, modify and retrieve data. SQL includes this language. Again, Access allows a user to avoid
directly using the DML by providing query by example (QBE) as was mentioned in Section 3.5.j.
Appendix: Designing Databases
Although not stated as part of the syllabus, students may find the following to be of value, particularly when
normalising a database.
The task is to create a database for this problem. In order to do this, you must first analyse the problem to
see what entities are involved. The easiest way to do this is to read the scenario again and to highlight the
nouns involved. These are usually the entities involved. This is done here.
A company employs engineers who service many products. A customer may own many products but a
customer's products are all serviced by the same engineer. When engineers service products they complete a
repair form, one form for each product repaired. The form contains details of the customer and the product
that has been repaired as well as the engineer's ID. Each form has a unique reference number. When an
engineer has repaired a product, the customer is given a copy of the repair form.
engineer
product
customer
repair form
Now we look for the relationships between the entities. These can usually be established by highlighting the
verbs as done here.
A company employs engineers who service many products. A customer may own many products but a
customer's products are all serviced by the same engineer. When engineers service products they complete a
repair form, one form for each product repaired. The form contains details of the customer and the product
that has been repaired as well as the Engineer's ID. Each form has a unique reference number. When a
repair is complete, the customer is given a copy of the repair form.
services completes
ENGINEER
serviced by
completed by
PRODUCT FORM
owned by given to
CUSTOMER
owns is given
There are two many-to-many relationships that must be changed to one-to-many relationships by using link
entities. This is shown below.
services completes
ENGINEER
ENG_PROD
serviced by completed by
PRODUCT FORM
owned by given to
CUST_PROD
CUSTOMER
owns is given
This suggests the following relations (not all the attributes are given).
ENGINEER(EngineerID, Name, … )
ENG_PROD(EngineerID, ProductID)
PRODUCT(ProductID, Description, … )
CUST_PROD(CustomerID, ProductID)
CUSTOMER(CustomerID, Name, … )
FORM(FormID, CustomerID, EngineerID, … )
These are in 3NF, but you should always check that they are.
Another useful diagram shows the life history of an entity. This simply shows what happens to an entity
during its life. An entity life history diagram is similar to a JSP diagram (see Section 3.5.c). The next
diagram shows the life history of an engineer in our previous problem.
Engineer
Detail * changes
This tells us
1. A new record for an engineer is created when an engineer joins the Company.
2. While the engineer is working for the Company, he/she may change their name, address or telephone
number as many times as is necessary (hence the use of the *).
3. When the engineer leaves the Company, the main life-cycle ends and the engineer's record is updated
to indicate they no longer work for the Company.
4. 12 months after the engineer has left the Company his/her record is archived and removed from the
database.
Q 1. A large organisation stores a significant amount of data. Describe the advantages of storing this data
in a database rather than in a series of flat files. [4]
A. - All the data is stored on the same structure and can therefore be accessed through it.
- The data is not duplicated across different files, consequently…
- there is less danger of data integrity being compromised.
- Data manipulation/input can be achieved more quickly as there is only one copy of each
piece of data.
- Files need to be compatible, this problem does not arise with a database.
Q 3. Every student in a school belongs to a form. Every form has a form tutor and all the form tutors are
members of the teaching body.
Draw an entity relationship diagram to show the relationships between the four entities STUDENT,
FORM, TUTOR, TEACHERS [6]
A.
STUDENT FORM
TEACHERS TUTOR
This automatic stock reordering has two cost effects. First it means that the organisation should rarely run
out of stock which causes a loss of sales and, hence, loss of income. It also means that the organisation
should not need to store large quantities of stock which leads to high inventory costs.
If the organisation also keeps data showing the rates of sales of products, the system can recognise changes
in these rates and so change its ordering patterns.
Thus, data about products in stock and rates of sales is valuable as they improve the profitability of the
organisation.
In order for data to be of value they must be accurate and up-to-date. Often data are inaccurate due to them
not being frequently updated. If the sales figures are only used once a week to update the stock database, the
stock levels are soon out of date and the data have little value.
These days banks offer services other than banking. They offer mortgages, insurance and business support.
If a bank is considering a loan, it is important that the bank is aware of the risks involved. Keeping data
about previous borrowers, such as age, income and social background, and comparing the data for a
potential new borrower with the historical data can help to determine whether or not to make the loan. This
is often done using artificial intelligence (AI) techniques and leads to fewer people reneging on their loan.
Thus, the data used is very valuable to the bank.
Another example is of an international company that has run two advertising campaigns in two different
countries. The one was much more successful than the other. It is important that the company keeps data
about the two campaigns in order to determine why the one campaign was more successful than the other.
This will lead to better sales campaigns in the future, improving the profitability of the company.
However, how does the senior executive in one country know what is happening in other countries? Modern
companies keep databases that can be accessed on a world-wide basis. In order to do this, value added
network services (VANS) are used. These simplify the exchange of data between users of the service by
using computer networks.
In these systems, users plug into the interface provided by the VANS operating company and the software
does everything else. A VANS may operate in a single company or may be of use to several companies.
For example, estate agents may share a VANS in order to match potential buyers with sellers over a much
wider area than is possible if each estate agent only has access to their own data. This system is also used by
solicitors having access to local authority databases for conveyancing purposes. Eventually VANS will
operate on a world-wide basis. Thus data that was only of value to a small number of users is now of value
to many more. This means that the data have increased in value.
One of the problems with so much data being available is trying to sift the data for useful information. This
is often achieved using data mining techniques. A lot of work is going on to develop sophisticated
datamining software which looks for patterns in vast quantities of data.
The ability to sift through data to find patterns such as
can lead to much better targeting of customers with the result that there are better returns on investments.
A great deal of work is being done on data mining as many companies can make use of the results. Indeed,
some companies sell lists of people who may be valuable customers to other companies.
3.7.b Standardisation
In order to be able to share data successfully some form of standardisation is needed so that users can send,
receive and interpret the data correctly. Some mention of this was made in Chapter 1.6 in the AS text.
Some typical standards used for files are given below.
Text files These are used to hold characters represented by the ASCII code. Text files are used to transfer
data between application packages. The data consists of individual characters and there is no formatting
applied to the characters.
Comma Separated Variable files are used to transfer tabular data between applications. Each field is
separated by a comma.
Tab Separated Variable files are used to transfer tabular data between applications. Each field is
separated by a tab character.
Standard Interchangeable Data files are used to transfer tabular data between applications. They are not
common outside the UK education market.
Rich Text Format files are a complex format used to store data from a word processor. They include
information about fonts, sizes, colour and styles.
Picture files These are used to represent sound pictures in digital format. There are many different formats
such BMP (bit mapped), JPEG (Joint Picture Experts Group), GIF (Graphical Interchange Format) and
MPEG (Moving Picture Experts Group). JPEG and MPEG involve compression techniques. It is these
techniques that allow pictures to be quickly transferred over the Internet. MPEG has also allowed the
introduction of many more television channels.
Sound files As with picture files, there are many different formats that store sound in digital form. WAV
files are common on PCs . Storing sound requires a great deal of memory. CDs sample at the rate of 44,100
samples/sec and DVD (Digital Versatile Disk) at 96.000 samples/sec. Thus 3 minutes of music requires 3 x
60 x 96,000 = 16Mbytes. (Current DVDs can hold 4.3 Gbytes or 13 hours of music.
Without standards there would be a proliferation of formats and it would not be possible to move data
electronically. Not only must file formats be standardised but also communication methods. For example,
if two computers need to communicate, it is essential that both are sending and receiving data in the same
format. It is useless if one computer sends in one format and the other is expecting the data in a different
format. As communications are world-wide and there are a multitude of computer manufacturers, it is
essential that standards are set for consistency.
The method of transferring data over a wide area is usually be means of ISDN (integrated services digital
network) connections. ISDN is used by telephone companies to connect digital exchanges. Most homes use
analogue connections to the local exchange but after that ISDN is used as shown in Fig. 3.7.b.1.
Digital links
Digital
Exchange
Digital
Exchange
Digital
Exchange
User
Fig. 3.7.b.1
ISDN has a standard format that is used world-wide. There are two standard ISDN services known as
primary rate access (ISDN 30) and basic rate access (ISDN 2). The difference is the number of channels and
the methods used to deliver the services to the user.
ISDN 2 will probably be used by most small business and individual users. It provides two channels at
64kbps (B-channels) and a signalling channel of 16kbps (D-channel). The three channels are multiplexed
onto a single communications medium. Data is packaged into frames, one type for transmission from the
network to the terminal and the other for transmission from the terminal to the network. Each type of frame
consists of 48 bits that have to be in a prescribed order. This ensures that the data can be reassembled
properly when received. This system can use the same wires as in current telephone networks.
ISDN 30 provides 30 B-channels and one D-channel. It is used by large customers and is usually delivered
by fibre optics. Its operation is basically the same as ISDN 2 but 30 channels are multiplexed instead of
two.
In order that data are understood when received, it is not sufficient to package data into a format that can be
sent along ISDN connections. The data may represent sound, pictures, text or many other things. It is
necessary to package this data into some standard format first. The standard used is Open Systems
Interconnection Reference Model usually simply called the OSI model.
Voice mail digitises spoken messages and stores them on disk. When recipients access the messages they
are converted back into sound.
Digital telephone systems provide many facilities. Because computers can maintain very large databases, it
is possible for users to have itemised bills, recall stored numbers and to have accurate timing of calls.
Although itemised bills can be sent out on a regular basis, users can, using the Internet, access their own
accounts at any time and see what calls they have made and the costs of these calls. These systems also
allow the use of voicemail. Mobile phones rely heavily on computers to route calls.
Electronic commerce (e-commerce) is becoming more popular. It is quite common to order goods over the
Internet. Many companies use computers to maintain large databases that can be queried by customers
online who may then place orders. An extension of this is EDI (electronic data interchange). EDI allows
users to send and receive order details and invoices electronically. It differs from email in that the data is
highly structured into fields such as sender's name, recipient's name, order number, quantity, product code,
whereas email is completely unstructured in that it is simply text. Fig. 3.7.c.1 shows how this works. Many
companies insist on using this method of ordering and invoicing.
Retailer Customer
Order
Payment
Fig. 3.7.c.1
Teleconferencing allows a group of people to communicate, throughout the world, simultaneously using
telephones and group email. Video conferencing is similar to teleconferencing plus the ability of users to
see one another. These methods of communication have reduced travel costs as meetings can be held
without people leaving their desks. Originally, special rooms were required for videoconferencing. This is
no longer necessary as videoconferencing can now be done using standard PCs and a video camera. In this
system whiteboards can be used to produce drawings that can be transmitted electronically.
Repeaters can be used to connect two segments of a network. It repeats data from one segment to another,
enhancing the signal, as shown in Fig. 3.7.d.1. Repeaters do not segment a network and do not partition a
network into sub-networks. They simply extend a network.
Repeater
Fig. 3.7.d.1
Hubs are used to connect computers together. Fig. 3.7.d.2 shows how hubs may be used to connect
computers that are in different rooms in a building.
To individual
computers
Main Hub
Hubs also act as repeaters by regenerating signals from one computer to all other computers (and possibly
other devices) connected to the hub. This allows longer lengths of cable to be used.
Newer technology replaces hubs with switches. This allows greater speed because each station is switched
in and thus has full network speed. Switches 'learn' which connections are required and join the
corresponding ends. Fig. 3.7.d.3 shows the use of a switch. If, at the same time, Station 1 wishes to
communicate with Server 1, Station 2 with Server 2 and Station 3 with Server 3, this is possible as the
switch will set up three independent paths. This means that data can flow at maximum speed along each as
the system will be treated as three independent circuits.
Switch
Fig. 3.7.d.3
We shall now look at how the facilities just described can be used to create practical networks.
Example 1
A team of programmers all have their own PCs which they use to create programs. The team work on the
same projects and share code. A suitable network is shown in Fig. 3.7.d.4. There is no need to use a switch
as files tend to be small (usually text) and traffic is relatively light. Also, the data will need to travel
between all stations.
Hub
Station 4 Station 5
Server
Fig.3.7.d.4
Example 2
A group of people work together to produce a catalogue and price list for a large company. The catalogue
consists of about 500 pages, each of which may have up to 15 colour pictures. Each picture has a
description of the item and its price. A separate booklet contains just the item codes, brief descriptions and
the prices. The pictures are held on a large server and all text descriptions to be used in the catalogue and in
the price list are on a smaller server.
This is a case where a switch is more appropriate than a hub as picture files can be very large. Also, the
people creating pictures will usually be connected to the large server while those creating the price list will
mainly need access to the smaller server. A possible solution is shown in Fig. 3.7.d.5.
Switch
An alternative solution could use both a switch and a hub. This would allow us to create two linked
networks, one that mainly carries pictures and one that mainly carries text.
Example 3
A primary school has two small computer rooms, near to one another, and a single server. A suitable
network is shown in Fig. 3.7.d.6.
Repeater
Fig. 3.7.d.6
If the computers are to be spread around the school, instead of all in two rooms, and if access to the Internet
is required from all stations, it would be better to use either a switch or a hub.
Another area of expansion is in providing information. For example, medical advances can be posted on the
WWW that can then be accessed world-wide. Indeed, doctors can request advice using the WWW.
The use of the Internet by media reporters can mean that news can be quickly updated and that information
is in electronic form. This means that it can be manipulated for use on other media.
Estate agents can set up sites that enable them to sell property throughout the world. The applications are
endless and you should keep abreast of modern developments as they are published in the media. It is also
worth reading Business @ the Speed of Thought by Bill Gates published by Penguin Books (ISBN
0140283110).
3.7.f Training
Training in the use if IT is essential if users are to make the best use of it. Young people are growing up in
an IT environment and receive basic training in its use. However, older generations find using IT daunting
and need careful and appropriate training. This may be as simple as switching on a PC and loading software
or may involve the use of particular packages. In the latter case, the packages taught need to be pertinent to
the jobs carried out by the learner. It is very easy to alienate learners by teaching them how to use software
facilities that they will never use.
It is also important that courses provide sufficient time for the learners to practise new skills and to be
provided with sufficient notes to enable them to redo tasks, set during the course, at a later date. Online help
is not enough; most people prefer to have their notes in printed form. This is because they need to look at
their work and their notes at the same time. Adjusting the size of windows so that the work and the notes
are both on the screen at the same time is often unsatisfactory. Also, learners like to flick back and forth
through their notes and this is much easier when the notes are on paper.
IT is an ever-changing subject, which means that users continually need retraining. Application packages
are continually being upgraded and new applications are being created daily.
IT is changing the way things are done all the time. Robots weld cars, what is to be done with the people
who used to do the welding? They will have to be retrained to do a different job. Bank clerks used to add
up columns of figures, now they press keys on a keyboard. However, they are now expected to provide new
services to the customer other than handling cash and cheques. They have had to be retrained as sales
persons as banks now sell mortgages, insurance and other services.
Organisations are setting up help desks for customers to contact when they have a query. At present, most
of these help desks involve large numbers of people. In future a lot of this help will be provided
electronically by means of databases that hold data about frequently asked questions (FAQs). This means
that the operators of the help desks will have to be retrained to create these databases.
Training in the use of IT is not sufficient in itself. Employees can be trained to use email but also need
training in how email can be used to enhance their work. Instead of groups of workers meeting, say, once a
week, the workers can keep one another informed of progress when it happens. This means that all workers
on a project know the current stage of development of that project. This speeds up the work. However,
training is needed in these new working methods, particularly to prevent an overloading of email
communications.
A similar example is that of selling double-glazing. At one time someone went to the customer's house and
measured all the windows. The next step was to go back to the office and prepare a quotation which was
then sent to the customer. Now, the sales person can use a laptop, with suitable software, to prepare a
quotation on the spot.
It is quite common for people to work on a project in the office, email it home and continue working on it
later at home.
Like banks, factories have seen major changes in working patterns. Fewer people are needed in the
assembly process but more technicians are needed to maintain the automated plant.
Office personnel use computers to produce invoices using databases of orders, delivery notes and customers.
The company payroll is fully computerised with money being transferred electronically from employer's
bank to employees banks. No longer do wages clerks have to calculate wages and count money into pay
packets.
Hotel receptionists have access to a database for all the hotels in a group. This means that they can now
book hotels for customers other than the one in which they work.
Staff in stores only take stock a few times a year instead of weekly. Stock levels are kept on computer
databases and need to be checked occasionally in case stock is removed without passing through point-of-
sale terminals. (This may be due to products being damaged or stolen.)
Teachers and lecturers often set assignments using computer networks. Students then post their work to
their tutors electronically. Tutors view the work on screen and return the marked work, with comments,
electronically.
People expect much higher quality in documents, whether it be posters or letters. Students expect teaching
materials to be of a higher standard. This book has been produced in electronic format so that you can read
it on a screen and print it off for later use. This means that your school or college only has to have one copy
of the book and it can be shared using a computer network.
Products can be manufactured to a much higher standard because of the use of computerised machines and
robots. This increase in accuracy has lead to an increase in quality. Self- assembly furniture is easier to put
together because the parts are made more accurately. Children's building toys look much better because the
components are more accurately made and are of better, more consistent, quality.
This increase in quality has led to fewer faults in end products such as motor cars. This means that, in the
case of motor cars, mechanics spend more time servicing vehicles and less time correcting errors in
manufacture. However, the increase in quality has also led to a reduction in the need to service motor
vehicles.
3.7 Example Questions
Example questions are not offered here as all the work is either a repeat of previously visited work or is
based on definitions that can be taken straight from the text.
As all the examiner comments have already been made, any content here would simply be a repeat of what
has gone before.
Chapter 3.8
Systems Development, Implementation, Management and Applications
Another useful technique is the Structured Systems Analysis and Design Method (SSADM). Fig. 3.8.a.1
shows the stages involved when using SSADM.
Feasibility Study
Requirements Analysis
Requirements Specification
Physical Design
Fig. 3.8.a.1
These steps have been explained, informally, in Chapter 1.7. In this Section we shall show how diagrams
can help the development of the stages shown in Fig. 3.8.a.1.
You do not need to know all the techniques of SSADM but Data Flow Diagrams (DFDs) are important.
DFDs provide a graphic representation of information flowing through a system. The system may be
manual, computerized or a mixture of both.
is a simple technique;
is easy to understand by users, analysts and programmers;
gives an overview of the system;
is a good design aid;
can act as a checking device;
clearly specifies communications in a system;
ensures quality.
DFDs use only four symbols. These are shown in Fig. 3.8.a.2.
Fig. 3.8.a.2
All names used should be meaningful to the users, whether they are computer literate or not.
The steps to be taken when developing DFDs are given in Table 3.8.a.1.
Step Notes
1. Identify dataflows. e.g. documents, VDU screens, phone messages.
2. Identify the external entities. e.g. Customer, Supplier
3. Identify functional areas. e.g. Departments, individuals.
4. Identify data paths. Identify the paths taken by the dataflows identified in step 1.
5. Agree the system boundary. What is inside the system and what is not.
6. Identify the processes. e.g. Production of invoices, delivery notes, payroll production.
7. Identify data stores. Determine which data are to be stored and where.
8. Identify interactions. Identify the interaction between the data stores and the processes.
9. Validate the DFD. Check that meaningful names have been used.
Check that all processes have data flows entering and leaving.
Check with the user that the diagram represents what is
happening now or what is required.
10. Fill in the details.
Table 3.8.a.1
Fig. 3.8.a.3 shows the different levels that can be used in DFDs.
0 Payroll System
Level 1
(Top 1 Get hours worked 2 Calculate wages 3 Produce wage slips
Level)
Level 2
2.1 Validate 2.2 Calculate 2.3 Calculate 2.4 Calculate
(Lower
Level) Data gross wage deductions net wage
Level 3
(Not
always Fig. 3.8.a.3
needed)
A hotel reception receives a large number enquiries each day about the availability of accommodation.
Most of these are by telephone. It also receives confirmation of bookings. These are entered onto a
computer database.
While a guest is resident in the hotel, any expenses incurred by the guest are entered into the database by the
appropriate personnel. If guests purchase items from the bar or restaurant, they have to sign a bill which is
passed to a receptionist who enters the details into the database.
When guests leave the hotel they are given an invoice detailing all expenditure. When they pay, the
database is updated and a receipt is issued.
The flow of data in this system is shown in Fig. 3.8.a.4.
Customer
Enquiry
Drinks Bill
Food Bill
1 Restaurant
Food Bill
Fig. 3.8.a.4
The symbol for Customer, an external entity, has a diagonal line to indicate that it occurs more than once.
This does not mean that these symbols represent different customers. It is simply used to make the diagram
clearer. Without this, there would be too many flow lines between this symbol and the internal processes.
Data stores may also be duplicated. This is done by having a double vertical line on the left hand side as
shown in Fig. 3.8.a.5.
M1 Customer data
Fig. 3.8.a.5
Notice this data store is numbered M1 whereas those in Fig. 3.8.a.4 were numbered D1 and D2. In data
stores, M indicates a manual store and D indicates a computer based data store. Also, there can be no data
flows between an external entity and a data store. The flow of data from, or to, an external entity must be
between the external entity and a process.
The example of the hotel system only shows one external entity, the customer. Usually there is more than
one external entity. Suppose we are dealing with a mail order company. Clearly, one external entity is the
customer. However, another is the supplier. Note that although there are more than one customer and
supplier, in the diagram they are written in the singular.
3.8.b The Purpose of Documentation
It is important that the design of a solution is accurate BEFORE it is implemented. For the design to be
accurate, the original specification must be accurate. This means that, when we are asked to produce a
solution, we must make sure that we thoroughly understand the problem. This can only be achieved by
checking our analysis with those who are going to use our system. Users are not usually technical people
and so we require simple ways of showing our understanding of the problem. One of the best ways of doing
this is to use diagrams.
An E-R diagram shows the relationships between entities so that we can check these relationships with the
end user. These diagrams help us to ask questions like
Thus, we can ensure that the relationships are correct. We can also ensure that we have included all the
entities.
Similarly, our Data Flow Diagrams (DFDs) show how data is moving through the organization and we can
ask questions like
Once we are sure that our analysis is complete and accurate, by continually checking it with the end user, we
can move on to the design stage. At the design stage the diagrams, so far produced, will be modified to
show exactly what we are going to produce. This may be different from the analysis due to unforeseen
constraints. However, as they should be very similar to those produced at the analysis stage, we can check
that we have included everything required by the user. Indeed we should go back to the user to validate our
design.
We can now design the user interfaces and check that they will allow the user to input the data required and
output the results expected. Again these can be checked with the user. A good way of doing this is to
produce a prototype of your solution. This will appear to work but, in fact, it only shows how the interfaces
work. There is no code behind the interfaces to manipulate the data. This can cause problems, as often
users think that they are seeing the real thing when there is still many months work needed to produce the
final solution. When the interfaces have been designed they can also be checked against the E-R and data
flow diagrams.
This continual cross-checking with previous steps is very important. Fig. 3.8.b.1 shows that, as each stage is
developed, it is checked against all previous stages. This continual validation process is essential if we are
to reduce the cost of maintenance due to errors and omissions. The careful documentation also helps to
maintain a piece of software when it is to be upgraded by adding extra facilities.
User Request
Initial Study
Feasibility
Study
Systems
Analysis
Systems
Design
Implementation
Change Over
Evaluation
and
Maintenance
Fig. 3.8.b.1
To answer the second the question first, “None.” This answer is a little bald, but strictly true. An
examination paper cannot ask for the specification for a PC because everyone will have their own idea what
specification will be appropriate and, anyway, the answer that the examiner has put on the mark scheme will
be out of date by the time the examination is marked. Strictly, there is no right answer to such a question.
So the requirements of the syllabus come down to the first question – What specifications are important to
allow the system to perform the operations expected of it?
The speed of the processor is simply a measurement of the number of operations that are possible every
second. This is being typed, using Word word processing software, on a computer with a 1.33 Megahertz
processor. If I had a 1.33 Gigahertz processor (1000 times faster) it would make no difference, I still can’t
type any faster. So the speed of the processor is largely irrelevant to this particular task. My neighbour uses
her computer in order to edit video material as part of a service that she offers to local industries. My
computer would simply not be able to process the data quickly enough to produce satisfactory images
without considerable jerking of the picture, in her case she needs the faster processor. A student uses their
computer to produce essays for their English course, whilst another produces high quality colour pictures for
an Art course. The first student is storing relatively small text files while the other student is needing to
store large quantities of data, my neighbour’s digitised video is going to need all of the 70 Gigabyte hard
drive that she has bought.
The English student will need to have a CDROM drive in order to load software to use. The Art student
decides that the images are too valuable to lose and opts for a DVD drive to act as a back up storage to the
hard drive. My neighbour has clients who want large numbers of copies of the videos that she produces on
CD’s to send to clients so she has invested in a CD writer, a fast one so that it takes less time to produce
each copy.
The English student has invested in an ink jet printer with a separate black cartridge because most of the
work will be in black and white. The artist has a simple one cartridge printer because black will not be
needed very often.
From these examples, hopefully, the student can understand that the important thing is the need to satisfy the
needs of the application rather than the need for the student to memorise large quantities of data about
system specifications which would be out of date very quickly anyway.
Students will be expected to be able to determine sensible types of hardware and software for particular
applications dependent upon their characteristics, just as they were expected to do in the AS part of the
syllabus.
The attention of students is especially directed to Chapter 1.7 the systems development life cycle, from the
AS work, and also to the notes on module 4, the project, for references to how to implement a system.
With a number of tasks, some may be possible at the same time, while with others it becomes important to
do them in the correct order.
It may also be important to work out how long the project should take to complete.
Major projects like this can be represented graphically to show the different tasks and how they join
together, or relate to each other.
Take, as an example, the major project of building a bungalow. It can be divided into a number of tasks.
A Concreting the foundations takes 4 days
B Building the walls takes 4 days
C Making the doors and windows takes 7 days
D Tiling the roof takes 2 days
E Installing the plumbing takes 3 days
F Doing the interior carpentry takes 4 days
G Installing the electrics takes 6 days
H Decorating takes 5 days
One way of deciding how long the bungalow takes to build is to add up all the separate times, 35 days. On
the other hand they are separate jobs so as long as enough people are working, it will only take as long as the
longest task, 7 days. This is silly, the decorating can’t be done before the roof is on!
The real time for the project is somewhere between 7 and 35 days.
H
G
F
E
D
C
B
A
4 8 12 16 20 24 No of Days
Fig. 3.8.f.1
Fig. 3.8.f.1 shows when all the different tasks can start and when they must end.
It shows which can overlap, and by how much.
Finally, it shows how long the project will take. (Your ideas of how a bungalow is built may be different,
but this is the method used by B and L Construction.)
Another type of graph might be similar to a flow diagram. The circles represent stages and the arrows the
tasks needed to reach that stage and the time, in days, needed to carry them out.
4
1 A 2
4
B
3 4 2 5
7 D 6
C G
4 7
3 F
6
E 5
H
1 Start
2 Foundations finished
3 Start making the windows and doors
4 Walls finished
5 Roof finished
6 Interior finished
7 Bungalow finished
There are many routes through the diagram. The one that gives the time taken to complete the bungalow is
1 2 4 5 6(G) 7
A total of 21 days. This is the critical path. The bungalow cannot be built in a shorter time.
The arrows show the order in which the tasks must be carried out, so E (the plumbing) cannot be done
before B (building the walls), but can be done at the same time as tiling the roof (D). Each node can have an
optimum time worked out. Node 6 has an optimum time of 4+4+2+6 = 16 days. This is the time to be sure
of getting to node 6 whichever route you choose. Each node can also have a latest starting time before it
holds up another node. Node 1 must start immediately otherwise the walls (4) won’t be finished by day 8.
However, node 3 could start immediately or wait a day without effecting the rest.
As an exercise try to draw a chart to help in the planning of the work for an A2 project. One possible
answer is shown below.
Evaluation
User documentation
Technical documentation
Implementation
Testing
Software development
Design
Analysis
Nature of problem
Time
3.8.h Managing, Monitoring and Maintenance of Systems
As with so much of the work in this module, a lot of the work in this section refers to the work already
covered in other sections. The need for managing, monitoring and maintenance of systems goes back to
section 1.7 in the AS text which was to do with systems analysis and the system life cycle. Not only should
one be aware of the need for documentation, but it should be up to date. In a subject like Computing, where
things are changing so very quickly, it is not reasonable to suggest that a thing has been done once and that it
therefore does not have to be considered again. Software and hardware change on such a regular basis that,
even if we are happy with our system, the outside world is going to affect us and force change. There are
very few systems that are totally self-contained and when another system is updated our original system may
no longer be compatible. This process of change should not be a static one. A firm should not sit back and
wait for a change to be forced upon it, rather there should be a continual process when managing a system,
of measuring the system that is being used against what is currently available, a software (and hardware)
audit. The outputs from systems should be studied to ensure that they are acceptable, this is quality control
and the documentation must keep up with the rest of the system in order to provide the necessary
information about the system to the various users.
3.8 Example Questions
Q 1. Explain what is meant by the term entity model. [4]
A. - A diagrammatic means of showing a
- database.
- The things that are represented in the database are shown
- along with the relationships between them.
- Included in the representation are the attributes of each entity, or the information stored about
each entity.
Notes: This is often referred to as a data model rather than an entity model.
Q 2. A computer system has been designed and produced for a person who works from home for a
publisher of fiction books as a proof reader of manuscripts. Another system is designed for a graphic
designer who sends work to clients electronically. Describe how the hardware specifications for the
two systems would differ. [10]
A. - The proofreader would require a keyboard, possibly a mouse, to input text and to navigate
around the text, while the designer will require various graphical input devices. Argument
can be made for graphics tablet, video digitizer, digital camera, light pen…Don’t forget that a
keyboard and mouse are just as good here given the right reasons, but they would not provide
evidence of a difference.
- The processor used by the graphics designer would need to be faster than that used for
manipulating text because there is far more data to be handled.
- The output devices might be a monitor and a printer in both cases, however the devices used
by the designer would need to be high resolution because of the graphical output being based
around pixels, while the proof reader is dealing with text characters that can be handled quite
adequately in low resolution.
- The graphics will consist of far larger files than equivalent text files meaning that the hard
drive will need to be of far greater capacity. It is not possible to give precise values because
they change so quickly, and the precise nature of the files in each case is not known.
- The proof reader will need to down load files from the head office of the publisher, and will
also need to return them. The designer will need to perform similar functions with the
graphics files. The difference in size of the respective files means that the designer will need
a faster communication link than the proof reader. Perhaps an ISDN line rather than a modem
and the standard telephone line.
Notes. Note how this question is answered in the same way as the hardware questions in the AS
syllabus.
Q 4. Discuss the need for project management when a major project is being implemented.
A. - Need to time the project,
- both maximum and minimum amounts of time that could be expected for the project.
- The need to plan when specific workers will be needed
- The need for materials to be available when required and not
- to be there before they are needed
- The need to ensure that specific parts of the project are being carried out and not forgotten
about.
- To ensure that one part of the project is finished before reliant parts are begun.
Chapter 3.9
Simulation and Real-time Processing.
When asked to describe a real-time application the first thing that needs to be described is the world of the
application. Everything else falls into place. Students should then describe the hardware necessary to allow
input and output from that world and the decisions that the software must take.
A nuclear reactor may start to react too violently and sensors inform the computer controlling the reaction
that this is happening. The computer takes the decision to insert the graphite rods to slow the reaction down.
This is a real-time application. The world of the application has been identified, the input devices are the
sensors that inform the computer of the state of the reaction, the computer makes an immediate decision and
the graphite rods are now moved into place. Notice that the rods moving is not immediate but will take place
over a period of time, however the decision was taken immediately. Note also that the sensors simply report
on the state of the world, there is no hint at decision making on the part of the sensors. Many students would
phrase their answer in the form of “The sensors spot that the reaction is too violent and the processor makes
…”. Here, the sensors are being credited with having processing power in that they can interpret the
readings that are produced.
Students should also be able to identify when a real-time system is appropriate as opposed to a system where
the decision making is in some way delayed.
Most sensors are surprisingly simple, relying on one of two methods to gain information.
They either use some type of spring mechanism to physically change the position of something in the sensor,
or a means of turning some reading into a variable voltage. The spring ones are like the pressure pad, held
open by a spring which is overcome by the weight of the burglar. Similarly, a bumper around a robot vehicle
can be kept away from the body of the vehicle by springs which are overcome if the robot moves too close
to something blocking its path. A thermistor, used to measure the room temperature for a central heating
system converts the ambient temperature to a voltage so that a decision can be made by the processor. A
light meter at a cricket match converts the available light to a voltage so that the processor can decide
whether there is enough light for play.
Notice that the digital sensors are really switches. If the bumper around the robot is depressed is it necessary
for a processor to make a decision? Probably not, the switch would simply switch off the motor. The
question arises whether this is a sensor in the true sense of a sensor in computing. The answer is that the
action of turning the motor off required no decision and hence no processing, but that the need to do so will
be reported to the processor because it now has to make a decision about what to do next and the input from
that sensor is going to be an important part of that decision.
One last point about sensors. There are as many different sensors as there are physical quantities that need
measuring, but their reports need to be kept as simple as possible to allow the processor to make decisions
quickly. The idea of a sensor being a TV camera because it can show the sensor what is going on in a large
area is unrealistic because it would provide too much information. What would be possible would be a TV
picture which could be scanned by a processor for any kind of movement in order to indicate the presence of
a burglar. Never lose sight of the idea that the processor is limited in the amount of data that can be
interpreted, as is the software that the processor is running.
When the processor makes its decisions it must be able to take some action. In the type of scenario we are
talking about it will probably be necessary to alter something in the physical world. This may involve
making the robot move in a different direction, it may be a matter of switching on some lights or telephoning
the police station to report an intruder on the premises. Some of these are simple electric circuits that the
computer can trip by itself, however, making the robot change direction is rather more complex. Such
movement involves the use of an actuator. An actuator is the device that can accept a signal from the
computer and turn it into a physical movement.
People can get tired and when they do get tired the work that they produce can be of a poorer quality than it
otherwise would have been. The robot will not necessarily do a better job than the human (indeed, it is not as
adaptable and therefore may produce worse results) however, it is consistent. It never gets tired.
There are some places that are particularly hazardous for human beings, for instance the inside of a nuclear
reactor, but where maintenance must be carried out. These are natural environments for a robot to work in
because there is no human equivalent.
A robot does not need the peripheral things that a human being needs. There is no need for light or air in a
factory. There is no need for a canteen or for restrooms or for a transport infrastructure to get the workers to
work and home again. There is no need for a car park or for an accounts department to work out the pay for
all the workers. There is a need, however, to employ technicians to service the machines, programmers to
program every task that they need to undertake. Notice the need for a human workforce has not disappeared
but the workforce is made up of different types of worker, generally more skilled than was previously the
case.
The rate of growth of a sunflower is known from observations taken over many years. The effects of
different chemicals on the growth of sunflowers are known from simple experiments using one chemical at a
time. The effects of different combinations of the chemicals are not known. If the computer is programmed
with all the relevant formulae dictating how it should grow in certain circumstances, the computer can play
the part of the sunflower and show how a real one can be expected to react. In this way the effects of
different growing conditions can be shown in seconds rather than waiting 6 months for the sunflower to
grow. One can imagine that in the course of a day a programmer can come up with a suitable cocktail of
additives to allow sunflowers to grow on the fringes of a desert and consequently create a cash crop for
farmers, in such conditions, where they had no cash crop before. This is an example of the use of a computer
simulation to speed up a process in order to give results in a more reasonable time scale.
Simulation can be used to predict the results of actions, or model situations, that would be otherwise too
dangerous. What will happen when a certain critical condition is exceeded in a nuclear reactor? I don’t want
them to try it on the one down the road from where I live and I’m sure no one else does either. Program a
computer to pretend to be a nuclear reactor and it is no longer necessary to do it for real.
Some things are impossible. It is not possible to fly through the rings of Saturn. Program a computer to
pretend and a virtual reality world can be created to make it seem possible.
A car company is planning a new suspension system for a range of cars. One way of testing different designs
is to build prototypes and take them out in different conditions to test how they work. This is very
expensive, as well as time consuming. A computer can be programmed to take the characteristics of each
possible system and report how well they will work, at a fraction of the cost. The same simulation can be
made to vary the conditions under which it operates. A fairly simple change to the parameters of one of the
formulae being used can simulate driving on a motorway or on a country lane.
People can have ideas which need testing to see if they are valid. An engineer may design a new leaf spring
for a suspension system. The hypothesis is that the spring will give more steering control when travelling on
rough surfaces. The computer can be set up to simulate the conditions and give evidence to either support or
contradict the hypothesis.
A financial package stores data concerning the economy. It can be used to provide information about past
performance of the various criteria being measured or it can be set up to predict what will happen in the
future. If the graph of a particular measure is linear then extrapolation of what will happen in a year’s time is
not difficult, in fact you certainly don’t need a computer to provide the prediction. However, if the graph is
non linear the mathematics becomes more difficult. More importantly, economic indicators do not exist in
isolation. If the unemployment figures go up then there is less money in the economy, so people can buy
less, so firms sell less, so more people are laid off. Unless the bank of England brings down interest rates
which will encourage people to borrow more and hence buy more, so firms need to employ more people in
order to put more goods in shops… When the relationships become intertwined like this the calculations of
predictions become very complex and computers are needed.
3.9.f Describe a Simulation
This is one of those sections which is impossible to completely describe because there are an infinite number
of possible scenarios that could be used.
Generally, a student would be expected to understand that in a given situation there are a number of
variables that control the outcome and the results that may be predicted. There should be an awareness that
the values of these variables do not just appear by magic but must be collected and that sensible limits
should be set within which the variable values must lie. There should be an awareness that the results are
going to be based on the use of these variables in specific formulae that relate the variables to one another.
Finally, there should be an awareness that the results produced are subject to a degree of error, the size of
which will result from, not just the validity of the variable values and the relationships, but also the validity
of the model that is used.
This combination of the vast quantities of data, the inter relationships and the consequent volume of
calculations means that computer power becomes essential to giving a sensible result. Indeed, with
something like the weather forecast, ordinary computers are too slow. There is a need for high speed
calculation, a need that is satisfied by the use of parallel processing.
If 100 digits need adding together a standard, single processor, computer will need 100 cycles to complete
the calculation. A computer with 50 processors can add the 50 pairs of values together in one cycle, the 50
answers can be added in pairs in a second cycle to give 25 answers and so on. A total of 7 cycles are needed
to add the set of 100 digits. This simple example gives an illustration of how simple arithmetic can be
significantly speeded up using parallel processing, and hence how parallel processing can be so important to
simulations.
If it were possible to predict the outcome of the lottery draw then there would be some very rich computer
programmers. Mathematically, the outcome is not random and should be predictable, perhaps by modelling
the behaviour of the individual items inside the machine that chooses the balls. However, this is impossible,
certainly with present technology.
If it were possible to predict accurately that human beings would all buy a particular song in preference to
another, then the record industry would not have to produce such a volume of material in order to have a
single hit. Human behaviour is very difficult to predict.
3.9 Example Questions
Q 1. Describe the real time application of a computer used to control a burglar alarm system. [4]
A. - The alarm system is working in a closed loop. Once activated, the system is self contained.
- Sensors are used as input to the system.
- Sensible sensors are pressure pads/movement sensors/infra red/sound sensors
- The sensors report to the processor which makes a decision based on activation, more than
one indicator, indicator of false alarm…
- Output in many possible forms including direct communication of system to the police.
- Sensible for system to be a polling system rather than an interrupt system because a polling
system will recognise when communication between the sensors and the processor are
damaged because it is not able to make contact.
Q 2. Explain why some applications require parallel architecture to carry out their processing and describe
what is meant by parallel processing.
A. - Applications that have a large amount of processing that needs to be carried out…
- in a fixed time period.
- Too much processing to be carried out by a single processor in that given time.
- The computer has more than one processor so that…
- when it is possible to divide a problem into a number of processes, if more than one can be
done at a time there are enough processors to manage it.
Notes: Generally, the content of this section has already been covered in other modules,
consequently there is little point in repeating examples that have been included elsewhere.
Chapter 3.10
Common Network Environments, Connectivity and Security Issues
In this Chapter you will learn how to connect LANs and WANs. Section 3.7.b showed how analogue
signals are used from your home PC (or network) to the local telephone exchange. This connection is
analogue if a modem is used. From then on digital signals are used until the final local exchange. From this
exchange analogue signals must be used if the ordinary home telephone and modem are used by the
receiver.
LANs use digital signals to transfer data between nodes. The rate of transmission of the data depends on the
topology of the network and the transmission medium used to join nodes in the network. Fig. 3.10.a.1
shows a ring network. The most common medium used in this type of network is unshielded twisted pair
(UTP) as described in Section 3.10.b. This makes ring networks easy to install but limits bandwidth and,
therefore, the maximum speed of the network.
Station Repeate
r
Fig. 3.10.a.1
Media, other than UTP, are used in ring networks, details of which are given in Table 3.10.a.1. You are not
expected to remember the exact transmission rates and other details. However, you do need to remember
the relative details. Details of these media are given in Section 3.10.b.
Medium Data Rate Mbps Max. Repeater spacing km Max. Number of Repeaters
UTP 4 or 16 0.1 72
Shielded TP 4 or 16 0.3 260
Baseband Coaxial 16 1.0 250
Optical fibre 100 2.0 240
Table 3.10.a.1
In bus networks the communication network is simply the transmission medium. Bus networks can use any
medium and details are given in Table 3.10.a.2.
Table 3.10.a.2
The limits on transfer rates given in the two tables are typical but they are being extended all the time as
technology advances.
Fig. 3.10.b.1
The other main type of cable used in LANs is coaxial cable. This has a central conductor enclosed in a
plastic sheath which is surrounded by a copper sheath. This copper screen is surrounded by a plastic coating
as shown in Fig. 3.10.b.2.
Copper screen
conductor
Central
conductor
Plastic insulators
Fig. 3.10.b.2
The transfer rates for these media are given in the Tables in Section 3.10.a.
Sometimes it is very difficult to lay cables so low-power radio may be used. This uses radio signals between
networks and nodes, with other forms of media used to link other parts of a network together. This is now
being used in schools that have mobile classrooms, sometimes known as demountables.
3.10.c Network Components
Switches use the same type of wiring as hubs (see Section 3.7.d). However, each connector has full
network speed. A typical layout is shown in Fig. 3.10.c.1. Here, each station has full speed access to the
server. However, if any of these stations wish to access the main network, they would have to share the
connection to the main network.
Stations
S
W
I
T
C To main
H network
Server
Fig. 3.10.c.1
If the number of stations is increased and they all want to access the main network, the increased local speed
would be less useful because of sharing access to the main network. In a case like this, it may be necessary
to upgrade the link to the main network.
A router is used to connect different types of network together. A router can alter packets of data so that two
connected networks (LANs or WANs) need not be the same. Routers use network addresses and addresses
of other routers to create a route between two networks. This means that routers must keep tables of
addresses. These tables are often copied between routers using routing information protocol (RIP).
Routers enable public networks to act as connections between private networks as shown in Fig. 3.10.c.2.
Public network
Router Router
LAN LAN
In order to route data round a network, a router takes the following steps.
Note that, in the case of the Internet, the destination address is the IP address.
Usually a router is slower than a bridge. A bridge links two LANs which may, or may not, be similar. It
uses packets and the address information in each packet. To route data efficiently, a bridge learns the
layouts of the networks.
Suppose a bridge is used to link two segments together that are not far apart, say in the same building. The
two segments can work independently but, if data needs to go from one segment to another, the bridge will
allow this. Fig. 3.10.c.3 shows this situation.
Segment
Bridge
Segment
Fig. 3.10.c.3
The bridge has to learn where each node is situated. The bridge will receive data that does not have to be
passed from one segment to another. Initially, any data the bridge receives is buffered and passed to both
segments. The bridge stores a table containing the addresses of sending nodes and the segment from which
the data was sent. Eventually, when all nodes have sent data, the bridge will know on which segment each
node is.
Now, when the bridge receives data being sent from one node to another, it can make a decision whether, or
not, the receiving node is on the same segment as the sending node. This leads to the following algorithm.
1. Sending node sends data onto its segment.
2. Data arrives at the bridge and is buffered.
3. Bridge checks destination address.
4. If destination is on same segment as sender then
a. discard the data
5. Else
a. pass data to other segment.
This algorithm will work on the configuration shown in Fig. 3.10.c.3 but now look at Fig. 3.10.c.4 which
shows multiple bridge connections.
Bridge A
Bridge C
Bridge B
Fig. 3.10.c.3
Now data can reach bridge B by two different routes from different directions. This means that the bridge
cannot build up its tables properly. The above algorithm needs modifying and this is done by the spanning
tree algorithm (STA) which builds up a picture of multiple connections and blocks one of them. In Fig.
3.10.c.4, bridge C could be blocked. In this particular case, any one of the bridges may be blocked. The
actual details of how the STA works is not required at this Level.
Bridge
Bridge
Fig. 3.10.c.5
In this case, the link will affect the overall performance of the LAN. If there is a great deal of traffic
between the segments, and the link is slow, the overall performance will be slow. In this type of
configuration placing a server can be critical.
However, bridges
introduce delays,
can be overloaded.
Modems are needed to convert analogue data to digital data and vice versa. A modem combines the data
with a carrier to provide an analogue signal. This means that ordinary telephone lines can be used to carry
data from one computer to another. This was explained in Section 3.7.b.
Messages are passed from the source computer, through other computers, to the destination computer.
The Internet provides
In order for this system to work, there are Internet Service Providers (ISP) who connect a subscriber to the
backbone of the Internet. These providers then pass data between them and onto their respective clients.
Fig. 3.10.d.1 (on the next page) shows how data, including electronic mail (see Section 3.10.g), are passed
from one computer to another.
An intranet is a network offering the same facilities as the Internet but solely within a particular company or
organisation.
An intranet has to have very good security for confidential information. Sometimes the organisation allows
the public to access certain parts of its intranet, allowing it to advertise. This Internet access to an intranet is
called an extranet.
Suitable software is required to make these systems work. Browsers allow a user to locate information using
a universal resource locator (URL). This is the address for data on the Internet. The URL includes the
transfer protocol to be used, for example http, the domain name where the data is stored, and other
information such as an individual filename.
e.g. http://www.bcs.org.uk/ will load the British Computer Society's home page.
Domain names are held in an hierarchical structure. Each name is for a location on the Internet. Each
location has a unique name. The names in the various levels of the hierarchy are assigned by the bodies that
have control over that area.
PC195-staff.acadnet.wlv.ac.uk
The domain is uk and the ac would be assigned to a particular authority. (In this case UKERNA). This
authority would then assign the next part, i.e. wlv. As this is Wolverhampton University, it is responsible
for all the parts prior to wlv. Those in charge of acadnet are responsible for PC195-staff.
Each computer linked to the Internet has a physical address, a number called its IP (Internet protocol)
address. This numeric address uniquely identifies the physical computer linked to the Internet. The domain
name server converts the domain name into its corresponding IP address.
H
H is destination
computer
E
C
A and G are
D linked via their
servers and the
nodes C, D, E
and F
A is source computer
C is the internet
provider for A
Fig. 3.10.d.1
3.10.e Hypertext Links
The World Wide Web stores vast amounts of data on machines that are connected to the Internet. This data
may be in the form of text, databases, programs, video, films, audio and so on. In order to view this data
you must use a browser such as Internet Explorer or Netscape. However, the browser will need to know
how to retrieve and display this data.
All the data is situated on computers all over the world. These computers have unique addresses and the
data is held in folders on these computers. However, not all computers use the same hardware and
software. This means that there must be some protocol that allows all the computers to communicate and be
able to pass the data from one computer to another. One of the protocols to do this is the hypertext transfer
protocol (http) that is used by the browsers to receive and transmit data. A typical URL is
http://www.bcs.org.uk/
which was explained in the previous Section. Here, the URL starts http:// where http tells the browser which
protocol to use. the portion :// is a separator marking off the transmission protocol from the rest. This URL
connects the user to the home page of the British Computer Society. If a particular piece of data is required,
such as a weather forecast, you can specify a folder to move to directly. This one
http://bbc.co.uk/weather/
loads a page from the directory weather at www.bbc.co.uk In turn, this page will have links to other
directories and pages.
This means that the browser now knows where to look for the data. Links may be placed so that a user can
quickly move around a document or to another document, which may be at a completely different site. Fig.
3.10.e.1 shows links to documents that are at the same site as the document containing the links.
Smart Cards
Contents
Definitions
Applications
The Electronic Purse
Home page
Fig. 3.10.e.1
The links are usually displayed in a different colour to the rest of the text and are underlined. When you
place your pointer on a link, the pointer becomes a pointing finger. If you now click the mouse button you
will be connected to the appropriate site and the data will be downloaded. (Try it.). In this document, when
you leave the pointer on a link, the URL will be displayed. Fig. 3.10.e.2 shows part of the page that is
displayed when Applications is clicked on.
On this page you will see further links, most of which link you to different sites around the world. For
example, Mondex links you to mondex.com, the home page for Mondex who specialise in applications of
smart cards.
Applications
Electronic Purse
Access Control and Security
Travelling
The Future
Smart Card Contents
Home Page
Electronic Purse
This acts like cash. The card can be charged up at modified automatic teller machines
(ATMs), modified BT payphones and at new points installed by the provider. Mondex is one
of the largest suppliers of these smart cards and trials are taking place at Aston, Exeter and
York Universities as well as at Swindon. The card is loaded with electronic cash and it can
then be used to pay for goods and services in a similar way to using a charge card. The
difference being that 'cash' is being transferred from the card to the retailer. The cards can
transfer 'cash' from one card to another. Thus, if two people, such as a parent and a child, each
have a card, the parent can transfer the child's pocket money from one card to the other.
Another large provider of smart cards is Visa. They produce both disposable and reloadable
cards. Visa Cash, as it is called, can provide secure trading on the Internet as well as facilities
similar to those of Mondex. Click here for more details.
Start of Applications
would produce
Similarly,
would produce
An HTML document is in two parts called the HEAD and the BODY. What is in the HEAD is not normally
displayed, although some browsers will display a title if it is included in the HEAD. Level 2 HTML
requires users to include a title of up to 64 characters. This is because some search programs enter it in a
database so that the search engine can find it if it contains what the searcher wants. Thus it is a good idea to
include some keywords in the title. The heading tags <H1>…</H1> to <H6>…</H6> are used to create
headings. The layout is decided by the browser, so blank lines, tabs and extra spaces are ignored. If you
want these, you must use tags to do it. This is because the browser has to fit the output to the display screen
attached to the receiver. These may be set up in many different ways. Fig. 3.10.f.1 shows a simple example
of HTML. In this piece of HTML the blank <HR> tags are used to insert blank lines because the Web
browser ignores the carriage return and new line characters.
<HTML>
<TITLE> An Example of HTML </TITLE>
<HEAD/>
<BODY>
<HR>
<H1>An Example of HTML </H1>
<HR>
This piece of text has been produced using HTML. The text may be
<B>bold</B> or <I>italic</I>.
Although this piece of text is on a new line here, it may not be when displayed by the browser. Remember,
the Web browser decides the layout unless tags are used.
</BODY>
Fig. 3.10.f.1
The result of a browser running this HTML will vary, but will be something like that shown in Fig. 3.10.f.2.
An Example of HTML
This piece of text has been produced using HTML. The text may be bold or italic. Although this
piece of text is on a new line here, it may not be when displayed by the browser. Remember, the
Web browser decides the layout unless tags are used.
Fig. 3.10.f.2
A line space and a thick line precede headings. A line space and a thick line also follow them.
Exactly how the information is displayed will depend on the browser. Also, some browsers do not recognise
all tags. If a browser encounters an unknown tag, it should ignore it. However, there is no guarantee of this.
The result is that a page that looks outstanding when you design it, may not look very good on a different
browser.
In Fig. 3.10.e.2 you will see many links, most of which link you to different sites around the world. For
example, Mondex links you to mondex.com, the home page for Mondex who specialise in applications of
smart cards.
To use links as shown in the previous Section, you need to use the anchor tag <A>.
Smart Cards
<A>Smart Cards</A>
in the HTML document. However, this will not create the link; it only creates the hypertext. This hypertext
must now be linked to the site. You do this by giving the anchor attributes, using a hypertext reference
(HREF). This points to where the document to be displayed is kept. A typical example is shown in Fig.
6.4.6.3. Note this only shows the HTML necessary to create the link.
Fig. 3.10.f.3
A shortened version can be used if the link is to a document in the same directory as the one being viewed.
In this case we need only write
If the document is in a subdirectory of the directory containing the page being viewed, we can write
Links can also be created to points in the same document by using the NAME attribute.
Inserting an image for interest is done by means of the <IMG> tag which has no end tag. You must specify
where the image is stored known as the source (SRC). For example
If you want the image to be a hypertext link, then use, for example,
Electronic mail systems allow the user to compose mail and to attach documents, in many formats, to the
message. Suppose several people are working on different chapters of a book. It is easy for them to pass
their work to one another as an attachment so that others can make comments and revisions before retuning
them. This book was created in this way. The ability to attach all kinds of documents can prove very useful.
The author of this Chapter uses email to collect homework. Students can word process their work and send
it as an attachment. I can then mark it and return my comments. Even better, students attach programs they
have been asked to write and I can run them to see if they work!
Often emails are sent to people who need to pass the message on to someone else. This is easy as there is a
forward facility with all email services. All the user has to do when an email is to be passed on to someone
else is to click a button, enter the email address and press the Send button.
It is easy to reply to an email as you only have to click a Reply button and the original sender's address
automatically becomes the address to which the reply is to be sent.
Another useful facility that can be used is the facility to send the same email (and attachments) to a group of
people. For example, if I wish to send a message to the whole of one of my classes I can do this. All that is
necessary is for me to create a group by inserting in it the email addresses of all the students in the class. I
can then type the message once and send it to the whole group by means of a single click on Send.
Users of email can also set message priorities and request confirmation of receipt.
It is also possible to use voice mail in a similar way to email. In this case the spoken message is digitised
and stored electronically on a disk. When the recipient checks for mail, the digitised form is turned back
into sound and the receiver can hear the message. These messages can also be forwarded, stored and replied
to.
This is common in schools and colleges. Usually teaching staff will have a different interface to that used
by the students. Also, network administrators will need a different interface.
The differences are not just what is available to the user. Some users will only want to use a simple
graphical user interface (GUI), others will prefer to use many menus. Technical staff may want to be able to
access the computer at a very low level and would then need to use a command based interface.
You should read Section 1.2.c which describes the different types of interface.
The problem lies in the fact that all these different interfaces must be made available to different users on the
same network. This is done by the system recognising which user has logged on and then presenting the
user with the appropriate interface. Which interface is provided is laid down by the network manager who
sets up rights for each user.
A first step is to encrypt the confidential data and this is addressed in the next Section.
Another solution is to install firewalls. These sit between WANs and LANs. The firewall uses names,
Internet Protocol addresses, applications, and so on that are in the incoming message to authenticate the
attempt to connect to the LAN. There are two methods of doing this. These are proxies and stateful
inspection. Proxies stop the packets of data at the firewall and inspect them before they pass to the other
side. Once the packets have been checked and found to be satisfactory, they are passed to the other side.
The message does not pass through the firewall but is passed to the proxy. This method tends to degrade
network performance but offers better security than stateful inspection.
Stateful inspection tracks each packet and identifies it. To do this, the method uses tables to identify all
packets that should not pass through the firewall. This is not as secure as the proxy method because some
data do pass through the firewall. However, the method uses less network resources.
Another way of ensuring privacy of data is to use authorisation and authentication techniques. These are
explained in the next Section.
3.10.j Encryption, Authorisation and Authentication
Encryption is applying a mathematical function, using a key value, to a message so that it is scrambled in
some way. There are many techniques for this. The problem is to make it virtually impossible for someone
to unscramble the message. Clearly, whatever function is applied to the original message must be
reversible. The problem is to make it very difficult for anyone to find the inverse of the original function. It
also means that there is a problem of many people needing to decrypt a message. All these people need the
key to unlocking the message. This makes it highly likely that an unauthorised person will get hold of this
key. One method of overcoming this is to use Public Private Key technology. This involves the sender
having a public key to encrypt the message and only the receiver having the private key to decrypt the
message.
Authentication is used so that both parties to the message can be certain that the other party is who they say
they are. This can be done by using digital signatures and digital certificates. Digital signatures require
encryption. Basically, a digital signature is code that is attached to a message.
In order to understand how public key cryptography works, suppose Alice and Bob wish to send secure mail
to each other:
First, both Bob and Alice need to create their public/private key pairs. This is usually done with the
help of a Certification Authority (CA).
Alice and Bob then exchange their public keys. This is done by exchanging certificates.
Bob can then use his private key to digitally sign messages, and Alice can check his signature using
his public key.
Bob can use Alice's public key to encrypt messages, so that only she can decrypt them.
A primary advantage of public-key cryptography is the application of digital signatures, which help combat
repudiation, i.e. denial of involvement in a transaction. Since the owner keeps their private key secret,
anything signed using that key can only have been signed by the owner.
The predominant public-key algorithm is RSA, which was developed in 1977 by, and named after, Ron
Rivest, Adi Shamir, and Leonard Adleman. The RSA algorithm is included as part of Web browsers from
Netscape and Microsoft and also forms the basis for many other products.
Host Computer
Remote
Computers
Fig. 3.10.k.1
An alternative configuration is shown in Fig. 3.10.k.2 where the Central database is duplicated. This
method also means that the central database must be updated during off peak hours.
Central
database
Host
Computer
Remote
Computers
Fig. 3.10.k.2