0% found this document useful (0 votes)
4 views

Computer Science Notes Year 2

The document contains notes on the design of interactive systems, focusing on usability, interaction styles, and design lifecycles. It discusses the evolution of user experience from mere usability to satisfaction and enjoyment, as well as various interaction styles like command line, menu, form fill-in, and direct manipulation. Additionally, it outlines different design methodologies, including the waterfall model, spiral model, and iterative user-centered design, emphasizing the importance of user involvement throughout the design process.

Uploaded by

securenetcyber
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views

Computer Science Notes Year 2

The document contains notes on the design of interactive systems, focusing on usability, interaction styles, and design lifecycles. It discusses the evolution of user experience from mere usability to satisfaction and enjoyment, as well as various interaction styles like command line, menu, form fill-in, and direct manipulation. Additionally, it outlines different design methodologies, including the waterfall model, spiral model, and iterative user-centered design, emphasizing the importance of user involvement throughout the design process.

Uploaded by

securenetcyber
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 318

COMPUTER SCIENCE NOTES

This pdf contains notes I, Norbert Ochieng, have taken while studying. These notes are shared
here in the hope that they may be useful to others, but I do not guarantee their accuracy,
completeness, or whether they are fully up-to-date.

SECOND YEAR
Part A

CHAPTER ONE:
DESIGN OF INTERACTIVE SYSTEMS

Interactive Systems
At its broadest, an interactive system can be any technology intended to help people complete
a teask and acheive their goals, however this could include things like pens, which could help
acheive the goal of leaving a message for a friend, so we tighten this definition to be any
technology incorporating some form of electronic logic designed to help people complete
tasks/acheive goals.

Alternatively, we could define this as any complex system with a UI. That is, something we
need to give instructions to and know the status of in carrying out those instructions. These
things are called systems (or artefacts), which we have dialogues with. Having dialogues with
non-humans (and other animals) is a relatively new concept over the past 50 years.

Usability
For many years, the key has been thought to be making interactive systems usable, i.e., giving
them usability.

To be usable a system needs to be effective. The system supports the tasks the user wants to
do (and the subcomponents of such tasks).

We could also consider other components to make things usable. Efficiency - the system
allows users to do tasks very quickly without making (many) errors; Learnable - the system is
easy enough to learn to use (commensurate with the complexity of the tasks the users want to
undertake); Memorable - the system is easy enough to remember how to use (once, learnt)
when users return to asks after periods of non-use.

We are now moving beyond that to consider satisfaction - the system should make users feel
satisfied with their experience of using it.
Positive user experience, rather than usability has now become the focus of the design of
interactive systems, particularly as we have so many systems that are for leisure, rather than
for work. This expands usability to cover issues such as:

• Enjoyable
• Fun
• Entertaining
• Aesthetically pleasing
• Supportive of creativity

Another way to think about usability is for the user interface to be transparent/translucent to
the user - the user should be concentrating on their task, and not on how to get the system to
do the task. This might not be the case originally however, though. For example, with the
pen, you had to think about it when younger and how to use it, and now you don't.

Difficulty in Designing Interactive Systems

Designing interactive systems on computer systems for non-experts has only been developed
for 25 years, whereas something like teapots has had 4000 years to be perfected (and still has
dripping spouts!). Books have had 500 years to develop systems containing the best design
features (page numbers, table of contents, index, chapter numbers, headings, etc) and books
can be used as interactive systems.

Affordance

Some things, e.g., doors, have a method of use which is ambiguous. Doors should afford
opening in the appropriate way, their physical appearance should immediately tell you what
to do.

Affordance could be formally defined as: "The perceived and actual properties of an object,
primarily those properties that could determine how the object could possibly be
used" (Norman, 1998).

Affordances could be inate, or perhaps culturally learnt, but a lifetime of experience with
doors, etc, means the interface to these systems is (or should be) transparent. Well designed
objects, both traditional and interactive have the right affordances.

Are Current Systems Sufficiently Usable?

It's been estimated that 95% of functions of current systems are not used, either because the
user does not know how, or doesn't want to. One of the causes of this is that anyone can use a
computer these days.

Key Principles of Interactive Design


• Learnability - how easy the system is to use
• Memorability - how easy is the system to remember how to use
• Consistency - to what extent are similar tasks conducted in similar ways within the system
(this will contribute to learnability)
• Visibility - to what extent does the system make it clear what you can/should do next
• Constraints - design the system so that users won't make mistakes and won't be led into
frustrating/irritating dead ends
• Feedback - provide good feedback on the status of the system, what has happened, what is
happening, what will happen

Interaction Styles
We need to get computers and complex systems to do things - somehow we have to "tell"
them what to do. In turn, they have to give us information - the status of the system, what to
do next, what's wrong, how to fix it.

One metaphor for this is to consider interacting with these systems as a dialogue - you tell/ask
the system to do something, it tells you things back; not too dissimilar to having a
conversation with a human being. Another metaphor is to consider these systems as objects
that you do things with and interact with (for example, putting waste in a waste paper bin), or
as navigating through a space and going places (the web).

Command Line Dialogue

This style is not widely used, but is useful to understand it to consider more recent and long-
term developments. This is the first interaction style that appeared on PCs, taking over from
mainframe systems.

The conversational metaphor applies here, where you're talking to the computer through your
keyboard and it reacts. However, the language you speak in must be correct to the last dot
and in the correct order, much like speaking a foreign language. The system doesn't give you
any clues on what to do, so you must remember (or use a crib sheet), the syntax of the
language. These commands can get quite long and complex, especially when passing lots of
options and (in UNIX), piping one command to the other.

You also get limited feedback about what is happening, a command such as rf may return
you directly to the command line, after deleting 0 or 100 files. The later versions of DOS
took the feedback step too far, however, asking for a confirmation of every file by default.
This is programmed by the interaction designer, and they have to remember to do this and get
the level of interaction right.

If you do get the command correct, however, this can be a very efficient way of operating a
complex system. A short, but powerful, language allows you to acheive a great deal and
people are willing to invest the time to learn this language to get the most efficient use of the
system.

However, although computers are good at dealing with cryptic strings, complex syntax and
an exact reproduction of the syntax every time, humans aren't. This interaction system
stemmed from the fact that processing power was expensive, so humans had to adapt to the
way computers needed to interact, not vice versa. This is no longer the case.
Menu Interaction Style

Although the command line style was good for experts, it wasn't for novice of infrequent
users, so an interaction style was developed which is almost the complete opposite of
command line dialogue in terms of strengths and weaknesses - menus.

Menus are simple, as not much needs to be remembered as the options are there on screen
and the physical interface corresponds directly to the options available. Feedback is
immediate - selecting an option will either take you to another screen of menus or to perform
the selected task.

Feedback is natural and built in -you can see whether you are going the right way, because
either something relevant occurs - much like handling objects gives you instant built in
feedback.

Selections from objects (such as radio buttons) can be thought of as a menu, even though the
selection method is different.

Menu systems should be usable without any prior knowledge or memory of previous use, as
it leads you through the interaction. This is excellent for novice and infrequent users (hence
their use in public terminals of many varieties). However, a complex menu structure can
complicate matters where particular features will be found (e.g., IVR systems)

Expert and frequent users get irritated by having to move through the same menu structure
every time they do a particular task (hence shortcut keys in applications such as Word), and
menu systems also present a lack of flexibility - you can only do what menu options are
present for you, so menu driven systems only work where there is a fairly limited number of
options at any point. Items also need to be logically grouped.

Form Fill-In Interaction Style

A form is provided for the user to fill in. Perhaps not the most revolutionary or interesting of
interaction styles, but it is based on another real world metaphor - filling out a paper form.
This also illustrates another problem with metaphors, not everything transfers well from one
medium to another.

Forms present lots of little problems:

• How many characters can be typed in the field - this is often not clear, and there is no
inherent indication of this (needs to be explicitely built in by the interaction designer) - this
violates the feedback principle.
• What format is supposed to be used - particularly for things like dates
• Lack of information - what if all information known by the form isn't available immediately.
The design may not let you process beyond the dialogue box

However, this system does have strengths. If the system is well-designed, any novice can use
them and it has a strong analogy to paper forms, and the user can be led easily through the
process. However, they are easy to design badly, and hence confuse and irritate the user.
Direct Manipulation

The heart of modern graphical user interfaces (GUIs, sometimes referred to as WIMP -
Windows, Icons, Mouse/Menus, Pointers) or "point and click" interfaces is direct
manipulation (note, GUI and WIMP themselves are not interaction styles).

The idea of direct manipulation is that you have objects (often called widgets) in the interface
(icons, buttons, windows, scrollbars, etc) and you manipulate those like real objects.

One of the main advantages of direct manipulation is that you get (almost) immediate natural
feedback on the consequences of your action by the change in the object and its context in the
interface. Also, with the widgets being onscreen, you are provided with clues as to what you
can do (doesn't help much with items buried in menu hierarchies). The nature of widgets
should tell you something about what can be done with them (affordance), and with widgets,
many more objects can be presented simultaneously (more than menus).

However, interface actions don't always have real world metaphors you can build on (e.g.,
executing a file) and information rich interactions don't always tend to work well with the
object metaphor and direct manipulation becomes tedious for repetitive tasks, such as
dragging, copying, etc...

Natural Language Interfaces

The ultimate goal of the conversational metaphor is for us to have a dialogue with the
computer in our own natural language. This might be written, but spoken seems easier. Voice
in/voice out systems have been developed, where the computer recognises what you say
(using speech recognition technology) and produces a response, spoken via text-to-speech
(TTS).

This kind of system should be "walk up and use", just speak your own language and this
should suit both novices and expert users. This isn't yet technologically feasible, as the AI is
largely fake and can only recognise key words. If speech recognition fails, it is easy to get
into nasty loops, where it is not clear as to how you get out again (this is the feedback
principle). There is also the problem of different accents and recognising older voices, and
privacy and security issues are introduced.

Design Lifecycles
How do we get from interaction ideas and information about the users and how the users
would like the system to workto an elegant, interactive system? Methods are needed to
structure and organise the work, so one of the key points is to involves users in all phases of
design. It's argued that it is too slow and and difficult to involve users in the design cycle, so
how can this be tackled?

A number of different theories and methods are developed to describe the methods. In HCI,
these are called design lifecycles, and all lifecycle theories and models include four basic
components, although they emphasise different aspects of them. These components are:

• Identifying user needs/requirements


• Developing (alternative) designs
• Building (prototype) versions of the designs
• Evaluating designs

Before we can continue, we need to consider what we mean by the term users. We can
consider three levels of users:

• Primary users - regular/frequent users of the system


• Secondary users - occasional users of the system, those who use the system occasionally, or
only through an intermediary
• Tertiary users - those affected by the system introduction, or who will influence purchase of
the system

You can also consider stakeholders "people or organisations who will be affected by the
system, and who have a direct or indirect influence on the system requirements".

Waterfall Model

See MSD.

The advantages of the waterfall model is that each stage produces a set of specifications that
can be handed on to the next stage, and the work is compartmentalised into clear chunks.
There is some feedback from each stage back to the previous one, but this slows down the
whole process.

With the waterfall model, it is very problematic to change the requirements as the project
develops, and is difficult to consider alternate designs. Any change in design ideas are
difficult to implement, as the system must be recoded.

Spiral Model

See MSD.

This model never really gained acceptance in the long-term in software engineering, or in
human-computer interaction.

In the 1980s, two models emerged from work on HCI, instead of software engineering, like
the waterfall and spiral models. These models focus on the need to build interfaces and
interactions to meet the needs of the users.
The Star

Here, evaluation is at the centre of the design process; we must be doing evaluation at all
stages of the design process, and we move around the star from analysis clockwise.

We can consider two modes of design activity:

• Analytic mode - top-down, organising and formal; working from the systems view towards
the user's view
• Synthetic mode - bottom-up, free-thinking, creative and ad-hoc; working from the user's
view to the systems view

Interface designers need to flip between these two modes.

The star lifecycle captures something important about the real design process - that we need
to work from both ends of the problem, from the user's perspective and from the perspective
of the technology, but it doesn't tell us very much about the ordering and nature of the
different processes involved in design and how to move a design forward.
Iterative User-Centred Design

This method was developed by Jack Carroll and colleagues and has the advantage of there
being numerous opportunities to include user requirements and alter specifications as new
knowledge is developed from studying prototypes. You are not committed to genuine coding
until a satisfactory design has been prototyped.

The problem with this method is that you do not always know when to break out of the cycle.
Evaluating a prototype may not be accurate and may not reveal all of the problems in the
prototype.

Usability Engineering Lifecycle

The UE lifecycle takes a combination of HCI and software engineering methods, and is more
useful for big systems. It is not as much a new concept (despite being recent - Mayhew,
1999), but it brings together lots of previous ideas and takes the best bits from previous
methods.

Here, the lifecycle is divided up into three phases.

Requirements Analysis

This consists of user profiling (understanding the users), task analysis, consideration of
technical and platform-related capabilities and constraints and the establishment of general
design principles - a definition of your usability goals, sometimes called a style guide or
initial design specification.

Design/Testing/Development

This can be broken down into three levels. At level 1, this is the conceptual model or design,
a mock-up (lo-fidelity prototype), which is evaluated to eliminate major design flaws. Level 2
is the screen design standards (SDS), designing the basic interaction styles, screens and then
evaluating these. This sub-phase should allow testing against usability goals. Level 3 is the
detailed interface design, a fine-grained design of the interface and interaction, which allows
further testing against usability goals.

Installation

This is the coding of the real system, where feedback from users working with the real
system (beta testers) is used.

This method has the advantage of stressing lots of user testing and involvement and divides
the development into clear phases of an increasingly realistic design, but it is difficult to
change requirements as you learn about the user's understanding of, and interacting with the
system.

Scenario-based System Design

This was based on further development by Carroll and Rosson in 2002 and evolved from the
difficulties of using multi-disciplinary teams for system development. How can all these
people work together and understand each others visions and problems? This method believes
you can do everything up until the implementation phase.
This system is based around scenarios, or stories, which elaborate the design and then
proposed design solutions are written for the problems. These start looking quite simple, but
increase in complexity as the design is articulated.

All scenarios have characteristic elements:

• Setting - situation elements that explain or motivate goals, actions and relations to the
actors
• Actors - humans interacting with the technology or other setting elements or personal
characteristics relevant to the scenario
• Task goals - effects on the situation that motivate actions carried out by the actors
• Plans - mental activity directed at converting a goal into a behaviour
• (User) Evaluation - mental activity directed at interpreting features of the situation
• Actions - observable user behaviour
• Events - external actions or reactions produced by the computer, or other features of the
settings; some of these may be hidden to the actors, but important to the scenario

The first step is to write a problem scenario, which is representative of the current situation.
There may be many of these to cover a variety of actors/situations, etc. They are not called
problem scenarios because they emphasise problematic aspects of current practices, but
because describe activities in the problem domain.

The next step is claims analysis, where each feature is identified and then the positive and
negative effects listed. Claims analysis may lead to elaboration of scenarios. To move
forward from the claims analysis, you should only have positive features.
The next step is to write activity scenarios - what and how an activity is going to address a
problem is the focus of activity scenarios. Only high levels of abstraction need to be
considered, no specific details. The design team first introduces concrete ideas of new
functionality and new ways of thinking about users' needs. As in other steps in the process, a
claims analysis is generated to help identify tradeoffs as you move forward with prototypes.

Further levels involve rewriting the scenarios at higher levels of detail (or lower levels of
abstraction, depending on how you want to look at it), and dealing with new issues which are
caused. Information and interaction design scenarios specify representations of a task's
objects and actions that will help make users perceive, interpret and make sense of what is
happening.

The goal of interaction design is to specify the mechanisms for accessing and manipulating
the task information and activities.

See MSD.

After all of these stages, we can move onto UML, etc and treat it like a classic design
problem.

This does have the advantage of allowing quite detailed development of the design without
committing to coding or prototyping and makes it easier to change requirements and consider
alternative designs, however, we must consider the disadvantages of not giving the users
hands-on experience with even lo-fidelity prototypes which may inform the views of the
designers. All prototypes are created after the design has been set. It is difficult to measure
against usability goals until late in the process.

Gathering Requirements for Designing Interactive


Systems
How do we find out what users really want in an interactive system?

Unfortunately, it's still very typical for developers to say "we could be users of this system
and we like/understand/can use it, so it must be okay". Even if you are part of the target user
group, if you have helped develop a system, it's inevitable that you will like and understand
it, so trying to establish requirements and then evaluate it on yourself is invalid. You must
have independent people outside the development group, whom you involve in establishing
requirements and doing evaluations. Ideally, these should cover the full range of users, not
just the core group (in which case, the development team probably isn't diverse enough
anyway). So, you need to think about the range of users for the system - men/women,
children/young/middle-aged/old, expert/novice with technology, and you need to make sure
the full spectrum is represented in the people you involve in requirements and evaluation

In software engineering, you typically have different types of requirements - functional and
non-functional. Functional requirements are physical attributes/functions of the
system/product - basically, what it does. Non-functional is how well it does this. For a mobile
phone, a functional requirement may be that the phone allows users to store common
numbers in an address book and a non-functional requirement may be that it can receive calls
in 95% of urban requirements.

Because the design of interactive systems includes such a variety of systems, tasks, users and
contexts, the types of requirements have been expanded. We still have functional
requirements, but non-functional requirements are broken up into more specific requirements:

• Data requirements - Lots of interactive systems deal with data and users have requirements
about how its handled
• Environmental requirements, or the contexts of use, which themselves can be split up into:
o Physical requirements - e.g., the system must be usable in a wide range of lighting
conditions, in noisy conditions and in extremes of heat (consider things like using
gloves with the system)
o Social requirements - Will the user be solo, or working in a group? e.g., laptop
computers designed for solo use and with privacy, but sometimes the workspace
wants to be shared. Will collaborators be in the same physical space or remote from
each other?
o Organisational requirements - Will the user have technical and training support? Is
sharing and collaboration encouraged, or is authority and heirarchy important?
What is the communications infrastructure like - broadband, omnipresent, stable?
• User requirements - The specific requirements of the intended user groups. You can try to
define a "typical user" or a range of "user profiles"
• Usability requirements - These try to capture the usability goals and measurable criteria.
These could be very general, such as "easy to use", or very specific "95% of users between
the age of 18 and 60 should be able to withdraw £50 within 60 seconds with a satisfaction
rating of 4/5".

There are problems that you can come across when eliciting requirements from users. We
need to find out what people want before we start developing new interactive systems. If
there is an existing technology in use, or an existing system, even if it's rather low-tech, we
can start by studying how users interact with that system, what problems they have and ask
them how they would like it improved.

Often, however, this situation isn't as simple - new interactive systems are highly innovative
and allow us to do completely new things (e.g., MP3 players). Before we have them, if you
ask users how they would like them, they won't have any idea as they can't imagine how they
would use them. Users often find new ways of using systems once they have them - the first
killer application for PCs was word processing - not spreadsheets and databases as had been
predicted; no-one predicted the rise in SMS on mobile phones.

Nonetheless, we still need to involve users from the very beginning in the design process. We
can get them to participate in the creative process of developing a completely new interactive
system or improving on an existing one. There are two basic ways of eliciting requirements:

• Ask (potential) users questions


• Observe them doing things (with non-interactive systems or with older interactive systems)

Examples of question asking techniques include:

• Questionnaires
• Interviews
• Focus groups and workshops (perhaps better to call these discussion techniques as well as
question asking techniques)

And examples of observational techniques include:

• Naturalistic observations
• Think aloud protocols
• Indirect observation (unobtrusive methods)

Questionnaires

Questionnaires are produced in a written format that requires very careful work (piloting and
refining a questionnaire) to ensure that the questions are absolutely clear to all respondents
and that you are collecting all the information you need.

Questionnaires are good for when the issues you want to address are well-defined (e.g.,
finding out problems users are having with a current system to be improved) and you need to
get information from a lot of people in a way that can be relatively quickly analysed.
Conversely, they are less good for situations where the questions are not very well defined
(e.g., when developing a very novel system) and the response rates for questionnaires are
very low - 40% is considered good.

The better a questionnaire is, the more likely it is to have a higher return rate, and your data
will also be of better quality. Following up questionnaires with reminder letters, phone calls
and e-mails, as well as offering incentives such as offering raffles all improve response rates
also. The length of the questionnaire should be commensurate with the importance of the
topic, but 20 minutes is too long for any questionnaire.

There are three types of questions on questionnaires:

• Open-ended - a completely free response. This is good when you want to elicit all kinds of
information, want respondents to be creative (many enjoy this) and you don't know what
they might say. It is more difficult to analyse, however, particularly if you have lots of
respondents - you need to develop your own categories to group the answers, and the
problem is then to decide whether two differently worded answers are the same category.
This is often biased, so two people often need to look at the answers.
• Closed questions - These are yes/no-style questions with a limited set of options, often with
"Don't Know" or "Other" as an option (these answers are desired in less than 10% of cases,
however). The options for closed questions can often be decided from doing a pilot open-
ended question and then using the answer grouping from that to decide the closed question
answers. Closed questions are easy for respondents to answer and good for getting lots of
data that's easy to process and understand. It is fundamentally categorically in nature,
however - answering "what" type questions, not on how people found something to use.
These quantitative questions are very useful in designing systems - you may want to
compare different designs, or compare one design as it evolves over time. Eliciting extreme
information from people ("What did you like most/least about X?") is also good.
• Likert scales - These are a very simple and neat way of measuring "how much" type
questions. The Likert (or rating) scale gives a scale from 1-5, where 1 is agree and 5 is
disagree (or could use 5, 7 or 9 gradations, depending on how much you want the
respondents to discriminate). Likert scales are very useful and should be used for everything
(according to Helen). They can be combined with open-ended comments where users can
justify their ratings. The disadvantage from Likert scales is that people can express no strong
preference either way. When you're comparing multiple examples and want to know the
best one, asking people to rank their options in order can often be more useful.

There is a checklist to creating a good questionnaire:

• Make questions clear - avoid double-barreled questions (e.g., "Do you think it'll be useful to
make the icons or buttons larger?" - does the answer refer to the icons or the buttons?).
• Make questions specific - the generalising should be done by the analyist, not the
respondent. (e.g., "How many hours do you use the Internet a week?" - the respondent here
is trying to average in their head. It's better to ask just about yesterday and then multiply by
7 and take into factors like weekends - this is the same as what the respondent is doing, but
you're likely to do it better with a better process).
• Think about the ordering of the questions - Ask easy questions first and get the respondent
relaxed and lulled into a false sense of security. They are more likely to complete a
questionnaire once they have started it. Personal questions should be asked last (people
don't like answering these, but if they're completed the rest of the questionnaire, they
probably will). Does answering one question lead you into answering another one in a
certain way?
• Make scales intuitive, clear and consistent - if you use numbers, 1 = low agreement and 5 =
high. This is intuitive. Also, when asking closed questions, questions should be either positive
or negative - mixing them up occasionally does keep respondents on their toes, however!
• Avoid technical or HCI jargon (e.g., when navigating about a website, a lot of people link
navigating with ships and compass).
• Provide clear instructions on how to complete the questionnaire and specific questions
(people will ignore you, but at least you tried). Examples often help.
• Clear layout is important and helpful - enough room is required for open questions, but a
balance is required between a lot of whitespace (which can give an intimidatingly long
questionnaire) and room.

Interviews

Interviews can range from the very formal, rather like a questionnaire given face-to-face, to
very informal, where the interviewer has a set of topics, but has a very free ranging
conversation with the interviewee. A lot of the principles from questionnaire design still
apply to interviews - clear question design, ordering of questions, etc, etc.

Interviews allow you to develop a relationship with the interviewee, and because you are
talking, you can elicit much more information, as well as explaining things the interviewee
might not understand and being able to tailor the questions to the interviewee much more
easily.

Interviews are much more time consuming than questionnaires and the interviewee might feel
a bit intimidated and may not reveal personal information. This method is also more prone to
"researcher bias" - with the interviewee telling the interviewer what they want to hear, and
not necessarily what they don't wish to hear.
Focus Groups

Normally, 5-7 people are brought together, usually from a particular user group to discuss a
particular system/problem. They are facilitated by a researcher who has a list of
questions/topics to be covered and who needs to discreetly guide the discussion. Focus
groups normally last for about 2 hours and are a fairly time-efficient way of eliciting
information and bouncing ideas around - people can spark ideas off each other. It is less good
at eliciting personally sensitive data, however.

Observation

There are three basic types of observational studies:

• "Quick and dirty" observation (good for initial user requirements elicitation) - right at the
beginning of the design process, you might want to get a general idea of how people are
using a current technology (and the problems they are having), or how they might use a new
technology - "quick and dirty" observation is a good way to accomplish this.
• Observation in a (usability) lab - if you want to study in detail how users interact with
technology, you want to have them in a controlled laboratory setting. This allows us to
observe how users undertake a range of tasks which are thought to be vital, to gauge
performance and "affective" reactions (how they feel).
• Observation in field studies - for a variety of reasons you might need to take a really detailed
look at the current use of technology (a "quick and dirty" observation may not reveal the
problems). Also, when an early prototype is available, this method can be used for an
evaluation use, where the prototype is used in a real world context.

There are also different levels of observer involvement; "insider-outsiderness". The most
extreme level of this is that of total outsider, where the outsider does not participate in the
situation at all (doesn't help people who get stuck or ask them about their experience), and the
other extreme of this is marginal/complete participation, for example where an observer
becomes part of a company and uses their system for a while, participating in the processes of
the system. In the middle there are observers who also participate, by creating certain
configurations to see how participants react.

Marginal/complete participation in observation is referred to as ethonography - a term from


anthropology where anthropologists go and live in a different culture and participate in their
lives in order to understand their customs. In both anthropology and HCI, this is a big
undertaking, with researchers sometimes participating in the use context of a situation for
months, if not years. It can be done on a small scale, however, with observing and
interviewing users and the researchers trying out the technologies for themselves.

Observation can be considered as varying on a scale from hypothesis-driven (where you have
a clear idea what you are looking for, what the problem is and what your theory is) to holistic
(where you are not sure what's going on, what the problem is and what you are going to find),
and this will influence exactly how your observations are done. With hypothesis-driven, it is
clear what specific behaviours you should be observing in classrooms, whereas with holistic,
you will need to observe everything and make sense of it later - there is a serious danger of
drowning in the information you collect, however.

There are different methods to observe:


• Notes plus still camera - Taking handwritten notes seems the most basic way of noting down
observations, but it is very difficult to write and observe at the same time, and very difficult
to write quickly enough to give much useful detail later. A dictaphone could be used in a
public situation to make up for this, and combining it with something like a hands-free kit
can make it look like you're on a mobile phone, decreasing your obviousness. Taking still
pictures of situations can be very useful, but it may upset observees in public situations (and
has ethical implications). This method is okay for use in labs, and is possible, if difficult, in
public.
• Checklists of behaviours - A list is made up of behaviours of interest, and all you need to do
is simply check off whether and how frequently they occur. This is good is the observation is
hypothesis-driven, but you may need a pilot study to work out the checklist. Once the
checklist is working, it saves an enormous amount of effort collecting and later analysing the
data.
• Audio recording plus still camera - Small, inconspicuous digital tape recorders are useful, but
this only works if the participants are talking, and even then it is difficult to work out what is
happening without visual record; still shots can help. If you choose to transcribe the whole
conversation, this can be very time consuming, but it may not be necessary - relevant points
and problems can be picked out and a summary generated.
• Video - This is good for capturing all visual and aural information, but it can be intrusive and
make people feel self-conscious (although they usually forget about the camera when they
are focussing on the technology). Video is very time-consuming to analyse (1 hour video =
100 hours analysis), but as with audio recording, it may be summarised with only important
segments analysed in detail. Additionally, videoing public places may not capture all
available information.

Frameworks have been developed for doing observation, as when performing observation,
especially holistic studies, there is a lot of information to record and the appropriate
information needs to be captured.

A very simple framework for field observation is the who/where/what framework. Who is the
person using the technology (age, gender, type, etc) and are others involved? Where are they
using it (the place) and what are the important characteristics of that environment? What is
the thing/task of what they are doing?

A more complex framework has been developed based on the simple one above which
considers more factors:

• Space: What is the physical space and how is it laid out?


• Actors: What are the roles and relevant details of those involved?
• Activities: What are the actors doing and why?
• Objects: What physical objects are present, such as furniture?
• Acts: What are specific individuals doing?
• Events: Is what you are observing part of a special event?
• Goals: What are the actors trying to accomplish?
• Feelings: What is the mood of the group and individuals?

In laboratory observation, one of the frustrating things is that you can get a lot of detail
(particularly from video analysis) and still not know what is going on. In particular, we have
no insight into the user's mental model of what is happening. A variation of pure
observational technique has been designed to deal with this situation - this is called the "think
aloud" or "concurrent verbal" protocol, where the person is asked to say out loud everything
that they are thinking and trying to do, so that their processes (and mental model) are
externalised. It is usually very straightforward to get the person to do this; they can be gently
prompted if they go silent.

Alternatives to this include getting them to work in pairs and describe what they are doing to
each other. In some situations, you can get one person to teach another how to do something.

For some tasks, it can be disruptive to get a person to talk whilst doing the task (e.g.,
something that requires a lot of concentration, or when talking is part of the task), so a
variation on the above called retrospective verbal protocol can be used. Here, you video the
person and immediately (while it's still fresh in their minds) get them to watch the video and
talk through what they are thinking at the time. However, you often lose a lot of the finer
detail of the thoughts going through your head, and it's often embarrasing to watch a video of
yourself.

In some instances, actually observing people may be too intrusive. We can ask people to keep
diaries of their behaviour, but this often falls off as people lose interest, and structure and
incentives are needed. Another method is interaction logging, taking some kind of automatic
log of the key presses, mouse movements and clicks, which gives us on a somewhat more
meaningful level the menu items chosen, web pages and links visited, etc... This information
by itself is not that useful - did the user find that page helpful/interesting/etc..., so it needs to
be combined with video or audio recording from user, preferably with verbal protocol data.

Logging is a completely unobtrusive measure - the user does not need to know it is
happening. Psychology has long liked the idea of unobstrusive measures like this, measures
which do not interfere with the behaviour at all. HCI needs more useful unobtrusive methods.

Ethics

If we involve people in research, we have an ethical obligation to inform them that we are
collecting data from them and using it. This is often a problem with observational data,
particularly if it is collected in a public place and this ethical obligation is ignored.
Unobtrusive methods are attractive specifically because they do not upset the naturalness of
behaviour - informing people would definately do so. There is no easy way out of this, but
the best rule of thumb is to get people's consent whenever you can. Collected research should
be anonymous.

Task Analysis and Modelling


A user has goals and needs to know methods of acheiving these goals. The user needs
feedback when the goals are achieved. A good interface can make it more or less easy to
achieve these goals. We need to understand the user needs and their goals to design good
interfaces.

A task is the set of activities (physical and/or cognitive) in which a user engages to achieve a
goal. Therefore, we can distinguish between a goal, the desired state of a system, and a task,
the sequence of actions performed to achieve a goal (i.e., it is a structured set of activities).
Goals, tasks and actions will be different for different people.
Task analysis and modelling are the techniques for investigating and representing the way
people perform activities: what people do, why they do it, what they know, etc. They are
primarily about understanding, clarifying and organising knowledge about existing systems
and work. There is a lot of common with systems analysis techniques, except that the focus
here is solely on the user and includes tasks other than those performed with an interactive
system. The techniques are applied in the design and evaluation of training, jobs and work,
equipment and systems, to inform interactive system design.

In this module, we'll focus on task decomposition (splitting the task into ordered subtasks),
but there are more advanced techniques covered in the textbooks - knowledge based
techniques (what the user knows about the task and how it is organised) and entity/object
based analysis (relationships between objects, actions and the people who perform them).

Task analysis involves the analysis of work and jobs and involves collecting data (using
techniques such as interviews and observations) and then analysing it. The level of granuality
of task analysis depends on various factors, notably the purpose of the analysis. Stopping
rules are used to define the depth of a task analysis - the point at which it is appropriate to
cease decomposing. User manuals often stop too early.

Some general rules of thumb for stopping rules is when the action is a complex motor action
and no problem solving is involved, or when the user doesn't articulate any lower level
activities. Other places to stop is when the likelihood and cost of error in the task are below a
certain threshold, or when the sub-tasks are outside the scope of the current project.

Task modelling represents the results of task analyses as task models. There is no specific,
correct model. Specific models describe one instance of a task as performed by one person. A
generic task model generalises across many instances to represent the variations in the task.

Heirarchial Tasks Analysis

Heirarchial tasks analysis (HTA) is concerned with observable behaviour and the reason for
this behaviour. It is less detailed than some techniques, but is a keystone for understanding
what users do.

HTA represents tasks as a hierarchical decomposition of subtasks and operations, with


associated plans to descibe sequencing:

• tasks/subtasks - activities to achieve particular goals/subgoals


• operations - lowest level of decomposition, this is the level defined by the stopping rule
• plans - these specify the sequencing of activities associated with a task and the conditions
under which the activities are carried out

A HTA can be represented by a structued, indented text, or using a structured chart notation.
A text variant of the above structured chart may be:

• 0. To photocopy a sheet of A4 paper:


1. Enter PIN number on the photocopier
2. Place document face down on glass
3. Select copy details
1. Select A4 paper
2. Select 1 copy
4. Press copy button
5. Collect output

• Plan 0: Do 1-2-4-5 in that order; when the defaults are incorrect, do 3


• Plan 3: Do any of 3.1 or 3.2 in any order; depending on the default settings

We can consider different types of plan:

• fixed sequence (e.g., 1.1 then 1.2 then 1.3)


• optional tasks (e.g., if the pot is full then 2)
• wait for events (e.g., when the kettle boils, 1.4)
• cycles (e.g., do 5.1-5.2 while there are still empty cups)
• time-sharing (e.g., do 1 and at the same time, do 2)
• discretionary (e.g., do any of 3.1, 3.2 or 3.3 in any order)
• mixtures - most plans involve several of the above
Is waiting part of a plan, or a task? Generally, task if 'busy' wait - you are actively waiting or
a plan if the end of the delay is the event, e.g., "when alarm rings", etc...

To do HTA, you first need to identify the user groups and select representatives and identify
the main tasks of concern. The next step is to design and conduct data collection to elicit
information about these tasks:

• The goals that the users are trying to achieve


• The activities they engage in to achieve these goals
• The reasons underlying these activities
• The information resources they use

This steps can be done with documentation, interviews, questionnaires, focus groups,
observation, ethnography, experiments, etc...

The data collected then needs to be analysed to create specific task models initially.
Decomposition of tasks, the balance of models and the stopping rules need to be considered.
The specific tasks models then should be generallised to create a generic task model - from
each task model for the same goal, produce a generic model that includes all the different
ways of achieving the goal. Models should then be checked with all users, other stakeholders,
analysts and the process should then iterate.

To generate a hierarchy, you need to get a list of goals and then group the goals into a part-
whole structure and decompose further where necessary, applying stopping rules when
appropriate. Finally, plans should be added to capture the ordering of goal achievement.
Some general things to remember about modelling are:

• Model specific tasks first, then generalise


• Base models on real data to capture all the wonderful variations in how people do tasks
• Model why people do things, as well as how
• Remember, there is no single, correct model
• Insight and experience are required to analyse and model tasks effectively and to then use
the models to inform design

When you are given an initial HTA, you need to consider how to check/improve it. There are
some heuristics that can be used:

• paired actions (for example, place kettle on stove can be broken down to include turning on
the stove as well)
• restructure
• balance
• generalise

There are limitations to HTA however. It focuses on a single user, but many tasks involve the
interaction of groups to people (there is a current shift towards emphasising the social and
distributed nature of much cognitive activity). It is also poor at capturing contextual
information and sometimes the 'why' information and can encourage a focus on getting the
model and notation 'right', which detracts from the content. There is a danger of designing
systems which place too much emphasis on current tasks or which are too rigid in the ways
they support the task.
There are many sources of information that can be used in HTA. One such is documentation -
although the manuals say what is supposed to happen, they are good for key words and
prompting interviews, and another may be observation (as discussed above) and interviews
(the expert: manager or worker? - interview both).

We could also consider contextual analysis, where physical, social and organisational settings
of design need to be investigated and taken into account in design. For example:

• physical (dusty, noisy, light, heat, etc...)


• social (sharing of information/displays, communication, privacy, etc...)
• cultural (etiquette, tone, reading/scanning style, terminology, data formats, etc...)
• organisational (hierarchy, management style, user support, availability of training, etc)

GOMS

GOMS is an alternative to HTA that is very different. It is a lot more detailed and in depth
and is applied to systems where timing and the number of keystrokes are vital. It is more
complex to use than HTA, and it is not common. GOMS is an acronym for:

• Goals - the end state the user is trying to reach; involves heirarchial decomposition into tasks
• Operators - basic actions, such as moving a mouse
• Methods - sequences of operators or procedures for achieving a goal or subgoal
• Selection rules - invoked when there is a choice of methods

Prototyping
Users often can't say what they want, but as soon as you give them something and they get to
use it, they know what they don't want. A bridge is needed between talking to users in the
abstract about what they might want and building a full-blown system (with all the expense
and effort that involves). The prototype is that bridge, it might be a paper-based outline of
screens, a video simulation of interaction or a 3D cardboard mock-up of a device.

Prototypes are very useful for discussing ideas with stakeholders, and it encourages reflection
on the design and allows you to explore alternative designs without commiting to one idea
and to make ideas from scenarios that are more concrete.

Low fidelity (lo-fi) prototypes are not very like the end product and are very obviously rough
and ready. They are cheap and simple to make and modify and it makes it clear to
stakeholders that they can be criticised. They are fairly low risk; designers do not have much
to risk with them, but they do not allow realistic use.

One form of lo-fi prototyping is storyboarding, a technique used in the film industry. A series
of sketches, usually accompanied with some text (e.g., a scenario), show the flow of
interaction. This is useful if you're good at sketching, otherwise it can be daunting.

Another method is using paper screenshots, where a sheet of paper or an index card is used
for each page, and overview diagrams can be used to show links between screenshots. Some
tools (such as Denim and Silk) exist to support this process, but they can keep the "sketchy"
look of the prototype.
Another method is the Wizard of Oz system. In the 1980s, a prototype system was created for
speech recognition. At the time, speech recognition was not currently available, so a fake
system was used, where a human actually did the recognition. The term is now used for any
system where the processing power is not implemented and a human "fakes" it. There is no
need to deceive the user for a Wizard of Oz system as it can work perfectly well if the user
knows it is a Wizard of Oz system.

High fidelity prototyping uses materials that you would expect to find in the end product or
system and looks much more like the final system. For software, people can use software
such as Macromedia Dreamweaver or Visual Basic, or in the case of web pages, writing
prototype HTML code. With high-fidelity prototyping, you get the real look and feel of some
of the functionality, it serves as a living specification, it is good for exploration of design
features and evaluation and it is a good marketing and sales tool.

It can be quite expensive and time-consuming however, especially if radical changes are
possible; developers are often unwilling to scrap high-fidelity prototypes. Evaluators tend to
comment on superficial design features rather than real design and user expectations can be
set too high.

When creating prototypes, you have to consider depth vs. breadth - do you prototype all of
the functionality, but then not go into any specific detail (horizontal prototyping), or do you
prototype one or more functions, but in a lot of depth (vertical prototyping).

Evolutionary prototyping also exists, which is where you build the real system bit by bit and
evaluate as you go along, and the opposite is throw-away prototyping, which is where a
prototype is built in one system (perhaps repeatedly with changes), and then that thrown is
totally thrown away and the real system is built from scratch (sometimes for efficiency,
security, etc...).

A final type of prototyping is experience prototyping, where using prototypes allows the
designers to really understand the users' experience, e.g., using a wheelchair for a day for
different technologies like ATMs, etc...

Design
Conceptual Design

Conceptual design is taking the requirements and turning them into a description of the
proposed system in terms of a set of integrated ideas and concepts about what it should do,
how it should behave or look in a way that will be understandable by users and other
stakeholders. It is not yet a real detailed, fully specified design, but is the "general idea".

At this early stage of design, we should be considering alternate design ideas, so:

• Keep an open mind but never forget the users, their tasks and environments
• Discuss ideas with all the stakeholders as much as possible
• Use lo-fi prototyping to get rapid, but well-informed, feedback from stakeholders
• Iterate, iterate, iterate
The big decisions that need to be made in the conceptual design stage is what interaction
mode/style to use (which will be the most suitable for all types of users, tasks, environment,
etc...), what interface metaphor to use and the interaction paradigm (desktop, wearable,
mobile). We also need to consider what functions will the system perform - task allocation,
what will the system and user do, and how will these functions relate back to each other -
temporal ordering - and make themselves available to the user.

After working with the conceptual design, hopefully evaluating it, considering alternatives
and refining it, eventually you are happy, or you run out of time, the next stage of design is
required - physical design.

Physical Design

Physical design deals with the detailed issues of both the physical interface and the
conceptual interface, but what will the detailed screen layout, etc, look like? The number of
screens and icons and their general functionality will have been decided at the conceptual
design stage. In the physical design stage, you also need to consider the temporal
arrangement of actions and their reactions. High fidelity prototypes can be used at this stage
to see how different users react to different presentations and different detailed dialogue
arrangements.

Evaluation
There are numerous different techniques for evaluation, and some of the techniques we
covered earlier can be reused. The techniques range from very informal (observing users
interacting with technologies, a "quick and dirty" approach) through to the highly structured
approaches.

DECIDE Framework

Determine the goals - what do you really want to find out, which determines the method you
use, the people you involve, etc...

Explore the questions - often a high level question (e.g., "why don't people like this system?")
needs to be broken down to find concrete things to ask or measure - this is called
operationalising the problem.

Choose the evaluation paradigm and techniques - a number are available, and we will look at
this in more detail later.

Identify practical issues - e.g., appropriate users need to participate in any user-based
evaluation (selecting the right level of expertise, age range, gender mix, etc...). You need to
consider the tasks the user will be doing with the system, the facilities and equipment
available, any schedule or budget constraints to be considered, and the expertise of the
evaluation team.

Decide how to deal with ethical issues. It is very important to consider this and to deal with
participants involved in evaluations ethically. The British Pyschology Society's "Ethical
Principle for Conducting Research with Human Participants" is a good guide to consult
before conducting experiments with humans.

• You must inform participants approximately what they will be asked to do before getting
their consent (usually by signing a consent form).
• All information is confidential and participants should be told this - you can report results,
but not in a way that identifies individuals.
• Participants should not be subjected to undue stress (tasks that are too difficult to do),
boredom (evaluations should be interesting and informative) or fatigue (sessions should not
be too long).
• Participants need to know that they can leave an evaluation at any time if they are unhappy.
• Participants should be reimbursed appropriately for their time, but they shouldn't be bribed
to act against their better judgements.

Evaluate, interpret and present the data - is your data reliable, i.e., would you get the same
results if you collected them on another day, with another group of users (probably not the
exact same results, but the conclusions should be the same). The validity of the data also
needs to be checked; is it really measuring what you really want to measure?

User-based Controled Evaluations

There are the "gold standard" of evaluations - the key is the control you hold over the
situation. Usually conducted in a usability lab, but this is not essential. Any controlled
situation (where the information available to the users, the tasks and the data collected are
controlled) will do, so long as the situation and results are shown to be replicable.

This is artificial, but it is so for a reason. By keeping everything constant apart from the one
thing you are interested in, you get rid of as much extraneous variation as possible, and
concentrate on what you are interested in. This form of evaluation does need to be
complemented by evaluation in realistic situations.

Control is needed due to confounding hidden variables - a problem using correlational,


natural data as you do not know what other relationships exist that are not explicit.

Basic Setup

A number of typical tasks with a system are undertaken where the users are given appropriate
training, information, etc... The measures used in the evaluation are:

• Effectiveness - proportion of tasks completed successfully


• Efficiency - time to complete each task
• Errors made - types and number
• Learnability - time to obtain 95% error-free performance
• Memorability - proportion of tasks successfully completed after a certain time from learning
to use the system
• Perceptions of the system - normally measured on Likert scales
Experimental Setup

If only one issue is to be concentrated on, a classic experiment can be run, where only one
thing is varied, to test a particular hypothesis. The independent variable is the one that is
varied or changed, whilst everything else is kept the same. The dependent variables are the
things that you measure to what effect the manipulation in the independent variable has.
Dependent variables could include things like:

• the number of times a user succeeds in doing the tasks


• time taken to do the tasks
• false leads followed
• users rating of the usefulness and ease of use of the system (e.g., using 7 point Likert scales)

Floor and ceiling effects can be observed with dependent variables, however. Independent
variables may not alter the dependent variables, so there is no point in having that factor as a
variable. Pilot studies should be used to eliminate variables that have these effects.

It is often best to have different participant groups and generate means. This avoid
contamination between the groups. A lot of people are neede in each group to overcome the
problem of variation between participants in each group, such as intra and inter personal
differences. This can be time-consuming and expensive. Statistics and pilot stuides can give
you a power estimate of how many people are needed in each group to give a good value.
This is usually about 30 people in a group.

Sometimes you can use the same participants in different conditions you want to evaluate,
and this is called within participants. You only need half the users and the amount of
variability can be decreased as individual particularities have an effect on both conditions.
You do need different tasks for the people to do, however.

One way to accomplish this is to have a pool of tasks which are similar. You need to counter-
balance the order of presentation of the different systems, in case the users get
tired/bored/more experienced as the study goes on.

If the above method does not work, we can use matched participants. Here, we have two
different groups of participants, but match the participants on relevant variables (such as age,
sex, experience, etc...). You then might need to match on other variables, which do not
necessarily have to match the same individual participant pairs in each group, as long as you
end up with the same composition in each group.

These types of controlled studies are perceived to be difficult and time-consuming, but this is
not necessarily the case. Alternatives have been developed using experts, instead of real
users. These methods are called expert, or inspection, methods.

One of these alternatives is heuristic evaluation, which is useful for a first evaluation before it
is given to users, and for eliminating initial flaws. The heuristics here refer to the set of
usability heuristics developed by Neilson. User testing is still important, however, as "users
always do surprising things".

Between 10-12 heuristics have been taken from 100s of previous methods used, and these
are:
• Visibility of system status - are users informed about what is going on and is appropriate
feedback provided within a reasaonable time about a users action.
• Match between the system and the real world - is the language used in the system simple
and are the words, phrases and concepts used familiar to the user.
• User control and freedom - are there ways of allowing users to easily escape from places
they unexpectedly find themselves in.
• Consistency and standards - are the ways of performing similar actions consistent.
• Help users recognise, diagnose and recover from errors - are error messages helpful and do
they use a plain language to describe the nature of the problem and suggest a way of solving
it.
• Error prevention - is it easy to make errors, and if so, where and why.
• Recognition rather than recall - have shortcuts been provided to allow more experienced
users to carry out tasks more quickly.
• Aesthetic and minimalistic design - is any unnecessary and irrelenvant information provided
• Help and documentation - is help information provided that can be easily searched and
followed

To conduct a heuristic evaluation, 3-5 HCI experts are used to compare the system to the
heuristics. The experts spend 1-2 hours going through the system. It is recommended that
they go through it twice, once to get used to the flow, and the second time as a user. Any
problems found are related to the heuristics and reported back. All the experts then come
together into a discussion group to collectively give an agreed rating to a set of problems. The
agreed ratings are divided into 4 groups: usability catastrophe, major problems, minor
problems and cosmetic problems.

Speech Based Interaction


These are sometimes called speech-in, speech-out systems (an avatar, conversationally
embodied agent or virtual human), and then can make interfaces easier and more natural to
use. Speech based systems are composed of utterances and generation of new information.

Speech synthesis can occur by two main processes: text-to-speech synthesis, programs that
can take any text string and convert them to speech-like sounds; and copy synthesis, which
comprises of digital recordings of the human voice. Copy synthesis can be accomplished by
two methods, one is whole utterance copy synthesis and the other is splicing copy synthesis,
where digital recordings are spliced together at the phrase, word or syllable level.

With TTS synthesis, it can cope with any text string that you give it, no matter how weird the
English (or computer code, etc). Previously, extra hardware was needed, but it is now largely
software driven using a standard sound card. It can, however, sound mechanical, although
this is less the case nowadays, the full range of human phonetics and phonology has yet to be
implemented.

In copy synthesis, whole utterance sounds much more natural, mainly because it is, but the
vocabulary can be limited, so a fallback to TTS is sometimes required. Spliced copy synthesis
can be a good compromise, but can draw attention to the synthetic nature of the speech
because of odd transitions.

When we are designing utterances that a speech-based system may produce, this is equivalent
of designing menus, icons, etc in a GUI, but the principles are going to be very different.
Here, we can draw strongly on research about how human-to-human dialogues are
undertaken. One of the key problems with speech-based systems however is the so-called
"bandwidth" problem, where there aren't lots of items on screen simultaneously, but a
sequence of items in time. These differences between speech and visual systems can be
expressed below:

Visual Speech

persistent transient

need to focus on display can be looking anywhere or nowhere

easy to get overview difficult to get overview

can focus on particular component must take in order presented

can turn away much more difficult to ignore

Grice's Maxims

Grice's conversational maxims (not exactly rules, as they can be broken very easily, but
intuitively we know it's odd when someone does so) are principles that govern conversations.

• Maxim of Quantity - Make your contribution to the conversation as informative as


necessary, but do not make your contribution to the conversation any more informative as
necessary.
• Maxim of Quality - Do not say which you believe to be false or that for which you lack
insufficient evidence.
• Maxim of Relevance - Be relevant (i.e., say things related to the current topic of
conversation).
• Maxim of Manner - Avoid obscurity of expression or ambiguity. Be brief, orderly and
appropriately polite.

Pitt and Edwards (2003) Guidelines

Pitt and Edwards have developed many speech-based interfaces and propose the additional
guidelines:

• Limit the number of choices available at each stage of the interaction to an absolute
minimum (like menu design)
• Keep each utterance as short as possible (Grice - be brief)
• Try to anticipate what the user is most likely to need at each stage of the interaction and
present that information first (often called front-loading information)

When considering Grice's Maxim of Manner, politeness needs extra words, so it conflicts
with "Be Brief", however, this may make the system seem less "computer like", and make the
system appear to be more intelligent. Politeness is liked by novices, but not disliked by
experts.
Avatars should have intonation, decent pausing, gestures, facial expressions, eye movements,
moods/emotions and lip synching. Some of this is fake, as the co-ordination of gesture with
speech is hand crafter, but intonation and pausing is real and can be generated in real time.
All this needs to be generated, and an appropriate markup language can be used.

You also need to consider the return path back to the system. With systems such as telephone
speech interfaces, a numeric input is typically used, which seems very limiting and primitive
compared to the output. Speech input, could be used, but often recognition is not very
accurate, so the dialogue could be stifled. Users can be guided as to what they should say to
aid recognition (people are good at parroting what computers do say).

Complex systems such as PCs that have the capability of producing speech can also produce
a wide variety of other sounds. In spite of the fact that Apples have always had sound
capability, and PCs have had the ability for quite a while, interface designers still don't make
much use of sound, other than basic "system beeps".

Sound in the interface can be used in a variety of ways:

• To cue speech (orient the user)


• To mark the end of an utterance (marked by intonation in human speech, often not so good
in TTS)
• As an alert or alarm (can't close the ears)
• To give rapid, summary information (quicker than speech)
• To give routine messages (if the user understands the meaning of sounds)

Auditory icons were proposed by Gaver, and these are sounds that we associate with the real
world object, function that the icon represents that we will immediately recognise (so no
learning of its meaning is required). This builds very much on the direct manipulation idea -
if things both look and sound like real world objects, we will be able to understand them and
manipulate them like real world objects.

There are problems with this, however. Like visual direct manipulation, many
objects/activities do not have real world equivalents, and distinctions made in sound are not
necessarily the ones that you want to make in an interface - a window opening sounds the
same as a window closing, but in an interface, these actions are very different.

Earcons were proposed by Blattner and her colleagues, and are a pun on icon (eye-con). They
are abstract, musical sounds that can be used to create motifs and patterns of considerable
complexity using pitch, rhythm and timbre (quality of sound). Learning is required, but does
not need to be explicit - people learn and remember musical patterns very well (even if they
are not musical, e.g., disk whirring). An infinite number of patterns can be produced, and
they can be parameterised (with sound, this is called sonification) with a number of
parameters, pitch, loudness, speed of tones, etc...

Earcons could be pleasant to use, like background, ambient music, and they can be combined
well with speech, possibly even have overlap (have speech and sound simultaneously).

Steve Brewster carried out some investigations into whether or not the use of sounds make an
interface more effective, efficient, easier to learn or more pleasent to use. Brewster, Wright
and Edwards (1993) compared the use of complex musical sounds with simple musical
sounds with simple non-musical sounds in combination with icons and menus. His measure
of effectiveness was the recall of the earcons (perhaps not a good dependent variable,
however), and he found that complex musical sounds were recalled better than simple music
sounds than simple non-musical sounds.

Brewster (1997) looked at the use of earcons combined with speech in telephone interfaces.
Over the phone, reproduction of sound is not very good (particularly in the higher
frequencies) and sound is mono only. In this study, earcons were used to represent icons in a
hierarchy, and recall of earcons was used as the measure of usability. Recall was good, even
for the poor sound quality, and a set of earcons designed for recognition over the phone were
recalled 67% of the time. Recall lasted a week after the intial experience, so it is suitable for
infrequent usage. Brewster concluded that "earcons can provide excellent navigation cues for
telephone-based interfaces".

Speech interfaces could become a lot more dominant in the future, and non-speech sound can
be used in more sophisticated ways.

Gesture Based Interaction


Gesture based interaction is possibly the only revolutionary innovation in interaction style on
the horizon in the HCI industry. Gesture based interaction extends the metaphor of both the
conversation and the virtual object. It builds on natural human behaviour - we all make
gestures as we communicate and think - and it can be used both at the desktop and in
mobile/wearable situations. However, there are downsides, such as you might have to wear a
glove, or learn a new language of gestures and use them with reasonable precision, and there
may be limitations on the kind of information that can be reasonably conveyed by a gesture.

Gestures are the movements of hands, arms (plus the upper body and head) used in
communication. We can consider different kind of gestures:

• Symbolic gestures (e.g., the okay sign, a shrug of "I don't know", goodbye, a head shake,
etc...)
• Dietic gestures (pointing, e.g., over there)
• Iconic gestures (convey information about size, shape, orrientation, etc...)
• Pantomimic gestures (mimicking actions)

Gestures have beats and are cohesive. They have no specific meaning, but seem to mark the
rhythm of speech, indicate boundaries and draw elements together. We have a very rich and
complex "language" of gestures - we learnt them as part of our native language without
having to think about them - which we use constantly, often subconsciously, such as when we
are talking on the phone, when we know the other people can not see them.

When we use a mouse to interact with menus, we make movements (although linguists would
not call them gestures, as they are not part of communication - they are an epi-phenomena to
the action of selection), and as we get used to the position of items in menus, the series of
distinct action becomes a fluid movement, or a "gesture".

Kurtenbach and Buxton (1993) built on this idea and developed a pen-based system that
recognised the shape of the "gesture", so the user did not actually have to open the menu
items:
True gesture-based interaction would occur using video recognition or glove-based gesture
recognition. In video gesture recognition, one or more cameras are required, and sometimes
you need to put markers on the hands - gloves or thimbles on the finger tips, usually. This is
less of a hassle for the user than "full glove" systems, but more processing power is required
to recognise the position and shape of hands and figures.

In glove-based gesture recognition systems, a glove with sensors is used to track finger and
hand positions, so less processing needs to be done to get the position of the hand and fingers
- this comes from the sensors, but the gesture still needs to be recognised. Another similar
concept is the pinch glove, which allows users to grab virtual objects. This does not recognise
the position of the hand or fingers, but instead works by touching two or more of the sensors
together, which is much simpler computationally.

One of the key problems with gesture recognition is how the recognition system knows when
one gesture finishes and another starts. Krueger created a solution involving time cues - if the
hand is held still for a certain time, a beep is heard indicating that it has been fixed, which
creates an analogy to holding something in place whilst glue is drying, but this could be tiring
and irritating.

Often, to ensure accurate gesture recognition and an intuitive interface, constraints are placed
on the system. Common constraints include a region defined as the "active zone", and
gestures are ignored if they are performed outside of the zone, and gestures are defined as
starting from a set start position and a (different) end position, and the gesture is identified as
the movement between these two positions. By having this start and end position, gestures
can be strung together without confusion.

Like Grice's maxims for speech based interactions, Baudel and Beaudouin-Lafon came up
with a set of guidelines on how to develop gesture based interaction systems:

1. Use hand tension, e.g., tensing the hand into static postures - this makes the users intent to
issue a command implicit. Tension also emphasises the structures of human-computer
dialogue. Conversely, end positions should not be tensed.
2. Provide fast, incremental and reversible action - this is one of the most basic principles for
direct manipulation adapted for gesture-based input. Speed is essential to ensure that the
user does not get tired forming gestures and reversibility is important to enable the user to
undo any action, and incremental actions are vital for the system to provide continuous
feedback to improve the user's confidence in the interface.
3. Favour ease of learning - in symbolic gestural interfaces, a compromise must be made
between natural gestures that are immediately learnt by the user and complex gestures that
might give greater control and complexity of functionality. For example, you could map the
most common interface actions to the most natual gestures to ensure ease of learning.
4. Hand gesture should only be used for appropriate tasks - it is important to choose carefully
the tasks that gesture input is going to be used for. While gesture input is natural for some
navigation and direct manipulation tasks, it is inappropriate to tasks that require precise
interation or manipulation.

Gesture-based interaction seems very interesting for some applications, where actions are
dominant and one may be talking (not to the computer, but to colleagues, audience, etc). In
the real world, gestures and speech are combined to create a communication, which is surely
what we want to aim for with human-computer interaction. Solutions do exist, but they are
rather clunky - for example, in a speech based word processor, how to distinguish content
from editing.

Lots of research is being done into this form of multimodal interface, so these may be the real
future for HCI.
CHAPTER 2:
OPERATING SYSTEMS
System Components
A computer system can be abstracted into various components. Users (people, or other
machines and systems) interact with system and application programs (defines the way in
which system resources are used to solve the computing problems of the users), which
interact with the OS (controls and co-ordinates the use of hardware among the various
application programs for the user), which interacts with the hardware (the most basic
computing resources - CPU, memory, I/O, etc). When developing an OS, all levels must be
considered, not just the OS level.

For users, they interact with the hardware (monitor, keyboard, mouse, etc) to control the
system and application programs. The program needs an efficient interface to interact with
the OS to utilise the resources it needs to accomplish the tasks the user wants to do and to
also communicate with the user. The OS interface describes the calls that an application
program can make into the OS.
The OS is designed for ease of use, which is in terms of not having to deal with low level
hardware issues. This is more important than CPU usage - a user doesn't care if the CPU is
idle 99% of the time. The goal of OS design is to provide a useful, simple and efficient
system:

• ease of use
• minimise resource usage
• maximise resource utilisation

However, abstraction away from the physical machine to provide ease of use can lead to
inefficiencies. There's a trade off required between the extra functionality of the virtual
machine and the overheads required to implement this.

History of Operating Systems

See CAR for


Computer History

1st Generation (1945-1955)

These are systems which didn't have operating systems. Users had complete access to the
machine and the interface was basically raw hardware. Programming was accomplished by
wiring up plugboards to control basic behaviour of the computer - the problems to be solved
were normally numerical calculations.

Operating systems and programming languages were virtually unheard of.

2nd Generation (1955-1965)

These systems had minimal operating systems and were typically mainframe computers used
to support commercial and scientific applications. Jobs and programs were written using
punch cards and the computers were normally application specific.

The minimal OS was required to interact with I/O devices and a subroutine library existed to
manage I/O interaction. The library is loaded into the top of memory and stays there.
Application programs existed (compiler/card reader), but were limited.

Optimisations could be made in reducing the setup time by batching similar jobs. A small
computer reads a number of jobs onto magnetic tape and the magtape is run on the
mainframe. A number of jobs is then done in succession without wasting CPU time. These
systems had a small permenantly resident OS which had initial control and controlled
automatic job sequencing.

These systems had a new memory layout and the processor had two modes - Normal, where
some instructions (e.g., HALT) are forbidden, which was used by the batch programs, and
Privileged, where all instructions are available, this was the level the OS ran in. Some CPUs
also implemented protection in Normal mode to prevent the user program overriding the first
section of memory in which the OS existed. Normal mode also has an instruction to call the
OS (hand back control).

Batch systems were good for CPU bound scientific computing with minimal I/O, but bad for
data processing with high I/O as the CPU is often idle. The speeds of the mechanical I/O
devices is considerably slower than that of the electronic computer and jobs often have to
pause when performing I/O. In some cases, the CPU was often idle for >80% of the time.

However, with a permenantly resident OS in the memory, the OS now becomes an overhead,
with less physical memory available for the application running at the time. Programming
discipline in the application is still required as the OS can not protect against the application
crashing the machine via bad I/O, infinite loops, etc...

3rd Generation (1965-1980)

The introduction of ICs led to minicomputers, which tried to merge the scientific computers
with I/O computers and this introduced multiprogramming. Disk drives allowed jobs and
programs to be stored on disk, allowing the computer to have access to many jobs. The CPU
can now load many jobs into memory by giving each one a memory partition and then
execute them by multiplexing between them.

In these OSs, the OS picks a job and then passes control to it. Eventually, the job either
finishes or waits for some task (e.g., I/O) to complete. In a non-multiprogrammed system, the
CPU would now be idle. In a multiprogrammed system, the OS switches the CPU to another
job.

Eventually, the operation the original job was waiting for completes, enabling that job to be
eligable to be waiting for CPU time. As long as there is at least one job waiting to execute,
the CPU is not idle.

When a job completes, the OS can load another job into that memory slot. This technique is
called spooling - Simultaneous Peripheral Operation On Line, and is still used today for
operations such as print queues.

Multiprogramming OSs are quite complex - they now handle quite complex I/O devices
(disks, etc) and swapping between applications when suspended on I/O.

Multiprogramming works best when there are a small number of large jobs - this reduces the
overhead of the OS due to managing memory between less jobs. This method also
encourages a fairer overall use of the computer.

Multiprogramming systems led on to time-sharing/interactive systems. The CPU here is


multi-plexed between several jobs that are kept in memory and on disk. The CPU is allocated
to a job only if the job is in memory. When a job waits on I/O, the OS switches the CPU to
another job. Jobs can also be switched in between memory and the disk. On-line
communication between the user and the system is provided - when the OS finishes the
execution of one command, it seeks the next "control statement" from the user's keyboard.

In time-share systems, the user has the illusion of using the computer continuously and the
user can now interact directly with the computer whilst it is running. This encourages smaller
jobs and more efficient use of the computer. Additionally, many users can share the computer
simultaneously, submitting jobs/commands directly via keyboards/terminals, etc.

The first time-sharing system was CTSS, developed by MIT. This formed the basis for
MULTICS, the first commercial OS that had limited success. UNIX was then developed by
Bell Labs in the high-level language C, based on B.

4th Generation (1980-present)

The fourth generation of operating systems broguth about the first mass user operating
systems, as LSI and VLSI enabled low-cost computing, with mini and desktop computers
becoming common.

The age of mass computing is based around desktop computers. The desktop PC is dedicated
to a single user and the distinction between desktop and minicomputers become increasingly
blurred. Many minicomputers are now merely highend desktop PCs, with advanced I/O
capabilities and multiprocessors.

The multiprogrammed OS (UNIX, VMS, etc) was developed further for mass user
computers, adding functionality like: massive file storage, fast networking, complex
application programs and the potential for multiprocessor. Modern desktop systems have
dedicated single user OSs (e.g., CP/M, DOS, MS-DOS, Mac OS, Windows) with complex
I/O devices: keyboards, mice, display screens, small printers, graphics tablets, etc, and
complex graphic applications, with user convenience and responsiveness considered of high
importance.

An increased blurring between desktop computers and minicomputers has led to


minicomputer OSs becoming available for desktop, e.g., Linux - based on the BSD and SysV
variants of UNIX.

Much of the current OS trends are based in the relative cost of computers and people. In the
beginning, computers were few with very expensive hardware, so the people cost was cheap
compared to the computer cost, therefore the goal was the maximise hardware utilisation.
Now we have an amazing number of computers with cheap hardware and the human cost is
expensive compared to the hardware cost. The goal now is ease of use.

Current OS trends are towards specialist requirements - moving away from the general
purpose OS for specialist needs, or tuning existing general purpose OSs. The introduction of
real-time OSs, which require low overheads and high efficiency of resource usage. Multi-
processor machines are becoming more common and OSs need to update to be able to handle
the extra hardware. On the other hand, embedded devices are now become very common,
which have limited power, space and weight requirements.

Role of the Operating System


Resource Sharing

The OS contains a set of algorithms that allocates resources to the programs executed on
behalf of the user. These resources include time, power, hardware, etc...
Control Program

The control program controls the operation of the application programs to prevent errors
affecting other programs.

Provision of a Virtual Machine

This hides interfaces to I/O devices, filing systems, etc, and provides a programming
interface for applications.

Kernel

The kernel is the only program resident all the time (all other applications are application
programs).

The modern general purpose OS contains various components to accomplish these roles, such
as:

• Process management
• Main memory management
• File management
• I/O system management
• Secondary storage management
• Protection system
• Command interpreter system

All but the last two in the above list can be consdered to be resource management

Processes

A program is an algorithm expressed in some notation. A process is the activity of executing


the program.

Early systems only allowed one program to be executed at a time. Later systems allowed
multiprogramming, where many programs are loaded into memory at once and only one can
be executed at any time. Programs are termed jobs or processes when executing. A process is
sequential and can only do one thing at a time.

Multiple user processes run in seperate (protected) address space, and can't directly access
other memory spaces. Some user programs can share address space (e.g., an interpreter), but
only the code (which is read only) is shared. The kernel can access all, however.

As the CPU only has one program counter, so the OS must use a system known as process
switching.
Process Creation

Processes are created by one of 4 principle events:

1. System initialisation (typically, this is the OS and system programs)


2. By running a process (started by the kernel or user programs)
3. By user execution (normally with interacting with some kind of shell)
4. Initiation of a batch job (typically only in batch systems, however)

When a new process is created, the OS must assign memory and create a task/process control
block (TCB/PCB) which holds various data about the currently running process, such as:

• Process state
• Unique program ID
• Program counter
• CPU registers
• Memory information (this, along with the program counter and CPU registers forms the
process context)
• I/O status information (e.g., open files)

Process Hierarchies

When a parent creates a child process, the child process can create its own child processes,
and we get a hierarchy (or tree) of processes. UNIX terms this as a process group, whereas
Windows has no such concept, and all processes are created equal.

Process State

In single processor systems, only one process can be in the running state at any time. If all
processes are waiting (or blocked), then the processor is idle, until one of the processes
become unblocked, or ready.

The states possible are:

• new: The process is being created


• running: Instructions are being executed
• waiting: The process is waiting for some event to occur
• ready: The process is waiting to be run
• terminated: The process has finished execution
The switch between processes is called context switching - where the context of the currently
running program is saved in the process context of the PCB and then the process context from
the new PCB is loaded into the CPU. The context switch time is overhead as the CPU does
nothing useful whilst switching. The time it takes for a switch is dependent on the hardware
support available.

Process Termination

There are various conditions that can terminate processes:

1. Normal exit (voluntary)


2. Exit on an error condition (voluntary)
3. Fatal error (involuntary)
4. Killed by another process (involuntary)

When a process is terminated, all resources used by a process is reclaimed by the OS, with
the PCB last (as the PCB contains information on where the resources are to be reclaimed).

System Calls

System calls provide an interface between the running process and the OS. They are available
as an assembly level instruction, normally some kind of software interrupt (INT 0x80 on
Linux) which corresponds to a system call to which the OS reacts appropriately.

Parameters are used in system calls, just as procedure calls. Normally registers, tables or
stacks are used, but in modern processors, tables are the most popular. Another method is
using instructions that force an address space change without using an interrupt, but the effect
is the same.

System calls normally map onto the services provided by the OS.

POSIX is an IEEE standard based on the UNIX interface. It is not an OS, merely a standard
API which is very vague and covers almost all systems calls you would ever have to make.
Some aspects of most major OSs are POSIX compliant, but there exists no complete
implementation of the POSIX standard. Linux is primarily POSIX compliant for the core
calls.

Threads
Traditional (heavyweight) processes have a single control path through their code. Many
applications are concurrent, but do not want the overhead of multiple processes and inter-
process communication (e.g., a word processor has UI, reading input, a background spell
checker, etc).

Many OS's provide threads (sometimes called lightweight processes), a basic unit of CPU
use. Each is a seperate control path through the code, and threads within a process are
essentially independent. Threads can access all the address space by the process, and threads
have no protection against each other (therefore, you should program threads safely).

Single and Multi-Threaded Processes

Each thread comprises of a thread ID, program counter, register set and a stack. The thread
shares with other threads belonging to a process, the code section, the data section and OS
resources such as open files. Each thread needs context, but not as much as the process needs,
as it inherits some of the context of the process.

Threading has various benefits, such as responsiveness - other threads can continue if one
blocks (e.g., on I/O), resource benefits - there's only one copy of the code and TCB, etc,
economy - thread creation and management is less costly than that of processes, and better
utilisations of multi-processor architectures (threads can run in parallel).
User Threads

Implemented by user level libraries, this provide threads to an application with the the
support of the underlying OS. Here, the application handles its own threading, so the OS only
sees one process. The application has a thread table, used in the application for keeping track
of the thread state and context, in a similar way to how the kernel handles TCBs for
processes. The state diagram for a thread looks similar to the one for a process:

User threads are usually fast and have low management overheads (as a library, the threads
library is within the address space of the process, so calls to the threads library functions are
fast). Application can also define management (i.e., scheduling) based on better information
on the threads than a more general policy.

This method does have disadvantages, however. When a thread calls I/O, the whole process
is blocked, as the OS sees the whole process as waiting for I/O. When an application has
threads that are likely to block often, then user threads may not be the best option.

Kernel Threads

Kernel threads are implemented at the OS level, and therefore has higher overheads (a thread
control space/table is now needed in the TCB) and is slower, due to syscalls. With kernel
threads, the OS schedules all threads, so it can schedule appropriately (e.g., when one thread
blocks, only move that into waiting instead of the whole process). Threads can also be
scheduled over multiple processes, so we have parallelism, instead of concurrency (which can
introduce its own set of problems).

Hybrid Threading

Both user and kernel threading has their advantages and disadvantages, so using the hybrid
methods give you the best of both worlds. Most real implementations of threading are hybrid.
A common approach is to multiplex a number of user threads onto a number of kernel
threads; each kernel thread has some set of user threads that take turns using the kernel
thread. There is usually a maximum number of kernel threads per user process.

Many-to-One

This is the same as normal user-level threading (only with management in the OS); many
user level threads are mapped to a single kernel thread. All thread management happens in
user-space, so this is efficient, but it does suffer from the one thread blocking the whole
process problem.

You could also implement a kernel thread as a process, which is used on systems that don't
support kernel threads.
One-to-One

All threads here are kernel threads - each user thread maps directly on to one kernel thread.
This has more concurrency than the many-to-one model, as when one thread blocks another
can run. This does have the overhead of having to create a kernel thread for each user thread,
however.

Many-to-Many

This is the normal method of implementing threading. Here, the user library controls all -
which threads are in the kernel, which are not and also controls swapping, etc. The library
tries to control the threads so that not all of the kernel threads end up blocked at the same
time, by swapping in and out waiting user threads to kernel threads.
This has the advantages of there being control over concurrency in the application, and the
application can allocate user threads to kernel threads in a way to optimise the concurrency
and ensure that at least one user thread is runnable at any time.

This method is quite complex, however and there are overheads in both the OS and the
application. The user thread scheduling policy may also conflict with the kernel thread
scheduling policy, introducing inefficiencies.

Threading Issues

fork() creates a seperate duplicate process and exec() replaces the entire process with
the program specified by its parameter. However, we need to consider what happens in a
multi-threaded process. exec() works in the same manor - replacing the entire process
including any threads (kernel threads are reclaimed by the kernel), but if a thread
calls fork(), should all threads be duplicated, or is the new process single threaded?

Some UNIX systems work around this by having two versions of fork(), a fork() that
duplicates all threads and a fork that duplicates only the calling thread, and usage of these
two versions depends entirely on the application. If exec() is called immediately after
a fork(), then duplicating all threads is not necessary, but if exec() is not called, all
threads are copied (an OS overhead is implied to copy all the kernel threads as well).

Thread Cancellation

Cancellation is termination of the thread before it is complete (e.g., if multiple threads are
searching a database and one finds an answer, the others can stop).

There are two types of cancellation: asynchronous and deferred (or synchronous). With
asynchronous, the calling thread can immediately terminate the target thread, but in deferred
cancellation, the target thread is information that it is to be terminated by setting some status
in the target thread. When the target thread consults its status, it finds out it is to terminate
and performs some clear-up and terminates itself. This requires some co-ordination between
threads.

Asynchronous cancellation is useful if the target process does not respond, otherwise it
should not be used, as it can lead to inconsistencies.

Signal Handling

This is used in UNIX to notify a process an error has occured. It is normally generated by an
event and is delivered to the process. Once delivered, the signal must be handled.

Synchronous signals are delivered to the same process that performs the operation that caused
the signal, e.g., divide by 0 errors, etc...

Aysnchronous signals are delivered to running processes and are generated by something
external to the process, e.g., a key press.
The signal can be handled by a default signal handler, or a user-defined signal handler. User-
defined overrides the default handler (in most cases).

In multi-threaded systems, we have to consider the case of to which thread should the signal
be delivered?

• To the thread where the signal applies (how do you know this?)
• To every thread (has a lot of overhead)
• To certain threads (which ones?)
• To a designated thread, which handles all signals (has the problem of there being the
overhead of another thread, and the thread may not be scheduled in the best way to handle
the signal)

CPU Scheduling
CPU scheduling refers to the assignment of the CPU amongst processes that are ready. It
selects from among the processes in memory that are ready to execute and allocates the CPU
to one of them. In terms of process state, the scheduler decides which of the processes that
are in the ready state should be allocated to the CPU and enter the running state.

The CPU scheduling decisions may take place when a process:

1. switches from the running to the waiting state


2. switches from the running to the ready state
3. switches from the waiting to the ready state
4. terminates

A scheduling decision may also take place as the result of a hardware interrupt.

We call scheduling under conditions 1 and 4 nonpreemptive - once the program is running, it
only gives up the CPU voluntarily. Scheduling under conditions 2 and 3 is preemptive - that
is, the running process can be preempted by an interrupt. Problems may arise when a program
is preempted whilst accessing shared data, which gives us a need for mutual exclusion.

Dispatcher

The dispatcher module gives control of the CPU to the process selected by the scheduler.
This involves:

• switching context
• switching to user mode
• jumping to the proper location in the user program to restart that program

With the dispatcher, we have a new metric, dispatch latency, which is the time it takes for the
dispatcher to stop one process and start another running. Dispatch latency is optimal when it
is minimised.

The dispatcher is an example of a mechanism, which is how you should implement


scheduling. Policies decide what will be done, and the seperation of policies and mechanisms
is an important OS principle, allowing flexibility if policy decisions are to be changed later.
Process Scheduling Queues

This is based upon the states in the process state transition graph. The OS needs to find the
TCBs of all processes efficiently, and needs to find a process waiting on a particular event
quickly when that event occurs. Specifically, the OS needs to find the next process to
dispatch (i.e., from the ready queue) quickly on a context switch. You also need to be able to
move processes between queues efficiently.

A process is only in one queue at a time. For example, you may have a ready queue, which is
the set of all processes residing in main memory, ready to execute, and also a number of
waiting queues, one for each different I/O device, which is the set of processes waiting on I/O
from that device. Processes move between queues as they change state. Pictorally, this may
look like this:

With this model, we have two types of schedulers, the short-term scheduler (or CPU
scheduler), which selects which process should be executed next from the ready queue and
allocates CPU. It is invoked very frequently (every 100 or so milliseconds), and must be very
fast. Additionally, we have the long-term scheduler (or job scheduler). This selects which
new processes should be brought into the ready queue, and is invoked very infrequently (in
terms of seconds and perhaps even minutes). The long-term scheduler controls the degree of
multi-programming (i.e., the number of processes in memory). However, how do we decide
which jobs should be brought in from main memory?

Processes can be described as being either I/O-bound (spends more time doing I/O than
computations, and has many short CPU bursts) or CPU-bound (spends more time doing
computation, has a few very long CPU bursts). The long term scheduler tries to maintain a
balance between these two forms.
Some OSs also have a medium term scheduler, which swaps in/out partially executed
processes.

Scheduling Policy

When the OS needs to select another process to execute, the scheduling policy decides which
of the processes to assign to the CPU. Scheduling policy effectively determines the order of
the ready queue so that the next process to run is at the head of the queue.

Scheduling policies are often defined on the CPU-I/O burst cycle. Most process executions
consist of a CPU execution and an I/O wait. The characteristics of this cycle help determine
policies. The relationship between relative sizes of CPU and I/O bursts are very important.
I/O bursts are very long compared to CPU bursts in most applications. A histogram of CPU-
burst times gives us a typical CPU-burst graph, showing many short CPU bursts and few long
CPU bursts, which are the CPU bound processes.

Some of the goals of scheduling policies are:

• CPU utilisation high (keeping the CPU as busy as possible)


• Throughput high (the number of processes that complete their execution per time unit)
• Turnaround time low (the amount of time it takes to execute a particular process)
• Waiting time low (amount of time a process has been in the waiting queue)
• Response time low (amount of time it takes from when a request was submitted until the
first response is produced, not output - for time-sharing environments)

Turnaround time alone is a bad goal to optimise towards, as overall processes will take
longer.

First-Come, First-Served Scheduling

In First-Come, First-Served (FCFS) scheduling, processes are scheduled in the order that they
arrive in the ready queue.

The average waiting time for FCFS is not optimal, however, as it suffers from the convoy
effect, where short processes are delayed behind long processes.
Shortest-Job-First Scheduling

Shortest-Job-First (SJF) scheduling works by associating each process with the length of its
next CPU burst, and then use these lengths to schedule the process with the shortest time
next.

This has two schemes: non-preemptive (once the CPU is given to the process, it cannot be
preempted until it completes its CPU burst); and preemptive (if a new process arrives with
CPU burst length less than the remaining time of the current executing process, preempt),
which is sometimes known as Shortest-Remaining-Time-First (SRTF) .

SJF is optimal for giving a minimum averasge waiting time for a given set of processes.

Determining the length of the next CPU burst can only be done by estimating the length. One
way of accomplishing this is by using the length of previous CPU bursts and exponential
averaging with the forumla τn + 1 = αtn + (1 - α)τn, where tn = the actual length of the nth CPU
burst, τn + 1 = predicted value for next CPU burst and 0 ≤ α ≤ 1.

Priority Scheduling

A priority number (integer) is associated with each process and the CPU is allocated to the
process with the highest priority (the smallest integer). This can be either preemptive or non-
preemptive. SJF is a priority scheduling policy if the priority is predicted next CPU burst
time.

This does have the problem of starvation, where low priority processes may never execute,
and the solution to this is aging, where as time progresses, the priority of the process is
increased.

Round Robin

Each process gets a small unit of CPU time (a time quanta), which is usually about 10-100
ms. After this time has elapsed, the process is preempted and added to the end of the ready
queue. A smaller time quanta means more context switches and therefore more overheads, so
the time quanta needs to be set to some appropriate value that is large with respect to the size
of the context switch, but not too high as to basically become a FIFO policy. If there
are n processes in the ready queue and the time quantum is q, then the process gets 1/n of the
CPU time in chunks of at most q time units at once. No process waits more than (n - 1)q time
units.

Average turnaround time does not necessarily improve as the quantum increases, so quantum
size is a trade-off between good turnaround and low overheads.

Multi-level Scheduling

Here, the ready queue is partitioned into seperate queues, e.g., a foreground queue (for
interactive processes) and a background queue (for batch processes). Each queue has its own
scheduling algorithm, for example the foreground one is a RR queue and the background one
may be FCFS.
Each queue must also be scheduled for CPU time, for example we could have a fixed priority
scheduling algorithm (serve all from foreground, then from background), but this does have
the possibilty of starvation, or we could have a time-slice, where each queue gets a certain
amount of CPU time which it can schedule among its processes, i.e., 80% to foreground and
20% to background.

Multi-level Feedback Queue

A process can move between the various queues in a multi-level scheduling policy, and this is
a way of implementing aging. The multilevel-feedback-queue scheduler is defined by the
following parameters:

• number of queues
• scheduling algorithms for each queue
• method used to determine when to upgrade a process
• method used to determine when to downgrade a process
• method used to determine which queue a process will enter when that process needs
service

Thread Scheduling

For user-level threads, any general purpose scheduling policy can be used according to the
best policy for that application. General user-level thread implementations should allow the
application sufficient freedom to define an application-specific scheduling regime, i.e., allow
the application to reorder the thread scheduling queues at any time, with any policy.

Memory Management
Processes (including threads) need physical memory in order to execute. The OS manages
memory and allocates memory to processes from physical memory, and virtual memory.
Physical memory management enables the OS to safely share physical memory among
processes. This allows us to multiprogram the processor, to time-share the system (overlap
computation and I/O) and to prevent one process from using another memory.

Memory management techniques must meet the following requirements:

• Processes may require more memory than available physical memory


• You may need to load processes to different locations each time they execute
• You may need to swap all or part of a process to secondary storage during execution (and
then reload it at a different memory location, as above)
• Memory protection between processes is required

Some simple memory management techniques include loading the process into memory and
switching between them with no protection between the processes (DOS), or copying the
entire process memory to secondary storage when it performs I/O and then copy it back when
it is ready to restart (early UNIX). This is clearly inefficient.

Most modern OSs work by giving each process a part of memory that it is allowed to access
and check each memory access to ensure that the process stays within its own memory.
Physical Memory Allocation

Main memory must accomodate both the OS and user processes. Usually, the main memory
is divided into a number of distinct partitions, such as the resident OS, which is normally a
single partition held in low memory, as this is where CPUs place interrupt vectors. User
processes normally consist of one or more partitions, within memory above the OS. However,
we need to consider how we're going to divide up these user partitions.

The simplest scheme is to partition into regions with fixed boundaries - this is called fixed
partitioning and has two types, where the partitions are of equal size, or where they are of
unequal size. Any process whose size is less than or equal to the partition size can be loaded
into an available partition. If all the partitions are full, the operating system can swap a
process out of a partition. However, if a process doesn't fit into a partition, it is the
programmers responsibility to make sure that the program can be split over multiple
partitions. With this, main memory use is inefficient. Any program, no matter how small,
occupies an entire partion. This is called internal fragmentation.

With unequal partitions, the problem is reduced, but not eliminated. Instead of
creating x partitions of equal size, partitions of varying sizes are created, so a program can be
loaded into the one most appropriate to its size.

Another method is dynamic partitioning, where the partitions are of variable length and
number and are created when the process asks for them - so, the process is allocated exactly
as much memory as it requires, eliminating internal fragmentation. It does however leave
holes in memory, as processes are swapped in and out in different places, leaving small gaps
between processes. This is called external fragmentation and could lead to a situation where
there is enough free memory for a process to run, but it is fragmented across main memory.
The way to solve this is using compaction - rearranging the programs in memory so they are
contiguous and all free memory is in one block. This is very, very slow.

There are different ways of implementing allocation of partitions from a list of free holes,
such as:

• first-fit: allocate the first hole that is big enough


• best-fit: allocate the smallest hole that is small enough; the entire list of holes must be
searched, unless it is ordered by size
• next-fit: scan holes from the location of the last allocation and choose the next available
block that is large enough (can be implemented using a circular linked list)

First-fit is usually the best and fastest, but analysis has shown that under first-fit,
fragmentation can cause up to half of the available memory to be lost in holes (external
fragmentation) or unused memory (internal fragmentation). Best-fit has the smallest amount
of external fragmentation. Next-fit can produce external fragmentation similar to best or first,
depending upon the next block found. It does use holes at the end of memory more frequently
than others perhaps leading to the largest blocks of contiguous memory.

A common approach, especially in embedded systems is the buddy system. Under the buddy
system, the entire space is treated as a single block of 2U. If a process requests memory of
size s, such that 2U-1 < s ≤ 2U, the entire block is allocated, otherwise the block is split into
two equal buddies. This process continues until the smallest block greater than or equal to s is
generated.

At any time, the buddy system maintains a list of holes of each size 2 i. A hole on the i + 1 list
can be split into 2 equal holes of size 2i in the i list.

Implementing Physical Memory Management

There are two main problems in implementing physical memory management - how can
processes be run in different areas of memory, given that they contain lots of addresses, and
how can the OS provide memory protection.

Memory addresses in a program must be translated into physical memory addresses, so


programs are created to be relocatable, such as the ELF binaries for Linux. There are three
points where binding of program address to memories can occur:

• compile time - if the memory location is known at compile time, static absolute addresses
can be compiled in; any starting address changes must be recompiled in.
• load time - here, the code generated is relocatable and the code can be loaded at any
address and the final binding occurs when the program is loaded. If the memory locations
change, the code just needs to be reloaded.
• execution time - the binding is delayed until runtime if the process can be moved from one
memory segment to another during execution. This method does require hardware support.

To work round this, we have the concept of a logical address space that is bound to a seperate
physical address space, which is the key to memory management. Logical addresses are the
ones generated by the CPU, and physical addresses are the ones seen by the memory unit.
When using compile or load-time address-binding schemes, the logical and physical
addresses are the same, but with the execution-time addres-binding scheme, the logical
(virtual) and physical addresses differ. This is accomplished using a memory management
unit.

The memory management unit (MMU) is a hardware device that maps a logical address to a
physical one, so the user program only deals with logical addresses, and never sees the
physical one.

The most simple form of MMU is the relocation register. The relocation register holds the
value of an offset and then adds the offset to all logical addresses given to it to give the
physical address. In this system, logical addresses are in the range 0..max for all processes
and physical addresses are in the range R..R+max, where R is the value in the relocation
register. The process only needs to generate the logical addresses, and only sees the process
running in addresses 0..max. This principle of mapping is central to memory management.

The relocation register can provide memory protection with the addition of a limit register,
which contains the allowed range of logical addresses - each logical address must be less than
the limit register. Hardware support is obviously required for this.

Another way of thinking of this is not thinking of a program as one large chunk of contiguous
memory - a process may be broken up into pieces that do not need to be loaded into main
memory all the time. This increases opportunities for multiprogramming as with many
processes in main memory, it is more likely a process will be in the ready state at any
particular point. This is achieved by programming techniques with minimal OS support.

Static vs. Dynamic Loading

So far, we have assumed that the entire program data and code must be loaded into memory
for process to execute - this is called static loading. An alternative approach is called dynamic
loading which works by not loading routines until it is called. This gives a better memory-
space utilisation and unused routines are never loaded. This is useful when large amounts of
code are required to handle infrequently occuring cases. If dynamic loading is implemented
through program design, no special support from the OS is required.

Dynamic loading enables a small part of the process to be loaded initially - usually around the
program entry point with part of the data and for a greater degree of multiprogramming as
only parts of the program are needed at any time in physical memory. It is usually the
responsibility of the application to ensure that the appropriate parts of the application are
loaded at any time.

Some OSs do provide libraries to aid loading parts of a program.

Overlays

These are slightly more structured, and codes are put into phases that are loaded over one
another. Again, the motivation is to keep only code and data in memory that is actually
needed. When one phase completes, the application arranges to overlay its memory with the
next phase when required. The total memory required is the same as the memory required by
the largest phase. Some OS support may be given to implement this.

This is useful when there is a limited amount of physical RAM, and some secondary storage
is available to hold code/data, e.g., in embedded systems.

Shared Libraries

Shared libraries can be implemented in two ways - using static or dynamic linking. In static
linking, all libraries required by the program are linked into the image (i.e., form part of the
data/code of the program), so the memory allocated must be big enough to hold the
application and the libraries.

With dynamic linking, the libraries required by the program are dynamically linked at
execution time (just-in-time), so the memory allocated must be big enough to hold the
application. Shared libraries are allocated from a different area of memory.

Swapping

A process (or part of a process) can be swapped temporarily out of memory to a backing store
(e.g., disk) and then brought back into memory for continued execution. This is how medium-
level scheduling is used. If you want to reload at a different address, execution time binding
is required.
The issue with swapping is the backing store - a fast disk large enough to accomodate copies
of all memory images for all users that must provide direct access to these memory images is
required. A major part of the swap time overhead is the transfer time. The total transfer time
is directly proportional to the amount of memory swapped.

One of the more common swapping variants is called roll-out, roll-in and it is used with
priority-based scheduling algorithms. Lower-priority processes are swapped out so higher-
priority processes can be loaded and executed.

Many different varieties of swapping can be found on many OSs, such as UNIX, Linux,
Windows, etc...

Paging

The problems with using physical memory allocation with unequal, fixed or variable sized
partitions are fragmentation and inefficiencies (algorithms to track holes, and compaction).
Paging and segmentation are memory allocation techniques that largely eliminate these
problems.

Many of these problems arise as memory allocation is of contiguous blocks of memory to


processes. Paging removes this restriction, as the logical address space of a process can be
noncontiguous. The physical memory space is divided into fixed-size blocks called frames,
and logical memory is divided into blocks of the same size called pages. Pages are then
allocated and fit into frames.

To implement paging, we need to keep track of all free frames. To run a program of
size n pages, we need to find n free frames and load the program into them. There is a
problem with internal fragmentation if a process size is not an exact number of frames. Only
part of one frame is lost to this fragmentation, however, and if we make the frame size quite
small, this isn't a significant problem.

With paging, the logical address generated by the CPU is divided into a page number p and a
page offset d. The page number is used as an index into a page table which contains the base
address of each page in physical memory and the page offset is combined with the base
address to define the physical memory address that is sent to the memory unit.
Usually, page tables are a fixed length, often the same size as a frame. If a process does not
require all entries in the page table, the redundant entries are set to be invalid. If a reference is
made to an invalid page table entry, the OS must handle a page fault and there is the potential
for an erroneous logical address to have been generated by the application process.

The page table is normally kept in main memory, so in this scheme every data/instruction
access requires two memory lookups, one for the page table and then one for the actual
instruction. The two memory access problem can be solved using a special peice of hardware
called a translation look-aside buffer (TLB).

TLBs are implemented using associative memory, which allows for a parallel search of page
numbers in order to get the frame number. The TLB is not a part of normal physical memory.

Memory protection in paging is implemented by associating a protection bit with each frame.
The valid-invalid bit is attached to each entry in the page table, where "valid" indicates that
the associated page is in a process' legal address space and is thus a legal page, whereas
"invalid" indicates that the page is not in the legal address space of the process.

Modern systems allow processes to have large logical address spaces, so the page table itself
can become very large. Hierarchial page tables can be used to break up the logical address
space into multiple page tables.
See hash tables
in ADS

In address spaces > 32 bits, hashed page tables have become common. Here, the virtual page
table is hashed into a page table.

Often, having one page table per process can be expensive in terms of physical memory. The
alternative is the inverted page table, where there is one entry for each real page of memory
and the entry consists of the virtual address of the page stored in that real memory location,
with information about the process that owns that page. This decreases memory usage, but
increases the time needed to search the table when a page reference occurs. Hash tables can
be used to limit the search to one, or at most a few, page-table entries.

Segmentation

Paging seperates the user's view of memory from actual physical memory. A user thinks of
the program as a collection of segments. A segment is a logical unit, such as: the main
program, procedures, functions, methods, objects, local variables, global variables, common
blocks, the stack, the symbol table and arrays. Segmentation is a memory management
scheme that supports a user view of memory.

In segmentation, you have a logical address consisting of a segment number and an offset and
a segment table that maps two-dimensional physical addresses. Each table entry has a base
that contains the starting physical address where segments reside in memory and the limit that
specifies the length of the segment.
With each entry in the segment table, a validation bit (one that indicates whether or not it was
an illegal statement) and read/write/execute privileges are associated with the entry. Since
segments can vary in length, memory allocation is a dynamic storage-allocation problem,
which can lead to external fragmentation.

Segmented Paging

Both paging and segmentation have advantages and disadvantages and most succesful
processor families are moving towards a mixture of segmentation and paging. Segmented
paging usually uses a segment table, a multi-level page table and an offset.

Virtual Memory

Virtual memory is the seperation of user logical memory from physical memory. Only part of
the program needs to be in memory for execution (i.e., only the part needed for current
execution - follows from the principle of locality, where programs and data references inside
a process tend to cluster and only a few pieces of a process will be needed in any period of
time). The major advantages of virtual memory is that the logical address space can be much
larger than the physical address space and more processes can be partly in physical memory
at any time.

Virtual memory can be much larger that physical memory, with n pages of virtual memory
mapped to m pages of physical memory and other pages on secondary storage.
Virtual memory can be implemented assuming a virtual address space of 0..max, where max
is usually a power of 2, e.g., could be 232 bytes, or 4 GiB, in 32-bit systems. The process can
be loaded anywhere in physical memory and often the physical memory size is significantly
smaller than the virtual address space of an individual process.

Physical memory nowadays is very large, so virtual memory is only useful for large data sets,
and for compilation models.

Demand Paging

Demand paging brings a page into memory only when it is needed. This means there is less
I/O, less memory, a faster response time and more multiprogramming.

A page is needed when there is a reference to it (i.e., an instruction/data address). An invalid


address will lead to an abort, and a not-in-memory request will lead to a fetch to memory -
this is called a page fault.

When the first reference to a page occurs, the page will not be in memory under this system,
so a page fault occurs. This is a non-maskable interrupts (can't be covered up by the user or
OS) and it is synchronous, you need to know the running process that caused the interrupt.
This allows for the OS to bring the appropriate page into main memory and for it to detect
invalid memory references - the OS contains a range of valid addresses for a process. The
process often does not need the full range of virtual memory, hence code size, data size and
stack size can be used to check if a memory reference is valid. This information is stored on a
per process basis in the TCB.

When a page fault occurs, the OS looks at the internal table to decide if the process is
permitted to make the memory reference and if it is, it finds an empty frame and swaps the
page into that frame, updates the page table and the internal OS tables and continues the
instruction that caused the page fault - as with other standard interrupts.

Pure demand paging is where you start executing a program with no pages loaded into
memory. This will trigger a page fault immediately, however.

Demand paging is more complex when a TLB is involved. When a virtual address is given,
the processor consults the TLB, and if a TLB hit occurs, the real address is returned and main
memory is consulted in the usual manner. If the page table entry is not found in the TLB (a
miss), the page number is used to index the process page table, and then a check occurs to see
if the page is in main memory. If it's not, a page fault is issued and the non-TLB process as
above is followed, with the TLB being updated as well as the page table.

Page Replacement

Page replacement is needed when there are no free physical memory frames.

When we want to load a page into memory, we try to find a free frame to put the page into. If
there is no free frame, then we use a page replacement algorithm to select a victim frame, and
then the victim is copied back onto disk and the new frame is loaded into its place and the
page and frame tables are updated.
The aim for designing a page replacement algorithm is to give the lowest page-fault rate. An
algorithm is evaluated by running a particular string of memory references and computing the
number of page faults on that string. An ideal page fault graph would look like:

Some algorithms, however give more page faults if the number of pages is increased - this is
called Belady's Anomaly.

The simplest page replacement algorithm is the first-in-first-out method, which can be
implemented using a circular linked list.

Another algorithm is the least recently used (LRU) algorithm, which replaces the page that
has not been referenced for the longest time. By the principle of locality, this should be the
page that is least likely to be referenced in the future. Each page could be tagged with the
time of last reference, but this would require a great deal of overhead. LRU can be easily
implemented using a stack of page numbers; when a page is referenced, it is moved or placed
on the top of the stack, and when looking for something for a replacement, the bottom of the
stack is all that you need.

The second chance algorithm uses a reference bit, which is initially set to 0 and then is
updated to 1 when it is accessed. The second chance algorithm uses a circular list of pages.
When searching for a page to replace, if a page has a reference bit one, it is set to 0 and the
page left in memory, and the next page in circular order is replaced subject to the same rules.

We can also consider counting algorithms, where a counter is kept of the number of
references that have been made to each page. The least frequently used (LFU) algorithm
replaces the page with the smallest count. The most frequently used (MFU) algorithm is
based on the algorithm that the page with the smallest count was probably just brought in and
has yet to be used.

The optimal algorithm replaces the page that will not be used for the longest period of time. It
is impossible to have perfect knowledge of future events, but if you already know what's
happening (as with a reference string), you can use this algorithm to measure how well your
algorithm performs.
The OS may wish some frames to remain in physical memory frames, such as parts of the OS
itself, control structures, I/O buffers, etc..., which is why we can use page and frame locking.
When a frame is locked, it may not be replaced. This usually requires a lock bit to be added
with each frame, and is usually supported in the OS, not hardware.

Each process requires a minimum number of pages. If indirect referencing is used, each
reference requires a page, so each level of references increases the amount of pages required.

Frame Allocation

Again, there are different methods of allocating frames. If there are 100 frames and 5
processes, each process can be given 20 pages. This is called equal allocation. A different
way is called proportional allocation, where allocation is given according to the size of the
process: ai, the allocation for process pi = si/S × m, where si = size of process pi, S =
Σsi and m = total number of frames.

A variation of the proportional allocation scheme is the priority allocation scheme, where the
proportions are computed based on priorities, rather than size (a higher priority process will
execute quicker if it has more frames).

If a process pi develops a page fault you can either select a replacement from one of its
frames or a replacement from a process with a lower priority number. This brings us to the
question of global or local replacement. Global replacement will select a replacement frame
from the set of all frames, so one process can take a frame from another. In priority based
allocation schemes, this may allow higher priority processes to increase their allocation of
frames by taking a frame from a low priority process. Problems arise as the page fault
behaviour of one process will also depend on the behaviour of other processes.

In local replacement, each process selects from only its own set of allocated frames, so the
process' page fault behaviour is entirely dependent upon itself, and not upon other processes.
The process execution time becomes more consistent across multiple executions.

Thrashing

If a process does not have enough frames allocated, the page fault rate can get very high, so a
lot of swapping back and forth between main memory and secondary storage is done, and the
result is thrashing, which is where more work is done by the OS swapping and paging than is
done by the application in doing something useful. This has severe performance implications.

A common form of thrashing is due to the OS monitoring utilisation and many page faults
occur, decreasing utilisation. As utilisation decreases, the OS increases the degree of
multiprogramming to try and increase it, which in turn increases the page faults and decreases
utilisation...

Thrashing occurs under global frame replacement more as frames can be stolen from other
(active) processes. Local frame replacement limits thrashing to within individual processes. If
a process starts thrashing, it cannot steal frames from another process and cause it to start
thrashing.
To prevent thrashing, a process must allocate a sufficient number of frames. We can predict a
sufficient amount of frames using the locality model. The diagram below shows how memory
accesses change over execution of a program.

Thrashing will occur when the size of the locality > memory (pages) allocated to the process.

The working set model assumes locality - it defines a per-process working set window which
contains the most recent page references for a process. If a page is in active use, it will have
been accessed in the working set window, and will be in the working set. If a page is not
being used, it will drop from the working set sometime after its last reference. The working
set approximates the locality of a process.

Formally, we have Δ, the working set window width in time and WSSi, the working set of
process Pi is the total number of pages accessed in the most recent Δ. The working set will
vary with time, and if Δ is too small, it will not cover the entire locality, and if Δ is too large,
it'll cover several localities. If Δ = ∞, it will cover the entire program. The most important
property is the total size of the working sets in the system. D = ΣWSS i. If D > m (the total
number of pages), thrashing will occur, and a process must be suspended.
The working set can be implemented with an interval timer and a reference bit. For example,
if Δ = 10,000, each process in the OS has bits to indicate whether it has been accessed in the
last 5000, 10,000 or 15,000 cycles. On a page access, the 5000 bit is set, and a timer interrupt
every 5000 cycles shifts all the bits along one, with the 5000 reference bit set to 0. If any of
the bits in memory = 1, the page is in the working set.

This is not entirely accurate, as you cannot tell within an interval of 5000 when a page
reference actually occured. You can improve this by interrupting more frequently and
increasing the number of reference bits, but this has a greater overhead and reduces the time
available to do useful work.

The working set model is quite clumsy and expensive to implement, and page-fault frequency
is better. An "acceptable" page-fault rate is better, and if the page fault rate travels outside of
these bounds, then the process loses a frame, and if the page-fault rate gets too high, the
process gains a frame.

Inter Process Communication


We need processes to co-operate to carry out particular jobs, e.g., one process requests a
service from another (such as requesting an I/O operation from the kernel, as the kernel is
still a process), pipelined processes working together (one process can't proceed until the
previous one has finished) and where multiple processes work on the same task, for example
one might read from disk to memory, and another reads from memory to the audio device.
This can be accomplished using inter process communication (IPC).

IPC has numerous advantages, such as:

• Information sharing (e.g., of shared files)


• Computation speed-up (e.g., a program can be split across multiple computers)
• Modularity (for software engineering reasons)
• Convenience (e.g., to edit, compile and print in parallel)

Independent processes cannot affect or be affected by the execution of another process (all
processes are in a seperate protected memory address space). Co-operating processes can
affect or be affected by the execution of another process, via some form of IPC mechanism.

The OS must solve some of the problems that IPC causes, however. It must provide
mechanisms to control interaction betwen the processes and must order accesses to shared
resources by processes. The OS must also provide mechanisms by which resources can wait
until the resource is available.

A paradigm for co-operating processes is the producer-consumer problem. A producer


process produces information that is consumed by a consumer process. There are two models
which can be used here, one is the unbounded-buffer, which places no practical limit on the
size of the buffer, and the buffer is just expanded as the producer fills it and the other is the
bounded-buffer, which assumes there is a fixed buffer size and the producer must wait for the
consumer to remove data if the buffer is full.

See lecture notes


for full details

One way to implement the bounded-buffer solution is to use a circular array/list, but this only
works if the processes can share memory. Another implementation that maximises the use of
the buffer is to use an integer called a counter which is then incremented when a producer
writes to the buffer and decremented when a consumer reads from it. These operations must
be performed atomically though, which isn't always the case if a context switch occurs during
the increment and decrement process.

We now have a race condition, which is the situation where several processes access and
manipulate shared data concurrently and the final value of the shared data depends on which
process finishes last, so the data may become inconsistent. To prevent race conditions and to
ensure consistent data, concurrent processes must be synchronised. This requires mechanisms
to ensure the orderly execution of co-operating processes.

Critical Sections

Here, we can use critical sections. If we have n processes competing to use some shared data,
each process has a code segment called the critical section, in which the shared data is
accessed. The problem is to ensure that when one process is executing in its critical section,
no other process is allowed to execute in its critical section. Solutions to this problem include:

• Mutual exclusion (mutexs) - if a process is executing in its critical section, then no other
processes can execute in their critical sections
• Progress - if no process is executing in its critical section and there exists some program that
wish to enter their critical section, then the selection of the processes that will enter their
critical section next can not be postponed indefinately
• Bounded waiting - a bound must exist on the number of times that other processes are
allowed to enter their critical sections after a process has made a request to enter its critical
section and before that request is granted
One way to accomplish this is to use command such
as EnterCriticalSection() and LeaveCriticalSection(). Processes may
share some common variables to synchronise their actions. For example, we could have an
integer value called turn, which if it matches the value associated with that process, then the
process can enter its critical section and then increment the turn counter. This satisfies the
mutual exclusion requirement, but not progress.

A second algorithm is to use guard flags, which consists of an array of flags. When a process
is ready to enter its critical section, it sets its flag to true and waits until all other processes
have completed their critical sections and then sets its flag to false. This satisfies mutual
exclusion again, but not progress.

The final algorithm is a mixture of turn and flags, which waits for the turn of a process that is
waiting to enter its critical section to finish and then enter its own. This meets all three
requirements, and solves the critical section problem for two processes.

However, we need to solve the problem for n processes, and one way of implementing this is
to use the Bakery (or Lamport) algorithm. Before entering its critical section, a process
receives a number, and the holder of the smallest number enters its critical section. If
processes i and j receive the same number, then if i < j, then i is served first, else j is served
first. The number scheme used always generates numbers in an increasing order of
enumeration.

As in many aspects of software, hardware can make programming easier. Hardware exists
that can solve the critical section problem. If an operation TestAndSet existed, which
could test and modify the contents of a word atomically, mutual exclusion could be
implemented. This does not satisfy the bounded wait requirement, however.

The software algorithms discussed above for exclusion of processes from critical sections are
problematic as they rely on the continuous testing of variables (busy waiting), which is
inefficient and wasteful of CPU cycles. The protocols are complex and unenforceable, relying
on programmer discipline and co-operation. The OS implements efficient IPC mechanisms
instead.

Signals

We've already discussed


signal handling in threads
in the threads section.

POSIX/Linux signals are a dataless communication either between processes, or between the
OS and a process. Signals can be thought of as software interrupts and require a handler
within a process. When the event that causes the signal occurs, the signal is said to be
generated and when the appropriate action for the process in response to the signal is taken,
the signal is said to be delivered. In the interim, the signal is said to be pending. Signals
consist of a number of types, such as system defined (normally up to 31) corresponding to a
particular event, or user-defined (normall integers > 31). Some of these signals include:

• SIGINT (2) - interrupt generated from a terminal


• SIGILL (4) - interrupt generated by an illegal instruction
• SIGFPE (8) - floating point exception
• SIGKILL (9) - kill a process
• SIGSEGV (11) - segmentation violation
• SIGALRM (14) - alarm clock timeout
• SIGCHLD (17) - sent to parent when child exits

Synchronous signals occur at the same place, every time in the program (e.g., SIGFPE,
SIGILL, SIGSEGV) and are caused by the program itself. Asynchronous signals may occur
at any time from a source other than the process (another process e.g., SIGKILL, a terminal
driver on a key press e.g., SIGINT on Ctrl+C, on an alarm clock, e.g., SIGALRM) and a
process can choose how to handle the specific signal.

A signal can be treated in the following ways, it can be ignored (except for SIGKILL), the
default action set by the OS can be executed (the default action is decided by the signal used,
such as terminating the process, suspending the process, terminating the process and saving a
copy in memory, continue a suspended process or just ignore), or a specific signal-catching
function can be invoked.

There are essential system calls for signalling:

• kill(pid, signal) - send a signal of value signal to the process with ID pid.
• sigaction(handler, signal) - installs a handler for a specific signal (handlers
are specific to each process)
• sigprocmask(..., mask, ...) - signals can be blocked or masked in much the
same manner as hardware interrupts can be masked by the OS
• sigpending(mask) - signals that are raised whilst they are masked out can be examined
• sigsuspend(mask) - a process can wait until a signal arrives

Semaphores

Semaphores are "a non-negative integer, which apart from initialisation can only be acted on
by two standard, atomic, uninterruptible operations, WAIT (which decrements the value of
the semaphore) and SIGNAL (which increments the value)" - E. Dijkstra

You must guarantee that a maximum of one process can execute a WAIT or SIGNAL
operation on the same semaphore at the same time (atomicity).

There can be two types of semaphore, a counting semaphore (the integer value can range over
an unrestricted domain) and a binary semaphore, where the binary value can range only
between 0 and 1. This can be simpler to implement.

Semaphores can be used to implement mutual exclusion for the critical section
of n processes. The shared data is a binary-semaphore mutex which is WAITed on when a
program wants to enter the critical section and then SIGNALed when the critical section
completes.

Semaphores can also be used to implement synchronisation. For example, B can only be
executed in program j only once A has executed in program i. When a semaphore is used to
indicate a condition, it is called a condition variable and synchronisation using a condition
variable is called condition synchronisation.

This can introduce two conditions, however: starvation, which is indefinite blocking when a
process may never be removed from the queue in which it is suspended; and deadlock, where
two or more processes are waiting indefinitely for an event that can only be caused by only
one of the waiting processes.

Counting semaphores can be used for resource management. If you consider a resource
with m allocatable elements, the counting semaphore is set to m and then users can wait for a
resource by waiting on the semaphore and then free up a resource by signalling a semaphore.

The classic problems of synchronisation, such as the bounded-buffer problem, the readers and
writers problem and the dining-philosophers problem can be solved using semaphores and
mutexs.

OS level semaphores ensure mutual exclusion over a shared resource if and only if the
application is disciplined. For example, the programmer should only access shared data if and
only if it is holding the appropriate semaphores, and it should issue a signal to release the
semaphores when it finishes the critical section.

Semaphore operations still require atomicity. If an operation is in the OS, it can be accessed
by a process using a system call. When it is in the OS, the call will execute atomically, or at
least seem atomic from the users point of view, if interrupts are enabled. True atomicity can
only be achieved with hardware support, such as the TestAndSet instruction discussed
earlier.

POSIX semaphores support two interfaces (Linux does not support these fully). Unnamed
semaphores are used within a process (or between related processes) and are created by a call
to sem_init(...) by a process. Operations include sem_wait(...) for a wait
and sem_post(...) for a signal. sem_trywait(...) also exists, which works like
wait if the semaphore has a greater count than 0, but doesn't wait if the semaphore is 0,
returning immediately to the calling process. These semaphores are destroyed by a call
to sem_destroy(...), or if the creating process dies.

Named semaphores can be used between any process. The namespace is the same as the
filesystem (so the name of a semaphore is effectively a filename). These are created
using sem_create(...) and then can be manipulated as
above. sem_close(...) removes the link between a process and a semaphore, and they
can live on past the termination of the creating process, so must be explictly destroyed
using sem_destroy(...). Here, the OS merely stores a list of processes waiting on a
particular semaphore (i.e., processes that are blocked on a semaphore), but hardware support
is still required for atomicity.

Linux implements SysV semaphores (i.e., as it existed in System V UNIX), which are not
POSIX compliant. They are system wide resources, managed by the OS, and any process can
access a semaphore, which is stored in the semaphore table of an OS. Allocation occurs
via semget(...), which creates a set (which may be of one) of semaphores or associate
with an existing set of semaphores. All operations on semaphores occurs
using semop(semaphore, op, ...) and semctl(...) allows you to examine or
change the value of a semaphore, e.g., to initialise it, as well as telling you how many
processes are waiting on the semaphore and allowing you to destroy it.

Semaphores can also be used to provide mutual exclusion between threads, such as provided
by the Linux pthreads library which provides
mutexes. pthread_mutex_init() provides initialisation for a
mutex, pthread_mutex_lock() obtains a lock for a mutex
and pthread_mutex_unlock() will give up a lock. The semaphores between pthreads
are supported by the standard Linux kernel semaphores.

Conditional synchronisation can also be used, where a condition enables threads to


atomically block and test a condition under the protection of mutual exclusion, until the
condition is true. If the condition is false, the thread blocks on the variable and automatically
releases the mutex. If another thread changes the condition, it may wake up waiting threads
by signalling the associated condition variable. Waiting threads will wake up and then
reacquire the mutex and reevaluate the condition. pthread_cond_init() will initialise a
condition variable, which must be a mutex initialised
by pthread_mutex_init(). pthread_cond_wait() and pthread_cond_sign
al() take a condition variable and a mutex as parameters.

Using the low-level IPC mechanisms mentioned above relies on programmer discipline for
the correct use of semaphores to wait and signal in the correct order at the correct point in the
algorithm. Additionally, it requires an agreement between communicators (i.e., between the
reader and the writer) on the precise protocol to be used, and problems can be encountered if
the writer assumes writers have priority and readers assume that the readers have priority
over the shared data. Sometimes, hardware support is required to provide atomicity
(TestAndSet, interrupt disabling, etc).

High-level IPC provides a structured IPC less susceptible to programmer error. It


incorporates both the means of shared data control and the shared data itself, as opposed to
low level IPC, such as semaphores, where the shared data control is distinct from the shared
data (note, the POSIX implementation assumes that the shared data in a file could be
accessed from processes without waiting for the appropriate semaphore).

Message Passing

Message passing is a mechanism for processes to communicate and synchronise their actions.
In a message system, processes communicate with each other without resorting to shared
variables. If P and Q wish to communicate, they need to establish a communication link
between them and exchange messages via send/receive.

This does require OS support and a number of implementation issues must be resolved -
where are messages stored whilst they are in transit, how many processes can join in with a
communication, and is the message size fixed or variable, is the link uni or bi-directional,
synchronous or asynchronous, etc...
With message passing, processes must name each other explicitly, such as: send(P,
message), to send a message to process P and receive(Q, message) to receive a
message from process Q.

With this method, links are established automatically and a link is associated with exactly one
pair of communicating processes, and between each pair their exists exactly one link. The
link may be unidirectional, but it is usually bidirectional.

A method of indirect communication involves using messages being directed and received
from mailboxes. Each mailbox has a unique ID and processes can only communicate only if
they share a mailbox. This introduces the problem of deciding who gets a message if a
mailbox is shared, however. The primitives in this method are: send(A,
message) and receive(A, message), which sends a receives messages from mailbox
A.

In this case, a link is only established if processes share a common mailbox, and a link may
be associated with many processes. Each pair of processes may share several communication
links and the links may be uni- or bi-directional.

Mailboxes can be owned by either the OS or a process. If it is OS owned (in OS address


space), the OS is independent and not attached to any process, so operations and system calls
are required for the process to create a new mailbox, send and receive messages through the
mailbox and to delete mailboxes. If a mailbox is process owned (in the address space of a
process) the process with a mailbox receives messages and other processes send messages to
the mailbox. An operation or system call is still required to write to the mailbox.

Message passing can be considered to either be blocking or non-blocking. Blocking is


considered synchronous; in a blocking send, the sending process is blocked until the message
is received by the receiver or a mailbox, and in a blocking receive, the receiver is blocked
until the message is available. Non-blocking is considered asynchronous; in a non-blocking
send, the sending process sends the message and then resumes execution, and for a non-
blocking receiver, the receive call will either get a message or a NULL. The send and receive
primitives may be either blocking or non-blocking, depening on the implementation.

The link (direct communication) or mailbox (indirect communication) will contain messages
that have been sent or not received. This is called a buffer, and they can be implemented in
more than one way:

• Zero capacity - 0 messages are held, and the sender must wait for the receiver
• Bounded capacity - here there is a buffer with a finite length of n messages and the sender
must wait if the link is full
• Unbounded capacity - infinite length, and the sender never waits

Client-Server Communication

When the processes are on different machines, network communication is required. Here,
three forms of IPC are available: sockets, remote procedure calls (RPCs) and remote method
invocation (Java). These will be discussed in the NDS course.
In POSIX, message queues between processes are used. Messages are stored in FIFO order,
and the queue is set up so that either the process sends or receives blocking (on the full or
empty queue respectively), or a the process sends or receives non-blocking. A process
reading from an empty queue will block.

In POSIX, the process creates or connects to an existing one using mq_open(name,


...), where name is a filename in the same namespace as POSIX semaphores. Sending and
receiving messages occur using mq_send(...) and mq_receive(...), and process
can close its connection to a message queue by mq_close(...), however the message
queue will still exist after this call. To destroy a message queue, mq_unlink(...) is used
before mq_close(...), and the queue will be destroyed after all processes attached to the
queue have unlinked.

High-level Synchronisation Mechanisms

Semaphores provide a convenient and effective mechanism for process synchronisation, but
their incorrect use can result in problems difficult to detect. To combat these problems, a
number of high-level synchronisation constructs have been proposed: critical regions and
monitors.

These processes are not implemented at the OS level, but usually as part of the programming
language. There is no atomicity at the application level.

Critical regions are a high-level synchronisation construct. A shared variable v of type T is


declared as v : shared T. The variable v is only accessible inside the statement region
v when B do S, where B is a boolean expression, which means that when statement S is
being executed, no other process can access variable v.

Regions referring to the same shared variable exclude each other in time (i.e., mutual
exclusion is enforced). When a process tries to execute the region statement, the boolean
expression B is evaluated. If B is true, S is executed, but if B is false, the process is delayed
until B becomes true and no other process is in the region associated with v.

Critical regions are implemented using semaphores. With the shared variable x, the following
are associated: semaphore mutex, first-delay, second-delay; int
first-count, second-count;. Mutually exclusive access to the critical section is
provided by mutex. If a process can not enter the critical section because B is false, it
initially waits on first-delay and is moved to second-delay before it is allowed to
re-evaluate B. A track is kept on the number of processes waiting on first-delay and second-
delay by using first-count and second-count respectively.

This algorithm assumes a FIFO ordering in the queueing of processes for a semaphore. For an
arbitrary queueing discipline, a more complicated implementation is required.

Monitors are high-level synchronisation constructs that allows the safe sharing of an abstract
data type among concurrent processes. The monitor encapsulates shared data together with
procedures that act upon that data. Only the procedures can be accessed from outside the
monitor; the data inside the monitor can not be directly accessed from outside the monitor.
Only one process can be active inside a monitor at a time.

To allow a process to wait within a monitor, a condition variable must be declared


as: condition x, y;. The condition variable can only be used with the operations wait
and signal. x.wait() means that the process invoking the operation is suspended until
another process invokes x.signal(), which resumes exactly one suspended process. If no
process is suspended, then the signal operation has no effect.

A conditional-wait construct also exists: x.wait(c), where c is an integer expression


evaluated when the wait operation is executed. The value is a priority number stored within
the name of a process that is suspended. When x.signal() is executed, the process with
the smallest associated priority number is resumed next.

Two conditions need to be checked to establish the correctness of a system:

• User processes must always make their calls on the monitor in a correct sequence.
• You must ensure that an uncooperative process does not ignore the mutual-exclusion
gateway provided by the monitor and try to access the shared resource directly, without
using the access protocols.
Deadlocks
Deadlocks occur when a set of blocked processes are each holding a resource and waiting to
acquire a resource held by another process in the set. Deadlocks can be characterised by the
bridge-crossing problem:

There is only one lane across the bridge. We can view each section of the bridge as a
resource, and if a deadlock occurs, it can be resolved if one car backs up (pre-empt resources
and rollback), but sometimes that will require the cars following it to be backed up also.
Starvation can also occur, if only cars from one direction are allowed to pass.

Deadlock can arise if four conditions hold simultaneously in a situation:

1. Mutual exclusion: only one process at a time can use the resource
2. Hold and wait: a process holding at least one resource is waiting to acquire additional
resources held by other processes
3. No preemption: a resource can only be released voluntarily by the process holding it, after
the process has completed its task
4. Circular wait: there exists a set {P0, P1, ..., Pn} of waiting processes such that P0 is waiting for a
resource that is held by P1, which is waiting for a resource that is held by P2, ..., Pn - 1 is
waiting for a resource that is held by Pn, that is waiting for a resource that is held by P0.

We could plot a resource allocation graph which consists of a set of vertices V and a set of
edges E, such that V is partitioned into two types, P = {P1, P2, ..., Pn}, the set of all processes
in the system and R = {R1, R2, ..., Rm}, the set of all resource types in the system. Each
process utilises a resource using: requests, a directed edge Pi → Rj, which if it can not be
assigned immediately, the requesting process is blocked; and resource use, indicated by a
directed edge Rj → Pi. That is, the process uses the resource. When a resource is released,
edges are removed.
The above resource allocation graph has no deadlocks, as there are no circularities. P3 has all
the resources it needs; when it frees R3 then P2 can proceed; when P2 releases R1, then P1 can
proceed.
The above resource allocation graph does have a deadlock, however. Mutual exclusion holds
over all of the resources, a number of processes hold resources and are waiting for others and
no process wants to release a resource. P3 holds R3 and wants R2; P2 holds R1 and R2 and
wants R3 and P1 holds R2 and wants R1. This is a circular wait, and we have all the conditions
for deadlock.

If a graph contains a cycle, the following basic rules can be applied: if the graph contains no
cycles, no deadlock occurs; if the graph contains a cycle, if there is only one instance per
resource type, then deadlock occurs, otherwise if there are several instances per resource
type, there is the possibility of deadlock.

Handling Deadlocks

The free basic methods for handling deadlocks are prevention/avoidance (ensuring that the
system will never enter a deadlock state), detection and recovery (allows the system to enter a
deadlock state and then recover from it) and the head-in-the-sand approach (ignore the
problem and pretend that deadlocks never occur - this is the approach used by most OS,
including UNIX, Linux and Windows).
To implement prevention, we need to ensure that one or more of the deadlock conditions do
not hold:

• Mutual exclusion is not required for sharable resources, but it must hold for non-sharable
resources.
• Hold and wait must guarantee that whenever a process requests a resource, it does not hold
any other resources. The process must request and be allocated all of its resources before it
begins execution, or the process can only request resources when the process currently has
none. This does have the possibility of low resource utilisation and starvation.
• No preemption - If a process that is holding some resources requests another resource that
can not be immediately allocated to it, then all resources currently being held are released.
Preempted resources are added to the list of resources for which the process is waiting and
the process will only be restarted when it can regain its old resources, as well as the new
ones it is requesting.
• Circular wait - impose a total ordering of all resource types and require that each process
requests resources in an increasing order of enumeration.

Deadlock algorithms prevent deadlock, so deadlock will never occur if (at least one of) the
restrictions are enforced. But, there is a cost to this system. To restrain how resource requests
are made can have the potential for low device utilisation and a reduced system throughput.

Alternatively, with some addition a priori information, deadlock can be avoided on a per
request basis. The simple method is that each process declares the maximum number of
resources of each type that it may need, or a dynamic alternative is to dynamically examine
the resource-allocation state to ensure that there can never be a circular-wait condition. The
resource allocation state is defined by the number of available and allocated resources, and
the maximum demands of the process.

Systems States

When a process requests an available resource, the system must decide if immediate
allocation would leave the system in a safe state - a state where there exists a sequence of all
processes so that all processes can get their required resources and complete. If the system is
not in a safe state, it is in an unsafe state, which means the possibility of deadlock exists (but
is not necessarily going to happen). The deadlock state is a subset of the unsafe states.

Avoidance requires that the system will never enter an unsafe state, so an algorithm must pick
a safe sequence of events and processes for execution. One such algorithm is the resource-
allocation graph algorithm. A claim edge Pi → Rj indicates that process Pi may request
resource Rj which is represented by a dashed line. A claim edge is converted to a request
edge when a process requests a resource. When a resource is released by a process, the
assignment reconverts to a claim edge. Resources must be claimed a priori in the system.

File Systems
Files are an integral part of most computer systems. Input to applications can be by means of
a file, and output can be saved in a file for long-term storage. File systems can store large
amounts of data, and the storage is persistent. The file management system is considered part
of the OS.
The file managment system is a way a process may access files. The programmer does not
need to develop file management software, this is provided by the OS. The objectives of a file
management system are to meet the data management needs of the user, to guarantee the data
in the file is valid, to minimise the potential for lost or destroyed data and to optimise
performance. We can also break down what the user requirements for data management are
into:

• being able to create, delete, read and change files


• controlled access to other users' files
• control accesses allowed to that users' files
• restructure the users' files in a form appropriate to the problem
• move data between files
• backup and recover the users' files in case of damage
• access and share the users' files using symbolic names

To enable this, the file management system typically provides functions to indentify and
locate a selected file, to organise files in a structure (often accomplished using directories)
which describes the locations of all files and their attributes, and to provide mechanisms for
protection. The file management systems provides secondary storage management, that is, it
allocates files to free disk blocks and manages free storage for available disk blocks.

Files

A file is a contiguous, logical addres space (apart from sparse files), however, they may not
be stored as a contiguous file. Files may be of different types, but from the perspective of the
OS, files have no structure - all a file is is a sequence of machine words and bytes. However,
from the user or application perspective, files may have a simple record structure made up of
lines (of fixed or variable length) or complex structures, which have special control
characters in the structure to make up files of various types.

Files also have attributes assigned to them:

• Name - in a human readable form


• Type - for systems that support different types
• Location - pointer to a file location on the device
• Size - current file size
• Protection - e.g., read, write, execute bits
• Time, date and user identification - used for protection, security and usage monitoring

These file attributes are kept in the directory structure.

The OS only provides basic file operations on files, such as: create, write, read, seek, delete
and truncate, so more complex operations must be made up of combinations of these
operations, e.g., copy is a create, followed by a read from a source file and a write to the new
file.

File types, traditionally implemented using file extensions (in Microsoft Windows, this is a
legacy from DOS), enables the OS to invoke an appropriate application for each file type.
UNIX and its variants use a "magic number" at the start of the file to declare its type.
File systems have criteria for file organisation, such as rapid access (especially needed when
accessing a single data item), ease of update (not always a concern, however, e.g., CD-
ROM), economy of storage (want to reduced wasted space, although this is less of a concern
in modern computing), simple maintenance and reliability.

Sequential Access Files

In these types of files, access starts at the beginning of the file, and to read to the middle of
the file, you must start at the beginning and read the file until you reach the position you
want, and then rewind to get back to the beginning. This method of file access is slow.

A fixed format is used for structuring data into records. Records are normally the same
length, and the fields in a record are in fixed order. One field (usually the first) is a key,
which uniquely identifies the record within that file. Usually, records are stored in key order
and batch updates are used to reduce the update time (records are cached and then ordered
before an update is done). This method is designed for sequential access devices (tapes), but
sequential files can be implemented on random access devices, such as disks.

Additions to and deletions from the file can cause problems. The file is stored in a sequential
ordering of record keys, so insertion requires "moving" some of the records on the storage
medium. This is very expensive in time. Usually, new records are placed in a new file, called
the overflow, log or transaction file and a periodic batch update occurs that merges the
overflow file with the original file. Alternatively, records are organised as a linked list. More
overhead in terms of storage is required, and additional processing is required also. This only
really works (on sequential storage devices) if the records remain in key order. Deletion
requests can also be stored in the log file.

To reduce slow, sequential access files, and to support random access storage devices better,
indexed sequential files can be used. The index provides the lookup capability to get in the
vicinity of a desired record faster. The index is searched for a key field that is equal or less
than the desired key value which contains a pointer to the main file. The search then
continues in the main file by the location indicated by the pointer. The records are still stored
in order, and new records are added to an overflow file (a record in the main file preceding
the new record is updated to contain a pointer to the new record in the overflow file). The
overflow is then periodically merged with the main file during a batch update.

Comparison: If a file contains 1,000,000 records, in a standard sequential file, the average
lookup case is 500,000 accesses to find a record in the sequential file. If an index contains
1000 entries, it will take on average 500 accesses to find the key, followed by 500 accesses in
the main file. On average, it is now 1000 accesses.

Direct Access Files

Direct access files exploit the capability of random access devices (such as disks) to access
directly any address on the device. No sequential ordering of data in the file is required, and
this method can be used where rapid access is required, such as in swapping or convential
user files. It is not that useful for backups or archives where only occasional access is
required (these can therefore be implemented on cheaper media, such as tapes).

Directories

Directories are used to organise files, and they themselves are files owned by the OS.
Directories contain information about files, such as name, attributes, location, ownership,
size, access/update times and protection information. They provide a mapping between file
names and the files themselves.

From a user perspective, they need to search for a file (by name or attribute), create a file,
delete a file, list a directory and rename a file, etc...

The directory must be organised to obtain efficiency (e.g., to locate a file quickly), naming
(so it is convenient for users, two users can have the same name for different files and the
same file can have several different names) and grouping (a logical grouping of files by
properties).

There are multiple ways of implementing directories. One such way is the single-level
directory, where a single directory exists for all users. However, if all users share the same
directory, they must ensure that all file names are unique, and there is the grouping problem
of little structure or organisation.

In a two-level directory setup, each user has their own directory, which allows a user to have
the same filename as another user, this allows for efficient searching, but has the problem of
there being no grouping capability between users.

Another is the tree-structured directory, which allows for efficient searching and grouping
capability, without any sharing of files. In this model, we have the concept of a current, or
working, directory. There are issues with this model, however. There are absolute and
relative path names, and creating a new file or directory is done in the current directory. Also,
how do you delete a directory? Most OSs require for you to explicitly state that you want to
remove all files in the directory, or that you do it beforehand.
Tree structure prohibits the sharing of files and directories, however. Acyclic graphs remove
this restriction. They are more complex than tree, and two different names can refer to the
same file (aliasing), so there can be multiple path names for the same file. This does
introduce a deletion problem, however - you can not delete a file until all references to it are
solved. This is difficult, so we can solve it using backpointers (so we can delete all the
pointers to a file), have backpointers in a daisy-chain organisation, or by using an entry-hold-
count solution.
Acyclic graphs are complex to manage, however. You need to ensure there are no cycles,
which links can introduce. Merely having subdirectories and files can not introduce cycles.
Another way of guaranteeing no cycles is to only allow links to files, not subdirectories, or by
using a cycle detection algorithm every time a new link is added to determine whether or not
it is okay (this is expensive).

A file has to be opened before it is of any use, and similarly, a file system must be mounted
before any files or directories are available.

Sharing of files on multi-user systems is desirable. Sharing can be done through a protection
scheme. A simple method is to ensure that only one user has write access to a file at a time. A
complex method is using a variation of semaphores implemented by some file systems
(including in POSIX) called file locks. On distributed systems, files may be shared across a
network.

Protection

Protection exists to make sure that files are not read or modified by unauthorised persons.
Protection mechanisms are provided by the OS to safeguard information inside the computer.
Objects and the rights that they have in particular contexts are identified in the aim to prohibit
processes from accessing objects they are not authorised to access.

A domain is associated with a set of pairs <objects, rights>, where rights will be a subset of
operations that are allowed on the object, and a domain could be a user or a process.
It is possible for an object to be in two domains at the same time with or without the same
rights. Different mechanisms are used for identifying domains, in UNIX a combination of
user ID and group ID is used.

Another method of providing protection is through access control lists and capabilities. It is
abstract model, where matrix rows correspond to domains, and columns correspond to
objects. The ACL defines columns - access rights for an object in different domains, and
capabilities define rows, which are associated with each process and is a set of operations that
are permitted.

An access matrix might look like this:

↓ Capability → Access List File 1 File 2 File 3 Card Reader Printer

neil read read

andy read print

DomainX read execute

DomainY read/write read/write

The capability list is a row of the matrix, and a capability then is like an abstract data type,
that is, it points to an object and specifies which operations are allowed on it (it may also
have a field 'type' indicating what type of object it is). The list then specifies what capabilities
are accessible in this domain.

In UNIX, the file owner (by default, the creator) is able to control what should be done, and
by whom. Types of accesses include read, write and execute. In UNIX, this is implemented
as a bitmask with three domains, owner, group and other. Groups in this case are created by a
sysadmin, not the user.

File systems reside on secondary storage and contains both programs and data. It also
contains swapped out pages as part of the virtual memory management system. The main
issues to be addressed here are how to store files (a contiguous logical structure) on
secondary storage (disk) and mapping high level file operations on to low-level disk
operations, whilst requiring efficient use of the disk and fast read/write times.

Disks provide a series of blocks, usually of a fixed size, which are sometimes related to the
memory frame/page size, often as a multiple. The number of file blocks can be loaded into
one physical memory frame.

Disks are random access, that is, you can directly address any given block on the disk, and it
is simple to access any file sequentially or randomly. For I/O efficiency, transfers between
memory and disk are often performed in units of blocks.
For convenient access to disks, the OS imposes a logical file system, which defines how files
look to the user (i.e., operations, file attributes, directories, etc...). Algorithms and data
structures are needed to map the logical file structure onto the physical disk. This file
organisation maps logical file structure onto physical block addresses. A basic file system
issues generic commands to a device driver to read and write blocks. Finally, the I/O control
contains device drivers to translate basic file system commands into commands for a specific
disk.

Several structures are used to implement file systems, such as on disk structures including the
boot control block, the partition control block, the directory structure to organise the files and
a per file file control block (FCB) containing permissions, dates (access, create, modify),
owner, group, size and blocks. In memory structures are also used, such as the partition table,
the directory structure (with recently accessed directories) and an open file table (both system
wide, and per process).
The above diagram shows the necessary file system structures provided by the OS to open a
file (a) and reading a file (b).

Virtual File Systems

Virtual file systems exist so that a single interface can be provided by the OS to many
different types of file systems. Different file systems can be mounted on different mount
points within the same directory structure, and operations have the same effect on all the
different file systems.

This logical file system is called the Virtual File System, or VFS and it allows the same
system call interface (the API) to be used for different types of file system. The API is to the
VFS interface, rather than any specific type of file system, and the VFS can distinguish
between local and remote files.

Directory Implementation

The simplest implementation of directories is a linear list of file names with pointers to the
data blocks. This is time-consuming, however (e.g., a search is required when a file is created
to ensure uniqueness), as linear searches are expensive, however this cost can be offset by
keeping frequently used directory information in a cache. Keeping an order on the names
could reduce searches, but it would make it more costly for insert/delete of names.
Another implementation is to use a hash table, which decreases directory search time. There
is a problem with collisions (situations where two file names hash to the same location) in
this approach, however. A method is needed for knowing that a hash location has a number of
corresponding file names, and for determining which name to use. This type of hash table is
called a chained-overflow hash table.

File Allocation Methods

In contiguous allocation, each file occupies a set of contiguous blocks on the disk. This is
relatively simple to implement, as only a starting location (block #) and length (number of
blocks) are required. Random access is also relatively easy. This is wasteful of space
(dynamic storage-allocation problem) due to fragmentation (see memory management
earlier) and files can not easily grow.

A slightly more improved method is linked allocation, where each file is a linked list of disk
blocks, which may be scattered anywhere on the disk, and where each block contains a
pointer to the next block. Again, this is simple as only the starting address is required and
there is no wasted space in this free space management system. However, there is no random
access, as you have to go down the linked list to find the appropriate block.

The file allocation table (FAT) system is a variant of linked allocation used in MS-DOS and
OS/2. A section at the beginning of the disk is dedicated to a FAT, containing an entry for
each disk block. The directory entry then only needs the block number for the start of the file.
The FAT then indexed by the block number gives the next block, etc...

Linked allocation solves the problems of contiguous allocation, but it does not support
random access tables without a FAT. This can be achieved with an index table - all the
pointers are brought together into an index block, which exists per file, where the ith entry
points to the ith block of file, and this gives us random access. With large files, however,
large index tables are needed, which is a disadvantage; multi-level indexing can be
introduced to combat this, however. Additionally, the index table is an overhead.

Free Space Management

Since disk space is limited, space needs to be reused from deleted files and free space needs
to be monitored. One way of implementing this is using a bit vector which shows when a
block is free or occupied. An alternative to this bit vector is the linked list. A pointer is kept
to the first free block, which contains a pointer to the next free block, etc... This is inefficient,
as to traverse the list requires multiple disk block reads, however this is an uncommon
operation, as most operations require just one block. However, the linked list does not require
any extra free space, unlike the bit vector, although there is no easy way to get contiguous
space easily.

These methods can be modified using grouping, storing the addresses of the next n free
blocks in the first free block; next n addresses of free blocks in the next free block, etc... - this
is more efficient in the number of blocks that need to be read; and counting, which takes
advantage of the fact that free blocks usually form together. The first free block contains the
number of free blocks immediately following (i.e., forming a contiguous free store), together
with the address of the next free block at the start of some contiguous free store.
POSIX and Linux File Systems

In these implementations, the logical file system provides simple file operations, including:
create, open, read, write, close, mount (mounting a file system at a mount point), stat (returns
information about a file), lseek (repositions the current position in the file) and truncate
(causes file to be truncated to a specific size).

This assumes a structure where all files have distinct absolute paths, using a tree hierarchy. It
is possible to set up cycles via symbolic links

The VPS provides uniform access to all file systems, including special files (e.g., /proc, /dev).

The file system organisation is a multi-level indexing scheme, with an i-node (index-node)
associated with each file, which contains attributes and disk addresses of the files' blocks.
There is no distinction between files and directories other than having a different mode type
field.

In an i-node, the first few disk addresses are stored (which for small files, will be all the disk
addresses) and there are links to further i-nodes which contain further disk addresses. This
may go up to further indirection, which gives us a very flexible system at the expense of
complexity and storage space.

For directories, each entry contains a file name and an i-node number. Here, the i-node
contains information about the time, size, type, ownership and disk blocks contained in the i-
node. If you imagine that the file system knows the root directory and wants to look up
/usr/neil/ops/exam, then from the root i-node, the system locates the i-node number of the file
/usr, and then looks for the next component neil, and so on. Relative paths work in the same
way, where there is a known current directory i-node.

The Ext2 Linux general purpose filesystem implements this, with a mostly static allocation of
i-nodes. The disk is split up into block groups of one or more disk cylinders (i.e., those
physically together) and blocks are used for i-nodes and file data blocks. Bitmaps are used for
keeping data blocks and i-nodes in the group. When allocating disk blocks, related
information is attempted to be kept together (i.e., i-nodes of all files in a directory in the same
block group) and new directories are created in block groups with the smallest number of
directories.

This scheme works well as long as there is > 10% free space.

With journalling file systems, files are stored in both disc and main memory (a memory cache
of disk blocks is used for efficiency), however we have to consider the possibility of what
happens when the system fails. On the disk, file system data structure may be corrupted, but
we must ensure there is no loss of data (i.e., consistency is maintained). Journaling (or log-
based transaction) file systems attempt to maintain consistency.

For a file operation, updates to the disk structure are written to a log, and when complete, the
changes are considered committed. The updates for a file operation form a transaction, and
actual updates to disk structures are performed so that if there is a crash, it is known how
much of the transaction has been performed (and which other transactions remain in the log).
ReiserFS under Linux is an example of a journalling file system.
I/O
Handling I/O efficiently is important to the user, and therefore to the OS. All user interaction
with the machine is done using I/O and most user invoked operations (i.e., applications) will
use I/O at some point.

Operating systems provide I/O device handling based on utilising low-level hardware
interfaces to devices (i.e., access device via special instructions). Device drivers are
provisioned here to handle the idiosyncrasies of individual devices, programmed in terms of a
low-level hardware interface and provide a standard high-level application interface to
devices, to enable their easy use by application software.

See ICS for information


on computer organisation

Devices are addressed either using direct mapping, where devices exist in a special I/O range
and special I/O instructions exist in the CPU to implement this, or by memory mapping,
where devices appear as addresses in the normal memory range and normal processor
instructions are used to access these addresses.

Devices tend to be accessible in the same place for a particular architecture and each address
range maps to the I/O control registers for the deviced accessed by special I/O instructions.
Some devices have both direct and memory-mapped address space (e.g., graphics cards).

I/O ports typically consist of a status register, which contains the bits read by the host that
indiciate device state (e.g., data ready for a ready), a control register which enables the host to
change the mode of a device (e.g., from full to half-duplex for serial ports), a data-in register
which is read by the host to get input and the data-out register, which is written to by the host
for data to be sent to the device. This introduces the issue of how does the host know when a
particular condition has arisen in the status register.

Polling is one way to solve this problem, and is a host initiated approach. The device status
register is polled until the wanted condition occurs, but this way is inefficient as there is a
busy-wait cycle to wait for I/O from a device. The alternative solution is using interrupts,
which is initiated by both the host and the device. A CPU interrupt request line is triggered by
the device and the interrupt handler receives interrupts and uses an interrupt vector to
dispatch an interrupt to the correct handler based on priority. Most interrupts can be maskable
to ignore or delay some interrupts, and the interrupt mechanism can be used for exceptions.
There is no need for busy-waiting in this model.
In the IA32 architecture, interrupts 0-31 are unmaskable, as some are exceptions, and 32-255
are maskable and are used for devices.

For information on
DMA, see ICS

DMA is used to avoid programmed I/O for large data movement and is implemented using a
DMA controller to bypass the CPU and transfer data directly between the I/O device and
memory.
Application Interfaces

There are many different hardware devices, but application processes want a common
interface, but devices can vary in many dimensions: character (deals with bytes or words of
data, e.g., keyboards, mice, serial ports, etc and commands to deal with this include get and
put - often libraries are implemented to allow in-line editing), stream (a stream of bits) or
block (devices deal with blocks of data, such as disks, commands include read, write, seek
and most raw I/O filesystem access uses this) transfer; sequential or random access; sharable
or dedicated device; device speed; read-write, read-only or write-only. Timer devices also
exist, which provide the current time, elapsed time, or just a standard timer.

I/O can be either blocking or non-blocking. That is, a process is suspended until I/O is
completed under a blocking scheme (this is easy to use and understand, but is insufficient for
some needs - although using threads blocking can be programmed around), or under a non-
blocking scheme, an I/O call returns as much as available and returns quickly with a count of
bytes read or written. A final scheme is asynchronous, which is difficult to use and is a
scheme where a program runs while the I/O executes and the I/O subsystem signals the
process when the I/O is completed.

Usually, hardware devices are controlled in the same manner as files (create, write, read,
delete, etc...), so the filesystem namespace can be used for devices (/dev in UNIX). The
kernel implements I/O through a series of layers as below:
The I/O subsystem of the kernel implements I/O scheduling and some I/O request ordering
via a per-device queue, although some OSs do try to implement fairness. Additionally,
buffering is also implemented, where data is stored in memory while it is transferred between
devices (this copes with device speed mismatch and transfer size mismatch and "copy
semantics).

Buffering can be implemented using many different methods, such as caching, where fast
memory holds a copy of the data (always just a copy, and this is the key to performance), or
spooling, where output for a device is held (this is useful is the device can only serve one
request at a time, e.g., printing). Device reservation can also be used, where exclusive access
is provided to a device and system calls for allocation and deallocation are used. However,
deadlock (discussed above) can be encountered here.

The kernel has data structures for things like keeping state info for I/O components including
open file tables, network connections and the character device state, as well as many, many
other data structures to track buffers, memory allocation and "dirty" blocks. Some OSs use
object-orrientated methods and message passing to implement I/O.

Some OSs use a file based approach to kernel I/O structures. Devices are named as files and
accesses to devices are by file operations (with some exceptions, e.g., seek in a character
device may have no effect). In Linux, an additional command ioctl() provides direct access to
the device driver, perhaps to set the device into a particular mode (e.g., changing speed or
other parameters of a serial port).
I/O Performance

I/O is a major factor in system performance, as it makes demands on the CPU to execute the
device driver or kernel I/O code and it forces context switches due to interrupts. Data copying
and network traffic are especially stressful.

Ways of improving performance include reducing the number of context switches, data
copying and interrupts by using large transfers, smart controllers and polling. DMA can also
be used and the CPU, memory, bus and I/O performance need to be balanced for highest
throughput.

This brings us to the question of where should functionality be implemented to get best
performance - application kernel or hardware level?
CHAPTER 3:
THEORY OF COMPUTATION
TOC is an extension of MCS , and as such extends on the topics originally covered in MCS,
such as:

• Inductive proofs and definitions (also covered in ICM)


• Finite automata
• Kleene's Theorem
• Non-deterministic finite automata
• Grammars
• Regular grammars
• Context-free grammars
• Pushdown automata
• Chomsky hierarchy

Level Grammars/Languages Grammar Productions Machines

Unrestricted/recursively α → β [α ∈ (V ∪ Σ)+, β ∈
0 Turing Machine
enumerable (V ∪ Σ)*]

α → β [α, β ∈ (V ∪ Σ)+, |α| ≤ Linear-bounded


1 Context-sensitive
|β|] automaton

2 Context-free A → β [A ∈ V, β ∈ (V ∪ Σ)*] Pushdown automaton

A → a, A → aB, A → Λ [A, B ∈ V,
3 Regular Finite automaton
a ∈ Σ]

We could also consider a level 2.5, which is a deterministic context free grammars.

Context-Sensitive Languages
A grammar is context sensitive if α → β satisfies |α| ≤ |β|, with the possible exception of S →
Λ. If S → Λ is present, then S must not occur in any right-hand side. The length restrictions
here mean that you can generate all strings in a language up to a certain length (derivation
trees).

Context-sensitive grammars generate context-sensitive languages, and every context free


language is context sensitive.
Linear Bounded Automata

A linear bounded automata (LBA) is defined like a nondeterministic Turing machine (see
next section), M, except that the initial config for input w has the form (q 0, <w>) and the tape
head can only move between < and >.

An equivalent definition allows inital configurations of the form (q0, <wΔn>), where n =
cM|w|, for some constant cM ≥ 0. This allows to have multi-track machines, where
cM represents the number of tracks.

Theorem: A language L is context sensitive if and only if L is accepted by some LBA.

Open research problem: Are deterministic linear bound automata as powerful as non-
deterministic LBAs?

Turing Machines
Turing machines were introduced by Alan Turing in "On Computable Numbers with an
Application to the Entscheidungsproblem" (The Entscheidungsproblem is whether or not a
procedure of algorithm exists for a general problem).

A Turing Machine is an automata with consits of a 5-tuple (Q, Σ, Γ, q0, δ)

• Q is a finite set of states including ha and hr (the halting accept and reject states)
• Σ - the alphabet of input symbols
• Γ - the alphabet of tape symbols, including the blank symbol Δ, such that Σ ⊆ Γ - {Δ}
• q0 - the initial state, such that q0 ∈ Q
• δ - the transition function such that δ : (Q - {ha, hr}) × Γ → Q × Γ × {L, R, S}

Intuitively, a Turing machine has a potentially infinite set of squares and a head moving on a
type when reading character a whilst in state q, δ(q, a) = (r, b, D), which denotes a new state
r, a is overwritten with b and the head either moves to the left (D = L), right (D = R), or
remains stationary (D = S).

Configurations and Transitions

A configuration is a pair (q, uav) where q ∈ Q, a ∈ Γ and u, v ∈ Γ*. This configuration is


considered to be the same as (q, uavΔ) - the rightmost blanks can be ignored.

A transition (q, uav) ├ (r, xby) describes a single move of a Turing machine, and we can
write (q, uav) ├* (r, xby) for a sequence of transitions. This can be represented
diagramatically as:
Convention also dictates that if there is no transition for what you want to do, a transition to
hr is implied.

There are also special case transitions we can use:

• The initial configuration for input w ∈ Σ* is (q0, Δw)


• Configuration (q, xay) is a halting configuration if q = ha or q = hr, an accepting configuration
if q = ha and a rejecting configuration if q = hr.
• Tape extension at the right end: (q, va) ├ (r, vbΔ) if δ(q, a) = (r, b, R) for some r ∈ Q and b ∈
Γ.
• A crash occurs at the left end of the tape: (q, av) ├ (hr, bv) if δ(q, a) = (r, b, L) for some r ∈ Q
and b ∈ Γ.

The language accepted by the Turing machine M is L(M) = {w ∈ Σ* | (q0, Δw) ├* (ha, xay)
for some x, y ∈ Γ* and a ∈ Γ}

Non-deterministic Turing Machines

A non-deterministic Turing machine (NTM) varies from the Turing machine in the transition
function. δ: (Q - {ha, hr}) × Γ → 2Q × Γ × {L, R, S}

Theorem: For every NTM N, there exists a TM M such that L(M) = L(N)

Multi-tape Turing Machines

Intuitively, this is a machine with n tapes on which the machine works simultaneously. This
is defined like the Turing machine, which has a revised definition of the transition function.
δ: (Q - {ha, hr}) × Γn × {L, R, S}n, where n ≥ 2.

Configurations on a multi-tape machine have the form (q, u1a1v1, u2a2v2, ..., unanvn), and the
intial configuration for an input w is (q0, Δw, Δ, ..., Δ), by convention.

Theorem: For every multi-tape machine N, there exists a TM M such that L(N) = L(M).

Recursively Enumerable

A language L is recursively enumerable if there exists a Turing machine that accepts L.

Theorem: A language L is recursively enumerable if and only if L is generated by some


(unrestricted) grammar.

Enumerating Languages

A multitape TM enumerates L if:

1. the computation begins with all tapes blank


2. the tape head on tape 1 never moves left
3. at each point in the computation, the contents of tape 1 has the form Δ#w 1#w2#...#wnv,
where n ≥ 0, w1 ∈ L, # ∈ (Γ - Σ) and v ∈ Σ*
4. Every w ∈ L will eventually appear as one of the strings w on tape 1.
Theorem: A language L is recursively enumerable if and only if L is enumerated by some
multitape Turing machine.

Computability
Decidability

A language L is decidable (or recursive) if it is accepted by some Turing machine M, that on


every inputs eventually reaches a halting state (ha or hr), i.e., it does not get stuck and loop.
We say that M decides L.

To decide whether w ∈ Σ* is in L, we run M on w until M reaches a halting configuration,


then w ∈ L if and only if M is in state ha.

Theorem:

1. Every context sensitive language is decidable - however as the converse is not true, we can
consider decidable languages as level 0.5 on the Chomsky hierarchy.
2. If a language L is decidable, then the complement L = Σ* - L is also decidable
3. If both L and L are recursively enumerable, then L is decidable

Proof of Theorem 2

Let M be a Turing machine that accepts L and reaches a halting config on every input (is
decidable). Modify M to machine M as follows.

1. Swap ha and hr by defining ha = hr and hr = ha.


2. Put a new symbol at the start of the string (to detect the crash condition)
i. Shift existing string one position to the right
ii. Put a special symbol (e.g., #) at the first square
iii. Move tape head to the second square (the old first square)
iv. Continue as for a normal TM
v. δ is extended as δ(q, #) = (ha, #, S) ∀ q ∈ Q.

Therefore, L(M) = L(M), and L is decidable, as neither of the modifications described above
can introduce looping.

Proof of Theorem 3

Intuitively, L(M) = L and L(M) = L. For a string w, w ∈ L ∨ w ∈ L - so at least one of those


machines will accept the string. If the two machines are run in parallel and terminate as soon
as one reaches an accept state, then it is impossible to get into a loop. This can be
accomplished on a two-tape machine. The string is copied to tape 2 in the first instance, and
then M is run on tape 1 and M is run on tape 2.

We can express this formally as: Let M = (Q, Σ, Γ, q0, δ) and M = (Q, Σ, Γ, q0, δ) be Turing
machines that accept L and L respectively. Without a loss of generality, we can assume that
M and M cannot crash at the left end of the tape.
We can construct a two-tape Turing machine M′ that decides L by copying its input to tape 2
and simulating M and M in parallel.

• Let Q′ ⊇ Q × Q (we need additional states for initial copying, which are Q′ - (Q × Q)).
• Let δ′((q, q), (x, x)) = ((r, r), (y, y), (D, D)), where δ(q, x) = (r, y, D) and δ(q, x) = (r, y, D). Also,
we need to define:
1. δ′((ha, q), (x, x)) = (ha′, (x, x), (S, S)) ∀ q ∈ Q, x ∈ Γ, x ∈ Γ
2. δ′((hr, q), (x, x)) = (hr′, (x, x), (S, S)) ∀ q ∈ Q, x ∈ Γ, x ∈ Γ
3. δ′((q, ha), (x, x)) = (hr′, (x, x), (S, S)) ∀ q ∈ Q, x ∈ Γ, x ∈ Γ
4. δ′((q, hr), (x, x)) = (ha′, (x, x), (S, S)) ∀ q ∈ Q, x ∈ Γ, x ∈ Γ

To show that M′ decides L, we can consider two cases:

1. w ∈ L. Then, on input w, M will reach ha. M will either reach state hr, or it will loop. In both
cases, M′ reaches state ha.
2. w ∉ L, or w ∈ L and M on input w will reach accept state ha. M will either terminate in state
hr, or will loop. In both cases, M′ reaches state hr′

Thus, L(M′) = L and M′ reaches a halting configuration on every input. We can transform M′
into a one-tape Turing machine M″ that accepts L and halts on the same inputs as M′. Thus, L
is decidable.

Encoding Turing Machines

For whatever Turing machine we encounter, we want to encode the Turing machine into a
string that represents the Turing machine. If we let Q = {q1, q2, ...} and S = {a1, a2, ...}, such
that for every Turing machine (Q, Σ, Γ, q, δ), Q ⊆ Q and Γ ⊆ S, then we can define:

• s(Δ) = 0
• s(ai) = 0i + 1, for i ≥ 1
• s(ha) = 0
• s(hr) = 00
• s(qi) = 0i + 2
• s(S) = 0
• s(L) = 00
• s(R) = 000

If we let M = (Q, Σ, Γ, q*, δ), then we can let a transition t: (q, a) → (q′, b, D) is encoded as
e(t) = s(q)1s(a)1s(q′)1s(b)1s(D)1, and e(M) is therefore
s(ai1)1s(ai2)1...s(ain)11s(q*)1e(t1)e(t2)1...e(tk)1, where Σ = {ai1, ..., ain} and δ = {t1, tk}. Also,
for x = x1x2...xk ∈ Sk, e(x) = 1s(x1)1s(x2)1...s(xk)1

For example, for the following Turing machine, where Σ = {a, b} and Γ = {a, b, Δ}:
Suppose q = q1, a = a1 and b = a2, then we can encode this as e(M) =
00100011000100010100010100011..., which can be broken down as:

001 00011 0001 0001 01 0001 01 00011

s(a) s(b) s(q) s(q) s(Δ) s(q) s(Δ) s(R)

input alphabet initial state transitions

Tape symbols are states are implicit in the transitions, and therefore do not need to be
explicitly defined. With this, you can feed the encoding machine into another machine (this is
a similar concept to how compilers work).

The Self-Accepting Language

You can define the self-accepting language SA ⊆ {0, 1}* by SA = {w | w = e(M) for some
Turing machine M and w ∈ L(M)}. This is basically saying that w is a code of a TM that
accepts its own encoding. As such, we can define the completment of SA, SA = NSA = {w ∈
{0, 1}* | w ≠ e(M) ∀ Turing machines M} ∪ {w ∈ {0, 1}* | w = e(M) for some Turing
machine M and w ∉ L(M)} - that is, a bit string that does not represent a TM, and a TM that
doesn't accept itself.

We have two theorems about the self-accepting language:

1. SA is recursively enumerable, but not decidable


2. NSA is not recursively enumerable

Proof of Theorem 2

This is done by contradiction (indirect proof), or reducio ad absurdum and relies on the
principle of tertium non dafur. If we suppose that NSA is recursively enumerable, then there
is a Turing machine that accepts NSA (for Turing machine M, L(M) = NSA). If we consider
the code e(M), then we have two cases:

1. e(M) ∈ L(M). Then, by assumption, e(M) ∈ NSA. But then, by definition of NSA, e(M) ∉ L(M).
This is obviously a contradiction.
2. e(M) ∉ L(M). Then, by definition of NSA, e(M) ∈ NSA, hence by assumption e(M) ∈ L(M). This
is again a contradiction.
It therefore follows that our assumption must be wrong. That is, NSA is not recursively
enumerable.

The corollary of this is that SA is not decidable, and the proof of this is by definition of a
decidable languages - "If a language is decidable, L is also decidable". As NSA is not r.e., it
is not decidable, therefore SA is not decidable.

Proof of Theorem 1

The corollary of the proof from theorem 2 is that SA is not decidable, and the proof of this is
by definition of a decidable languages - "If a language is decidable, L is also decidable". As
NSA is not r.e., it is not decidable, therefore SA is not decidable.

We also need to consider the theorem that SA is recursively enumerable. To prove this, we
need to consider a Turing machine TSA accepts SA by working as follows:

TSA first checks that its input w is the code of some Turing machine M (i.e., w = e(M)) and
then Σ(M) includes {0, 1}. if this is not the case, T SA rejects w. Otherwise, TSA computes
e(w) = e(e(M)) and runs U (the universal Turing machine) on e(M)e(w). It follows that
TSA accepts w if and only if M accepts w. Hence L(T SA) = SA.

The Universal Turing Machine

The universal Turing machine is basically a Turing machine interpreter. It takes inputs of the
form e(M)e(w), where M is a Turing machine and w is a string over Σ(M)*. The universal
Turing machine will run machine M on input w, and accepts e(M)e(w) if w ∈ L(M) (M
accepts w), rejects e(M)e(w) if w ∉ L(M) (M rejects w) and e(M)e(w) loops when M loops
on w.

Most Languages are not Recursively Enumerable

A set S is countably infinite is there exists a bijective function N → S and is countable if it is


countably infinite or finite. e.g., countable sets, N, Z, N × N and uncountable sets: R, 2N

Cantor said that if you plot a matrix of pairs of numbers, and move diagonally you can
represent any number with a bijective function, so pairs of numbers are countably infinite, as
long as the base number is.

Lemmas:
1. If S0, S1, ... are countable, then S = ∪∞i = 0Si is countable
2. If S is an infinite set, then its power set 2S is uncountable
3. If S is uncountable, and T is a countable subset of S, then S - T is uncountable

Proof of lemma 2: Let S be countably infinite, as otherwise 2 S is obviously uncountable


(there are uncountably many singleton subsets). We proceed by contradiction. Suppose 2S is
countably infinite, then there is an infinite listing S 0, S1, ... of the elements of 2S (i.e., of all
the subsets of S). Since S is countably infinite, there is a bijection ƒ: N → S. We can define
some subset of S as S′ by S′ = {ƒ(i) | i ≥ 0 ^ ƒ(i) ∉ Si}. Since S′ ⊆ S, there is some n such that
S′ = Sn. We can consider two cases:

1. ƒ(n) ∈ Sn, then ƒ(n) ∉ S′ = Sn, which is a contradiction


2. ƒ(n) ∉ Sn, then ƒ(n) ∈ S′ = Sn, which is again a contradiction.

Theorem: For every nonempty alphabet Σ, there are uncountably many languages over Σ that
are not recursively enumerable.

Proof: Every recursively enumerable language is accepted by some Turing machine. Since
Turing machines can be encoded as strings over {0, 1} and {0, 1}* is countably infinite, the
set of all Turing machines over Σ is countable as well (every subset of a countable set is
countable). Hence, the set of all recusively enumerable languages over Σ is countable too.
But, 2Σ*, the set of all languages over Σ is uncountable by lemma 2, thus by lemma 3: 2 Σ* - {L
⊆ Σ* | L is r.e.} is uncountable as well.

Turing-computable Functions

A Turing machine M = (Q, Σ, Γ, q0, δ) computes the partial function ƒm(x) = {y if (q0, Δx) ├*
(ha, Δy), undefined otherwise} - which says that ƒm(x) = the resulting string on the tape if it
halts, otherwise it is undefined. We can generally change this unary function into a k-ary
function by saying that M computes for every k ≥ 1. The k-ary partial function ƒm : (Σ*)k →
Σ* is defined by ƒm(x1, ..., xk) = {y, if (q0, Δx1Δs2...Δxk) ├* (ha, Δy), undefined otherwise}.

A partial function ƒ : (Σ*)k → Σ* is Turing computable if there exists a Turing machine M


such that ƒm = ƒ

Computing Numeric Functions

A partial function ƒ : Nk → N on natural numbers is computable if the corresponding function


({1}*)k → {1}* is computable, where each n ∈ N is represented by 1n.

e.g., you could have a Turing machine computing ƒ : N → N with ƒ(x) = if x is


even then 1 else 0, which could be represented as
Characteristic Function of a Language

For every alphabet Σ, fix to distinct strings w1, w2 ∈ Σ*. The characteristic function of a
language L ⊆ Σ* is the total function XL : Σ* → Σ*, defined by XL = {w1 if x ∈ L,
w2 otherwise.

Theorem: A language is decidable if and only if its characteristic function X L is computable.

Graph of a Partial Function

The graph of a partial function ƒ : Σ* → Σ* is the set, Graph(ƒ) = {(v, w) ∈ Σ* × Σ* | ƒ(v)


= w}. This can be turned into a language over Σ ∪ {#}: Graph#(ƒ) = {v#w | (v, w) ∈
Graph(ƒ)}

We can have two theorems from this:

1. A partial function ƒ : Σ* → Σ* is computable if and only if Graph #(ƒ) is recursively enumerable.


2. A total function ƒ : Σ* → Σ* is computable if and only if Graph #(ƒ) is decidable.

Decision Problems

A decision problem is a set of questions, each of which has the answer yes or no.

We can say that these questions are instances of the decision problem; depending on their
answers, they are either 'yes'-instances or 'no'-instances. To solve decision problems by
Turing machines, instances are encoded as strings.

A decision problem P is decidable, or solvable, if its language of encoded 'yes'-instances is


decidable. Otherwise, P is undecidable or unsolvable.

A decision problem P is semi-decidable if its language of encoded 'yes'-instances is


recursively enumerable.
The Membership Problem for Regular Languages

Input: A regular expression r and w ∈ Σ*


Question: Is w ∈ L(r)
Problem: {(r, w) | r is a regular expression, w ∈ Σ*}
'Yes'-instances: {(r, w) | r is a regular expression and w ∈ L(r)
Encoded 'Yes'-instances: {encode(r)encode(w) | r is a regular expression and w ∈ L(r)}

This problem is decidable, as it can be turned into an automata.

The Self-Accepting Problem

Input: A Turing machine M


Question: Does M accept e(M)?
Problem: {M | M is a Turing machine}
'Yes'-instances: {M | M is a Turing machine and e(M) ∈ L(M)}
Encoded 'Yes'-instances: {e(M) | M is a Turing machine and e(M) ∈ L(M)}

The encoded 'Yes'-instances is the language SA, which is recursively enumerable (L(SA) can
be enumerated), but not decidable, therefore this problem is semi-decidable.

The Membership Problem for Recursively Enumerable Languages

Input: A Turing machine M and w ∈ Σ*


Question: Is w ∈ L(M)?
Problem: {(M, w) | M is a Turing machine and w ∈ Σ*}
'Yes'-instances: {(M, w) | M is a Turing machine and w ∈ L(M)}
Encoded 'Yes'-instances: {e(M)e(w) | M is a Turing machine and w ∈ L(M)}

The encoded 'Yes'-instances is MP, the membership problem.

Theorem: The membership problem for recursively enumerable languages is undecidable, but
semi-decidable.

Proof: We reduce the self accepting problem (SA) to the membership problem for recursively
enumerable languages (MP). That is, we show that if MP is decidable, then SA is also
decidable. Since SA was proved to be undecidable, MP must be undecidable too.

Suppose that MP is decidable. Then L(MP) is decidable, i.e., there is a Turing machine T that
terminates for all inputs and accepts MP. We can construct a Turing machine T′ that decides
SA as follows: T′ transforms its input e(M) into e(M)e(e(M)) (where e(M) = w and is placed
in e(M)e(w)). T then starts on this string. This can be represented diagramatically:
T′ terminates for all inputs because T always terminates. Moreover, T′ accepts e(M) if and
only if T accepts e(M)e(e(M)), that is, T′ accepts e(M) if and only if e(M) ∈ L(M).

Hence, T′ decides SA, therefore, if we can solve MP, we can solve SA. But, as SA was
proved to be undecidable, this is a contradiction. Hence, our assumption that MP is decidable
is false, i.e., MP is undecidable.

We can summarise this as: reduction of SA to MP - "If we can solve MP, we can also solve
SA" ⇔ "SA is not more difficult than MP" ⇔ "MP is at least as difficult as SA"

Reducing Languages and Decision Problems

Let L1, L2 ⊆ Σ* be languages. L1 is reducible to L2 if there is a computable total function (i.e.,


there exists a Turing machine) ƒ : Σ* → Σ* such that for all x ∈ Σ*, x ∈ L1 if and only if ƒ(x)
∈ L2.

Let P1, P2 be decision problems and e(P1), e(P2) be the associated languages of the encoded
'Yes'-instances. We say that P1 is reducible to P2 if e(P1) is reducible to e(P2).

Theorem 1: Let L1, L2 be languages such that L1 is reducible to L2. If L2 is decidable, so is L1.

Theorem 2: Let P1, P2 be decision problems such that P1 is reducible to P2. If P2 is decidable,
so is P1.
This diagram shows that the strings in L1 are mapped directly to strings in L2, and strings
in L1 are mapped to strings in L2.

L1 = SA = {e(M) | M is a Turing machine and e(M) ∈ L(M)} and L2 = MP = {e(M)e(w) | M is


a Turing machine and w ∈ w ∈ L(M)}. ƒ : {0, 1}* → {0, 1}* with ƒ(x) = {xe(x) if x = e(M)
for some Turing machine M, Λ otherwise.

Halting Problem

Input: A Turing machine M and a string w


Question: Does M reach a halting configuration on input w?
Theorem: The halting problem is undecidable, but semi-decidable

We reduce the membership problem for recursively enumerable languages (MP) to the
halting problem (HP). Suppose that HP is decidable, then there is a TM T halt that terminates
on all inputs and accepts an input e(M)e(w) if and only if M reaches a halting configuration
(terminates on input w).

We can construct a Turing machine T′ diagramatically as follows:


Note that if Thalt outputs 'yes' then U will terminate on e(M)e(w) with either Yes or No.
Hence, T′ solves MP. Since MP is undecidable, our assumption that HP is decidable must be
false.

Accept#

Theorem: The following problem (Accept#) is undecidable.

Input: A Turing machine M such that # ∈ Σ


Question: Is # ∈ L(M)

Proof: We reduce MP to Accept#. Suppose Accept# is decidable, then there is a Turing


machine that halts on all inputs, accepting input e(M) if and only if # ∈ L(M). We can
construct a Turing machine T′ as follows:

Where C is a Turing machine that transforms e(M)e(w) into the output e(Mw), where Mw is a
Turing machine that ignores its input and runs M on input w. Now, T′ solves MP. We can
consider two cases:

Case 1: w ∈ L(M). Then, by construction of Mw, Mw accepts every input and hence # ∈
L(Mw), thus T# accepts e(Mw) and T′ accepts e(M)e(w).

Case 2: w ∉ L(M). Then, by construction of Mw, Mw accepts no input and hence # ∉ L(Mw),
this T# rejects e(Mw) and T′ rejects e(M)e(w).

Since MP is undecidable, Accept# must be undecidable as well.

More Undecidable Problems

These can all be proved by reduction.

• Input: A Turing machine M.


Question: Does M accept no input ("Is L(M) = ∅")?
• Input: Two Turing machines M1 and M2.
Question: Do M1 and M2 accept the same inputs ("Is L(M1) = L(M2)")?
• Input: A Turing machine M
Question: Is L(M) finite?
• Input: A Turing machine M
Question: Is L(M) regular?
• Input: A Turing machine M
Question: Is L(M) decidable?

Rice's Theorem (1953)

Let C be a proper, non-empty subset of all recursively enumerable languages. Then, the
following problem is undecidable:

Input: A Turing machine M


Question: Is L(M) ∈ C?

In other words, every nontrivial property of recursively enumerable languages is undecidable.


A property is nontrivial if it is satisfied by a proper, nonempty subset C of the class of all
recursively enumerable languages. e.g., for the problem "Is L(M) finite?" C = {L = Σ* | L is
finite}

Beware, Rice's theorem is about the languages accepted by Turing machines, not about the
machines themselves.

Undecidable Problems About Context Free Grammars

Input: A context free grammar G


Question: Is L(G) = Σ*

Input: Two context free grammars G1 and G2


Question: Is L(G1) = L(G2)

Input: Two context free grammars G1 and G2


Question: Is L(G1) ∩ L(G2) = ∅?

Input: A context-free grammar G


Question: Is G ambiguous?

Input: A context-free grammar G


Question: Is L(G) inherently ambiguous?

In all the cases above, apart from the question 'Is G ambiguous?', the CFGs can be replaced
by pushdown automata. However, in the case of 'Is G ambiguous?', this refers to a property of
a grammar, not a language, so a pushdown automata can not be substituted in.

The Church-Turing Thesis

The Church-Turing thesis was proposed by Alonzo Church and Alan Turing in 1936 and is
the statement that "Every effective procedure can be carried out by a Turing machine".
A functional version of this statement is that every partial function computable by an
effective procedure is Turing-computable, and for decision problems, every decision problem
solvable by an effective procedure is decidable.

This is only a thesis as an effective procedure (algorithm) can not be defined.

Complexity
(By convention when dealing with complexity, we only consider Turing machines, both
deterministic and nondeterministic of which all computations eventually halt).

The time complexity of a Turing machine M is function τM : N → N, where τM(n) is the


maximum number of transitions M can make on any input of length n. For a nondeterministic
Turing machine N, τN(n) is the maximum number of transitions N can make on any input of
length n by employing any choice of transitions. The time complexity of (possibly
nondeterministic) multitape machines is defined analogously, and we define the time
complexity as above as the worst-case complexity.

The space complexity of a Turing machine M is the function sM : N → N, where sM(n) is the
maximum number of tape squares M visits on any input of length n. For a nondeterministic
Turing machine N, sN(n) is the maximum number of tape squares N visits on any input of
length n by employing any choice of transitions. For multitape Turing machines, the
maximum number of tape squares refers to the maximum of the numbers for the individual
tapes (which differs only be a constant factor from the maximal number of squares visited on
all tapes altogether).

For every Turing machine M, sM(n) ≤ τM(n) + 1, as it is impossible for more squares to be
visited than transitions made.

Growth Rate of Functions

Given a function f, g : N → N, f is of order g, written as f = Ο(g), or f ∈ Ο(g), if there


is c, n0 such that ƒ(n) ≤ c . g(n) ∀ n ≥ n0.

Theorem, for a, b, r ∈ N with a, b > 1

1. loga(n) = Ο(n) but n ≠ Ο(loga(n))


2. nr = Ο(bn) but bn ≠ Ο(nr)
3. bn = Ο(n!) but n! ≠ Ο(bn)

Using this 'Big-O' notation, we can create a hierarchy of complexity:

Name Ο

Constant 1

Logarithmic log n
Linear n

Log-linear n log n

Quadratic n2

Cubic n3

Polynomial np

Exponential bn

Factorial ∞

The table below shows examples of different growth rates for different values of n.

log2(n) n n2 n3 2n n!

2 5 25 125 32 120

3 10 100 1000 1024 3628800

4 20 400 8000 1048576 2.4 × 1014

5 50 2500 125000 1.1 × 1015 3.0 × 1064

6 100 10000 1000000 1.2 × 1030 > 10157

Properties of Time Complexity

Theorem - multiple tapes vs. one tape: For every k-tape Turing machine M there is Turing
machine M′ such that L(M′) = L(M) and τM′ = Ο(τM2).

Theorem - there are aribtrarily complex languages. Let ƒ be a total computable function.
Then there is a decidable language L such that for every Turing machine M accepting L, τ M is
not bounded by ƒ.

Theorem - there need not exist a best Turing machine. There is a decidable language L such
that for every Turing machine M accepting L, there is a Turing machine M′ such that L(M′) =
L(M) and τM′ = Ο(log(τM)).

The Classes P and PSpace

A language L is recognisable in polynomial time if there is a deterministic Turing machine M


accepting L such that τM = Ο(nr) for some r ∈ N. The class of all languages recognisable in
polynomial time is denoted by P.
A language L is recognisable in polynomial space if there is a deterministic Turing machine
M accepting L such that sM = Ο(nr) for some r ∈ N. The class of all languages recognisable in
polynomial space is denoted by PSpace.

A decision problem is said to be in P resp. PSpace if its language of encoded 'yes'-instances is


in P resp. PSpace.

The Classes NP and NPSpace

A language L is recognisable in nondeterministic polynomial time if there is a


nondeterministic Turing machine N accepting L such that Τ N = Ο(nr), for some r ∈ N. The
class of all languages recognisable in nondeterministic polynomial time is denoted by NP.

A language L is recognisable in nondeterministic polynomial space if there is a


nondeterministic Turing machine N accepting L such that s N = Ο(nr) for some r ∈ N. The
class of all languages recognisable in nondeterministic polynomial space is denoted by
NPSpace.

A decision problem is said to be in NP resp. NPSpace if its language of encoded 'yes'-


instances is in NP resp. NPSpace.

Example: A problem in P

A graph G = (V, E) consists of a finite set, V, of vertices (or nodes) and a finite set, E, of
edges such that each edge is an unordered pair of nodes. A path is a sequence of nodes
connected by edges. A complete graph (or clique) is a graph where each two nodes are
connected by an edge.

The path problem has input of a graph G and nodes s and t in G and the question does G have
a path from s to t?

The theorem is that the path problem is in P, and the proof for this is the following algorithm
which solves the path problem:

On input (G, s, t) where G = (V, E) is a graph with nodes s and t:

1. Mark node s
2. Repeat until no additional nodes are marked
i. scan all edges of G
ii. for each edge {v, v′} such that v is marked and v′ is unmarked, mark v.
3. Check if t is marked - accept, otherwise reject

The running time of this algorithm can be considered by considering each stage. Stages 1 and
3 are considered only once. The body of the loop in stage 2 is executed at most |V| times,
because each time except for the last, an unmarked node is marked. The body of the loop can
be executed in time linear in |E| and stages 1 and 3 need only constant time. Hence, the
overall running time is Ο(|V| × |E|). In the worst case, |E| = |V|2/2, and hence in Ο(|V|3).
Example: Problems in NP

The complete subgraph (clique) problem has the input of a graph G and k ≥ 1. The question is
does G have a complete subgraph with k vertices?

The Hamiltonian circuit problem takes an input of a graph G, and the question is does G have
a Hamiltonian circuit (a path v0, ..., vn through all nodes of G such that v0 = vn and v1, ...,
vn are pairwise distinct.

Theorem: The complete subgraph and the Hamiltonian circuit problems are in NP.

The clique problem can be solved by a nondeterministic Turing machine in polynomial time.
N nondeterministically selects ("guesses") vertices v1, ..., vk in G and checks whether each
pair {vi, vj}, 1 ≤ i ≤ j ≤ k is an edge in G. This checking can be done in time polynomial in |V|
+ k.

G can be represented as a string of vertices followed by a string of edges in the form of vertex
pairs. Then, at most k2/2 vertex pairs have to be checked, which can be done in time |V|2/2
× k2/2.

Tractable Languages

A language is tractable if L ∈ P, otherwise L is intractable.

A language L is recognisable in exponential time if there is a Turing machine M accepting L


such that τM = Ο(2nr) for some r ∈ N. The class of all languages recognisable in exponential
time is denoted by ExpTime.

Theorem:

1. P ⊆ NP ⊆ PSpace = NPSpace ⊆ ExpTime


2. P ≠ ExpTime

There is an open research problem of does P = NP.


Conjectured Hierarchy of Complexity Classes

(Only P ≠ ExpTime is known, but it is not known whether the other subsets above are proper
or not)

Inclusions P ⊆ NP and PSpace ⊆ NPSpace, as every deterministic Turing machine can be


represented by a nondeterministic Turing machine.

Inclusions P ⊆ PSpace and NP ⊆ NPSpace, as every Turing machine (deterministic or


nondeterministic) satisfies sM(n) ≤ τM(n) + 1, as we can not visit more squares than we can do
transitions. Hence τM = Ο(nr) implies sM = Ο(nr).

Inclusions NPSpace ⊆ PSpace. This can be proved using Savitch's Theorem (which is
nontrivial), which states that for every nondeterministic Turing machine N there is a Turing
machine M such that L(M) = L(N) and sM = sN2, and a polynomial squared is still a
polynomial.

Inclusions PSpace ⊆ ExpTime. For an input of length n each configuration of the Turing
machine M can be written as follows: (q, a0...ai...asM(n)). For each part, there are |Q| possible
states, sM(n) possible tape head positons and |Γ|sM(n) possible tape content configurations.
Hence, the maximum number of configurations is |Q| × sm(n) × |Γ|sM(n) = Ο(Γ|sM.

A Turing machine computation that halts does not repeat any configurations. Hence τ M =
Ο(ksM for some k, and it follows our inclusion, hence PSpace ⊆ ExpTime (for k ≤ 2m we
have knr ≤ (2m)nr = 2mnr = 2nm + r).
Polynomial-time Reductions

A function ƒ is polynomial-time computable if there is a Turing machine M with τ M = Ο(nr)


that computes ƒ.

Let L1, L2 ⊆ Σ* be languages. L1 is polynomial-time reducible to L2 if there is a polynomial-


time computable total function ƒ : Σ* → Σ* such that for all x ∈ Σ*, x ∈ L1 if and only if ƒ(x)
∈ L2.

Theorem: Let L1 be polynomial-time reducible to L2. Then: L2 ∈ P implies L1 ∈ P; L2 ∈ NP


implies L1 ∈ NP.

Proof

Since L2 ∈ P there is a Turing machine M2 with polynomial time complexity that decides L2.
Moreover, since L1 is polynomial-time reducible to L2, there is a Turing machine M with
polynomial-time complexity such that ƒM : Σ* → Σ* satisfies for all x ∈ Σ*, x ∈ L1 if and
only if ƒM(x) ∈ L2.

Let M1 be the composite Turing machine MM2 which first runs M and then runs M2 on the
output of M, then L(M1) = L1.

Since |ƒM(x)| can not exceed max(τM(|x|), |x|), the number of transitions of M1 is bounded by
the sum of the following estimates of the seperate computations:

(a) τM1(n) ≤ τM(n) + τM2(max(τM(n), n))

Let τM = Ο(nr), then there are constants c and n0 with τM(n) ≤ c . nr ∀ n ≥ n0.

Let τM2 = Ο(nt), then there are constants c2 and n2 with τM2(n) ≤ c2 . nt ∀ n ≥ n2.

Hence τM2(n) ≤ c2.nt + d2, ∀ n ≥ 0, where d2 = {τM2(k) | k < n2}.

It follows τM2(τM(n)) ≤ c2 . (τM(n))t + d2, for all n ≥ 0, and hence τM2(τM(n)) ≤ c2 . (c . nr)t + d2,
for all n ≥ 0

So, τM2(τM(n)) ≤ c2ct . nrt + d2 ∀ n ≥ 0.

Thus, by formula (a) above, τM1 = Ο(nrt), hence L1 ∈ P (r and t are constants, therefore nrt ∈
P)

Hardness

If L1 is polynomial-time reducible to L2, then L1 is no harder than L2. Equivalently, L2 is at


least as hard as L1.

A language L is NP-hard if every language in NP is polynomial-time reducible to L; A NP-


hard language is at least as hard to decide as every language in NP. Equivalently, no language
in NP is harder to decide than any NP-hard language.
A language L is NP-complete if it is NP-hard and belongs to NP; a NP-complete language is
a "hardest" language in NP. Corollary, for every NP-complete language L, L ∈ P if and only
if P = NP.

Assuming P ≠ NP:

The Satisifability Problem

Consider boolean expressions built from variables xi and connectives ^, ∨ and ¬. A literal is
either a variable xi, or a negated variable ¬xi. An expression C1 ^ C2 ^ ... ^ Cn is in
conjunctive normal form (CNF) if C1, ..., Cn are disjunction of literals.

An expression C is satisfiable if there is an assignment of truth values to variables that make


C true.

e.g., C = (x1 ∨ x3) ^ (¬x1 ∨ x2 ∨ x4) ^ ¬x2 ^ ¬x4. In this case, C is satisfiable by the assignment
α = {x1, x2, x4 → false, x3 → true}

The satisfiability problem takes input of a boolean expression C in CNF, and the question is
"Is C satisfiable?".

Theorem: The satisfiability problem is NP-complete. This was proved by S. A. Cook in 1971
was was the first problem to be shown to be NP-complete.

To prove a problem is NP-complete, any NP-complete problem should be reducible to it. e.g.,
a polynomial-time reduction of the satisfiability problem to the clique problem. What we
need is a function ƒ : x → (Gx, kx), where x if a CNF expression and (Gx, kx) is a pair
depending on x, such that x is satisfiable if and only if Gx has a clique with kx vertices.
Let x = ^ci = 1∨dij = 1ai, j, where ^ and ∨ are conjuncts and each ai, j is a literal.

Observation: Pick one literal from each conjunct and connect each pair of picked literals that
is not complimentary (e.g., x and ¬x are complimentary). If this yields a complete graph,
then x is satisfiable. Vice versa, if x is satisfiable, then a complete subgraph can be
constructed in this way.

Idea: Choose Gx as the graph consisting of all occurences of literals and all connections
between literals that are in different conjuncts and not complimentary.

Define Gx = (Vx, Ex) by Vx = {(i, j) | 1 ≤ i ≤ c ^ 1 ≤ j ≤ di} and Ex = {((i, j), (l, m)) | i ≠ l ^
ai, j ¬≡ ¬al, m}, and let kx = c.

By construction of Gx: Two vertices are linked if and only if their literals are in different
conjuncts and there is a truth assignment making both literals true. Next, we show that ƒ is a
reduction from the satisfiability problem to the clique problem.

"Only if": Let x be satisfiable. Then there is a truth assignment α such that for each i, there is
some literal ai, ji with α(ai, ji) = true. Hence, by construction of Ex, the vertices (1, j1), (2, j2),
..., (kx, jkx) forms a clique in Gx. Note that ai, ji ¬≡ ¬ai, ji, because α(ai, ji) = true = α(ai, ji).

"If": Let Gx have a clique with kx many vertices. Then, by definition of Ex, the literals of these
vertices must be pairwise non-complimentary. Hence, there is a truth assignemtn Θ making
all of these literals true. Moreover, by the definition of E x, these kx literals are in kx different
conjuncts. So α makes each of the kx = c conjuncts true. Thus, x is satisfiable.
CHAPTER 4:
MODELLING AND SYSTEM DESIGN

The Software Engineering Problem


How to bridge the gap between customer requirements and code. In some domains, this gap
is huge, because requirements are stated in terms of waffly things like goals and needs - not
functionality. The traditional answer is that "a miracle occurs", or that it's ad-hoc, and "gurus"
are needed. As a result, the process is unpredictable and costly. Can we provide a systematic
approach to bridge the gap?

Planning is essential. It happens in other engineering disciplines, and plans are required,
because massive changes are costly. Although it is vital to plan, even the best plans can go
wrong, so plans can be written to be changed, and anticipating change is a modern software
engineering reality. Planning is especially critical in large systems, however large is not
necessarily complex.

Adopting a development methodology is one approach to solving this problem. These


methodologies provide a process which consists of steps and phases which, if followed, may
make it easier to bridge the gap. There are lots of methodologies, but this course focuses on
elements of object orrientation (specifically, UML and RUP).

Object Orrientation

Object orrientation is often what is applied in industry (the critical industry is more
conservative, but even they are interested in things such as Safe-C++). OO can enable
engineering of more maintainable and extensible systems and helps in building reusable
components and therefore large-scale systems.

However, it must be born in mind that it is not always the best approach to take - following
any methodology blindly can cause problems), but it is helpful in the right setting.

OO development is the use of classes and systems in developing software and systems.
Classes represent recurring, reusable, extensible concepts that provide services - similar to
types. Objects are instances of classes and carry out computation in response to messages.
OO development implies using classes and objects throughout the construction process, for
all parts, including requirements analysis, design, coding and testing.

Typical Development Processes

The waterfall, spiral and V-models all have the following phases, in some kind of order:

1. Requirements analysis
2. Specification and design
3. Implementation
4. Testing
5. Delivery
6. Maintenance

Waterfall Process

This method was first proposed by Royce in 1970. Each stage must be signed off and can be
repeated or feedback to an earlier stage. It's not an accurate model any longer - it was in the
1970s, however it was developed for small, simple programs. Modern methods have become
refined and improved.
Spiral Process

Created by Barry Boehm, the spiral process incourages incremental development and makes
the risks and costs of both resources and failure explicit in the process.

Rational Unified Process

Also known as just RUP, this is a "use case and architecture-driven process" from IBM
Rational. Based on a "reasonable" set of user stories, developers go through a suite of
iterations, refining the stories to OO design. In each iteration, additional information is added
to the designs. Again, this is an incremental approach to development.

RUP also specifies a management process that can help to monitor and measure progress.

User Stories

Written from a users perspective, using their vocabulary - often there is no mention of
computer systems, software or user interfaces. Some things can be imprecise or unstated,
stories are often broken up into normal cases and exceptional cases. Use cases are a way to
write stories.

Model Driven Development

RUP is a special case of a more general category - model driven development, or MDD. The
aim with MDD is to build software using blueprints - things that are at a higher level of
abstraction than code (e.g., Ada packages, Java classes, etc). From this we get a model of the
system, however a model is not necessarily a diagram.

In this module, we'll look at one type of blueprinting language - UML, although many other
exists, in addition to other MDD processes.

MDD is essentially expanding the waterfall methods and the development quadrent of the
spiral method. MDD has a basis in formal methods, where each level of abstraction has to be
verifiably consistent with each other.

The Object Management Group (a consortium of interested platforms, originally interested in


OO programming, which now produces standards) has created the model driven architecture,
a standardised approach to the MDD. In the MDA, you start with a PIM (platform
independent model) which must then be converted into a PSM (platform specific model).
With the PSM, automated code generation codes can be used to generate the running code.
This code is often inelegant and hard to maintain by hand.

We can look at the MDA in more detail on the diagram below:

This diagram adds the PDM, a model of the platform which is used for the code generation.
In theory, this can be used to generate a PSM from the PIM.
Extreme Programming

Extreme Programming (or XP) is a view of interest in industry and research. Here, code is the
most important deliverable and everything else is much less and may be ignored.

Development starts with coding, and changes to your design or requirements are reflected
immediately in code. There are questions whether or not this is sensible, or even dangerous
and if it is compatible with modelling or not.

XP takes the best bits of hacking and adds things such as peer review to generate an efficient
method of development

Agile Methods

XP is an instance of the so called Agile methodology. These methodologies emphasise


incremental development and deprecate the production of non-code output, as the customer
presumably wants code. Agile methods do not say "Don't write documentation" - they say
write only what will help.

The principles of agile development are:

• Early and continuous development of valuable software


• Welcome changing requirements, and harness them for competitive advantage
• Deliver working software frequently
• Business people and developers working together on a day-to-day basis
• A continuous attention to technical detail
• Simplicity is essential
• Self-organising teams that reorganise and assess progress regularly

Software Quality

The principle goal of software engineering is to produce quality software. There are both
internal and external factors that influence software quality.

Internal factors are perceptable to computer professionals (e.g., efficiency of algorithms and
data structures). The customer doesn't care about these directly, but they are how we satisfy
the requirements.

External factors are preceived by the client, and are clearly the most important. Some
example of external quality factors are:

• Correctness - ability of the software to be able to perform tasks defined by its specifications
• Robustness - ability of the software to react appropriately to abnormal conditions
• Extendibility - the ease of adapting software to changes in the specifications
• Reusability - the ability of different software to serve for the construction of many different
systems
• Efficiency - places minimal demands as possible on hardware resources
• Portability - ease of transferring software to different environments
• Ease of use
• Timeliness - ability to get the software out of the door when the client wants it
High quality OO systems will be non-monolithic, i.e., will be divided up into classes, each of
which provide a distinct set of services. These classes will be:

• understandable - to determine what a class does, you should need to understand as little of
the class's context as possible - this is subjective, however
• continuous - if you modify a class, a minimum number of other classes will have to change
• protected - classes avoid propagating error conditions to neighbouring classes

There are five rules to help build high quality OO systems, and our programming and
modelling languages must provide support for these rules.

Direct Mapping Rule

The class structure devised in the process of building a software system should remain
compatible with any class structure devised in the process of modelling the problem domain.

To address the needs of the clients, their problem must be understood, and then a solution
formulated. Developers may model the customers needs and the solution in a language (e.g.,
UML). A seamless mapping is needed from the solution in UML through to a software
implementation. Seamlessness involves using a common set of concepts throughout
development (classes and objects).
In practice, this isn't quite enough, however. It's difficult to describe customer requirements
in terms such as objects and classes. Requirements are often goal/event/need-based, and are
more easily expressed in non-class forms, so the seamless process is extended with
techniques such as use cases and interactions.

(Specification, Design, Implementation, Validation and verification, Generallisation)

Reversability (as demonstrated in the above diagram) is like the backwards arrows in the
waterfall model, and is often required - for example to allow automatic construction of UML
models from code and the automatic generation of meaningful demonstration.

Few Interfaces Rule

Every class should communicate with as few others as possible.

Thus, restrict the number of communication channels between classes (e.g., the number of
services that can be invoked). This can improve understanding and protection.

Small Interfaces Rule

If two classes communicate, they should exchange as little information as possible.

This is also known as weak coupling, and relates to the size of connections, as opposed to
their number.
Explicit Interfaces Rule

Whenever two classes, A and B communicate, this must be obvious from the text of A or B,
or both.

Previous rules have established that communication should be limited to a few participants,
and then only a few words. This rule requires that communication be loud and public. This is
very important with respect to understanding.

Communication here refers to both messages between objects and the sharing of data.

Information Hiding Rule

The designer of every class must select a subset of services as the official information about
the class, to be made available to the authors of the client classes. (David Lorge Parnas)

Only some, but probably not all, of the class's services are public. The rest are secret. The
public part is the class's interface.

Software Construction Principles

The following principles apply to any kind of component, not just classes.

• Lingusitic units - components must correspond to syntactic language units


• Self-documentation - component designers should make all of its information part of the
component itself
• Uniform access - all component services must be available through a uniform notation
• Single choice - whenever a component must support a set of alternatives, one and only one
component in the system should know the exhaustive list
• Open-closed - services can be added (open), but the interface is stable for the clients to use
(closed). Both of these are needed in real projects. The classic approach is to close when we
reach stability, reopening when needed. But for this, we will need to reopen all the clients
too. This is clearly undesirable.

Modelling
At each phase in the MDD process, we create models - an abstract representation of reality.
For MSD, these are diagrams, but this is not necessarily the case. Models capture the essence
of things of interest - during requirements analysis we model the essence of customer goals
and needs, and during design we model the essence of the systems that will satisfy the
requirements. We use models as they are usually easier to change than the system itself.

OO Modelling vs. OO Programming

OO modelling involves abstraction and planning. No code is written in the modelling stage;
the essentials of a system are planned out and written down, often in a graphical notation
(e.g., UML).

OO programming involves expressing an OO model in an executable language.


The Unified Modelling Language (UML)

The UML is a visual language for describing systems, and is a successor to the wealth of OO
analysis and design methods of the 80s and 90s. It's mainly based on the work of Booch,
Rumbaugh (OMT) and Jacobson. The OMG standard for UML is currently 2.0, but many
organisations still using 1.x, and not all tools fully support 2.0 yet.

An important fact to remember is that the UML is only a language - it doesn't specify a
development or management process, methods need a language and a process. UML is
typically used with RUP, MDA or some other agile method. OMT, Objectory and Fusion are
examples of other methods which include UML.

UML consists of two main parts:

• the graphical notation, used to draw models


• a metamodel which provides rules clarifying which models are valid and which are invalid

UML consists of a series of different diagrams, such as the:

• class diagrams, the essentials and advanced concepts


• interaction diagrams, for specifying object interactions and run-time behaviour via message
passing (communication diagrams in UML 2.0)
• state charts, for specifying object behaviour and state changes (reactions to messages)
• use case diagrams, for capturing scenarios of system use
• physical diagrams, for mapping software elements to physical components
• extension mechanisms
• activity diagrams

Essentials of OO and UML 2.0


OO has static concepts:

• classes
• features of classes (attributes and operations)
• visibility
• relationships (client-supplier and inheritence)
• packages

And dynamic concepts:

• objects
• dynamic dispatch
• substitution
• messages
• events

We can also consider design-by-contract.


OO also considers use cases and actors. These ideas themselves aren't actually OO, but are
used in concept with OO ideas. They can be helpful in finding classes and objects, but they
can be dangerous and misleading.

There are also modelling extensions (e.g., stereotypes) and UML state machines (a simple
extension of concepts seen in the first year).

Classes

The most important concept in OO is the class. A class is a module (consisting of a data part
and an operation part) and a type (which can declare instances; objects), although there are
many different definitions of this.

Each class has an interface and the interface declares and defines the services/features that the
class provides. It also describes how to access these services; the interface could be
considered to be the public face of these services. With interfaces, you don't need to consider
what's happening internally.

All of OO analysis, design and programming revolves around the classes, which suggests the
question - how do we find the classes?

A class is represented in UML as shown below (e.g., a stack)

Sections on the class box can be hidden/elided as seen fit. There are also variants on this
diagram too (stereotypes). Different UML tools also variously implement operation
signatures (the way of defining operations).

Classes have features of two types, attributes (or fields) that are used to hold data associated
with objects, and operations, used to implement behavior and computations. Attibutes can be
as specific as you like (e.g., defining them as an integer type), and neither operations or data
is compulsary in a class. Some tools do restrict the types you are allowed (e.g., Visio), but
pure UML has no restrictions. Operations can exist in two forms, functions and procedures.
Functions are operations that calculate and return a value (they do not change the state of an
object, i.e., make persistent, visible changes to values stored in fields). Procedures are
operations that change the state of an object (but do not return any values - however, they do
support in/out values). Seperating functions from procedures in this way is useful in
simplifying systems, for verification and testing.
An operation in UML specifies the services that a class can provide to its clients, and is
implemented using a method. The full UML syntax for an operation is: visibility
name(param_list) :type {constraint} . The constraint may include pre- and
post- conditions (boolean expressions that determines whether or not an expression can be, or
has been, evaluated). An operation should be considered as an interface implemented by one
or more methods.

Visibility

Each feature of classes may be visible or invisible to other classes (clients). It is


recommended (but not required) to not allow clients to write to/change fields directly. Each
feature in a class can be tagged with visibility permissions (Java/UML).

• private (-) - the feature is inaccessible to clients and children


• public (+) - the feature is completely accesibly to clients
• protected (#) - the feature is inaccesible to clients (but accesible to children)

UML also allows you to define language-specific visibility tags.

OO programs are basically a series of feature calls, i.e., accesses to attributes and operation
calls (recall the uniform access principle - implies that all feature calls are written using a
single notation).

Objects

An OO program declares a number of variables (e.g., attributes, locals, and parameters) and
each variable can refer to (or be attached to) an object. An object is a chunk of memory that
may be attached to a variable which has a class for its type.

Using an object has two steps - declaration of a variable and allocation.

Each object optionally has a name (e.g., an object of type Person has the name Homer), and
multiple instances of an object can be represented as on the right (useful for containers and in
representing collaboration of objects). This however represents declaration and allocation, but
not allocation by itself.
Constructors

A class should contain special features called constructors - this is invoked when an object is
allocated and attached to a variable. Constructors in Java have the same name of the class,
and there may be many of them (overloaded - for example, so it can be constructed using
different sets of startup data).

Constructors are invoked on allocation. UML has the <<constructor>> stereotype.

Class Relationships

Individual classes are boring. Connecting them is what generates interesting emergent
behaviour. Connecting them is also what could cause problems and generate errors, so we
need to be careful and systematic when deciding how to connect classes.

In general, there are only two basic ways to connect a class:

• by client-supplier relationships
• by inheritence relationships

Both of these have speciallisations in UML, though.

Client-Supplier Relationships

One class may need to use another class (e.g., to create an object, call a feature, attach to,
include, etc...). We say that the client uses services provided by a supplier. There are many
different client-supplier relationships. There are specific kinds of relationships:

• Has-A - e.g., Board has-a collection of Squares


• Creates - e.g., Factory is responsible for creating instances of motors (example of a type of
design pattern)
• Part-Of - a Cylinder is a part-of a Motor (no sharing of the constituent parts). Some
programming languages support part-of relationships via composite, or expanded, types.

Associations represent relationships between instances of classes and are the basic way to
represent client-supplier in UML. Each association has two association ends, and each
association end can be named with a role. If there is no role, you name an end after a target
class. Association ends can also have multiplicities.

Multiplicities are constraints that indicate the number of instances that participate in a
relationship. The default multiplicity on a role is 0..*. Other multiplicities include:

• 1 - exactly one instance


• * - same as 0..*
• 1..* - at least one
• 0..1 - common participation

You can, in general, specify any number, contiguous sequence n..m, or set of number {0, 2,
4} on a role. Note: multiplicities constrain instances.
Responsibilities

An association implies some relationship for updating and maintaining the relationship. This
is not shown on the diagram, so it can be implemented in several ways, such as a method.
Responsibilities do not imply data structure. If you want to indicate which participant is
responsible for maintaining the relationship, add navigability.

A navigable association indicates which participant is responsible for the relationship, e.g.,

Here, game is responsible for the relationship. In a model of a Java program, this may
indicate that a Game object refers to (points to, or "has-a") a Room object.

The default, "undirected" association really means bidirectional! Directed associations are
typically ready as "has-a" relationships.

Attributes and Associations

An attribute of a class represents state - information that must be recorded for each instance,
but associations imply that one class can access all of another class, even if that is not what is
wanted. At the implementation level, the difference between associations and attributes is
normally very small, so attributes can be considered to be small, simple objects that do not
have their own identity.

Inheritance

If we recall the open-closed principle, we need a mechanism for re-opening a class to change,
add and remove features. Inheritance allows us to re-open a class once it is closed.

Inheritance is a relationship between a child class and one or more parent classes. The
inheriting class (the child/subclass) inherits features from its parents, and may add its own.

In UML, we call inheritance generalisation. For example, if we have personal and corporate
customers that have similarities and differences, we can place the similarities in a customer
class and the difference in generalisations of customers, as shown below:

All the features of customer are features of its generalisations too. This implies
substituability.
There are many different types of inheritance. Subtyping inheritance is used when there is a
strong degree of commonality between two or more classes, e.g., between Person and
Employee. An Employee is-a Person, as Employees behave like Persons but also have
specialised behaviour. When such a degree of common behaviour occurs, Employee is said to
be a subtype of Person. Subtyping inheritence captures the is-a relationship between classes.

The inheritance modelling rule is such that, given classes A and B, if you can argue that B is
also an A, then you can make B inherit from A. This is a general rule-of-thumb, but there will
be cases where it does not apply. If you can argue that B is-a A, it's easy to change your
argument so that B has-a A.

There are many different types of inheritance as shown in the tree below.

• Subtype - modelling some subset relation


• Restriction - instances of the child are instances of the parent that satisfy a specific
constraint (e.g., square inherits rectangle)
• Extension - adding new features
• Variation - child redefines features of the parent
• Uneffecting - child abstracts out features of parents from at least one client view
• Reification - partial or complete choice for data structures (e.g., parent is a TABLE, reifying
child classes are HASH_TABLE)
• Structure - parent is a structural property (COMPARABLE), child represents objects with
property

is-a does introduce problems, however. Some phrases that contain the words "is a" are
instantiation phrases, and some are generalisations. Generalisations are transitive,
instantiation is not. So, when using is-a, you should take special care that you do not mean an
instantiation instead of a generalisation relationship.
Class Diagrams

A UML class diagram describes the classes in a system and the static relationships between
them. These static relationships are variants of:

• associations
• generalisations

Class diagrams also depict attributes and operations of a class, and constraints on
connections.

Class diagrams can be used in three different ways:

1. Conceptual modelling, drawn to represent concepts in the problem domain, and often done
during analysis and preliminary design
2. Specification modelling, drawn to represent the interfaces, but not implementations of the
software
3. Implementation modelling, drawn to represent code

The division between these perspective is "fuzzy".

Conceptual Modelling

• Concept - a real world entity of interest


• Structure - relationships between concepts
• Conceptual modelling - describing (using UML) concepts and structures of the real world

Class diagrams are commonly used for conceptual modelling. They are a way to model the
business domain and capture business rules. They do not depict the structure of software,
however!

An example conceptual model/class diagram might look like:


The conceptual model is used as the basis for generating a specification model. Classes
appearing in the conceptual model may occur in the specification model, they may also be
represented using a set of classes, or they may vanish altogether!

All OO methods hinge on being able to find the concepts and classes - this is much more
critical than discovering actors or scenarios. It is the process of finding concepts that really
drives OO development - use cases are not OO! It is important to note that we are not finding
objects, as there are too many of these to worry about - rather, it is the recurring data
abstractions that need to be found. There are no rules, but we do have some good ideas,
precedents and known pitfalls.s

One way of introducing a concept is to use noun and noun phrases from the system
requirements. This isn't very good, but does work as a first pass approximation. Many books
take this superficial approach: "Take your requirements documentat and highlight all the
verbs and all the nouns. Your nouns will correspond to classes and your verbs to methods of
classes." This method is too simple-minded, however. It suffers from the vagaries of natural-
language description, finds obvious classes but misses hidden ones and often finds totally
useless classes.

When evaluating the usefulness of a class from this method, you must ask yourself the
question: Is this class a seperate data abstraction, or are all operations on this class already
covered by operations belonging to other classes, or "is the proposed class relevant to the
system?" - this is answered based on what you want to do to the class.

The noun/verb approach often misses important classes based on the requirements are
phrased. Rewriting a requirement can help, but you need to know what you are looking for to
make sure it is in the rewritten requirement.

We need to be able to find interesting classes. An interesting class:

• will have several attributes


• will have several operations and methods (and at least one state-change method, usually)
• will represent a recurring concept, or a recurring interface in the system you are building
• will have a clearly associated data abstraction

Association vs. Generalisation

When planning associations between classes, how does a designer know how to use
inheritance or client-supplier? The decision can be based on whether or not one is a more
sensible modelling decision than the other, or on performance grounds. Generally, inheritance
offers a bit more of a performance hit than client-supplier (though this also depends on how
good an optimising compiler you have). To answer this, we need to consider substitution and
dynamic despatch.

Generalisation becomes particularly useful when used with substitution and polymorphism.

Substitution is assigning one variable to an object of another type, allowing you to replace a
general type with a more specific type. For example, if you had a variable p referring to an
object of type Person and a variable e referring to an object of type Employee, and then did
an assignment of p = e, this is perfectly legitimate, as variables are only references to
objects, not the objects itself. When a client calls p.f, however, it can only call the features
of Person, as the compiler would see attempting to access a method that belongs to Employee
as an invalid function call and throw an error at compile-time.

Inheritance also introduces problems of method overriding. If you consider a class Person
with a method display(), the display methods for Persons are probably not appropriate
for Employees, since the latter have more and different attributes, so we can override the
version of display() inherited from person and replace it with a new version for
Employee.

When overriding occurs, the child class merely provides a new implementation of the
method. If the signature matches that of an inherited method, the new version is used
whenever it is called. In UML, there is no special syntax for overriden operations and some
developers omit the operation in the child class if it is overriden, which can lead to confusion.
The {leaf} constraint next to the name of an operation prevents overriding in subclasses.

There are constraints on overriding, however. Rules have to be obeyed in order to maintain
substitutability and static typing in a language. The typical rules on overriding are that
attributes and methods can be overriden by default and the overriden method must type
conform to an inherited method. Additionally, the overriden method must correctness
conform to the original (this is covered later in contracts).

Type conformance is language dependent, and Java implements the exact match rule,
sometimes called "no-variance". Other methods of type conformance are contravariance,
where a type is replaced by a more general type and covariance, where a type is replaced by a
more specific type. Contravariance has not been demonstrated to be very useful, but it is
relatively easy to implement. No variance is also trivial to implement, but it is often too
restrictive for use (e.g., in Java, many casts to Object are made). Covariance is probably the
most useful, but it can require system-level validity checks, which are very expensive. In
UML, type conformance is a point of semantic variation.

Dynamic dispatch is very useful and is used to invoke methods applicable to the dynamic
type (the type of object attached to a variable) of an object. During execution, a variable can
be used to refer to objects of its declared (static) type or of any compatible type, e.g., children
and their descendents.

The basic rule of association vs. generalisation is simple, a directed relationship represents
"has-a" and a generalisation represents "is-a", however it is still sometimes difficult to decide
which one to use. We can take the point of view that when the "is-a" view is legitimate, we
can take the "has-a" view as well, however the inverse of this is not normally true.

Rule of Change

Associations permit change, whilst generalisations do not.

Do not use generalisation for a perceived is-a relationship if the corresponding object
components may have to be changed at run-time.
Polymorphism Rule

Very simple: if we want to use polymorphism and dynamic binding, then we use
generalisation.

Generalisation is appropriate for representing is-a relationships if data structure


components of a more general type may need to be attached to references of a more
specialised type.

Most people use UML class diagrams as described above, and they are the most useful type.
They can be applied with a number of different programming languages. The more advanced
concepts covered below are more challenging to use with specific programming languages,
but the concepts are still useful.

Packages

Packages can be used to group any collection of modelling elements (classes, objects, other
packages, etc...). Relationships between packages can be expressed in terms of dependencies;
a dependency exists between two elements if changes to one element (the supplier) may
cause changes to the second element (the client). There are many type of dependencies in
UML (association and generalisation are forms of dependencies).

Stereotypes

Stereotypes are the standard lightweight mechanism for extending UML. If you need a
modelling construct that isn't UML, but which is similar to something that is, then a
stereotype should be used. A stereotype is a textual annotation of diagramming elements.
Many standard stereotypes exist, and you can define your own.

e.g., a UML interface:

Some built in stereotypes include:

• <<access>>: the public contents of a target package are accessible to the source package
namespace
• <<create>>: a feature which creates an instance of the attached classifier
• <<friend>>: the source has access to the target of a dependency
• <<instantiate>>: the source classifier creates instances of the target classifier
• <<invariant>>: a constraint that must hold for the attached classifiers/relationships

Translating a stereotyped class into a programming language is a challenge. Not all


stereotypes will be immediately translatable to all languages, e.g., <<friend>> in Java is
difficult, but is easy in C++. A library of "patterns" is often needed. Tool support is also
required for bespoke stereotypes, e.g., how will you make Visio support an <<RFID>>
stereotype, not only syntactically, but also for code generation.

Aggregation and Composition

Associations represent general relationships between classes and objects. At the


implementation level, they can be defined in terms of reference types. Further relationships
are also provided with UML:

• aggregation: "part-of"
• composition: like aggregation, but without sharing

This aggregation relationship shows that an instance of Style is part-of zero or more instances
of Circle. Style instances may also be shared by many circles. This is semantically fuzzy, as
there is no implicit difference between this and association, and it can't be implemented in
Java. It is advisable to avoid this unless you formally specify its meaning.

This composition relationship shows that a motor is composed of one or more Cylinders, and
Cylinders are "integral parts" of Motors. The part objects (here, Cylinders) belong to only one
whole and the parts live and die with the whole. These are sometimes called "value types" or
"expanded types", such as in Eiffel.

Association should be used whenever you are in doubt over which relationship to use, and
association can always be refined to more specific forms of relationships between modelling
elements. Associations with 1..1 multiplicity can be considered equivalent to compositions
(since they support cascading deletes too).

Interfaces and Abstract Classes

A pure interface provides no implementation, only operations are declared. An abstract class
may have some implemented methods and fields, but not everything needs to be
implemented.
Association Classes

Association classes can be used to add attributes, operations and constraints to associations.

In the case above, an attribute could be added to Person showing the start date with a
company, but this is really an attribute of the relationship between Person and Company.

The association class implicitly includes the constraint that there is only one instance of the
association class between any two participating Person and Company objects. Otherwise, this
must be stated explicitly.

Parameterised Classes

The notion of parameterised classes is present in Java, C++ and Eiffel. They let you define
collections of an arbitary type, e.g., Set(T), Sequence(T), Binary_Tree(T),
where T is a type parameter that must be filled in order to produce a type that can be used to
instantiate objects.
<<bind>> is a stereotype on the dependency and it indicates that Employee_Set will conform
to the interface of Set. You can not add features to the bound element.

So far we have only considered how to model static system aspects. UML provides facilities
for modelling behaviour, which in general is by considering objects and events, messages and
reactions to events.

System Events and Operations

Systems are used by actors (e.g., users and other systems) and are built to respond to events -
external stimuli generated by actors. When delineating the system borderline, it is often
useful to find out which events are of interest to a system.

Operations are executed in response to a system event. The system has control over how it
responds to events, and operation execution in UML is usually represented via messages.

UML provides diagramming notations for depicting events and responses - communication
diagrams.

Communication Diagrams

Communication diagrams describe how groups of objects collaborate in some behaviour.


Typically, a communication diagram captures the behaviour of a single scenario of use. It
shows the number of objects and the messages that are passed among these objects within the
scenario.

There are two main types of communication diagrams: sequence diagrams and collaboration
diagrams.
Sequence Diagrams

In sequence diagrams, objects are shown as boxes at the top of dashed vertical lifelines
(actors can also be shown). Messages between objects are arrows, and self-calls are
permitted. Conditions (guards) are used, as are iteration markers.

To show when an object is active, an activation box is drawn and the sender is blocked.
Return from the call can be shown as well, but it usually clutters the diagram and confuses
things if you do.

An external view of a system can also be represented using a sequence diagram.


Sequence diagrams are traditionally constructed after system services and scenarios of use
have been determined. They are good at showing collaborations among objects, not a precise
definition of the behaviour. Statecharts are better suited to the behaviour of a single object.

Sequence diagrams are built by identifying events - are they generated by an actor, or the
system itself? The focus is on capturing the intent rather than the physical effect (i.e., don't
use them to flowchart).

Collaboration Diagrams

Semantically, they are equivalent to sequence diagrams. Objects here are shown as icons, and
they can be placed anywhere on the page/screen. The sequence of message firing is shown by
the number of the messages. It is easier to depict object links and layout with collaboration
diagrams and they are also a lot more compact. Sequence diagrams show sequences better,
however.

In collaboration diagrams, different numbering schemes for sequences are permitted:

• Whole sequence numbers (as shown above) are the simplest


• Decimal number sequences (e.g., 1.1, 1.2, 1.2.1, 1.2.2) can be used to show which operation
calls another operation

Control information (guards and assignments) can also be shown in collaboration diagrams as
with sequence diagrams.

Object Diagrams

A collaboration diagram without messages is also known as an object diagram, and the
relationships between objects are called links. An object diagram must be a valid instantiation
of a static class diagram. Objects must have classes and links between objects must be
instances of associations between classes. This can be used as a quick consistency check.

State Charts

Class diagrams and packages describe static structure of a system. Interaction diagrams
describe the behaviour of a collaboration. However, we have not described the behaviour of a
single object when it reacts to messages - this can be done using OCL (covered later) or
statecharts. State charts describe all possible states that an object can get in to, and how the
object responds to events.
The syntax for a transition in a state chart is: Event [Guard] / Action. Actions are
associated with transitions and are short, uninterruptible processes. Activities are associated
with states, and may be interrupted. A guarded transition occurs only if the condition
evaluates true; only one transition can be taken. When in a state with an event, a wait takes
places until the event occurs.

Statecharts are good at describing the behaviour of an object across several scenarios of use.
They are not good at describing behavior that involves a number of collaborating objects
(interaction diagrams should be used instead for this).

It is not usually worthwhile to draw a statechart for every class in the system, so they should
only be used for classes that exhibit interesting behaviour, e.g., UI and control objects.

Common Criticisms of UML

• Not writing code immediately often draws criticism of wasting time, however a short
amount of time spent on modelling may save a great deal of time coding (See agile methods
and extreme programming earlier on).
• UML is very complex (the syntax guide is 200 pages long). This is true, so most users only
focus on a selection of the most useful diagrams.
• UML can easily produce inconsistent models (i.e., multiple views of the same system).
Research and tool support is hoping to improve this.
Standards

UML is a de facto standard, that is, one produced by industry that aims to be practical and
usable, but technically inferior to official standards. Standards are documented agreements
containing technical specifications or other precise criteria. They are to be used consistently
as rules, guidelines or definitions of characteristics, to ensure that products are fit for their
purpose.

Other types of standards include de jure, which are official standards from a governing body
(e.g., ISO, IET, ANSI, BSI) and are developed under consultation with stakeholders. They
are usually slow to appear (often past the point where they would be useful) and because they
represent compromises among different views, they are often hard to apply. A final type of
standard is an in-house one, which only applies to a particular company.

Use Cases

Use cases are commonly described as telling a story of how a user carries out a task to
achieve a desired goal (functionality). A use case is a document that describes a sequence of
steps of an actor using a system to complete a scenario. An actor is external to the system
(i.e., it is a human operator or another system) and a scenario describes a complete sequence
of events, actions and transactions required to produce or complete something of value.

Use cases are all about what an actor does, not how they or the system accomplishes it.

Actors can be broken down into primary actors, with goals, and secondary actors, with no
goals, but interests, such as managers in a retail environment.

Use case diagrams are used to illustrate the relationship between a set of use cases and their
actors. Their purpose is to allow rapid understanding of how external actors interact with the
system, however, the diagrams are not the important thing.

If one use case initiates or duplicates the behaviour of another use case, it is said to 'use' the
second use case, and this is shown as:
If one use case alters the behaviour of another use case, it is said to extend the second use
case, and this is shown as:

If one case is similar to another, but does a little bit more, then generalisation can be applied:

The difference between <<extends>> and generalisation is that with <<extends>>, the base
use case must specify extension points.

Some guidelines to think about use case relationships are that it's usually easier to think about
normal cases first and worry about variations afterwards. <<uses>> (sometimes called
<<include>>) should be used when you are repeating yourself in two seperate use cases.
Generalisation should be used when you are describing a variation on a behaviour that you
want to capture informally. <<extends>> should be used when you more accurately want to
capture a variation on behaviour.

Use cases can interact with any number of actors, and use cases are discovered by first
identifying the actors. For each actor, you need to consider the scenarios that they might
initiate (trigger). A common error that is made is representing initial system functions with a
use case (e.g., create transaction, destroy record, etc...). Use cases represent services that may
be implemented by multiple operations (i.e., a transaction) in a system. They are usually
relatively large processes.

Use cases should also be created to deal with exceptions (e.g., in a retail environment and the
card authorisation fails), which should be encapsulated in so called secondary scenarios.
Alternatives also need to be considered in these secondary scenarios.

An Example Use Case Template

• Name: << An active verb phrase, describing the goal >>


• Context: << A longer description of the goal >>
• Actors:
o Primary Actors: << The Actor with the goal >>
o Supporting Actors: << Those with interests to be protected >>
• Preconditions: << Required state of world prior to Use Case >>
• Trigger: << What starts the Use Case, often Primary Actor >>
• Main Success Scenarios: << Steps >>
• Success Postconditions: << Holds for successful exit and achieves goal >>
• Secondary Scenarios: << Steps and postconditions >>

Use cases are good for modelling current work practices, describing transactional style
systems (but are probably not the first choice) and uncovering business rules and error cases.
They should be used with care, and scenarios should be kept free from design assumptions.

The Development Process


The UML provides a standard language for describing OO systems; it does not describe a
standard development process. Reasons for this include a need to seperate concerns and
recognising that different development processes will be used in different problem domains
by different people.

It is possible to identify a fairly typical view of best practices, basic activities and models,
based on an iterative, use case driven development process. RUP (discussed earlier) is an
example of this.

Interative, Use Case Driven Development

An iterative life-cycle is based on successive enlargement and refinement of a system through


multiple development cycle of analysis, design, implementation and testing. Each cycle
tackles a relatively small set of requirements, proceeding through the cycle. The system
grows incrementally as each cycle is completed, in contrast to the classic waterfall model. In
iterative development, we have the advantages of a reduced complexity and the generation of
early feedback.

The basic (rational) process has 6 stages:

1. Use case modelling


2. Each use case (scenario) is refined and explained using interaction diagrams
3. In parallel, class diagrams are produced
4. Complex interactions from step 2 are refined using collaboration diagrams and statecharts
5. Specification class diagram is produced
6. Code is generated

Design by Contract
RUP, or use case driven development helps us identify features that stakeholders want,
identify objects and collaborations that implement these features, identify classes that provide
the structure that enables collaboration and identifies operations (the building blocks of
operations). However, it does not help us describe the behaviour of operations.

Assertions

An assertion is a boolean expression or predicate that evaluates to true or false in every state.
In the context of a program, assertions express constraints on the program state that must be
true at a specified point during execution. In a model/diagram, they document what must be
true of a implementation of a modelling element. Assertions are typically associated with
methods, classes and even individual program statements and are useful for helping to write
correct software, since they specify what is expected behaviour, as well as for documentation
of interfaces and debugging and to improve fault tolerance.

Pre- and postconditions are assertions associated with methods of a class. Preconditions are
properties that must be true when the method is called and postconditions are properties that
must be true when the method returns safely. They can both be optional (in which case the
assertion is just True), and in this case, an optional precondition implies that "there are no
constraints on calling this method", and an optional postcondition implies that "the method
body can do anything, as long as it terminates".

Pre- and postconditions can be viewed as a contract that binds a method and its
callers/clients: "If you (the client) promise to call me (the method) with the precondition
satisfied, then I guarantee to deliver a final state in which the postcondition holds." The caller
thus knows nothing about how the final state is produced (abstraction), only that they can
depend on delivery. The method and its implementer need to only worry about cases where
the precondition is true (and none other).

However, if a client calls a method when the precondition is false, then this precondition
violation is the client's fault. This is extremely helpful in tracking down errors, as if a
precondition is failed, the supplier code does not need to be analysed. Formally, the contract
says nothing about what should be done if a precondition fails, so any behaviour (infinite
loop, exception handler, return error message) is acceptable.

Contracts are specifications of what a method should do under particular conditions. So,
given a contract, you can write an implementation that satisfies it. This form of
documentation is more precise than just class interfaces. Run-time checking for correctness
can be implemented, and they also provide a basis for testing and formal proofs.
In addition to pre- and postconditions, we also have class invariants, which are global
properties, preserved by "all" methods and instances of the class. Despite its name, however,
the invariants does not always need to be true. The constructors must leave the object in a
legal state satisfying the invariant, they can not assume the invariant is automatically
initialised. Any legal call made by a client must start from a state satisfying the invariant and
must end up in such a state that also satisfies the invariant. Private methods can do what they
like, but if they terminate in a state where the invariant is false, then clients can not use the
object. This is sometimes too inflexible, however.

The effect of an assertion at run time should be under the control of the developer. Checking
assertions takes time (especially invariants and postconditions), but in testing and debugging,
these assertions are very important and helpful. For production releases, assertion checking
may want to be turned off to improve performance.

We need to consider what happens to assertions with inheritance, however. In general,


assertions are inherited with the features or classes to which they apply, and the invariants are
anded together. However, if two contradictory assertions are inherited, this causes a fallacy
and the object will never be in a safe state.

Child classes can also override method implementations from a parent, but what if the parent
method has a contract that is inherited? There are two alternatives, which is to replace the
inherited implementation, but keep the contract (e.g., to provide a more efficient
implementation, or an implementation that does more than the original), or to modify the
contract and the implementation (because the contract may not say exactly what you want,
but complications may arise with things such as substitution).

Contracts can not be invalidated or broken by overriding, otherwise clients can not rely on the
methods' results. You can do at least what the original contract could do, but you can also do
more. This leads us to the rules for assertion and overriding:

1. The inherited contract can not be broken/contradicted


2. The inherited contract may be kept unchanged; this guarantees that it is not broken (used
for changing the method body/implementation details)
3. The @pre may be replaced by a weaker one
4. The @post condition may be replaced by a stronger one

Rules 3/4 imply that if you want to change the contract under inheritance, you can replace it
with a subcontract, where every behaviour of the new contract satisfies the original.

To use subcontracting effectively, a compiler has to check that preconditions are weakened
and that postconditions are strengthened. This is inefficient for real programs. An efficient
implementation is to use a low-tech language convention based on the observations that for
assertions α, β, γ: α implies (α ∨ γ) and (β ^ γ) implies β. i.e., accept weaker preconditions
only in the form (α ∨ γ) and accept stronger postconditions only in the form (β ^ γ).

In languages such as Java with iContract or Eiffel, only new clauses are specified in the
overriden method and new @pre clauses are automatically or-ed with any inherited
precondition clauses and any new @post clauses are automatically and-ed with any inherited
postcondition clauses. The overridden contract is automatically a subcontract in this case, and
no theorem proving needs to be done.
UML is primarily a graphical notation, and uses text for labels and names. Text is also used
for writing constraints, a restriction on state values. Types are a form of constraints, and
constraints are how design by contract is implemented in UML. Constraints are also useful
for resolving limitations with UMLs expressiveness.

UML supports an informal notion of constaint called a note, which can hold any piece of
information and is expressed as a string attached to a modal element. There are no restrictions
as to what can go in a note, but to write formal (machine-checkable) constraints, we use the
standard constraint language OCL.

OCL

OCL, or the Object Constraint Language, is a constraint/assertion language for OCL. It is


used for writing general constraints on models and also for design-by-contract. It can be
applied to any modelling element, not just classes.

OCL has some issues, however. It has a programming language like syntax (similar to C++),
but is easy-to-use by non-formalists. There are no formal semantics, it is ambiguous and in
places it can be cumbersome. OCL is used in the UML metamodel.

The essential capabilities of OCL are the context, specifying which modal element is to be
constrained; navigation expressions, navigating through models to identify objects that are
relevant to a constraint, which can be used to constrain values of attributes and related
objects; and expressions, which are asserting properties about relationships between objects.

For examples of
navigation expressions
see the lecture slides.

Navigation expressions provide the means for referring to objects that are linked to a
specified context object. Linked objects are obtained starting from the context object. Links
are followed to gain access to other objects of interest (similar to references in
Java/C++/Eiffel), but complexity arises when collections (e.g., sets, sequences, bags, etc) are
navigated.

OCL supports a number of basic types and typical operations upon them, e.g., Boolean,
Integer, Real and String. Collection, Set, Bag, Sequence and Tuple are basic types as well and
iterative operations and operators exist for making use of these. Every class appearing a UML
model can also be used as an OCL type. Type conformance rules are as in UML, i.e., types
conform to supertypes, i.e, Set, Bag and Sequence conform to Collection.

Predefined operations available on all objects include:

• oclIsTypeOf(t : OclType) - true if self and t are of the same type.


• oclInState(s : OclState) - true if self is in the state specified by s. s is the name of some state
in the statechart for the class.
• oclIsNew() - true if used in a postcondition and the object is created by the operation.
The logic for OCL is actually three-valued, as an expression can evaluated to true, false or
undefined. e.g., illegal type conversions return undefined, and taking the first() element of an
empty sequence.

For truth tables, we consider that an expression is undefined if one of its arguments is
undefined, except:

• true OR anything is true


• false AND anything is false
• false IMPLIES anything is true

If a navigation expression denotes objects retrieved by following links, then depending on


multiplicities of association, the number of objects retrieved may vary. When a navigation
expressions can return more than one object, it returns a collection (a set or a bag) - iterated
traversals provide bags, single traversals provide sets. As of UML 2.0, collections can be
nested.

As navigations can be composed, more complicated paths through a diagram can be denoted.
These are evaluated in a step-by-step manner. For example, for self.department.staff,
self.department gives a set of departments, and then .staff is then applied to each element of
the set, producing a bag of people.

Operations and attributes defined for classes in a model can be used in OCL expressions, and
collections come with some built in operations, accessible using →, typically producing bags.
e.g., sum() is applicable to numerical collections and asSet() which converts bags to
sets, removing duplicates.

select() is also a built-in operation for picking out specific objects from larger collections
(a quantifier). e.g., to pick out all employees with a salary greater than £50,000: context
Company inv:
employees→select(p:Person|p.contract.grade.salary>50000). Further
navigations can be defined on the result of the select.

Similarly, the collect() operation takes an expression and returns a bag containing all
values of expression. e.g., to return the age of all employees in the department: context
Department inv: staff→collect(p:Person|p.age()). You need to avoid
dropping name and type of bound variables, as this can easily lead to ambiguity - the OCL
guide is quite careless on this.

Other iterative constraints defined on a collection include forAll() (returns true if every
member of the collection satisfies the boolean expression), exists() (true if there is one
member of the collection satisfying the boolean expression), allInstances() (returns all
instances of a type,
e.g., Grade.allInstances→forAll(g:Grade|g.salary>20000). allInstan
ces() should be used very carefully, as it is not always necessary to use it, and it is often
difficult to implement it.

OCL can also be used to write arbitarily complex constraints, which can correspond to class
invariants.
The usual laws of design by contract should be followed with UML and OCL (preconditions
weakened - or; postconditions strengthened - and; new clauses and'ed to the invariant). OCL
does not require that you follow these rules, but it is strongly recommended. Modified
contracts must maintain consistency.

Generalisation relationships in OCL are not navigable, and they do not usually feature in
writing constraints, but generalisations may be constrained.

Testing
Typical development projects include a substantial amount of time and money spent on
testing. On average, this is 40-60% of the overall project cost. It is unrealistic to expect to get
everything right from the start, as there is not enough time to get the requirements perfect,
clients can change their minds and there might be extremely strict dependability
requirements, so a testing phase will always be required to show that the project meets its
requirements.

Testing has come a long way with the since the birth of software engineering. The first view
of testing was a "novice" view from the 1970s, where testing is basically debugging. This is a
"happy fantasy land with elves, pixies and gremlins". This developed into an increasingly
mature view in the 1980s which is where you try to show that software does not work (a shift
from testing whether software does work, to proving it does not work - try to break the
software and check it handles errors correctly) and in the 1990s where testing that reduces the
risk of not meeting requirements or quality targets - non-functional requirements. This
evolved into the mature view taken today, where testing and quality is a state of mind. Code
tests are not the only testing done to check correctness - questions such as "did you build the
right system" and "did you build the system right" are the ones that are asked. Testing and
correction is done at all stages at the lifecycle.

Verification is checking that the system has been built correctly, such as verifying against the
metamodel for UML or the language specification for code. Validation is checking that the
right system has been built and it is what you want - assertions and pre- and postconditions
are used to do this. Verification can be done by the development team, but validation should
be done by people outside the team, as it is hard to be objective when you were involved in
development.

Verification of the model should be done at each level of abstraction and validation should be
done between each level - this checks that the system meets current requirements.

Certification is checking against a "benchmark", and it is done in situ over time, e.g., moving
from beta to release candidate to final, etc... Certification shows that the system performs
adequately when put into action. Validation, verification and certification can be carried out
by systematic, planned testing and argument.

Validation, verification, testing and certification are all part of the quality assurance phase of
a project. QA consists of:

• a definition of what is considered "acceptable quality" for the product being considered
• a definition of the QA process or processes that will be followed to ensure quality (e.g.,
reviews)
• application evidence (e.g., test data, the results of reviews, expert assessments of quality)
• analysis of the evidence (i.e., how well did we meet our goals)

The principle of ALARP - as low as reasonably possible - is used in QA. This is making sure
that everything is as safe as reasonably possible. For consumer software, low might not be
that low, but for safety-critical systems, it probably will be very low.

Testing is one of the key techniques in QA, and by ensuring that you are aware of the current
version of requirements, VVC becomes a lot easier.

Preparing for testing all starts with the requirements, which are used to start deriving tests.
These tests need to be relatively stable. Certification is done with acceptance tests, to
demonstrate that the "adequacy" requirements are met. Validation is performed by testing the
system before installation, to show that the design has been met. Verification is performed by
unit and module testing to demonstrate that code meets the design.

V Model

The V model is a development model where a testing mentality is demonstrated at all stages
of the design lifecycle.
A unit is a component of interest in your system, e.g., a class, a package, function, etc... Unit
tests usually consist of a few hundred lines of code aimed towards breaking the design, and
these are devised as part of the design process. Development of unit tests is an interative
process, as new elements of the design are added and improved, the unit tests must be re-run
and errors corrected.

Integration testing is testing that when we "glue" together these units, then they behave
sensibly. It makes the assumption that we have successfully unit tested the whole system.
Integration testing tests large components against their specifications. It is designed to check
that all units are used and all input errors are properly handled, and that all specified
functionality is provided and it all works properly together.

Unit testing can be typically done in team, but integration testing may involve code or
designs from another team. The precise interfaces of units need to be clearly specified. Often,
specialised testing teams are employed to deal with integration testing so as to avoid team
conflicts (and also to speed up the process).
For acceptance tests, the entire system (including hardware) is tested against the
requirements. Acceptance tests consider the system as a whole, as well as performance,
security, configurability, start-up and recovery issues (i.e., the non-functional requirements).
Tests derived from the specification are executed against the system in the client's
environment. The certification process often requires "correct" operation over a predefined
period.

Test-Driven Development

In its purest form, this is where you write test drivers and test cases (i.e., test data and
expected results) before you write the code, although obviously some modelling and the
interface specification will need to be done.

In general, whenever you make a change to your system, rerun your tests before you check
the changed code into the repository. If the system fails to build at the end of the day (the
system should be built once a day), it will be easy to identify the culprits.

Testing on a class can be done using pre- and postconditions and invariants. Preconditions on
the constructor should be implemented as invariants (as it is impossible to check an attribute
on an object before it exists and invariants are checked after the object is created). It is
important to check when you are writing assertions that you are writing the correct thing, and
testing for tautologies and fallacies are good ways of checking for this.

When writing test cases, exception handlers are used to handle errors generated by calling a
method, so the test case can return true if the test case passes. When dealing with contracts,
there are general rules to follow when exceptions are raised:

• Run-time assertion violations are manifestations of bugs in the program (even if the
violation is due to environmental issues outside of your control, you still have to anticipate
and deal with them).
• A precondition failure means that the client is at fault, and that is where the bug lies
• A postcondition failure is the suppliers' fault, and that is where the bug lies

However, the mere fact that a test passes does not mean that the software is correct, as an
analysis error when designing the test case could have occured, so an incorrect
implementation and the wrong test can collaborate (especially if the wrong assumption or
interpretation is made of the requirements) to make the test result "correct".

One of the key goals of software engineering is to make bug free code, however sometimes
intentional bugs are introduced e.g., Excel replicated a bug in leap handling from Lotus 123,
so they could use the same date serial number scheme as Lotus 123 and increase
compatibility.

Testing aims to guarantee that a system is fit for a specific purpose. It must be effective and
also cost-effective and thus testing requires proper planning. We must anticipate a certain
amount of testing, but it is often desirable to minimise costs - it is relatively easy to fix
mistakes early on in development, but less so as development continues. Organisations must
decide how much it is worth to fix faults.
Research has shown that it is impossible to fully test a system, or fully prove that it's correct
("Testing can never prove the absence of errors, only their presence" - Dijkstra), so research
now concentrates on testing/proof strategies that give the greatest confidence for the least
effort. The testing process must be systematic and documented which will help in future
planning and estimation.

A common cycle of testing looks like:

There are two main types of testing that can be performed: static testing (models and code are
analysed for correct construction - no code is actually executed, e.g., for rules such as no null-
pointer dereferencing); and, dynamic testing (executes the code against sample data, the
results of which are compared to predictions from models or code). Code is typically
analysed in advance to acquire suitable test inputs and expected outputs.

Path analysis is a commonly used technique applied late in design or during implementation.
A program defines a set of paths through the code based on its control structure or in terms of
how variables are used, e.g., every variable in a program must be introduced, intialised,
destroyed and used. The points in a program where these events occur are marked and then a
tool is used to check that each event occurs the right number of times for each variable. This
is usually implemented using some sort of graph structure.

Paths can also be used to analyse control structure, as paths represent loops, sequencing,
choice, etc... Some things that can be checked for are dead code (i.e., all segments of a
program can be reached) and that loop conditions allow termination.

"Rational" testing is dynamic testing based on test cases, but this clearly can not test all
possible paths. Domain testing aims at using models to reduce the number of tests needed
(e.g., look at the attribute types in UML classes and extract extreme values, or look at guards
in sequence diagrams or state charts to extract values for exercising all branches).

Statistical testing, which is based on usage models (i.e., what does a typical user/archetype do
with the system), also has uses. Test cases are generated from usage models focussing on
critical or problematic scenarios. Random tests are also included to attempt to exercise code
that is most often accessed.

Testing must also be able to deal with changes to requirements, which spawn changes to
models and code. Regression testing controls the amount of re-testing carried out in response
to change. The original system tests are saved and documented. Changes are traced through
to the tests and changed results predicted. Tests should then be re-run, unchanged parts
should give the original results, but changed parts should give the predicted results.

Testing and recording of results needs to be systematic, and this is necessary for estimation
for future projects and maintenance. This is made easier with proper planning and recording
of testing results. For validation, it is vital to number requirements and cross-reference design
decisions to those requirements. Verification, for comparison, is exercised against programs.
It must be defined early on what is an acceptable level of testing, and how much detail to
record.

Acceptability may not require 100% success, but there needs to be a record of which tests
passed and which tests failed. Justifications must be made for failures that were condoned,
and any modifications to the system must be fully documented.

By defining early on the goals for success, this helps you know when you have succeeded,
even though sometimes the goal may seem like a moving target. Not all projects will succeed
by building a complete system, and not all projects will succeed by writing a formal
specification and refining it into C.
PART B.
CHAPTER 5:

Networks and Distributed Systems


The Rise of Networks
Until the 1980s, computer systems were large and expensive, and there was no meaningful
way to link them together. The 1980s, however, saw two important developments - powerful
microprocessors and higher-speed networks (LANs/WANs now up and over a gigabit in
speed). More recent developments including research are scalable cluster-based computer
(e.g. Beowulf) and peer-to-peer networks (e.g., BitTorrent, SETI@Home, etc). The result is
networked computers.

It is now feasible to design and implement computing systems with large numbers of
networked computers. Advantages to this include shared resources, such as hardware (file
stores, printers, CPU, etc) and software (files, databases, etc), to speed up computation
(partitioned computations can have parts excecuting concurrently within a system), reliability
(if one site fails, others may be able to continue) and easier communication.

Network Protocols

Seperate machines are connected via a physical network, and computers pass messages to
each other using an appropriate protocol. Networking operating system services interface to
the network via the kernel and contains complex algorithms to deal with transparency,
scheduling, security, fault tolerance, etc. They provide peer-to-peer communication services
and user applications directly access the services, or the middleware.
Middleware is a high level of OS services. Protocol is a "well-known set of rules and forats to
be sued for communications between processes in order to perform a given task". Protocols
specify the sequence and format of messages to be exchanged.

We can consider a protocol that allows the receiver to request a re-transmission across a data-
link (on message error):

This protocol is inefficient due to a cross over of data and control messages. Steps 6 and 7 are
not needed.

Protocol Layers

Protocol layers exist to reduce design complexity and improve portability and support for
change. Networks are organised as series of layers or levels each built on the one below. The
purpose of each layer is to offer services required by higher levels and to shield higher layers
from the implementation details of lower layers.

Each layer has an associated protocol, which has two interfaces: a service interface -
operations on this protocol, callable by the level above and a peer-to-peer interface, messages
exchanged with the peer at the same level.

No data is directly transferred from layer n on one machine to layer n on another machine.
Data and control pass from higher to lower layers, across the physical medium and then back
up the network stack on the other machine.

As a message is passed down the stack, each layer adds a header (and possibly a tail such as a
checksum, although this is most common at the bottom of the stack) and passes it to the next
layer. As a message is received up the stack, the headers are removed and the message routed
accordingly.
As you move down the stack, layers may have maximum packet sizes, so some layers may
have to split up the packet and add a header on each packet before passing each individual
packet to the lower layers.

Some protocol design issues include identification (multiple applications are generating many
messages and each layer needs to be able to uniquely identify the sender and intended
recipient of each message), data transfer alternatives (simplex, half and full duplex), EDC
error control (error detection and correction - depends on environment e.g., SNR and
application requirements), message order preservation and swamping of a slow receiver by a
fast sender (babbling idiot problem, when a failed node swamps a databus with nonsense).

We can also consider two types of protocols, connection vs. connectionless. Connection
orrientated services are where a connection is created and then messages are received in the
order they are sent, a real world analogy is the POTS. Connectionless services are where data
is sent independently. It is dispatched before the route is known and it may not arrive in the
order sent, a real world analogy here is the postal system.

OSI Reference Model

The ISO Open Systems Interconnections (OSI) reference model deals with connecting open
networked systems. The key principes of the OSI are:

• layers are created when different levels of abstraction are needed


• the layer should perform a well defined function
• the layer function should be chosen with international standards in mind
• layer boundaries should be chosen to minimise information flow between layers
• the number of layers should be: large enough to prevent distinct functions being thrown
together, but small enough to prevent the whole architecture being unweildy
Layer Description Examples

Protocols that are designed to meet the communication


Application HTTP, FTP, SMTP,
requirements of specific applications, often defining the
(APP) CORBA IIOP
interface to a service.

Protocols at this level transmit data in a network


Presentation SSL, CORBA Data
representation that is independent of the representations
(SECURITY) Representation
used in individual computers, which may differ.

Session At this level, reliability and adaptation are performed, such as


(ERRORS) detection of failures and automatic recovery.

This is the lowest level at which messages (rather than


Transport packets) are handled. Messages are addressed to
TCP, UDP
(TRANS) communication ports attached to processes. Protocols in this
layer be be connection-orrientated or connectionless.

Transfers data packets between computers in a specific


Network network. In a WAN or an inter-network this involves IP, ATM virtual
(ROUTING) generation of a route passing through routers. In a single circuits
LAN, no routing is required.
Responsible for transmission of packets between nodes that
Data link are directly connected by a physical link. In a WAN, Ethernet MAC, ATM
(DATA) transmission is between pairs of routers or between routers cell transfer, PPP
and hosts. In a LAN it is between a pair of hosts.

The circuits and hardware that drive a network. It transmits


sequences of binary data by analogue signalling, using AM or Ethernet based-
Physical
FM of electrical signals (on cable circuits), light signals (on band signalling,
(WIRES)
fibre optic circuits) or other EM signals (on radio or ISDN
microwave circuits).

Messages and headers combine to form packets - the bits that actually appear on the network.

TCP/IP Reference Model

The TCP/IP reference model originated from research in 1974 using the ARPANET
(predecessor to the modern Internet). TCP/IP does not have session or presentation layers,
which are only really needed in applications with specific requirements, and the data link and
physical layers are combined to form one layer.

Protocols within the TCP/IP model include Telnet, FTP, SMTP and DNS at the application
layers, TCP and UDP in the transport layer, IP in the network/Internet layer and ARPANET,
SATNET, AX.25 and Ethernet in the physical/data-link layer.

The OSI and the TCP/IP models both have advantages and disadvantages. For example, the
OSI model is well defined with the model before the protocols, whereas the TCP/IP model
defined protocols first and then retrofitted a model on top. The number of layers is also
different, although both have the network, transport and application layers the same. There
are also different levels of support for higher-level services - these are the key differences.
OSI supports connectionless and connection-orrientated communication in the network layer
and connection orrientated only in the transport layer, whereas TCP/IP supports
connectionless in the network layer and both in the transport layer, so users are given an
important choice.

OSI suffers from the standard criticism that can be applied to all ISO standards, which is that
it was delivered too late, it is over complicated and developed from the viewpoint of
telecoms. Additionally, the implementations of it are poor when compared to TCP/IP as
implemented on UNIX. Additionally, politics were poor, as it was forced on the world, and
was resisted in favour of UNIX and TCP/IP.

The TCP/IP model has the problem of only being able to support TCP/IP protocols, whilst the
OSI model can include new protocols. There is very little to distinguish between the data link
and physical layers, which do very different jobs. TCP/IP is the result of hacking by graduate
students, so there is little design behind it.

Middleware

Middleware alters the OSI model by replacing session and presentation layers with
middleware protocols. In terms of the TCP/IP model, middleware protocols are placed
between the application and transport layers. It is a good place to augment TCP/IP with
dependability features, e.g., security and fault tolerance.

Middleware protocols abstract away from the difficulties of using raw UDP and TCP, and
allow high-level IPC to occur. TCP is connection-based, so a connection needs to be
established and closed down before messages can be sent. Messages are sent using
the send and receive primitives, and can be used synchronously or asynchronously.
Destinations of messages are specified as a tuple of IP address and port as dictated by TCP
and UDP.

Sockets

A common implementation of this kind of middleware is using the "socket" abstraction. On


UNIX-based systems, two type of sockets exist, ones in the UNIX domain, for local sockets
tied to the VFS and ones in the INET domain, for processes on different machines. Different
types of sockets also exist: SOCK_STREAM creates a TCP-based socket, SOCK_DGRAM
for a UDP based socket and SOCK_RAW to handle other socket types (bypasses the
TCP/UDP stage and uses the IP level directly).

Common socket operations include:

• socket(domain, type, procedure) - create a socket


• bind(s, port) - associate the socket s with a specified port
• listen(port, backlog) - server listens for connection requests on a port
• connect(s, serverport) - client connects to a server using socket s
• accept(s, clientport) - server accepts connection
• sendto(s, data, clientport) - client/server sends data to a port
• receive(ip, port) - client/server receives data from a port

The sockets abstraction is extensively used and realises many application level services, such
as FTP, Telnet, HTTP, etc...
Most client-server communication is synchronous (the client blocks until the server replies),
and acknowledgements are redundant, however asynchronous communication allows clients
to continue without waiting (e.g., when no reply is expected from the server, or the client
accepts the reply when it arrives).

Typically, request-reply type communication is mapped to TCP, as UDP may get omissions,
lost replies, duplicates, unordered messages, etc... Using UDP/TCP in this way means
distribution is apparent to the application, but one of the goals of distributed systems is
transparency. Middleware provides a transparent layer between the application and transport
protocols.

Remote Procedure Calls

Remote procedure calls (RPC) provides a higher level of abstraction than sockets. A client
program calls a procedure in a server program (synchronous if a return value is required,
otherwise asynchronous), and parameters have to be passed by value, as references are
useless on another machine.

The dispatcher is required to map incoming calls onto the relevant procedure, and in the
client to map an incoming reply message to the relevant stub procedure. The interface
compiler generates a number (or name) for each procedure in the interface, which is inserted
into the call message by the client stub procedure, and the server uses it to identify which
procedure to call. In single-threaded clients, this is very slow as RPC must be done serially,
but multi-threading clients have huge increases in throughput, as the program will not block.

In order to implement failure recovery, the server generates a unique ID for each remote
procedure, and all RPCs must include the appropriate ID. The ID can then be used to detect
failed server processes (by the dispatcher on the server) using a variety of techniques, and on
the restart of the server process, a new ID is created, and the call with the stale ID is aborted.
In the case of client failure, the server must be able to "roll back" (or perhaps do nothing if
the semantics require it) to a previous state.

The OSI model is implemented into the network stack as demonstrated below:

In the many implementations, however, the network and data link layers are often directly
supported by hardware, leaving transport in the network OS services.

External Data Representation

We have the issue where hosts involved in communication may have different architectures
and therefore representing data differently, so data needs to be converted between the sender
and receiver representations. This is accomplished by the presentation layer in the OSI
model.

The process of encoding and decoding application data and messages is called marshalling
and unmarshalling.

We only consider lower level representations (integer lengths and big-endian vs. little-endian,
IEEE754 floats vs. non-standard ones, string encoding, arrays, structures, etc) and not the
higher level representations (e.g., images, video, multimedia documents, etc) which is dealt
with at the application level.

Negotiation is required to change base types into a common format (e.g., 32 bits for integers),
structures and arrays must be packed for transport, and more complex types (such as
pointers), must be linearised. The most common conversion strategies are to use a
common/canonical intermediate form, or to use a receiver-makes-right system. Tagging data
helps this, as a common interface is used across a range of implementations.
Failure Semantics

In order to implement failure recovery, the server generates a unique ID for each remote
procedure, and all RPCs must include the appropriate ID. The ID can then be used to detect
failed server processes (by the dispatcher on the server) using a variety of techniques, and on
the restart of the server process, a new ID is created, and the call with the stale ID is aborted.
In the case of client failure, the server must be able to "roll back" (or perhaps do nothing if
the semantics require it) to a previous state.

Different types of failure semantics include: Maybe (Best Efforts) Call Semantics, where the
request is sent only once, and it is unknown what happens if the request times out, and
unreliable networks may duplicate requests, and the state of the server is not reliable. Another
type is At-Least-Once Call Semantics, where the client retries up to n times, so if the call
succeeds the procedure has been executed at least once, however depending on the failure
modes, this could be more.

At-Most-Once call semantics guarantees that the remote procedure is executed no more than
once, but may only be partially complete. Here, the server tracks request identifiers and
discards duplicates. The server must buffer replies and retransmit until it is acknowledged by
the client. Transactional call semantics guarantees the procedure is executed completely once
or not at all (i.e., no partial completion). The server must offer atomic transactions for each
RPC, to take it from one consistent state to another consistent state, otherwise other RPCs
using the data may get an inconsistent state.

Physical Networks
Physical Layer

The basis of data communication is that of some variation of a physical property of the
communication medium, e.g., voltage on coaxial cable, pulsing light down fibre optics or RF
propagation through the atmosphere.

At the physical level, we deal with communications at a bit level, and this is probably the
most difficult layer to deal with, as physical hardware sends or receives raw bit streams and
bit streams tend not to be error free.

Encodings such as non return-to-zero (NRZ) are used as electrical failure is easier to detect
and signal-to-noise ratio is improved; here, 0 is -5V and 1 is +5V. The Manchester encoding
improves the SNR further as an alternating signal is used - a '1' is represented by a full cycle
of the inverted signal from the master clock.

The introduction of fault tolerance at this layer makes transparency difficult - the session
layer is supposed to be responsible for fault tolerance, but this can only really be done with a
fault model (i.e., a knowledge of errors due to transmission at the physical layer).

A limit on the data rate is placed by the sampling frequency. The Nyquist Samping Theorem
(developed by Harry Nyquist working at Bell Labs) states that in order to be "perfectly"
represented by its samples, a signal must be sampled at a sampling rate (or frequency) equal
to at least twice its frequency. Other factors that affect data rate include the SNR and
attenuation.

Other issues at the physical layer include latency (delay between sending and data becoming
available at destination), message transmission time and throughput (the total volume of data
across a network in any given time).

Data Link Layer

The data link layer builds the bit stream into frames of data.

The start and end of frames can be denoted by time gaps, however we then have the problem
of missing bits - how do we signify the difference between the end of a frame and an error.
We could denote a time gap with a certain sequence (e.g., CAN uses 6 consecutive equal
bits), but this sequence could also indicate failure and "real" data has to be stuffed (insertion
of an alternate bit pattern based on a pattern known to both sender and receiver that the
receiver ignores) to prevent n equal bits being misconstrued as an error or time gap.

Other ways of implementing frames is by using a character count, start/end characters and
this can be combined with parity checking or CRC checks.

RS232

Standards such as RS232 are point-to-point standards, which specify mechanical, electrical,
functional and procedural interfaces. RS232 utilises a number of copper wires to connect two
nodes, and what signals on these wires mean. Unidirectional hardware control lines are used
to control flow.

On RS232, communications normally occur using either a 7 or 8 bit byte, followed by an


optional parity bit or stop bit. Both ends have to be configured in the same manner for
communication to occur, however. RS232 has no real real error control, so the data-link layer
usually includes some kind of basic protocol.
For RS232, transfer of data from source to destination occurs by an unacknowledged
connectionless service (the source sends independent data frames to the destination), an
acknowledged connectionless service (the source sends independent data frames to the
destination with acknowledgement returned) or an acknowledged connection-orrientated
service (the source and destination establish connection and send numbered/sequential
frames). The latter is more reliable than the former.

Other checks (such as sliding window protocols) can be used, but are normally implemented
at higher levels of the network stack.

Other issues we should be considering when dealing with physical networks include:

• Scalability - can it cope with a growth in traffic, clients, physical size, etc...
• Reliability - minimising the impact of errors introduced by the network, in order to avoid
wasted traffic
• Security - ability to secure communication channels, e.g., firewalls or frequency hopping
• Mobility - access to network, wireless network, etc...
• Quality of Service (QoS) - meeting of deadlines, especially for multimedia applications

There are also different types of network to consider, such as LANs (local area networks),
WANs (wide area networks - typically used to link different LANs together) and MANs
(metropolitan area networks, normally implemented like LANs, but over a larger area). As
networks get bigger, data transfer rate tends to decrease.

We can also consider wireless networks, such as WLANs (IEEE802.11a/b/g and are now
often used in the place of LANs), PANs (wireless personal area networks, such as Bluetooth
to phone, PDA, etc) and mobile phone networks (e.g., GSM).

In large networks, packets may visit intermediary nodes before it reaches its intended
destination, so the topology of a network is critical and factors such as installation cost,
communication cost and availability should be considered. Common topologies for point-to-
point networks are demonstrated below:
Broadcast Networks

In addition to point-to-point networks, we can also consider broadcast networks, which share
a single communication channel. Here, messages are broken into packets which are broadcast
to all on that channel, and when a packet is received, a computer examines the data (which
contains a destination address) and discards it if it is not relevant.

Multicasting is a special case of this, and can be implemented in point-to-point networks also.
Multicasting is one-to-some (where some is a subset of all destinations), and this requires
support in the network hardware.

However, in broadcast networks, we have the problem of deciding who uses the channel if
there is competition for it. We use medium access control (MAC) to determine who goes
next. There are two main philosophies in this, which is static allocation (typically time slots)
and dynamic allocation, which can be either have a central abitration or decentralised
allocation (a free-for-all).
One method of solving this problem is ALOHA, which is contention system. In ALOHA,
users transmit when data becomes available, however collisions are inevitable, and the
colliding frames are destroyed. A sender finds out if a frame is destroyed by also listening. A
variation of ALOHA called slotted ALOHA imposes agreed timeslots (this requires global
time, however - see later lectures), and you only send at the start of a timeslot.

ALOHA developed into carrier sense multiple access (CSMA), where a sender listens to a
channel first and only transmits if it is silent. If a collision occurs, the station waits a random
amount of time and then retries. There are two different varieties of CSMA, non-persistent
CSMA, which senses transmission and sends if free, but waits for re-sensing if busy, and p-
persistent CSMA which senses if the channel is silent, then transmits with probability (1 - p).
If the channel is not free after waiting, it continually senses until it is free.

However, we now have the problem of when is a collision detected. An improvement to


CSMA can be made by aborting transmission as soon as a collision is detected (however,
senders may not immediately hear the collision). The contention period here is normally quite
small, as collisions can be detected in just a few microseconds. CSMA-CD also works better
with large packets, as once connection is established, the throughput is greater (although this
does have the disadvantage of reducing other packets' responsiveness).

Another way of solving the problem is to use collision avoidance, where it is agreed in
advance when the sender has sole access to the network. Some ways of implementing this
include:

• Time Division Multiplexing - have an agreed time slot for a message (this does have the
requirement of global time, however)
• Frequency Division Multiplexing - frequency range of wireless communications split into
bands, each user having exclusive access
• Wavelength Division Multiple Access - like FDM, but for fibre optics and at the visible area of
the spectrum
• Spread Spectrum (frequency hopping) - this is used to prevent jamming, where the
frequency changes according to a system only the sender and receiver know
Ethernet (IEEE802.3)

Ethernet is used as a broadcast bus, where CSMA-CD provides MAC.

The preamble is used for hardware timing purposes (e.g., to achieve a hardware sync), and
the data payload is variable (max 1.5 kb). Collisions are detected by listening on the input
port whilst sending on the output port. Transmissions are aborted when a collision is detected,
and retries occur after a random back-off time. Usual efficiency is about 80-90%.

The packet must be padded where the data length is less than minimum (64 bytes).

Controller Area Network

The Controller Area Network (CAN) is an advanced serial bus system that efficiently
supports distributed, real-time control. It was originally developed for use in automobiles by
Bosch in the late 1980s. CAN has been standardised by ISO as ISO 11898 and is now used in
more than just automobiles.

It has been widely adopted as it reduces the amount of wiring required, as CAN controllers
are cheap and can even be incorporated into the client itself.

The CAN protocol is only defined at the physical and datalink layers, however the exact
physical states are not declared in the specification.

Data messages on the CAN bus do not contain the address of either the sender or receiver, but
the content of each message is labelled by an identifier that is unique throughout the network.
Messages are broadcast atomically, where they are either simultaneously accepted by all
nodes, or no nodes. This identifier describes the meaning of the data, and for certain
applications, the assignment of message identifiers to functions is standardised. Messages are
then filtered according to their relevance; if a message is relevant, it is processed by the
receiver.

The unique identifier also determines the priority of the message - the lower the value of the
identifier, the higher the priority. When the bus is considered ideal, all nodes will attempt to
send their current highest priority sendable message. The highest priority message is
guaranteed to gain access to the bus and lower priority messages are automatically
retransmitted in the next bus cycle, or in a subsequent bus cycle if there are still other, higher
priority messages waiting to be sent.

CAN often uses a two wire bus, such as twisted pair, although flat pair cable performs well, it
does generate more noise and is more susceptible to noise (EMI). NRZ encoding with bit
stuffing is used for data communication on a differential two wire bus (i.e., each wire carries
opposite signal, e.g., one +5V and another -5V). NRZ encoding ensures compact messages
with a minimum number of transitions and a high resiliance to external disturbance. This
allows CAN to operate in harsh and noisy environments. The ISO standard recommends that
bus interface chips be designed so communication can continue even if either of the two
wires in the bus is broken, or either way is shorted to power or ground.

CAN uses bit arbitration, which means that all nodes must hear data in the same cycle, which
means the maximum data rate is limited by the speed of light and the cable length. CAN uses
CSMA-CD, however unlike Ethernet, when frames are transmitted at the same time, non-
destructive bitwise arbitration allows the highest priority message to gain bus access.

Bit arbitration works by that when the bus has been silent for long enough, each node begins
to output the identifier for its highest priority message. Bus conflicts are therefore resolved by
non-destructive bitwise arbitration using a wired-AND (open collector) where dominant bits
(0) overwrite recessive bits (1). Nodes read as well as write to the bus, so nodes back off
when it reads a value different to what it output.

CAN transmits and receives data using message frames, which carry up to 8 bytes of data
from a transmitting node to the relevant receiving nodes. The end of a message is indicated
by five consecutive identical bits (i.e., the bus goes silent), and bit stuffing of real data is used
to avoid this combination cropping up in a data combination.

Switched Networks

Broadcast and point-to-point networks are good for message passing as they are
connectionless, but they are less good for streaming applications (e.g., transmission of
video/sound data streams). Switched networks are better for connection orrientated
requirements, as well as many WAN applications and streaming applications.

In switched networks, all computers are directly linked to the swich (i.e., computers are not
linked directly to one another), and switching is used to transmit information between two
nodes which do not share a direct link.

Packet switching is worked by placing computers at each switching node. Basic processing
and storage is made available, and "store and forward" networks (where packets are
forwarded from source to destination when a link to the destination becomes available) are
now possible. A permanent connection is established between each node and the switch, and
to send a packet from A to B, the switch determines when the packet is forwarded to B.
Switches can also switch between different types of physical networks.

One way of implementing switch networks is using source routing, where each packet has
enough info to enable any switch to decide how to get the packet to its destination -
overheads are expensive as the full route must be known beforehand and must be embedded
in the packet being sent.

Another method is virtual circuit switching. VC switching is similar to a phone call, where
there is an explicit setup and tear-down stage, so all packets for that connection follow the
same circuit. This is sometimes called a connection-orrientated model. Each packet contains a
VCI (virtual circuit ID) which is then reassigned by each switch for the next VC switch based
on a VCI table. Switches here are complicated, and timing guarantees are difficult. Only the
connection request needs to contain the full destination address, and as the VCI tends to be
smaller than the connection address, the per-packet overhead is small.

Due to the setup time, you generally have to wait a full round trip time (RTT) for connection
setup before sending the first data packet, so this is inefficient for short or bursty packets.
Additionally, if a switch or link in the connection fails, the connection is broken and a new
one needs to be set up.

The connection set up does provide an opportunity to reserve resources, however (this is
important if Quality of Service is a prime requirement for applications)

Another method is datagram switching. Here, there is no connection set up phase and each
packet is forwarded independently, so the model is completely connectionless, and is
analogous to the postal system. Each switch maintains a forwarding (routing) table which
maps host to port, therefore a switch needs to know which port to send a packet on to reach
its desired host.

In this model, there is no RTT delay waiting for the connection setup; a host can send data as
soon as it is ready. However, the source host has no way of knowing if the network is capable
of delivering such a packet, or if the destination host is even up or available. However, as
packets are treated independently, it is possible to route around link and node failures. Every
packet must contain the full address of the destination, so the overhead per packet is higher
than for the connection-orrientated model.

LANs have physical limitations, specifically in the length, so we need to introduce ways to
get round this.

Hubs

Hubs act at a physical layer and are basically multi-port repeaters. They perform no routing
(so are only useful in broadcast networks) and simply receive and send onwards. They can
help reduce effects such as SNR, but can increase collisions. Hubs act at the physical layer
and join network segments.

Bridges

Bridges act at the data-link layer and connect two or more network segments or LANs
(however, they must be of the same type). They have an accept-and-forward strategy, and
perform a basic level of routing. They usually do not alter a packet header.

A key issue is that we must ensure that we forward a packet to the correct LAN. The simplest
way to implement this is to use fixed routing - this is where a route is assigned for all
source/destination pairs, so is only usable in small networks. The selected route is usually the
one with the fewest hops, but problems can be encountered if the network changes or new
machines are added.

Learning bridges work by having a routing table and entries in this are learnt from the source
address, and is effective if we have a tree network topology. In learning bridges, we only
forward if necessary, so unnecessary traffic is not created on segments or LANs. The
algorithm for learning bridges is such that when a frame arrives on a port P at the bridge, it
must have come from the direction of a sending LAN, so we examine this source address S.
We then update the forwarding table to reflect that to send messages to destination S, we
forward it to the LAN connected to port P. Each entry in the forwarding table is then
timestamped to enable entries to be deleted after a fixed time - this enables the table to reflect
changes in network topology (e.g., S could move from port 1 to port 2 and the bridge would
learn this next time S sends a message).

In learning bridges however, we have the problem of network loops causing messages to
arrive at a bridge from multiple directions, which causes the route table to oscillate. To solve
this, we can use a simple result from graph theory: "for any connected graph, consisting of
nodes and edges connecting pairs of nodes, there is a spanning tree of edges that maintains
the connectivity of the graph but contains no closed loops".

To implement this, we have bridges running a distributed algorithm to find a spanning tree.
An attempt is made to find the best tree (with respect to number of hops), and the topology is
not know before running the algorithm. The algorithm is that each bridge has a unique ID,
and each port on a bridge is assigned a cost (sometimes the same cost for each port, and is
usually factors such as bitrate supported by a port). The bridge with the smallest ID is the tree
root. The number of hops is initiallised to 0.

The bridges then need to find the root bridge, so all bridges circulate messages containing
their ID and the number of hops. When a bridge receives one of these messages, if the ID is
lower than its own message, or the ID is same but has a lower cost, it adds one on to the hop
count and forwards the message on, stopping sending its own message, as it has identified
another bridge that could be the root, or a faster route to the root, otherwise the message is
discarded. Eventually, only messages from the root bridge will be circulating, and all other
bridges should have backed off as their ID/cost is higher. Loops are therefore avoided,
assuming that all identifiers are unique.

Each bridge will then determine its root port (i.e., port with least cost to root) and a dedicated
bridge is chosen for each LAN. Where a LAN has more than one bridge, the bridge sends
messages to other (external) bridges on the same segment (this is achieved by no bridge
forwarding a message to another LAN), and the message contains the cost to root for each
bridge. The bridge with the smallest cost to root is then chosen as the designated bridge (i.e.,
fastest route from bridge up the tree chosen).

This algorithm results in that one bridge on each LAN is selected as the designated bridge,
each bridge forwards frames over each LAN for which it is the designated bridge. This is still
a learning-style algorithm, but there are no loops.

Bridges do have the problem of not scaling, as the spanning tree algorithm does not scale, and
having a single designated bridge can be a bottleneck. Additionally, it does not accomodate
heterogeneity (bridges make use of frame headers, so they can only support networks with
the same format for addresses). Additionally, different MTUs can cause problems.

Switches

We can think of switches as multiport bridges.


Switches solve the problem of static network topologies dictated by attempts to minimise
wiring, so switched networks provide flexible architecture for networks. Switches allow
packets to traverse point-to-point WANs from source to destination, and are fundamental for
WANs.

Recently, switches have also become cost effective for LANs, as they allow the ability of a
local LAN to have effective total connectivity between nodes, and provide fast
communication and less contention. The advantages are that whilst wiring is still based on
physical demands (i.e., computer is connected to the closest switch), LANs can be configured
logically (usually in software).

LAN switches are strictly switching hubs, and switch between single LAN types. They are
termed level 2 switches (from the OSI model), as the inspect packets at the frame level by
looking for the MAC address. They can be implemented as store-and-forward switches,
where the incoming frame is buffered briefly before being routed, which allows the number
of switches to be less than the number of ports (although this has a performance hit), or cut-
through switches, where an incoming frame is routed as soon as the MAC address is received
(usually at the front of the frame, e.g., in Ethernet). This can lead to propogation of faulty
frames, as the sending frame is forwarded before the CRC is checked.

Routers

Routers are capable of providing interconnects between different sorts of LANs and WANs.

ATM

ATM stands for asynchronous transfer mode and is a connection-oriented packet-switched


network. It was developed by and for the telecoms industry and was originally designed for
WAN, although it is now used in LAN settings. It is a IEC standard.

ATM requires the setting up of a virtual circuit before transmission can start, and connections
are identified by a two-level hierarchy consiting of a VCI (virtual circuit identifier) giving the
route across switches from source to destination and a virtual path identifier which identifies
a set of inter-switch routes. A combination of VCI and VPI can be thought of as a virtual
circuit in packet switched network terminology.

The ATM model is a three-level model consisting of an ATM Adaption Layer (AAL) which
adapts information to be sent to the cell structure (i.e., creates packets or cells for, e.g.,
specially dealing with voice), an ATM layer which is responsible for multiplexing and
switching packets (or cells) and a physical or adaptation layer which is responsible for
adapting to the underlying transmission media.

Packets in ATM are called cells and are made up of a 5-byte header and a 48-byte fixed
length payload.

• GFC - generic flow control


• VPI - virtual path identifier
• VCI - virtual circuit identifier
• Type - mangement, congestion control, AAL, etc (AAL 1 and 2 for applications that need a
guaranteed data rate, 3 and 4 for packet data and 5 as an alternate packet data standard)
• CLP - cell loss priority (when network is overloaded, packets of a lower priority are lost first)
• HEC - header error check

Fixed length packets are easier to switch in hardware, and enables hardware switch
parallelism as variable length packets inevitably require software to process them. When
deciding what length to use, there is no optimal, as small packets have a high header-to-data
overhead and large packets have an underutilisation problem. Small packets do improve
latency (especially important for voice), e.g., if sound is recorded at 64 kbps and a cell was
1000 bytes long, you would need to wait 125 ms before a cell was full, which is too long for
voice. ATM was a compromise of Europes desired 32 bytes and Americas desired 64 bytes.

With ATM there is a problem of congestion if the rate at which things are being put into the
queue for the output ports is close to or greater than the maximum data rate of the output port.
Recognising congestion can take time, so it is often better to try to prevent congestion rather
than react to it. Ways of avoiding congestion include:

• Admission Control - this is used to decide whether or not a request for a new connection is
allowed, and once this is set up all parties have some knowledge of likely data rates along
the connection
• Rate Policing - this controls flow into a network across a channel to be regular, using a
"leaky-bucket" algorithm (i.e., buffer packets and release at a regular rate). This transforms a
bursty data arrival into a regular drip of packets into the network, speading the load. This is
usually matched to an agreed data rate set up by admission control

However, if congestion still arises in physical networks, we can use the CLP byte to shed low
priority traffic.

The ATM model does not easily map on to the OSI and TCP/IP models, as ATM has
characteristics of end-to-end VCs, switching and routing, which occur at OSI levels 3 and 4,
but the TCP/IP view is that ATM sits at levels 1 and 2, which enables them to place IP over
it, however ATM does not have the characteristics of a single hop data-link layer, and it
needs messages to be split and reformed which IP does not support.

We can try and represent how ATM fits into the OSI model below, however:

OSI Layer ATM Layer ATM Sublayer Functionality

• Providing the standard interface (convergence)


CS
3/4 AAL (TCP, UDP and IP would go between these layers)
SAR • Segmentation and reassembly

• Flow control
2/3 ATM • Cell header generation/extraction
• Virtual circuit/path management
• Cell multiplexing/demultiplexing

• Cell rate decoupling


• Header checksum generation and verification
TC • Cell generation
2
• Packing/unpacking cells from the enclosing envelope
Physical • Frame generation

• Bit timing
1 PMD • Physical network access

Network Layer
We have a requirement to cope with internetworking, that is, when communication occurs
across many (potentially) different networks. The physical and data-link layers alone can join
similar networks, however we must be able to join different networks. We have to consider
issues such as heterogeneity (communication may need to traverse many different networks
to get to the destination) and scale, which itself has issues such as routing (how to find
efficient paths through networks consisting of millions of nodes) and addressing (how to
assign unique identifiers to each node).

An internetwork is an arbitrary collection of networks interconnected to provide an end-to-


end service, and works by a concatentation of networks, such as Ethernet, PPP, FDDI (Fibre
Distributed Data I/F), etc...

Both hosts and routers need the network layer of the stack, however routers do not consider
any layers higher, which are only considered by the hosts.

A common network layer implementation is IP (Internet Protocol), which sits at level 3 in the
OSI stack and at the Internet layer in the TCP/IP stack. We need to consider which host-to-
host protocols IP needs to support, as well as the fact it needs to be supportable by the
underlying physical network (i.e., layers 1 and 2). IP is connectionless (datagram-based) and
has a best-effort delivery (unreliable service) model - packets can be lost, delivered out of
order, duplicates can arrive and packets can be delayed for a very long time.

The format of a datagram packet in IPv4 looks like this:


IPv4 supports a maximum data length of 65535 bytes, and allows a TOS (type of service)
marker that allows different treatment of packets depending upon application needs. TTL
(Time To Live) is used as a hop count; it is set by the sender and is decremented by routers
and when it reaches 0, the packet is killed. In this case, the addresses are IP addresses, not
raw network addresses.

Each network has some MTU (maximum transmission unit), which is the largest IP datagram
that can be carried in a frame, e.g., for Ethernet this is 1500 bytes, and for ATM (CS-PDU),
this is 64 kb (we are only interested in the size of the ATM packet, not an ATM cell), so the
IP datagram must fit into the payload of a frame.

A strategy to implement this is to fragment when necessary (i.e., when MTU < Datagram),
but we should try to avoid fragmentation at the source host (i.e., the source should not create
datagrams bigger than the MTU to the first router). Refragmentation is possible en route, but
as fragments are self-contained datagrams, we can delay reassembly until we reach the
destination host, which could save time if the packets need disassembling again at a later
point in the route. However, we can not recover from lost fragments.

When fragmenting and reassembling, we can use the offset parameter in the field to tell us
how many bytes from the start of the datagram the current packet is. The first bit of the offset
is set to 0 when the last packet is received so it is known when the datagram has been
completely received. The ident field is used to uniquely identify a datagram, so you know
which part of a packet a fragment belongs to when reassembling.

We also need to consider the problem of global addressing. All addresses must be unique
within their scope (e.g., for the Internet, this must be global, but for an intranet, it only need
to be within the current LAN). Addresses are hierarchial and consist of a network part and
and a host part - this is essential for scalability, as we only need to consider the network when
we are sending, which can then forward to the host. Routers do not need to know about all
hosts, only networks.

IPv4 addresses are 32 bits long and are represented by a dot notation which groups them into
4 groups of 8 bits, e.g., 10.23.4.55, 123.66.43.4, 144.32.40.240, etc...
There are two ways of identifying the network part of a host, originally classful address space
was used, but this has now moved to classless/subnetting in the modern Internet.

Classful Routing

In classful routing, there are 5 classes of addresses, named A-E, however D addresses are
reserved for multicast, which is rarely used on the modern Internet and class E addresses are
reserved. A class can be identified by its leading value, and classes differ by how many bits
the network number and host numbers use.

Class Leading Value Bit length of network number Bit length of host number

A 0 7 24

B 10 14 16

C 110 21 8

D 1110 N/A N/A

E 1111 N/A N/A

Choices over which class an organisation is supplied are assigned based on how many hosts
are anticipated to be used in the network, and within classes, ranges were assigned
geographically (e.g., 144.x for the UK, etc).

Classless Inter-Domain Routing and Subnets

The original intent of classful routing was to have the network part of an IP address would
identify exactly one network. However, for every network under 255 hosts, a class C was
needed, and for over 255 hosts, a class B. This can lead to a large wastage, a class C with 2
hosts is only 0.78% efficient, and a class B with 256 hosts is only 0.39% efficient.
Additionally, there were still a lot of networks, and route propagation protocols do not scale
well.

Subnets add another level to the address/routing hierarchy, as subnet masks are used to define
a variable partition of the host part. The mask reveals the subnet number which is part of the
IP address to be compared with the routing table entry, this is more flexible than classful
address space, as it allows a greater break up of addresses and more efficiency.

IP uses the datagram forwarding strategy, where every datagram contains the destination
address. If the node is directly connected to the destination network, we can forward directly
to the host, but if it is not directly connected, then we forward to some router (the
gateway/default router), which has a forwarding table that maps the network number into the
next hop.

Once a message has reached the appropriate network, address translation must be used to
map IP addresses into physical network addresses. ARP (Address Resolution Protocol) is
used to do this. A table is maintained of IP to physical address bindings, and a request is
broadcast to all nodes within the local network if an IP is not in the table. The target machine
responds with its physical address, and table entries are discarded if they are not refreshed
(e.g., every 10-15 minutes).

When a host or router can not process an IP datagram, and error message is returned using
ICMP (Internet Control Message Protocol). Typical errors/ICMP messages include:

• Echo (ping)
• Redirect (from router to source host)
• Destination unreachable (protocol, port or host)
• TTL exceeded (so datagrams don't cycle/hop forever)
• Checksum failed
• Reassembly failed
• Can not fragment

Routing

Forwarding is selecting an output port based on the destination address and the routing table,
and routing is the process by which the routing table is built.

The easiest way to represent a network for routing is as a weighted graph, and we then have
the problem of finding the lowest cost path between two nodes. Factors to consider include
static factors such as topology and dynamic issues such as load. A simple solution is to send
the packet out on all interfaces except the one it came in on, this is guaranteed to reach the
destination in the shortest time, but causes congestion, duplication, etc...

We can use metrics to measure network performance. The original ARPANET metric
measured the number of packets queued on each link, and took neither latency nor bandwidth
into consideration. A new ARPANET metric was developed to combat this, where each
incoming packet is stamped with an arrival time (AT), and the departure time (DT) is
recorded. When a link-level ACK arrives, the delay time is then computed as Delay = (DT -
AT) + Transmit + Latency. If a timeout occurs, DT is reset to the time the packet was
retransmitted, and the link cost is the average delay over some time period.

RIP (Routing Information Protocol) can be used to compute a distance vector. Each node
contains a set of triples (Destination, Cost, NextHop), and exchanges updates with its directly
connected neighbours, either periodically (typically in the order of several seconds) or
whenever the table changes (called a triggered update). The updates are in a form of a list of
pairs (Destination, Cost), and if a route appears to be better (i.e., have a smaller cost), the
local table is updated with the new NextHop. Existing routes should be refreshed
periodically, and delete if they time out.

When subnetting is used, the routing algorithm changes, however. The entries are now of the
form (SubnetNumber, SubnetMask, Cost, NextHop). A binary AND is performed on the IP
and the subnet mask, and then it is seen whether or not it matches the subnet number, and if it
is, forwarded on to NextHop.

Problems can occur with loops and failures, however, where an out of date node publishes
information about a link that goes down to NextHop, and the node decides it can reach the
failed node through the out-of-date node, causing an infinite loop. To break these loops, we
can do things such as set "infinity" to an arbitrary number, such as 16, or use a split horizon,
where routes learnt from a neighbour are not sent back to that neighbour.

To solve the problem of extremely large routing tables, interdomain routing is used. Here,
networks are parts of autonomous systems (AS), which correspond to an administrative
domain. They are autonomous as what happens inside an AS doesn't affect the outside. Route
propogation now uses a two-level hierarchy, with an interior gateway protocol to speak to
other nodes inside the network, and an exterior gateway protocol (i.e., BGP is an Internet
wide standard) to talk to neighbours (or peers) of the AS.

EGP (the exterior gateway protocol) was previously used on the Internet before being
replaced by BGP (border gateway protocol). It is designed for a tree-structured Internet, and
is concerned with reachability, not optimal routes. The protocol consists of messages for
neighbour acquisition (one router requests that another be its peer, and peers then exchange
reachability information), neighbour reachability (one router periodically tests if another if
still reachable, by exchanging HELLO/ACK messages, and a k out of n rule is used) and
routing updates (peers periodically exchange their distance-vector routing tables).

Constraints in routing are varied, and normally have to be programmed into routers manually.

BGP4 is the current version of the border gateway protocol used on the Internet. BGP4 has 4
AS types:

• stub AS - has a single connection to one other AS, and carries local traffic only
• multihomed AS - has connections to more than one AS but only carries local traffic; it does
not allow transit between peers across its network
• transit AS - has connections to more than one AS, and carries both local and transit traffic

Each AS has one or more border routers, and one BGP speaker that advertises its local
networks, other reachable networks (in the case of transit AS) and gives path information.

IPv6

IPv6 features completely classless 128-bit addresses. The address contains host "subnet"
details and a specific address. It works around the fragmentation problem by using a method
called "end-to-end" fragmentation, where once sent, no further fragmentation can occur, as
the minimum MTU on route is determined prior to the send (either by using path discovery,
or default to a minimum MTU - 1280 bytes). It provides many services such as real-time
(using priority field in the header), multicast and anycast (ensuring delivery to at least one set
of nodes). Security can also occur at the IP layer via authentication and an encrypted payload.

The header is 40 bytes long, with extensions (fixed order, with a mostly fixed length). The
type of extensions are indicated by a "next header" field, and contain information such as
routing or fragmentation information.
Currently, IPv6 is implemented on top of the current IPv4 network using a method called
tunneling. Tunneling is a transmission of packets through an alien network (in this case
IPv4), but where both the sender and recipient speak a common protocol. The sender
encapsulates an IPv6 packet within an IPv4 packet and sends it, however fragmentation may
be necessary. It is effectively a software layer.

Transport Layer
The IP/network layers provide the upper layers with data independence from the underlying
switching/transmission technologies, which are used to establish, maintain and terminate
connections.

The transport (or host-to-host) layer (layer 4 in the OSI model) gives a reliable, transparent
transfer of data between end points and provides end-to-end error recovery and flow control,
this requires:

• Guaranteed message delivery


• Messages are delivered to applications in the same order in which they are sent
• At most one copy of a message is delivered
• Arbitrarily large messages are supported
• The receiver can flow control the sender

The transport layer relies on an underlying best effort network (e.g., IP) that limits messages
to some finite size, but is not perfect. The requirements of the transfer layer exist to mask the
unreliability of the underlying network, that is, the transport layer basically manages the
issues caused by lower level routing.

Endpoints are identified by ports, and host:port pairs provide naming in the transport layer.
Servers have well known ports (see /etc/services on UNIX-based systems), which are
only interpreted on a host. This allows multiplexing amonst applications for the same service
- the transport layer routes a message to the correct service based on the destination port.
User Datagram Protocol

UDP, the user datagram protocol, is an unreliable and unordered protocol based on the
datagram idea. It extends the unreliable host-to-host service to a process-to-process level, it
consists of no flow control and has a simple header format, with an optional checksum.

Both UDP and IP provide a connectionless unreliable interface, and the only thing UDP
offers IP is the idea of a port interface. However, UDP is service-oriented and removes the
need for the sender to know much about the receiver/network topology, which hosts should
not know for transparency reasons.

Transmission Control Protocol

TCP, the Transmission Control Protocol, provide a connection-oriented transport. It is full-


duplex and point-to-point and has flow control, to keep the sender from overrunning the
receiver, and congestion control, to keep the sender from overrunning the network.

TCP uses the concept of a reliable byte-stream. Applications read and write bytes, but TCP
sends and receives segments and the translation between the two is done using buffers. It has
the same port concept idea as UDP and is similar in concept to ATM.

TCP needs to address end-to-end issues, as it can potentially connect many different hosts, so
explicit connection establishment and termination is required. Additionally, it must deal with
problems that can be introduced at the network layer, such as potentially different round trip
times (RTT), so an adaptive timeout mechanism is needed; or long delays in the network, so
we need to be able to handle the arrival of very old packets; or different capacity at the
destination compared to the source or on the network, so we need to be able to deal with
congestion and capacity issues.
Connections in TCP are established by a three way handshake. Initially, a server is listening
for client connection requests (CR), then, when a client issues a CR, then the server checks if
a server is listening on the selected port and sends an ACK if it is. The ACK indicates next
data is expected, so the client acknowledges the connection and synchronises sequence
numbers in the first data segment that is sent out.

The TCP model has a state diagram which indicates which particular state the communication
is in at any one time. In addition to opening the connection, connections are ended by either
end sending a segment with the FIN bit set in the header.
TCP considers flow control, which is important as both senders and receivers have a finite
buffer space, and they process data at different rates. A sliding window protocol is used to
provide flow control. Here, a time window of connections can be established and messages
sent. It is possible to deal with out-of-order arrival of segments and to have multiple
outstanding (un-ACKed) segments in the time window. The upper bound of un-ACKed
segments sets the window size, and this limits the amount of buffer space required at the
sender and receiver. Having a bigger window gives us more flexibility, but increases the risk
of more failures remaining undetected.

Flow control is implemented on the sender to bound failure detection, i.e., where a message is
not acknowledged sufficiently quick enough. A sequence number is assigned to each frame,
and three state variables are maintained - SWS (send window size), which is the maximum
number of unacknowledged frames the sender can transmit, LAR, the sequence number of
the last acknowledgement received and LFS, the sequence number of the last frame sent. We
must then maintain the invariant LFS - LAR ≤ SWS.

When an ACK arrives, we advance the LAR pointer, but we must retransmit the frame if an
ACK is not received within a given time. As we need to buffer up to SWS frames, this means
the transmission failure detection time is bounded.

To support out-of-order receives and reject out-of-date messages, the receiver must also
implement flow control. Here, three state variables are maintained, the receive window size
(RWS), the largest frame acceptable (LFA - basically, how many messages make up the
largest allowed frame), and LFR, the last frame received, but not delivered. An invariant LFA
- LFR ≤ RWS must be maintained.

When a frame with sequence number S arrives, if LFR < S ≤ LFA, then we accept, else if S ≤
LFR or S > LFA, we discard. Cumulative ACKs are sent, e.g., if we receive segments in the
order 7, 8, 6, then we wait until 6 arrives before acknowledging 7 and 8. The LFA does
define a limit, however, e.g., if 11 is received before any other packets, then it may be
rejected. The buffer supports up to RFS frames.

Flow control allows us to deal with errors. e.g., if RWS = 1, then after an error, subsequent
frames are discarded, so the sender must go back N frames and repeat them. N is decided
mainly by network delay. If RWS > 1, then after an error, subsequent frames are buffered,
and the sender selectively repeats the missed frame. This is a trade off between bandwidth
required and the buffer space required, and typically, the larger the transmission delay and
likely errors, the larger the buffer that is needed.

Sliding window and flow control is implemented in TCP using the Acknowledgement,
SequenceNum, AdvertisedWindow and Flags fields. The AdvertisedWindow field in the
header defines the window to be used for transmission.

ACK messages consist of the acknowledged sequence number in the Acknowledgement field,
and in the same message, a new AdvertisedWindow can be sent. The AdvertisedWindow tells
us how many more bytes can be sent than those already received/acknowledged.

A problem with TCP is that of a 32-bit sequence number, which is limited given current
network speeds, and it is only a matter of seconds in gigabit networks until sequence numbers
start to repeat. We need to ensure that the network is well utilised, which has implications on
window size. A 16-bit AdvertisedWindow means we only have 64 kbytes before wraparound.
To work round this, TCP has been given the option of a scaling factor to cope.

A final problem to consider in TCP is that of retransmission (from the sender) of


unacknowledged segments after some timeout has expired. This requires a timer, but there is
the problem of knowing which value to use. A fixed value needs knowledge of network
behaviour and does not respond to changing network conditions - too small and the network
is flooded with retransmission, and two large and the receiver will stall. Another method is
using an adaptive timer, however it is still difficult to monitor RTT with cumulative
acknowledgements, and network conditions may change faster than we can adapt.

Adaptive retransmission does still tend to be used, however. One method involves using a
weighted average, where a SampleRTT is measured for each segment/ACK pair, and a
weighted average (basically a balance between actual and expected RTT) is computed:
EstRTTt + 1 = α × EstRTTt + β × SampleRTT, where α + β = 1. Typically, 0.8 < α < 0.9 and
0.1 < β < 0.2. Timeout is then set based on this RTT, and is typically twice the value.

This weighted average gives quite a large timeout when the network is relatively stable,
however. Another solution is the Jacobson/Karel algorithm, where the mean deviation of
actual RTTs from the estimated RTTs is used.
Diff = SampleRTT - EstRTT
EstRTT = EstRTT + (δ × Diff)
Dev = Dev + δ (|Diff| - Dev)

Where δ is a factor between 0 and 1.

Variance is then considered when setting a timeout value: Timeout = μ × EstRTT + φ × Dev,
where μ = 1 and φ = 4.

This algorithm is only as good as the granuality of the clock (500 ms on UNIX), and μ, φ and
δ are used to control the speed of response and the ability to adapt.

The Karn/Partridge algorithm tells us not to sample the RTT when we retransmit, as the
subsequent ACK may be from the original transmission, just late, so the timeout value
becomes distorted. In this algorithm, the timeout is multiplied by a constant after each
retransmission (often 2), which progressively gives the network longer to get a message
through. When an ACK is received for a segment that has not been retransmitted, Jacobson's
algorithm is used to set a timeout.

Name Services
Distributed systems often use names to refer to different resources (computers, services,
remote objects, files, devices, etc). A consistent naming of resources is necessary for
distributed systems - you need to be able to refer to the same resource consistently from
different places in the system over a period of time. Additionally, name conflicts need to be
avoided and attributes may be used to infer a name, so attributes of a required service may be
enough to name a server that needs to be contacted.

Naming is an issue at each level of the OSI and TCP/IP protocol stacks. It provides an
abstraction from the underlying technology, e.g., physical address at data link layer, IP
address at network layer, host:port tuple at transport layer.

A name is resolved when it is mapped to the named entity - this mapping is termed a binding,
and it identifies an IP address, a type of resource and other entity specific attributes (e.g., the
length of time a mapping will remain valid).

Names require nested resolution - a web URL may be mapped to an IP:port tuple, which is
then mapped to a physical address when it reahes the correct physical network, and then the
file name part of the URL is mapped to file blocks by the file system. Each layer must be able
to cope with mappings that change, however.

Name Spaces

This bit is wrong, but is


what the lecture slides
say, so use it for the exam.
URLs (uniform resource locators) are direct links to host and IPs (e.g., pisa.cs.york.ac.uk)
and can scale to a limitless set of web resources. It does have the problem of web dangling
references, however - if a resource moves, the URL is invalid.

URNs (uniform resource names) solve the dangling link problem, as URNs persist, even if
the resource moves. The resource owner registers it with a URN lookup service, which
provides the current URL when asked - if a URL changes, the owner must inform the URN
lookup service. URNs identify a namespace and a resource, e.g., urn:cs.york.ac.uk/RT2001-2
identifies the resource RT2001-2 in the namespace cs.york.ac.uk. URNs also identify a
scheme, e.g., http, ftp, etc...

A name space is a collection of all valid names in a given context, e.g., for a file system, the
context might be a directory and the valid names are strings that follow a certain pattern (e.g.,
printable characters no longer than 255 characters). A valid name does not necessarily map to
an existing entity (e.g., a file with a given name may not necessarily exist in that directory).

Flat name spaces (i.e., where everything is in a single context) are usually finite, but
hierarchial namespaces are possibly infinite by allowing recursive contexts.

Aliases also allow greater transparency - www.cs.york.ac.uk maps to pisa.cs.york.ac.uk, but


the web server can be moved to turin.cs.york.ac.uk earily if need be, as only the mapping
inside the cs.york.ac.uk network needs to be changed.

Name Resolutions

Name resolution is an iterative process used to map names to primitive attributes, or to derive
a name to be passed on to another name service (e.g., an aliased name may first be translated
to another name, then to a reference for the actual resource). Aliases can lead to circularities,
so it is often necessary to implement around cyclic lookups.

The name service for a large number of names should not be stored on a single computer, as
this is a bottleneck as well as a single point of failure.

Partitioning of a name service requires a structure so that names can be resolved efficiently.
This process of finding naming data within the split name service is called navigation. In
iterative navigation, a client nameserver presents a name to NS1, which will resolve all or
part of a name, and suggests that a further nameserver can help, if required.

To resolve a name in the DNS, e.g., milan.cs.york.ac.uk, we first contact the root
nameservers which gives the address of the nameserver for .uk, which in turn gives us the
address for the nameserver for .ac.uk, which gives us the main York nameserver for
york.ac.uk, which we can ask for the cs.york.ac.uk nameserver, which will finally give us the
address of milan.

This co-ordinated resolution can involve recursive or non-recursive navigation. This is the
opposite to iterative navigation, which is client controlled, but this is server controlled. In
non-recursive, the root nameserver does all of the lookups, whereas in a recursive setup, each
server contacts the upstream servers and passes the results back.
The DNS name space is partitioned organisationally and geographically, e.g.,
milan.cs.york.ac.uk identifies the server milan in the Computer Science department of the
University of York, which is an academic institution in the UK. Components are seperated by
dots and it is case-insensitive.

DNS supports host name resolution (A records for IPv4 and AAAA records for IPv6), mail
host locations (MX records), aliases (CNAME records), reverse lookups (IPs to hostnames
with PTR records), etc, and typically uses the iterative lookup mechanism.

Most DNS servers use caching to increase performance, which helps in availability of name
service data if a nameserver crashes. DNS records are accompanied by a TTL (time to live),
which tells the cache how long to cache the record for before refreshing. We call responses
from caches to be "non-authorative", as they may not be accurate and are out-of-date. An
authorative server is the name server that actually manages the zone records for that domain
name, and this server needs to intelligently manage the TTL to balance between load and
accuracy.

DNS clients are called resolvers and are often implemented in the OS as a library function.

Services Discovery

Software such as Jini provides facilities for service discovery, and are called lookup services.
Lookup services allow servers to register the services they provide, and then clients query the
lookup service for the required service. Infrastructure then provides an appropriate interface
to a server via remote method invocation (RMI).

Lookup services should also be able to control the services that different users (e.g., visitors)
can access.

In the above example, a client requires a lookup service in the finance group, so it sends out a
broadcast, to which the lookup service for that group responds. The client then communicates
with the lookup to request printing. As only one printing service is registered for the finance
group, the address of it is given to the client for direct communication to the printing service.

Distributed Systems
No agreement has been reached on the exact definition of a distributed system, although
some definitions include:

• Tanenbaum - "A distributed system is a collection of independent computers that appears to


its' users as a single coherent system"
• Coulouris et. al - A distributed system is "one in which hardware or software components
located at networked computers communicate and co-ordinate their actions by passing
messages"
• Lamport: "A distributed system is one that stops you getting any work done when a machine
you've never even heard of crashes"

Enslow has produced a means of categorising distributed systems:

• Concurrency (or parallelism) - concurrent program execution is normal, and control and co-
ordination of concurrent access to resources is important
• No global time - co-operation between programs on different computers is via message
passing. Whilst close co-ordination demands a shared idea of time between computers,
there are limits to accuracy of clock synchronisation achievable
• Independent failures - computers, networks and resources in distributed systems can fail
independently leading to network partitioning

Challenges of Distributed Systems

Heterogeneity - Networks require standard communications protocols, hardware may vary


(data types, e.g., length of an int may be implemented differently or big endian vs. little
endian), OSs may provide varying implementations of network protocols and programming
languages may not be implemented on some architectures (common languages e.g., Java and
C tend to be used).

Openness - This determines whether or not the system can be extended or partially re-
implemented. Open distributed systems enable new services (by adding new hardware and
software) to be added for use by clients. Documentation of the interfaces is required for this.
It is hard to design and support a system if you have little knowledge or control over how it is
used or modified.

Security - Many services have high value to users. Often, secure data is passed across public
networks. There is a need to ensure the concealment of the content of messages and that the
identity of the client can be determined by the server (and vice versa). Common security
problems include denial-of-service attacks and the security of mobile code.

Scalability - Requires that the system is still usable if many more


computers/resources/services are added. Scalability requires the system to control the cost of
physical resources, control performance loss, prevent software resources running out and to
avoid performance bottlenecks.
Failure handling - Network partitions and computer systems fail producing incorrect, or no
results. You need to detect, mask, tolerate and recover from failures. Requirements can
include Mean Time Between Failures (MTBF), availability and amount of unscheduled
maintenance.

Concurrency - Services and applications provide resources that are shared in distributed
systems, hence shared resources can be accessed simultaneously. Operations must be
synchronised to maintain data consistency.

Transparency - Concealment from the users and applications of seperation of componenets of


a distributed system. We can consider different forms of transparency.

Transparency Description

Hide differences in data representation and how a resource is accessed (network


Access
transparency)

Location Hide where a resource is located (network transparency)

Migration Hide that a resource may move to another location (distribution transparency)

Hide that a resource may be moved to another location whilst in use (distribution
Relocation
transparency)

Hide that a service may be replicated for fault tolerance and performance
Replication
(distribution transparency)

Hide that a resource may be shared by several competitive users (distribution


Concurrency
transparency)

Failure Hide the failure and recovery of a resource (distribution transparency)

Hide whether a software resource is in memory or on disk (distribution


Persistence
transparency)

Distributed File Systems


One of the key motivations for distributed systems is the sharing of information. Information
on the Internet can be shared by the web using HTTP and browsers, however information
sharing in LANs tends to have more specific requirements, such as persistent storage,
distributed access to consistent data, and this isn't met by the Internet (no guaranteed
consistency).

Distributed file systems provide the functionality of a non-distributed file system for clients
on a distributed platform.
See File Systems
in OPS.

Like standard file systems, the user requirements for distributed file systems are:

• be able to create, delete, read and modify files


• controlled access to other users' files
• move data between files
• back up and recover the user's files in case of damage
• file access via symbolic names

The objectives of a distributed file system are:

• meet the data management needs of the user


• guarantee that the data in the file is valid
• minimise potential for lost or destroyed data
• provide performance similar to local file systems (often requires caching)
• provide reliability similar to local file systems (replication may be required)
• concurrent file updates (similar concerns to a local file system)

The distributed file system must provide a number of functions, such as the ability to locate
and identify a selected file, to organise files into a structure (directories, etc) and to provide
mechanisms for protection.

To acheive these objectives, we require transparency:

• access transparency (uniform API, ideally identical to the local file system)
• location transparency (uniform namespace from all clients, remote files should have a
similar naming scheme to local ones)
• migration transparency (no indication of location in the file name)
• scalability (no bound on the size of the file system)

Distributed file systems typically have a directory service and a flat file service that both
export RPC interfaces to the client providing access to files.

The flat file service performs operations on files on the server side. All files are identified by
a UFID (unique file ID) - a flat, non-heirarchical, naming scheme (i.e., an exposed interface).
This maintains a single level, flat directory to map the UFID to a file location (which also
maintains a file map of logical block number to physical block address), and provides read,
write, create and provides get/set attribute functions. Compared with the UNIX file service,
the flat file service has no open/close operations (just quote the UFID), and to read/write, an
explicit location in the file (no seek is required) is all that is needed. Operations (except
create) leave the UFID unchanged.

The directory service provides a mapping between paths and names and UFIDs. It provides
some directory functionality (e.g., create/delete directories to get file handles for sub-
directories) and allows server-side files to appear and be manipulated in a hierarchial fashion
by the client. It also defines lookup operations on the elements of the path or the filename.
The lookup operation takes a UFID for a directory and a text string and returns the UFID of
the file corresponding to the string. The client module recursively calls a lookup on the
pathname elements (similar to an iterative nameserver). Sequential accesses can be sped up
using previously resolved information.

The client module integrates and extends the flat file service and directory service under a
single API. The API is usable by local applications, and can represent most forms of file
systems, including UNIX hierarchial trees. It also manages information about the network
location of flat file and directory service processes.

Caching

Performing all file operations on a remote server increases network traffic and decreases
performance because of network delays, so recently used data or directories is cached locally
for repeated access. The size of a cache unit is normally the same as the block unit for
transfers, and as larger block sizes means lower network overheads.

The cache location is either on the client or the server. If we cache on the client size, local
disc tends to be non-volative and have a large size and main memory is faster, smaller but
volatile, however they are both difficult to maintain consistency for writes whilst having a
low network load. However, if we cache on the server side, in memory there is faster I/O, no
consistency problems, more memory is typically available than in a client, but it is volatile
and network load increases.

It is difficult to maintain file consistency when caching, and there are three cache flush
policies, write through to server (information is written to the cache and server in parallel,
which is reliable, but has poor write performance), delayed write (periodically, or when
cached block is discarded by the cache replacement algorithm, multiple blocks are bulk
written, but there is ambiguity for other file readers) and write on close (but there is a
problem if files are open for a long term). The choice depends in importance of consistency
and usage patterns.

Sun NFS

The Sun NFS is the most famous implementation of a network file system and consists of a
collection of protocols that provide clients with a model of a distributed file system and is
based upon RPC. It provides one-copy semantics for consistency, where updates are written
to the single copy and are available immediately, and is similar to a local file system, based
on the concept of delegation. It does have to cope with network, server and client failures.

Access to the servers files is via the NFS server and the flat file server. The virtual file system
helps map to actual UNIX file system. The NFS allows transparent access to the remote
filesystem, where the server exports all or part of a filesystem, and the client mounts all or
part of it into the clients filesystem.

Name resolution becomes iterative, which increases network traffic. In some ways, it would
be better to have recursive name resolution, but this has a higher load on the server.

NFS implements server-side caching with read-ahead and write-behind caching in the main
memory. Client-side caching works by each client having a memory cache containing data
previously read from a file, and the server may delegate some rights to a client. This
delegation may be recalled if another machine contacts the server to access the file. The
server must keep track of which file it has delegated, and to whom.

Cache validation checks whether the client copy is still valid (e.g., whether or not the TTL
has expired), or whether the server copy been updated. This is quicker than updating the file.
Modification and validation timestamps are used, along with a freshness interval, which is set
adaptively, depending on frequency of file updates. This could be 3-30 seconds for files, or
30-60 seconds for directories.

Synchronisation of files is difficult in distributed file systems if caching is used. To solve


this, we have multiple options, such as propogating all changes to server immediately (which
is inefficient), or using session semantics (changes to open files visible to the modifying
process immediately, and only when the file is closed are changes made available to others.
NFS adopts session semantics so consistency could be a problem. Concurrent access to files
does remain a problem, and issues include preventing more than one simultaneously attempt
to write to a file, so file locks can be used.

NFS was developed on SunRPC, which uses at-least-once semantics, so NFS implements a
duplicate-request cache, so each RPC contains a unique identifier which is cached. If a call is
still in progress when a duplicate request arrives, it is ignored, which gives a combination
similar to at-most-once semantics. The difference is the acknowledgement the client has
when the result has been received.

File locks can be used, which allow one writer and multiple concurrent readers. To lock a
file, special lock commands can be used. However, we need to consider what happens if a
client crashes whilst holding a lock, so the server issues a lease on each lock, and when it
expires the lock is removed automatically by the server. A renew() command exists which
requests that the lock is renewed. However, if the server crashes, the server will probably lose
information on locks granted to files, but when it reboots, the server will enter a grace period
where clients can reclaim locks previously granted. During the grace period, only claims for
previous locks will be accepted - i.e., no new locks. The leasing approach does require the
client and the server to have reasonably synchronised clocks.

Delegation also causes problems when a failure occurs and a cache isn't written back - full
recovery is impossible, unless the cache is persistent. The NFS places responsibility on the
client for providing file recovery (which is very difficult). If the server crashes, delegations
are allowed to be reclaimed during the grace period. In general, mechanisms used for caching
are highly dependent of the needs of the system and future characteristics.

We can compare the criteria for distributed file systems to Sun NFS:

• Access Transparency - The UNIX API provides the same interface as the VFS
• Location Transparency - NFS does not enforce a network-wide namespace. Therefore,
different machines may mount different combinations of remote filesystems.
• Migration transparency - Not fully achieved in NFS as mount tables must change if file
changes filesystem
• Scalability - Yes
• File Replication - Not supported
• Fault Tolerance - Similar to local files, but some differences due to distribution
• Consistency - One-copy semantics
Mobile Systems
Mobility is becoming increasingly important in the world of networking and distributed
systems for both interaction and connection. It is also important to stay connected whilst
moving (e.g., a laptop moving between WLAN segments).

We can consider mobile systems to be stand alone devices that are able to move at the
physical level, and thus require wireless communications. At the network level we want to
move and maintain a network connection (mobility transparency). At the application level,
code and data can move, so we need to have dynamic download of code as the application
needs it, and we also need to consider mobile agents which "travel" around the network
which introduces security concerns.

Mobility has an impact on the OSI and TCP/IP stacks. Previously, we have assumed that
hosts connect and remain at fixed points in the network, so now we have to consider
fragmentation and reassembly between layers, and naming and routing at each layer. Naming
involves using both network indepdendent IP addresses and physical, network depedent
addresses and routing happens by considering IP datagrams that are routed across networks
by routers and then using bridges and switches to route within the physical network.

Wireless communication can range from very simple ad-hoc networks, where lots of low-
resource devices work together to a more complex structured network, which have been
standardised by some body (normally the IEEE). Some wireless communication standards
include:

IrDA

IrDA is an infrared based standard started in 1993 based upon that used by TV remote
controls. It is of limited range and of line of sight only.

The standard provides a data link layer, which is essentially a wireless serial link ranging in
speed from 9600 bps to 4 Mbps. The key motivation behind IrDA was to "provide a low-cost,
low-power, half-duplex, serial data interconnection standard that supports a walk-up, point-
to-point user model that is adaptable to a wide range of applications and devices" - that is, it
provides a convenient wireless point-to-point link. Many protocols are available to run on
IrDA.

Bluetooth

Bluetooth is meant to unite the worlds of computers and telecoms. It is based in the ISM 2.4
GHz RF band with ranges from 10-100 m. It defines its own protocol stack which doesn't
map on to either the OSI nor TCP/IP model. It can cope with devices moving in and out of
range by broadcasting periodic "I am here" messages to establish neighbours.

Non-standard protocols need to be used to access a LAN, and Bluetooth is often not
standardised across multiple manufacturers, and problems can occur implementing it.
Wireless Ethernet

The wireless Ethernet standard is defined in IEEE 802.11a, b and g, with a and b being
incompatible, but g is backwards compatible with b. In IEEE 802.11b, devices can
communicate at up to 11 Mbps over distances of up to 150 m and can communicate both in
an ad-hoc/peer-to-peer system directly with other devices or in a more structured way with an
access point.

Wireless Ethernet differs from standard IEEE 802.3 Ethernet in the implemenation of MAC
and the CSMA/CS mechanism. There are still equal opportunities for all to transmit, but the
MAC controls use of the transmission channel, with both added security (WEP/WPA) and
slot conditions. Additionally, a fairly consistent signal strength is required through the
network to detect interference and collisions (i.e., another sender), and there are more failure
conditions than in wired LANs.

Hidden stations are when a failure to detect that another station is transmitting occurs.
Additional problems include fading as a signal may not be heard by all nodes in the network,
but is by the base station. Additionally, collision masking is not effective in wireless, as
multiple transmitters may produce garbled messages. As the frequency band used by wireless
Ethernet is ISM, it is shared with devices such as Bluetooth and microwave ovens so
interference can occur.

MAC is augmented by slot reservation. Before sending, the receiver agrees a transmission
slot and when you are ready to transmit, you sense the medium, and if there is no signal, then
the medium is either available, or there are out of range stations requesting a slot, or using
previously reserved slots. To request a slot, the sender sends a request to send (RTS) packet
with the duration, and the receiver acknowledges this with a clear to send (CTS) packet with
the duration. All stations in the range of the sender and receiver therefore know about the
slot, and the sender can send with a minimal risk of collision.

This gives us a mechanism called CSMA/CA (collision avoidance), but there are still
problems with nodes moving.

Mobile IP

Many devices connect at different locations so do not want a static IP address allocated.
DHCP (Dynamic Host Configuration Protocol) enables a temporary IP address to be
acquired, but there are still difficulties if you want to stay connected whilst moving between
networks.

IP routing is based on subnets at fixed locations, however mobile IP is based on the idea of an
allocation of an address in the "home" domain. When the hosts move, routing becomes more
complex. A home agent (HA) in the home domain and a foreign agent (FA) in the current
domain perform routing. The home domain currently routes for the sender and the foreign
domain takes over hosting sender.

When a mobile host arrives at a new site, it informs the foreign agent and a temporary "care-
of" address is allocated. The FA then contacts the HA of the host with the new "care-of"
address. The sender communicates with the HA initially and then it is told the "care-of"
address. "Care-of" addresses do timeout.
In the router of the HA, a mobility binding table is maintained on the home agent of the
mobile node, and the mobile nodes home address is mapped to its current care-of address
along with a lifetime. The FAs maintain a visitor list which maps the home address and HA
address to the MAC address, along with the lifetime.

HAs then use IP tunneling and encapsulate all packets addressed to the mobile node and
forward them to the FA. The FAs then decapsulate all packets addressed to them and
forwards them to the correct node using the hardware lookup table. Mobile nodes can often
perform FA functions if it receives an IP address (e.g., via DHCP). Bidirectional
communications will require tunneling in each direction - this is analogous to IPv6 over IPv4
networks.

Mobile IP implements security using authentication. All parties can authenticate, but the bare
minimum is that between the mobile node and the HA. Authentication uses MD5 (a 128-bit
digest of the message) for added security. To protect against replay attacks, timestamps are
mandatory, and sequence numbers must be unique. Unique random numbers are also
generated in requests to join network that must be returned in the reply. The HA and FA do
not need to share any security information.

Mobile IP has the problem of using suboptimal triangle routing, where if the mobile node is
in the same sub-network as the node in which it is communicating, it could be very slow to
route via the HA. It would be an improvement if packets can be directly routed. To do this,
foreign agents let the corresponding node know the care-of address of the mobile node, and
the corresponding node can create its own tunnel to the mobile node. This does require
special software, however. The HA must initiate it with the corresponding node via a
"binding update", and binding tables can become stale.

Other problems with mobile IP include redundancy problems (single HAs are fragile - so we
have multiple), and frequent reports to the HA if the mobile node is moving, of which a
possible solution is to cluster FAs. Security is also always a concern when data travels over
untrusted networks.

IPv6 implements mobile IP in the standard, so FAs are not needed; whereas mobile IPv4 is a
set of extensions that must be supported by all nodes to work. With mobile IPv6, mobile
nodes can function in any location without the services of a special router as IPv6 uses
stateless address auto-configuration, that is, DHCP. Home agents do still need to be told by
the mobile node where it is. Additionally, with security, nodes are expected to employ strong
authentication and encryption.

Voice over IP

Originally, data and phone networks were seperate each having different requirements. As
networks grew, a greater need to carry data over telecom networks developed and the issue
was how to carry data over telecoms networks. Currently, data is the majority of telecom
traffic, so new telecoms networks being installed are data networks, and the issue shifts to
being how to carry voice over data networks - the solution being to encode voice as data.

VoIP has many issues to be considered, such as user location, setting up a connection and
real-time and QoS (quality of service) issues. Provision of QoS is difficult, as it is difficult to
map the users' view of quality into the network stack - arguably no layers implement QoS,
although some may provide a higher quality service. QoS is difficult to achieve with a limited
network capacity, and all parts of the network must cooperate to provide QoS.

An implementation of VoIP is SIP - session initiation protocol, which provides a general


means of negotiating a connection. SIP URLs act as locators, and users register with a local
server, which permeates outwards to location servers and proxy servers so that callers can be
routed to the correct hosts. Proxies then communicate together and route data onwards;
however they must keep an up-to-date cache of these locations to do this.

The actual transmission protocol used is RTP for the message data (i.e., the voice) which has
some QoS. RTP is a transport layer running on top of UDP. RTP headers include timing
information which allows applications to specify acceptable loss rates and react accordingly,
e.g., change encoding rate to reduce traffic.

RTP does rely on a network that is capable of delivering packets on time, and to maintain
timeliness, we degrade the quality delivered. Networks are generally poor at delivering
packets on time, so prioritisation ("traffic shaping") and reservation of bandwidth are usual
methods for dealing with the problem (e.g., ATM). Variability in communication times can
be significant and hence cause problems.

Both SIP and mobile IP have some similar functions: SIP provides personal mobility (SIP
URLs are personal, e.g., sip:[email protected]), and mobile IP provides terminal
mobility (the ability to move devices between networks). Both SIP and mobile IP are
important, and it remains to be seen how they evolve together.

Time and Clock Synchronisation


One of the characteristics for distributed systems includes a lack of global time, that is, it is
difficult for two machines to agree what the current time is.

Time is often required to measure delays between distributed components, synchronise


streams (e.g., sound and vision), detect event ordering for casual analysis, and many utilities
and applications use timestamps (e.g., in SSL).

Most machines have an idea of local time, where inaccuracies don't matter if the time is only
used locally. Most local time services are implemented using a quartz clock that oscillates
and decrements a counter. When the counter reaches zero, this triggers an interrupt and the
counter is reset, so the interrupt rate is set to the value the counter is set to. An interrupt
handler then updates a software clock, which the OS provides access to.

At any instance in a distributed system, however, independent clocks from a number of


machines will be different (even if they were synchronised initially), due to crystal
frequencies having a natural drift with temperature. This difference in readings is called clock
skew, and a typical rate is 1 in 10-6, or 1 second in 11.6 days.

Universal Coordinated Time

Wikipedia article on
the MSF time signal
The key time source is universal co-ordinated time (UTC), which is based on atomic clocks,
but leap seconds are inserted to keep in phase with astronomic time. Radio stations (MSF in
the UK) broadcast UTC as an encoded time signal. Atmospheric disturbances mean that land-
based UTC sources have an accuracy of ± 10 msec. GEOS (Geostationary Environment
Operation Satellite) or GPS signals provide UTC to ± 0.5 msec. To support these type of
services, radio or GPS receivers are required on the machine to receive the signals.

Clocks can either by synchronised internally or externally. External synchronisation is when


the time source is outside of the distributed system, within some bound D, and internal
synchronisation is when clocks are synchronised relative to each other. However, this does
not mean they are necessarily correctly synchronised to a universal time source, just to each
other.

Synchronous Systems

Synchronous systems have bounds known for the clock drift rate, maximum message
transmission delay, and to execute each step of a process. In theory, the process sends a
message at local time t, and the receiver gets the message at time t + T, where T is the time to
send a message. The two clocks are now synchronised.

However, T is very difficult to know, but we can assume a maximum and a minimum. If we
set the clock to t + min or t + max, then skew can be up to u, where u = (max - min). If we set
the clock to t + (max-min)/2, then skew is up to u/2. The best skew is u(1 - 1/N), for
synchronising N clocks.

Asynchronous Systems

In asynchronous systems, there is no max message delay, and round-trip times between pairs
of processes is often short. Cristian's algorithm tells us to estimate message propogation
time p = (T1 - T0)/2, so the clock gets sent to UTC + p. To increase the accuracy of this, we
can measure T1 - T0 over a number of transactions, but discard any that are over a threshold
as being subject to excessive delay, or take the minimum values as the most accurate.

Here, we have the problem of the time server being a single point of failure, and a bottleneck,
and spurious time server values. It also makes the assumption that the server replies instantly,
and tranmission in both directions takes the same time.

The Berkeley Algorithm was developed for Berkeley UNIX (now BSD). A co-ordinator is
chosen as the master, and slaves are periodically polled to query clocks. The master than
estimates local times with a compensation for propogation delay. An average time is now
calculated, ignoring obviously invalid times. A message is then sent to each slave indicating
clock adjustment. Here, synchronisation is feasible to within 20-25 msec for 15 computers
with a drift rate of 2 × 10-5 and a round trip delay of 10 msec.

Network Time Protocol

Cristian's and the Berkeley algorithms were designed for use in LANs. Synchronisation
across the Internet is more problematic. The Network Time Protocol (NTP) was created with
the following design aims:

• To provide a service enabling clients across the Internet to be synchronised accurately to


UTP
• Provide a reliable service that can survive lengthy losses of connectivity
• Enable clients to resynchronise sufficiently frequently to offset the drift rates found in most
computers
• Provide protection against interference with the time service, whether malicious or
accidental

NTP is implemented with multiple time servers across the Internet. Primary servers (stratum
1) are connected directly to UTC receivers, and secondary servers (stratum 2) synchronise
with the primaries. Lower stratum servers synchronise with the higher strata, where each
level is less accurate than the last.

NTP scales easily to large numbers of servers and clients, and can easily cope with failures of
servers (if a primary UTC server fails, it becomes a secondary).

There are different types of synchronisation modes for NTP: multicast, where 1 or more
servers periodically multicast to other servers on a high-speed LAN, this sets clocks assuming
some small delay and isn't very accurate; procedure call mode is similar to Cristian's
algorithm, where a client requests time from a few other servers, and this is used where there
is no multicast, or higher accuracy is needed; and the symmetric protocol, which is used by
master servers on LANs, and the layers closest to the primaries, which is very high accuracy
and is based on pairwise synchronisation.

If we do discover that our clock is fast, we can't run it backwards without weird things
happening, however. The normal solution is to run it slow (e.g., interrupts are generated
every 10ms, so only 9ms is added to the clock) for some period to allow normal time to catch
up.

Coordination
Concurrent processes in a distributed system need to coordinate actions to maintain the
illusion of the distributed system as one machine.

We can make the assumption here that processes are connected via reliable channels
(eventually, all sent messages are received). If failures do occur, then these are masked by the
underlying reliable communication protocols. We can also assume that alternative routes for
messages in a network exist, so there is no single point of failure within the network.
Clocks that are internally synchronised are termed logical clocks. Internal synchronisation is
often useful as it is more realistic to implement, and the order in which events occur is more
important than the actual time which it occured.

Happens-before

This can be characterised with Lamport's happens-before relation: a → b, where event a


happens before event b. If a and b are from the same process, and a occurs before b, then a →
b is true. If a is the event of a message send by one process and b is the event of the same
message being received by another process, then a → b is also true.

Happens-before is a transitive relationship, and concurrent events have no relation (if a and b
occur at the same time in process that do not exchange messages (even indirectly), then
neither a → b nor b → a is true.

To assert a happens-before relationship, timestamps are needed on events. All processes in


the distributed system must agree on the time for the event, so a method is needed that copes
with (logical) clocks running at different speeds on different machines.

One of the common methods of implementation is for Li (the logical clock on process i) is
incremented before each event issued by P i. When Pi sends a message m, it adds a tail
Li forming a tuple (m, Li). When receiving a message (m, t), Pj computes Lj = max(Li, Lj) and
then sends a receive message to the originating process with the new timestamp.

Given that different events could be given identical timestamps, a total ordering can be
achieved by assigning each process a unique ID.

Often to implement consistency in a replicated distributed system, we can use totally ordered
multicasts. Totally ordered multicasts are implemented by acknowledging messages (other
methods exist, but this is the simplest way). Everyone in the system sees every message and
every acknowledgement - this is obviously a large overhead, but is less than recreating an
out-of-sync replicated database (for example). We can implement this along with the unique
IDs above to create the total ordering.

Mutual Exclusion

See OPS.

Like in independent systems, distributed systems also have to share resources, which are
often encapsulated in critical sections. There are two essential requirements for mutual
exclusion:

1. Safety - At most one process can be in a critical section at a time


2. Liveness - Requests to enter/exit the critical section will eventually succeed

The second requirement implies freedom from deadlock (discussed in OPS) and starvation
(freedom from which requires fairness when selecting who should receive the critical section
next amongst waiting processes).
One way of implementing this is using a central server which controls access to resources and
grants permissions to enter and exit critical sections. This is a simple method, but has a single
point of failure.

Another way is using a token ring to form a logical token ring amongst processes. A token is
given to process 0, which then circulates. When a process has a token, it can enter a critical
section, perform the requested operations and then exit the critical section then handing on
the token. Problems can occur if the token is lost, though - there's no way to tell if it was lost,
or if someone is using it.

Ricart and Agrawala use multicast to provide mutual exclusion - this requires total event
ordering. This distributed algorithm is:

1. When a process wants to enter a critical section, it sends all the other processes a message
containing the name of the critical section and the current time. It waits until an OK reply is
received from all the other processes.
2. When a process receives a request message:
a. If the receiver is not in the named critical section, and doesn't want to enter it - it
sents back an OK
b. If the receiver is in the named critical section, it does not reply, but queues the
request
c. If the receiver wants to enter the critical section (but is not actually in it yet), it
compares the timestamp in the message with one it has sent to everyone, and the
lowest timestamp wins. If the received timestamp is lower, we proceed as (a),
otherwise we proceed as (b).
3. When a process exits a critical section, the process sends OK to all processes on its queue
(and then deletes them from the queue)

Messages per Delay before entry (in message


Algorithm Problems
entry/exit times)

Centralised 3 2 Coordinator crash

Distributed 2(n - 1) 2(n - 1) Crash of any process

Lost token, process


Token ring 1 to ∞ 0 to n - 1
crash

Centralised algorithm is the most efficient, as the only messages required are to request, grant
to enter and exit, and the delay between request and grant is short. In the distributed
algorithm, messages required include many requests and grants per critical section access,
and there is a long delay to enter a critical section. In a token ring implementation, messages
are variable as a token can circulate without a critical section being entered, and the delay can
be anything between 0 (the token has arrived as it is needed) and n - 1 (the token has just
departed).
Elections

Some distributed algorithms require some process to act as a coordinator. The selection of a
coordinator occurs via an election. The goal is that once an election has started, it concludes
with all processes knowing the new coordinator.

An election can be called by any process, but a process can not call more than one election at
a time. Many elections can occur concurrently, called by different processes, and election
algorithms must be resilient to this.

In the ring algorithm, processes are arranged into a logical ring. When a process notices the
coordinator is not functioning, an ELECTION message containing its process number is sent
to the successor. If the successor is also down, it is sent to the next process. The message is
then forwarded to all in the ring, each adding its own process number. The message then
returns to the instigator (who sees that its own process number is in the message), which
sends out a COORDINATOR message to all with the highest process number in it.

Another method the bully algorithm. When a process P notices coordinator is not responding,
it initiates is an election by sending the ELECTION message to all processes with a higher
numbers. If no-one responds, P wins an election and becomes the co-ordinator. If one of the
higher processes respond, P's job is done and the higher process now holds an election.
Eventually, all processes give up apart from one, which is the new coordinator. This process
sends all others a COORDINATOR message to indicate its identity.

Transactions
The server manages a set of objects, which local and remote clients interact with
concurrently. The server must ensure objects remain in a consistent state when multiple
accesses or failures occur.

Transactions define a set of atomic operations (either the entire transaction occurs, or the
effects of a partial execution of a transaction are erased).

The properties of a transaction should be ACID:

• Atomicity - a transaction should be all or nothing


• Consistency - a transaction takes the system from one consistent state to another
• Isolation - each transaction must be performed without interference from another
• Durability - after a transaction has completed, its effects are permanently stored (i.e., not
lost)

Concurrent transactions can cause problems, such as the lost update problem (when an update
occurs based on the results of an out-of-date query) or inconsistent retrievals (where a query
occurs in the middle of a transaction).

Serial Equivalence

If transactions perform correctly by themselves, then if a series of transactions is performed


serially, the combined effect will be okay. An interleaving of the operations of the
transactions which also has the same effect is a serially equivalent interleaving. Serial
equivalence prevents lost updates or inconsistent retrieval problems.

Serial interleaving prevents lost updates and inconsistent retrieval problems.

For two transactions to be serially equivalent, all pairs of conflicting operations of the two
transactions must be executed in the same order at all of the objects they both access.

Operations of different
Conflict? Reason
transactions

Because the effect of a pair of read operations does not


read/read No
depend on the order in which they are executed

Because the effect of a read and a write operation depends on


read/write Yes
the order of their execution

Because the effect of a pair of write operations depends on


write/write Yes
the order of their execution

Aborts

Servers must record effects of committed transactions, and record none of the effects of
aborted transactions.

Problems occur if we don't do this, such as the dirty read problem, which arises if a
transaction reads the uncommitted value of another transaction, which is then aborted. The
premature write problem arises if a transaction reads a value written by another transaction
which is then aborted, with the wrong value potential restored. To solve this, we require that
all reads/writes take place after all previous transactions using the objects have been
committed or aborted.

Locks

Transactions must be scheduled so that the effect on shared data is serially equivalent, but
this doesn't necessarily prevent concurrency.

In exclusive locks (which is used when exclusive access is needed), the server attempts to
lock any object about to be used by operations of a client transaction. If a client requires
access to an object already locked, the transaction is suspended and the client waits until the
object is unlocked. However, we must consider the problems of deadlock from OPS.

Two-phase locks can be used to ensure conflicting operations are executed in the same order.
The transaction is not allowed to get more locks after it has released a lock. The first phase is
to acquire locks, and the second phase is to release locks.
In strict two-phase locking, locks applied during a transaction is held until the transaction has
committed or aborted. This solves the problem of dirty writes or premature writes.

Timestamp Ordering

A unique timestamp is assigned to each transaction when it starts. From this, we can get a
total ordering of transaction starts, and requests from transactions can be ordered.

Transaction requests are checked immediately:

Rule Tc Ti

Tc must not write an object that has been read by any T i, where Ti > Tc. This
1 write read
requires that Tc ≥ the maximum read timestamp of the object.

Tc must not write an object that has been written by any Ti where Ti > Tc. This
2 write write
requires that Tc > the write timestamp of the committed object.

Tc must not read an object that has been written by any T i where Ti > Tc. This
3 read write
requires that Tc > the write timestamp of the committed object.

Replication
Replication of data allows maintenance of copies of data in multiple computers. It provides
enhanced performance, high availability and fault-tolerance.

To correctly implement replication, we require replication transparency - the client shouldn't


be aware of multiple physical copies of the data and data should be organised as individual
logical objects (i.e., operations return one set of values).
Operations on replicated data should be consistent; replicas should report the same data, and
some replicas may need to catch-up and process updates.

System Model

Each logical object is implemented by a collection of physical copies turned replicas. For the
purpose of the model, we assume an asynchronous system, where the processes fail by
crashing, and there are no network partitions.

We can break down a request into five phases:

• A request is made to the front end, which then issues the request directly to the replica
manager
• Coordination - the replica managers coordinate to execute the request consistently
(decisions need to be made about the relative ordering of requests).
• Execution - replica managers execute requests in such a way that it can be undone
• Agreement - replica managers reach a consensus on the effect of the request that will be
committed, and an abort or commit is decided at this stage
• Response - the replica managers respond to the request, via the front end

Group Messages

Management of replicas require multicast communication (i.e., a message sent to just the
replica managers, at the same time). Group membership services control multicast by
managing the membership of a group (which can be dynamic by servers failing, new ones
coming online and other ones going offline). The group membership service provides an
interface for membership changes (create/destroy process groups, add/remove prcesses),
implements a failure detector (monitors group members for crashes and may suspend
potentially failing processes) and monitors reachability of members in the event of a
communications crash. The service would also notify group members to membership changes
and performs group address expansion.

An application running in a process group must be able to cope with changing membership.
The group management service provides each process with a series of views of the
membership. Each process in a group holds the same view, and a new view is delivered when
a change in membership occurs (to ensure consistency, the group membership protocol holds
new views in a queue until all members agree to delivery). New views are delivered at the
same time to all processes, and view changes occur in the same order at each process, so
processes should always act on the same view in order to maintain consistency. This method
is called view delivery.

Another method called view synchronous group communication extends the group delivery
to guarantee that a message multicast to members of a view reaches all non-faulty members
of the group. This relies on the principles of agreement (all processes deliver the same set of
messages in any given view), integrity (a process will deliver a message at most once) and
validity (processes always deliver the messages that they send).

In the view-synchronous system, delivery of a new view draws a conceptual line, and every
message delivered is either one side of the line or another. You can infer a set of messages
delivered to other processes when a new view is delivered.

ISIS implements a synchronous view using TCP/IP, however the main problem is that of
ensuring all messages delivered before the next group membership change occurs. A solution
to this problem is that messages received by everyone in the group are stable, but when a
view change needs to occur, all unstable messages are sent to all, followed by a flush
message. When a process has received a flush from all others, a stable state is regained and it
can continue.

Fault Tolerance

Fault tolerance attempts to ensure that a correct service is provided, despite failures (so data
and functionality is replicated at the replica managers).

The naïve approach is that if one replica is not responding, to try another. Correctness for
replicated objects is achieved by linearisability, an interleaved sequence of operations meets
the specification of a single correct copy of the object.

The order of operations in the interleaving is consistent with the real times which the
operations were requested by the client. The idea is to ensure that replicas are the same as a
virtual replica that receives operations in the correct order and doesn't crash. It requires
synchronised clocks (which is not always possible). This only applies to operations however,
not transaction. Concurrency control is still needed to achieve transaction consistency

One way of implementing fault tolerance is using a passive (primary backup) setup. A single
primary replica manager and a number of secondary (backup) replica managers exist, and the
frontends communicate with the primary replica manager only. Going back to the five phases
of the system model:

• Request - made to the frontend which passes it with a unique identifier to the primary
replica manager
• Coordination - FIFO used, in addition with identifier checks to make sure transactions aren't
done twice
• Execution - the primary executes the request and stores the response
• Agreement - if the request is an update, the primary pushes the updated state to the
backups (which acknowledge)
• Response - primary responds to the frontend which passes back to the client

This is an implementation of linearisability if the primary is correct. It requires a single


backup to take over the primary role if the primary fails, and agreement on what operations
have taken place at the replicas when the primary fails. View-synchronous communication
can be used as the new view can be delivered without the primary, although ordering over the
replica managers is required to decide who is to become the new primary. All surviving
replicas have received the same operations also.

In active replication, the replica managers play equivalent roles and are organised as a group.
Again, we can compare this to the five phases of the system model:

• Request - made to the frontends, including a unique identifier which is multicasted to all the
replica managers
• Coordination - group communications system delivers the message to all replica managers in
the same order
• Execution - each replica manager executes a request (identically)
• Agreement - is not needed
• Response - each replica manager sends a response to the frontend, which will collect a
number of replies and check for consistency

This does not acheive linearisability (the total order in which replica managers process
requests is different to the real-time order in which the clients made their requests). However,
the weaker condition of sequential consistency is achieved. Replica managers process the
same sequence of requests in the same order - happen-before ordering can be used. There is
no total ordering over requests (as required by linearisability), just an ordering over requests
coming from a specific client.
CHAPTER 6:
Logic Programming and Artificial Intelligence
This minisite contains notes taken by Chris Northwood whilst studying Computer Science at
the University of York between 2005-09 and the University of Sheffield 2009-10.

They are published here in case others find them useful, but I provide no warranty for their
accuracy, completeness or whether or not they are up-to-date.

The contents of this page have dubious copyright status, as great portions of some of my
revision notes are verbatim from the lecture slides, what the lecturer wrote on the board, or
what they said. Additionally, lots of the images have been captured from the lecture slides.

What is AI?
There are many different competing definitions for AI, each with strengths and weaknesses.
Some of them follow:

• AI is concerned with creating intelligent computer systems - however, we don't have a good
definition of intelligent
• "AI is the study of systems that would act in a way that to any observer would appear
intelligent" (Coppin) - this is dependent on the observer, and some observers that work in
marketing have a very loose definition of intelligence.

We can define two types of AI from the above defintion:

• Weak AI - Machines can possibly act as if they were intelligent


• Strong AI - Machines can actually think intelligently

Alan Turing developed the Turing Test to check intelligence using the weak AI definition.
The Turing Test is stuch that an interrogator is communicating with another group of people
and a computer - however, he doesn't know which one. If 30% of the time (an arbitrary
figure), the computers are able to fool the interrogator it is a real person, then the system has
passed the Turing test.

Strong AI proponents have put forward the Chinese room test. A person P1 communicates in
Chinese with another person, P2, by passing written notes back and forth. P2 does not know
Chinese, however. P2 has a room full of books that tell him what to do for each character in
the note (similar to a Turing machine). Assuming all the rules are correct, does P2 speak
Chinese?

Other questions can be asked of intelligence. Goedel/Turing said that there are problems that
computers are not able to compute (e.g., the halting problem), however the question must be
asked - can humans? Therefore, are Turing machines the most powerful computational
model?

Further definitions of AI include:


• "AI involves using methods based on intelligent behaviour of humans (or animals) to solve
complex problems" (Coppin). This is nature orriented AI.
• AI is concerned with making computers more useful and useable - this definition is very
vague, so it can be applied to all AI projects, but also to more general CS problems, and
indeed could be the basis for all CS.
• AI is what AI researchers do - this is probably the best definition to date

Agents
"An agent is anything that can be viewed as perceiving its environment through sensors and
acting upon that environment through actuators." (Russel and Norvig)

However, thermostats (which respond to changes in temperature by turning heating on or off)


could also be included in that definition, but is not an AI.

Agents used to act only in closed systems, where the environment was very constrained and
controlled, but now they are behaving in increasingly open environments.

We should consider intelligent agents, which can be defined by three characteristics


according to Wooldridge (1999):

• Reactivity - Agents are able to perceive the environment and respond in a timely fashion to
changes
• Proactiveness - Agents are able to exhibit goal-directed behaviour by taking initiative
• Social Ability - Agents are capable of interacting with other agents (and possibly humans)

This definition still includes thermostats, so it still has weaknesses.

Intelligent agents consist of various components which map sensors to actuators. The most
important component is that of reasoning, which consists of representations of the
environment (more than just pixels from a camera, but a semantic model which is a
representation of the images), formulating a course of action (planning), path finding and
steering, learning, social reasoning (competing and cooperating) and dealing with uncertainty
(where the effects of an action are not known, so a decision must be made based on this).
Learning is very important, and there is a school of thought that all intelligent agents must
learn. Learning can be broken up into supervised learning, where a teacher gives agent
examples of things (sometimes called inductive learning), reinforcement learning, where the
agent receives feedback of whether its behaviour is good or bad and learns from that (the
carrot and stick approach) and unsupervised learning, which is the most difficult to
implement, where the agent is given some basic concepts and then discovers things based on
them (e.g., give a number system and let it discover prime numbers).

Another component of intelligent agents is perception, which is vision and language


understanding, and the final component is acting, which is motion control and balancing.

Reasoning is done using search - looking at all possible ways and choosing the best one (e.g.,
path finding, learning, looking for the best action policy). Therefore, a possible definition of
intelligent is whether or not the agent has done the right thing - a rational agent. The right
thing here is defined by a performance measure - a means of calculating how well the agent
has performed based on the sequence of percepts that it has received.

We can therefore define a rational agent such that for each possible percept sequence
(complete history of anything the agent has ever perceived), a rational agent should select an
action that is expected to maximise its performance measure, given the evidence provided by
the percept sequence, and whatever built-in prior knowledge of the environment (knowledge
that the agent designer has bestowed upon the agent before its introduction to the
environment) the agent has.

Environments

There are different factors to consider when considering environments, such as:

• Fully vs. Partially Observable - the completeness of the information about the state of the
environment
• Deterministic vs. Non-deterministic - Do actions have a single guaranteed effect (dependent
on the state?)
• Episodic vs. Sequential - Is there a link between agent performance in different episodes?
• Static vs. Dynamic - Does the environment remain unchanged except by agent actions?
• Discrete vs. Continuous - is there a fixed finite number of actions and percepts?
• Single agent vs. Multi-agent

Programming AI
Two good languages for AI programming are Lisp, as programs can be treated as data and
Prolog, which uses declarative programming.

Imperative/procedural programs say what should be done and how, so computation consists
of doing what it says. Declarative/non-procedural programs say what needs to be done, but
not how. Computation is performed by an interpreter that must decide how to perform the
computation.

Logic programming is a form of declerative programming. Logic programs consist of


statements of logic that say what is true of some domain. Computation is deduction; answers
to queries are obtained by deducing the logical consequences of the logic program, thus, the
interpreter is an inference engine. Programs can also contain information that helps guide the
interpreter.

Problem Representation
A problem representation is defined by five items:

• set of states
• initial state
• finite set of operators - each is a function from states to states. The function may be partial
as some operators may not be applicable to some states.
• goal test - can be explicit or implicit
• path cost - additive, so can be derived from the action costs

A solution is a sequence of operators that leads from the intial state to a state that satisfies the
goal test.

Instead of operators, we could have successor functions. If s is a state then Successors is


defined to be { x | x = op, for some operator op}

A problem representation can also be depicted as a state space - a graph in which each state is
a node. If operator o maps state s1 to state s2, then the graph has a directed arc
from s1 to s2 and the arc is labelled o. Labels could also be added to indicate the intial state
and the goal states. Arcs can also be labelled to indicate operator costs.

Often a problem itself has a natural notion of configurations and actions. In such cases we
can build a direct representation in which the state and operators correspond directly to the
configurations and actions.

However, operators in the representation might not correspond directly to actions in the
world. Some problems can be solved both ways, for example both towards and from the goal.
If the real world has a natural direction, then we can refer to the search process as either
forwards (if it matches the direction in the natural world) or backwards search.

An alternative view is that we can start with an empty sequence and fill in a middle state first,
before filling in the first or last actions. In this view, states to not correspond directly to
actions, but to the sequence built up so far - this is called indirect representation.

In direct representations, there is a close correspondence between problems and their


representations, however it is important not to confuse them, especially because some
representations are not direct. It is important to recognise that states are data structures (and
they only reside inside a computer), and operators map data structures to data structures.

Search Trees

A problem representation implicitly defines a search tree. Root is a node containing the initial
state. If a node contains s, then for each state s′ in Successors, the node has one child node
containing the state s′. Each node in the search tree is called a search node. In practice, it can
contain more than just a state, e.g., the path cost from the initial node. Search trees are not
just state spaces, as two nodes in a tree can contain the same state.

Search trees could contain infinite paths, which we need to eliminate. This could be due to an
infinite number of states, or where the state space contains a cycle. Search trees are always
finitary (each node has a finite number of children), since a problem representation has a
finite number of operators.

Search
As we saw earlier, given a problen representation, search is the process that builds some or all
of the search tree for the representation in order to either identify one or more of the goals
and the sequence of operators that produce each, or identify the least costly goal and the
sequence of operators that produces it. Additionally, if the search tree is finite and does not
contain a goal, then search should recognise this.

The basic idea of tree search is that it is an offline, simulated exploration of state space by
generating successors of already-explored states (a.k.a., expanding states).

The general search algorithm follows the general idea that you maintain a list L containing
the fringe nodes:

1. set L to the list containing only the initial node of the problem representation
2. while L is not empty do
i. pick a node n from L
ii. if n is a goal node, stop and return it along with the path from the initial node to n
iii. otherwise remove n from L, expand n to generate its children and then add the
children to L
3. return Fail

Search is implemented where a state is a representation of a physical configuration, and a


node is a data structure constituting part of a search tree that includes state, parent node,
action, path cost, g(x) and depth.

Search algorithms differ by how they pick which node to expand. Uninformed search looks at
only structure of the search tree, not at the states inside the nodes - this is often called blind
search algorithms. Informed search algorithms make their decision by looking at the states
inside the nodes - these are often called heuristic search algorithms.

Search algorithms can be evaluated with a series of criteria:

• Completeness - always finds a solution if one exists


• Time complexity - number of nodes generated (not expanded)
• Space complexity - maximum number of nodes in memory
• Optimality - always finds the least-cost solution

Time and space complexity are measured in terms of b - maximum branching factor in the
search tree, d - depth of the least cost solution (root is depth 0) and m, the maximum depth of
the state space (may be ∞)
Breadth First Search

See ADS.

BFS works by expanding the shallowest unexpanded node and is implemented using a FIFO
queue.

It is complete, has a time complexity of Ο(bd + 1), space complexity of Ο(bd + 1) and is optimal
is all operators have the same cost.

The space complexity is a problem, however, and additionally we can not find the optimal
solution if operators have different costs.

Uniform Cost Search

This works by expanding the least-cost unexapanded node and is implemented using a queue
ordered by increasing path cost. If all the step costs are equal, this is equivalent to BFS.

It is complete if the step cost ≥ ε, has a time complexity where the number of nodes with g ≤
the cost of the optimal solution, Ο(bceil(C*/ε)), where C* is the cost of the optimal solution, or
Ο(bd). The space complexity is where the number of nodes with g ≤ cost of the optimal
solution Ο(bceil(C*/ε)), or Ο(bd). This is optimal since nodes are expanded in an increasing
order of g(n).

This still uses a lot of memory, however.

Depth First Search

This works by expanding the deepest unexpanded node, and is implemented using a
stack/LIFO queue.

It is not complete, as it could run forever without ever finding a solution is the tree has an
infinite depth. It does complete in finite spaces, however. The time complexity is Ο(bm),
which is terrible if m >> d, but if solutions are dense, this may be faster than BFS. The space
complexity if Ο(bm), i.e., linear space. It is not optimal.

Depth Limited Search

Depth limited search is DFS with depth limit l. That is, nodes deeper than l will not be
expanded.

This solution is not complete, as a solution will only be found if d > l. It has a time
complexity Ο(bl) and a space complexity Ο(bl). Also, the solution is not optimal.

Iterative Deepening Search

Depth limited search can be made complete using iterative deepening. The implementation of
this is:
set l = 0
do
depth limited search with limit l
l++
until solution found or no nodes at depth l have children

This does have a lot of repetition of the nodes at the top of the search tree, so it's a class
memory vs. CPU trade-off - this is CPU intensive, but only linear in memory. However, with
a large b, the overhead is not too great, for example, when b = 10 and d = 5 the overhead is
11%.

IDS is complete, has a time complexity of (d + 1)b0 + db1 + (d - 1)b2 + ... + bd, or Ο(bd) and a
space complexity of Ο(bd). Like BFS, it is optimal if step cost = 1.

Handling Repeated States

Failure to detect repeated states can turn a linear space into an exponential one. Any cycles in
a state space will always produce an infinite search tree.

Sometimes, repeated states can be reduced or eliminated by changing the problem


representation, e.g., by adding a precondition saying that the inverse of the previous operation
is not allowed will eliminate cycles of length 2.

Another way of handing repeated states is by checking whether or not a node is equal to one
of its ancestors, and don't add it if it is. For DFS-type algorithms, this is relatively easy to
implement as all ancestors are already on the node list, however for other algorithms, this
takes Ο(d) to traverse up the node list, or keeping a record of all the ancestors of a node,
which consumes a lot of space.

A better way to solve this is to use graph search, which keeps track of all generated states.
When a node is removed from the node list (the open list), it is stored on another list (the
closed list). The basic version is the same as tree search, but you don't expand a node if it has
the same state as one on the closed list. A more sophisticated version is required to retain
optimality: If the new node is "cheaper" than its duplicate on the closed list, then the one on
the closed list and its descendants must be deleted.

Siblings can still be created, however, so another way would be compare the node to all
nodes - this is very expensive, so the normal solution to repeated states is to rewrite the
problem representation.

We can consider the search algorithms above to be uninformed, that is, only the structure of
the search tree and not at the states inside the nodes - these are blind search algorithms. We
can improve these search algorithms by inspecting the contents of a node to see whether or
not they are a good node to expand, we can call these type of algorithms informed search.
The way of choosing whether or not a node is good is done using a heuristic, which makes a
guess, using rules of thumb, as to whether that state is a good one or not. As these problems
are NP-complete, heuristics are never accurate, but the guesses they make could be good.

The general search algorithm for informed search is called best-first search, although this is a
slight misnomer, as if it were truly best-first, we would be able to go straight to the goal state.
The general idea here is to use an evaluation function ƒ(n) for each node n, which gives us an
estimate of desirability - only part of the evaluation functions needs to be a heuristic for the
search to become informed. We then expand the most desirable unexpanded node. This is
done by ordering the nodes in L in decreasing order of desirability, and nodes are always
chosen from the front of L.

Greedy Best-First Search

In Greedy Best-First Search, the evaluation function ƒ(n) = h(n), our heuristic (an estimate of
cost from n to the nearest goal). In our distance-finding example, this could be the straight
line distance. Greedy Best-First Search therefore works by expanding the node that appears to
be closest to the goal.

Greedy Best-First Search is not complete, as there a variety of cases where completeness is
lost, e.g., it can get stuck in loops. The time complexity is Ο(bm), but a good heuristic can
give dramatic improvement. The space complexity is also Ο(bm), as all nodes are kept in
memory. Greedy best-first search is also not optimal.

A* Search

Greedy Best-First Search ignores the cost to get to the goal, which loses optimality. We
therefore need to consider the cost to get to the node we are at. The basic idea of this is to
avoid expanding paths that are already expensive. For this, we update our evaluation function
to be ƒ(n) = g(n) + h(n), where g(n) is the cost so far to reach the node n - this is the same as
uniform-cost search, therefore we can consider A* search to be a mixture of uniform-cost and
greedy best-first search.

Even though the goal may be expanded by a previous expansion, it is never actually
considered until it reaches the front of the list, as a more optimal path to the goal may be
expanded from a node with a lower current cost.

A* search is sometimes referred to as A-search, however A* does have extra conditions for
optimality compared to A. In A*, h(n) must be admissible, which guarantees optimality.

Using A* search, nodes are expanded in order of f value, which gradually adds so-called f-
contours of nodes. A contour i has all nodes where f = fi, where fi < fi + 1.
A* is complete (unless there are infinitely many nodes with f ≤ ƒ(G) - as in uniform cost
search). It also has a time complexity that is exponential in relative error in h and the length
of the solution, and the space complexity is that of keeping all nodes in memory. It is also
optimal.

Admissible Heuristics

A heuristic h(n) is admissible if ∀n h(n) ≤ h*(n), where h*(n) is the true cost to reach the
nearest goal state from n, i.e., an admissible heuristic never overestimates the cost to reach
the goal (i.e., it is optimistic).

For proof of optimality,


see lecture slides.

Theorem: If h(n) is admissible, A* using tree-search is optimal. That is, there is no goal node
that has less cost than the first goal node that is found.

A heuristic is consistent if, for every node n, a successor n′ of n generated by any


action a, h(n) ≤ c(n, a, n′) + h(n′), where c(n, a, n′) is the cost of getting from node n to n′ via
action a. Using the triangle inequality (that is, the combined length of any two sides is always
longer than the length of the remaining side), we have ƒ(n′) ≥ ƒ(n), i.e., ƒ(n) is non-
decreasing along any path.

Theorem: If a heuristic is consistent, then it is admissible. The reverse is not strictly true, but
it is very hard to come up with admissible heuristic that is not admissible.
Theorem: If h(n) is consistent, A* using graph-search is optimal. The proof for this is similar
to that for uniform cost.

Admissible heuristics can be discovered by relaxing the problem to a simpler form (i.e.,
removing some constraints so it becomes easier to solve). If we have two heuristics
where h1 is a relaxed form of h2 and both are admissible, then ∀n h2(n) ≥ h1(n), then we can
say that h2 dominates h1.

Theorem: If h2 domaintes h1 then A* with h1 expands at least every node expanded by A*


with h2.

Logic
Logic is the study of what follows from what, e.g., if two statements are true, then we can
infer a third statement from it. It does not matter if the statements are actually true, just that
if x and y is true, then z must also be true to be a valid statement.

Logic is useful for:

• Reasoning about combinatorial circuits


• Reasoning about databases
• Giving semantics to programming languages
• Formally specifying software and hardware systems - model-checking and theorem-proving
can be used to show that the system meets the specification or that the system has certain
properties.
• Capturing complexity classes (e.g., there is a correspondance between problems in NP and
the forumlas of existential second-order logic - more precisely, a forumla and a problem
correspond if there is a one-to-one mapping between the models satisfying the formula and
the solutions to the problem)

• AIMA 7.6 SAT (the satisfiability problem) solvers can solve practical problems. Cook
showed all problems in NP can be polynomial reduced to SAT, but until recently no SAT
solver could practically solve the very large number of clauses mapping an NP problem to
SAT generates. Modern SAT solvers can now support up to 1,000,000 clauses so many
practical problems can be solved by mapping them to SAT.

Logic is very important for AI as flexible, intelligent agents need to know facts about the
world in which they operate - declerative knowledge and procedural knowledge - how to
accomplish tasks. e.g., procedural - don't drink poison; declerative - drinking poison will kill
you, and you want to stay alive.

Every way we can represent the world is a model, thus any model where the statement x is
true, and y is true, we can infer a condition z based on those two statements. However, if we
can infer a counter model where the premises are true, but the condition isn't, the argument is
invalid.

Some words in an argument have set logical meanings, such as Some or Every, but others
have variable meanings, such as foo or bar. Words that don't very in meaning are called
logical symbols.
The semantics of a statement is whether or not a statement is true in a world, and is giving
meaning to a statement.

Rule based systems use proof theory to take our arguments and check their validity, e.g.,
some X is a Y, every Y is a Z → some X is a Z is valid. We can express this as X/Y, Y/Z ∴
X/Z, which is valid and we don't need to consider the rules.

We need to consider a grammar, where we can express this formally, and the semantics that
determine the validity of the arguments. Ways of doing this include propositional and first-
order logic. The requirements for a declerative representation language are:

• The language should have syntax (a grammar defining the legal sentences)
• The language must have semantics (what must the world be like for a sentence to be true -
truth-conditional semantics; there are other type of semantics dealing with other types of
meaning, but we will not be dealing with that).

Intelligent agents need to reason with their knowledge, and many kinds of reasoning can be
done with declarative statements:

• educated guesses
• reasoning by analogy
• jump to conclusions
• make generalisations

However, for LPA we will focus on sound, logical reasoning - i.e., deducing logical
consequences of what is known, as determined by truth-conditional semantics.

Propositional Logic

As discussed above, we can discuss a declarative representation language using a syntax and
a semantics. For propositional logic, our syntax consists of a lexicon and a grammar.

The lexicon (non-logical symbols) of propositional logic consists of a set of proposition


symbols. Throughout the examples, we can assume that, P, Q, R, S and T are among the
propositional symbols. The logical symbols of propositional logic are ¬, ∨, ^, →, ↔. We can
then define the sentences of propositional logic as:

• If α is a proposition letter, then α is an atomic sentence


• If α and β are sentences, then (¬α), (α ∨ β), (α ^ β), (α → β) and (α ↔ β) are molecular
sentences.

The above is an official definition, but we can take certain shortcuts to reduce clutter and to
make writing logical statements easier. We can assign to connectives, from high to low, ¬,
&or, ∧, →, ↔. We can then take advantage of this to omit paranthesis when no ambiguity
results. Additionally, we can take the connectives ^ and ∨ to have an arbitary arity of two or
more.

We call a sentence of the form α ∨ β a disjunction, where α and β are its disjuncts. A sentence
of the form α ^ β is a conjunction, where α and β are its conjuncts.
We also need to consider the semantics of propositional logic. In propositional logic, a model
is a function that maps every proposition letter to a truth value, either TRUE or FALSE. If α
is a sentence, and I is a model, then [[α]]I is the semantic value of α relative to model I.
[[α]]I is then defined as follows:

• If α is an atomic sentence, then [[α]]I = I(α)


• If α is (¬β) then [[α]]I = { TRUE if [[β]]I = FALSE, FALSE otherwise
• If α is (β ∨ γ) then [[α]]I = { TRUE if [[β]]I or [[γ]]I, FALSE otherwise
• If α is (β ^ γ) then [[α]]I = { TRUE if [[β]]I = TRUE and [[γ]]I = TRUE, FALSE otherwise
• If α is (β → γ) then [[α]]I = { TRUE if [[β]]I = FALSE or [[γ]]I = TRUE, FALSE otherwise
• If α is (β ↔ γ) then [[α]]I = { TRUE if [[β]]I = [[γ]]I, FALSE otherwise

An alternate way of thinking is if one thinks of the logical connectives as truth functions,
which maps a pair of arguments to true or false in every model (which is why it is considered
a logical symbol). Thus, we could define: [[(α ^ β)]]I = [[^]]([[α]]I, [[β]]I).

Theorem 1: Let A and A′ be two equivalent formulas. If B is a formula that contains A as a


subformula, then B is logical equivalent to the formula that results from substituting A′ for A
in B.

Theorem 2: Two sentences, A and B, are logically equivalent if and only if then sentence A
↔ B is valid.

Theorem 3: Let A, A1, ..., An be sentences. Then {A1, ..., An} entails A if and only if the
sentence A1 ^ ... ^ An → A is valid.

An argument consists of a set of permises and another sentence called the conclusion. We
allow the premises to be an infinite set. However, in many cases, we will be concerned with
arguments that have a finite set of premises, and we can call these finite arguments. If an
argument with premises P and conclusion C is valid, then we say that P entails C (or, C
follows from P, or C is a consequence of P).

Let α be a sentence, A be a set of sentences and I be a model, then:

• I satisfies α if and only if [[α]]I = TRUE


• I satisfies A if and only if satisfies every a ∈ A
• I falsifies A if and only if [[α]]I = FALSE
• A (or α) is satisfiable if and only if it is satisfied by some model
• α is valid and unfalsifiable if and only if it is satisfied by every model

Note that although each individual sentence in A may be satisfiable, yet A itself may be
unsatisfiable. For example, {P, ¬P} is unsatisfiable, yet P and ¬P are each satisfiable. Thus,
not every unsatisfiable set of sentences if a set of unsatisfiable sentences. As we can see, the
notion of a valid sentence is therefore different to the notion of a valid argument, but it is
related.

There are some special definitions worth noting, however.

• The empty set of sentences if satisfied by every model, hence ∅ entail C if and only if C is a
valid sentence
• An unsatisfiable sentences is entailed only by an unsatisfiable set of sentences
• A valid sentence is entailed by any set of sentences

Two sets of sentences are logically equivalent if they are satisfied by precisely the same
models (they have the same truth value in all models). Two sentences are logically equivalent
if and only if they entail each other.

First Order Predicate Calculus

Again, we start by considering the syntax of FOPC. The lexicon (non-logical symbols) of
FOPC consist of the disjoint sets of a set of predicate symbols, each having a specified arity ≥
0 (we often refer to symbols with arity 0 as proposition symbols - propositional logic is where
all predicates have no arguments), and a set of function symbols, each having a specified
arity ≥ 0 (function symbols of arity 0 are often called individual symbols, or often a a
misnomer, constants), and finally a set of variables.

The logical symbols of FOPC consist of the quantifiers ∀ and ∃, and the logical connectives
¬, ^, ∨, → and ↔.

We can define the terms of FOPC as individual symbols, variables, and function symbols,
where if ƒ is a function symbol of arity n and t1, ..., tn are terms then ƒ(t1, ..., tn) is a term.

The formulas of FOPC are defined as atomic formulas, where if P is a predicate symbol of
arity n and t1, ..., tn are terms, then P(t1, ..., tn) is an atomic formula, and as molecular
formulas, where if α and β are formulas, then ¬α, (α ∨ β), (α ^ β), (α → β) and (α ↔ β) are
molecular formulas. Additionally, if x is a variable and α is a formula, then (∀xα) and (∃xα)
are molecular formulas.

We can consider variables to be either bound or free. We say that a variable x occuring in a
formula α is bound in α if it occurs within α as a subformula of the form ∀xβ or ∃xβ.
Otherwise we call the variable free. We call a formula α a sentence if there is no free
occurance of a variable.

Sentences have truth values, as FOPC is based on compositional semantics, where the total
truth value is equal to the sum of its parts.

Like with propositional logic, we can drop paranthesis if no ambiguity results, and we often
give connectives and quantifiers precedence from high to low (¬, ∨, ^, →, ↔, ∀, ∃). We often
specifiy ∀x1, x2, ..., xn as a shorthand for ∀x1, ∀x2, ..., ∀xn, and similarly for ∃.

If a term or formula contains no variables, we refer to it as ground. α ∨ β is a disjunction,


where α and β are its disjuncts and α ^ β is a conjunction, where α and β are its conjuncts.

We can also consider first order propositional logic, which is first order predicate calculus
with equality. The equality symbol "=" is both a logical symbol and a binary predicate
symbol. Atomic formulas using this predicate symbol are usually written as (t1 = t2), instead
of =(t1, t2). Also, we usually write (t1 ≠ t2) instead of ¬(t1 = t2).

To consider the semantics of FOPC, we define a pair (D, A), where D is a potentially infinite
non-empty set of individuals called the domain and A is a function that maps each n-ary
function symbol to a function from Dn to D and each n-ary predicate symbol to a
function Dn to {TRUE, FALSE}. Individual symbols (function symbols of 0-arity) always
map to an element of D and a predicate symbol of arity 0 (i.e., a proposition symbol) is
mapped to TRUE or FALSE.

When we are considering formula with free variables, the truth value depends not only on
what model we are considering, but what values we give to these variables. For this purpose,
we will consider the value of an expression relative to a model and a function from variables
to D. Such a function is called a value assignment.

For a full semantic


definition of all symbols
see lecture notes.

If α is an expression or a symbol of the language, then [[α]]I, gis the semantic value of α
relative to model I and value assignment g. If g is a value assignment, then g[x ↵ i] is the
same value assignment, with the possible exception that g[x → i](x) = i.

Proof Systems
Proof systems work by a syntactic characterisation of valid arguments. Here, we will consider
the Hilbert, or axiomatic, system. These systems work by starting with the set of premises of
the argument, and repeatedly apply inference rules to members of the set to derive new
sentences that are added to the set, the goal being to derive the conclusion of the argument.
Some proof systems start with a possibly infinite set of logical axioms (e.g., every sentence of
the form α ^ ¬α).

Propositional Logic

Many proof systems for propositional logic employ a wide variety of rules. Commonly used
rules include and-introduction - from any two sentences α and β, derive α ^ β - and modus
ponens - from any two sentences of the form α and α → β, derive β.

If we consider the following argument, we can show that the conclusion is derived from the
premises:

A
A→B
A→C
B^C

Then we can derive the result as follows:

1. A - premise
2. A → B - premise
3. A → C - premise
4. B - modus ponens from 1 and 2
5. C - modus ponens from 1 and 3
6. B ^ C - and-introduction from 4 and 5

Derivations can also be represented as trees and as directed, acyclic graphs, e.g.,

We can say a proof system is sound if every sentence that can be derived from a set of
premises is entailed by the premises, and that a proof system is complete if every sentence
that is entailed by a set of premises can be derived from the set of premises. Therefore, we
can consider soundness and completeness when the set of sentences derivable from a set of
premises is identical to the set of sentences entailed by the premises. This is obviously a
desirable property.

An inference rule of the form: From α1, ..., αn derive β, is sound if {α1, ..., αn} entails β.
Modus ponens and and-introduction are sound.

Some proofs are linear, that is when the shape of their tree looks as follows:
We can also reason backwards, where a proof can be turned upside down and we start from
the conclusion and work towards [], the truth condition.

Horn Clauses

Each premise in a propositional Horn clause argument is a definite clause (α 1 ^ ... ^ αn) → β,
where n ≥ 0 and each αi and β is atomic. If n = 0, then we have that → β is a definite clause,
and we usually write this as simply β.

The conclusion in a propositional Horn clause argument is a conjunction of atoms α 1 ^ ... ^


αn. As a special case, we can consider [] when n = 0. We call this the empty conjunction and
it is true in all models.

SLD Resolution for Propositional Logic

The SLD resolution inference rule is used to derive a goal clause from a goal clause and
definite clause.

From the goal clause G1 ^ ... ^ Gi - 1 ^ Gi ^ Gi + 1 ^ ... ^ Gn (n ≥ 1) and the definite clause B1 ^
... ^ Bm → H (m ≥ 0) we can derive the goal clause G1 ^ ... ^ Gi - 1 ^ B1 ^ ... ^ Bm ^ Gi + 1 ^ ... ^
Gn, provided that H and Gi are identical.

Because of their special form, SLD proofs are often shown simply as:
A^C
B ^ D ^ C (4)
E ^ D ^ C (3)
E ^ C (2)
C (1)
[] (5)

Where the number following is the number of the premise that SLD resolution was applied
to.

The element of the goal clause that is chosen to have SLD resolution applied to it is chosen
using a selection function, and the most common selection functions are the leftmost and
rightmost selection functions.

We have the following properties about SLD inference:

1. The SLD inference rule is sound and complete for propositional Horn-clause arguments
2. Selection function independence - Let S1 and S2 be selection functions and let A be a Horn-
clause argument. Then, there is an SLD resolution proof for A that uses S1 if and only if there
is one that uses S2.

SLD Resolution for First Order Predicate Calculus

The goal here is to generalise SLD resolution for propositional Horn-clause arguments to
handle first order Horn-clause arguments (i.e., a subset of predicate calculus).

If we let Φ be a forumla whose free (unbound) variables are x1, ..., xn, then ∀Φ is the
universal closure of Φ, which abbreviates ∀x1, ..., xn Φ. Furthermore, ∃Φ is the existential
closure of Φ and abbreviates ∃x1, ..., xn Φ.

Each premise in a first order Horn-clause argument is a definite clause∀((α1 ^ ... ^ αn) → β),
where n ≥ 0 and where each αi and β are atomic formulas of first order predicate calculus.
If n = 0, then we have that ∀(→ β), which is a definite clause and we simply write this as ∀β.

The conclusion in a propositional Horn-clause argument is a goal clause ∃(α1 ^ ... ^ αn),
where each αi is an atomic formula of first-order predicate calculus. As a special case
when n = 0, we can write [] - this is the empty conjunction which is true in all models.

Since all variables in the definite clause are univerally quantified, and all those in the goal
clause are existentially quantified, the closures are often not written, so in such cases, it is
important to bear in mind that they are there implicitly.

Substitutions

A substitution is a function that maps every expression to an expression by replacing


variables with expressions whilst leaving everything else alone. Application of a substitution
θ to an expression e is written as eθ rather than θ(e). We only need to consider applying
substitutions to expressions containing no quantifiers to simplify the presentation.
A more formal definition of a substitution is that: A substitution is a function θ from
expressions to expressions such that for every expression e: if e is a constant then eθ = e.
If e is composed of e1, e2, ..., en then eθ is composed of e1θ, e2θ, ..., enθ in the same manner.
If e is a variable, then eθ is an expression.

We can consider the identity function as a substitution, usually called the identity
substitution. We can observe that substitutions are uniquely determined by their treatment of
the variables; that is θ1 = θ2 if and only if vθ1 = vθ2 for every variable v.

The domain of a substitution is the set containing every variable that the substitution does not
map to itself. The domain of the identity substitution is the empty set.

An algorithm needs a systematic way of naming each substitution it uses with a finite
expression. Since we are only concerned with substitutions with finite domains, that can be
accomplished straightforwardly.

Suppose that the domain of a substitution θ is {x1, ..., xn} and that each xi is mapped to some
expression ti, then θ is named by {x1 → t1, ..., xn → tn}. The empty set names the identity
substitution and {x → y, y → x} names the substitutions that swaps xs and ys throughout an
expression.

Instances

An expression e′ is said to be an instance of expression e if and only if e′ = eθ for some


substitution θ (i.e., eθ is an instance of e).

As the identity function is a substitution, we could say that every expression is an instance of
itself.

An expression is ground if and only if it contains no variables. We shall be especially


concerned with the ground instances of an expression, which obviously are the instances of
the expression which contains no variables.

We can write egr to denote the set of all ground instances of the expression e. By extension, if
E is a set of expressions, then Egr denotes the set of all ground instances of all expressions in
E, that is Egr = ∪e ∈ E egr.

Herbrand's Theorem

Herbrand's Theorem is that by considering ground instances, we can relate the validity of first
order Horn clause arguments to those of propositional logic. This relationship offers us
guidance on how to generalise the SLD resolution inference rule from propositional logic to
first order predicate calculus.

The theorem is: "Let P be a set of definite clauses and G be a goal clause. Then, P entails G if
and only if Pgr entails some ground instance of G."

The Herbrand theorem gives us an idea of how SLD resolution should work for first order
Horn clause arguments. The best way to demonstrate this is with some example arguments
and their SLD search spaces.
(1) P(a)
(2) P(c) → P(d)
(3) R(d) → P(e)
(4) R(d)
P(x)

P(x)
(1) [] - considering x replaced by a
(2) P(c) - considering x replaced by d
(3) R(d) - considering x replaced by e
(4) []

Where variables occur in the premises, they can be replaced by any constant in a similar way
to variables in the goal, as long as the variable replacements are consistent.

Unifiers

A substitution θ is said to unify expressions e1 and e2 if e1θ = e2θ. We call these two
expressions unifiable, and θ is the unifier. The set of the set of all unifiers of two expressions
can be characterised by a greatest element, the most-general unifier.

If we let e1 and e2 be two expressions that have no variables in common, then e1 and e2 are
unifiable if and only if the intersection of (e1)gr and (e2)gr is non-empty. Furthermore, if they
are unifiable and θ is their most general unifier, then (e1θ)gr = (e2θ)gr = (e1)gr ∩ (e2)gr.

We can say that a substitution θ solves an equation e1 = e2 if e1θ= e2θ. We say that a
substitution solves a set of equations if it solves every equation in the set. (Note that ∅ is
solved by every substitution). Thus, we can view the problem of finding a most general
unifier of e1 and e2 as that of finding the most general solution to e1 = e2. The unification
algorithm finds a most general solution to any given finite set of equations by rewriting the
set until it is in such a form that a most general solution to it is readily seen.

An occurance of an equation in an equation set is in solved form if the equation is of the


form x = t and x does not occur in t or in any other equation in the equation set. A set of
equations is in solved form if every equation in the set is in solved form. {x1 → t1,
..., xn → tn} is a most general solution to the solved form set of equations {x1 = t1, ..., xn = tn}.

The unification algorithm takes two expressions, s1 and s2, and outputs either a substitution or
failure. We let X be the singleton set of equations {s1 and s2} and repeatedly perform any of
the four unification transformations until no transformation will apply. The output is of the
form {x1 → t1, ..., xn → tn}, where x1 = t1, ..., xn = tn are the equations that remain in X.

The four unification transactions are:

1. Select any equation of the form t = x, where t is not a variable and x is a variable, and rewrite
this as x = t
2. Select any equation of the form x = x, where x is a variable and erase it
3. Select any equation of the form t = t′ where neither t nor t′ is a variable. If t and t′ are atomic
and identical, then erase the equation, else if t is of the form ƒ(e1, ..., en) and t′ is of the form
ƒ(e′1, ..., e′n), then replace t = t′ by e1 = e′1, ..., en = e′n, else fail.
4. Select any equation of the form x = t where x is a variable that occurs somewhere else in the
set of equations and where t ≠ x. If x occurs in t, then fail. Else, apply the substitution {x → t}
to all other equations (without erasing x = t).

The unification algorithm is correct by the theorem that: the unification algorithm performs
only a finite number of transformations on any finite input; if the unification algorithm
returns fail, then the initial set of equations has no solutions; if the unification algorithm
returns a set of equations then that set is in solved form and has the same solutions as the
input set of equations.

With this, we can apply SLD-resolution for first order Horn clause arguments. From the goal
clause G1 ^ ... ^ Gi - 1 ^ Gi ^ Gi + 1 ^ ... ^ Gn (n ≥ 1) and the definite clause B1 ^ ... ^ Bm → H
(m ≥ 0) we can derive the goal clause (G1 ^ ... ^ Gi - 1 ^ B1 ^ ... ^ Bm ^ Gi + 1 ^ ... ^ Gn)θ,
provided that H and Gi are unifiable and θ is the most general unifier of the two.

The SLD-resolution inference rule is sound and complete for first-order Horn clause
arguments, and selection function independence also applies in the first-order case.

Answer Extraction

Definite clauses are so-named because they do not allow the expression of indefinite
(disjunctive) information. e.g., given the premises thief(smith) ∨ thief(jones) and the
conclusion ∃x thief(x), the conclusion follows, but we can not name the thief. This situation
would never arise if the premises were definite clauses.

If we let P be a set of definite clauses and let C be a formula without any quantifiers, then P
entails∃C if and only if P entails some ground instance of C. If Cθ is a ground instance that is
entailed, then θ is said to be an answer.

We know that if a first-order Horn clause argument is valid then it has an SLD proof, and it is
easy to extract an answer from the proof. Furthermore, if we consider all SLD proofs for an
argument, then we can extract all answers.

Answers are extracted by attaching to each node in the search space an answer box that
records the substitutions made for each variable in the goal clause. The answers are then
located in the answer box attached to each empty clause in the search space.

Constraint Satisfaction
An instance of the finite-domain constraint satisfaction problem consists of a finite set of
variables X1, ..., Xn, and for each variable Xi a finite set of values, Di, called the domain
(although Russel and Norvig require this to be non-empty, common practice is to allow
empty domains, as they are useful), and a finite set of constraints, each restricting the values
that the variables can simultaneously take.

A total assignment of a CSP maps each variable to an element in its domain (it is a total
function), and a solution to the CSP is a total assignment that satisfies all the constraints.
Once you have an instance of a CSP, the goal is usually to find whether the instance has any
solutions (satisfiable), to find any or all solutions, or to find a solution that is optimal for
some given objective function (combinatorial optimisation).

Constraints in a CSP have an arity, the number of variables it constrains - constraints of arity
2 are referred to as binary constraints.

The satisfiability problem for a CSP is NP-complete, and this can be proven by reducing CSP
to SAT, where the variables are proposition letters, domains are True or False and the
constraint is that the entire thing must evaluate to True. Therefore, no polynomial algorithm
is going to work, and we need to search in some form.

Here, we are only looking at constraint satisfaction for discrete, finite domains. Other
constraint satisfaction problems (other the domain of reals) include linear programming (in
polynomial time), non-linear programming and integer linear programming (NP-complete).
Other extensions to the basic problem include soft constraints, dynamic constraints and
probabilistic constraints.

Constraints are expressed extensionally, where a constraint among n variables is considered


as a set of n tuples. IfX1, ..., Xn are variables with domains D1, ..., Dn, then a constraint among
these variables is a subset of D1 × ... × Dn. The traditional way, especially for theoretical
studies, abstracts away from any language for expressing constraints, although it can be
thought of as a disjunctive normal form.

In a constraint language, we can use operators such as = (equality),≠ (disequality), inequality


on domains with an ordering (< or >), operators on values or variables, such as + or ×, logical
operators such as ocnjunction, negation and disjunction, and constraints and operators
specific to particular domains, such as alldifferent(x1, ..., xn) or operations on lists, intervals
and sets.

CSP solving is often effective in many cases, and can be solved by first modelling, where a
problem is formulated as a finite CSP (similar to reducing the problem), solving the CSP
(there is powerful technology available to solve even large instances of a CSP that arise from
modelling important real-world problems) and then mapping the solution to the CSP to a
solution to the original problem.

CSP solving methods include constraint synthesis, problem reduction or simplification, total
assignment search (also called local search or repair methods, e.g., hill-climbing, simulated
annealing, genetic algorithms, tabu search or stochastic local search), and partial assignment
search (also called constraint satisfaction search, partial-instantiate-and-prune and constraint
programming). Searches can be classified by two criteria: stochastic or non-stochastic and
systematic or non-systematic search.

Partial Assignment Search

To find a solution by partial assignment search, a space of partial assignments is searched,


looking for a goal state which is a total assignment that satisfies all the constraints.

In partial assignment search, you start at the initial state with an assignment to no variables.
Operators are then applied, each of which extends the assignment to cover an additional
variable. Some reasoning is then performed at each node to simplify the problem and to prune
nodes that can not extend to reach a solution.

Partial assignment uses a method called backtrack. If all the variables of a constraint are
assigned and the constraint is not satisfied then the node containing that constraint is pruned.

Partial assignment search for an instance I of the CSP can be shown as a problem
representation:

• States: A state consists of a CSP instance and a partial assignment


• Initial State: The CSP instance I and the empty assignment. Problem reduction is often
performed to simplify the initial state
• Goal States: Any state that has not been pruned and the assignment is total
• Operators: A variable selection function (heuristic) is used to choose an unassigned variable,
Xi. Then, for every value v in the domain of Xi, there is an operator that produces a new state
by:
o Extending the assignment by assigning v to Xi
o Setting D i to {v}
o Performing problem reduction to reduce the domains
o Pruning the new node if any domain is the empty set
• Path Cost: This is constant for each operator application

There are issues in partial assignment search, however - how do we know what problem
reductions should be performed, how are variables selected, how the search space is searched
and how the values are ordered.

Given a problem state (a set of variables with domains, a partial assignment to those variables
and a set of constraints), then we should be able to transform it to a simpler, but equivalent
problem state.

Sometimes the simplified problem state will be so simple that it may be obvious what the
solutions are, or that there are no solutions. In the non obvious states search continues.

Other techniques for reduction include domain reduction.

Consistency Techniques

Consistency techniques are a class of reduction techniques that vary in how much reduction
they achieve and how much time they take to compute. In the extreme, there are some
consistency techniques that are so powerful that they solve a CSP by themselves, but these
are too slow in practice. At the other extreme, one could also solve by search with little or no
simplifcation, which is also ineffective.

For many applications, generalised arc consistency (GAC) provides an effective trade-off for
a wide class of problems. Node consistency is a special case of GAC for unary constraints
and arc consistency is a special case for binary constraints. Other consistency techniques
could also be effective for specific problems.
Node Consistency

A unary constraint C on a variable x is node consistent if C(v) is true for every value v in the
domain of x.

A CSP is node consistent if all of its unary constraints are node consistent.

To achieve node consistency on a constraint C(x), remove from the domain of x every
value vx, such that C(vx) is false. Node consistency on a problem is achieved by achieving it
once on each unary constraint.

If a CSP instance is node consistent, that all of its unary constraints can be removed without
changing the solutions to the problem. Whether or not a CSP instance is satisfiable is
independent of whether or not it is node consistent, as shown in the table below:

odd(x) Sat Sat

NC Dx = {1} Dx = {}

NC Dx = {1,2} Dx = {2}

Arc Consistency

This is the binary case of generallised arc consistency. A constraint C on variables x and y is
arc consistent from x to y if and only if for every value vx in the domain of x there is a
value vy in the domain of y such that {x → vx, y → vy} satisfies C. Alternatively we could say
that every value in x takes part in some solution of C.

A constraint C between x and y is arc consistent (written AC(C)) if it is arc consistent


from x to y and y to x.

An instance of a CSP is arc consistent if all of its constraints are arc consistent. Whether or
not a CSP instance is satisfiable is independent of whether or not it is arc consistent:

x≠y Sat Sat

Dx = {1} Dx = {}
AC
Dy = {2} Dy = {}

Dx = {1}
AC Dx = {1,2}
D = {2} Dy = {1}
y

Arc consistency is directional and is local to a single constraint.

If applying arc consistency to a constraint C results in x having an empty domain if and only
if C is unsatisfiable. Applying arc consistency from x to y would also result in y having an
empty domain if and only if x does.
Generalised Arc Consistency

A constraint C on a set {x0, x1, ..., xn} is generalised arc consistent from x0 if and only if for
every value of v0 in the domain of x0 there are values v1, ..., vn in the domains of x1, ..., xn such
that {x0 → v0, ..., xn → vn satisfies C. Alternatively, we could say that every value in the
domain of x0 takes part in some solution of C.

A constraint C on a set X of variables is generalised arc consistent (written GAC(C)), if and


only if it is generalised arc consistent from every x ∈ X. An instance of the CSP is
generallised arc consistent if all of its constraints are generalised arc consistent.

This is a generalisation of both node and arc consistency. A binary constraint is arc consistent
if and only if it is generalised arc consistent. A unary constraint is node consistent if and only
if it is generalised arc consistent.

Domain Reduction during Search

There are three main techniques for applying domain reduction during search:

• Backtracking: When all variables of a constraint are assigned, then check if the constraint is
satisfied - if not, then prune. This is almost always ineffective as it does not perform enough
pruning.
• Forward Checking: When all variables are assigned, but one variable of a constraint C is
assigned, then achieve generalised arc consistency from the unassigned variable to the other
variables and prune if a domain becomes empty.
• Maintaining Arc Consistency: Achieve GAC at each node, and prune if a domain becomes
empty.

Variable Ordering

Variable ordering can affect the size of the search space dramatically. There are two main
techniques to variable ordering: static, which is satisfied before the search begins and is the
same down every path; and dynamic, where the choice depends on the current state of the
search, and hence can vary from path to path.

The most common type of variable ordering is based on the fail-first principle. A variable
with the smallest domain is chosen, and ties can be broken by choosing a variable that
participates in most constraints. This is static if no simplification is done and is dynamic if
simplification reduces domain sizes. This method often works well and is built into many
toolkits.

Other methods sometimes work better for particular problems, such as in laying out a PCB to
place the largest components first, and in class timetabling to timetable the longest
continuous sessions.

Once the variables have been ordered and a pruning method selected, we have a search space,
however we need to consider how to search this space. Standard depth first search is used,
however a variety of advanced methods for dependency-directed backtracking exist. Instead
of backing up one level on failure, it can sometimes be inferred that there can be no solution
unless search backs up several levels.
When choosing a value from the search space, we have to consider ordering them. This has
no effect if no solution exists or all solutions are to be found. The first strategy is to
arbitrarily choose a value - this is a popular strategy. Other strategies include, for each value,
forward check and choose one that maximises the sum of the domain sizes of the remaining
variables, and the final strategy is for each value, to forward check and choose one that
maximises the product of the domain sizes of the remaining variables.

Particular problems may also have simple strategies that enable the choice of a value most
likely to succeed.

Planning
A planner takes an initial state and a goal, and then uses a process known as planning to put
the world into a state that satisfies the goal. Principally, there is no difference between this
and search (as for most AI techniques). However, in a simple search (where all applicable
actions are taken in a state, and then the resulting states are checked to see if they are a goal
state), this becomes infeasible even in moderately complex domains. A solution to this
problem is to exploit information about actions, so irrelevant actions are never chosen (e.g.,
for an AI driver, if your goal is to get from A to B, then you don't need to alter the air
conditioning).

Situation Calculus

Situation calculus is one method of representing change in the world. Situation calculus
works by maintaining an up-to-date version of the state of the world, by tracking changes
within it. The language of situation calculus is first order logic.

Situation calculus consists of:

• domain predicates, which represent information about the state of the world
• actions
• atemporal axioms, which consist of general facts about the world
• intra-situational axioms, which are facts about situations
• causal axioms, which describe the effects of actions

Predicates that can change depending on the world state need an extra argument to say in
what state they are true, e.g., in the blocks world, held(x) becomes held(x, s), etc...
Predicates that do not change do not need the extra argument, e.g., block(x).

We also have a function called result(a, s), which is used to describe the situation that
results from performing an action a in a situation s.

Causal axioms model what changes in a situation that results from executing some action, not
what stays the same. Therefore, we need to define frame axioms which define what stay the
same in a situation. This necessity causes the frame problem, as in a complex domain, a lot of
things stay the same after an action has been executed.

Finding a sequence of actions that leads to a goal state can be formulated as an entailment
problem in first order predicate calculus. For situation calculus planning, the steps to take are
to specify an initial state (as state s0), specify a goal (a formula that contains one free variable
S) and then find a situational term such that the goal is a logical consequence of the axioms
and the initial state. A situation term is a ground term of nested result expression that denotes
a state.

We can formally state this as: Given a set of axioms D, an initial state description I and a goal
condition G, find a situation term t such that G {S → t} is a logical consequence of D ∪ I.

Situation calculus does have limitations, however - it has a high time complexity, does not
support simultaneous actions, does not handle external actions, or other changes than those
brought about as a direct result of the actions, it assumes that all actions are instantaneous,
and there is no way to represent time.

As we can see, there are multiple problems with planning, such as the frame problem (both
representational and inferential), the qualification problem (it is difficult to define all
circumstances under which an action is guaranteed to work) and the ramification problem
(there is a proliferation of implicit consequences of actions).

STRIPS

STRIPS is the Stanford Research Institute Problem Solver, and also the formal language that
the problem solver uses. It provides a representation formalism for actions and has the idea
that everything stays the same unless explicitly mentioned - this avoids the frame problem.

Again, STRIPS uses propositional or first order logic, and the state of the world is
represented by a conjunction of literals. There is one restriction, and that is that first-order
literals must be ground and function-free. STRIPS uses the closed world assumption, that is,
everything that is not mentioned in a state is assumed to be false. In STRIPS, the goal is
represented as a conjunction of positive (that is, non negated) literals. In the propositional
case, the goal is satisfied if the state of the world contains all the propositional atoms. In the
first-order case, the goal is satisfied if the state contains (all) positive literals in the goal.

The STRIPS representation consists of a state (a conjunction of function-free literals), a goal


(a conjunction of positive function-free literals - they can however contain variables, which
are assumed to be existentially qualified) and an action (represented by a set of pre-conditions
that must hold before it can be executed and a set of post-conditions, i.e., effects that happen
when it is executed). It is common practice to seperate the effects (post-conditions) of an
action in STRIPS into an add list, which contains the positive literals, and a delete list which
contain the negative literals, for the sake of readability.

An action is said to be applicable if in some state, the state satisfies the actions preconditions
(i.e., all literals in the preconditions are part of the state). To establish applicability for first-
order actions (action schemas), we need to find a substitution for the varibles that make the
action applicable.

The results of an action are that the positive literals in the effect are added to the state, and
any negative literals in the effect that match existing positive literals in the state make the
positive literals disappear, with the exceptions of positive literals already in the state are not
added again, and negative literals that match with nothing in the state are ignored.
Regression Planning

Planning in STRIPS can be seen as a search problem. We can move from one state of a
problem to another in both a forward and backward direction because the actions are defined
in terms of both preconditions and effects. We can use either forward search, called
progression planning, or backward search, known as regression planning.

Forward planning is equivalent to forward search, and as such is very inefficient and suffers
from all of the same problems as the underlying search mechanism. A better way to solve the
planning problem is through backwards state-space search, i.e., starting at the goal and
working our way back to the initial state. With backwards search, we need to only consider
moves that achieve part of the goal, and with STRIPS, there is no problem in finding the
predecessors of a state.

To do regression planning, all literals from the goal description are put into a current state S
(the state to be achieved). A literal is then selected from S, and an action selected that
contains the selected literal in its add list. The preconditions of the selected action are added
to the current state S and literals in the add list are removed from S. When the start state is a
subset of the current list of literals of S, we can return success.

Whilst doing regression planning, we must insist on consistency (i.e., that the actions we
select do not undo any of the desired goal literals).

This state space search method is still inefficient however, so we should consider whether or
not we can perform A* search with an admissible heuristic.

Subgoal Independence

It is often beneficial to assume that the goals that we have to solve do not interact. Then, the
cost of solving a conjunction of goals is approximated to the sum of the costs of solving each
individual goal seperately. This heuristic is:

• optimistic (and therefore admissible) when the goals do interact, i.e., an action in a sub plan
deletes a goal achieved by another subplan
• pessimistic (inadmissible) when subplans contain redundant actions

Empty Delete List

This heuristic assumes that all actions have only positive effects. For example, if an action
has the effects A and ¬B, the empty delete list heuristic considers the action as if it only had
the effect A. In this way, we can assume that no action can delete the literals achieved by
another action.

Sussman Anomaly

The Sussman Anomaly is a situation in the blocks world that can be solved effectively using
the planning methods discussed above.
The initial state is On(C, A), Ontable(A), Ontable(B), Handempty and the goal state
is On(A, B), On(B, C).

One way of solving this anomaly is to add the condition Ontable(C), however another
way is to use another method of planning known as partial order planning.

Partial Order Planning

Up until now, plans have been totally ordered, i.e., the exact temporal relationships between
the actions are known. In partially ordered plans, we don't have to specify the temporal
relationships between all the actions, and in practice this means that we can identify actions
that happen in any order.

Above, we have searched from the intial state to the goal or the vice versa. A complete plan is
one which provides a path from the initial state to the goal, and a partial plan is one which is
not complete, but could be extended to be complete by adding actions at the end (forwards)
or at the beginning (backwards) of it. We then need to consider whether or not we can search
this space of partial plans.

Doing planning as search happens in the space of partially ordered partial plans, where the
nodes are the partially ordered partial plans. The edges denote refinements of the plans, i.e.,
adding a new action and constraining the temporal relationships between two actions. The
root (initial) node is the empty plan, and a goal is a plan that, when executed, solves the
original problem.

We represent a plan as a tuple (A, O, L) where:

• A is a set of actions
• O is a set of orderings on actions, e.g., A < B means that action A must take place before
action B
• L is a set of causal links which represent dependencies between actions, e.g., (A add, Q, Aneed)
means that Q is a precondition of Aneed and also an effect of Aadd. Obviously we still have to
maintain that Aadd < Aneed.
We also need to consider threats - we say that an action threatens a causal link when it might
delete the goal that the link satisfies. The consequences of adding an action that breaks a
causal link into the plan are serious. We have to make sure to remove the threat by demotion
(move earlier) or promotion (move later).

The partial order planning (POP) algorithm works as follows:

• Create two fake actions, start and finish


• start has no preconditions and its effects are the literals of the intial state
• finish has no effects and its preconditions are the literals of the goal state
• an agenda is created which contains pairs (precondition, action) and initially
contains only the preconditions of the finish action
• while the agenda is not empty:
o select an item from the agenda (i.e., a pair (Q, A need))
o select an action Aadd whose effects include the selected precondition Q
o If Aadd is new, add it to the plan (i.e., to the set of actions A)
o Add the ordering Aadd < Aneed
o Add the causal link (Aadd, Q, Aneed)
o Update the agenda by adding all the pre-conditions of the new action (i.e., all the
(Qi, Aadd)) and remove the previously selected agenda item
o Protect causal links by checking whether Aadd threatens any existing links, and check
existing actions for threats towards the new causal links. Demotion and promotion
can be used as necessary to remove threats
o Finish when there are no open precondition

Other Approaches to AI
The agents we have looked at above are known as logic-based agents, where an agent
performs logic (symbolic) reasoning to choose the next action, based on percepts and internal
axioms. Logic based agents have the advantages of the ability to use domain knowledge, and
the behaviour of agents have clean logical semantics. However, logic reasoning takes times
(sometimes a considerable amount of time), so the computed action may no longer be optimal
at execution time (i.e., logic causes calculative rationality). Additionally, representing and
reasoning about dynamic, complex environments in logic is difficult, and as logic-based
agents operate on a set of symbols that represent entities in the world, mapping the percepts
to these symbols is often difficult and problematic.

Reactive Agents

Reactive agents are baed on the physical grounding hypothesis and this is sometimes referred
to as Nouvelle AI or situated activity (Brooks, 1990).

Here, intelligent systems necessarily have their interpretations grounded in the physical
world, and agents map sensory input directly to output (no typing on the input and output
occurs). Nouvelle AI takes the view that it is better to start with the development of simple
agents in the real world than complex agents in simplified environments.

Reactive agents are built with a subsumption architecture, where organisation is by


behavioural units (or layers), and decomposition is by activity, rather than decomposition by
function. Each layer connects sensory input to effectors, and each layer is a reactive rule
(originally implemented as an augmented finite state machine). Intelligent behaviour then
emerges from the interation of many simple behaviours.

We can therefore say that the decision making of a reactive agent is realised through a set of
task accomplishing behaviours. These behaviours are implemented as simple rules that map
perceptual input directly to actions (state → action). As many rules can fire simultaneously,
we need to consider a priority mechanism to choose between different actions.

We can formally specify a behaviour rule as a pair (c, a), where c ⊆ P is a condition and a is
an action. A behaviour (c, a) will then fire when the environment is in state s ∈ S and c ⊆
See(s), that is, the agents' percepts in state s. We also let R be the set of an agents' behaviour
rules, and we let '<' be a preference relation over R (a total ordering). We say that b1 < b2 is
that b1 inhibits b2, i.e., that b1 will get priority over b2.

Reactive agents are therefore very efficient - there is a very small time lag between
perception and action, simple and robust in noisy domains. We also say that they have
cognitive adequacy. Humans have no full monolithic internal models, that is, no internal
model of the entire visible scene, and they even have multiple internal representations that are
not necessarily consistent. Additionally, humans have no monolithic control, and humans are
not general purpose - they perform poorly in different contexts (especially abstract ones).

On the other hand, reactive agents require that the local environment has sufficient
information for an agent to determine an acceptable action. They are hard to engineer, as
there is no principled methodology for building reactive agents, and they are especially
difficult to develop for complex environments.

We can therefore conclude that logic-based (symbolic) AI demonstrated sophisticated


reasoning in simple domains, in the hope that these ideas can be generalised to the more
complex domain, and nouvelle AI solved less sophisticated tasks in noisy complex domains,
in the hope that this can be generalised to more sophisticated tasks. We should consider
whether it is possible to get the best of both worlds.

Layered Architectures

Layered architectures work by creating a subsystem (layer) for each type of behaviour (i.e.,
logic-based, reactive, etc) and these subsystems are arranged into a hierarchy of interacting
layers. Horizontal layering is where each layer is directly connected to sensors and effectors,
and vertical layering is where sensory input and output is dealt with by at most one layer
each. Horizontal layering is shown below:
Vertical layering also comes in two varieties, one-pass control and two-pass control.

Multi-Agent Systems

Multi-agent systems come with the idea that "there is no single agent system". This is
different to traditional distributed systems in that agents may be designed by different
individuals with different goals, and agents must be able to dynamically and autonomously
co-ordinate their activities and co-operate (or compete) with each other.

One methodology for multi-agent systems is co-operative distributed problem solving


(CDPS). This is where a network of problem solvers work together to solve problems that are
beyond their individual capabilities; no individual node has sufficient expertise, resources and
information to solve the problem by themselves. CDPS works on the benevolence assumption
that all agents share a common goal. Self-interested agents (agents designed by different
individuals or organisations to represent their interests) can exist, however.

Multi-agent systems can be evaluated using two main criteria: coherence - that is, the solution
quality and resource efficiency; and the co-ordination, the degree to which the agents can
avoid interfering with each other, and having to explicitly synchronise and align their
activities.

Issues in CDPS include dividing a problem into smaller tasks for distribution among agents,
synthesising a problem solution from sub-problem results, optimising problem solving
activities to achieve maximum coherence and avoiding unhelpful interactions and
maximising the effectiveness of existing interactions.

Task and result sharing comes in three main phases: problem decomposition, where sub-
problems (possibly in a hierarchy) are produced to an appropriate level of granuality; sub-
problem solution, which includes the distribution of tasks to agents; and solution synthesis.

One way of accomplishing this is using a method known as contract net. This is a task
sharing mechanism based on the idea of contracting in the business world. Currently, contract
net is the most implemented and best studied framework for distributed problem solving.

Contract net starts by task announcement - an agent called a task manager advertises the
existance of a task (either to a specific audience, or all), and agents listen to announcements,
evaluate them and decide whether or not to submit a bid. A bid indicates the capabilities of
the bidder with regard to the task. The announcer receives the bids and chooses the most
appropriate agent (the contractor) to execute the task. Once a task is completed, a report is
sent to the task manager.

The agents co-operatively exchange information as a solution is developed, and result sharing
improves group performance in the following ways:

• confidence: independently derived solutions can be cross-checked


• completeness: agents share their local views to achieve a better global view
• precision
• timeliness: a quicker problem solution is achieved

A co-ordination mechanism is necessarily to manage inter-dependencies between the


activities of agents. We can consider a hierarchy of co-ordination relationships:

There are multiple types of positive, non-requested relationships:

• action equality relationship: two agents plan to perform an identical action, so only one
needs to perform it
• consequence relationship: a side-effect of an agent's action achieves an explicit goal of
another
• favour relationship: a side-effect of an agent's plan makes it easier for another agent to
achieve his goals

Partial Global Planning

In partial global planning, the idea is that agents exchange information in order to reach
common conclusions about the problem-solving process. It is partial as the system does not
generate a plan for the entire problem, and is global as agents form non-local plans by
exchanging local plans to achieve a non-local view of problem solving.

There are three stages in partial global planning. In the first, each agent decides what its own
goals are and generates plans to achieve them. Agents then exchange information to
determine where plans and goals interact. Agents then alter their own local plans in order to
better co-ordinate their own activities.

A meta-level structure guides the co-operation process within the system. It dictates which
agents an agent should exchange information with, and under what conditions information
exchanges should take place.

A partial global plan contains: an objective which the multi-agent system is working towards;
activity maps, which are a representation of what agents are actually doing, and what results
will be generated by the activities; and a solution construction graph, which represents how
agents ought to interact, what information ought to be exchanged, and when.

Finally we need to consider multi-agent planning. In a multi-agent system, we can either have
centralised planning for distributed plans, distributed planning for centralised plans, or
distributed planning for distributed plans. When planning is distributed, we need to consider
plan merging.

Plan merging works by taking a set of plans generated by single agents, and from them
generating a conflict-free multi-agent plan. This multi-agent plan is not necessarily optimal,
however. We can use planning approaches discussed above, however we often also require a
During List, which contains conditions that must hold whilst the action is being carried out.

Plan merging has three stages:

• interaction analysis, where interactions in between the agents' plans are discovered. This is
based on the principles of: satisfiability, where two actions are said to be satisfiable if there
is some sequence in which they may be executed without invalidating a pre-condition;
commutativity, where two actions may be executed in parallel; and precedence, which is the
sequence in which actions may be executed
• safety analysis, where it is determined which interactions are unsafe. Actions where there
are no interaction, or where the actions commute, are removed from the plan, and the set
of all harmful interactions is generated using search
• interaction resolution, which is the mutual exclusion of critical sections (i.e., unsafe plan
interactions)
CHAPTER 7:
Relational Databases and Query Languages
Information Systems
Different types of system exist based on their main focus:

• Processing systems - these were mainly covered in MSD last term (e.g., the mobile phone
case study)
• Information systems - data is a central artefact and is the most important aspect of the
system (e.g., payroll systems)
• Control systems - exist to monitor/control a piece of equipment, e.g., a lift. They have very
different types of concerns to other types of systems and are covered in modules such as
RTS.

RDQ is primarily concerned with information systems. We can consider two components of
information systems: the database - supported by a DBMS, the database is always in a valid
state and does more than just holding data (e.g., it performs validation and verification); and
transactions, which are atomic operations (which is challenging, as you have to deal with
crashing and errors, as well as handling concurrency), defined using basic operations and are
based on data constraints and business rules.

Data needs to exist by itself, not just as part of a program.

Development of Large Databases


Data Content and Structure Database Applications

Phase Requirements collection


Data Requirements Processing Requirements
1 and analysis

Phase Conceptual Database Conceptual Schema Design Transaction and Application


2 Design (DBMS-independent) Design (DBMS-independent)

Phase
Choice of DBMS
3

Phase Logical Schema and View


Data Model Mapping
4 Design (DBMS-dependent)

Phase Internal Schema Design


Physical Design
5 (DBMS-dependent)

Phase System Implementation DDL (Data definition Transaction and Application


6 and Tuning language) and SDL statements Implementation
Database were traditionally specified using an entity-relationship diagram, but modern
specifications use conceptual class models with restrictions, although we refer to class
models in a database context as an entity and associations become relationships. Transactions
are specified using collaboration diagrams, which are rarely complex.

All entities must have the basic operations of create, remove, modify attribute and link, and
inheritence can only be of attributes and associations, not methods. Inheritence is either total,
where the parent class is only used for inheritence or partial, which is normal inheritence.
Multiplicities are traditionally restricted to 0, 1 and * to give us one-to-one, one-to-many and
many-to-many relationships. To allow other UML multiplicities, we use participation limits.

When considering attributes, we can consider composite attributes, ones that can be broken
down further (e.g., an address) and also multi-valued attributes, where several pieces of data
can exist for one attribute (e.g., telephone number, could have daytime, evening, mobile,
etc...).

Data Integrity
Data integrity exists to define what it means for data to be valid. Data integrity rules can be
expressed in OCL, B, Z or using structured English. We can consider three types of integrity
concerns: structural integrity, such as multiplicity and participation limits; business rules -
constraints that arise from the way a business works; and inherent integrity, which is related
to the database paradigm, and are only relevant during design.

Different types of constraints for business rules exist:

• Domain - Although it is too early to give specific type constraints, we can give high level
abstractions of the types allowed, e.g., say that age has a maximum value of 26, rather than
age is an int.
• Inter-attribute - A restriction on the values of an attribute in terms of the values of one or
more attributes of the same class. The inclusion of derived attributes generally lead to this
kind of constraint.
• Class - This is limitations on a class, e.g., on how many instances of the class may be allowed.
• Inter-class - The is similar to inter-attribute constraints, but involve attributes of different
entities.

Verification
Verification and validation are important components of any high quality engineered
solution. The kinds of quality control issues that can arise in information systems
specification are summarised in the V-model, as studied in MSD. Here, we only deal with
validation and verification of specification models.

As we use direct translation to approach the construction of the database, it is essential that
the models produced are verified internally and checked against each other, as well as against
the requirements. The whole topic of validation can be discussed under requirements
engineering, but here we only look at it from the point of view of data modelling.
Data models represent the requirements for a system's data structure, and should represent all
of the data required by the system. Data models should aim to reduce redundancy - ideal data
models should not have any redundant structures or data. Class diagram checks are based on
traditional data model checks, and uses a technique called resolution and the following check
list:

• Component checks:
o Is every class linked to at least one other?
o Are all multiplicities defined?
o Are all names unique?
• Relationship checks:
o Are only simple associations and simple subtyping used?
o Apply resolution checks to all many-to-many associations to pick up missing or
misplaced attributes
• Redundancy checks:
o Check that all classes and associations are needed
o Ensure there is a good reason that classes linked by associations with multiplicity 1
or 0..1 at both ends cannot be merged
o Where two classes are linked by more than one sequence of associations, that some
of the associations are not redundant
o Subclasses of a single class are not disjoint

Some checks can be carried out automatically with a tool, but others require insight into the
application.

Resolution Checks

Resolution is used to check the validity of binary associations that have many-to-many
relationships. Resolution looks for missed or misplaced attributes, and resolution replaces a
many-to-many association with two associations and a new class:

This is more general than a standard UML association class, as it solves some of the
weaknesses in the association class, such as only having one association class per
relationship.

Transaction model checking simply is ensuring that:

• all necessary components are represented


• all class operations are called by the manager
• indirect messages relate to the creation and deletion of links
• messages that call set-based operations have sets as targets
• consistency of guards.
Finally, model cross checking of the class and collaboration models leads to elaboration of, at
least, the class model. It checks that there are enough transactions to handle the data and there
is enough data to carry out the transactions. The basic checking processes are that all called
operations are available in the classes and all data in guards and parameters must be
available. Extra checks such as general checks, integrity checks and navigation checks could
also be used.

To check there are enough transactions, you need to ensure that every class has transactions
to create, delete and change the attributes of an instance, and every association has
transactions to create and delete links. Therefore, for every role whose multiplicity does not
include 0, any transaction that creates an instance of the opposite class must also create a link
to that instance and any transaction that deletes the last or only link to the opposite class must
also delete the instance of the opposite class.

Integrity checks ensure that business rules are properly expressed, and the simplest of them
are the structural integrity rules imposed by the diagram, so you have to ensure that the
structural integrity is correctly captured in the class model and the collaboration diagrams
enforce structural integrity.

Navigation checks ensure that all possible questions can be answered by the data model.
Connection traps occur when the structure looks fine, but does not yield answers to the
needed navigations. Navigation checks consist of considering each of the required
transactions of the system, and making sure that they can all be implemented in the system.
You do need to make sure that by solving connection traps, unjustifiable closed loops are not
introduced.

Quality of Information Systems

We can consider some of the quality rules for object orrientated software engineering when
developing quality software for databases. We've already considered the direct mapping rule
with relational databases, where just the specification language is adapted. We can also
consider the small interfaces rule (data components are minimally linked), the explicit
interfaces rule (integrity rules specify the dependencies between data) and the information
hiding rule (physical data is hidden).

Databases
There are many different definitions of a database. Most focus on the features of the universe
of discourse (what the database is holding), purpose, related sets of data, organised persistent
data, shared access and central control and transactions.

Databases need not necessarily be computerised (the first ones for example were paper
based), but for the context of this module, we will only consider computerised databases.
Early non-computerised databases used manual catalogues and procedures, and in addition to
the obvious inconveniences, there was the major problem of duplication, and there was no
established database theory.

With the rise of computing in the 1970s, limited computing resources became available and
database theory became developed. Databases were typically implemented in files using file
handling programs, but duplications and manual checks were still needed. Maintenance was
also difficult and there is very little support for security.

Moving into the 1980s, some theory became established and relational databases were
introduced. Simple networks of computers became available and DBMSs are a hot topic in
research.

ANSI/SPARC Architecture

The American National Standards Committee Standards Planning And Requirements


Committee (ANSI SPARC) proposed a three-level architecture for the design and
implementation of databases as show below:

This is a theoretical model for database implementation that doesn't deal with operating
systems, or the specifiation or design levels. Most good DBMSs, especially large ones,
implement the SPARC architecture.

The internal level is the hidden computer storage, organised in fiile structures. The
conceptual level is a model in the relevant paradigm (for this course, this is a relational
model). The external level is composed of the views of the conceptual level needed by each
group of users. At all the levels, we call the description of the database a schema.
The ANSI/SPARC architecture promotes insulation of data from programs, and ensures each
level is independent and transparent. A DBMS that supports ANSI/SPARC should provide:

• a structure for storing data


• mapped structures for envisaging and using data
• access controls to ensure that data can only be entered and used via these structures

In ANSI/SPARC, we can consider internal mappings - the links between the architecture
levels which are used to provide physical data through external views, and external mappings
- which situate the DBMS in the system context (users and storage media) and provide links
to the users and the computer systems.

We can consider the components of a DBMS as demonstrated in the diagram below:

DDL (the data definition language) can be considered as an unofficial entry point to the
DBMS as it provides us with direct access to the data dictionary. Most applications
communicate using DML and DQL (data manipulation and data query languages). A second
raw DQL approach is often used by database administrators where they need quick access to
data that a transaction does not necessarily exist for. The only "official" access to the data
should be through application programs using transactions.

Ultimately, the goals of a DBMS are:

• persistent storage of programs and data structures in a common format


• controlled, minimal redundancy
• representation of complex data structures (with integrity constraints)
• controlled multi-user access
• prevention of unauthorised access
• management of security features
• provision of backup and recovery facilities

Users of a DBMS

We can consider a DBMS to be used by four groups of people (roles).

1. Database developers - design new database or undertake complex modifications to existing


ones - typically have full access, but only on development systems
2. Database administrators - involved with low-level access, responsible for granting access
permissions, efficiency, modifying structure of a live database. They have constrained access
to structure and wide access to data.
3. Application developers - programmers of the applications that implement transactions. They
are typically involved in both the development of the initial database, or can be brought in
later to extend or update the system.
4. End users - these use built application programs or queries to access the database to read
and update data as appropriate. Typically they have tightly-controlled access rights based on
their role within the organisation.

Relational Databases

Relational databases are the most widely used database paradigm, and are the basis for
commercial developments in object-relational paradigms.

Relations are subsets of a cross product of domains (sets of values), and relational models are
defined as collections of relations. With a relational model, we can use very abstract algebraic
(relational algebra) and logic (relational calculus) languages can be used to specify
transactions. With a relational DBMS, the internal and conceptual schemas, as well as the
views, are defined by relations.

A relation is a set, and a table is a concrete representation of a relation - there are no duplicate
rows or columns and the order of the rows and columns is irrelevant. We can consider
relation intensions (the type of relation - sometimes called the schema) and relation
extensions (the data that exists at a particular time).

Primary keys are used to uniquely identify tuples, and they are unique, unchanging, never
null attributes, or combination of attributes. An entity integrity of the primary key is that no
attribute of it takes null values.

We decide which key is the primary key by analysing single attributes for candidate keys. If
there are none, we analyse combinations of attributes for more candidate keys, and then pick
the minimal candidate keys (the ones that involve the least number of attributes) and choose a
key using an arbritrary (but informed) decision. If there are no suitable attributes or
combination of attributes, then we can define a surrogate key (one which is created with the
sole purpose of being a primary key).

Relations are linked by using the primary key of one relation as a foreign key in the intension
of the linked relation (this is indicated with a * and a dotted arrow to the relevant primary
key, and the primary and foreign key must use the same domain). This gives us a new
referential integrity constraint - the values of the foreign key in any existing extension are
legitimate values of the corresponding primary key in its existing extension, or null for
optional relationships.

Null values are a possible value of every domain, they are used as a place holder and can be
interpreted as being irrelevant, unknown, or unknown if a value even exists. However, the
last interpretation does not work in the closed loop model, where we assume that the database
is a representation of the real world - we can infer things from data that is not present in the
database.

There is also a problem of how to do we perform arithmetical operations on null values. In


early times, inconsistent "funny" values were used (e.g., 0, 9999, etc), which in itself
introduces more problems. We should try and avoid null values where possible.

To design relation intensions from the class diagrams, we create a relation intension for each
class and identify primary keys. To deal with relationships, we take different approaches
depending on the multiplicity:

• Many-to-many or 0..1 to 0..1 - we create a third relation comprising the primary keys
• Many-to-one - we post the primary key of the one-end class to the many-end class
• One-to-one - we add the primary key of one to the other

However, we do have a problem when it comes to handling subclasses. Subclasses do not


exist in relational theory, so relations need to be used. Important concerns include
participation (is every instance a member of a subclass?), disjointness (can an instance belong
to more than one subclass?), transactons, and the associations in which the classes are
involved.

Three ways to represent this include collapsing the hierarchy, equating inheritence to 1-to-1
relationships, or removing the supertype, but the type we choose depends on the concerns
listed above.

Relational Algebra
Relational algebra is used in the design of transactions, and forms the conceptual basis for
SQL. It consists of operators and manipulators, which apply to relations and give relations as
a result, but do not change the actual relation in the database.

The operators come from the mathematical operators od dealing with sets:

Relational Operator Meaning

union(R1, R2) → R3 R3 has all the tuples from R1 and R2

difference(R1, R2) →
R3 has all the tuples of R1 which are not tuples of R2
R3

intersection(R1, R2)
R3 has all the tuples that are in both R1 and R2
→ R3
The intension of R3 is the concatenation of those of R1 and R2. The
product(R1, R2) → R3
tuples of R3 are those of the cross product of R1 and R2

We can also consider the manipulators of relational algebra:

Relational Manipulator Meaning

restrict(R1; R2 is a relation comprising only of the tuples of R1 for which the


predicate) → R2 predicate is true

project(R1; R2 is a relation comprising only the attributes of R1 specified in the


predicate) → R2 predicate

join(R1, R2; R3 is a relation comprising of tuples from R1 and R2 joined according to


predicate) → R3 a condition on common attributes specified in the predicate

R3 is a relation formed by identifying tuples in R1 which match the values


divide(R1, R2; of attributes in R2, according to the predicate, and returning the values of
predicate) → R3 all other attributes of the identified tuples; R3 is the relational image
of R2 on R1

These operations can be joined to be able to give more interesting queries, e.g., "What are the
numbers of the flights that go to Bangkok?" could be represented
by: project(restrict(FLIGHT; Destination = Bangkok); FlightNo).

It is often simpler to represent complex queries diagramatically, such as for "What are the
identifiers of the holidays that Ana has booked?":
Another type of query to consider is the existential query, where we look for the existance of
a connection between two relations. The solution for this is to restrict, then join and project.
This is not very efficient, however, as joins and products create huge relations. A good
approach to use is to get the queries correct and then think about adding projects and restricts
before joining to optimise the operation. For existance between multiple relations, existantial
queries can be included in one another.

In the case of universal queries, we are not interested in specific customers or packages (for
example), and we can use the divide manipulator to implement these. For divide(R,
S) to be well-defined, the intension of S must be a proper subset of the intension of R. The
resulting relation has an intention of the attributes a of R that are not in S and an extension of
the tuples t of the projection of R to a such that, for every tuple u of S, tu is in R.

Some queries can be handled as either an existential or universal query, however.

Relational algebra is not computationally complete, however (you can not write general
queries), and other problems can occur - such as finding out the number of tuples in a
relation.

Relational Calculus

See LPA for


predicate calculus

Relational calculus is an alternative to relational algebra and forms another conceptual basis
to SQL. We can think of relational algebra as being prescriptive (how - a procedural style
similar to programming languages), whereas relational calculus can be thought of as
descriptive (what - a nonprocedural style based on predicate calculus). Which form you use is
a matter of style.

The relational algebraic expression project(restrict(FLIGHT; Destination =


Bangkok); FlightNo) can be expressed as relational calculus using a set
comprehension: f : FLIGHT f.FlightNo where f.Destination = Bangkok, or perhaps the more
familiar notation { f : FLIGHT | f.Destination = Bangkok ∧ f.FlightNo }.

The basic syntax of relational calculus is tuple-specification where predicate. The tuple
specification and the predicate use range variables, and the tuple specification is a list of
range variables and attribute selections and the predicate is over the range variables in the
tuple specification. The result of this is a relation composed of the tuples of the form tuple-
specification such that predicate is true.

Existential queries in relational calculus are implemented using a range variable for the target
relation, an existential quantification, and one quantified variable per relation from the anchor
up to, but not including the target relation and equalities to establish the starting points and
links.
To do universal queries, we use a universal quantifier with a range variable for the target
relation, a quantified range variable for the anchor relation and a quantified predicate, which
is an existential query to establish a connection between the anchor and the target.

We do however have the same problems as we had with relational algebra, where general
search algorithms can not be implemented - this is due to a restriction in predicate calculus.

A variant of relational calculus is domain calculus, where predicate logic is defined over
domains, rather than relations.

In conclusion, relational calculus is useful for defining relations and for optimisation (a huge
number of algebraic laws exist). For the evaluation of language, relational calculus is
relationally complete, and relational algebra is a convenient target to implement the calculus,
and an algorithm exists which can transform any calculus expression into an algebra
expression.

Normalisation
Normalisation is a technique to eliminate duplicated data and unnecessary attributes. It is
applied during the design stage and ensures that all relations comply with restrictions
imposed by normal forms - a succession of rules. Normalisation helps reveal relations and
avoid inconsistency during transactions.

The result is not necessarily efficient (e.g., duplicated data might be used to speed up a
search), and any changes in the physical design for efficiency reasons should be recorded.

Dependence among attributes needs to be controlled, and anomalous results when inserting,
updating or deleting data need to be reduced, as well as eliminating spurious tuples when
joining relations. Dependencies such as one value determining another (functional
dependence) are not recorded in the design, which is a problem.

The need for normalisation was proposed by Codd very early after he recognised a need for a
technique to remove dependencies. The first three levels of normalisation were suggested in
1972 and subsequently, Boyce and Codd added a further level (BCNF). Other research has
produced further forms, although these are concerned with dependencies other than those the
first three and BCNF (such as between tuples, relations and their projections), which are
concerned with the dependencies between attributes of a tuple.

An attribute B is functionally dependent on another attribute, A, if, at any time, knowing the
value of A uniquely determines the value of B. We write this as A → B, where A and B can
be lists of attributes.
Functional dependencies exist between the primary key and other attributes, and
normalisation removes any other functional dependency. To identify the functional
dependencies, we use the domains and integrity constraints, rather than a particular extension
- which can only show the absence of a functional dependency. In relational databases, we
can consider a universal relation, which is a join of all relations in a database, and attribute
names are prefixed by the name of their original relation.

Given a set of functional dependencies, we can infer others, known as Armstrong's axioms:

• Reflexivity - if Y ⊆ X, then X → Y
• Augmentation - if X → Y, then XZ → YZ
• Transitivity - if X → Y and Y → Z, then X → Z
• Composition - if X → Y and X → Z, then X → YZ
• Decomposition - if X → YZ, then X → Y and X → Z
• Pseudo-transivity - if X → Y and WY → Z, then WX → Z

This is sound and complete, and only reflexivity, augmentation and transitivity are enough.
Any functional dependencies satisfied by the axiom of reflexivity are considered trivial, and
do not say anything special about the application.

Derived attributes represent required calculations (e.g., age from birth date) and are recorded
in the data dictionary. They characterise a functional dependency, but are not part of the
relational model and prevent full normalisation.

To start normalisation, you need to determine non-trivial functional dependencies and at the
end of each level of normalisation, relations should be presented complete with their primary
and foreign keys.

First Normal Form

The first normal form (1NF) is concerned with the internal structure of attributes and domains
need to be clearly defined.

A relation in 1NF has no set-valued attributes; all its attributes contain only atomic values
from their domains.

An attribute can be a set if sets are values of the domain, otherwise a new relation is needed.
Second Normal Form

The second normal form (2NF) requires that attributes are properly dependent on the primary
key. It is a concern for any relation with a multi-attribute primary key.

A relation is in 2NF if, and only if, it is in 1NF, and every non-prime attribute is fully
functionally dependent on the primary key.

A non-prime attribute is one which is not part of any candidate key. We can say that A is
functionally dependent on B: B → A, but not C → A for any C ⊂ B.

Foreign keys then allow the original data structure to be recreated from the normalised
relations if necessary.

Third Normal Form

The third normal form (3NF) is concerned with mutual dependencies of attributes that are not
in the primary key.

A relation is in 3NF if, and only if, it is in 2NF, and no non-prime attributes are transitively
dependent on the primary key.

A is transitively dependent on B if there is a C such that B → C and C →, and C is not a


candidate key.

Any internally dependent attributes should be extracted to a seperate relation for this stage of
normalisation to be applied.

Boyce-Codd Normal Form

Boyce-Codd Normal Form (BCNF) applies to relations that have two or more candidate keys
that overlap and is stricter than 3NF; every relation in BCNF is in 3NF, but a relation in 3NF
may not be in BCNF. In other cases, 3NF will be enough.

A relation is in BCNF if, and only if, whenever an attribute A is functionally dependent on B,
then either B contains all the attributes of a candidate key of R, or A is in B.

One way of normalising to BCNF is to extract attributes in the offending dependency to a


new relation.

Algorithms for normalisation work by decomposing the non-normalised relations into two or
more normalised relations using the methods explained above. Some authors propose the use
of normalisation through decomposition as the central technique to design a relational model,
starting from the universal relation and the functional dependencies. This preserves
dependencies, as all dependencies of the universal relation can be inferred from the
dependencies that only involve attributes of a single relation, and this is reasonably easy to
enforce. This approach is not often taken in practice, though.
We can also consider the property of lossless-join, where decomposed relations can always
be joined back using the join operator of relational algebra on the primary and foreign keys,
to exactly the contents of the original relation, and no spurious tuples should be introduced by
the join.

It is always possible to normalise a relation to 3NF to obtain a collective of relations that


preserve depenency and the lossless-join property, and it is always possible to normalise a
relationship to BCNF and maintain lossless-join, however it is not always possible to obtain
relations that satisfy all the properties. When this occurs, the problem should be recorded and
a compromise made.

SQL
SQL (Structured Query Language) is the most popular relational database language. It
implements both a DDL and a DML and concerns itself with the creation, update and delation
of relations and properties based on relational algebra and calculus. It is typically included in
host applications, or conventional computer programs.

SQL is based on SEQUEL (Structured English QUEry Langauge) which was developed by
IBM in the 1970s for System R. In 1986, ANSI developed a standardised version of SQL
based on IBM's SEQUEL, and in 1987, ISO also released a version, which was updated with
integrity enhancements in 1989, and then again in 1992 (SQL-92) and 2000 (SQL-99,
sometimes referred to as SQL-3). The current version of SQL is SQL-2003 and adds object-
relational features and XML, etc...

There are many variations and specialised versions of SQL, e.g., ORDBMS SQL for object-
relational products was marketed by DB2, Oracle and Informix, Geo-SQL is used in various
GIS products, and TSQL2 is a temporal query language much cited in research. They do all
have a common core, but other varients include direct SQL and embedded SQL, differences
in access privileges and concurrency and not stored SQL, which includes the usual control
commands.

Tables

In SQL, tables look like relations, but are not actually sets as duplication can occur.
Duplications can be removed, but the column order will always matter. There are three types
of tables, base tables, views and result tables. Base tables are persistent, normalised relations
that are often defined by metatables; views are derived from the base table, or other views
(note, these are different from the views of the ANSI/SPARC architecture) and there is a deal
of controversy when discussing updates. Finally, result tables are transient and hold the
results of queries - these can be saved by making them a view.

To implement a relational structure, the main activity is creating the base tables. We need to
discover the data domains that are available in the DBMS and then map the logical domains
to the physical domains and record the decisions made to do this. We need to perform a final
check that integrity and normalisation still holds and then use the SQL CREATE command to
implement the relations.
The first step is to use CREATE SCHEMA schema_name to create the database all our
relationships exist in. The second step is to create the tables themselves using CREATE
TABLE table_name(attr1 type1 options1, ...,
attrn typen optionsn, further_options).

Integrity Constraints

Integrity constraints can be implemented using the PRIMARY KEY key_name statement in
the further_options part of the CREATE TABLE command, additionally, foreign keys
can be implemented using CONSTRAINT constraint_name FOREIGN KEY
(key_name) REFERENCES table_name in the further_options part.

Data domains can also be defined using the CREATE command - CREATE DOMAIN
domain_name type DEFAULT d CHECK expression. Additionally, checks can be
declared on the whole table with the CHECK command. A UNIQUE statement can be used for
alternative simple candidate keys. Checks can be deferred.

When integrity constraints involve more than one table, we can use the CREATE
ASSERTION command - CREATE ASSERTION assertion_name expression.
Assertions are checked at the end of every transaction, so they are more costly in
performance terms than CHECK.

Additionally, we can use triggers to implement integrity constraints. Triggers are associated
to one table and to an update operation. It does not avoid the update, but it can undo one that
has been done. They take action when a specified condition is satisfied, e.g., CREATE
TRIGGER trigger_name BEFORE UPDATE OF table_name REFERENCING
OLD ROW AS old_tuple NEW ROW AS new_tuple FOR EACH ROW WHEN
expression SET expression.

Using ON DELETE and ON UPDATE, we can have flexibility in handling referential


integrity, but we need to consider whether or not we really need that. We can use cascades to
handle mandatory association with the CASCADE command, so we have enfored referential
integrity.

Views

As queries and result tables are lost, we can use CREATE VIEW to associate a name with a
particular query, e.g., CREATE VIEW view_name AS expression.

SQL Statements

Now we can start considering the basic SQL commands:

• INSERT INTO table_name(attr1, ..., attrn) VALUES(value1, ...,


valuen)
• DELETE FROM table_name WHERE predicate
• UPDATE table_name SET attr1=value1, ..., attrn=valuen, (attr1,
..., attrn) = subquery WHERE predicate
• SELECT DISTINCT attr1,...,attrn FROM table1,...,tablen WHERE
predicate GROUP BY attr1,...,attrn HAVING
func1,...,funcn ORDER BY attr1 DESC/ASC, ..., attrn DESC/ASC

SELECT is the most used command, and it can be used to write statements based on
relational algebra and calculus. Using a simple SELECT command you can project, product
and restrict, e.g., project(restrict(product(table1, table2); predicate); attr1, attr2) can be
translated in to SQL as SELECT attr1,attr2 FROM table1,table2 WHERE
predicate. The predicate in SQL is used to select attributes or tuples for which the
predicate is true. The predicates are built from comparators and logical connectives,
e.g., Attr1 [NOT] BETWEEN Attr2 AND Attr3, Attr1 IS [NOT]
NULL, Attr [NOT] LIKE pattern, etc...

Using IN conditions in the predicate allows nesting of queries e.g., SELECT * FROM
flight WHERE flightno IN (SELECT flight FROM holpackage) - this is
computationally expensive, however, as the nested query is evaluated once for each candidate
tuple of the outer query. The nested query refers to an attribute of a relation in the outer query
- these types of query are called correlated queries.

The EXISTS operator can be used in a predicate to check the existance (non-null value) of
another query, e.g., SELECT DISTINCT custno FROM booking first WHERE
hpackid='Cultural' AND EXISTS SELECT custno FROM booking
second WHERE second.custno = first.custno and
second.hpackid='Football'.

We can also use aggregate functions in our select query, e.g., SELECT hpackid, COUNT
DISTINCT custno FROM booking GROUP BY hpackid HAVING COUNT
DISTINCT > 2.

In addition to our SELECT - FROM - WHERE project - product - restrict form of an SQL
query, we can also use the explicit operators UNION, INTERSECT and EXCEPT for union,
intersection and difference respectively. Additonally, there is also an explicit JOIN operator
with several extensions, and division has been replaced by the facility to consider predicates
involving relations. To force our queries to return sets, we must qualify
the SELECT command with the DISTINCT option.

Joins exist in many forms, with the most basic being the relational algebra, or inner, join.
This can be expressed as either SELECT * FROM table1,table2 WHERE
table1.attr1 = table2.attr3, or in the form of SELECT * FROM table1
INNER JOIN table2 USING (attr1).

If an attribute of one or both tables are required, whether they are linked to a tuple of the
other table or not, outer joins can be used. The syntax is the same as for the JOIN command
above, except with INNER being replaced by the type of join being used. LEFT OUTER
JOIN is when all the tuples of the table to the left of the JOIN are required, and similarly
for RIGHT OUTER JOIN, all the tuples to the right of the JOIN are included. For
completeness, there is also FULL OUTER JOIN, which includes all the tuples from both
tables. Any values that are blank in the other table after the join are left as null.
SQL only supports existential quantification using the EXISTS qualifier, and not universal
quantification, however we can use the following rule to map a universal query to an
existential one: (∀x : T ∧ p) ≡ ¬(∃x : T ∧ ¬p).

SQL is used with other programming languages, so we need to consider how to distinguish
SQL commands from our standard program and how to map data from the database to host-
program data structures and how to determine the start and end of a transaction.

One implementation in C is the call level interface specification, which is where most of the
database queries are handled in the preprocessor. Other languages, such as Java, PHP and
modern implementations of C use the core library and drivers to access it (e.g., ODBC or
JDBC).

CLI statements in C look like: EXEC SQL query. Shared variables are declared
between EXEC SQL BEGIN DECLARE SECTION and EXEC SQL END DECLARE
SECTION, and then are referenced in the query by prefixing it with a colon, e.g., EXEC SQL
SELECT count(custno) INTO :numcust FROM booking. We need to make sure
that we have correct data types (or weird things will happen when we start casting), and these
need to be checked.

When we are dealing with multiple tuples, it is often easier to use a CURSOR, which is
declared as EXEC SQL DECLARE ptr CURSOR FOR query, and then EXEC SQL
OPEN ptr, FETCH FROM ptr into :var (repeated as many times as necessary) and
then finally EXEC SQL CLOSE ptr.

Transactions in C are started by an EXEC SQL CONNECT host username


IDENTIFIED BY password command and then either an EXEC SQL
COMMIT command, or EXEC SQL ROLLBACK. Either way, the connection should then be
released.

In other languages, which are based on an API (such as the JDBC in Java or mysqlclient in
C), distinguishing SQL commands is not an issue as it is not compiled in advance. However,
a more flexible and complex approach of processing results is needed. In JDBC, we can use
commands such as DriverManager.getConnection(...), creating a statement is
done with stmt = conn.createStatement(), and then the statement is executed
using rs = stmt.executeQuery(...). The result set rs is then processed with
command such as rs.getInt(i), rs.getString(i), etc. Finally, the result set is
closed with stmt.close() and the connection closed using conn.close().

Security
When designing security measures for databases, we should start from the assumption that
agents (people) intend harm, this contrasts with safety, which exists to prevent accidents
(although the two can complement each other in areas).

In databases, security stems from the need for military and commercial secrecy, and was
typically implemented using encryption and physical isolation, however, with the rise of
multi-user databases, this is not feasible. With the advent of Internet connected databases,
there is even less control over who might try to access the database, the number and type of
these concurrent accesses and what parts of the database can be accessed.

Threats are potential entries that could be used to cause harm, whereas an attack is the
exploitation of a threat. Possible threats could come from authorised users abusing their
privileges and the possibility of unauthorised access. The potential consequences of a
successful attack include a breach of security, integrity and availability. One of the key tasks
in database security is to identify threats. Countermeasures to attacks include access control,
inference control, flow control and encryption.

Good database systems should have a security policy and mechanisms in place to implement
the security policy. The security policy typically consists of a set of high-level guidelines and
requirements based on user needs, the installation environment and any institutional and legal
constraints. The mechanisms are a set of functions that apply the security policy and can be
implemented in hardware, software and human procedures. Mechanisms could either be
mandatory controls which can not be overridden and discretionary controls, which allows
privleged users to bypass restrictions.

To capture mechanisms that address policies, we have objects (entities that need to be
protected) classified by a scale of security. Subjects request access to objects, and they need
to be controlled. Subjects are classified by a scale of truth.

Military Model

In the military model of security, we have all controls as mandatory, and confidentiality is the
number one concern. In this model, the objects are the values of attributes and subjects are
the database users. The classic secrecy classification of objects is top secret, secret and
unclassified, and subjects are similarly classified. This determines which categories of subject
can access which categories of object.

Security should be considered early on in the database development process, before the
specification stage. General things that should be considered include questions such as:

• How private is the data?


• How robust and rigorous does the subject need to be?
• Are the requirements the same for all data?
• What is the security policy?
• What is the appropriate model to be applied?
• Does the development need tailoring?

If we know the DBMS at this stage, then the facilities of that should be considered as part of
the analysis.

During the specification stage, we should consider access control. Groups who have common
access rights need to be identified and assigned to roles, and then transactions should be
annotated with access requirements. For each class, roles should be assigned responsibilities
for creation, deletion and manipulation of data. When designing verification plans, testing
should include cases to check that security is enforced and for each transaction that an
intersection of roles is responsible for the constituent operations.
If any inconsistency does come up in the consultation (e.g., someone does not have the roles
required to run all of the parts of a transaction - i.e., they do not have access to a class that
needs data deleting when a transaction requires that data to be deleted), then further
consultation with the client is needed, and it can often be worked around with, for example,
views.

Relational theory contains nothing about security in databases, however SQL has
implemented some access control facilities using the GRANT statement.

SQL

In SQL, users are subjects and there is no classification of objects, which can be tables,
attributes or domains. Privileges are expressed using DML commands such
as GRANT and REVOKE, which grants permission for an object and a privilege to a subject.
SQL only supports discretionary control.

The GRANT command has the syntax of GRANT privilege ON object TO


subject, with privilege is a command, such as SELECT, UPDATE, etc... More control can
be used by specifying columns in the privilege, e.g., UPDATE(column) would only allow
updates on that column in the table. The use of views allows an even finer control over
access, limiting access to certain fields, or to a certain time using
the CURRENT_TIME() command in the WHERE clause of the view.

To implement this, support is needed from the host programming language, the DBMS and
the OS. Good practice is to record the roles in a metatable.

Concurrency

For more information,


see Transactions in NDS.

When dealing with transactions, the introduction of concurrency into the system can bring
efficiency gains, however the DBMS must ensure that integrity is not compromised by this.
The safest route is sequential execution, but this is highly inefficient, so we need to introduce
some form of transaction management.

We can consider transactions as a series of database operations that together move the
database from one consistent state to another, although the database may be inconsistent at
points during the transaction execution. It is desirable for transactions to exhibit the ACID
properties to enforce this integrity when concurrency is involved. The ACID properties are:

• Atomicity - all steps must be completed before changes become permenant and visible, and
if completion is not possible, then we must fail and updates are undone
• Consistency - transactions must maintain validity
• Isolation - transactions must not interfere with each other
• Durability - it must be possible to recover from system crashes
Lost Update Problem

The lost update problem is an interference problem, caused by a lack of atomicity resulting in
an inconsistent database. e.g.,, if we have two concurrent executions of a Transfer transaction
for moving money in a bank from an account A1 and both transactions read the current value
before the other has updated, then the money deducted from A1 will only be that of the last
transaction to commit, rather than the complete amount.

Uncommitted Dependency Problem

This is another interference problem causing by a lack of atomicity and resulting in an


inconsistent database. If a transaction T2 starts and completes whilst a transaction T1 is
running, and then T1 fails and rolls back, the result of the transaction of T2 will be lost.

Inconsistent Analysis

If a transaction reads an uncommitted update from another transactions at a point when the
second transaction causes the database to be in an inconsistent state, then the first transaction
is said to have performed an inconsistent analysis, even if both transactions successfully
commit.

Schedulers

To work around this problem, the DBMS implements a scheduler, which orders the basic
operations for efficient use of the CPU and to enforce isolation - scheduling should be
serialisable.

Two different scheduling algorithms including time stamping, where the start time of a
transaction is recorded, and the time is used to solve conflicts - policy may dictate whether
priority is given to the older or younger transaction. Optimistic scheduling assumes that
conflicts are rare, so all transactions are carried out on memory, and the possibility of conflict
is analysed before committal.

Locking

Locking works by placing locks on the data accessed by transactions - complete isolation is
not needed. The granuality for the lock depends on the transaction, e.g., the whole database
for batch processes, a table, a page (this is what is most frequently used by multi-user
DBMSs), row (this has a high overhead) or an individual field (not often implemented).

We can consider binary locking, where an object is in two states, either locked or unlocked,
however this is too restrictive, or we can consider shared (S for read-only operations) and
exclusive (X for write operations) locks. This does have a slightly increased overhead,
however.

Lock held by transaction 1

X S
X denied denied
Lock requested by transaction 2
S denied granted

The two-phase locking protocol guarantees serialisability. All locks that are needed are
obtained before any locks are released, this means we can categorise a transaction in to two
phases, a growing phase (acquiring locks) and a shrinking phase (releasing locks).

To avoid deadlock, we can use various methods, such as prevention to abort when a lock is
not available, detection using timeout and kill one of the transactions in deadlock, and
avoidance, which gets all needed locks at once.

SQL

When starting a transaction, we can set the transaction type and the isolation level in
SQL: SET TRANSACTION type ISOLATION LEVEL level, where the transaction
types are READ ONLY, so no updates can be performed by the transaction, and READ
WRITE, where updates and reads are permitted. This definition of transaction types prevents
updates without proper locks and he isolation level defines the policy for the use of locks. All
locks are released at COMMIT or ROLLBACK. The behaviors of the isolation levels are shown
in the table below:

X locks S locks X and S


on on locks on Description
tuples tuples predicates

No locks, and can only be used on READ


N/A
READ ONLY transactions. Lost update is not an issue,
(read- no no
UNCOMMITTED but uncommitted dependency and lost analysis
only)
are still problems.

Uses X locks, and it can read data that is not


locked, using a short S lock. Uncommitted
READ
yes no no dependency is solved, but Scholar's lost update (a
COMMITTED
variant of lost update) and inconsistent analysis
are still possible.

REPEATABLE This uses X and S locks, but has a problem of


yes yes no
READ phantom records. Locks are applied to tuples.

This uses predicate locking based on


the WHERE clauses
SERIALIZABLE yes yes yes
of SELECT, UPDATE and DELETE, and has no
concurrency problems.
Recovery

We need to be able to restore the database to a valid state in case of failure - that is, enforce
the durability aspect of the ACID properties. Most DBMSs are expected to provide some
form of recovery mechanism, and most common mechanisms involve some level of
redundancy, involving recovery logs and backups. Recovery issues we need to consider
include system recovery, media recovery and system committal and rollback.

Transactions typically operate on the main memory buffers and are intermittently written to
disk - this process is controlled jointly by the DBMS and the operating system. Updates are
stored, and a log. The log stores the start of transactions, and then an entry for each update
that contains enough information to be able to undo or redo the update finally with a commit.
The log is essentially a database itself, and care should be taken with it as it is invaluable in
case a recovery needs to be made with it.

When a write to disk occurs, the OS proceeds as usual, and at commit time, the commit is
recorded in the log which is then updated first (the write-ahead log rule) and then a forced
write of the log to disk occurs. This means that the log and the updates on disk may be out of
phase. Checkpoints are used to force the write of data to disk.

At a checkpoint, all executing transactions are suspended, and all dirty buffer pages are
written to disk. A recording of the checkpoint is made in the log, with a listing of the
transactions that were active at the time of the checkpoint. The log is then written to disk and
the log buffer reinitialised, and transactions unsuspended.

The recovery procedure itself is automatically initiated by the system restart. The basic
procedure is to create two transaction lists, an UNDO list which contains all active
transactions at the last checkpoint and then a REDO list, which is initally empty. The section
of the logfile after the last checkpoint is then analysed to update the UNDO and REDO lists.
Log files are parsed from top to bottom, and when a transaction is started, it is added to
UNDO and when a commit is found, the transaction is moved to REDO. The logfile is then
run from bottom to top to undo - this is backwards recovery - and then it is run from top to
bottom to redo - forward recovery.

Recovery is implemented in SQL using intermediate savepoints which are equivalent to


intermediate COMMIT points. Updates are not visible until after a savepoint, which makes
partial rollbacks possible. The syntax for savepoints is simple: SAVEPOINT
name, ROLLBACK TO name and RELEASE name.

In the event of a disk failure, the database and the log file are unreliable, so restoration has to
be done from a backup. There is no need to undo anything as backups are made when the
database is in a consistent state. Backup procedures are supported by DBMSs and work by
generating consistent images of the database - a full backup. Log files (incremental backups)
are stored between backups to ease the recovery process. There will always be losses in this
case, however - and handling these inevitable losses are an application dependent issue that is
solved by management processes.

For system commital and rollback (especially in distributed systems), all managers need to be
coordinated, and one manager will have a system coordinator function. A two-phase commit
protocol is used to implement system commital and rollback. The first phase is to prepare and
send your commits to the system coordinator, all managers are instructed to commit and logs
are written and responses collected. The second phase is to commit, where the system
coordinator evaluates and records the results of the other managers and then makes the
decision for each manager to make the final commit or rollback. This methods has to be used
when transactions involve multiple databases.

Distribution
Distributed databases typically consist of logically related data, but they are no longer
centralised - the database is broken into fragments. Management is still central, which leads
to a degree of transparency in the database system. The distributed database shares a common
universe of discourse.

Distributed databases can afford us efficiency, reliability, and allow us to decentralise a


business. Server farms consisting of many generic servers are now common, as opposed to
purchasing a large mainframe. With the rise of fast networks and the acceptance of the
Internet, distributed databases are now a common reality.

With distributed databases, we can have data where it is needed, with a more flexible growth
route and a reduced risk of single point-of-failure, however management and control (the
duplication ghost) adds additional complexity, and considerations need to be made for
increased security threats and storage requirements. There are still no standards for
distributed DBMSs, and in reality there is a long way to go to perfect this area of database
theory.

A distributed DBMS keeps track of local and distributed data and performs query
optimisation based on the best access strategy. Most DDBMSs provide an enhanced security,
concurrency and recovery facility, as well as enhanced transaction management to
decompose access requests.

The components of a DDBMS include computer workstations for the data and the clients,
network hardware and software and then in the clients, a transaction processor and data
manager are required. Both data and processing can happen either at a single or multiple sites
with a distributed system, and a certain amount of heterogeneity is supported by most
DDBMSs.

DDBMS Architecture

The distributed DBMS architecture has five levels:

1. Local schema
2. Component scheme, in a common model
3. Export schema, data that is shared
4. Federated schema, global view
5. External schema, this is the external level of ANSI/SPARC

Transparency is required so that the end user feels that they are the only user of local data.
Distribution transparency works on the principles of fragmenation, location and local
mapping. SQL supports a special NODE clause to specify the location of a fragment and to
implement local mapping. Updates on duplicates also become the responsibility of the
programmer. For transaction transparency, a two-phase commit protocol is used, and we also
have to consider failure transparency, performance transparency (query optimisation) and
heterogeneity transparency.

Design Considerations

When designing a distributed database, the same issues apply, but we have to consider the
extra issues of data fragmentation, replication of fragments and the location of fragements
and schemas. Data fragmentation is designed using a fragmentation schema which divides
relations into logical fragments, and allows whole relations to be recreated using unions and
joins. A replication schema can be used to define the replication of fragments and an
allocation schema is used to define the locations. A global directory is required to hold all of
these schemas.

Different strategies can be used for data fragmentation, we can either fragment horizontally,
where subsets of tuples are distributed and we have a set of disjoint subsets, or we can
fragment vertically, where projections are distributed and the primary key is replicated.
Fragements therefore must all have the same number of rows. It is also possible to mix these
strategies.

Data replication works by replicating the individual fragments of the database. With
replication, we can enhance data availability and response time, as well as reducing
communication and query costs. Replication works off the mutual consistency rule, where all
copies are identical - updates are performed in all sites. With replication, there is an overhead
in transaction management, as we have a choice of copy to access, and synchronised update.
Replication can greatly ease recovery.

Query optimisation can also be used to minimise data transfer. The choice of processing site
is based on estimates, and most DDBMSs provide a semijoin operator, which reduces the
number of tuples before transferring by combining a project() with a join().

For election algorithms,


see NDS.

Distributed concurrency is based on an extension of the locking mechanism. A coordinator


site exists which associates locks with distinguished copies at a primary site (with or without
a backup site) and a primary copy. Coordinators are chosen using an election process, or a
voting method.

Codd specified a set of twelve commandments for distributed databases - although none are
implemented in modern DDBMSs, like the similar commandments for relational systems,
they provide a framework for explanation and evaluation.

1. Local site independence


2. Central site independence
3. Failure independence
4. Location transparency
5. Fragmenation transparency
6. Replication transparency
7. Distributed query processing
8. Distributed transaction processing
9. Hardware independence
10. Operating system independence
11. Network independence
12. Database independence
CHAPTER 8:
Computer Graphics and Visualisation

What is Computer Graphics?

Angel c.1

Computer graphics is concerned with all aspects of producing pictures or images using a
computer; the ultimate aim is to represent an imagined world as a computer program.

A Brief History

• pre-1950s - Crude hardplots


• 1950s - CRT displays (e.g., as seen in early RADAR implementations)
• 1963 - Sutherland's sketch pad
• 1960s - The rise of CAD
• 1970s - Introduction of APIs and standards
• 1980s - Rise of PCs lead to GUIs and multimedia
• 1990s - Visualisation, rise of the Internet, WWW, Java, VRML

Computer graphics have many applications, such as displaying information as in meterology,


medical uses and GIS; design as with CAD/CAM and VLSI as well as simulation, such as
virtual reality, pilot training and games.

Graphics systems are raster based - represented as an array of picture elements known as
pixels stored in a frame buffer. Vector based systems on the other hand work by following a
set of instructions and outputing the results. Raster based systems are faster, but require more
memory.
Graphics are ultimately about controlling pixels, but we use primitives to abstract away from
the pixels.

Mathematical Basics

Angel c.4

We can consider some mathematical basics when dealing with computer graphics. We can
consider points - locations in space specified by coordinates, lines - which join two points, a
convex hull - the minimum set which includes all points and the line segments connecting
any two of them, and a convex object - where any point lying on the line segment connecting
any two points in the object is also in the object.

In computer graphics, we also use vectors for relative positions in space, orrientation of
surfaces and the behaviour of light. We can also use them for trajectories of dynamics and
operations. A vector is a n-tuple of real numbers, and addition and scalar multiplication is
done quite straight forwardly by either adding each individual term to the corresponding one
in another vector, or by multipying each individual element by a scalar.

Trigonometry (sine, cosine, tangent, etc) is also used to calculate directions of vectors and
orrientations of planes.

We can also use the dot (inner) product of vectors, where a ∧ b = |a||b|cos θ. The product can
be calculated as a ∧ b = x1y1 + ... + xnyn. We can use this to figure the Euclidean distance of a
point (x, y) from the origin d = √x2 + y2 (and similarly for n-dimensional space).

The dot product can also be used for normallising a vector.v′ = v/||v|| is a unit vector, it has a
length of 1. The angle between the vectors v and w can be expressed as: cos-1((v ∧ w)/(||v||
||w||)). If the dot product of two vectors v and w is 0, it means that they are perpendicular.

Matrices

Matrices are arrays of numbers indexed by row and column, starting from 1. We can consider
vectors as n × 1 matrix. If A is an n × m matrix and B is a m × p matrix, then A ∧ B is
defined as a matrix of dimension n × p. An entity of A ∧ B is cij = Σms = 1 aisbsj.

The identity matrix is the matrix where Ixy = 1, if x = y, otherwise 0, and transposing
a n × m matrix gives an m × n matrix: Aij = A′ji.

A determinant (det A) is used in finding the inverse of a matrix. If det A = 0, it means


that A is singular and has no inverse.
Given two vectors v and w we can construct a third vector u = w × v which is perpendicular
on both v and w, we call this the cross product.

Planes

Three points not in a line determine a unique plane. Three edges forming a triangle determine
a plane. A plane can be determined by point P0 and two unparalleled vectors u and v.

The equation of a plane can be expressed as T(α, β) = P0 + αu + βv. An equation of a plane


given the plane normal n: n ∧ (P - P0)

Planes are used for modelling faces of objects by joinin edges and vertices. In this case, their
service normals are used to represent shading and other more complex lumination phenomena
such as ray tracing. You can then visualise correspondingly the texture that is applied onto
objects. You can also model the visualisation system with planes, and they constitute the
basis for more complex surfaces (e.g., curved surfaces come from patches that are planar in
nature).

Transformations

There are three fundamental transformations in computer graphics, which can be expressed
using vectors and matrices.
However, we now have the problem that translation can not be expressed using matrix
multiplication, so we move from from cartesian to homogenuous (projective) coordinate
space. Each affine transformationis represented by a 4 × 4 matrix within a reference system
(frame). The point/vector (x, y) becomes (x, y, w), so (x1, y1, w1) = (x2, y2, w2)
if x1/w1 = x2/w2 and y1/w1 = y2/w2. This division by w is called homogenising the point.
When w = 0, we can consider the points to be at infinity.

Now we are working in homogenuous space, we can apply uniform treatment:

Shearing can be used to distort the shape of an object - we shear along the x axis when we
pull the top to the right and the bottom to the left. We can also shear along the y axis.
Transformations of a given kind can be composed in sequence. Translations can be used to
apply transformations at arbitrary points. When composing different translations, the last
translation to be applied to the matrix is applied first, and then backwards from that.
Transformations are generally associative, but have non-commutativity.

Special orthogonal matrices can be used to simplify calculations for rigid body
transformations:

3D Transformations

For 3D transformations, the same principles apply, but now in a 4D homogenuous space.
However, rotation now has to be defined with respect to a specific axis.
However, if we want to rotate a cube with the centre in P0 around an axis v with the angle β.
The unit vector v has the property cos2Φx + cos2Φy + cos2Φx = 1. We can denote the
coordinates of v by αx = cosΦx, αy = cosΦy and αz = cosΦz. We should decompose all the
necessary rotations into rotations about the coordinate axes.

Graphics often utilise multiple coordinate spaces. We can express this as P (i), as the
representation of P in coordinate system i. To translate between the two systems, we apply
the matrix Mi←j, as the transformation from system j to system i

Primitives

Angel c.2

In OpenGL, we specify primitives by declaring glBegin(mode);, followed by a


specification of the vertices and other properties, and then we end the primitive definition
with glEnd();. The mode in this case is a specification of which primitive we want to
define, such as:

Examples are in
lecture slides.
And vertices can be specified using glVertexNT(TYPE c1, TYPE c2, ..., TYPE
cn);, where N is the number of dimensions of the vertex (2, 3 or 4) and T is a character
indicating the data type, s for short, i for integer, f for float and d for double. Alternatively,
the vertex arguments could be passed as an array using glVertexNTv(TYPE
coords[]);.

We can also consider other primitives, such as curves, circles and ellipses, which are
typically specified by a bounding box (an extent), a start angle and an end angle, although
variations exist which use the centre and a radius, different implementations use radians vs.
degrees and allow the orrientation of the bounding box to be changed. NURBS are also
common and are what OpenGL uses. Simple curves are non-existant in OpenGL and must be
defined using quadrics.

When talking about points, we can consider the convex hull - the minimum set which
includes all the points and the line segments connecting any two of them. The convex object
consists of any point lying on the line segment connecting any two points in the object.

We can also consider text as a primitive. Text comes in two types, stroke fonts (vector),
where each character is defined in terms of lines, curves, etc... Stroke fonts are fairly easy to
manipulate (scaling and rotation, etc), but this takes CPU time and memory. Bitmap (raster)
fonts are where each character is represented by a bitmap. Scaling is implemented by
replicating pixels as blocks of pixels, which gives a block appearance. Text primitives give
you control of attributes such as type, colour and geometrical parameters and typically
require a position and a string of characters at the very least.

OpenGL support for text is minimal, so GLUT services are used, such
as glutBitmapCharacter(); and glutStrokeCharacter();.

Some primitives are used to define areas, so a decision needs to be made on whether or not
these area-definining primitives need to be filled or outlined. In OpenGL, polygonal
primitives are shaded, and explicit control is given over the edge presence. More complicated
effects can be accomplished by texturing. This fill attribute may allow solid fills, opaque and
transparent bitmaps and pixmaps.

A final primitive to consider is that of canvases. It is often useful to have multiple canvases,
which can be on- or off-screen, and there consists of the default canvas, and a concept of a
current canvas. Canvases allow us to save or restore image data, and support buffering and
have a cache for bitmap fonts. The OpenGL approach to canvases uses special-purpose
buffers, double buffering and display lists.

Curves and Surfaces


Up until now, curves and surfaces have been rendered as polylines and polygonal meshes,
however we can use piecewise polynomial functions and patches to represent more accurate
curves and curved surfaces.

Parametric equations (one for each axis) are used to represent these, where each section of a
curve is a parametric polynomial. Cubic polynomials are widely used due to having an end-
point and gradient continuity, as well as being the lowest degree polynomial that is non-
planar. Curve classes are used due to the constraints used to identify them (points, gradients,
etc).

Parametric polynomials of a low degree are prefered as this gives us a local control of shape,
they have smoothness and continuity and derivatives can be evaluated. They are fairly stable
and can be rendered easily too - small changes in the input should only produce small
changes in the output.

The points are called curve and segment points.

We can consider two types of geometric continuity of 2 segments:

1. segments join (G0)


2. tangent directions are equal at the join point, but magnitudes do not have to be (G1)

We can consider three properties of parametric continuity:

1. all three parameters are equal at the join point (C0)


2. G1 (only required if curves have to be smooth) and the magnitudes of the tangent vectors
are equal (C1)
3. The direction and magnitude of ∀i, i ∈ N, 1 ≤ i ≤ n, di/dtiQ(t) are equal at the join point (Cn -
however n > 1 is only relevant for certain modelling contexts, e.g., fluid dynamics).

Hermite Curves

Hermite curves are specified using two end-points and the gradients at the end points, and
these characteristics specify the Hermite geometry vector. The end points and tangents should
allow easy interactive editing of the curve.
By changing the magnitude of the tangent, we can change the curve whilst maintaining G1
consistency.

Bézier Curves

Parametric gradients are not intuitive, so an alternative is to use 4 points, of which 2 are
interpolated and 2 which are approximated. We can then figure out G1 = dQ(0)/dt = 3(P2 - P1)
and G2 = dQ(1)/dt = 3(P4 - P3).
Bézier curves and surfaces are implemented in OpenGL. To use them, we need a one-
dimensional evaluator to compute Bernstein polynomials. One dimensional evaluators are
defined using glMap1f(type, u_min, u_max, stride, order,
point_array), where type is the type of objects to evaluate (points, colours, normals,
etc), u_min and u_max are the range of t, stride is the number of floating point values to
advance between control points, order specifies the number of control points
and point_array is a pointer to the first co-ordinate of the first control point. An example
implementation of this may be:

GLfloat control_points[4][3] = {{-4.0, -4.0, 0.0} {-2.0, 4.0, 0.0}


{2.0, -4.0, 0.0} {4.0, 4.0, 0.0}};
glMap1f(GL_MAP1_VERTEX3, 0.0, 1.0, 3, 4, &control_points[0][0]);
glEnable(GL_MAP1_VERTEX3);
glBegin(GL_LINE_STRIP);
for (i=0;i<=30;i++) glEvalCoord1f((GLfloat) i/30.0);
glEnd();

To implement Bézier surfaces, the glMap2f(type, u_min1, u_max1, stride1,


order1, u_min2, u_max2, stride2, order2, point_array) function can
be used, along with glEvalCoord2f(), or you can
use glMapGrid2f() and glEvalMesh2().

B-splines

Bézier curves and patches are widely used, however they have the problem of only giving C0
continuity at the join points. B-spline curves can be used to obtain C2 continuity at the join
points with a cubic.
The B-spline consists of control points (Pi), cubic poly segments (Q3, Q4, ..., Qm) and Qi is
defined on ti ≤ t ≤ ti + 1, 3 ≤ i ≤ m. Control points affect four segments.

Knots lay between Qi and Qi + 1 at ti (the knot value). t3 and tm + 1 are end knots, and are
equally spaced (uniform), so ti + 1 = ti + 1 (therefore the blending function is the same for all
segments).

We can call these curves non-rational to distinguish it from the rational cubic curves (a ratio
of polynomials).

Another variety of B-splines also exist - non-uniform non-rational B-splines. Here, the
spacing of the knots (in parameter space) is specified by a knot vector (the value of the
parameter at the segment joins).

The table below compares Bézier curves with uniform B-splines:

Representation Bézier Curves Uniform B-splines

Compute a new spline Every 3 points Every new point

Smoothness Good Very good

Maximum four curve


Each control point affects: The entire curve
segments

C0 and C1 (if additional


Continuity C0, C1 and C2
constraints are imposed)

Complexity of a curve segment High Low

Number of basis functions Four Four

Interpolates the end control May not interpolate


Interpolation properties
points any point

The convex hull of the control points Contains the curve Contains the curve

Sum of the basis functions Equal to one Equal to one

Affine transformation to its control Transforms the curve


Transforms the curve accordingly
point representation accordingly

Processing Pipeline

Angel c.2
We can consider the processing pipeling as: Modelling → Geometric Processing →
Rasterisation → Display, and we can further break down Geometric Processing to
Transformation (4×4 matrices which transform between two co-ordinate systems) →
Clipping (Generated by a limited angle of view) → Projecting (4×4 matrices describing the
view angle). This can be implemented using custom VLSI circuits, or with further refinement
using things such as shaders, etc...

Pipelines introduce latency, delays for a single datum to pass through the system. The
interface between an application program and the graphics systems can be specified using a
set of functions that reside in a graphics library.

These APIs are device independent, and provide high-level facilities such as modelling
primitives, scene storage, control over the rendering pipeline and event management.

We can consider the following general principles and structures:

• Primitive functions - define low level objects and atomic entities


• Attributes - these govern the way objects are displayed
• Viewing functions - specifying the synthetic camera
• Transformations - geometrical transformations (translation, rotation, scaling, etc)
• Input functions - functions used to input data from the mouse, keyboard, etc
• Control functions - responsible for communications with the windowing system

OpenGL Interface

In an X Windows system, the OpenGL system could be represented as follows:

GL is our basic graphics library and consists of all the primitive functions, GLU is the
graphics utility library and uses GL, GLUT is the graphics utility library toolkit and
interfaces with the windowing system. The library is constantly expanding. The final
component, GLX for X Windows, or on Microsoft Windows wgl communicates with the X
windowing system, or the Windows window subsystem.

Modelling

Angel c.9

There are two different kinds of modelling - a domain model and a graphical model. The
domain model is a high-level abstraction of the objects and normally refer to objects, whereas
the graphical model is a low-level implementation and refers to points, vertices, polygons,
etc...

When we model, we can do it in two modes - in an immediate mode, where no record is kept
of primitives or the attributes, and the image is regenerated by respecifying the primitives -
and a retained mode, where a graphical model is stored in a data structure (called a display
list in OpenGL) and a scene database is stored.

In OpenGL, we use display lists, which are a cache of commands (not an interactive database
- once it has been created, it can not be altered). Display lists can be created in two modes
- GL_COMPILE and GL_COMPILE_AND_EXECUTE, where the display list is executed as it
is declared.

A modelling coordinate system is used to define individual objects (located at the origin, axis
aligned), and then objects are combined within a world coordinate system.

To simplify complex problems, we can encapsulate scenes into 'self-contained' packages,


which localise changes and allow reuse. This is typically done with either display lists, or for
more complex display lists (as you can not have display lists within a display list ), with
procedures that call multiple display lists. This in itself is not fundamental to modelling, but
has a lot of efficiency increases.

In OpenGL, you create a display list using glGenLists(1), which will return
a GLuint (which will be 0 if a failure has occured), and the actual contents of the list are
generated in between glNewList(GLuint list, mode) and glEndList(). They
are then called using glCallList(list) and free'd with glDeleteList(list,
1). glIsList(list) is used to check for the existance of a list.

Most things can be cached into a display list (with the exception of other display lists),
however things that change a clients state (such a GLUT windowing commands) are executed
immediately.

Matrices

Matrices are used to isolate different transformations (for example, for each individual
model) from different transformations. In OpenGL, matrices are stored on a stack, and they
can be manipulated with:
• glPushMatrix() - this is a 'double up' push, where the current matrix is pushed on to
the top of the stack, but is still available as the current matrix
• glPopMatrix() - this replaces the current matrix with the one on the top of the stack
• glLoadMatrix(GLfloat *M) - replaces the current matrix with the matrix M, a special
version of this, glLoadIdentity() loads the identity matrix
• glTranslate(), glRotate(), glScale(), glMultMatrix(GLfloat *M) - in
each case the new matrix is premultiplied by the current matrix and the result replaces the
current matrix.

Stack based architectures are used by most graphics system, as they are ideally suited for the
recursive data structures used in modelling.

Due to the way matrix multiplication works, actions on the matrix start from the one most
recently applied and move backwards in the reverse order to how they were applied.

An OpenGL programmer must manage the matrix and attribute stacks explicitly!

Attributes

Similar to a matrix stack, the attribute stack can be manipulated to hold attributes that can be
expensive to set up and reset when rendering a model.

Attributes tend to be organised into attribute groups, and settings of selected groups
(specified by bitmasks) can be pushed/popped. glPushAttrib(GLbitfield
bitmask) and glPopAttrib() work in the same way as the matrix stack. One of the
most common bitmasks is GL_ALL_ATTRIB_BITS, but more can be found in the manuals.

A version also exists for networked systems: glPushClientAttrib(GLbitfield


bitmask) and glPopClientAttrib().

Materials

Three dimensional structures are perceived by how different surfaces reflect light. Surfaces
may be modelled using glMaterialfv(surfaces, what, *values), where
surfaces is which surfaces are to be manipulated (i.e., just the front of a polygon
with GL_FRONT or both sides with GL_FRONT_AND_BACK) what is the material attribute
to be manipulated (e.g., GL_AMBIENT_AND_DIFFUSE for lighting
and GL_SHININESS for shininess). For single values, glMaterialf(surfaces,
what, v) can be used.

An alternative is just simply colour the materials


using glColorMaterial(GL_FRONT_AND_BACK,
GL_AMBIENT_AND_DIFFUSE) and then wrapping the objects to be created
with glEnable(GL_COLOR_MATERIAL) and glDisable(GL_COLOR_MATERIAL),
and then using standard glColor3f() commands.
Lighting

Lighting effects require at least one light source, and OpenGL allows at least 8 specified
by GL_LIGHTn.

Global lighting is enabled using glEnable(GL_LIGHTING) and then specific lights can
be turned on or off using glEnable(GL_LIGHTn) and glDisable() respectively.

The attributes of lights are specified using glLightfv(GL_LIGHTn, attribute,


*value), where attribute is the attribute of the light to be set up
(e.g., GL_AMBIENT, GL_DIFFUSE, GL_POSITION, GL_SPECULAR).

Ambient Sources

Ambient sources have a diffused, non-directional source of light. The intensity of the ambient
light is the same at every point on the surface - it is the intensity of the sources mulitplied by
some ambient reflection coefficient.

This is defined in OpenGL with glMaterial{if}v(GLenum face, GLenum


pname, TYPE *params);. face is
either GL_FRONT, GL_BACK or GL_FRONT_AND_BACK, pname is a parameter name, and
in this case we use GL_AMBIENT, and params specifies a pointer to the value or values
that pname is set to, and this is applied to each generated vertex.

Objects and surfaces are characterised by a coefficients for each colour. This is an empirical
model that is not found in reality.

Spotlights

Spotlights are based on Warn's model, and simulate studio lighting effects. The itensity at a
point depends on the direction, and flaps ("barn doors") confine effects sharply. Spotlights
are often modelled as a light source in a cone.

Spotlights in OpenGL are implemented using


the GL_SPOT_CUTOFF and GL_SPOT_EXPONENT parameters to glLight.

Graphics File Formats

Most graphics programming is done in modelling applications which output the model in a
file format. Common file formats include:

• VRML (Virtual Reality Modelling Language) - an open standard written in a text form
• RasMol - a domain modelling language used in molecular chemistry, in which the file contain
a text description of the coordinates of atoms and bonds in a model
• 3D Studio - a proprietary file format developed by AutoDesk that is very common in industry

Solid Object Modelling


Angel c.9.10

When we model with solid objects, rather than surfaces only, we can do boolean set
operations on the objects - union, intersection and subtraction. When we apply these to
volumes, these operations are not closed, as they can yield points, lines and planes, so we
only consider regularised boolean set operations, which always yield volumes, so the
intersection of two cubes sharing an edge is null.

It is difficult to define boolean set operations with primitive instancing, this is where objects
are defined as a single primitive geometric object and is scaled, translated and rotated into the
world (as we have been doing with display lists in OpenGL). Primitive instancing is simple to
implement, however.

Sweep Representations

This is where an area is swept along a specified trajectory in space. We can consider this as
either extrusion, where a 2D area follows a path normal to the area and rotational sweeps,
where an area is rotated about an axis (however, these do not always produce volumes).

When sweeping, if we sweep a 3D object, this will always guarantee a volume, and in the
process of sweeping may change the shape, size and orientation of the object. Trajectories
can be complex (e.g., following a NURBS path), and it is not clear what the representational
power of a sweep is (what volume operations can we perform?). We typically use these as
input to other representations.

Constructive Solid Geometry

Constructive solid geometry is based on the construction of simple primitives. We combine


these primitives using boolean set operations and explicitly store this in the constructive solid
geometry representation, where tree nodes are operators (and, or or diff) and leaves are
primitives. The primitives themselves can be simple solids (guaranteeing closed volumes or
an empty result), or non bounded solids - half-spaces, which provide efficiency. To compute
the volume, we can use traversal.

Boundary Representations

Boundary representations (B-reps) are one of the simplest representations of solids. They
require planar polygonal faces, and are historically implemented through traversing faces and
connected edges, due to the vector graphics for which their efficiency is aimed.

B-reps require represented objects to be two-manifold. An object is two manifold if it is


guaranteed that only two faces join at an edge, i.e., a surface must be either of the forms
shown below, not both at the same time, which would cause us to move in to hyperspace.
B-reps consist of a vertex table with x, y and z co-ordinates, an edge table consisting of a pair
of the two vertices connected by an edge and a face table consisting of a list of which edges
denote a face.

Winged-edge representation is a common representation of b-reps, where each edge has


pointers to its vertices, the two faces sharing the edge and to four of the edge emanating from
it. Using winged-edge representation, we can compute adjacency relationships between faces,
edges and vertices in a constant time.

Spatial partitioning representations are a class of representations that work by decomposing


space into regions that are labelled as being inside or outside the solid being modelled. There
are three types to consider.

Spatial Occupancy Enumerations

These are voxel based systems, where the material is represented by a succession of small
cubes like pixels in a display. The precision of the scene is the size of the voxel, the more
voxels you have, you have an increase in precision, but at the expense of memory and
computation increase (Ο(n3)).

This approach is good for fast clearance checking in CAD, although many optimisations can
be made. Additionally, having a good level of precision tends to cause problems, and no
analytic form exists for faces or surfaces. It is fairly easy to compute the volume, however, as
volume = voxel volume × number of voxels.
Quadtrees and Octrees

Quadtrees is spatial occupancy enumeration with an adaptive resolution - pixels are not
divided when they are either empty or fully occupied. Net compression occurs, and it seems
likely that intersection, union and difference operations are easy to perform.

Octrees are the quadtree principle extended to be in three dimensions, so a division of a voxel
now produces 8 equal voxels. They can easily be generated from other representations and
are rendered from front to back. They are logically equivalent to quadtrees so boolean
operations are easily possible.

Representing octrees in a tree form has a high memory cost, so we can use various linear
encodings. Octrees are commonly used to partition worldspace for rendering optimisation and
for occlusion culling.

Binary Space Partitioning Trees

Binary space partitioning (BSP) trees are an elegant and simple representation that deals only
with surfaces, not volumes. It does have a potentially high memory overhead, but boolean
operations and point classification are easy to compute.

Physics
Angel c.11

Fractals

The basic characteristics of a fractal are that they have infinite detail at any point, exhibit self
similarity and are defined procedurally. They can be applied to generate terrains, clouds,
plants and flowers. Fractals can be classified into two basic types, self-similar (has statistical
self-similarity) and self-affine (exact self-similarity).

If we are dealing with objects defined in three dimensional space, then the euclidian
dimension (De) of an object is the number of variables required the specify the points in the
object. In 3D space, this is always 3. The topological dimension (D t) of an object is the
minimum number of intersections of spheres needed to ensure coverage of the object. For
discrete points, Dt is 0; for lines, Dt is 1; for planes, Dt is 2; and for volumes, Dt is 3.

The similarity dimension Ds is a measure of the way an object scales with respect to
measuring a lengthε. In general, this is NεD = 1, hence Ds = log(N)/log(1/ε).

Cantor Set

One famous fractal is the Cantor Set (sometimes called Cantor Dust) derived by Georg
Cantor. It consists of discrete points on an interval of a real line, but has as many points as
there are in an interval. In general, the interval is divided into 2 k segments which in the
limit k = ∞ gives 2N0 discrete points. Since the points are discrete, the topological dimension
Dt = 0, but we can consider the similarity dimension to be Ds = log(2)/log(3) = 0.6309.

Kock Curve

Like the Cantor Set, the Kock curve is formed from an interval of the real line, but it is
infinitely long, continuous and not differentiable.

As we have a curve, the topological dimension Dt = 1, but the similarity dimension Ds =


log(4)/log(3) = 1.2618.
A fractal is defined as any curve for which the similarity dimension is greater than the
topological dimension. Other definitions do exist, e.g., when the Hausdroff dimension is
greater than the topological dimension (Mandelbrot, 1982), but this fails for curves which
contain loops. Another defintion is any self-similar curve (Feder, 1988), but this includes
straight lines, where are exactly self-affine.

Fractals are problematic, as they take a large amount of computational time to render and are
quite difficult to control. An alternative method that could be used is random midpoint
displacement, which is faster, but less realistic. In random midpoint displacement, the
midpoint of a straight-line segment a-b is displaced by r, a value chosen by Gaussian
distribution, a mean or a variance.

Another way to represent fractals is as self-squaring fractals. If you take a point x + iy in


complex space and repeatedly apply the transformation zn + 1 = zn2 + c. This will either diverge
to infinity, converge on a fixed limit (an attractor) or remain on the boundary of an object -
the Julia set. The Julia set may be disconnected depending on the control factor c). A special
set known as the Mandelbrot set is all values of c that give a connected Julia set. Sets are
found by probing of sample points and inverse transformation.

Shape Grammars

Shape grammars are production rules for generating and transforming shapes. They are
especially useful for plants and patterns. Strings can be interpreted as a sequence of drawing
commands, e.g., F - forward, L - left, R - right some angle.

Particle Systems

Particle systems can be used to model objects with fluid-like properties, such as smoke, fish
clouds, waterfalls, groups of animals, etc...

In particle systems, a large number of particles are generated randomly, and objects are
rendered as very small shapes. Parameters (e.g., trajectory, color, etc) vary and evolution is
controlled by simple laws. Particles eventually spawn or are deleted. One way of dealing with
particle systems is to start with organised particles and then disintegrate the object.

Constraints allow aspects of real world behaviour to be added to an evolving particle system,
e.g., particles may bounce off a surface or be deflected by some field. Two forms of
constraints exist: hard constraints, which must be enforced, e.g., a surface off which particles
must bounce, these are difficult to implement as new behaviours and detectors must be set up;
and soft constraints, which may be breached - these are usually implemented using penalty
functions that are dependent on the magnitude of the breach of the constraint. They are often
simpler to implement and can be very efficiently be described in terms of energy functions
similar to potential energy.

Physically Based Models

Physically based models can be used to model the behaviour of non-rigid objects (e.g.,
clothing, deforming materials, muscles, fat, etc...), and the model represents interplay
between external and internal forces.
Key Framing

Key framing is the process of animating objects based on position and orientation at key
frames. This "in betweening" generates auxilary frames using motion paths - they can be
specified explicitly (e.g., using lines and splines) or generated procedurally (such as using a
physics based model). Complexities can be caused by objects changing shape; objects may
gain and lose vertices, edges and polygons, so we can consider a general form of key-framing
as morphing (metamorphosis).

Animation paths between key frames can be given by a curve-fitting techniques, which
allows us to support acceleration. We can use an equal time spacing with in-betweens for
constant speed, but for acceleration, we want time spacing between in betweens to increase,
so an acceleration function (typically trigonometric) is used to map in between numbers to
times.

Cameras

Angel c.1

Images are formed by light reflecting off objects and being observed by a camera (which
could include the human eye).

In simple cameras, the image consists of projections of points, and with the pinhole model,
we can assume an infinite depth of field. We can also consider the field of view, which maps
points to points and the angle of view, which depends on variables d and h as in the diagram
below.
ys′ = -ys/(zs/d) and yc′ = -yc/(zc/d).

We can also consider synthetic cameras:

Viewing Transformations

For a more in-depth


treatment, see lecture 6.

Angel c.5

Viewing determines how a 3D model is mapped to a 2D image.

In the real world, we pick up the object, position it and then look at it, however in computer
graphics, objects are positioned in a fixed frame and the viewer then moves to an appropriate
position in order to achieve the desired view - this is achieved using a synthetic camera model
and viewing parameters.

The important concepts in viewing transformations are the projection plane (the viewplane)
and projectors - straight projection rays from the centre of projection. The two different forms
of projection are perspective (the distance from the centre of projection to the view plane is
finite - as occurs in the natural world), and parallel, where the distance from the centre of
projection to the view plane is infinite, so the result is determined by the direction of
projection.

With perspective projection, lines not parallel to the viewplane converge to a vanishing point.
The principal vanishing point is for lines parallel to a principal axis, and an important
characteristic is the number of vanishing points.

With parallel projection, the centre of projection is at infinity, and the alignment of the
viewplane is either at the axes, or in the direction of projection.

We can define a view reference coordinate system, where the world coordinates are relative
to the camera, rather than the origin. In this system, we have many parameters, such as:

• location of the viewplane


• window within the viewplane
• projection type
• projection reference point (prp)

The viewplane also has parameters such as view reference point (vrp), the viewplane normal
(vpn) and the view up vector (vuv), and the window has width and height defined.

In the perspective space, the prp becomes the centre of projection and in the parallel case we
have the concept of a window centre and the direction of projection is the vector from prp to
the window centre. If the direction of projection and vpn are parallel, this gives us an
orthographic projection.

Clipping Planes

The viewplane window clips the sides of images, so we can also limit the depth of view by
using front and back clipping planes. We can use this to implement depth cueing, focusing on
objects of interest and culling artefacts (both near and far). To create a clipping plane, we
need a distance along the n axis - all planes are created parallel to the viewplane. In
perspective projection, if we have negative distances (behind the centre of projection), we see
objects upside down. Nothing in the plane of the centre of projection can be seen.

With clipping planes, we can now consider view volumes. For perspective views, we have a
truncated pyramid (a frustrum), but in the parallel view, we get a rectangular solid:

In OpenGL, we have a current transformation matrix (CTM) that operates on all geometry.
There are two components to this: a model view matrix and a projection matrix. The model
view matrix addresses modelling transformations and camera position and the projection
matrix deals with 3D to 2D projection. OpenGL transformations can be applied to either
matrix, but we need to explicitly select the mode
using glMatrixMode(GL_MODELVIEW) or glMatrixMode(GL_PROJECTION).
Normally, we leave it in model-view mode, however.

We can also explicitly manipulate the matrix


using glLoadIdentity(), glLoadMatrix{fd}(const TYPE
*m), glMultMatrix{fd}(const TYPE *m). We can also implicitely construct a
matrix using gluLookAt(GLdouble ex, GLdouble ey, GLdouble ez,
GLdouble cx, GLdouble cy, GLdouble cz, GLdouble ux, GLdouble
uy, GLdouble uz). e refers to the position of the eye, c to the point being looked at
and u tells you which way is up.

For perspection projection, the projection matrix can be defined using transformations, but it
is much simpler to use 'pre-packaged' operators: glFrustrum(GLdouble left,
GLdouble right, GLdouble bottom, GLdouble top, GLdouble near,
GLdouble far) and gluPerspective(GLdouble fovy, GLdouble aspect,
GLdouble near, GLdouble far).
For orthographic projection, the parameters define a rectanglar view
volume: glOrtho(GLdouble left, GLdouble right, GLdouble bottom,
GLdouble top, GLdouble near, GLdouble far). A simplified form for 2D
graphics also exists: gluOrtho2D(GLdouble left, GLdouble right,
GLdouble bottom, GLdouble top).

Projection transformations map a scene into a canonical view volume (CVV). OpenGL maps
into the full CVV given by the function glOrtho(-1.0, 1.0, -1.0, 1.0, -1.0,
1.0).

Viewport transformations locates the viewport in the output window. Distortion may occur if
aspect ratios of the view volume and the viewport differ. The viewport and projection
matrices may need to be reset after the window is resized.

Visible Surface Determination

Angel c.5.6, 8.8

Visible surface determination is all about figuring out what is visible, and is based on vector
mathematics.

Coherence is the degree to which parts of an environment exhibit self similarity, and it can be
used to accelerate graphics by allowing simpler calculations or allowing them to be reused.
Types of coherence are:
• Object coherence - if objects are totally seperate, they may be handled at the object level,
rather than the polygon level
• Face coherence - properties change smoothly across faces, so solutions can be modified
across a face
• Edge coherence - an edge visibly changes when it penetrates or passes behind a surface
• Implied edge coherence - when one polygon penetrates another, the line of intersection can
be determined from the edge intersections
• Scan-line coherence - typically, there is little difference between scan lines
• Area coherence - adjacent pixels are often covered by the same polygon face
• Depth coherence - adjacent parts of a surface have similar depth values
• Frame coherence - in animation, there is often little difference from frame to frame

Two ways of solving this include using image precision and object precision. In image
precision, an object is found for each pixel, and this has complexity Ο(np), where n is the
number of objects and p is the number of pixels , whereas for object precision, for each object
the pixels that represent it are displayed, and then it is discovered whether said pixel is
obscured, and this has complexity Ο(n2), so this method is good when p > n, or in other
words, when there are not many pixels.

One simple way of implementing object precision is using extents and bounding boxes. With
bounding boxes, a box is drawn around each polygon/object and if any of the boxes intersect,
then we know we need to check for occlusion, otherwise we don't. This has a linear
performance. To achieve sublinear performance in extents and bounding boxes, we can
precompute (possible) front-facing sets at different orientations and to use frame coherence in
identifying which polygons may become visible (this only works for convex polyhedra,
however).

Another way of implementing object precision is to consider back face culling, which is the
process of removing from the rendering pipeline the polygons that form the back faces of
closed non-transparent polyhedra. This works by eliminating polygons that will never be
seen. We know that the dot product of two vectors is the cosine of the angle between them, so
when surface normal ∧ direction of projection < 0, we know the polygon is front facing, but
when it is ≥ 0, we know it is rear facing and we do not need to render it. However, we still
need to have special cases to deal with free polygons, transparencies non-closed polyhedra
and polyhedra that have undergone front plane clipping.

Back face culling in OpenGL is implemented


using glCullFace(GL_FRONT), glCullFace(GL_BACK) or glCullFace(FRONT
_AND_BACK) and glEnable(GL_CULL_FACE). We can implement two-sided lighting
with glLightModeli(GL_LIGHT_MODEL_TWO_SIDE,
GL_TRUE) or GL_FALSE and material properties can be set
using glMaterialfv(GL_FRONT, ...), etc... We also need to tell OpenGL which way
round our front faces are when we're defining objects, which we do
with glFrontFace(GL_CCW) and glFrontFace(GL_CW).

Other optimisations that can be done is using spatial partitioning, where the world is divided
into seperate regions so only one or two need to be evaluated at any time (e.g., in a maze with
areas seperated by doors). Additionally, we could use an object hierarchy, where the models
are designed so the extents at each level act as a limiting case for its children.
Z-Buffers

The Z (or depth) buffer approach is an image precision approach. We can consider it similar
to the frame buffer, which stores colours, as the depth buffer stores Z values. We can
manipulate the Z buffer in a number of ways, e.g., we can read and write to it, or make it read
only, or link it to the stencil buffer.

In OpenGL, we implement the depth buffer


using glutInitDisplayMode(GLUT_DEPTH | ...), and
then glEnable(GL_DEPTH_TEST) and then modify glClear() to use
the GL_DEPTH_BUFFER_BIT too.

Z-buffering is good as it is not limited to polygons, but any geometric form with depth and
colour. Additionally, it has clear memory requirements and it is fairly easy to implement in
hardware. The buffer can be saved and reused for further optimisations, and masking can be
used to overlay objects.

It does, however, have finite accuracy, and problems can be caused by the non-linear Z
transform in perspective projections. Additionally, colinear edges with different starting
points can cause aliasing problems to occur, which is why it is important to avoid T-junctions
in models.

An improvement that we can use is an A-buffer, which stores a surface list along with RGB
intensity, opacity, depth, area, etc... OpenGL only implements a Z-buffer, however, which
causes problems as the results of blending can be dependent on the order in which the model
objects are rendered. Typically the best way to work round this is to do translucent things
last.
Ray Casting

Ray casting works by firing rays from the centre of projection into the scene via the
viewplane. The grid squares on the viewplane correspond to pixels, and this way finds the
surface at the closest point of intersection, and the pixel is then coloured according to the
surface value.

To do this, we need a fast method for computing the intersections of lines with arbritary
shapes of polygons. One way of doing this is to represent lines in their parametric form, and
then approximate values can be found by subtracting the location of the centre of projection
from the pixels position.

This is quite easy for spheres, as using the parametric form of the equation for a sphere gives
us a quadratic equation that has two real roots if the line intersects the sphere, a real and
complex root if the line is tangential to the sphere and two complex roots if the line does not
intersect.

For polygons, the intersection of a line and the plane of a polygon must be found, and then
that point must be tested to see if it is inside the polygon.

List Priority

This is a hybrid object/image precision algorithm that operates by depth sorting polygons.
The principle is that closer objects obscure more distant objects, so we order the objects to try
and ensure a correct image.

The trivial implementation of this is the painters algorithm, which makes the assumption that
objects do not overlap in z. In the painters algorithm, you start by rendering the further object
away and then draw the closer objects on top - sort, then scan convert. However, if you have
overlapping z-coordinates and interpenetrating objects and cyclic overlaps, then problems can
occur.

Scan Line Algorithms

In scan line algorithms, these operate by building a data structure consisting of polygon edges
crossreferenced to a table of polygons. When a scan commences, all the edges the scanline
crosses are placed in an active set. As the scan proceeds, the polygons that it enters are
marked as active. If only one polygon is rendered, then its colour is rendered, else a depth test
is carried out. This is again an image precision algorithm.
Area Subdivision Algorithms

This is a recursive divide and conquer algorithm. They operate by examining a region of the
image plane and applying a simple rule for determining which polygons should be rendered
in that region. If the rule works, that area of the polygon is rendered, otherwise, it subdivides
and the algorithm is recursively called on each subregion, this continues down to a subpixel
level which allows us to sort out aliasing problems. These are mixed or object precision
algorithms.

Clipping

Angel c.8.4-8.7 The purpose of clipping is to remove objects or parts of objects that lie
outside of the view volume. Operationally, the acceleration that occurs in the scan conversion
process is sufficient to make this process worthwhile.

If we have a rectangular clip region, then we can consider the basic case of making
sure x and y are between the two known bounds of the clip region. Clipping is usually applied
to points, lines, polygons, curves and text.
An alternative method with the same effect is scissoring, which is carried out on the pixels in
memory after rasterisation, and is usually used to mask part of an image. Clipping, on the
other hand, operates with primitives which is discards, creates or modifies. It is more
expensive than scissoring, but does greatly speed up rasterisation.

Coden-Sutherland Algorithm

The Cohen-Sutherland algorithm uses the limits of the view volume to divide the world up
into regions and assigns codes called "outcodes" to these regions. Line segments can then be
classified by the outcodes of their end points. These outcodes are assigned to line midpoints
and the outcodes are tested with the results of either accept, reject or split. This is done until
all line segments are accepted or rejected.

An outcode consists of 4 bits: (y > ymax)(y < ymax)(x > xmax)(x < xmax), so at any point, the
outcode would look like:
The algorithm works by testing the outcodes at each end of a line segment together with the
and of the outcodes. From this, we can infer 4 conditions:

Outcode Outcode Outcode 1 &


Results
1 2 Outcode 2

0 0 Don't care Both ends in view volume, so render

0 >0 Don't care


One end in and one end out of view volume, clip line to
view volume
>0 0 Don't care

>0 >0 0 Part of line may be in view, so do more tests

>0 >0 >0 Line can not enter view volume

The Cohen-Sutherland algorithm works best when the clipping region is large compared to
the world, so there are lots of trivial accepts, or when the clipping region is small compared
to the world, so there are lots of trivial rejects.

However, as we find the intercept using the slope-intercept form, these calculations involve
floating point divisions to find x and the value of m, and redundant clipping can often occur.

Parametric Clipping

For equations, see


lecture notes.

This algorithm works by deriving the line equation in a parametric form so the properties of
the intersections with the sides of the view volume are qualitatively analysed. Only if
absolutely necessary are the intersections carried out.

When we are dealing with 3D, lines have to be clipped against planes. The Cohen-Sutherland
algorithm can be applied in a modified form, but only the Ling-Barsky algorithm can be used
if parametric clipping is used.

Sutherland-Hodgman

When dealing with polygons, it is not good enough to do line clipping on the polygon edges,
as new points must be added to close up polygons. The Sutherland-Hodgman algorithm clips
all the lines forming the polygon against each edge of the view volume in turn.

This is implemented conceptually by using in and out vertex lists. The input list is traversed
with vertices stored in the output list. Each edge is then clipped against in turn, using the
output from the previous clip as input. We can consider this as a pipeline of clippers, and
optimise by passing each vertex to the next clipper immediately (i.e., pipelining).
When dealing with curves and circles, we have non-linear equations for intersection tests, so
we utilise bounding information for trivial accept/rejects and then repeat on quadrants,
octants, etc..., and either test eachs ection analytically or scissor on a pixel-by-pixel basis.

Similarly with text, we can consider it as a collection of lines and curves (GLUT_STROKE)
and clip it as a polygon, or as a bitmap (GLUT_BITMAP), which is scissored in the colour
buffer. Since characters usually come as a string, we can accelerate clipping by using
individual character bounds to identify where to clip the string.

Scan Conversion (Rasterisation)

Angel c.8.9-8.12 Given a line defined on the plane of real numbers (x, y) and a discrete
model of the plane (x, y) consisting of a rectangular array of rectangles called pixels which
can be coloured, we want to map the line on to the pixel array, while satisfying and
optimising the constraints of maintaing a constant brightness, having a differing pen style or
thickness, the shape of the endpoints if the line is thicker than one pixel and to minimise
jagged edges.

The pixel space is a rectangle on the x, y plane bounded by xmax and ymax. Each axis is divided
into an integer number of pixels Nx, Ny, therefore the width and height of each pixel is
(xmax/Nx, ymax/Ny). Pixels are referred to using integer coordinates that either refer to the
location of their lower left hand corners or their centres. Knowing the width and height values
allow pixels to be defined, and often we assume that width and height are equal, making the
pixels square.

Digital Differential Analyser Algorithm

This algorithm assumes that the gradient satisfies 0 ≤ m ≤ 1 and all other cases are handled by
symmetry. If a line segment is to be drawn from (x1, y1) to (x2, y2), we can compute the
gradient as m = Δy/Δx. Using a round function to cast floats to integers, we can move along
the line and colour pixels in. This looks like:

line_start = round(x1);
line_end = round(x2);
colour_pixel(round(x1), round(y1));
for (i = line_start + 1; i <= line_end; i++) {
y1 = y1 + m;
colour_pixel(i, round(y1));
}

Bresenham's Algorithm

The DDA algorithm needs to carry out floating point additions and rounding for every
iteration, which is inefficient. Bresenham's algorithm operates only on integers, requiring that
only the start and end points of a line segment are rounded.

If we assume that 0 ≤ m ≤ 1 and (i, j) is the current pixel, then the value of y on the
line x = i + 1 must lie between j and j + 1. As the gradient must be in the range of 0 to 45º, the
next pixel must either be in the quadrent to the east (i + 1, j) or north east (i + 1, j + 1)
(assuming that pixels are identified about their centres).

This gives us a decision problem, so we must find a decision variable to be detected to


determine which pixel to colour. We can rewrite the straight line equation as a function
ƒ(x, y) = xΔy - yΔx + cΔx. This function has the properties of ƒ(x, y) < 0 when the point is
above the line and ƒ(x, y) > 0 when the point is below the line. We can define a line from the
centre of our current pixel P to a midpoint of the line between pixels E and NE, that is the
point (xP + 1, yP + 0.5).

Following steps are not independent of previous decisions as the line must still extend from
the origin of our original pixel P to the midpoint of the E and NE of the current pixel being
considered.

To eliminate floats from this, we can simply multiply everything by 2 to eliminate our 0.5
factor.

We have other considerations, such as line intensity. Intensity is a function of slope, and lines
with different slopes have different numbers of pixels per unit length. To draw two such lines
with the same intensity, we must make the pixel intensity a function of the gradient (or
simply use antialiasing).

We can also consider clip rectangles. If a line segment has been clipped at a clipping
boundary opposite an integer x value and a real y value, then the pixel at the edge is the same
as the one produced by the clipping algorithm. Subsequent pixels may not be the same,
however, as the clipped line would have a different slope. The solution to this is to always
use the original (not clipped) gradient.

For filled polygons, we calculate a list of edge intersections for each scan line and sort the list
in increasing x. For each pair in the list, the span between the two points are filled. When
given an intersection at some fractional x value, we determine the side that is the interior by
rounding down when inside, and up when outside. When we are dealing with integer values,
we say that leftmost pixels of the span are inside, and rightmost are outside. When
considering shared vertices, we only consider the minimal y value of an edge, and for
horizontal edges, these edges are not counted at all.

We can also use the same scan line algorithm that was discussed in visible surface
determination, where a global list of edges is maintained, each with whatever interpolation
information is required. This mechanism is called edge tables.

Colours
Colours are used for printing, art and in computer graphics. Colours have a hue, which is
used to distinguish between different colours, and this is decided by the dominant
wavelength; a saturation (the proportions of dominant wavelength and white light needed to
define the colour), which defined by the excitation purity; and a lightness (for reflecting
objects) or brightness (for self-luminous objects) called a luminance, which gives us the
amount of light.
A colour can be categorised in a function C(λ) that occupies wavelengths from about 350 to
780 nm, and the value for a given wavelength in λ shows the intensity for that colour at that
wavelength. However, as the human visual system works with three different types of cones,
our brains do not receive the entire distribution across C(λ), but rather three values - the
tristimulus values.

Tristimulus theory is the basic tenet of three-colour theory: "If two colours produce the same
tristimulus values, then they are visually indistinguishable". This means that we only need
three colours (primaries) to reproduce the tristimulus values needed for a human observer.
With CRTs, we vary the intensity of each primary to produce a colour, and this is called
additive colour (where the primary colours add together to give the perceived colour). In such
a system, the primaries red, green and blue are usually used.

The opposite of additive colour (where primaries add light to an initially black display) is
subtractive colour, where we start with a white surface (such as a sheet of paper) and add
coloured pigments to remove colour components from a light that is striking the surface.
Here, the primaries are usually the complementary colours - cyan, magenta and yellow.

If we normalise RGB and CYM to 1, we can express this as a colour solid, and any colour
can therefore be represented as a point inside this cube.

CIE on Wikipedia

One colour model is the CIE (Commision Internationale de l'Eclairage, 1931). It consists of
three primaries: X, Y and Z and experimentally determined colour-matching functions that
represents all colours perceived by humans. The XYZ model is an additive model.

The CIE model breaks down colour into two parts, brightness and chromaticity. In the XYZ
system, Y is a measure of brightness (or luminance), and the colour is then specified by two
derived parameters x and y, which are functions of all tristimulus values, X, Y and Z, such
that: x = X/(X + Y + Z) and y = Y/(X + Y + Z).

A colour gamut is a line or polygon on the chromacity diagram


YIQ is a composite signal for TV defined as part of the NTSC standard. It is based on CIE in
that the Y value carries the luminance as defined in CIE and I (in-phase) and Q (quadrature)
contribute to hue and purity.

Two other systems are the hue-saturation-value (HSV) system, which is a hexcone
represented by the RGB colour space as viewed from (1, 1, 1) and is represented as an angle
around the cone (hue), the saturation (distance from the origin) as a ratio of 0 to 1 and value
(intensity), which is the distance up the cone.

A similar system is the hue-lightness-saturation (HLS) system. The hue is the name of the
colour, brightness the luminance and saturation the colour attribute that distinguises a pure
shade of a colour from a shade of the same hue mixed with white to form a pastel colour.
HLS is also typically defined as a cone, so it represents and easy way to convert the RGB
colour space into polar coordinates.

Illumination

Angel c.6

The simplest lighting model is that where each object is self-luminous and has its own
constant intensity. This is obviously not very accurate, however, so we consider light to
originate from a source, which can either be ambient, a spotlight, distributed or a distant light.

Surfaces, depending on their properties (e.g., matte, shiny, transparent, etc) will then show up
in different ways depending on how the light falls on. We can characterise reflections as
either specular, where the reflected light is scattered in a narrow range of angles around the
angle of reflection, or as diffusing, where reflected light is scattered in all directions. We can
also consider translucent surfaces, where some light penetrates the surface using refraction,
and some is reflected.

Diffuse Reflection

We can use the Lambertian model to implement diffuse reflection, which occurs on dull or
flat surfaces.

With the Lambertian model, the light is reflected with an equal intensity in all directions (this
is not a real world representation, however, as imperfections in a surface will reflect light
more in some directions than others), however we only consider the vertical component of
the light (the angle between the surface normal N and the light source L, which can be
represented as (N ∧ L). We also need to consider a diffuse reflection coefficient k, so we can
compute the intensity of a surface as I = Isourcek(N ∧ L).

We can also consider a distance term to account for any attenuation of the light as it travels
from the source to the surface. In this case, we use the quadratic attenuation
term k/(a + bd + cd2), where d is the distance from the light source.

However, this causes problems as the value will be negative if it is below the horizon, so we
give this value a floor of 0.

For long distance light sources, we can consider L to be constant, and harshness can be
alleviated by including an ambient term.

Diffuse reflection is set up


using GL_DIFFUSE or GL_AMBIENT_AND_DIFFUSE in glMaterial.

Specular Reflection

Specular reflection is a property of shiny (smooth) surfaces. If we assume point sources, we


can have a model as follows:

Where L is the light source vector, N is the surface normal and V is a vector towards the
centre of projection. In a perfect reflector, reflections are only along R.

Phong Illumination
The Phong illumination model is a simple model for non-perfect reflectors. Maximum
reflection occurs when α is 0, and then a rapid fall-off occurs. It can be approximated by
cosn(α), where n is a shininess coefficient.

The Phong illumination model including distance attenuation is therefore: I = I sourceksource +


((kcosθ + kspecularcosnα)Iambient)/(a + bd + cd2), where we now have a new specular-reflection
coefficient, kspecular.

For multiple light sources, lighting equations are evaluated individually for each source and
the results are summed (it is important to be aware of overflow).

Objects that appear further from the light appear a darker. A realistic but unhelpful way of
modelling this is as 1/(d2), so we use the more useful model min(1/(a + bd + cd2), 1),
where a, b and c are constants chosen to soften the lighting.

In OpenGL, we can use the


parameters GL_CONSTANT_ATTENUATION, GL_LINEAR_ATTENUATION and GL_QUA
DRATIC_ATTENUATION to allow each factor to be set.

Fog

Atmospheric attenuation is used to simulate the effect of atmosphere on colour, and is a


refinement of depth cueing. It is most commonly implemented using fog.

OpenGL supports three fog models using glFog{if}v(GLenum pname, TYPE


*params);, which using the GL_FOG_MODE parameter with the
arguments GL_LINEAR, GL_EXP or GL_EXP2. Other parameters
for glFog include GL_FOG_START, GL_FOG_END and GL_FOG_COLOR.

Additionally, fog must be specifically enabled using glEnable(GL_FOG);.

Global Illumination
Global illumination takes into account all the illumination which arrives at a point, including
light which has been reflected or refracted by other objects. It is an important factor to
consider when persuing photo realism.
The two main global illumination models are ray tracing, which considers specular
interaction, and radiosity, which considers diffuse interaction. When considering these with
regard to global illumination, we have global interaction, where we start with the light source
and follow every light path (ray of light) as it travels through the environment. Stopping
conditions that we can consider are when the light hits the eye point, if it travels out of the
environment, or if the light has had its energy reduced to an admisible limit due to absorbtion
in objects.

With diffuse and specular interactions, we can derive four types of interaction to consider,
diffuse-diffuse (the radiosity model), specular-diffuse, diffuse-specular and specular-specular
(ray tracing).

Ray Tracing

Whitted ray tracing is a ray tracing algorithm that traces light rays in reverse direction of
propagation from the eye back into the scene towards the light source. Reflection rays (if
reflective) and refraction rays (if translucent/transparent) are spawned for every hit point
(recursive calculation), and these rays themselves may spawn new rays if there are objects in
the way.

Ray tracing uses the specular global illumination model and a local model, so it considers
diffuse-reflection interactions, but not diffuse-diffuse.

The easiest way to represent ray tracing is using a ray tree, where each hit point generates a
secondary ray towards the point source, as well as the reflection and refraction rays. The tree
gets terminated when secondary rays miss all objects, a preset depth limit is reached, or
time/space/patience is exhausted.

Problems with ray tracing are that shadow rays are usually not refracted, and problems with
numerical precision may give false shadows. Additionally, the ambient light model it uses is
simplistic.

Radiosity

Wikipedia article

Classic radiosity implements a diffuse-diffuse interaction. The solution is view-independent,


as lighting is calculated for every point in the scene, rather than those that can be seen from
the eye.

With radiosity, we consider the light source as an array of emitting patches. The scene is then
split in to patches, however this depends on the solution, as we do not know how to divide the
scene in to patches until we have a partial result, which makes the radiosity algorithm
iterative. We then shoot light in the scene from a source and consider diffuse-diffuse
interactions between a light patch and all the receiving patches that are visible from the light
patch. The process then continues iteratively by considering as the next shooting patch the
one which has received the highest amount of energy. The process stops when a high
percentage of the intial light is distributed about the scene.
Problems can occur if the discretisation of the patches is too course, and radiosity does not
consider any specular components.

Shading

Angel c.6.5 There are two principle ways of shading polygons, using flat shading, where
1 colour is decided for the whole polygon and all pixels on the polygon are set to that colour,
or interpolation shading, where the colour at each pixel is decided by interpolating properties
of the polygons vertices. Two types of interpolation shading are used, Gourand shading and
Phong shading. In both Gourand and Phong shading, each vertex can be given different
colour properties and different surface normals.

Flat Shading

In this model, the illumination model is applied once per polygon, and the surface
normal N is constant for the entire polygon - so the entire polygon is shaded with a single
value.

This can yield nasty effects (even at high levels of detail), such as mach banding, which is
caused by "lateral inhibition", which is a perception property of the human eye - the more
light a receptor receives the more it inhibits its neighbours. This is a biological edge detector,
and the human eye prefers smoother shading techniques.

In OpenGL, this model can be used using glShadeModel(GL_FLAT).

Gourand Shading

In Gourand shading, each vertex has a normal vector and colour data associated with it. The
colouring of each vertex is then determined and then interpolated along each edge. When the
polygon is then scan converted, the colour is interpolated along each scan line using the edge
values.
However, the polygons are still flat, so the silhouette of curved objects can visibly consist of
straight lines. Additionally, in the OpenGL pipeline, shading is carried out after the projection
transformation is applied, so if a perspective transformation is used, the non-linear
transformation of the Z-axis may produce odd effects. These effects can be worked around
using smaller polygons, however.

Additionally, discontinuities can occur where 2 polygons with different normals share the
same vertex, and odd effects can occur if the normals are set up wrong, and some mach
banding effects can still be produced.

In OpenGL, this mode is used using glShadeModel(GL_SMOOTH).

Phong Shading

In Phong shading, the normals are interpolated rather than the colour intensities. Normals are
interpolated along edges, and then interpolation down the scan line occurs and the normal is
calculated for each pixel. Finally, the colour intensity for each pixel is calculated.

Phong shading does give a correct specular highlight and reduces problems of Mach banding,
but it does have a very large computational cost, so Phong shading is almost always done
offline, and it is not available in standard OpenGL.

Blending

We can use a process called blending to allow translucent primitives to be drawn. We


consider a fourth colour value - the alpha value (α ∈ [0,1]) - to give us a RGBA colour
model. Alpha is then used to combine colours of the source and destination components.
Blending factors (four-dimensional vectors) are used to figure out how to mix these RGBA
values when rendering an image. A factor is multiplied against against the appropriate colour.

In OpenGL, blending must be enabled using glEnable(GL_BLEND) and the blending


functions for the source and destination must be specified using glBlendFunc(GLenum
source, GLenum destination). Common blending factors include:

Constant Blending Factor

GL_ZERO (0, 0, 0, 0)

GL_ONE (1, 1, 1, 1)

GL_SRC_COLOR (Rsource, Gsource, Bsource, Asource)

GL_ONE_MINUS_SRC_COLOR (1 - Rsource, 1 - Gsource, 1 - Bsource, 1 - Asource)

GL_SRC_ALPHA (Asource, Asource, Asource, Asource)

GL_ONE_MINUS_SRC_ALPHA (1 - Asource, 1 - Asource, 1 - Asource, 1 - Asource)

The order in which polygons are rendered affects the blending.

Shadows

There are two methods to implement shadows:

1. Use solid modelling techniques to determine shadow volumes, then change the scan
conversion routine to check if a pixel is in a shadow volume, and reduce slightly the colour
intensity for every volume they are in.
2. A cheaper way is to consider only shadows on one plane then define a perspective
projection onto that plane with the centre of projection being the light source. To do this,
we first draw the plane, then any other objects and multiply the view matrix by the
perspective projection matrix, switch the depth testing and re-render the objects. This is also
called 2-pass shading, and uses the following simple projection matrix that projects all 3D
points onto a 2D plane a distance d from the centre of projection.

Texturing

Angel c.7.6 To provide surface detail, geometric modelling has its limits - a large number
of polygons are inefficient to render and there is a modelling problem to implement them. It
is often easier to introduce detail in the colour/frame buffer as part of the shading process.
Tables and buffers can be used to alter the appearance of the object, and several different
mapping approaches are in widespread use, e.g., texture mapping is used to determine colour,
environment mapping to reflect the environment and bump mapping to distort the apparent
shape by varying the normals.

A texture is a digital image that has to be mapped onto the surface of the polygon as it is
rendered into the colour buffer. Many issues can arise during this process, such as differences
in format between the colour buffer and the image, the number of dimensions in the texture,
the way the image is held (should be as a bitmap consisting of texels), and how the image is
mapped to the polygon.

Texture Mapping

Translating textures in multiple dimensions to a 2 dimensional polygon can be done in a


number of ways:

• 1-dimensional, e.g., a rainbow - a 1-dimensional texture from violet to red is mapped onto
an arc
• 2-dimensional - this is straight forward as a direct mapping can be done
• 3-dimensional, e.g., a sculpture, a 2-dimensional texture contains a 3-dimensional model of
the rocks colouring and a one-to-one mapping of the space containing the statue onto the
texture space occurs. When the polygons of the statue are rendered, each pixel is coloured
using the texture colour at the equivalent texture space point
• 4-dimensional - a 3-dimensional mapping is done, and the fourth dimension should consist
of the initial and last states and the intermediate steps are the alpha blends of the two
limits. Consecutive textures are then applied during the animation.

When we have a two-dimensional texture consisting of a 2-dimensional array of texels and a


2-dimensional polygon defined in 3-dimensional space and then a mapping of the texture on
to the polygon, we can project the polygon onto a viewplane then quantise the viewplane in
to discrete pixels. We then need to use the correct texel for any pixel on the polygon.
OpenGL explicitly supports 1-dimensional and 2-dimensional textures.

One way of doing this is to normalise the texture space (s, t) to [0,1] and then consider the
mapping of a rectange of texture in to the model (u, v) space. This mapping is defined
parametrically in terms of maximum and minimum coordinates - u = umin +
(s - smin)/(smax - smin) × (umax - umin).

An alternative is to texture a whole object at once using a projection from a simpler


intermediate object - this process is called two-part mapping. Textures are mapped on to
regular surfaces (e.g., spheres, cylinders, boxes) and projected on to the textured object using
three different methods - the normal from the intermediate, the normal to the intermediate
and the projector from the centre of the object. All that then has to be done is to define the
mapping on to the intermediate object.
If there is more polygon than there is texture, we can either repeat the whole texture like wall
paper, or repeat only the last bits of the texture as a 1-dimensional texture. Additionally,
sampling problems between the texture and the polygon can occur, which is further
heightened by the non-linear transformation of the Z-axis in the case of perspective.
Mipmapping can be used to control the level of detail by generating new textures that have
half the resolution of the original and using them where appropriate.

In OpenGL, textures are defined as a combination of an image bitmap and a mapping


function on to objects. The image bitmap is set using glTexImage2D(GLenum
target, GLint level, GLint components, GLsizei width, GLsizei
height, GLint border, GLenum format, GLenum type, GLvoid
*pixels). Where target is normally
either GL_TEXTURE_2D or GL_TEXTURE_1D (a glTexImage1D() variant also
exists), level is normally 0 and allows multi-resolution textures
(mipmaps), components are the number of colour components (1-4, i.e.,
RGBA), width and height are the horizontal and vertical dimensions of the texture map
and need to be a power of 2, border is either 0 or 1 and defines whether or not
a border should be used, format is the colour format
(e.g., GL_RGB or GL_COLOR_INDEX), type is the data type
(e.g., GL_FLOAT, GL_UNSIGNED_BYTE) and pixels is a pointer to the texture data
array.
glTexParameter{if}v(GLenum target, GLenum parameter, TYPE
*value), is used to determine how the texture is applied. target is
typically GL_TEXTURE_2D and parameter is one
of GL_TEXTURE_WRAP_S, GL_TEXTURE_WRAP_T, GL_TEXTURE_MAG_FILTER, GL_
TEXTURE_MIN_FILTER and GL_TEXTURE_BORDER_COLOR. The value then depends
on the parameter being set. The wrap parameters determine how texture coordinates outside
the range 0 to 1 are handled and the values GL_CLAMP is where the coordinates are clamped
to the range 0..1 so the final edge texels are repeated and GL_REPEAT repeats the whole
texture along the fractional parts of the surface. MAG and MIN specify how texels are
selected when pixels do not map on to texel centres. GL_NEAREST is used to select the
nearest pixel and GL_LINEAR uses a weighted average of the 4 nearest
texels. GL_TEXTURE_BORDER_COLOR expects a colour vector to be specified as a
parameter.

The vertices of the polygon must be assigned coordinates in the texture space and this is done
in the same way as assigning normals, with a glTexCoord2f(x, y) command, which
assigns the vertex to position (x, y) in the 2-dimensional texture.

glTexEnvi(GL_TEXTURE_ENV, GL_TEXTURE_ENV_MODE, value) is used to


determine the interaction between texture's and the polygon's materials, where value is
either GL_REPLACE, where the pixel is always given the texture's colour; GL_DECAL,
where in RGB the pixel is given the texture's colour, but if RGBA is used, alpha blending
occurs; GL_MODULATE is where the polygons colour and texture are multiplied;
and GL_BLEND where a texture environment colour is specified
using glTexEnvfv(GL_TEXTURE_ENV, GL_TEXTURE_ENV_COLOR, GLfloat
*colourarray) and this is multiplied by the texture's colour and added to a one-minus
texture blend.

As with most other things in OpenGL, we also need to


use glEnable(GL_TEXTURE_2D) and glEnable(GL_TEXTURE_1D) to enable
texturing.

Mipmapping

Mipmapping is implemented in OpenGL where the resolution of a texture is succesively


quartered (e.g., 1024 × 512 → 512 × 256 → 256 × 128 → ... → 1 × 1 ).

OpenGL allows different texture images to be provided for multiple levels, but all sizes of the
texture, from 1 × 1 to the max size, in increasing powers of 2 must be provided. Each texture
is identified by the level parameter in glTexImage?D, and level 0 is always the original
texture. The use of it is activated by glTexParameter() with
the GL_TEXTURE_MIN_FILTER and the GL_NEAREST_MIPMAP_NEAREST parameters.

OpenGL does provide a function to automatically construct


mipmaps: glBuild2DMipmaps(GLenum target, GLint comps, GLint
width, GLint height, GLint format, GLenum type, GLvoid *data).
All the parameters are the same as they are in glTexImage().
Texture Blending and Lighting

Additionally, we need to consider blending and lighting the texture. When a texel is to be
placed in to the colour buffer during scan conversion, we still need to make two choices -
whether or not the illumination model applies to the texture and whether the texture replaces
the properties of the underlying material of the polygon, or be blended with it (i.e., do we
only see the texture, or a mixture of texture and material). If we do not apply the illumination
model to the texture, the texture will appear the same from all angles under all lights, and if
we do then the texture will be shaded using the same calculations as would be applied to
material properties of the polygon.

Enviroment and reflection maps can be used to simulate a shiny object without ray tracing. A
view point is positioned within the object, looking out and then the resultant image is used as
a texture to apply to the object. The reflective object S is replaced by a projection
surface P and then the image of the environment is computed on P. An image is then
projected from P onto S.

Many other pixel-level operations are also possible:

• Fog - the pixels are blended with the fog colour with the blending governed by the Z-
coordinate
• Antialising - pixels are replaced by the average of their own and their nearest neighbours
colours
• Colour balancing - colours are modified as they are written in to the colour buffer
• Direct manipulation - colours are copied or replaced

As well as the colour and depth buffers, OpenGL provides a stencil buffer used for masking
areas of other buffers, an accumulation buffer which can be used for whatever you want and a
bitwise XOR operator is usually provided in hardware on the graphics chip.

Halftones

Halftones can be used to increase apparent available intensities at the expense of resolution
and are based on the principle of halftoning as used in newspaper and other print
publications.

Halftones are implemented using rectangular pixel regions called halftone patterns.
An n2 grid gives n2 + 1 intensities, so a 4 by 4 block has 17 shades from white to black. To
reach a level k intensity, you need to turn on the pixels numbered ≤ k. This can be generalised
to colour.

Pattern generation is not trivial, as sub-grid patterns become evident. Visual effects, e.g.,
contouring, need to be avoided, and isolated pixels are not always effective on some devices.

Dithering

Dithering uses the same principle as halftone in printing, but the output medium has fixed
pixels and can not represent the dots of a half tone picture. The solution is to use a block of
pixels for each point in the image.
For an n × n dither matrix Dn, entries are 0..n2 - 1. The input intensity IR required at (x, y) is
scaled in to 0 ≤ k ≤ n2. (x, y) is then only turned on if and only if IR > ID.

If the number of pixels in the colour buffer is the same as in the output medium, dithering is
applied by incrementing the intensity of each pixel in the output by components dependent on
the sum of the values from several superimposed dither matrices representing the pixels of
the input.

Dithering in OpenGL can be implemented


using glEnable(GL_DITHER) and glDisable(GL_DITHER), but by default it is
enabled.

Bump Maps

Bump maps are used to capture surface roughness. A pertubation function is applied to the
surface normal and this pertubed normal is used in lighting calculations. At a position in the
surface P(u, v) and a bump map b(u, v), then the surface effect P′(u, v) = P(u, v) + b(u, v)n.

Bump maps are represented using lookup tables and can be random patterns, regular patterns
or some other mapping, such as characters, etc...

OpenGL provides facilities for all or part of a texture image to be replaced by a new image
with glTexSubImage2D(), as it is much faster to replace than to construct a new image -
this is useful in video processing.
Additionally, glCopyTexImage?D() and glCopyTexSubImage?D() allow you to
take texture information directly from the frame buffer.

OpenGL implementations may support working sets of textures called texture objects, which
are similar in principle to display lists. With these, you can enquire the state of textures and
available resources, prioritise textures and release texture resources
with glDeleteTextures(). glTexGen() can be used for the automatic generation of
texture coordinates allowing sphere and cube mappings. Additionally, a texture matrix exists
which can transform the texture coordinates. This is used with animation to produce moving
(4-dimensional) textures.

Visualisation
Visualisation is the process of generating graphical models from data. Many computer based
technologies generate massive amounts of data, such as satellite and spacecraft scanners,
medical instruments (CT, MRI, ultrasound, etc), simulations of fluid dynamics and weather
forecasting, financial data and information spaces. Automated analysis is of limited use when
trying to make sense of very large datasets, so it makes sense to utilise human visual
capabilities - IA, rather than AI. The challenge is therefore to extract a graphical model from
the data.

Visualisation is a somewhat biased term, as we can "visualise" things using our auditory
senses (sonification) and based on physics (haptics), so we can say that our ultimate aim is
conceptualisation. With visualisation, we can also have collaborative visualisation and
computational steering, where the computer attempts to apply some analysis to the data to
help the user.

We can break down the visualisation process into two areas, the system and the user. The
system takes data, applies it to a model and then renders the result. The user takes this result
and perceives it, applying structure and eventually understanding it.

Two main types of visualisation are scientific visualisation and information visualisation. If
the data has a natural mapping to R3, then we can use scientific visualisation, otherwise
information visualisation has to be used.

Visualisation Pipeline

Most visualisation systems consist of a pipeline of processes, and some powerful commercial
tools implement these in a visual programming environment, where visualisation modules are
plugged together by data flows. The system can then automate updates of results when the
parameters or models change.

Scientific Data Sets

We can classify scientific data sets on four orthogonal criteria:

1. Dimensionality
2. Attribute data type (scalar, vector, tensor, multivariable, etc)
3. Cell type (vertex, triangle, etc)
4. Organisation of cells (topological and geometrical)
Some examples of visualisation include contour data, which can be shown as a 3 dimensional
mesh, or a 2-dimensional grid, with colour representing the third dimension. Voxels can also
be used when representing 3-dimensional data, such as that from an MRI scanner. Here, the
source data consists of parallel planes of intensity information collective from consecutive
cross sections - in MRI a pixel of such a plane represents the X-ray absorption of that part of
the body that the pixel physically corresponds to, so corresponding pixels in two consecutive
planes form the top and bottom of a voxel.

Vector fields and flows can also be represented using a directed line segment, where length is
proportional to magnitude. Care is required for scaling and foreshortening, and the data also
needs to be filtered to reduce clutter. Variations of this exist, such as using glyphs.

Flow visualisation (such as in wind tunnels and ultrasound) and be represented using a
particle track, which traces the path of a particle in the field. Streamlining allows you to
integrate the flow over time.

Optical flow and motions can be represented using pindiagrams (arrows) showing the
direction of the displacement.

On the other hand, data sets for information visualisation are more disparate, and are either of
implicit functions (i.e., ƒ(x, y) = c) or explicit functions (i.e., z = g(x, y)).
CHAPTER 9:
Lexical and Syntax Analysis of Programming
Languages
Languages

For more detail


see TOC.

Languages are a potentially infinite set of strings (sometimes called sentences, which are a
sequence of symbols from a given alphabet). The tokens of each sentence are ordered
according to some structure.

A grammar is a set of rules that describe a language. Grammars assign structure to a sentence.

An automaton is an algorithm that can recognise (accept) all sentences of a language and
reject those which do not belong to it.

Lexical and syntactical analysis can be simplified to a machine that takes in some program
code, and then returns syntax errors, parse trees and data structures. We can think of the
process of description transformation, where we take some source description, apply a
transformation technique and end up with a target description - this is inference mapping
between two equivalent languages, where the destination is a machine executable.

Compilers are an important part of computer science, even if you never write a compiler, as
the concepts are used regularly, in interpreters, intelligent editors, source code debugging,
natural language processing (e.g., for AI) and for things such as XML.

Throughout the years, progress has been made in the field of programming to bring higher
and higher levels of abstraction, moving from machine code, to assembler, to high level
langauges and now to object orrientation, reusable languages and virtual machines.

Translation Process
LSA only deals with the front-end of the compiler, next year's module CGO deals with the
back-end. The front-end of a compiler only analyses the program, it does not produce code.

From source code, lexical analysis produces tokens, the words in a language, which are then
parsed to produce a syntax tree, which checks that tokens conform with the rules of a
language. Semantic analysis is then performed on the syntax tree to produce an annotated
tree. In addition to this, a literal table, which contains information on the strings and constants
used in the program, and a symbol table, which stores information on the identifiers occuring
in the program (e.g., variable names, constant names, procedure names, etc), are produced by
the various stages of the process. An error handler also exists to catch any errors generated by
any stages of the program (e.g., a syntax error by a poorly formed line). The syntax tree
forms an intermediate representation of the code structure, and has links to the symbol table.

From the annotated tree, intermediate code generation produces intermediate code (e.g., that
suitable for a virtual machine, or pseudo-assembler), and then the final code generation stage
produces target code whilst also referring to the literal and symbol table and having another
error handler. Optimisation is then applied to the target code. The target code could be
assembly and then passed to an assembler, or it could be direct to machine code.

We can consider the front-end as a two stage process, lexical analysis and syntactic analysis.

Lexical Analysis

Lexical analysis is the extraction of individual words or lexemes from an input stream of
symbols and passing corresponding tokens back to the parser.

If we consider a statement in a programming language, we need to be able to recognise the


small syntactic units (tokens) and pass this information to the parser. We need to also store
the various attributes in the symbol or literal tables for later use, e.g., if we have an variable,
the tokeniser would generate the token var and then associate the name of the variable with it
in the symbol table - in this case, the variable name is the lexeme.

Other roles of the lexical analyser include the removal of whitespace and comments and
handling compiler directives (i.e., as a preprocessor).

The tokenisation process takes input and then passes this input through a keyword recogniser,
an identifier recogniser, a numeric constant recogniser and a string constant recogniser, each
being put in to their own output based on disambiguating rules.

These rules may include "reserved words", which can not be used as identifiers (common
examples include begin, end, void, etc), thus if a string can be either a keyword or an
identifier, it is taken to be a keyword. Another common rule is that of maximal munch, where
if a string can be interpreted as a single token or a sequence of tokens, the former
interpretation is generally assumed.
For revision of
automata, see MCS
or lecture notes.

The lexical analysis process starts with a definition of what it means to be a token in the
language with regular expressions or grammars, then this is translated to an abstract
computational model for recognising tokens (a non-deterministic finite state automaton),
which is then translated to an implementable model for recognising the defined tokens (a
deterministic finite state automaton) to which optimisations can be made (a minimised DFA).

Syntactic Analysis

Syntactic analysis, or parsing, is needed to determine if the series of tokens given are
appropriate in a language - that is, whether or not the sentence has the right shape/form.
However, not all syntactically valid sentences are meaningful, further semantic analysis has
to be applied for this. For syntactic analysis, context-free grammars and the associated
parsing techniques are powerful enough to be used - this overall process is called parsing.

In syntactic analysis, parse trees are used to show the structure of the sentence, but they often
contain redundant information due to implicit definitions (e.g., an assignment always has an
assignment operator in it, so we can imply that), so syntax trees, which are compact
representations are used instead. Trees are recursive structures, which complement CFGs
nicely, as these are also recursive (unlike regular expressions).

There are many techniques for parsing algorithms (vs FSA-centred lexical analysis), and the
two main classes of algorithm are top-down and bottom-up parsing.

For information about


context free grammars,
see MCS.

Context-free grammars can be represented using Backus-Naur Form (BNF). BNF uses three
classes of symbols: non-terminal symbols (phrases) enclosed by brackets <>, terminal
symbols (tokens) that stand for themselves, and the metasymbol ::= - is defined to be.

As derivations are ambiguous, a more abstract structure is needed. Parse trees generalise
derivations and provide structural information needed by the later stages of compilation.

Parse Trees

Parse trees over a grammar G is a labelled tree with a root node labelled with the start symbol
(S), and then internal nodes labelled with non-terminals. Leaf nodes are labelled with
terminals or ε. If an internal node is labelled with a non-terminal A, and has n children with
labels X1, ..., Xn (terminals or non-terminals), then we can say that there is a grammar rule of
the form A → X1...Xn. Parse trees also have optional node numbers.
The above parse tree corresponds to a leftmost derivation.

Traversing the tree can be done by three different forms of traversal. In preorder traversal,
you visit the root and then do a preorder traversal of each of the children, in in-order
traversal, an in-order traversal is done of the left sub-tree, the root is visited, and then an in-
order traversal is done of the remaining subtrees. Finally, with postorder traversal, a postorder
traversal is done of each of the children and the root visited.

Syntax Trees

Parse trees are often converted into a simplified form known as a syntax tree that eliminates
wasteful information from the parse tree.

At this stage, treatment of errors is more difficult than in the scanner (tokeniser), as the
scanner may pass problems to the parser (an error token). Error recovery typically isolates the
error and continues parsing, and repair can be possible in simple cases. Generating
meaningful error messages is important, however this can be difficult as the actual error may
be far behind the current input token.
Grammars are ambiguous if there exists a sentence with two different parse trees - this is
particularly common in arithmetic expressions - does 3 + 4 × 5 = 23 (the natural
interpretation, that is 3 + (4 × 5)) or 35 ((3 + 4) × 5). Ambiguity can be inessential, where the
parse trees both give the same syntax tree (typical for associative operations), but still a
disambiguation rule is required. However, there is no computable function to remove
ambiguity from a grammar, it has to be done by hand, and the ambiguity problem is
undecidable. Solutions to this include changing the grammar of the language to remove any
ambiguity, or to introduce disambiguation rules.

A common disambiguation rule is precedence, where a precedence cascade is used to group


operators in to different precedence levels.

BNF does have limitations as there is no notation for repetition and optionality, nor is there
any nesting of alternatives - auxillary non-terminals are used instead, or a rule is expanded.
Each rule also has no explicit terminator.

Extended BNF (EBNF) was developed to work around these restrictions. In EBNF, terminals
are in double quotes "", and non-terminals are written without <>. Rules have the
form non-terminal = sequence or non-terminal = sequence | ... |
sequence, and a period . is used as the terminator for each rule. {p} is used for 0 more
occurrences of p, [p] stands for 0 or 1 occurances of p and (p|q|r) stands for exactly one
of p, q, or r. However, notation for repetition does not fully specify the parse tree (left or
right associativity?).

Syntax diagrams can be used for graphical representations of EBNF rules. Non-terminals are
represented in a square/rectangular box, and terminals in round/oval boxes. Sequences and
choices are represented by arrows, and the whole diagram is labelled with the left-hand side
non-terminal. e.g., for factor → (exp) | "number" .

Semantic Analysis

Semantic analysis is needed to check aspects that are not related to the syntactic form, or that
are not easily determined during parsing, e.g., type correctness of expressions and declaration
prior to use.

Code Generation and Optimisation

Code generation and optimisation exists to generally transform the annotated syntax tree into
some form of intermediate code, typically three-address code or code for a virtual machine
(e.g., p-code). Some level of optimisation can be applied here. This intermediate code can
then be transformed into instructions for the target mahcine and optimised further.
Optimisation is really code improvement, e.g., constant folding (e.g., replace x =
4*4 with x = 16), and also at a target level (e.g., multiplications by powers of 2 replaced
by shift level instructions).

T-Diagrams

T-diagrams can be used to represent compilers, e.g.,

This shows an Ada compiler, written in C that outputs code for the Intel architecture. Cross
compilers can also be represented, e.g.,

This shows an Ada compiler for PowerPC systems that is written for the Intel architecture.

T-diagrams can be combined in multiple ways, such as to change the target language:

Or to change the host language, e.g., to compile a compiler written in C or to generate a


cross-compiler:
Bootstrapping

The idea behind bootstrapping is to write a compiler for a language in a possibly older
version or a restricted subset of the language. A "quick and dirty" compiler is then written for
that language that runs on some machine, but in a restricted subset or older version.

This new version of the compiler incorporates the latest features of the language and
generates highly efficient code, and all changes to the language are represented in the new
compiler, so this compiler is used to generate the "real" compiler for target machines.

To bootstrap, we use the quick and dirty compiler to compile the new compiler, and then
recompile the new compiler with itself to generate an efficient version.

With bootstrapping, improvements to the source code can be bootstrapped by applying the 2-
step process. Porting the compiler to a new host computer only requires changes to the
backend of the compiler, so the new compiler becomes the "quick and dirty" compiler for
new versions of the language.

If we use the main compiler for porting, this provides a mechanism to quickly generate new
compilers for all target machines, and minimises the need to maintain/debug multiple
compilers.

Top-Down Parsing
Top down parsing can be broken down into two classes: backtracking parsers, which try to
apply a grammar rule and backtrack if it fails; and predictive parsers, which try to predict the
next nonterminal in the input using one or more tokens lookahead. Backtracking parsers can
handle a larger class of grammars, but predictive parsers are faster. We look at predictive
parsers in this module.

In this module, we look at two classes of predictive parsers: recursive-descent parsers, which
are quite versatile and appropriate for a hand-written parser, and were the first type of parser
to be developed; and, LL(1) parsing - left-to-right, leftmost derivation, 1 symbol lookahead, a
type of parser which is no longer in use. It does demonstrate parsing using a stack, and helps
to formalise the problems with recursive descent. There is a more general class of LL(k)
parsers.

Recursive Descent

The basic idea here is that each nonterminal is recognised by a procedure. This choice
corresponds to a case or if statement. Strings of terminals and nonterminals within a choice
match the input and calls to other procedures. Recursive descent has a one token lookahead,
from which the choice of appropriate matching procedure is made.

The code for recursive descent follows the EBNF form of the grammar rule, however the
procedures must not use left recursion, which will cause the choice to call a matching
procedure in an infinite loop.

There are problems with recursive descent, such as converting BNF rules to EBNF,
ambiguity in choice and empty productions (must look on further at tokens that can appear
after the current non-terminal). We could also consider adding some form of early error
detection, so time is not wasted deriving non-terminals if the lookahead token is not a legal
symbol.

Therefore, we can improve recursive descent by defining a set First(α) as the set of tokens
that can legally begin α, and a set Follow(A) as the set of tokens that can legally come after
the nonterminal A. We can then use First(α) and First(β) to decide between A → α and A →
β, and then use Follow(A) to decide whether or not A → ε can be applied.

LL(1) Parsing

LL(1) parsing is top-down parsing using a stack as the memory. At the beginning, the start
symbol is put onto the stack, and then two basic actions are available: Generate, which
replaces a nonterminal A at the top of the stack by string α using the grammar rule A → α;
and Match, which matches a token on the top of the stack with the next input token (and, in
case of success, pop both). The appropriate action is selected using a parsing table.

With the LL(1) parsing table, when a token is at the top of the stack, there is either a
successful match or an error (no ambiguity). When there is a nonterminal A at the top, a
lookahead is used to choose a production to replace A.

The LL(1) parsing table is created by starting with an empty table, and then for all table
entries, the production choice A → α is added to the table entry M[A,a] if there is a
derivation α ├* aβ where a is a token or there are derivations α ├* ε and S$ ├* βAaγ, where
S is that start symbol and a is a token (including $). Any entries that are now empty represent
errors.

A grammar is an LL(1) grammar if the associate LL(1) parsing table has at most one
production in each table entry. A LL(1) grammar is never ambiguous, so if a grammar is
ambiguous, disambiguating rules can be used in simple cases. Some consequences of this are
for rules of the form A → α | β, then α and β can not derive strings beginning with the same
token a and at most one of α or β can derive ε.
With LL(1) parsing, the list of generated actions match the steps in a leftmost derivation.
Each node can be constructed as terminals, or nonterminals are pushed onto the stack. If the
parser is not used as a recogniser, then the stack should contain pointers to tree nodes. LL(1)
parsing has to generate a syntax tree.

In LL(1) parsing, left recursion and choice are still problematic in similar ways to that of
recursive descent parsing. EBNF will not help, so it's easier to keep the grammar as BNF.
Instead, two techniques can be used: left recursion removal and left factoring. This is not a
foolproof solution, but it usually helps and it can be automated.

In the case of a simple immediate left recursion of the form A → Aα | β, where α and β are
strings and β does not begin with A, then we can rewrite this rule into two grammar rules: A
→ βA′, A′ → αA′ | ε. The first rule generates β and then generates {α} using right recursion.

For indirect left recursion of the form A → Bβ1 | ..., B → Aα1, then a cycle (a derivation of
the form A ├ α ├* A) can occur. The algorithm can only work if a grammar has no cycles or
ε-productions.

Left recursion removal does not change the language, but it does change the grammar and
parse trees, so we now have the new problem of keeping the correct associativity in the
syntax tree. If you use the parse tree directly to get the syntax tree, then values need to be
passed from parent to child.

If two or more grammar choices share a common prefix string (e.g., A → αβ | αγ), then left
factoring can be used. We can factor out the longest possible α and split it into two rules: A
→ αA′, A′ → β | γ.

To construct a syntax tree in LL(1) parsing, it takes an extra stack to manipulate the syntax
tree nodes. Additional symbols are needed in the grammar rules to provide synchronisation
between the two stacks, and the usual way of doing this is using the hash (#). In the example
of a grammar representing arithmetic operations, the # tells the parser that the second
argument has just been seen, so it should be added to the first one.

First and Follow Sets

To construct a First set, there are rules we can follow:

• If x is a terminal, then First(x) = {x} (and First(ε) = {ε})


• For a nonterminal A with the production rules A → β1 | ... | βn, First(A) = First(β1) ∪ ... ∪
First(βn)
• For a rule of the form A → X1X2...Xn, First(A) = (First(X1) ∪ ... ∪ First(Xk)) - {ε}, where ε ∈
First(Xj),j = 1..(k - 1) and ε ∉ First(Xk, i.e., where the first non-nullable symbol X symbol is Xk.

Therefore, the First(w) set consists of the set of terminals that begin strings derived
from w (and also contains ε if w can generate ε).

To construct Follow sets, we can say that if A is a non-terminal, then Follow(A) is the set of
all terminals that can follow A in any sentential form (derived from S).

• If A is the start symbol, then the end marker $ ∈ Follow(A)


• For a production rule of the form B → αAβ, everything in First(β) is placed in Follow(A)
except ε.
• If there is a production B → αA, or a production B → αAβ where First(β) contains ε, then
Follow(A) containts Follow(B).

A final issue to consider is how to deal with errors in the parsing stage. In recursive descent
parsers, a panic mode exists where each procedure declares a set of synchronising tokens, and
when confused, input tokens are skipped (scan ahead) until one of the synchronising sets of
tokens are seen and you can continue parsing. In LL(1) parsers, sets of synchronising tokens
are kept in an additional stack or built directly into the parsing table.

Bottom-Up Parsing
Top-down parsing works by tracing out the leftmost derivations, whereas bottom-up parsing
works by doing a reverse rightmost derivation.

The right sentential form are the rightmost derivations of a string, and a parse of a string eats
tokens left to right to reverse the rightmost derivation. A handle is a string defined by the
expansion of a non-terminal in right sentential form.

Bottom-up parsing works by starting with an empty stack and having two operations: shift,
putting the next input token onto the stack; and reduce, replacing the right-hand side of a
grammar rule with its left-hand side. We augment the start state S with a rule Z → S (some
books refer to this as S′ → S), making Z our new start symbol, so when only Z is on the stack
and the input is empty, we accept. The parse stack is maintained where tokens are shifted
onto it until we have a handle on top of the stack, whereupon we shall reduce it by reversing
the expansion.

In this process, we can observe that when a rightmost non-terminal is expanded, the terminal
to the right of it is in its Follow set - we need this information later on, and is also important
to note that the Follow set can include $.

At any moment, a right sentential form is split between the stack and the input; each reduce
action produces the next right sentential form. Given A → α, α is reduced to A if the stack
content is a viable prefix of the right sentential form. Appropriate handles are at the base of
the expansion triangles in the following right sentential form.

LR(0)

An LR(0) item is a way of monitoring progress towards a handle. It is represented by a


production rule together with a dot. The right-hand side has part behind the dot and part in
front. This says that the parse has matched a substring derivable by the component to the left
of the dot and is now prepared to match something on the input stream defined by that
component to the right of the dot, e.g.,
A production rule can therefore give rise to several configuration items, with the dots in
varying places. When the dot is to the extreme left, we call this an initial item, and when the
dot is to the extreme right, we call this a completed item, and these are typically underlined.

To represent the state and the progress of the parse, we can use a finite state automata.
Typically, we start by construction a NDFA where each state of the NDFA contains an LR(0)
item and transitions occur based on terminals and non-terminals. We then apply subset
construction to obtain a DFA.

To traverse our newly constructed DFA, the stack now contains both symbols and DFA states
in pairs. We can then use the algorithm described on slide 15 of lecture set 6. As LR(0) has
no look-ahead, we can encounter problems with ambiguous grammars. If a state contains a
complete item A → α. and another item A → α.Xβ, then there is a shift-reduce conflict. If a
state contains a complete item A → α. and another complete item B → β., then there is a
reduce-reduce conflict.

We say that a grammer is LR(0) if none of the above conflicts arise, that is, if a state contains
a complete item, then it can contain no other item.

Another way of saying that a grammar is LR(0) is if its LR(0) parser is unambiguous. The
LR(0) parsing table is the DFA and the LR(0) parsing actions combined.

LR(0) parsing belongs to a more general class of parsers, called LR parsers. NDFAs are built
to describe parsing in progressively increasing levels of detail (similar to recursive descent
parsing). The DFA, built from subset construction, builds the parse tree in a bottom-up
fashion and waits for a complete right-hand side with several right-hand sides sharing a prefix
considered in parallel. A stack is used to store partial right-hand sides (the handles) and
remember visited states. The right-hand side is traced following arrows (going from left-hand
side to right-hand side), and then removed from the stack (going against the arrows). The
uncovered state and left-hand side non-terminal then define the new state.
LR parsers can also take advantage of a look-ahead symbols, similar as in top-down parsing
where the left-hand side is expanded into the right-hand side based on a single look-ahead
symbol (if any). The main LR parsing methods are:

• LR(0) - discussed above, uses no look-ahead


• SLR(1) - the left-hand side is only replaced with the right-hand side if the look-ahead symbol
is in the Follow set of the left-hand side
• LR(1) - this uses a subset of the Follow set of the left-hand side that takes into account
context (the tree to the above and left of the left-hand side)
• LALR(1) - this reduces the number of states compared to LR(1), and (if lucky), uses a proper
subset of the Follow set of the left-hand side

SLR(1)

SLR(1) parsing works on the principle of look-ahead before shift and see what follows before
reduce. The algorithm is described on slide 18 of lecture set 6.

We say that a grammar is SLR(1) if and only if:

• for any item A → α.Xβ where X is a terminal, there is no complete item B → γ. in the same
state for which X ∈ Follow(B) - this avoids shift-reduce conflicts
• for any two complete items A → α. and B → β. in the same state, Follow(A) ∩ Follow(B) = ∅ -
this avoids reduce-reduce conflicts

We can accomplish this by using some disambiguating rules, prefering shift over reduce
solves the shift-reduce conflict (and as a direct implication, this implements the most closely
nested rule). For reduce-reduce conflicts, the rule with the longer right-hand side is preferred.

LR(1)

Canoncical LR(1) parsing is done using a DFA based on LR(1) items. This is a powerful
general parser, but is very complex, with up to 10 times more states than the LR(0) parser.

We can construct a LR(1) parser in the same way as a LR(0) parser, using the LR(1) items as
nodes. However, the transitions are slightly different. For any X (terminal or non-terminal)
the lookahead does not change:
We also have ε-transitions for every production B → β and for every token bi ∈ First(γa),
where B is a nonterminal. This also changes the look-ahead symbol, unless γ → ε or γ = ε.

The LR(1) parsing table is the same format as for SLR(1) parsing, and the parsing algorithm
is as described on slide 27 of lecture set 6.

We say that a grammar is LR(1) if and only if:

• for any item [A → α.Xβ, a], where X is a terminal, there is no complete item [B → α., X] in the
same state - this removes shift-reduce conflicts
• there are no two complete items [A → α., a] and [B → β., a] in the same state

LALR(1)

The class of languages that LALR(1) can parse is between that of LR(1) and SLR(1), and in
reality is enough to implement most of the languages available today. It uses a DFA based on
LR(0) items and sets of look-aheads which are often proper subsets of the Follow sets
allowed by the SLR(1) parser. The LALR(1) parsing tables are often more compact than the
LR(1) ones.

We define the core of a state in an LR(1) DFA as the set of all LR(1) items reduced to LR(0)
by dropping the lookahead.

As the LR(1) DFA is too complex, we can reduce the size by merging states with the same
core and then extend each LR(0) item with a list of lookahead tokens coming from the LR(1)
items of each of the merged states. We can observe that if s1 and s2 have the same core, and
there is a transition s1 → t1 on symbol X, then there is a transition s2 → t2 on X,
and t1 and t2 have the same core.

This gives us a LALR(1) parser, which is still better than SLR(1).


The LALR(1) algorithm is the same as the LR(1) algorithm, and an LALR(1) grammar leads
to unambiguous LALR(1) parsing. If the grammar is at least LR(1), then the LALR(1) parser
can not contain any shift-reduce conflicts. The LALR(1) parser could be less efficient than
the LR(1) parser, however - as it makes additional reductions before throwing error.

Another technique to construct the LALR(1) DFA using the LR(0) DFA instead of the LR(1)
DFA is known as propagating lookaheads.

Error Recovery

Similarly to LL(1) parsers, there are three possible actions for error recovery:

• pop a state from the stack


• pop tokens from the input until a good one is seen (for which the parse can continue)
• push a new state onto the stack

A good method is to pop states from the stack until a nonempty Goto entry is found, then, if
there is a Goto state that matches the input, go - if there is a choice, prefer shift to reduce, and
out of many reductions, choose the most specific one. Otherwise, scan the input until at least
one of the Goto states like the token or the input is empty.

This could lead to infinite loops, however. To avoid loops, we can only allow shift actions
from a Goto state, or, if a reduce action Rx is applied, set a flag and store the following
sequence of states produced purely by reduction. If a loop occurs (that is, Rx is produced
again), states are popped from the stack until the first occurence of Rx is removed, and a shift
action immediately resets the flag.

The method Yacc uses for error recovery is NTi → error. The Goto entries of NTi will then be
used for error recovery. In case of an error, states are popped from the stack until one of
NTi is seen. error is then added to the input as its first token, and you continue going as if
nothing has happened (you can alternatively choose to discard the original lookahead
symbol). All erroneuous input tokens are discarded without producing error messages until a
sequence of 3 tokens is shifted legally onto the stack.

You might also like