Science of Programming
Science of Programming
THE SCIENCE
OF PROGRAMMING
David Gries
!IJ',"
,... --.,
Springer-Verlag
New York Heidelberg Berlin
The Science
of Programming
David Gries
Springer-Verlag
New York Heidelberg Berlin
David Gries
Department of Computer Science
Cornell University
Upson Hall
Ithaca, NY 14853
U.S.A.
9 8 7 6 5 4 3 2 l
ISBN 0-387-90641-X Springer- Verlag New York Heidelberg Berlin
ISBN 0-540-90641-X Springer-Verlag Berlin Heidelberg New York
Foreword
not only helpful but even indispensable. Choice and order of examples
are as important as the good taste with which the formalism is applied.
To get the message across requires a scientist that combines his scientific
involvement in the subject with the precious gifts of a devoted teacher.
We should consider ourselves fortunate that Professor David Gries has
met the challenge.
Edsger W. Dijkstra
Preface
It is in this context that the title of this book was chosen. Programming
began as an art, and even today most people learn only by watching oth-
ers perform (e.g. a lecturer, a friend) and through habit, with little direc-
tion as to the principles involved. In the past IO years, however, research
has uncovered some useful theory and principles, and we are reaching the
point where we can begin to teach the principles so that they can be cons-
ciously applied. This text is an attempt to convey my understanding of
and excitement for this just-emerging science of programming.
The approach does require some mathematical maturity and the will to
try something new. A programmer with two years experience, or a junior
or senior computer science major in college, can master the material -at
least, this is the level I have aimed at.
A common criticism of the approach used in this book is that it has
been used only for small (one or two pages of program text), a!beit com-
plex, problems. While this may be true so far, it is not an argument for
ignoring the approach. In my opinion it is the best approach to reasoning
about programs, and I believe the next ten years will see it extended to
and practiced on large programs. Moreover, since every large program
consists of many small programs, it is safe to say the following:
VIII Preface
Part III is the heart of the book. Within it, in order to get the reader
more actively involved, I have tried the following technique. At a point, a
question will be raised, which the reader is expected to answer. The ques-
tion is followed by white space, a horizontal line, and more white space.
After answering the question, the reader can then continue and discover
my answer. Such active involvement will be more difficult than simply
reading the text, but it will be far more beneficial.
Chapter 21 is fun. It concerns inverting programs, something that Eds-
ger W. Dijkstra and his colleague Wim Feijen dreamed up. Whether it is
really useful has not been decided, but it is fun. Chapter 22 presents a
few simple rules on documenting programs; the material can be read be-
fore the rest of the book. Chapter 23 contains a brief, personal history of
this science of programming and an anecdotal history of the programming
problems in the book.
Answers to some exercises are included -all answers are not given so
the exercises can be used as homework. A complete set of answers can be
obtained at nominal cost by requesting it, on appropriate letterhead.
Notation. The notation if/ is used for "if and only if". A few years ago,
while lecturing in Denmark, I used /if instead, reasoning that since "if and
only if" was a symmetric concept its notation should be symmetric also.
Without knowing it, I had punned in Danish and the audience laughed,
for ]if in Danish means "a little trick". I resolved thereafter to use /if so I
could tell my joke, but my colleagues talked me out of it.
The symbol 0 is used to mark the end of theorems, definitions,
examples, and so forth. When beginning to produce this book on the
phototypesetter, it was discovered that the mathematical quantifiers
"forall" and "exists" could not be built easily, so A and E have been used
for them.
Throughout the book, in the few places they occur, the words he, him
and his denote a person of either sex.
x Preface
Acknowledgements
Those familiar with Edsger W. Dijkstra's monograph A Discipline of
Programming will find his influence throughout this book. The calculus
for the derivation of programs, the style of developing programs, and
many of the examples are his. In addition, his criticisms of drafts of this
book have been invaluable.
Just as important to me has been the work of Tony Hoare. His paper
on an axiomatic basis for programming was the start of a new era, not
only in its technical contribution but in its taste and style, and his work
since then has continued to influence me. Tony's excellent, detailed criti-
cisms of a draft of Part I caused me to reorganize and rewrite major parts
of it.
I am grateful to Fred Schneider, who read the first drafts of all chap-
ters and gave technical and stylistic suggestions on almost every para-
graph.
A number of people have given me substantial constructive criticisms
on all or parts of the manuscript. For their help I would like to thank
Greg Andrews, Michael Gordon, Eric Hehner, Gary Levin, Doug Mcil-
roy, Bob Melville, Jay Misra, Hal Perkins, John Williams, Michael
Woodger and David Wright.
My appreciation goes also to the Cornell Computer Science Commun-
ity. The students of course CS600 have been my guinea pigs for the past
five years, and the faculty and students have tolerated my preachings
about programming in a very amiable way. Cornell has been an excellent
place to perform my research.
This book was typed and edited by myself, using the departmental
PDPII/60-V AX system running under UNIX* and a screen editor written
for the Terak. (The files for the book contain 844,592 characters.) The
final copy was produced using troff and a Comp Edit phototypsetter at
the Graphics Lab at Cornell. Doug Mcilroy introduced me to many of
the intricacies of troff. Alan Demers, Dean Krafft and Mike Hammond
provided much help with the PDP! I/ 60-V AX system; and Alan Demers,
Barbara Gingras and Sandor Halasz spent many hours helping me con-
nect the output of troff to the phototypesetter. To them I am grateful.
The National Science Foundation has given me continual support for
my research, which led to this book.
Finally, I thank my wife, Elaine, and children, Paul and Susan, for
their love and patience over the past one and one half years.
*UNIX is a trademark of Bell Laboratories.
Table of Contents
A story
We have just finished wntmg a large program (3000 lines). Among
other things, the program computes as intermediate results the quotient q
and remainder r arising from dividing a non-negative integer x by a posi-
tive integer y. For example, with x = 7 and y = 2, the program calculates
q = 3 (since 7+2 = 3) and r = I (since the remainder when 7 is divided by
2 is 1).
Our program appears below, with dots " ... " representing the parts of
the program that precede and follow the remainder-quotient calculation.
The calculation is performed as given because the program will sometimes
be executed on a micro-computer that has no integer division, and porta-
bility must be maintained at all costs! The remainder-quotient calculation
actually seems quite simple; since + cannot be used, we have elected to
repeatedly subtract divisor y from a copy of x, keeping track of how
many subtractions are made, until another subtraction would yield a nega-
tive integer.
r:= x; q:= O;
while r >y do
begin r:= r-y; q:= q+l end;
x = y*q +r,
{y >O}
r:= x; q:= O;
( I) while r > y do
begin r:= r-y; q:= q+I end;
{x =y*q +r}
Testing now results in far less output, and we make progress. Assertion
checking detects an error during a test run because y is 0 just before a
remainder-quotient calculation, and it takes only four hours to find the
error in the calculation of y and fix it.
Part 0. Why Use Logic? Why Prove Programs Correct? 3
But then we spend a day tracking down an error for which we received
no nice false-assertion message. We finally determine that the remainder-
quotient calculation resulted in
x = 6, y = 3, q =I, r = 3.
Sure enough, both assertions in (I) are true with these values; the problem
is that the remainder should be less than the divisor, and it isn't. We
determine that the loop condition should be r ;:;;:i: y instead of r > y. If
only the result assertion were strong enough -if only we had used the
assertion x = y*q + r and r <y - we would have saved a day of work!
Why didn't we think of it?
We fo., the error and insert the stronger assertion:
{y >O}
r:= x; q:= O;
while r ;;::i:y do
begin r:= r-y; q:= q+I end;
{x =y*q +rand r <y}
Things go fine for a while, but one day we get incomprehensible output.
It turns out that the quotient-remainder algorithm resulted in a negative
remainder r = -2. But the remainder shouldn't be negative! And we find
out that r was negative because initially x was -2. Ahhh, another error
in calculating the input to the quotient-remainder algorithm -x isn't sup-
posed to be negative! But we could have caught the error earlier and
saved two days searching, in fact we should have caught it earlier; all we
had to do was make the initial and final assertions for the program seg-
ment strong enough. Once more we fix an error and strengthen an asser-
tion:
the initial assertion (0:;;;;;; x and 0 <y) and the final assertion (x = y*q + r
and 0:;;;;;; r <y) before writing the program segment, for they form the
definition of quotient and remainder.
But what about the error we made in the condition of the while loop?
Could we have prevented that from the beginning? Is there is a way to
prove, just from the program and assertions, that the assertions are true
when flow of control reaches them? Let's see what we can do.
Just before the loop it seems that part of our result,
(2) x = y*q +r
holds, since x = r and q = 0. And from the assignments in the loop body
we conclude that if (2) is true before execution of the loop body then it is
true after its execution, so it will be true just before and after every itera-
tion of the loop. Let's insert it as an assertion in the obvious places, and
let's also make all assertions as strong as possible:
Now, how can we easily determine a correct loop condition, or, given the
condition, how can we prove it is correct? When the loop terminates the
condition is false. Upon termination we want r <y, so that the comple-
ment, r ;;;:: y must be the correct loop condition. How easy that was!
It seems that if we knew how to make all assertions as strong as possi-
ble and if we learned how to reason carefully about assertions and pro-
grams, then we wouldn't make so many mistakes, we would know our
program was correct, and we wouldn't need to debug programs at all!
Hence, the days spent running test cases, looking through output and
searching for errors could be spent in other ways.
Part 0. Why Use Logic? Why Prove Programs Correct? 5
Discussion
The story suggests that assertions, or simply Boolean expressions, are
really needed in programming. But it is not enough to know how to write
Boolean expressions; one needs to know how to reason with them: to sim-
plify them, to prove that one follows from another, to prove that one is
not true in some state, and so forth. And, later on, we will see that it is
necessary to use a kind of assertion that is not part of the usual Boolean
expression language of Pascal, PL/ I or FOR TRAN, the "quantified"
assertion.
Knowing how to reason about assertions is one thing; knowing how to
reason about programs is another. In the past 10 years, computer science
has come a long way in the study of proving programs correct. We are
reaching the point where the subject can be taught to undergraduates, or
to anyone with some training in programming and the will to become
more proficient. More importantly, the study of program correctness
proofs has led to the discovery and elucidation of methods for developing
programs. Basically, one attempts to develop a program and its proof
hand-in-hand, with the proof ideas leading the way! If the methods are
practiced with care, they can lead to programs that are free of errors, that
take much less time to develop and debug, and that are much more easily
understood (by those who have studied the subject).
Above, I mentioned that programs could be free of errors and, in a
way, I implied that debugging would be unnecessary. This point needs
some clarification. Even though we can become more proficient in pro-
gramming, we will still make errors, even if only of a syntactic nature
(typos). We are only human. Hence, some testing will always be neces-
sary. But it should not be called debugging, for the word debugging
implies the existence of bugs, which are terribly difficult to eliminate. No
matter how many flies we swat, there will always be more. A disciplined
method of programming should give more confidence than that! We
should run test cases not to look for bugs, but to increase our confidence
in a program we are quite sure is correct; finding an error should be the
exception rather than the rule.
With this motivation, let us turn to our first subject, the study of logic.
Part I
Propositions
and Predicates
3. If b is a proposition, then so is (, b ).
4. If b and c are propositions, then so are (b Ac), (b v c ),
(b ~c), and (b =c).
As seen in the above syntax, five operators are defined over values of
type Boolean:
negation: (not b ), or ( , b)
conjunction: (b and c ), or (b Ac)
disjunction: (b or c ), or (b v c)
implication: (b imp c ), or (b ~c)
equality: (b equals c ), or (b = c)
Two different notations have been given for each operator, a name and a
mathematical symbol. The name indicates how to pronounce it, and its
use also makes typing easier when a typewriter does not have the
corresponding mathematical symbol.
The following terminology is used. (b Ac) is called a conjunction; its
operands b and c called conjuncts. (b v c) is called a disjunction; its
operands b and c are called disjuncts. (b ~c) is called an implication;
its antecedent is b and its consequent is c.
10 Part I. Propositions and Predicates
( 1.2.4) Case 3. The value of a constant proposition with more than one
operator is found by repeatedly applying ( 1.2.2) to a subproposi-
tion of the constant proposition and replacing the subproposition
by its value, until the proposition is reduced to Tor F.
We give an example of evaluation of a proposition:
((TAT)~ F)
= (T~F)
=F
Remark: The description of the operations in terms of a truth table, which
lists all possible operand combinations and their values, can be given only
because the set of possible values is finite. For example, no such table
could be given for operations on integers. D
false" "true". But note that operation or denotes "inclusive or" and not
"exclusive or". That is, (TV T) is T, while the "exclusive or" of T and T
is false.
Also, there is no causality implied by operation imp. The sentence "If
it rains, the picnic is cancelled" can be written in propositional form as
(rain ? no picnic). From the English sentence we infer that the lack of
rain means there will be a picnic, but no such inference can be made from
the proposition (rain ? no picnic).
Example. Let state s be the function defined by the set {(a, T), (be, F),
(yl, T)}. Then s(a) denotes the value determined by applying state (func-
tion) s to identifier a: s(a)=T. Similarly, s(bc)=F and s(y/) =
T. D
( 1.3.2) Definition. Proposition e is well-defined in state s if each iden-
tifier in e is associated with either T or F in state s. D
s(((,b)Vc))
= ((, T)V F) (b has been replaced by T, c by F)
= (FV F)
=F D
1.5 Tautologies
A Tautology is a proposition that is true in every state in which it is
well-defined. For example, proposition T is a tautology and F is not.
The proposition b v , b is a tautology, as can be seen by evaluating it with
b = T and b =F:
TV,T TVF T
FV ,F FVT T
b ,b b v ,b
T F T
F T T
The basic way to show that a proposition is a tautology is to show that its
evaluation yields T in every possible state. Unfortunately, each extra
identifier in a proposition doubles the number of combinations of values
for identifiers -for a proposition with i distinct identifiers there are i
cases! Hence, the work involved can become tedious and time consum-
ing. To illustrate this, ( 1.5.1) contains the truth table for proposition
(b A e Ad) ~(d ~ b ), which has three distinct identifiers. By taking some
shortcuts, the work can be reduced. For example, a glance at truth table
( 1.2.3) indicates that operation imp is true whenever its antecedent is false,
so that its consequent need only be evaluated if its antecedent is true. In
example ( 1.5.1) there is only one state in which the antecedent b A e Ad is
true -the state in which b, e and d are true- and hence we need only
the top line of truth table (l.5.1).
Section 1.6 Propositions as Sets of States 15
TTT T T T
TTF F T T
TFT F T T
(1.5.1) TFF F T T
FTT F F T
FTF F T T
FFT F F T
FFF F T T
Disproving a conjecture
Sometimes we conjecture that a proposition e is a tautology, but are
unable to develop a proof of it, so we decide to try to disprove it. What
does it take to disprove such a conjecture?
It may be possible to prove the converse -i.e. that , e is a tautology-
but the chances are slim. If we had reason to believe a conjecture, it is
unlikely that its converse is true. Much more likely is that it is true in
most states but false in one or two, and to disprove it we need only find
one such state:
Example. The set of two states {(b, T),(c, T),(d, T)} and {(b,F),
(c, T), (d, F)}, is represented by the proposition
(bAcAd)V(,bAcA,d) D
it rains: r
picnic is cancelled: pc
be wet: wet
stay at home: s
2. Write truth tables to show the values of the following propositions in all states:
4. Below are some English sentences. Introduce identifiers to represent the simple
ones (e.g. "it's raining cats and dogs.") and then translate the sentences into pro-
positions.
(a) Whether or not it's raining, I'm going swimming.
(b) If it's raining I'm not going swimming.
(c) It's raining cats and dogs.
(d) It's raining cats or dogs.
(e) If it rains cats and dogs I'll eat my hat, but I won't go swimming.
(f) If it rains cats and dogs while I am swimming I'll eat my hat
Chapter 2
Reasoning using Equivalence Transformations
7. Law of Contradiction: El A, El = F
8. Law of Implication: El ~ E2 = , El v E2
9. Law of Equality: (El=E2) = (El~E2)A(E2~El)
Section 2.1 The Laws of Equivalence 21
I 0. Laws of or-simplification:
ElV El = El
ElVT = T
ElV F = El
ElV(El AE2) = El
Don't be alarmed at the number of laws. Most of them you have used
many times, perhaps unknowingly, and this list will only serve to make
you more aware of them. Study the laws carefully, for they are used over
and over again in manipulating propositions. Do some of the exercises at
the end of this section until the use of these laws becomes second nature.
Knowing the laws by name makes discussions of their use easier.
The law of the Excluded Middle deserves some comment. It means
that either b or , b must be true in any state; there can be no middle
ground. Some don't believe this law, at least in all its generality. In fact,
here is a counterexample to it, in English. Consider the sentence
Clearly, the Jaw is true in all states (in which it is well-defined), so that it
is a tautology.
Exercise I concerns proving all the Jaws to be equivalences.
el b ~c and
e2 ,b v c
we have
E(el) = dV(b ~c)
E(e2) = dV( ,b Ve)
Section 2.2 The Rules of Substitution and Transitivity 23
b ~e
= ,bve (Implication)
= ev,b (Commutativity)
, ,e v ,b (Negation)
= ,e~,b (Implication)
Example. We show that the law of Contradiction can be proved from the
others. The portion of each proposition to be replaced in each step is
underlined in order to make it easier to identify the substitution.
(b A(b ~c)) ~c
=,(bA(,bVc))Vc (Implication, 2 times)
= ,bv,(,bVc)Vc (De Morgan)
=T (Excluded Middle)
Trans/orming an implication
Suppose we want to prove that
(2.2.3) El A £2 A £3 ~E
(El A E2AE3)~E
,(EJAE2AE3)VE (Implication)
= ,EJV ,E2V ,E3VE (De Morgan)
The final proposition is true in any state in which at least one of , El,
, E2, , £3 and E is true. Hence, to prove that (2.2.3) is a tautology we
need only prove that in any state in which three of them are false the
fourth is true. And we can choose which three to assume false, based on
their form, in order to develop the simplest proof.
With an argument similar to the one just given, we can see that the
five statements
EJAE2AE3 ~ E
EJAE2A ,E ~ ,£3
El A , EA £3 ~ , £2
, EA £2 A £3 ~ , El
(2.2.4) ,£JV ,E2V ,E3V E
are equivalent and we can choose which to work with. When given a pro-
position like (2.2.3), eliminating implication completely in favor of dis-
junctions like (2.2.4) can be helpful. Likewise, when formulating a prob-
lem, put it in the form of a disjunction right from the beginning.
bV,bVcVd
Next, define the propositions that arise by using the rules of Substitution
and Transitivity and an already-derived theorem to be a theorem. In this
context, the rules are often called inference rules, for they can be used to
infer that a proposition is a theorem. An inference rule is often written in
the form
where the E; and E stand for arbitrary propositions. The inference rule
has the following meaning. If propositions £ 1, • • • , En are theorems,
then so is proposition E (and £ 0 in the second case). Written in this
form, the rules of Substitution and Transitivity are
el =e2
(2.3.2) Rule of Substitution:
E(el) = E(e2), E(e2) = E(el)
el =e2, e2=e3
(2.3.3) Rule of Transitivity:
el =e3
(3.1.1) premise: p Aq
conclusion:p A (r v q)
Fromp A
I pAq premise
(3.1.3) 2 p property of and,
3 q property of and,
4 rVq property of or, 3
5 p A(r V q) property of and, 2, 4
Infer e.
£ 1, ···,En
(3.2.J) A-1: - - - - -
£1A · · · A En
Section 3.2 Inference Rules 31
E1A A En
(3.2.2) A-E: - - - - - -
E;
E;
(3.2.3) V-1: - - - - - -
E1 v v En
Remark: There are places where it frequently rains while the sun is shin-
ing. Ithaca, the home of Cornell University, is one of them. In fact, it
sometimes rains when perfectly blue sky seems to be overhead. The
weather can also change from a furious blizzard to bright, calm sunshine
and then back again, within minutes. When the weather acts so strangely,
as it often does, one says that it is lthacating. D
Let us redo proof (3.1.3) in (3.2.4) below and indicate the exact infer-
ence rule used at each step. The top line states what is to be proved. The
line numbered I contains the first (and only) premise (pr I). Each other
line has the following property. Let the line have the form
Then one can form an instance of the named inference rule by writing the
propositions on lines line #, ... , line # above a line and proposition E
below. That is, the truth of E is inferred by one inference rule from the
truth of previous propositions. For example, from line 4 of the proof we
see that q / r v q is an instance of rule V-1: (r v q) is being inferred from q.
Note how rule A-E is used to break a proposition into its constituent
parts, while A-I and V-1 are used to build new ones. This is typical of the
use of introduction and elimination rules.
Proofs (3.2.5) and (3.2.6) below illustrate that and is a commutative
operation; if p A q is true then so is q Ap, and vice versa. This is obvious
after our previous study of propositions, but it must be proved in this for-
mal system before it can be used. Note that both proofs are necessary;
one cannot derive the second as an instance of the first by replacing p
and q in the first by q and p, respectively. In this formal system, a proof
holds only for the particular propositions involved. It is not a schema,
the way an inference rule is.
From
1 p Aq pr 1
(3.2.5) 2 p A-E, I
3 q A-E, 1
4 qAp A-1, 3, 2
To illustrate the relation between the proof system and English, we give
an argument in English for lemma (3.2.5): Suppose p Aq is true [line I].
Then so is p, and so is q [lines 2 and 3]. Therefore, by the definition of
and, q Ap is true [line 4].
From
qAp pr I
(3.2.6) 2 q A-E, I
3 p A-E, 1
4 pAq A-1, 3, 2
using "pr i" to refer to the ;th premise later on, as shown in (3.2. 7). This
abbreviation will occur often. But note that this is only an abbreviation,
and we will continue to use the phrase "occurs on a previous line" to
include the premises, even though the abbreviation is used.
From
I q A-E, pr I
(3.2.7) 2
p A-E, pr I
3 pAq A-1, 2, I
we conclude no sun.
Here is a simple example.
From
p V(q Ar)
2 p ~s
3 (qAr)~s
4 s V-E, I, 2, 3
5 S Vp V-1 (rule (3.2.3)), 4
El~E2, El
(3.2.9) ~-E: El
34 Part I. Propositions and Predicates
From p A q , p ~rinfer r v ( q ~ r)
I p Aq pr I
2 p ~r pr 2
(3.2.10)
3 p A-E (rule (3.2.2)), I
4 r ~-E, 2, 3
5 r v (q ~ r) V-1 (rule (3.2.3)), 4
From p A q, p ~ r infer r v (q ~ r)
I p A-E, pr I
(3.2.11) 2 r
~-E, pr 2, I
3 rv(q~r) V-1, 2
El ~E2, E2~El
(3.2.12) =-1: El = E2
El=E2
(3.2.13) =-E: - - - - - -
El~E2, E2~El
From
I =-E, pr 2
2 q ~r ~-E, I, pr I
3 r =q =-I, pr 3, 2
2. Here is one proof that p follows from p. Write another proof that uses only
one reference to the premise.
From p infer p
1
2
I pp pr 1
pr 1
4. For each of your proofs of exercise 3, give an English version. (The English
versions need not mimic the formal proofs exactly.)
36 Part I. Propositions and Predicates
From E 1, • • • , En infer E
(3.3.1) =>-I: - - - - - - - - - - -
(EI A · · · A En ) ="> E
Proof (3.3.2) uses =>-I twice in order to prove that p Aq and q A p are
equivalent, using lemmas proved in the previous section.
Infer (p A q) = ( q A p )
I (pAq)=">(qAp) =>-I, (3.2.5)
(3.3.2) 2
(q Ap)=">(p Aq) =>-I, (3.2.6)
3 {pAq)=(qAp) =-I, I, 2
Subproofs
A proof can be included within a proof, much the way a procedure can
be included within a program. This allows the premise of =>-I to appear
as a line of a proof. To illustrate this, (3.3.2) is rewrit'en i~ (3.3.4} to
include proof (3.2.5) as a subproof. The subproof happens to be on line I
here, but it could be on any line. If the subtheorem appears on line j
Section 3.3 Proofs and Subproofs 37
(say) of the main proof, then its proof appears indented underneath, with
its lines numbered j. I, j .2, etc. We could have replaced the reference to
(3.2.6) by a subproof in a similar manner.
Infer
From p A q infer q Ap
I. I p A-E, pr I
1.2 q A-E, pr I
(3.3.4) 1.3 q Ap A-1, J.2, I.I
2 (pAq)~(qAp) ~-1, I
3 (qAp)~(pAq) ~-1,(3.2.6)
4 (pAq)=(qAp) =-1,2,3
Scope rules
A subproof can contain references not only to previous lines in its
proof, but also to previous lines that occur in surrounding proofs. We
call these global line references. However, "recursion" is not allowed; a
line j (say) may not contain a reference to a theorem whose proof is not
finished by line j.
The reader skilled in the use of block structure in languages like PL/ I,
ALGOL 60 and Pascal will have no difficulty in understanding this scope
rule, for essentially the same scope mechanism is employed here (except
for the restriction against recursion). Let us state the rule more precisely.
Example (3.3. 7) illustrates the use of this scope rule; line 2.2 refers to
line I, which is outside the proof of line 2.
We illustrate another common mistake below; the use of a line that is not
in a surrounding proof. Below, on line 6.1 an attempt is made to refer-
ence s on line 4.1. Since line 4.1 is not in a surrounding proof, this is not
allowed.
A subproof using global references is being proved in a particular con-
text. Taken out of context, the subproof may not be true because it relies
Section 3.3 Proofs and Subproofs 39
From ~r)
I pr I
2
(3.3.8) p pr I
2.2 Fromq infer r
2.2.1 pAq A-1, 2.1, pr I
2.2.2 r ~-E, I, 2.2.1
2.3 q~r ~-1, 2.2
3 P ~(q ~r) ~-1, 2
Proof by contradiction
A proof by contradiction typically proceeds as follows. One makes an
assumption. From this assumption one proceeds to prove a contradiction,
say, by showing that something is both true and false. Since such a
40 Part I. Propositions and Predicates
contradiction cannot possibly happen, and since the proof from assump-
tion to contradiction is valid, the assumption must be false.
Proof by contradiction is embodied in the proof rules , -I and ,-E:
From E infer El A , El
(3.3.9) ,-1:
,E
From , E infer El A , El
(3.3. I 0) i-E:
E
Rule , -I indicates that if "From E infer El A , El" has been proved for
some proposition El, then one can write , E on a line of the proof.
Rule , -I similarly allows us to conclude that E holds if a proof of
"From , E infer El A , El" exists, for some proposition El.
We show in (3.3.11) an example of the use of rule , -1, that from p we
can conclude , ip.
From p infer , , p
I p pr I
(3.3.11) 2 From ip infer p A ip
2.1 p A ip A-1, I, pr I
3 , ip , -1, 2
From , , p infer p
, ip pr I
(3.3.12) 2 From ip infer ip A, ip
2.1 ip A , ,p A-1, pr I, I
3 p ,-E, 2
Theorems (3.3.1 I) and (3.3.12) look quite similar, and yet both proofs are
needed; one cannot simply get one from the other more easily than they
are proven here. More importantly, both of the rules i-1 and i-E are
needed; if one is omitted from the proof system, we will be unable to
deduce some propositions that are tautologies in the sense described in
section 1.5. This may seem strange, since the rules look so similar.
Let us give two more proofs. The first one indicates that from p and
ip one can prove any proposition q, even one that is equivalent to false.
This is because both p and ip cannot both be true at the same time, and
hence the premises form an absurdity.
Section 3.3 Proofs and Subproofs 41
Fromp, ,p infer q
I p pr I
2 ,p pr 2
(3.3.13) 3 From , q infer p A ,p
3.1 p A ,p A-1, I, 2
4 q ,-E, 3
From p A q infer , (p ~ , q)
I p Aq pr I
2 Fromp ~ ,q infer q A ,q
(3.3.14) 2.1 p A-E, I
2.2 q A-E, I
2.3 ,q ~-E, pr I, 2.1
2.4 qA,q A-I, 2.2, 2.3
3 ,(p~,q) ,-1,2
Summary
The reader may have noticed a difference between the natural deduc-
tion system and the previous systems of evaluation and equivalence
transformation: the natural deduction system does not allow the use of
constants T and F! The connection between the systems can be stated as
follows. If "Infer e" is a theorem of the natural deduction system, then e
is a tautology and e = T is an equivalence. On the other hand, if e = T is
a tautology and e does not contain T and F, then "Infer e" is a theorem
of the natural deduction system. The omission of T and F is no problem
because, by the rule of Substitution, in any proposition T can be replaced
by a tautology (e.g. b v , b) and F by the complement of a tautology (e.g.
b A , b) to yield an equivalent proposition.
We summarize what a proof is as follows. A proof of a theorem
"From e 1, • • • , en infer e" or of a theorem "Infer e" consists of a
sequence of lines. The first line contains the theorem. If the first line is
unnumbered, the rest are indented and numbered I, 2, etc. If the first line
has the number i, the rest are indented and numbered i. I, i. 2, etc. The
last line must contain proposition e. Each line i must have one of the
following four forms:
42 Part I. Propositions and Predicates
Form 1: (i) ei pr j
where 1 ~j ~n. The line contains premise j.
Historical Notes
The style of the logical system defined in this chapter was conceived
principally to capture our "natural" patterns of reasoning. Gerhard
Gentzen, a German mathematician who died in an Allied prisoner of war
camp just after World War II, developed such a system for mathematical
arguments in his 1935 paper Untersuchungen ueber das /ogische Schliessen
[20], which is included in [43].
Several textbooks on logic are based on natural deduction, for example
W.V.O. Quine's book Methods of Logic [41].
The particular block-structured system given here was developed using
two sources: WFF'N PROOF: The Game of Modern Logic, by Layman
E. Allen [l] and the monograph A Programming Logic, by Robert Con-
stable and Michael O'Donnell [7]. The former introduces the deduction
system through a series of games; it uses prefix notation, partly to avoid
problems with parentheses, which we have sidestepped through informal-
ity. A Programming Logic describes a mechanical program verifier for
Exercises for Section 3.3 43
From p A q infer q Ap
From q Ap infer p Aq
Even though it looks like the second should follow directly from the
first, in the formal system both must be proved.
But we can prove something about the formal system: systematic sub-
stitution of propositions for identifiers in a theorem and its proof yields
another theorem and proof. So we can consider any theorem to be a
schema also. For example, from proof (3.2.5) of "From p A q infer q Ap"
we can generate a proof of "From (av b )Ac infer c A (av b )" simply by
substituting a vb for p and c for q everywhere in proof (3.2.5):
From (a v b ) A c infer c A (a vb )
I (aVb)Ac prl
2 a Vb A-E, I
3 c A-E, I
4 c A (a Vb) A-I, 3, 2
Let us state more precisely this idea of textual substitution in theorem and
proof.
Infer (p A q) = ( q A p )
I (pAq)~(qAp) (3.2.5)
2 (qAp)~(pAq) (3.2.5) (with p for q, q for p)
3 (pAq)=(qAp) =-1, I, 2
For example, given that c =';>av b is true, to show that c =';:> b v a is true
we take E(p) to be c =';>p, el =e2 to be av b =b Va (the law of Commu-
tativity, which will be proved later) and apply the theorem.
The rule of Substitution was an inference rule in the equivalence sys-
tem of chapter 2. However, it is a meta-theorem of the natural deduction
system and must be proved. Its proof, which would be performed by
induction on the structure of proposition E(p ), is left to the interested
reader in exercise 10, so let us suppose it has been done. We put the rule
of Substitution in the form of a derived inference rule:
el =e2, E(el)
(3.4.4) subs: - - - - - - (E(p) is a function of p)
E(e2)
To show the use of (3.4.4), we give a schematic proof to show that the
rule of substitution as given in section 2.2 holds here also.
With this derived rule of inference, we have the flexibility of both the
equivalence and the natural deduction systems. But we must make sure
that the laws of section 2.1 actually hold! We do this next.
We now turn to the laws of section 2.1. Some of their proofs are given
here; the others are left as exercises to the reader.
3. Distributive laws. Here is a proof of the first; the second is left to the
reader. The proof is broken into three parts. The first part proves an
implication =;:. and the second part proves it in the other direction, so
that the third can prove the equivalence. The second part uses a case
analysis (rule V-E) on b v ,b -the law of the Excluded Middle- which is
not proved until later. The use of b v , b in this fashion occurs often
From b v (c A d) infer ( b v c) A ( b v d)
I From b infer (b Vc)A(b Vd)
I. I b Ve V-1, pr I
1.2 bVd V-1, pr I
(3.4. 7) 1.3 (bVc)A(bVd) A-1, I.I, 1.2
2 b =;:.(b Vc)A(b Vd) =;;.-1, I
3 From c Ad infer (b Vc)A(b Vd)
3.1 c A-E, pr I
3.2 d A-E, pr I
3.3 b Ve V-1, 3.1
3.4 bVd V-1, 3.2
3.5 (bVc)A(bVd) A-1, 3.3, 3.4
4 (c Ad)=;:.(b Vc)A(b Vd) =;;.-1. 3
5 (bVc)A(bVd) V-E, pr I, 2, 4
Section 3.4 Adding Flexibility to the Natural Deduction System 49
From ( b v c) A ( b v d) infer b v ( c A d)
I b Ve A-E, pr I
2 bVd A-E, pr I
3 b v ,b (3.4.14)
4 From b infer b v (c Ad)
4.1 I bV(cAd) V-1, pr I
5 b ~b V(c Ad) ~-1, 4
(3.4.8) 6 From ,b infer b V(c Ad)
6.1 c (3.4.6), I, pr I
6.2 d (3.4.6), 2, pr I
6.3 cAd A-1, 6.1, 6.2
6.4 bV(cAd) V-1, 6.3
7 ,b ~b V(c Ad) ~-1, 6
8 bV(cAd) V-E, 3, 5, 7
(3.4.13) Infer , ,b =b
I b ~, ,b ~-1, (3.3.11)
2 , ,b ~b ~-1, (3.3.12)
3 , ,b =b =-1, I, 2
Infer b v, b
From ,(b v ,b) infer (b v ,b) A ,(b v ,b)
I.I ,(bV,b) prl
1.2 From ,b infer (b v ,b)A ,(b v ,b)
(3.4.14)
1.2.1
1.2.2
I
b v ,b
(bV,b)A,(bV,b)
V-1, pr I
A-1,1.2.1,1.1
1.3 b , -E, 1.2
1.4 b V ,b V-1, J.3
1.5 (bV,b)A,(bV,b) A-1,1.4,prl
2 bV,b ,-E,I
10-11. Laws of or- and and-Simplification. These laws use the constants
T and F, which don't appear in the inference system.
From e 1, e2 infer e3
I el pr I
2 e2 pr 2
3 e3 Why?
and we need only substantiate line 3 -i.e. give a reason why e3 can be
written on it. We can look to three things for insight. First, we may be
able to combine the premises or derive sub-propositions from them in
some fashion, if not to produce e3 at least to get something that looks
similar to it.
Secondly, we can investigate e3 itself. Since an inference rule must be
used to substantiate line 3, the form of e3 should help us decide which
inference rule to use. And this leads us to the third piece of information
we can use, the inference rules themselves. There are ten inference rules,
which yields a lot of possibilities. Fortunately, few of them will apply to
any particular proposition e3, because e3 must have the form of the con-
clusion of the inference rule used to substantiate it. And, with the addi-
tional information of the premises, the number of actual possibilities can
be reduced even more.
Section 3.5 Developing Natural Deduction System Proofs 53
For example, if e3 has the form e4 9e5, the two most likely inference
rules to use are =-E and 9-1, and if a suitable equivalence does not seem
possible to derive from the premises, then =-E can be eliminated from
consideration.
Let us suppose we try to substantiate line 3 using rule 9-1, because it
has the form e4 9 e5. Then we would expand the proof as follows.
From e 1, e2 infer e4 9 e5
1 el pr 1
2 e2 pr 2
3 From e4 infer e5
3.1 e4 pr I
3.2 e5 Why?
4 e49e5 9-1, 3
parts of it, can be built from shorter propositions that occur on previous
lines. Note that, except for =-I, the forms of the conclusions of the rules
of introduction are all different, so that at most one of these rules can be
used to substantiate a proposition.
The rules of elimination are generally used to "break apart" a proposi-
tion so that one of its sub-propositions can be derived. All the rules of
elimination (except for =-E) have a general proposition as their conclu-
sion. This means that they may possibly be used to substantiate any pro-
position. Whether an elimination rule can be used depends on whether its
premises have appeared on previous lines, so to decide whether these rules
should be used requires a look at previous lines.
2 (p A ,q)=';> ,p Why?
Little can be derived from p =';>q, except the disjunction ,p v q (using the
rule of Substitution). We will keep this proposition in mind. Which rules
of inference could be used to substantiate line 2? That is, which rules of
inference could have (p A , q) =';> ,p as their conclusion?
Possible inference rules are: =';>-I, A-E, V-E, ,-E, =-E and =';>-E. Which
seems most applicable, and why? Expand the proof accordingly.
Section 3.5 Developing Natural Deduction System Proofs 55
There is little to suppose that the elimination rules could be useful, for
their premises are different from the propositions on previous lines. This
leaves only ~-1.
From
p ~q pr I
2 From p A ,q infer ip
2.1 p A, q pr I
Why?
~-1, 2
Possible inference rules are , -1, A-E, V-E, , -E and ~-1. Choose the rule
that is most applicable and expand the proof accordingly.
The elimination rules don't seem useful here; elimination of imp on line I
results in q, and we already know that A-E can be used to derive p and
, q from p A , q. Only , -I seems helpful:
From
I p~q pr I
2 From p A ,q infer ip
2.1 p A ,q pr I
2.2 From p infer e A , e (which e?)
2.2.1 p pr I
2.2.2 e A ,e Why?
2.3 ,p , -1, 2.2
3 {pAiq)~ip ~-I, 2
What proposition e should be used on lines 2.2 and 2.2.2? To make the
choice, look at the propositions that occur on lines previous to 2.2 and
56 Part I. Propositions and Predicates
the propositions we know we can derive from them. Expand the proof
accordingly.
From
I p :!;>q pr l
2 From p A , q infer ,p
2.1 p" ,q pr I
2.2 p A-E, 2.1
2.3 ,q A-E, 2.1
2.4 Fromp infer q 11, q
2.4.1 q =;;.-E, l, 2.2
2.4.2 q" ,q A-1, 2.4.1, 2.3
2.5 ,p , -I, 2.4
3 (p" ,q):!;> ,p :!;>-I, 2
Rule =-E can be used to derive two implications. This seems useful here,
since implications will be needed to derive the goal, and we derive both.
Section 3.5 Developing Natural Deduction System Proofs 57
From,
I ,p ~q =-E, pr I
2 q ~,p =-E, pr I
The following rules could be used to substantiate line 3: , -I, A-E, V-E, , -
E and ~-E. Choose the most likely one and expand the proof accord-
ingly.
The elimination rules don't seem helpful at all, because the premises that
would be needed in order to use them are not available and don't seem
easy to derive. The only rule to try at this point is , -I -we have little
choice!
What proposition e should be used on lines 3 and 3.2, and how should it
be proved? Expand the proof accordingly.
So we are left with concluding the two propositions p and ip. These are
quite simple, using the above reasoning, so let us just show the final
proof.
At each step of the development of the proof there was little choice. The
crucial -and most difficult- point of the development was the choice of
inference rule , -I to substantiate the last line of the proof, but careful
study of the inference rules led .to it as the only likely candidate. Thus,
directed study of the available information can lead quite simply to the
proof.
Section 3.5 Developing Natural Deduction System Proofs 59
I. If Bill takes the bus, then Bill misses his appointment, if the
bus is late.
2. Bill shouldn't go home, if (a) Bill misses his appointment, and
(b) Bill feels downcast.
3. If Bill doesn't get the job, then (a) Bill feels downcast, and (b)
Bill shouldn't go home.
Which of the following conjectures are true? That is, which can be validly
proved from the premises? Give proofs of the true conjectures and coun-
terexamples for the others.
I. If Bill takes the bus, then Bill does get the job, if the bus is
late.
2. Bill gets the job, if (a) Bill misses his appointment, and (b) Bill
should go home.
3. If the bus is late, then (a) Bill doesn't take the bus, or Bill
doesn't miss his appointment, if (b) Bill doesn't get the job.
4. Bill doesn't take the bus if, (a) the bus is late, and (b) Bill
doesn't get the job.
5. If Bill doesn't miss his appointment, then (a) Bill shouldn't go
home, and (b) Bill doesn't get the job.
6. Bill feels downcast, if (a) the bus is late, or (b) Bill misses his
appointment.
7. If Bill does get the job, then (a) Bill doesn't feel downcast, or
(b) Bill shouldn't go home.
8. If (a) Bill should go home, and Bill takes the bus, then (b) Bill
doesn't feel downcast, if the bus is late.
This problem is typical of the puzzles one comes across from time to time.
Most people are confused by them -they just don't know how to deal
with them effectively and are amazed at those that qo. It turns out, how-
ever, that knowledge of propositional calculus makes the problem fairly
easy.
The first step in solving the problem is to translate the premises into
propositional form. Let the identifiers and their interpretations be:
60 Part I. Propositions and Predicates
The premises are given below. Each has been put in the form of an impli-
cation and in the form of a disjunction, knowing that the disjunctive form
is often helpful.
Now let's solve the first few problems. In order to save space, Premises I,
2 and 3 are not.written in every proof, but are simply referred to as Prem-
ises I, 2 and 3. Included, however, are propositions derived from them in
order to get more true propositions from which to conclude the result.
Conjecture 1: If Bill takes the bus, then Bill does get the job, if the bus is
late. Translate the conjecture into propositional form.
From tb infer bl ~ g ·
tb pr I
2 bl ~gj Why?
From tb infer bl ~ gj
I tb pr I
2 bl~ ma ~-E, Premise I, I
3 bl ~gj Why?
The necessary propositions for the use of the elimination rules are not
available, so try ~-1:
From tb infer bl ~
I tb pr I
2 bl ~ma ~-E, Premise I, I
3 From bl infer gj
3.1 bl pr I
3.2 gj Why?
4 bl ~gj ~-I,3
Can any propositions be inferred at line 3.2 from the propositions on pre-
vious lines and Premises I, 2 and 3? Expand the proof accordingly.
From tb infer bl ~ gj
I tb pr I
2 bl ~ma ~-E, Premise I, I
3 From bl infer gj
3.1 bl pr 1
3.2 ma ~-E, 2, 3.1
3.3 gj Why?
4 bl ~gj ~-I, 3
62 Part I. Propositions and Predicates
None of the the rules seem helpful. The only proposition available that
contains gj is Premise 3, and its disjunctive form indicates that gj must
necessarily be true only in states in which (fdA ,gh) is false (according to
theorem (3.4.6)). But there is nothing in Premise 2, the only other place
fd and gh appear, to make us believe that fdA ,gh must be false.
Perhaps the conjecture is false. What counterexample -i.e. state in
which the conjecture is false- does the structure of the proof and this
argument lead to?
Conjecture 2: Bill gets the job, if (a) Bill misses his appointment and (b)
Bill should go home. Translate the conjecture into propositional form.
2 gj Why?
What can we derive from line I and Premises I, 2 and 3? Expand the
proof accordingly.
Section 3.5 Developing Natural Deduction System Proofs 63
Both line I and Premise 2 contain ma and gh. Premise 2 can be put in
the form ,(ma Agh)V ,fd. Since ma Agh is on line I, theorem (3.4.6)
together with the law of Negation allows us to conclude that ,fd is true,
or that fd is false. Putting this argument into the proof yields
From ma A gh infer gj
I ma Agh pr I
2 , (ma Agh )V ,fd subs, De Morgan, Premise 2
3 , ,(ma Agh) subs, Negation, I
4 ,fd (3.4.6), 2, I
5 gj Why?
The applicable rules are A-E, V-E, , -E and ~-E. This means that an ear-
lier proposition must be broken apart to derive gj. The one that contains
gj is Premise 3, and in its disjunctive form it looks promising. To show
that gj is true, we need only show that fdA ,gh is false. But we already
know that/dis false, so that we can complete the proof as follows.
From ma A gh infer gj
I ma Agh pr I
2 ,(ma Agh)V ,fd subs, De Morgan, Premise 2
3 , ,(ma Agh) subs, Negation, I
4 ,fd (3.4.6), 2, I
5 ,fdV, ,gh V-1, 4
6 ,(fdA ,gh) subs, De Morgan, 5
7 gj (3.4.6), Premise 3, 6
Conjecture 3: If the bus is late, then (a) Bill doesn't take the bus, or Bill
doesn't miss his appointment, if (b) Bill doesn't get the job. Translate the
conjecture into propositional form.
2 Why?
Just before line 2.2, what propositions can be inferred from earlier propo-
sitions and Premises I, 2 and 3? Expand the proof accordingly.
What inference rule should be used to substantiate line 2.5? Expand the
proof accordingly.
The proposition on line 2.5 could have the form of the conclusion of rules
V-1, A-E, V-E, , -E and ?-E. The first rule to try is V-1. Its use would
require proving that one of , tb and , ma is true. But, looking at the
Premises, this seems difficult. For from Premise I we see that both tb
and ma could be true, while the other premises are true also because both
their conclusions are true. Perhaps there is a contradiction. What is it?
((x<y)Ac)Vd.
The new assertions like P are called atomic expressions, while an expres-
sion that results from replacing an identifier by an atomic expression is
called a predicate. We will not go into detail about the syntax of atomic
expressions; instead we will use conventional mathematical notation and
rely on the reader's knowledge of mathematics and programming. For
example, any expression of a programming language that yields a Boolean
result is an acceptable atomic expression. Thus, the following are valid
predicates:
The second example illustrates that parentheses are not always needed to
isolate the atomic expressions from the ·rest of a predicate. The pre-
cedences of operators in a predicate follow conventional mathematics.
For example, the Boolean operators A, v, and ~ have lower precedence
than the arithmetic and relational operators. We will use parentheses to
make the precedence of operations explicit where necessary.
Evaluating predicates
Evaluating a predicate in a state is similar to evaluating a proposition.
All identifiers are replaced by their values in the state, the atomic expres-
sions are evaluated and replaced by their values ( T or F), and the result-
ing constant proposition is evaluated. For example, the predicate
x <y v b in the state {(x, 2),(y, 3), (b, F)} has the value of 2 < 3 v F,
which is equivalent to T v F, which is T.
Using our earlier notation s (e) to represent the value of expression e
in state s, and writing a state as the set of pairs it contains, we show the
evaluation of three predicates:
68 Part I. Propositions and Predicates
y =O v (x /y= 5) .
Rather than change the definition of and and or, which would require
us to change our formal logic completely, we introduce two new opera-
tors: cand (for conditional and) and cor (for conditional or). The
operands of these new operators can be any of three values: F, T and U
(for Undefined). The new operators are defined by the following truth
table.
This definition says nothing about the order in which the operands should
be evaluated. But the intelligent way to evaluate these operations, at least
on current computers, is in terms of the following equivalent conditional
expressions:
Operators cand and cor are not commutative. For example, b cand c is
not equivalent to c cand b. Hence, care must be exercised in manipulat-
ing expressions containing them. The following laws of equivalence do
hold for cand and cor (see exercise 5). These laws are numbered to
correspond to the numbering of the laws in chapter 2.
3. Distributivity:
El cand (E2 cor E3) = (El cand E2) cor (El cand E3)
El cor (E2 cand E3) = (El cor E2) cand (El cor E3)
70 Part I. Propositions and Predicates
I 0. cor-simplification
El cor El= El
El cor T = T (provided El is well-defined)
El cor F =El
El cor (El cand E2) = El
11. cand-simplification
El cand El = El
El cand T =El
El cand F = F (provided El is well-defined)
El cand (El cor E2) = El
In addition, one can derive various laws that combine cand and cor with
the other operations, for example,
3. Evaluate the following predicates in the state given in exercise I. Use U for
the value of an undefined expression.
4.2 Quantification
Existential quantification
Let m and n be two integer expressions satisfying m ~ n. Consider
the predicate
ns;
n-1
i=m
= Sm * Sm+I * ... *Sn-I·
stand for the sum and product of the values Sm, Sm+" .. ., Sn _ 1, respec-
tively. These can be written in a more linear fashion, similar to (4.2.1), as
follows, and we shall continue to use this new form:
(4.2.3) Definition of E:
(Ei:m~i<m:E;) = F, and,fork';;i:m,
(E i: m ~i < k +I: E; ) = (E i: m ~ i < k : E;) v Ek D
(E i: 0 ~ i < 0: i = i)
(Ei:-3~i<-3:T)
The value 0 is called the identity element of addition, because any number
added to 0 yields that number. Similarly, I, F and T are the identity ele-
ments of the operators •, or and and, respectively. End of remark
Section 4.2 Quantification 73
Universal quantification
The universal quantifier, A, is read as "for all". The predicate
(4.2.4) (Ai:m~i<n:E;)
is true in a state iff, for all values i in the range m ~i < n, E; is true in
that state.
We now define A in terms of E, so that, formally, we need deal only
with one of them as a new concept. Predicate (4.2.4) is true if! all the E;
are true, so we see that it is equivalent to
Em AEm+IA · · · 11 En-1
, ,(EmAEm+i" · · · AEn-i) (Negation)
,(,Em V, Em+I V • • • V, En-I) (De Morgan)
,(Ei: m ~i <n:, E;)
Numerical quantification
Consider predicates E 0 , E i. .... It is quite easy to assert formally that
k is the smallest integer such that Ek holds. We need only indicate that
E 0 through Ek -I are false and that Ek is true:
74 Part I. Propositions and Predicates
It is more difficult to assert that k is the second smallest integer such that
Ek holds, because we also have to describe the first such pn:Ji1:ate E1 :
Obviously, describing the third smallest value k such that Ek holds will
be clumsier, and to write a function that yields the number of true E; will
be even harder. Let us introduce some notation:
A Note on ranges
Thus far, the ranges of quantifiers have been given in the form m ~ i
<n, for integer expressions m and n. The lower bound m is included in
the range, the upper bound n is not. Later, the form of ranges will be
generalized, but this is a useful convention, and we will use it where it is
suitable.
Note that the number of values in the range is n -m. Note also that
quantifications with adjacent ranges can be combined as follows:
While it is possible to allow predicates like (4.3.2), and most logical sys-
tems do, it is advisable to enforce the use of each identifier in only one
way:
does not comply with the restriction. An equivalent predicate that does is
At times, for convenience a predicate will be written that does not follow
restriction (4.3.3). In this case, be sure to view each quantified identifier
as being used nowhere else in the world. Think of the two different uses
of the same identifier as different identifiers.
Let us now formally define the terms free and bound, based on the
structure of expressions.
Note that both x and y are free in the predicate x ~y, while x
remains free and y becomes bound when the predicate is embedded in the
expression (Ny:O~y <IO:x ~y)=4.
78 Part I. Propositions and Predicates
t I I I
2~m <n A (A i:2~i <m:m+i #0)
t I I I
2 ~m <n A (An: 2 ~n <m: m +n #0) INV AUD (why?)
t I I I t I I I
(E i: I ~i <25: 25+i =O) A (E i: I ~i <25: 26+i =O) INVALID
t I I I t I I I
(Et: I ~t <25: 25+t =O) A (Ei: I ~i <25: 26+i =O)
t I I I I
(A m: n < m < n +6: (E i: 2 ~ i < m: m +i =0))
+I I I
t I I I I
(A m:n <m <n+6:(En: 2~n <m: m +n =O)) INVALID
• I I I
t I I I I
(Am :n <m <n +6:(Ek: 2~k <m: m +k =O)) 0
• I I I
We have
(4.4.5) (E~•z>:+u = (x < w*z A (Ai: 0 ~i <n: b [i] < w*z »:+u
= x <w*(a+u) A (A i:O~i <n: b[i]<w*(a+u))
Example (4.4.2) shows the replacement of free identifier x by identifier z;
(4.4.3) the replacement of a free identifier by an expression. Example
(4.4.4) illustrates that only free occurrences of an identifier are replaced.
Example (4.4.5) shows two successive substitutions and the introduction of
80 Part I. Propositions and Predicates
But this is not the desired predicate, because the i in y-i has become
bound to the quantifier A, since it now occurs within the scope of A.
Care must be taken to avoid such "capturing" of an identifier in the
expression being substituted. To avoid this conflict we can call for first
(automatically) replacing identifier i of E by a fresh identifier k (say), so
that we arrive at
The following two lemmas are stated without proof, for they are fairly
obvious:
Exercises for Section 4.4 81
(4.4.7) D
Simultaneous substitution
Let x denote a list (vector) of distinct identifiers:
(4.4.9) Ef, or
The second example illustrates the fact that the substitutions must be
simultaneous; if one first replaces all occurrences of x and then replaces
all occurrences of y, the result is x +z + x +z + z, which is not the same.
In general, E::t can be different from (E:)t.
(4.5.1) (Ei: R: E) or
(4.5.2) (Ai:R:E),
Example 1. Let Person (p) represent the sentence "p is a person". Let
Mortal(x) represent the sentence "x is mortal". Then the sentence "All
men are mortal", or, less poetically but more in keeping with the times,
"All persons are mortal'', can be expressed by (A p: Person (p ):
Mortal(p )). D
Example 2. It has been proved that arbitrarily large primes exist. This
theorem can be stated as follows:
Section 4.5 Quantification Over Other Ranges 83
since the context indicated that only integers were under consideration.
or, as an abbreviation,
R~E
(4.5.4) A-1: where i is a fresh identifier.
(Ai: R: E)
(Ai:R:E)
(4.5.5) A-E: . . for any predicate e
R:~E:
Let us now turn to the inference rules for E. Using the techniques of
earlier sections, E can be defined in terms of A:
(Ai:R:E)
(4.5.6) E-1: - - - - - -
,(Ei: R: ,£)
Section 4.6 Some Theorems About Textual Substitution and States 85
(Ei: R: £)
(4.5.7) E-E: - - - - - -
,(Ai: R: ,E)
(Ei: R: E)
(4.5.8) bound-variable substitution: - - - - - -
(Ek: R( Ek)
(provided k does not appear free in R and £)
(s; x:v)
s =(s; x:s(x))
We now give three simple lemmas dealing with textual substitution. For-
mal proofs would rely heavily on the caveats given on textual substitution
in definition (4.4.6), and would be based on the structure of the expres-
sions involved. We give informal proofs.
s'(E) = s(E:).
(Lemma 4.6.1) D
. . . . _ji=j-e
(b,1.e)[j]-li#j-b[i] D
Notice the similarity between the notation (s; x:v) used in section 4.7
to denote a modified state s and the notation (b; i :e) to denote a modi-
fied array b .
Example 2 illustrates nested use of the notation. Since (b; 0:8) is the
array (function) (8, 4, 6), it can be used in the first position of the nota-
tion. Nested parentheses do become burdensome, so we drop them and
rely instead on the convention that rightmost pairs "i :e" are dominant
and have precedence. Thus the last line of example 2 is equivalent to
(b; 0:8; 2:9; 0:7).
perm (( c; O:x ), C)
Simplifying expressions
It is sometimes necessary to simplify expressions (including predicates)
containing the new notation. This can often be done using a two-case
analysis as shown below, which is motivated by definition (5.1.2). The
first step is the hardest, so let us briefly explain it. First, note that either
i = j or i ¥= j. In the former case (b; i:5)1/] = 5 reduces to 5 = 5; in the
second case it reduces to b Ul = 5.
(b; i:5)1/] = 5
= (i = j A 5 = 5) V (i # j A b I/]= 5) (Def. of (b; i:5))
= (i = j) V (i # j A b I/] = 5) ((5 =5) = T, and-simpl.)
= (i = j v i ¥= j) A (i = j v b Ul = 5) (Distributivity)
= TA (i = j V bl/] =5) (Excluded middle)
=i=jVbl/]=5 (and-simpl.)
92 Part I. Propositions and Predicates
define a type t and two variables p and q with type t. Each variable contains
two fields; the first is named n and can contain a string of 0 to 10 characters
-e.g. a person's name- and the second is named age and can contain an
integer. The following assignments indicate how the components of p and q can
be assigned and referenced. After their execution, both p and q contain 'Hehner'
in the first component and 32 in the second. Note how q.age refers to field age
of record variable q.
An array consists of a set of individual values, all of the same type (the old
view). A record consists of a set of individual values, which can be of different
types. In order to allow components to have different types we have sacrificed
some flexibility: components must be referenced using their name (instead of an
expression). Nevertheless, arrays and records are similar.
Develop a functional view for records, similar to the functional view for arrays
just presented.
Section 5.2 Array Sections and Pictures 93
b[O:n-1] denotes the whole array, while if O:::;;i ::;;;;j <n, b[i:j] refers to
the array section composed of b[i], b[i+l], ... ,bl/]. If i = j+l, b[i:j]
refers to an empty section of b.
Quite often, we have to assert something like "all elements of array b
are less than x ", or "array b contains only zeroes". These might be writ-
ten as follows.
Be very careful with = and #, for the last example shows that b = y can
be different from , (b # y )! Similarly, b ::;;;;y can be different from
i(b>y).
We also use the notation x Eb to assert that the value of x is equal to
(at least) one of the values b [i]. Thus, using domain (b) to represent the
set of subscript values for b, x Eb is equivalent to
(E i: i E domain(b ): x = b[i])
Array pictures
Let us now turn to a slightly different subject, using pictures for some
predicates that describe arrays. Suppose we are writing a program to sort
an array b[O:n-1], with initial values B[O:n-1] -i.e. initially, b =B.
We want to describe the following conditions:
where
ordered(b[O:k-1]) =(Ai: 0 ~i <k-1: b[i]~b[i+I])
k k+I n-1
b I I ~x I
while if k =n the section b[k +I:n] is empty. One disadvantage of such
pictures is that they often cause us to forget about singular cases. We
unconsciously think that, since section b [O:k -I] is in the picture, it must
contain something. So use such pictures with care.
An essential property of such pictures is that the formal definition of
assignment (given later in Part II) is useable on pictures when they appear
in predicates. This will be discussed in detail in Part II.
0 k h n
(a) 0 ~ k ~h ~n A bl...__~_x_..I=_x....l _ __._I=_x.....l _~_x___.
0 i n
(b) O~i <n A bl ordered I I
96 Part I. Propositions and Predicates
defines an array of arrays. That is, b[O] (and similarly b[I]) is an array
consisting of three elements named b [O][ I], b [0][2] and b [0][3]. One can
also have an "array of arrays of arrays", in which case three subscripts
could be used -e.g. d[i]l/][k ]- and so forth.
Array of arrays take the place of two-dimensional arrays in FOR-
TRAN and PL/ I. For example, (5.3.1) could be thought of as equivalent
to the PL/ I declaration
We want to define the notation (b; s :e) for any selectors. We do this
recursively on the length of s. The first step is to determine the base case,
(b; £:e).
Let x be a simple variable (which contains a scalar or function). Since
x and x o e are equivalent, the assignments x:= e and x o £:= e are also
equivalent. But, by our earlier notation the latter should be equivalent to
x := (x; £:e ). Therefore, the two expressions e and (x; £:e) must yield
the same value, and we have
e =(x; £:e)
(b; £:g) = g
i-:Fj-bU]
{
(b;[i]os:e)U]= i=j-(bU];s:e) 0
Example 2. In this and the following examples, let c[l:3] = (6, 7,8) and
b[O:l][1:3] = ((0, 1,2),(3,4,5)). Then
Again, all but the outer parentheses can be omitted. For example, the
following two expressions are equivalent. They define an array (function)
that is the same as b except in three positions -[i]I/], U] and [k][i].
98 Part I. Propositions and Predicates
Modify the notation of this section to allow references to subrecords of arrays and
subarrays of records, etc.
Chapter 6
Using Assertions To Document Programs
(6.1.1) Store in z the product a*b, assuming a and b are initially ;;;:::o.
does not indicate where the result of the multiplication should be stored,
and hence it cannot be understood in isolation, as it should be.
English can be ambiguous, so we often rely on more formal specifica-
tion techniques. The notation
The precondition of the program is given, the fixed variables, which must
not be changed, are listed and the postcondition is to be established.
Here are some more examples of specifications (all variables are
integer valued).
Example 1 (array summation). Given are fixed n ;;::i:o and fixed array
b[O:n -1]. Establish
R:s=(Ii:O:;;;;i<n:b[i]). D
Again, there is a problem with this specification; the result can be esta-
blished simply by setting all elements of b to zeroes. This problem can be
overcome by including a comment to the effect that the only way to alter
b is to swap two of its elements.
Naturally, with large, complex problems there may be difficulty in
specifying programs in this simple manner, and new notation may have to
be introduced to cope with the complexity. But for the most part, the
simple specification forms given a~ove will suffice. Even a compiler can
be specified in such a notation, by judicious use of abstraction:
{Pascal program (p ) }
compiler
{/BM 370 program (q) A equivalent (p, q )}
swap: t := x; x := y; y := t
and actually to
(A X , Y, x, y : {x = X Ay = YI swap {x = Y A y = X })
(6.2.2) can be read in English as follows: for all (integer) values of X and
Y, if initially x = X and y = Y, then execution of swap establishes x = Y
and y =X.
X and Y denote the initial values of variables x and y, but they also
denote the final values of y and x. An identifier can denote either an ini-
tial or a final value, or even a value upon which the initial or final value
depends. For example, the following is also a specification of swap,
although it is not as easy to understand:
Generally, we will use capital letters in identifiers that represent initial and
final values of program variables, and small letters for identifiers that
name variables in a program.
As a final example, we specify a sort program again, this time using an
extra identifier to alleviate the problem mentioned in example 3 of section
6.1. The predicate perm (c, C) has the meaning "array c is a permutation
of array C, i.e. a rearrangement of C ". See exercise 5 of section 4.2.
Section 6.3 Proof Outlines 103
{X =x Ay = Y}
t := x;
{t=XAx=XAy=Y}
x:= y;
{t =X Ax= f Ay =YI
y:= t
{y = X Ax= Y}
The reader can informally verify that, for each statement of the program,
if its precondition -the predicate in braces preceding it- is true, then
execution of the statement terminates with its postcondition -the predi-
cate in braces following it- true.
A predicate placed in a program is called an assertion; we assert it is
true at that point of execution. A program together with an assertion
between each pair of statements is called a proof outline, because it is just
that; it is an outline of a formal proof, and one can understand that the
program satisfies its specification simply by showing that each triple
(precondition, statement, postcondition) satisfies {precondition} statement
{postcondition}. The formal proof method is described in Part II.
Placing assertions in a program for purposes of documentation is often
called annotating the program, and the final program is also called an
annotated program.
Below is a proof outline for
The proof outline illustrates two new conventions. First, an assertion can
be named so that it can be discussed more easily, by placing the name at
its beginning followed by a colon. Secondly, adjacent assertions -e.g.
{P} {PJ}- mean that the first implies the second -e.g. P ~Pl. The
lines have been numbered solely for reference in a later discussion.
I. P -=;:.Pl (lines 1, 2)
2. {P/} i:= i+l {P2} (lines 2, 3, 4)
3. n=;:. P3 (lines 4, 5)
4. {P3}s:=s+i {R} (lines 5, 6, 7)
Together, these give the desired result: execution of ;:= i+l; s:= s+i
begun in a state satisfying P terminates in a state satisfying R.
The next example illustrates the use of a conditional statement. Note
how the assertion following then is the conjunction of the precondition of
the conditional statement and the test, since this is what is true at that
point of execution. Since both the then-part and the else-part end with
the assertion x =abs (X ), this is what we may conclude about execution
of the conditional statement.
{x =X}
ifx<Othen {x=XAx<O}
x:= -x
{x = -X Ax >O} {x = abs(X)}
else {x = X Ax ~O}
skip
=
{x X Ax >O} {x abs(X)} =
{x =abs(X)}
(7.1) the set of all states such that execution of S begun in any one of
them is guaranteed to terminate in a finite amount of time in a
state satisfying R. 0
Let's give some examples for some ALGOL-like commands, based on our
knowledge of how these commands are executed.
Note carefully that {Q} S {R} is really a statement in the predicate cal-
culus, since it is equivalent to Q ~wp(S, R ). Thus, it is either true or
false in any state. When we write it, we usually mean it to be a tautology
-we expect it to be universally true.
A command S is usually designed for a specific purpose -to establish
the truth of one particular postcondition R. So we are not always
interested in the general properties of S, but only in those pertaining to
R. Moreover, even for this R we may not be interested in the weakest
precondition wp(S, R), but usually in some stronger precondition Q (say)
that represents a subset of the set represented by wp(S, R). Thus, if we
can show that Q ~ wp(S, R) without actually forming wp (S, R ), then we
are content to use Q as a precondition.
The ability to work with a precondition that is not the weakest is use-
ful, because the derivation of wp (S, R) itself can be impractical, as we
shall see when we consider loops.
Note that wp is a function of two arguments: a command S and a
predicate R. Consider for the moment an arbitrary but fixed command
S. We can then write wp(S, R) as a function of one argument: wp 5 (R ).
The function wps transforms any predicate R into another predicate
wp 5 ( R ). This is the origin of the term "predicate transformer" for wp 5 .
Remark: The notation Q {S} R was first used in 1969 (see chapter 23) to
denote partial correctness. It has the interpretation: if execution of S
begins in a state satisfying Q, and if execution terminates, then the final
110 Part II. The Semantics of a Small Language
T {while T do skip} T
{ T} while T do skip { T}
Some properties of wp
If we are to define a programming notation using the concept of wp,
then we had better be sure that wp is well-behaved. By this we mean that
we should be able to define reasonable, implementable commands using
wp. Furthermore, it would be nice if unimplementable commands would
be rejected from consideration. Let us therefore analyze our interpreta-
tion (7.1) of wp(S, R ), and see whether any properties can be derived
from it.
First, consider the predicate wp (S, F) (for any command S ). This
describes the set of states such that execution of S begun in any one of
them is guaranteed to terminate in a state satisfying F. But no state ever
satisfies F, because F represents the empty set. Hence there could not
possibly be a state in wp(S, F), and we have our first property:
Let us see why (7.4) is a tautology. First, consider any states that satis-
fies the lefthand side (LHS) of (7.4). Execution of S begun in s will ter-
minate with both Q and R true. Hence QA R will also be true, and s is
in wp (S, QA R ). This shows that LHS ~ R HS. Next, suppose s is in
wp (S, QA R ). Then execution of S begun in s is guaranteed to ter-
minate in some state s' of QA R. Any such s' must be in Q and in R, so
Chapter 7 The Predicate Transformer wp 111
wp(flip,head)Vwp(flip,tail) =F
But the coin is guaranteed to land with either a head or a tail up, so that
s R
(a) i:= i+I i >O
(b) ;:= i+2; j:=j-2 i+j =O
(c) i:= i+I; j:=j-1 i*j = 0
(d) z:= z*j; i:=i-1 z*/ =c
(e) a[i]:= I a[i] =aU]
(0 a[a[i]]:= i a[i] =i
2. Examples 1-5 of this section each gave a predicate in the form wp (S, R) = Q.
Rewrite each of these in the form { Q} S {R }, just to get used to the two dif-
ferent notations. For example, example 2 would be written as
3. Prove (7.5) and (7.6). Don't rely on the notion of execution and interpretation
(7.1 ); prove them only from (7.4) and the laws of predicate calculus.
4. Prove using (7.4) that (wp(S, R)Awp(S, ,R)) = F.
5. Give an example to show that the following is not true for all states:
(wp(S, R)Vwp(S, ,R)) = T.
6. Show that (7.7) holds for deterministic S. (It cannot be proved from axioms
(7.3)-(7.4); it must be argued based on the definitions of determinism and wp, as
was done for (7.3) and (7.4).)
7. Suppose Q ~wp(S,R) has been proven for particular Q, R and S.
Analyze fully the statement
Exercises for Chapter 7 113
(ls it true in general; if not, what restrictions must be made so that it holds for
"reasonable" classes of predicates Q, R and commands S, etc.) Hint: be careful
to consider the case where x appears in S. You may want to answer the question
under the ground rule that the appearance of x in S means that (7.8) is invalid,
and that the quantified identifier x should be changed before proceeding. It is
also instructive, however, to answer this question without using this ground rule.
See section 4.3.
8. Suppose Q ~ wp (S, R) has been proven for particular Q, R and S.
Analyze fully the statement
{(Ex : Q) I s {(Ex : R ) I
(Is it true in general; if not, what restrictions must be made su ~hat it holds for
"reasonable" classes of predicates Q, R and commands S, etc.) See the hint on
exercise 7.
Chapter 8
The Commands skip, abort and Composition
That is, it doesn't matter whether one thinks of SI; S2; SJ as S/ com-
posed with S2; SJ or as S/; S2 composed with SJ, and it is all right to
leave the parentheses out. (Similarly, because addition is associative,
a +b +c is well-defined because a +(b +c) yields the same result as
(a+b )+c .)
Be aware of the role of the semicolon; it is used to combine adjacent,
independent commands into a single command, much the way it is used in
English to combine independent clauses. (For an example of its use in
English, see the previous sentence.) It can be thought of as an operator
that combines, just as catenation is used in Pascal and PL/ I to combine
two strings of characters. Once this is understood, there should be no
confusion about where to put a semicolon.
Our use of the semicolon conforms not only to English usage, but also
to its original use in the first programming notation that contained it,
ALGOL 60. It is a pity that the designers of PL/ I and Ada saw fit to go
against convention and use the semicolon as a statement terminator, for it
has caused great confusion.
Thus far, we don't have much of a programming notation -about all
we can write is a sequence of skips and aborts. In the next chapter we
define the assignment command. Before reading ahead, though, perform
some of the exercises in order to get a firm grasp of this (still simple)
material.
116 Part II. The Semantics of a Small Language
where
Predicate domain (e) will not be formally defined, since expressions e are
not. However, it must exclude all states in which evaluation of e would
be undefined -e.g. because of division by zero or subscript out of range.
118 Part II. The Semantics of a Small Language
This example required explicit use of the term domain (e) of definition
(9.1.1). D
Thus, x will contain the value b [i] upon termination if! i is a valid sub-
script for array b. D
Section 9.1 Assignment to Simple Variables 119
wp ("x := e ", y = c) = (y = c ). 0
s R
(a) x:= 2*y+3 x= 13
(b) x:=x+y x<2*y
(c) j:= j+I O<j A(A i:O~i~j:b[i]=5)
(d) a/15:= (bU] = 5) a/15 =(A i:O~i ~j: b[i] =5)
(e) a/15:= a/15 A (bU] =5) a/15 =(Ai:O~i~j:b[i]=5)
(0 x:= x*y x*y =c
(g) x:=(x-y)*(x+y) x + y 2 ~o
2. Prove that definition (9.1.3) satisfies laws (7.3), (7.4) and (7.7). The latter
shows that assignment is deterministic.
3. Review section 4.6 (Some theorems about textual substitution). Let s be the
machine state before execution of x := e and let s' be the final state. Describe s
and s' in terms of how x := e is executed. (What, for example, should be the
value in x upon termination?) Then show that for any predicate R, s' ( R) is true
if! s(R;) is true. Finally, argue that this last fact shows that the definition of
assignment is consistent with our operational view of assignment.
4. One can write a "forward rule" for assignment, which from a precondition
derives the strongest postcondition sp(Q, "x:= e") such that execution of x:= e
with Q true leaves sp(Q, "x:= e") true (in the definition below, v represents the
initial value of x ):
Show that this definition is also consistent with our model of execution. One way
to do this is to show that execution of x := e with Q true is guaranteed to ter-
minate with sp ( Q, "x: = e ") true:
where the x; are distinct simple variables and the e; are expressions. For
purposes of explanation the assignment is abbreviated as x:= e. That is,
any identifier with a bar over it represents a vector (of appropriate
length).
The multiple assignment command can be executed as follows. First
evaluate the expressions, in any order, to yield values vi. · · · , Vn. Then
assign v 1 to x 1, v 2 to x 2, ••• , vn to Xn, in that order. (Because the X; are
distinct, the order of assignment doesn't matter. However, a later general-
ization will require left-to-right assignment.)
The multiple assignment is useful because it easily describes a state
change involving more than one variable. Its formal definition is a simple
extension of assignment to one variable:
where domain(e) describes the set of states in which all the expressions in
the vector e can be evaluated:
m i+p
bl I Ai <m <i+p
m i+p
bl I A i=m+J~i+p
We have:
Exercises for Section 9.2 123
m+l+x =i+p
wp("a:=a+l; b:=x",a=b)
wp("a:= a+l",a =x)
= a+l =x
3. Determine and simplify wp (S, R) for the pairs (S, R) given below.
s R
(a) z,x,y:=l,c,d z*x'' =cd
(b) i, s:= I, b[O] I~; <n As =b[O]+. .. +b[i-1]
(c) a, n := 0, I a 2 <n A (a+l) 2 ;:;;:i:n
(d) i,s:= i+l, s+b[i] O<i <n As =b[O]+ ... +b[i-1]
(e) ;:= i+I; j:= j+i i=j
(0 j:= j+i; ;:: ;+1 i =j
(g) i' j:= i+l, j+i =
i j
since both change b to represent the function (b; i:e). But (9.3.1) is an
assignment to a simple variable. Since assignment to a simple variable is
already defined in (9.1.1 ), so is assignment to a subscripted variable! We
have, using definition (9.1.1 ),
Section 9.3 Assignment to an Array Element 125
Remark: The notation (b; i :e) is used in defining assignment to array ele-
ments and in reasoning about programs, but not in programs. For tradi-
tional reasons, the assignment command is still written as b [i]:= e. D
performed here was explained at the end of section 5.1, so reread that
part if you are having trouble with it. 0
(9.4.2) x o s := e.
Note that a simple assignment x := e has form (9.4. I) -with n = I and
s 1 = f - since it is the same as x o f:= e. Also, the assignment b [i ]:= e
has this form, with n =I, x 1 =b, s 1 =[i] and e 1 =e.
The multiple assignment can be executed in a manner consistent with
the formal definition given below as follows:
To get some idea for the predicate transformer, let's look at the definition
of multiple assignment to simple variables:
The difficulty with (9.4.4) is that textual substitution is defined only for
identifiers, and so 0
Ri
s is as yet undefined. We now generalize the
notion of textual substitution to include the new case by describing how
to massage Ri0
s into the form of a conventional textual substitution.
The generalization will be done so that the manner of execution given in
(9.4.3), including the left-to-right order of assignment, will be consistent
with definition (9.4.4).
To motivate the generalization, consider the assignment
Why? Suppose two of the selectors s; and s1 (say), where i <j, are the
same. Then, after execution of (9.4.5), the value of ei (and not of e;) will
be in b o si, and thereafter a reference b o si should yield ei. But this is
exactly the case with execution of (9.4.6); the left-to-right order of assign-
ment during execution of (9.4.5) is reflected in the right-to-left precedence
rule for applying function (b; s 1:e 1; • • • ; Sm :em) to an argument.
Secondly, note that for distinct identifiers b and c and selectors s and
t (which need not be distinct) the assignments b o s, c o t := e, g and
c o t, b o s := g, e should have the same effect. This is because b o s
and c o t refer to different parts of computer memory, and what is
assigned to one cannot effect what is assigned to the other. (Remember,
expressions e and g are evaluated before any assignments are made.)
This leads us to the following
provided that identifier b does not begin any of the x;. This rule
indicates how multiple assignments to subparts of an object b can
be viewed as a single assignment to b . D
Note that the swap performs correctly when i = j, since this case is
automatically included in the above derivation. If this derivation seems
too fast for you, reread section 5.1. D
The last line follows because if k =Fi and k # j then (b; i:bU]; j:b[i])[k]
= b [k]. The only array values changed by the swap are b [i] and
bl/]. D
130 Part II. The Semantics of a Small Language
(a) Rblp·
e ..g
b[JJ, x
(b) Rb[?x,b[JJ
e, 'g
4. Derive a definition for a general multiple assignment command that can include
assignments to simple variables, array elements and Pascal record fields. (see
exercise I of section 5.3.)
S. Prove that lemma 4.6.3 holds for the extended definition of textual substitution:
Lemma. Suppose each X; of list x has the form identifier o selector and
suppose ii is a list of fresh, distinct identifiers. Then
Chapter 10
The Alternative Command
(IO.I) ifx;;;:::o-z:=x
ax~o-z:=-x
fi
ifx;;;:::o-z:=x D x~O-z:=-x fi
Command (IO.I) contains two entities of the form B - S (separated
by the symbol D) where B is a Boolean expression and S a command.
B - S is called a guarded command, for B acts as a guard at the gate
- , making sure S is executed only under the right conditions. To exe-
cute (I 0.1 ), find one true guard and execute its corresponding command.
Thus, with x >O execute z := x, with x < 0 execute z := -x, and with
x = 0 execute either (but not both) of the assignments.
This brief introduction has glossed over a number of important points.
Let us now be more precise in describing the syntax and execution of the
alternative command.
132 Part II. The Semantics of a Small Language
(10.2) if B 1 - S 1
0 B2 - S2
Typically, we assume that the guards are total functions -i.e. are well-
defined in all states. This allows us to simplify the definition by deleting
the first conjunct. Thus, with the aid of quantifiers we rewrite the defini-
tion in ( 10.3b) below. From now on, we will use ( 10.3b) as the definition,
but be sure the guards are well-defined in the states in which the alterna-
tive command will be executed!
wp((IO.l),z =abs(x))
=(x~ovx:;;;;O)A {BBA
(x ~O~wp("z:= x", z =abs(x))) A B 1 ~wp(S 1 ,R)A
(x :;;;;o~wp("z:= -x", z =abs(x))) B 2 ~wp(S 2 ,R)
=TA (x ~O~x =abs(x)) A
(x :;;;;o~ -x =abs(x))
=TA TAT
= T D
we calculate:
Hence we see that array b should not contain the value 0, and that the
definition of p as the number of values greater than zero in b [O:i - I] will
be true after execution of the alternative command if it is true before. D
The reader may feel that there was too much work in proving what we
did in example 2. After all, the result can be obtained in an intuitive
manner, and perhaps fairly easily (although one is likely to overlook the
problem with zero elements in array b ). At this point, it is important to
practice such formal manipulations. It results in better understanding of
the theory and better understanding of the alternative command itself.
134 Part II. The Semantics of a Small Language
if x ;:;;:i: 0 - skip ax ~0 - x := - x fi
Its counterpart in ALGOL, if x < 0 then z := -x, has the default that if
x ;;::i:o execution is equivalent to execution of skip. Although a program
may be a bit longer because of the lack of a default, there are advantages.
The explicit appearance of each guard does aid the reader; each alterna-
tive is given in full detail, leaving less chance of overlooking something.
More importantly, the lack of a default helps during program develop-
ment. Upon deriving a possible alternative command, the programmer is
forced to derive the conditions under which its execution will perform
satisfactorily and, moreover, is forced to continue deriving alternatives
until at least one is true in each possible initial state. This point will
become clearer in Part Ill.
The absence of defaults introduces, in a reasonable manner, the possi-
bility of nondeterminism. Suppose x = 0 when execution of command
(IO. I) begins. Then, since both guards x ;:;;:i: 0 and x ~O are true, either
command may be executed (but only one of them). The choice is entirely
up to the executor -for example it could be a random choice, or on days
with odd dates it could be the first and on days with even dates it could
Chapter 10 The Alternative Command 135
(I) Q =;>BB
(2) Q AB; =;> wp (S;, R ), for all i, I ~ i ~ n.
Proof We first show how to take Q outside the scope of the quantifica-
tion in assumption 2 of the theorem:
Hence, we have
136 Part II. The Semantics of a Small Language
That is, the search has been narrowed down to array section b[i:j], and k
is an index into this section. We want to prove that
holds. The first assumption Q ~BB of theorem ( I0.5) holds, because the
conjunction of the guards in (10.6) is equivalent to T. The second
assumption holds, because
QAb[k]:s;;;x ~xEb[k:j]
= wp("i:= k", x Eb[i:j]), and
QAb[k];;;:::x ~xEb[i:k]
= wp("j:= k", X Eb[i:j]).
The two implications follow from the fact that Q indicates that the array
is ordered and that x is in b[i:j] and from the second conjunct of the
antecedents. Hence the theorem allow us to conclude that ( I0.6) is
true. D
6. Arrays f [O:n] and g [O:m] are alphabetically ordered lists of names of people.
It is known that at least one name is on both lists. Let X represent the first (in
alphabetic order) such name. Calculate and simplify the weakest precondition of
the following alternative command with respect to predicate R given after it.
Assume i and j are within the array bounds.
8. The command in the following proof outline could be used in an algorithm that
determines the maximum value m of an array b [O:n -1]. Using theorem I0.5,
prove that it is true.
T
B S
in F
out
do B - Sod
(II.I)
do BB - if 8 1 - S1
a ...
0 Bn - Sn
fi
od
or do BB - IF od
That is, if all the guards are false, which means that BB is false, execution
terminates; otherwise, the corresponding alternative command IF is exe-
cuted and the process is repeated. One iteration of a loop, therefore, is
equivalent to finding BB true and executing IF.
Thus, we can get by with only the simple while-loop. Nevertheless, we
will continue to use the more general form because it is extremely useful
in developing programs, as we will see in Part III.
Ho(R)=,BBAR
140 Part II. The Semantics of a Small Language
Let us also write a predicate HdR), for k >O, to represent the set of all
states in which execution of DO terminates in k or fewer iterations, with
R true. The definition will be recursive -i.e. in terms of Hk _ 1(R ). One
case is that DO terminates in 0 iterations, in which case H 0(R) is true.
The other case is that at least one iteration is performed. Thus, BB must
initially be true and the iteration consists of executing a corresponding IF.
This execution of IF must terminate in a state in which the loop will
iterate k - I or fewer times. This leads to
i, s := I, b [O];
do i < 11 - i ,s:= i+I, s+b[i] od
{ R : s = (I k : 0 ~ k < 11: b [ k ]) }
How can we argue that it works? Let's begin by giving a predicate P that
shows the logical relationship between variables i, s and b -in effect, it
serves as a definition of i and s:
P: I ~i ~ 11 As =(Ik:O~k <i:b[k])
We will show that P is true just before and after each iteration of the
loop, so that it is also true upon termination. If P is true in all these
places, then, with the additional help of the falsity of the guards, we can
see that R is also true upon termination (since P A i;;;::: 11 ~ R ). We
summarize what we need to show by annotating the algorithm:
Chapter 11 The Iterative Command 141
{T}
i, s := I, b [O];
{P}
(11.3) doi<It- {i<llAP}i,s:=i+l,s+b[i]{P}od
{i~llAP}
{R}.
Now let's show that an iteration of the loop terminates with P true -i.e.
an execution of command i, s: = i +I, s +b [i] beginning with P and
i < 11 true terminates with P still true. Again, we can see this informally
or we can formally prove it:
t: 11-i.
{b ;;;::o}
x, y, z := a, b, O;
(11.4) do y >O A even(y) - y,x:= y72,x+x
a odd(y) - y, z:= y-1, z+x
od
{R:z=a*b}
z +x*y =z+x +x*(y-l). For the first guarded command, note that
execution of y, x := y 72, x +x with y even leaves the value of z + x*y
unchanged, because x*y =(x+x)* (y72) when y is even. We leave the
more formal verification to the reader (exercise 7).
Since each iteration of the loop leaves P true, P must be true upon
termination. We show that P together with the falsity of the guards
implies the result R as follows:
The work done thus far is conveyed by the following annotated program.
{b ;;;::o}
x, y, z:= a, b, 0;
{P}
doy >O A even(y) - {P Ay >O A even(y)} y,x:= y72,x+x {P}
(11.5) 0 odd(y) -{PAodd(y)}y,z:=y-1,z+x {P}
od
{P "y :;;;;o", odd(y)}
{PAy=O}
{R:z=a*b}
I'. PABB~wp(IF,P)
Finally, we leave to the reader (exercise 4) the proof that, for all k ;;;;:o,
(11.7) PAr:s;;;k~HdPA,BB).
Discussion
A loop has many invariants. For example, the predicate x*O =O is an
invariant of every loop since it is always true. But an invariant that satis-
fies the assumptions of theorem ( 11.6) is important because it provides
understanding of the loop. Indeed, every loop, except the most trivial,
should be annotated with an invariant that satisfies the theorem.
As we shall see in Part Ill, the invariant is not only useful to the
reader, it is almost necessary for the programmer. We shall give heuris-
tics for developing the invariant and bound function before developing the
loop and argue that this is the more effective way to program. This
makes sense if we view the invariant as simply the definition of the vari-
ables and remember the adage about precisely defining variables before
Chapter 11 The Iterative Command 145
{Q}
{inv P: the invariant}
{bound t: the bound function}
(11.8) do B 1 - S1
a ...
0 Bn - Sn
od
{R}
When faced with a loop with form ( 11.8), according to theorem ( 11.6)
the reader need only check the points given in ( 11.9) to understand that
the loop is correct. The existence of such a checklist is indeed an advan-
tage, for it allows one to be sure that nothing has been forgotten. In fact,
the checklist is of use to the programmer himself, although after a while
(pun) its use becomes second-nature.
{T}
i, s:= IO, O;
{inv P : 0 ~ i ~ I0 A s = (l: k : i +I ~k ~ I0: b [k ]) }
{bound t: i}
do i #0 - i,s:= i-1, s+b[i] od
{ R: s = (l: k : I ~ i ~ IO: b [ k ]) }
Exercises for Chapter 11 147
9. Formally prove the points of checklist 11.9 for the following algorithm. The
algorithm finds the position i of x in array b [O: n - I] if x Eb [O:n - I] and sets
i to n if it is not.
{O~n}
;:= O;
{inv P:O~i~n Ax9'b[O:i-I]}
{;,ound t: n -i}
do i <n cand x ~b[i] - i:= i+I od
{R: (O~i <n Ax =b[i]) v (i =n Ax 9=b[O:n-I])}
10. Formally prove the points of checklist 11.9 for the following algorithm. The
algorithm sets i to the highest power of 2 that is at most n.
{O<n}
;:= I;
{inv P: O<i ~n A (Ep: i =2P)}
{bound t: n -i}
do 2*i ~n - ;:= 2*i od
{R: O<i ~n <2*i A (Ep: i =2P)}
11. Formally prove the points of checklist 11.9 for the following algorithm. The
algorithm computes the nth Fibonacci number f n for n >O. which is defined by
/o=0,/1 =I, and/n =fn-i+fn-2 for n >I.
{n >O}
i, a, b:= I, I, O;
{inv P: I ~i ~n A a =f; Ab =f;-d
{bound t: n -i}
do i <n - i,a,b:= i+l, a+b, a od
{R:a=fn}
12. Formally prove the points of checklist 11.9 for the following algorithm. The
algorithm computes the quotient q and remainder r when x is divided by y.
{x ;;;::o 11 O<y}
q, r:= 0, x;
{inv P: O~r A O<y A q*y+r =x}
{bound t: r}
do r ;;:;:: y - r, q: = r -y, q +I od
{R: O~r <y A q*y+r =xi
13. Formally prove the points of checklist 11.9 for the following algorithm. The
algorithm finds an integer k such b [k] is the maximum value of array b [O:n -I]
-note that if the maximum value occurs more than once the algorithm is non-
deterministic.
148 Part II. The Semantics of a Small Language
{O<n}
i, k := I, O;
{inv P: O<i ~n A b[k];;;:::b[O:i-1]}
{bound t : n -i I
do i <n - if b[i]~b[k] - skip
Ob[iJ;;;:::b[kJ- k:=;
fi;
i:= i+I
od
{R: b[k];;;:::b[O:n-1]}
Chapter 12
Procedure Call
In one sense, using a procedure is exactly like using any other opera-
tion (e.g. +) of the programming notation, and constructing a procedure is
extending the language to include another operation. For example, when
we use + in an expression, we never question how it is performed; we just
assume that it works. Similarly, when writing a procedure call we rely
only on what the procedure does, and not on how it does it. In another
sense, a procedure (and its prooO is a lemma. A program can be con-
sidered a constructive proof that its specification is consistent and com-
putable; a procedure is a lemma used in the constructive proof.
In the following sections, Pascal-like notations are used for procedure
declaration and call, although the (possible) execution of a procedure call
may not be exactly as in Pascal. The reason is that the main influence in
developing the procedure call here was the need for a simple, understand-
able theorem about its use, and such an influence was beyond the state of
the art when Pascal was developed.
Procedure declaration
A procedure declaration has the form
{Pre: P)
{Post: QI
proc <identifier>( <par.spec.>; · · · ; <par. spec.>);
<body>
The following restrictions are made on the use of identifiers in a pro-
cedure declaration. The only identifiers that can be used in the body are
the parameters and the identifiers declared in the body itself -i.e. no
"global variables" are allowed. The parameters must be distinct identif-
iers. Precondition P of the body may contain as free only the parameters
with attribute value (and value result); postcondition Q only parameters
with attribute result (and value result). This restriction is essential for a
simple definition of procedure call, but it does not limit procedures or
calls of them in any essential way. P and Q may, of course, contain as
free other identifiers that are not used within the program (to denote ini-
tial values of variables, etc.). See section 12.4 for a way to eliminate this
restriction.
Example. Given fixed x, fixed n >O and fixed array b[O:n -I], where
x Eb, the following procedure determines the position of x in b, thus
establishing x = b [i].
Note that identifiers have been used to denote the initial values of the
parameters that do not have attribute result, even though the parameters
are not altered during execution of the procedure body. D
152 Part II. The Semantics of a Small Language
Thus, the X; are the value parameters of procedure p, the Y; the value-
result parameters and the z; the result parameters. We have left out the
types of the parameters because they don't concern us at this point. (This
is an example of the use of abstraction!)
The name of the procedure is p. The a;, b; and c; are the arguments of
the procedure. The a; are expressions; the b; and c; have the form
identifier o selector -in common parlance, they are "variables". The a;
are the value arguments corresponding to the X; of (12.1.1 ), the b; the
value-result arguments and the c; the result arguments. Each argument
must have the same type as its corresponding parameter.
The identifiers accessible at the point of call must be different from the
procedure parameters x, y and z. This restriction avoids extra notation
needed to deal with the conflict of the same identifier being used for two
different purposes and is not essential.
To illustrate, here is a call of procedure search of the previous exam-
ple: search (50, t, c, position U]). Its execution stores in position U] the
position of the value of t in array c [0:49].
A call p (a, b, C) can be executed as follows:
{PR: P~·bi.
a,
A (A ii, v: Qf·~
u,v
9 R~·
u,v
~)} p('ii, b, C) {R}
Proof Suppose for the moment that we know the values ii, v that will
be assigned to parameters with attribute result. Then execution of the
procedure body B, by itself, can be viewed as a multiple assignment
jl,z:= ii,v. From (12.1.2), we see that the procedure call can be viewed as
the following sequence (12.2.2). In (12.2.2), postcondition R has been
placed at the end and assertions P and Q have been placed suitably
because we expect to use that information subsequently.
c12.2.2) x,y:= a,b {P}; y,z:= ii,v {Q}; 'fi,c:= y,z {R}
= Q~·:.
u,v
(since it contains no X; or y;!)
154 Part II. The Semantics of a Small Language
In order to be able to use the fact that {P} B {Q} has been proved about
the procedure body, we require that ( 12.2.3) be true before the call; this is
the first conjunct in the precondition PR of the theorem. Therefore, no
matter what values ii, v execution assigns to the result parameters, Q will
be true in the indicated place in ( 12.2.2).
Now, we want to determine initial conditions that guarantee the truth
of R upon termination, no matter what values ii, v are assigned to the
result parameters and arguments. R holds after the call if, for all values
ii, v, the truth of Q in (12.2.2) implies the truth of R after the call. This
can be written in terms of the initial conditions as
(A ii,v: (12.2.4)9(12.2.5))
holds, where a and b are integer variables and identifiers Y and X de-
note their final values, respectively. We apply theorem (12.2.1) to find a
satisfactory precondition PR:
PR =(a = X A b = Y) A
(Aul • ul·(yl
· = Y Ay2=X)Y1·Y2 9 (a= YA b =X)a,b )
u/,u2 ul,u2
=(a =X Ab= Y) A
(Aul, u2: (ul = Y A u2 = X) 9(ul =Y A u2 = X))
= (a = X A b = Y) A T
Section 12.2 Two Theorems Concerning Procedure Call 155
Therefore, it is equivalent to
Thus, this last line is also true about the procedure body B. Now apply
the theorem as in example 1 to yield the desired result. Hence, ( 12.2. 7)
holds.
This illustrates how initial and final values of parameters can be han-
dled. The identifiers that denote initial and final values of parameters can
be replaced by fresh identifiers -or any expressions- to yield another
proof about the procedure body, which can then be used in theorem
12.2.1. D
Example 3. We now prove correct a call that has array elements as argu-
ments. Consider the procedure of example I. We want to prove that
swap (i, b [i ]) interchanges i and b [i] but leaves the rest of array b
unchanged. It is assumed that the value of i is a valid subscript. Thus,
we want to prove
which assigns the value parameter to both result parameters. Note that
postcondition Q does not contain the value parameter. We want to exe-
cute the call p (b [i ], i, b [i +I]), which assigns b [i] to i and b [i +I].
Thus, it makes sense to try to prove
First, replace the free variable X in the proof of the procedure body by
C:
{P: x = C} z 1, z2: = x , x {Q: z 1 = z2 = C }
b[i]=C"
(A vl,v2:vl=v2=C~
vl=(b; i+l:v2)[/]=(b; i+l:v2)[/+l]=C)
= b[i]=C "(b; i+l:C)[/]=(b; i+l:C)[I+l]=C
(Au,v: QE·~:::;;.R~·~)
u, v u, v
(12.2.10) R~·~
u. v
= QE·~
u, v
A I
where the free variables of I are disjoint from ii and c. For then the
complicated conjunct may be simplified as follows:
= (Q!.•:.
u,v
A /)~·~
b,c
(12.2.4)
= Q~·:.
b,c
A I (Lemma 3, def of/)
(12.2.11) R = Q5Y·:.
•c
A I
But this is not enough. From (12.2.11) we want to conclude that (12.2.10)
holds, but this is not always the case, because
(Qi•~
b,c
A/)~·~
u,v
Q!.·:.
u, v
(12.3.2) p(a, b, c, J)
How do we extend theorems 12.2.1 and 12.2.12 to allow for call by refer-
ence? Call by reference can be viewed as an efficient form of call by
value-result; execution is the same, except that the initial assignments to r
and the final assignments to d are not needed. But the proof of the pro-
cedure body, {P} B {Q}, is consistent with our notion of execution for
value-result parameters only if value-result parameters occupy separate
locations -assignment to one parameter must not affect the value of any
other parameter. When using call by reference, then, we must be sure
that this condition is still upheld.
Let us introduce the notation disj(d) to mean that no sharing of
memory occurs among the d;. For example, disj(dl, d2) holds for dif-
ferent identifiers di and d2. Also, disj(b[i],b[i+l]) holds, while
disj(b[i],bU]) is equivalent to i ¥-j.
Further, we say that two vectors x and y are pairwise disjoint, written
pdisj(x; .f), if each x; is disjoint from each YJ -i.e. disj(x;, y1 ) holds.
Theorems 12.2.1 and 12.2.12 can then be modified to the following:
{Px·E·~
O,b,d
A (Au,v,w:QE·~·~
U,V,W
~R~·~·~)}
U, V, W
p(a, b, c, J)
{R} D
holds. Then
{P~·[·~ A I} p(a,
a,b,d
b, c, d) {QE·:·~}
b,c,d
D
and p(ii, d)
{PR: P~·j
O,D
A (A ii~v: Qf·~
U,V
~R~·~)}
U,V
p(a, b, c)
{R}
{PR: P~·i
a,b
A(Au,v: Q~·!·~~R~·~)}
a.u,v u,v
p(a, b, c)
{R}
holds. In other words, PR ~wp(p(ii, b, C), R ). 0
{Q(u)} S {R}
{(Au: Q(u))} S {R}
{(Eu: Q(u))} S {R}
{Pre: 0 ~ k A x = x A b = B}
{Post: O~p ~k A b[p]=X
proc s(value x: integer;
var b: array of integer;
var k, p: integer);
p, b[k]:= 0, x;
{inv: O~p ~k Ax 9'b[O:p-I]}
{bound: k -p}
do x #b[p] - p:= p+I od
ls the procedure fully specified -i.e. has anything omitted from the specification
that can be proved of the procedure body? Which of the following calls can be
proved correct using theorem 12.2.1. Prove them correct.
(a) {T} s(5,c,O,j) {cl/]=5}
(b) {O~m} s(f, c, m,j) {cl/]=/}
(c) {O<m} s(b[O],c,m,j) {cl/]=c[O]}
(d) {O<m} s(5, c, m, m) {c[m]=5}
6. Which of the calls given in exercise 5 can be proved correct using theorem
12.2.12? Prove them correct.
7. Suppose parameters k and p of exercise 5 have attribute value result instead of
var. Can call (d) of exercise 5 be proved correct using theorem 12.3.3? If so, do
so. Can it be proved correct using theorem 12.3.4?. 12.3.5? If so, do so.
8. Consider the following variation of procedure s of exercise 5.
{Pre: 0 ~ k A b = B}
{Post: O~p ~k A b[p]=x
proc ss(value x: integer;
value b: array of integer;
value result k , p : integer);
p, b[k]:= 0, x;
{inv: O~p ~k Ax 9'b[O:p-I]}
{bound: k-p}
do x #b[p] - p:= p+I od
What is a proof?
The word radical, used above, is appropriate, for the methodology pro-
posed strikes at the root of the current problems in programming and pro-
vides basic principles to overcome them. One problem is that program-
mers have had little knowledge of what it means for a program to be
correct and of how to prove a program correct. The word proof has un-
pleasant connotations for many, and it will be helpful to explain what it
means.
A proof, according to Webster's Third New International Dictionary, is
"the cogency of evidence that compels belief by the mind of a truth or
fact". It is an argument that convinces the reader of the truth of some-
thing.
The definition of proof does not imply the need for formalism or
mathematics. Indeed, programmers try to prove their programs correct in
this sense of proof, for they certainly try to present evidence that compels
their own belief. Unfortunately, most programmers are not adept at this,
as can be seen by looking at how much time is spent debugging. The pro-
grammer must indeed feel frustrated at the lack of mastery of the subject!
Part of the problem has been that only inadequate tools for under-
standing have been available. Reasoning has been based solely on how
164 Part 111. The Development of Programs
programs are executed, and arguments about correctness have been based
on a number of test cases that have been run or hand-simulated. The
intuition and mental tools have simply been inadequate.
Also, it has not always been clear what it means for a program to be
"correct", partly because specifications of programs have been so impre-
cise. Part II has clarified this for us; we call a program S correct -with
respect to a given precondition Q and postcondition R - if {Q} S {R}
holds. And we have formal means for proving correctness.
Thus, our development method will center around the concept of a for-
mal proof, involving weakest preconditions and the theorems for the alter-
native, iterative and procedure call constructs discussed in Part II. In this
connection, the following principle is important:
the reader. In addition, some programs are so large that they cannot be
comprehended fully by one person at one time. Thus, there is a continual
need to strive for balance, conciseness, and even elegance.
The approach we take, then, can be summarized in the following
The Coffee Can Problem. A coffee can contains some black beans and
white beans. The following process is to be repeated as long as possible.
Randomly select two beans from the can. If they are the
same color, throw them out, but put another black bean
in. (Enough extra black beans are available to do this.)
If they are different colors, place the white one back into
the can and throw the black one away.
Execution of this process reduces the number of beans in the can by one.
Repetition of the process must terminate with exactly one bean in the can,
for then two beans cannot be selected. The question is: what, if anything,
can be said about the color of the final bean based on the number of
166 Part III. The Development of Programs
white beans and the number of black beans initially in the can? Spend
ten minutes on the problem, which is more than it should require, before
reading further.
It doesn't help much to try test cases! It doesn't help to see what happens
when there are initially I black bean and I white bean, and then to see
what happens when there are initially 2 black beans and one white bean,
etc. I have seen people waste 30 minutes with this approach.
Instead, proceed as follows. Perhaps there is a simple property of the
beans in the can that remains true as beans are removed and that,
together with the fact that only one bean remains, can give the answer.
Since the property will always be true, we will call it an invariant. Well,
suppose upon termination there is one black bean and no white beans.
What property is true upon termination, which could generalize, perhaps,
to be our invariant? One is an odd number, so perhaps the oddness of the
number of black beans remains true. No, this is not the case, in fact the
number of black beans changes from even to odd or odd to even with
each move. But, there are also zero white beans upon termination
-perhaps the evenness of the number of white beans remains true. And,
indeed, yes, each possible move either takes out two white beans or leaves
the number of white beans the same. Thus, the last bean is black if ini-
tially there is an even number of white beans; otherwise it is white.
Closing the curve. This second problem is solved in essentially the same
manner. Consider a grid of dots, of any size:
Two players, A and B, play the following game. The players alternate
moves, with A moving first. A moves by drawing I or _ between two
adjacent dots; B moves by drawing a dotted line between two adjacent
dots. For example, after three full moves the grid might be as to the left
below. A player may not write over another players move.
Chapter 13 Introduction 167
A wins the game if he can get a completely closed curve, as shown to the
right above. B, because he goes second, has an easier task: he wins if he
can stop A from getting a closed curve. Here is the question: is there a
strategy that guarantees a win for either A or B, no matter how big the
board is? If so, what is it? Spend some time thinking about the problem
before reading further.
Looking at one trivial case, a grid with one dot, indicates that A cannot
win all the time -four dots are needed for a closed curve. Hence, we
look for a strategy for B to win. Playing the game and looking at test
cases will not find the answer! Instead, investigate properties of closed
curves, for if one of these properties can be barred from the board, A
cannot win. The corresponding invariant is that the board is never in a
configuration in which A can establish that property.
What properties does a closed curve have? It has parallel lines, but B
cannot prevent parallel lines. It has an even number of parallel lines, but
B cannot prevent this. It has four angles L, _J, rand I, but B cannot
prevent A from drawing angles. It always has at least one angleL, which
opens northeast -and B can prevent A from drawing such an angle! If
A draws a horizontal or vertical line, as shown to the left below, then B
simply fills in the corresponding vertical or horizontal line, if it is not yet
..... n·
filled in, as shown to the right below. A simpler strategy couldn't exist!
.~A'smove . /:_.
I...~B'smove
and~
...
These two problems have extremely simple solutions, but the solutions
168 Part Ill. The Development of Programs
are extremely difficult to find by simply trying test cases. The problems
are easier if one looks for properties that remain true. And, once found,
these properties allow one to see in a trivial fashion that a solution has
been found.
Besides illustrating the inadequacy of solving by test cases, these prob-
lems illustrate the following principle:
In fact, we shall see by examples that the more properties you know
about the objects, the more chance you have of creating an efficient algo-
rithm. But let us leave further examples of the use of this principle to
later chapters.
Programming-in-the-small
For the past ten years, there has been much research in "pro-
gramming-in-the-small'', partially because it seemed to be an area in
which scientific headway could be made. More importantly, however, it
was felt that the ability to develop small programs is a necessary condition
for developing large ones -although it may not be sufficient.
This fact is brought home most clearly with the following argument.
Suppose a program consists of n small components -i.e. procedures,
modules- each with probability p of being correct. Then the probability
P that the whole program is correct certainly satisfies P <pn. Since n is
large in any good-sized program, to have any hope that the program is
correct requires p to be very, very close to 1. For example, a program
with 10 components, each of which has 95% chance of being correct, has
less than a 60% chance of being correct, while a program with 100 such
components has less than a .6% chance of being correct!
Part III concentrates on the place where many programming errors are
made: the development of small program segments. All the program seg-
ments in Part III are between I and 25 lines long, with the majority being
between I and IO. It is true, however, that some of the programs are
short because of the method of development. Concentrating on princi-
ples, with an emphasis on precision, clarity and elegance, can actually
result in shorter programs. The most striking example of this is the pro-
gram The Welfare Crook -see section 16.4.1.
A disclaimer
The methods described in Part III can certainly benefit almost any pro-
grammer. At the same time, it should be made clear that there are other
ways to develop programs. A difficult task like programming requires
many different tools and techniques. Many algorithms require the use of
an idea that simply does not arise from the principles given in this Part,
so this method alone cannot be used to solve them effectively. Some
important ideas, like program transformation and "abstract data types"
are not discussed at all, while others are just touched upon. And, of
course, experience and knowledge can make all the difference in the
world.
Secondly, even though the emphasis is on proofs of correctness, errors
will occur. The wise programmer develops a program with the attitude
that a correct program can and will be developed, provided enough care
and concentration is used, and then tests it thoroughly with the attitude
that it must have a mistake in it. The frequency of errors in mathematical
theorems, proofs, and applications of theorems is well-recognized and
documented, and the area of program-proving will not be an exception.
We must simply learn to live with human fallibility and simplify to reduce
it to a minimum.
Nevertheless, the study of Part III will provide an education in
rigorous thinking, which is essential for good programming. Conscious
application of the principles and strategies discussed will certainly be of
benefit.
first before proceeding! Finally, the reader should do several of the exer-
cises at the end of the section.
Simply reading and listening to lectures on program development can
only teach about the method; in order to learn how to use it, direct
involvement is necessary. In this connection, the following meta-principle
is of extreme importance:
Ideas may be simple and easy to understand, but their application may
require effort. Recognizing a principle and applying it are two different
things.
notation in which the final program is expressed. For example, one can
use the principles and strategies espoused in this book even if the final
program has to be in FORTRAN: one programs into a language, not in
it. To be sure, considerably more than one month of education and train-
ing will be necessary to wean yourself away from QWERTY program-
ming, for old habits are changed very slowly. Nevertheless, I think it is
worthwhile.
Let us now turn to the elucidation of principles and strategies that may
help give the QWERTY programmer a new keyboard.
Chapter 14
Programming as a Goal-Oriented Activity
(14.2) R:z;;;:::xAz;;;:::yA(z=xVz=y)
if x ;;:;:: y - z := x fi
This program performs the desired task provided it doesn't abort. Recall
from Theorem 16.5 for the Alternative Construct that, to prevent abor-
tion, precondition Q of the construct must imply the disjunction of the
guards, i.e. at least one guard must be true in any initial states defined by
Q. But Q, which is T, does not imply x ;;:;:: y. Hence, at least one more
guarded command is needed.
Another possible way to establish R is to execute z := y. From the
above discussion it should be obvious that y ;;;;:x is the desired guard.
Adding this guarded command yields
Now, at least one guard is always true, so that this is the desired program.
Formally, we know that (14.3) is the desired program by theorem 16.5.
To apply the theorem, take
S 1: z:= x S 2:z:=y
8 1: x ;;;;:y B 2 :y ;;;;:x
P: T R: z;;;;:xAz;;;;:yA(z=xVz=y)
Discussion
The above development illustrates the following
By this we mean that the desired result, or goal, R, plays a more impor-
tant role in the development of a program than the precondition Q. Of
course, Q also plays a role, as will be seen later. But, in general, more
insight is gained from the postcondition. The goal-oriented nature of pro-
gramming is one reason why the programming notation has been defined
in terms of weakest preconditions (rather than strongest postconditions
-see exercise 4 of section 9.1).
To substantiate this hypothesis of the goal-oriented nature of program-
ming, consider the following. Above, the precondition was momentarily
put aside and a program was developed that satisfied
{T}S{?}
In the example just developed, the postcondition was refined while the
precondition, which was simply T, needed no refining.
A problem is sometimes specified in a manner that lends itself to
several interpretations. Hence, it is reasonable to spend some time mak-
ing the specification as clear and unambiguous as possible. Moreover, the
form of the specification can influence algorithmic development, so that
striving for simplicity and elegance should be helpful. With some prob-
lems, the major difficulty is making the specification simple and precise,
and subsequent development of the program is fairly straightforward.
Often, a specification may be in English or in some conventional nota-
tion -like max(x, y )- that is at too "high a level" for program develop-
ment, and it may contain abbreviations dealing with the applications area
with which the programmer is unfamiliar. The specification is written to
convey what the program is to do, and abstraction is often used to sim-
plify it. More detail may be required to determine how to do it. The
example of setting z to the maximum of x and y illustrates this nicely. It
is impossible to write the program without knowing what max means,
while writing a definition provides the insight needed for further develop-
ment.
The development of ( 14.3) illustrates one basic technique for develop-
ing an alternative construct, which was motivated by theorem 10.5 for the
Alternative Construct.
This technique, and a similar one for the iterative construct, is used often.
Let us return to program ( 14.3) for a moment. It has a pleasing sym-
metry, which is possible because of the nondeterminism. If there is no
reason to choose between z := x and z := y when x =y, one should not be
forced to choose. Programming requires deep thinking, and we should be
spared any unnecessary irritation. Conventional, deterministic notations
force the choice, and this is one reason for preferring the guarded com-
mand notation.
Nondeterminism is an important feature even if the final program turns
out to be deterministic, for it allows us to devise a good programming
methodology. One is free to develop many different guarded commands
completely independently of each other. Any form of determinism, such
as evaluating the guards in order of occurrence (e.g. the PL/ I Select state-
ment), drastically affects the way one thinks about developing alternative
constructs.
A second example
Write a program that permutes (interchanges) the values of integer var-
iables x and y so that x :;;;;y. Use the method of development discussed
above.
As a first step, before reading further, write a suitable precondition Q
and postcondition R.
The problem is slightly harder than the first one, for it requires the intro-
duction of notation to denote the initial and final values of variables.
Precondition Q is x = X A y = Y, where identifiers X and Y denote the
initial values of variables x and y, respectively. Postcondition R 1s
Remark: One could also use the concept of a permutation and write R as
x :;;;;y A perm ((x, y ), (X, Y)). D
if x :;;;;y - skip
a y :;;;;;; x - x. y := .v. x
fi
Since the disjunction of the guards, x :;;;;y v y :;;;;;; x, is always true, the pro-
gram is correct (with respect to the given Q and R ).
Note carefully how the theorem for the Alternative Construct is used
to help determine the guards. This should not be too surprising -after
all, the theorem simply formalizes the principles used by programmers to
understand alternative commands.
j = k mod 10
(Note. how strategy ( 14.7) was used, in an informal but careful manner.)
The question is: which is to be preferred, ( 14.8) or segment ( 14.9) below,
which is the same as (14.8) except that its second guard, j ;;;:,9, is weaker.
At first thought, ( 14.9) might be preferred because it executes without
abortion in more cases. If initially j = JO (say), it nicely sets j to 0. But
this is precisely why (14.9) is not to be preferred. Clearly, j =IO is an
error caused by a hardware malfunction, a software error, or an inadver-
tant modification of some kind - j is always supposed to satisfy
0 ~j <I 0. Execution of ( 14.9) proceeds as if nothing were wrong and the
error goes undetected. Execution of ( 14.8), on the other hand, aborts if
j = JO, and the error is detected.
( 14.10) •Principle: All other things being equal, make the guards
of an alternative command as strong as possible, so that
some errors will cause abortion.
The phrase "all other things being equal" is present to make sure that the
principle is reasonably applied. For example, at this point I am not even
prepared to advocate strengthening the first guard, as follows:
This chapter discusses two methods for developing a loop when the
precondition Q, the postcondition R, the invariant P and the bound
function t are given. The first method leads naturally to a loop with a
single guarded command, do B - S od. The second takes advantage of
the flexibility of the iterative construct and generally results in loops with
more than one guarded command.
Checklist 11. 9 will be heavily used, and it may be wise to review it
before proceeding. As is our practice throughout, the parts of the
development that illustrate the principles to be covered are discussed in a
formal and detailed manner, while other parts are treated more infor-
mally.
R:s =(I:j:O~j<n:bl/])
Thus, variable i has been introduced. The invariant states that at any
point in the computation s contains the sum of the first i values of b.
The assignment i, s:= 0, 0 obviously establishes P, so it will suffice as
the initialization. (Note that i, s := I, b [O] does not suffice because, if
n =O, it cannot be executed. If n =O, execution of the program must set
s to the identity of addition, 0.)
The next step is to determine the guard B for the loop do B - S od.
Checklist 11.9 requires PA , B ~ R, so , B is chosen to satisfy it. Com-
paring P and R, we conclude that i = n will do. The desired guard B of
the loop is therefore its complement, i # n. The program looks like
i, s := 0, O; do i # n - ? od
Now for the command. The purpose of the command is to make progress
towards termination -i.e. to decrease the bound function t - and an
obvious first choice for it is i:= i+I. But, this would destroy the invari-
ant, and to reestablish it b[i] must simultaneously be added to s. Thus,
the program is
Remark: For those uneasy with the multiple assignment, the formal proof
that P is maintained is as follows. We have
Discussion
First of all, let us discuss the balance between formality and intuition
observed here. The pre- and postconditions, the invariant and the bound
function were given formally and precisely. The development of the parts
of the program was given less formally, but checklist 11.9, which is based
on the formal theorem for the Iterative Construct, provided most of the
motivation and insight. In order to check the informal development, we
relied on the theory (in checking that the loop body maintained the invari-
ant). This is illustrative of the general approach ( 13.1) mentioned in
chapter 13.
An important strategy in the development was finding the guard before
the command. And the prime consideration in finding the guard B was
that it had to satisfy PA , B ~ R. So, , B was developed and then com-
plemented to yield B.
Section 15. I Developing the Guard First 181
Some object at first to finding the guard this way, because Tradition
would use the guard i <n instead of i #n. However, i #n is better,
because a software or hardware error that made i >n would result in a
nonterminating execution. It is better to waste computer time than suffer
the consequences of having an error go undetected, which would happen
if the guard i <n were used. This analysis leads to the following
The method used for developing the guard of a loop is extremely sim-
ple and reliable, for it is based on manipulation of static, mathematical
expressions. In this connection, I remember my old days of FORTRAN
programming -the early I960's- when it sometimes took three debug-
ging runs to achieve proper loop termination. The first time the loop
iterated once too few, the second time once too many and the third time
just right. It was a frustrating, trial-and-error process. No longer is this
necessary; just develop , B to satisfy P A , B ~ R and complement it.
Another important point about the development was the stress on ter-
mination. The need to progress towards termination motivated the
development of the loop body; reestablishing the invariant was the second
consideration. Actually, every loop with one guarded command has the
high-level interpretation
( 15.1.3) {invariant: P}
{bound: t}
do B - Decrease t, keeping P true od
{PA , B}
The invariant P, given below using a diagram, states that x is not in the
already-searched rows b[O:i-1] and not in the already-searched columns
b[i,O:j-1] of the current row i.
0 j n-1
0 x not here
(15.1.6) P: O~i~m AO~j<n A I
m-1 '--~~~~~~--'
The obvious choice is i, j:= 0, 0, for then the section in which "x is not
here" is empty. Next, what should be the guard B of the loop?
B: i #m A (i =m cor x #b[i,j])
and finally to
B: i #m cand x #b[i,j].
The final line is therefore the guard of the loop. The next step is to deter-
mine the loop body. Do it, before reading further.
The purpose of the loop body is to decrease the bound function t, which
is the number of elements in the untested section: (m -i)*n - j. PA B,
the condition under which the body is executed, implies that i <m, j <n
and x # b [i ,j], so that element b [i, j], which is in the untested section,
can be moved into the tested section. A possible command to do this is
j:= }+I, but it maintains the invariant P only if j <n -1. So we have
the guarded command
J<n-1-j:=}+I
(15.1.7) i, j:= 0, O;
do i #m cand x #b[i ,j] -
if j < n - I - j: = j +I 0 j = n - I - i, j: = i +I, 0 fi
od
i, 1:= 0, O;
do i #m cand x #b[i,j] -
1:=1+1;
ifl<n -skip Ol=n -i,j:=i+l,Ofi
od
Discussion
Note that operation cand (instead of A) is really necessary.
Note that the method for developing an alternative command was used
when developing the body of the loop, albeit informally. First, the com-
mand j:= 1+1 was chosen, and it was seen that it performed as desired
only if 1 <n-l. Formally, one must prove
2. The invariant of the loop of the second example was given in terms of a
diagram (see ( 15.1.6)). Replace the diagram by an equivalent statement in the
predicate calculus.
3. Write a program that, given a fixed integer array b [O:n -1], where n >O. sets
x to the smallest value of b. The program should be nondeterministic if the
smallest value occurs more than once in b. The precondition Q, postcondition
R, loop invariant P and bound function R are
Q: O<n
R: x :;;;;b[O:n-l] A (El: 0:;;;;1 <n: x =bU])
P: 1 :;;;;;;; :;;;;n Ax :;;;;b[O:i-1] A (Ej: 0:;;;;1 <i: x =bU])
t: n-i
Section 15.2 Making Progress Towards Termination 185
4. Write a program for the problem of exercise 3, but use the invariant and bound
function
S. Write a program that, given a fixed integer n>0, sets variable i to the highest
power of 2 that is at most n. The precondition Q, postcondition R, loop invari-
ant P and bound function R are
Q: O<n
R: O<i ~n <2*i A (Ep: i =2'1)
P: 0 < i ~ n A (E p : i = 2!' )
t: n -i
6. Translate program ( 15.1.7) into the language of your choice -PL/ I, Pascal,
FORTRAN, etc.- remembering the need for the operation cand. Compare your
answer with ( 15.1. 7).
Four-tuple Sort
Consider the following problem. Write a program that sorts the four
integer variables qO, qi, q2, q3. That is, upon termination the following
should be true: qO~qi ~q2~q3.
Implicit is the fact that the values of the variables should be permuted
-for example, the assignment qO, qi, q2, qJ:= 0, 0, 0, 0 is not a solution,
even though it establishes qO~qi ~q2~q3. To convey this information
explicitly, we use Qi to denote the initial value of qi, and write the for-
mal specification
Note that this includes all pairs, and not just adjacent ones. For example,
the number of inversions in (I, 3, 2, 0) is 4. So the bound function is
The invariant indicates that the four variables must always contain a
permutation of their initial values. This is obviously true initially, so no
initialization is needed.
In the last section, at this point of the development the guard of the
loop was determined. Instead, here we will look for a number of guarded
commands, each of which makes progress towards termination. The
invariant indicates that the only possible commands are those that swap
(permute) the values of two or more of the variables. To keep things sim-
ple, consider only swaps of two variables. There are six possibilities:
qO, qi:= qi, qO and qi, q2:= q2, qi, etc.
Now, execution of a command must make progress towards termina-
tion. Consider one possible command, qO, qJ:= qi. qO. It decreases the
number of inversions in (qO, qi, q2, q3) ijJ qO>qi. Hence, the guarded
command qO >qi - qO, qi:= qi. qO will do. Each of the other 5 possibil-
ities are similar, and together they yield the program
Together with invariant P, this implies the desired result. But note that
only the first three guards were needed to establish the desired result.
Therefore, the last three guarded commands can be deleted, yielding the
program
Section 15.2 Making Progress Towards Termination 187
Discussion
The approach used here can be summarized as follows.
there is exactly one final state, so that in terms of the result the program
is deterministic.
The number of iterations of the loop is equal to the number of inver-
sions, which is at most 6.
0 j n-1
0 x not here
(15.2.3) P: O:;;;;;:;;;;m AO:;;;;j:;;;;n A I
m-1
The bound function is the sum of number of values in the untested section
and the number of rows in the untested section: t = (m -i )*n - j + m -i.
The additional value m -i is needed because possibly j = n. As a first
step in the development, determine the initialization for the loop.
The obvious choice is i, j:= 0, 0, for then the section in which "x is not
here" is empty. Note carefully how the invariant includes j :;;;;n, instead
of j < n. This is necessary because the number of columns, n, could be
0.
Next, guarded commands for the loop must be developed. What is the
simplest command possible, and what is a suitable guard for it?
Section 15.2 Making Progress Towards Termination 189
Note that this guard has been made as weak as possible. Now, does a
loop with this single guarded command solve the problem? Why or why
not? If not, what other guarded command can be used?
A loop with only this guarded command could terminate with i < m A
j = n, and this, together with the invariant, is not enough to prove R.
Indeed, if the first row of b does not contain x, the loop will terminate
after searching through only the first row! Some guarded command must
deal with increasing i.
The command;:= i+l may only be executed if i<m. Moreover, it
has a chance of keeping P true only if row i does not contain x, so con-
sider executing it only under the additional condition j = n. But this
means that j should be set to 0 also, so that the condition on the current
row i is maintained. This leads to the program
( 15.2.4) i,j:=O,O;
do i #m A j #n cand x #b[i,j] - j:= }+1
Oi#mAj=n -i,j:=i+l,O
od
and this together with P implies the result R. Hence, the program is
correct. Note that in the case i = m the invariant implies that x is not in
rows 0 through m - I of b, which means that x f/- b.
190 Part II I. The Development of Programs
Discussion
This loop was developed by continuing to develop simple guarded
commands that made progress towards termination until PA , BB ~ R.
This led to a loop with a form radically different from what most pro-
grammers are used to developing (partly because they don't usually know
about guarded commands). It does take time to get used to ( 15.2.4) as a
loop for searching a two-dimensional array.
This problem is often used to argue for the inclusion of gotos or loop
"exits" in a conventional language, because, unless one uses an extra vari-
able commonly called a "flag", the conventional solution to the problem
needs two nested loops and an "exit" from the inner one:
( 15.2.5) i, j := 0, O;
while i<m do
begin while j < n do
if x = b [i,}] then goto /oopexit
else j:= J+I;
i:= i+I; j:= 0
end;
loopexit:
We see, then, that the guarded command notation and the method of
development together lead to a simpler, easier-to-understand, solution to
the problem -provided one understands the methodology.
How could program ( 15.2.4) be executed effectively? An optimizing
compiler could analyze the guards and commands and determine the
paths of execution given in diagram ( 15.2.6) -in the diagram, an arrow
with F ( T) on it represents the path to be taken when the term from
which it emanates is false (true). But ( 15.2.6) is essentially a flawchart for
program ( 15.2.5)! At least in this case, therefore, the "high level" pro-
gram ( 15.2.4) can be simulated using the "lower-level" constructs of Pas-
cal, FORTRAN and PL/I.
Program ( 15.2.4) is developed from sound principles. Program ( 15.2.5)
is typically developed in an ad hoc fashion, using development by test
cases, the result being that doubt is raised whether all cases have been
covered.
Exercises for Section 15.2 191
(15.2.6) i ,j:= 0, O;
Di<mAj=n- i,j:=i+l,O
od
The first two lines hold because any divisor of x and y is also a divisor of x +y
and x -y -since x / d ±y / d =(x ±y) / d for any divisor d of x and y.
Your program has the result assertion
R: x = y =gcd(X, Y)
The program should not use multiplication or division. It should be a loop (with
initialization) with invariant
R: x =OAy =gcd(X, Y)
P: O:;;;;x A O:;;;;y A (0, O)#(x,y)Agcd(x,y)=gcd(X, Y)
t: 2*x+y
5. This problem concerns that part of a scanner of a compiler -or any program
that processes text- that builds the next word or sequence of nonblank symbols.
Characters b U:79] of character array b [0:79] are used to hold the part of the
input read in but "not yet processed", and another line of input can be read into
b by executing read(b ). Input lines are 80 characters long.
It is known that b U :79] catenated with the remaining input lines is a
sequence
w I '-'I REST
where 'i" denotes catenation, "-" denotes a blank space, W is a nonempty
sequence of nonblank characters, and REST is a string of characters. The pur-
pose of the program to be written is to "process" the input word W, deleting it
from the input and putting it in a character array s. W is guaranteed to be short
enough to fit in s. For example, the top part of the diagram below shows sample
initial conditions with 10-character lines. The bottom diagram gives correspond-
ing final conditions.
W: 'WORD'
REST: 'NEXT-ONE-IS-IT--' E-IS-IT---
bl/:79]: 'WO' input:
Initial Conditions
Final Conditions
(a) (b)
Figure 16.1.1 Blowing up the balloon
194 Part III. The Development of Programs
Hence, the set of states represented by P must contain both the set of
possible initial states represented by IS and the set of final states repres-
ented by R, as shown in Fig. 16.1.l(b).
Consider R to be the deflated state of a balloon, which is blown up to
its complete inflated state, P, just before execution of the loop. Each
iteration of the loop will then let some air out of the balloon, until the
last iteration reduces the balloon back to its deflated state R. This is
illustrated in Fig. 16.2.1, where P0 = P is the balloon before the first itera-
tion, P 1 the balloon after the first iteration and P 2 the balloon after the
second iteration.
Po
Remark: The balloon and its various states of deflation is defined more
precisely as follows. P is the completely inflated balloon. Consider the
bound function t. Let to be the initial value oft, which is determined by
the initialization, t 1 the value oft after the first iteration, t 2 the value oft
after the second iteration, etc. Then the predicate
p A O~t ~t;
denotes the set of states in the balloon after the i th iteration. Thus, ini-
tialization deflates the balloon to include only states in P A 0 ~ t ~ t 0 , the
first iteration deflates it more to P A 0 ~ t ~ t 1, etc. D
Weakening a predicate
Here are four ways of weakening a predicate R:
The first three methods are quite useful. In each, insight for weaken-
ing R comes directly from the form and content of R itself, and the
number of possibilities to try is generally small. The methods may there-
fore provide the kind of directed, disciplined development we are looking
for.
The fourth method of weakening a predicate is rarely useful in pro-
gramming, in all its generality. There is no reason to try to add one dis-
junct rather than another, and hence adding a disjunct would be a random
task with an infinite number of possibilities. We shall not analyze this
method further.
Discussion
Here, strategy 15.1.4 was used to develop the loop -first the guard
was created and then the loop body. The guard was created in such a
simple and useful manner that it deserves being called a strategy itself.
Linear search
As a second example of deleting a conjunct, consider the following
problem. Given is a fixed array b [O:m - I] where 0 < m. It is known that
a fixed value x is in b[O:m -I]. Write a program to determine the first
occurrence of x in b -i.e. to store in a variable i the least integer such
that x = b [i].
The first task is to specify the program more formally. This is easy to
do; we have the following precondition Q and postcondition R:
Q: O<m Ax Eb[O:m-1]
R: O~i <m Ax ~b[O:i-1] Ax =b[i]
A good invariant should be easy to establish. The first two conjuncts are
established by the assignment i:= 0, while most of the difficulty of the
program lies in establishing the third. Hence, it makes sense to delete the
third conjunct, yielding the following invariant:
Use the complement of the deleted conjunct. Thus far, the program is
i:= O; do x #b[i] - ? od
Choose the command for the loop, explaining how it was found.
Discussion
The program is certainly correct, but let us try formally to prove it
using checklist 11.9. First, show that invariant ( 16.2.4) is initially true:
Is this true? Certainly not -the antecedent is not enough to prove that
i +I< m ! The problem is that we have neglected to include in the invari-
ant the fact that x E b[O:m -1]. Formally, the invariant should be
With this slight change, one can formally prove that the program is
correct (see exercise 5).
In omitting the conjunct x E b[O:m -1] we were simply using our
mathematician's license to omit the obvious. Note that all the free identif-
iers of x Eb [O:m -I] are fixed throughout Linear Search: x, b and m are
not changed. Hence, facts concerning only these identifiers do not
change. It can be assumed that the reader of the algorithm and its sur-
rounding text will remember these facts, so that they don't have to be
repeated over and over again.
Later on, such obvious detail will be omitted from the picture when it
doesn't hamper understanding. For now, however, your task is to gain
experience with the formalism and its use in programming, and for this
purpose it is better to be as precise and careful as possible. It is also to
be remembered that text surrounding a program in a book such as this
one rarely surrounds that same program when it appears in a program
listing, as it should. Be extremely careful in your program listings to
present the program as clearly and fully as possible.
(16.3.1) R: s =(I:j:O~J<n:bU])
The fact that each array element is involved m the sum suggests that a
loop of some form should be developed, so R should be weakened to
yield a suitable invariant P. R contains the constant n (i.e. n may not
be changed). R can therefore be weakened by replacing n by a fresh
variable i, yielding
p: 0 ~ i ~n A S =(I. j: 0 ~} < i: b [j ))
Program ( 15.1.1) for this problem was developed using this loop invariant
and the bound function t = n -i:
Discussion
Two other constants of R could be replaced to yield an invariant.
Replacing the constant 0 yields the invariant
O~i~n As =(I.j:i~J<n:b[i])
Using this as an invariant, one can develop a loop that adds the elements
b[i] to s in decreasing order of subscript value j (see exercise I of section
15.1 ).
If result assertion R is written as
s = (I.j:O~j ~n-l:b[i])
Note carefully the lower bound on i this time. Because n can be zero,
the array can be empty. Therefore the assignment i, s := 0, b [O], a favor-
ite of many for initializing such a loop, cannot be used here. The initiali-
zation must be i, s := - I, 0. (See exercise I).
This example illustrates that there may be several constants to choose
from when replacing a constant by a variable. In general, the constant is
chosen so that the resulting invariant can be easily established, so that the
guard(s) of the loop are simple and, of course, so that the command(s) of
the loop can be easily written. This is a trial-and-error process, but one
gets better at it with practice.
Too often, variables are introduced into a program without the pro-
grammer really knowing why, or whether they are even needed. In gen-
eral, the following is a good principle to follow.
Section 16.3 Replacing a Constant By a Variable 201
We now have at least one good reason for introducing a variable: the
need to weaken a result assertion to produce an invariant. It goes without
saying that each variable introduced will be defined in some manner.
Part of this definition, which is often forgotten, is the range of the vari-
able. We emphasize the need for this range with the following
(16.3.4) R: a 2 ~n <(a+I) 2
A program for this problem was developed in section 16.2 by deleting the
conjunct n <(a+ 1)2; the program took time proportional to vn. Here
we use the method of replacing a constant by a variable.
First try replacing the expression a+ I by a fresh variable b to yield
a, b:= 0, n+l;
do a+ I "# b - ? od
The precondition of (16.3.5) will be the invariant together with the guard
of the loop:
PAa+l=l=b.
Discussion
It may seem that the technique of halving the interval was pulled out
of a hat. It is simply one of the useful techniques that programmers must
know about, for its use often speeds up programs considerably. The exe-
cution time of this program is proportional to logn, while the execution
time of the program developed in section 16.2 is proportional to ..rn.
Program (16.3.7) illustrates another reason to introduce a variable: d
has been introduced to make a local optimization. The introduction of d
not only reduces the number of times the expression (a +b )72 is
evaluated, it also makes the program more readable.
Note that no definition is given ford. Variable d is essentially a con-
stant of the loop body. It is assigned a value upon entrance to the loop
body, and this value is used throughout the body. It carries no value
from iteration to iteration. Moreover, d is used only in two adjacent
lines, and its use is obvious from these two lines. A definition of d would
belabor the obvious and is therefore is omitted.
A similar program can be developed by replacing the second occur-
rence of a in (16.3.4) by a variable -see exercise 3.
Section 16.3 Replacing a Constant By a Variable 203
The only difficulty in writing ( 16.3.9) might have been in getting k's
bounds correct. Subsequently, we will work with R as written in ( 16.3.8),
but we will turn to the more formal definition ( 16.3.9) when insight is
needed.
Clearly, iteration is needed for this program. Remembering the point
of this section, what loop invariant would you choose?
What should be the bound function, the initialization and the guard of the
loop?
The first conjunct is implied by the guard of the loop. What extra condi-
tion is needed to imply the second conjunct?
(16.3.11) i,p:= I, I;
{invariant P: I ~i <n A
p is the length of the longest plateau of b[O:i-1]}
{bound t: n-i}
do i #n - if b[i]#b[i-p] - ;:= i+l
Ob[i]=b[i-p]- i,p:= i+l,p+I
fi
od
Discussion
A common mistake in developing this program is to introduce, too
early in the game, an extra variable v that contains the current value of
the latest, longest plateau, so that the test would be on b [i] = v instead of
b [i] = b [i-p]. It is a mistake that I made the first time I developed this
program. But it only serves to complicate the program. Principle ( 16.3.2)
Exercises for Section 16.3 205
R: s = (I.j:O~j ~n-l:bl/])
The invariant is to be found from the result assertion by replacing the constant
n -I by a variable.
2. Prove formally that the body of the loop of program (16.3.7) actually decreases
the bound function (point 5 of Checklist 11.12). The important point here is that,
when the body of the loop is executed, a+ I < b .
3. Develop a program for approximating the square root of n by replacing the
second occurrence of a in ( 16.3.4) by b, yielding the invariant
Don't forget to choose suitable bounds for b. Compare the resulting program,
and the effort needed to derive it, with the development presented earlier.
4. (Binary Search). Write a program that, given fixed x and fixed, ordered (by
~)array b[l:n] satisfying b[l]~x <b[n], finds where x belongs in the array.
That is, for a fresh variable i the program establishes
(i =O Ax <b[I]) v
(I ~i <n A b[i]~x <b[i+I]) v
(i =n A b[n]~x)
5. Write a program that, given fixed, ordered array b[O:n -1] where n >O. finds
the number of plateaus in b [O:n - I].
6. Write a program that, given fixed array b[O:n -1] where n >O. finds the posi-
tion of a maximum value in b -i.e. establish
206 Part II I. The Development of Programs
The program should be nondeterministic if the maximum value occurs more than
once in b.
7. Write a program that, given fixed array b[O:n -1] where n ~O. stores ind
the number of odd values in b [O:n -I].
8. Given are two fixed, ordered arrays f [O:m -I] and g[O:n -1], where m, n
~ 0. It is known that no two elements off are equal and that no two elements
of g are equal. Write a program to determine the number of values that occur
both inf and g. That is, establish
You may use the fact that the length of the longest plateau of an empty array is
zero. This exercise is illustrative of the fact that not all loop invariants will arise
directly from considering the strategies for developing invariants discussed in this
chapter. Here, we actually added a disjunct, thus strengthening the invariant, to
produce another program.
R: i =iv
Section 16.4 Enlarging the Range of a Variable 207
The Linear Search Principle indicates that a search for a value i satisfying
R should be in order of increasing value, beginning with the lowest.
Thus, the invariant for the loop will be
P: O~i ~iv
Discussion
The method used to develop the invariant was to enlarge the range of a
variable. In R, variable i could have only one value: iv. This range of
values is enlarged to the set {O, I, · · · , iv}. In this case, the enlarging
came from weakening the relation i =iv to i ~iv and then putting a
lower bound on i. This method is similar to the last one, introducing a
variable and supplying its range -it just happens that the variable is
already present in R.
The example illustrates another important principle:
value that is on all three of them; this least value is known to exist.
This program is often written in IO to 30 lines of code in FORTRAN,
PL/ I or ALGOL 68 by those unexposed to the methods given in this
book. The reader might wish to develop the program completely before
studying the subsequent development.
What is the first step in writing the program? Do it.
The first step is to write pre- and postconditions Q and R. Since the lists
f, g and h are fixed, we will use the fact that they are alphabetically
ordered without mentioning it in Q or R. So Q is simply T. Using iv,
jv and kv to denote the least values satisfying/[iv]=gl/v]=h[kv], and
using three simple variables i, j and k, the postcondition R can be writ-
ten as
R: i =iv A j = jv A k = kv
Notice how the problem of defining the values iv, jv and kv in detail has
been finessed. We know what least means, and hope to proceed without a
formal definition. Now, why should a loop be used? Develop the invari-
ant and bound function for the loop.
We have:
The last two conjuncts, and also 0 ~ i +I, are implied by the invariant, so
only i +I ~iv must be implied by the guard. The guard cannot be i +I ~
iv, because the program may not use iv. But, the relation i+l ~iv,
together with P, means that f (i) is not the crook, and this is true if
f[i] <gU]. Thus, the guard can be f [i] <gU]. In words, since the
crook does not come alphabetically before gU], if f [i] comes alphabeti-
cally before gU]. then f [i] cannot be the crook.
But the guard could also be f [i] <h [k] and, for the moment, we
choose the disjunction of the two for the guard:
The other guards are written in a similar fashion to yield the program
(16.4.4) PA, BB ? R
Note that only the first disjunct of each guard is needed to prove (16.4.4).
Hence, the second disjuncts can be eliminated to yield the program
(16.4.5) i, j, k:= 0, 0, O;
do /[i] <gl/] - i := i+I
0 gl/] <h[k] - j:= j+I
0 h[k]<J[i] - k:= k +I
od
Discussion
In developing this program, for the first guard, at first f [i] <gU] is
developed, and then weakened tof[i]<gl/]Vf[i]<h[k]. Why is it
weakened?
Well, the first concern is to obtain a correct program; the second con-
cern is to obtain an efficient one. In proving correctness, one task is to
prove that, upon termination, (16.4.4) holds. The stronger , BB is, the
more chance we have of proving (16.4.4). Since BB is the complement of
, BB, this means that the weaker BB is, the more chance we have of prov-
ing (16.4.4). Thus, we have the following principle:
Inserting Blanks
Consider the following problem. Write a program that, given fixed
n ~O. fixed p ~o. and array b[O:n -I], adds p*i to each element b[i] of
b. Formally, using B; to represent the initial value of b[i], we have
!" states that the first j elements of b have their final values. But the
fact that the other n - j elements have their initial values should also be
included, and the full invariant is
P:O~j~n A(Ai:O~i<J:b[i]=B;+p*i)A
(Ai:j~i<n: b[i]=B;)
(16.5.1) j:= O;
doj#n -j,bU]:=j+1, bU]+p*j od
i+n-1 j j+n-1
Q: bl X[O:n -1] I A bl Y[O:n-1] I
i+n-1 j j+n-1
R: bl Y[O:n -1] I A bl X[O:n-1] I
For the rest of the development, a less formal approach will be used,
which uses the insight gained thus far without requiring all the formal
details. We take for granted that only the sections mentioned should be
changed and that they do not overlap, and use the following diagrams for
the pre- and postconditions -"unswapped" ("swapped") means that the
values in the indicated section have their initial (final) values:
i+n-1 j j+n-1
Q: bl unswapped I A bl unswapped I
i+n-1 j j+n-1
R: bl swapped I A bl swapped I
Since each element of the two sections must be swapped, a loop is sug-
gested that will swap one element of each at a time. The first step in find-
ing the invariant is to replace the constant n of R by a variable k:
Section 16.5 Combining Pre- and Postconditions 213
i+k-l j j+k-l
P: 0 ~ k ~n A b ,__,_s_w_a_p-pe_d_ __,I A bjr"-_s_w_a~p-pe_d_ _,I
But P does not indicate the state of array elements with indices in i +k:
i +n - I and j +k :j +n - I. Adjusting P suitably yields invariant P as the
predicate 0 ~ k ~ n A b together with
k:= O;
dok ¥-n - k,b[i+k],bU+k]:= k+l, bU+k],b[i+k] od
Discussion
Again, the invariant was developed by replacing a constant of R by a
variable and then adding a disjunct in order to reflect the initial condi-
tions. We used diagrams in order to avoid some formalism and messy
detail. For some, pictures are easier to understand. But be especially
careful when using them, for they can lead to trouble. It is too easy to
forget about special cases, for example that an array section may be
empty, and this can lead to either an incorrect or less efficient program.
To avoid such cases, always define the ranges of new variables carefully
and be sure each picture is drawn in such a way that you know it can be
translated easily into a statement of the predicate calculus.
The development of the invariant was a two-step process. The invari-
ant can also be developed as follows. Both Q (or a slightly perturbed ver-
sion of it due to initialization) and R must imply the invariant. That is,
Q and R must be instances of the more general predicate P. Q states
214 Part Ill. The Development of Programs
that the sections are unswapped; hence, the invariant must include, for
each section, an unswapped subsection, which could be the complete sec-
tion. On the other hand, R states that the sections are swapped; hence,
the invariant must include, for each section, a swapped subsection, which
could be the complete section. One is led to draw diagram ( 16.5.2), using
a variable k to indicate the boundary between the unswapped and
swapped subsections.
m p-1 p n-1
R: m ~p ~n Ab I ~x I >x I
More formally, if initially b[m:n-I]=B[m:n-1], then the program estab-
lishes
m f, n -I
R: m ~p <n Aperm(b, B) Ab kB[m]IB m]l>B[m] I
The color of an element may be tested with Boolean expressions red(b [i]),
white(b[i]) and blue(b[i]), which return the obvious values. The number of
such tests should be kept to a minimum. The only way to permute array elements
Exercises for Section 16.5 215
v s
···~
That is,
No ordering of values in array elements is implied. For example, the fact that V 0
is followed by V 1 in the linked list does not mean that v [p +I] contains V 1•
Write a program that reverses the links -the arrows implemented by array s.
Array v should not be altered, and upon termination the linked list should be
~~~!~·
~~ . . £'
{n ;:;;:i:o}
a, b:= 0, n +I;
{inv: a <b :;;;;n+I A a 2 :;;;;n <b 2}
do a+I ¥-b - d:= (a+b)-7-2;
ifd*d:;;;;n - a:= d ad*d>n - b:= d fi
od {a 2 :;;;;n <(a+l) 2 }
The bound function b -a+ I was used to prove termination. But the
smaller bound function ceil(log((b -a)) shows that this program is indeed
much faster than program (( 16.2.3)), which performs approximately b -a
iterations:
0 j n-l
O~nothere
P: 0 ~ i ~ m A 0 ~j <n A i
m-1
i<h or i=hAj<k
For example, (-I, 5)<(5, 1)<(5, 2). This is called the lexicographic ord-
ering of integer pairs. It is extended in the natural way to the operators
~. >and ;:;;:i:. It is also extended to triples, 4-tuples, etc. For example,
t: i*m +j.
(17.3) Theorem. Consider a pair (i, j), where i and j are expressions
containing variables used in a loop. Suppose each iteration of the
loop decreases (i, j) (lexicographically speaking). Suppose further
that i satisfies mini :::;; i :::;; maxi and j satisfies minj :;;;j :::;; maxj,
for constants mini, maxi, minj and maxj. Then execution of the
loop must terminate, and a suitable bound function is
If one can exhibit a pair (triple, etc.) that satisfies theorem 17.3, there
is no need to actually produce the bound function, unless it makes things
clearer or is needed for other reasons. We give three examples.
In section 15.2 the following program (15.2.4) was written for searching
a (possibly empty) two-dimensional array.
Chapter 17 Notes on Bound Functions 219
{O:s;;m AO:s;;n}
i, j:= 0, O;
do i # m A j # n cand x # b [ i, j] - j: = j + 1
0 i#m Aj=n -i,j:=i+l,0
od
{(O:s;;i<m AO:s;;j<n Ax=b[i,j])V(i=m Axtf-b)}
The pair (i ,.i) is initially (0, 0) and each iteration increases it. Therefore,
the pair (m -i, n -j) is decreased at each iteration. Further, we have
O:s;;m-i:::;;m and O:s;;n-j:s;;n. Hence, theorem 17.3 can be applied and
the loop terminates. The bound function that arises from the use of the
theorem is (m-i)*(n+l)+n-j.
The tuple (qO, qi, q2, q3) is decreased (lexicographically speaking) by each
iteration. It is bounded below by the tuple whose values are min(qO, qi,
q2, q3) and is bounded above by the 4-tuple whose values are max(qO,
qi, q2, q3). Hence, the loop terminates.
Removing train from the yard reduces the number of trains and reduces
the total number of cars in the yard. On the other hand, splitting a train
leaves the total number of cars the same but increases the number of
trains by 1. So we choose the pair
220 Part III. The Development of Programs
O!
n! n * (n-1)! for n >O.
for writing programs iteratively that could have been written recursively.
One trick in doing so will be to think iteratively right from the beginning.
That is, if the program will be written using iteration, then the invariant
for the loop will have to be developed before writing the loop (as much as
possible).
The topic will allow us to bring up two important strategies and dis-
cuss the relation between them, for recursive procedures often evolve from
their use. These strategies are: -solving problems in terms of simpler
ones, and divide and conquer. While not on the same level of detail and
precision as some of the strategies presented earlier, these two old
methods can still be useful when practised consciously.
At the end of section 18.3, some comments are made concerning the
choice of data structures in programming and the use of program
transformations.
m n p-1
Q: b jB[m:n-IJI B[n:p-1] I
where B denotes the initial value of array b. The program should swap
the two array sections, using only a constant amount of extra space
(independent of m, n and p ), thus establishing the predicate
Section 18.1 Solving Simpler Problems First 223
m p-1
(18.1.2) R: b I B[n:p-1] ln[m:n-t]I
m h n k p-1
already swap with swap with already
b swapped b[n:k-1] b[h:n-1] swapped
Discussion
This program could also have been written in recursive fashion as
{Swap sections b[m :n -1] and b[n :p -1], where m <n <p}
proc swap_sections(var b: array[*] of integer;
value m, n, p: integer);
ifn-m =p-n -swapequals(b, m, n,p-n)
an -m >p-n - swapequa/s(b, m' n' p-n);
swap-Sections (b, m +p -n, n, p)
an-m <p-n - swapequals(b, m, n, n-m);
swap-Sections (b, m, n, m +p -n)
fi
In this case, I like the iterative version better. It was not difficult to
discover the invariant, and it is, to me, easier to understand (this is not
always the case). The iterative version does require two extra variables i
and j, which are not needed in the recursive version.
The iterative version has the neat property that deleting all the calls of
swapequa/s results in program ( 18.1.3) to compute the greatest common
divisor, gcd(n -m, p -n ), of the initial array-section sizes. To see this
old, elegant program emerge from a useful, practical programming prob-
lem was a delightful experience!
The program could have been developed by first replacing n and p by
variables h and k, and then determining how to reduce the size of the
unswapped portion. There are often many ways to arrive at the same pro-
gram, and one cannot really say that one is better than the other. Redo-
ing a problem once done, using the principles and asking why they weren't
used the first time, can increase programming skill and lead to better pro-
grams. The following confession concerns this point.
Exercises for Section 18.1 225
(18.1.3) {m<n<p}
i, j:= n -m, p -n;
{inv: O<i AO<} Agcd(n-m,p-n)=gcd(i,j)}
do i # j - if i > j - ;:= i-j
DJ>; - 1:= J-i
fi
od {i =j =gcd(n-m, p-n)}
lo 0
/1 I
In fn-1 + fn-2 for n >I
( fn )
fn-1 -
('
I 0
')(Jn-I)
J n-2
However, if n >2 then a more general method must be used. The divide
and conquer strategy invites us to perform the sort by sorting two (or
more) sections of the array separately. Suppose the array is partitioned as
Section 18.2 Divide and Conquer 227
follows.
0 k n-l
I ? I
What condition must be placed on the two sections so that sorting them
separately yields an ordered array?
Every value in the first section should be ::::;:; every value in the second sec-
tion:
0 k n-l
(18.2.2) b l::::;;b[k:n-I]l;:;::b[O:k-lll
This means that if the values of b can be permuted to establish the above
predicate, then to sort the array it remains only to sort the partitions
b[O:k-1] and b[k:n -1].
Actually, a procedure similar to one that establishes (18.2.2) has
already been written -see exercise 4 of section 16.5- so we will make
use of it. Procedure Partition splits a non-empty array section into three
partitions, where the value x in the middle one is the initial value in b [ 1]:
m p n-l
(18.2.3) R: m :::;;p <n A b I :::;;x Ix I >x I
After partitioning the array as above, it remains to sort the two parti-
tions b[m:p-l] and b[p+l:n-l]. If they are small enough, they can be
sorted directly; otherwise, they can be sorted by partitioning again and
sorting the smaller sub-partitions. While one sub-partition is being sorted,
the bounds of the other must be stored somewhere. But sorting one will
generate two more smaller partitions to sort, and their bounds must be
stored somewhere also. And so forth.
To keep track of the partitions still to be sorted, use a set variable s to
contain their boundaries. That is, s is a set of pairs of integers and, if
(i,j) is ins, then b[i:j] remains to be sorted. We write the invariant
Note how English is used to eliminate the need for formally introducing
an identifier to denote the initial value of array b.
Thus, we arrive at the following program:
Discussion
Program ( 18.2.5) describes the basic idea behind Quicksort. Proof of
termination is left to exercise I. The execution time of Quicksort is
O(n log n) on the average and O(n 2 ) in the worst case. The space needed
in the worst case is O(n ), which is more than it need be; exercise 2 shows
how to reduce the space.
In the development of this program, the guiding motivation was the
desire to divide and conquer. The simpler problem needed to effect the
divide and conquer was procedure Partition. Had we first noticed that
procedure Partition was available and asked how it could have been used,
we would have been using strategy 18.1.1, solve the problem in terms of
simpler ones.
B ~ -----
""'/""- /~
E
A
F
C
G
/~
I J K L
Figure 18.3.1 Example of a Binary Tree
Above, the term tree is defined in the easiest possible manner: recur-
sively. For that reason, many algorithms that manipulate trees are given
recursively also. Here, we wish to describe a few basic algorithms dealing
with trees, but using iteration. With a firm grasp of this material, it
should not be difficult to develop other algorithms that deal with trees,
graphs and other structures.
Implementing a tree
We describe one typical implementation of a tree, which is motivated
by the need in many algorithms to insert nodes into and delete nodes
from a tree. The implementation uses a simple variable p and three
arrays: root[O:?], left[O:?] and right[O:?].
Variable p contains an integer satisfying -I ~p. It describes, or
represents, the tree.
If integer k describes a tree or subtree, then the following holds:
For example, the tree of Fig. 18.3.1 could appear as given in (18.3.1).
0123456 7 8 910
root B A c E F I J K L G
(18.3.l)p=I left -I 0 5 6 8 -I -I -I -I -I
right 4 3 IO 7 9 -I -I -I -I -I
Some comments are in order. First, p need not equal O; the root node
need not be described by the first elements of the arrays, root[O], left[O]
and right[O]. In fact, several trees could be maintained in the same three
arrays, using pl, p2 and p3 (say) to "point to their roots". This, of
course, implies that the nodes of the trees in the arrays need not be in any
particular order. In (18.3.1), the elements with index 2 of the three arrays
are not used in the representation of tree p at all. Moreover, the root of
the left subtree of A precedes A in the array, while the root of its right
subtree follows it. This means that one can not process the tree by pro-
cessing the elements of root (and left and right) in sequential order.
In the rest of this section, we will deal with a tree p using the original
notations empty(p ), root[p ], left[p] and right[p]. Note, however, that
this notation is quite close to what one would use in a program dealing
with a tree implemented as just shown.
Section 18.3 Traversing binary trees 231
#p
(18.3.2) R: #p =c
The first step, of course, is to give a definition of #p, in the hope that
it will yield insight into the program. Write a definition of #p -it may
help to use recursion since tree is defined recursively.
# = Jempty(p) - 0
0 8·3·3> P l ,empty(p) - I +#left[p]+#right[p]
(18.3.5) c, s:= 0, p;
{inv: (18.3.4)}
{bound: 2*(#p-c)+I sl}
dos#{} - Choose(q, s); s:= s -{q};
if empty(q) - skip
0 i empty(q)- c, s:= c+I, S U{right[q]}U{left[q]}
fi
od {c =#p}
The bound function was discovered by noting that the pair (#p -c, I s I ) is
decreased (lexicographically speaking) by each iteration -see Chapter 17.
Note that it does not matter in which order the subtrees in set s are
processed. This is because the number of nodes in each subtree will be
added to c and addition is a commutative operation. In this case, the use
of the nondeterministic operation Choose(q, s), which stores an arbitrary
value in s into q, nicely frees us from having to make an unnecessary
choice.
Preorder traversal
The preorder list of the nodes of a tree p, written preorder(p ), is
defined as follows. If the tree is empty, it is the empty sequence(); other-
wise, it is the sequence consisting of
For example, for the subtree e of Fig. 18.3.1 with root E we have
preorder(e) =(£,I, J)
jempty(p) - ()
( 18.3.6) preorder(p) = I , empty(p) - (root(p )) I preorder(left[p ]) I
~ preorder(right[p])
Note that preorder(p) is defined recursively. This notation and the de-
finition of preorder in terms of catenation has been designed to allow us
to state and analyze various properties and algorithms in a simple, crisp
Section 18.3 Traversing binary trees 233
Note the similarity between definitions (18.3.3) and (18.3.6). They have
the same form, but the first uses the commutative operator + while the
second uses the non-commutative operator I . Perhaps the program to
calculate the preorder list may be developed by transforming program
Node Count so that it processes the trees of set s in a definite order.
First, let's rewrite Node Count in ( 18.3.8) to store the node values into
array b, instead of simply counting nodes. The invariant is
O:s;;;c :s;;;#p A
set of nodes of p = b [O:c -1] u {nodes of trees in s}
(18.3.8) c,s:=O,p;
{bound: 2*(#p-c)+I sl}
dos#{} - Choose(q, s); s:= s -{q};
if empty ( q) - skip
0, empty(q)- c, b[c]:= c+l, root[q];
s:= s u{right[q]}u{left[q]}
fi
od {c =#p A b[O:c-1] contains the nodes of p}
Discussion
In ( 18.3.5), the order in which the left and right subtrees are stored in
set s is immaterial, because addition, which is being performed on the
number of nodes in each, is commutative. In (18.3.10), however, the
order in which nodes are stored in sequence r is important because opera-
tion I is not commutative.
My first development of this program, done over 5 years ago, was not
performed like this. It was an ad hoc process, with little direction,
because I was new at the game and had to struggle to learn and perfect
techniques.
Without the sequence notation (see Appendix 2), including the nota-
tion for catenation, one tries to work with English phrases, for example,
writing the invariant as
In general, this principle deals with data and its representation, as well as
with commands. We should use data structures that suit the problem,
and, once a correct program has been developed, deal with the problem of
changing the data structures to make their use more efficient and imple-
menting them in the programing language. This latter task, often called
"data refinement", has not received the attention that "program refine-
ment" has.
In a "modern" programming notation allowing "data encapsulation",
data refinement may just mean appending a program segment that des-
cribes how the objects are to be represented and the operations are to be
implemented. In other programming notations, it may mean transforming
the program so that it operates on allowable objects of the language.
3. Write a program to store in array b the postorder list of nodes of tree p. The
postorder list is defined as follows. If p is empty the postorder list is the empty
sequence(). If p is not empty, the postorder list is
4. The root of a tree is defined to have depth 0, the roots of its subtrees have
depth 1, the roots of their subtrees have depth 2, and so on. The depth of the tree
itself is the maximum depth of its nodes. The depth of an empty tree is -1. For
example, in tree ( 18.3. l ), A has depth 0, F has depth 2, and the tree itself has
depth 3. Write a program to calculate the depth of a tree.
Chapter 19
Efficiency Considerations
(19.1.1) Theorem. Suppose a loop has (at least) two guarded commands,
with guards Bl and B2. Then strengthening B2 to B2 A , Bl
leaves BB, and hence PA , BB ~ R, unchanged.
R: i = iv A j = jv A k = kv
Section 19.1 Restricting Nondeterminism 239
The invariant for a loop is found by using the Linear Search Principle,
(16.2.7), and enlarging the range of variables in R:
(19.1.2) i,j,k:=0,0,0;
do/[iJ <gUJ v f[i] <h[k] - ; := ;+1
a
gl/J <h[kJ v gl/J <J[iJ - 1:=1+1
a
h[k]<f[i] v h[k]<gUJ - k:= k+l
od
i, j, k := 0, 0, O;
do/[i] <gU] - i := i+l
0 gl/] <h[k]- j := J+l
0 h[k]<f[i] - k := k +1
od
Note that theorem 19.1.1 could now be used to strengthen two of the
guards, but it is better not to. There is no reason for preferring one of the
commands over the others, and strengthening the guards using the
theorem will only complicate them and make the program less efficient.
In this case, the nondeterminism aids in producing the simplest solution.
Exponentiation
Consider writing a program that, given two fixed integers X and Y,
X ;;;;:o and Y ;;;;:o, establishes
240 Part III. The Development of Programs
{o~x Ao~ Y}
x, y, z:= X, Y, I;
doO<y A even(y) -y,x := y-7-2,x*x
0 O<y - y,z := y-1, z*x
od {z =XY}
{o~x Ao~ YJ
x, y, z:= X, Y, I;
doO<y Aeven(y)-y,x:=y-7-2,x*x
0 O<y A odd(y) - y,z := y-1, z*x
od {z =XY}
do i <n ···;k:=5*i;
... ; i:= i+2;
od
Next, make z = 5*i part of the invariant of the loop. This means that the
assignment z := 5*i within the loop becomes unnecessary, but whenever i
is increased z must be altered accordingly:
z:= i*5;
{Part of invariant: z =5*i}
doi<n ···;k:=z;
· · ·; i, z:= i+2, z+IO;
od
address(b[O, O]) + i* 51 +)
Then, within a loop that increments i with each iteration, all calculations
of the address of b[i, j] can be transformed as above to make them more
efficient. This optimization is also effective because it allows the detec-
tion and elimination of certain kinds of common arithmetic expressions.
In general, this transformation is called taking an assertion out of a
loop (and making it part of the loop invariant). In this case, the assertion
z = 5*i was taken out of the loop to become part of the invariant. The
technique can be used wherever the value of some variable like z can be
calculated by adjusting its current value, instead of calculating it afresh
each time.
242 Part III. The Development of Programs
In the above example, taking the relation out of the loop can reduce
execution time by only a constant factor, but examples exist that show
that the technique can actually reduce the order of execution time of an
algorithm.
Homer's rule
Consider evaluating a polynomial a 0 +a 1*x 1 + · · · +an-i*xn-l for
n ;;;:: I and for a value x and given constants ai. The result assertion is
R: y = ao*x 0 + · · · +an-i*xn-I
i, y := I, a 0 ;
{invariant: I ~i ~n A y = a 0 *x 0 + ··· +ai_ 1*xi-I)
{bound: n -i}
doi#n -;,y:=i+I,y+ai*xi od
part of the invariant of the loop allows us to transform the program into
i, y, z:= I, ao. x;
{invariant: I ~i ~n A z =xi A y = a 0 *x 0 + · · · +ai_ 1*xi-I)
{bound: n -i}
doi#n -i,y,z:=i+I,y+ai*z,z*x od
Axiom I. I is in Seq.
Axiom 2. If x is in Seq, so are 2*x. 3*x and 5*x.
Axiom 3. The only values in Seq are given by Axioms I and 2.
The problem is to write a program that stores the first 1000 values of
Seq, in order, in an array q[0:999], i.e. that establishes
Since Axiom 2 specifies that a value is in Seq if a smaller one is, it may
make sense to generate the values in order. A possibility, then, is to
replace the constant 1000 of R by a variable i, yielding the invariant
i, q [O]:= I, I; {P}
{invariant: P; bound: 1000-i}
do i ~ 1000 - Calculate xnext, the ;rh value in Seq;
i, q[i]:= i+I, xnext
od
244 Part III. The Development of Programs
Value xnext is the minimum of x2, x3 and x5. We see, then, that vari-
able xnext is not really needed, and we modify the program structure to
i, q[O]:= I, I; {P}
{invariant: P; bound: 1000-i}
do i ¥- 1000 - Calculate x2, x3, x5 to satisfy Pl;
i, q[i]:= i+I, min(x2,x3,x5)
od
i, q[O]:= I, I; {P}
Establish Pl for i =I;
{invariant: PA Pl; bound: 1000-i}
do i ¥-1000 - i,q[i]:= i+l, min(x2,x3,x5);
Reestablish Pl
od
i, q[O]:= I, I; {P}
Establish Pl: x2, x3, x5, j2, j3, j5:= 2, 3, 5, 0, 0, O;
{invariant: PA Pl; bound: 1000-i}
do i # 1000 - i,q[i]:= i+I, min(x2,x3,x5);
Reestablish Pl:
do x2~q[i-I] - j2:= j2+1; x2:= 2*qLJ2] od;
do xJ~q[i-1] - jJ:= j3+1; x3:= 3*qLJ3] od;
do x5~q[i-1] - j5:= j5+1; x5:= 5*qLJ5] od
od
(19.2.1) x 2 +y 2 =r AO~y~x
To help in writing it (and to arrange to use the strategy of taking a relation out of
a loop), assume the following. Two arrays xv and yv will hold the values of the
pairs (x, y) satisfying ( 19.2.1 ). Furthermore, the pairs are to be generated in
increasing order of their x-values, and a variable x is used to indicate that all
pairs with x-value less than x have been generated. Thus, the first approxima-
tion to the invariant of the main loop of the program will be
Pl: 0 ~i A ordered(xv[O:i-1]) A
the pairs (xvi/], yvl/]), 0 ~j < i, are all the pairs
with x-value <x that satisfy (19.2.1).
246 Part Ill. The Development of Programs
Other reasons will probably suggest themselves once familiarity with the
technique is acquired. We illustrate with three examples.
d:= a +(b-a)/2
b =a+c
d=a+c/2
(E p: I ~P: c = 2P) (therefore c is even)
Printing can be done in linear time and searching can be done in time
proportional to the logarithm of the current size of v, using Binary
Search.
But what about inserting a new value x? Inserting will require finding
the position j where x belongs -i.e. finding the value j such that vU-1]
~ x < vU]- then shifting vU:i-1] up one position to vU+l:i], and
finally placing x in vU]. Shifting vU:i-1] may take time proportional
to i, which means that each insertion may take time proportional to i,
and therefore, in the worst case the total time spent inserting n items may
be on the order of n 2 • This is expensive, and a modification is in order.
Shifting is the expensive operation, so we try to change the data repre-
sentation to make it less expensive. How can this be done, perhaps to
eliminate the need for shifting altogether?
A simple way to make shifting less expensive is to spread the values out,
so that an empty array element, or "gap", appears between each pair of
values. Thus, an array v[0:2n -I] of twice the size is defined by
Remark: If all values are known to be positive, then the sign bit of vi/]
can be used to distinguish values from gaps. D
Now, shifting and inserting takes no time at all, because the new value
can be placed in a gap. But shifting and inserting destroys the fact that a
gap separates each pair of values, and after inserting it necessary to recon-
figure the array to reestablish ( 19.3.4). Reconfiguring can be costly, so we
must find a way to avoid it as much as possible.
We can defer reconfiguring the array simply by weakening the invari-
ant to allow several values to be adjacent to each other. However, there
are never adjacent gaps; the odd positions of v always contain values.
We introduce a fresh variable k to indicate the number of array elements
being used, and use the invariant
Section 19.3 Changing a Representation 249
(19.3.5) P: O~i A
ordered(v[O:k-1]) A {V0 , • • · , V;_i}Ev(O:k-1] A
(A}: O~j <k A oddU): vU] is not a gap) A
v[O:k -I] contains k-i gaps A
(A}: O~j <k: vU] a gap =;> vU]=vU+l])
Note, now, that when inserting the first value no shifting is required, since
it can fill a gap. The second value is likely to fill a gap also, but it may
cause a shift. The third value inserted may fill a gap also, but the proba-
bility is greater that it will cause some shifting because there are fewer
gaps. At some time, so many values will have been inserted that shifting
again becomes too expensive. At this point, it is wise to reconfigure the
array so that there is again one gap between each pair of values.
To summarize, the table is defined by (19.3.5), with (19.3.4) also being
true initially. That is, values are separated by gaps. The table is initially
set to empty using
i' k := 0, 0
{(19.3.4) and (19.3.5) are true}
(19.3.6) {(19.3.5)}
if shifting too expensive - Reconfigure to reestablish ( 19.3.4)
a shifting is not too expensive - skip
fi;
Find the position j where V; belongs;
Shift vU: ... ] up one position to make room for V;;
i, vU]:= i+l, V;
Discussion
The first idea in developing this algorithm was to find a way to make
shifting less expensive; the method used was to put a gap between each
pair of values. The second idea was to defer reconfiguration, because it
was too expensive. The first idea made shifting cheap, but introduced the
expensive reconfiguration operation; the second idea deferred reconfigura-
tion often enough so that the total costs of shifting and reconfiguration
were roughly the same.
The algorithm is a competitor to balanced tree schemes in situations
where a table of values is to be maintained in memory.
The first four functions are executed in constant time. Function append,
however, takes time proportional to the length of the list v to which w is
being appended.
This is all we will need to know about LISP.
Consider implementing a queue using LISP lists and the five functions
just given. A queue is a list v on which three operations may be per-
formed: the first is to reference the first element on the list, the second is
to delete the first element and the third is to insert a value w at the end
of the queue. Thus, the three operations on queue v can be written as
the queue can be examined. In the worst case, the time needed to per-
form the insertions is on the order of n 2 • Why?
Insertion can be done easily if the queue is kept in reverse order. But
this would make deletion expensive. Thus, we compromise: implement
queue v =(v 0 , · · · , ;-d using two lists vh and vt, where the second is
reversed:
vh:= tail(vh);
if vh =() A vt #()
{inv: queue is (reverse ( vt) I vh )}
{bound: I vt I}
dovt#()- vh,vt:= construct(head(vt),vh),tail(vt) od
{(19.3.7) A VI=()}
0 vh #() V VI=() - skip
fi
that, given fixed i >O, fixed integer k, array v and array isgap satisfying
(19.3.5), spreads the array values out to establish (19.3.4).
2. Change the representation of variables in program ( 19.3.2) so that no squaring
operations are used.
3. Develop a program that, given fixed integers X, Y >O, establishes z = X Y.
Develop it using the idea that z should be calculated through a series of multipli-
cations, so that it may make sense to initialize z to the identity of * , I, trying to
create the invariant of a loop first, and changing a representation to make it all
possible.
Chapter 20
Two Larger Examples of Program Development
(20.1.1) justifying#lines#by########
inserting#extra#blanks#is##
one#task#of#a#text#editor.#
(20.1.2) justifying#####lines#####by
inserti ng#extra##bla nks##is
one##task#of#a#text#editor.
The first step is to write pre- and postconditions for the procedure body.
We begin with the precondition. The words themselves are not part of
the specification, since only column numbers are given. So the precondi-
tion won't be written in terms of words. But it may help to give an
intepretation of the precondition in terms of words. Initially, the input
line has the form
where WI is the first word, W2 the second, ... , Wn the last, s is the
number of extra blanks, and the number of blanks at each place has been
shown within brackets. The precondition Q itself must give restrictions
on the input -e.g. that there cannot be a negative number of words or of
extra blanks. In addition, because array b will be modified, it is
Section 20.1 Right-Justifying Lines of Text 255
(20.1.7) {Q}
Calculate p , q and t to establish QI;
{QI A Q}
Calculate new b[l:n] to establish R
{QI AR}
Calculating p, q and t
The two English commands of (20.1. 7) have to be refined. We begin
with the first. At this point, refine "Calculate p, q and 1 to establish
QI". Be absolutely sure the refinement is correct.
which simplifies to
which cannot be true. What does n =O mean? That there are no words
on a line. But of course, a line with 0 words cannot be justified!
Assume the specification is changed so that, if a line has zero or one
words on it, then no justification should occur.
The case even (z) is solved in a similar fashion, leaving us with the fol-
lowing algorithm to establish QI if n >I:
Sect!'Jn 20.1 Right-Justifying Lines of Text 257
Determine p, q and t:
ifeven(z)- q:= s-7-(n-I); 1:= 1-t(s mod(n-l));p:= q+I
Qodd(z) -p:= s-7-(n-I); 1:= n-(s mod(n-1)); q:=p+I
fi
or simply
k,e:=n,s;
dok#t -b[k]:=b[k]+e; k,e:=k-1,e-q od;
doe #0 - b[k]:= b[k]+e; k, e:= k-1, e-p od
Each loop was developed by first writing the invariant, then writing the
command of the loop, and finally determining a suitable guard. The
guard e #0 for the second loop was discovered by noting that the invari-
ant states that e =p*(k-1) and that e =O implies either p =O or k =I,
each of which implies that all values b[i] have their final value.
258 Part Ill. The Development of Programs
Discussion
The development of this program brings up several interesting points.
First of all, consider the development of the postcondition (20.1.6). A
common mistake in writing this specification is to describe the right-
justified line as two cases:
While it can lead to a correct program, the program will be less efficient
than the one developed, even if in a relatively minor way. Generally
speaking, one should try to follow the principle:
But this would have eliminated the possibility of noticing that the loop
could be written without loss of efficiency to halt immediately if p =O.
Further, one familiar \\-ith using loop invariants will generate the invariant
and loop given in (20.1.11) as quickly as the PL/ I loop.
Section 20.2 The Longest Upsequence 259
Thus, using a variable k to contain the answer, the program has the pre-
and postconditions:
Q: n >O
R: k =/up(b[O:n-1])
Note that a change in any one value of a sequence could change its long-
est upsequence, and this means that possibly every value of a sequence s
must be interrogated to determine /up (s ). This suggests a loop. Begin by
writing a possible invariant and an outline of the loop.
The loop will interrogate the values of b[O:n -I] in some order. Since
lup(b[O:O]) is I, a possible invariant can be derived by replacing the con-
stant n of R by a variable:
P: I ~i ~n A k =lup(b[O:i-1])
i, k:= I, I;
do i ¥- n - increase i, maintaining P od
P: O~i ~n A k =/up(b[O:i-1]) A
m is the smallest value in b [O:i-1] that ends an
upsequence of length k
In the case b[i];;;:::m, k can be increased and m set to b[i], so that the
program thus far looks like
i,k,m:=J,I,b[O]; {P}
doi~n -ifb[i];;;:::m -k,m:=k+I,b[i]
a
b[i]<m - ?
fi;
;:= i+I
od
The case b[i]<m[I] is the easiest to handle. Since m[I] is the smallest
value that ends an upsequence of length I of b[O:i-1], if b[i]<m[I],
then b[i] is the smallest value in b[O:i] and it should become the new
m[I]. No other value of m need be changed, since all upsequences of
b[O:i-1] end in a value larger than b[i].
Finally, consider the case m[l]~b[i]<m[k]. Which values of m
should be changed? Clearly, only those greater than b [i] can be changed,
since they represent minimum values. So suppose we find the j satisfying
2. (The Next Higher Permutation). Suppose array b[O:n -I] contains a sequence
= =
of (not necessarily different) digits, e.g. n 6 and b [0:5] (2, 4, 3, 6, 2, I). Con-
sider this sequence as the integer 243621. For any such sequence (except for the
one whose digits are in decreasing order) there exists a permutation of the digits
that yields the next higher integer (using the same digits). For the example, it is
(2, 4, 6, I, 2, 3), which represents the integer 246123.
Write a program that, given an array b[O:n -I] that has a next higher permu-
tation, changes b into that next higher permutation.
3. (Different Adjacent Subsequences). Consider sequences of l's, 2's and J's. Call
a sequence good if no two adjacent non-empty subsequences of it are the same.
For example, the following sequences are good:
2
32
32123
1232123
Exercises for Chapter 20 263
33
32121323
123123213
It is known that a good sequence exists, of any length. Consider the "alphabetical
ordering" of sequences, where sequence sl .<. sequence s2 if, when considered as
decimal fractions, s/ is less than s2. For example, 123. <. 1231 because
.1232<.1231 and 12.<.13. Note that if we allow O's in a sequence, then
s/ I 0 .=. sl. For example, 110 .=. 11, because .110 = .11.
Write a program that, given a fixed integer n ;;;;:o, stores in array b[O:n -I]
the smallest good sequence of length n.
4. (The Line Generator). Given is some text stored one character to an array ele-
ment in array b[O:n -I]. The possible characters are the letter A, ... , Z, a blank
and a new line character (NL). The text is considered to be a sequence of words
separated by blanks and new line characters. Desired is a program that breaks
the text into lines in a two-dimensional array line[O:nolines -I, O:maxpos-1],
with line[O,O:maxpos-1] being the first line, line[l,O:maxpos] being the
second line, etc. The lines must satisfy the following properties:
For X, we can define a second array X' [O: N -I] as follows. For each i, element
X'[i] is the number of values in X[O:i-1] that are less than X[i]. For exam-
ple, we show one possible array X and the corresponding array X', for N = 6.
x =(2,0,3, l,5,4)
X' = (0, 0, 2, I, 4, 4)
7. (The Non-Crooks). Array f [O: F-1] contains the names of people who work
at Cornell, in alphabetical order. Array g[O: G-1] contains the names of people
on Welfare in Ithaca, in alphabetical order. Thus, neither array contains dupli-
cates and both arrays are monotonically increasing:
10. (Due to W.H.J. Feijen) Given is an array g[O:N -I], N ;:;;:i: 2, satisfying
o:;;;;g[O]:;;;; · · · <g[N-1]. Define
h 1 =g[O]+g[I]
hk = hk-1 +g[k] for I <k :;;;;N-1
11. (Exponentiation). Write a program that, given two integers x ;;::i:o and y >O.
calculates the value z = xY. The binary representation bk -I · · · b 1b 0 of y is
also given, and the program can refer to bit i using the notation b;. Further, the
value k is given. The program is to begin with z =
I and reference each bit of
the binary representation once, in the order bk-1' bk-2•
Chapter 21
Inverting Programs
{x =3} x:= 1
IS
{x=l}x:=3
Thus, execution of the first begins with x = 3 and ends with x = 1, while
execution of the second does the opposite. (Note carefully how one gets
an inverse by reading backwards -except that the assertion becomes the
command and the command becomes the assertion. This itself is a sort of
inversion.) This example shows that we may have to compute inverses of
programs together with their pre- and/ or postconditions.
The command x := x*x has no inverse, because two different initial
values x =2 and x =-2 yield the same result x =4. To have an inverse,
a program must yield a different result for each different input.
266 Part Ill. The Development of Programs
The inverse of x:= x-y is x := x +y, and vice versa. Let's calculate the
inverse of y := x -y. This is equivalent to y := -(y-x ), which is
equivalent toy:= y-x; y:= -y. The inverse of this sequence is y:= -y;
y:= y+x, which is equivalent to y:= -y+x, which is equivalent to
y := x -y. Hence, y := x -y is its own inverse, and (21.2) is equivalent to
But then (21.1) is its own inverse! We leave to exercise I the proof that
(21.1) swaps the values of the integer variables x and y.
The inverse of skip. The inverse of skip would be piks, so we will have
to introduce piks as a synonym for skip.
executed, and upon termination x has a final value c2. The inverse
assigns c2 to x, executes the inverse of S, and terminates with x =cl:
Execution must begin with at least one guard true, so the disjunction of
the guards has been placed before the command. Execution terminates
with either RI or R2 true, depending on which command is executed, so
RI v R2 is the postcondition.
To perform the inverse of (21.3), we must know whether to perform
the inverse of S2 or to perform the inverse of SI, since only one of them
is executed when (21.3) is executed. To determine this requires knowing
which of R2 and RI is true, which means they cannot both be true at the
same time. We therefore require that RI A R2 = F. For symmetry, we
also require Bl A B2 = F.
Now let's develop the inverse of (21.3). Begin at the end of (21.3) and
read backwards. The last line of (21.3) gives us the first line of the
inverse: {R2 v RI} if. This makes sense; since (21.3) must end in a state
satisfying RI v R2, its inverse must begin in a state satisfying R2 v RI.
Reading the fourth line backwards gives us the first guarded command:
R2 - sr' {B2}
This is understood as follows. Execution of (21.3) beginning with B2 true
executes S2 and establishes R2. Execution of its inverse beginning with
R2 true undoes what S2 has done, thus establishing B2.
Note carefully how, when inverting a guarded command with a post-
condition, the guard and postcondition switch places.
Continuing to read backwards yields the following inverse of (21.3)
(provided RI A R2 = F):
268 Part III. The Development of Programs
(21.5) do Bl - SJ od {,Bl}
Loop (21.5) contains the barest information -it is annotated only with
the fact that Bl is false upon termination. It turns out that a loop invari-
ant is not needed to invert a loop.
From previous experience in inverting an alternative command, we
know that a guarded command to be inverted requires a postcondition.
Further, we can expect , Bl to become the precondition of the loop
(because we read backwards) and therefore the loop must have a precon-
dition that will become the postcondition. The two occurrences of Bl in
(21.5), lead us to insert another predicate Cl as follows:
Now it's easy to invert: simply read backwards, inverting the delimiters
do and od and inverting a guarded command as done earlier in the case
of the alternative command. The inverse of (21.6) is
Inverting swap_equals
In section 16.5 a program was developed to swap two non-overlapping
sections b[i:i +n -I] and bU:j+n -I] of equal size n, where n ;;;:o. The
invariant for the loop of the program is 0 ~ k ~ n together with
k:= O;
do k ¥-n - b[i+k],bU+k]:= bU+k],b[i+k]; k:= k+I od
=
postcondition that describes the value of k. This postcondition is k n,
the complement of the guard of the loop. Also, to invert the loop we will
need a precondition for it and a postcondition for its body; these can be
k =O and k #0, respectively. Thus, we rewrite the program as
(21.8) k:=O;
loop: {k =O}
do k #n
b[i+k],bU+k]:= bU+k],b[i+k]; k:= k+I {k #0}
od
{k =n}
{k =n}
where loop labels the five indented lines: the loop and its pre- and post-
conditions. Using the rule for inverting a block, we find the inverse of
this program to be
pool:
{k =n}
do k #0 -
(b[i+k],bU+k]:= bU+k],b[i+k]; k:= k+l)- 1 {k #n} od
{k =O}
Further, the body of the loop -the inverse of the multiple assignment in
the original loop- is
k:= n;
pool: {k =n}
do k #0 -
k:= k-1; b[i+k],bU+k]:= bU+k],b[i+k] {k #n} od
{k =O}
{k =O}
Note how the original program swaps values beginning with the first ele-
ments of the sections, while its inverse begins with the last elements and
works its way backward. Note also that (21.8) is its own inverse, so (21.8)
has at least two inverses.
270 Part III. The Development of Programs
Inverting Perm-1o_Code
Exercise 5 of chapter 20 was to write a program for the following
problem. Let N be an integer, N >O, and let X[O:N-1] be an array that
contains a permutation of the integers 0, I, · · · , N-1. Formally,
(21.IO)X =(2,0,3,1,5,4)
X' = (0, 0, 2, I, 4, 4)
We try to write a loop that changes one value of array x from its initial
to its final value at each iteration. The usual strategy in such cases is to
replace a constant of the result assertion by a variable. Here, we can
replace 0 or N, which leads to calculating the array values in descending
or ascending order of subscript value, respectively. Which should we do?
In example (21.10), the values X[N-1] and X'[N-1] are the same. If
the last values of X and X' were always the same, working in descending
order of subscript values might make more sense. So let's try to prove
that they are always the same.
Chapter 21 Inverting Programs 271
X[N-1] is the last value of X. Since the array values are 0, ... , N-1,
there are exactly X[N-1] values less than X[N-1] in X[O:N-2]. But
X'[N-1] is defined to be the number of values in X[O:N-2] less than
X[N-1]. Hence, X[N-1] and X'[N-1] are the same.
Replacing the constant 0 of the postcondition by a variable k yields
the first attempt at an invariant:
But the invariant must also indicate that the lower part of x still contains
its initial value, so we rewrite the invariant as
The obvious bound function is k, and the loop invariant can be esta-
blished using k : = N.
There is still a big problem with using this as the loop invariant. We
began developing the invariant by noticing that X[N-1] = X'[N-1], so
that the final value of x[N-1] was the same as its initial value. To gen-
eralize this situation, at each iteration we would like x[k-1] to contain
its final value, but the invariant developed thus far doesn't indicate this.
The generalization would work if at each iteration x[O:k-1] contained
a permutation of the integers {O, · · · , k-1} and if the code for this per-
mutation was equal to X'[O:k -I]. But this is not the case: the invariant
does not even indicate that x[O:k-1] is a permutation of the integers
{O, · · · ,k-1}.
Perhaps x can be modified during each iteration so that this is the
case. Let us rewrite the invariant as
k:= N;
do k #0 - k:= k-1;
Reestablish P
od
X =(2, 5, 4, I, 0, 3) and k =6
X =(2, 5, 4, I, 0, 3) and k =5
X =(2, 4, 3, I, 0, 3) and k =5
(21.12) k:= N;
do k #0 -
k:= k-1;
Subtract I from every member of x[O:k-1] that is>x[k]:
j:= O;
do j #k - {xU]#x[k]}
if xU] >x[k] - xU]:= xU]-1
OxU]<x[k] - skip
fl;
j:= j+I
od
od
k:= N;
loopa: {k = N}
do k #0 -
k:= k-1;
j:= O;
/oopb: {j =O}
do} #k -
if xU]>x[k] - xU]:= xU]-1 {xU]~x[k]}
a
xUJ<x[k] - skip {xUJ<x[k]}
fi;
j:= J+I
{j #0}
od
{j =k}
{j =k}
{k #N}
od
{k =O}
{k =OJ
Now invert the program, step by step, applying the inversion rules given
earlier. First, invert the block k:= N; loopa {k =OJ to yield k:= O;
loopa- 1 {k =NJ. Next, loopa- 1 is
apool: {k =OJ
do k #N -(k:= k-1; j:= O; loopb {j =k})- 1 {k #0} od
{k =NJ
k:= O;
apoo/: {k =O}
do k #N -
j:= k;
bpoo/: {j = k}
doj#O -
j:=j-1;
if x[k]>xU] - piks {x[k]>xU]}
ax[k]~xU] - xU]:= xU]+I {x[k]<xU]}
fi
{j #k}
od;
{j =O}
{j =O}
k:= k+I
{k #0}
od
{k =N}
{k =N}
k:= O;
do k #N -
j:= k;
doj #0 -
j:=j-1;
if x[k] >xU] - piks
ax[k] ~xU] - xU]:= xU]+I
fi
od;
k:= k+I
od
Almost all programs in this book have been written in the guarded
command notation, with the addition of multiple assignment, procedure
call and procedure declaration. To execute the programs on a computer
usually requires translation into Pascal, PL/ I, FORTRAN or another
implemented language. Nevertheless, it still makes sense to use the
guarded command notation because the method of program development
is so intertwined with it. Remember Principle 18.3.11: program into a
programming language, not in it.
In this chapter, we discuss the problems of writing programs in other
languages as well as in the guarded command notation. We give general
rules for indenting and formatting, describe problems with definitions and
declarations of variables, and show by example how the guarded com-
mand notation might be translated into other languages.
22.1 Indentation
In the early days, programs were written in FORTRAN and assembly
languages with no indentation whatsoever, and they were hard to under-
stand because of it. The crutch that provided some measure of relief was
the flaw chart, since it gave a two-dimensional representation that exhi-
bited the program structure or "flow of control" more clearly.
Maintaining two different forms of the program -the text itself and
the flaw chart- has always been prone to error because of the difficulty
in keeping them consistent. Further, most programmers have never liked
drawing flaw charts, and have often produced them only after programs
were finished, and only because they were told to provide them as docu-
mentation. Therefore the relief expected from the use of flaw charts was
missing when most needed -during program development.
276 Part III. The Development of Programs
Sequential composition
Many programming conventions force the programmer to write each
command on a separate line. This tends to spread a program out, making
it difficult to keep the program on one page. Then, indentation becomes
hard to follow. The rule to use is the following:
i= l; k= I; m(I)= b(O); t P */
Together, the three assignments perform the single function of establishing
P. There is no reason to force the programmer to write them as
i= l;
k= I;
m(I)= b(O);
(As an aside, note how the PL/ I assignment is written with no blank to
the left of= and one blank to the right. Since FORTRAN and PL/ I use
the same symbol for equality and assignment, it behooves the programmer
to find a way to make them appear different.)
Don't use rule 22.1.1 as a license to cram programs into as little space
as possible; use the rule with care and reason.
The rule concerning indentation of sequences of commands is obvious:
Section 22.1 Indentation 277
i= I;
k= I;
m(I)= b(O);
Indenting subcommands
The rule concerning subcommands of a command is:
or, in PL/ I,
Note that the body of the loop is indented. Further, the body is a se-
quence of two commands, which, following rule 22.1.2, begin in the same
column. Also, the subcommands of the PL/ I conditional statement are
indented with respect to its beginning.
The PL/ I conditional statement could also have been written as
IF d*d ~n THEN a= d;
ELSE b= d;
Assertions
As mentioned as early as chapter 6, it helps to put assertions in pro-
grams. Include enough so that the programmer can understand the pro-
gram, but not so many that he is overwhelmed with detail. The most
important assertion, of course, is the invariant of a loop. Actually, if the
program is annotated with the precondition, the postcondition, an invari-
ant for each loop, and a bound function for each loop, then the rest of the
pre- and postconditions can, in principle, be generated automatically.
Assertions, of course, must appear as comments in languages that don't
allow them as a construct (Ada does). Two rules govern the indentation
of assertions:
We have used these rules throughout the book, so they should appear
natural by now (naturalness must be learned). For two examples of the
use of rule 22.1.5, see program 20.2.2.
Section 22.1 Indentation 279
Indentation of delimiters
There are three conventions for indenting a final delimiter (e.g. od, fl
and the END; of PL/ I). The first convention puts the delimiter on a
separate line, beginning in the same column as the beginning of the com-
mand. This convention has been used throughout this book.
The second convention is to indent the delimiter the same distance as
the subcommands of the command -as in the PL/ I loop
DO WHILE (expression);
END;
This convention has the advantage that it is easy to determine which com-
mand sequentially follows this one: simply search down in the column in
which the DO WHILE begins until a non-blank is found.
The third convention is to hide the delimiter completely on the last line
of the command. For example,
DO WHILE ( expression );
· · · END;
or
do guard -
od
This convention recognizes that the indenting rules make the end delim-
iters redundant. That is, if a compiler used the indentation to determine
the program structure, the end delimiters wouldn't be necessary. The del-
imiters are still written, because they provide a useful redundancy that can
be checked by the compiler, but they are hidden from view.
Which of the three conventions you use is not important; the impor-
tant point is to be consistent, so that the reader is not surprised:
The command-comment
Some of the programs presented in this book, like program 20.2.2,
have used an English sentence as a label (followed by a colon) or a com-
ment. The English sentence was really a command to do something, and
280 Part III. The Development of Programs
the program text that performed the command was indented underneath
it. Here is an example.
is not precise enough, for it forces the reader to read the refinement in
order to determine where the sum of the array elements is placed. Far
better is the command-comment
Section 22.1 Indentation 281
As you can see from the last example, the command-comment can be in
the form we have been using throughout the book for specifying a pro-
gram (segment).
Here is the indentation rule for command-comments.
The reason for not using this convention should be clear from the exam-
ple: one cannot tell where the refinement ends. Much better is to use rule
22.1.7:
Judicious use of spacing (skipping lines) may help, but no simple rule for
spacing after refinements can cover all cases if refinements are not
indented.. So follow rule 22.1. 7.
One more point concerning indentation of comments. Don't insert
them in such a manner that the structure of the program becomes hidden.
For example, if a sequence of program commands begin in column 10, no
comment between them should begin in columns to the left of column 10.
282 Part III. The Development of Programs
Procedure headings
As mentioned in chapter 12, the purpose of a procedure is to provide a
level of abstraction: the user of a procedure need only know what the pro-
cedure does and how to call it, and not how the procedure works. To
emphasize this, the procedure declaration should be indented as follows.
It may be reasonable to have a blank line before and after the procedure
declaration in order to set it off from the surrounding text.
As an example, here is a Pascal-like procedure declaration:
This strategy lies behind much of what has been presented in this book.
A definition of a set of variables is simply an assertion about their logical
relationship, which must be true at key places of the program. In the
284 Part III. The Development of Programs
These declarations suffer for several reasons. First, the variables have not
been grouped by their logical relationship. From the name staffsize, one
might deduce that this variable is logically related to array staff, but it
need not be so. Also, there is no way to understand the purpose or need
for divsize. Further, the definitions of globally important variables are
mixed up with the definitions of local variables, which are used in only a
few, adjacent places (i and j, for example).
Then there is no definition of the variables. For example, how do we
know just where in array staff the employees can be found. Are they
inserted at the beginning of the array, or the end, or in the middle? It has
also not been indicated that the lists are sorted.
Here is a better version of these declarations.
i, j: integer;
q: Phonerec;
Now the variables are grouped according to their logical relationship, and
definitions are given that describe the relationship. These definitions are
actually invariants (but not loop invariants), which hold at (almost) all
places of the program.
Variables i, j and q are presumably used only in a few, localized
places, and hence need no definition at this point.
Note carefully the format of the declarations. The variables themselves
begin in the same column, which makes it easy to find a particular vari-
able when necessary. Further, the comments describing each group
appear to the right of the variables, again all beginning in the same
column. Spending a few minutes arranging the declarations in this format
is worthwhile, for it aids the programmer as well as the reader.
One more point. Nothing is worse than a comment like "i is an index
into array b" When defining variables, refrain from buzzwords like "poin-
ter", "counter" and "index", for they serve only to point out the laziness
and lack of precision of your thought.
Section 22.3 Writing Programs in Other Languages 287
i, j, k := 0, 0, O;
{inv: 0 ~i ~iv A O~j ~jv A O~k ~kv}
{bound: i-iv +j-jv +k-kv}
dof[i]<gU] - i:= i+t
0 gU]<h[k] - j:= j+I
0 h[k]<f[i] - k:= k+I
od
{i =iv A j = jv A k = kv}
i = O; j = O; k = O;
/*Simulate 3-guarded-command loop:•/
/"inv: 0~ i~ iv A 0~ j~ jv A 0~ k ~ kv*j
/"bound: i- iv+ j- jv+ k- kv*/
LOOP:
IF /(i)<gU) THEN DO; i= i+I; GOTO LOOP; END;
IF g(i)<hU) THEN DO; j= j+I; GOTO LOOP; END;
IF /(i)<gU) THEN DO; k= k+l;GOTO LOOP; END;
/"i= iv A j= jv A k= kv*/
Program in Pascal
Program in PL/ I
/*Then words, n ;;;:o, on line number z begin in columns b(I), ... , b(n).
Exactly one blank separates each adjacent pair of words. s, s ;;;:;: 0,
is the.total number of blanks to insert between words to right-justify the
line. Determine new column numbers b (I :n) to represent the justified
line. Result assertion R, below, specifies that the numbers of blanks
inserted between different pairs of words differ by no more than one,
and that extra blanks are inserted to the left or right, depending on
the line number. Unless O~n ~I, the justified line has the format
WI [p+I blanks] ... [p+I] Wt [q+I] ... [q+I] Wn
where p, q, t satisfy
QI: l~t~n AO~p AO~q Ap*(t-l)+q*(n-t)=s"
(odd(z)Aq=p+I v even(z)Ap=q+I)
Using B to represent the initial value of array b, result assertion R is
R: (O~n ~I" b =B) v ((Ai: I ~i ~t: b(i)= B(i)+p*(i-1)) A
(Ai: t <i ~n: b(i)=B(i)+p*(t-l)+q*(i-t)))*/
justify: PROC(n,z,s,b);
DECLARE (n, z, s, b(*)) FIXED;
DECLARE (q,p,t,e,k) FIXED;
IF n >I THEN
DO; /*Determinep, q and t:*/
IF MOD(z, 2) =O
THEN DO; q= s /(n-1);
t= l+MOD(s, (n-1)); p= q+l; END;
ELSE DO; p= s / (n-1);
t= n-MOD(s, (n-1)); q= p+l; END;
/*Calculate new column numbers b (I :n ):*I
k= n; e= s;
/"inv: t ~k ~n A e =p*(t-l)+q*(k-t) A
b(l:k)= B(l:k) "b(k+l:n) has its final values*/
DO WHILE(k ,=t); b(k)= b(k)+e;
k= k-l; e= e-q; END;
/"inv: I ~k ~t A e =p*(t-1) A
b[l:k]=B(l:k) A b(k+l:n) has its final values*/
DO WHILE(e ,=O); b(k)= b(k)+e;
k= k-1; e= e-p; END;
END; END justify;
292 Part III. The Development of Programs
Program in FORTRAN
In the FORTRAN example given below, note how each guarded com-
mand loop is implemented using an IF-statement that jumps to a labeled
CONTINUE statement. These CONTINUE statements are included only
to keep each loop as a separate entity, independent of the preceding and
following statements.
Pre-1960
FORTRAN and FAP, the IBM 7090 assembly language, were my first
programming languages, and I loved them. I could code with the best of
them, and my flaw charts were always neat and clean. In 1962, as a
research assistant on a project to write the ALCOR-ILLINOIS 7090 Algol
60 Compiler, I first came in contact with Algol 60 [39]. Like many, I was
confused on this first encounter. The syntax description using BNF (see
Appendix I) seemed foreign and difficult. Dynamic arrays, which were
allocated on entrance to and deallocated on exit from a block, seemed
wasteful. The use of ":=" as the assignment symbol seemed unnecessary.
The need to declare all variables seemed stupid. Many other things dis-
turbed me.
I'm glad that I stuck with the project, for after becoming familiar with
Algol 60 I began to see its attractions. BNF became a useful tool. I
Section 23.1 A Brief History of Programming Methodology 295
began to appreciate the taste and style of Algol 60 and of the Algol 60
report itself. And I now agree with Tony Hoare that
The 1960s
The 1960s was the decade of syntax and compiling. One sees this in
the wealth of papers on context-free languages, parsing, compilers, com-
piler-compilers and so on. The linguists also got into the parsing game,
and people received Ph.D.s for writing compilers.
Algol was a focal point of much of the research, perhaps because of
the strong influence of IFIP Working Group 2.1 on Algol, which met
once or twice a year (mostly in Europe). (IF/P stands for International
Federation for Information Processing). Among other tasks, WG2. I pub-
lished the Algol Bulletin in the 1960s, an informal publication with fairly
wide distribution, which kept people up to date on the work being done in
Algol and Algol-like languages.
Few people were involved deeply in understanding programming per se
at that time (although one does find a few early papers on the subject)
and, at least in the early 1960s, people seemed to be satisfied with pro-
gramming as it was being performed. If efforts were made to develop for-
mal definitions of programming languages, they were made largely to
understand languages and compilers, rather than programming. Concepts
from automata theory and formal languages played a large role in these
developments, as is evidenced by the proceedings [42] of one important
conference that was held under IFIP's auspices.
A few isolated papers and discussions did give some early indications
that much remained to be done in the field of programming. One of the
first references to the idea of proving programs correct was in a stimulat-
ing paper [35] presented in 1961 and again at the 1962 IFIP Congress by
John McCarthy (then at M.I.T., now at Stanford University). In that
paper, McCarthy stated that "instead of trying out computer programs on
test cases until they are debugged, one should prove that they have the
desired properties." And, at the same Congress, Edsger W. Dijkstra
(Technological University Eindhoven, the Netherlands, and later also with
Burroughs) gave a talk titled Some meditations on advanced program-
296 Part III. The Development of Programs
ming [J J]. At the 1965 IFIP Congress, Stanley Gill, of England, re-
marked that "another practical problem, which is now beginning to loom
very large indeed and offers little prospect of a satisfactory solution, is
that of checking the correctness of a large program."
But, in the main, the correctness problem was attacked by the more
theoretically inclined researchers only in terms of the problem of formally
proving the equivalence of two different programs; this approach has not
yet been that useful from a practical standpoint.
As the 1960s progressed, it was slowly realized that there really were
immense problems in the software field. The complexity and size of pro-
jects increased tremendously in the 1960s, without commensurate increases
in the tools and abilities of the programmers; the result was missed dead-
lines, cost overruns and unreliable software. In 1968, a NATO Confer-
ence on Software Engineering was held in Garmisch, Germany, [6] in
order to discuss the critical situation. Having received my degree (Dr. rer.
nat) two years earlier in Munich under F.L. Bauer, one of the major
organizers of the conference, I was invited to attend and help organize.
Thus, I was able to listen to the leading figures from academia and indus-
try discuss together the problems of programming from their two, quite
different, viewpoints. People spoke openly about their failures in soft-
ware, and not only about their successes, in order to get to the root of the
problem. For the first time, a consensus emerged that there really was a
software crisis, that programming was not very well understood.
In response to the growing awareness, in 1969 IFIP approved the for-
mation of Working Group 2.3 on programming methodology, with Mich-
ael Woodger (National Physics Laboratory, England) as chairman. Some
of its members -including Dijkstra, Brian Randell (University of Newcas-
tle upon Tyne), Doug Ross (Softech), Gerhard Seegmueller (Technical
University Munich), Wlad M. Turski (University of Warsaw) and Niklaus
Wirth (Eidgenossische Technische Hochschule, Zurich)- had resigned
from WG2. I earlier when Algol 68 was adopted by WG2. I as the "next
Algol". Their growing awareness of the problems of programming had
convinced them that Algol 68 was a step in the wrong direction, that a
smaller, simpler programming language and description was necessary.
Thus, just around 1970, programming had become a recognized, res-
pectable -in fact, critical- area of research. Dijkstra's article on the
harmfulness of the goto in 1968 [12] had stirred up a hornets' nest. And
his monograph On Structured Programming [14] (in which the term was
introduced in the title but never used in the text), together with Wirth's
article [44] on stepwise refinement, set the tone for many years to come.
Section 23.1 A Brief History of Programming Methodology 297
Pe' {x:= e} P
PAB{S}P
P { while B do S } P A , B
And yet, we didn't really know how to do this. For example, we knew
that the loop invariant should come before the loop, but we had no good
methods for doing so and certainly could not teach others to do it. The
arguments went back and forth for some time, with those in favor of loop
invariants becoming more adept at producing them and coming up with
more and more examples to back up their case.
The issue was blurred by the varying notions of the word proof Some
felt that the only way to prove a program correct formally was to use a
theorem prover or verifier. Some argued that mechanical proofs were and
would continue to be useless, because of the complexity and detail that
arose. Others argued that mechanical proofs were useless because no one
could read them. Article [IO] contains a synthesis of arguments made
against proofs of correctness of programs, and it is suggested reading. In
this book, a middle view has been used: one should develop a proof and
program hand-in-hand, but the proof should be a mixture of formality
and common sense.
Several forums existed throughout the I 970's for discussing technical
work on programming. Besides the usual conferences and exchanges, two
others means deserve mention. First, IFIP Working group 2.3 on pro-
gramming methodology, and later WG2. I, WG2.2 and WG2.4, were used
quite heavily to present and discuss problems related to programming.
Since its formation, WG2.3 has met once or twice a year for five days to
discuss various aspects of programming. No formal proceedings have ever
emerged from the group; rather the plan has been to provide a forum for
discussion and cross-fertilization of ideas, with the results of the interac-
tion appearing in the normal scientific publications of its members. The
group has produced an anthology of already-published articles by its
members [22], which illustrates well the influence of WG2.3 on the field of
programming during the 1970s. It is recommended reading for those
interested programming methodology.
Secondly, several two-week courses were organized throughout the
1970's by the Technical University Munich. These courses were taught by
the leaders in the field and attended by advanced graduate students,
young Ph.D.s, scientists new to the field and people from industry from
Europe, the U.S. and Canada; they were not just organized to teach a
subject but to establish a forum for discussion of ongoing research in a
very well-organized fashion. Many of the ones dealing with programming
itself (some were on compiling, operating systems, etc.) were sponsored by
NA TO. These schools are unusual in that 50 to I 00 researchers ·vere
together for two weeks to discuss one topic. The lectures of many of the
schools have been published -see for example [2], [4] and [3].
Back to the development of programs. In 1975, Edger W. Dijkstra
published a paper [ 15], which was a forerunner to his book [ 16]. The
Section 23.2 The Problems Used in the Book 301
The Coffee Can Problem (Chapter I). Dijkstra mentioned the problem in
a letter in Fall 1979; he learned of it from his colleague, Carel Scholten.
It took five minutes to solve.
Closing the Curve (Chapter I). John Williams (then at Cornell, now at
IBM, San Jose) asked me to solve this problem in 1973. I was not able
to do so, and Williams had to give me the answer.
The Maximum Problem (Chapter 14). [16], pp. 52-53.
The Next Higher Permutation Problem (exercise 2 of chapter 14 and
exercise 2 of chpater 20). The problem has been around for a long
time; the development is from [16], pp. 107-110.
Searching a Two-dimensional Array (sections 15.1, 15.2). My solution.
Four-tuple Sort (section 15.2). [ 16], p. 61.
gcd(x, y) (exercise 2 of section 15.2). This, of course, goes back to Euclid.
The versions presented here are largely from [ 16].
Approximating the Square Root (sections 16.2, 16.3 and 19.3). [16], pp.
61-65.
Linear Search and the Linear Search Principle (section 16.2). The devel-
opment is from [16], pp. 105-106.
The Plateau Problem (section 16.3). I used this problem to illustrate loop
invariants at a conference in Munich, Germany, in 1974. Because of
lack of experience, my program used too many variables (see the discus-
sion at the end of section 16.3). Michael Griffiths (University of
Nancy) wrote a recursive definition of the plateau of an array and then
changed the definition into an iterative program; the result was a pro-
gram similar to (16.3.11). The idealized development given in section
302 Part Ill. The Development of Programs
opment as done in section 19.3 appears in [16], pp. 65-67. The program
of exercise 11, which processes the binary representation in a different
order, was shown to me by John Williams. I once listened to two com-
puter scientists discuss exponentiation talk right past each other; each
thought he was talking about the exponentiation routine, not knowing
that the other existed.
Controlled Density Sorting (section 19.3). Robert Melville derived this
algorithm as part of his Ph.D. thesis at Cornell [36]; it appeared in [37].
Efficient Queues in LISP (section 19.3). Robert Melville derived this
algorithm as part of his Ph.D. thesis at Cornell [36].
Right-justifying Lines of Text (section 20.1 ). The derivation first
appeared in [21].
The Longest Upsequence (section 20.2). Dijkstra gave this as an exercise
a day before he derived it at the 1978 Marktoberdorf Course on Pro-
gram Construction [4]. Four or five people present, who were experi-
enced in the method of programming, had no difficulty with it; the rest
of the audience did. Jay Misra (University of Texas, Austin) had pre-
sented a similar solution earlier on a paper on program development
[38], and a generalization of it is used in the UNIX program DIFF [31].
Unique 5-bit Subsequences (exercise I, chapter 20). In [13].
Different Adjacent Subsequences (exercise 2, chapter 20). In [13].
Perm-to-Code (exercise 5 of chapter 20). This problem was solved by
Dijkstra and his colleague, Willem H.J. Feijen, in connection with in-
verting programs (see chapter 21) in [ 17]. The concept of inverting pro-
grams and most of the inversions presented in chapter 21 are due to
them.
Code-to-Perm (exercise 6 of chapter 20). See Perm-to-Code.
Appendix 1
Backus-Naur Form
(Al.I) <digit>::= I
These two rules, which express different forms for the same nonterminal,
can be abbreviated using the symbol I, read as "or", as
<digit>::=Ol l I 2l 3l 4l 5l 6l 7l 8l 9
x u y => x u y
The symbol => denotes a single derivation -one rewriting action. The
symbol =>* denotes a sequence of zero or more single derivations. Thus,
<constant> I =>* 3 2 5 I
------ "----
<ex pr>
(------ <exIpr>
----- ) I
<constant>
----- +I ---..__
<ex pr> <ex pr>
I
<digit>
I
<constant>
I
<constant> 4
I
I I
<digit> <digit>
I I
I 3
A grammar that allows more than one syntax tree for some sentence is
called ambiguous. This is because the existence of two syntax trees allows
us to "parse" the sentence in two different ways, and hence to perhaps
give two meanings to it. In this case, the ambiguity shows that the gram-
mar does not indicate whether + should be performed before or after *.
The syntax tree to the left (above) indicates that * should be performed
first, because the <expr> from which it is derived is in a sense an
operand of the addition operator+. On the other hand, the syntax tree to
the right indicates that + should be performed first.
One can write an unambiguous grammar that indicates that multiplica-
tion has precedence over plus (except when parentheses are used to over-
ride the precedence). To do this requires introducing new nonterminal
symbols, <term> and <factor>:
<ex pr> <term> I <expr> +<term>
<expr> - <term>
<term> <factor> I <term> * <factor>
<factor> <constant> I ( <expr> )
<constant>::= <digit>
<constant>::= <constant> <digit>
<digit> .. Ol Il2l3l4l5l6l7l8l9
In this gramar, each sentence has one syntax tree, so there is no ambi-
guity. For example, the sentence 1+3*4 has one syntax tree:
Appendix I Backus-Naur Form 309
----/-----
<expr>
I
<term>
I
<ex pr>
<term>
I
<term>
____ , ____
* <factor>
I
<factor> <factor> <constant>
I I I
<constant> <constant> <digit>
I I I
<digit> <digit> 4
I I
I 3
Extensions to BNF
A few extension to BNF are used to make it easier to read and under-
stand. One of the most important is the use of braces to indicate repeti-
tion: {x} denotes zero or more occurrences of the sequence of symbols x.
Using this extension, we can describe <constant> using one rule as
References
The theory of syntax has been studied extensively. An excellent text
on the material is Introduction to Automata Theory, Languages and
Computation (Hopcroft, J.E. and J.D. Ullman; Addison-Wesley, 1979).
The practical use of the theory in compiler construction is discussed in the
texts Compiler Construction for Digital Computers (Gries, D.; John
Wiley, 1971) and Principles of Compiler Design (Aho, A.V., and J. Ull-
man; Addison Wesley, 1977).
Appendix 2
Sets, Sequences, Integers, and Real Numbers
These examples illustrate one way of describing a set: write its elements as
a list within braces { and }, with commas joining adjacent elements. The
first two examples illustrate that the order of the elements in the list does
not matter. The third example illustrates that an element listed more than
once is considered to be in the set only once; elements of a set must be
distinct. The final example illustrates that a set may contain zero ele-
ments, in which case it is called the empty set.
It is not possible to list all elements of an infinite set (a set with an
infinite number of elements). In this case, one often uses dots to indicate
that the reader should use his imagination, but in a conservative fashion,
Appendix 2 Sets, Sequences, Integers, and Real Numbers 311
{k I even(k)}
w.n I i=j+l}
Assuming i and j are integer-valued, this describes the set of pairs
{ · · ·, (-1, -2), (0, -1), (I, 0), (2, I), (3, 2), · · ·)
The cardinality or size of a set is the number of elements in it. The
notations I a I and card(a) are often used to denote the cardinality of set
a. Thus, I {}I =O, I {1,5}1 =2, and card({3,3,3})=1.
The following three operations are used build new sets: set union u, set
intersection n and set difference -.
Choose(a, x)
Sequences
A sequence is a list of elements (joined by commas and delimited by
parentheses). For example, the sequence (I, 3, 5, 3) consists of the four
elements I, 3 , 5 , 3, in that order, and () denotes the empty sequence. As
opposed to sets, the ordering of the elements in a sequence is important.
The length of a sequence s, written I s I , is the number of elements in
it.
Catenation of sequences with sequences and/ or values is denoted by I.
Thus,
s = (s[O],s[l],s[2], · · · ,s[n-1])
That is, s[O] refers to the first element, s [I] to the second, and so forth.
Further, the notation s[k .. ], where 0 ~k ~n, denotes the sequence
That is, s[k .. ] denotes a new sequence that is the same as s but with the
first k elements removed. For example, ifs is not empty, the assignment
s:= s[I..]
Using the sequence notation, rather than the usual pop and push of stacks
and insert into and delete from queues, may lead to more understandable
programs. The notion of assignment is already well understood -see
chapter 9- and is easy to use in this context.
We also use the set of real numbers, although on any machine this set and
operations on it are approximated by some form of floating point
numbers and operations. Nevertheless, we assume that real arithmetic is
performed, so that problems with floating point are eliminated.
The following operations take as operands either integers or real num-
bers:
Relations
Let A and B be two sets. The Cartesian product of A and B, written
A XB, is the set of ordered pairs (a, b) where a is in A and b is in B:
A XB ={(a, b) I a EA Ab EB}
Let N be the set of integers. One relation over NXN is the successor
relation:
succ = {( i • i +I ) I i E NI
The following relation associates with each person the year in ,.,,ich he
left his body:
I = {(a, a) I a EA}
When dealing with binary relations, we often use the name of a rela-
tion as a binary operator and use infix notation to indicate that a pair
belongs in the relation. For example, we have
From the three relations given thus far, we can conclude several things.
For any value a there may be different pairs (a, b) in a relation. Such a
relation is called a one-to-many relation. Relation parent is one-to-many,
because most people have more than one parent.
For any value b there may be different pairs (a, b) in a relation.
Such a relation is called a many-to-one relation. Many people may have
died in any year, so that for each integer i there may be many pairs (p, i)
in relation died..in. But for any person p there is at most one pair (p, i)
in died_in. Relation died_in is an example of a many-to-one relation.
In relation succ, no two pairs have the same first value and no two
pairs have the same second value. Relation succ is an example of a one-
to-one relation.
A relation on A XB may contain no pair (a, b) for some a in A .
Such a relation is called a partial relation. On the other hand, a relation
on A XB is total if for each a EA there exists a pair (a, b) in the rela-
tion. Relation died_in is partial, since not all people have died yet.
Relation succ is total (on NXN).
If relation R on A XB contains a pair (a, b) for each b in B, we say
that R is onto B. Relation parent is onto, since each child has a parent
(assuming there was no beginning).
a R o S c if! (Eb : a R b A b S c)
For example,
parent 0 I
parent' parent
parent 2 grandparent
parent 3 great -grandparent
and
(i succk }) ifJ i+k = j
Looking upon relations as sets and using the superscript notation, we can
define the closure R+ and transitive closure R* of a relation R as fol-
lows.
R+ R1u R2 u R3 u
R* R0 u R 1 u R 2 u
b R- 1 a if! a R b
Functions
Let A and B be sets. A function f from A to B, denoted by
f: A -B
Note carefully the three ways in which a function name f is used. First,
f denotes a set of pairs such that for any value a there is at most one
pair (a, b ). Second, a f b holds if (a, b) is in f. Third, f (a) is the
value associated with a, that is, (a ,f (a)) is in the function (relation) f.
The beauty of defining a function as a restricted form of relation is
that the terminology and theory for relations carries over to functions.
Thus, we know what a one-to-one function is. We know that composition
of (binary) functions is associative. We know, for any function, what / 0 ,
/ 1, / 2, f+ and/* mean. We know what the inverse /- 1 off is. We
know that / - 1 is a function ifJ f is not many-to-one.
f(y) = x*y
/(2) = x* 2
f (x+2) = x * (x+2)
/(x*2) = x * x*2
Appendix 3 Relations and Functions 319
The terminology used for binary relations and functions extends easily to
n -ary relations and functions.
Appendix 4
Asymptotic Execution Time Properties
i: = n ; do i >I - i: = i - I od
i:= n; j:= O; do i >I - ;:= i-1; j:= 0 od
;:= n; do i >I - i:= i +2 od
The first requires n units of time; the second 2n. The units of time
required by the third program is more difficult to determine. Suppose n
Appendix 4 Asymptotic Execution Time Properties 321
is a power of 2, so that
n: 2 64 128 32768
2n: 2 4 128 256 65536
logn: 0 1 6 7 15
We need a measure that allows us to say that the third program is by far
the fastest and that the other two are essentially the same. To do this, we
define the order of execution time.
(A4.l) Definition. Let f(n) and g(n) be two functions. We say that
f(n) is (no more than) order g(n), written O(g(n)), if a constant
c >O exists such that, for all (except a possibly finite number)
positive values of n,
f(n)~c*g(n).
Since the first and second programs given above are executed in n and
2n units, respectively, their execution times are of the same order.
Secondly, one can prove that log n is O(n ), but not vice versa. Hence
the order of execution time of the third is less than that of the first two
programs.
We give below a table of typical execution time orders that arise fre-
quently in programming, from smallest to largest, along with frequent
terms used for them. They are given in terms of a single input parameter
n. In addition, the (rounded) values of the orders are given for n = 100
and n = 1000, so that the difference between them can be seen.
For algorithms that have several input values the calculation of the order
of execution time becomes more difficult, but the technique remains the
same. When comparing two algorithms, one should first compare their
execution time orders, and, if they are the same, then proceed to look for
finer detail such as the number of times units required, number of array
comparisons made, etc.
An algorithm may require different times depending on the configura-
tion of the input values. For example, one array b[l:n] may be sorted in
n steps, another array b'[l:n] in n 2 steps by the same algorithm. In this
case there are two methods of comparing the algorithms: average- or
expected-case time analysis and worst-case time analysis. The former is
quite difficult to do; the latter usually much simpler.
As an example, Linear Search, (16.2.5), requires n time units in the
worst case and n /2 time units in the average case, if one assumes the
value being looked for can be in any position with equal probability.
Answers to Exercises
2. a,b b c d b Ve bVcVd b Ac b Ac Ad
TTT T T T T
TTF T T T F
TFT T T F F
TFF T T F F
FTT T T F F
FTF T T F F
FFT F T F F
FFF F F F F
324 Answers to Exercises
Truth table for the first Distributive law (only) (since the last two columns
are the same, the two expressions heading the columns are equivalent and
the law holds):
3. Use De Morgan's laws and the law of Negation to "move not in" so
that it is applied only to identifiers and constants. For example,
transform , (a V(F A, c )) as follows:
,(aV(FA,c))
,aA,(FA,c)
= ,aA(,FV,,c)
= ,aA(,FVc)
4. Use , F = T and , T = F to eliminate all occurrences of , F and , T
(see exercises 3 and 4).
5. The proposition now has the form e 0 v · · · Ven for some n ;;;::o, where
each of the e; has the form (g 0 A · · · Agm). Perform the following until
all gj in all e; have one of the forms identifier, , identifier, T and F:
Consider some e; with a gj that is not in the desired form. Use
the law of Commutativity to place it as far right as possible, so
that it becomes gm. Now, gm must have the form (ho v · · · v
hk ), so that the complete e; is
Case 4: E(p) has the form El(p) A E2(p ). By induction, we have that
El(el) = El(e2) and E2(el) = E2(e2). Hence, El(el) and El(e2) have the
same value in every state, and E2(el) and E2(e2) have the same value in
every state. The following truth table then establishes the desired result:
The rest of th .. cases, E(p) having the forms El(p )V E2(p ), El(p) ~
E2(p) and El(p) = E2(p) are similar and are not shown here.
Answers for Section 3.2 327
We now show that use of the rule of Transitivity generates only tauto-
logies. Since el= e2 and e2 = e3 are tautologies, we know that el and e2
have the same value in every state and that e2 and e3 have the same value
in every state. The following truth table establishes the desired result:
el e2 e3 el =e3
T T T T
F F F T
4. Infer p = p vp
I From p infer p v p
I.I pVp V-1,prl
2 p ?pVp ?-I, I
3 From p v p infer p
3.1 p V-E, pr I, (3.3.3), (3.3.3)
4 p v p ?p ?-I, 3
5 p = p vp =-I, 2, 4
Case 4: E(p) has the form G(p)AH(p) for some expressions G and H.
In this case, by induction we may assume that the following proofs exist.
From el =e2, G(el) infer G(e2)
From el =e2, H(el) infer H(e2)
We can then give the following proof.
The rest of the cases, where E(p) has one of the forms G(p)V H(p),
G(p)9H(p) and G(p)=H(p), are left to the reader.
2. For the proofs of the valid conjectures using the equivalence transfor-
mation system of chapter 2, we first write here the disjunctive normal
form of the Premises:
Premise I: i tb v i bl v ma
Premise 2: i ma v i fdV i gh
Premise 3: gj V(fdA igh)
Conjecture 2, which can be written in the form (ma A gh) ~ gj, is
proved as follows. First, use the laws of Implication and De Morgan to
put it in disjunctive normal form:
( E. l) i ma v i gh v gj .
1. (a)
t I
(Ek:O~k
I I
<n: PAHk(T)) A k >O (invalid)
(b) (Aj:O~j
+I I
<n: BJ ~wp(SLJ,
I I R))
£~+ 1 =(Ai:O~i<n+l:b[i]<b[i+I])
0 p q n-1
2. (a) 0 ~P ~ q +I ~ n A b _~_x___.l_ ___..l_>_x___.I
._I
Answers for Chapter 7 335
R: x =max({y I y Eb}).
For program development it is may be useful to replace max by its mean-
ing. The result assertion R would then be
( d) First specification:
{n >O}
s
{O~i <n A (Aj: O~j <n: b[i]~bU]) A b[i]>b[O:i-1]}.
Second specification: Given fixed n > 0 and fixed array b [O:n - I], set i to
establish
wp(S,R).
7. This exerr.ise is intended to make the reader more aware of how quan-
tification works in connection with wp, and the need for the rule that
each identifier be used in only one way in a predicate. Suppose that Q
~ wp (S, R) is true in every state. This assumption is equivalent to
We are asked to analyze predicate (7.8): {(Ax: Q)} S {(Ax: R)}, which
is equivalent to
Let us analyze this first of all under the rule that no identifier be used in
more than one way in a predicate. Hence, rewrite (E7.2) as
and assume that x does not appear in S and that z is a fresh identifier.
We argue operationally that (E7.3) is trm.. Suppose the antecedent of
(E7.3) is true in some state s, and that execution of S begun in s ter-
minates in state s'. Because S does not contain identifier x, we have
s(x) =s'(x).
Because the antecedent of (E7.3) is true ins, we conclude from (E7.I)
that (A x: wp (S, R )) is also true in state s. Hence, no matter what the
value of x ins, s'(R) is true. But s(x)=s'(x). Thus, no matter what
the value of x in s', s'(R) is true. Hence, so is s'((A x: R)), and so is
s'((A z: R;)). Thus, the consequent of (E7.3) is true ins, and (E7.3)
holds.
We now give a counterexample to show that (E7.2) need not hold if x
is assigned in command S and if x appears in R. Take command
S: x:= I. Take R: x =I. Take Q: T. Then (E7.I) is
which is true. But (E7.2) is false in this case: its antecedent (Ax: T) is
true but its consequent wp("x:= I", (Ax: x =I)) is false because predicate
(Ax:x=l)isF.
We conclude that if x occurs both in S and R, then (E7.2) does not in
general follow from (E7. I).
Answers for Section 9.2 337
The last line follows because neither Q(v) nor e(v) contains a reference
to x. Now suppose Q is true in some states. Let v =s(x), the value of
x in states. For this v, (Q(v) A e(x)=e(v)) is true in states, so that
(E4.I) is also true ins. Hence Q ~(E4.I), which is what we needed to
show.
( E5. I ) ( b ; s :b o s)= b
Eb
(b. s, b 0 '1· ... sn b o snl (substitute X; for each u;)
EC (n applications of ( E5. I))
E
wp(S3, R) = (w ~r v w >r) A
(w ~r ~wp("r, q:= r-w, q+I", R)) A
(w >r ~wp(skip, R))
(w ~r ~((q+I)*w +r-w =x A r-w ~O)) A (w >r ~R)
(w ~r ~q*w+r =x A r-w ~O) A (w >r ~ R)
This is implied by R.
6. wp(S6, R) = (f[i]<gU] v f[i]=gU] v f[i]>gU]) A
(f[i] <gU] ~ R/+1) A
(f[i] =gU] ~ R) A
(f[i] > gU] ~Rf +1)
R A (f[i] <gU] ~ /[i+I]~X) A (f[i] >gU] ~gU]~X)
R (since R implies that gU]~X and f[i]~X)
PABB PA BB AT
PA BB A (Ai: PA B; ~wp{S;, P)) (since I. is true)
PA BB AP A (Ai: B; ~wp(S;, P))
~ BB A (Ai: B; ~ wp(S;, P))
wp(lF, P)
Thus, we need only show that (E3.I) implies 3' of theorem 11.6. Note
that P, IF, and t do not contain t I or tO. Since IF does not refer to T
and tO, we know that wp(IF, tl~tO+I) =BB A tl~tO+l. We then have
the following:
(£3.1) = PABB~wp(IF,t<tl)f 1
(by definition of := )
~ PABBAt ~tO+I ~ wp(IF, t ~t/-l)f 1 At ~tO+I
(Insert t ~tO+I on both sides of ~)
PA BBAt ~tO+I ~ wp(IF, t ~t/-l)f 1 A(t/ ~tO+l)f 1
PA BBAt ~tO+I ~ (wp(IF, t ~t/-l)At/ ~tO+l)f 1
(Distributivity of textual substitution)
= PA BBAt ~tO+I ~ (wp(lF, t ~t/-l)Awp(IF, ti ~tO+l))f 1
(IF does not contain ti nor tO)
PA BBAt ~tO+I ~ wp(IF, t ~ti-I At/ ~tO+l)f 1
(Distributivity of Conjunction)
PA BBAt ~tO+I ~ wp(IF, t ~t0)f 1
PABBAt ~tO+I ~ wp("t/:= t; IF", t ~tO)
Since the derivation holds irrespective of the value tO, it holds for all tO,
and 3' is true.
4. We first show that (11.7) holds fork =O by showing that it is equiv-
alent to assumption 2:
Assume ( 11.7) true fork= K and prove it true fork= K +I. We have:
and P A , BB At:;;;;;; K +l ~ P A , BB
= Ho(P A , BB)
which shows that ( 11.7) holds for k = K +I. By induction, ( 11.7) holds
for all k.
6.H' 0(R)=,BBAR. For k>O, H'k(R)=wp(lF,H'k-t(R)). (Ek:
o:;;;;k: H'dR)) represents the set of states in which DO will terminate
with R true in exactly k iterations. On the other hand, wp(DO, R)
represents the set of states in which DO will terminate with R true in k
or less iterations.
10.(1) wp("i:= l", P) = O<l:;;;;n A(Ep: 1=2P)
= T (above, take p =0).
(2) wp(S 1, P) wp("i:= 2*i", 0 <i :;;;;n A (Ep: i = 2P))
0<2*i :;;;;n A (Ep: 2*i =2P),
d =(I, 2, 3, 5, 4, 2)
d' =(I, 2, 4, 3, 2, 5)
in mind. There is a least integer i, 0:;;;;;; i < n, such that d[O:i -I] =
d'[O:i-1] and d[i] <d'[i]. One can show that i is well-defined by the
fact that d[i+l:n -I] is a non-increasing sequence and that d[i] <
d[i+l].
In order for d' to be the next highest permutation, d'[i] must contain
the smallest value of d[i +I :n - I] that is greater than d[i]. Let the right-
most element of d[i+l:n-1] with this value be di/]. Consider d" =
(d; i:dU]; j:d[i]). d" represents the array d but with the values at posi-
tions i and j interchanged. In the example above, d" = (I, 2, 4, 5, 3, 2).
Obviously, d" is a higher permutation than d, but perhaps not the next
highest. Moreover, d'[O:i] = d'10:i].
It can be proved that d'Ti+l:n-1] is a non-increasing sequence.
Hence, reversing d'1i+l:n-l] makes it an increasing sequence and, there-
fore, as small as possible. This yields the desired next highest permuta-
tion d.
Answers for Section 15.2 343
The algorithm is then: calculate i; calculate j; swap b[i] and bU]; reverse
b[i+l:n-1]! Here, formalizing the idea of a next highest permutation
leads directly to an algorithm to calculate it!
X, y:= x, Y;
dox>y -x:=x-y
0 Y >x - y:= y-x
od
{O<x=y Agcd(x,y)=gcd(X, Y)}
{x = gcd(X, Y)}
5. 1:= O;
doj =F-80 cand bU]=F'' - t,s[t+l],j:= t+I, bU],j+I
0 j =80 - read(b ); j:= 0
od
344 Answers to Exercises
if x <b[I] - ;:= 0
0b[l]~x <b[n] - The program (a)
0 b[n]~x - ;:= n
fi
i, j:= 0, n+I;
{inv: O~i <J ~n+I A b[i]~x <bU]}
{bound: logU-i)}
do i+I #} - e:= (i+J)72;
{l~e~n}
if b[e]~x - i:= ea b[e]>x - j:= e fi
od
10. i,p:= 0, O;
{inv: see exercise JO; bound: n -i}
do i # n - Increase i, keeping invariant true:
j:= i+I;
{inv: b[i:j-1] are all equal; bound: n-j}
do j #n cand bU] =b[i] - j:= }+I od;
p:= max(p, j-i);
i:= j
od
m m+I q p n-1
P: m <q ~p+I ~n Ax =B[I] Ab~'x~l-~_x_~l__?_~l_>_x~I
6. The precondition states that the linked list is in order; the postcondition
that it is reversed. This suggests an algorithm that at each step reverses
one link: part of the list is reversed and part of it is in order. Thus, using
another variable t to point to the part of the list that is in order, the
invariant is
~r v
~ ...
t v s v s
~···~
Initially, the reversed part of the list is empty and the unreversed part is
the whole list. This leads to the algorithm
p, 1:= O,p;
dot #0 - p, t, s[t]:= t, s[t],p od
Q: x Eb[O:m-1,0:n-l]
R:O~i<m AO~j<n Ax=b[i,j]
Actually, Q and R are quite similar, in that both state that x 1s m a rec-
tangular section of b -in R, the rectangular section just happens to have
only one row and column. So perhaps an invariant can be used that indi-
cates that x is in a rectangular section of b:
What could serve as guards? Consider i := i +I. Its execution will main-
tain the invariant if x is not in row i of b. Since the row is ordered, this
can be tested with b [i, j] < x, for if b [i, j] > x, so are all values in row
i. In a similar fashion, we determine the other guards:
Answers for Section 18.3 347
In order to prove that the result is true upon termination, only the first
and last guards are needed. So the middle guarded commands can be
deleted to yield the program
\ empty(p) - ()
postorder(p) = ) , empty - (postorder(lef t [p]) I
postorder(right [p ]) I (root (p))
348 Answers to Exercises
~ q <O - root(abs(q))
post(q) = I q =O - ()
~ q >O - postorder(right[q]) I root(q)
Using a sequence variable s, the invariant is:
Now note that execution of the body of the main loop does not destroy
P2, and therefore P2 can be taken out of the loop. Rearrangement then
leads to the program
i, x:= 0, 0, O;
do r >2*x 2 - x:= x+l od;
y:= x;
{inv: Pl A P2}
do x 2 ~r - Increase x, keeping invariant true:
Determine y to satisfy (El.I):
+
do x 2 y 2 > r - y: = y - l od;
if x 2 + y 2 =r - x[v],y[v], i,x:= x, y, i+l, x+l
Ox 2 +y 2 <r - x:= x+l
fi
od
{n ~O}
a, c:= 0, I; do c 2 ::;;;;n - c:= 2*c od;
{inv: a 2 ::;;;;n <(a+c)2 A (Ep: I ::;;;;p: c =2P}
{bound: ..;; -a}
doc#l -c:=c/2;
if (a +c )2 ::;;;; n - a:= a +c
0(a+c) 2 >n - skip
fi
od
{a 2 ::;;;; n <(a+ 1)2}
p =c2
(E3.2) q = a*c
is promising, because it lets u~ replace almost all the O!Jerations involving
a. Thus, before the main loop, q will be 0 since a is 0 there. Secondly,
to maintain (E3.2) across c:= c / 2 we can insert q:= q / 2. Thirdly, (E3.2)
maintained across execution of the command a:= a +c by assigning a new
value to q -what is the value?
Now try a third variable r to contain the value n -a 2 , which will always
be ~O. (E3.3) becomes
2*q+p-r:s;;;o
p =c 2 , q =a*c, r =n-a 2
{n ;;;:::o}
p,q,r:= l,O,n; dop~n -p:=4*p od;
dop¥-l -p:=p/4; q:=q/2;
if2*q +p ~r - q,r:= q+p,; r-2*q-p
a
2*q +p >r - skip
fi
od
{q2~n <(q +1)2}
Upon termination we have p =I, c =I, and q =a*c =a, so that the
desired result is in q. Not only have we eliminated squaring, but all mul-
tiplications and divisions are by 2 and 4; hence, they could be imple-
mented with shifting on a binary machine. Thus, the approximation to
the square root can be performed using only adding, subtracting and shift-
ing.
P2: 5 :;;;;;; n = I s I :; ; ; 36 A
c[i] = s[i-4]* 24 + s[i-3]* 23 + s[i-2]* 22 + s[i-1]* 2 + s[i]
(for 4:;;;;;; i <n)
n, c[ 4],in [O]:= 5, 0, T;
in[1:31]:= F; {s =(0,0,0,0,0)}
{inv: Pl A P2 A P3 A ,good(s I O)}
do c[4]# I -
if n =36 - Print sequences
an #36 - skip
fi;
Change s to next higher good sequence:
doin[(c[n-1]*2+1)mod32] {(i.e. igood(sl I)}
- Delete ending 1's from s:
do odd(c[n-1]) - n:= n-1; in[c[n]]:= F od;
Delete ending 0:
n:= n-1; in[c[n]]:= F
od;
Append I to s:
c[n]:= (c[n-1]*2 +I) mod 32; in[c[n]]:= T; n := n +I
od
so that f [h] does not appear in G, and increasing h will maintain the
invariant. Similarly the guard fork:= k +l will be g[k] <f[h].
This gives us our program, written below. We assume the existence of
virtual values /[-l]=g[i-1]=-oo and f[F]=g[G]=+oo; this allows
us to dispense with worries about boundary conditions in the invariant.
h, k, c := 0, 0, O;
{inv: P; bound: F-p +G-q}
do/ #F Ag #G -
iff[h]<g[k] - h,c:= h+l, c+l
Of[h]=g[k] - h, k:= h+l, k+l
Of[h]>g[k] - k, c:= k+l, c+l
fi
od;
Add to c the number of unprocessed elements off and g:
c:=c+F-h+G-k
References
1968), 147-148.
[13] - · A short introduction to the art of programming. EWD316,
Technological University Eindhoven, August 1971.
[14] - · Notes on Structured Programming. In Dahl, 0.-J., C.A.R.
Hoare and E.W. Dijkstra, Structured Programming, Academic Press,
New York 1972. (Also appeared a few years earlier in the form of a
technical report).
[ 15] - · Guarded commands, nondeterminacy and the formal derivation
of programs. Comm. of the ACM 18 (August 1975), 453-457.
[16] - · A Discipline of Programming. Prentice Hall, Englewood
Cliffs, 1976.
[17] - · Program inversion. EWD671, Technological University Eind-
hoven, 1978.
[18] Feijen, W.H.J. A set of programming exercises. WF25, Technologi-
cal University Eindhoven, July 1979.
[19] Floyd, R. Assigning meaning to programs. In Mathematical As-
pects of Computer Science, XIX American Mathematical Society
(196 7), 19-32.
[20] Gentzen, G. U ntersuchungen ueber das logische Schliessen. Math.
Zeitschrifft 39 (1935), 176-210, 405-431.
[21] Gries, D. An illustration of current ideas on the derivation of cor-
rectness proofs and correct programs. IEEE Trans. Software Eng. 2
(December 1976), 238-244.
[22] _ (ed.). Programming Methodology, a Collection of Articles by
Members of WG2.3. Springer Verlag, New York, 1978.
[23] _ and G. Levin. Assignment and procedure call proof rules.
TOPLAS 2 (October 1980), 564-579.
[24] _and Mills, H. Swapping sections. TR 81-452, Computer Science
Dept., Cornell University, January 1981.
[25] Guttag, J.V. and J.J. Horning. The algebraic specification of data
types. Acta Informatica IO (1978), 27-52.
[26] Hoare, C.A.R. Quicksort. Computer Journal 5 (1962), 10-15.
[27] - · An axiomatic approach to computer programming. Comm
ACM 12 (October 1969), 576-580, 583.
[28] - · Procedures and parameters: an axiomatic approach. In Sym-
posium on Semantics of Programming Languages. Springer Verlag,
New York, 1971, 102-116.
[29] - · Proof of correctness of data representations. Acta Informatica
I (1972), 271-281.
[30] _ and N. Wirth. An axiomatic definition of the programming
language Pascal. Acta Informatica 2 ( 1973), 335-355.
[31] Hunt, J.W. and M.D. Mcilroy. An algorithm for differential file
comparison. Computer Science Technical Report 41, Bell Labs,
Murray Hill, New Jersey, June 1976.
References 357
[32] lgarashi, S., R.L. London and D.C. Luckham. Automatic program
verification: a logical basis and its implementation. Acta Informatica
4 (1975), 145-182.
[33] Liskov, B. and S. Zilles. Programming with abstract data types.
Proc. ACM SIGPLAN Conf. on Very High Level Languages, SIG-
PLAN Notices 9 (April 1974), 50-60.
[34] London, R.L., J.V. Guttag, J.J. Horning, B.W. Mitchell and G.J.
Popek. Proof rules for the programming language Euclid. Acta
Informatica 1 (October 1979), 1-79.
[35] McCarthy, J. A basis for a mathematical theory of computation.
Proc. Western Joint Comp. Con.f, Los Angeles, May 1961, 225-238,
and Proc. I Fl P Congress 1962, North Holland Pub I. Co., Amster-
dam, 1963.
[36] Melville R. Asymptotic Complexity of Iterative Computations.
Ph.D. thesis, Computer Science Department, Cornell University,
January 1981.
[37] _and D. Gries. Controlled density sorting. IPL 10 (July 1980),
169-172.
[38] Misra, J. A technique of algorithm construction on sequences.
IEEE Trans. Software Eng. 4 (January 1978), 65-69.
[39] Naur, P. et al. Report on ALGOL 60. Comm. of the A CM 3 (May
1960), 299-314.
[40] Naur, P. Proofs of algorithms by general snapshots. BIT 6 ( 1969),
310-316.
[41] Quine, W.V.O. Methods of Logic. Holt, Reinhart and Winston,
New York, 1961.
[42] Steel, T.B. (ed.). Formal Language Description Languages/or Com-
puter Programming. Proc. IFIP Working Conference on Formal
Language Description Languages, Vienna 1964, North-Holland,
Amsterdam, 1971.
[43] Szabo, M.E. The Collected Works of Gerhard Gentzen. North Hol-
land, Amsterdam, 1969.
[44] Wirth, N. Program development by stepwise refinement. Comm
ACM 14 (April 1971), 221-227.
Index
85 nondeterministic, 111
Bounded nondeterminism, 312 procedure call, 164
sequential composition, 114-115
Calculus, 25 skip, 114
propositional calculus, 25 Command-comment, 99, 279
predicate calculus, 66 indentation of, 279
Call, of a procedure, 152 Common sense and formality, 164
by reference, 158 Commutative laws, 20
by result, 151 proof of, 48
by value, 151 Composition, associativity of, 316
by value result, 151 Composition, of relations, 316
cand, 68-70 Composition, sequential, 114-115
cand-simplification, 80 Concatenation, see Catenation
Cardinality, of a set, 311 Conclusion, 29
Cartesian product, 315 Conjecture, disproving, 15
Case statement, 134 Conjunct, 9
Catenation, 75 Conjunction, 9-10
identity of, 75, 333 distributivity of, 110
of sequences, 312 identity of, 72
ceil, 314 Conjunctive normal form, 27
Changing a representation, 246 Consequent, 9
Chebyshev, 83 Constable, Robert, 42
Checklist for understanding a loop, Constant proposition, IO
145 Constant-time algorithm, 321
Chomsky, Noam, 304 Contradiction, law of, 20, 70
Choose, 312 Contradiction, proof by, 39-41
Closing the Curve, 166, 301 Controlled Density Sort, 247, 303
Closure, of a relation, 317 cor, 68-70
transitive, 317 cor-simplification, 79
Code, for a permutation, 270 Correctness
Code to Perm, 264, 272-273, 303 partial, 109-110
Coffee Can Problem, 165, 301 total, 110
Combining pre- and postconditions, Counting nodes of a tree, 231
211 Cubic algorithm, 321
Command, 108 Cut point, 297
abort, 114
alternative command, 132 Data encapsulation, 235
assignment, multiple, 121, 127 Data refinement, 235
assignment, simple, 128 De Morgan, Augustus, 20
assignment to an array element, 124 De Morgan's laws, 20, 70
Choose, 312 proof of, 49
deterministic, 111 Debugging, 5
guarded command, 131 Decimal to Base B, 215, 302
iterative command, 139 Decimal to Binary, 215, 302
360 Index
U, 69
Ullman, J.D., 309
Unambiguous grammar, 308
Unbounded nondeterminism, 312
Undefined value, 69
Union, of two sets, 311
Unique 5-bit Sequences, 262,
303, 352