Science of Programming
Science of Programming
Science of Programming
Editor
Springer David Gries
New York
Berlin Advisory Board
Heidelberg
Barcelona F. L. Bauer
Hong Kong K. S. Fu
London J. J. Horning
Milan
R. Reddy
Paris
Singapore D. C. Tsichritzis
Tokyo W. M. Waite
Texts and Monographs in Computer Science
Suad Alagic, Object-Oriented Database Programming
Michael A. Arbib, A.J. Kfoury, and Robert N. Moll, A Basis for Theoretical
Computer Science
W.H.J. Feijen, A.J.M. van Gasteren, D. Gries, and J. Misra, Eds., Beauty Is Our
Business: A Birthday Salute to Edsger W. Dijkstra
Springer
David Gries
Department of Computer Science
Cornell University
Upson Hall
Ithaca, NY 14853
U.S.A.
9 8
not only helpful but even indispensable. Choice and order of examples
are as important as the good taste with which the formalism is applied.
To get the message across requires a scientist that combines his scientific
involvement in the subject with the precious gifts of a devoted teacher.
We should consider ourselves fortunate that Professor David Gries has
met the challenge.
Edsger W. Dijkstra
Preface
It is in this context that the title of this book was chosen. Programming
began as an art, and even today most people learn only by watching oth-
ers perform (e.g. a lecturer, a friend) and through habit, with little direc-
tion as to the principles involved. In the past IO years, however, research
has uncovered some useful theory and principles, and we are reaching the
point where we can begin to teach the principles so that they can be cons-
ciously applied. This text is an attempt to convey my understanding of
and excitement for this just-emerging science of programming.
The approach does require some mathematical maturity and the will to
try something new. A programmer with two years experience, or a junior
or senior computer science major in college, can master the material -at
least, this is the level I have aimed at.
A common criticism of the approach used in this book is that it has
been used only for small (one or two pages of program text), albeit com-
plex, problems. While this may be true so far, it is not an argument for
ignoring the approach. In my opinion it is the best approach to reasoning
about programs, and I believe the next ten years will see it extended to
and practiced on large programs. Moreover, since every large program
consists of many small programs, it is safe to say the following:
V1ll Preface
Part III is the heart of the book. Within it, in order to get the reader
more actively involved, I have tried the following technique. At a point, a
question will be raised, which the reader is expected to answer. The ques-
tion is followed by white space, a horizontal1line, and more white space.
After answering the question, the reader can then continue and discover
my answer. Such active involvement will be more difficult than simply
reading the text, but it will be far more beneficial.
Chapter 21 is fun. It concerns inverting programs, something that Eds-
ger W. Dijkstra and his colleague Wim Feijen dreamed up. Whether it is
really useful has not been decided, but it is fun. Chapter 22 presents a
few simple rules on documenting programs; the material can be read be-
fore the rest of the book. Chapter 23 contains a brief, personal history of
this science of programming and an anecdotal history of the programming
problems in the book.
Answers to some exercises are included -all answers are not given so
the exercises can be used as homework. A complete set of answers can be
obtained at nominal cost by requesting it, on appropriate letterhead.
Notation. The notation iff is used for "if and only if". A few years ago,
while lecturing in Denmark, I used Fif instead, reasoning that since "if and
only if" was a symmetric concept its notation should be symmetric also.
Without knowing it, I had punned in Danish and the audience laughed,
for fif in Danish means "a little trick". I resolved thereafter to use fif so I
could tell my joke, but my colleagues talked me out of it.
The symbol 0 is used to mark the end of theorems, definitions,
examples, and so forth. When beginning to produce this book on the
phototypesetter, it was discovered that the mathematical quantifiers
"forall" and "exists" could not be built easily, so A and E have been used
for them.
Throughout the book, in the few places they occur, the words he, him
and his denote a person of either sex.
x Preface
Acknowledgements
Those familiar with Edsger W. Dijkstra's monograph A Discipline of
Programming will find his influence throughout this book. The calculus
for the derivation of programs, the style of developing programs, and
many of the examples are his. In addition, his criticisms of drafts of this
book have been invaluable.
Just as important to me has been the work of Tony Hoare. His paper
on an axiomatic basis for programming was the start of a new era, not
only in its technical contribution but in its taste and style, and his work
since then has continued to influence me. Tony's excellent, detailed criti-
cisms of a draft of Part I caused me to reorganize and rewrite major parts
of it.
I am grateful to Fred Schneider, who read the first drafts of all chap-
ters and gave technical and stylistic suggestions on almost every para-
graph.
A number of people have given me substantial constructive criticisms
on all or parts of the manuscript. For their help I would like to thank
Greg Andrews, Michael Gordon, Eric Hehner, Gary Levin, Doug McIl-
roy, Bob Melville, Jay Misra, Hal Perkins, John Williams, Michael
Woodger and David Wright.
My appreciation goes also to the Cornell Computer Science Commun-
ity. The students of course CS600 have been my guinea pigs for the past
five years, and the faculty and students have tolerated my preachings
about programming in a very amiable way. Cornell has been an excellent
place to perform my research.
This book was typed and edited by myself, using the departmental
PDPll/60-VAX system running under UNIX+ and a screen editor written
for the Terak. (The files for the book contain 844,592 characters.) The
final copy was produced using troff and a Comp Edit phototypesetter at
the Graphics Lab at Cornell. Doug McIlroy introduced me to many of
the intricacies of troff; Alan Demers, Dean Krafft and Mike Hammond
provided much help with the PDPII/60-VAX system; and Alan Demers,
Barbara Gingras and Sandor Halasz spent many hours helping me con-
nect the output of troffto the phototypesetter. To them I am grateful.
The National Science Foundation has given me continual support for
my research, which led to this book.
Meetings of the IFIP Working Group on programming methodology,
WG2.3, have had a strong influence on my work in programming metho-
dology over the past 8 years.
+UNIX is a trademark of Bell Laboratories.
Preface Xl
Finally, I thank my wife, Elaine, and children, Paul and Susan, for
their love and patience while I was writing this book.
In preparing the second printing of this book, over 150 changes were
made without significantly changing the page numbering. Thanks go to
the following people for notifying me of errors: Roland Backhouse, Alfs
T. Berztiss, Ed Cohen, Cui Jing, Cui Yan-Nong, Pavel Curtis, Alan
Demers, David Gries, Robert Harper, Cliff Jones, Donald E. Knuth, Liu
Shau-Chung, Michael Marcotty, Alain Martin, James Mildrew, Ken
Perry, Hal Perkins, Paul Pritchard, Willem de Roever, J.L.A. van de
Snepsheut, R.C. Shaw, Jorgan Steensgaard-Madsen, Rodney Topor, Sol-
veig Torgerson, Wlad Turski, V. Vitek, David Wright, Zhou Bing-Sheng.
Table of Contents
Part II. The Semantics of a Small Language ... ...... .......... ............. 107
Chapter 7. The Predicate Transformer wp.................................. 108
Chapter 8. The Commands skip, abort and Composition ........... 114
Chapter 9. The Assignment Command ........ ........ .......... ....... ... ... 117
9.1. Assignment to Simple Variables.......... .......... .......... ............. 117
9.2. Multiple Assignment to Simple Variables ............................ 121
9.3. Assignment to an Array Element ........ .......... .......... ............. 124
9.4. The General Multiple Assignment Command ...................... 127
Chapter 10. The Alternative Command...................................... 131
Chapter II. The Iterative Command............ .............................. 138
Chapter 12. Procedure Call ........ .......... .......... .......... ...... ...... ...... 149
12.1. Calls with Value and Result Parameters........... ................. 150
12.2. Two Theorems Concerning Procedure Call...... ...... ...... ...... 153
12.2. Using Var Parameters ........................................................ 158
12.3. Allowing Value Parameters in the Postcondition............... 160
A story
We have just finished wntmg a large program (3000 lines). Among
other things, the program computes as intermediate results the quotient q
and remainder r arising from dividing a non-negative integer x by a posi-
tive integer y. For example, with x = 7 and y = 2, th~ program calculates
q = 3 (since 772 = 3) and r = 1 (since the remainder when 7 is divided by
2 is l).
Our program appears below, with dots " ... " representing the parts of
the program that precede and follow the remainder-quotient calculation.
The calculation is performed as given because the program will sometimes
be executed on a micro-computer that has no integer division, and porta-
bility must be maintained at all costs! The remainder-quotient calculation
actually seems quite simple; since 7 cannot be used, we have elected to
subtract divisor y from a copy of x repeatedly, keeping track of how
many subtractions are made, until another subtraction would yield a nega-
tive integer.
r:= x; q:= 0;
while r >y do
begin r:= r-y; q:= q+l end;
x = y *q +r,
{y>O}
r:=x; q:=O;
(1) while r > y do
begin r:= r-y; q:= q+l end;
{x = y *q +r}
Testing now results in far less output, and we make progress. Assertion
checking detects an error during a test run because y is 0 just before a
remainder-quotient calculation, and it takes only four hours to find the
error in the calculation of y and fix it.
Part O. Why Use Logic? Why Prove Programs Correct? 3
But then we spend a day tracking down an error for which we received
no nice false-assertion message. We finally determine that the remainder-
quotient calculation resulted in
Sure enough, both assertions in (I) are true with these values; the problem
is that the remainder should be less than the divisor, and it isn't. We
determine that the loop condition should be r;;::: y instead of r >y . If
only the result assertion were strong enough -if only we had used the
assertion x = y*q + rand r <y - we would have saved a day of work!
Why didn't we think of it?
We fix the error and insert the stronger assertion:
{y>O}
r:=x; q:=O;
while r;;:::y do
begin r: r -y; q: q + I end;
= =
{x =y*q +r and r <y}
Things go fine for a while, but one day we get incomprehensible output.
It turns out that the quotient-remainder algorithm resulted in a negative
remainder r = -2. But the remainder shouldn't be negative! And we find
out that r was negative because initially x was -2. Ahhh, another error
in calculating the input to the quotient-remainder algorithm -x isn't sup-
posed to be negative! But we could have caught the error earlier and
saved two days searching, in fact we should have caught it earlier; all we
had to do was make the initial and final assertions for the program seg-
ment strong enough. Once more we fix an error and strengthen an asser-
tion:
the initial assertion (0 ~ x and 0 <y) and the final assertion (x = y*q +r
and 0 ~r <y) before writing the program segment, for they form the
definition of quotient and remainder.
But what about the error we made in the condition of the while loop?
Could we have prevented that from the beginning? Is there is a way to
prove, just from the program and assertions, that the assertions are true
when flow of control reaches them? Let's see what we can do.
Just before the loop it seems that part of our result,
(2) x =y*q +r
holds, since x = rand q = o. And from the assignments in the loop body
we conclude that if (2) is true before execution of the loop body then it is
true after its execution, so it will be true just before and after every itera-
tion of the loop. Let's insert it as an assertion in the obvious places, and
let's also make all assertions as strong as possible:
Now, how can we easily determine a correct loop condition, or, given the
condition, how can we prove it is correct? When the loop terminates the
condition is false. Upon termination we want r <y, so that the comple-
ment, r ~ y must be the correct loop condition. How easy that was!
It seems that if we knew how to make all assertions as strong as possi-
ble and if we learned how to reason carefully about assertions and pro-
grams, then we wouldn't make so many mistakes, we would know our
program was correct, and we wouldn't need to debug programs at all!
Hence, the days spent running test cases, looking through output and
searching for errors could be spent in other ways.
Part O. Why Use Logic? Why Prove Programs Correct? 5
Discussion
The story suggests that assertions, or simply Boolean expressions, are
really needed in programming. But it is not enough to know how to write
Boolean expressions; one needs to know how to reason with them: to sim-
plify them, to prove that one follows from another, to prove that one is
not true in some state, and so forth. And, later on, we will see that it is
necessary to use a kind of assertion that is not part of the usual Boolean
expression language of Pascal, PL/ I or FORTRAN, the "quantified"
assertion.
Knowing how to reason about assertions is one thing; knowing how to
reason about programs is another. In the past 10 years, computer science
has come a long way in the study of proving programs correct. We are
reaching the point where the subject can be taught to undergraduates, or
to anyone with some training in programming and the will to become
more proficient. More importantly, the study of program correctness
proofs has led to the discovery and elucidation of methods for developing
programs. Basically, one attempts to develop a program and its proof
hand-in-hand, with the proof ideas leading the way! If the methods are
practiced with care, they can lead to programs that are free of errors, that
take much less time to develop and debug, and that are much more easily
understood (by those who have studied the subject).
Above, I mentioned that programs could be free of errors and, in a
way, I implied that debugging would be unnecessary. This point needs
some clarification. Even though we can become more proficient in pro-
gramming, we will still make errors, even if only of a syntactic nature
(typos). We are only human. Hence, some testing will always be neces-
sary. But it should not be called debugging, for the word debugging
implies the existence of bugs, which are terribly difficult to eliminate. No
matter how many flies we swat, there will always be more. A disciplined
method of programming should give more confidence than that! We
should run test cases not to look for bugs, but to increase our confidence
in a program we are quite sure is correct; finding an error should be the
exception rather than the rule.
With this motivation, let us turn to our first subject, the study of logic.
Part I
Propositions
and Predicates
As seen in the above syntax, five operators are defined over values of
type Boolean:
(1.2.4) Case 3. The value of a constant proposition with more than one
operator is found by repeatedly applying (1.2.2) to a subproposi-
tion of the constant proposition and replacing the sUbproposition
by its value, until the proposition is reduced to Tor F.
We give an example of evaluation of a proposition:
«TAT)~F)
= (T~F)
=F
false" "true". But note that operation or denotes "inclusive or" and not
"exclusive or". That is, (T v T) is T, while the "exclusive or" of T and T
is false.
Also, there is no causality implied by operation imp. The sentence "If
it rains, the picnic is cancelled" can be written in propositional form as
(rain =? no picnic). From the English sentence we infer that the lack of
rain means there will be a picnic, but no such inference can be made from
the proposition (rain =? no picnic).
Example. Let state s be the function defined by the set {(a, T), (be, F),
(yl, T)}. Then sea) denotes the value determined by applying state (func-
tion) s to identifier a: s(a)=T. Similarly, s(be)=F and s(yl) =
T. 0
s«(,b)Vc»
= (( , T) v F) (b has been replaced by T, c by F)
= (FV F)
=F 0
I. s ( <proposition» s «imp-expr»
2. s ( <proposition> ) s ( <proposition» = s ( <imp-expr»
3. s «imp-expr» s«expr»
4. s«imp-expr» s( <imp-expr» =:> sC<expr»
5. s ( <expr» s( <term»
6. s «expr» s( <expr» v s( <term»
7. s «term» s( <factor»
8. s «term» s ( <term» A s ( <factor»
9. s «factor» , s(<factor»
10. s«factor» s ( <proposition> )
II. s«factor» T
12. s«factor» F
13. s «factor» s ( <identifier» (the value of
<identifier> in s)
b e ,b , b Ve b ~e (b ~e)=(,b Ve)
F F T T T T
F T T T T T
T F F F F T
T T F T T T
I.S Tautologies
A Tautology is a proposition that is true in every state in which it is
well-defined. For example, proposition T is a tautology and F is not.
The proposition b v , b is a tautology, as can be seen by evaluating it with
b = T and b =F:
TV,T TVF T
Fv,F FVT T
b bv,b
T F T
F T T
The basic way to show that a proposition is a tautology is to show that its
evaluation yields T in every possible state. Unfortunately, each extra
identifier in a proposition doubles the number of combinations of values
for identifiers ~for a proposition with i distinct identifiers there are i
cases! Hence, the work involved can become tedious and time consum-
ing. To illustrate this, (1.5. I) contains the truth table for proposition
(b A e A d) ~ (d ~ b), which has three distinct identifiers. By taking some
shortcuts, the work can be reduced. For example, a glance at truth table
(1.2.3) indicates that operation imp is true whenever its antecedent is false,
so that its consequent need only be evaluated if its antecedent is true. In
example (1.5.1) there is only one state in which the antecedent b A e A d is
true ~the state in which b, e and d are true~ and hence we need only
the top line of truth table (1.5. I).
Section 1.6 Propositions as Sets of States 15
Disproving a conjecture
Sometimes we conjecture that a proposltlOn e is a tautology, but are
unable to develop a proof of it, so we decide to try to disprove it. What
does it take to disprove such a conjecture?
It may be possible to prove the converse -i.e. that , e is a tautology-
but the chances are slim. If we had reason to believe a conjecture, it is
unlikely that its converse is true. Much more likely is that it is true in
most states but false in one or two, and to disprove it we need only find
one such state:
Example. The set of two states feb, T),(c, T),(d, T») and {(b,F),
(c, T), (d, F»), is represented by the proposition
(bAcAd)V(,bAcA,d) 0
it rains: r
picnic is cancelled: pc
be wet: wet
stay at home: s
2. Write truth tables to show the values of the following propositions in all states:
4. Below are some English sentences. Introduce identifiers to represent the simple
ones (e.g. "it's raining cats and dogs. ") and then translate the sentences into pro-
positions.
(a) Whether or not it's raining, I'm going swimming.
(b) If it's raining I'm not going swimming.
(c) It's raining cats and dogs.
(d) I t's raining cats or dogs.
(e) If it rains cats and dogs I'll eat my hat, but I won't go swimming.
(f) If it rains cats and dogs while I am swimming I'll eat my hat
Chapter 2
Reasoning using Equivalence Transformations
7. Law of Contradiction: E1 A, E1 = F
8. Law of Implication: E1 ~ E2 = , E1 v E2
9. Law of Equality: (E1 = E2) = (E1 ~ E2) A (E2 ~ E1)
Section 2.1 The Laws of Equivalence 21
Don't be alarmed at the number of laws. Most of them you have used
many times, perhaps unknowingly, and this list will only serve to make
you more aware of them. Study the laws carefully, for they are used over
and over again in manipulating propositions. Do some of the exercises at
the end of this section until the use of these laws becomes second nature.
Knowing the laws by name makes discussions of their use easier.
The law of the Excluded Middle deserves some comment. It means
that at least one of band , b must be true in any state; there can be no
middle ground. Some don't believe this law, at least in all its generality.
In fact, here is a counterexample to it, in English. Consider the sentence
Clearly, the law is true in all states (in which it is well-defined), so that it
is a tautology.
Exercise I concerns proving all the laws to be equivalences.
e1 b ~c and
e2 ,b v c
we have
E(el) = dv(b~c)
E(e2) = dv(,b vc)
Section 2.2 The Rules of Substitution and Transitivity 23
b ~e
= ,b Ve (Implication)
= e v,b (Commutativity)
= "cv,b (Negation)
= ,e ~,b (Implication)
Example. We show that the law of Contradiction can be proved from the
others. The portion of each proposition to be replaced in each step is
underlined in order to make it easier to identify the substitution.
(b A(b ?c»?C
= , (b A ( , b VC» Vc (Implication, 2 times)
=,bV,(,bVc)Vc (De Morgan)
=T (Excluded Middle)
Transforming an implication
Suppose we want to prove that
(2.2.3) E1 A E2 A E3 ? E
The final proposition is true in any state in which at least one of , E1,
, E2, , E3 and E is true. Hence, to prove that (2.2.3) is a tautology we
need only prove that in any state in which three of them are false the
fourth is true. And we can choose which three to assume false, based on
their form, in order to develop the simplest proof.
With an argument similar to the one just given, we can see that the
five statements
E1AE2AE3 ?E
E1 A E2 A , E ? , E3
E1 A , E A E3 ? , E2
, E A E2 A E3 ? , E1
(2.2.4) , E1 V , E2 v , E3 V E
are equivalent and we can choose which to work with. When given a pro-
position like (2.2.3), eliminating implication completely in favor of dis-
junctions like (2.2.4) can be helpful. Likewise, when formulating a prob-
lem, put it in the form of a disjunction right from the beginning.
N ext, define the propositions that arise by using the rules of Substitution
and Transitivity and an already-derived theorem to be a theorem. In this
context, the rules are often called inference rules, for they can be used to
infer that a proposition is a theorem. An inference rule is often written in
the form
and
E E,Eo
26 Part I. Propositions and Predicates
where the E j and E stand for arbitrary propositions. The inference rule
has the following meaning. If propositions E], ... ,En are theorems,
then so is proposition E (and Eo in the second case). Written in this
form, the rules of Substitution and Transitivity are
el =e2
(2.3.2) Rule of Substitution:
E(el) = E(e2), E(e2) = E(el)
el =e2, e2 =e3
(2.3.3) Rule of Transitivity:
el =e3
(3.1.1) premise: p II q
conclusion:p II (r v q)
Fromp II
I p IIq premise
(3.1.3)
2 p property of and, I
3 q property of and, I
4 r Vq property of or, 3
5 pll(rVq) property of and, 2, 4
Infer e.
E 1, ,En
(3.2.1) A-I: - - - - -
EIA AEn
Section 3.2 Inference Rules 31
E, /\ /\ En
(3.2.2) /\-E: - - - - - ' -
E;
E;
(3.2.3) v-I: - - - - -
E, V V En
Remark: There are places where it frequently rains while the sun is shin-
ing. Ithaca, the home of Cornell University, is one of them. In fact, it
sometimes rains when perfectly blue sky seems to be overhead. The
weather can also change from a furious blizzard to bright, calm sunshine
and then back again, within minutes. When the weather acts so strangely,
as it often does, one says that it is Ithacating. 0
pVq"r
(pVq)/\,r
Let us redo proof (3.1.3) in (3.2.4) below and indicate the exact infer-
ence rule used at each step. The top line states what is to be proved. The
line numbered I contains the first (and only) premise Cpr I). Each other
line has the following property. Let the line have the form
Then one can form an instance of the named inference rule by writing the
propositions on lines line #, ... , line # above a line and proposition E
below. That is, the truth of E is inferred by one inference rule from the
truth of previous propositions. For example, from line 4 of the proof we
see that q I r v q is an instance of rule V-I: (r v q) is being inferred from q.
Fromp Aq infer p A (r v q)
1 P Aq pr 1
A-E, 1
(3.2.4) 2 p
3 q A-E, 1
4 rVq v-I, 3
5 pA(rVq) A-I, 2, 4
Note how rule A-E is used to break a proposition into its constituent
parts, while A-I and v-I are used to build new ones. This is typical of the
use of introduction and elimination rules.
Proofs (3.2.5) and (3.2.6) below illustrate that and is a commutative
operation; if p Aq is true then so is q Ap, and vice versa. This is obvious
after our previous study of propositions, but it must be proved in this for-
mal system before it can be used. Note that both proofs are necessary;
one cannot derive the second as an instance of the first by replacing p
and q in the first by q and p, respectively. In this formal system, a proof
holds only for the particular propositions involved. It is not a schema,
the wayan inference rule is.
From
1 p Aq
(3.2.5) 2 p
3 q
4 q Ap
To illustrate the relation between the proof system and English, we give
an argument in English for lemma (3.2.5): Suppose p Aq is true [line I].
Then so is p, and so is q [lines 2 and 3]. Therefore, by the definition of
and, q "p is true [line 4].
From
1
(3.2.6) 2
3 p
4 p Aq
using "pr i" to refer to the i 1h premise later on, as shown in (3.2.7). This
abbreviation will occur often. But note that this is only an abbreviation,
and we will continue to use the phrase "occurs on a previous line" to
include the premises, even though the abbreviation is used.
we conclude no sun.
Here is a simple example.
E1? E2, E1
(3.2.9) ?-E:
E2
34 Part I. Propositions and Predicates
From p A q , P ~r infer r v (q ~ r )
1 p Aq pr 1
2 p ~r pr 2
(3.2.10)
3 p A-E (rule (3.2.2», 1
4 r ~-E, 2, 3
5 r V(q ~r) V-I (rule (3.2.3», 4
From p Aq , P ~ r infer r v (q ~ r)
1 p A-E, pr 1
(3.2.11)
2 r ~-E, pr 2, I
3 r V(q ~r) V-I, 2
E1 ~ E2, E2~ E1
(3.2.12) =-1: E1 =E2
E1=E2
(3.2.13) =-E: - - - - - -
E1 ~ E2, E2~ E1
From
1 =-E, pr 2
2 q ~r ~-E, I, pr 1
3 r =q =-1, pr 3, 2
2. Here is one proof that p follows from p. Write another proof that uses only
one reference to the premise.
Fromp infer p
pr I
pr I
4. For each of your proofs of exercise 3, give an English version. (The English
versions need not mimic the formal proofs exactly.)
36 Part I. Propositions and Predicates
Proof (3.3.2) uses ='?>-I twice in order to prove that p "q and q "p are
equivalent, using lemmas proved in the previous section.
Subproofs
A proof can be included within a proof, much the way a procedure can
be included within a program. This allows the premise of ='?>-I to appear
as a line of a proof. To illustrate this, (3.3.2) is rewritten in (3.3.4) to
include proof (3.2.5) as a subproof. The subproof happens to be on line I
here, but it could be on any line. If the subtheorem appears on line j
Section 3.3 Proofs and Subproofs 37
(say) of the main proof, then its proof appears indented underneath, with
its lines numbered j.l, j .2, etc. We could have replaced the reference to
(3.2.6) by a subproof in a similar manner.
Infer
From p A q infer q A P
I.l p A-E, pr I
1.2 q A-E, pr I
(3.3.4) I.3 q Ap A-I, 1.2, l.l
2 (p Aq)~(q Ap) ~-I, I
3 (qAp)~(PAq) ~-I,(3.2.6)
4 (p A q) = (q AP ) =- I, 2, 3
From (q v s) ~ (p A q) infer (q v s) = (p A q)
I (qVs)~(PAq) prl
2 From p A q infer q v s
(3.3.5) 2.1 q A-E, pr I
2.2 q vs v-I, 2.1
3 (PAq)~(qVs) ~-1,2
4 (qVs)=(PAq) =-1,1,3
Scope rules
A subproof can contain references not only to previous lines in its
proof, but also to previous lines that occur in surrounding proofs. We
call these global line references. However, "recursion" is not allowed; a
line j (say) may not contain a reference to a theorem whose proof is not
finished by line j.
The reader skilled in the use of block structure in languages like PLI I,
ALGOL 60 and Pascal will have no difficulty in understanding this scope
rule, for essentially the same scope mechanism is employed here (except
for the restriction against recursion). Let us state the rule more precisely.
Example (3.3.7) illustrates the use of this scope rule; line 2.2 refers to
line 1, which is outside the proof of line 2.
We illustrate another common mistake below; the use of a line that is not
in a surrounding proof. Below, on line 6.1 an attempt is made to refer-
ence s on line 4.1. Since line 4.1 is not in a surroWJding proof, this is not
allowed.
A subproof using global references is being proved in a particular con-
text. Taken out of context, the subproof may not be true because it relies
Section 3.3 Proofs and Subproofs 39
. t p =;>(lq =;> r )
F rom (PA'q )=;> riner
I (p Aq)=;>r pr I
2 From p infer q =;> r
(3.3.8) 2.1 p pr I
2.2 From q infer r
2.2.1 I
p Aq A-I, 2.1, pr I
2.2.2 r =;>..E, I, 2.2. I
2.3 q =;>r =;>"1, 2.2
3 p =;>(q =;> r) =;>"1, 2
Proof by contradiction
A proof by contradiction typically proceeds as follows. One makes an
assumption. From this assumption one proceeds to prove a contradiction,
say, by showing that something is both true and false. Since such a
40 Part I. Propositions and Predicates
contradiction cannot possibly happen, and since the proof from assump-
tion to contradiction is valid, the assumption must be false.
Proof by contradiction is embodied in the proof rules , -I and , -E:
Rule , -I indicates that if "From E infer E1/\ , E1" has been proved for
some proposition E1, then one can write , E on a line of the proof.
Rule , -I similarly allows us to conclude that E holds if a proof of
"From , E infer E1/\ , E1" exists, for some proposition E1.
We show in (3.3.11) an example of the use of rule, -1, that from p we
can conclude , ,p.
From p infer , , p
I p pr I
(3.3.11) 2 From ,p infer p /\ ,p
2.1 P /\ ,p /\-1, I, pr I
3 , -I, 2
From , ,p infer p
I , ,p pr I
(3.3.12) 2 From,p infer ,p /\ , ,p
2.1 ,p /\ , ,p /\-1, pr I,
3 p ,-E,2
Theorems (3.3.11) and (3.3.12) look quite similar, and yet both proofs are
needed; one cannot simply get one from the other more easily than they
are proven here. More importantly, both of the rules , -I and , -E are
needed; if one is omitted from the proof system, we will be unable to
deduce some propositions that are tautologies in the sense described in
section 1.5. This may seem strange, since the rules look so similar.
Let us give two more proofs. The first one indicates that from p and
,p one can prove any proposition q, even one that is equivalent to false.
This is because both p and ,p cannot both be true at the same time, and
hence the premises form an absurdity.
Section 3.3 Proofs and Subproofs 41
Fromp, ,p infer q
1 P pr 1
2 ,p pr 2
(3.3.13) 3 From ,q infer p 1\ ,p
3.1 P 1\ ,p 1\-1, 1,2
4 q ,-E,3
From p 1\ q infer , (p ~ , q )
I P 1\ q pr I
2 From p ~ , q infer q 1\ , q
(3.3.14) 2.1 P I\-E, I
2.2 q I\-E, 1
2.3,q ~-E, pr I, 2.1
2.4 q 1\ , q 1\-1, 2.2, 2.3
3 ,(p~,q) ,-1,2
Summary
The reader may have noticed a difference between the natural deduc-
tion system and the previous systems of evaluation and equivalence
transformation: the natural deduction system does not allow the use of
constants T and F! The connection between the systems can be stated as
follows. If "Infer e" is a theorem of the natural deduction system, then e
is a tautology and e = T is an equivalence. On the other hand, if e = T is
a tautology and e does not contain T and F, then "Infer e" is a theorem
of the natural deduction system. The omission of T and F is no problem
because, by the rule of Substitution, in any proposition T can be replaced
by a tautology (e.g. b v , b) and F by the complement of a tautology (e.g.
b 1\ , b) to yield an equivalent proposition.
We summarize what a proof is as follows. A proof of a theorem
"From e I, . . . ,en infer e" or of a theorem "Infer e" consists of a
sequence of lines. The first line contains the theorem. If the first line is
unnumbered, the rest are indented and numbered 1, 2, etc. If the first line
has the number i, the rest are indented and numbered i. I, i.2, etc. The
last line must contain proposition e. Each line i must have one of the
following four forms:
42 Part I. Propositions and Predicates
Form 1: (i) ej pr j
where 1 ~j ~ n. The line contains premise j.
p
Form 3: (i) p Theorem name, ref \, ... , ref q
Theorem name is the name of a previously proved theorem; refk
is as in Form 2. Let rk denote the proposition referred to be
refk. Then "From r" ... , rq infer p" must be the named
theorem.
Historical Notes
The style of the logical system defined in this chapter was conceived
principally to capture our "natural" patterns of reasoning. Gerhard
Gentzen, a German mathematician who died in an Allied prisoner of war
camp just after World War II, developed such a system for mathematical
arguments in his 1935 paper Untersuchungen ueber das logische Schliessen
[20], which is included in [43].
Several textbooks on logic are based on natural deduction, for example
W.V.O. Quine's book Methods of Logic [41].
The particular block-structured system given here was developed using
two sources: WFF'N PROOF: The Game of Modern Logic, by Layman
E. Allen [1], and the monograph A Programming Logic, by Robert Con-
stable and Michael O'Donnell [7]. The former introduces the deduction
system through a series of games; it uses prefix notation, partly to avoid
problems with parentheses, which we have sidestepped through informal-
ity. A Programming Logic describes a mechanical program verifier for
Exercises for Section 3.3 43
Ei E, V ... V En , E, ~ E, ... , En ~E
V-I: v-E:
E, v ... V En E
3. Prove that q ~ (q A q). Prove that (q A q) ~ q. Use the first two results to
prove that q =
(q A q). Then rewrite the last proof so that it does not refer to
outside proofs.
4. Prove that p = (p VP ).
5. Prove that p ~«r Vs) ~ p).
6. Prove that q ~(r ~(q Ar».
7. Prove that fromp ~(r ~s) follows r ~(p ~s).
44 Part I. Propositions and Predicates
18. Prove «(P A , q) ~ q) ~ (p ~ q). [This, together with exercise 17, allows us
to prove (p ~q)=«(P A ,q)~q).]
19. Prove (p ~q)~«(P A ,q)~ ,p).
20. Prove «(P A , q ) ~ ,p ) ~ (p =;> q). [This, together with exercise 19, allows
us to prove (p ~q)=«(P A ,q)~ ,p).]
21. Prove that (p =q)~(,p = ,q).
22. Prove that ( , p = , q ) ~ (p = q). [This, together with exercise 21, allows us
to prove (p = q ) = ( , p = , q ).]
23. Prove , (p = q) ~ ( , p = q)
24. Prove (,p =q)~,(p =q). [This, together with exercise 21, allows us to
prove the law of Inequality, , (p = q) = (,p = q ).]
25. Prove (p =q)~(q =p).
26. Use a rule of Contradiction to prove From p infer p .
27. For each of the proofs of exercise 1-7,9-25, give a version in English. (It need
not follow the formal proof exactly.)
Section 3.4 Adding Flexibility to the Natural Deduction System 45
From p 1\ q infer q 1\ P
From q 1\ P infer p 1\ q
Even though it looks like the second should follow directly from the
first, in the formal system both must be proved.
But we can prove something about the formal system: systematic sub-
stitution of propositions for identifiers in a theorem and its proof yields
another theorem and proof. So we can consider any theorem to be a
schema also. For example, from proof (3.2.5) of "From p 1\ q infer q I\p"
we can generate a proof of "From (aVb)l\c infer cl\(aVb)" simply by
substituting a v b for p and c for q everywhere in proof (3.2.5):
Let us state more precisely this idea of textual substitution in theorem and
proof.
Infer (p A q) = (q A P )
1 (PAq)~(qAp) (3.2.5)
2 (qAp)~(PAq) (3.2.5) (with p for q, q for p)
3 (PAq)=(qAp) =-1, 1,2
For example, given that c =;>a v b is true, to show that c =;>b Va is true
we take E(P) to be c =;>p, el =e2 to be a vb = b Va (the law of Commu-
tativity, which will be proved later) and apply the theorem.
The rule of Substitution was an inference rule in the equivalence sys-
tem of chapter 2. However, it is a meta-theorem of the natural deduction
system and must be proved. Its proof, which would be performed by
induction on the structure of proposition E(P), is left to the interested
reader in exercise 10, so let us suppose it has been done. We put the rule
of Substitution in the form of a derived inference rule:
el =e2, E(el)
(3.4.4) subs: - - - - - - (E(P) is a function of p)
E(e2)
To show the use of (3.4.4), we give a schematic proof to show that the
rule of substitution as given in section 2.2 holds here also.
With this derived rule of inference, we have the flexibility of both the
equivalence and the natural deduction systems. But we must make sure
that the laws of section 2.1 actually hold! We do this next.
We now turn to the laws of section 2.1. Some of their proofs are given
here; the others are left as exercises to the reader.
3. Distributive laws. Here is a proof of the first; the second is left to the
reader. The proof is broken into three parts. The first part proves an
implication -=?> and the second part proves it in the other direction, so
that the third can prove the equivalence. The second part uses a case
analysis (rule V-E) on b V , b -the law of the Excluded Middle- which is
not proved until later. The use of b v , b in this fashion occurs often
From , (b A c) infer , b v , e
1 ,(bAe) prl
2 From, ( , b v , c) infer (b A c) A , (b A c )
2.1 , (,b v ,c) pr 1
2.2 From, b infer ( , b v , c ) A , ( , b v , e )
(3.4.10) 2.2.1 I 'b v , c v-I, pr 1
2.2.2 (, b v , c)A, (, b v , c) A-I, 2.2.1, 2. 1
2.3 b , -E, 2.2
2.4 From, c infer ( , b v , e ) A , ( , b v , c)
2.4.1 I'
bv ,e v-I, pr I
2.4.2 ( , b v , c) A , ( , b v , c) A-I, 2.4.1, 2.1
2.5 e ,-E, 2.4
2.6 b Ae A-I, 2.3, 2.5
2.7 (bAc)A ,(bAc) A-I, 2.6, I
3 ,b v, c ,-E,2
50 Part I. Propositions and Predicates
(3.4.13) Infer"b =b
I b?"b ?-I, (3.3.11)
2 "b?b ?-I, (3.3.12)
3 "b=b =-1, I, 2
Infer b v , b
From, (b v , b) infer (b v , b) A , (b v , b)
J.I ,(bV,b) prl
1.2 From,b infer (b v ,b)A ,(b V,b)
1.2.1 I
bv,b v-I, pr I
(3.4.14) 1.2.2 (bv,b)A,(bv,b) A-I,1.2.1,1.I
J.3 b , -E, 1.2
1.4 bv,b V-I,J.3
1.5 (b v ,b)A ,(b v ,b) A-I, 1.4, pr I
2 bV,b ,-E,I
10-11. Laws of or- and and-Simplification. These laws use the constants
T and F, which don't appear in the inference system.
4. Prove the second and third Commutative laws, (b v c) =(c Vb) and (b = c)
=(c=b).
5. Prove the second Distributive law, b A(c v d) = (b
Ac) v (b Ad).
6. Prove the second of De Morgan's laws, , (b v c) = , b A , c.
7. Prove the law of Contradiction, ,(b A ,b).
8. Prove the law of Implication, b v c = ( , b =9c).
9. Prove the law of Equality, (b = c) = (b =9 c)A (c =9 b).
10. Prove theorem (3.4.3).
11. Prove the rule of Transitivity: from a = band b = c follows a = c .
12. Prove that from p v q and , q follows p (see (3.4.6».
52 Part I. Propositions and Predicates
3 I e3 Why?
and we need only substantiate line 3 ~i.e. give a reason why e3 can be
written on it. We can look to three things for insight. First, we may be
able to combine the premises or derive sub-propositions from them in
some fashion, if not to produce e3 at least to get something that looks
similar to it.
Secondly, we can investigate e3 itself. Since an inference rule must be
used to substantiate line 3, the form of e3 should help us decide which
inference rule to use. And this leads us to the third piece of information
we can use, the inference rules themselves. There are ten inference rules,
which yields a lot of possibilities. Fortunately, few of them will apply to
any particular proposition e3, because e3 must have the form of the con-
clusion of the inference rule used to substantiate it. And, with the addi-
tional information of the premises, the number of actual possibilities can
be reduced even more.
Section 3.5 Developing Natural Deduction System Proofs 53
For example, if e3 has the form e4 =";>e5, the two most likely inference
rules to use are =-E and =";>-1, and if a suitable equivalence does not seem
possible to derive from the premises, then =-E can be eliminated from
consideration.
Let us suppose we try to substantiate line 3 using rule =";>-1, because it
has the form e4 =";> e5. Then we would expand the proof as follows.
3.2 e5 Why?
4 e4 =,,;>e5 =";>-1, 3
parts of it, can be built from shorter propositions that occur on previous
lines. Note that, except for =-1, the forms of the conclusions of the rules
of introduction are all different, so that at most one of these rules can be
used to substantiate a proposition.
The rules of elimination are generally used to "break apart" a proposi-
tion so that one of its sUb-propositions can be derived. All the rules of
elimination (except for =-E) have a general proposition as their conclu-
sion. This means that they may possibly be used to substantiate any pro-
position. Whether an elimination rule can be used depends on whether its
premises have appeared on previous lines, so to decide whether these rules
should be used requires a look at previous lines.
pr I
1 p 9q
2 (p A ,q)=>,p Why?
Little can be derived from p =>q, except the disjunction ,p v q (using the
rule of Substitution). We will keep this proposition in mind. Which rules
of inference could be used to substantiate line 2? That is, which rules of
inference could have (p A , q) =>,p as their conclusion?
Possible inference rules are: =>-1, A-E, V-E, , -E, =-E and =>-E. Which
seems most applicable, and why? Expand the proof accordingly.
Section 3.5 Developing Natural Deduction System Proofs 55
There is little to suppose that the elimination rules could be useful, for
their premises are different from the propositions on previous lines. This
leaves only ~-l.
From
p~q
2 From p A , q infer
2.1 P A , q pr I
Why?
3 ~-I, 2
Possible inference rules are, -I, A-E, v-E, , -E and ~-E. Choose the rule
that is most applicable and expand the proof accordingly.
The elimination rules don't seem useful here; elimination of imp on line 1
results in q, and we already know that A-E can be used to derive only p
and , q from p A , q. Only , -I seems helpful:
1 p ~q pr 1
2 From p A , q infer , p
2.1 p A, q pr 1
2.2 From p infer e A, e (which e?)
2.2.1 p pr 1
2.2.2 e A, e Why?
2.3 ,p , -I, 2.2
3 (PA,q)~,p ~-I, 2
What proposition e should be used on lines 2.2 and 2.2.2? To make the
choice, look at the propositions that occur on lines previous to 2.2 and
56 Part I. Propositions and Predicates
the propositions we know we can derive from them. Expand the proof
accordingly.
Rule =-E can be used to derive two implications. This seems useful here,
since implications will be needed to derive the goal, and we derive both.
Section 3.5 Developing Natural Deduction System Proofs 57
The following rules could be used to substantiate line 3: , -I, A-E, v-E, ,-
E and ~-E. Choose the most likely one and expand the proof accord-
ingly.
The elimination rules don't seem helpful at all, because the premises that
would be needed in order to use them are not available and don't seem
easy to derive. The only rule to try at this point is , -I -we have little
choice!
What proposition e should be used on lines 3 and 3.2, and how should it
be proved? Expand the proof accordingly.
From ,p = q infer, (p = q)
1 ,p ='7>q =-E, pr I
2 q ='7>,p =-E, pr I
3 From p =q infer p A ,p
3.1 p ='7>q =-E, pr 1
3.2 q ='7> P =-E, pr I
3.3 p Why?
3.4,p Why?
3.5 p A,p A-I, 3.3, 3.4
4 ,(p =q) ,-1,3
So we are left with concluding the two propositions p and ,p. These are
quite simple, using the above reasoning, so let us just show the final
proof.
From , p = q infer , (p = q)
I ,p ='7>q =-E, pr I
2 q='7>,p =-E, pr I
3 Fromp =q infer p A ,p
3.1 p ='7> q =-E, pr I
3.2 q ='7> P =-E, pr I
3.3 From ,p infer p A ,p
3.3.1 q ='7>-E, I, pr I
3.3.2 p ='7>-E, 3.2, 3.3.1
3.3.3 p A,p A-I, 3.3.2, pr I
3.4 p , -E, 3.3
3.5 From p infer p A ,p
3.5.1 q ='7>-E, 3.1, pr I
3.5.2 ,p ='7>-E, 2, 3.5.1
3.5.3 p A,p A-I, pr I, 3.5.2
3.6 ,p , -I, 3.5
3.7 p A,p A-I, 3.4, 3.6
5 ,(p=q) , -I, 2
At each step of the development of the proof there was little choice. The
crucial -and most difficult- point of the development was the choice of
inference rule , -I to substantiate the last line of the proof, but careful
study of the inference rules led to it as the only likely candidate. Thus,
directed study of the available information can lead quite simply to the
proof.
Section 3.5 Developing Natural Deduction System Proofs 59
I. If Bill takes the bus, then Bill misses his appointment, if the
bus is late.
2. Bill shouldn't go home, if (a) Bill misses his appointment, and
(b) Bill feels downcast.
3. If Bill doesn't get the job, then (a) Bill feels downcast, and (b)
Bill shouldn't go home.
Which of the following conjectures are true? That is, which can be validly
proved from the premises? Give proofs of the true conjectures and coun-
terexamples for the others.
I. If Bill takes the bus, then Bill does get the job, if the bus is
late.
2. Bill gets the job, if (a) Bill misses his appointment, and (b) Bill
should go home.
3. If the bus is late, then (a) Bill doesn't take the bus, or Bill
doesn't miss his appointment, if (b) Bill doesn't get the job.
4. Bill doesn't take the bus if, (a) the bus is late, and (b) Bill
doesn't get the job.
5. If Bill doesn't miss his appointment, then (a) Bill shouldn't go
home, and (b) Bill doesn't get the job.
6. Bill feels downcast, if (a) the bus is late, or (b) Bill misses his
appointment.
7. If Bill does get the job, then (a) Bill doesn't feel downcast, or
(b) Bill shouldn't go home.
8. If (a) Bill should go home, and Bill takes the bus, then (b) Bill
doesn't feel downcast, if the bus is late.
This problem is typical of the puzzles one comes across from time to time.
Most people are confused by them -they just don't know how to deal
with them effectively and are amazed at those that do. It turns out, how-
ever, that knowledge of propositional calculus makes the problem fairly
easy.
The first step in solving the problem is to translate the premises into
propositional form. Let the identifiers and their interpretations be:
60 Part I. Propositions and Predicates
The premises are given below. Each has been put in the form of an impli-
cation and in the form of a disjunction, knowing that the disjunctive form
is often helpful.
Now let's solve the first few problems. In order to save space, Premises I,
2 and 3 are not. written in every proof, but are simply referred to as Prem-
ises 1, 2 and 3. Included, however, are propositions derived from them in
order to get more true propositions from which to conclude the result.
Conjecture 1: If Bill takes the bus, then Bill does get the job, if the bus is
late. Translate the conjecture into propositional form.
From tb infer bl ? gj
tb pr 1
2 b/?gj Why?
From tb infer bl ~ gj
I tb pr I
2 bl ~ rna ~-E, Premise I,
3 bl ~gj Why?
The necessary propositions for the use of the elimination rules are not
available, so try ~-I:
From tb infer bl ~ gj
I tb pr I
2 bl ~ rna ~-E, Premise I, I
3 From bl infer gj
3.1 bl pr I
3.2 gj Why?
4 bl ~gj ~-I,3
Can any propositions be inferred at line 3.2 from the propositions on pre-
vious lines and Premises 1, 2 and 3? Expand the proof accordingly.
From tb infer bl ~ gj
I tb pr I
2 bl ~ rna ~-E, Premise I, I
3 From bl infer gj
3.1 bl pr 1
3.2 rna ~-E, 2, 3.1
3.3 gj Why?
4 bl ~gj ~-I,3
62 Part I. Propositions and Predicates
None of the the rules seem helpful. The only proposition available that
contains gj is Premise 3, and its disjunctive form indicates that gj must
necessarily be true only in states in which (fdA ,gh) is false (according to
theorem (3.4.6)). But there is nothing in Premise 2, the only other place
fd and gh appear, to make us believe that fdA ,gh must be false.
Perhaps the conjecture is false. What counterexample -i.e. state in
which the conjecture is false- does the structure of the proof and this
argument lead to?
Conjecture 2: Bill gets the job, if (a) Bill misses his appointment and (b)
Bill should go home. Translate the conjecture into propositional form.
2 gj Why?
What can we derive from line I and Premises I, 2 and 3? Expand the
proof accordingly.
Section 3.5 Developing Natural Deduction System Proofs 63
Both line I and Premise 2 contain rna and gh. Premise 2 can be put in
the form , (rna Agh)V ,fd. Since rna Agh is on line I, theorem (3.4.6)
together with the law of Negation allows us to conclude that ,fd is true,
or that fd is false. Putting this argument into the proof yields
5 gj Why?
The applicable rules are A-E, v-E, , -E and ~-E. This means that an ear-
lier proposition must be broken apart to derive gj. The one that contains
gj is Premise 3, and in its disjunctive form it looks promising. To show
that gj is true, we need only show that fd A , gh is false. But we already
know thatfd is false, so that we can complete the proof as follows.
pr I
2 subs, De Morgan, Premise 2
3 subs, Negation, I
4 (3.4.6), 2, 3
5 V-I, 4
6 subs, De Morgan, 5
7 (3.4.6), Premise 3, 6
Conjecture 3: If the bus is late, then (a) Bill doesn't take the bus, or Bill
doesn't miss his appointment, if (b) Bill doesn't get the job. Translate the
conjecture into propositional form.
Just before line 2.2, what propositions can be inferred from earlier propo-
sitions and Premises I, 2 and 3? Expand the proof accordingly.
What inference rule should be used to substantiate line 2.5? Expand the
proof accordingly.
The proposition on line 2.5 could have the form of the conclusion of rules
v-I, A-E, v-E, , -E and =?-E. The first rule to try is V-I. Its use would
require proving that one of , tb and , rna is true. But, looking at the
Premises, this seems difficult. For from Premise I we see that both tb
and rna could be true, while the other premises are true also because both
their conclusions are true. Perhaps there is a contradiction. What is it?
((x<y)Ac)Vd.
The new assertions like P are called atomic expressions, while an expres-
sion that results from replacing an identifier by an atomic expression is
called a predicate. We will not go into detail about the syntax of atomic
expressions; instead we will use conventional mathematical notation and
rely on the reader's knowledge of mathematics and programming. For
example, any expression of a programming language that yields a Boolean
result is an acceptable atomic expression. Thus, the following are valid
predicates:
The second example illustrates that parentheses are not always needed to
isolate the atomic expressions from the rest of a predicate. The pre-
cedences of operators in a predicate follow conventional mathematics.
For example, the Boolean operators A, v, and ~ have lower precedence
than the arithmetic and relational operators. We will use parentheses to
make the precedence of operations explicit where necessary.
Evaluating predicates
Evaluating a predicate in a state is similar to evaluating a proposition.
All identifiers are replaced by their values in the state, the atomic expres-
sions are evaluated and replaced by their values (T or F), and the result-
ing constant proposition is evaluated. For example, the predicate
x <y v b in the state {(x,2),(y,3),(b,F)} has the value of 2<3 v F,
which is equivalent to TV F, which is T.
Using our earlier notation s(e) to represent the value of expression e
in state s, and writing a state as the set of pairs it contains, we show the
evaluation of three predicates:
68 Part I. Propositions and Predicates
y =0 v (x /y= 5) .
Rather than change the definition of and and or, which would require
us to change our formal logic completely, we introduce two new opera-
tors: cand (for conditional and) and cor (for conditional or). The
operands of these new operators can be any of three values: F, T and U
(for Undefined). The new operators are defined by the following truth
table.
This definition says nothing about the order in which the operands should
be evaluated. But the intelligent way to evaluate these operations, at least
on current computers, is in terms of the following equivalent conditional
expressions:
Operators cand and cor are not commutative. For example, b cand c is
not equivalent to c cand b. Hence, care must be exercised in manipulat-
ing expressions containing them. The following laws of equivalence do
hold for cand and cor (see exercise 5). These laws are numbered to
correspond to the numbering of the laws in chapter 2.
3. Distributivity:
El cand (E2 cor E3) = (El cand E2) cor (E1 cand E3)
El cor (E2 cand E3) = (El cor E2) cand (El cor E3)
70 Part I. Propositions and Predicates
10. cor-simplification
E1 cor E1 = E1
E1 cor T = T (provided E1 is well-defined)
E1 cor F = E1
E1 cor (E1 cand E2) = E1
II. cand-simplification
E1 cand E1 = E1
E1 cand T = E1
E1 cand F = F (provided E1 is well-defined)
E1 cand (E1 cor E2) = E1
In addition, one can derive various laws that combine cand and cor with
the other operations, for example,
3. Evaluate the following predicates in the state given in exercise I. Use U for
the value of an undefined expression.
4.2 Quantification
Existential quantification
Let m and n be two integer expressions satisfying m ~ n. Consider
the predicate
The set of values that satisfy m ~ i < n is called the range of the quanti-
fied identifier i. Predicate (4.2.2) is read in English as follows.
nSi =
n-I
i=m
Sm *Sm+1 * ... *Sn-I'
stand for the sum and product of the values Sm, Sm+J, ... , Sn-J, respec-
tively. These can be written in a more linear fashion, similar to (4.2.1), as
foHows, and we shall continue to use this new form:
(4.2.3) Definition of E:
(Ei:m ~i <m: E i ) = F, and, for k ~m,
(E i : m ~ i < k + I: E j ) = (E i : m ~ i < k: E;) v Ek 0
(Ei:O~i<O:i=i)
(Ei:-3~i<-3: T)
The value 0 is called the identity element of addition, because any number
added to 0 yields that number. Similarly, I, F and T are the identity ele-
ments of the operators *, or and and, respectively. 0
Section 4.2 Quantification 73
(1) (E i: O~ i < 100: (Ej: O~j < 100: prime (i) /I. i *j = 1079»
(2) (E i: O~ i < IOO:prime(i) /I. (E j: O~j < 100: i *j = 1079»
(3) (Ei ,j: O~i ,j < 100: prime (i) /I. i*j = 1079»
Universal quantification
The universal quantifier, A, is read as "for all". The predicate
(4.2.4) (Ai:m~i<n:Ei)
is true in a state iff, for all values i in the range m ~ i < n, Ei is true in
that state.
We now define A in terms of E, so that, formally, we need deal only
with one of them as a new concept. Predicate (4.2.4) is true iff all the Ei
are true, so we see that it is equivalent to
Em /I. E m+ 1A . . . A En~1
(A i: m ~ i < m : Ed
, (E i : m ~ i < m: , Ei )
,F (because the range of E is empty)
T
Numerical quantification
Consider predicates Eo, £1, .... It is quite easy to assert formally that
k is the smallest integer such that Ek holds. We need only indicate that
Eo through Ek~1 are false and that Ek is true:
74 Part I. Propositions and Predicates
o~ k A (A i: 0 ~ i < k: ~ E;) A Ek
It is more difficult to assert that k is the second smallest integer such that
Ek holds, because we also have to describe the first such predicate Ej :
Obviously, describing the third smallest value k such that Ek holds will
be clumsier, and to write a function that yields the number of true E; will
be even harder. Let us introduce some notation:
(Ei:m~i<n:E;) = «Ni:m~i<n:E;)~I)
(Ai:m~i<n:E;) = «Ni:m~i<n:E;)=n-m)
Now it is easy to assert that k is the third smallest integer such that Ek
holds:
«Ni:O~i<k:E;)=2) A Ek
A Note on ranges
Thus far, the ranges of quantifiers have been given in the form m ~ i
<n, for integer expressions m and n. The lower bound m is included in
the range, the upper bound n is not. Later, the form of ranges will be
generalized, but this is a useful convention, and we will use it where it is
suitable.
Note that the number of values in the range is n -m. Note also that
quantifications with adjacent ranges can be combined as follows:
While it is possible to allow predicates like (4.3.2), and most logical sys-
tems do, it is advisable to enforce the use of each identifier in only one
way:
Note that both x and yare free in the predicate x";:;;y, while x
remains free and y becomes bound when the predicate is embedded in the
expression (N y: O";:;;y < 10: x ";:;;y) = 4.
78 Part I. Propositions and Predicates
t I I I
2~m <n A (A i:2~i <m:m-':-i #0)
f I I I
2 ~m <n A (A n: 2 ~n <m: m -':-n #0) INVALID (why?)
• I I I • I I I
(E i: I ~; <25: 25-':-; =0) A (E;: I ~i <25: 26-':-; =0) INVALID
t I I I t I I I
(E t: I ~t <25: 25-':-t =0) A (E i: I ~i <25: 26-':-i =0)
t I I I I
(Ei: I ~i <25: 25-':-i =0 A 26-':-; =0)
t I I I I
(A m:n <m <n +6:(E;: 2~; <m: m-':-; =0»
• I I I
• I I I I
(A m: n <m <n+6:(En: 2~n <m: m -':-n =0» INVALID
• I I I
t I I I I
(A m:n <m <n+6:(Ek:2~k <m:m-':-k =0» 0
• I I I
The scope mechanism being employed here is similar to the ALGOL
60 scope mechanism (which is also used in Pascal and PL/ I). Actually,
its use in the predicate calculus came first. A phrase (A i: R: E) intro-
duces a new level of nomenclature, much like a procedure declaration
"proc p (i); begin ... end" does. Inside the phrase, one can refer to all
variables used outside, except for ;; these are global identifiers of the
phrase. The part Ai is a "declaration" of a new local identifier i.
As in ALGOL 60, the name of a local identifier has no significance
and can be changed systematically without destroying the meaning. But
care must be taken to "declare" bound identifiers in the right place to get
the intended meaning.
Section 4.4 Textual Substitution 79
We have
E x<y i\(Ai:O:::;:i<n:b[i]<y)
E~·~i x<y-i i\(Ai:O:::;:i<n:b[i]<y-i).
But this is not the desired predicate, because the i in y -i has become
bound to the quantifier A, since it now occurs within the scope of A.
Care must be taken to avoid such "capturing" of an identifier in the
expression being substituted. To avoid this conflict we can call for first
(automatically) replacing identifier i of E by a fresh identifier k (say), so
that we arrive at
The following two lemmas are stated without proof, for they are fairly
obvious:
Exercises for Section 4.4 81
(4.4.7) o
Simultaneous substitution
Let x denote a list (vector) of distinct identifiers:
(4.4.9) E§, or
a+b +a+b +c
x+y +x+y +z
The second example illustrates the fact that the substitutions must be
simultaneous; if one first replaces all occurrences of X and then replaces
all occurrences of y, the result is x +z + x + z + z, which is not the same.
In general, E;:t can be different from (E;)~'.
<
3. Consider the predicate E =(A i: I ~ i n : (E j: b U] = i».
Indicate which of
the following textual substitutions are invalid and perform the valid ones.
82 Part I. Propositions and Predicates
(4.5.1) (Ei:R:E) or
(4.5.2) (Ai:R:E),
Example 1. Let Person (P) represent the sentence "p is a person". Let
Morta/(x) represent the sentence "x is mortal". Then the sentence "All
men are mortal", or, less poetically but more in keeping with the times,
"All persons are mortal", can be expressed by (A p : Person (p):
Mortal(p )). 0
Example 2. It has been proved that arbitrarily large pnmes exist. This
theorem can be stated as follows:
Section 4.5 Quantification Over Other Ranges 83
(A n: max(n, -n)=abs(n»
since the context indicated that only integers were under consideration.
or, as an abbreviation,
R?E
(4.5.4) A-I: where i is a fresh identifier.
(A i: R: E)
(Ai:R:E)
(4.5.5) A-E: for any predicate e
R~ ?E:
Let us now turn to the inference rules for E. Using the techniques of
earlier sections, E can be defined in terms of A:
(A i: R: E)
(4.5.6) E-I:
,(Ei:R: ,E)
Section 4.6 Some Theorems About Textual Substitution and States 85
(Ei: R: E)
(4.5.7) E-E: - - - - -
, (A i: R: , £)
(Ei: R: £)
(4.5.8) bound-variable substitution: - - - - - -
(E k: Rk: E1)
(provided k does not appear free in Rand £)
(s; x:v)
s =(s; x:s(x»
We now give three simple lemmas dealing with textual substitution. For-
mal proofs would rely heavily on the caveats given on textual substitution
in definition (4.4.6), and would be based on the structure of the expres-
sions involved. We give informal proofs.
(Lemma 4.6.1) 0
(b;i:e)U]= ! i=j-e
i#j-bU] 0
Notice the similarity between the notation (s; x:v) used in section 4.6
to denote a modified state s and the notation (b; i:e) to denote a modi-
fied array b .
Example 2 illustrates nested use of the notatiotJ.. Since (b; 0:8) is the
array (function) (8,4,6), it can be used in the first position of the nota-
tion. Nested parentheses do become burdensome, so we drop them and
rely instead on the convention that rightmost pairs "i:e" are dominant
and have precedence. Thus the last line of example 2 is equivalent to
(b; 0:8; 2:9; 0:7).
b:=(b; i:e)
Simplifying expressions
It is sometimes necessary to simplify expressions (including predicates)
containing the new notation. This can often be done using a two-case
analysis as shown below, which is motivated by definition (5.1.2). The
first step is the hardest, so let us briefly explain it. First, note that either
i = j or i"# j. In the former case (b; i:5)U] = 5 reduces to 5 = 5; in the
second case it reduces to b U] = 5.
(b; i :5)U] 5=
= (i = j A 5 = 5) v (i "# jAb U] = 5) (DeL of (b; i :5»
= (i =j) v (i"#j A bU]=5) «5 =5) = T, and-simp!.)
= (i = j v i "# j) A (i = j v b U] = 5) (Distributivity)
= T A (i = j v bU] =5) (Excluded middle)
=i=j V bU]=5 (and-simp!.)
92 Part I. Propositions and Predicates
define a type t and two variables p and q with type t. Each variable contains
two fields; the first is named n and can contain a string of 0 to 10 characters
-e.g. a person's name- and the second is named age and can contain an
integer. The following assignments indicate how the components of p and q can
be assigned and referenced. After their execution, both p and q contain 'Hehner'
in the first component and 32 in the second. Note how q.age refers to field age
of record variable q.
An array consists of a set of individual values, all of the same type (the old
view). A record consists of a set of individual values, which can be of different
types. In order to allow components to have different types we have sacrificed
some flexibility: components must be referenced using their name (instead of an
expression). Nevertheless, arrays and records are similar.
Develop a functional view for records, similar to the functional view for arrays
just presented.
Section 5.2 Array Sections and Pictures 93
b[O:n-I] denotes the whole array, while if O~i ~j <n, b[i:j] refers to
the array section composed of b[i], b[i+I], ... , b[j]. If i = j+l, b[i:j]
refers to an empty section of b .
Quite often, we have to assert something like "all elements of array b
are less than x", or "array b contains only zeroes". These might be writ-
ten as follows.
(Ai:O~i<n:b[i]<x)
(A i: 0 ~ i < n: b [i) = 0)
Because such assertions occur so frequently, we abbreviate them; these
two assertions would be written as b <x and b =0, respectively. That is,
the relational operators denote element-wise comparison when applied to
arrays. Here are some more examples, using arrays b [O:n -I] and
c[O:n-l] and simple variable x.
Be very careful with = and #, for the last example shows that b = y can
be different from , (b # y)! Similarly, b ~y can be different from
,(b>y).
We also use the notation x E b to assert that the value of x is equal to
(at least) one of the values b [i]. Thus, using domain(b) to represent the
set of subscript values for b, x E b is equivalent to
(E i: i E domain (b ): x = b [i))
Such abbreviations can make program specification -and understand-
ing the specification later- easier. However, when developing a program
94 Part I. Propositions and Predicates
Array pictures
Let us now turn to a slightly different subject, using pictures for some
predicates that describe arrays. Suppose we are writing a program to sort
an array b[O:n -I], with initial values B[O:n -I] -i.e. initially, b = B.
We want to describe the following conditions:
(I) b [O:k -I] is sorted and all its elements are at most x,
(2) the value that belongs in b [k] is in simple variable x,
(3) every value in b[k+l:n-l] is at least x.
where
ordered(b[O:k -I]) =(Ai: 0 ~i <k -I: b[i] ~b[i+I])
k k+1 n-I
b I I ~x I
o k h n
(a) 0 ~ k ~ h ~ n /\ b --,~:..:..:x-LI=_x"--l-I_--ll_=...c.x'--ll_?:........:..:.x---,'
1-'
o i n
(b) O~i <n /\ b' ordered I I
96 Part I. Propositions and Predicates
defines an array of arrays. That is, b [0] (and similarly b [I]) is an array
consisting of three elements named b[O][1], b[0][2] and b[0][3]. One can
also have an "array of arrays of arrays", in which case three subscripts
could be used -e.g. d[i]U][k]- and so forth.
Array of arrays take the place of two-dimensional arrays in FOR-
TRAN and PLj I. For example, (5.3.1) could be thought of as equivalent
to the PLjI declaration
We want to define the notation (b; s:e) for any selector s. We do this
recursively on the length of s. The first step is to determine the base case,
(b; c:e).
Let x be a simple variable (which contains a scalar or function). Since
x and x 0 c are equivalent, the assignments x:= e and x 0 c:= e are also
e =(x; c:e)
(b; c:g) = g
(b;[i]os:e)U] = !
i # j - b UJ
i=j-(bU];s:e) 0
Example 2. In this and the following examples, let c[l:3] = (6,7,8) and
b[0:1][1:3] = «0, 1,2),(3,4,5». Then
Again, all but the outer parentheses can be omitted. For example, the
following two expressions are equivalent. They define an array (function)
that is the same as b except in three positions -[i]U], U] and [k][i].
98 Part I. Propositions and Predicates
Modify the notation of this section to allow references to subrecords of arrays and
subarrays of records, etc.
Chapter 6
Using Assertions To Document Programs
(6.1.1) Store in z the product a*b, assuming a and b are initially ;;::'0.
does not indicate where the result of the multiplication should be stored,
and hence it cannot be understood in isolation, as it should be.
English can be ambiguous, so we often rely on more formal specifica-
tion techniques. The notation
(6.1.2) {Q}S{R}
The precondition of the program is given, the fixed variables, which must
not be changed, are listed and the postcondition is to be established.
Here are some more examples of specifications (all variables are
integer valued).
Example 1 (array summation). Given are fixed n;;:'O and fixed array
b [O:n -I]. Establish
R: s =(Li:O~i <n:b[i]). 0
Example 3 (sorting). Given fixed n ;;:, 0 and array b [O:n -I], sort b, I.e.
establish
Again, there is a problem with this specification; the result can be esta-
blished simply by setting all elements of b to zeroes. This problem can be
overcome by including a comment to the effect that the only way to alter
b is to swap two of its elements.
Naturally, with large, complex problems there may be difficulty in
specifying programs in this simple manner, and new notation may have to
be introduced to cope with the complexity. But for the most part, the
simple specification forms given ai?ove will suffice. Even a compiler can
be specified in such a notation, by judicious use of abstraction:
{Pascal program(p)}
compiler
{IBM 370program(q) 1\ equivalent(p, q)}
where the predicates Pascal program, IBM 370 program and equivalent
must be defined elsewhere.
102 Part I. Propositions and Predicates
(A X, Y, x ,y : I x =X II Y = YJ swap Ix = Y II Y = X ])
(6.2.2) can be read in English as follows: for all (integer) values of X and
Y, if initially x = X and y = Y, then execution of swap establishes x = Y
and y =X.
X and Y denote the initial values of variables x and y, but they also
denote the final values of y and x. An identifier can denote either an ini-
tial or a final value, or even a value upon which the initial or final value
depends. For example, the following is also a specification of swap,
although it is not as easy to understand:
Generally, we will use capital letters in identifiers that represent initial and
final values of program variables, and small letters for identifiers that
name variables in a program.
As a final example, we specify a sort program again, this time using an
extra identifier to alleviate the problem mentioned in example 3 of section
6.1. The predicate perm (c, C) has the meaning "array c is a permutation
of array C, i.e. a rearrangement of C". See exercise 5 of section 4.2.
Section 6.3 Proof Outlines 103
{X =x Ay = Y}
t:= X;
{t =X AX =X Ay = Y}
X:= y;
{t = X Ax = Y IIy = Y}
y:= t
{y=XAX=Y}
The reader can informally verify that, for each statement of the program.
if its precondition -the predicate in braces preceding it- is true, then
execution of the statement terminates with its postcondition -the predi-
cate in braces following it- true.
A predicate placed in a program is called an assertion; we assert it is
true at that point of execution. A program together with an assertion
between each pair of statements is called a proof outline, because it is just
that; it is an outline of a formal proof, and one can understand that the
program satisfies its specification simply by showing that each triple
(precondition, statement, postcondition) satisfies {precondition} statement
{postcondition}. The formal proof method is described in Part II.
Placing assertions in a program for purposes of documentation is often
called annotating the program, and the final program is also called an
annotated program.
Below is a proof outline for
The proof outline illustrates two new conventions. First, an assertion can
be named so that it can be discussed more easily, by placing the name at
its beginning followed by a colon. Secondly, adjacent assertions -e.g.
{Pi {PlJ- mean that the first implies the second -e.g. P ~ Pl. The
lines have been numbered solely for reference in a later discussion.
1. P =;> Pl (lines 1, 2)
2. {Pl} i:= i+l {P2} (lines 2, 3, 4)
3. P2=;> P3 (lines 4, 5)
4. {P3} s:= s+i {R} (lines 5, 6, 7)
Together, these give the desired result: execution of j:= ;+1; s:= s+i
begun in a state satisfying P terminates in a state satisfying R.
The next example illustrates the use of a conditional statement. Note
how the assertion following then is the conjunction of the precondition of
the conditional statement and the test, since this is what is true at that
point of execution. Since both the then-part and the else-part end with
the assertion x =abs(X), this is what we may conclude about execution
of the conditional statement.
{x=X}
if x <0 then {x =X Ax <OJ
x:= -x
{x =-X Ax >O} {x =abs(X)}
else {x =X AX ~O}
skip
{x =X Ax ~O} {x =abs(X)}
{x =abs(X)}
(7.1) the set of all states such that execution of S begun in anyone of
them is guaranteed to terminate in a finite amount of time in a
state satisfying R. 0
Let's give some examples for some ALGOL-like commands, based on our
knowledge of how these commands are executed.
In section 6.1, we used the notation {Q} S {R} to mean that execution
of S begun in any state satisfying predicate Q would terminate in a state
satisfying predicate R. In this context, Q is called the precondition and
R the postcondition of S. Similarly, we call wp(S, R) the weakest
precondition of S with respect to R, since it represents the set of all
states such that execution begun in anyone of them will terminate with R
true. (See section 1.6 for a definition of weaker and weakest in this con-
text.) We see, then, that the notation {Q} S {R} is simply another nota-
tion for
Remark: The notation Q {S} R was first used in 1969 (see chapter 23) to
denote partial correctness. It has the interpretation: if execution of S
begins in a state satisfying Q, and if execution terminates, then the final
110 Part II. The Semantics of a Small Language
T {while T do skip} T
Some properties of wp
If we are to define a programming notation using the concept of wp,
then we had better be sure that wp is well-behaved. By this we mean that
we should be able to define reasonable, implementable commands using
wp. Furthermore, it would be nice if unimplementable commands would
be rejected from consideration. Let us therefore analyze our interpreta-
tion (7.1) of wp(S,R), and see whether any properties can be derived
from it.
First, consider the predicate wp (S , F) (for any command S). This
describes the set of states such that execution of S begun in anyone of
them is guaranteed to terminate in a state satisfying F. But no state ever
satisfies F, because F represents the empty set. Hence there could not
possibly be a state in wp(S, F), and we have our first property:
Let us see why (7.4) is a tautology. First, consider any state s that satis-
fies the left hand side (LHS) of (7.4). Execution of S begun in s will ter-
minate with both Q and R true. Hence Q 1\ R will also be true, and s is
in wp (S, Q 1\ R). This shows that LHS ~ RHS. Next, suppose s is in
wp (S , Q 1\ R ). Then execution of S begun in s is guaranteed to ter-
minate in some state s' of Q 1\ R. Any such s' must be in Q and in R, so
Chapter 7 The Predicate Transformer wp III
that s is in wp(S, Q) and in wp(S, R). This shows that RHS =? LHS.
Together with LHS =? RHS, this yields RHS = LHS.
We have thus shown that (7.3) and (7.4) hold. The arguments were
based solely on the informal interpretation (7.1) that we wanted to give to
the notation wp (S, R). We now take them as basic axioms, and use
them as we do other axioms and laws of the predicate calculus. Using
them, we can prove two other useful laws; their proofs are left as exer-
cises.
But the coin is guaranteed to land with either a head or a tail up, so that
S R
(a) i:= i+1 i >0
(b) i:= i +2; j:=j-2 i+j =0
(c) i:= i+l; j:=j-l i*j =0
(d) z:= z*j; i:=i-l z*/ =c
(e) a[i]:= I a[i] =aU]
(f) a[a[i]]:= i a[i]=i
3. Prove (7.5) and (7.6). Don't rely on the notion of execution and interpretation
(7.1); prove them only from (7.4) and the laws of predicate calculus.
4. Prove using (7.4) that (wp(S, R)Awp(S, ,R)) = F.
5. Give an example to show that the following is not true for all states:
(wp(S, R)Vwp(S, ,R» = T.
6. Show that (7.7) holds for deterministic S. (It cannot be proved from axioms
(7.3)-(7.4); it must be argued based on the definitions of determinism and wp, as
was done for (7.3) and (7.4).)
7. Suppose Q =:>wp(S,R) has been proven for particular Q, Rand S.
Analyze fully the statement
Exercises for Chapter 7 113
(Is it true in general; if not, what restrictions must be made so that it holds for
"reasonable" classes of predicates Q, R and commands S, etc.) Hint: be careful
to consider the case where x appears in S. You may want to answer the question
under the ground rule that the appearance of x in S means that (7.8) is invalid,
and that the quantified identifier x should be changed before proceeding. It is
also instructive, however, to answer this question without using this ground rule.
See section 4.3.
8. Suppose Q ~ wp (S , R) has been proven for particular Q, Rand S.
Analyze fully the statement
That is, it doesn't matter whether one thinks of Sl; S2; S3 as Sf com-
posed with S2; S3 or as SJ; S2 composed with S3, and it is all right to
leave the parentheses out. (Similarly, because addition is associative,
a +b +c is well-defined because a +(b +c) yields the same result as
(a+b)+c.)
Be aware of the role of the semicolon; it is used to combine adjacent,
independent commands into a single command, much the way it is used in
English to combine independent clauses. (For an example of its use in
English, see the previous sentence.) It can be thought of as an operator
that combines, just as catenation is used in Pascal and PLj I to combine
two strings of characters. Once this is understood, there should be no
confusion about where to put a semicolon.
Our use of the semicolon conforms not only to English usage, but also
to its original use in the first programming notation that contained it,
ALGOL 60. It is a pity that the designers of PLj I and Ada saw fit to go
against convention and use the semicolon as a statement terminator, for it
has caused great confusion.
Thus far, we don't have much of a programming notation ~about all
we can write is a sequence of skips and aborts. In the next chapter we
define the assignment command. Before reading ahead, though, perform
some of the exercises in order to get a firm grasp of this (still simple)
material.
116 Part II. The Semantics of a Small Language
where
Predicate domain (e) will not be formally defined, since expressions e are
not. However, it must exclude all states in which evaluation of e would
be undefined -e.g. because of division by zero or subscript out of range.
118 Part II. The Semantics of a Small Language
This example required explicit use of the term domain (e) of definition
(9.1.1). 0
Thus, x will contain the value b [i] upon termination iff i is a valid sub-
script for array b. 0
Section 9.1 Assignment to Simple Variables 119
(9.1.4) {y =X A X = Y} {y =X A X = Y}
t:= x; t:= x;
{y=XAt=Y} x:= y;
x:= y; y:= t
{x=XAt=Y} {x=XAy=Y}
y:= t
{x =X Ay = Y}
120 Part ]1. The Semantics of a Small Language
S R
(a) x:= 2*y+3 x = 13
(b) x:= x+y x<2*y
(c) j:= j+I O<j /\ (A i:O~i ~j: b[i]=5)
(d) all5:= (b U] = 5) all5 = (A i: 0 ~ i ~j: b [i] = 5)
(e) a1l5:= all5 /\ (bU] =5) a1l5 = (A i: O~ i ~j: b[i] =5)
(f) x:= x*y x*y =c
(g) x:= (x-y) *(x+y) x + y2~O
2. Prove that definition (9.1.3) satisfies laws (7.3), (7.4) and (7.7). The latter
shows that assignment is deterministic.
3. Review section 4.6 (Some theorems about textual substitution). Let s be the
machine state before execution of x:= e and let s' be the final state. Describe s
and s' in terms of how x:= e is executed. (What, for example, should be the
value in x upon termination?) Then show that for any predicate R, s' (R) is true
iff s(Re} is true. Finally, argue that this last fact shows that the definition of
assignment is consistent with our operational view of assignment.
4. One can write a "forward rule" for assignment, which from a precondition
derives the strongest postcondition sp(Q, "x:= e") such that execution of x:= e
with Q true leaves sp (Q, "x:= e") true (in the definition below, v represents the
initial value of x):
Show that this definition is also consistent with our model of execution. One way
to do this is to show that execution of x:= e with Q true is guaranteed to ter-
minate with sp (Q, "x:= e") true:
where the Xi are distinct simple variables and the ei are expressions. For
purposes of explanation the assignment is abbreviated as x:= e. That is,
any identifier with a bar over it represents a vector (of appropriate
length).
The multiple assignment command can be executed as follows. First
evaluate the expressions, in any order, to yield values v J, . . . , V n . Then
assign v I to X J, V2 to X2, ... , Vn to x n , in that order. (Because the Xi are
distinct, the order of assignment doesn't matter. However, a later general-
ization will require left-to-right assignment.)
The multiple assignment is useful because it easily describes a state
change involving more than one variable. Its formal definition is a simple
extension of assignment to one variable:
where domain(e) describes the set of states in which all the expressions in
the vector e can be evaluated:
m i+p
bl I Ai<m<i+p
m i+p
hi I A i=m+l:::;;i+p
We have:
Exercises for Section 9.2 123
m+l+x =i+p
3. Determine and simplify wp(S, R) for the pairs (S, R) given below.
S R
(a) z,x,y:= I, c, d z*x y =c d
(b) i,s:=I,b[O] l:S;i<n I\s=b[O]+ ... +b[i-I]
(c) a,n:=O,1 a 2<n 1\ (a+I)2~n
(d) i,s:= i+l, s+b[i] O<i <n 1\ s =b[O]+' .. +b[i-l]
(e) i:= i+l; j:= j+i i =j
(f) j:= j+i; i:= i+1 i=j
(g) i, j:= i+l, j+i i =j
since both change b to represent the function (b; i:e). But (9.3.1) is an
assignment to a simple variable. Since assignment to a simple variable is
already defined in (9.1.1), so is assignment to a subscripted variable! We
have, using definition (9.1.1),
Section 9.3 Assignment to an Array Element 125
(9 ..
3 3) wp ("b[·]·-"
1 .- e, R) - Rb
(b; i:e)
Remark: The notation (b; i:e) is used in defining assignment to array ele-
ments and in reasoning about programs, but not in programs. For tradi-
tional reasons, the assignment command is still written as b [i]:= e. 0
performed here was explained at the end of section 5.1, so reread that
part if you are having trouble with it. 0
Example 4. Assume n > I. Let ordered(b [I:n]) mean that the elements
of b are in ascending order. Then
wp("b [n]:= x", ordered(b [I:n ]»
(ordered(b [I:n ]))(i; n :x) (Definition)
= ordered«b; n:x)[ I:n]) (Textual substitution)
= ordered(b[I:n-I]) " b[n-I]~x (Definition of ordered)
b[i],bU]:= bU],b[i].
(9.4.2) x 0 S:= e.
Note that a simple assignment x:= e has form (9.4.1) -with n = 1 and
SI =
E- since it is the same as x 0 E:= e. Also, the assignment b [i]:= e
has this form, with n = 1, X 1= b, S I =[i] and e 1= e.
The multiple assignment can be executed in a manner consistent with
the formal definition given below as follows:
To get some idea for the predicate transformer, let's look at the definition
of multiple assignment to simple variables:
The difficulty with (9.4.4) is that textual substitution is defined only for
identifiers, and so Rio
S is as yet undefined. We now generalize the
notion of textual substitution to include the new case by describing how
i
to massage R 0 S into the form of a conventional textual substitution.
The generalization will be done so that the manner of execution given in
(9.4.3), including the left-to-right order of assignment. will be consistent
with definition (9.4.4).
To motivate the generalization, consider the assignment
Why? Suppose two of the selectors Si and Sj (say), where i <j, are the
same. Then, after execution of (9.4.5), the value of ej (and not of ei) will
be in b 0 Sj' and thereafter a reference b 0 si should yield ej' But this is
exactly the case with execution of (9.4.6); the left-to-right order of assign-
ment during execution of (9.4.5) is reflected in the right-to-left precedence
rule for applying function (b; s,:e,; ... ; Sm :e m ) to an argument.
Secondly, note that for distinct identifiers band c and selectors sand
t (which need not be distinct) the assignments b 0 s, cot::: e, g and
cot, b 0 S::: g, e should have the same effect. This is because b 0 s
and cot refer to different parts of computer memory, and what is
assigned to one cannot effect what is assigned to the other. (Remember,
expressions e and g are evaluated before any assignments are made.)
This leads us to the following
provided that identifier b does not begin any of the Xi' This rule
indicates how mUltiple assignments to subparts of an object b can
be viewed as a single assignment to b. 0
Note that the swap performs correctly when i = j, since this case is
automatically included in the above derivation. If this derivation seems
too fast for you, reread section 5.1. 0
The last line follows because if k #i and k # j then (b; i:bU]; j:b[i])[k]
= b[k]. The only array values changed by the swap are b[i] and
bU]. 0
130 Part II. The Semantics of a Small Language
( a) Rb[il.h[il.x
e.F.g .
( b) Rb[iJ.
e,f. g
x. b[jJ
4. Derive a definition for a general multiple assignment command that can include
assignments to simple variables, array elements and Pascal record fields. (see
exercise I of section 5.3.)
5. Prove that lemma 4.6.3 holds for the extended definition of textual substitution:
Lemma. Suppose each Xi of list x has the form identifier 0 selector and
suppose li is a list of fresh, distinct identifiers. Then
Chapter 10
The Alternative Command
(10.1) ifx~O-z:=x
Ux:;O;;O - z:=-x
fi
ifx~O-z:=x U x:;O;;O-z:=-x fi .
(10.2) if BI - SI
oB2 - S2
Typically, we assume that the guards are total functions -i.e. are well-
defined in all states. This allows us to simplify the definition by deleting
the first conjunct. Thus, with the aid of quantifiers we rewrite the defini-
tion in (1O.3b) below. From now on, we will use (1O.3b) as the definition,
but be sure the guards are well-defined in the states in which the alterna-
tive command will be executed!
wp«(I0.1),z =abs(x»
=(X~OVX~O)A {BBA
(x ~O?Wp("z:= x", z =abs(x») A B] ?Wp(S],R)A
(x ~O?Wp("z:= -x", z = abs (X») B 2?Wp(S2,R)
= T A (x ~O?x =abs(x» A
(x ~O?-X =abs(x»
= TATAT
= T 0
we calculate:
Hence we see that array b should not contain the value 0, and that the
definition of p as the number of values greater than zero in b [O:i -I] will
be true after execution of the alternative command if it is true before. 0
The reader may feel that there was too much work in proving what we
did in example 2. After all, the result can be obtained in an intuitive
manner, and perhaps fairly easily (although one is likely to overlook the
problem with zero elements in array b). At this point, it is important to
practice such formal manipulations. It results in better understanding of
the theory and better understanding of the alternative command itself.
134 Part II. The Semantics of a Small Language
Its counterpart in ALGOL, if x <0 then z:= -x, has the default that if
x;;::'O execution is equivalent to execution of skip. Although a program
may be a bit longer because of the lack of a default, there are advantages.
The explicit appearance of each guard does aid the reader; each alterna-
tive is given in full detail, leaving less chance of overlooking something.
More importantly, the lack of a default helps during program develop-
ment. Upon deriving a possible alternative command, the programmer is
forced to derive the conditions under which its execution will perform
satisfactorily and, moreover, is forced to continue deriving alternatives
until at least one is true in each possible initial state. This point will
become clearer in Part III.
The absence of defaults introduces, in a reasonable manner, the possi-
bility of nondeterminism. Suppose x = 0 when execution of command
(IO.l) begins. Then, since both guards x;;::'O and x ~O are true, either
command may be executed (but only one of them). The choice is entirely
up to the executor -for example it could be a random choice, or on days
with odd dates it could be the first and on days with even dates it could
be the second, or it could be chosen to minimize execution time. The
Chapter 10 The Alternative Command 135
point is that, since execution of either one leads to a correct result, the
programmer should not have to worry about which one is executed. He is
free to derive as many alternatives commands and corresponding guards
as possible, without regard to overlap.
Of course, for purposes of efficiency the programmer could strengthen
the guards to excise the nondeterminism. For example, changing the
second guard in (l0.1) from x::;:;;:O to x <0 would help if evaluation of
unary minus is expensive, because in the case x =0 only the first com-
mand z:= x could then be executed.
Finally, the lack of default allows the possibility of symmetry 'see
(10.1», which is pleasing -if not necessary- to one with a mathematical
eye.
(I)Q~BB
(2) QI\Bj~wp(Sj,R), for all i, l::;:;;:i::;:;;:n.
Proof We first show how to take Q outside the scope of the quantifica-
tion in assumption 2 of the theorem:
(A i: Q 1\ B j ~ wp(Sj, R»
= (A i: ,(Q I\Bj )Vwp(Sj, R» (Implication)
= (A i: , Q v, Bj Vwp(Sj, R» (De Morgan)
= , Q v (A i: , Bj Vwp(Sj, R» (Q doesn't depend on i)
= Q ~(A i: Bj ~wp(Sj, R» (Implication, twice)
Hence, we have
That is, the search has been narrowed down to array section b[i:j], and k
is an index into this section. We want to prove that
Q /\b[k]~x ~ x Eb[k:j]
= wp("i:= k", x Eb[i:j]) , and
Q /\b[k]~x ~ x Eb[i:k]
= wp("j:= k", x Eb[i:j]).
The two implications follow from the fact that Q indicates that the array
is ordered and that x is in b[i:j] and from the second conjunct of the
antecedents. Hence the theorem allows us to conclude that (10.6) is
true. 0
6. Arrays f[O:n] and g[O:m] are alphabetically ordered lists of names of people.
It is known that at least one name is on both lists. Let X represent the first (in
alphabetic order) such name. Calculate and simplify the weakest precondition of
the following alternative command with respect to predicate R given after it.
Assume i and j are within the array bounds.
l B
-
T
S
J
in F
out
do B - S od
do B1 -SI
D B2 - S2
(11.1)
D Bn - Sn
od
doBB-ifB I -SI
D ...
DBn - Sn
fl
od
or do BB - IF od
That is, if all the guards are false, which means that BB is false, execution
terminates; otherwise, the corresponding alternative command IF is exe-
cuted and the process is repeated. One iteration of a loop, therefore, is
equivalent to finding BB true and executing IF.
Thus, we can get by with only the simple while-loop. Nevertheless, we
will continue to use the more general form because it is extremely useful
in deVeloping programs, as we will see in Part III.
H o( R) = , BB 1\ R
140 Part II. The Semantics of a Small Language
Let us also write a predicate Hk (R), for k > 0, to represent the set of all
states in which execution of DO terminates in k or fewer iterations, with
R true. The definition will be recursive -i.e. in terms of Hk -I(R). One
case is that DO terminates in 0 iterations, in which case H o(R) is true.
The other case is that at least one iteration is performed. Thus, BB must
initially be true and the iteration consists of executing a corresponding IF.
This execution of IF must terminate in a state in which the loop will
iterate k -lor fewer times. This leads to
i, S:= I, b [0];
doi<II -i,s:=i+I,s+b[i]od
{R: S =('ik:O::::;;k <11:b[k])}
How can we argue that it works? Let's begin by giving a predicate P that
shows the logical relationship between variables i, sand b -in effect, it
serves as a definition of i and s:
{T}
i, s:= 1, b [0];
{P}
(11.3) do i<l1 - {i<l1 /I P} i,s:= i+l, s+b[i] {P} od
{i ~ 11 /I P}
{R}
N ow let's show that an iteration of the loop terminates with P true -i.e.
an execution of command i, S:= i+l, s+b[i] beginning with P and
i < 11 true terminates with P still true. Again, we can see this informally
or we can formally prove it:
t : II-i.
{b ~O}
x, y, z:= a, b, 0;
(11.4) do y > 0 1\ even (y) - y, x:= y -;-2, x +x
o odd(y) - y, z:= y-I, z+x
od
{R: z =a*b}
P: y ~O 1\ Z +x*y =a*b.
z +x*y =z+x +x*(y-I). For the first guarded command, note that
execution of y, x:= y "';-2, x +x with y even leaves the value of z + x*y
unchanged, because x*y = (x +x) * (y "';-2) when y is even. We leave the
more formal verification to the reader (exercise 7).
Since each iteration of the loop leaves P true, P must be true upon
termination. We show that P together with the falsity of the guards
implies the result R as follows:
The work done thus far is conveyed by the following annotated program.
{b :):O}
x, y, z:= a, b, 0;
{PI
do y > 0 /\ even (y) ~ {P /\ Y > 0 /\ even (y )} y , x := y "';-2, x +x {P}
(11.5) 0 odd(y) ~ {P /\odd(y)} y, z:= y -I, z +x {PI
od
{P /\ Y ~O/\ ,odd(y)}
{P/\y=O}
{R: z =a*b}
I'. P 1\ BB =» wp(lF, P)
Discussion
A loop has many invariants. For example, the predicate x*O = 0 is an
invariant of every loop since it is always true. But an invariant that satis-
fies the assumptions of theorem (11.6) is important because it provides
understanding of the loop. Indeed, every loop, except the most trivial,
should be annotated with an invariant that satisfies the theorem.
As we shall see in Part III, the invariant is not only useful to the
reader, it is almost necessary for the programmer. We shall give heuris-
tics for developing the invariant and bound function before developing the
loop and argue that this is the more effective way to program. This
makes sense if we view the invariant as simply the definition of the vari-
ables and remember the adage about precisely defining variables before
Chapter II The Iterative Command 145
{Q}
{inv P: the invariant}
{bound t: the bound function}
(1\.8) do B, - S,
o ...
o Bn - Sn
od
{R}
When faced with a loop with form (1\.8), according to theorem (1\.6)
the reader need only check the points given in (11.9) to understand that
the loop is correct. The existence of such a checklist is indeed an advan-
tage, for it allows one to be sure that nothing has been forgotten. In fact,
the checklist is of use to the programmer himself, although after a while
(pun) its use becomes second-nature.
{ T}
i, S:= 10,0;
{inv P: O::::;;i::::;;10 1\ s=(Lk:i+l::::;;k::::;;lO:b[k])}
{bound t: i}
do i #0 - i,s:= i-I, s+b[i] od
{R: S =(Lk: l::::;;k::::;; lO:b[kJ)}
Exercises for Chapter II 147
9. Formally prove the points of checklist 11.9 for the following algorithm. The
algorithm finds the position i of x in array b [0: n -I] if x E b [O:n -I] and sets
i to n if it is not.
{O~n}
i:= 0;
{inv P:O~i~n Axi{b[O:i-I]}
{bound t: n -i)
do i <n cand x 01= b[i] - i:= i+1 od
{R: (O~i <n A x =b[iD v (i =n A x jtb[O:n-I]»)
10. Formally prove the points of checklist 11.9 for the following algorithm. The
algorithm sets i to the highest power of 2 that is at most n.
{O<n)
i:= I;
{inv P: O<i ~n A (Ep: i =2P )}
{bound t: n -i)
do 2*i ~n - i:= 2*i od
{R: 0<i~n<2*i A(Ep:i=2P)}
11. Formally prove the points of checklist 11.9 for the following algorithm. The
algorithm computes the nth Fibonacci number In for n >0, which is defined by
= =
10 =0, II I, and In In-l+ln-2 for n > 1.
{n >0)
i, a, b:= I, 1,0;
{inv P: 1 ~i ~n A a =Ii A b =Ii-d
{bound t: n-i}
do i <n - i,a,b:= i+I, a+b, a od
{R:a=ln}
12. Formally prove the points of checklist 11.9 for the following algorithm. The
algorithm computes the quotient q and remainder r when x is divided by y .
{x ~O A O<y)
q,r:=O,x;
{inv P: O~r A O<y A q*y+r =x}
{bound t: r}
do r ~ y - r, q: = r - y, q + 1 od
{R: O~r <y A q*y+r =x}
13. Formally prove the points of checklist 11.9 for the following algorithm. The
algorithm finds an integer k such b [k] is the maximum value of array b [O:n -I]
-note that if the maximum value occurs more than once the algorithm is non-
deterministic.
148 Part II. The Semantics of a Small Language
{O<n}
i,k:=l,O;
{inv P: O<i ~n /I b[k];;:'b[O:i-J]}
{bound t: n -i}
do i <n ~ if b[i]~b[k] ~ skip
Ub[i];;:'b[k] ~ k:= i
fi;
i:= i + J
od
{R: b[k];;:'b[O:n-J]}
Chapter 12
Procedure Call
In one sense, using a procedure is exactly like using any other opera-
tion (e.g. +) of the programming notation, and constructing a procedure is
extending the language to include another operation. For example, when
we use + in an expression, we never question how it is performed; we just
assume that it works. Similarly, when writing a procedure call we rely
only on what the procedure does, and not on how it does it. In another
sense, a procedure (and its proof) is a lemma. A program can be con-
sidered a constructive proof that its specification is consistent and com-
putable; a procedure is a lemma used in the constructive proof.
In the following sections, Pascal-like notations are used for procedure
declaration and call, although the (possible) execution of a procedure call
may not be exactly as in Pascal. The reason is that the main influence in
developing the procedure call here was the need for a simple, understand-
able theorem about its use, and such an influence was beyond the state of
the art when Pascal was developed.
Procedure declaration
A procedure declaration has the form
{Pre: P}
{Post: Q}
proc <identifier>( <par. spec.> ; ... ; <par. spec.»;
<body>
The following restrictions are made on the use of identifiers in a pro-
cedure declaration. The only identifiers that can be used in the body are
the parameters and the identifiers declared in the body itself -i.e. no
"global variables" are allowed. The parameters must be distinct identif-
iers. Precondition P of the body may contain as free only the parameters
with attribute value (and value result); postcondition Q only parameters
with attribute result (and value result). This restriction is essential for a
simple definition of procedure call, but it does not limit procedures or
calls of them in any essential way. P and Q may, of course, contain as
free other identifiers that are not used within the program (to denote ini-
tial values of variables, etc.). See section 12.4 for a way to eliminate this
restriction.
Example. Given fixed x, fixed n >0 and fixed array b[O:n-I), where
x E b, the following procedure determines the position of x in b, thus
establishing x =b[i).
Note that identifiers have been used to denote the initial values of the
parameters that do not have attribute result, even though the parameters
are not altered during execution of the procedure body. 0
152 Part II. The Semantics of a Small Language
Thus, the Xj are the value parameters of procedure p, the Yi the value-
result parameters and the Zj the result parameters. We have left out the
types of the parameters because they don't concern us at this point. (This
is an example of the use of abstraction!)
The name of the procedure is p. The aj, h j and Cj are the arguments of
the procedure. The aj are expressions; the h j and Cj have the form
identifier 0 selector -in common parlance, they are "variables". The aj
are the value arguments corresponding to the Xj of (12.1.1), the h j the
value-result arguments and the Cj the result arguments. Each argument
must have the same type as its corresponding parameter.
The identifiers accessible at the point of call must be different from the
procedure parameters x, y and z. This restriction avoids extra notation
needed to deal with the conflict of the same identifier being used for two
different purposes and is not essential.
To illustrate, here is a call of procedure search of the previous exam-
ple: search (50, t, C ,positionU]). Its execution stores in position U] the
position of the value of t in array C [0:49].
A call p(li, ii, C) can be executed as follows:
x, y:= a, b; B; b, c:= y, z
{PR: p~'bi
a,
A (A ii, v: QE'~
U,V
=? R~'~)} pea, b, c) {R}
U,V
(12.2.2) x,y:= a,b {P}; y,z:= ii,v {Q}; b,c:= y,z {R}
= Q~':
u, v
(since it contains no Xi or Yi!)
154 Part II. The Semantics of a Small Language
= R~' ~
u, v
(since it contains no Xi or Yi!)
In order to be able to use the fact that {P} B {Q} has been proved about
the procedure body, we require that (12.2.3) be true before the call; this is
the first conjunct in the precondition PRof the theorem. Therefore, no
matter what values ii, v execution assigns to the result parameters, Q will
be true in the indicated place in (12.2.2).
Now, we want to determine initial conditions that guarantee the truth
of R upon termination, no matter what values ii, v are assigned to the
result parameters and arguments. R holds after the call if, for all values
ii, v, the truth of Q in (12.2.2) implies the truth of R after the call. This
can be written in terms of the initial conditions as
holds, where a and b are integer variables and identifiers Y and X de-
note their final values, respectively. We apply theorem (12.2. I) to find a
satisfactory precondition P R:
PR =(a=Xl\b=Y)1\
(A ul , u2'(y1
. = Y 1\ y2=X)Yl,y2
ul,u2 =? (a = Y 1\ b =X)a,b
ul,u2 )
= (a = X 1\ b = Y) 1\
(A ul,u2:(ul= Y 1\ u2=X)=?(ul= Y 1\ u2=X»
= (a = X 1\ b = Y) 1\ T
Section 12.2 Two Theorems Concerning Procedure Call 155
{P:yl=XAy2=Y}B{Q:yl=YAy2=X}
Therefore, it is equivalent to
(A X , Y: {y 1 = X A Y2 = Y} B {y 1 = YAy 2 = X})
{yl=A Ay2=Y}B{yl=YAy2=A}
Thus, this last line is also true about the procedure body B. Now apply
the theorem as in example I to yield the desired result. Hence, (12.2.7)
holds.
This illustrates how initial and final values of parameters can be han-
dled. The identifiers that denote initial and final values of parameters can
be replaced by fresh identifiers -or any expressions- to yield another
proof about the procedure body, which can then be used in theorem
12.2.1. 0
Example 3. We now prove correct a call that has array elements as argu-
ments. Consider the procedure of example I. We want to prove that
swap (i, b [i]) interchanges i and b [i] but leaves the rest of array b
unchanged. It is assumed that the value of i is a valid subscript. Thus,
we want to prove
(12.2.8) {i =/ A (Aj:bU]=BU])}
swap (i, b [i])
{R: i = B [I] A b [/] = I A (A j: I # j : b U] = B U])}
PR = i =/ /\ b[i]=B[I] /\
(A uI, u2: ul = B[I] /\ u2 = / =? ul = B[I] /\
(b; i:u2)[I]=/ /\ (Aj:/#j:(b; i:u2)U]=BU])
= i =/ /\ b[i]=B[I] /\ B[I]=B[I] /\
(b; i:I)[I]=/ /\ (Aj:/#j:(b; i:I)U]=BU])
= i =/ /\ b[i]=B[I] /\ T /\ /=/ /\ (Aj:/#j:bU]=BU])
which assigns the value parameter to both result parameters. Note that
postcondition Q does not contain the value parameter. We want to exe-
cute the call p(b[i],i,b[i+I]), which assigns b[i] to i and b[i+I].
Thus, it makes sense to try to prove
First, replace the free variable X in the proof of the procedure body by
C:
b [i] = C /\
(A vI, v2: vI = v2 = C =?
vl=(b; i+l:v2)[I]=(b; i+l:v2)[I+I]=C)
= b [i ] = C /\ (b; i + I : C )[I ] = (b; i + I : C )[1 + 1] = C
(A it,V: QE'~
u, v
~ R~'~)
u, v
(12.2.10) R~' ~
u. v
= QE,u, v~ A I
where the free variables of I are disjoint from li and C. For then the
complicated conjunct may be simplified as follows:
(A it, V: QE'~
u, v
~ R~'~)
u. v
= (Q~':
u, v
A I) b,c
'!.,~ (12.2.10)
= Q~':
b,c
A I (Lemma 4.6.3, def of I)
(12.2.11) R = Qb~':
,c
A I
But this is not enough, From (12.2.11) we want to conclude that (12.2.10)
holds, but this is not always the case, because
(Qi' ~)~, ~
b,c u, v
Q~':
u,v
(12.3.2) pea, b, c, d)
How do we extend theorems 12.2.1 and 12.2.12 to allow for call by refer-
ence? Call by reference can be viewed as an efficient form of call by
value-result; execution is the same, except that the initial assignments to r
and the final assignments to d are not needed. But the proof of the pro-
cedure body, {PI B {Q}, is consistent with our notion of execution for
value-result parameters only if value-result parameters occupy separate
locations -assignment to one parameter must not affect the value of any
other parameter. When using call by reference, then, we must be sure
that this condition is still upheld.
Let us introduce the notation disj(d) to mean that no sharing of
memory occurs among the d i . For example, disj(dl,d2) holds for dif-
ferent identifiers dl and d2. Also, disj (b [i], b [i + I]) holds, while
disj(b[i],bU]) is equivalent to i #j.
Further, we say that two vectors x and.v are pairwise disjoint, written
pdisj(x; jI), if each Xi is disjoint from each Yj -i.e. disj(Xi, Yj) holds.
Theorems 12.2.1 and 12.2.12 can then be modified to the following:
{PX',J:',~
li,b,d
A (A ii, \I, W: QE'~'~
U,v,W
~ R~'~' ~)}
u, v. w
pea, b, c, d)
{R} 0
holds. Then
{P~',J:',~ A I}
a,b,d
pea, b, c, d) {Q['::'~ A I}
b,c,d
0
pea, b, C)
fR}
{PR: P:'boJ::
a.
A (A u,v: Q:'!":
a,u,v
~ R~'
u,v
~)}
p(a, b, C)
{R}
holds. In other words, PR ~wp(p(a, b, c), R). 0
Examples of the use of these theorems are left to the exercises.
{Q(U)} S {R}
{(A u: Q(u))} S {R}
{(Eu: Q(u))} S {R}
{Pre: 0";;; k A X = X A b = B}
{Post: O";;;p";;;k A b[P] =X}
proc s (value x : integer;
value result b: array of integer;
value result k, p: integer);
p, b[k]:= 0, x;
{inv: O";;;p ";;;k A x Ef b [O:p -I]}
{bound: k -p}
do x #b[P] - p:= p+l od
Is the procedure fully specified -i.e. has anything omitted from the specification
that can be proved of the procedure body? Which of the following calls can be
proved correct using theorem 12.2.1. Prove them correct.
(a) {d=O} s(5,c,d,j) {c[j]=5}
(b) {O";;;m}s(j,c,m,j){c[j]=f}
(c) {O<m} s(b[O],c,m,j) {c[j]=c[O]}
(d) {O<m} s(5, c, m, m) {c[m]=5}
6. Which of the calls given in exercise 5 can be proved correct using theorem
12.2.12? Prove them correct.
7. Suppose parameters k and p of exercise 5 have attribute var instead of value
result. Can call (d) of exercise 5 be proved correct using theorem 12.3.3? If so,
do so. Can it be proved correct using theorem 12.3.4? 12.3.5? If so, do so.
Part III
The Development of Programs
Chapter 13 Introduction
What is a proof?
The word radical, used above, is appropriate, for the methodology pro-
posed strikes at the root of the current problems in programming and pro-
vides basic principles to overcome them. One problem is that program-
mers have had little knowledge of what it means for a program to be
correct and of how to prove a program correct. The word proof has un-
pleasant connotations for many, and it will be helpful to explain what it
means.
A proof, according to Webster's Third New International Dictionary, is
"the cogency of evidence that compels belief by the mind of a truth or
fact". It is an argument that convinces the reader of the truth of some-
thing.
The definition of proof does not imply the need for formalism or
mathematics. Indeed, programmers try to prove their programs correct in
this sense of proof, for they certainly try to present evidence that compels
their own belief. Unfortunately, most programmers are not adept at this,
as can be seen by looking at how much time is spent debugging. The pro-
grammer must indeed feel frustrated at the lack of mastery of the subject!
Part of the problem has been that only inadequate tools for under-
standing have been available. Reasoning has been based solely on how
164 Part III. The Development of Programs
programs are executed, and arguments about correctness have been based
on a number of test cases that have been run or hand-simulated. The
intuition and mental tools have simply been inadequate.
Also, it has not always been clear what it means for a program to be
"correct", partly because specifications of programs have been so impre-
cise. Part II has clarified this for us; we call a program S correct -with
respect to a given precondition Q and postcondition R - if {Q} S {R}
holds. And we have formal means for proving correctness.
Thus, our development method will center around the concept of a for-
mal proof, involving weakest preconditions and the theorems for the alter-
native, iterative and procedure call constructs discussed in Part II. In this
connection, the following principle is important:
the reader. In addition, some programs are so large that they cannot be
comprehended fully by one person at one time. Thus, there is a continual
need to strive for balance, conciseness, and even elegance.
The approach we take, then, can be summarized in the following
The Coffee Can Problem. A coffee can contains some black beans and
white beans. The following process is to be repeated as long as possible.
Randomly select two beans from the can. If they have the
same color, throw them out, but put another black bean
in. (Enough extra black beans are available to do this.)
If they are different colors, place the white one back into
the can and throw the black one away.
Execution of this process reduces the number of beans in the can by one.
Repetition of the process must terminate with exactly one bean in the can,
for then two beans cannot be selected. The question is: what, if anything,
can be said about the color of the final bean based on the number of
166 Part III. The Development of Programs
white beans and the number of black beans initially in the can? Spend
ten minutes on the problem, which is more than it should require, before
reading further.
It doesn't help much to try test cases! It doesn't help to see what happens
when there are initially I black bean and I white bean, and then to see
what happens when there are initially 2 black beans and one white bean,
etc. I have seen people waste 30 minutes with this approach.
Instead, proceed as follows. Perhaps there is a simple property of the
beans in the can that remains true as beans are removed and that,
together with the fact that only one bean remains, can give the answer.
Since the property will always be true, we will call it an invariant. Well,
suppose upon termination there is one black bean and no white beans.
What property is true upon termination, which could generalize, perhaps,
to be our invariant? One is an odd number, so perhaps the oddness of the
number of black beans remains true. No, this is not the case, in fact the
number of black beans changes from even to odd or odd to even with
each move. But, there are also zero white beans upon termination
-perhaps the evenness of the number of white beans remains true. And,
indeed, yes, each possible move either takes out two white beans or leaves
the number of white beans the same. Thus, the last bean is black if ini-
tially there is an even number of white beans; otherwise it is white.
Closing the curve. This second problem is solved in essentially the same
manner. Consider a grid of dots, of any size:
Two players, A and B, play the following game. The players alternate
moves, with A moving first. A moves by drawing I or _ between two
adjacent dots; B moves by drawing a dotted line between two adjacent
dots. For example, after three full moves the grid might be as to the left
below. A player may not write over the other player's move.
Chapter 13 Introduction 167
A wins the game if he can get a completely closed curve, as shown to the
right above. B, because he goes second, has an easier task: he wins if he
can stop A from getting a closed curve. Here is the question: is there a
strategy that guarantees a win for either A or B, no matter how big the
board is? If so, what is it? Spend some time thinking about the problem
before reading further.
Looking at one trivial case, a grid with one dot, indicates that A cannot
win all the time -four dots are needed for a closed curve. Hence, we
look for a strategy for B to win. Playing the game and looking at test
cases will not find the answer! Instead, investigate properties of closed
curves, for if one of these properties can be barred from the board, A
cannot win. The corresponding invariant is that the board is never in a
configuration in which A can establish that property.
What properties does a closed curve have? It has parallel lines, but B
cannot prevent parallel lines. It has an even number of parallel lines, but
B cannot prevent this. It has four angles L, ~, I and I, but B cannot
prevent A from drawing angles. It always has at least one angleL, which
opens northeast -and B can prevent A from drawing such an angle! If
A draws a horizontal or vertical line, as shown to the left below, then B
simply fills in the corresponding vertical or horizontal line, if it is not yet
filled in, as shown to the right below. A simpler strategy couldn't exist!
These two problems have extremely simple solutions, but the solutions
168 Part III. The Development of Programs
are extremely difficult to find by simply trying test cases. The problems
are easier if one looks for properties that remain true. And, once found,
these properties allow one to see in a trivial fashion that a solution has
been found.
Besides illustrating the inadequacy of solving by test cases, these prob-
lems illustrate the following principle:
In fact, we shall see by examples that the more properties you know
about the objects, the more chance you have of creating an efficient algo-
rithm. But let us leave further examples of the use of this principle to
later chapters.
Programming-in-the-small
F or the past ten years, there has been much research in "pro-
gramming-in-the-small", partially because it seemed to be an area in
which scientific headway could be made. More importantly, however, it
was felt that the ability to develop small programs is a necessary condition
for developing large ones ~although it may not be sufficient.
This fact is brought home most clearly with the following argument.
Suppose a program consists of n small components ~i.e. procedures,
modules~ each with probability p of being correct. Then the probability
P that the whole program is correct certainly satisfies P <pn. Since n is
large in any good-sized program, to have any hope that the program is
correct requires p to be very, very close to I. For example, a program
with 10 components, each of which has 95% chance of being correct, has
less than a 60% chance of being correct, while a program with 100 such
components has less than a .6% chance of being correct!
Part III concentrates on the place where many programming errors are
made: the development of small program segments. All the program seg-
ments in Part III are between I and 25 lines long, with the majority being
between 1 and 10. It is true, however, that some of the programs are
short because of the method of development. Concentrating on princi-
ples, with an emphasis on precision, clarity and elegance, can actually
result in shorter programs. The most striking example of this is the pro-
gram The Welfare Crook -see section 16.4.
A disclaimer
The methods described in Part III can certainly benefit almost any pro-
grammer. At the same time, it should be made clear that there are other
ways to develop programs. A difficult task like programming requires
many different tools and techniques. Many algorithms require the use of
an idea that simply does not arise from the principles given in this Part,
so this method alone cannot be used to solve them effectively. Some
important ideas, like program transformation and "abstract data types"
are not discussed at all, while others are just touched upon. And, of
course, experience and knowledge can make all the difference in the
world.
Secondly, even though the emphasis is on proofs of correctness, errors
will occur. The wise programmer develops a program with the attitude
that a correct program can and will be developed, provided enough care
and concentration is used, and then tests it thoroughly with the attitude
that it must have a mistake in it. The frequency of errors in mathematical
theorems, proofs, and applications of theorems is well-recognized and
documented, and the area of program-proving will not be an exception.
We must simply learn to live with human fallibility and simplify to reduce
it to a minimum.
Nevertheless, the study of Part III will provide an education in
rigorous thinking, which is essential for good programming. Conscious
application of the principles and strategies discussed will certainly be of
benefit.
first before proceeding! Finally, the reader should do several of the exer-
cises at the end of the section.
Simply reading and listening to lectures on program development can
only teach about the method; in order to learn how to use it, direct
involvement is necessary. In this connection, the following meta-principle
is of extreme importance:
Ideas may be simple and easy to understand, but their application may
require effort. Recognizing a principle and applying it are two different
things.
notation in which the final program is expressed. For example, one can
use the principles and strategies espoused in this book even if the final
program has to be in FORTRAN: one programs into a language, not in
it. To be sure, considerably more than one month of education and train-
ing will be necessary to wean yourself away from QWERTY program-
ming, for old habits are changed very slowly. Nevertheless, I think it is
worthwhile.
Let us now turn to the elucidation of principles and strategies that may
help give the QWERTY programmer a new keyboard.
Chapter 14
Programming as a Goal-Oriented Activity
This gives us the conditions under which execution of z:= x will establish
R, and our first attempt at a program can be
Chapter 14 Programming as a Goal-Oriented Activity 173
if x ~y - z := x fi
This program performs the desired task provided it doesn't abort. Recall
from theorem 10.5 for the alternative construct that, to prevent abortion,
precondition Q of the construct must imply the disjunction of the guards,
i.e. at least one guard must be true in any initial states defined by Q. But
Q, which is T, does not imply x ~ y. Hence, at least one more guarded
command is needed.
Another possible way to establish R is to execute z:= y. From the
above discussion it should be obvious that y ~x is the desired guard.
Adding this guarded command yields
Now, at least one guard is always true, so that this is the desired program.
Formally, we know that (14.3) is the desired program by theorem 10.5.
To apply the theorem, take
Discussion
The above development illustrates the following
By this we mean that the desired result, or goal, R, plays a more impor-
tant role in the development of a program than the precondition Q. Of
course, Q also plays a role, as will be seen later. But, in general, more
insight is gained from the postcondition. The goal-oriented nature of pro-
gramming is one reason why the programming notation has been defined
in terms of weakest preconditions (rather than strongest postconditions
-see exercise 4 of section 9.1).
To substantiate this hypothesis of the goal-oriented nature of program-
ming, consider the following. Above, the precondition was momentarily
put aside and a program was developed that satisfied
{T} S {? I
Whenever S is thought to be complete, check whether T ~ wp(S,
z =max(x, y», or T ~ wp(S,(I4.2». How many programs S will you
write before a correct one is found?
Another principle used in the above development is:
In the example just developed, the postcondition was refined while the
precondition, which was simply T, needed no refining.
A problem is sometimes specified in a manner that lends itself to
several interpretations. Hence, it is reasonable to spend some time mak-
ing the specification as clear and unambiguous as possible. Moreover, the
form of the specification can influence algorithmic development, so that
striving for simplicity and elegance should be helpful. With some prob-
lems, the major difficulty is making the specification simple and precise,
and subsequent development of the program is fairly straightforward.
Often, a specification may be in English or in some conventional nota-
tion -like max(x, y)- that is at too "high a level" for program develop-
ment, and it may contain abbreviations dealing with the applications area
with which the programmer is unfamiliar. The specification is written to
convey what the program is to do, and abstraction is often used to sim-
plify it. More detail may be required to determine how to do it. The
example of setting z to the maximum of x and y illustrates this nicely. It
is impossible to write the program without knowing what max means,
while writing a definition provides the insight needed for further develop-
ment.
The development of (14.3) illustrates one basic technique for develop-
ing an alternative construct, which was motivated by theorem 10.5 for the
Alternative Construct.
This technique, and a similar one for the iterative construct, is used often.
Let us return to program (14.3) for a moment. It has a pleasing sym-
metry, which is possible because of the nondeterminism. If there is no
reason to choose between z:= x and z := y when x = y, one should not be
forced to choose. Programming requires deep thinking, and we should be
spared any unnecessary irritation. Conventional, deterministic notations
force the choice, and this is one reason for preferring the guarded com-
mand notation.
Nondeterminism is an important feature even if the final program turns
out to be deterministic, for it allows us to devise a good programming
methodology. One is free to develop many different guarded commands
completely independently of each other. Any form of determinism, such
as evaluating the guards in order of occurrence (e.g. the PL / I Select state-
ment), drastically affects the way one thinks about developing alternative
constructs.
A second example
Write a program that permutes (interchanges) the values of integer var-
iables x and y so that x :S;;y. Use the method of development discussed
above.
As a first step, before reading further, write a suitable precondition Q
and postcondition'R.
The problem is slightly harder than the first one, for it requires the intro-
duction of notation to denote the initial and final values of variables.
Precondition Q is x = X A Y = Y, where identifiers X and Y denote the
initial values of variables x and y, respectively. Postcondition R IS
Remark: One could also use the concept of a permutation and write R as
x:S;;y Aperm«x, y), (X, y)). 0
ifx~y -skip
Oy ~x - x, y:= y, x
fi
j = k mod IO
Thus, j will only take on the values 0, I, ... ,9. Let us determine a com-
mand to "increase k under the invariance of j = k mod 10", assuming that
function mod is not available.
Chapter 14 Programming as a Goal-Oriented Activity 177
(Note how strategy (14.7) was used, in an informal but careful manner.)
The question is: which is to be preferred, (14.8) or segment (14.9) below,
which is the same as (14.8) except that its second guard, j ~ 9, is weaker.
At first thought, (14.9) might be preferred because it executes without
abortion in more cases. If initially j = 10 (say), it nicely sets j to O. But
this is precisely why (14.9) is not to be preferred. Clearly, j = 10 is an
error caused by a hardware malfunction, a software error, or an inadver-
tant modification of some kind ~j is always supposed to satisfy
O~j < 10. Execution of (14.9) proceeds as if nothing were wrong and the
error goes undetected. Execution of (14.8), on the other hand, aborts if
j = 10, and the error is detected.
(14.10) • Principle: All other things being equal, make the guards
of an alternative command as strong as possible, so that
some errors will cause abortion.
The phrase "all other things being equal" is present to make sure that the
principle is reasonably applied. For example, at this point I am not even
prepared to advocate strengthening the first guard, as follows:
if 0 ~j /\ j <9 - k ,j:= k + I, j + I
Uj =9 - k, j:= k+l, 0
fi
This chapter discusses two methods for developing a loop when the
precondition Q, the postcondition R, the invariant P and the bound
function t are given. The first method leads naturally to a loop with a
single guarded command, do B ~ Sod. The second takes advantage of
the flexibility of the iterative construct and generally results in loops with
more than one guarded command.
Checklist 11.9 will be heavily used, and it may be wise to review it
before proceeding. As is our practice throughout, the parts of the
development that illustrate the principles to be covered are discussed in a
formal and detailed manner, while other parts are treated more infor-
mally.
R:s =(Lj:O~j<n:bU])
P: 0 ~ j ~ n 1\ s = (L j : 0 ~j < i : b U])
t: n - j
180 Part III. The Development of Programs
Thus, variable ; has been introduced. The invariant states that at any
point in the computation s contains the sum of the first i values of b.
The assignment i, s:= 0, 0 obviously establishes P, so it will suffice as
the initialization. (Note that ;, s:= I, b[O] does not suffice because, if
n =0, it cannot be executed. If n =0, execution of the program must set
s to the identity of addition, 0.)
The next step is to determine the guard B for the loop do B - Sod.
Checklist 11.9 requires P II , B ~ R, so , B is chosen to satisfy it. Com-
paring P and R, we conclude that j = n will do. The desired guard B of
the loop is therefore its complement, i"#n. The program looks like
Now for the command. The purpose of the command is to make progress
towards termination -i.e. to decrease the bound function t - and an
obvious first choice for it is j:= ;+1. But, this would destroy the invari-
ant, and to reestablish it b[i] must simultaneously be added to s. Thus,
the program is
Remark: For those uneasy with the multiple assignment, the formal proof
that P is maintained is as follows. We have
Discussion
First of all, let us discuss the balance between formality and intuition
observed here. The pre- and postconditions, the invariant and the bound
function were given formally and precisely. The development of the parts
of the program was given less formally, but checklist 11.9, which is based
on the formal theorem for the Iterative Construct, provided most of the
motivation and insight. In order to check the informal development, we
relied on the theory (in checking that the loop body maintained the invari-
ant). This is illustrative of the general approach (I3.1) mentioned in
chapter 13.
An important strategy in the development was finding the guard before
the command. And the prime consideration in finding the guard B was
that it had to satisfy P II , B 9 R. So, , B was developed and then com-
plemented to yield B.
Section 15. I Developing the Guard First lSI
Some object at first to finding the guard this way, because Tradition
would use the guard i < n instead of i oF n . However, i oF n is better,
because a software or hardware error that made i > n would result in a
nonterminating execution. It is better to waste computer time than suffer
the consequences of having an error go undetected, which would happen
if the guard i <n were used. This analysis leads to the following
The method used for developing the guard of a loop is extremely sim-
ple and reliable, for it is based on manipulation of static, mathematical
expressions. In this connection, I remember myoid days of FORTRAN
programming --the early 1960's~ when it sometimes took three debug-
ging runs to achieve proper loop termination. The first time the loop
iterated once too few, the second time once too many and the third time
just right. It was a frustrating, trial-and-error process. No longer is this
necessary; just develop , B to satisfy P II , B =? R and complement it.
Another important point about the development was the stress on ter-
mination. The need to progress towards termination motivated the
development of the loop body; reestablishing the invariant was the second
consideration. Actually, every loop with one guarded command has the
high-level interpretation
(15.1.3) {invariant: P}
{bound: t}
do B - Decrease t, keeping P true od
{P II , B}
The invariant P, given below using a diagram, states that x is not in the
already-searched rows b[O:i -I] and not in the already-searched columns
b[i,O:}-I] of the current row i.
o } n-l
o x not here
(15.1.6) P: O~i ~m 1\ O~} <n 1\ I
m-l
'----------'
The obvious choice is i, j:= 0, 0, for then the section in which "x is not
here" is empty. Next, what should be the guard B of the loop?
B: i #m 1\ (i ~m cor x #b[i,}])
Section 15.1 Developing the Guard First 183
B: i #- m 1\ (i =m cor x #- b [i ,j)
and finally to
The final line is therefore the guard of the loop. The next step is to deter-
mine the loop body. Do it, before reading further.
The purpose of the loop body is to decrease the bound function t, which
is the number of elements in the untested section: (m -i)*n - j. P 1\ B,
the condition under which the body is executed, implies that i < m, j <n
and x #- b [i ,j), so that element b [i ,j), which is in the untested section,
can be moved into the tested section. A possible command to do this is
j:= j+l, but it maintains the invariant P only if j <n -I. So we have
the guarded command
j<n-I ~j:=j+1
if j < n -I ~ j: = j + I 0j = n -) ~ j ,j: = j + ), 0 fi
The program is therefore
(15.1.7) j, j:= 0, 0;
do j #-m cand x #-b[i,j) ~
if j < n -I ~ j: = j +) 0 j = n -) ~ i,j: = i + ), 0 fi
od
i,):=O,O;
do i # m cand x # b [i , j] -
):= )+1;
if) <n - skip 0 j =n - i,j:= i+l, 0 fi
od
Discussion
Note that operation cand (instead of A) is really necessary.
Note that the method for developing an alternative command was used
when developing the body of the loop, albeit informally. First, the com-
mand F= j + I was chosen, and it was seen that it performed as desired
only if) < n -I. Formally, one must prove
(P A B Aj<n-I) =? wp("F=)+I", P)
but this case is simple enough to handle informally -if care is used.
Second, the command i,):= i + I, 0 was chosen to handle the remaining
case,) =n.
Note that the alternative command has the guards j < n -I and j =
n -I, and not j < n -I and)): n -I. The guards of the alternative com-
mand have been made as strong as possible, in keeping with principle
14.10, in order to catch errors.
We will develop another solution to this problem in section 15.2.
2. The invariant of the loop of the second example was given in terms of a
diagram (see (15.1.6». Replace the diagram by an equivalent statement in the
predicate calculus.
3. Write a program that, given a fixed integer array b[O:n-I], where n >0, sets
X to the smallest value of b. The program should be nondeterministic if the
smallest value occurs more than once in b. The precondition Q, postcondition
R, loop invariant P and bound function tare
Q: O<n
R: x~b[O:n-l] /\ (Ej: O~j<n: x=bU])
P: I~j~n A x~b[O:j-l] A (Ej: O~j<i: x=bU])
t: n-i
Section 15.2 Making Progress Towards Termination 185
4. Write a program for the problem of exercise 3, but use the invariant and bound
function
5. Write a program that, given a fixed integer n >0, sets variable i to the highest
power of 2 that is at most n. The precondition Q, postcondition R, loop invari-
ant P and bound function tare
Q: O<n
R: 0< i ~ n < 2*i /\ (E p: i = 2P )
P: 0< i ~ n /\ (E p: i = 2P )
t: n-i
6. Translate program (15.1.7) into the language of your choice -PLj I, Pascal,
FORTRAN, etc.- remembering the need for the operation cando Compare your
answer with (15.1. 7).
Four-tuple Sort
Consider the following problem. Write a program that sorts the four
integer variables qO, ql, q2, q3. That is, upon termination the following
should be true: qO ~ql ~q2 ~q3.
Implicit is the fact that the values of the variables should be permuted
-for example, the assignment qO, ql, q2, q3:= 0, 0, 0, 0 is not a solution,
even though it establishes qO~ql ~q2~q3. To convey this information
explicitly, we use Qi to denote the initial value of qi, and write the for-
mal specification
where the second conjunct perm ( ... , ... ) of R means that the four
variables qO, ql, q2, q3 contain a permutation of their original values.
A loop will be written. Its invariant expresses the fact that the four
variables must always contain a permutation of their initial values:
Note that this includes all pairs, and not just adjacent ones. For example,
the number of inversions in (1,3,2,0) is 4. So the bound function is
The invariant indicates that the four variables must always contain a
permutation of their initial values. This is obviously true initially, so no
initialization is needed.
In the last section, at this point of the development the guard of the
loop was determined. Instead, here we will look for a number of guarded
commands, each of which makes progress towards termination. The
invariant indicates that the only possible commands are those that swap
(permute) the values of two or more of the variables. To keep things sim-
ple, consider only swaps of two variables. There are six possibilities:
qO, qI:= qI, qO and qI, q2:= q2, qI, etc.
Now, execution of a command must make progress towards termina-
tion. Consider one possible command, qO, qI:= qI, qO. It decreases the
number of inversions in (qO, qI, q2, q3) iffqO>qI. Hence, the guarded
command qO>qI - qO, qI:= qI, qO will do. Each of the other 5 possibil-
ities are similar, and together they yield the program
qO~qI~q2~q3.
Together with invariant P, this implies the desired result. But note that
only the first three guards were needed to establish the desired result.
Therefore, the last three guarded commands can be deleted, yielding the
program
Section 15.2 Making Progress Towards Termination 187
Discussion
The approach used here can be summarized as follows.
there is exactly one final state, so that in terms of the result the program
is deterministic.
The number of iterations of the loop is equal to the number of inver-
sions, which is at most 6.
o J n-I
o x not here
The bound function is the sum of number of values in the untested section
and the number of rows in the untested section: t = (m ~i)*n ~J + m ~i.
The additional value m -i is needed because possibly j = n. As a first
step in the development, determine the initialization for the loop.
The obvious choice is i, j:= 0, 0, for then the section in which "x is not
here" is empty. Note carefully how the invariant includes j ~n, instead
of J < n. This is necessary because the number of columns, n, could be
o.
Next, guarded commands for the loop must be developed. What is the
simplest command possible, and what is a suitable guard for it?
Section 15.2 Making Progress Towards Termination 189
Note that this guard has been made as weak as possible. Now, does a
loop with this single guarded command solve the problem? Why or why
not? If not, what other guarded command can be used?
A loop with only this guarded command could terminate with i <m "
j = n, and this, together with the invariant, is not enough to prove R.
Indeed, if the first row of b does not contain x, the loop will terminate
after searching through only the first row! Some guarded command must
deal with increasing i.
The command i:= i+1 may only be executed if i <m. Moreover, it
has a chance of keeping P true only if row i does not contain x, so con-
sider executing it only under the additional condition j = n. But this
means that j should be set to 0 also, so that the condition on the current
row i is maintained. This leads to the program
(15.2.4) i, j:= 0, 0;
do i ¥= m "j ¥= n cand x ¥=b[i,j] - j:= j+1
o i¥=m "} =n -i,j:=i+I,O
od
and this together with P implies the result R. Hence, the program is
correct. Note that in the case i = m the invariant implies that x is not in
rows 0 through m -I of b, which means that x if b .
190 Part III. The Development of Programs
Discussion
This loop was developed by continuing to develop simple guarded
commands that made progress towards termination until P II , BB ~ R.
This led to a loop with a form radically different from what most pro-
grammers are used to developing (partly because they don't usually know
about guarded commands). It does take time to get used to (15.2.4) as a
loop for searching a two-dimensional array.
This problem is often used to argue for the inclusion of gotos or loop
"exits" in a conventional language, because, unless one uses an extra vari-
able commonly called a "flag", the conventional solution to the problem
needs two nested loops and an "exit" from the inner one:
(15.2.5) i, j:= 0, 0;
while i ¥- m do
begin while j ¥- n do
if x = b [i , j] then goto loopexit
else j:= j + I;
i, j:= i+l, 0
end;
loopexit:
We see, then, that the guarded command notation and the method of
development together lead to a simpler, easier-to-understand, solution to
the problem -provided one understands the methodology.
How could program (15.2.4) be executed effectively? An optimizing
compiler could analyze the guards and commands and determine the
paths of execution given in diagram (15.2.6) -in the diagram, an arrow
with F (T) on it represents the path to be taken when the term from
which it emanates is false (true). But (15.2.6) is essentially a flawchart for
program (l5.2.5)! At least in this case, therefore, the "high level" pro-
gram (15.2.4) can be simulated using the "lower-level" constructs of Pas-
cal, FORTRAN and PLjl.
Program (15.2.4) is developed from sound principles. Program (15.2.5)
is typically developed in an ad hoc fashion, using development by test
cases, the result being that doubt is raised whether all cases have been
covered.
Exercises for Section 15.2 191
(15.2.6) i,j:= 0, 0;
dO~,?;-C:~X#b[i.~.=j+l)
Di:mAj~~iJ=i+I.O
od
The first two lines hold because any divisor of x and y is also a divisor of x +y
and x-y -since x / d ±y / d =(x ±y) / d for any divisor d of x and y.
Your program has the result assertion
R: x = y =gcd(X, Y)
The program should not use multiplication or division. It should be a loop (with
initialization) with invariant
and bound function t: x +y. Use the properties given above to determine possi-
ble guarded commands for the loop.
3. Redo the program of exercise 2 to determine the greatest common divisor of
three numbers X, Y and Z that are> O.
4. Write an algorithm to determine gcd (X, Y) for X, Y ~ 0 using multiplica-
tion and division (see exercise 2). For example, it is possible to subtract a multi-
ple of x from y. The result assertion, invariant and bound function are
192 Part III. The Development of Programs
R: x =0 A y =gcd(X, Y)
P:O:::;;X AO:::;;y A(O,O)~(X,y)Agcd(x,y)=gcd(X, Y)
t: 2*x+y
5. This problem concerns that part of a scanner of a compiler -or any program
that processes text- that builds the next word or sequence of non blank symbols.
Characters bU:79] of character array b[0:79] are used to hold the part of the
input read in but "not yet processed", and another line of input can be read into
b by executing read (b). Input lines are 80 characters long.
It is known that b U :79] catenated with the remaining input lines is a
sequence
wi '-'I REST
where '1" denotes catenation, "-" denotes a blank space, W is a nonempty
sequence of nonblank characters, and REST is a string of characters. The pur-
pose of the program to be written is to "process" the input word W, deleting it
from the input and putting it in a character array s. W is guaranteed to be short
enough to fit in s. For example, the top part of the diagram below shows sample
initial conditions with IO-character lines. The bottom diagram gives correspond-
ing final conditions.
W: 'WORD'
REST: 'NEXT-ONE-IS-IT--' E-IS-IT---
bU:79]: 'WO' input:
Initial Conditions
Final Conditions
(a) (b)
Figure 16.1.1 Blowing up the balloon
194 Part Ill. The Development of Programs
Hence, the set of states represented by P must contain both the set of
possible initial states represented by IS and the set of final states repres-
ented by R, as shown in Fig. 16. 1.1 (b).
Consider R to be the deflated state of a balloon, which is blown up to
its complete inflated state, P, just before execution of the loop. Each
iteration of the loop will then let some air out of the balloon, until the
last iteration reduces the balloon back to its deflated state R. This is
illustrated in Fig. 16.1.2, where Po=P is the balloon before the first itera-
tion, P J the balloon after the first iteration and P 2 the balloon after the
second iteration.
Po
Remark: The balloon and its various states of deflation is defined more
precisely as follows. P is the completely inflated balloon. Consider the
bound function t. Let to be the initial value of t, which is determined by
the initialization, t J the value of t after the first iteration, t 2 the value of t
after the second iteration, etc. Then the predicate
denotes the set of states in the balloon after the ith iteration. Thus, ini-
tialization deflates the balloon to include only states in P " 0:::;; t :::;; to, the
first iteration deflates it more to P " 0:::;; t :::;; t J, etc. 0
Weakening a predicate
Here are four ways of weakening a predicate R:
The first three methods are quite useful. In each, insight for weaken-
ing R comes directly from the form and content of R itself, and the
number of possibilities to try is generally small. The methods may there-
fore provide the kind of directed, disciplined development we are looking
for.
The fourth method of weakening a predicate is rarely useful in pro-
gramming, in all its generality. There is no reason to try to add one dis-
junct rather than another, and hence adding a disjunct would be a random
task with an infinite number of possibilities. We shall not analyze this
method further.
P: 0«;a 2«;n.
a:= 0; do (a+I)2«;n - ? od
Discussion
Here, strategy 15.1.4 was used to develop the loop -first the guard
was created and then the loop body. The guard was created in such a
simple and useful manner that it deserves being called a strategy itself.
Linear search
As a second example of deleting a conjunct, consider the following
problem. Given is a fixed array b[O:m-l] where O<m. It is known that
a fixed value x is in b [O:m -I]. Write a program to determine the first
occurrence of x in b -i.e. to store in a variable i the least integer such
that x = b [i].
The first task is to specify the program more formally. This is easy to
do; we have the following precondition Q and postcondition R:
Q: O<m /\ xEb[O:m-l]
R: O~i <m /\ x ~b[O:i-l] /\ x =b[i]
R:O~i<m /\(Aj:O~j<i:x~bU])/\x=b[i]
A good invariant should be easy to establish. The first two conjuncts are
established by the assignment i:= 0, while most of the difficulty of the
program lies in establishing the third. Hence, it makes sense to delete the
third conjunct, yielding the following invariant:
Use the complement of the deleted conjunct. Thus far, the program is
i:= 0; do x #b[i] -? od
Choose the command for the loop, explaining how it was found.
Discussion
The program is certainly correct, but let us try formally to prove it
using checklist 11.9. First, show that invariant (16.2.4) is initially true:
Is this true? Certainly not -the antecedent is not enough to prove that
i + I < m! The problem is that we have neglected to include in the invari-
ant the fact that x E' b [O:m -I]. Formally, the invariant should be
With this slight change, one can formally prove that the program is
correct (see exercise 5).
In omitting the conjunct xE'b[O:m-l] we were simply using our
mathematician's license to omit the obvious. Note that all the free identif-
iers of x E' b [0: m -I] are fixed throughout Linear Search: x, band mare
not changed. Hence, facts concerning only these identifiers do not
change. It can be assumed that the reader of the algorithm and its sur-
rounding text will remember these facts, so that they don't have to be
repeated over and over again.
Later on, such obvious detail will be omitted from the picture when it
doesn't hamper understanding. For now, however, your task is to gain
experience with the formalism and its use in programming, and for this
purpose it is better to be as precise and careful as possible. It is also to
be remembered that text surrounding a program in a book such as this
one rarely surrounds that same program when it appears in a program
listing, as it should. Be extremely careful in your program listings to
present the program as clearly and fully as possible.
5. Prove with the help of checklist 11.9 that program (16.2.5) is correct, using loop
invariant (16.2.6) and bound function t: m -i.
(16.3.1) R: s =(l:j:O::::;;j<n:bU])
The fact that each array element is involved in the sum suggests that a
loop of some form should be developed, so R should be weakened to
yield a suitable invariant P. R contains the constant n (i.e. n may not
be changed). R can therefore be weakened by replacing n by a fresh
variable i, yielding
s =(l:j:O::::;;j<i:bU])
200 Part Ill. The Development of Programs
Discussion
Two other constants of R could be replaced to yield an invariant.
Replacing the constant 0 yields the invariant
U sing this as an invariant, one can develop a loop that adds the clements
b U] to s in decreasing order of subscript value j (see exercise I of section
15.1 ).
If result assertion R is written as
s = (:Lj:Oo:(jo:(n-l:bU])
the constant expression n -I can be replaced to yield the invariant
Note carefully the lower bound on i this time. Because n can be zero,
the array can be empty. Therefore the assignment i, s:= 0, b [0], a favor-
ite of many for initializing such a loop, cannot be used here. The initiali-
zation must be i, s:= -I, O. (See exercise I).
This example illustrates that there may be several constants to choose
from when replacing a constant by a variable. In general, the constant is
chosen so that the resulting invariant can be easily established, so that the
guard(s) of the loop are simple and, of course, so that the command(s) of
the loop can be easily written. This is a trial-and-error process, but one
gets better at it with practice.
Too often, variables are introduced into a program without the pro-
grammer really knowing why, or whether they are even needed. In gen-
eral, the following is a good principle to follow.
Section 16.3 Replacing a Constant By a Variable 201
We now have at least one good reason for introducing a variable: the
need to weaken a result assertion to produce an invariant. It goes without
saying that each variable introduced will be defined in some manner.
Part of this definition, which is often forgotten, is the range of the vari-
able. We emphasize the need for this range with the following
A program for this problem was developed in section 16.2 by deleting the
conjunct n «a+I)2; the program took time proportional to .rn . Here
we use the method of replacing a constant by a variable.
First try replacing the expression a + I by a fresh variable b to yield
a 2 ~n <b 2
The precondition of (16.3.5) will be the invariant together with the guard
of the loop:
P Aa+l#b.
Discussion
It may seem that the technique of halving the interval was pulled out
of a hat. It is simply one of the useful techniques that programmers must
know about, for its use often speeds up programs considerably. The exe-
cution time of this program is proportional to log n, while the execution
time of the program developed in section 16.2 is proportional to .rn .
Program (16.3.7) illustrates another reason to introduce a variable: d
has been introduced to make a local optimization. The introduction of d
not only reduces the number of times the expression (a +b )--;-2 is
evaluated, it also makes the program more readable.
Note that no definition is given for d. Variable d is essentially a con-
stant of the loop body. It is assigned a value upon entrance to the loop
body, and this value is used throughout the body. It carries no value
from iteration to iteration. Moreover, d is used only in two adjacent
lines, and its use is obvious from these two lines. A definition of d would
belabor the obvious and is therefore omitted.
A similar program can be developed by replacing the second occur-
rence of a in (16.3.4) by a variable -see exercise 3.
Section 16.3 Replacing a Constant By a Variable 203
The only difficulty in writing (16.3.9) might have been in getting k's
bounds correct. Subsequently, we will work with R as written in (16.3.8),
but we will turn to the more formal definition (16.3.9) when insight is
needed.
Clearly, iteration is needed for this program. Remembering the point
of this section, what loop invariant would you choose?
What should be the bound function, the initialization and the guard of the
loop?
The first conjunct is implied by the guard of the loop. What extra condi-
tion is needed to imply the second conjunct?
(16.3.11) i ,p:= I, I;
{invariant P: I ~ i ~n "
p is the length of the longest plateau of b[O:i-I]}
{bound t: n-i}
do i #n - if b[i]#b[i-p] - i:= i+1
Db[i]=b[i-p] - i,p:= i+l,p+1
fi
ad
Discussion
A common mistake in developing this program is to introduce, too
early in the game, a variable v that contains the value of the latest, long-
est plateau, so that the test would be b [i] = v instead of b [i] = b [i -p]. I
made this mistake the first time I developed the program. But it only
complicates the program. Principle (16.3.2) -introduce a variable only
Exercises for Section 16.3 205
R: s = (Ij:OE;;jE;;n-l:bU])
The invariant is to be found from the result assertion by replacing the constant
n -I by a variable.
2. Prove formally that the body of the loop of program (16.3.7) actually decreases
the bound function (point 5 of Checklist 11.9). The important point here is that,
+
when the body of the loop is executed, a I < b .
3. Develop a program for approximating the square root of n by replacing the
second occurrence of a in (16.3.4) by b, yielding the invariant
Don't forget to choose suitable bounds for b. Compare the resulting program,
and the effort needed to derive it, with the development presented earlier.
4. (Binary Search). Write a program that, given fixed x and fixed, ordered (by
E;;) array b[l:n] satisfying b[I]E;;x <b[n], finds where x belongs in the array.
That is, for a fresh variable i the program establishes
R: IE;;i<n Ab[i]E;;x<b[i+l]
The execution time of the program should be proportional to log n .
After writing the program, incorporate it in a program for a more general
search problem: with no restriction on the value x, determine i to satisfy
(i =0 A x <b[I]) V
(l E;;i <n A b[i]E;;x <b[i+l]) v
(i =n A b[n]E;;x)
5. Write a program that, given fixed, ordered array b [O:n -I] where n > 0, finds
the number of plateaus in b [O:n -I l
6. Write a program that, given fixed array b [O:n -I] where n > 0, finds the posi-
tion of a maximum value in b -i.e. establish
206 Part III. The Development of Programs
The program should be nondeterministic if the maximum value occurs more than
once in b.
7. Write a program that, given fixed array b [O:n -I] where n ~ 0, stores in d
the number of odd values in b [O:n -)].
8. Given are two fixed, ordered arrays f [O:m -I] and g [O:n -I], where m, n
~ O. It is known that no two elements of f are equal and that no two elements
of g are equal. Write a program to determine the number of values that occur
both in f and g. That is, establish
k=(Ni,):O~i<m AO~)<n:f[i]=g[j])
9. Write a program that, given fixed array b [O:n -I], where n ~ 0, determines
whether b is zero: using a fresh Boolean variable s, the program establishes
10. Write another program to find the length of the longest plateau of b [O:n -I].
This algorithm uses the idea that the loop body should investigate one plateau at
each iteration. The loop invariant is therefore
You may use the fact that the length of the longest plateau of an empty array is
zero. This exercise is illustrative of the fact that not all loop invariants will arise
directly from considering the strategies for developing invariants discussed in this
chapter. Here, we actually added a conjunct, thus strengthening the invariant, to
produce another program.
R: i =iv
The Linear Search Principle indicates that a search for a value i satisfying
R should be in order of increasing value, beginning with the lowest.
Section 16.4 Enlarging the Range of a Variable 207
The Linear Search Principle indicates that a search for a value i satisfying
R should be in order of increasing value, beginning with the lowest.
Thus, the invariant for the loop will be
P: O~i ~iv
Discussion
The method used to develop the invariant was to enlarge the range oj a
variable. In R, variable i could have only one value: iv. This range of
values is enlarged to the set to, I, ... ,iv}. In this case, the enlarging
came from weakening the relation i = iv to i ~ iv and then putting a
lower bound on i. This method is similar to the last one, introducing a
variable and supplying its range -it just happens that the variable is
already present in R.
The example illustrates another important principle:
value that is on all three of them; this least value is known to exist.
This program is often written in 10 to 30 lines of code in FORTRAN,
PL/ I or ALGOL 68 by those unexposed to the methods given in this
book. The reader might wish to develop the program completely before
studying the subsequent development.
What is the first step in writing the program? Do it.
The first step is to write pre- and postconditions Q and R. Since the lists
J, g and h are fixed, we will use the fact that they are alphabetically
ordered without mentioning it in Q or R. So Q is simply T. Using iv,
jv and kv to denote the least values satisfyingJ[iv]=gUv]=h[kv], and
using three simple variables i, j and k, the postcondition R can be writ-
ten as
R: i = iv /\ j = jv /\ k = kv
Notice how the problem of defining the values iv, jv and kv in detail has
been finessed. We know what least means, and hope to proceed without a
formal definition. Now, why should a loop be used? Develop the invari-
ant and bound function for the loop.
(16.4.2) i, j, k:= 0, 0, 0;
do?-i:=i+1
o ? - j:= j+1
O?-k:=k+l
od
We have:
The last two conjuncts, and also 0 ~ i + I, are implied by the invariant, so
only i + I ~ iv must be implied by the guard. The guard cannot be i + I ~
iv, because the program may not use iv. But, the relation i + I ~ iv,
together with P, means that f(i) is not the crook, and this is true if
f[i]<gU], Thus, the guard can be f[i]<gU]. In words, since the
crook does not come alphabetically before gU], if f[i] comes alphabeti-
cally before gU], then f[i] cannot be the crook.
But the guard could also be f[i]<h[k] and, for the moment, we
choose the disjunction of the two for the guard:
The other guards are written in a similar fashion to yield the program
(16.4.3) i, j, k:= 0, 0, 0;
dof[i] <gU] v f[i] <h[k] - i:= i+l
o g[iJ <h[k] v gU]<f[i] - j:= j+1
o h[k]<f[i] v h[k]<gU] - k:= k+1
od
(16.4.4) PA,BB?R
Note that only the first disjunct of each guard is needed to prove (16.4.4).
Hence, the second disjuncts can be eliminated to yield the program
(16.4.5) i, j, k:= 0, 0, 0;
dof[i] <gU] - i:= i+1
o gU] <h[k] - j:= j+1
o h[k]<f[i] - k:= k +1
od
Discussion
In developing this program, for the first guard, at first f[i] <gU] is
developed, and then weakened to f[i]<gU] v f[i]<h[k]. Why is it
weakened?
Well, the first concern is to obtain a correct program; the second con-
cern is to obtain an efficient one. In proving correctness, one task is to
prove that, upon termination, (16.4.4) holds. The stronger , BB is, the
more chance we have of proving (16.4.4). Since BB is the complement of
, BB, this means that the weaker BB is, the more chance we have of prov-
ing (16.4.4). Thus, we have the following principle:
Inserting Blanks
Consider the following problem. Write a program that, given fixed
n :;::'0, fixed p :;::'0, and array b[O:n -I], adds p*i to each element b[i] of
b. Formally, using Bi to represent the initial value of b [i], we have
P' states that the first j elements of b have their final values. But the
fact that the other n - j elements have their initial values should also be
included, and the full invariant is
(16.5.1) j:= 0;
doj-#-n -j,bU]:=j+l,bU]+p*j od
i+n-I j j+n-I
Q: bl X[O:n -I] I 1\ bl Y[O:n-l] I
i+n-I j j+n-I
R: bl y[O:n-l] I 1\ bl X[O:n -I] I
For the rest of the development, a less formal approach will be used,
which uses the insight gained thus far without requiring all the formal
details. We take for granted that only the sections mentioned should be
changed and that they do not overlap, and use the following diagrams for
the pre- and postconditions -"unswapped" ("swapped") means that the
values in the indicated section have their initial (final) values:
i+n-I j j+n-I
Q: bl unswapped I 1\ bl unswapped I
i+n-I j j+n-I
R: bl swapped
, 1\ bl swapped I
Since each element of the two sections must be swapped, a loop is sug-
gested that will swap one element of each at a time. The first step in find-
ing the invariant is to replace the constant n of R by a variable k:
Section 16.5 Combining Pre- and Postconditions 213
i+k-I j j+k-I
P': 0 ~ k ~n /\ b ,r--sw-a-p-p-ed-'I /\ b "'::-1-s-w-ap"-p-ed----,I
But P' does not indicate the state of array elements with indices in i +k:
i +n -I and j +k:j +n -1. Adjusting P' suitably yields invariant P as the
predicate 0 ~ k ~ n together with
k:= 0;
do k #n ~ k, b[i+k), bU+k]:= k+l, bU+k), b[Hk) od
Discussion
Again, the invariant was developed by replacing a constant of R by a
variable and then adding a conjunct in order to reflect the initial condi-
tions. We used diagrams in order to avoid some formalism and messy
detail. For some, pictures are easier to understand. But be especially
careful when using them, for they can lead to trouble. It is too easy to
forget about special cases, for example that an array section may be
empty, and this can lead to either an incorrect or less efficient program.
To avoid such cases, always define the ranges of new variables carefully
and be sure each picture is drawn in such a way that you know it can be
translated easily into a statement of the predicate calculus.
The development of the invariant was a two-step process. The invari-
ant can also be developed as follows. Both Q (or a slightly perturbed ver-
sion of it due to initialization) and R must imply the invariant. That is,
Q and R must be instances of the more general predicate P. Q states
214 Part III. The Development of Programs
that the sections are unswapped; hence, the invariant must include, for
each section, an unswapped subsection, which could be the complete sec-
tion. On the other hand, R states that the sections are swapped; hence,
the invariant must include, for each section, a swapped subsection, which
could be the complete section. One is led to draw diagram (16.5.2), using
a variable k to indicate the boundary between the unswapped and
swapped subsections.
m p-I p n-I
R: m ~p ~ n II bil-_~::::..x=--=----LI----=>_x=--=-----II
More formally, if initially b[m:n-I]=B[m:n-I], then the program estab-
lishes
m p n-/
R: m ~p <n II perm(b, B) 1\ b I~B[m]IB[m]1 >B[m] I
Procedure Partition is a slight modification of the answer to exercise 3.
5. (The Dutch National Flag). Given is an array b[O:n -I] for fixed n ~O, each
element of which is colored either red, white or blue. Write a program to permute
the elements so that all the red elements are first and all the blue ones last. That
is, the program is to establish
s
H VII +-.. -1
,--v---,_s:......, v s
Vn-II -II
That is,
No ordering of values in array elements is implied. For example, the fact that Vo
is followed by VI in the linked list does not mean that v[p+I] contains VI'
Write a program that reverses the links -the arrows implemented by array s.
Array v should not be altered, and upon termination the linked list should be
{n ~O}
a, b:= 0, n+l;
{inv: a <b ~n+1 A a2~n <b 2}
doa+l#b -d:=(a+b)-=-2;
ifd*d~n -a:=d Ud*d>n -b:=dfi
od {a2~n «a+I)2}
o j n-l
o~nothere
P: 0 ~ i ~ mAO ~j <n A i
m-I
i<h or i=hAj<k
For example, (-I, 5) «5, 1) «5, 2). This is called the lexicographic ord-
ering of integer pairs. It is extended in the natural way to the operators
~, > and~. It is also extended to triples, 4-tuples, etc. For example,
(3,5,5)«4,5,5)«4,6,0)«4,6, I).
(17.3) Theorem. Consider a pair (i, j), where i and j are expressions
containing variables used in a loop. Suppose each iteration of the
loop decreases (i, j) (lexicographically speaking). Suppose further
that i satisfies mini:::::; i :::::; maxi and j satisfies minj:::::;j:::::; maxj,
for constants mini, maxi, minj and maxj. Then execution of the
loop must terminate, and a suitable bound function is
If one can exhibit a pair (triple, etc.) that satisfies theorem 17.3, there
is no need to actually produce the bound function, unless it makes things
clearer or is needed for other reasons. We give three examples.
In section )5.2 the following program (15.2.4) was written for searching
a (possibly empty) two-dimensional array.
Chapter 17 Notes on Bound Functions 219
{O:S;m A O:S;n}
i, j:= 0, 0;
do i =F m A j =F n cand x =Fb[i,j] ~ j:= j+1
Di=FmAj=n ~i,j:=i+I,O
od
{(O:S;i<m AO:S;j<n Ax=b[i,jDV(i=m AXtfb)}
The pair (i ,j) is initially (0,0) and each iteration increases it. Therefore,
the pair (m -i, n -j) is decreased at each iteration. Further, we have
O:S;m-i:S;m and O:S;n-j :S;n. Hence, theorem 17.3 can be applied and
the loop terminates. The bound function that arises from the use of the
theorem is (m-i)*(n+I)+n-j.
The tuple (qO, ql, q2, q3) is decreased (lexicographically speaking) by each
iteration. It is bounded below by the tuple whose values are min (qO, qJ,
q2, q3) and is bounded above by the 4-tuple whose values are max(qO,
qJ, q2, q3). Hence, the loop terminates.
Removing train from the yard reduces the number of trains and reduces
the total number of cars in the yard. On the other hand, splitting a train
leaves the total number of cars the same but increases the number of
trains by 1. So we choose the pair
220 Part III. The Development of Programs
O!
n! n*(n-I)! forn>O.
for writing programs iteratively that could have been written recursively.
One trick in doing so will be to think iteratively right from the beginning.
That is, if the program will be written using iteration, then the invariant
for the loop will have to be developed before writing the loop (as much as
possible).
The topic will allow us to bring up two important strategies and dis-
cuss the relation between them, for recursive procedures often evolve from
their use. These strategies are: -solving problems in terms of simpler
ones, and divide and conquer. While not on the same level of detail and
precision as some of the strategies presented earlier, these two old
methods can still be useful when practised consciously.
At the end of section 18.3, some comments are made concerning the
choice of data structures in programming and the use of program
transformations.
m n p-l
Q: b IB[m:n-IJI B[n:p-IJ I
where B denotes the initial value of array b. The program should swap
the two array sections, using only a constant amount of extra space
(independent of m, nand p), thus establishing the predicate
Section 18.1 Solving Simpler Problems First 223
m p-I
(18.1.2) R: b I B[n:p-I] IB[m:n-I]1
n-j n
b swap with swap with
b[n:n+j-l] b[n-i:n-l]
i,j:=n-m,p-n; {PI
do i> j - swapequals(b, n-i, n, j); i:= i-j
Ui <j - swapequals(b, n-i, n+j-i,i); j:= j-i
od;
{PAi=j}
swapequals(b, n -i, n, i)
Discussion
This program could also have been written in recursive fashion as
(18.1.3) {m<n<p}
i, j:= n-m, p-n;
{inv: 0 <i A 0 <j A gcd(n -m, p -n) = gcd(i, j)}
doi>j-i:=i-j
U i<j -j:=j-i
od
{i = j =gcd(n-m, p-n)}
Exercises for Section 18.1 225
10 0
II 1
In In-I +In-2 for n >1
The first eight Fibonacci numbers are 0, I, 1,2,3,5,8, 13.
The definition of In for n > 1 can be written in matrix notation as follows:
In 1 In-I
In-I o In-2
It is fairly easy to write a program that takes time proportional to n to calculate
In. However, in a subsequent section, 19.1, a program is given to perform
exponentiation in for positive integers n in time proportional to log n, where i
could be a matrix. Write a program to calculate In in logarithmic time using the
simpler(?) problem of exponentiation.
226 Part III. The Development of Programs
However, if n >2 then a more general method must be used. The divide
and conquer strategy invites us to perform the sort by sorting two (or
more) sections of the array separately. Suppose the array is partitioned as
Section 18.2 Divide and Conquer 227
follows.
o k n-I
? 1
What condition must be placed on the two sections so that sorting them
separately yields an ordered array?
Every value in the first section should be ~ every value in the second sec-
tion:
o k n-I
(18.2.2) b l~b[k:n-l]l;;'b[O:k-l]l
This means that if the values of b can be permuted to establish the above
predicate, then to sort the array it remains only to sort the partitions
b[O:k-l] and b[k:n-I].
Actually, a procedure similar to one that establishes (18.2.2) has
already been written -see exercise 4 of section 16.5- so we will make
use of it. Procedure Partition splits a non-empty array section b [m:n -I]
into three partitions, where the value x in the middle one is the initial
value in b [m]:
m p n-I
(18.2.3) R: m ~p < nAb 1-1_~_x----..JIL--x-,-I_>_x---,I
After partitioning the array as above, it remains to sort the two parti-
tions b [m :p -I] and b [p + I:n -I]. If they are small enough, they can be
sorted directly; otherwise, they can be sorted by partitioning again and
sorting the smaller sub-partitions. While one sub-partition is being sorted,
the bounds of the other must be stored somewhere. But sorting one will
generate two more smaller partitions to sort, and their bounds must be
stored somewhere also. And so forth.
To keep track of the partitions still to be sorted, use a set variable s to
contain their boundaries. That is, s is a set of pairs of integers and, if
(i,j) is in s, then b[i:j] remains to be sorted. We write the invariant
Note how English is used to eliminate the need for formally introducing
an identifier to denote the initial value of array b.
Thus, we arrive at the following program:
Discussion
Program (18.2.5) describes the basic idea behind Quicksort. Proof of
termination is left to exercise I. The execution time of Quicksort is
O(n logn) on the average and O(n2) in the worst case. The space needed
in the worst case is O(n), which is more than it need be; exercise 2 shows
how to reduce the space.
In the development of this program, the guiding motivation was the
desire to divide and conquer. The simpler problem needed to effect the
divide and conquer was procedure Partition. Had we first noticed that
procedure Partition was available and asked how it could have been used,
we would have been using strategy 18.1.1, solve the problem in terms of
simpler ones.
B ~
~
E
A
-----F
C
/~
G
I
/""-J /~
K L
Figure 18.3.1 Example of a Binary Tree
Above, the term tree is defined in the easiest possible manner: recur-
sively. For that reason, many algorithms that manipulate trees are given
recursively also. Here, we wish to describe a few basic algorithms dealing
with trees, but using iteration. With a firm grasp of this material, it
should not be difficult to develop other algorithms that deal with trees,
graphs and other structures.
Implementing a tree
We describe one typical implementation of a tree, which is motivated
by the need in many algorithms to insert nodes into and delete nodes
from a tree. The implementation uses a simple variable p and three
arrays: root[O:?], left[O:?] and right[O:?].
Variable p contains an integer satisfying -I ~p. It describes, or
represents, the tree.
If integer k describes a tree or subtree, then the following holds:
For example, the tree of Fig. 18.3.1 could appear as given in (18.3.1).
o I 2 3 4 5 6 7 8 9 10
root B A C E F I J K L G
(18.3.1) p = I left -I 0 5 6 8 -I -I -I -I -I
right 4 3 10 7 9 -I -I -I -I -I
Some comments are in order. First, p need not equal 0; the root node
need not be described by the first elements of the arrays, root [0], left[O]
and right[O]. In fact, several trees could be maintained in the same three
arrays, using pI, p2 and p3 (say) to "point to their roots". This, of
course, implies that the nodes of the trees in the arrays need not be in any
particular order. In (18.3.1), the elements with index 2 of the three arrays
are not used in the representation of tree p at all. Moreover, the root of
the left subtree of A precedes A in the array, while the root of its right
subtree follows it. This means that one can not process the tree by pro-
cessing the elements of root (and left and right) in sequential order.
In the rest of this section, we will deal with a tree p using the original
notations empty (p), root[p], left[p] and right[p]. Note, however, that
this notation is quite close to what one would use in a program dealing
with a tree implemented as just shown.
Section 18.3 Traversing binary trees 231
#p
(18.3.2) R: #p =c
The first step, of course, is to give a definition of #p, in the hope that
it will yield insight into the program. Write a definition of #p -it may
help to use recursion since tree is defined recursively.
( 18.3.3)
Jempty(p) ~ 0
#p = )~ ,empt)!(p) ~ I +#leJt[p]+#right[p]
The bound function was discovered by noting that the pair (#p -e, I s I ) is
decreased (lexicographically speaking) by each iteration -see Chapter 17.
Note that it does not matter in which order the subtrees in set s are
processed. This is because the number of nodes in each subtree will be
added to e and addition is a commutative operation. In this case, the use
of the nondeterministic operation Choose (q, s), which stores an arbitrary
value in s into q, nicely frees us from having to make an unnecessary
choice.
Preorder traversal
The preorder list of the nodes of a tree p, written preorder(p), is
defined as follows. If the tree is empty, it is the empty sequence 0; other-
wise, it is the sequence consisting of
For example, for the subtree e of Fig. 18.3.1 with root E we have
preorder(e) = (E, I, J)
preorder (a) = (A , B, E, I, J, C, F, K, L, G)
Using I to denote catenation of sequences, preorder(p) can be written as
jempty(p) - 0
(18.3.6) preorder(p) = I,empty(p) - (root(p)) I preorder(left[p]) I
. preorder (right [p ])
Note that preorder (P) is defined recursively. This notation and the de-
finition of preorder in terms of catenation has been designed to allow us
to state and analyze various properties and algorithms in a simple, crisp
Section 18.3 Traversing binary trees 233
Note the similarity between definitions (18.3.3) and (18.3.6). They have
the same form, but the first uses the commutative operator + while the
second uses the non-commutative operator I. Perhaps the program to
calculate the preorder list may be developed by transforming program
N ode Count so that it processes the trees of set s in a definite order.
First, let's rewrite Node Count in (18.3.8) to store the node values into
array b, instead of simply counting nodes. The invariant is
(18.3.8) e,s:=O,{p};
{bound: 2*(#p-e) + I s I}
do s #{} - Choose(q, s); s:= s -{q};
if empty (q) - skip
0, empty(q)- e, b[e]:= e+l, root[q];
s:= s U {right [q]} U {left[q]}
fi
od {e = #p " b [O:e -I] contains the nodes of p}
Discussion
In (18.3.5), the order in which the left and right subtrees are stored in
set s is immaterial, because addition, which is being performed on the
number of nodes in each, is commutative. In (18.3.10), however, the
order in which nodes are stored in sequence r is important because opera-
tion I is not commutative.
My first development of this program, done over 5 years ago, was not
performed like this. It was an ad hoc process, with little direction,
because I was new at the game and had to struggle to learn and perfect
techniq ues.
Without the sequence notation (see Appendix 2), including the nota-
tion for catenation, one tries to work with English phrases, for example,
writing the invariant as
In general, this principle deals with data and its representation, as well as
with commands. We should use data structures that suit the problem,
and, once a correct program has been developed, deal with the problem of
changing the data structures to make their use more efficient and imple-
menting them in the programing language. This latter task, often called
"data refinement", has not received the attention that "program refine-
ment" has.
In a "modern" programming notation allowing "data encapsulation",
data refinement may just mean appending a program segment that des-
cribes how the objects are to be represented and the operations are to be
implemented. In other programming notations, it may mean transforming
the program so that it operates on allowable objects of the language.
3. Write a program to store in array b the postorder list of nodes of tree p. The
postorder list is defined as follows. If p is empty the postorder list is the empty
sequence O. If P is not empty, the postorder list is
4. The root of a tree is defined to have depth 0, the roots of its subtrees have
depth I, the roots of their subtrees have depth 2, and so on. The depth of the tree
itself is the maximum depth of its nodes. The depth of an empty tree is -I. For
example, in tree (18.3.1), A has depth 0, F has depth 2, and the tree itself has
depth 3. Write a program to calculate the depth of a tree.
Chapter 19
Efficiency Considerations
(19.1.1) Theorem. Suppose a loop has (at least) two guarded commands,
with guards B1 and B2. Then strengthening B2 to B2 A , B1
leaves BB, and hence P A, BB ~ R, unchanged.
R: i = iv A j = jv A k = kv
Section 19.1 Restricting N ondeterminism 239
The invariant for a loop is found by using the Linear Search Principle,
(16.2.7), and enlarging the range of variables in R:
(19.1.2) i,j,k:=O,O,O;
dof[i] <gU] v f[i] <h[k] - i:= i+1
U gU] <h[k] v gU] <f[i] - j:= j+1
U h[k]<f[i] v h[k]<g[j] - k:= k+1
od
i, j, k:= 0, 0, 0;
do f[i] <gU] - i := i+1
UgU] <h[k]- j:= j+1
U h[k]<f[i] - k:= k+1
od
Note that theorem 19. I.I could now be used to strengthen two of the
guards, but it is better not to. There is no reason for preferring one of the
commands over the others, and strengthening the guards using the
theorem will only complicate them and make the program less efficient.
In this case, the nondeterminism aids in producing the simplest solution.
Exponentiation
Consider writing a program that, given two fixed integers X and Y,
X ;;:: 0 and Y;;:: 0, establishes
240 Part III. The Development of Programs
P: O~y A Z*X Y =X Y •
t: y
{O~X A O~ Y}
X,y,Z:=X, Y, I;
do O<y A even(y) - y,X:= y+2,x*x
o O<y - y,z := y-I, Z*X
od {Z =X Y }
{O~X A O~ Y}
x, y, z:= X, Y, I;
do O<y A even(y) - y,X:= y+2,x*x
o O<y A odd(y) - y,z := y-I, z*x
od{z=X Y }
x, y, z:= X, Y, I;
do O<y - do even(y) - y,x:= y+2,x*x od;
y, z:= y-I, z*x
od
Section 19.2 Taking an Assertion Out of a Loop 241
Next, make z = 5*i part of the invariant of the loop. This means that the
assignment z:= 5 *i within the loop becomes unnecessary, but whenever i
is increased z must be altered accordingly:
z:= i* 5;
{Part of invariant: z =5*i}
do i < n .. ; k:= z;
· .. ; i, z:= i+2, z+IO;
od
Then, within a loop that increments i with each iteration, all calculations
of the address of b [i, j] can be transformed as above to make them more
efficient. This optimization is also effective because it allows the detec-
tion and elimination of certain kinds of common arithmetic expressions.
In general, this transformation is called taking an assertion out of a
loop (and making it part of the loop invariant). In this case, the assertion
z = 5 *i was taken out of the loop to become part of the invariant. The
technique can be used wherever the value of some variable like z can be
calculated by adjusting its current value, instead of calculating it afresh
each time.
242 Part III. The Development of Programs
In the above example, taking the relation out of the loop can reduce
execution time by only a constant factor, but examples exist that show
that the technique can actually reduce the order of execution time of an
algorithm.
Horner's rule
Consider evaluating a polynomial aO+al*x 1+ ... +an_l*X n - 1 for
n ::;::: I and for a value x and given constants ai. The result assertion is
i, y:= I, ao;
{invariant: I~i~n Ay =ao*xo+ ... +ai_l*x i - 1}
{bound: n -i}
do i cpn - i,y:= i+l, y+ai*x i od
part of the invariant of the loop allows us to transform the program into
i, y, z:= I, ao, x;
{invariant: I~i~n Az=Xi Ay =ao*xo+ ... +ai_l*x i - 1}
{bound: n-i}
doicpn -i,y,z:=i+l,y+ai*z,z*x od
Axiom I. I is in Seq.
Axiom 2. If x is in Seq, so are 2*x, 3*x and 5*x.
Axiom 3. The only values in Seq are given by Axioms I and 2.
The problem is to write a program that stores the first 1000 values of
Seq, in order, in an array q[0:999], i.e. that establishes
Since Axiom 2 specifies that a value is in Seq if a smaller one is, it may
make sense to generate the values in order. A possibility, then, is to
replace the constant 1000 of R by a variable i, yielding the invariant
i, q[O]:= 1, 1; {P}
{invariant: P; bound: 1000-i}
do i # 1000 - Calculate xnext, the ith value in Seq;
i, q[i]:= i+l, xnext
od
244 Part III. The Development of Programs
Value xnext is the minimum of x2, x3 and x5. We see, then, that vari-
able xnext is not really needed, and we modify the program structure to
i, q[O]:= I, I; {P}
{invariant: P; bound: 1000-i}
do i =F- 1000 - Calculate x2, x3, x5 to satisfy Pl;
i, q[i]:= i+l, min(x2,x3,x5)
od
i, q[O]:= 1, I; {P}
Establish Pl for i = I;
{invariant: P /\ Pi; bound: IOOO-i}
do i =F- 1000 - i, q[i]:= i + I, min (x2, x3, x5);
Reestablish Pl
od
i, q[O]:= I, I; {PJ
Establish PI: x2, x3, x5, j2, j3, j5:= 2, 3, 5,0,0, 0;
{invariant: P /\ PI; bound: lOOO-i}
do i ¥-1000 ~ i,q[i]:= i+l, min(x2,x3,x5);
Reestablish PI:
dox2~q[i-I] ~j2:=j2+1; x2:= 2*qU2] od;
do x3~q[i-I] ~ j3:= j3+1; x3:= 3*qU3] od;
do x5~q[i-I] - j5:= j5+1; x5:= 5*qU5] od
od
To help in writing it (and to arrange to use the strategy of taking a relation out of
a loop), assume the following. Two arrays xv and yv will hold the values of the
pairs (x, y) satisfying (19.2.1). Furthermore, the pairs are to be generated in
increasing order of their x -values, and a variable x is used to indicate that all
pairs with x-value less than x have been generated. Thus, the first approxima-
tion to the invariant of the main loop of the program will be
PI: 0 ~i /\ ordered(xv[O:i-l]) /\
the pairs (xvU],yvU]),O~j <i, are all the pairs
with x-value <x that satisfy (19.2.1).
246 Part III. The Development of Programs
Other reasons will probably suggest themselves once familiarity with the
technique is acquired. We illustrate with three examples.
d:=a+(b-a)/2
b =a+c
d=a+c/2
(E p: I!(p: c = 2P ) (therefore c is even)
Printing can be done in linear time and searching can be done in time
proportional to the logarithm of the current size of v, using Binary
Search.
But what about inserting a new value x? Inserting will require finding
the position j where x belongs -i.e. finding the value j such that v U -I]
~ x < vU]- then shifting vU:i-l] up one position to vU+I:i], and
finally placing x in vU]' Shifting vU:i-l] may take time proportional
to i, which means that each insertion may take time proportional to i,
and therefore, in the worst case the total time spent inserting n items may
be on the order of n 2 . This is expensive, and a modification is in order.
Shifting is the expensive operation, so we try to change the data repre-
sentation to make it less expensive. How can this be done, perhaps to
eliminate the need for shifting altogether?
A simple way to make shifting less expensive is to spread the values out,
so that an empty array element, or "gap", appears between each pair of
values. Thus, an array v[O:2n -I] of twice the size is defined by
Remark: If all values are known to be positive, then the sign bit of vU]
can be used to distinguish values from gaps. 0
Now, shifting and inserting takes no time at all, because the new value
can be placed in a gap. But shifting and inserting destroys the fact that a
gap separates each pair of values, and after inserting it necessary to recon-
figure the array to reestablish (19.3.4). Reconfiguring can be costly, so we
must find a way to avoid it as much as possible.
We can defer reconfiguring the array simply by weakening the invari-
ant to allow several values to be adjacent to each other. However, there
are never adjacent gaps; the odd positions of v always contain values.
We introduce a fresh variable k to indicate the number of array elements
being used, and use the invariant
Section 19.3 Changing a Representation 249
Note, now, that when inserting the first value no shifting is required, since
it can fill a gap. The second value is likely to fill a gap also, but it may
cause a shift. The third value inserted may fill a gap also, but the proba-
bility is greater that it will cause some shifting because there are fewer
gaps. At some time, so many values will have been inserted that shifting
again becomes too expensive. At this point, it is wise to reconfigure the
array so that there is again one gap between each pair of values.
To summarize, the table is defined by (19.3.5), with (19.3.4) also being
true initially. That is, values are separated by gaps. The table is initially
set to empty using
i, k:= 0, 0
{(19.3.4) and (19.3.5) are true}
(19.3.6) {(19.3.5)}
if shifting too expensive - Reconfigure to reestablish (19.3.4)
o shifting is not too expensive - skip
fi;
Find the position j where Vi belongs;
Shift vU: ... ] up one position to make room for Vi;
i, vU]:= ;+1, Vi
Discussion
The first idea in developing this algorithm was to find a way to make
shifting less expensive; the method used was to put a gap between each
pair of values. The second idea was to defer reconfiguration, because it
was too expensive. The first idea made shifting cheap, but introduced the
expensive reconfiguration operation; the second idea deferred reconfigura-
tion often enough so that the total costs of shifting and reconfiguration
were roughly the same.
The algorithm is a competitor to balanced tree schemes in situations
where a table of values is to be maintained in memory.
The first four functions are executed in constant time. Function append,
however, takes time proportional to the length of the list v to which w is
being appended.
This is all we will need to know about LISP.
Consider implementing a queue using LISP lists and the five functions
just given. A queue is a list v on which three operations may be per-
formed: the first is to reference the first element on the list, the second is
to delete the first element and the third is to insert a value w at the end
of the queue. Thus, the operations on queue v can be implemented as
Now, suppose n values vo, ... , Vn-I are to be inserted in a queue and,
between insertions, values may be taken off the queue or the first value on
Section 19.3 Changing a Representation 251
the queue can be examined. In the worst case, the time needed to per-
form the insertions is on the order of n 2. Why?
Insertion can be done easily if the queue is kept in reverse order. But
this would make deletion expensive. Thus, we compromise: implement
queue v =(vo, " ' , vi-d using two lists vh and vt, where the second is
reversed:
vh:= tail(vh);
if vh = 0 A vt # 0
{inv: queue is (reverse(vt) I vh)}
{bound: I vt I}
do vt #0 - vh, vt:= construct (head(vt ), vh), tail(vt) od
{(19.3.7) A vt =()}
Dvh # 0 v vt = 0 - skip
fi
(20.1.1) justifying#lines#by########
inserting#extra#blanks#is##
one#task#of#a#text#editor.#
(20.1.2) justifying#####lines#####by
inserting#extra##blanks##is
one##task#of#a#text#editor.
The first step is to write pre- and postconditions for the procedure body.
We begin with the precondition. The words themselves are not part of
the specification, since only column numbers are given. So the precondi-
tion won't be written in terms of words. But it may help to give an
intepretation of the precondition in terms of words. Initially, the input
line has the form
where WI is the first word, W2 the second, ... , Wn the last, s is the
number of extra blanks, and the number of blanks at each place has been
shown within brackets. The precondition Q itself must give restrictions
on the input -e.g. that there cannot be a negative number of words or of
extra blanks. In addition, because array b will be modified, it is
Section 20.1 Justifying Lines of Text 255
Now, the pre- and postconditions are Q (20.1.3) and R (20.1.6), with
variables p, q and t of R satisfying Ql (20.1.5). What is the general
structure of the algorithm?
(20.1.7) {Q}
Calculate p , q and t to establish Ql;
{Ql A Q}
Calculate new b[l:n] to establish R
{Ql A R}
Calculating p, q and t
The two English commands of (20.1.7) have to be refined. We begin
with the first. At this point, refine "Calculate p, q and t to establish
QI". Be absolutely sure the refinement is correct.
which simplifies to
which cannot be true. What does n =0 mean? That there are no words
on a line. But of course, a line with 0 words cannot be justified!
Assume the specification is changed so that, if a line has zero or one
words on it, then no justification should occur.
The case even (z) is solved in a similar fashion, leaving us with the fol-
lowing algorithm to establish QI if n > I:
Section 20.1 Right-Justifying Lines of Text 257
Determine p, q and t:
ifeven(z) - q:= s+(n-I); t:= I+(s mod(n-I»;p:= q+1
Uodd(z) - p:= s+(n-I); t:= n-(s mod (n-I»; q:= p+1
fi
or simply
k, e:= n, s;
do k #-t - b[k]:= b[k]+e; k, e:= k-J, e-q od;
do e#-O - b[k]:= b[k]+e; k, e:= k -1, e -p od
Each loop was developed by first writing the invariant, then writing the
command of the loop, and finally determining a suitable guard. The
guard e#-O for the second loop was discovered by noting that the invari-
ant states that e =p*(k-l) and that e =0 implies either p =0 or k = 1,
each of which implies that all values b[i] have their final value.
258 Part III. The Development of Programs
Discussion
The development of this program brings up several interesting points.
First of all, consider the development of the postcondition (20.1.6). A
common mistake in writing this specification is to describe the line as two
cases:
While it can lead to a correct program, the program will be less efficient
than the one developed, even if in a relatively minor way. Generally
speaking, one should try to follow the principle:
But this would have eliminated the possibility of noticing that the loop
could be written without loss of efficiency to halt immediately if p = O.
Further, one familiar with using loop invariants will generate the invariant
and loop given in (20.1.11) as quickly as the PLI I loop.
Section 20.2 The Longest Upsequence 259
Thus, using a variable k to contain the answer, the program has the pre-
and postconditions:
Q: n >0
R: k =lup(b[O:n-ID
Note that a change in anyone value of a sequence could change its long-
est upsequence, and this means that possibly every value of a sequence s
must be interrogated to determine lup (s). This suggests a loop. Begin by
writing a possible invariant and an outline of the loop.
The loop will interrogate the values of b [O:n -1] in some order. Since
lup(b[O:O]) is 1, a possible invariant can be derived by replacing the con-
stant n of R by a variable:
P: I ~i ~n /\ k =lup(b[O:i-I])
i, k:= I, I;
do i ~ n - increase i, maintaining P od
Increasing i extends the sequence b[O:i -1] for which k is the length of a
longest upsequence, and hence may call for an increase in k. Whether k
is to be increased depends on whether b[i] is at least as large as a value
that ends a longest upsequence of b[O:i-l] (there may be more than one
longest upsequence). It makes sense to maintain information in other
variables so that such a test can be efficiently made. What is the min-
imum information needed to ascertain whether k should be increased?
260 Part III. The Development of Programs
P: I";;'i";;'n A k =lup(b[O:i-I]) A
m is the smallest value in b[O:i-l) that ends an
upsequence of length k
In the case b[i)~m, k can be increased and m set to b[i), so that the
program thus far looks like
i,k,m[I]:=I,I,b[O]; {PI
do i#n ~ if b[i]~m[k] ~ k:= k+l; m[k]:= b[i]
o
b[i] <m[k] ~ ?
fi;
i:= i+l
od
The case b [i] < m [I] is the easiest to handle. Since m [I] is the smallest
value that ends an up sequence of length I of b[O:i-l], if b[i]<m[I],
then b[i] is the smallest value in b[O:i] and it should become the new
m[I]. No other value of m need be changed, since all upsequences of
b[O:i-l] end in a value larger than b[i].
Finally, consider the case m [I] ~ b [i] < m [k]. Which values of m
should be changed? Clearly, only those greater than b [i] can be changed,
since they represent minimum values. So suppose we find the j satisfying
2. (The Next Higher Permutation). Suppose array b[O:n -I] contains a sequence
of (not necessarily different) digits, e.g. n = 6 and b [0:5] =(2,4,3,6,2, I). Con-
sider this sequence as the integer 243621. For any such sequence (except for the
one whose digits are in decreasing order) there exists a permutation of the digits
that yields the next higher integer (using the same digits). For the example, it is
(2,4,6,1,2,3), which represents the integer 246123.
Write a program that, given an array b[O:n -I] that has a next higher permu-
tation, changes b into that next higher permutation.
3. (Different Adjacent Subsequences). Consider sequences of l's, 2's and 3's. Call
a sequence good if no two adjacent non-empty subsequences of it are the same.
For example, the following sequences are good:
2
32
32123
1232123
Exercises for Chapter 20 263
33
32121323
123123213
]t is known that a good sequence exists, of any length. Consider the "alphabetical
ordering" of sequences, where sequence sf . <.sequence s2 if, when considered as
decimal fractions, sf is less than s2. For example, 123.<. 1231 because
.123<.1231 and 12.<' 13. Note that if we allow O's in a sequence, then
I
sf 0 .=. sf. For example, 110 .=. II, because .110 = .11.
Write a program that, given a fixed integer n ?O, stores in array b[O:n -I]
the smallest good sequence of length n.
4. (The Line Generator). Given is some text stored one character to an array ele-
ment in array b [O:n -I]. The possible characters are the letter A, ... , Z, a blank
and a new line character (NL). The text is considered to be a sequence of words
separated by blanks and new line characters. Desired is a program that breaks
the text into lines in a two-dimensional array line [O:nolines -I, O:maxpos -I],
with line[O,O:maxpos-l] being the first line, line[I,O:maxpos-l] being the
second line, etc. The lines must satisfy the following properties:
For X, we can define a second array X' [O:N -I] as follows. For each i, element
X'[i] is the number of values in X[O:i-l] that are less than X[i]. For exam-
ple, we show one possible array X and the corresponding array X' , for N = 6.
X = (2, 0, 3, 1,5,4)
X' = (0, 0, 2, I, 4, 4)
7. (The Non-Crooks). Array I[O:F-l] contains the names of people who work
at Cornell, in alphabetical order. Array g[O:G -1] contains the names of people
on welfare in Ithaca, in alphabetical order. Thus, neither array contains dupli-
cates and both arrays are monotonically increasing:
Count the number of people who are presumably not crooks: those that appear in
at least one array but not in both.
8. Read exercise 7. Suppose the arrays may contain duplicates, but the arrays are
still ordered. Write a program that counts the number of distinct names that are
not on both lists -i.e. don't count duplicates.
9. (Period of a Decimal Expansion). For n > I, the decimal expansion of 1 / n is
periodic. That is, it consists of an initial sequence of digits d I . . . di followed by
a sequence di+I" . di+j that is repeated over and over. For example,
1 /4 = .2500000 ... , so the sequence 0 is repeated over and over (i = 2 and j = I),
while 1/7=.142857142857142857 ... , so the sequence 142857 is repeated over
and over (i = 0 and j = 6, although one can take i to be any positive integer
also). Write a program to find the length j of the repeating part. Use only sim-
ple variables -no arrays.
hi =g[O]+g[l]
hk = hk - I +g[k] for 1 <k :::;;N-l
11. (Exponentiation). Write a program that, given two integers x ;;:::0 and y >0,
calculates the value z = x Y . The binary representation bk -I . . . bib 0 of y is
also given, and the program can refer to bit i using the notation b i . Further, the
value k is given. The program is to begin with z = 1 and reference each bit of
the binary representation once, in the order bk -(, bk -2,
Chapter 21
Inverting Programs
{x =3} x:= I
IS
{x =I} x:= 3
Thus, execution of the first begins with x = 3 and ends with x = I, while
execution of the second does the opposite. (N ote carefully how one gets
an inverse by reading backwards -except that the assertion becomes the
command and the command becomes the assertion. This itself is a sort of
inversion.) This example shows that we may have to compute inverses of
programs together with their pre- and/ or postconditions.
The command x:= x*x has no inverse, because two different initial
values x =2 and x = -2 yield the same result x =4. To have an inverse,
a program must yield a different result for each different input.
266 Part III. The Development of Programs
The inverse of x:= x-y is x:= x+y, and vice versa. Let's calculate the
inverse of y:= x -y. This is equivalent to y:= -(y -x), which is
equivalent to y:= y-x; y:= -yo The inverse of this sequence is y:= -y;
y:= y +x, which is equivalent to y:= -y +x, which is equivalent to
y:= x-yo Hence, y:= x-y is its own inverse, and (21.2) is equivalent to
But then (21.1) is its own inverse! We leave to exercise I the proof that
(21.1) swaps the values of the integer variables x and y.
The inverse of skip. The inverse of skip would be piks, so we will have
to introduce piks as a synonym for skip.
The inverse of S1; S2; ... ; Sn. According to what we did previously,
the inverse of a sequence of commands is the reverse of the sequence of
inverses of the individual commands.
The inverse of x:= c1; S {x =c2}, where c1 and c2 are constants. This is
a kind of a "block". A new variable x is initialized to a value c1, S is
Chapter 21 Inverting Programs 267
executed, and upon termination x has a final value c2. The inverse
assigns c2 to x, executes the inverse of S, and terminates with x = cl:
Execution must begin with at least one guard true, so the disjunction of
the guards has been placed before the command. Execution terminates
with either RI or R2 true, depending on which command is executed, so
RI v R2 is the postcondition.
To perform the inverse of (21.3), we must know whether to perform
the inverse of S2 or to perform the inverse of SI, since only one of them
is executed when (21.3) is executed. To determine this requires knowing
which of R2 and RI is true, which means they cannot both be true at the
same time. We therefore require that Rl/\ R2 = F. For symmetry, we
also require BI /\ B2 = F.
Now let's develop the inverse of (21.3). Begin at the end of (21.3) and
read backwards. The last line of (21.3) gives us the first line of the
inverse: {R2 v RI} if. This makes sense; since (21.3) must end in a state
satisfying R 1 v R2, its inverse must begin in a state satisfying R2 v RI.
Reading the fourth line backwards gives us the first guarded command:
R2 - srI {B2}
(21.5) do Bl - Sl od {, Bl}
Loop (21.5) contains the barest information -it is annotated only with
the fact that Bl is false upon termination. It turns out that a loop invari-
ant is not needed to invert a loop.
From previous experience in inverting an alternative command, we
know that a guarded command to be inverted requires a postcondition.
Further, we can expect , Bl to become the precondition of the loop
(because we read backwards) and therefore the loop must have a precon-
dition that will become the postcondition. The two occurrences of Bl III
(21.5), lead us to insert another predicate el as follows:
Inverting swap_equals
In section 16.5 a program was developed to swap two non-overlapping
sections b[i:i+n-I] and bU:j+n-l] of equal size n, where n ~O. The
invariant for the loop of the program is 0 ~ k ~ n together with
k:= 0;
do k #n - b[i+k],bU+k]:= bU+k],b[i+k]; k:= k+l od
(21.8) k:= 0;
loop: {k =O}
do k #n
b[i+k],bU+k]:= bU+k],b[i+k]; k:= k+1 {k #O}
od
{k =n}
{k =n}
where loop labels the five indented lines: the loop and its pre- and post-
conditions. Using the rule for inverting a block, we find the inverse of
this program to be
pool:
{k =n}
do k #0-
(b[i+k],bU+k]:= bU+k],b[i+k]; k:= k+I)-1 {k #n} od
{k =O}
Further, the body of the loop -the inverse of the multiple assignment in
the original loop- is
k:= n;
pool: {k =n}
do k #0-
k:= k-I; b[i+k],bU+k]:= bU+k],b[i+k] {k #n} od
{k =O}
{k =O}
Note how the original program swaps values beginning with the first ele-
ments of the sections, while its inverse begins with the last elements and
works its way backward. Note also that (21.8) is its own inverse, so (21.8)
has at least two inverses.
270 Part III. The Development of Programs
Inverting Perm...1o_Code
Exercise 5 of chapter 20 was to write a program for the following
problem. Let N be an integer, N >0, and let X[O:N -I] be an array that
contains a permutation of the integers 0, I, ... , N -I. Formally,
(21.10) X =(2,0,3,1,5,4)
X' = (0, 0, 2, I, 4, 4)
We try to write a loop that changes one value of array x from its initial
to its final value at each iteration. The usual strategy in such cases is to
replace a constant of the result assertion by a variable. Here, we can
replace 0 or N, which leads to calculating the array values in descending
or ascending order of subscript value, respectively. Which should we do?
In example (21.10), the values X[N -I] and X [N -I] are the same. If
the last values of X and X' were always the same, working in descending
order of subscript values might make more sense. So let's try to prove
that they are always the same.
Chapter 21 Inverting Programs 271
X[N-I] is the last value of X. Since the array values are 0, ... , N-I,
there are exactly X[N -I] values less than X[N -I] in X[O:N -2]. But
X'[N-I] is defined to be the number of values in X[0:N-2] less than
X[N-l]. Hence, X[N-I] and X'[N-I] are the same.
Replacing the constant 0 of the postcondition by a variable k yields
the first attempt at an invariant:
But the invariant must also indicate that the lower part of x still contains
its initial value, so we rewrite the invariant as
The obvious bound function is k, and the loop invariant can be esta-
blished using k: = N.
There is still a big problem with using this as the loop invariant. We
began developing the invariant by noticing that X[N-I] = X'[N-I], so
that the final value of x[N-I] was the same as its initial value. To gen-
eralize this situation, at each iteration we would like x[k -I] to contain
its final value, but the invariant developed thus far doesn't indicate this.
The generalization would work if at each iteration x[O:k -I] contained
a permutation of the integers to, ... ,k -1) and if the code for this per-
mutation was equal to X'[O:k -I]. But this is not the case: the invariant
does not even indicate that x [O:k -I] is a permutation of the integers
to, ... ,k-I}.
Perhaps x can be modified during each iteration so that this is the
case. Let us rewrite the invariant as
k:= N;
do k #0 - k:= k-I;
Reestablish P
od
x =(2,5,4,1,0,3) and k =5
(21.12) k:= N;
do k #0-
k:= k-l;
Subtract I from every member of x[O:k-l] that is>x[k]:
j:= 0;
doj #k - {xU]#x[k]}
if xU]>x[k] - xU]:= xU]-1
o
xU]<x[k] - skip
fi;
j:=j+l
od
od
k:= N;
loopa: {k =N}
do k #0 -
k:= k-l;
j:= 0;
loopb: {j =O}
doj #k -
if xU]>x[k] - xU]:= xU]-1 {xU]):x[k]}
o
xU]<x[k] - skip {xU] <x[k]}
fi;
j:= j+l
{j #O}
od
{j =k}
{j =k}
{k #N}
od
{k =O}
{k =O}
Now invert the program, step by step, applying the inversion rules given
earlier. First, invert the block k:= N; loopa {k =O} to yield k:= 0;
loopa~1 {k =N}. Next, loopa~l is
apool: {k =O}
do k #N - (k:= k-l; j:= 0; loopb {j =k})~l {k #O} od
{k =N}
k:= 0;
apoo/: {k =O}
do k #N-
j:= k;
bpoo/: U=k}
doj #0-
j:= j-I;
if x[k]>xU] - piks {x[k]>xUn
o x[k] ::;;;xU] - xU]:= xU]+1 {x[k] <xU]}
fi
U#k}
od;
U=O}
U =O}
k:= k+1
{k #O}
od
{k =N}
{k =N}
k:= 0;
do k #N-
j:= k;
doj #0 -
j:= j-I;
if x[k]>xU] - piks
ox[k]::;;;xU] - xU]:= xU]+1
fi
od;
k:= k+l
od
Almost all programs in this book have been written in the guarded
command notation, with the addition of multiple assignment, procedure
call and procedure declaration. To execute the programs on a computer
usually requires translation into Pascal, PL/ I, FORTRAN or another
implemented language. Nevertheless, it still makes sense to use the
guarded command notation because the method of program development
is so intertwined with it. Remember Principle 18.3.11: program into a
programming language, not in it.
In this chapter, we discuss the problems of writing programs in other
languages as well as in the guarded command notation. We give general
rules for indenting and formatting, describe problems with definitions and
declarations of variables, and show by example how the guarded com-
mand notation might be translated into other languages.
22.1 Indentation
In the early days, programs were written in FORTRAN and assembly
languages with no indentation whatsoever, and they were hard to under-
stand because of it. The crutch that provided some measure of relief was
the flaw chart, since it gave a two-dimensional representation that exhi-
bited the program structure or "flow of control" more clearly.
Maintaining two different forms of the program -the text itself and
the flaw chart- has always been prone to error because of the difficulty
in keeping them consistent. Further, most programmers have never liked
drawing flaw charts, and have often produced them only after programs
were finished, and only because they were told to provide them as docu-
mentation. Therefore the relief expected from the use of flaw charts was
missing when most needed -during program development.
276 Part III. The Development of Programs
Sequential composition
Many programming conventions force the programmer to write each
command on a separate line. This tends to spread a program out, making
it difficult to keep the program on one page. Then, indentation becomes
hard to follow. The rule to use is the following:
i= I; k= I; m(l)= b(O); /* P *I
Together, the three assignments perform the single function of establishing
P. There is no reason to force the programmer to write them as
i= 1;
k= I;
m(l)= b(O);
(As an aside, note how the PLj I assignment is written with no blank to
the left of = and one blank to the right. Since PLjI uses the same sym-
bol for equality and assignment, it behooves the programmer to find a
way to make them appear different.)
Don't use rule 22.1.1 as a license to cram programs into as little space
as possible; use the rule with care and reason.
The rule concerning indentation of sequences of commands is obvious:
Section 22.1 Indentation 277
i= 1;
k= I;
m(l)= b(O);
Indenting subcommands
The rule concerning subcommands of a command is:
or, in PL/ I,
Note that the body of the loop is indented. Further, the body is a se-
quence of two commands, which, following rule 22.1.2, begin in the same
column. Also, the subcommands of the PLj I conditional statement are
indented with respect to its beginning.
The PLj I conditional statement could also have been written as
IF d*d~n THEN a= d;
ELSE b= d;
Assertions
As mentioned as early as chapter 6, it helps to put assertions in pro-
grams. Include enough so that the programmer can understand the pro-
gram, but not so many that he is overwhelmed with detail. The most
important assertion, of course, is the invariant of a loop. Actually, if the
program is annotated with the precondition, the postcondition, an invari-
ant for each loop, and a bound function for each loop, then the rest of the
pre- and postconditions can, in principle, be generated automatically.
Assertions, of course, must appear as comments in languages that don't
allow them as a construct. (Early versions of Ada included an "assert"
statement; mature Ada does not.) Two rules govern the indentation of
assertions:
We have used these rules throughout the book, so they should appear
natural by now (naturalness must be learned). For two examples of the
use of rule 22.1.5, see program 20.2.2.
Section 22.1 Indentation 279
Indentation of delimiters
There are three conventions for indenting a final delimiter (e.g. od, fi
and the END; of PLfl). The first convention puts the delimiter on a
separate line, beginning in the same column as the beginning of the com-
mand.
The second convention is to indent the delimiter the same distance as
the subcommands of the command -as in the PL( I loop
DO WHILE ( expression );
END;
This convention has the advantage that it is easy to determine which com-
mand sequentially follows this one: simply search down in the column in
which the DO WHILE begins until a non-blank is found.
The third convention is to hide the delimiter completely on the last line
of the command. For example,
DO WHILE ( expression );
... END;
or
do guard ~
od
This convention recognizes that the indenting rules make the end delim-
iters redundant. That is, if a compiler used the indentation to determine
the program structure, the end delimiters wouldn't be necessary. The del-
imiters are still written, because they provide a useful redundancy that can
be checked by the compiler, but they are hidden from view.
Which of the three conventions you use is not important; the impor-
tant point is to be consistent, so that the reader is not surprised:
The command-comment
Some of the programs presented in this book, like program 20.2.2,
have used an English sentence as a label (followed by a colon) or a com-
ment. The English sentence was really a command to do something, and
280 Part III. The Development of Programs
the program text that performed the command was indented underneath
it. Here is an example.
is not precise enough, for it forces the reader to read the refinement in
order to determine where the sum of the array elements is placed. Far
better is the command-comment
Section 22.1 Indentation 281
As you can see from the last example, the command-comment can be in
the form we have been using throughout the book for specifying a pro-
gram (segment).
Here is the indentation rule for command-comments.
The reason for not using this convention should be clear from the exam-
ple: one cannot tell where the refinement ends. Much better is to use rule
22.1.7:
Judicious use of spacing (skipping lines) may help, but no simple rule for
spacing after refinements can cover all cases if refinements are not
indented. So follow rule 22.1.7.
One more point concerning indentation of comments. Don't insert
them in such a manner that the structure of the program becomes hidden.
For example, if a sequence of program commands begin in column 10, no
comment between them should begin in columns to the left of column 10.
282 Part III. The Development of Programs
Procedure headings
As mentioned in chapter 12, the purpose of a procedure is to provide a
level of abstraction: the user of a procedure need only know what the pro-
cedure does and how to call it, and not how the procedure works. To
emphasize this, the procedure declaration should be indented as follows.
It may be reasonable to have a blank line before and after the procedure
declaration in order to set it off from the surrounding text.
As an example, here is a Pascal-like procedure declaration:
(*Pre: n =N A x =X A b =B A XEB[O:N-l]*)
(*Post: O~i <N A B[i]=X*)
proc search (value n, x: integer;
value b: array of integer;
result i: integer);
body of procedure
This strategy lies behind much of what has been presented in this book.
A definition of a set of variables is simply an assertion about their logical
relationship, which must be true at key places of the program. In the
284 Part III. The Development of Programs
These declarations suffer for several reasons. First, the variables have not
been grouped by their logical relationship. From the name staffsize, one
might deduce that this variable is logically related to array staff, but it
need not be so. Also, there is no way to understand the purpose or need
for divsize. Further, the definitions of globally important variables are
mixed up with the definitions of local variables, which are used in only a
few, adjacent places (i and j, for example).
Then there is no definition of the variables. For example, how do we
know just where in array staff the employees can be found. Are they
inserted at the beginning of the array, or the end, or in the middle? It has
also not been indicated that the lists are sorted.
Here is a better version of these declarations.
i, j: integer; q: Phonerec;
Now the variables are grouped according to their logical relationship, and
definitions are given that describe the relationship. These definitions are
actually invariants (but not loop invariants), which hold at (almost) all
places of the program.
Variables j, j and q are presumably used only in a few, localized
places, and hence need no definition at this point.
Note carefully the format of the declarations. The variables themselves
begin in the same column, which makes it easy to find a particular vari-
able when necessary. Further, the comments describing each group
appear to the right of the variables, again all beginning in the same
column. Spending a few minutes arranging the declarations in this format
is worthwhile, for it aids the programmer as well as the reader.
One more point. Nothing is worse than a comment like "j is an index
into array b ". When defining variables, refrain from buzzwords like
"pointer", "counter" and "index", for they serve only to point out the lazi-
ness and lack of precision of your thought. Of course, at times such com-
ments may be worthwhile, but in general try to be more precise.
Section 22.3 Writing Programs in Other Languages 287
i, j, k:= 0, 0, 0;
{inv: 0 ~ i ~ iv II 0 ~j ~ jv II 0 ~ k ~ kv }
{bound: i-iv + j-jv +k-kv}
do/[i]<gU] - i:= i+l
UgU]<h[k] - j:= j+l
Uh[k]</[i] - k:= k+l
od
{i = iv II j = jv A k = kv }
i = 0; j
= 0; k = 0;
/ *Simulate 3-guarded-command loop:* /
/*inv:O~i~iv 1I0~j~jv 1I0~k~kv*/
/*bound: i-iv + j-jv +k-kv*/
LOOP:
IF 1(i)<gU) THEN DO; i= i+l; GOTO LOOP; END;
IF gU)<h(k) THEN DO; j= j+l; GOTO LOOP; END;
IF h(k)</(i) THEN DO; k= k+I;GOTO LOOP; END;
/ *i = iv II j = jv II k = kv * /
{The n words, n ;;:'0, on line number z begin in columns b[I], ... , b[n].
Exactly one blank separates each adjacent pair of words. s, s ;;:, 0,
is the total number of blanks to insert between words to justify the
line. Determine new column numbers b [I:n] to represent the justified
line. Result assertion R, below, specifies that the numbers of blanks
inserted between different pairs of words differ by no more than one,
°
and that extra blanks are inserted to the left or right, depending on
the line number. Unless ~ n ~ I, the justified line has the
following format, where Wi is word i:
WI [p+I blanks] ... [p+I] Wt [q+I] ... [q+l] Wn
where p, q, t satisfy
Q I: I ~t ~n /\ O~p /\ O~q /\ P *(t-I)+q *(n-t)=s /\
(odd(z) /\ q =p+1 v even(z) /\ p =q+I)
Using B to represent the initial value of array b, result assertion R is
R:(O~n~l/\b=B) v «Ai: I~i~t:b[i]=B[i]+p*(i-I»/\
(A i: t < i ~ n: b [i] = B[i] + p *(t -I) + q *(i -t)m
Program in Pascal
(*The n words. n ~O. on line number z begin in columns b(l) • ...• b(n).
Exactly one blank separates each adjacent pair of words. s. s ~ O.
is the total number of blanks to insert between words to justify the
line. Determine new column numbers b (1: n) to represent the justified
line. Result assertion R. below. specifies that the numbers of blanks
inserted between different pairs of words differ by no more than one.
and that extra blanks are inserted to the left or right. depending on
the line number. Unless O:S;; n :s;; I. the justified line has the
following format. where Wi represents word i:
WI [p+l blanks] ... [p+I] Wt [q+l] ... [q+l] Wn
where p • q. t satisfy
QI: l:S;;t:S;;n AO:S;;p AO:S;;q Ap*(t-I)+q*(n-t)=s A
(odd(z) A q =p+l v even(z) A p = q+l)
Using B to represent the initial value of array b. result assertion R is
R: (O:S;;n:S;;1 A b =B) v «A i: l:S;;i:S;;t: b(i)=B(i)+p*(i-I» A
(Ai: t<i:S;;n: b(i)=B(i)+p*(t-I)+q*(i-t)))*)
Program in PL/I
/*The n words, n ;;;:'0, on line number z begin in columns b(l), ... , ben).
Exactly one blank separates each adjacent pair of words. s, s ;;;:, 0,
is the total number of blanks to insert between words to justify the
line. Determine new column numbers b (I:n) to represent the justified
line. Result assertion R, below, specifies that the numbers of blanks
inserted between different pairs of words differ by no more than one,
and that extra blanks are inserted to the left or right, depending on
the line number. Unless O~n ~ I, the justified line has the format
WI [p+1 blanks] ... [p+l] WI [q+I] ... [q+I] Wn
where p, q, t satisfy
QI: I~t~n "O~p "O~q "p*(t-I)+q*(n-t)=s"
(odd(z) " q =p+1 v even(z) " p =q+l)
Using B to represent the initial value of array b, result assertion R is
R:(O~n~1 Ab=B) v «Ai: l~i~/:b(i)=B(i)+p*(i-l)"
(A i: t < i ~ n:
b (i ) = B (i ) + p *(1 - I ) + q >I;(i - t ))) * /
Program in FORTRAN
In the FORTRAN example given below, note how each guarded com-
mand loop is implemented using an IF-statement that jumps to a labeled
CONTINUE statement. These CONTINUE statements are included only
to keep each loop as a separate entity, independent of the preceding and
following statements.
C The n words, n ~O, on line number z begin in cols b(l), ... , b(n).
C Exactly one blank separates each adjacent pair of words. s, s ~O, is
C the number of blanks to insert between words to right-justify the
C line. Determine new col numbers b (I:n) to represent the justified
C line. Result assertion R, below, specifies that the numbers of blanks
C inserted between different pairs of words differ by no more than one.
C Also, extra blanks are inserted to the left or right, depending on
C the line number. Unless 0 ~ n ~ I, the justified line has the format
C
C WI [p+1 blanks]' .. [p+I] WI [q+l] .,. [q+l] Wn
C where p, q, I satisfy
C
C QI: 1~/~n /\O~p /\O~q /\p*(t-I)+q*(n-/)=s /\
C (odd(z) /\ q =p+1 v even(z) /\p =q+l)
C
C Using B to represent the initial value of array b, result assertion R is
C
C R:(O~n~1 /\b=B) v «Ai: l~i~t:b(i)=B(i)+p*(i-I»/\
C (Ai: t<i~n: b(i)=B(i)+p*(t-I)+q*(i-t))
C
SUBROUTINE justify (n, z, s, b)
INTEGER n, z, s, b(n)
C
INTEGER q,p,t,e,k
IF (n .LE. I) GOTO 100
C Determine p, q and t:
e= z/2
IF (z .NE. 2*e) GOTO 20
q=s/(n-I)
t= I +s -q*(n-l)
p= q+l
GOTO 30
20 p= s / (n-l)
t= n -s +p*(n-l)
q= p+1
30 CONTINUE
Section 22.3 Writing Programs in Other Languages 293
Pre-1960
FORTRAN and F AP, the IBM 7090 assembly language, were my first
programming languages, and I loved them. I could code with the best of
them, and my flaw charts were always neat and clean. In 1962, as a
research assistant on a project to write the ALCOR-ILLINOIS 7090 Algol
60 Compiler, I first came in contact with Algol 60 [39]. Like many, I was
confused on this first encounter. The syntax description using BNF (see
Appendix I) seemed foreign and difficult. Dynamic arrays, which were
allocated on entrance to and deallocated on exit from a block, seemed
wasteful. The use of ":=" as the assignment symbol seemed unnecessary.
The need to declare all variables seemed stupid. Many other things dis-
turbed me.
I'm glad that I stuck with the project, for after becoming familiar with
Algol 60 I began to see its attractions. BNF became a useful tool. I
Section 23.1 A Brief History of Programming Methodology 295
began to appreciate the taste and style of Algol 60 and of the Algol 60
report itself. And I now agree with Tony Hoare that
The 1960s
The 1960s was the decade of syntax and compiling. One sees this in
the wealth of papers on context-free languages, parsing, compilers, com-
piler-compilers and so on. The linguists also got into the parsing game,
and people received Ph.D.s for writing compilers.
Algol was a focal point of much of the research, perhaps because of
the strong influence of IFIP Working Group 2.1 on Algol, which met
once or twice a year (mostly in Europe). (IFIP stands for International
Federation for Information Processing). Among other tasks, WG2.1 pub-
lished the Algol Bulletin in the 1960s, an informal publication with fairly
wide distribution, which kept people up to date on the work being done in
Algol and Algol-like languages.
Few people were involved deeply in understanding programming per se
at that time (although one does find a few early papers on the subject)
and, at least in the early 1960s, people seemed to be satisfied with pro-
gramming as it was being performed. If efforts were made to develop for-
mal definitions of programming languages, they were made largely to
understand languages and compilers, rather than programming. Concepts
from automata theory and formal languages played a large role in these
developments, as is evidenced by the proceedings [42] of one important
conference that was held under IFIP's auspices.
A few isolated papers and discussions did give some early indications
that much remained to be done in the field of programming. One of the
first references to the idea of proving programs correct was in a stimulat-
ing paper [35] presented in 1961 and again at the 1962 IFIP Congress by
John McCarthy (then at M.I.T., now at Stanford University). In that
paper, McCarthy stated that "instead of trying out computer programs on
test cases until they are debugged, one should prove that they have the
desired properties." And, at the same Congress, Edsger W. Dijkstra
(Technological University Eindhoven, the Netherlands, and later also with
Burroughs) gave a talk titled Some meditations on advanced program-
296 Part Ill. The Development of Programs
ming [II]. At the 1965 IFIP Congress, Stanley Gill, of England, re-
marked that "another practical problem, which is now beginning to loom
very large indeed and offers little prospect of a satisfactory solution, is
that of checking the correctness of a large program."
But, in the main, the correctness problem was attacked by the more
theoretically inclined researchers only in terms of the problem of formally
proving the equivalence of two different programs; this approach has not
yet been that useful from a practical standpoint.
As the 1960s progressed, it was slowly realized that there really were
immense problems in the software field. The complexity and size of pro-
jects increased tremendously in the 1960s, without commensurate increases
in the tools and abilities of the programmers; the result was missed dead-
lines, cost overruns and unreliable software. In 1968, a NATO Confer-
ence on Software Engineering was held in Garmisch, Germany, [6] in
order to discuss the critical situation. Having received my degree (Dr. rer.
nat) two years earlier in Munich under F.L. Bauer, one of the major
organizers of the conference, I was invited to attend and help organize.
Thus, I was able to listen to the leading figures from academia and indus-
try discuss together the problems of programming from their two, quite
different, viewpoints. People spoke openly about their failures in soft-
ware, and not only about their successes, in order to get to the root of the
problem. For the first time, a consensus emerged that there really was a
software crisis, that programming was not very well understood.
In response to the growing awareness, in 1969 IFIP approved the for-
mation of Working Group 2.3 on programming methodology, with Mich-
ael W oodger (National Physics Laboratory, England) as chairman. Some
of its members -including Dijkstra, Brian Randell (University of Newcas-
tle upon Tyne), Doug Ross (Softech), Gerhard Seegmueller (Technical
University Munich), Wlad M. Turski (University of Warsaw) and Niklaus
Wirth (Eidgenossische Technische Hochschule, Zurich)- had resigned
from WG2.1 earlier when Algol 68 was adopted by WG2.1 as the "next
Algol". Their growing awareness of the problems of programming had
convinced them that Algol 68 was a step in the wrong direction, that a
smaller, simpler programming language and description was necessary.
Thus, just around 1970, programming had become a recognized, res-
pectable -in fact, critical- area of research. Dijkstra's article on the
harmfulness of the goto in 1968 [12] had stirred up a hornets' nest. And
his monograph On Structured Programming [14] (in which the term was
introduced in the title but never used in the text), together with Wirth's
article [44] on stepwise refinement, set the tone for many years to come.
Section 23.1 A Brief History of Programming Methodology 297
PAB{S}P
P {while B do S } P A , B
And yet, we didn't really know how to do this. For example, we knew
that the loop invariant should come before the loop, but we had no good
methods for doing so and certainly could not teach others to do it. The
arguments went back and forth for some time, with those in favor of loop
invariants becoming more adept at producing them and coming up with
more and more examples to back up their case.
The issue was blurred by the varying notions of the word proof Some
felt that the only way to prove a program correct formally was to use a
theorem prover or verifier. Some argued that mechanical proofs were and
would continue to be useless, because of the complexity and detail that
arose. Others argued that mechanical proofs were useless because no one
could read them. Article [10] contains a synthesis of arguments made
against proofs of correctness of programs, and it is suggested reading. In
this book, a middle view has been used: one should develop a proof and
program hand-in-hand, but the proof should be a mixture of formality
and common sense.
Several forums existed throughout the 1970's for discussing technical
work on programming. Besides the usual conferences and exchanges, two
other forums deserve mention. First, IFIP Working group 2.3 on pro-
gramming methodology, and later WG2.l, WG2.2 and WG2.4, were used
quite heavily to present and discuss problems related to programming.
Since its formation, WG2.3 has met once or twice a year for five days to
discuss various aspects of programming. No formal proceedings have ever
emerged from the group; rather the plan has been to provide a forum for
discussion and cross-fertilization of ideas, with the results of the interac-
tion appearing in the normal scientific publications of its members. The
group has produced an anthology of already-published articles by its
members [22], which illustrates well the influence of WG2.3 on the field of
programming during the 1970s. It is recommended reading for those
interested in programming methodology.
Secondly, several two-week courses were organized throughout the
1970's by the Technical University Munich. These courses were taught by
the leaders in the field and attended by advanced graduate students,
young Ph.D.s, scientists new to the field and people from industry from
Europe, the U.S. and Canada; they were not just organized to teach a
subject but to establish a forum for discussion of ongoing research in a
very well-organized fashion. Many of the ones dealing with programming
itself (some were on compiling, operating systems, etc.) were sponsored by
NATO. These schools are unusual in that 50 to 100 researchers were
together for two weeks to discuss one topic. The lectures of many of the
schools have been published -see for example [2], [4] and [3].
Back to the development of programs. In 1975, Edsger W. Dijkstra
published a paper [15], which was a forerunner to his book [16]. The
Section 23.2 The Problems Used in the Book 301
The Coffee Can Problem (Chapter 13). Dijkstra mentioned the problem
in a letter in Fall 1979; he learned of it from his colleague, Carel Schol-
ten. It took five minutes to solve.
Closing the Curve (Chapter 13). John Williams (then at Cornell, now at
IBM, San Jose) asked me to solve this problem in 1973. I was not able
to do so, and Williams had to give me the answer.
The Maximum Problem (Chapter 14). [16], pp. 52-53.
The Next Higher Permutation Problem (exercise 2 of chapter 14 and
exercise 2 of Chapter 20). The problem has been around for a long
time; the development is from [16], pp. 107-110.
Searching a Two-dimensional Array (sections 15.1, 15.2). My solution.
Four-tuple Sort (section 15.2). [16], p. 61.
gcd(x, y) (exercise 2 of section 15.2). This, of course, goes back to Euclid.
The versions presented here are largely from [16].
Approximating the Square Root (sections 16.2, 16.3 and 19.3). [16], pp.
61-65.
Linear Search and the Linear Search Principle (section 16.2). The devel-
opment is from [16], pp. 105-106.
The Plateau Problem (section 16.3). I used this problem to illustrate loop
invariants at a conference in Munich, Germany, in 1974. Because of
lack of experience, my program used too many variables (see the discus-
sion at the end of section 16.3). Michael Griffiths (University of
Nancy) wrote a recursive definition of the plateau of an array and then
changed the definition into an iterative program; the result was a pro-
gram similar to (16.3.11). The idealized development given in section
302 Part III. The Development of Programs
These two rules, which express different forms for the same nonterminal,
can be abbreviated using the symbol 1, read as "or", as
<digit> ::= 0 1 1 I 21 31 41 51 61 71 81 9
xU y => x uy
The symbol => denotes a single derivation -one rewriting action. The
symbol =>* denotes a sequence of zero or more single derivations. Thus,
<constant> I =>* 3 2 5 I
<expr>
(
______
<expr>
__________ ------- "'-------II
I _______ *
I ____ )
<expr>
<expr>
<constant>
In the syntax tree, a single derivation using the rule U ::=u is expressed
by a node U with lines emanating down to the symbols of the sequence
u. Thus, for every single derivation in the sequence of derivations there is
a nonterminal in the tree, with the symbols that replace it underneath.
For example, the first derivation is <expr> => <expr> * <expr>, so
at the top of the diagram above is the node <expr> and this node has
lines emanating downward from it to <expr>, * and <expr>. Also,
there is a derivation using the rule <digit> :: = I, so there is a
corresponding branch from <digit> to I in the tree.
The main difference between a derivation and its syntax tree is that the
syntax tree does not specify the order in which some of the derivations
were made. For example, in the tree given above it cannot be determined
whether the rule <digit> ::= I was used before or after the rule <digit>
:: = 3. To every derivation there corresponds a syntax tree, but more than
one derivation can correspond to the same tree. These derivations are
considered to be equivalent.
<expr> <expr>
'"
/\~ /~
<expr> + <expr> * <expr>
<expr>
/\~
<expr>/ */ <expr> <expr> + <expr>
A grammar that allows more than one syntax tree for some sentence is
called ambiguous. This is because the existence of two syntax trees allows
us to "parse" the sentence in two different ways, and hence to perhaps
give two meanings to it. In this case, the ambiguity shows that the gram-
mar does not indicate whether + should be performed before or after *.
The syntax tree to the left (above) indicates that * should be performed
first, because the <expr> from which it is derived is in a sense an
operand of the addition operator +. On the other hand, the syntax tree to
the right indicates that + should be performed first.
One can write an unambiguous grammar that indicates that multiplica-
tion has precedence over plus (except when parentheses are used to over-
ride the precedence). To do this requires introducing new nonterminal
symbols, <term> and <factor>:
<expr> <term> 1 <expr> + <term>
..
<expr> - <term>
<term> .. <factor> 1 <term> * <factor>
<factor> .. <constant> 1 «expr»
<constant>:: = <digit>
<constant>:: = <constant> <digit>
<digit> .. 0111213141516171819
In this gramar, each sentence has one syntax tree, so there is no ambi-
guity. For example, the sentence 1+3*4 has one syntax tree:
Appendix I Backus-Naur Form 309
----/ -----
<expr>
1
<term>
+
<expr>
----1----
<term> *
<term>
<factor>
I I I
<factor> <factor> <constant>
I I 1
<constant> <constant> <digit>
I I 1
<digit> <digit> 4
I I
1 3
Extensions to BNF
A few extension to BNF are used to make it easier to read and under-
stand. One of the most important is the use of braces to indicate repeti-
tion: {x} denotes zero or more occurrences of the sequence of symbols x.
U sing this extension, we can describe <constant> using one rule as
References
The theory of syntax has been studied extensively. An excellent text
on the material is Introduction to Automata Theory, Languages and
Computation (Hopcroft, J.E. and J.D. Ullman; Addison-Wesley, 1979).
The practical use of the theory in compiler construction is discussed in the
texts Compiler Construction for Digital Computers (Gries, D.; John
Wiley, 1971) and Principles of Compiler Design (Aho, A.V., and J. Ull-
man; Addison Wesley, 1977).
Appendix 2
Sets, Sequences, Integers, and Real Numbers
These examples illustrate one way of describing a set: write its elements as
a list within braces { and}, with commas joining adjacent elements. The
first two examples illustrate that the order of the elements in the list does
not matter. The third example illustrates that an element listed more than
once is considered to be in the set only once; elements of a set must be
distinct. The final example illustrates that a set may contain zero ele-
ments, in which case it is called the empty set.
It is not possible to list all elements of an infinite set (a set with an
infinite number of elements). In this case, one often uses dots to indicate
that the reader should use his imagination, but in a conservative fashion,
Appendix 2 Sets, Sequences, Integers, and Real Numbers 311
{k I even (k ) }
{(i,j) I i =j+l}
{ ... , (-I, -2), (0, -I), (1,0), (2, I), (3,2), ... )
Choose (a , x)
Sequences
A sequence is a list of elements (joined by commas and delimited by
parentheses). For example, the sequence (I, 3, 5, 3) consists of the four
elements I, 3 , 5 , 3, in that order, and 0 denotes the empty sequence. As
opposed to sets, the ordering of the elements in a sequence is important.
The length of a sequence s, written / s / , is the number of elements in
it.
Catenation of sequences with sequences and/ or values is denoted by /.
Thus,
That is, s[O] refers to the first element, s [I] to the second, and so forth.
Further, the notation s [koo], where 0::::;; k ::::;; n, denotes the sequence
That is, s [koo] denotes a new sequence that is the same as s but with the
first k elements removed. For example, if s is not empty, the assignment
s:= s[1..]
Using the sequence notation, rather than the usual pop and push of stacks
and insert into and delete from queues, may lead to more understandable
programs. The notion of assignment is already well understood -see
chapter 9- and is easy to use in this context.
We also use the set of real numbers, although on any machine this set and
operations on it are approximated by some form of floating point
numbers and operations. Nevertheless, we assume that real arithmetic is
performed, so that problems with floating point are eliminated.
The following operations take as operands either integers or real num-
bers:
Relations
Let A and B be two sets. The Cartesian product of A and B, written
A XB, is the set of ordered pairs (a, b) where a is in A and b is in B:
Let N be the set of integers. One relation over NXN is the successor
relation:
The following relation associates with each person the year in which he
left his body:
I = {(a, a) I a E A}
When dealing with binary relations, we often use the name of a rela-
tion as a binary operator and use infix notation to indicate that a pair
belongs in the relation. For example, we have
From the three relations given thus far, we can conclude several things.
For any value a there may be different pairs (a, b) in a relation. Such a
relation is called a one-to-many relation. Relation parent is one-to-many,
because most people have more than one parent.
For any value b there may be different pairs (a, b) in a relation.
Such a relation is called a many-to-one relation. Many people may have
died in any year, so that for each integer i there may be many pairs (p, i)
in relation died--.in. But for any person p there is at most one pair (p, i)
in died--.in. Relation died--.in is an example of a many-to-one relation.
In relation succ, no two pairs have the same first value and no two
pairs have the same second value. Relation succ is an example of a ane-
ta-one relation.
A relation on A XB may contain no pair (a, b) for some a in A.
Such a relation is called a partial relation. On the other hand, a relation
on A XB is total if for each a E A there exists a pair (a, b) in the rela-
tion. Relation died--.in is partial, since not all people have died yet.
Relation succ is total (on NXN).
If relation R on A XB contains a pair (a, b) for each b in B, we say
that R is onto B. Relation parent is onto, since each child has a parent
(assuming there was no beginning).
For example,
parent ° J
parentI parent
parent 2 grandparent
parent 3 great -grandparent
and
(i succ k j) iff i+k = j
Looking upon relations as sets and using the superscript notation, we can
define the closure R+ and transitive closure R' of a relation R as fol-
lows.
R+ R l uR2 uR 3 u
R' RO U Rl U R2 U
b R- 1 a iff aRb
Functions
Let A and B be sets. A function f from A to B, denoted by
f: A-B
Note carefully the three ways in which a function name f is used. First,
f denotes a set of pairs such that for any value a there is at most one
pair (a,b). Second, af b holds if (a,b) is in f. Third, f(a) is the
value associated with a, that is, (a,f (a» is in the function (relation) f.
The beauty of defining a function as a restricted form of relation is
that the terminology and theory for relations carries over to functions.
Thus, we know what a one-to-one function is. We know that composition
of (binary) functions is associative. We know, for any function, what fO,
f', f2, f+ and f* mean. We know what the inverse f-' of f is. We
know that f -I is a function iff f is not many-to-one.
fey) = x*y
f(2) = x* 2
f(x+2) = x * (x+2)
f(x*2) = x * x*2
Appendix 3 Relations and Functions 319
(ao, 02, " ' , an-lo g(ao, " ' , an-I» and
g(ao, ... , an-I) = an
The terminology used for binary relations and functions extends easily to
n -ary relations and functions.
Appendix 4
Asymptotic Execution Time Properties
The first requires n units of time; the second 2n. The units of time
required by the third program is more difficult to determine. Suppose n
Appendix 4 Asymptotic Execution Time Properties 321
is a power of 2, so that
i =2k
n: 2 64 128 32768
2n: 2 4 128 256 65536
log n: 0 6 7 15
We need a measure that allows us to say that the third program is by far
the fastest and that the other two are essentially the same. To do this, we
define the order of execution time.
(A4.1) Definition. Let fen) and g(n) be two functions. We say that
fen) is (no more than) order g(n), written O(g(n», if a constant
c >0 exists such that, for all (except a possibly finite number)
positive values of n,
f(n):O:::;c*g(n).
Since the first and second programs given above are executed in nand
2n units, respectively, their execution times are of the same order.
Secondly, one can prove that logn is O(n), but not vice versa. Hence
the order of execution time of the third is less than that of the first two
programs.
We give below a table of typical execution time orders that arise fre-
quently in programming, from smallest to largest, along with frequent
terms used for them. They are given in terms of a single input parameter
n. In addition, the (rounded) values of the orders are given for n = 100
and n = 1000, so that the difference between them can be seen.
For algorithms that have several input values the calculation of the order
of execution time becomes more difficult, but the technique remains the
same. When comparing two algorithms, one should first compare their
execution time orders, and, if they are the same, then proceed to look for
finer detail such as the number of times units required, number of array
comparisons made, etc.
An algorithm may require different times depending on the configura-
tion of the input values. For example, one array b[l:n] may be sorted in
n steps, another array b'[l:n] in n 2 steps by the same algorithm. In this
case there are two methods of comparing the algorithms: average- or
expected-case time analysis and worst-case time analysis. The former is
quite difficult to do; the latter usually much simpler.
As an example, Linear Search, (16.2.5), requires n time units in the
worst case and n /2 time units in the average case, if one assumes the
value being looked for can be in any position with equal probability.
Answers to Exercises
TTT T T T T
TTF T T T F
TFT T T F F
TFF T T F F
FTT T T F F
FTF T T F F
FFT F T F F
FFF F F F F
324 Answers to Exercises
Truth table for the first Distributive law (only) (since the last two columns
are the same, the two expressions heading the columns are equivalent and
the law holds):
Case 4: E(P) has the form E1(P) A E2(P). By induction, we have that
E1(e1) = El(e2) and E2(el) = E2(e2). Hence, E1(e1) and EI(e2) have the
same value in every state, and E2(e1) and E2(e2) have the same value in
every state. The following truth table then establishes the desired result:
The rest of the cases, E(P) having the forms E1(P) v E2(P), E1(P) ?
E2(P) and EJ(p) = E2(P) are similar and are not shown here.
Answers for Section 3.2 327
We now show that use of the rule of Transitivity generates only tauto-
logies. Since e 1= e2 and e2 = e3 are tautologies, we know that eland e2
have the same value in every state and that e2 and e3 have the same value
in every state. The following truth table establishes the desired result:
el e2 e3 el=e3
T T T T
F F F T
3. (a)~F~ro~m~~~~_~~r_l~'n~~~er~r
I p A q pr I
2 p ~r pr 2
3 p A-E, 1
4 r ~-E, 2, 3
I p A-E pr I
2 p ~-E, pr 2, I
2. Infer (p /\ q) =';>(p V q)
2 From p /\ q infer p v q
2.1 P /\-E,prl
2.2 p vq v-I, I
3 (P/\q)=';>(PVq) =';>-1,2
4. Infer p =p vp
1 Fromp infer p v p
l.l pVp v-I, pr I
2 P =';>p vP =';>-1, I
3 Fromp v p infer p
3.1 P v-E, pr I, (3.3.3), (3.3.3)
4 P v P =';> P =';>-1, 3
5 P = P vp =-1, 2, 4
Case 4: E(P) has the form G(p)/\H(P) for some expressions G and H.
In this case, by induction we may assume that the following proofs exist.
From el =e2, G(el) infer G(e2)
From el =e2, H(el) infer H(e2)
We can then give the following proof.
The rest of the cases, where E(P) has one of the forms G(p)V H(P),
G (P) '*" H(P) and G (P) = H(P), are left to the reader.
2. For the proofs of the valid conjectures using the equivalence transfor-
mation system of chapter 2, we first write here the disjunctive normal
form of the Premises:
Premise I: , tb v , bl v rna
Premise 2: , rna v ,fdv , gh
Premise 3: gj v (fd A ,gh)
Conjecture 2, which can be written in the form (rna A gh) ? gj, is
proved as follows. First, use the laws of Implication and De Morgan to
put it in disjunctive normal form:
(E.I) ,rnaV,ghVgj.
(c) Some means at least one: (E i:) ~ i <k +1: b [i) =0), or, better yet,
(Ni:) ~i <k+1: b[i)=O»O
(d) (O~i<n candb[i)=O)~}~i~k, or
(A i:O~i <n: b[i)=O~) ~i ~k)
(b)
+I I I
(A}:O~}<n:Bj ~wp(SLj,
I R»
Answers for Section 4.4
1. Ej = E (i is not free in E)
o p q n-I
2. (a) 0 ~ p ~ q +I ~ n " b L-I_~.:....;x.;....."JI_ _ _1-1--'->_x----'I
Answers for Chapter 7 335
R: x =max({y Iy Eb}).
{n >O}
S
{O~i <n 1\ (Aj:O~j <n: b[i]:):b[j]) 1\ b[i]>b[O:i-I]}.
Second specification: Given fixed n > 0 and fixed array b [O:n -I], set i to
establish
(k) Define average (i) = (:Lj: O~j <4: grade [i, j]).
First specification:
wp(S,R).
7. This exercise is intended to make the reader more aware of how quan-
tification works in connection with wp, and the need for the rule that
each identifier be used in only one way in a predicate. Suppose that Q
~ wp (S, R) is true in every state. This assumption is equivalent to
We are asked to analyze predicate (7.8): {(A x: Q)} S {(A x: R)J, which
is equivalent to
Let us analyze this first of all under the rule that no identifier be used in
more than one way in a predicate. Hence, rewrite (E7.2) as
and assume that x does not appear in S and that z is a fresh identifier.
We argue operationally that (E7.3) is trut.-. Suppose the antecedent of
(E7.3) is true in some state s, and that execution of S begun in s ter-
minates in state s'. Because S does not contain identifier x, we have
sex) =s'(x).
Because the antecedent of (E7.3) is true in s, we conclude from (E7.1)
that (A x: wp (S, R» is also true in state s. Hence, no matter what the
value of x in s, s'(R) is true. But s(x)=s'(x). Thus, no matter what
the value of x in s', s'(R) is true. Hence, so is s'«A x: R», and so is
s'«A z: R{». Thus, the consequent of (E7.3) is true in s, and (E7.3)
holds.
We now give a counterexample to show that (E7.2) need not hold if x
is assigned in command S and if x appears in R. Take command
S: x:= I. Take R: x = I. Take Q: T. Then (E7.1) is
(A x: T ~wp("x:= I ", x = I»
which is true. But (E7.2) is false in this case: its antecedent (A x: T) is
true but its consequent wp("x:= I ", (A x: x = I» is false because predicate
(Ax:x=l)isF.
We conclude that if x occurs both in Sand R, then (E7.2) does not in
general follow from (E7.1).
Answers for Section 9.2 337
The last line follows because neither Q(v) nor e (v) contains a reference
to x. Now suppose Q is true in some state s. Let v = s (x), the value of
x in state s. For this v, (Q(V) A e(x)=e(v» is true in state s, so that
(E4.1) is also true in s. Hence Q ~(E4.1), which is what we needed to
show.
Eg (n applications of (E5.1»
E
wp(S3, R) = (w ~r v w >r) A
(w ~r -=?wp("r, q:= r-w, q+l", R» A
(w >r -=?wp(skip, R»
(w ~r -=?«q+l)*w +r-w =x A r-w ;:'0» A (w >r -=? R)
(w ~r -=?q*w+r =x A r-w ;;:'0) A (w >r -=? R)
This is implied by R.
6. wp(S6, R) = (f[i]<gU] v f[i]=gU] v f[i]>gUn A
(f[i] <gU] -=? Rj+l) A
(f[i]=gU]-=?R) A
(f[i] > gU] -=? Rj+l)
R A (f[i]<gU]-=?f[i+I]~X) A (f[i]>gU]-=?gU];:'X)
R (since R implies that gU] ~ X and f[i] ~ X)
PABB P A BB AT
P A BB A (A i: P AB; =,;>wp(Sj, P» (since 1. is true)
P A BB A P A (A i: B; =,;>wp(Sj,P»
=';> BBA(Ai: Bj =,;>wp(Sj,P»
wp(IF, P)
Thus, we need only show that (E3.1) implies 3' of theorem 11.6. Note
that P, IF, and I do not contain tJ or to. Since IF does not refer to T
and to, we know that wp(IF, t1 ::::;;10+1) = BB A tJ ::::;;to+1. We then have
the following:
Since the derivation holds irrespective of the value to, it holds for all to,
and 3' is true.
4. We first show that (11.7) holds for k =0 by showing that it is equiv-
alent to assumption 2:
Assume (11.7) true for k = K and prove it true for k = K +1. We have:
and P A , BB A t ~ K +1 o? P A , BB
= HO(PA,BB)
which shows that (II. 7) holds for k = K +I. By induction, (11.7) holds
for all k.
6. H'o(R)=,BBAR. For k >0, H'dR)=wp(IF, H'k-J(R». (Ek:
o~ k: H'd R» represents the set of states in which DO will terminate
with R true in exactly k iterations. On the other hand, wp(DO, R)
represents the set of states in which DO will terminate with R true in k
or less iterations.
10. (I) wp("i:= I", P) = 0<1 ~n A (Ep: 1 =2P )
= T (above, take p =0).
(2) wp(SJ, P) wp("i:= 2*i", O<i ~n A (Ep: i =2P »
0<2*i ~n A (Ep: 2*i =2P ),
= (A u: , T(u» V wp(S, R)
(Since neither S nor R contains u)
,(Eu: T(u»V wp(S,R)
(Eu: T(u» ~wp(S, R) (Implication)
(El.l) {(Eu: T(u))} S {} (Implication)
The quantifier A in the precondition of (12.7) of theorem 12.6 is neces-
sary; without it, predicate (12.7) has a different meaning. With the quan-
tifier, the predicate can be interpreted as follows: The procedure call can
be executed in a state s to produce the desired result R if all possible
assignments ii, v to the result parameters and arguments establish the
truth of R. Without the quantifier, the above equivalence indicates that
an existential quantifier is implicitly present. With this implicit existential
quantifier, the predicate can be interpreted as follows: The procedure call
can be executed in a state s to produce the desired result R if there exists
at least one possible assignment of values ii, v that establishes the truth
of R. But, since there is no quarantee that this one possible set of values
ii, v will actually be assigned to the parameters and arguments, this state-
ment is generally false.
d =(1,2,3,5,4,2)
d' =(1,2,4,3,2,5)
The algorithm is then: calculate i; calculate j; swap b [i] and b U]; reverse
b[i+l:n-I]! Here, formalizing the idea of a next highest permutation
leads directly to an algorithm to calculate it!
x, y:= X, Y;
do x > y - x:= x-y
Oy>x-y:=y-x
od
[O<x=y t\gcd(x,y)=gcd(X, Y)}
[x = gcd(X, Y))
5. t:= 0;
do j '#80 cand bU]'#" - t,s[t+IJ,j:= t+l, bUJ,j+1
nj =80 - read(b); j:= 0
od
344 Answers to Exercises
ifx<b[l] ~i:=O
Db[l]~x<b[n] ~Theprogram(a)
Db[n]~x ~i:=n
fi
i,j:=O,n+l;
{inv: O~i <j ~n+1 A b[i]~x <bU]}
{bound: logU-i)}
doi+l¥-j ~e:=(i+j)+2;
{I~e~n}
if b[e]~x ~ i:= eO b[e]>x ~ j:= e fi
od
10. i ,p:= 0, 0;
{inv: see exercise 10; bound: n -i}
do i ¥- n ~ Increase i, keeping invariant true:
j:= i+l;
{inv: b[i:j-l] are all equal; bound: n-j}
doj¥-n candbU]=b[i] ~j:=j+1 od;
p:= max(p, j-i);
i:= j
od
m m+1 q p n-I
P: m< q ~p +1 ~ n A x = B[ I] A 1-1 _IL...->_x--,I
b x--,-I_~...:...x_..J..I_._?
6. The precondition states that the linked list is in order; the postcondition
that it is reversed. This suggests an algorithm that at each step reverses
one link: part of the list is reversed and part of it is in order. Thus, using
another variable t to point to the part of the list that is in order, the
invariant is
t v s v s
~Vi+ll t-"'---1 Vn I -I I
Initially, the reversed part of the list is empty and the unreversed part is
the whole list. This leads to the algorithm
p,t:=-I,p;
do t #-1 ~ p, t ,s[t]:= t, s[t],p od
Q: x Eb[O:m-I,O:n-l]
R: O:::;;;i <m A O:::;;;j <n A x = b[; ,j]
Actually, Q and R are quite similar, in that both state that x is in a rec-
tangular section of b -in R, the rectangular section just happens to have
only one row and column. So perhaps an invariant can be used that indi-
cates that x is in a rectangular section of b:
What could serve as guards? Consider i:= i + l. Its execution will main-
tain the invariant if x is not in row; of b. Since the row is ordered, this
can be tested with b [i , j] < x, for if b [i , j] > x, so are all values in row
i. In a similar fashion, we determine the other guards:
Answers for Section 18.3 347
i, p, q, j:= 0, m - I, 0, n - I;
do b[i,j] <x - i:= i+1
Ub[p,q]>x -p:=p-l
Ub[p, q]<x - q:= q+1
Ub[i, j] >x - j:= j-I
od
In order to prove that the result is true upon termination, only the first
and last guards are needed. So the middle guarded commands can be
deleted to yield the program
i, j:= 0, n-I;
do b[i,j]<x -;:= ;+1
Ub[i,j]>x -j:=j-I
od {x =b[i,j]}
emPfY(P) - 0
postorder(p) = { , empty - (postorder (left [P]) I
postorder(right[p]) I (root(p»
348 Answers to Exercises
Hence, all solutions (xv [i], yv [i]) to the problem satisfy r ~ 2*xv [i]2.
Using the Linear Search Principle, we write an initial loop to determine
the smallest x satisfying r ~ 2*x 2 , and the first approximation to the pro-
gram is
Answers for Section 19.2 349
Now note that execution of the body of the main loop does not destroy
P2, and therefore P2 can be taken out of the loop. Rearrangement then
leads to the more efficient program
i, x:= 0, 0;
do r >2*x 2 - x:= x+1 od;
y:= x;
{inv: PI II Pl}
do x2~r -
Increase x, keeping invariant true:
Determine y to satisfy (E 1.1):
do x 2 +y2>r - y:= y-l od;
2
if x +y2=r - xv[v],yv[v],i,x:= x, y, i+l, x+l
Dx 2 +y2<r -x:= x+l
fi
od
350 Answers to Exercises
[n )!O}
a,c:=O, I; doc2~n -c:=2*c od;
[inv: a2~n «a +C)2 A (Ep: 1 ~p: c =2 P }
[bound: .;n - a}
doc#l-c:=c/2;
if(a+c)2~n - a:= a+c
D(a+c)2>n - skip
fi
od
[a2~n «a+I)2}
(E3.1) a 2 +2*a*c+p-n ~O
(E3.2) q = a*c
(E3.3) a2+2*q+p-n~O
Now try a third variable r to contain the value n -a 2 , which will always
be ~O. (E3.3) becomes
2*q +p -r ~O
{n :;:,O}
p, q, r:= 1,0, n; dop ~n - p:= 4*p od;
do p =1= I - p:= p /4; q:= q/ 2;
if 2*q +p ~r - q,r:= q+p, ; r-2*q-p
02*q+p>r -skip
fi
od
{q2 ~n «q +1)2}
n, c[4],in[O]:= 5,0, T;
in[I:31]:= F; {s = (O,O,O,O,O)}
{inv: PI A P2 A P3 A , good(s I O)}
doc[4];61 ~
if n = 36 ~ Print sequence s
Dn ;636 ~ skip
fi;
Change s to next higher good sequence:
doin[(c[n-I]*2+l) mod 32] W.e. ,good(s II)}
~ Delete ending l's from s:
do odd(c[n-l]) ~ n:= n-I; in[c[n]]:= F od;
Delete ending 0:
n:= n-l; in[c[n]]:= F
od;
Append I to s:
c[n]:= (c[n-l]*2+1) mod 32; in[c[n]]:= T; n:= n+l
od
O~h~FAO~k~G A
c =(Ni: O~i <h: f[i'l f:
g[O:G-I])+
(N): O~}<k: gU] Iff[O:F-I])
Now, consider execution of h:= h+1. Under what conditions does its
execution leave P true? The guard for this command must obviously
imply f[h] f: g[O:G -1], but we want the guard to be simple. As it
354 Answers to Exercises
so that f[h] does not appear in G, and increasing h will maintain the
invariant. Similarly the guard for k:= k+1 will be g[k] <f[h].
This gives us our program, written below. We assume the existence of
virtual values f[-I]=g[i-I]=-oo and f[F]=g[G]=+oo; this allows
us to dispense with worries about boundary conditions in the invariant.
h, k, c:= 0, 0, 0;
{inv: P; bound: F-p +G-q}
dof #F Ag#G ~
iff[h]<g[k] ~ h,c:= h+l, c+l
Uf[h]=g[k] ~ h, k:= h+l, k+l
Uf[h]>g[k] ~ k, c:= k+l, c+l
fi
od;
Add to c the number of unprocessed elements of f and g:
c:= c +F-h +G-k
Index
85 nondeterministic, I I I
Bounded nondeterminism, 312 procedure call, 164
sequential composition, 114-115
Calculus, 25 skip, 114
propositional calculus, 25 Command-comment, 99, 279
predicate calculus, 66 indentation of, 279
Call, of a procedure, 152 Common sense and formality, 164
by reference, 158 Commutative laws, 20
by result, 151 proof of, 48
by value, 151 Composition, associativity of, 3 I 6
by value result, 151 Composition, of relations, 3 16
cand,68-70 Composition, sequential, 114-115
cand-simplification, 80 Concatenation, see Catenation
Cardinality, of a set, 311 Conclusion, 29
Cartesian product, 315 Conjecture, disproving, 15
Case statement, 134 Conjunct, 9
Catenation, 75 Conjunction, 9-10
identity of, 75, 333 distributivity of, 110
of sequences, 312 identity of, 72
ceil, 314 Conjunctive normal form, 27
Changing a representation, 246 Consequent, 9
Chebyshev, 83 Constable, Robert, 42
Checklist for understanding a loop, Constant proposition, 10
145 Constant-time algorithm, 321
Chomsky, Noam, 304 Contradiction, law of, 20, 70
Choose, 312 Contradiction, proof by, 39-41
Closing the Curve, 166, 30 I Controlled Density Sort, 247, 303
Closure, of a relation, 3 17 cor, 68-70
transitive, 3 I 7 cor-simplification, 79
Code, for a permutation, 270 Correctness
Code to Perm, 264, 272-273, 303 partial, 109-110
Coffee Can Problem, 165,301 total, 110
Combining pre- and postconditions, Counting nodes of a tree, 23 I
21 I Cubic algorithm, 321
Command, 108 Cut point, 297
abort, 114
alternative command, 132 Data encapsulation, 235
assignment, multiple, 121, 127 Data refinement, 235
assignment, simple, 128 De Morgan, Augustus, 20
assignment to an array element, 124 De Morgan's laws, 20, 70
Choose, 312 proof of, 49
deterministic, I I I Debugging, 5
guarded command, 13 I Decimal to Base B, 215, 302
iterative command, 139 Decimal to Binary, 215,302
360 Index
U, 69
Ullman, J.D., 309
Unambiguous grammar, 308
Unbounded nondeterminism, 312
Undefined value, 69
Union, of two sets, 311
Unique 5-bit Sequences, 262,
303, 352