Science of Programming

Download as pdf or txt
Download as pdf or txt
You are on page 1of 405
At a glance
Powered by AI
The document provides an overview of the contents of a book on computer science and programming including various authors and topics covered.

The book covers topics related to object-oriented databases, relational databases, verification of programs, theoretical computer science, algorithmic languages, program development, string rewriting systems, and more.

Programming concepts discussed include object-oriented programming, formal verification, programming methodology, and the science of programming.

Texts and Monographs in Computer Science

Editor
Springer David Gries
New York
Berlin Advisory Board
Heidelberg
Barcelona F. L. Bauer
Hong Kong K. S. Fu
London J. J. Horning
Milan
R. Reddy
Paris
Singapore D. C. Tsichritzis
Tokyo W. M. Waite
Texts and Monographs in Computer Science
Suad Alagic, Object-Oriented Database Programming

Suad Alagic, Relational Database Technology

Suad Alagic and Michael A. Arbib, The Design of Well-Structured


and Correct Programs

S. Thomas Alexander, Adaptive Signal Processing: Theory and Applications

Krzysztof R. Apt and Ernst-Rudiger Olderog, Verification of Sequential and


Concurrent Programs

Michael A. Arbib, A.J. Kfoury, and Robert N. Moll, A Basis for Theoretical
Computer Science

Friedrich L. Bauer and Hans Wossner, Algorithmic Language and Program


Development

W. Bischofberger and G. Pomberger, Prototyping-Oriented Software Development:


Concepts and Tools

Ronald V. Book and Friedrich Otto, String-Rewriting Systems

Kaare Christian, A Guide to Modula-2

Edsger W. Dijkstra, Selected Writings on Computing: A Personal Perspective

Edsger W. Dijkstra and Carel S. Scholten, Predicate Calculus and Program


Semantics

W.H.J. Feijen, A.J.M. van Gasteren, D. Gries, and J. Misra, Eds., Beauty Is Our
Business: A Birthday Salute to Edsger W. Dijkstra

P.A. Fejer and D.A. Simovici, Mathematical Foundations of Computer Science,


Volume I: Sets, Relations, and Induction

Melvin Fitting, First-Order Logic and Automated Theorem Proving

Nissim Francez, Fairness

R.T. Gregory and E.V. Krishnamurthy, Methods and Applications of Error-Free


Computation

David Gries, Ed., Programming Methodology: A Collection of Articles by Members


of IFIP WG2.3

David Gries, The Science of Programming

David Gries and Fred B. Schneider, A Logical Approach to Discrete Math

(continued after index)


The Science
of Programming
David Gries

Springer
David Gries
Department of Computer Science
Cornell University
Upson Hall
Ithaca, NY 14853
U.S.A.

Library of Congress Cataloging in Puelication Data


Gries, David,
The science of programming.
(Texts and monographs in computer science)
Bibliography: p.
Includes index.
1. Electronic digital computers-Programming.
1. Title. II. Series.
QA76.6.G747 001.64'2 81-14554
AACR2

Printed on acid-free paper.

© 1981 by Springer-Verlag New York Inc.

All rights reserved. No pan of this book may be translated or


reproduced in any form without written permission from Springer- Verlag,
175 Fifth Avenue. New York. New York 10010, USA.
The use of general descriptive names, trade names. trademarks, etc. in
this publication, even if the former are not especially identified. is
not to be taken as a sign that such names, as understood by the Trade Marks
and Merchandise Marks Act, may accordingly be used freely by anyone.
ISBN-13: 978-0-387-96480-5 e-ISBN-13: 978-1-4612-5983-1
DOl: 10.10071978-1-4612-5983-1

9 8

Springer-Verlag New York Berlin Heidelberg


A member of BertelsmannSpringer Science+Business Media GmbH
Foreword

This is the textbook I hoped someone like Professor David Gries


would write -and, since the latter has no rivals, that means I just hoped
he would write it. The topic deserves no lesser author.
During the last decade, the potential meaning of the word "program"
has changed profoundly. While the "program" we wrote ten years ago
and the "program" we can write today can both be executed by a com-
puter, that is about all they have in common. Apart from that superficial
similarity, they are so fundamentally different that it is confusing to
denote both with the same term. The difference between the "old pro-
gram" and the "new program" is as profound as the difference between a
conjecture and a proven theorem, between pre-scientific knowledge of
mathematical facts and consequences rigorously deduced from a body of
postulates.
Remembering how many centuries it has taken Mankind to appreciate
fully the profundity of this latter distinction, we get a glimpse of the edu-
cational challenge we are facing: besides teaching technicalities, we have
to overcome the mental resistance always evoked when it is shown how
the techniques of scientific thought can be fruitfully applied to a next area
of human endeavour. (We have already heard all the objections, which
are so traditional they could have been predicted: "old programs" are
good enough, "new programs" are no better and are too difficult to design
in realistic situations, correctness of programs is much less important than
correctness of specifications, the "real world" does not care about proofs,
etc. Typically, these objections come from people that don't master the
techniques they object to.)
It does not suffice just to explain the formal machinery that enables us
to design "new programs". New formalisms are always frightening, and it
takes much careful teaching to convince the novice that the formalism is
VI Foreword

not only helpful but even indispensable. Choice and order of examples
are as important as the good taste with which the formalism is applied.
To get the message across requires a scientist that combines his scientific
involvement in the subject with the precious gifts of a devoted teacher.
We should consider ourselves fortunate that Professor David Gries has
met the challenge.

Edsger W. Dijkstra
Preface

The Oxford English Dictionary contains the following sentence con-


cerning the term science:

Sometimes, however, the term science is extended to de-


note a department of practical work which depends on
the knowledge and conscious application of principles;
an art, on the other hand, being understood to require
merely knowledge of traditional rules and skill acquired
by habit.

It is in this context that the title of this book was chosen. Programming
began as an art, and even today most people learn only by watching oth-
ers perform (e.g. a lecturer, a friend) and through habit, with little direc-
tion as to the principles involved. In the past IO years, however, research
has uncovered some useful theory and principles, and we are reaching the
point where we can begin to teach the principles so that they can be cons-
ciously applied. This text is an attempt to convey my understanding of
and excitement for this just-emerging science of programming.
The approach does require some mathematical maturity and the will to
try something new. A programmer with two years experience, or a junior
or senior computer science major in college, can master the material -at
least, this is the level I have aimed at.
A common criticism of the approach used in this book is that it has
been used only for small (one or two pages of program text), albeit com-
plex, problems. While this may be true so far, it is not an argument for
ignoring the approach. In my opinion it is the best approach to reasoning
about programs, and I believe the next ten years will see it extended to
and practiced on large programs. Moreover, since every large program
consists of many small programs, it is safe to say the following:
V1ll Preface

One cannot learn to write large programs effectively until


one has learned to write small ones effectively.

While success cannot be guaranteed, my experience is that the approach


often leads to shorter, clearer, correct programs in the same amount of
time. It also leads to a different frame of mind, in that one becomes
more careful about definitions of variables, about style, about clarity.
Since most programmers currently have difficulty developing even small
programs, and the small programs they develop are not very readable,
studying the approach should prove useful.
The book contains little or no discussion of checking for errors, of
making programs robust, of testing programs and the like. This is not
because these aspects are unimportant or because the approach does not
allow for them. It is simply that, in order to convey the material as sim-
ply as possible, it is necessary to concentrate on the one aspect of develop-
ing correct programs. The teacher using this book may want to discuss
these other issues as well.

The Organization of the Book


Part I is an introduction to the propositional and predicate calculi.
Mastery of this material is important, for the predicate calculus should be
used as a tool for doing practical reasoning about programs. Any discip-
line in which severe complexity arises usually turns to mathematics to
help control that complexity. Programming is no different.
Rest assured that ] have attempted to convey this material from the
programmer's viewpoint. Completeness, soundness, etc., are not men-
tioned, because the programmer has no need to study these issues. He
needs to be able to manipUlate and simplify propositions and predicates
when developing programs.
Chapter 3, which is quite long, discusses reasoning using a "natural
deduction system". ] wrote this chapter to learn about such systems and
to see how effective they were for reasoning about programs, because a
number of mechanical verifier systems are based on them. My conclusion
is that the more traditional approach of chapter 2 is far more useful, but]
have left chapter 3 in for those whose tastes run to the natural deduction
systems. Chapter 3 may be skipped entirely, although it may prove useful
in a course that covers some formal logic and theory.
If one is familiar with a few concepts of logic, it is certainly possible to
begin reading this book with Part II and to refer to Part ] only for con-
ventions and notation. The teacher using this text in a course may also
want to present the material in a different order, presenting, for example,
the material on quantification later in the course when it is first needed.
Preface IX

Part II defines a small language in terms of weakest preconditions. The


important parts -the ones needed for later understanding of the develop-
ment of programs- are chapters 7 and 8, sections 9.1 and 9.2, and
chapters 10 and II. Further, it is possible to skip some of the material,
for example the formal definition of the iterative construct and the proof
of theorem 11.6 concerning the use of a loop invariant, although I believe
that mastering this material will be beneficial.

Part III is the heart of the book. Within it, in order to get the reader
more actively involved, I have tried the following technique. At a point, a
question will be raised, which the reader is expected to answer. The ques-
tion is followed by white space, a horizontal1line, and more white space.
After answering the question, the reader can then continue and discover
my answer. Such active involvement will be more difficult than simply
reading the text, but it will be far more beneficial.
Chapter 21 is fun. It concerns inverting programs, something that Eds-
ger W. Dijkstra and his colleague Wim Feijen dreamed up. Whether it is
really useful has not been decided, but it is fun. Chapter 22 presents a
few simple rules on documenting programs; the material can be read be-
fore the rest of the book. Chapter 23 contains a brief, personal history of
this science of programming and an anecdotal history of the programming
problems in the book.
Answers to some exercises are included -all answers are not given so
the exercises can be used as homework. A complete set of answers can be
obtained at nominal cost by requesting it, on appropriate letterhead.

Notation. The notation iff is used for "if and only if". A few years ago,
while lecturing in Denmark, I used Fif instead, reasoning that since "if and
only if" was a symmetric concept its notation should be symmetric also.
Without knowing it, I had punned in Danish and the audience laughed,
for fif in Danish means "a little trick". I resolved thereafter to use fif so I
could tell my joke, but my colleagues talked me out of it.
The symbol 0 is used to mark the end of theorems, definitions,
examples, and so forth. When beginning to produce this book on the
phototypesetter, it was discovered that the mathematical quantifiers
"forall" and "exists" could not be built easily, so A and E have been used
for them.
Throughout the book, in the few places they occur, the words he, him
and his denote a person of either sex.
x Preface

Acknowledgements
Those familiar with Edsger W. Dijkstra's monograph A Discipline of
Programming will find his influence throughout this book. The calculus
for the derivation of programs, the style of developing programs, and
many of the examples are his. In addition, his criticisms of drafts of this
book have been invaluable.
Just as important to me has been the work of Tony Hoare. His paper
on an axiomatic basis for programming was the start of a new era, not
only in its technical contribution but in its taste and style, and his work
since then has continued to influence me. Tony's excellent, detailed criti-
cisms of a draft of Part I caused me to reorganize and rewrite major parts
of it.
I am grateful to Fred Schneider, who read the first drafts of all chap-
ters and gave technical and stylistic suggestions on almost every para-
graph.
A number of people have given me substantial constructive criticisms
on all or parts of the manuscript. For their help I would like to thank
Greg Andrews, Michael Gordon, Eric Hehner, Gary Levin, Doug McIl-
roy, Bob Melville, Jay Misra, Hal Perkins, John Williams, Michael
Woodger and David Wright.
My appreciation goes also to the Cornell Computer Science Commun-
ity. The students of course CS600 have been my guinea pigs for the past
five years, and the faculty and students have tolerated my preachings
about programming in a very amiable way. Cornell has been an excellent
place to perform my research.
This book was typed and edited by myself, using the departmental
PDPll/60-VAX system running under UNIX+ and a screen editor written
for the Terak. (The files for the book contain 844,592 characters.) The
final copy was produced using troff and a Comp Edit phototypesetter at
the Graphics Lab at Cornell. Doug McIlroy introduced me to many of
the intricacies of troff; Alan Demers, Dean Krafft and Mike Hammond
provided much help with the PDPII/60-VAX system; and Alan Demers,
Barbara Gingras and Sandor Halasz spent many hours helping me con-
nect the output of troffto the phototypesetter. To them I am grateful.
The National Science Foundation has given me continual support for
my research, which led to this book.
Meetings of the IFIP Working Group on programming methodology,
WG2.3, have had a strong influence on my work in programming metho-
dology over the past 8 years.
+UNIX is a trademark of Bell Laboratories.
Preface Xl

Finally, I thank my wife, Elaine, and children, Paul and Susan, for
their love and patience while I was writing this book.
In preparing the second printing of this book, over 150 changes were
made without significantly changing the page numbering. Thanks go to
the following people for notifying me of errors: Roland Backhouse, Alfs
T. Berztiss, Ed Cohen, Cui Jing, Cui Yan-Nong, Pavel Curtis, Alan
Demers, David Gries, Robert Harper, Cliff Jones, Donald E. Knuth, Liu
Shau-Chung, Michael Marcotty, Alain Martin, James Mildrew, Ken
Perry, Hal Perkins, Paul Pritchard, Willem de Roever, J.L.A. van de
Snepsheut, R.C. Shaw, Jorgan Steensgaard-Madsen, Rodney Topor, Sol-
veig Torgerson, Wlad Turski, V. Vitek, David Wright, Zhou Bing-Sheng.
Table of Contents

Part O. Why Use Logic? Why Prove Programs Correct? ............ ..

Part I. Propositions and Predicates............................................... 7


Chapter 1. Propositions ..................... .......... ........ .......... .......... ... 8
1.1. Fully Parenthesized Propositions.. ........ ........ .......... .......... ... 8
1.2. Evaluation of Constant Propositions.................... ........ ........ 10
1.3. Evaluation of Propositions in a State ...... .......... .......... ........ 11
1.4. Precedence Rules for Operators .......... .......... .......... ............. 12
1.5. Tautologies...... .......... .......... ............ .......... .................. ........ 14
1.6. Propositions as Sets of States .......... .......... .................. ........ 15
1.7. Transforming English to Propositional Form....................... 16
Chapter 2. Reasoning using Equivalence Transformations .......... 19
2.1. The Laws of Equivalence .......... .......... .......... .......... ............. 19
2.2. The Rules of Substitution and Transitivity.......... ........ ........ 22
2.3. A Formal System of Axioms and Inference Rules ............... 25
Chapter 3. A Natural Deduction System .......... ........ .......... ........ 28
3.1. Introduction to Deductive Proofs ........................................ 29
3.2. Inference Rules .................... .......... .......... .......... .......... ........ 30
3.3. Proofs and Subproofs .......... .......... .......... .......... .......... ........ 36
3.4. Adding Flexibility to the Natural Deduction System ........... 45
3.5. Developing Natural Deduction System Proofs........ ............. 52
Chapter 4. Predicates........ ............ ........ .......... .. ............ .............. 66
4.1. Extending the Range of a State.. ........ .......... .......... ............. 66
4.2. Quantification....... .......... .......... .......... .......... .......... ............. 71
4.3. Free and Bound Identifiers ...................... ............................ 76
4.4. Textual Substitution .............................. .............................. 79
XIV Table of Contents

4.5. Quantification Over Other Ranges....................................... 82


4.6. Some Theorems About Textual Substitution and States...... 85
Chapter 5. Notations and Conventions for Arrays...................... 88
5.1. One-dimensional Arrays as Functions...... .......... .......... ........ 88
5.2. Array Sections and Pictures. .......... ....... ... .......... .......... .... .... 93
5.3. Handling Arrays of Arrays of ... ...... .... ...... .......... ............. 96
Chapter 6. Using Assertions to Document Programs.................. 99
6.1. Program Specifications........................................................ 99
6.2. Representing Initial and Final Values of Variables ...... ........ 102
6.3. Proof Outlines.................................... ............ ..................... 103

Part II. The Semantics of a Small Language ... ...... .......... ............. 107
Chapter 7. The Predicate Transformer wp.................................. 108
Chapter 8. The Commands skip, abort and Composition ........... 114
Chapter 9. The Assignment Command ........ ........ .......... ....... ... ... 117
9.1. Assignment to Simple Variables.......... .......... .......... ............. 117
9.2. Multiple Assignment to Simple Variables ............................ 121
9.3. Assignment to an Array Element ........ .......... .......... ............. 124
9.4. The General Multiple Assignment Command ...................... 127
Chapter 10. The Alternative Command...................................... 131
Chapter II. The Iterative Command............ .............................. 138
Chapter 12. Procedure Call ........ .......... .......... .......... ...... ...... ...... 149
12.1. Calls with Value and Result Parameters........... ................. 150
12.2. Two Theorems Concerning Procedure Call...... ...... ...... ...... 153
12.2. Using Var Parameters ........................................................ 158
12.3. Allowing Value Parameters in the Postcondition............... 160

Part III. The Development of Programs ....................................... 163


Chapter 13. Introduction............................................................ 163
Chapter 14. Programming as a Goal-Oriented Activity.. .......... ... 172
Chapter IS. Developing Loops from Invariants and Bounds....... 179
15.1. Developing the Guard First............ ......... ..... ............. ........ 179
15.2. Making Progress Towards Termination ................ .......... ... 185
Chapter 16. Developing Invariants ............................................. 193
16.1. The Balloon Theory....... ............... .................. ................... 193
16.2. Deleting a Conjunct........................................................... 195
16.3. Replacing a Constant By a Variable................................... 199
Table of Contents xv

16.4. Enlarging the Range of a Variable ................. .......... .......... 206


16.5. Combining Pre- and Postconditions................. .......... ........ 211
Chapter 17. Notes on Bound Functions...................................... 216
Chapter 18. Using Iteration Instead of Recursion .......... .......... ... 221
18.1. Solving Simpler Problems First ........ .................... ............. 222
18.2. Divide and Conquer........................................................... 226
18.3. Traversing Binary Trees ..................................................... 229
Chapter 19. Efficiency Considerations ........................................ 237
19.1. Restricting Nondeterminism......... .......... .......... .................. 238
19.2. Taking an Assertion out of a Loop.................................... 241
19.3. Changing a Representation ................................................ 246
Chapter 20. Two Larger Examples of Program Development.. ... 253
20.1. Justifying Lines of Text .................... .......... .......... .......... ... 253
20.2. The Longest Upsequence.............. .................... .................. 259
Chapter 21. Inverting Programs................. .......... .......... .......... ... 265
Chapter 22. Notes on Documentation......................................... 275
22.1. Indentation. ........ .......... ................ ...... ......... ............... ....... 275
22.2. Definitions and Declarations of Variables.......................... 283
22.3. Writing Programs in Other Languages........ .......... .......... ... 287
Chapter 23. Historical Notes ...... .......... ...................................... 294
23.1. A Brief History of Programming Methodology.................. 294
23.2. The Problems Used in the Book ....... .......... .......... .......... ... 301
Appendix I. Backus-Naur Form ................................................... 304
Appendix 2. Sets, Sequences, Integers and Real Numbers ............ 310
Appendix 3. Relations and Functions........................................... 315
Appendix 4. Asymptotic Execution Time Properties... ..... ..... ........ 320
Answers to Exercises ... ... ............. ................. ..... ...... ........... ........ ... 323
References ... ..... ... ..... ... ....... ....................... ........ .......... .......... ....... 355
Index ............................................................................................ 358
Part 0
Why Use Logic?
Why Prove Programs Correct?

A story
We have just finished wntmg a large program (3000 lines). Among
other things, the program computes as intermediate results the quotient q
and remainder r arising from dividing a non-negative integer x by a posi-
tive integer y. For example, with x = 7 and y = 2, th~ program calculates
q = 3 (since 772 = 3) and r = 1 (since the remainder when 7 is divided by
2 is l).
Our program appears below, with dots " ... " representing the parts of
the program that precede and follow the remainder-quotient calculation.
The calculation is performed as given because the program will sometimes
be executed on a micro-computer that has no integer division, and porta-
bility must be maintained at all costs! The remainder-quotient calculation
actually seems quite simple; since 7 cannot be used, we have elected to
subtract divisor y from a copy of x repeatedly, keeping track of how
many subtractions are made, until another subtraction would yield a nega-
tive integer.

r:= x; q:= 0;
while r >y do
begin r:= r-y; q:= q+l end;

We're ready to debug the program. With respect to the remainder-


quotient calculation, we're smart enough to realize that the divisor should
initially be greater than 0 and that upon its termination the variables
2 Part O. Why Use Logic? Why Prove Programs Correct?

should satisfy the formula

x = y *q +r,

so we add some output statements to check the calculations:

write (,dividend x =', x, 'divisor y =', y);


r:=x; q:=O;
while r >y do
begin r:= r-y; q:= q+l end;
write ('y *q + r =', y *q + r);

Unfortunately, we get voluminous output because the program segment


occurs in a loop, so our first test run is wasted. We try to be more selec-
tive about what we print. Actually, we need to know values only when an
error is detected. Having heard of a new feature just inserted into the
compiler, we decide to try it. If a Boolean expression appears within
braces { and} at a point in the program, then, whenever "flow of control"
reaches that point during execution, it is checked: if false, a message and
a dump of the program variables are printed; if true, execution continues
normally. These Boolean expressions are called assertions, since in effect
we are asserting that they should be true when flow of control reaches
them. The systems people encourage leaving assertions in the program,
because they help document it.
Protests about inefficiency during production runs are swept aside by
the statement that there is a switch in the compiler to turn off assertion
checking. Also, after some thought, we decide it may be better always to
check assertions -detection of an error during production would be well
worth the extra cost.
So we add assertions to the program:

{y>O}
r:=x; q:=O;
(1) while r > y do
begin r:= r-y; q:= q+l end;
{x = y *q +r}

Testing now results in far less output, and we make progress. Assertion
checking detects an error during a test run because y is 0 just before a
remainder-quotient calculation, and it takes only four hours to find the
error in the calculation of y and fix it.
Part O. Why Use Logic? Why Prove Programs Correct? 3

But then we spend a day tracking down an error for which we received
no nice false-assertion message. We finally determine that the remainder-
quotient calculation resulted in

x =6, y =3, q = I, r =3.

Sure enough, both assertions in (I) are true with these values; the problem
is that the remainder should be less than the divisor, and it isn't. We
determine that the loop condition should be r;;::: y instead of r >y . If
only the result assertion were strong enough -if only we had used the
assertion x = y*q + rand r <y - we would have saved a day of work!
Why didn't we think of it?
We fix the error and insert the stronger assertion:

{y>O}
r:=x; q:=O;
while r;;:::y do
begin r: r -y; q: q + I end;
= =
{x =y*q +r and r <y}

Things go fine for a while, but one day we get incomprehensible output.
It turns out that the quotient-remainder algorithm resulted in a negative
remainder r = -2. But the remainder shouldn't be negative! And we find
out that r was negative because initially x was -2. Ahhh, another error
in calculating the input to the quotient-remainder algorithm -x isn't sup-
posed to be negative! But we could have caught the error earlier and
saved two days searching, in fact we should have caught it earlier; all we
had to do was make the initial and final assertions for the program seg-
ment strong enough. Once more we fix an error and strengthen an asser-
tion:

{O:;;:;x and O<y}


r:= x; q:= 0;
while r;;:::y do
begin r:= r -y; q:= q +1 end;
{x =y*q +r and O:;;:;r <y}

It sure would. be nice to be able to invent the right assertions to use in a


less ad hoc fashion. Why can't we think of them? Does it have to be a
trial-and-error process? Part of our problem here was carelessness in
specifying what the program segment was to do -we should have written
4 Part O. Why Use Logic? Why Prove Programs Correct?

the initial assertion (0 ~ x and 0 <y) and the final assertion (x = y*q +r
and 0 ~r <y) before writing the program segment, for they form the
definition of quotient and remainder.
But what about the error we made in the condition of the while loop?
Could we have prevented that from the beginning? Is there is a way to
prove, just from the program and assertions, that the assertions are true
when flow of control reaches them? Let's see what we can do.
Just before the loop it seems that part of our result,

(2) x =y*q +r

holds, since x = rand q = o. And from the assignments in the loop body
we conclude that if (2) is true before execution of the loop body then it is
true after its execution, so it will be true just before and after every itera-
tion of the loop. Let's insert it as an assertion in the obvious places, and
let's also make all assertions as strong as possible:

{O~x and O<y}


r:= x; q:= 0;
{O~r andO<y andx=y*q+r}
while r ~y do
begin {O~r and O<y ~r and x =y*q +r}
r:=r-y; q:=q+l
{O~r andO<y andx=y*q+r}
end;
{O~r <y and x = y*q +r}

Now, how can we easily determine a correct loop condition, or, given the
condition, how can we prove it is correct? When the loop terminates the
condition is false. Upon termination we want r <y, so that the comple-
ment, r ~ y must be the correct loop condition. How easy that was!
It seems that if we knew how to make all assertions as strong as possi-
ble and if we learned how to reason carefully about assertions and pro-
grams, then we wouldn't make so many mistakes, we would know our
program was correct, and we wouldn't need to debug programs at all!
Hence, the days spent running test cases, looking through output and
searching for errors could be spent in other ways.
Part O. Why Use Logic? Why Prove Programs Correct? 5

Discussion
The story suggests that assertions, or simply Boolean expressions, are
really needed in programming. But it is not enough to know how to write
Boolean expressions; one needs to know how to reason with them: to sim-
plify them, to prove that one follows from another, to prove that one is
not true in some state, and so forth. And, later on, we will see that it is
necessary to use a kind of assertion that is not part of the usual Boolean
expression language of Pascal, PL/ I or FORTRAN, the "quantified"
assertion.
Knowing how to reason about assertions is one thing; knowing how to
reason about programs is another. In the past 10 years, computer science
has come a long way in the study of proving programs correct. We are
reaching the point where the subject can be taught to undergraduates, or
to anyone with some training in programming and the will to become
more proficient. More importantly, the study of program correctness
proofs has led to the discovery and elucidation of methods for developing
programs. Basically, one attempts to develop a program and its proof
hand-in-hand, with the proof ideas leading the way! If the methods are
practiced with care, they can lead to programs that are free of errors, that
take much less time to develop and debug, and that are much more easily
understood (by those who have studied the subject).
Above, I mentioned that programs could be free of errors and, in a
way, I implied that debugging would be unnecessary. This point needs
some clarification. Even though we can become more proficient in pro-
gramming, we will still make errors, even if only of a syntactic nature
(typos). We are only human. Hence, some testing will always be neces-
sary. But it should not be called debugging, for the word debugging
implies the existence of bugs, which are terribly difficult to eliminate. No
matter how many flies we swat, there will always be more. A disciplined
method of programming should give more confidence than that! We
should run test cases not to look for bugs, but to increase our confidence
in a program we are quite sure is correct; finding an error should be the
exception rather than the rule.
With this motivation, let us turn to our first subject, the study of logic.
Part I
Propositions
and Predicates

Chapter I defines the syntax of propOSitIOns -Boolean expressions


using only Boolean variables- and shows how to evaluate them. Chapter
2 gives rules for manipulating propositions, which is often done in order
to find simpler but equivalent ones. This chapter is important for further
work on programming, and should be studied carefully.
Chapter 3 introduces a natural deduction system for proving theorems
about propositions, which is supposed to mimic in some sense the way we
"naturally" argue. Such systems are used in research on mechanical verifi-
cation of proofs of program correctness, and one should become familiar
with them. But the material is not needed to understand the rest of the
book and may be skipped entirely.
Chapter 4 extends propositions to include variables of types besides
Boolean and introduces quantification. A predicate calculus is given, in
which one can express and manipulate the assertions we make about pro-
gram variables. "Bound" and "free" variables are introduced and the
notion of textual substitution is studied. This material is necessary for
further reading.
Chapter 5 concerns arrays. Thinking of an array as a function from
subscript values to array element values, instead of as a collection of
independent variables, leads to some neat notation and rules for dealing
with arrays. The first two sections of this chapter should be read, but the
third may be skipped on first reading.
Finally, chapter 6 discusses briefly the use of assertions in programs,
thus motivating the next two parts of the book.
Chapter 1
Propositions

We want to be able to describe sets of states of program variables and


to write and manipulate clear, unambiguous assertions about program
variables. We begin by considering only variables (and expressions) of
type Boolean: from the operational point of view, each variable contains
one of the values T and F, which represent our notions of "truth" and
"falsity", respectively. The word Boolean comes from the name of a 19th
century English mathematician, George Boole, who initiated the algebraic
study of truth values.
Like many logicians, we will use the word proposition for the kind of
Boolean or logical expression to be defined and discussed in this chapter.
Propositions are similar to arithmetic expressions. There are operands,
which represent the values T or F (instead of integers), and operators
(e.g. and, or instead of *, +), and parentheses are used to aid in determin-
ing order of evaluation. The problem will not be in defining and evaluat-
ing propositions, but in learning how to express assertions written in
English as propositions and to reason with those propositions.

1.1 Fully Parenthesized Propositions


Propositions are formed according to the following rules (the operators
will be defined subsequently). As can be seen, parentheses are required
around each proposition that includes an operation. This restriction,
which will be weakened later on, allows us to dispense momentarily with
problems of precedence of operators.

I. T and F are propositions.


2. An identifier is a proposition. (An identifier is a sequence of
one or more digits and letters, the first of which is a letter.)
Section 1.1 Fully Parenthesized Propositions 9

3. If b is a proposition, then so is (, b).


4. If band c are propositions, then so are (b A c), (b v c),
(b =;>c), and (b =c).

This syntax may be easier to understand in the form of a BNF grammar


(Appendix 1 gives a short introduction to BNF):

(1.1.1) <proposition> ::= T I F I <identifier>


I (, <proposition»
I «proposition> A <proposition> )
I «proposition> v <proposition> )
I «proposition> =;> <proposition> )
I «proposition> = <proposition> )
Example. The following are propositions (separated by commas):
F, (,T), (bVxyz), «,b)A(c=;>d»,
«abcl=id)A(,d» 0

Example. The following are not propositions:


F F, (b V(c), (b)A), a+b 0

As seen in the above syntax, five operators are defined over values of
type Boolean:

negation: (not b), or (, b)


conjunction: (b and c), or (b A c)
disjunction: (b or c), or (b v c)
implication: (b imp c), or (b =;>c)
equality: (b equals c), or (b = c)
Two different notations have been given for each operator, a name and a
mathematical symbol. The name indicates how to pronounce it, and its
use also makes typing easier when a typewriter does not have the
corresponding mathematical symbol.
The following terminology is used. (b A c) is called a conjunction; its
operands band c called conjuncts. (b v c) is called a disjunction; its
operands band c are called disjuncts. (b =;> c) is called an implication;
its antecedent is b and its consequent is c.
10 Part 1. Propositions and Predicates

1.2 Evaluation of Constant Propositions


Thus far we have given a syntax for propositions; we have defined the
set of well-formed propositions. We now give a semantics (meaning) by
showing how to evaluate them.
We begin by defining evaluation of constant propositions ~proposi­
tions that contain only constants as operands~ and we do this in three
cases based on the structure of a proposition e: for e with no operators,
for e with one operator, and for e with more than one operator.

(1.2.1) Case 1. The value of proposition Tis T; the value of F is F.


(1.2.2) Case 2. The values of (,b), (b AC), (b Vc), (b ~c) and (b =c),
where band c are each one of the constants T and F, are given
by the following table (called a truth table). Each row of the
table contains possible values for the operands band c and, for
these values, shows the value of each of the five operations. For
example, from the last row we see that the value of ( , T) is F
and that the values of (TAT), (TVT), (T~T) and (T=T) are
all T.

b c (,b) (bAc) (bVc) (b~c) (b=c)


F F T F F T T
(1.2.3) F T T F T T F
T F F F T F F
T T F T T T T

(1.2.4) Case 3. The value of a constant proposition with more than one
operator is found by repeatedly applying (1.2.2) to a subproposi-
tion of the constant proposition and replacing the sUbproposition
by its value, until the proposition is reduced to Tor F.
We give an example of evaluation of a proposition:

«TAT)~F)
= (T~F)
=F

Remark: The description of the operations in terms of a truth table, which


lists all possible operand combinations and their values, can be given only
because the set of possible values is finite. For example, no such table
could be given for operations on integers. 0

The names of the operations correspond fairly closely to their mean-


ings in English. For example, "not true" usually means "false", and "not
Section 1.3 Evaluation of Propositions in a State 11

false" "true". But note that operation or denotes "inclusive or" and not
"exclusive or". That is, (T v T) is T, while the "exclusive or" of T and T
is false.
Also, there is no causality implied by operation imp. The sentence "If
it rains, the picnic is cancelled" can be written in propositional form as
(rain =? no picnic). From the English sentence we infer that the lack of
rain means there will be a picnic, but no such inference can be made from
the proposition (rain =? no picnic).

1.3 Evaluation of Propositions in a State


A proposition like «,
c) v d) can appear in a program in several places,
for example in an assignment statement b:= « , c) v d) and in an if-
statement if «,c) v d) then " '. When the statement in which the
proposition appears is to be executed, the proposition is evaluated in the
current machine "state" to produce either T or F. To define this evalua-
tion requires a careful explanation of the notion of "state".
A state associates identifiers with values. For example, in state s (say),
identifier c could be associated with value F and identifier d with T. In
terms of a computer memory, when the computer is in state s, locations
named c and d contain the values F and T, respectively. In another
state, the associations could be (c, T) and (d, F). The crucial point here
is that a state consists of a set of pairs (identifier, value) in which all the
identifiers are distinct, i.e. the state is a function:

(\.3.1) Definition. A state s is a function from a set of identifiers to the


set of values T and F. 0

Example. Let state s be the function defined by the set {(a, T), (be, F),
(yl, T)}. Then sea) denotes the value determined by applying state (func-
tion) s to identifier a: s(a)=T. Similarly, s(be)=F and s(yl) =
T. 0

(1.3.2) Definition. Proposition e is well-defined in state s if each iden-


tifier in e is associated with either T or F in state s. 0

In state s = feb, T),(e,F)}. proposition (b Ve) is well-defined while


proposition (b v d) is not.
Let us now extend the notation s (identifier) to define the value of a
propositIOn in a state. For any state s and proposition e, s (e) will
denote the value resulting from evaluating e in state s. Since an identifier
b is also a proposition, we will be careful to make sure that s (b) will still
denote the value of b in state s.
12 Part I. Propositions and Predicates

(1.3.3) Definition. Let proposition e be well-defined in state s. Then


s(e), the value of e in state s, is the value obtained by replacing
all occurrences of identifiers b in e by their values s (b) and
evaluating the resulting constant proposition according to the
rules given in the previous section 1.2. 0

Example. s ((( , b ) v c» is evaluated in state s = {(b, T), (c, F)}:

s«(,b)Vc»
= (( , T) v F) (b has been replaced by T, c by F)
= (FV F)
=F 0

1.4 Precedence Rules for Operators


The previous sections dealt with a restricted form of propositions, so
that evaluation of propositions could be explained without having to deal
with the precedence of operators. We now relax this restriction.
Parentheses can be omitted or included at will around any proposition.
For example, the proposition ((b Vc)?d) can be written as b Vc ?d. In
this case, additional rules define the order of evaluation of subproposi-
tions. These rules, which are similar to those for arithmetic expressions
are:

I. Sequences of the same operator are evaluated from left to


right, e.g. b Ac Ad is equivalent to ((b Ac)Ad).
2. The order of evaluation of different, adjacent operators is
given by the list: not (has highest precedence and binds tightest),
and, or, imp, equals.

It is usually better to make liberal use of parentheses in order to make


the order of evaluation clear, and we will usually do so.

Examples ,b =b Ac is equivalent to (,b)=(bAc)


bV,c?d is equivalent to (bV(,c»?d
b ? c ?d I\e is equivalent to (b?c)?(dAe) 0

The following BNF grammar defines the syntax of propositions, giving


enough structure so that precedences can be deduced from it. (The non-
terminal <identifier> has been left undefined and has its usual meaning).
Section 1.4 Precedence Rules for Operators 13

I. <proposition> ::= <imp-expr>


2. I <proposition> = <imp-expr>
3. <imp-expr> <expr>
4. <imp-expr> =:> <expr>
5. <expr> <term>
6. I <expr> v <term>
7. <term> .. - <factor>
8. I <term> A <factor>
9. <factor> , <factor>
10. ( <proposition> )
II. T
12. F
13. <identifier>

We now define s (e), the value of proposition e in state s, recursively,


;')ased on the structure of e given by the grammar. That is, for each rule
of the grammar, we show how to evaluate e if it has the form given by
that rule. For example, rule 6 indicates that for an <expr> of the form
<expr> v <term>, its value is the value found by applying operation or
to the values s «expr» and s «term» of its operands <expr> and
<term>. The values of the five operations =, =:>, v, A and , used in
rules 2, 4, 6, 8 and 9 are given by truth table (1.2.3).

I. s ( <proposition» s «imp-expr»
2. s ( <proposition> ) s ( <proposition» = s ( <imp-expr»
3. s «imp-expr» s«expr»
4. s«imp-expr» s( <imp-expr» =:> sC<expr»
5. s ( <expr» s( <term»
6. s «expr» s( <expr» v s( <term»
7. s «term» s( <factor»
8. s «term» s ( <term» A s ( <factor»
9. s «factor» , s(<factor»
10. s«factor» s ( <proposition> )
II. s«factor» T
12. s«factor» F
13. s «factor» s ( <identifier» (the value of
<identifier> in s)

An example of evaluation using a truth table


Let us compute values of the proposition (b =:> c) = ( , b v c) for all pos-
sible operand values using a truth table. In the table below, each row
gives possible values for band c and the corresponding values of , b,
, b v c, b =:> c and the final proposition. This truth table shows how one
builds a truth table for a proposition, by beginning with the values of the
14 Part I. Propositions and Predicates

identifiers, then showing the values of the smallest subpropositions, then


the next smallest, and building up to the complete proposition.
As can be seen, the values of , b v e and b ~ e are the same in each
state, and hence the propositions are equivalent and can be used inter-
changeably. In fact, one often finds b ~c defined as ,b Ve. Similarly,
b = c is often defined as an abbreviation for (b ~ e) A (c ~ b) (see exer-
cise 2i).

b e ,b , b Ve b ~e (b ~e)=(,b Ve)

F F T T T T
F T T T T T
T F F F F T
T T F T T T

I.S Tautologies
A Tautology is a proposition that is true in every state in which it is
well-defined. For example, proposition T is a tautology and F is not.
The proposition b v , b is a tautology, as can be seen by evaluating it with
b = T and b =F:

TV,T TVF T
Fv,F FVT T

or, in truth-table form:

b bv,b
T F T
F T T

The basic way to show that a proposition is a tautology is to show that its
evaluation yields T in every possible state. Unfortunately, each extra
identifier in a proposition doubles the number of combinations of values
for identifiers ~for a proposition with i distinct identifiers there are i
cases! Hence, the work involved can become tedious and time consum-
ing. To illustrate this, (1.5. I) contains the truth table for proposition
(b A e A d) ~ (d ~ b), which has three distinct identifiers. By taking some
shortcuts, the work can be reduced. For example, a glance at truth table
(1.2.3) indicates that operation imp is true whenever its antecedent is false,
so that its consequent need only be evaluated if its antecedent is true. In
example (1.5.1) there is only one state in which the antecedent b A e A d is
true ~the state in which b, e and d are true~ and hence we need only
the top line of truth table (1.5. I).
Section 1.6 Propositions as Sets of States 15

bed b Ac Ad d?b (b Ac Ad)?(d ?b)


TTT T T T
TTF F T T
TFT F T T
(1.5.1) TFF F T T
FTT F F T
FTF F T T
FFT F F T
FFF F T T

Using such informal reasoning helps reduce the number of states in


which the proposition must be evaluated. Nevertheless, the more distinct
identifiers a proposition has the more states to inspect, and evaluation
soon becomes infeasible. Later chapters investigate other methods for
proving that a proposition is a tautology.

Disproving a conjecture
Sometimes we conjecture that a proposltlOn e is a tautology, but are
unable to develop a proof of it, so we decide to try to disprove it. What
does it take to disprove such a conjecture?
It may be possible to prove the converse -i.e. that , e is a tautology-
but the chances are slim. If we had reason to believe a conjecture, it is
unlikely that its converse is true. Much more likely is that it is true in
most states but false in one or two, and to disprove it we need only find
one such state:

To prove a conjecture, it is necessary to prove that it is true in all


cases; to disprove a conjecture, it is sufficient to find a single case
where it is false.

1.6 Propositions as Sets of States


A proposition represents, or describes, the set of states in which it is
true. Conversely, for any set of states containing only identifiers associ-
ated with T or F we can derive a proposition that represents that state
set. Thus, the empty set, the set containing no states, is represented by
proposition F because F is true in no state. The set of all states is
represented by proposition T because T is true in all states. The follow-
ing example illustrates how one can derive a proposition that represents a
given set of states. The resulting proposition contains only the operators
and, or and not.
16 Part I. Propositions and Predicates

Example. The set of two states feb, T),(c, T),(d, T») and {(b,F),
(c, T), (d, F»), is represented by the proposition

(bAcAd)V(,bAcA,d) 0

The connection between a proposition and the set of states it represents is


so strong that we often identify the two concepts. Thus, instead of writ-
ing "the set of states in which b v , c is true" we may write "the states in
b v , c ". Though it is a sloppy use of English, it is at times convenient.
In connection with this discussion, the following terminology is intro-
duced. Proposition b is weaker than c if c -=';> b. Correspondingly, c is
said to be stronger than b. A stronger proposition makes more restric-
tions on the combinations of values its identifiers can be associated with,
a weaker proposition makes fewer. In terms of sets of states, b is as weak
as c if it is "less restrictive": if b's set of states includes at least c's states,
and possibly more. The weakest proposition is T (or any tautology),
because it represents the set of all states; the strongest is F, because it
represents the set of no states.

1.7 Transforming English to Propositional Form


At this point, we translate a few sentences into propositional form.
Consider the sentence "If it rains, the picnic is cancelled." Let identifier r
stand for the proposition "it rains" and let identifier pc represent "the
picnic is cancelled". Then the sentence can be written as r -=';>pc.
As shown by this example, the technique is to represent "atomic parts"
of a sentence -how these are chosen is up to the translator- by identif-
iers and to describe their relationship using Boolean operators. Here are
some more examples, using identifiers r, pc, wet, and s defined as fol-
lows:

it rains: r
picnic is cancelled: pc
be wet: wet
stay at home: s

1. If it rains but I stay at home, I won't be wet: (r As) -=';> , wet


2. I'll be wet if it rains: r ~ wet

3. If it rains and the picnic is not cancelled or I don't stay home,


111 be wet: Either «rA,pc)V,s)-=';>wet or
(r A( ,pc v,s» -=,;>wet. The English is ambiguous; the latter pro-
position is probably the desired one.
Exercises for Chapter I 17

4. Whether or not the picnic is cancelled, I'm staying home if it


rains: (pc v , pc) A r ~ s . This reduces to r ~ s.
5. Either it doesn't rain or I'm staying home: , r v s.

Exercises for Chapter 1


1. Each line contains a proposition and two states sl and s2. Evaluate the propo-
sition in both states.

proposition state sl state s2


m n l!. q m n e. q
(a) , (m v n) T F T T F T T T
(b) ,m v n T F T T F T T T
(c) ,(mAn) T F T T F T T T
(d) ,m An T F T T F T
( e) (mVn)~p T F T T T T F T
(f) m V(n ~p) T F T T T T F T
(g) (m=n)A(p=q) F F T F T F T F
(h) m=(n A(P=q» F F T F T F T F
(i) m=(n Ap=q) F F T F T F T F
(j) (m =n) A (p ~q) F T F T T T F F
(k) (m=nAp)~q F T F T T T F F
(I) (m ~n) ~(p ~q) F F F F T T T T
(m) (m ~(n ~p» ~q F F F F T T T T

2. Write truth tables to show the values of the following propositions in all states:

(a) bVcVd (e) , b ~(b v c)


(b) b Ac Ad (f) ,b=(bVc)
( c) bA(cVd) (g) (,b=c)Vb
(d) bV(cAd) (h) (b Vc)A(b ~c)A(c ~b)
(i) (b =c) = (b ~c)A(c ~b)

3. Translate the following sentences into propositional form.


(a) x <y or x = y.
(b) Either x <y, x = y, or x> y.
(c) If x> y and y > z, then v = w.
(d) The following are all true: x <y, y <
z and v = w.
(e) At most one of the following is true: x <y, y <
z and v = w.
(f) None of the following are true: x <y, y < z and v = w.
(g) The following are not all true at the same time: x <y, y < z and v = w.
(h) When x <y, then y <z; when x )! y, then v = w.
(i) Whenx<y theny<z meansthatv=w,butifx)!y theny<z doesn't
hold; however, if v = w then x <y.
(j) If execution of program P is begun with x <y, then execution terminates
with y =2'.
(k) Execution of program P begun with x < 0 will not terminate.
18 Part I. Propositions and Predicates

4. Below are some English sentences. Introduce identifiers to represent the simple
ones (e.g. "it's raining cats and dogs. ") and then translate the sentences into pro-
positions.
(a) Whether or not it's raining, I'm going swimming.
(b) If it's raining I'm not going swimming.
(c) It's raining cats and dogs.
(d) I t's raining cats or dogs.
(e) If it rains cats and dogs I'll eat my hat, but I won't go swimming.
(f) If it rains cats and dogs while I am swimming I'll eat my hat
Chapter 2
Reasoning using Equivalence Transformations

Evaluating propositions is rarely our main task. More often we wish


to manipulate them in some manner in order to derive "equivalent" but
simpler ones (easier to read and understand). Two propositions (or, in
general, expressions) are equivalent if they have the same value in every
state. For example, since a +(c -a) = c is always true for integer vari-
ables a and c, the two integer expressions a +(c -a) and c are equivalent,
and a +(c -a) = c is called an equivalence.
This chapter defines equivalence of propositions in terms of the evalua-
tion model of chapter 1. A list of useful equivalences is given, together
with two rules for generating others. The idea of a "calculus" is discussed,
and the rules are put in the form of a formal calculus for "reasoning"
about propositions.
These rules form the basis for much of the manipulations we do with
propositions and are very important for later work on developing pro-
grams. The chapter should be studied carefully.

2.1 The Laws of Equivalence


For propositions, we define equivalence in terms of operation equals
and the notion of a tautology as follows:

(2.1.1) Definition. Propositions £1 and £2 are equivalent iff £1 = £2 is


a tautology. In this case, £1 = £2 is an equivalence. 0

Thus, an equivalence is an equality that is a tautology.


Below, we give a list of equivalences; these are the basic equivalences
from which all others will be derived, so we call them the laws of
equivalence. Actually, they are "schemas": the identifiers £1, £2 and £3
20 Part I. Propositions and Predicates

within them are parameters, and one arrives at a particular equivalence by


substituting particular propositions for them. F or example, substituting
x v y for E1 and z for E2 in the first law of Commutativity, (E1 AE2) =
(E2 A E1), yields the equivalence

«XVy)AZ) =(z A(xVy»

Remark: Parentheses are inserted where necessary when performing a sub-


stitution so that the order of evaluation remains consistent with the origi-
nal proposition. For example, the result of substituting x v y for b In
bAz is (xVy)Az, and not xVyAz, which is equivalent to
xV(yAz). 0

I. Commutative Laws (These allow us to reorder the operands of and, or


and equality):
(E1 A E2) = (E2 A E1)
(E1 v E2) = (E2V E1)
(E1=E2) = (E2=E1)

2. Associative Laws (These allow us to dispense with parentheses when


dealing with sequences of and and sequences of or):
E1A(E2AE3) = (E1 AE2)AE3 (so write both as E1AE2AE3)
E1 V(E2V E3) = (E1 v E2) v E3

3. Distributive Laws (These are useful in factoring a proposition, in the


same way that we rewrite 2* (3+4) as (2* 3)+(2* 4»:
E1 v (E2 A E3) (E1 v E2) A (E1 v E3)
E1A(E2VE3) = (E1AE2)V(E1AE3)

4. De Morgan's Laws (After Augustus De Morgan, a 19th century


English mathematician who, along with Boole, laid much of the founda-
tions for mathematical logic):
, (E1 A E2) , E1 v , E2
,(E1VE2) = ,E1A,E2

5. Law of Negation: , ( , E1) = E1

6. Law of the Excluded Middle: E1 v , E1 T

7. Law of Contradiction: E1 A, E1 = F

8. Law of Implication: E1 ~ E2 = , E1 v E2
9. Law of Equality: (E1 = E2) = (E1 ~ E2) A (E2 ~ E1)
Section 2.1 The Laws of Equivalence 21

10. Laws of or-simplification:


E1v E1 = E1
E1V T = T
Elv F = EI
E1V(E1AE2) = E1

11. Laws of and-simplification:


E1 AE1 = E1
E1 AT = E1
E1AF = F
E1 A(E1V E2) = E1

12. Law of Identity: E1 = E1

Don't be alarmed at the number of laws. Most of them you have used
many times, perhaps unknowingly, and this list will only serve to make
you more aware of them. Study the laws carefully, for they are used over
and over again in manipulating propositions. Do some of the exercises at
the end of this section until the use of these laws becomes second nature.
Knowing the laws by name makes discussions of their use easier.
The law of the Excluded Middle deserves some comment. It means
that at least one of band , b must be true in any state; there can be no
middle ground. Some don't believe this law, at least in all its generality.
In fact, here is a counterexample to it, in English. Consider the sentence

This sentence is false.

which we might consider as the meaning of an identifier b. Is it true or


false? It can't be true, because it says it is false; it can't be false, because
then it would be true! The sentence is neither true nor false, and hence
violates the law of the Excluded Middle. The paradox arises because of
the self-referential aspect of the sentence -it indicates something about
itself, as do all paradoxes. [Here is another paradox to ponder: a barber
in a small town cuts the hair of every person in town except for those who
cut their own. Who cuts the barber's hair?] In our formal system, there
will be no way to introduce such self-referential treatment, and the law of
the Excluded Middle holds. But this means we cannot express all our
thoughts and arguments in the formal system.
Finally, the laws of Equality and Implication deserve special mention.
Together, they define equality and imp in terms of other operators: b = c
can always be replaced by (b =,;>c)A(c =';>b) and ,b =,;>c by bvc. This
reinforces what we said about the two operations in chapter 1.
22 Part I. Propositions and Predicates

Proving that the logical laws are equivalences


We have stated, without proof, that laws 1-12 are equivalences. One
way to prove this is to build truth tables and note that the laws are true in
all states. For example, the first of De Morgan's laws, ,(£1 A £2) = ,£1
v , £2, has the following truth table:

£1 £2 £1 A£2 , (£1 A £2) ,£1 ,£2 , £1 v, £2 ,(£1 A £2) = ,£lV,£2


F F F T T T T T
F T F T T F T T
T F F T F T T T
T T T F F F F T

Clearly, the law is true in all states (in which it is well-defined), so that it
is a tautology.
Exercise I concerns proving all the laws to be equivalences.

2.2 The Rules of Substitution and Transitivity


Thus far, we have just discussed some basic equivalences. We now
turn to ways of generating other equivalences, without having to check
their truth tables. One rule we all use in transforming expressions, usually
without explicit mention, is the rule of "substitution of equals for equals".
Here is an example of the use of this rule. Since a +( c -a) = c, we can
substitute for expression a +(c -a) in (a +(c -a »*d to conclude that
(a +(c -a »*d = c*d; we simply replace a +(c -a) in (a +(c -a »*d by the
simpler, but equivalent, expression c.
The rule of substitution is:

(2.2.1) Rule of Substitution. Let e1 =e2 be an equivalence and £(P) be


a proposition, written as a function of one of its identifiers p.
Then £(el) = £(e2) and £(e2) = £(el) are also equivalences. 0

Here is an example of the use of the rule of Substitution. The law of


Implication indicates that (b ~c) = (, b v c) is an equivalence. Consider
the proposition E (P) = d v p. With

e1 b ~c and
e2 ,b v c
we have
E(el) = dv(b~c)
E(e2) = dv(,b vc)
Section 2.2 The Rules of Substitution and Transitivity 23

so that dV(b ~e) = dV(,b Ve) is an equivalence.


In using the rule of Substitution, we often use the following form. The
proposition that we conclude is an equivalence is written on one line. The
initial proposition appears to the left of the equality sign and the one that
results from the substitution appears to the right, followed by the name of
the law el = e2 used in the application:

dV(b ~e) = dV(,b Ve) (Implication)

We need one more rule for generating equivalences:

(2.2.2) Rule of Transitivity. If el = e2 and e2 = e3 are equivalences, then


so is el = e3 (and hence el is equivalent to e3). 0

Example. We show that (b ~ e) = ( , e ~ , b) is an equivalence (an


explanation of the format follows):

b ~e
= ,b Ve (Implication)
= e v,b (Commutativity)
= "cv,b (Negation)
= ,e ~,b (Implication)

This is read as follows. First, lines I and 2 indicate that b ~e is


equivalent to , b ve, by virtue of the rule of Substitution and the law of
Implication. Secondly, lines 2 and 3 indicate that ( , b v e) is equivalent to
e v , b, by virtue of the rule of Substitution and the law of Commuta-
tivity. We also conclude, using the rule of Transitivity, that the first pro-
position, b ~e, is equivalent to the third, e v, b. Continuing in this
fashion, each pair of lines gives an equivalence and the reasons why the
equivalence holds. We finally conclude that the first proposition, b ~e,
is equivalent to the last, , e ~, b. 0

Example. We show that the law of Contradiction can be proved from the
others. The portion of each proposition to be replaced in each step is
underlined in order to make it easier to identify the substitution.

T(b A , b) = , b v ~ (De Morgan's Law)


=~ (Negation)
= b v ,b (Commutativity)
=T (Excluded Middle) 0

Generally speaking, such fine detail is unnecessary. The laws of Com-


mutativity and Associativity are often used without explanation, and the
application of several steps can appear on one line. For example:
24 Part I. Propositions and Predicates

(b A(b ?c»?C
= , (b A ( , b VC» Vc (Implication, 2 times)
=,bV,(,bVc)Vc (De Morgan)
=T (Excluded Middle)

Transforming an implication
Suppose we want to prove that

(2.2.3) E1 A E2 A E3 ? E

is a tautology. The proposition is transformed as follows:

(E1 AE2 A E3) ? E


, (E1 A E2 A E3) V E (Implication)
= , E1 V , E2 V , E3 v E (De Morgan)

The final proposition is true in any state in which at least one of , E1,
, E2, , E3 and E is true. Hence, to prove that (2.2.3) is a tautology we
need only prove that in any state in which three of them are false the
fourth is true. And we can choose which three to assume false, based on
their form, in order to develop the simplest proof.
With an argument similar to the one just given, we can see that the
five statements

E1AE2AE3 ?E
E1 A E2 A , E ? , E3
E1 A , E A E3 ? , E2
, E A E2 A E3 ? , E1
(2.2.4) , E1 V , E2 v , E3 V E

are equivalent and we can choose which to work with. When given a pro-
position like (2.2.3), eliminating implication completely in favor of dis-
junctions like (2.2.4) can be helpful. Likewise, when formulating a prob-
lem, put it in the form of a disjunction right from the beginning.

Example. Prove that

is a tautology. Eliminate the main implication and use De Morgan's law:

Now simplify using Negation and eliminate the other implications:


Section 2.3 A Formal System of Axioms and Inference Rules 25

(,b Vc)V(b Vc Vd)V(c Vd)

Use the laws of Associativity, Commutativity and or-simplification to


arrive at

which is true because of the laws of the Excluded Middle, b v , b = T,


and or-simplification. This problem, which at first looked quite difficult,
became simple when the implications were eliminated.

2.3 A Formal System of Axioms and Inference Rules


A calculus, according to Webster's Third International Dictionary, is a
method or process of reasoning by computation of symbols. In section
2.2 we presented a calculus, for by performing some symbol manipulation
according to rules of Substitution and Transitivity we can reason with
propositions. For obvious reasons, the system presented here is called a
propositional calculus.
We are careful to say a propositional calculus, and not the proposi-
tional calculus. With slight changes in the rules we can have a different
calculus. Or we can invent a completely different set of rules and a com-
pletely different calculus, which is better suited for other purposes.
We want to emphasize the nature of this calculus as a formal system
for manipulating propositions. To do this, let us put aside momentarily
the notions of state and evaluation and see whether equivalences, which
we will call theorems, can be discussed without them. First, define the
propositions that arise directly from laws 1-12 to be theorems. They are
also called axioms (and the laws 1-12 are axiom schemas), because their
theorem hood is taken at face value, without proof.

(2.3.1) Axioms. Any proposition that arises by substituting propositions


for E1, E2 and E3 in one of the Laws 1-12 is called a
theorem. 0

N ext, define the propositions that arise by using the rules of Substitution
and Transitivity and an already-derived theorem to be a theorem. In this
context, the rules are often called inference rules, for they can be used to
infer that a proposition is a theorem. An inference rule is often written in
the form

and
E E,Eo
26 Part I. Propositions and Predicates

where the E j and E stand for arbitrary propositions. The inference rule
has the following meaning. If propositions E], ... ,En are theorems,
then so is proposition E (and Eo in the second case). Written in this
form, the rules of Substitution and Transitivity are

el =e2
(2.3.2) Rule of Substitution:
E(el) = E(e2), E(e2) = E(el)

el =e2, e2 =e3
(2.3.3) Rule of Transitivity:
el =e3

A theorem of the formal system, then, is either an axiom (according to


(2.3.1» or a proposition that is derived from one of the inference rules
(2.3.2) and (2.3.3).
Note carefully that this is a totally different system for dealing with
propositions, which has been defined without regard to the notions of
states and evaluation. The syntax of propositions is the same, but what
we do with propositions is entirely different. Of course, there is a relation
between the formal system and the system of evaluation given in the pre-
vious chapter. Exercises 9 and 10 call for proof of the following relation-
ship: for any tautology e in the sense of chapter 1, e = T is a theorem,
and vice versa.

Exercises for Chapter 2


1. Verify that laws 1-12 are equivalences by building truth tables for them.
2. Prove the law of Identity, e = e, using the rules of Substitution and Transi-
tivity and the laws I-II.
3. Prove that , T = F is an equivalence, using the rules of Substitution and Tran-
sitivity and the laws 1-12.
4. Prove that , F = T is an equivalence, using the rules of Substitution and Tran-
sitivity and the laws 1-12.
5. Each column below consists of a sequence of propositions, each of which
(except the first) is equivalent to its predecessor. The equivalence can be shown
by one application of the rule of Substitution and one of the laws 1-12 or the
results of exercises 3-4. Identify the law (as is done for the first two cases).

(a) (x Ay)V(z A, z) (a) ,(,b A(,b ~z»Vz


(b) (x Ay) v F Contradiction (b) (,b A(,b ~z»~z
(c) xAy or-simplification (c) ( , b A ( , , b Vz» ~ z
(d) (xAy)VF (d) (,b A("b v, ,z»~z
(e) (xAy)V(FAz) (e) (,b A ,(,b A ,Z»9Z
(f) (x Ay)V(F Az) (f) (,b A ,(,b A ,z»~z
(g) (x Ay)V«x A ,x)Az) (g) ,(b V(,b A ,z»~z
Exercises for Chapter 2 27

(h) (x Ay)V(X A(,X Az» (h) ,«bV,b)A(bV,z»~z


(i) x A(y V(,x Az» (i) ,(TA(bV,z»~z
(j) x A (y V , X) A (y V Z ) (j) ,(bV,z)~z
(k) XA(,xVy)A(ZVy) (k) , ,(b V ,z) Vz
(I) x A ( , X V , , y) A (Z V Y ) (I) (bV,z)Vz
(m) xA ,(xA ,y)A(zVy) (m)bV(,zVz)

6. Each proposition below can be simplified to one of the six propositions F, T,


x, y , X A y, and x V y. Simplify them, using the rules of Substitution and Tran-
sitivity and the laws 1-12.

(a) xv(yvx)V,y (g) , x ~(x A y)


(b) (x V y ) A (x V , Y ) (h) T ~('x ~x)
(c) xV y V , x (i) x ~(y ~(x A y»
(d) (x Vy)A(x V ,y)A(,x Vy)A(,x V,y) (j) ,x~(,x~(,xAy»
(e) (x Ay)V(x A ,y)V(,x Ay)V(,X A ,y) (k) ,y ~y
(f) ('x Ay)V X (I) ,y ~,y

7. Show that any proposition e can be transformed into an equivalent proposition


in disjunctive normal form -i.e. one that has the form

eo V ••• V en where each ei has the form go A A gm

Each gj is an identifier id, a unary operator , id, T or F. Furthermore, the


identifiers in each ei are distinct.

8. Show that any proposition e can be transformed into an equivalent proposition


in conjunctive normal form -i.e. one that has the form

eo A ••• A en where each ei has the form go V ••• V gm

Each gj is an identifier id, a unary operator , id, T or F. Furthermore, the


identifiers in each ei are distinct.
9. Prove that any theorem generated using laws 1-12 and the rules of Substitution
and Transitivity is a tautology, by proving that laws 1-12 are tautologies (see exer-
cise I) and showing that the two rules can generate only tautologies.
10. Prove that if e is a tautology, then e = T can be proved to be an equivalence
using only the laws 1-12 and the rules of Substitution and Transitivity. Hint: use
exercise 8.
Chapter 3
A Natural Deduction System

This chapter introduces another formal system of axioms and inference


rules for deducing proofs that propositions are tautologies. It is called a
"natural-deduction system" because it is meant to mimic the patterns of
reasoning that we "naturally" use in making arguments in English.
This material is not used later and can be skipped. The equivalence-
transformation system discussed in chapter 2 serves more than adequately
in developing correct programs later on. One could go further and say
that the equivalence-transformation system is more suited to our needs.
The fact that the natural-deduction system was developed in order to
mimic our natural patterns of reasoning may be the best reason for not
using it, for our "natural" patterns of reasoning are far from adequate.
Nevertheless, study of this chapter is worthwhile for several reasons.
The formal system presented here is minimal: there are no axioms and a
minimal number of inference rules. Thus, one can see what it takes to
start with a bare-bones system and build up enough theorems to the point
where further theorems are not cumbersome to prove. The equivalence-
transformation system, on the other hand, provided as axioms all the use-
ful basic equivalences. Secondly, such systems are being used more and
more in mechanical verification systems, and the computer science student
should be familiar with them. (A natural-deduction system is also used in
the popular game WFF'N PROOF.) Finally, it is useful to see and com-
pare two totally different formal systems for dealing with propositions.
Section 3.1 Introduction to Deductive Proofs 29

3.1 Introduction to Deductive Proofs


Consider the problem of proving that a conclusion follows from certain
premises. For example, we might want to prove that p II (r v q) follows
from p II q -i.e. p II (r v q) is true in every state in which p II q is. This
problem can be written in the following form:

(3.1.1) premise: p II q
conclusion:p II (r v q)

In English, we might argue as follows.

(3.1.2) Proof of (3.1.1): Since p II q is true (in state s), so is p, and so is


q . One property of or is that, for any r, r v q is true if q is, so
r v q is true. Finally, since p and r v q are both true, the proper-
ties of and allow us to conclude that p II (r v q) is true in s also.

In order to get at the essence of such proofs, in order to determine just


what is involved in such arguments, we are going to strip away the verbi-
age from the proof and present simply the bare details. Admittedly, the
proofs will look (at first) complicated and detailed. But once we have
worked with the proof method for a while, we will be able to return to
informal proofs in English with much better facility. We will also be able
to give some guidelines for developing proofs (section 3.5).
The bare details of proof (3.1.2) are, in order: a statement of the
theorem, the sequence of propositions initially assumed to be true, and the
sequence of propositions that are true based on previous propositions and
various rules of inference.
These bare details are presented in (3.1.3). The first line states the
theorem to be proved: "From p II q infer p II (r v q )". The second line
gives the premise (if there were more premises, they would be given on
successive lines). Each of the succeeding lines gives a proposition that one
can infer, based on the truth of the propositions in the previous lines and
an inference rule. The last line contains the conclusion.

Fromp II

I p IIq premise
(3.1.3)
2 p property of and, I
3 q property of and, I
4 r Vq property of or, 3
5 pll(rVq) property of and, 2, 4

To the right of each proposition appears an explanation of how the


proposition's "truth" is derived. For example, line 4 of the proof indicates
30 Part I. Propositions and Predicates

that r v q is true because of a property of or ~that r v q is true if q is~


and because q appears on the preceding line 3. Note that parentheses are
introduced freely in order to maintain priority of operators. We shall
continue to do this without formal description.
In this formal system, a theorem to be proved has the form

From e" ... , en infer e.

In terms of evaluation of propositions, such a theorem is interpreted as: if


e I' ... , en are true in a state, then e is true in that state also. If n is 0,
meaning that there are no premises, then it can be interpreted as: e is true
in all states, i.e. e is a tautology. In this case we write it as

Infer e.

Finally, a proposition on a line of a proof can be interpreted to mean that


it is true in any state in which the propositions on previous lines are true.
As mentioned earlier, our natural deduction system has no axioms.
The properties of operators used above are captured in the inference rules,
which we begin to introduce and explain in the next section. (Inference
rules were first introduced in section 2.3; review that material if neces-
sary.) The inference rules for the natural deduction system are collected
in Figure 3.3.1 at the end of section 3.3.

3.2 Inference Rules


There are ten inference rules in the natural deduction system. Ten is a
rather large number, and we can work with that many only if they are
organized so that they are easy to remember. In this system, there are
two inference rules for each of the five operators not, and, or, imp and
equals. One of the rules allows the introduction of the operator in a new
proposition; the other allows its elimination. Hence there are five rules of
introduction and five rules of elimination. The rules for introducing and
eliminating and are called A-I and A-E, respectively, and similarly for the
other operators.

Inference rules A-I, A-E and V-I

Let us begin by giving three rules: A-I, A-E and V-I.

E 1, ,En
(3.2.1) A-I: - - - - -
EIA AEn
Section 3.2 Inference Rules 31

E, /\ /\ En
(3.2.2) /\-E: - - - - - ' -
E;

E;
(3.2.3) v-I: - - - - -
E, V V En

Rule /\-] indicates that if E, and E2 occur on previous lines of a proof


(i.e. are assumed to be true or have been proved to be true), then their
conjunction may be written on a line. If we assert "it is raining", and we
assert "the sun is shining", then we can conclude "it is raining and the sun
is shining". The rule is called "/\-]ntroduction", or "/\-]" for short, because
it shows how a conjunction can be introduced.
Rule /\-E shows how and can be eliminated to yield one of its con-
juncts. If E ,/\ E2 appears on a previous line of a proof (i.e. is assumed to
be true or has been proved to be true), then either E, or E2 may be writ-
ten on the next line. Based on the assumption "it is raining and the sun is
shining", we can conclude "it is raining", and we can conclude "the sun is
shining".

Remark: There are places where it frequently rains while the sun is shin-
ing. Ithaca, the home of Cornell University, is one of them. In fact, it
sometimes rains when perfectly blue sky seems to be overhead. The
weather can also change from a furious blizzard to bright, calm sunshine
and then back again, within minutes. When the weather acts so strangely,
as it often does, one says that it is Ithacating. 0

Rule V-I indicates that if E, is on a previous line, then we may write


E, V E2 on a line. If we assert "it is raining", then we can conclude "it is
raining or the sun is shining".
Remember, these rules hold for all propositions E, and E 2 • They are
really "schemas", and we get an instance of the rule by replacing E, and
E2 by particular propositions. For example, since p v q and r are propo-
sitions, the following is an instance of /\-1.

pVq"r
(pVq)/\,r

Let us redo proof (3.1.3) in (3.2.4) below and indicate the exact infer-
ence rule used at each step. The top line states what is to be proved. The
line numbered I contains the first (and only) premise Cpr I). Each other
line has the following property. Let the line have the form

line # I E "name of rule", line #, ... , line #


32 Part I. Propositions and Predicates

Then one can form an instance of the named inference rule by writing the
propositions on lines line #, ... , line # above a line and proposition E
below. That is, the truth of E is inferred by one inference rule from the
truth of previous propositions. For example, from line 4 of the proof we
see that q I r v q is an instance of rule V-I: (r v q) is being inferred from q.

Fromp Aq infer p A (r v q)
1 P Aq pr 1
A-E, 1
(3.2.4) 2 p
3 q A-E, 1
4 rVq v-I, 3
5 pA(rVq) A-I, 2, 4

Note how rule A-E is used to break a proposition into its constituent
parts, while A-I and v-I are used to build new ones. This is typical of the
use of introduction and elimination rules.
Proofs (3.2.5) and (3.2.6) below illustrate that and is a commutative
operation; if p Aq is true then so is q Ap, and vice versa. This is obvious
after our previous study of propositions, but it must be proved in this for-
mal system before it can be used. Note that both proofs are necessary;
one cannot derive the second as an instance of the first by replacing p
and q in the first by q and p, respectively. In this formal system, a proof
holds only for the particular propositions involved. It is not a schema,
the wayan inference rule is.

From
1 p Aq
(3.2.5) 2 p
3 q
4 q Ap

To illustrate the relation between the proof system and English, we give
an argument in English for lemma (3.2.5): Suppose p Aq is true [line I].
Then so is p, and so is q [lines 2 and 3]. Therefore, by the definition of
and, q "p is true [line 4].

From
1
(3.2.6) 2
3 p
4 p Aq

Proof (3.2.6) can be abbreviated by omitting lines containing premises and


Section 3.2 Inference Rules 33

using "pr i" to refer to the i 1h premise later on, as shown in (3.2.7). This
abbreviation will occur often. But note that this is only an abbreviation,
and we will continue to use the phrase "occurs on a previous line" to
include the premises, even though the abbreviation is used.

From Ap infer pI'.


q A-E, pr I
(3.2.7)
2 p A-E, pr I
3 p Aq A-I, 2, 1

Inference rule V-E


The inference rule for elimination of or is

E IV ••• v En ' E I ? E, ... , En ? E


(3.2.8) V-E:
E

Rule v-E indicates that if a disjunction appears a previous line, and if


Ei ? E appears on a previous line for each disjunct E;, then E may be
written on a line of the proof. If we assert "it will rain tomorrow or it
will snow tomorrow", and if we assert "rain implies no sun", and if we
also assert "snow implies no sun", then we can conclude "there will be no
sun tomorrow". From

(rain v snow), (rain ? no sun), (snow? no sun)

we conclude no sun.
Here is a simple example.

From p v ( A r), p ? s, ( A r) ? s infer s v p


1 p v (q A r) pr I
2 P?S pr 2
3 (qAr)?s pr3
4 s V-E, 1, 2, 3
5 s vp V-I (rule (3.2.3)), 4

Inference rule ?-E

E1? E2, E1
(3.2.9) ?-E:
E2
34 Part I. Propositions and Predicates

Rule ~-E is called modus ponens. It allows us to write the consequent


of an implication on a line of the proof if its antecedent appears on a pre-
vious line. If we assert that x > 0 implies that y is even, and if we deter-
mine that x >0, then we can conclude that y is even.
We show an example of its use in proof (3.2.10). To show the relation
between the formal proof and an English one, we give the proof in
English: Suppose p Aq and p ~ r are both true. From p Aq we conclude
that p is true. Because p ~ r, the truth of p implies the truth of r, and r
is true. But if r is true, so is r "ored" with anything; hence r V(q ~r) is
true.

From p A q , P ~r infer r v (q ~ r )
1 p Aq pr 1
2 p ~r pr 2
(3.2.10)
3 p A-E (rule (3.2.2», 1
4 r ~-E, 2, 3
5 r V(q ~r) V-I (rule (3.2.3», 4

To emphasize the use of the abbreviation to refer to premises, we show


(3.2.10) in its abbreviated form in (3.2.11).

From p Aq , P ~ r infer r v (q ~ r)
1 p A-E, pr 1
(3.2.11)
2 r ~-E, pr 2, I
3 r V(q ~r) V-I, 2

Inference rules =-/ and =-E

E1 ~ E2, E2~ E1
(3.2.12) =-1: E1 =E2

E1=E2
(3.2.13) =-E: - - - - - -
E1 ~ E2, E2~ E1

Rules = -I and = -E together define equality in terms of implication.


The premises of one rule are the conclusions of the other, and vice versa.
This is quite similar to how equality is defined in the system of chapter 2.
Rule =-1 is used, then, to introduce an equality e1 = e2 based on the pre-
vious proof of e1 ~e2 and e2~e1.
Here is an example of the use of these rules.
Exercises for Section 3.2 35

From
1 =-E, pr 2
2 q ~r ~-E, I, pr 1
3 r =q =-1, pr 3, 2

Exercises for Section 3.2


1. Each of the following theorems can be proven using exactly one basic inference
rule (using the abbreviation that premises need not be written on lines; see the text
preceding (3.2.7». Name that inference rule.
(a) From a, b infer a Ab
(b) From a A b A (q V r), a infer q V r
(c) From ,a infer ,a Va
(d) From e =d, dVe infer d ~e
(e) From b ~e, b infer b v,b
(0 From, a, , b, e infer, a Ve
(g) From (a ~b )Ab, a infer a ~b
(h) From a V b ~e, e ~a V b infer a vb e =
(i) FromaAb,qVr infer (aAb)A(qVr)
(j) Fromp ~(q ~r),p, q Vr infer q ~r
(k) From e ~d, d ~e, d ~e infer e =d
(I) From a vb, a ve, (a vb) ~e infer e
(m) From a ~(dVe),(dve)~a infer a =(dVe)
(n) From (a V b )~e, (a V d)~e, (a vb )V(a Vd) infer e
(0) From a ~(b Ve),b ~(b Ve),a vb infer b ve

2. Here is one proof that p follows from p. Write another proof that uses only
one reference to the premise.

Fromp infer p
pr I
pr I

3. Prove the following theorems using the inference rules.

(a) Fromp"q,p~r inferr (0 From b ~ e Ad, b infer d


(b) From p = q, q infer p (g) From p A q , P ~ r infer r
(c) From p, q ~ r , p ~ r infer p "r (h) Fromp, q A(P ~s) infer q AS
(d) From b " , e infer, e (i) Fromp =q infer q =p
(e) From b infer b V ,e (j) From b ~(e Ad), b infer d

4. For each of your proofs of exercise 3, give an English version. (The English
versions need not mimic the formal proofs exactly.)
36 Part I. Propositions and Predicates

3.3 Proofs and Subproofs


Inference rule ='?>-I
A theorem of the form "From e, ... , en infer e" is interpreted as: if
e" ... , en are true in a state, then so is e. If e" ... , en appear on lines of
a proof, which is interpreted to mean that they are assumed or proven
true, then we should be able to write e on a line also. Rule ='?>-I, (3.3.1),
gives us permission to do so. Its premise need not appear on a previous
line of the proof; it can appear elsewhere as a separate proof, which we
refer to in substantiating the use of the rule. Unique names should be
given to proofs to avoid ambiguous references.

From E" ... ,En infer E


(3.3.1) ='?>-I: - - - " - - - - - " - - - -
(E,,, ... "En) ='?> E

Proof (3.3.2) uses ='?>-I twice in order to prove that p "q and q "p are
equivalent, using lemmas proved in the previous section.

Infer (P " q) = (q" p )


I (P"q)='?>(q"p) ='?>-I, (3.2.5)
(3.3.2) 2
(q "p)='?>(P "q) ='?>-I, (3.2.6)
3 (P"q)=(q"p) =-1, I, 2

Rule ='?>-I allows us to conclude p ='?>q if we have a proof of q given


premise p. On the other hand, if we take p ='?> q as a premise, then rule
='?>-E allows us to conclude that q holds when p is given. We see that the
following relationship holds:

Deduction Theorem. "Infer p ='?> q" is a theorem of the natural


deduction system, which can be interpreted to mean that p ='?> q IS
a tautology, iff"Fromp infer q" is a theorem. 0

Another example of the use of ='?>-I shows that p implies itself:

(3.3.3) Infer p ='?> P


1 I p ='?> P ='?>-I, exercise 2 of section 3.2

Subproofs
A proof can be included within a proof, much the way a procedure can
be included within a program. This allows the premise of ='?>-I to appear
as a line of a proof. To illustrate this, (3.3.2) is rewritten in (3.3.4) to
include proof (3.2.5) as a subproof. The subproof happens to be on line I
here, but it could be on any line. If the subtheorem appears on line j
Section 3.3 Proofs and Subproofs 37

(say) of the main proof, then its proof appears indented underneath, with
its lines numbered j.l, j .2, etc. We could have replaced the reference to
(3.2.6) by a subproof in a similar manner.

Infer
From p A q infer q A P
I.l p A-E, pr I
1.2 q A-E, pr I
(3.3.4) I.3 q Ap A-I, 1.2, l.l
2 (p Aq)~(q Ap) ~-I, I
3 (qAp)~(PAq) ~-I,(3.2.6)
4 (p A q) = (q AP ) =- I, 2, 3

Another example of a proof with a subproof is given in (3.3.5). Again, it


may be instructive to compare the proof to an English version:

Suppose (q v s) ~(p A q). To prove equivalence, we must show


also that (PAq)~(qVs). [Note how this uses rule =-1, that
a ~ band b ~ a means a = b . These sentences correspond to
lines I, 3 and 4 of the formal proof.] To prove (p A q) ~(q v s),
argue as follows. Assume p A q is true. Then so is q. By the
definition of or, so is q v s. [Note the correspondence to lines
2.1-2.2.] 0

From (q v s) ~ (p A q) infer (q v s) = (p A q)
I (qVs)~(PAq) prl
2 From p A q infer q v s
(3.3.5) 2.1 q A-E, pr I
2.2 q vs v-I, 2.1
3 (PAq)~(qVs) ~-1,2
4 (qVs)=(PAq) =-1,1,3

As mentioned earlier, the relationship between proofs and sub-proofs


in logic is similar to the relationship between procedures and sub-
procedures (modules and sub-modules) in programs. A theorem and its
proof can be used in two ways: first, use the theorem to prove something
else; secondly, study the proof of the theorem. A procedure and its
description can be used in two ways: first, understand the description so
that calls of the procedure can be written; secondly, study the procedure
body to understand how the procedure works. This similarity should
make the idea of subproofs easy to understand.
38 Part I. Propositions and Predicates

Scope rules
A subproof can contain references not only to previous lines in its
proof, but also to previous lines that occur in surrounding proofs. We
call these global line references. However, "recursion" is not allowed; a
line j (say) may not contain a reference to a theorem whose proof is not
finished by line j.
The reader skilled in the use of block structure in languages like PLI I,
ALGOL 60 and Pascal will have no difficulty in understanding this scope
rule, for essentially the same scope mechanism is employed here (except
for the restriction against recursion). Let us state the rule more precisely.

(3.3.6) Scope rule. Line i of a proof, where i is an integer, may contain


references to lines I, ... , ;-1. Line j.i, where i is an integer, may
contain references to lines j. 1, ... , j. (i-I) and to any lines refer-
enceable from line j (this excludes references to line j itself). 0

Example (3.3.7) illustrates the use of this scope rule; line 2.2 refers to
line 1, which is outside the proof of line 2.

Fromp =>(q =>r) infer (p Aq)=>r


1 p =>(q =>r) pr 1
2 From p A q infer r
(3.3.7) 2.1 p A-E, pr 1
2.2 q=>r =>-E,I,2.1
2.3 q A-E, pr 1
2.4 r =>-E, 2.2, 2.3
3 (pAq)=>r =>-1,2

Below we illustrate an invalid use of the scope rule.

From (Proof INVALID)


1 p pr 1
2 From p infer , p
2.1 P pr 1
2.2 p =>,p =>-1, 2 (invalid reference to line 2)
3 p =>,p =>-1, 2 (valid reference to line 2)

We illustrate another common mistake below; the use of a line that is not
in a surrounding proof. Below, on line 6.1 an attempt is made to refer-
ence s on line 4.1. Since line 4.1 is not in a surroWJding proof, this is not
allowed.
A subproof using global references is being proved in a particular con-
text. Taken out of context, the subproof may not be true because it relies
Section 3.3 Proofs and Subproofs 39

From p v q ,p =;>s, s =;>r infer r (proof INVALID)


I p vq pr I
2 p =;>s pr 2
3 s =;>r pr 3
4 From p infer r
4. I s =;>-E, 2, pr I (valid reference to 2)
4.2 r =;>-E, 3, 4. I (valid reference to 3)
5 p =;>r =;>-1, 4
6 From q infer r
6. I r =;>-E, 3, 4. I (invalid reference to 4. I)
7 q =;> r =;>-1, 6
8 r v-E, I, 5, 7

on assumptions about the context. This again points up the similarity


between ALGOL-like procedures and subproofs. Facts assumed outside a
subproof can be used within the proof, just as variables declared outside a
procedure can be used within a procedure, using the same scope mechan-
Ism.
To end this discussion of scope, we give a proof with two levels of sub-
proof. It can be understood most easily as follows. First read lines I, 2
and 3 (don't read the the proof of the lemma on line 2) and satisfy your-
self that if the proof of the lemma on line 2 is correct, then the whole
proof is correct. Next, study the proof of the lemma on line 2 (only lines
2.1,2.2 and 2.3). Finally, study the proof of the lemma on line 2.2, which
refers to a line two levels out in the proof.

. t p =;>(lq =;> r )
F rom (PA'q )=;> riner
I (p Aq)=;>r pr I
2 From p infer q =;> r
(3.3.8) 2.1 p pr I
2.2 From q infer r
2.2.1 I
p Aq A-I, 2.1, pr I
2.2.2 r =;>..E, I, 2.2. I
2.3 q =;>r =;>"1, 2.2
3 p =;>(q =;> r) =;>"1, 2

Proof by contradiction
A proof by contradiction typically proceeds as follows. One makes an
assumption. From this assumption one proceeds to prove a contradiction,
say, by showing that something is both true and false. Since such a
40 Part I. Propositions and Predicates

contradiction cannot possibly happen, and since the proof from assump-
tion to contradiction is valid, the assumption must be false.
Proof by contradiction is embodied in the proof rules , -I and , -E:

From E infer E1/\ , E1


(3.3.9) ,-1:
,E
From , E infer E1 /\ , E1
(3.3.10) ,-E:
E

Rule , -I indicates that if "From E infer E1/\ , E1" has been proved for
some proposition E1, then one can write , E on a line of the proof.
Rule , -I similarly allows us to conclude that E holds if a proof of
"From , E infer E1/\ , E1" exists, for some proposition E1.
We show in (3.3.11) an example of the use of rule, -1, that from p we
can conclude , ,p.

From p infer , , p
I p pr I
(3.3.11) 2 From ,p infer p /\ ,p
2.1 P /\ ,p /\-1, I, pr I
3 , -I, 2

Rule , -I is used to prove that "p follows from p; similarly, rule , -E is


used in (3.3.12) to prove that p follows from, ,p.

From , ,p infer p
I , ,p pr I
(3.3.12) 2 From,p infer ,p /\ , ,p
2.1 ,p /\ , ,p /\-1, pr I,
3 p ,-E,2

Theorems (3.3.11) and (3.3.12) look quite similar, and yet both proofs are
needed; one cannot simply get one from the other more easily than they
are proven here. More importantly, both of the rules , -I and , -E are
needed; if one is omitted from the proof system, we will be unable to
deduce some propositions that are tautologies in the sense described in
section 1.5. This may seem strange, since the rules look so similar.
Let us give two more proofs. The first one indicates that from p and
,p one can prove any proposition q, even one that is equivalent to false.
This is because both p and ,p cannot both be true at the same time, and
hence the premises form an absurdity.
Section 3.3 Proofs and Subproofs 41

Fromp, ,p infer q
1 P pr 1
2 ,p pr 2
(3.3.13) 3 From ,q infer p 1\ ,p
3.1 P 1\ ,p 1\-1, 1,2
4 q ,-E,3

From p 1\ q infer , (p ~ , q )
I P 1\ q pr I
2 From p ~ , q infer q 1\ , q
(3.3.14) 2.1 P I\-E, I
2.2 q I\-E, 1
2.3,q ~-E, pr I, 2.1
2.4 q 1\ , q 1\-1, 2.2, 2.3
3 ,(p~,q) ,-1,2

For comparison, we give an English version of proof (3.3.14). Let p 1\ q


be true. Then both p and q are true. Assume that p ~ , q is true.
Because p is true this implication allows us to conclude that , q is true,
but this is absurd because q is true. Hence the assumption that p ~ , q
is true is wrong, and , (p ~ , q) holds.

Summary
The reader may have noticed a difference between the natural deduc-
tion system and the previous systems of evaluation and equivalence
transformation: the natural deduction system does not allow the use of
constants T and F! The connection between the systems can be stated as
follows. If "Infer e" is a theorem of the natural deduction system, then e
is a tautology and e = T is an equivalence. On the other hand, if e = T is
a tautology and e does not contain T and F, then "Infer e" is a theorem
of the natural deduction system. The omission of T and F is no problem
because, by the rule of Substitution, in any proposition T can be replaced
by a tautology (e.g. b v , b) and F by the complement of a tautology (e.g.
b 1\ , b) to yield an equivalent proposition.
We summarize what a proof is as follows. A proof of a theorem
"From e I, . . . ,en infer e" or of a theorem "Infer e" consists of a
sequence of lines. The first line contains the theorem. If the first line is
unnumbered, the rest are indented and numbered 1, 2, etc. If the first line
has the number i, the rest are indented and numbered i. I, i.2, etc. The
last line must contain proposition e. Each line i must have one of the
following four forms:
42 Part I. Propositions and Predicates

Form 1: (i) ej pr j
where 1 ~j ~ n. The line contains premise j.

Form 2: (i)p Name, ref), ... , refq


Each reh either (I) is a line number (which is valid according to
scope rule (3.3.6», or (2) has the form "pr j", in which case it
refers to premise ej of the theorem, or (3) is the name of a previ-
ously proven theorem. Let rk denote the proposition or theorem
referred to by ref k. Then the following must be an instance of
inference rule Name:

p
Form 3: (i) p Theorem name, ref \, ... , ref q
Theorem name is the name of a previously proved theorem; refk
is as in Form 2. Let rk denote the proposition referred to be
refk. Then "From r" ... , rq infer p" must be the named
theorem.

Form 4: (i) [Proof of another theorem]


That is, the line contains a complete subproof, whose format fol-
lows these rules.

Figure 3.3.1 contains a list of the inference rules.

Historical Notes
The style of the logical system defined in this chapter was conceived
principally to capture our "natural" patterns of reasoning. Gerhard
Gentzen, a German mathematician who died in an Allied prisoner of war
camp just after World War II, developed such a system for mathematical
arguments in his 1935 paper Untersuchungen ueber das logische Schliessen
[20], which is included in [43].
Several textbooks on logic are based on natural deduction, for example
W.V.O. Quine's book Methods of Logic [41].
The particular block-structured system given here was developed using
two sources: WFF'N PROOF: The Game of Modern Logic, by Layman
E. Allen [1], and the monograph A Programming Logic, by Robert Con-
stable and Michael O'Donnell [7]. The former introduces the deduction
system through a series of games; it uses prefix notation, partly to avoid
problems with parentheses, which we have sidestepped through informal-
ity. A Programming Logic describes a mechanical program verifier for
Exercises for Section 3.3 43

PLICS (a subset of PLIC, which is a subset of PL/I), developed at Cor-


nell University. Its inference rules were developed with ease of presenta-
tion and mechanical verification in mind. Actually, the verifier can be
used to verify proofs of programs, and includes not only the propositional
calculus but also a predicate calculus, including a theory of integers and a
theory of strings.

E" ... , En E,A ... AEn


A-I: A-E:
E,A ... AEn Ei

Ei E, V ... V En , E, ~ E, ... , En ~E
V-I: v-E:
E, v ... V En E

From E infer E1 A , E1 From , E infer E1 A , E1


, -I: , -E:
,E E

E1~E2, E2~E1 E1=E2


=-1: =-E:
E1=E2 E1~E2, E2~E1

From E" ... , En infer E E1~E2, E1


~-I: ------"---- ~-E: -----
(E,A ... AEn)~E E2

Figure 3.3.1 The Set of Basic Inference Rules

Exercises for Section 3.3


1. Use lemma (3.2.11) and inference rule ~-I to give a I-line proof that
(p Aq A(P ~r»~(r V(q ~r».

2. Prove that (p A q) ~ (p v q), using rule ~-I.

3. Prove that q ~ (q A q). Prove that (q A q) ~ q. Use the first two results to
prove that q =
(q A q). Then rewrite the last proof so that it does not refer to
outside proofs.
4. Prove that p = (p VP ).
5. Prove that p ~«r Vs) ~ p).
6. Prove that q ~(r ~(q Ar».
7. Prove that fromp ~(r ~s) follows r ~(p ~s).
44 Part I. Propositions and Predicates

8. What is wrong with the following proof?

Infer a ~ b (Proof INVALID)


I a pr I
2 From, b infer b A , b
2.1 , b pr I
2.2 ,b ~b A,b ~-I, 2
2.3 b A, b ~-E. 2.2, 2.1
2 b ,-E,2

9. Prove that from ,p and ( ,p ~q) v (p A (r ~q» follows r ~q.

10. Prove that q ~ (p A r) follows from q ~p and q ~ r.

11. Prove that from , q follows q ~P .


12. Prove that from , q follows q ~ ,p.
13. Prove that from, q follows q ~ (p A ,p).
14. Prove that from p v q, , q follows P .
15. Prove p A (p ~ q) =;> q.
16. Prove «(p ~q)A(q ~r»~(p ~r).

17. Prove (p ~q)~«(P A ,q)~q).

18. Prove «(P A , q) ~ q) ~ (p ~ q). [This, together with exercise 17, allows us
to prove (p ~q)=«(P A ,q)~q).]
19. Prove (p ~q)~«(P A ,q)~ ,p).
20. Prove «(P A , q ) ~ ,p ) ~ (p =;> q). [This, together with exercise 19, allows
us to prove (p ~q)=«(P A ,q)~ ,p).]
21. Prove that (p =q)~(,p = ,q).
22. Prove that ( , p = , q ) ~ (p = q). [This, together with exercise 21, allows us
to prove (p = q ) = ( , p = , q ).]
23. Prove , (p = q) ~ ( , p = q)
24. Prove (,p =q)~,(p =q). [This, together with exercise 21, allows us to
prove the law of Inequality, , (p = q) = (,p = q ).]
25. Prove (p =q)~(q =p).
26. Use a rule of Contradiction to prove From p infer p .
27. For each of the proofs of exercise 1-7,9-25, give a version in English. (It need
not follow the formal proof exactly.)
Section 3.4 Adding Flexibility to the Natural Deduction System 45

3.4 Adding Flexibility to the Natural Deduction System


We first introduce some flexibility by showing how theorems can be
viewed as schemas -i.e. how identifiers in a theorem can be viewed as
standing for any arbitrary proposition. Next, we introduce a rule of sub-
stitution of equals for equals, incorporating into the natural deduction sys-
tem the method of proving equivalences of chapter 2. We prove a
number of theorems, including the laws of equivalence of chapter 2.

Using theorems as schemas


The inference rules given in Figure 3.3.1 hold for any propositions E,
E), ... , En. They are really "schemas", and one gets a particular inference
rule by substituting particular propositions for the "placeholders" E, E j,
... , En. On the other hand, theorems of the form "From premises infer
conclusion" are proved only for particular propositions. For example,
proof (3.3.2) used the following two theorems (3.2.5) and (3.2.6):

From p 1\ q infer q 1\ P

From q 1\ P infer p 1\ q

Even though it looks like the second should follow directly from the
first, in the formal system both must be proved.
But we can prove something about the formal system: systematic sub-
stitution of propositions for identifiers in a theorem and its proof yields
another theorem and proof. So we can consider any theorem to be a
schema also. For example, from proof (3.2.5) of "From p 1\ q infer q I\p"
we can generate a proof of "From (aVb)l\c infer cl\(aVb)" simply by
substituting a v b for p and c for q everywhere in proof (3.2.5):

From (a vb)l\c infer c 1\ (a vb)


I (aVb)l\c pr I
2 a Vb I\-E, I
3 c I\-E, I
4 cl\(aVb) 1\-1, 3, 2

Let us state more precisely this idea of textual substitution in theorem and
proof.

(3.4.1) Theorem. Write a theorem as a function of one of its identifiers,


p: "From Ej(P), ... , En(P) infer E(P)".
Let G be any proposi-
tion. Then "From Ej(G), ... , En(G) infer E(G)" can also be
proved.
46 Part I. Propositions and Predicates

Informal proof Without less of generality, assume the proof of the


theorem contains no references to other theorems outside the proof. (If it
does, first change the proof to include them as subproofs, as was done in
generating proof (3.3.4) from proof (3.3.2), repeating the process until no
references to outside theorems exist.) Then we can obtain a proof of the
new theorem simply by substituting G for p everywhere in the proof of
the original theorem. 0
Theorems like (3.4.1) are often called meta-theorems, because they are
not theorems in the proof system, like "From ... infer ... ", but are proofs
about the proof system. The use of meta-theorems takes us outside the
formal system just a bit, but it is worthwhile to relax formality in this
way.
We can put meta-theorem (3.4.1) in the form of a derived rule ofinfer-
ence as follows:

From E 1(p), . . . , En (P) infer E (p )


(3.4.2) (p an identifier)
From E,(G), ... ,En(G) infer E(G)

We use this derived rule of inference to rewrite theorem (3.3.2) using


only theorem (3.2.5) (and not (3.2.6». Note how line 2 refers to theorem
(3.2.5) and indicates what propositions are being replaced. We often leave
out this indication if it is obvious enough.

Infer (p A q) = (q A P )
1 (PAq)~(qAp) (3.2.5)
2 (qAp)~(PAq) (3.2.5) (with p for q, q for p)
3 (PAq)=(qAp) =-1, 1,2

Earlier, we discussed the relation between procedures of a program and


subproofs of a proof. We can now extend this relation to procedures with
parameters and subproofs with parameters. Consider rule (3.4.2). The
proof of the premise corresponds to the definition of a procedure with a
parameter p. The use of the conclusion in another proof corresponds to a
call of the procedure with an argument G.

The Rule of Substitution of equals for equals


The rule of Substitution, introduced in section 2.2, will be used in this
section in the following form.
Section 3.4 Adding Flexibility to the Natural Deduction System 47

(3.4.3) Theorem. Let proposition E be thought of as a function of one


of its identifiers, p, so that we write it as E(P). Then if el = e2
and E(el) appear on previous lines, then we may write E(e2) on
a line. 0

For example, given that c =;>a v b is true, to show that c =;>b Va is true
we take E(P) to be c =;>p, el =e2 to be a vb = b Va (the law of Commu-
tativity, which will be proved later) and apply the theorem.
The rule of Substitution was an inference rule in the equivalence sys-
tem of chapter 2. However, it is a meta-theorem of the natural deduction
system and must be proved. Its proof, which would be performed by
induction on the structure of proposition E(P), is left to the interested
reader in exercise 10, so let us suppose it has been done. We put the rule
of Substitution in the form of a derived inference rule:

el =e2, E(el)
(3.4.4) subs: - - - - - - (E(P) is a function of p)
E(e2)

To show the use of (3.4.4), we give a schematic proof to show that the
rule of substitution as given in section 2.2 holds here also.

From el =e2 infer E el)=E(e2)


1 el=e2 pr 1
2 From E(el) infer E(e2)
(3.4.5) 2.1 E(e2) subs, pr I,
3 E(el) =;> E(e2) =;>-1, 2
4 From E(e2) infer E(eJ)
4.1 e2=el =-1,(3.3.3) (p =;>p)
4.2 E(eJ) subs, 4.1, pr 1
5 E(e2) =;> E(eJ) =;>-1, 4
6 E(eJ) = E(e2) =-1, 3, 5

With this derived rule of inference, we have the flexibility of both the
equivalence and the natural deduction systems. But we must make sure
that the laws of section 2.1 actually hold! We do this next.

Some basic theorems


A number of theorems are used often, including the laws of section
2.1. We want to state them here and prove some of them; the rest of the
proofs are left as exercises. The first to be proved is quite useful. It
states that if at least one of two propositions is true, and if the first is
false, then the second is true.
48 Part I. Propositions and Predicates

Fromp Vq"p infer q


I ,p pr2
2 From p infer q
2.1 p pr I
(3.4.6) 2.2 From, q infer p A ,p
2.2.1IpA,p A-I, 2.1, I
2.3 q , -E, 2.2
3 p -=?>q -=?>-I,2
4 q v-E, pr 1,3, (3.3.3)

We now turn to the laws of section 2.1. Some of their proofs are given
here; the others are left as exercises to the reader.

I. Commutative laws. (p Aq)=(q Ap) was proven in theorem (3.3.4); the


other two commutative laws are left to the reader to prove.

2. Associative laws. These we don't need to prove since the inference


rules for and and or were written using any number of operands and no
parentheses.

3. Distributive laws. Here is a proof of the first; the second is left to the
reader. The proof is broken into three parts. The first part proves an
implication -=?> and the second part proves it in the other direction, so
that the third can prove the equivalence. The second part uses a case
analysis (rule V-E) on b V , b -the law of the Excluded Middle- which is
not proved until later. The use of b v , b in this fashion occurs often

From b v (c A d) infer (b v C)A (b v d)


I From b infer (b Vc)A(b Vd)
1.1 b vc v-I, pr I
1.2 bvd v-I, pr I
(3.4.7) 1.3 (b vc)A(b Vd) A-I, 1.1, 1.2
2 b -=?>(b v c)A(b Vd) -=?>-I, I
3 From cAd infer (b v c) A(b v d)
3.1 c A-E, pr I
3.2 d A-E, pr I
3.3 b vc v-I, 3.1
3.4 bvd v-I, 3.2
3.5 (b Vc)A(b Vd) A-I, 3.3, 3.4
4 (c Ad)-=?>(b Vc)A(b Vd) -=?>-I, 3
5 (bVc)A(bvd) v-E, pr I, 2,4
Section 3.4 Adding Flexibility to the Natural Deduction System 49

From (b v c) A(b v d) infer b v (e Ad)


1 b Ve A-E, pr I
2 bvd A-E, pr I
3 b v ,b (3.4.14)
4 From b infer b V(e Ad)
4.1 I
b v (e Ad) V-I, pr 1
5 b ?b V(e Ad) ?-1,4
(3.4.8) 6 From ,b infer b V(e Ad)
6.1 e (3.4.6), I, pr 1
6.2 d (3.4.6), 2, pr I
6.3 cAd A-I, 6.1, 6.2
6.4 bV(eAd) v-I, 6.3
7 ,b?b V(e Ad) ?-1,6
8 bV(eAd) v-E, 3, 5, 7

(3.4.9) Infer b V(e Ad)=(b Ve)A(b vd)


I bV(eAd)?(bve)A(bvd) ?-I, (3.4.7)
2 (bVc)A(bvd)?bV(eAd) ?-I, (3.4.8)
3 bV(eAd)=(bVe)A(bVd) =-1, 1,2

4. De Morgans's laws. We prove only the first one here.

From , (b A c) infer , b v , e
1 ,(bAe) prl
2 From, ( , b v , c) infer (b A c) A , (b A c )
2.1 , (,b v ,c) pr 1
2.2 From, b infer ( , b v , c ) A , ( , b v , e )
(3.4.10) 2.2.1 I 'b v , c v-I, pr 1
2.2.2 (, b v , c)A, (, b v , c) A-I, 2.2.1, 2. 1
2.3 b , -E, 2.2
2.4 From, c infer ( , b v , e ) A , ( , b v , c)
2.4.1 I'
bv ,e v-I, pr I
2.4.2 ( , b v , c) A , ( , b v , c) A-I, 2.4.1, 2.1
2.5 e ,-E, 2.4
2.6 b Ae A-I, 2.3, 2.5
2.7 (bAc)A ,(bAc) A-I, 2.6, I
3 ,b v, c ,-E,2
50 Part I. Propositions and Predicates

From ,b v ,c infer ,(b AC)


I From, b infer , (b A c)
1.1 ,b pr I
1.2 From b A C infer b A , b
1.2.1 Ib A-E, pr I
(3.4.11) 1.2.2 bA,b A-I, 1.2.1, 1.1
1.3 ,(bAc) ,-1,1.2
2 ,b? ,(b Ac) ?-I, I
3 From,c infer ,(b Ac)
3.1 ,c pr I
3.2 From b AC infer c A , c
3.2.1 I c A-E, pr I
3.2.2 c A , c A-I,3.2.1, 3.1
3.3,(b A c) ,-1,3.2
4 , c ? , (b A c) ?- I,
3
5 ,(bAc) v-E,prl,2,4

(3.4.12) Infer ,(bAc)=,bV,c


I ,(bAc)?,bV,c ?-I, (3.4.10)
2 ,bv,c?,(bAc) ?-I, (3.4.11)
3 ,(bAc)=,bV,c =-1, 1,2

5. Law of Negation. This one is extremely simple because we have


already done the necessary groundwork in previous theorems:

(3.4.13) Infer"b =b
I b?"b ?-I, (3.3.11)
2 "b?b ?-I, (3.3.12)
3 "b=b =-1, I, 2

6. Law of the Excluded Middle. This proof proceeds by assuming the


converse and proving a contradiction in a straightforward manner.
Exercises for Section 3.4 51

Infer b v , b
From, (b v , b) infer (b v , b) A , (b v , b)
J.I ,(bV,b) prl
1.2 From,b infer (b v ,b)A ,(b V,b)
1.2.1 I
bv,b v-I, pr I
(3.4.14) 1.2.2 (bv,b)A,(bv,b) A-I,1.2.1,1.I
J.3 b , -E, 1.2
1.4 bv,b V-I,J.3
1.5 (b v ,b)A ,(b v ,b) A-I, 1.4, pr I
2 bV,b ,-E,I

7. Law of Contradiction. Left to the reader.

8. Law of Implication. Left to the reader.

9. Law of Equality. Left to the reader.

10-11. Laws of or- and and-Simplification. These laws use the constants
T and F, which don't appear in the inference system.

Exercises for Section 3.4


1. Use the idea in theorem (3.4.1) to derive from (3.3.7) a proof that
(p Aq)=9(P V q) follows from p =9(q =9p V q).
2. Use the idea in theorem (3.4.1) to derive from (3.3.8) a proof that from
(q Ar Aq)=9r follows (q Ar)=9(q =9r).
3. Use the idea in theorem (3.4.1) to derive from (3.3.4) a proof that (a A b A c)
= (c A a A b).

4. Prove the second and third Commutative laws, (b v c) =(c Vb) and (b = c)
=(c=b).
5. Prove the second Distributive law, b A(c v d) = (b
Ac) v (b Ad).
6. Prove the second of De Morgan's laws, , (b v c) = , b A , c.
7. Prove the law of Contradiction, ,(b A ,b).
8. Prove the law of Implication, b v c = ( , b =9c).
9. Prove the law of Equality, (b = c) = (b =9 c)A (c =9 b).
10. Prove theorem (3.4.3).
11. Prove the rule of Transitivity: from a = band b = c follows a = c .
12. Prove that from p v q and , q follows p (see (3.4.6».
52 Part I. Propositions and Predicates

3.5 Developing Natural Deduction System Proofs


The reader has no doubt struggled to prove some theorems in the
natural deduction system, and has wondered whether such proofs could be
developed in a systematic manner. This section should provide some help.
We will begin to be less formal, stating facts without formal proof and
taking larger steps in a proof when doing so does not hamper understand-
ing. This is not only convenient; it is necessary. While the formal
methods are indispensable for learning about propositions, one must begin
to use the insight they supply instead of the complete formality they
require in order to keep from being buried under mounds of detail.
To help the reader take a more active role in the development of the
proofs, they will be presented as follows. At each step, a question will be
posed, which must be answered in order to invent the next step in the
proof. The answer will be separated from the question by white space
and an underline, so that the reader can try to answer the question before
proceeding. In this way" the reader can actually develop each step of the
proof and check it with the one presented.

Some general hints on developing proofs


Suppose a theorem of the form "From el, e2 infer e3" is to be proved.
The proof must have the form

From el, e2 infer e3


el pr I
2 e2 pr 2

3 I e3 Why?

and we need only substantiate line 3 ~i.e. give a reason why e3 can be
written on it. We can look to three things for insight. First, we may be
able to combine the premises or derive sub-propositions from them in
some fashion, if not to produce e3 at least to get something that looks
similar to it.
Secondly, we can investigate e3 itself. Since an inference rule must be
used to substantiate line 3, the form of e3 should help us decide which
inference rule to use. And this leads us to the third piece of information
we can use, the inference rules themselves. There are ten inference rules,
which yields a lot of possibilities. Fortunately, few of them will apply to
any particular proposition e3, because e3 must have the form of the con-
clusion of the inference rule used to substantiate it. And, with the addi-
tional information of the premises, the number of actual possibilities can
be reduced even more.
Section 3.5 Developing Natural Deduction System Proofs 53

For example, if e3 has the form e4 =";>e5, the two most likely inference
rules to use are =-E and =";>-1, and if a suitable equivalence does not seem
possible to derive from the premises, then =-E can be eliminated from
consideration.
Let us suppose we try to substantiate line 3 using rule =";>-1, because it
has the form e4 =";> e5. Then we would expand the proof as follows.

From e1, e2 infer e4 =";> e5


I el pr I
2 e2 pr 2
3 From e4 infer e5
3.1 e4 pr I

3.2 e5 Why?
4 e4 =,,;>e5 =";>-1, 3

Thus, we have reduced the problem of proving e4=,,;>e5 from el and e2 to


the problem of proving e5 from e4, and the new problem promises to be
simpler because propositions e4 and e5 each contain fewer operations than
e3 did -they are in some sense smaller and simpler.
The above discussion shows basically how to go about developing a
proof. At each step, investigate the inference rules to determine which are
most likely to be applicable, based mainly on the proposition to be proved
and secondly on previous assumptions and already-proved theorems, and
attempt to apply one of them in order to reduce the problem to a simpler
one.
As the proof expands and more assumptions are made, try to invent
and substantiate new propositions (from the already proved ones) that
may be helpful for proving the desired result. But remember that, while
the premises are certainly useful, proof development is a goal-oriented
activity, and it is mainly the goal, the proposition that must be substan-
tiated; we should look to the goal and possible inference rules for the
most insight.
Successful proof development requires some experience with the infer-
ence rules, so the reader should spend some time studying them and
deciding when they might be employed. We can give some hints here.
Rules =-1 and =-E together define operation equals. They are used
only to derive an equivalence or to turn one into implications. If
equivalence is not a part of the premises or goal, they can be eliminated
from consideration.
The other rules of introduction are used to introduce longer proposi-
tions from shorter ones. Hence, they are useful when the desired goal, or
54 Part I. Propositions and Predicates

parts of it, can be built from shorter propositions that occur on previous
lines. Note that, except for =-1, the forms of the conclusions of the rules
of introduction are all different, so that at most one of these rules can be
used to substantiate a proposition.
The rules of elimination are generally used to "break apart" a proposi-
tion so that one of its sUb-propositions can be derived. All the rules of
elimination (except for =-E) have a general proposition as their conclu-
sion. This means that they may possibly be used to substantiate any pro-
position. Whether an elimination rule can be used depends on whether its
premises have appeared on previous lines, so to decide whether these rules
should be used requires a look at previous lines.

The Development of a proof


Problem. Prove that if p => q is true then so is (p A , q) => ,p. The first
step in developing a proof is to draw the outline for the proof and fill in
the first line with the theorem, the next lines with the premises and the
last line with the goal ~i.e the proposition to be inferred. Perform this
step.

The problem description yields the following start of a proof:

From p =>q infer (p q ) => , p


I
A ,

pr I
1 p 9q

2 (p A ,q)=>,p Why?

At this point, it is wise to study the premises to see whether propositions


can be derived from them. Do this.

Little can be derived from p =>q, except the disjunction ,p v q (using the
rule of Substitution). We will keep this proposition in mind. Which rules
of inference could be used to substantiate line 2? That is, which rules of
inference could have (p A , q) =>,p as their conclusion?

Possible inference rules are: =>-1, A-E, V-E, , -E, =-E and =>-E. Which
seems most applicable, and why? Expand the proof accordingly.
Section 3.5 Developing Natural Deduction System Proofs 55

There is little to suppose that the elimination rules could be useful, for
their premises are different from the propositions on previous lines. This
leaves only ~-l.

From
p~q
2 From p A , q infer
2.1 P A , q pr I

Why?
3 ~-I, 2

What can be derived from the propositions appearing on lines previous to


2.2?

Using A-E, we can derive p and , q from premise p A , q. We then see


that q can be derived from p ~ q and p. (Is it strange that both q and
, q can be derived?) Keeping these in mind, list the inference rules that
could be used to substantiate line 2.2.

Possible inference rules are, -I, A-E, v-E, , -E and ~-E. Choose the rule
that is most applicable and expand the proof accordingly.

The elimination rules don't seem useful here; elimination of imp on line 1
results in q, and we already know that A-E can be used to derive only p
and , q from p A , q. Only , -I seems helpful:

Fromp ~q infer (p A ,q)~,p

1 p ~q pr 1
2 From p A , q infer , p
2.1 p A, q pr 1
2.2 From p infer e A, e (which e?)
2.2.1 p pr 1

2.2.2 e A, e Why?
2.3 ,p , -I, 2.2
3 (PA,q)~,p ~-I, 2

What proposition e should be used on lines 2.2 and 2.2.2? To make the
choice, look at the propositions that occur on lines previous to 2.2 and
56 Part I. Propositions and Predicates

the propositions we know we can derive from them. Expand the proof
accordingly.

We reasoned above that we could derive both q and , q, so the obvious


choice is e =q. We complete the proof as follows:

Fromp ~q infer (p A ,q)~,p


I p ~q pr I
2 From p A , q infer , p
2.1 p A ,q pr I
2.2 p A-E, 2.1
2.3 ,q A-E,2.1
2.4 From p infer q A , q
2.4.1 Iq ~-E, 1,2.2
2.4.2 qA,q A-I, 2.4.1, 2.3
2.5 ,p ,-1,2.4
3 (PA,q)~,p ~-1,2

The Development of a second proof


Problem. Prove that from , p = q follows , (p = q). Draw the outline of
the proof and fill in the obvious details.

From ,p =q infer ,(p =q)


I I ,p =q pr I

2 ,(p =q) Why?

What information can be gleaned from the premises?

Rule =-E can be used to derive two implications. This seems useful here,
since implications will be needed to derive the goal, and we derive both.
Section 3.5 Developing Natural Deduction System Proofs 57

From ,p =q infer, (p =q)


=-E, pr I
=-E, pr I

3 ,(p =q) Why?

The following rules could be used to substantiate line 3: , -I, A-E, v-E, ,-
E and ~-E. Choose the most likely one and expand the proof accord-
ingly.

The elimination rules don't seem helpful at all, because the premises that
would be needed in order to use them are not available and don't seem
easy to derive. The only rule to try at this point is , -I -we have little
choice!

From ,p =q infer ,(p =q)


I ,p ~q =-E, pr I
2 q ~ ,p =-E, pr I
3 Fromp =q infer e A,e (which e?)
3.1 p=q prl

3.2 eA,e Why?


4 ,(p=q) , -I, 3

What proposition e should be used on lines 3 and 3.2, and how should it
be proved? Expand the proof accordingly.

The propositions ,p ~q and q ~,p are available. In addition, from


line 3.1 p ~q and q ~p can be derived. Let's rearrange these as follows:
p~q,q~,p,q~p, and
,p ~q, q ~ ,p, q ~p.

If we assume p we can prove both p and ,p; if we assume ,p we can


also prove p and ,p. Hence we should be able to prove the contradic-
tion p A ,p. So try e = p and write the following proof.
58 Part I. Propositions and Predicates

From ,p = q infer, (p = q)
1 ,p ='7>q =-E, pr I
2 q ='7>,p =-E, pr I
3 From p =q infer p A ,p
3.1 p ='7>q =-E, pr 1
3.2 q ='7> P =-E, pr I
3.3 p Why?
3.4,p Why?
3.5 p A,p A-I, 3.3, 3.4
4 ,(p =q) ,-1,3

So we are left with concluding the two propositions p and ,p. These are
quite simple, using the above reasoning, so let us just show the final
proof.

From , p = q infer , (p = q)
I ,p ='7>q =-E, pr I
2 q='7>,p =-E, pr I
3 Fromp =q infer p A ,p
3.1 p ='7> q =-E, pr I
3.2 q ='7> P =-E, pr I
3.3 From ,p infer p A ,p
3.3.1 q ='7>-E, I, pr I
3.3.2 p ='7>-E, 3.2, 3.3.1
3.3.3 p A,p A-I, 3.3.2, pr I
3.4 p , -E, 3.3
3.5 From p infer p A ,p
3.5.1 q ='7>-E, 3.1, pr I
3.5.2 ,p ='7>-E, 2, 3.5.1
3.5.3 p A,p A-I, pr I, 3.5.2
3.6 ,p , -I, 3.5
3.7 p A,p A-I, 3.4, 3.6
5 ,(p=q) , -I, 2

At each step of the development of the proof there was little choice. The
crucial -and most difficult- point of the development was the choice of
inference rule , -I to substantiate the last line of the proof, but careful
study of the inference rules led to it as the only likely candidate. Thus,
directed study of the available information can lead quite simply to the
proof.
Section 3.5 Developing Natural Deduction System Proofs 59

The Tardy Bus Problem


The Tardy Bus Problem is taken from WFF'N PROOF: The Game of
Modern Logic [I].

THE TARDY BUS PROBLEM. Given are the following premises:

I. If Bill takes the bus, then Bill misses his appointment, if the
bus is late.
2. Bill shouldn't go home, if (a) Bill misses his appointment, and
(b) Bill feels downcast.
3. If Bill doesn't get the job, then (a) Bill feels downcast, and (b)
Bill shouldn't go home.

Which of the following conjectures are true? That is, which can be validly
proved from the premises? Give proofs of the true conjectures and coun-
terexamples for the others.
I. If Bill takes the bus, then Bill does get the job, if the bus is
late.
2. Bill gets the job, if (a) Bill misses his appointment, and (b) Bill
should go home.
3. If the bus is late, then (a) Bill doesn't take the bus, or Bill
doesn't miss his appointment, if (b) Bill doesn't get the job.
4. Bill doesn't take the bus if, (a) the bus is late, and (b) Bill
doesn't get the job.
5. If Bill doesn't miss his appointment, then (a) Bill shouldn't go
home, and (b) Bill doesn't get the job.
6. Bill feels downcast, if (a) the bus is late, or (b) Bill misses his
appointment.
7. If Bill does get the job, then (a) Bill doesn't feel downcast, or
(b) Bill shouldn't go home.
8. If (a) Bill should go home, and Bill takes the bus, then (b) Bill
doesn't feel downcast, if the bus is late.

This problem is typical of the puzzles one comes across from time to time.
Most people are confused by them -they just don't know how to deal
with them effectively and are amazed at those that do. It turns out, how-
ever, that knowledge of propositional calculus makes the problem fairly
easy.
The first step in solving the problem is to translate the premises into
propositional form. Let the identifiers and their interpretations be:
60 Part I. Propositions and Predicates

tb: Bill takes the bus


ma: Bill misses his appointment
b/: The bus is late
gh: Bill should go home
fd: Bill feels downcast
gj: Bill gets the job.

The premises are given below. Each has been put in the form of an impli-
cation and in the form of a disjunction, knowing that the disjunctive form
is often helpful.

Premise I. tb ?(bl ?ma) or , tb V , bl V ma


Premise 2. (ma Afd)? ,gh or , ma V ,fdV ,gh
Premise 3. ,gj ?(jd A ,gh) or gj V(jd A ,gh)

Now let's solve the first few problems. In order to save space, Premises I,
2 and 3 are not. written in every proof, but are simply referred to as Prem-
ises 1, 2 and 3. Included, however, are propositions derived from them in
order to get more true propositions from which to conclude the result.

Conjecture 1: If Bill takes the bus, then Bill does get the job, if the bus is
late. Translate the conjecture into propositional form.

In propositional form, the conjecture is tb ?(bl? gj). We try to prove


"From tb infer bl? gj", which would prove that the conjecture is true.
Write the outline for the proof and fill in the obvious details.

From tb infer bl ? gj
tb pr 1

2 b/?gj Why?

What propositions can be derived from line I and Premises I, 2 and 3?


Expand the proof accordingly.

Proposition bl ? ma can be derived from Premise I and line I:


Section 3.5 Developing Natural Deduction System Proofs 61

From tb infer bl ~ gj
I tb pr I
2 bl ~ rna ~-E, Premise I,

3 bl ~gj Why?

Which rules could be used to substantiate line 3?

Proposition bl ~ gj could be an instance of the conclusion of rules ~-I,


A-E, V-E, , -E, =-E and ~-E. Which seems most useful here? Expand the
proof accordingly.

The necessary propositions for the use of the elimination rules are not
available, so try ~-I:

From tb infer bl ~ gj
I tb pr I
2 bl ~ rna ~-E, Premise I, I
3 From bl infer gj
3.1 bl pr I

3.2 gj Why?
4 bl ~gj ~-I,3

Can any propositions be inferred at line 3.2 from the propositions on pre-
vious lines and Premises 1, 2 and 3? Expand the proof accordingly.

Proposition rna can be derived from lines 2 and 3.1:

From tb infer bl ~ gj
I tb pr I
2 bl ~ rna ~-E, Premise I, I
3 From bl infer gj
3.1 bl pr 1
3.2 rna ~-E, 2, 3.1

3.3 gj Why?
4 bl ~gj ~-I,3
62 Part I. Propositions and Predicates

What rules could be used to substantiate line 3.3?

Proposition gj could be an instance of the conclusion of rules A-E, v-E,


, -E and ~-E. Which ones seem helpful here?

None of the the rules seem helpful. The only proposition available that
contains gj is Premise 3, and its disjunctive form indicates that gj must
necessarily be true only in states in which (fdA ,gh) is false (according to
theorem (3.4.6)). But there is nothing in Premise 2, the only other place
fd and gh appear, to make us believe that fdA ,gh must be false.
Perhaps the conjecture is false. What counterexample -i.e. state in
which the conjecture is false- does the structure of the proof and this
argument lead to?

Up to line 3.2 of the proof we have assumed or proved tb = T, bl = T


and rna = T. To contradict the conjecture, we need gj = F. Finally, the
above argument indicates we should try to let fdA ,gh be true, so we try
fd= T and gh = F. Indeed, in this state Premises 1, 2 and 3 are true and
the conjecture is false.

Conjecture 2: Bill gets the job, if (a) Bill misses his appointment and (b)
Bill should go home. Translate the conjecture into propositional form.

This conjecture can be translated as (rna Agh) ~ gj. To prove it we need


to prove "From rna Agh infer gj". Draw the outline of a proof and fill in
the obvious details.

From rna A gh infer j


1 rna Agh pr 1

2 gj Why?

What can we derive from line I and Premises I, 2 and 3? Expand the
proof accordingly.
Section 3.5 Developing Natural Deduction System Proofs 63

Both line I and Premise 2 contain rna and gh. Premise 2 can be put in
the form , (rna Agh)V ,fd. Since rna Agh is on line I, theorem (3.4.6)
together with the law of Negation allows us to conclude that ,fd is true,
or that fd is false. Putting this argument into the proof yields

From rna Agh infer gj


I rna Agh pr I
2 , (rna Agh)V,fd subs, De Morgan, Premise 2
3 , , (rna Agh) subs, Negation, I
4 ,fd (3.4.6), 2, 3

5 gj Why?

What inference rule should be used to substantiate line 5? Expand the


proof accordingly.

The applicable rules are A-E, v-E, , -E and ~-E. This means that an ear-
lier proposition must be broken apart to derive gj. The one that contains
gj is Premise 3, and in its disjunctive form it looks promising. To show
that gj is true, we need only show that fd A , gh is false. But we already
know thatfd is false, so that we can complete the proof as follows.

pr I
2 subs, De Morgan, Premise 2
3 subs, Negation, I
4 (3.4.6), 2, 3
5 V-I, 4
6 subs, De Morgan, 5
7 (3.4.6), Premise 3, 6

Conjecture 3: If the bus is late, then (a) Bill doesn't take the bus, or Bill
doesn't miss his appointment, if (b) Bill doesn't get the job. Translate the
conjecture into propositional form.

Is this conjecture ambiguous? Two possible translations are

bl ~(, gj ~(, tb v, rna», and


bl ~( , tb V( , gj ~, rna»

Let us assume the first proposition is intended. It is true if we can prove


"From bl infer , gj ~( , tb v , rna )". Draw the outline of the proof and
64 Part I. Propositions and Predicates

fill in the obvious details.

From bl infer ,gj ?(, tb v , ma)


bl pr I

2 ,gj?( ,tb V,ma) Why?

What propositions can be derived from line I and the Premises?

No propositions can be derived, at least easily, so let's proceed to the next


step. What rule should be used to substantiate line 2? Expand the proof
accordingly.

Quite obviously, rule ?-I should be tried:

From b I infer ,gj?(, tb v ,ma)


I bl pr I
2 From ,gj infer, tb v , ma
2.1 ,gj pr I

2.2 , tb v ,ma Why?


3 ,gj?( ,tb v ,ma)

Just before line 2.2, what propositions can be inferred from earlier propo-
sitions and Premises I, 2 and 3? Expand the proof accordingly.

Thc antecedent of Premise 3 is true, so we can conclude that the conse-


quent is also true:
Exercises for Section 3.5 65

From bl infer ,gj =?(, tb v, rna)


I bl pr I
2 From ,gj infer, tb v, rna
2.1 , gj pr I
2.2 fdA ,gh =?-E, Premise 3, 2.1
2.3 fd A-E, 2.2
2.4 , gh A-E, 2.2

2.5 , tb v , rna Why?


3 , gj =? ( , tb v , rna)

What inference rule should be used to substantiate line 2.5? Expand the
proof accordingly.

The proposition on line 2.5 could have the form of the conclusion of rules
v-I, A-E, v-E, , -E and =?-E. The first rule to try is V-I. Its use would
require proving that one of , tb and , rna is true. But, looking at the
Premises, this seems difficult. For from Premise I we see that both tb
and rna could be true, while the other premises are true also because both
their conclusions are true. Perhaps there is a contradiction. What is it?

In a state with tb = T, rna = T, bl = T, gh =F, fd= T and gj =F


Premises I, 2 and 3 are true, but the conjecture is false.

Exercises for Section 3.5


1. Prove or disprove conjectures 4-8 of the Tardy Bus problem.
2. For comparison, prove the valid conjectures of the Tardy Bus problem using a
mixture of the equivalence-transformation system of chapter 2 and English.
Chapter 4
Predicates

In section 1.3, a state was defined as a function from identifiers to the


set of values {T, F}. The notion of a state is now extended to allow iden-
tifiers to be associated with other values, e.g. integers, sequences of char-
acters, and sets. The notion of a proposition will then be generalized in
two ways:

1. In a proposItIOn, an identifier may be replaced by any expres-


sion (e.g. x ~y) that has the value T or F.
2. The quantifiers E, meaning "there exists"; A, meaning "for all";
and N, meaning "number of', are introduced. This requires an
explanation of the notions of free identifier and bound identifier
and a careful discussion of scope of identifiers in expressions.

Expressions resulting from these generalizations are called predicates, and


the addition to a formal system (like the system of chapter 2 or 3) of
inference rules to deal with them yields a predicate calculus.

4.1 Extending the Range of a State


We now consider a state to be a function from identifiers to values,
where these values may be other than T and F. In any given context, an
identifier has a type, such as Boolean, which defines the set of values with
which it may be associated. The notations used to indicate the standard
types required later are:

Boolean (i): identifier i can be associated (only) with T or F.


Section 4.1 Extending the Range of a State 67

natural number(i): i can be associated with a member of {0,1,


2, ... }.
integer (i): i can be associated with an integer -a member of
{ ... ,-2,-1,0,1,2,···}.
integerset (i): i can be associated with a set of integers.

Other types will be introduced where necessary.

Let P be the expression x <y, where x and y have type integer.


When evaluated, P yields either T or F, so it may replace any identifier
in a proposition. For example, replacing b in (b A c) v d by P yields

((x<y)Ac)Vd.

The new assertions like P are called atomic expressions, while an expres-
sion that results from replacing an identifier by an atomic expression is
called a predicate. We will not go into detail about the syntax of atomic
expressions; instead we will use conventional mathematical notation and
rely on the reader's knowledge of mathematics and programming. For
example, any expression of a programming language that yields a Boolean
result is an acceptable atomic expression. Thus, the following are valid
predicates:

((x ::::;;y) A (y <Z» v (x+y <Z)


(x ::::;;y A Y <z) v x +y <z

The second example illustrates that parentheses are not always needed to
isolate the atomic expressions from the rest of a predicate. The pre-
cedences of operators in a predicate follow conventional mathematics.
For example, the Boolean operators A, v, and ~ have lower precedence
than the arithmetic and relational operators. We will use parentheses to
make the precedence of operations explicit where necessary.

Evaluating predicates
Evaluating a predicate in a state is similar to evaluating a proposition.
All identifiers are replaced by their values in the state, the atomic expres-
sions are evaluated and replaced by their values (T or F), and the result-
ing constant proposition is evaluated. For example, the predicate
x <y v b in the state {(x,2),(y,3),(b,F)} has the value of 2<3 v F,
which is equivalent to TV F, which is T.
Using our earlier notation s(e) to represent the value of expression e
in state s, and writing a state as the set of pairs it contains, we show the
evaluation of three predicates:
68 Part I. Propositions and Predicates

s«x:;;;;y "y <z) V (x+y <z» where s ={(x, 1),(y,3),(z, 5)}


(I :;;;; 3 " 3 < 5) v (I +3 < 5)
= (T " T) v T
= T.
s«x:;;;;y fly <z) v (x+y <z» where s ={(x,3),(y, 1),(z,5)}
= (3:;;;;1" 1<5)V(3+1<5)
= (F 1\ T) v T
= T.
s«x:;;;;y "y <z) v (x+y <z» where s ={(x, 5),(y, I),(z, 3)}
= (5:;;;;1" 1 <3) v (5+1 <3)
= (F " T) v F
= F.

Reasoning about atomic expressions


Just as inference rules were developed for reasoning with propositions,
so they should be developed to deal with atomic expressions. For exam-
ple, we should be able to prove formally that i <k follows from (i <j
"j <k). We shall not do this here; as they say, "it is beyond the scope
of this book." Instead, we rely on the reader's knowledge of mathematics
and programming to reason, as he always has done, about the atomic
expressions within predicates.
As mentioned earlier, we will be using expressions dealing with integer
arithmetic, real arithmetic (though rarely) and sets. The operators we will
be using in these expressions are described in Appendix 2.

The operators cand and cor


Every proposition is well-defined in any state in which all its identifiers
have one of the values T and F. When we introduce other types of
values and expressions, however, the possibility of undefined expressions
(in some states) arises. For example, the expression x /y is undefined if y
is O. We should, of course, be sure that an expression in a program is
well-defined in each state in which it will be evaluated, but at times it is
useful to allow part of an expression to be undefined.
Consider, for example, the expression

y =0 v (x /y= 5) .

Formally, this expression is undefined if y = 0, because x /y is undefined


if y = 0 and or is itself defined only when its operands are T or F. And
yet some would argue that the expression should have a meaning in any
state where y = O. Since in such states the first operand of or is true, and
since or is defined to be true if either of its operands is true, the
Section 4.1 Extending the Range of a State 69

expression should be true. Furthermore, such an interpretation would be


quite useful in programming, for it would allow us to say many things
more clearly and compactly. For example, consider being able to write

if y =0 v (x /y= 5) then sl else s2


as opposed to
if y =0 then sl
else if x /y= 5 then sl
else s2

Rather than change the definition of and and or, which would require
us to change our formal logic completely, we introduce two new opera-
tors: cand (for conditional and) and cor (for conditional or). The
operands of these new operators can be any of three values: F, T and U
(for Undefined). The new operators are defined by the following truth
table.

b c b cand c b cor c b c b cand c b cor c


T T T T F U F U
T F F T U T U U
T U U T U F U U
F T F T U U U U
F F F F

This definition says nothing about the order in which the operands should
be evaluated. But the intelligent way to evaluate these operations, at least
on current computers, is in terms of the following equivalent conditional
expressions:

b cand c: if b then c else F


b cor c: if b then T else c

Operators cand and cor are not commutative. For example, b cand c is
not equivalent to c cand b. Hence, care must be exercised in manipulat-
ing expressions containing them. The following laws of equivalence do
hold for cand and cor (see exercise 5). These laws are numbered to
correspond to the numbering of the laws in chapter 2.

2. Associativity: El cand (E2 cand E3) = (El cand E2) cand E3


El cor (E2 cor E3) = (El cor E2) cor E3

3. Distributivity:
El cand (E2 cor E3) = (El cand E2) cor (E1 cand E3)
El cor (E2 cand E3) = (El cor E2) cand (El cor E3)
70 Part I. Propositions and Predicates

4. De Morgan: , (E1 cand E2) = , E1 cor, E2)


, (E1 cor E2) = , E1 cand , E2)

6. Excluded Middle: E1 cor, E1 = T (provided E1 is well-defined)

7. Contradiction: E1 cand , E1 = F (provided E1 is well-defined)

10. cor-simplification
E1 cor E1 = E1
E1 cor T = T (provided E1 is well-defined)
E1 cor F = E1
E1 cor (E1 cand E2) = E1

II. cand-simplification
E1 cand E1 = E1
E1 cand T = E1
E1 cand F = F (provided E1 is well-defined)
E1 cand (E1 cor E2) = E1

In addition, one can derive various laws that combine cand and cor with
the other operations, for example,

E1 cand (E2V E3) = (E1 cand E2)V(E1 cand E3)

Further development of such laws are left to the reader.

Exercises for Section 4.1


1. The first two exercises consist of evaluating predicates and other expressions
involving integers and sets. Appendix 2 gives more information on the operations
used. The state s in which the expressions should be evaluated consists of two
integer identifiers x, y, a Boolean identifier b, two set identifiers m, n and an
integer array c[I:3]. Their values are:

x =7, y =2, b = T, m = {l,2,3,4}, n = {2,4,6}, c =(2,4,6)

(a) x-;-y =3 (h) -ceil(-x/y)=x-;-y


(b) (x-I)-;-y =3 (i) 7 mod 2
(c) (x+I)-;-y =3 U) Jloor(xjy)=x-;-y
(d) ceil(x/y)=x-;-y +1 (k) min(jloor(x/2), ceil(x/2»<ceil(x/2)
(e) Jloor«x+I)/y) =(x+I)-;-y (I) (abs(-x) =-abs(x» = b
(f) Jloor(-x/y)=-3 (m) bVx<y
(g) ceil(x/y)=x-;-y (n) 19 mod 3
2. Evaluate the following expressions in the state given in exercise I.
(a) m un (g) I m I Em
(b) m nn (h) I n lEn
Section 4.2 Quantification 71

(c)xEm/\b (i) ((lml}U{6, 7})Cn


(d) m cn /\b U) Iml +Inl =Imunl
(e) 0 c m (k) min(m)
(f) {i liE m /\ even (i)} C n (I) {i liE m /\ i En}

3. Evaluate the following predicates in the state given in exercise I. Use U for
the value of an undefined expression.

(a) b Vx /(y-2)=0 (f) x =0 cand x /(y-2)=0


(b) b corx /(y-2)=0 (g) I ~y ~3 cand c[y] Em
(c) b/\x/(y-2)=0 (h) I ~y ~3 cor c[x] Em
(d) b cand x / (y-2) =0 (i) I ~y ~3 cand c[y+l] Em
(e) x =O/\x /(y-2)=0 (j) I ~ x ~ 3 cor c [y] E m

4. Consider propositions a, band c as having the values F, T or U (for unde-


fined). Describe all states where the commutative laws a cor b = b cor a and
a cand b = b cand a do not hold.
S. Prove that the laws of Associativity, Distributivity, De Morgan, Excluded Mid-
dle, Contradiction, cor-simplification and cand-simplification, given just before
these exercises, hold. Do this by building a truth table for each one.

4.2 Quantification

Existential quantification
Let m and n be two integer expressions satisfying m ~ n. Consider
the predicate

(4.2.1) Em V Em+ 1 V ... V En-I,

where each Ei is a predicate. (4.2.1) is true in any state in which at least


one of the Ei is true. It can be expressed using the existential quantifier
E (read "there exists") as

(4.2.2) (Ei: m ~i <n: Ed.

The set of values that satisfy m ~ i < n is called the range of the quanti-
fied identifier i. Predicate (4.2.2) is read in English as follows.

(Ei there exists at least one (integer) i


such that
m ~i<n i is between m and n -\ (inclusive)
for which the following holds:
Ei ·
72 Part I. Propositions and Predicates

The reader is no doubt already familiar with some forms of quantifica-


tion in mathematics. For example,
n-I
2 Si = Sm +Sm+1 + ... +Sn-I
;=m

nSi =
n-I

i=m
Sm *Sm+1 * ... *Sn-I'

stand for the sum and product of the values Sm, Sm+J, ... , Sn-J, respec-
tively. These can be written in a more linear fashion, similar to (4.2.1), as
foHows, and we shall continue to use this new form:

(I. i: m ~ i < n : Si)


(ni:m~i<n:Si)

At this point, (4.2.2) is simply an abbreviation for (4.2.1). It can be


recursively defined as follows:

(4.2.3) Definition of E:
(Ei:m ~i <m: E i ) = F, and, for k ~m,
(E i : m ~ i < k + I: E j ) = (E i : m ~ i < k: E;) v Ek 0

Remark: The base case of this recursive definition, which concerns an


empty range m ~ i < m for i, brings out an interesting point. The dis-
junction of zero predicates, (E i: m ~ i < m: Ei ), has the value F: "oring"
o predicates together yields a predicate that is always false. For example,
the following predicates are equivalent to F:

(Ei:O~i<O:i=i)
(Ei:-3~i<-3: T)

The disjunction of zero disjuncts is F. The conjunction of zero con-


juncts turns out to be T. Similarly, the sum of zero values is 0 and the
product of zero values is I. These four facts are expressed as

(I. i:O~i <0: x;) = 0,


(n i: 0 ~ i < 0: x;) = I,
(Ei:O~i<O:Ei) =F,
(Ai:O~i<O:Ei) = T. (Notation explained subsequently)

The value 0 is called the identity element of addition, because any number
added to 0 yields that number. Similarly, I, F and T are the identity ele-
ments of the operators *, or and and, respectively. 0
Section 4.2 Quantification 73

The following examples use quantification over two identifiers. They


are equivalent; they assert the existence of i and j between I and 99 such
that i is prime and their product is 1079 (is this true?). The third one uses
the convention that successive quantifications with the same range,
(E i: m ~ i < n: (E j: m ~j < n: (E k: m ~ k < n : ... ))) can be written
as (E i, j ,k: m ~ i ,j ,k < n: ... ).

(1) (E i: O~ i < 100: (Ej: O~j < 100: prime (i) /I. i *j = 1079»
(2) (E i: O~ i < IOO:prime(i) /I. (E j: O~j < 100: i *j = 1079»
(3) (Ei ,j: O~i ,j < 100: prime (i) /I. i*j = 1079»

Universal quantification
The universal quantifier, A, is read as "for all". The predicate

(4.2.4) (Ai:m~i<n:Ei)

is true in a state iff, for all values i in the range m ~ i < n, Ei is true in
that state.
We now define A in terms of E, so that, formally, we need deal only
with one of them as a new concept. Predicate (4.2.4) is true iff all the Ei
are true, so we see that it is equivalent to

Em /I. E m+ 1A . . . A En~1

, , (Em /I. Em+ 1/I. .•• /I. En~l) (Negation)


,(, Em V, Em+ 1V ... V, En~d (De Morgan)
, (E i : m ~ i < n: , Ei )

This leads us to define (4.2.4) as

(4.2.5) Definition. (Ai:m~i<n:Ed= ,(Ei:m~i<n:,Ei)' 0

Now we can prove that (4.2.4) is true if its range is empty:

(A i: m ~ i < m : Ed
, (E i : m ~ i < m: , Ei )
,F (because the range of E is empty)
T

Numerical quantification
Consider predicates Eo, £1, .... It is quite easy to assert formally that
k is the smallest integer such that Ek holds. We need only indicate that
Eo through Ek~1 are false and that Ek is true:
74 Part I. Propositions and Predicates

o~ k A (A i: 0 ~ i < k: ~ E;) A Ek

It is more difficult to assert that k is the second smallest integer such that
Ek holds, because we also have to describe the first such predicate Ej :

o~j < k A (A i: 0 ~ i <j: ~ E;) A Ej A


(A i: j + I ~ i < k: ~ E;) A Ek

Obviously, describing the third smallest value k such that Ek holds will
be clumsier, and to write a function that yields the number of true E; will
be even harder. Let us introduce some notation:

(4.2.6) Definition. (N i: m ~ i < n : E;) denotes the number of different


values i in range m ~ i <n for which E; is true. N is called the
counting quantifier. 0

This means that

(Ei:m~i<n:E;) = «Ni:m~i<n:E;)~I)
(Ai:m~i<n:E;) = «Ni:m~i<n:E;)=n-m)

Now it is easy to assert that k is the third smallest integer such that Ek
holds:

«Ni:O~i<k:E;)=2) A Ek

A Note on ranges
Thus far, the ranges of quantifiers have been given in the form m ~ i
<n, for integer expressions m and n. The lower bound m is included in
the range, the upper bound n is not. Later, the form of ranges will be
generalized, but this is a useful convention, and we will use it where it is
suitable.
Note that the number of values in the range is n -m. Note also that
quantifications with adjacent ranges can be combined as follows:

(E i : m ~ i < n : Ed v (E i: n ~ i <p : E;) = (E i : m ~ i < p : E; )


(Ai:m~i<n:Ei)A(Ai:n~i<p:Ei) = (Ai:m~i<p:Ei)
(Ni:m~i<n:Ei)+(Ni:n~i<p:E;) = (Ni:m~i<p:E;)
Exercises for Section 4.2 75

Exercises for Section 4.2


1. Consider character strings in P Lj I or Pascal. Let I denote catenation of
strings. For example, the value of the expression 'ab:' I 'xl' is the character string
'ab:xl '. What is the identity element of operation catenation?
2. Define the notation (A i: m :::;; i < n: E j ) recursively.
3. Define the notation (N i: m :::;; i < n: E j ) recursively.
4. Write a predicate that asserts that the value x occurs the same number of times
in arrays b[O:n-l] and c[O:m-l].
5. Write a predicateperm(b, c) that asserts that array b[O:n-l] is a permuta-
tion of array c [O:n -I]. Array b is a permutation of c if it is just a rearrange-
ment of it: each value occurs the same number of times in band c. (See exercise
4.)
6. Consider array b [O:n -I], where n >0. Let j, k be two integers satisfying
O:::;;j :::;; k + I:::;; n. By b U:k] we mean the set of array elements b U], ... , b [k],
where the list is empty if j = k + I.
Translate the following sentences into predicates. For example, the first one
can be written as (A i: j :::;; i < k + I: b [i] = 0). Some of the statements may be
ambiguous, in which case you should try to translate both possibilities.
(a) All elements of b U:k] are zero.
(b) No values of bU:k] are zero.
(c) Some values of bU:k] are zero. (What does "some" mean?)
(d) All zeroes of b[O:n -I] are in bU:k].
(e) Some zeroes of b[O:n-l] are in bU:k].
(f) Those values in b[O:n-l] that are not in hU:k] are in bU:k].
(g) It is not the case that all zeroes of b [O:n -I] are in b U: k].
(h) If b[O:n-l] contains a zero then so does bU:k].
(i) If bU:k] contains two zeroes then j = I.
(j) Either b[l:jJ or bU:k] contains a zero (or both).
(k) The values of b U:k] are in ascending order.
(1) If x is in bU:k], then x+1 is in b[k+l:n-l].
(m) b U:k] contains at least two zeroes.
(n) Every value in bU:k] is also in b[k+l:n-l].
(0) j is a power of 2 if j is in bU:k].
(p) Evcry element of b [O:jJ is less than x, and every element of
bU+l:n-l] exceeds x.
(q) If b [I] is 3 or b [2] is 4 and b [3] is 5 then j = 3.
76 Part I. Propositions and Predicates

4.3 Free and Bound Identifiers


The predicate

(4.3.1) (A i: m ~i <n: x*i >0)

asserts that x multiplied by any integer between m and n -1 (inclusive)


exceeds O. This is true if both x and m exceed 0 or if x is less than 0
and n is at most O. Hence, (4.3.1) is equivalent to the predicate

(x >0 1\ m >0) v (x <0 1\ n ~O)

Thus, the truth of (4.3.1) in a state s depends on the values of m, nand


x in s, but not on the value of i -in fact, i need not even occur in state
s. And it should also be clear that the meaning of the predicate does not
change if all occurrences of i are replaced by j:

(Aj:m ~j <n:x*j >0)

Obviously, identifier i in (4.3.1) plays a different role than identifiers m,


n and x. So we introduce terminology to help make the different roles
clear. Identifiers m, n and x are free identifiers of the predicate. Identif-
ier i is bound in (4.3.1), and it is bound to the quantifier A in that predi-
cate.
N ow consider the predicate

(4.3.2) i >0 1\ (A i: m ~i <n: x*i >0).

This is confusing, because the leftmost occurrence of i is free (and during


an evaluation will be replaced by the value of i in the state), while the
other occurrences of i are bound to the quantifier A. Clearly, it would be
better to use a different identifier j (say) for the bound i and to rewrite
(4.3.2) as

i >01\ (Aj: m ~j <n :x*j >0).

While it is possible to allow predicates like (4.3.2), and most logical sys-
tems do, it is advisable to enforce the use of each identifier in only one
way:

(4.3.3) Restriction on identifiers: In an expression, an identifier may not


be both bound and free, and an identifier may not be bound to
two different quantifiers. 0

Note that the predicate


Section 4.3 Free and Bound Identifiers 77

(A i: m ,,;:;; i < n : x *i > 0) A (A i: m ,,;:;; i < n : y *i < 0)


does not comply with the restriction. An equivalent predicate that does is

(A i: m ,,;:;; i < n: x *i > 0) A (A k: m ,,;:;; k < n: y *k < 0)


At times, for convenience a predicate will be written that does not follow
restriction (4.3.3). In this case, be sure to view each quantified identifier
as being used nowhere else in the world. Think of the two different uses
of the same identifier as different identifiers.
Let us now formally define the terms free and bound, based on the
structure of expressions.

(4.3.4) Definition (of a free identifier i in an expression).


I. i is free in the expression consisting simply of i.
2. i is free in expression (E) if it is free in E.
3. i is free in expression op E, where op is a unary operator (e.g.
" -), if it is free in E.
4. i is free in expression E10p E2, where op is a binary operator
(e.g. v, +) if it is free in E1 or E2 (or both).
5. i is free in expression (A j: m ";:;;j < n: E), (E j: m ";:;;j < n: E)
and (N j: m ";:;;j < n : E»
if it is not the same identifier as j and if
it is free in m, n or E. 0

(4.3.5) Definition (of a bound identifier i in an expression).


1. i is bound in expression ( E ) if it is bound in E.
2. i is bound in expression op E, where op is a unary operator,
if it is bound in E.
3. i is bound in expression E10p E2, where op is a binary opera-
tor, if it is bound in E1 or E2.
4. i is bound to the quantifier in expression (E i: m ,,;:;; i < n: E)
(and similarly for A and N). The scope of the bound identifier i
is the complete predicate (E i: m ,,;:;; i < n : E).
5. i is bound (but not to the shown quantifier) in expression
(E j: m ";:;;j < n : E) if it is bound in m, n or E. Similar state-
ments hold for quantifiers A and N. 0

Note that both x and yare free in the predicate x";:;;y, while x
remains free and y becomes bound when the predicate is embedded in the
expression (N y: O";:;;y < 10: x ";:;;y) = 4.
78 Part I. Propositions and Predicates

Examples. In the predicates given below, bound occurrences of identifiers


are denoted by arrows leading to the quantifier to which they are bound,
while all other occurrences are free. Invalid predicates are marked as
invalid.

t I I I
2~m <n A (A i:2~i <m:m-':-i #0)

f I I I
2 ~m <n A (A n: 2 ~n <m: m -':-n #0) INVALID (why?)

• I I I • I I I
(E i: I ~; <25: 25-':-; =0) A (E;: I ~i <25: 26-':-; =0) INVALID

t I I I t I I I
(E t: I ~t <25: 25-':-t =0) A (E i: I ~i <25: 26-':-i =0)

t I I I I
(Ei: I ~i <25: 25-':-i =0 A 26-':-; =0)

t I I I I
(A m:n <m <n +6:(E;: 2~; <m: m-':-; =0»
• I I I
• I I I I
(A m: n <m <n+6:(En: 2~n <m: m -':-n =0» INVALID
• I I I
t I I I I
(A m:n <m <n+6:(Ek:2~k <m:m-':-k =0» 0
• I I I
The scope mechanism being employed here is similar to the ALGOL
60 scope mechanism (which is also used in Pascal and PL/ I). Actually,
its use in the predicate calculus came first. A phrase (A i: R: E) intro-
duces a new level of nomenclature, much like a procedure declaration
"proc p (i); begin ... end" does. Inside the phrase, one can refer to all
variables used outside, except for ;; these are global identifiers of the
phrase. The part Ai is a "declaration" of a new local identifier i.
As in ALGOL 60, the name of a local identifier has no significance
and can be changed systematically without destroying the meaning. But
care must be taken to "declare" bound identifiers in the right place to get
the intended meaning.
Section 4.4 Textual Substitution 79

Exercises for Section 4.3


1. In the following predicates, draw an arrow from each bound identifier to the
quantifier to which it is bound. Indicate the invalid predicates.
(a) (Ek:O~k <n: PAHdT}}Ak >0
(b) (Aj: O~j <n: Bj ~wp(SLj, R)}
(c) (Ej: O~j <n: (A i: O~i <j+I:!(i)<!U+I}}}
(d) (A j: O~j <n: Bj v cj}A(A k: 0 ~k <n: Bk ~(E s: O~s < n: Cs })
(e) (Aj: 0 ~j <n: (Et:j+l ~t <m: (A k: O~k <n: F(k, t)}}}

4.4 Textual Substitution


Textual substitution will be used in Part II to provide an elegant and
useful definition of assignment to variables.
Let E and e be expressions and x an identifier. The notation

denotes the expression obtained by simultaneously substituting e for all


free occurrences of x in E (with suitable use of parentheses around e to
maintain precedence of operators).
A simple example is: (x+y); = (z+y).
Some more examples of textual substitution are given using the follow-
ing predicate E, which asserts that x and all elements of array b[O:n-l]
are less than y:

(4.4.1) E = x<y A(Ai:O~i<n:b[i]<y}.

We have

(4.4.2) E; = z<y A(Ai:O~i<n:b[i]<y}.

(4.4.3) EI+ y = x <x+y A (A i:O~i <n: b[i]<x+y).

(4.4.4) E£ =E (only free occurrences of i are replaced,


and i is not free in E)

(4.4.5) (Ew*zY);+u = (x <w*z A (A i:O~i <n: b[i]<w*z)):+u


= x <w*(a+u) A (A i:O~i<n: b[i]<w*Ca+u)}

Example (4.4.2) shows the replacement of free identifier x by identifier z;


(4.4.3) the replacement of a free identifier by an expression. Example
(4.4.4) illustrates that only free occurrences of an identifier are replaced.
Example (4.4.5) shows two successive substitutions and the introduction of
80 Part I. Propositions and Predicates

parentheses around the expression being inserted. In the second substitu-


tion of (4.4.5), z is being replaced by a +u, so that w*z should be
changed to w* (a +u) and not w*a +u, which, because of our precedence
conventions, would be viewed as (w*a)+u. (If we always fully
parenthesized expressions or used prefix notation, the need for this extra
parenthesization would not arise.)
Substitution has already been used, but with a different notation. If
we consider E of (4.4.1) to be a function of identifier x, E(x), then E;' is
equivalent to E(z). The new notation describes both the identifier being
replaced and its replacement. Therefore, an English description is not
needed to indicate the identifier being replaced.
There are some problems with textual substitution as just defined,
which we illustrate with some examples. First, E~+I would not make
sense because it would result in ... c + I [i] ... , which is syntactically
incorrect. Textual replacement must result in a well-formed expression.
Secondly, suppose we want to indicate that identifier x and the array
elements of b are all less than y -i, where i is a program variable. Not-
ing the similarity between this assertion and E, (4.4.1), we try to write this
by replacing y in E by y -i:

E x<y i\(Ai:O:::;:i<n:b[i]<y)
E~·~i x<y-i i\(Ai:O:::;:i<n:b[i]<y-i).

But this is not the desired predicate, because the i in y -i has become
bound to the quantifier A, since it now occurs within the scope of A.
Care must be taken to avoid such "capturing" of an identifier in the
expression being substituted. To avoid this conflict we can call for first
(automatically) replacing identifier i of E by a fresh identifier k (say), so
that we arrive at

E:'·~i = x<y-i i\(Ak:O:::;:k<n:b[k]<y-i).

Let us now define textual substitution more carefully:

(4.4.6) Definition. The notation E~\ where x is an identifier and E and


e expressions, denotes the predicate created by simultaneously
replacing every free occurrence of x in E bye. To be valid, the
substitution must yield a syntactically correct predicate. If the
substitution would cause an identifier in e to become bound, then
a suitable replacement of bound identifiers in E must take place
before the substitution in order to avoid the conflict. 0

The following two lemmas are stated without proof, for they are fairly
obvious:
Exercises for Section 4.4 81

(4.4.7) o

(4.4.8) Lemma. If y is not free in E, then (E;)t

Simultaneous substitution
Let x denote a list (vector) of distinct identifiers:

Let e be a list (of the same length as x) of expressions. Then simultane-


ous substitution of all occurrences of the Xi by the corresponding ei in an
expression E is denoted by

(4.4.9) E§, or

The caveats placed on simple substitution in definition 4.4.6 apply here


also. Here are some examples.

a+b +a+b +c

x+y +x+y +z

(A i:O~i <n: b(i) v c(i+I»::+~.d


= (Ak:O~k<n+i:d(k)V c(k+I»

The second example illustrates the fact that the substitutions must be
simultaneous; if one first replaces all occurrences of X and then replaces
all occurrences of y, the result is x +z + x + z + z, which is not the same.
In general, E;:t can be different from (E;)~'.

Exercises for Section 4.4


1. Consider the predicate E: (A i: 0 ~ i +
< n : b [i] < b [i I ]). Indicate which of
the following textual substitutions are invalid and perform the valid ones.

Ej.i Enn+l. EbC' Enn+i. Ebb+l, En,b


m,k

2. Consider the predicate E: n >i 1\ (N j: I ~j <n: n-7-j =0» 1. Indicate


which of the following textual substitutions are invalid and perform the valid
ones.

Ej.; Enm+i. Ejj+l. Eik. (Enn+i )it. En,;


n+i,l

<
3. Consider the predicate E =(A i: I ~ i n : (E j: b U] = i».
Indicate which of
the following textual substitutions are invalid and perform the valid ones.
82 Part I. Propositions and Predicates

EJ, E/:, E~'i


4. Consider the assignment statement x:= x + 1. Suppose that after its execution
we want R: x >0 to be true. What condition, or "precondition", must be true
before execution in order to have R true after? Can you put your answer in
terms of a textual substitution in R?
5. Consider the assignment statement a:= a*b. Suppose that after its execution
we want R: a*b = c to be true. What condition, or "precondition", must be true
before execution in order to have R true after? Can you put your answer in
terms of a textual substitution in R?
6. Define textual substitution recursively, based on the structure of an expression.

4.5 Quantification Over Other Ranges


Until now, we have viewed the predicate (E i : m ~ i < n : E i ) as an
abbreviation for Em V • •• V En _). The notion of quantification is now
generalized to allow quantification over other ranges, including infinite
ones. This results in a system with more "power"; we will be able to
make assertions that were previously not possible. But predicates with
infinite ranges cannot always be computed by a general method in a finite
amount of time. Hence, although such predicates may be used heavily in
discussing programs, they won't appear in programs.
A predicate can have the form

(4.5.1) (Ei:R:E) or

(4.5.2) (Ai:R:E),

where i is an identifier and Rand E are predicates (usually, but not


necessarily, containing i). The first has the interpretation "there exists a
value of i in range R (for which R is true) for which E is true". The
second has the interpretation "for all values of i in range R, E is true".
The notions of free and bound identifiers and the restrictions on their
occurrence in predicates, as given in section 4.3, hold here in the same
manner and will not be discussed further.

Example 1. Let Person (P) represent the sentence "p is a person". Let
Morta/(x) represent the sentence "x is mortal". Then the sentence "All
men are mortal", or, less poetically but more in keeping with the times,
"All persons are mortal", can be expressed by (A p : Person (p):
Mortal(p )). 0

Example 2. It has been proved that arbitrarily large pnmes exist. This
theorem can be stated as follows:
Section 4.5 Quantification Over Other Ranges 83

(A n: 0 <n: (E i: n <i:prime(i»), where


prime(i) = (l <i A (Aj: 1<j <i: i mod j # 0»

In fact, Chebyshev proved in 1850 that there is a prime between every


integer and its double, which we state as

(A n: 1 <n: (Ei: n ~i <2n : prime (i») 0

Example 3. The predicate below asserts that the maximum of an integer


and its negation is the absolute value of that integer:

(A n: integer(n): max(n, -n) =abs(n» 0

The type of a quantified identifier


Implicit in our use of (Ei:n<i:prime(i» above is that i has type
integer. However, when dealing with more general ranges this is not
always the case. Consider, for example, the predicate (A p: Person (p):
Mortal(p », where the range of p is the set of all objects (see example 1
above). Hence, the type of a quantified identifier must be made clear in
some fashion so that the set of values that the identifier ranges over is
unambiguously identified. The formal way to do this is to include the
type as part of the range predicate. This has been done in example 3
above, where the range is integer(n).
Often, however, the text surrounding a predicate and the form of the
predicate itself will identify the type of the quantified identifier, making it
unnecessary to give it explicitly in the predicate.
The range can even be omitted completely when it can be determined
from the context; this is just the usual attempt to suppress unnecessary
details. For example, the predicate in example 3 could have been written
as

(A n: max(n, -n)=abs(n»

since the context indicated that only integers were under consideration.

Tautologies and implicit quantification


Suppose a predicate like max (n, -n) = abs (n), where n has type
integer, has been proved to hold in all states: it is a tautology. Then it is
true for all (integer) values of n, so that the following is also true:

(A n: integer(n): max(n, -n) = abs(n»


84 Part I. Propositions and Predicates

or, as an abbreviation,

(A n: max(n, -n) = abs(n»

Thus we see that


(4.5.3) any tautology E is equivalent to the same predicate E but with
all its identifiers i" ... ,im universally quantified, i.e. it is equiv-
alent to (A ii, . .. ,im: E).

This simple fact will be useful in chapter 6 in determining how to describe


initial and final values of variables of a program.

InJerence rules Jor A and E


The rest of this section 4.5, which requires knowledge of chapter 3,
need not be read to understand later material. It gives introduction and
elimination rules for A and E, thus extending the natural deduction sys-
tem given in chapter 3. The purpose is to show as briefly as possible how
this can be done.
First, consider a rule for introducing A. For it, we need conditions
under which (A i: R: E) holds in a state s. It will be true in s if R ? E is
true in s, and if the proof of R ? E does not depend on i, so that it is
true for all i. The simplest way to require this condition is to require that
i not even be mentioned in anything that the proof of R ? E depends
upon. Thus, we require that i be a fresh identifier, which occurs nowhere
in proofs that the proof of R ? E depends upon. Thus, the inference rule
is

R?E
(4.5.4) A-I: where i is a fresh identifier.
(A i: R: E)

N ow assume that (A i: R: E) is true in state s. Then it is true for any


value of i, so that R ? E: holds in state s for any predicate e. Thus we
have the elimination rule

(Ai:R:E)
(4.5.5) A-E: for any predicate e
R~ ?E:

Let us now turn to the inference rules for E. Using the techniques of
earlier sections, E can be defined in terms of A:

(A i: R: E)
(4.5.6) E-I:
,(Ei:R: ,E)
Section 4.6 Some Theorems About Textual Substitution and States 85

(Ei: R: E)
(4.5.7) E-E: - - - - -
, (A i: R: , £)

A final inference rule allows substitution of one bound variable for


another without changing the value of the predicate:

(Ei: R: £)
(4.5.8) bound-variable substitution: - - - - - -
(E k: Rk: E1)
(provided k does not appear free in Rand £)

Exercises for Section 4.5


1. Let fool(p, t) stand for "you can fool person p at time t". Translate the fol-
lowing sentences into the predicate calculus.
(a) You can fool some of the people some of the time.
(b) You can fool all the people some of the time.
(c) You can't fool all the people all the time.
2. Write the following statements as predicates.
(a) The square of an integer is nonnegative.
(b) Three integers are the lengths of the sides of a triangle if and only if the sum
of any two is at least the third (use sides (a,b, c) to mean that a, band care
the lengths of the sides of a triangle).
(c) For any positive integer n a solution to the equation w" + x" + y" = z"
exists, where w, x, y and z are positive integers.
(d) The sum of the divisors of integer n, but not including n itself, is n. (An
integer with this property is called a perfect number. The smallest perfect number
is 6, since 1+2+3=6.)

4.6 Some Theorems About Textual Substitution and States


In general, the two expressions E and E: are not the same; evaluated
in the same state they can yield different results. But they are related.
We now investigate this relation.
Let us first review terminology. If e is an expression and s a state,
then s (e) denotes the value of expression e in state s, found by substitut-
ing the values in s for the identifiers in e and then evaluating the result-
ing constant expression. If an identifier is undefined in s, the symbol V
is used for its value.
We need to be able to talk about a state s' that is the same as state s
except for the value of identifier x (say), which is v in s'. We describe
state s' by the notation
86 Part I. Propositions and Predicates

(s; x:v)

For example, execution of the assignment x:= 2 in state s terminates in


the state s' =(s; x :2). In general, execution of the assignment x:= e
beginning in state s terminates in the state (s; x:s (e », since the value of
expression e in state s is being assigned to x. Note that

s =(s; x:s(x»

holds because the value of x in state s is s (x).

We now give three simple lemmas dealing with textual substitution. For-
mal proofs would rely heavily on the caveats given on textual substitution
in definition (4.4.6), and would be based on the structure of the expres-
sions involved. We give informal proofs.

(4.6.1) Lemma. seE;} = s(E:r,e».


That is, substituting an expression e for x in E and then evaluat-
ing in s yields the same result as substituting the value of e in s
for x and then evaluating.

Proof Consider evaluating the lefthand side (LHS). Wherever x occurs


in the original expression E, instead of replacing it by its value in s we
must evaluate e in s and use this value, since x has been replaced bye.
This value is s(e). Hence, to evaluate the LHS we can evaluate E in s,
but wherever x occurs use the value s (e). But this is the way the RHS is
evaluated, so the two are the same. 0

The following lemma will be extremely helpful ill understanding the


definition of the assignment statement in Part II.

(4.6.2) Lemma. Consider a state s. Let s' = (s; x:s(e». Then

s' (E) = s (E;).

In other words, evaluating E: in state s yields the same value as


evaluating E in (s; x:s(e».

Proof s'(E) = (s; x:s(e»(E) (Definition of state s')

= (s; x:s(e»(E;(e» (In evaluating E in (s; x:s(e»,


the value s (e) is used for x)
= (s; x:s(x»(Es(e» (x does not occur in Es(e) , so the
value of E:r,e) is independent
of the value of x)
Exercises for Section 4.6 87

(Since state s =(s; x:s(x»)

(Lemma 4.6.1) 0

The above lemmas generalize easily to the case of simultaneous substi-


tution, and we will not discuss the matter further. The final lemma, a
trivial but important fact, is stated without proof.

(4.6.3) Lemma. For a list of distinct identifiers x, expression E and a


list Ii (of the same length as x) of fresh, distinct identifiers, we
have
(E;)~ = E 0

Exercises for Section 4.6


1. Let state s contain: x = 5, Y = 6, b = T. What are the contents of the follow-
ing states? (s; x:6), (s; y:s(x», (s; y:s(x+y», (s; b:F), (s; b:T).
«s; x:6); y:4). «s; x:y); y:x).
Chapter 5
Notations and Conventions for Arrays

The array is a major feature of our programming languages. It is


important to have the right viewpoint and notation for dealing with
arrays, so that making assertions and reasoning about programs using
them can be done effectively. Traditionally, an array has been considered
to be a collection of subscripted independent variables, which share a
common name. This chapter presents a different view, introduces suitable
notation, and gives examples of its use.
This material is presented here because it discusses notations and con-
cepts needed for reasoning about arrays, rather than with the notations
used in the programming language itself. The first two sections will be
needed for defining assignment to array elements in Part II.

5.1 One-dimensional Arrays as Functions


Consider an array defined in Pascal-like notation by

var a: array [I: 3] of integer

In PLjI and FORTRAN, this would be written as

DECLARE a(l:3) FIXED; and


INTEGER a(3)

respectively. Except in older versions of FORTRAN, the lower bound


need not be one; it can be any integer -negative, zero or positive. Zero
is often a more suitable lower bound than one, especially if the range of a
quantified identifier i (say) is written in the form m ~i <n. For exam-
ple, suppose an array b is to have n values in it, each being ~ 2. Giving
b the lower bound 0 and putting these values in b[O], b[I], ... , b[n -I]
Section 5.1 One-dim"ensional Arrays as Functions 89

allows us to express this as

(A i: 0 ~ i < n : b [i] ;;, 2)


Throughout this section we will use as an example an array b declared
as

(5.1.1) var b: array [0:2] of integer

Let us introduce some notation. First, sequence notation (see Appen-


dix 2) is used to describe the value of an array. For example, b =
(4, -2, 7) means that b[0]=4, b[I]=-2 and b[2]=7. Secondly, for any
array b, b.lower denotes its lower subscript bound and b.upper its upper
bound. For example, for b declared in (5.1.1), b.lower = 0 and
b.upper = 2. Then we define domain (b), the subscript range of an array,
as

domain (b) = {i I b.lower ~i ~b.upperJ


As mentioned earlier, the conventional view is that b declared in
(5.1.1) is a collection of three independent subscripted variables, b [0],
b [I] and b [2], each of type integer. One can refer to a subscripted vari-
able using b [i], where the value of integer expression i is in domain (b).
One can assign value e to subscripted variable b [2] (say) by using an
assignment b [i]:= e where expression i currently has the value 2.
It is advantageous to introduce a second view. Array b is considered
to be a (partial) function: b is a simple variable that contains a function
from subscript values to integers. With this view, b [i] denotes function
application: the function currently in simple variable b is applied to argu-
ment i to yield an integer value, in the same way that abs(i) does.

Remark: On my first encounter with it, this functional view of arrays


bewildered me. It seemed useless and difficult to work with. Only after
gaining experience with it did I come to appreciate its simplicity, elegance
and usefulness. I hope the reader ends up with the same apprecia-
tion. 0

When considering an array as a function, what does the assignment


b [i l= e mean? Well, it assigns a new function to b, a function that is
the same as the old one except that at argument i its value is e. For
example, execution of b[ll= 8 beginning with

b [0] = 2, b [I] = 4, b [2] = 6

terminates with b the same as before, except at position I:


90 Part I. Propositions and Predicates

b [0] = 2, b [I] = 8, b [2] = 6.

It is convenient to introduce a notation to describe such altered arrays.

(5.1.2) Definition. Let b be an array (function), i an expression and e


an expression of the type of the array elements. Then (b; i:e)
denotes the array (function) that is the same as b except that
when applied to the value of i it yields e:

(b;i:e)U]= ! i=j-e
i#j-bU] 0

Notice the similarity between the notation (s; x:v) used in section 4.6
to denote a modified state s and the notation (b; i:e) to denote a modi-
fied array b .

Example 1. Let b [0:2] = (2, 4, 6). Then


(b; 0:8)[0] = 8 (i.e. function (b; 0:8) applied to 0 yields 8)
(b; 0:8)[I]=b[I]=4 (i.e. (b; 0:8) applied to I yields b[I])
(b; 0:8)[2] = b [2] = 6 (i.e. (b; 0:8) applied to 2 yields b [2])
so that (b; 0:8)=(8,4,6). 0

Example 2. Let b[0:2] = (2, 4, 6). Then


(b; 1:8) = (2,8,6)
(b;2:8) = (2, 4,8)
«b; 0:8); 2:9) = (8, 4, 9)
«(b; 0:8); 2:9); 0:7) = (7,4,9) 0

Example 2 illustrates nested use of the notatiotJ.. Since (b; 0:8) is the
array (function) (8,4,6), it can be used in the first position of the nota-
tion. Nested parentheses do become burdensome, so we drop them and
rely instead on the convention that rightmost pairs "i:e" are dominant
and have precedence. Thus the last line of example 2 is equivalent to
(b; 0:8; 2:9; 0:7).

Example 3. Let b [0:2] = (2, 4, 6). Then


(b; 0:8; 2:9; 0:7)[0] = 7
(b; 0:8; 2:9; 0:7)[1] = (b; 0:8; 2:9)[1] = (b; 0:8)[1] = b [I] = 4
(b; 0:8; 2:9; 0:7)[2] = (b; 0:8; 2:9)[2] = 9 0

The assignment statement b [i]:= e can now be explained in terms of


the functional view of arrays; it is simply an abbreviation for the follow-
ing assignment to simple variable b!
Section 5.1 One-dimensional Arrays as Functions 91

b:=(b; i:e)

We now have two conflicting views of arrays: an array is a collection


of independent variables and an array is a partial function. Each view has
its advantages, and, as with the particle and wave theories of light, we
switch back and forth between them, always using the most convenient
one for the problem at hand.
One advantage of the functional view is that it simplifies the program-
ming language, because there is now only one kind of variable, the simple
variable. It may contain a function, which can be applied to arguments,
but function application already exists in most programming languages.
On the other hand, with the collection-of-independent-variables view the
notion of state becomes confused, because a state must map not only
identifiers but also entities like b[l] into values.
In describing b [iJ= e as an abbreviation of b:= (b; i:e) the functional
view is being used to describe the effect of execution, but not how the
assignment is to be implemented. Execution can still be performed using
the collection-of-independent-variables view -by evaluating i and e,
selecting the subscripted variable to assign to, and assigning e to it. It is
not necessary to create a whole new array (b; i:e) and then assign it.
The functional view has other uses besides describing assignment. For
example, for an array c [O:n -I] the assertion

perm «c; O:x), C)

asserts that c, but with the value x in position 0, is a permutation of


array C. It is clumsy to formally assert this in another fashion.

Simplifying expressions
It is sometimes necessary to simplify expressions (including predicates)
containing the new notation. This can often be done using a two-case
analysis as shown below, which is motivated by definition (5.1.2). The
first step is the hardest, so let us briefly explain it. First, note that either
i = j or i"# j. In the former case (b; i:5)U] = 5 reduces to 5 = 5; in the
second case it reduces to b U] = 5.

(b; i :5)U] 5=
= (i = j A 5 = 5) v (i "# jAb U] = 5) (DeL of (b; i :5»
= (i =j) v (i"#j A bU]=5) «5 =5) = T, and-simp!.)
= (i = j v i "# j) A (i = j v b U] = 5) (Distributivity)
= T A (i = j v bU] =5) (Excluded middle)
=i=j V bU]=5 (and-simp!.)
92 Part I. Propositions and Predicates

Exercises for Section 5.1


1. Let b [1:4] = (2, 4, 6, 8). What are the contents of the following arrays?
(a) (b; 1:3) (d) (b; l:b(4); 2:b(3); 3:b(2); 4:b(l»
(b) (b; I:b(l» (e) (b; 4:b(4);3:b(3);2:b(2); I:b(l»
(c) (b; J:b(4» (f) (b; l:b(I); l:b(2); l:b(3); l:b(4»
2. Let a state contain i = 2, j = 3 and b (0:5) = (- 3, -2, -1,0, 1,2). Evaluate
the following:
(a) (b; i:2)[j] (e) (b; i+j:6)[4]
(b) (b; i + 1:2)(J] (f) (b; i :2; j :3)(J +i -2]
(c) (b; i+2:2)(J] (g)(b; i:2; j:3)(J+i-l]
(d) (b; i+j:6)[5] (h)(b; i:2; j-I:3)[i]
3. Simplify the following predicates by eliminating the notation (b; ... ).
(a) (b; i:5)[i]=(b; i:5)(J]
(b) (b; i:b[i])(i]=i
(c) (b; i:b[i];j:b(J])(i]=(b;j:b(J]; i:b[i])(J]
(d) (b; i:b(j]; j:b[i])(i]=(b; j:b[i]; i:b(j])(j]
(e) (b; i:b(j];j:b(j])[i]=(b; i:b[i];j:b(j])(J]
(f) (b; i:b[i])(j]=(b;j:b(j])(J]
4. The programming language Pascal contains the type record, which allows one
to build a new type consisting of a fixed number of components (fields) with other
types. For example, the Pascal-like declarations

type t: record n : array [0: 10] of char; age: integer end;


var p ,q: t

define a type t and two variables p and q with type t. Each variable contains
two fields; the first is named n and can contain a string of 0 to 10 characters
-e.g. a person's name- and the second is named age and can contain an
integer. The following assignments indicate how the components of p and q can
be assigned and referenced. After their execution, both p and q contain 'Hehner'
in the first component and 32 in the second. Note how q.age refers to field age
of record variable q.

p.n:= 'Hehner'; p.age:= 32; q.n:= p.n; q.age:= q.age + I-I

An array consists of a set of individual values, all of the same type (the old
view). A record consists of a set of individual values, which can be of different
types. In order to allow components to have different types we have sacrificed
some flexibility: components must be referenced using their name (instead of an
expression). Nevertheless, arrays and records are similar.
Develop a functional view for records, similar to the functional view for arrays
just presented.
Section 5.2 Array Sections and Pictures 93

5.2 Array Sections and Pictures


Given integer expressions el and e2 satisfying el ~e2+1, the notation
b[el:e2] denotes array b restricted to the range el:e2. Thus, for an array
declared as

var b: array [O:n -I] of integer

b[O:n-I] denotes the whole array, while if O~i ~j <n, b[i:j] refers to
the array section composed of b[i], b[i+I], ... , b[j]. If i = j+l, b[i:j]
refers to an empty section of b .
Quite often, we have to assert something like "all elements of array b
are less than x", or "array b contains only zeroes". These might be writ-
ten as follows.

(Ai:O~i<n:b[i]<x)
(A i: 0 ~ i < n: b [i) = 0)
Because such assertions occur so frequently, we abbreviate them; these
two assertions would be written as b <x and b =0, respectively. That is,
the relational operators denote element-wise comparison when applied to
arrays. Here are some more examples, using arrays b [O:n -I] and
c[O:n-l] and simple variable x.

Abbreviation Equivalent predicate


b[I:5] = x (A i: I ~ i ~ 5: b [i] = x)
b[6:1O]#x (A j: 6 ~j ~ 10: b[j]#x)
b[O:k-I]<x <b[k:n-I] (A i: 0 ~ i <k: b [i) < x) A
(A i: k ~i <n: x <b[i))
b[i:j]~b[j:k] (Ap,q: i ~p ~j ~q ~k: b[P]~b[q))
, (b[6:1O] #x) , (A j: 6 ~j ~ 10: b[j] #x)
=(Ej:6~j~10: ,(b[j]#x»
= (Ej: 6~j ~ 10: b[j] =x)

Be very careful with = and #, for the last example shows that b = y can
be different from , (b # y)! Similarly, b ~y can be different from
,(b>y).
We also use the notation x E b to assert that the value of x is equal to
(at least) one of the values b [i]. Thus, using domain(b) to represent the
set of subscript values for b, x E b is equivalent to

(E i: i E domain (b ): x = b [i))
Such abbreviations can make program specification -and understand-
ing the specification later- easier. However, when developing a program
94 Part I. Propositions and Predicates

to meet the specification it is often advantageous to expand the abbrevia-


tions into their full form, because the full form can give more insight into
program development. In a sense, the abbreviations are a form of
abstraction; they let us concentrate on what is meant, while how that
meaning is formally expressed is put aside for the moment. This is similar
to procedural abstraction; when writing a call of a procedure we concen-
trate on what the procedure does, and how it is implemented does not
concern us at the moment.

Array pictures
Let us now turn to a slightly different subject, using pictures for some
predicates that describe arrays. Suppose we are writing a program to sort
an array b[O:n -I], with initial values B[O:n -I] -i.e. initially, b = B.
We want to describe the following conditions:

(I) b [O:k -I] is sorted and all its elements are at most x,
(2) the value that belongs in b [k] is in simple variable x,
(3) every value in b[k+l:n-l] is at least x.

To express this formally, we write

(5.2.1) O~k <n A ordered(b[O:k -I]) A perm«b; k:x), B) A


b[O:k-I]~x ~b[k+I:n-I]

where
ordered(b[O:k -I]) =(Ai: 0 ~i <k -I: b[i] ~b[i+I])

and the notation perm (X, Y) means "array X is a permutation of array


Y".
Looks complicated, doesn't it? Because such assertions occur fre-
quently when dealing with arrays, we introduce a "picture" notation to
present them in a manner that allows easier understanding. We replace
assertion (5.2.1) by

o k-l k k+l n-I


O~k <n A b I ordered, ~x I I :?x I Aperm(B, (b; k:x»

The second term describes the current partitioning of b in a straightfor-


ward manner. The array name b appears to the left of the picture. The
properties of each partition of the array are written inside the box for that
partition. The lower and upper bounds are given at the top for a parti-
tion whose size is :?O, while just the subscript value is given for a parti-
tion known to be of size I (like b [k:k ]). Bounds may be omitted if the
Exercises for Section 5.2 95

picture is unambiguous; the above picture can be written in at least two


other ways:

o k n-l 0 k-l k+l n-l


b ' Io-rd-e-re-d-,-~-x---r-I---'--?-x---'I and b ' Io-r-de-r-ed-,-~-x"I--rI--?-x------'1

Note that some of the partitions may be empty. For example, if k = 0,


b [O:k -I] is empty and the picture reduces to

k k+1 n-I
b I I ~x I

while if k =n the section b[k+l:n] is empty. One disadvantage of such


pictures is that they often cause us to forget about singular cases. We
unconsciously think that, since section b [O:k -I] is in the picture, it must
contain something. So use such pictures with care.
An essential property of such pictures is that the formal definition of
assignment (given later in Part II) is useable on pictures when they appear
in predicates. This will be discussed in detail in Part II.

Exercises for Section 5.2


1. Redo exercise 6 of section 4.2, using the abbreviations introduced in this sec-
tion.
2. Translate the following predicates into the picture notation.
(a) 0 ~p ~q +1 ~n /\ b [O:p -I] ~x <b [q +l:n]
(b)O~k-l~f~h-l<n /\
I =b[l:k-l] /\ 2=b[k:f-I]/\ 3=b[h+l:n]
3. Change the following predicates into equivalent ones that don't use pictures.

o k h n
(a) 0 ~ k ~ h ~ n /\ b --,~:..:..:x-LI=_x"--l-I_--ll_=...c.x'--ll_?:........:..:.x---,'
1-'

o i n
(b) O~i <n /\ b' ordered I I
96 Part I. Propositions and Predicates

5.3 Handling Arrays of Arrays of ...


This section may be skipped on first reading.
The Pascal declaration

(5.3.1) var b: array [0: I] of array [I :3] of integer

defines an array of arrays. That is, b [0] (and similarly b [I]) is an array
consisting of three elements named b[O][1], b[0][2] and b[0][3]. One can
also have an "array of arrays of arrays", in which case three subscripts
could be used -e.g. d[i]U][k]- and so forth.
Array of arrays take the place of two-dimensional arrays in FOR-
TRAN and PLj I. For example, (5.3.1) could be thought of as equivalent
to the PLjI declaration

DECLARE b(O:I, 1:3) FIXED;

because both declarations define an array that can be thought of as two-


dimensional:

b [0][ 1] b [0][2] b [0][3] b [0,1] b [0,2] b [0,3]


or
b[I][I] b[I][2] b[I][3] b[I,I] b[1,2] b[I,3]

We now extend the notation (b; i:e) to allow a sequence of subscripts


in the position where i appears, for the following reason. If the assign-
ment c [i]:= 2 is equivalent to c: = (c; i :2), then the assignment b [i]U]:= 3
should be equivalent to b:= (b; [i]U]:3), where brackets are placed
around each of the subscripts i and j in order to have an easy-to-read
notation.
We need to be able to refer to sequences of subscript expressions
(enclosed in brackets), like [i],[i+I]UJ and [i]U][k]. We introduce some
terminology to make it easier. The term selector denotes a finite sequence
of subscript expressions, each enclosed in brackets. The null selector
-the sequence containing 0 subscripts- is written as E. The null selector
enjoys a nice property; it is the identity element of the operation catena-
tion on sequences (see the remark following definition (4.2.3». That is,
using 0 to denote catenation, for any identifier or selector s we have s 0 E
= s. Any reference to a variable -simple or subscripted- now consists
of an identifier catenated with a selector; the reference x to a simple vari-
able is really x 0 E.
Section 5.3 Handling Arrays of Arrays of ... 97

Example 1. b 0 c is identifier b followed by the null selector. It refers to


the complete array b.
b [0] consists of identifier b catenated with the selector [0]. For b
declared in (5.3.1), it refers to the array (b [0][1], b [0][2], b [0][3]).
b[i]U] consists of identifier b followed by the selector [i]U]. For b
declared in (5.3.1), it refers to a single integer. 0

We want to define the notation (b; s:e) for any selector s. We do this
recursively on the length of s. The first step is to determine the base case,
(b; c:e).
Let x be a simple variable (which contains a scalar or function). Since
x and x 0 c are equivalent, the assignments x:= e and x 0 c:= e are also

equivalent. But, by our earlier notation the latter should be equivalent to


x:= (x; c:e). Therefore, the two expressions e and (x; c:e) must yield
the same value, and we have

e =(x; c:e)

With this insight, we define the notation (b; s:e).

(5.3.2) Definition. Let band g be functions or variables of the same


type. Let s be a suitable selector for b. The notation (b; s:e)
for a suitable expression e is defined by

(b; c:g) = g

(b;[i]os:e)U] = !
i # j - b UJ
i=j-(bU];s:e) 0

Example 2. In this and the following examples, let c[l:3] = (6,7,8) and
b[0:1][1:3] = «0, 1,2),(3,4,5». Then

(c; c:b[I]) =b[I], so that


(c; c:b[I])[2] =
b[I][2] = 4. 0

Example 3. (c; 1:3)[1] = (c; [I] 0 c:3)[I]


= (c[I]; c:3) = 3.
(c; 1:3)[2]= (c; [1]oc:3)[2] = c[21 = 7.
(c; 1:3)[3]=(c; [J]oc:3)[3]=c[3]=8. 0

Example 4. (b; [I ][3]:9)[0] = b [0] = (0, 1,2).


(b; [1][3]:9}[1] = (b[I]; [3]:9) = (3,4,9). 0

Again, all but the outer parentheses can be omitted. For example, the
following two expressions are equivalent. They define an array (function)
that is the same as b except in three positions -[i]U], U] and [k][i].
98 Part I. Propositions and Predicates

«(b; [i]U]:e); U]:f); [k][i]:g) and


(b; [i ][j):e; [j):f; [k ][i ]:g).

Exercises for Section 5.3


1. Exercise 4 of section 5.1 was to develop a functional view of records. One can
also have arrays of records and records of arrays. For example, the following
Pascal-like declarations are valid.

type t: record x: integer; y: array [0: 10] of integer end;


var b: array [O:n -)] of t

Modify the notation of this section to allow references to subrecords of arrays and
subarrays of records, etc.
Chapter 6
Using Assertions To Document Programs

This chapter introduces the use of predicates as assertions for docu-


menting programs in an informal manner, thus paving the way for the
more formal treatment given in Parts II and III.

6.1 Program Specifications


A program specification must describe exactly what execution of a
program is to accomplish. Another part of a specification might also deal
with speed, size, and so forth, but for now we will concentrate on the part
that describes only the what.
One way to specify a program is to give a high-level, English command
for it. For example, the following specifies a program to multiply two
non-negative integer variables.

(6.1.1) Store in z the product a*b, assuming a and b are initially ;;::'0.

When written as a comment for a program segment within a program,


(6.1.1) is called a command-comment; it is a comment that is a statement
or command to perform some action.
A naive programmer might think that a program that sets a, band z
to 0 satisfies (6.1.1), because (6.1.1) does not indicate that a and b should
not be changed. The wise programmer, however, remembers that a pro-
gram should do nothing more than is explicitly required, and also nothing
less. Since the specification does not mention changing a and b, they
should not be changed. A command-comment like (6.1.1) should define
precisely all input and output variables; nothing is more useless than a
partial specification that cannot be understood without reading the pro-
gram itself. For example, the specification
100 Part I. Propositions and Predicates

Multiply a and b together

does not indicate where the result of the multiplication should be stored,
and hence it cannot be understood in isolation, as it should be.
English can be ambiguous, so we often rely on more formal specifica-
tion techniques. The notation

(6.1.2) {Q}S{R}

where Q and R are predicates and S is a program (sequence of com-


mands), has the following interpretation:

(6.1.3) If execution of S is begun in a state satisfying Q, then it is


guaranteed to terminate in a finite amount of time in a state
satisfying R. 0

Q is called the precondition or input assertion of S; R the postcondition,


output assertion or result assertion. The braces { and 1 around the asser-
tions are used to separate the assertions from the program itself.
Note that nothing is said about execution beginning in a state that
does not satisfy Q; the specification deals only with some initial states. If
the program is to deal with all possible initial states, for example by print-
ing error messages for erroneous input, then these cases form part of the
specification and must be covered by the predicates Q and R .
Note also that termination is guaranteed to happen in a finite amount
of time -provided, of course, that execution continues.
Finally, we stress the fact that (6.1.2) is itself a predicate -a statement
that is either true or false- which we usually want to be true. When
writing a program S to satisfy (6.1.2), it is our business to prove in some
fashion that {Q} S {R} does indeed hold. Part II will describe how to
write such a predicate in the predicate calculus introduced in earlier sec-
tions and to formally prove that it is a tautology.
As an example of the use of notation (6.1.2), we write specification
(6.1.1) in it -note the use of the label R in the postcondition to give the
postcondition a name:

(6.1.4) {O~a A O~b} S {R: z =a*b}

Unfortunately, (6.1.4) does not indicate which variables should be


changed, and in fact the program segment z: =0; a:= 0; b: =0 satisfies it.
Typically, we use common sense and English to rectify the problem (but
see also section 6.2). And we often use a mixture of the command-
comment and the formal notation (6.1.2) in the following standardized
form:
Section 6.1 Program Specifications 101

(6.1.5) Given fixed a, b ;;:'0, establish (the truth of) R: z =a*b.

The precondition of the program is given, the fixed variables, which must
not be changed, are listed and the postcondition is to be established.
Here are some more examples of specifications (all variables are
integer valued).

Example 1 (array summation). Given are fixed n;;:'O and fixed array
b [O:n -I]. Establish

R: s =(Li:O~i <n:b[i]). 0

Example 2 (square root approximation). Given fixed integer n ;;:'0, store


Ins an approximation to the square root of n; i.e. establish

Example 3 (sorting). Given fixed n ;;:, 0 and array b [O:n -I], sort b, I.e.
establish

R: (Ai: O~i<n-I: b[i]~b[i+I]). 0

Again, there is a problem with this specification; the result can be esta-
blished simply by setting all elements of b to zeroes. This problem can be
overcome by including a comment to the effect that the only way to alter
b is to swap two of its elements.
Naturally, with large, complex problems there may be difficulty in
specifying programs in this simple manner, and new notation may have to
be introduced to cope with the complexity. But for the most part, the
simple specification forms given ai?ove will suffice. Even a compiler can
be specified in such a notation, by judicious use of abstraction:

{Pascal program(p)}
compiler
{IBM 370program(q) 1\ equivalent(p, q)}

where the predicates Pascal program, IBM 370 program and equivalent
must be defined elsewhere.
102 Part I. Propositions and Predicates

6.2 Representing Initial and Final Values of Variables


The program

swap: t:= x; x:= y; y:= t

swaps or exchanges the values of integer variables x and y, using a


"local" variable t. In order to state formally what swap does, we need a
way to describe the initial and final values of x and y. To do this, we use
identifiers X and Y:

(6.2.1) {x =X lIy = YJ swap Ix = Y lIy =X]

Now, we are asserting that (6.2.1) is always true; it is a tautology. Recall


from section 4.5 that a tautology with free identifiers is equivalent to the
same predicate but with all previously free identifiers universally bound.
That is, (6.2.1) is equivalent to

(6.2.2) (A X, Y: {x =X lIy = YJ swap Ix = Y lIy = Xl)


and actually to

(A X, Y, x ,y : I x =X II Y = YJ swap Ix = Y II Y = X ])
(6.2.2) can be read in English as follows: for all (integer) values of X and
Y, if initially x = X and y = Y, then execution of swap establishes x = Y
and y =X.
X and Y denote the initial values of variables x and y, but they also
denote the final values of y and x. An identifier can denote either an ini-
tial or a final value, or even a value upon which the initial or final value
depends. For example, the following is also a specification of swap,
although it is not as easy to understand:

{x =X+I lIy = Y-I] swap {x = Y-I lIy =X+l].

Generally, we will use capital letters in identifiers that represent initial and
final values of program variables, and small letters for identifiers that
name variables in a program.
As a final example, we specify a sort program again, this time using an
extra identifier to alleviate the problem mentioned in example 3 of section
6.1. The predicate perm (c, C) has the meaning "array c is a permutation
of array C, i.e. a rearrangement of C". See exercise 5 of section 4.2.
Section 6.3 Proof Outlines 103

Example 1 (sorting). Given fixed n ~O and array c[O:n-l] with c =C,


establish

R: perm(c, C) " (A i:O:S;i <n-l:c[i]:S;c[i+I]).

Exercises for Section 6.2


1. Write specifications for the following problems. Put them in the form used in
example I above (sorting) and also in the form {Q} S {R}. The problems may
be vaguely stated, so you may have to use common sense and your experience to
derive a precise specification.
(a) Set x to the maximum value in array b [O:n - I].
(b) Set x to the absolute value of x.
(c) Find the position of a maximum value in array b [O:n -I].
(d) Find the position of the first maximum value in b [O:n -I].
(e) Tell whether a given integer that is greater than I is prime. (An integer> I is
prime if it is divisible only by I and itself.)
(f) Find the nth Fibonacci number In. The Fibonacci numbers are defined by
10=0, II = I and, for n >1, In =ln~1 +ln~2' Thus, the Fibonacci number
sequence begins with (0, I, I, 2, 3, 5, 8).
(g) Tell whether integer array b [O:n -I] is sorted (is in ascending order).
(h) Set each value of array b [O:n -I] to the sum of the values in b.
(i) Let c [O:n -I] be the list of people teaching at Cornell and w [O:m -I] be the
list of people on welfare in Ithaca. Both lists are alphabetically ordered. It is
known that at least one person is on both lists. Find the first such person!
(j) The same problem as (i), except that there are three lists: c, the Cornellians;
w, those on welfare; and m, those making money consulting for the federal
government.
(k) Consider a two-dimensional array g [O:n -I, 0:3]. g [i , 0], g [i, I], g [i , 2]
and g [i, 3] are the grades for student i in his courses this semester, with
A =4.0, B =3.0, C =2.0, etc. Let name[O:n-l] contain the names of the
students. Find the student with the highest average. You may use "real vari-
ables", which can contain floating point numbers.

6.3 Proof Outlines


We have shown how to write a predicate (within braces) before and
after a program in order to assert what is to be true before and after exe-
cution. In the same manner, a predicate may appear between two state-
ments in order to show what must be true at that point of execution. For
example, here is a complete formulation of program swap, which swaps
(exchanges) the values of two variables x and y, using a local variable t.
104 Part I. Propositions and Predicates

{X =x Ay = Y}
t:= X;
{t =X AX =X Ay = Y}
X:= y;
{t = X Ax = Y IIy = Y}
y:= t
{y=XAX=Y}

The reader can informally verify that, for each statement of the program.
if its precondition -the predicate in braces preceding it- is true, then
execution of the statement terminates with its postcondition -the predi-
cate in braces following it- true.
A predicate placed in a program is called an assertion; we assert it is
true at that point of execution. A program together with an assertion
between each pair of statements is called a proof outline, because it is just
that; it is an outline of a formal proof, and one can understand that the
program satisfies its specification simply by showing that each triple
(precondition, statement, postcondition) satisfies {precondition} statement
{postcondition}. The formal proof method is described in Part II.
Placing assertions in a program for purposes of documentation is often
called annotating the program, and the final program is also called an
annotated program.
Below is a proof outline for

{i ;;:::'0 lis = 1+2+· .. +i}


i:= ;+1; s:= s+i
{i >Olls = 1+2+· .. +i}

The proof outline illustrates two new conventions. First, an assertion can
be named so that it can be discussed more easily, by placing the name at
its beginning followed by a colon. Secondly, adjacent assertions -e.g.
{Pi {PlJ- mean that the first implies the second -e.g. P ~ Pl. The
lines have been numbered solely for reference in a later discussion.

(I) {P: i ;;:::'OllS = 1+2+· .. +i}


(2) {PI: ;+1 >Olls = 1+2+· .. +(i+I-I)}
(3) i:= ;+1;
(4) {P2:; >OAs = 1+2+ ... +(i-I»)
(5) {P3: i >OAs+i = 1+2+· .. +i}
(6) s:= s+i
(7) {R:; >Olls = 1+2+· .. +ij

The above proof outline indicates the following facts, in order:


Section 6.3 Proof Outlines 105

1. P =;> Pl (lines 1, 2)
2. {Pl} i:= i+l {P2} (lines 2, 3, 4)
3. P2=;> P3 (lines 4, 5)
4. {P3} s:= s+i {R} (lines 5, 6, 7)

Together, these give the desired result: execution of j:= ;+1; s:= s+i
begun in a state satisfying P terminates in a state satisfying R.
The next example illustrates the use of a conditional statement. Note
how the assertion following then is the conjunction of the precondition of
the conditional statement and the test, since this is what is true at that
point of execution. Since both the then-part and the else-part end with
the assertion x =abs(X), this is what we may conclude about execution
of the conditional statement.

{x=X}
if x <0 then {x =X Ax <OJ
x:= -x
{x =-X Ax >O} {x =abs(X)}
else {x =X AX ~O}
skip
{x =X Ax ~O} {x =abs(X)}
{x =abs(X)}

More details on annotating a program will be forthcoming when we


study loops in Part II. However, one point should be made here. It is
not always necessary to give a complete proof outline. Enough assertions
should be inserted to make the program understandable, but not so many
that the program is hidden from view. In general, a good practice is to
insert those assertions that are not so easily determined by the reader, and
to omit those that are.
Part II
The Semantics of
a Small Language

This Part introduces a programming notation and defines it in terms of


the notion of a "weakest precondition". The main concern is the state-
ments, or commands, of the notation and how they can be understood.
The syntax of declarations and expressions is a secondary concern, and
instead of formally defining them we appeal to the reader's knowledge of
mathematics and programming. In general, a Pascal-like notation for
declarations is used, which the reader should have no trouble understand-
ing. It is understood that each simple variable and expression has a type,
usually integer or Boolean, and that variables are considered to be of type
integer unless otherwise specified or obvious from the context.
Chapter 7
The Predicate Transformer wp

Our task is to define the commands (statements) of a small language.


This will be done as follows. For any command S and predicate R,
which describes the desired result of executing S, we will define another
predicate, denoted by wp (S , R), that represents

(7.1) the set of all states such that execution of S begun in anyone of
them is guaranteed to terminate in a finite amount of time in a
state satisfying R. 0

Let's give some examples for some ALGOL-like commands, based on our
knowledge of how these commands are executed.

Example t. Let S be the assignment command ;:= ;+1 and let R be


i ~ 1. Then

wp("i:= i+I", ; ~I) = (i ~O)

for if ; ~O, then execution of;:= ; + I terminates with ; ~ I, while if ; > 0,


execution cannot make i :::;; I. 0

Example 2. Let S be if x ~y then z:= x else z:= y and R be z =


max (x, y). Execution of S always sets z to max (x, y), so that
wp(S,R)=T. 0

Example 3. Let S be as in Example 2 and let R be z = y. Then


wp (S, R) = (y ~x), for execution of S beginning with y ~x sets z to y
and execution of S beginning with y < x sets z to x, which is #- y . 0

Example 4. Let S be as in Example 2 and let R be z = y -I. Then


wp (S , R) = F (the set of no states), for execution of S can never set z
less than y . 0
Chapter 7 The Predicate Transformer wp 109

Example 5. Let S be as in Example 2 and R be z = y +1. Then


wp (S, R) = (x = y + 1), for only then will execution of S set z to
y+1. 0

Example 6. For a command S, wp(S, T) represents the set of all states


such that execution of S begun in anyone of them is guaranteed to ter-
minate. 0

In section 6.1, we used the notation {Q} S {R} to mean that execution
of S begun in any state satisfying predicate Q would terminate in a state
satisfying predicate R. In this context, Q is called the precondition and
R the postcondition of S. Similarly, we call wp(S, R) the weakest
precondition of S with respect to R, since it represents the set of all
states such that execution begun in anyone of them will terminate with R
true. (See section 1.6 for a definition of weaker and weakest in this con-
text.) We see, then, that the notation {Q} S {R} is simply another nota-
tion for

(7.2) Q ?Wp(S, R).


Note carefully that {Q} S {R} is really a statement in the predicate cal-
culus, since it is equivalent to Q ?Wp(S, R). Thus, it is either true or
false in any state. When we write it, we usually mean it to be a tautology
-we expect it to be universally true.
A command S is usually designed for a specific purpose -to establish
the truth of one particular postcondition R. So we are not always
interested in the general properties of S, but only in those pertaining to
R. Moreover, even for this R we may not be interested in the weakest
precondition wp(S, R), but usually in some stronger precondition Q (say)
that represents a subset of the set represented by wp (S, R). Thus, if we
can show that Q ? wp (S , R) without actually forming wp (S , R), then we
are content to use Q as a precondition.
The ability to work with a precondition that is not the weakest is use-
ful, because the derivation of wp(S, R) itself can be impractical, as we
shall see when we consider loops.
Note that wp is a function of two arguments: a command S and a
predicate R. Consider for the moment an arbitrary but fixed command
S. We can then write wp (S , R) as a function of one argument: wps(R).
The function WPs transforms any predicate R into another predicate
WPs (R). This is the origin of the term "predicate transformer" for wPs.

Remark: The notation Q {S} R was first used in 1969 (see chapter 23) to
denote partial correctness. It has the interpretation: if execution of S
begins in a state satisfying Q, and if execution terminates, then the final
110 Part II. The Semantics of a Small Language

state will satisfy R. We use braces around the predicates (instead of


around the command) to denote total correctness: execution is guaranteed
to terminate. As an example, note that

T {while T do skip} T

where skip is a null command, is a tautology, because execution of the


loop never halts. But

{T} while T do skip {T}

which is equivalent to T ~ wp ("while T do skip ", T), IS everywhere


false. 0

Some properties of wp
If we are to define a programming notation using the concept of wp,
then we had better be sure that wp is well-behaved. By this we mean that
we should be able to define reasonable, implementable commands using
wp. Furthermore, it would be nice if unimplementable commands would
be rejected from consideration. Let us therefore analyze our interpreta-
tion (7.1) of wp(S,R), and see whether any properties can be derived
from it.
First, consider the predicate wp (S , F) (for any command S). This
describes the set of states such that execution of S begun in anyone of
them is guaranteed to terminate in a state satisfying F. But no state ever
satisfies F, because F represents the empty set. Hence there could not
possibly be a state in wp(S, F), and we have our first property:

(7.3) Law of the Excluded Miracle: wp (S, F) = F.


The name of this property is appropriate, for it would indeed be a miracle
if execution could terminate in no state.
The second law is as follows. For any command S and predicates Q
and R the following holds:

(7.4) Distributivity of Conjunction:


wp(S, Q) 1\ wp(S, R) = wp(S, Q I\R)

Let us see why (7.4) is a tautology. First, consider any state s that satis-
fies the left hand side (LHS) of (7.4). Execution of S begun in s will ter-
minate with both Q and R true. Hence Q 1\ R will also be true, and s is
in wp (S, Q 1\ R). This shows that LHS ~ RHS. Next, suppose s is in
wp (S , Q 1\ R ). Then execution of S begun in s is guaranteed to ter-
minate in some state s' of Q 1\ R. Any such s' must be in Q and in R, so
Chapter 7 The Predicate Transformer wp III

that s is in wp(S, Q) and in wp(S, R). This shows that RHS =? LHS.
Together with LHS =? RHS, this yields RHS = LHS.
We have thus shown that (7.3) and (7.4) hold. The arguments were
based solely on the informal interpretation (7.1) that we wanted to give to
the notation wp (S, R). We now take them as basic axioms, and use
them as we do other axioms and laws of the predicate calculus. Using
them, we can prove two other useful laws; their proofs are left as exer-
cises.

(7.5) Law of Monotonicity: if Q =? R then wp(S, Q) =?wp(S, R)

(7.6) Distributivity of Disjunction:


wp(S,Q)Vwp(S,R) =?wp(S,QVR)

It is interesting to compare (7.4) and (7.6). One is an equivalence, the


other an implication. Why? The reason is that execution of commands
may be nondeterministic. Execution of a command is nondeterministic if
it need not always be exactly the same each time it is begun in the same
state. It may produce different answers, or it may simply take different
"paths" en route to the same answer. Most sequential programming nota-
tions, like Algol and FORTRAN, are implemented in a deterministic
fashion -execution begun in the same state is always the same- so this
idea of nondeterminism may be new to you.
As an example of a nondeterministic action for which the LHS and
RHS of (7.6) are not equivalent, consider the act of flipping a coin that is,
theoretically, so thin that it cannot land on its side. There is no guarantee
that flipping the coin will yield a head, so that wp (flip, head) = F. Simi-
larly, wp(flip, tail) = F. Hence,

wp(flip, head) v wp(flip, tail) = F

But the coin is guaranteed to land with either a head or a tail up, so that

wp (flip, head v tail) = T

If we know that a command is deterministic, we can show (see exercise


6) that

(7.7) wp(S,Q)Vwp(S,R) = wp(S,QvR) (for deterministic S)

Note carefully that nondeterminism is a property of the implementa-


tion of a command, and not a property of the command itself. If a com-
mand satisfies (7.7), then it should be possible to implement it in a deter-
ministic fashion without restricting its generality. If a command does not
satisfy (7.7), then if it is implemented in a deterministic fashion, the
112 Part II. The Semantics of a Small Language

implementation is likely to restrict the command somewhat -for exam-


ple, by requiring so much skill in flipping a coin that landing head up is
guaranteed.
In the next chapter we will begin defining a programming notation in
terms of wp. In doing so, we must be extremely careful. For any com-
mand S, the function wps(R) yields a predicate, and at first glance it
might seem that any function with domain and range the set of predicates
will do. But remember that such functions must represent implementable
commands. At the least, it is our duty to certify that such functions
satisfy (7.3) and (7.4), because these properties were developed based on
our notion of command execution. We shall not always perform this duty
in later chapters (because it has been done before); rather, we will leave
this task to the reader as exercises.

Exercises for Chapter 7


1. Determine wp (S , R) for the following Sand R, based on your own
knowledge of how S is executed. Assume that all variables are of type integer
and that all subscripts are in range.

S R
(a) i:= i+1 i >0
(b) i:= i +2; j:=j-2 i+j =0
(c) i:= i+l; j:=j-l i*j =0
(d) z:= z*j; i:=i-l z*/ =c
(e) a[i]:= I a[i] =aU]
(f) a[a[i]]:= i a[i]=i

2. Examples 1-5 of this section each gave a predicate in the form wp (S , R) = Q .


Rewrite each of these in the form {Q} S {R J, just to get used to the two dif-
ferent notations. For example, example 2 would be written as

{T} ifx:;:'y thenz:=x elsez:=y {z=max(x,y)}.

3. Prove (7.5) and (7.6). Don't rely on the notion of execution and interpretation
(7.1); prove them only from (7.4) and the laws of predicate calculus.
4. Prove using (7.4) that (wp(S, R)Awp(S, ,R)) = F.
5. Give an example to show that the following is not true for all states:
(wp(S, R)Vwp(S, ,R» = T.
6. Show that (7.7) holds for deterministic S. (It cannot be proved from axioms
(7.3)-(7.4); it must be argued based on the definitions of determinism and wp, as
was done for (7.3) and (7.4).)
7. Suppose Q =:>wp(S,R) has been proven for particular Q, Rand S.
Analyze fully the statement
Exercises for Chapter 7 113

(7.8) {(Ax:Q)} S {(Ax:R)}

(Is it true in general; if not, what restrictions must be made so that it holds for
"reasonable" classes of predicates Q, R and commands S, etc.) Hint: be careful
to consider the case where x appears in S. You may want to answer the question
under the ground rule that the appearance of x in S means that (7.8) is invalid,
and that the quantified identifier x should be changed before proceeding. It is
also instructive, however, to answer this question without using this ground rule.
See section 4.3.
8. Suppose Q ~ wp (S , R) has been proven for particular Q, Rand S.
Analyze fully the statement

{(E x: Q)} S {(E x: R )}


(Is it true in general; if not, what restrictions must be made so that it holds for
"reasonable" classes of predicates Q, R and commands S, etc.) See the hint on
exercise 7.
Chapter 8
The Commands skip, abort and Composition

We now define a programming notation in terms of wp. We will also


indicate how each command of the programming notation is to be exe-
cuted, so that the reader can relate it to statements of other conventional
languages. Also, by showing how the command can be executed we
establish that it really is useful. But the definition in terms of wp should
be viewed as the definition of the command.
We begin with the command skip. Execution of skip does nothing
(and, we assume, very quickly). It is equivalent to the "empty" command
of ALGOL 60 and Pascal and to the PL/I command consisting solely of a
semicolon ";". It is included in the notation for two reasons. First, it is
often useful to be able to explicitly say that "nothing" should be done.
But just as importantly its predicate transformer is mathematically very
simple -it is the identity transformation:

(8.1) Definition. wp (skip, R) = R. 0

The second command is abort, which is introduced not because of its


usefulness in programming but because it, too, has a definition that is
mathematically simple. It is the only possible command whose predicate
transformer is a "constant" function (see exercise 3).

(8.2) Definition. wp (abort, R) = F. 0

How is abort executed? Well, it should never be executed, because it


can only be executed in a state satisfying F and no state satisfies F! If
execution ever reaches a point at which abort is to be executed, then
obviously the program (and its proof) is in error, and abortion is called
for.
Sequential composition is one way of composing larger program seg-
ments from smaller segments. Let Sl and S2 be two commands. Then
Chapter 8 The Commands skip, abort and Composition 115

SJ; S2 is a new command. It is executed by first executing SJ and then


executing S2. Its formal definition is:

(8.3) Definition. wp("SJ; sr, R) = wp(SJ, wp(S2, R». 0

As a (trivial) example, we have

wp ("skip; skip ", R) = wp (skip, wp (skip, R»


= wp(skip, R) (since wp(skip, R) = R)
= R.

Now consider a sequence of three commands: SJ; S2; S3. Executing


it should involve first executing SJ, then S2, and finally S3, but we must
make sure that the sequence also makes sense in terms of wp. Is it to be
interpreted as (SJ; S2); S3 or as SJ; (S2; S3)? Fortunately, the opera-
tion of function composition, which is used in defining sequential compo-
sition, is associative (see Appendix 3). Therefore

wp("SJ; (S2; S3)", R) = wp("(SJ; S2); S3", R).

That is, it doesn't matter whether one thinks of Sl; S2; S3 as Sf com-
posed with S2; S3 or as SJ; S2 composed with S3, and it is all right to
leave the parentheses out. (Similarly, because addition is associative,
a +b +c is well-defined because a +(b +c) yields the same result as
(a+b)+c.)
Be aware of the role of the semicolon; it is used to combine adjacent,
independent commands into a single command, much the way it is used in
English to combine independent clauses. (For an example of its use in
English, see the previous sentence.) It can be thought of as an operator
that combines, just as catenation is used in Pascal and PLj I to combine
two strings of characters. Once this is understood, there should be no
confusion about where to put a semicolon.
Our use of the semicolon conforms not only to English usage, but also
to its original use in the first programming notation that contained it,
ALGOL 60. It is a pity that the designers of PLj I and Ada saw fit to go
against convention and use the semicolon as a statement terminator, for it
has caused great confusion.
Thus far, we don't have much of a programming notation ~about all
we can write is a sequence of skips and aborts. In the next chapter we
define the assignment command. Before reading ahead, though, perform
some of the exercises in order to get a firm grasp of this (still simple)
material.
116 Part II. The Semantics of a Small Language

Exercises for Chapter 8


1. Prove that definition (8.1) satisfies laws (7.3), (7.4) and (7.7).
2. Prove that definition (8.2) satisfies laws (7.3), (7.4) and (7.7).
3. Consider introducing a command make -true with a constant predicate
transformer:

wp (make -true, R) = T for all predicates R.

Why isn't make -true a valid command?


4. Prove that definition (8.3) satisfies laws (7.3) and (7.4), provided Sl and S2 do.
5. Prove that definition (8.3) satisfies (7.7) provided Sl and S2 do. This shows
that sequential composition does not introduce nondeterminism.
6. Prove that wp ("x: = e; abort ", R) = F, for any predicate R, regardless of
the definition of wp("x:= e", R).
Chapter 9
The Assignment Command

9.1 Assignment to Simple Variables


For the moment, we consider assignment to a simple variable, where a
"simple" variable is a variable of type integer, Boolean and the like. We
treat assignment to array elements in section 9.3.
The assignment command has the form

where x is a simple variable, e is an expression, and the types of x and e


are the same. This command is read as "x becomes e". As a convention,
it is written with a blank separating the assignment symbol:= from e but
no blank separating x from := .
The command x:= e can be executed properly only in a state in which
e can be evaluated (e.g. there is no division by zero). Execution consists
of evaluating e and storing the resulting value in the location named x.
In effect, (the value of) x is replaced by (the value of) e, and a similar,
but textual, replacement forms the heart of the definition:

(9.1.1) Definition. wp("x:= e", R) = domain(e) cand R~'

where

(9.1.2) domain(e) is a predicate that describes the set of all states 10


which e may be evaluated -i.e. is well-defined. D

Predicate domain (e) will not be formally defined, since expressions e are
not. However, it must exclude all states in which evaluation of e would
be undefined -e.g. because of division by zero or subscript out of range.
118 Part II. The Semantics of a Small Language

It can be defined recursively on the structure of expressions (see exercise


6).
Often, we tend to omit domain(e) entirely, writing

(9.1.3) wp("x:= e", R) = Re<

because assignments should always be written in contexts In which the


expressions can be properly evaluated.
Definition (9.1.3) can be bewildering at first, for it seems to require
"thinking backwards". Our operational habits make us feel that the
precondition should be R and the postcondition Re<! Here is an informal
explanation of (9.1.1): Since x will contain the value of e after execution,
then R will be true after execution iff R, with the value of x replaced by
e, is true before execution. A more formal explanation is left to exercise
3. The following examples should lend some confidence in the definition.
In particular, examples 7 and 8 should convince the reader that this defin-
ition is consistent with our conventional model of execution.

Example 1. wp("x:= 5", x =5) = (5=5) = T. Hence execution of x:= 5


always establishes x = 5. 0

Example 2. wp ("x:= 5", x # 5) = (5 # 5) = F. Hence execution of x:= 5


never establishes x # 5. 0

Example 3. wp("x:= x+I", x <0) = (x+1 <0) = (x <-I). 0

Example 4. wp ("x:= x*x ", x4 = 10) = «X*X)4 = 10) = (x 8 = 10). 0

Example 5. For any predicate p of one argument,

wp("x:= a..;-b",p(x)) = (b #0 candp(a";-b)).

This example required explicit use of the term domain (e) of definition
(9.1.1). 0

Example 6. Suppose array b is declared with subscript range 0: 100. Then

wp("x:= b[i]",x =b[i]» = (O:(i :(IOO cand b[i]=b[i])


= (O:(i:(IOO).

Thus, x will contain the value b [i] upon termination iff i is a valid sub-
script for array b. 0
Section 9.1 Assignment to Simple Variables 119

Example 7. Assume c is a constant. Then wp ("x:= e", x = c) = (e = c).


This means that execution of x := e is guaranteed to terminate with c in x
if/the value of expression e be/ore execution is c. 0

Example 8. Assume c is a constant and x and yare distinct identifiers.


Then

wp("x:= e", y =c) = (y =c). 0

Example 8 is particularly illuminating. Since y must retain its original


value c, execution of the assignment x:= e cannot change y. Since the
above must hold for all variables y and values c, execution of x:= e may
change only x, and no other variable. Hence, no so-called "side effects"
are allowed. This restriction holds universally: execution of an assignment
may change only the variable indicated and evaluation of an expression
may change no variable. This prohibits functions with side effects.
The ban on side effects is extremely important, for it allows us to con-
sider expressions as conventional mathematical entities. This means that
we can use all the conventional properties with which we are used to
working when dealing with them -such as associativity and commuta-
tivity of addition and the logical laws of chapter 2.

Swapping the values of two variables


The sequence t:= x; x:= y; y:= t can be used to "swap" or exchange
the values of variables x and y, as the following shows.

wp("t:= x; x:= y; y:= t", x =X Ay = Y)


wp("t:= x; x:=y", wp("y:= t", x =X Ay = Y»
= wp("t:=x; x:=y", x=X At=Y)
wp("t:=x", wp("x:=y", x=X At=Y»
wp("t:= x", y =X At = Y)
(y =X A X = Y)

The above is comparatively difficult to read and write. Instead, we use a


proof outline, as illustrated to the left in (9.1.4).

(9.1.4) {y =X A X = Y} {y =X A X = Y}
t:= x; t:= x;
{y=XAt=Y} x:= y;
x:= y; y:= t
{x=XAt=Y} {x=XAy=Y}
y:= t
{x =X Ay = Y}
120 Part ]1. The Semantics of a Small Language

Recall from section 6.3 that, in a proof outline, an assertion appears


between each pair of commands. The assertion is a postcondition for the
first command and a precondition for the second: The proof outline is
often read backwards, since a precondition is determined from a postcon-
dition and a command. We could also abbreviate this proof outline as
shown to the right in (9.1.4), since determining the intermediate assertions
is a simple, almost mechanical, chore.

Exercises for Section 9.1


I. Determine and simplify wp(S,R) for the pairs (S,R). Variable a1l5 has
type Boolean; all other variables have type integer.

S R
(a) x:= 2*y+3 x = 13
(b) x:= x+y x<2*y
(c) j:= j+I O<j /\ (A i:O~i ~j: b[i]=5)
(d) all5:= (b U] = 5) all5 = (A i: 0 ~ i ~j: b [i] = 5)
(e) a1l5:= all5 /\ (bU] =5) a1l5 = (A i: O~ i ~j: b[i] =5)
(f) x:= x*y x*y =c
(g) x:= (x-y) *(x+y) x + y2~O

2. Prove that definition (9.1.3) satisfies laws (7.3), (7.4) and (7.7). The latter
shows that assignment is deterministic.
3. Review section 4.6 (Some theorems about textual substitution). Let s be the
machine state before execution of x:= e and let s' be the final state. Describe s
and s' in terms of how x:= e is executed. (What, for example, should be the
value in x upon termination?) Then show that for any predicate R, s' (R) is true
iff s(Re} is true. Finally, argue that this last fact shows that the definition of
assignment is consistent with our operational view of assignment.
4. One can write a "forward rule" for assignment, which from a precondition
derives the strongest postcondition sp(Q, "x:= e") such that execution of x:= e
with Q true leaves sp (Q, "x:= e") true (in the definition below, v represents the
initial value of x):

sp(Q, "x:= e") = (Ev: Q; /\ x =e;)

Show that this definition is also consistent with our model of execution. One way
to do this is to show that execution of x:= e with Q true is guaranteed to ter-
minate with sp (Q, "x:= e") true:

{Q} x:= e {sp(Q,"x:= e")}

5. See exercise 4. Give an example to show that Q is not equivalent to


wp("x:= e", sp(Q, "x:= e"».
6. Consider integer expressions defined using the syntax (see Appendix I)
Section 9.2 Multiple Assignment to Simple Variables 121

<expr> <term> I <expr> <term> +


<term> <factor> I <term> * <factor>
I <term> -:- <factor>
<factor> ::= <integer constant> I <identifier>
I <array identifier> [ <expr >]
Let domain(b) denote the set of subscript values for any array b. Define
domain «expr» for any expression <expr> recursively, on the structure of
expressions. Assume the errors that can occur are subscript out of range and divi-
sion by zero.

9.2 Multiple Assignment to Simple Variables


A multiple assignment to simple variables has the form

where the Xi are distinct simple variables and the ei are expressions. For
purposes of explanation the assignment is abbreviated as x:= e. That is,
any identifier with a bar over it represents a vector (of appropriate
length).
The multiple assignment command can be executed as follows. First
evaluate the expressions, in any order, to yield values v J, . . . , V n . Then
assign v I to X J, V2 to X2, ... , Vn to x n , in that order. (Because the Xi are
distinct, the order of assignment doesn't matter. However, a later general-
ization will require left-to-right assignment.)
The multiple assignment is useful because it easily describes a state
change involving more than one variable. Its formal definition is a simple
extension of assignment to one variable:

(9.2.2) Definition. wp("x:= e", R) = domain(e) cand Ri. 0

where domain(e) describes the set of states in which all the expressions in
the vector e can be evaluated:

domain(e) = (A i: domain (ed).

Example 1. x, y:= y, x can be used to "swap" the values of x and y. 0

Example 2. x, y, z:= y, z, x "rotates" the values of x, y and z. 0

Example 3. wp("z,y:= z*x, y-I",y:;'O 1\ z*x' =c)


(y-1:;'0 1\ (z*x)*x v - I =c)
= (y:;' 1 1\ z* xl' = C ). 0
122 Part II. The Semantics of a Small Language

Example 4. wp("s, i:= s+b[i], i+l", i >0 A s = ("i..j: 0 ~j < i: bUD


i+1 >0 A s+b[i] = ("i.. j: O~j < i+l: bU])
= i?OAS=("i..j:O~j<i:bU])

Note that execution leaves s = ("i..j: 0 ~j < i: bUD unchanged. 0

Example 4. wp("x,y:= x-y, y-x", x+y = c)


(x-y + y-x =c)
= (O=c). 0

Example 5. wp("x,y:= x-y, x+y", x+y =c)


(x-y +x+y =c)
= (2*x =c). 0

It is difficult at first to use the assignment command definition, for our


old habits of reasoning about assignments in terms of execution get in the
way. We have to consciously force ourselves to use it. Surprisingly
enough, with practice it does help. Here is an example to illustrate this.
Suppose we have an array b and variables i, m, and p, with
i ~ m < i +p. Values i and i +p -I define the boundaries of a partition
b [i: i +p -I] of b, while m is an index in that partition, as shown in the
first predicate below. It is desired to make the middle partition smaller
by setting i to m + I, but at the same time p should be changed so that
i +p -I still describes the rightmost boundary of the partition, which does
not change, as shown by the second predicate below.

m i+p
bl I Ai<m<i+p

m i+p
hi I A i=m+l:::;;i+p

Now, what value should be assigned to p? Instead of determining it


through ad hocery, let us use the definition of wp. Letting c be the ini-
tial value of i +p, we want to find the expression x that makes the fol-
lowing true:

{i+p=c} i,p:=m+l,x U+p=c}

We have:
Exercises for Section 9.2 123

wp("i,p:= m+l, x", i+p =c)


(i+p =C):,,;!+l.x
= (m+1 +X)=C

Since initially i +p = c, we substitute for C to get

m+l+x =i+p

Solving for variable x yields x = p +i -m -I, so the desired assignment is


i ,p:= m +I,p+i-m-I.
The definition of wp was used to derive the assignment, and not only
to show that the assignment was correct. This is a hint as to the useful-
ness of wp in deriving programs.

Remark: Consider finding a solution for x in the assertion

(9.2.3) {T} a:= a+l; b:= x {a =b}

Blind analysis leads to

wp("a:= a+l; b:= x", a =b)


wp("a:= a+I", a =x)
= a+1 =x

which clearly is wrong; we cannot substitute a + I for x in (9.2.3) and


achieve a tautology. The problem is that x must be considered a function
of a and b. If we write x as x(a, b), then we get

wp("a:= a+l; b:= x(a, b)", a =b)


wp("a:= a+I", a =x(a, b»
= a+l=x(a+l,b)

Thus, we see that x is not dependent on b, and we can take x as the


expression a, which is the obvious answer. 0

Exercises for Section 9.2


1. Prove that x,y:= el, e2 is semantically equivalent to x:= el; y:= e2, and
also to y:= e2; x:= e 1, provided that x does not occur in e2 and y does not
occur in el.
2. Show by counterexample that x, y:= e1, e2 and x:= e1; y:= e2 and y:= e2;
x:= e 1 are generally not equivalent if x occurs in e2 or y in e 1.
124 Part II. The Semantics of a Small Language

3. Determine and simplify wp(S, R) for the pairs (S, R) given below.

S R
(a) z,x,y:= I, c, d z*x y =c d
(b) i,s:=I,b[O] l:S;i<n I\s=b[O]+ ... +b[i-I]
(c) a,n:=O,1 a 2<n 1\ (a+I)2~n
(d) i,s:= i+l, s+b[i] O<i <n 1\ s =b[O]+' .. +b[i-l]
(e) i:= i+l; j:= j+i i =j
(f) j:= j+i; i:= i+1 i=j
(g) i, j:= i+l, j+i i =j

4. In each of the following predicates, x represents an unknown expression that is


to be determined. That is, an expression for x involving the other variables is to
be determined so that the assertion is a tautology. Do so, as was done in the
example preceding the exercises. The first few exercises are simple, so that you
can easily become familiar with the technique.

(a) {T} a, b:= a+l, x {b =a+l}


(b) {T} a:= a+l; b:= x {b =a+l}
(c) {T} b:= x; a:= a+1 {b =a+l}
(d) {i =j} i,j:= i+l,x {i=j}
(e) {i = j} i:= i+l; j:= x {i = j}
(f) {i = j} j:= x; i:= i+1 {i = j}
(g) {z +a*b =c} z,a:= z+b,x {z +a*b =c}
(h) {even(a) 1\ z +a*b =c} a,b:=a /2,x {z +a*b =c}
(i) {even(a) 1\ z +a*b =c} a:= a /2; b:= x {z +a*b =c}
U) {T} i,s:= O,x {s =(Lj:O:S;j:S;i: bUm
(k) {T} i,s:= O,x {s =(Lj:O:S;j<i: bum
>
(I) {i 0 1\ S = (L j: 0 :S; j < i: b U])} i, s: = i + I, x {s = (L j: 0 :S; j < i: b U])}

9.3 Assignment to an Array Element


Recall (section 5.1) that in the functional view of arrays an array b is a
simple variable that contains a function, and that conventional "array sub-
scripting", b[i], is simply application of the function currently in b to the
argument i. Recall also that (b; i:e) denotes a function that is the same
as b, except that at the argument i it yields the value e.
We can therefore view a subscripted variable assignment b[i]:= e as
equivalent to the assignment

(9.3.1) b:= (b; i:e)

since both change b to represent the function (b; i:e). But (9.3.1) is an
assignment to a simple variable. Since assignment to a simple variable is
already defined in (9.1.1), so is assignment to a subscripted variable! We
have, using definition (9.1.1),
Section 9.3 Assignment to an Array Element 125

wp("b[i]:= e", R) = wp("b:= (b; i:e)", R)


= domain«b; i:e)) cand Rfb; i:e)

Using inrange(b, i) to mean that the value of i is a valid subscript, we


can rewrite the definition of b [i]:= e as

(9.3.2) Definition. wp ("b[i]:= e", R)


= inrange(b, i) cand domain(e) cand Rfb;i:e) 0

Typically, we tend to leave off in range and domain, writing simply

(9 ..
3 3) wp ("b[·]·-"
1 .- e, R) - Rb
(b; i:e)

Remark: The notation (b; i:e) is used in defining assignment to array ele-
ments and in reasoning about programs, but not in programs. For tradi-
tional reasons, the assignment command is still written as b [i]:= e. 0

We have managed to make the definition of assignment to an array


element look quite simple and, in fact, determining wp ("b [i]:= e", R) is a
mechanical chore. In making the definition simple, however, we have
pushed the complications into the predicate calculus; it is not always easy
to manipulate a precondition into an understandable form. The following
examples are designed to illustrate how this can be done. In the exam-
ples, assume that all subscripts are in range.

Example 1. wp("b[i]:= 5", b[i]=5)

(b [i] = 5)(i; ;:5) (Definition)


(b; i:5)[i]=5 (Textual substitution)
5=5
T

Hence, execution of b [i]:= 5 always sets b [i] to 5. 0

Example 2. wp("b[i]:= 5", b[i]=bU])

(b [i] = b U])(i: ;:5) (Definition)


(b; i:5)[i]=(b; i:5)U] (Textual subs.)
(i¥-j A5=bUD v (i=j A 5=5) (Case anal.: i ¥- j vi = j)
(i ¥- j A 5 = b U]) v (i = j)
(i ¥- j v i = j) A (5 = bU] v i = j) (Distributivity)
T A (i = j v b U] = 5)
i=j V bU]=5

Often, the case i = j is omitted carelessly when basing arguments on intui-


tion only; the formal definition really helps here. The case analysis
126 Part II. The Semantics of a Small Language

performed here was explained at the end of section 5.1, so reread that
part if you are having trouble with it. 0

Example 3. wp("b[b[i]]:= i", b[i]=i)


(b [i] = i)(i; b[i]i) (Definition)
(b; b[i]:i)[i]=i (Textual subs.)
(b[i]#i" b[i]=i) v (b[i]=i" i =i) (Case analysis)
F v (b [i] = i " T)
b [i] = i

Hence, execution of b [b [i]]:= i has no effect on the predicate b [i] = i.


This exercise is quite difficult to perform using only operational reason-
ing. 0

Example 4. Assume n > I. Let ordered(b [I:n]) mean that the elements
of b are in ascending order. Then
wp("b [n]:= x", ordered(b [I:n ]»
(ordered(b [I:n ]))(i; n :x) (Definition)
= ordered«b; n:x)[ I:n]) (Textual substitution)
= ordered(b[I:n-I]) " b[n-I]~x (Definition of ordered)

By replacing ordered(b[l:n]) by its definition, we get a more formal


derivation:
wp("b[n]:= x", (A i: 1 ~i <n: b[i]~b[i+I]))
(A i: I~i<n:(b; n:x)[i]~(b; n:x)[i+l])
(b; n:x)[n-I]~(b; n:x)[n] "
(A i: 1 ~i <n-I:(b; n:x)[i]~(b; n:x)[i+I])
b[n-I]~x "(Ai:l~i<n-l:b[i]~b[i+I]) 0

Exercises for Section 9.3


1. Determine and simplify the following weakest preconditions, where array b IS
declared as b [O:n -I] and it is known that all subscripts are in range.
(a) wp("b[i]:= i", b[b[i]]=i)
(b) wp("b[i]:= 5", (Ej:i~j<n:b[i]~bU]»
(c) wp("b[i]:= 5", (Ej:i~j<n:b[i]<bU]»
(d) wp("b[i]:= 5", b[O:n-I]=B[O:n-I])
(e) wp("b[i]:= b[i-l]+b[i)", b[i]=(Lj: l~j<i:bU]))
(f) wp("t:= b[i]; b[i]:= bU]; bU]:= t", b[i]=x" bU]=y)
(g) wp("t:=b[i];b[i]:=bU];bUJ:=t", k#i"k#j"b[k]=C)
2. Derive a definition for an assignment r.s:= e for a Pascal-like record r with
field name s (see exercise 4 of section 5.1).
Section 9.4 The General Multiple Assignment Command 127

9.4 The General Multiple Assignment Command


This section may be skipped, since it is not needed to understand pro-
gram development described in Part III. The material is used heavily in
defining procedure calls in chapter 12.
Thus far, we have defined the assignment x:= e for a simple variable
x, the assignment x:= e for distinct simple variables Xj, and the assign-
ment b [i]:= e to an array element b [i]. We now want to define the gen-
eral mUltiple assignment command. For example, it allows us to swap the
values of two array elements:

b[i],bU]:= bU],b[i].

In addition, it allows us to deal with assignments to elements of sub-


arrays. For example, for array c declared as,

var c: array [0: 10] of array [0: 10] of integer,

the assignment c[i]U]:= e has not yet been defined.


Recall (from section 5.3) that a selector is a sequence of bracketed
expressions (subscripts). The null selector is denoted by E. For any iden-
tifier x (say), we have x =x 0 E, where 0 denotes catenation of identif-
iers and selectors.
The mUltiple assignment command has the form

where each Xi is an identifier, each Si is a selector and each expression ei


has the same type as Xi 0 Si. Using xes for x lOS" • • . , Xn 0 Sn and e
for e" ... , en, we abbreviate the multiple assignment by

(9.4.2) x 0 S:= e.
Note that a simple assignment x:= e has form (9.4.1) -with n = 1 and
SI =
E- since it is the same as x 0 E:= e. Also, the assignment b [i]:= e
has this form, with n = 1, X 1= b, S I =[i] and e 1= e.
The multiple assignment can be executed in a manner consistent with
the formal definition given below as follows:

(9.4.3) Execution of a multiple assignment. First, determine the variables


specified by the Xi 0 Si and evaluate the expressions ei to yield
values Vi. Then assign v I to x lOS" V2 to X2 0 S2, ... , and Vn to
Xn 0 Sn' The order of assignment must be from left to right. 0

We define mUltiple assignment by giving a predicate transformer for it.


128 Part II. The Semantics of a Small Language

To get some idea for the predicate transformer, let's look at the definition
of multiple assignment to simple variables:

wp(''X"::: e", R) = R{- .

We must be sure that the general multiple assignment definition includes


this as a subcase. We therefore generalize the simpler definition to allow
identifiers catenated with selectors instead of just identifiers x:

(9.4.4) Definition. wp ("x 0 S := e", R) = R i 0 s. 0

The difficulty with (9.4.4) is that textual substitution is defined only for
identifiers, and so Rio
S is as yet undefined. We now generalize the
notion of textual substitution to include the new case by describing how
i
to massage R 0 S into the form of a conventional textual substitution.
The generalization will be done so that the manner of execution given in
(9.4.3), including the left-to-right order of assignment. will be consistent
with definition (9.4.4).
To motivate the generalization, consider the assignment

(9.4.5) b 0 s" "'. b 0 sm::: e, • ...• em

This assignment first assign e, to b 0 s" then e 2 to b 0 S 20 and so on.


Thus, it should be equivalent to

Why? Suppose two of the selectors Si and Sj (say), where i <j, are the
same. Then, after execution of (9.4.5), the value of ej (and not of ei) will
be in b 0 Sj' and thereafter a reference b 0 si should yield ej' But this is
exactly the case with execution of (9.4.6); the left-to-right order of assign-
ment during execution of (9.4.5) is reflected in the right-to-left precedence
rule for applying function (b; s,:e,; ... ; Sm :e m ) to an argument.
Secondly, note that for distinct identifiers band c and selectors sand
t (which need not be distinct) the assignments b 0 s, cot::: e, g and
cot, b 0 S::: g, e should have the same effect. This is because b 0 s
and cot refer to different parts of computer memory, and what is
assigned to one cannot effect what is assigned to the other. (Remember,
expressions e and g are evaluated before any assignments are made.)
This leads us to the following

(9.4.7) Definition. P{-, where each element of vector x is an identifier


catenated with a selector, is given by the following three rules.
Section 9.4 The General Multiple Assignment Command 129

(a) Provided x is a list of distinct identifiers (thus, each of the


selectors in x is the null selector c), pi
denotes conventional
textual substitution.
(b) R~' b 0 s,-c 0 I,y = R~' COl,! 0 s,y
e,[,h,g e,h,[,g

provided that band c are distinct identifiers. This rule indicates


that adjacent reference-expression pairs may be permuted as long
as they begin with different identifiers.

provided that identifier b does not begin any of the Xi' This rule
indicates how mUltiple assignments to subparts of an object b can
be viewed as a single assignment to b. 0

Example 1. wp("x,x:= 1,2", R)


= wp("x 0 c, x 0 c:= 1,2", R)
= R(x; £:1; £:2)

= Rf (see definition (5.3.2))

Execution of x, x:= 1,2 is equivalent to execution of x:= 2; there is really


no sense in using x, x:= I, 2. 0

Example 2. wp("b[i], bU]:= bU], b[i)", b[i] =X AbU] = Y)


= (b; i:bU]; j:b[i])[i]=X A (b; i:bU]; j:b[i])U] = Y
= (b; i:bU]; j:b[i])[i]=X A b[i]= Y
=«i=jAb[i]=X) V(i#jAbU]=X)) A b[i]=Y
= bU]=X A b[i]= Y

Note that the swap performs correctly when i = j, since this case is
automatically included in the above derivation. If this derivation seems
too fast for you, reread section 5.1. 0

Example 3. wp("b[i], bU]:= bU], b[i)", (A k: k#i A k#j: b[k] = B[k]))


= (A k: k#i A k#j: (b; i:bU]; j:b[i])[k] = B[k]))
= (A k: k#i Ak#j: b[k]=B[k])

The last line follows because if k #i and k # j then (b; i:bU]; j:b[i])[k]
= b[k]. The only array values changed by the swap are b[i] and
bU]. 0
130 Part II. The Semantics of a Small Language

Exercises for Section 9.4


1. Transform the following using definition (9.4.7) so that they denote conven-
tional textual substitution -i.e. the superscript in each is a list of distinct identif-
Iers.

( a) Rb[il.h[il.x
e.F.g .

( b) Rb[iJ.
e,f. g
x. b[jJ

( c) R b[iJ. c[iJ. b[ll


e• .r. g
( d) Rb[il.c[il.h[ll.cLil
e . .f.g.n

2. Determine and simplify the following weakest preconditions. where b IS an


array of integers and it is assumed that all subscripts are in range.

(a) wp("b[i],b[2]:= 3,4", b[i]=3)


(b) wp("b[i],b[2]:= 4, 4", b[i]=3)
(c) wp("p, b(p]:= b(P],p", p =b(p])
(d) wp("i,b[i]:=i+I,O", O<il\(Aj:O~j<i:bU]=O»
(e) wp("i, b[i]:= i+I, 0", O<i 1\ b[O:i-I]=O)
(f) wp("p, b(P),b[q]:= b(P],b[q],p", p =b[q))
(g) wp("p, b[p),b[b(P]]:= b(p].b[b(p]),p", p =b[b(P]])
(h) wp("p, b(P],b[b(P]]:= b(P],b[b(P]],p", p ~b[b(P]])
3. Prove the following implication:

i=/l\b[i)=K =';> wpC"i, b[i]:= b[i],i", i=Kl\b[J)=/)

4. Derive a definition for a general multiple assignment command that can include
assignments to simple variables, array elements and Pascal record fields. (see
exercise I of section 5.3.)
5. Prove that lemma 4.6.3 holds for the extended definition of textual substitution:

Lemma. Suppose each Xi of list x has the form identifier 0 selector and
suppose li is a list of fresh, distinct identifiers. Then
Chapter 10
The Alternative Command

Programming notations usually have a conditional command, or if-


statement, which allows execution of a subcommand to be dependent on
the current state of the program variables. An example of a conditional
command, taken from ALGOL 60 and Pascal, is

if x ~O then z:= x else z:=-x

Execution of this command stores the absolute value of x in z: if x ~ 0


then the first alternative z:= x is executed: otherwise the second alterna-
tive z:= -x is executed. In our programming notation, this command
can be written as

(10.1) ifx~O-z:=x
Ux:;O;;O - z:=-x
fi

or, since it is short and simple enough, on one line as

ifx~O-z:=x U x:;O;;O-z:=-x fi .

Command (10.1) contains two entities of the form B - S (separated


by the symbol U) where B is a Boolean expression and S a command.
B - S is called a guarded command, for B acts as a guard at the gate
- , making sure S is executed only under the right conditions. To exe-
cute (10.1), find one true guard and execute its corresponding command.
Thus, with x >0 execute z:= x, with x <0 execute z:= -x, and with
x = 0 execute either (but not both) of the assign~~nts.
This brief introduction has glossed over a nuMber of important points.
Let us now be more precise in describing the syntax and execution of the
alternative command.
132 Part II. The Semantics of a Small Language

The general form of the alternative command is

(10.2) if BI - SI
oB2 - S2

where n ~O and each Bj - Sj is a guarded command. Each Si can be


any command -skip, abort, sequential composition, assignment, another
alternative command, etc.
For purposes of abbreviation, we refer to the general command (10.2)
as IF, while BB denotes the disjunction

Command IF can be executed as follows. First, if any guard B j is not


well-defined in the state in which execution begins, abortion may occur.
This is because nothing is assumed about the order of evaluation of the
guards. Secondly, at least one guard must be true; otherwise execution
aborts. Finally, if at least one guard is true, then one guarded command
Bj - Sj with true guard Bj is chosen and Sj is executed.
The definition of wp (IF, R) is now quite obvious. The first conjunct
indicates that the guards must be well-defined. The second conjunct indi-
cates that at least one guard is true. The rest of the conjuncts indicate
that execution of each command Sj with a true guard Bi terminates with
R true:

(I0.3a) Definition. wp(IF, R) = domain(BB) /\ BB /\


(BI ?Wp(S), R)) /\ ... A (Bn ='?wp(Sn' R)) 0

Typically, we assume that the guards are total functions -i.e. are well-
defined in all states. This allows us to simplify the definition by deleting
the first conjunct. Thus, with the aid of quantifiers we rewrite the defini-
tion in (1O.3b) below. From now on, we will use (1O.3b) as the definition,
but be sure the guards are well-defined in the states in which the alterna-
tive command will be executed!

(10.3b) Definition. wp(lF, R) = (E i: 1 ~ i ~ n: Bi ) A


(A i: I ~i ~n: Bi ='?wp(S;, R)) 0

Example I. Let us show that, under all initial conditions, execution of


(10.1) stores the absolute value of x in z. That is, we want to show that
wp«(lO.I),z =abs(x)) = T. We have:
Chapter \0 The Alternative Command 133

wp«(I0.1),z =abs(x»
=(X~OVX~O)A {BBA
(x ~O?Wp("z:= x", z =abs(x») A B] ?Wp(S],R)A
(x ~O?Wp("z:= -x", z = abs (X») B 2?Wp(S2,R)
= T A (x ~O?x =abs(x» A
(x ~O?-X =abs(x»
= TATAT
= T 0

Example 2. The following command is supposed to be the body of a loop


that counts the number of positive values (P) in array b [O:m -I].

(10.4) if b[i]>O - p,i:= p+l, i+1


Ob[i]<O - i:= i+1
fi

After execution of this command we expect to have i ~m and p equal to


the number of values in b[O:i-l] that are greater than zero. Letting R
be the assertion

i ~m Ap =(Nj: O~j <i: bU]>O)

we calculate:

wp«(lO.4), R) = (b[i]>O v b[i]<O) A


(b[i]>O? wp("p,i:= p +1,i+I", R» A
(b[i]<O? wp("i:= i+I", R)
= b[i]#O A
(b[i]>O?i+1 ~m A p +1 =(N j: O~j <i+l: bU]>O» A
(b[i]<O?i+l~m Ap =(Nj:O~j<i+l:bU]>O»
= b[i]#O Ai <m A
p =(N j: O~j <i: bU] >0) A
p = (N j : 0 ~j < i : b U] > 0)
= b[i]#O Ai <m A p =(N j: O~j <i: bU]>O)

Hence we see that array b should not contain the value 0, and that the
definition of p as the number of values greater than zero in b [O:i -I] will
be true after execution of the alternative command if it is true before. 0

The reader may feel that there was too much work in proving what we
did in example 2. After all, the result can be obtained in an intuitive
manner, and perhaps fairly easily (although one is likely to overlook the
problem with zero elements in array b). At this point, it is important to
practice such formal manipulations. It results in better understanding of
the theory and better understanding of the alternative command itself.
134 Part II. The Semantics of a Small Language

Moreover, the kind of manipulations performed in example 2 will indeed


be necessary in developing some programs, and the facility needed for this
can only come through practice. Even the act of performing a few exer-
cises will begin to change the way you "naturally" think about programs
and thus what you call your intuition about programming.
Later on, when attacking a problem that is similar to one worked on
earlier, it may not be necessary to be so formal, but the formality will be
at your fingertips when you need it on the more difficult problems.

Some comments about the alternative command


The alternative command differs from the conventional if-statement in
several respects. We now discuss the reasons for these differences.
First, the alternative command allows any number of alternatives, not
just two. Thus, it serves also as a "case statement" (Pascal) or "SELECT
statement" (PLj I). There is no need to have two different notations, one
for two alternatives and one for more. One notation for one concept -in
this case alternation or choice- is a well-known, reasonable principle.
There are no defaults: each alternative command must be preceded by
a guard that describes the conditions under which it may be executed.
For example, the command to set x to the absolute value of x must be
written with two guarded commands:

if x ;;::. 0 - skip U x ~0 - x:= -x fi

Its counterpart in ALGOL, if x <0 then z:= -x, has the default that if
x;;::'O execution is equivalent to execution of skip. Although a program
may be a bit longer because of the lack of a default, there are advantages.
The explicit appearance of each guard does aid the reader; each alterna-
tive is given in full detail, leaving less chance of overlooking something.
More importantly, the lack of a default helps during program develop-
ment. Upon deriving a possible alternative command, the programmer is
forced to derive the conditions under which its execution will perform
satisfactorily and, moreover, is forced to continue deriving alternatives
until at least one is true in each possible initial state. This point will
become clearer in Part III.
The absence of defaults introduces, in a reasonable manner, the possi-
bility of nondeterminism. Suppose x = 0 when execution of command
(IO.l) begins. Then, since both guards x;;::'O and x ~O are true, either
command may be executed (but only one of them). The choice is entirely
up to the executor -for example it could be a random choice, or on days
with odd dates it could be the first and on days with even dates it could
be the second, or it could be chosen to minimize execution time. The
Chapter 10 The Alternative Command 135

point is that, since execution of either one leads to a correct result, the
programmer should not have to worry about which one is executed. He is
free to derive as many alternatives commands and corresponding guards
as possible, without regard to overlap.
Of course, for purposes of efficiency the programmer could strengthen
the guards to excise the nondeterminism. For example, changing the
second guard in (l0.1) from x::;:;;:O to x <0 would help if evaluation of
unary minus is expensive, because in the case x =0 only the first com-
mand z:= x could then be executed.
Finally, the lack of default allows the possibility of symmetry 'see
(10.1», which is pleasing -if not necessary- to one with a mathematical
eye.

A theorem about the alternative command


Quite often, we are not interested in the weakest precondition of an
alternative command, but only in determining if a known precondition
implies it. For example, if the alternative command appears in a pro-
gram, we may already know that its precondition is the postcondition of
the previous command, and we really don't need to calculate the weakest
precondition. In such cases, the following theorem is useful.

(10.5) Theorem. Consider command IF. Suppose a predicate Q satisfies

(I)Q~BB
(2) QI\Bj~wp(Sj,R), for all i, l::;:;;:i::;:;;:n.

Then (and only then) Q ~wp(IF, R). 0

Proof We first show how to take Q outside the scope of the quantifica-
tion in assumption 2 of the theorem:

(A i: Q 1\ B j ~ wp(Sj, R»
= (A i: ,(Q I\Bj )Vwp(Sj, R» (Implication)
= (A i: , Q v, Bj Vwp(Sj, R» (De Morgan)
= , Q v (A i: , Bj Vwp(Sj, R» (Q doesn't depend on i)
= Q ~(A i: Bj ~wp(Sj, R» (Implication, twice)

Hence, we have

(Q ~ BB) 1\ (A i: Q 1\ B j ~ WP(Si, R» (Assumptions (I), (2»


= (Q ~ BB) 1\ (Q ~(A i: B j ~ wp (Sj, R» (From above)
= Q~(BBI\(Ai:Bj~wp(Sj,R»
= Q ~wp(lF, R) (Definition (I0.3b»
136 Part II. The Semantics of a Small Language

Hence, the conjunction of the assumptions is equivalent to the conclusion,


and the theorem is proved. 0

Example 3. Suppose a binary search is being performed for a value x


known to be in array b[O:n-l]. We are at the stage where the following
predicate Q is true:

Q: ordered(b[O:n-l]) /\ O~i <k <j <n /\ x Eb[i:j].

That is, the search has been narrowed down to array section b[i:j], and k
is an index into this section. We want to prove that

(10.6) {Q} if b[k]~x ~ i:= k 0 b[k]~x ~ j:= k fl {x Eb[i:j]}

holds. The first assumption Q ~ BB of theorem (10.5) holds, because the


disjunction of the guards in (10.6) is equivalent to T. The second
assumption holds, because

Q /\b[k]~x ~ x Eb[k:j]
= wp("i:= k", x Eb[i:j]) , and
Q /\b[k]~x ~ x Eb[i:k]
= wp("j:= k", x Eb[i:j]).

The two implications follow from the fact that Q indicates that the array
is ordered and that x is in b[i:j] and from the second conjunct of the
antecedents. Hence the theorem allows us to conclude that (10.6) is
true. 0

Exercises for Chapter 10


1. Determine wp ("if fl", R), for any predicate R. Have you seen a command
with this definition before?
2. Prove that command IF satisfies properties (7.3) and (7.4) of chapter 7, pro-
vided the sub-commands of IF do.
3. The following command S3 is used in an algorithm that finds the quotient and
remainder when a value x is divided by a value y. Calculate and simplify
wp(S3,q*w+r =x /\ r~O).

S3: if w ~r ~ r,q:= r-w, q+l 0 w >r ~ skip fl.

4. Calculate and simplify wp(S4, a >0 /\ b >0) for the command

S4: if a >b ~ a:= a-b U b >a ~ b:= b-a fl.

5. Calculate and simplify wp (S5, x ~y) for the command


Exercises for Chapter 10 137

S5: if x >y - x, y:= y, x 0 x:::;; y - skip fl.

6. Arrays f[O:n] and g[O:m] are alphabetically ordered lists of names of people.
It is known that at least one name is on both lists. Let X represent the first (in
alphabetic order) such name. Calculate and simplify the weakest precondition of
the following alternative command with respect to predicate R given after it.
Assume i and j are within the array bounds.

S6: if f[i]<g[j] - i:= i+l


of[i] = g[j] - skip
Of[i]>g[j] - j:= j+1
fl
{R: ordered(f[O:n]) A ordered(g[O:m]) A f[i]:::;;X A g[j]:::;;XJ

7. The command of the following proof outline could be used in an algorithm to


store a *b in variable z. Using theorem 10.5, prove that the proof outline is true.

{x >0 A Z + y*x =a*bJ


if odd(x) - z ,x:= z+y ,x-l 0 even(x) - skip fl;
y, x:= 2*y, X-7-2
{x ~ 0 A Z + y *x = a *b J
8. The command in the following proof outline could be used in an algorithm that
determines the maximum value m of an array b[O:n-l]. Using theorem 10.5,
prove that it is true.

{O<i <n Am =max(b[O:i-l])}


ifb[i]>m -m:=b[i]O b[i]:::;;m -skip fl
{O<i <n Am =max(b[O:i])}
Chapter 11
The Iterative Command

The conventional while-loop and the iterative command


The while-loop in Pascal has the form "while B do S" and in PLj I
the rather baroque form "DO WHILE ( B ) ; SEND ;" for a Boolean
expression B and command S. S is sometimes called the body of the
loop. Execution of the while-loop can be expressed using a goto state-
ment as

loop: if B then begin S; goto loop end

but it is often described by a flaw chart:

l B
-
T
S
J
in F

out

In our programming notation, the while-loop has the form

do B - S od

where B - S is a guarded command. This form allows us to generalize


to the following, which we call the iterative command and refer to by the
name DO.
Chapter II The Iterative Command 139

do B1 -SI
D B2 - S2
(11.1)
D Bn - Sn
od

where n ~O and each Bi - Si is a guarded command. Note the syntactic


similarity between DO and IF; one is a set of guarded commands enclosed
in do and od, the other a set enclosed in if and fl.
In one sentence, here is how (11.1) can be executed. Repeat (or
iterate) the following until no longer possible: choose a guard Bi that is
true and execute the corresponding command Si.
Upon termination all the guards are false. Choosing a true guard and
executing its command is called performing an iteration of the loop.
Note that nondeterminism is allowed: if two or more guards are true,
anyone (but only one) is chosen and the corresponding command is exe-
cuted at each iteration. U sing I F to denote the alternative command with
the same guarded commands and BB to denote the disjunction of the
guards (see chapter 10), we see that (11.1) is equivalent to

doBB-ifB I -SI
D ...
DBn - Sn
fl
od

or do BB - IF od

That is, if all the guards are false, which means that BB is false, execution
terminates; otherwise, the corresponding alternative command IF is exe-
cuted and the process is repeated. One iteration of a loop, therefore, is
equivalent to finding BB true and executing IF.
Thus, we can get by with only the simple while-loop. Nevertheless, we
will continue to use the more general form because it is extremely useful
in deVeloping programs, as we will see in Part III.

The formal definition of DO


The following predicate H o(R) represents the set of states in which
execution of DO terminates in 0 iterations with R true, because the
guards are initially false:

H o( R) = , BB 1\ R
140 Part II. The Semantics of a Small Language

Let us also write a predicate Hk (R), for k > 0, to represent the set of all
states in which execution of DO terminates in k or fewer iterations, with
R true. The definition will be recursive -i.e. in terms of Hk -I(R). One
case is that DO terminates in 0 iterations, in which case H o(R) is true.
The other case is that at least one iteration is performed. Thus, BB must
initially be true and the iteration consists of executing a corresponding IF.
This execution of IF must terminate in a state in which the loop will
iterate k -lor fewer times. This leads to

HdR) = Ho(R) v wp(IF,Hk-I(R», for k >0.

Now, wp (~O, R) is to represent the set of states in which execution of


DO terminates in a bounded number of iterations with R true. That is,
initially there must be some k such that at most k iterations will be per-
formed. We therefore define

(11.2) Definition. wp(OO, R)=(Ek:O::::;;k:Hk(R)) 0

Two examples of reasoning about loops


The formal definition of DO is not easy to use, and it gives no insight
into developing programs. Therefore, we want to develop a theorem that
allows us to work with a useful precondition of a loop (with respect to a
postcondition) that is not the weakest precondition. We first illustrate the
idea with two examples.
Execution of the following algorithm is supposed to store in variable s
the sum of the elements of array b [0: 10].

i, S:= I, b [0];
doi<II -i,s:=i+I,s+b[i]od
{R: S =('ik:O::::;;k <11:b[k])}

How can we argue that it works? Let's begin by giving a predicate P that
shows the logical relationship between variables i, sand b -in effect, it
serves as a definition of i and s:

P: 1::::;; i ::::;; 11 A s = ('i k : 0::::;; k < i : b [k ])


We will show that P is true just before and after each iteration of the
loop, so that it is also true upon termination. If P is true in all these
places, then, with the additional help of the falsity of the guards, we can
see that R is also true upon termination (since P A i ~ I I => R). We
summarize what we need to show by annotating the algorithm:
Chapter II The Iterative Command 141

{T}
i, s:= 1, b [0];
{P}
(11.3) do i<l1 - {i<l1 /I P} i,s:= i+l, s+b[i] {P} od
{i ~ 11 /I P}
{R}

We repeat, because it is very important: if we can show that (1) P is true


before execution of the loop and that (2) each iteration of the loop leaves
P true, then P is true before and after each iteration and upon termina-
tion. Then, the truth of P and the falsity of the guard allow us to con-
clude that the desired result R has been established.
Now let's verify that P is true after the initialization i,s:= l,b[O], no
matter what the initial state is. We can see this informally, or we can
prove it as follows:

wp("i, s:= 1, b[O]", P)


1:::;;; 1 :s:;;: 11 /I b [0] = (I k : 0 :s:;;: k < 1: b [k ])
=
T.

N ow let's show that an iteration of the loop terminates with P true -i.e.
an execution of command i, S:= i+l, s+b[i] beginning with P and
i < 11 true terminates with P still true. Again, we can see this informally
or we can formally prove it:

wp("i, s:= i+l, s+b[i]",P)


l:S:;;:i+l:S:;;:l1 /I s+b[i]=(Ik:O:S:;;:k<i+l:b[k])
= O:S:;;:i<l1 /I s=(Ik:O:S:;;:k<i:b[k])

and (P /I i < 11) implies the last line.


Hence we know that if execution of the loop terminates, upon termina-
tion P and i ~ 11, and hence R, are true.
A predicate P that is true before and after each iteration of a loop is
called an invariant relation, or simply an invariant, of the loop. (The
adjective invariant means constant, or unchanging. In mathematics the
term means unaffected by the group of mathematical operations under
consideration, the single operation here being an iteration of the loop
under the initial truth of P.)
To show that the loop terminates, we introduce an integer function, t,
of the program variables that is an upper bound on the number of itera-
tions still to be performed. Each iteration of the loop decreases t by at
least 1 and, as long as execution of the loop has not terminated, t is
bounded below by O. Hence, the loop must terminate. Let t be:
142 Part II. The Semantics of a Small Language

t : II-i.

Since each iteration increases i by I, it obviously decreases t by I. Also,


as long as there is an iteration to perform, i.e. as long as i < II, we know
that t is greater than O.
In this case, t indicates exactly how many iterations are still to be per-
formed, but, in general, it may only provide an upper bound on the
number of iterations still to be performed. Function t has been called a
variant function, as opposed to the invariant relation P -the function
changes at each iteration; the relation remains invariantly true. However,
in order to emphasize its purpose, we will call t the bound function.
The previous example may seem to require too much of an explanation
for such a simple algorithm. Let us now consider a second example,
whose correctness is not so obvious. Indeed, it is only with the aid of the
invariant that we will be able to understand it. Algorithm (11.4) is sup-
posed to store in variable z the value a*b for b ~ 0, but without the use
of multiplication.

{b ~O}
x, y, z:= a, b, 0;
(11.4) do y > 0 1\ even (y) - y, x:= y -;-2, x +x
o odd(y) - y, z:= y-I, z+x
od
{R: z =a*b}

One view of the loop is that it processes the binary representation of b,


which has been stored in y. Testing for oddness and evenness is done by
interrogating the rightmost bit, subtracting 1 when the rightmost bit is 1
means changing it to a zero, and dividing by 2 is done by shifting the
binary representation I bit to the right, thus deleting the rightmost bit.
But, how do we know the algorithm works? We introduce -out of the
old hat, so to speak- the invariant P (how to find invariants is a topic of
Part III):

P: y ~O 1\ Z +x*y =a*b.

We determine that P is true just after the initialization:

wp("x, y, z:= a, b, 0", P) = b ~O 1\ O+a*b =a*b,

which is obviously implied by the precondition of algorithm (11.4). Next,


we show that any iteration of the loop beginning with P true terminates
with P true, so that P is an invariant of the loop. For the second
guarded command, this can be observed by noting that the value of
Z + x*y remains the same if y is decreased by 1 and x is added to z:
Chapter 11 The Iterative Command 143

z +x*y =z+x +x*(y-I). For the first guarded command, note that
execution of y, x:= y "';-2, x +x with y even leaves the value of z + x*y
unchanged, because x*y = (x +x) * (y "';-2) when y is even. We leave the
more formal verification to the reader (exercise 7).
Since each iteration of the loop leaves P true, P must be true upon
termination. We show that P together with the falsity of the guards
implies the result R as follows:

P /\ , (y >0 /\ even (y» /\ , odd(y)


= y:):O /\ z+x*y =a*b /\ (y ~O /\ even(y»
= y =0/\ z+x*y =a*b
? z =a*b

The work done thus far is conveyed by the following annotated program.

{b :):O}
x, y, z:= a, b, 0;
{PI
do y > 0 /\ even (y) ~ {P /\ Y > 0 /\ even (y )} y , x := y "';-2, x +x {P}
(11.5) 0 odd(y) ~ {P /\odd(y)} y, z:= y -I, z +x {PI
od
{P /\ Y ~O/\ ,odd(y)}
{P/\y=O}
{R: z =a*b}

To show that the loop terminates, use the bound function t = y: it is


greater than 0 if there is another iteration to execute and is decreased by
at least I on each iteration.

A theorem concerning a loop, an invariant and a bound function


In the two examples just given, the same kind of reasoning was used to
argue that the loops performed as desired. This form of reasoning is
embodied in theorem (11.6). By now, the theorem should be quite clear.
Assumption I implies that P will be true upon termination of DO.
Assumption 2 indicates that function t is bounded below by 0 as long as
execution of DO has not terminated. Assumption 3 indicates that each
iteration decreases t by at least one, so that termination is guaranteed to
occur. An unbounded number of iterations would decrease t below any
limit, which would lead to a contradiction. Finally, upon termination all
the guards are false, so that , BB is true.
144 Part II. The Semantics of a Small Language

(11.6) Theorem. Consider loop DO. Suppose a predicate P satisfies


1. PI\B;=»wp(S;,P), foralli, I~i~n.

Suppose, further, that an integer function ( satisfies the following,


where (1 is a fresh identifier:
2. P 1\ BB =»(t >0),
3. P I\B; =» wp("t]:= t; S;", t <tl), for I ~i ~n.

Then P =» wp (DO, P /\ , BB). 0

Proof We leave to the reader (exercise 2) the proof that assumption


implies

I'. P 1\ BB =» wp(lF, P)

We leave to the reader (exercise 3) the proof that assumption 3 implies

3'. P 1\ BB 1\ t ~tO+1 =» wp(lF, (~tO), for all to.


Finally, we leave to the reader (exercise 4) the proof that, for all k ~O,

Predicate (11.7) is interpreted to mean that in any state in which P is


true, if (~k, then execution of the loop will terminate in k or fewer
iterations with P true and BB false. Since t is a finite function, (E k:
o~ k: t ~ k) is true in any state. Therefore,
P = P 1\ (Ek:O~k: (~k)
= (Ek:O~k:P /\t~k) (Since k is not free in P)
=» (E k: 0 ~k:
HdP 1\ , BB» ( 11.7)
= wp(DO, P /\ , BB) (Definition (11.2» 0

Discussion
A loop has many invariants. For example, the predicate x*O = 0 is an
invariant of every loop since it is always true. But an invariant that satis-
fies the assumptions of theorem (11.6) is important because it provides
understanding of the loop. Indeed, every loop, except the most trivial,
should be annotated with an invariant that satisfies the theorem.
As we shall see in Part III, the invariant is not only useful to the
reader, it is almost necessary for the programmer. We shall give heuris-
tics for developing the invariant and bound function before developing the
loop and argue that this is the more effective way to program. This
makes sense if we view the invariant as simply the definition of the vari-
ables and remember the adage about precisely defining variables before
Chapter II The Iterative Command 145

using them. At this point, of course, developing an invariant may seem


almost impossible, since even the idea of an invariant is new. Leave the
development process to Part III, and for now concentrate on understand-
ing loops for which invariants are already provided.

Annotating a loop and understanding the annotation


Algorithms (11.3) and (11.5) are annotated to show when and where
the invariants are true. Rather than write the invariant in so many places,
it is often easier to give the invariant and bound function in the text
accompanying an algorithm. When it is necessary to include them in the
algorithm itself, it is advantageous to use an abbreviation, such as shown
in (1\.8).

{Q}
{inv P: the invariant}
{bound t: the bound function}
(1\.8) do B, - S,
o ...
o Bn - Sn
od
{R}

When faced with a loop with form (1\.8), according to theorem (1\.6)
the reader need only check the points given in (11.9) to understand that
the loop is correct. The existence of such a checklist is indeed an advan-
tage, for it allows one to be sure that nothing has been forgotten. In fact,
the checklist is of use to the programmer himself, although after a while
(pun) its use becomes second-nature.

(11.9) Checklist for understanding a loop:


I. Show that P is true before execution of the loop begins.
2. Show that {PABd S; {P}, for I~i~n. That is, execution of
each guarded command terminates with P true, so that P is
indeed an invariant of the loop.
3. Show that P A , BB ? R, i.e. upon termination the desired
result is true.
4. Show that P A BB ? (I >0), so that t is bounded from below
as long as the loop has not terminated.
5. Show that {PABdtl:=t; S;{t<tl}, for l~i~n, so that
each loop iteration is guaranteed to decrease the bound func-
tion. 0
146 Part II. The Semantics of a Small Language

Often, only the invariant and bound function need be provided as


documentation for a loop, because the algorithm is then almost trivial to
verify. This is documentation at its best: just enough to provide the
necessary understanding and not so much that the reader is lost in super-
fluous, obvious details.
In the same vein, the parts of an invariant that refer only to unchanged
variables of a loop are often omitted, and the reader is expected to note
this. For example, in algorithm (11.3) we did not indicate explicitly that
array b remained unchanged (by including as a conjunct of the precondi-
tion, the invariant and the postcondition the predicate b = B where B
represents the initial value of b). Similarly, in algorithm (5.4) we did not
indicate explicitly that a and b remained unchanged.
It is important to perform several of the exercises 7-13. In Part III we
will be discussing the development of loops, but Part III will make sense
and seem easy only if you are completely familiar with theorem 11.6 and
the use of checklist 11.9 and if you have gained some facility in this way
of thinking.

Exercises for Chapter 11


1. Determine wp (do od, R), for any R. Have you seen a command with these
characteristics before?
2. Prove that I' follows from assumption I (see the proof of theorem (11.6».
3. Prove that 3' follows from assumption 3 (see the proof of theorem (11.6».
4. Prove by induction on k that (11.7) follows from I', 2 and 3' (see the proof of
theorem (11.6».
5. Prove that properties (7.3) and (7.4) of chapter 7 hold for the definition of
wp(DO, R).
6. HdR) represents the states in which execution of DO will terminate in k or
fewer iterations with R true. Define H' k (R) to represent the set of states in
which execution of DO will terminate in exactly k iterations. What set of states
does the predicate (Ek: O::::;;k: H'dR» represent? How does it differ from
wp(DO, R)?
7. Formally prove the points of checklist 11.9 for algorithm 11.4.
8. Formally prove the points of checklist 11.9 for the following algorithm, which
stores in s the sum of the elements of b [1: 10].

{ T}
i, S:= 10,0;
{inv P: O::::;;i::::;;10 1\ s=(Lk:i+l::::;;k::::;;lO:b[k])}
{bound t: i}
do i #0 - i,s:= i-I, s+b[i] od
{R: S =(Lk: l::::;;k::::;; lO:b[kJ)}
Exercises for Chapter II 147

9. Formally prove the points of checklist 11.9 for the following algorithm. The
algorithm finds the position i of x in array b [0: n -I] if x E b [O:n -I] and sets
i to n if it is not.
{O~n}
i:= 0;
{inv P:O~i~n Axi{b[O:i-I]}
{bound t: n -i)
do i <n cand x 01= b[i] - i:= i+1 od
{R: (O~i <n A x =b[iD v (i =n A x jtb[O:n-I]»)

10. Formally prove the points of checklist 11.9 for the following algorithm. The
algorithm sets i to the highest power of 2 that is at most n.

{O<n)
i:= I;
{inv P: O<i ~n A (Ep: i =2P )}
{bound t: n -i)
do 2*i ~n - i:= 2*i od
{R: 0<i~n<2*i A(Ep:i=2P)}

11. Formally prove the points of checklist 11.9 for the following algorithm. The
algorithm computes the nth Fibonacci number In for n >0, which is defined by
= =
10 =0, II I, and In In-l+ln-2 for n > 1.
{n >0)
i, a, b:= I, 1,0;
{inv P: 1 ~i ~n A a =Ii A b =Ii-d
{bound t: n-i}
do i <n - i,a,b:= i+I, a+b, a od
{R:a=ln}

12. Formally prove the points of checklist 11.9 for the following algorithm. The
algorithm computes the quotient q and remainder r when x is divided by y .

{x ~O A O<y)
q,r:=O,x;
{inv P: O~r A O<y A q*y+r =x}
{bound t: r}
do r ~ y - r, q: = r - y, q + 1 od
{R: O~r <y A q*y+r =x}

13. Formally prove the points of checklist 11.9 for the following algorithm. The
algorithm finds an integer k such b [k] is the maximum value of array b [O:n -I]
-note that if the maximum value occurs more than once the algorithm is non-
deterministic.
148 Part II. The Semantics of a Small Language

{O<n}
i,k:=l,O;
{inv P: O<i ~n /I b[k];;:'b[O:i-J]}
{bound t: n -i}
do i <n ~ if b[i]~b[k] ~ skip
Ub[i];;:'b[k] ~ k:= i
fi;
i:= i + J
od
{R: b[k];;:'b[O:n-J]}
Chapter 12
Procedure Call

This chapter develops the definition of the command procedure call,


which invokes a procedure declared with value and result parameters.
Two theorems are then proved in order to make proving procedure calls
correct easier. Finally, the theorems are extended to include reference, or
"var", parameters. This chapter relies heavily on the mUltiple assignment
command, which was defined in section 9.4.
This material need not be read to understand Part III (program
development), and this chapter may be skipped. The purpose of the
chapter is to illustrate how the correctness issues extend to more compli-
cated constructs, but the material will not be used formally. Because of
the attempt to deal with a number of different parameter-passing mechan-
isms, the chapter contains quite a few theorems concerning procedure
calls. When using anyone programming notation, of course, the number
of parameter-passing mechanisms and corresponding applicable theorems
used is fewer.
In order to introduce this detailed material in as simple a manner as
possible, it is assumed that procedures are not recursive.
The procedure, or subroutine as it called in FORTRAN, is a basic
building-block in programming. (The word routine was used as early as
1949 on the EDSAC, generally accepted as the first practical stored pro-
gram computer to be completed. It was built at Cambridge University by
a team headed by Maurice Wilkes and executed its first program in May
1949.) The main use of the procedure is in abstraction. By abstraction
we mean the act of singling out a few properties of an object for further
use or study, omitting from consideration other properties that don't con-
cern us for the moment. The main property that we single out, once a
procedure is written, is what it does; the main property that we omit from
consideration is how it does it.
150 Part II. The Semantics of a Small Language

In one sense, using a procedure is exactly like using any other opera-
tion (e.g. +) of the programming notation, and constructing a procedure is
extending the language to include another operation. For example, when
we use + in an expression, we never question how it is performed; we just
assume that it works. Similarly, when writing a procedure call we rely
only on what the procedure does, and not on how it does it. In another
sense, a procedure (and its proof) is a lemma. A program can be con-
sidered a constructive proof that its specification is consistent and com-
putable; a procedure is a lemma used in the constructive proof.
In the following sections, Pascal-like notations are used for procedure
declaration and call, although the (possible) execution of a procedure call
may not be exactly as in Pascal. The reason is that the main influence in
developing the procedure call here was the need for a simple, understand-
able theorem about its use, and such an influence was beyond the state of
the art when Pascal was developed.

12.1 Calls with Value and Result Parameters

Procedure declaration
A procedure declaration has the form

proc <identifier> ( <parameter specification>;


<parameter specification» ;
{P} <body> {Q}

where each <parameter specification> has one of the three forms

value <identifier list> : <type>


value result <identifier list> : <type>
result <identifier list> : <type>

As usual, an <identifier list> is a sequence of one or more identifiers,


joined by commas. A parameter of a procedure has a type (e.g. Boolean,
array of integer), which defines the type that a corresponding argument
may have. We do not consider array bounds to be part of the type of an
array; an array is simply a function of one integer argument.
Execution of a procedure call causes the procedure <body> to be exe-
cuted. The <body> may be any command whatsoever -a sequence of
commands, an assignment command, a loop, etc. It may contain suitably
declared local variables. During execution of the <body>, the parame-
ters are considered to be local variables of the <body>' The initial values
of the parameters and the use of their final values are determined by the
Section 12.1 Calls with Value and Result Parameters 151

attributes value or result given to the parameters in the procedure head-


ing. This will be explained later.
The precondition P and postcondition Q of the <body> are necessary
for understanding, but not executing, a procedure call. It is assumed that
{PI <body> {Q} has been proved, and that this information will be used
in writing calls. A trend in documentation is to require both the pre- and
postcondition to be written before the body, as shown below, because it is
easier to find the information needed to write and understand procedure
calls. To write a call, one need only understand the first three lines:

{Pre: P}
{Post: Q}
proc <identifier>( <par. spec.> ; ... ; <par. spec.»;
<body>
The following restrictions are made on the use of identifiers in a pro-
cedure declaration. The only identifiers that can be used in the body are
the parameters and the identifiers declared in the body itself -i.e. no
"global variables" are allowed. The parameters must be distinct identif-
iers. Precondition P of the body may contain as free only the parameters
with attribute value (and value result); postcondition Q only parameters
with attribute result (and value result). This restriction is essential for a
simple definition of procedure call, but it does not limit procedures or
calls of them in any essential way. P and Q may, of course, contain as
free other identifiers that are not used within the program (to denote ini-
tial values of variables, etc.). See section 12.4 for a way to eliminate this
restriction.

Example. Given fixed x, fixed n >0 and fixed array b[O:n-I), where
x E b, the following procedure determines the position of x in b, thus
establishing x =b[i).

{Pre: n =N "x =X "b =B" X EB[O:N-l]}


{Post: O~i<N" B[i)=X}
proc search (value n, x: integer;
value b: array of integer);
result i: integer);
i:= 0;
{invariant: 0 ~ i < N " X ¥c B[O:i -I]}
{bound: N-i}
do b[i]#x - i:= i+l od

Note that identifiers have been used to denote the initial values of the
parameters that do not have attribute result, even though the parameters
are not altered during execution of the procedure body. 0
152 Part II. The Semantics of a Small Language

In the sequel, we assume the procedure has the following form:

(12.1.1) proc p (value x; value result y; result z);


{PI B {Q}

Thus, the Xj are the value parameters of procedure p, the Yi the value-
result parameters and the Zj the result parameters. We have left out the
types of the parameters because they don't concern us at this point. (This
is an example of the use of abstraction!)

The procedure call and its execution


We are interested in formally defining the command procedure call,
which has the form

(12.1.2) p(li, ii, c)

The name of the procedure is p. The aj, h j and Cj are the arguments of
the procedure. The aj are expressions; the h j and Cj have the form
identifier 0 selector -in common parlance, they are "variables". The aj
are the value arguments corresponding to the Xj of (12.1.1), the h j the
value-result arguments and the Cj the result arguments. Each argument
must have the same type as its corresponding parameter.
The identifiers accessible at the point of call must be different from the
procedure parameters x, y and z. This restriction avoids extra notation
needed to deal with the conflict of the same identifier being used for two
different purposes and is not essential.
To illustrate, here is a call of procedure search of the previous exam-
ple: search (50, t, C ,positionU]). Its execution stores in position U] the
position of the value of t in array C [0:49].
A call p(li, ii, C) can be executed as follows:

All parameters are considered to be local variables of the pro-


cedure. First, determine the values of the value arguments li
and b and store them in the corresponding parameters x and y.
Second, determine the variables described by the result argu-
ments b, c -i.e. determine their addresses in memory. Note
that all parameters with attribute value are initialized, and the
others are not. Third, execute the procedure body. Fourth,
store the values of the result parameters y, z in the correspond-
ing result arguments ii, c (using their previously determined
addresses) in left-to-right order.
Section 12.2 Two Theorems Concerning Procedure Call 153

Formal definition of the procedure call


From the above description of execution, we see that execution of the
call p (a, b, c) is equivalent to execution of the sequence

x, y:= a, b; B; b, c:= y, z

(The addresses of b, c can be evaluated before or after execution of the


procedure body B, since execution of B cannot change them.) We define

(12.1.3) wp(P(a, b, C), R) = wp("x,y:= a,b; B; b,c:= y,z",R)

12.2 Two Theorems Concerning Procedure Call


We now develop theorems that allow the use of procedural abstraction
when writing procedure calls. First, we state a theorem and argue about
its validity based on our notion of procedure call execution.

(12.2.1) Theorem. Suppose procedurep is defined as in (12.1.1). Then

{PR: p~'bi
a,
A (A ii, v: QE'~
U,V
=? R~'~)} pea, b, c) {R}
U,V

holds. In other words, P R =? wp (p (a, b, C), R).


Proof Suppose for the moment that we know the values ii, v that will
be assigned to parameters with attribute result. Then execution of the
procedure body B, by itself, can be viewed as a multiple assignment
y,Z:= ii,v. From (12.1.3), we see that the procedure call can be viewed as
the following sequence (12.2.2). In (12.2.2), postcondition R has been
placed at the end and assertions P and Q have been placed suitably
because we expect to use that information subsequently.

(12.2.2) x,y:= a,b {P}; y,z:= ii,v {Q}; b,c:= y,z {R}

Since this is a sequence of assignments, we can easily determine the weak-


est precondition such that its execution will establish each of the three
predicates at the indicated places. Note that these are necessary and suffi-
cient conditions. For example, R holds on termination iff (12.2.5) holds
before execution:

(12.2.3) Weakest precondition to establish P: P:"b~


a,

(12.2.4) Weakest precondition to establish Q: (Q~' :):"b~


u, v a,

= Q~':
u, v
(since it contains no Xi or Yi!)
154 Part II. The Semantics of a Small Language

(12.2.5) Weakest precondition to establish R: «R~'~) E'~) ~,i


y,z u,va,b

= R~' ~
u, v
(since it contains no Xi or Yi!)

In order to be able to use the fact that {P} B {Q} has been proved about
the procedure body, we require that (12.2.3) be true before the call; this is
the first conjunct in the precondition PRof the theorem. Therefore, no
matter what values ii, v execution assigns to the result parameters, Q will
be true in the indicated place in (12.2.2).
Now, we want to determine initial conditions that guarantee the truth
of R upon termination, no matter what values ii, v are assigned to the
result parameters and arguments. R holds after the call if, for all values
ii, v, the truth of Q in (12.2.2) implies the truth of R after the call. This
can be written in terms of the initial conditions as

(A ii, v: (12.2.4) =?(l2.2.5»

This is the second conjunct of the precondition of the theorem. 0

Examples of the use of theorem 12.2.1


Each example illustrates an important point about the use of the
theorem, so read the examples carefully. In each, the procedure body is
omitted, since it is not needed to ascertain correctness of a call.

Example 1. Consider the procedure

proc swap (value result yl, y2: integer);


{P: yl =X 1\ y2= Y} B {Q: yl = Y 1\ y2=X}

We want to prove that

(12.2.6) {a =X 1\ b = Y} swap(a, b) {R: a =Y 1\ b =X}

holds, where a and b are integer variables and identifiers Y and X de-
note their final values, respectively. We apply theorem (12.2. I) to find a
satisfactory precondition P R:

PR =(a=Xl\b=Y)1\
(A ul , u2'(y1
. = Y 1\ y2=X)Yl,y2
ul,u2 =? (a = Y 1\ b =X)a,b
ul,u2 )
= (a = X 1\ b = Y) 1\
(A ul,u2:(ul= Y 1\ u2=X)=?(ul= Y 1\ u2=X»
= (a = X 1\ b = Y) 1\ T
Section 12.2 Two Theorems Concerning Procedure Call 155

and this IS implied by the precondition of (12.2.6). Hence, (12.2.6) is


correct. 0

Example 2. Consider the procedure of Example I. Suppose we want to


prove that

(12.2.7) {a =A A b = Y} swap(a, b) {a = Y A b =A}

holds, where a and b are integer variables and identifiers A and Y


denote their initial values, respectively. The difficulty here is that dif-
ferent identifiers are used in the declaration and in the call to denote the
initial values X. We surmount this difficulty as follows. The following
has been proved about the procedure body -i.e. it is a tautology:

{P:yl=XAy2=Y}B{Q:yl=YAy2=X}

Therefore, it is equivalent to

(A X , Y: {y 1 = X A Y2 = Y} B {y 1 = YAy 2 = X})

Now we can produce an instance of the above quantified predicate by


replacing X by A and Y by Y, respectively, to yield

{yl=A Ay2=Y}B{yl=YAy2=A}

Thus, this last line is also true about the procedure body B. Now apply
the theorem as in example I to yield the desired result. Hence, (12.2.7)
holds.
This illustrates how initial and final values of parameters can be han-
dled. The identifiers that denote initial and final values of parameters can
be replaced by fresh identifiers -or any expressions- to yield another
proof about the procedure body, which can then be used in theorem
12.2.1. 0

Example 3. We now prove correct a call that has array elements as argu-
ments. Consider the procedure of example I. We want to prove that
swap (i, b [i]) interchanges i and b [i] but leaves the rest of array b
unchanged. It is assumed that the value of i is a valid subscript. Thus,
we want to prove

(12.2.8) {i =/ A (Aj:bU]=BU])}
swap (i, b [i])
{R: i = B [I] A b [/] = I A (A j: I # j : b U] = B U])}

Identifiers I and B denote the initial values of i and b, respectively. In


the proof of the body of the procedure declaration, we can replace the
156 Part II. The Semantics of a Small Language

expressions X and Y by / and B[I], respectively, to yield

{P:yI=/ /\y2=B[I]} B {Q:yI=B[/]/\y2=J}

N ow apply theorem 12.2.1 to R of (12.2.8) to get the precondition P R

PR = i =/ /\ b[i]=B[I] /\
(A uI, u2: ul = B[I] /\ u2 = / =? ul = B[I] /\
(b; i:u2)[I]=/ /\ (Aj:/#j:(b; i:u2)U]=BU])
= i =/ /\ b[i]=B[I] /\ B[I]=B[I] /\
(b; i:I)[I]=/ /\ (Aj:/#j:(b; i:I)U]=BU])
= i =/ /\ b[i]=B[I] /\ T /\ /=/ /\ (Aj:/#j:bU]=BU])

and this is implied by the precondition of (12.2.8). 0

Example 4. Consider the procedure

proe p (value x; result z I , z2)


{P: x =X} zl, z2:= x, x {Q: zl=z2=X}

which assigns the value parameter to both result parameters. Note that
postcondition Q does not contain the value parameter. We want to exe-
cute the call p(b[i],i,b[i+I]), which assigns b[i] to i and b[i+I].
Thus, it makes sense to try to prove

(12.2.9) {b[i]=C /\ i =J} p(b[i],i,b[i+I]) {R: i =b[I]=b[/+I]=C}

First, replace the free variable X in the proof of the procedure body by
C:

{P: x = C} zl, z2:= x, x {Q: zl=z2=C}

Next, apply theorem 12.2.1 to yield the precondition

b [i] = C /\
(A vI, v2: vI = v2 = C =?
vl=(b; i+l:v2)[I]=(b; i+l:v2)[I+I]=C)
= b [i ] = C /\ (b; i + I : C )[I ] = (b; i + I : C )[1 + 1] = C

Since the last line is implied by the precondition of (12.2.9), (12.2.9)


holds. 0
Section 12.2 Two Theorems Concerning Procedure Call 157

A theorem that is easier to use


In the precondition of theorem 12.2.1, it would be nice not to have the
complicated conjunct

(A it,V: QE'~
u, v
~ R~'~)
u, v

If we restrict the postcondition R in some fashion, we may be able to


eliminate this complicated conjunct. This may be the case, for example, if
we allow only those R that satisfy

(12.2.10) R~' ~
u. v
= QE,u, v~ A I

where the free variables of I are disjoint from li and C. For then the
complicated conjunct may be simplified as follows:

(A it, V: QE'~
u, v
~ R~'~)
u. v

(A it, V: Q~': ~Q~': A J)


u, v u. v
I (since it, V are not free in I)

Our task, then, is to determine predicates R that satisfy (12.2.10). To do


this, we can textually replace it, V by li, C in (12.2.10) and use predicates
R that satisfy

(Lemma 4.6.3 and ex. 5 of 9.4)

= (Q~':
u, v
A I) b,c
'!.,~ (12.2.10)

= Q~':
b,c
A I (Lemma 4.6.3, def of I)

Hence we restrict our attention to predicates R satisfying

(12.2.11) R = Qb~':
,c
A I

But this is not enough, From (12.2.11) we want to conclude that (12.2.10)
holds, but this is not always the case, because

(Qi' ~)~, ~
b,c u, v

is not always equal to

Q~':
u,v

The two are equal, however, if (li, C) consists of distinct identifiers, as we


know from Lemma 4.6.3. Hence we have the following theorem, which is
158 Part II. The Semantics of a Small Language

more restrictive but easier to use

(12.2.12) Theorem. Suppose procedure p is defined as in (12.1.1). Sup-


pose (b, c) is a list of distinct identifiers. Suppose none of the
free identifiers in predicate J appear in the argument lists band
c. Then
{P~'1:
a, b
"I} pea, b, C) {Q1:'~
b,c
"I} 0

Predicate J of the theorem captures the notion of in variance: predicates


that do not refer to the result arguments'remain unchanged throughout
the call of the procedure.
This theorem is simpler than theorem 12.2, I, and should be used when-
ever only identifiers are used as arguments. Examples of its use are left to
the exercises.

12.3 Using Var Parameters


A value-result parameter y with corresponding argument c is handled
during a call as follows. The value of c is stored in y; the procedure
body is executed; the value of y is stored in c. If y is an array, this
implementation can take much time and space.
Another method of argument-parameter correspondence is call by ref-
erence, Here, before execution of the body, the address of c is stored in
y. During execution of the body, every reference to y is then treated as
an indirect reference to c. For example, the assignment y:= e within the
body has the immediate effect of the assignment c: = e. In other words, y
and c are considered to be different names for the same location.
Call by value-result requires space equal to the size of the argument,
while call by reference requires constant space, Call by value-result
requires time at least proportional to the size of the argument to prepare
and conclude the call, while call by reference requires constant time for
this. But call by reference does require more time for each reference to
the parameter during execution of the procedure body.
Especially for arguments that are arrays, call by reference is preferred.
A call by reference parameter is denoted by the attribute var, which is
short for "variable". The procedure declaration given in (12.1.1) and
corresponding call of (12.1.2) are extended as follows:

(12.3.1) proc p (value x; value result y; result z; var r);


{P} B {Q}
Section 12.3 Using Var Parameters 159

(12.3.2) pea, b, c, d)
How do we extend theorems 12.2.1 and 12.2.12 to allow for call by refer-
ence? Call by reference can be viewed as an efficient form of call by
value-result; execution is the same, except that the initial assignments to r
and the final assignments to d are not needed. But the proof of the pro-
cedure body, {PI B {Q}, is consistent with our notion of execution for
value-result parameters only if value-result parameters occupy separate
locations -assignment to one parameter must not affect the value of any
other parameter. When using call by reference, then, we must be sure
that this condition is still upheld.
Let us introduce the notation disj(d) to mean that no sharing of
memory occurs among the d i . For example, disj(dl,d2) holds for dif-
ferent identifiers dl and d2. Also, disj (b [i], b [i + I]) holds, while
disj(b[i],bU]) is equivalent to i #j.
Further, we say that two vectors x and.v are pairwise disjoint, written
pdisj(x; jI), if each Xi is disjoint from each Yj -i.e. disj(Xi, Yj) holds.
Theorems 12.2.1 and 12.2.12 can then be modified to the following:

(12.3.3) Theorem. Suppose disj(d) and pdisj(d; b, C) hold. Then we


have

{PX',J:',~
li,b,d
A (A ii, \I, W: QE'~'~
U,v,W
~ R~'~' ~)}
u, v. w

pea, b, c, d)
{R} 0

(12.3.4) Theorem. Suppose (b, c, d) is a list of distinct identifiers. Let


ref (l) denote the list of free identifiers in predicate I. Finally,
suppose that

pdisj(b,c, d; ref (/)

holds. Then

{P~',J:',~ A I}
a,b,d
pea, b, c, d) {Q['::'~ A I}
b,c,d
0

As a simplification, if we restrict attention to call by value and call by


reference, theorem 12.3.4 simplifies to
160 Part II. The Semantics of a Small Language

(12.3.5) Theorem. Suppose procedure p is defined and called using

procp(value x; var 1'); fP} B fQ}


and pea, d)

where d is a list of distinct identifiers. Suppose no free identif-


ier of I occurs in d. Then

Examples of the use of these theorems are left to the exercises.

12.4 Allowing Value Parameters in the Postcondition


In procedure declaration 12.1.1, the postcondition Q of the body may
not contain the value parameters X. There is a good reason for this.
Value parameters are considered to be local variables of the procedure
body. Therefore, they have no meaning once execution of the procedure
body has terminated. In general, one can not meaningfully use local vari-
ables of a command in the postcondition of a command.
But this restriction irritates, because it (almost) always requires the use
of an extra identifier to denote the initial value of a value parameter.
Perhaps there is a way ot allowing the value parameters to occur in Q,
which would eliminate this problem.
Consider theorem 12.2.1:

fPR: p~,i 1\ (A ii,v: Q~'~ => R~'~)}


a, b u,V u, v

pea, b, C)
fR}

It would not lllilke sense with respect to the model of execution if x


occurred in Q~' ~, because x cannot be referred to before the call (it is a
u, v
list of parameters of the procedure). What is meant by x in this context?
Well, it really refers to the value arguments a, so let us try to textually
replace x by a in Q. But this replacement makes sense with respect to
the model of execution only if, upon termination of B, the value parame-
ters still have the initial values of the value arguments. This we can
ensure by requiring that no assignments to value parameters occur within
Q and the value arguments are not affected by assignments to the other
parameters. We then get the following counterparts of theorems 12.2.1, in
which x can be referred to in Q. The counterparts of theorems 12.2.12,
12.3.3 and 12.3.5 are similar.
Exercises for Chapter 12 161

(12.4.1) Theorem. Suppose procedure p is defined as in (l2.1.1), but Q


may contain the value parameters X, no assignments occur to
value parameters, and the value arguments are not affected by
assignments to the other parameters during execution of the pro-
cedure call. Then

{PR: P:'boJ::
a.
A (A u,v: Q:'!":
a,u,v
~ R~'
u,v
~)}
p(a, b, C)
{R}
holds. In other words, PR ~wp(p(a, b, c), R). 0
Examples of the use of these theorems are left to the exercises.

Exercises for Chapter 12


1. Consider the three predicates

{Q(U)} S {R}
{(A u: Q(u))} S {R}
{(Eu: Q(u))} S {R}

where identifier u is not free in command S or predicate R. Suppose the first


has been proven to be true -i.e. it is a tautology. Is it equivalent to the second
or the third? Hint: use the fact that {Q(u)} S {R} is equivalent to Q (u) ~
wp (S, R). Also, it is equivalent to itself but with identifier u universally quanti-
fied, the scope of u being the complete predicate.
2. Use the results of exercise I to reason why the quantifier A cannot be omitted
in theorem 12.2.1.
3. Find a counterexample to the conjecture that theorem 12.2.12 holds even if the
arguments are not identifiers. Hint: there must be arguments that are disjoint but
still interact in some fashion.
4. Section 12.2 contained four examples of the use of theorem 12.2.1. Which of
the procedure calls in the examples can be proved correct using theorem 12.2.12
instead? Prove them correct.
S. The following procedure inserts x in array b [O:k -1] if it is not present, thus
increasing k, and stores in p the position of x in b [O:k -1]. It assumed that the
element b [k] can be used by the procedure for its own purposes.
162 Part II. The Semantics of a Small Language

{Pre: 0";;; k A X = X A b = B}
{Post: O";;;p";;;k A b[P] =X}
proc s (value x : integer;
value result b: array of integer;
value result k, p: integer);
p, b[k]:= 0, x;
{inv: O";;;p ";;;k A x Ef b [O:p -I]}
{bound: k -p}
do x #b[P] - p:= p+l od

Is the procedure fully specified -i.e. has anything omitted from the specification
that can be proved of the procedure body? Which of the following calls can be
proved correct using theorem 12.2.1. Prove them correct.
(a) {d=O} s(5,c,d,j) {c[j]=5}
(b) {O";;;m}s(j,c,m,j){c[j]=f}
(c) {O<m} s(b[O],c,m,j) {c[j]=c[O]}
(d) {O<m} s(5, c, m, m) {c[m]=5}
6. Which of the calls given in exercise 5 can be proved correct using theorem
12.2.12? Prove them correct.
7. Suppose parameters k and p of exercise 5 have attribute var instead of value
result. Can call (d) of exercise 5 be proved correct using theorem 12.3.3? If so,
do so. Can it be proved correct using theorem 12.3.4? 12.3.5? If so, do so.
Part III
The Development of Programs
Chapter 13 Introduction

Part III discusses a radical methodology for the development of pro-


grams, which is based on the notion of weakest precondition and exploits
our definition of a programming notation in terms of it. To the reader,
the methodology will probably be different from anything seen before.
The purpose of this introduction is to prepare the reader for the approach
-to give reasons for it, to explain a few points, and to indicate what to
expect.

What is a proof?
The word radical, used above, is appropriate, for the methodology pro-
posed strikes at the root of the current problems in programming and pro-
vides basic principles to overcome them. One problem is that program-
mers have had little knowledge of what it means for a program to be
correct and of how to prove a program correct. The word proof has un-
pleasant connotations for many, and it will be helpful to explain what it
means.
A proof, according to Webster's Third New International Dictionary, is
"the cogency of evidence that compels belief by the mind of a truth or
fact". It is an argument that convinces the reader of the truth of some-
thing.
The definition of proof does not imply the need for formalism or
mathematics. Indeed, programmers try to prove their programs correct in
this sense of proof, for they certainly try to present evidence that compels
their own belief. Unfortunately, most programmers are not adept at this,
as can be seen by looking at how much time is spent debugging. The pro-
grammer must indeed feel frustrated at the lack of mastery of the subject!
Part of the problem has been that only inadequate tools for under-
standing have been available. Reasoning has been based solely on how
164 Part III. The Development of Programs

programs are executed, and arguments about correctness have been based
on a number of test cases that have been run or hand-simulated. The
intuition and mental tools have simply been inadequate.
Also, it has not always been clear what it means for a program to be
"correct", partly because specifications of programs have been so impre-
cise. Part II has clarified this for us; we call a program S correct -with
respect to a given precondition Q and postcondition R - if {Q} S {R}
holds. And we have formal means for proving correctness.
Thus, our development method will center around the concept of a for-
mal proof, involving weakest preconditions and the theorems for the alter-
native, iterative and procedure call constructs discussed in Part II. In this
connection, the following principle is important:

(13.1) .Principle: A program and its proof should be developed


hand-in-hand, with the proojusually leading the way.

It is just too difficult to prove an already existing program correct, and it


is far better to use the proof-of-correctness ideas throughout the program-
ming process for insight.

The balance between formality (lnd common sense


Our approach to programming is based on proofs of correctness of
programs. But be assured that complete attention to formalism is neither
necessary nor desirable. Formality alone is inadequate, because it leads to
incomprehensible detail; common sense and intuition alone -the
programmer's main tools till now- are inadequate, because they allow
too many errors and bad designs.
What is needed is a fine balance between the two. Obvious facts
should be left implicit, important points should be stressed, and detail
should be presented to allow the reader to understand a program as easily
as possible. A notation must be found that allows less formalism to be
used. Where suitable, definitions in English are okay, but when the going
gets rough, more formalism is required. This takes intelligence, taste,
knowledge and practice. It is not easy.
Actually, every mathematician strives for this fine balance. Large gaps
will be left in a proof if it is felt that an educated reader will understand
how to fill them. The most important and difficult points will receive the
most attention. A proof will be organized as a series of lemmas to ease
understanding.
This balance between formality and common sense is even more
important for the programmer. Programming requires so much more
detail, which must be absolutely correct without relying on the goodwill of
Chapter 13 Introduction 165

the reader. In addition, some programs are so large that they cannot be
comprehended fully by one person at one time. Thus, there is a continual
need to strive for balance, conciseness, and even elegance.
The approach we take, then, can be summarized in the following

(13.2) .Principle: Use theory to provide insight; use common


sense and intuition where it is suitable, but fall back on
the formal theory for support when difficulties and com-
plexities arise.

However, a balance cannot be achieved unless one has both common


sense and a facility with theory. The first has peen most used by pro-
grammers; to overcome the current imbalance it is necessary to lean to the
formal side for awhile. Thus, the subsquent discussions and the exercises
may be more formal than is required in practice.

Proof versus test-case analysis


It was mentioned above that part of the problem has been reliance on
test cases, during both program development and debugging. "Develop-
ment by test case" works as follows. Based on a few examples of what
the program is to do, a program is developed. More test cases are then
exhibited -and perhaps run- and the program is modified to take the
results into account. This process continues, with program modification
at each step, until it is believed that enough test cases have been checked.
The approach described in this Part is based instead on developing a
proof of correctness and a program hand-in-hand. It is different from the
usual operational approach. Experience with the new approach can actu-
ally change the way one deals with problems outside the domain of pro-
gramming, too. Two examples illustrate how effective the approach can
be.

The Coffee Can Problem. A coffee can contains some black beans and
white beans. The following process is to be repeated as long as possible.

Randomly select two beans from the can. If they have the
same color, throw them out, but put another black bean
in. (Enough extra black beans are available to do this.)
If they are different colors, place the white one back into
the can and throw the black one away.

Execution of this process reduces the number of beans in the can by one.
Repetition of the process must terminate with exactly one bean in the can,
for then two beans cannot be selected. The question is: what, if anything,
can be said about the color of the final bean based on the number of
166 Part III. The Development of Programs

white beans and the number of black beans initially in the can? Spend
ten minutes on the problem, which is more than it should require, before
reading further.

It doesn't help much to try test cases! It doesn't help to see what happens
when there are initially I black bean and I white bean, and then to see
what happens when there are initially 2 black beans and one white bean,
etc. I have seen people waste 30 minutes with this approach.
Instead, proceed as follows. Perhaps there is a simple property of the
beans in the can that remains true as beans are removed and that,
together with the fact that only one bean remains, can give the answer.
Since the property will always be true, we will call it an invariant. Well,
suppose upon termination there is one black bean and no white beans.
What property is true upon termination, which could generalize, perhaps,
to be our invariant? One is an odd number, so perhaps the oddness of the
number of black beans remains true. No, this is not the case, in fact the
number of black beans changes from even to odd or odd to even with
each move. But, there are also zero white beans upon termination
-perhaps the evenness of the number of white beans remains true. And,
indeed, yes, each possible move either takes out two white beans or leaves
the number of white beans the same. Thus, the last bean is black if ini-
tially there is an even number of white beans; otherwise it is white.

Closing the curve. This second problem is solved in essentially the same
manner. Consider a grid of dots, of any size:

Two players, A and B, play the following game. The players alternate
moves, with A moving first. A moves by drawing I or _ between two
adjacent dots; B moves by drawing a dotted line between two adjacent
dots. For example, after three full moves the grid might be as to the left
below. A player may not write over the other player's move.
Chapter 13 Introduction 167

A wins the game if he can get a completely closed curve, as shown to the
right above. B, because he goes second, has an easier task: he wins if he
can stop A from getting a closed curve. Here is the question: is there a
strategy that guarantees a win for either A or B, no matter how big the
board is? If so, what is it? Spend some time thinking about the problem
before reading further.

Looking at one trivial case, a grid with one dot, indicates that A cannot
win all the time -four dots are needed for a closed curve. Hence, we
look for a strategy for B to win. Playing the game and looking at test
cases will not find the answer! Instead, investigate properties of closed
curves, for if one of these properties can be barred from the board, A
cannot win. The corresponding invariant is that the board is never in a
configuration in which A can establish that property.
What properties does a closed curve have? It has parallel lines, but B
cannot prevent parallel lines. It has an even number of parallel lines, but
B cannot prevent this. It has four angles L, ~, I and I, but B cannot
prevent A from drawing angles. It always has at least one angleL, which
opens northeast -and B can prevent A from drawing such an angle! If
A draws a horizontal or vertical line, as shown to the left below, then B
simply fills in the corresponding vertical or horizontal line, if it is not yet
filled in, as shown to the right below. A simpler strategy couldn't exist!

These two problems have extremely simple solutions, but the solutions
168 Part III. The Development of Programs

are extremely difficult to find by simply trying test cases. The problems
are easier if one looks for properties that remain true. And, once found,
these properties allow one to see in a trivial fashion that a solution has
been found.
Besides illustrating the inadequacy of solving by test cases, these prob-
lems illustrate the following principle:

(13.3) .Principle: Know the properties of the objects that are to


be manipulated by a program.

In fact, we shall see by examples that the more properties you know
about the objects, the more chance you have of creating an efficient algo-
rithm. But let us leave further examples of the use of this principle to
later chapters.

Programming-in-the-small
F or the past ten years, there has been much research in "pro-
gramming-in-the-small", partially because it seemed to be an area in
which scientific headway could be made. More importantly, however, it
was felt that the ability to develop small programs is a necessary condition
for developing large ones ~although it may not be sufficient.
This fact is brought home most clearly with the following argument.
Suppose a program consists of n small components ~i.e. procedures,
modules~ each with probability p of being correct. Then the probability
P that the whole program is correct certainly satisfies P <pn. Since n is
large in any good-sized program, to have any hope that the program is
correct requires p to be very, very close to I. For example, a program
with 10 components, each of which has 95% chance of being correct, has
less than a 60% chance of being correct, while a program with 100 such
components has less than a .6% chance of being correct!

Remark: Doug McIlroy (Bell Laboratories) disagrees with this argument,


claiming that correct programs are made from incorrect parts. Telephone
control programs, for example, are more than half audit code, whose
business is to recover from unintended states, and the audit code has been
known to mask software as well as hardware errors. Also, an incorrect
procedure may be called within our own program in a (unknowingly) res-
tricted fashion so that the incorrectness never comes to light. Procedures
that blow up for some input work perfectly well in programs that insulate
them from these cases. Nevertheless, for most of the situations we face,
the argument holds. 0
Chapter 13 Introduction 169

Part III concentrates on the place where many programming errors are
made: the development of small program segments. All the program seg-
ments in Part III are between I and 25 lines long, with the majority being
between 1 and 10. It is true, however, that some of the programs are
short because of the method of development. Concentrating on princi-
ples, with an emphasis on precision, clarity and elegance, can actually
result in shorter programs. The most striking example of this is the pro-
gram The Welfare Crook -see section 16.4.

A disclaimer
The methods described in Part III can certainly benefit almost any pro-
grammer. At the same time, it should be made clear that there are other
ways to develop programs. A difficult task like programming requires
many different tools and techniques. Many algorithms require the use of
an idea that simply does not arise from the principles given in this Part,
so this method alone cannot be used to solve them effectively. Some
important ideas, like program transformation and "abstract data types"
are not discussed at all, while others are just touched upon. And, of
course, experience and knowledge can make all the difference in the
world.
Secondly, even though the emphasis is on proofs of correctness, errors
will occur. The wise programmer develops a program with the attitude
that a correct program can and will be developed, provided enough care
and concentration is used, and then tests it thoroughly with the attitude
that it must have a mistake in it. The frequency of errors in mathematical
theorems, proofs, and applications of theorems is well-recognized and
documented, and the area of program-proving will not be an exception.
We must simply learn to live with human fallibility and simplify to reduce
it to a minimum.
Nevertheless, the study of Part III will provide an education in
rigorous thinking, which is essential for good programming. Conscious
application of the principles and strategies discussed will certainly be of
benefit.

The organization of Part III


In order to convey principles and strategies as clearly as possible, most
of the sections are organized as follows. A small example is used to illus-
trate one or two new points. The points are discussed. One or two exam-
ples are then developed in a manner calculated to involve the reader in
the use of the points. A question is asked about the development, and the
question is followed by blank space, a horizontal line and the answer (as
done above). The reader is encouraged to attempt to answer the question
170 Part III. The Development of Programs

first before proceeding! Finally, the reader should do several of the exer-
cises at the end of the section.
Simply reading and listening to lectures on program development can
only teach about the method; in order to learn how to use it, direct
involvement is necessary. In this connection, the following meta-principle
is of extreme importance:

(13.4) .Principle: Never dismiss as obvious any fundamental


principle, for it is only through conscious application of
such principles that success will be achieved.

Ideas may be simple and easy to understand, but their application may
require effort. Recognizing a principle and applying it are two different
things.

Which typewriter do you choose?


Back in 1867, the typewriter was introduced into the United States. By
1873, the current arrangement of the keys on the typewriter, called the
QWERTY keyboard (after the first six letters of the upper key row), was
implemented, never to be changed again. At that time typing speed was
not important -most people used two fingers anyway. Moreover, the
typewriters often jammed, and the most-used letters were arbitrarily distri-
buted in order to reduce speed so jamming wouldn't occur so easily.
Today, millions of excellent, speedy touch-typists use the inefficient
QWERTY keyboard, because that is the only one made. Every so often,
a new arrangement is designed and tested. The tests show that a good
typist can learn the new arrangement in a month or so, and thereafter will
type much faster with much less energy and strain. Yet the new keyboard
never catches on. Why? Too much is invested in hardware and training.
Because of the high cost of changeover, because of inertia, QWERTY
remains supreme.
Let's face it: the average programmer is a QWERTY programmer. He
is stuck with old notations, like FORTRAN and COBOL. More impor-
tantly he has been thinking with two fingers, using the same mental tools
that were used at the beginnings of computer science, in the 1940s and
1950s. True, "structured programming" has helped, but even that, by
itself, is not enough. To put it simply, the mental tools available to pro-
grammers have been inadequate.
The work on developing proof and program hand-in-hand is beginning
to show fruit, and it may lead to a more efficient arrangement of the
programmer's keyboard. Luckily, the hardware need not change. Mental
tools and attitudes are far more important in programming than the
Chapter 13 Introduction 171

notation in which the final program is expressed. For example, one can
use the principles and strategies espoused in this book even if the final
program has to be in FORTRAN: one programs into a language, not in
it. To be sure, considerably more than one month of education and train-
ing will be necessary to wean yourself away from QWERTY program-
ming, for old habits are changed very slowly. Nevertheless, I think it is
worthwhile.
Let us now turn to the elucidation of principles and strategies that may
help give the QWERTY programmer a new keyboard.
Chapter 14
Programming as a Goal-Oriented Activity

A simple example of program development


Consider the following problem. Write a program that, given fixed
integers x and y, sets z to the maximum of x and y. (Throughout, we
use the convention that variables called "fixed" should not be changed by
execution of the program. See section 6.3.) Thus, a command S is
desired that satisfies

(14.1) {T} S {R: z =max(x, y)}.

Before the program can be developed, R must be refined by replacing


max by its definition -after all, without knowing what max means one
cannot write the program. Variable z contains the maximum of x and y
if it satisfies

(14.2) R:z~x/\z~y /\(z=xvz=y)

Now, what command could possibly be executed in order to establish


(14.2)? Since (14.2) contains z =x, the assignment z:= x is a possibility.
The assignment z:= x +1 is also a possibility, but z:= x is favored for at
least two reasons. First, it is determined from R: to achieve z = x assign
x to z. Second, it is simpler.
To determine the conditions under which execution of z:= x will actu-
ally establish (14.2), simply calculate wp("z:= x", R):

wp("z:= x", R) =x ~x /\ X ~y/\ (x =x V x =y)


= T/\x~y /\(Tvx=y)
=x~y

This gives us the conditions under which execution of z:= x will establish
R, and our first attempt at a program can be
Chapter 14 Programming as a Goal-Oriented Activity 173

if x ~y - z := x fi

This program performs the desired task provided it doesn't abort. Recall
from theorem 10.5 for the alternative construct that, to prevent abortion,
precondition Q of the construct must imply the disjunction of the guards,
i.e. at least one guard must be true in any initial states defined by Q. But
Q, which is T, does not imply x ~ y. Hence, at least one more guarded
command is needed.
Another possible way to establish R is to execute z:= y. From the
above discussion it should be obvious that y ~x is the desired guard.
Adding this guarded command yields

(14.3) if x~y - z:= x


Dy ~x - z:= y
fi

Now, at least one guard is always true, so that this is the desired program.
Formally, we know that (14.3) is the desired program by theorem 10.5.
To apply the theorem, take

SI: z:= x S2: z:= y


BI:x~y B 2: y ~x
Q: T R: z ~ x A Z ~Y A (z = X V Z =Y)

Discussion
The above development illustrates the following

(14.4) .Principle: Programming is a goal-oriented activity.

By this we mean that the desired result, or goal, R, plays a more impor-
tant role in the development of a program than the precondition Q. Of
course, Q also plays a role, as will be seen later. But, in general, more
insight is gained from the postcondition. The goal-oriented nature of pro-
gramming is one reason why the programming notation has been defined
in terms of weakest preconditions (rather than strongest postconditions
-see exercise 4 of section 9.1).
To substantiate this hypothesis of the goal-oriented nature of program-
ming, consider the following. Above, the precondition was momentarily
put aside and a program was developed that satisfied

I?} S {R: z =max(x, y)};

whenever S was considered complete, the requirement Q =;,. wp (S, R) was


checked. Try doing opposite: forget about postcondition R, and try to
174 Part III. The Development of Programs

develop a program S satisfying only

{T} S {? I
Whenever S is thought to be complete, check whether T ~ wp(S,
z =max(x, y», or T ~ wp(S,(I4.2». How many programs S will you
write before a correct one is found?
Another principle used in the above development is:

(14.5) ePrinciple: Before attempting to solve a problem, make


absolutely sure you know what the problem is.

In programming, this general principle becomes:

(14.6) ePrinciple: Before developing a program, make precIse


and refine the pre- and postconditions.

In the example just developed, the postcondition was refined while the
precondition, which was simply T, needed no refining.
A problem is sometimes specified in a manner that lends itself to
several interpretations. Hence, it is reasonable to spend some time mak-
ing the specification as clear and unambiguous as possible. Moreover, the
form of the specification can influence algorithmic development, so that
striving for simplicity and elegance should be helpful. With some prob-
lems, the major difficulty is making the specification simple and precise,
and subsequent development of the program is fairly straightforward.
Often, a specification may be in English or in some conventional nota-
tion -like max(x, y)- that is at too "high a level" for program develop-
ment, and it may contain abbreviations dealing with the applications area
with which the programmer is unfamiliar. The specification is written to
convey what the program is to do, and abstraction is often used to sim-
plify it. More detail may be required to determine how to do it. The
example of setting z to the maximum of x and y illustrates this nicely. It
is impossible to write the program without knowing what max means,
while writing a definition provides the insight needed for further develop-
ment.
The development of (14.3) illustrates one basic technique for develop-
ing an alternative construct, which was motivated by theorem 10.5 for the
Alternative Construct.

(14.7) eStrategy for developing an alternative command: To


invent a guarded command, find a command C whose
execution will establish postcondition R in at least some
cases; find a Boolean B satisfying B ~ wp (C, R); and
Chapter 14 Programming as a Goal-Oriented Activity 175

put them together to form B - C (see assumption 2 of


the theorem). Continue to invent guarded commands
until the precondition of the construct implies that at least
one guard is true (see assumption I of the theorem).

This technique, and a similar one for the iterative construct, is used often.
Let us return to program (14.3) for a moment. It has a pleasing sym-
metry, which is possible because of the nondeterminism. If there is no
reason to choose between z:= x and z := y when x = y, one should not be
forced to choose. Programming requires deep thinking, and we should be
spared any unnecessary irritation. Conventional, deterministic notations
force the choice, and this is one reason for preferring the guarded com-
mand notation.
Nondeterminism is an important feature even if the final program turns
out to be deterministic, for it allows us to devise a good programming
methodology. One is free to develop many different guarded commands
completely independently of each other. Any form of determinism, such
as evaluating the guards in order of occurrence (e.g. the PL / I Select state-
ment), drastically affects the way one thinks about developing alternative
constructs.

A second example
Write a program that permutes (interchanges) the values of integer var-
iables x and y so that x :S;;y. Use the method of development discussed
above.
As a first step, before reading further, write a suitable precondition Q
and postcondition'R.

The problem is slightly harder than the first one, for it requires the intro-
duction of notation to denote the initial and final values of variables.
Precondition Q is x = X A Y = Y, where identifiers X and Y denote the
initial values of variables x and y, respectively. Postcondition R IS

(14.8) R:x:S;;y A(x=XAy=Y v x=YAy=X).

Remark: One could also use the concept of a permutation and write R as
x:S;;y Aperm«x, y), (X, y)). 0

N ow, what simple commands could cause (14.8) to be established, at least


under some conditions?
176 Part III. The Development of Programs

Precondition Q, which is x = X A Y = Y, appears as part of R, so there


is a good chance that the operation skip could establish R under some
conditions. (This is a use of the precondition to provide additional infor-
mation). Another possibility is the swap x, y:= y, x, because it also
would establish the second conjunct of R. How does one determine the
guards for each of these commands, and what are the guards?

The guard B; of a guarded command B; - S; of an alternative construct


must satisfy Q A Bi =";> wp (Si, R), according to the Theorem for the Alter-
native Construct. For the command skip ~ we have wp (skip, R) = R.
Hence, B of the guarded command B - skip must satisfy Q A B =";> R .
Since Q implies the second conjunct of R, the first conjunct x ~y of R
can be the guard, so that the guarded command is x ~y - skip.
For the second command we have

wp("x, y:= y, x", R)


= y ~x A (y = X A X = Y v y = Y A X = X).

Again, the second conjunct of this weakest precondition is implied by Q,


so that the first conjunct y ~ x can be the guard. This yields the alterna-
tive construct

ifx~y -skip
Oy ~x - x, y:= y, x
fi

Since the disjunction of the guards, x ~y v y ~ x, is al~ays true, the pro-


gram is correct (with respect to the given Q and R). .
Note carefully how the theorem for the Alternative Construct is used
to help determine the guards. This should not be too surprising -after
all, the theorem simply formalizes the principles used by programmers to
understand alternative commands.

Keeping guards of an alternative command strong


Suppose variable j contains the remainder when k IS divided by IO
(for k > 0). That is, j and k should always satisfy

j = k mod IO

Thus, j will only take on the values 0, I, ... ,9. Let us determine a com-
mand to "increase k under the invariance of j = k mod 10", assuming that
function mod is not available.
Chapter 14 Programming as a Goal-Oriented Activity 177

One possible command is k,j:= k+l,j+1. However, this does the


job only if before its execution j <9, and so we have the guarded com-
mand j < 9 - k, j:= k + I, j + 1. However, initially we have 0 ~j < 10,
so that the case j = 9 must be considered also. The obvious command in
this case is k, j:= k + I, 0, and we arrive at the program segment

(14.8) ifj<9-k,j:= k+l,j+1


Uj =9 - k,j:= k+l, 0
fi

(Note how strategy (14.7) was used, in an informal but careful manner.)
The question is: which is to be preferred, (14.8) or segment (14.9) below,
which is the same as (14.8) except that its second guard, j ~ 9, is weaker.
At first thought, (14.9) might be preferred because it executes without
abortion in more cases. If initially j = 10 (say), it nicely sets j to O. But
this is precisely why (14.9) is not to be preferred. Clearly, j = 10 is an
error caused by a hardware malfunction, a software error, or an inadver-
tant modification of some kind ~j is always supposed to satisfy
O~j < 10. Execution of (14.9) proceeds as if nothing were wrong and the
error goes undetected. Execution of (14.8), on the other hand, aborts if
j = 10, and the error is detected.

(14.9) if j <9 - k,j:= k+l, j+1


Uj ~9 - k ,j:= k+l, 0
fi

This analysis leads to the following

(14.10) • Principle: All other things being equal, make the guards
of an alternative command as strong as possible, so that
some errors will cause abortion.

The phrase "all other things being equal" is present to make sure that the
principle is reasonably applied. For example, at this point I am not even
prepared to advocate strengthening the first guard, as follows:

if 0 ~j /\ j <9 - k ,j:= k + I, j + I
Uj =9 - k, j:= k+l, 0
fi

As a final note, program (14.8) can be rearranged to

k:=k+l; ifj<9-j:=j+l U j=9-j:=Ofi


178 Part Ill. The Development of Programs

Exercises for Chapter 14


1. Develop programs for the following problems in a fashion similar to the
development of the above programs. Remember, satisfactory pre- and postcondi-
tions should be developed first.
(a) Set z to abs (x).
(b) Set x to abs (x).
(c) Suppose x contains the number of odd integers in array b [O:k -I], where
k ~ O. Write a program to add I to k, keeping the property of x the same.
That is, upon termination k should be one more than it was initially and x
should still contain the number of odd integers in b [O:k -I].
< <
(d) Suppose integer variables a and b satisfy 0 a + 1 b, so that the set
{a, a + I, ... ,b} contains at least 3 values. Suppose also that the following
predicate is true:

Is it possible to halve the interval a:b, by setting either a or b to (a +b )-;-2, at


the same time keeping P true? Answer the question by trying to develop a pro-
gram to do so.
2. (The Next Higher Permutation). Consider an integer of n decimal digits
(n >0) contained in an array d[O:n -I], with d[O] being the high-order digit.
For example, with n = 6 the integer 123542 would be contained in d as
d = (1,2,3,5,4,2). The next higher permutation of d[O:n -I] is an array d'
that represents the next higher integer composed of exactly the same digits. In the
example given, the next higher permutation would be d' = (1,2,4,2,3,5).
The problem is to define precisely the next higher permutation d' for an
integer d [O:n - I]. Does your definition give any insight into developing a pro-
gram to find it?
Chapter 15
Developing Loops from Invariants and Bounds

This chapter discusses two methods for developing a loop when the
precondition Q, the postcondition R, the invariant P and the bound
function t are given. The first method leads naturally to a loop with a
single guarded command, do B ~ Sod. The second takes advantage of
the flexibility of the iterative construct and generally results in loops with
more than one guarded command.
Checklist 11.9 will be heavily used, and it may be wise to review it
before proceeding. As is our practice throughout, the parts of the
development that illustrate the principles to be covered are discussed in a
formal and detailed manner, while other parts are treated more infor-
mally.

15.1 Developing the Guard First

Summing the elements of an array


Consider the following problem. Write a program that, given fixed
integer n ~ 0 and fixed integer array b [O:n -I], stores in variable s the
sum of the elements of b. The precondition Q is simply n ~O; the
postcondition R is

R:s =(Lj:O~j<n:bU])

A loop with the following invariant and bound function is desired.

P: 0 ~ j ~ n 1\ s = (L j : 0 ~j < i : b U])
t: n - j
180 Part III. The Development of Programs

Thus, variable ; has been introduced. The invariant states that at any
point in the computation s contains the sum of the first i values of b.
The assignment i, s:= 0, 0 obviously establishes P, so it will suffice as
the initialization. (Note that ;, s:= I, b[O] does not suffice because, if
n =0, it cannot be executed. If n =0, execution of the program must set
s to the identity of addition, 0.)
The next step is to determine the guard B for the loop do B - Sod.
Checklist 11.9 requires P II , B ~ R, so , B is chosen to satisfy it. Com-
paring P and R, we conclude that j = n will do. The desired guard B of
the loop is therefore its complement, i"#n. The program looks like

i, s:= 0,0; do j "# n -? od

Now for the command. The purpose of the command is to make progress
towards termination -i.e. to decrease the bound function t - and an
obvious first choice for it is j:= ;+1. But, this would destroy the invari-
ant, and to reestablish it b[i] must simultaneously be added to s. Thus,
the program is

(15.1.1) i,s:= 0, 0; do i"#n - ;,s:= ;+1, s+b[i] od

Remark: For those uneasy with the multiple assignment, the formal proof
that P is maintained is as follows. We have

wp("i, s:= i+I, s +b[i]", P)


= O:;:;;;i+l:;:;;;n II s+b[i] =(Ij: O:;:;;;j <;+1: bU])

and this is implied by Pili "# n . 0

Discussion
First of all, let us discuss the balance between formality and intuition
observed here. The pre- and postconditions, the invariant and the bound
function were given formally and precisely. The development of the parts
of the program was given less formally, but checklist 11.9, which is based
on the formal theorem for the Iterative Construct, provided most of the
motivation and insight. In order to check the informal development, we
relied on the theory (in checking that the loop body maintained the invari-
ant). This is illustrative of the general approach (I3.1) mentioned in
chapter 13.
An important strategy in the development was finding the guard before
the command. And the prime consideration in finding the guard B was
that it had to satisfy P II , B 9 R. So, , B was developed and then com-
plemented to yield B.
Section 15. I Developing the Guard First lSI

Some object at first to finding the guard this way, because Tradition
would use the guard i < n instead of i oF n . However, i oF n is better,
because a software or hardware error that made i > n would result in a
nonterminating execution. It is better to waste computer time than suffer
the consequences of having an error go undetected, which would happen
if the guard i <n were used. This analysis leads to the following

(15.1.2) -Principle: All other things being equal, make the


guards of a loop as weak as possible, so that an error
may cause an infinite loop.

Principle 15.1.2 should be compared to principle 14.10, which concerns


the guards of an alternative command.

The method used for developing the guard of a loop is extremely sim-
ple and reliable, for it is based on manipulation of static, mathematical
expressions. In this connection, I remember myoid days of FORTRAN
programming --the early 1960's~ when it sometimes took three debug-
ging runs to achieve proper loop termination. The first time the loop
iterated once too few, the second time once too many and the third time
just right. It was a frustrating, trial-and-error process. No longer is this
necessary; just develop , B to satisfy P II , B =? R and complement it.

Another important point about the development was the stress on ter-
mination. The need to progress towards termination motivated the
development of the loop body; reestablishing the invariant was the second
consideration. Actually, every loop with one guarded command has the
high-level interpretation

(15.1.3) {invariant: P}
{bound: t}
do B - Decrease t, keeping P true od
{P II , B}

This approach to loop development is summarized as follows:

(15.1.4) -Strategy for developing a loop: First develop the


guard B so that P /I , B =? R; then develop the body
so that it decreases the bound function while reestab-
lishing the loop invariant.
182 Part III. The Development of Programs

Searching a two-dimensional array


Consider the following problem. Write an algorithm that, given a
fixed array of arrays b[O:m-I][O:n-I], where O<m and O<n, searches
b for a fixed value x. If x occurs in several places in b, it doesn't matter
which place is found. For this problem, we will use conventional two-
dimensional notation, writing b as b [O:m -I, O:n -I]. Using variables i
and }, upon termination either x =b[i,}] or, if this is not possible,
i = m. To be more precise, execution of the program should establish

(15.1.5) R: (O~i <m 1\ O~) <n 1\ x =b[i, }]) v (i =m 1\ x ¢b).

The invariant P, given below using a diagram, states that x is not in the
already-searched rows b[O:i -I] and not in the already-searched columns
b[i,O:}-I] of the current row i.

o } n-l
o x not here
(15.1.6) P: O~i ~m 1\ O~} <n 1\ I
m-l
'----------'

The bound function t is the number of values in the untested section:


(m -i )*n - ). As a first step in the development, before reading further
determine the initializati on for the loop.

The obvious choice is i, j:= 0, 0, for then the section in which "x is not
here" is empty. Next, what should be the guard B of the loop?

Expression , B must satisfy P 1\ , B ? R. It must be strong enough so


that each of the two disjuncts of R can be established. To provide for the
first disjunct, choose i < m cand x = b [i ,}]; to provide for the second,
choose i = m. The operator cand is needed to ensure that the expression
is well-defined, for b[i,}] may be undefined if i~m. Therefore, choose
, B to be

, B: i =m v (i <m cand x =b[i,j])

Using De Morgan's laws, we find its complement B:

B: i #m 1\ (i ~m cor x #b[i,}])
Section 15.1 Developing the Guard First 183

Since the guard B is to be evaluated only when the invariant P is true,


which means that i ~m is true, it can be simplified to

B: i #- m 1\ (i =m cor x #- b [i ,j)

and finally to

B: i #-m cand x #-b[i,j].

The final line is therefore the guard of the loop. The next step is to deter-
mine the loop body. Do it, before reading further.

The purpose of the loop body is to decrease the bound function t, which
is the number of elements in the untested section: (m -i)*n - j. P 1\ B,
the condition under which the body is executed, implies that i < m, j <n
and x #- b [i ,j), so that element b [i ,j), which is in the untested section,
can be moved into the tested section. A possible command to do this is
j:= j+l, but it maintains the invariant P only if j <n -I. So we have
the guarded command

j<n-I ~j:=j+1

What do we do if j ?;n -I? In this case, because invariant P is true, we


have j = n -I. Hence, we must determine what to do if j = n -I, i.e. if
b [i, j) is the rightmost element of its row. To move b [i, j) into the
tested section requires moving to the beginning of the next row, i.e. exe-
cuting i, j:= i+l, o.
The loop body is therefore

if j < n -I ~ j: = j + I 0j = n -) ~ j ,j: = j + ), 0 fi
The program is therefore

(15.1.7) j, j:= 0, 0;
do j #-m cand x #-b[i,j) ~
if j < n -I ~ j: = j +) 0 j = n -) ~ i,j: = i + ), 0 fi
od

If desired, the body of the loop can be rearranged to yield


184 Part III. The Development of Programs

i,):=O,O;
do i # m cand x # b [i , j] -
):= )+1;
if) <n - skip 0 j =n - i,j:= i+l, 0 fi
od

Discussion
Note that operation cand (instead of A) is really necessary.
Note that the method for developing an alternative command was used
when developing the body of the loop, albeit informally. First, the com-
mand F= j + I was chosen, and it was seen that it performed as desired
only if) < n -I. Formally, one must prove

(P A B Aj<n-I) =? wp("F=)+I", P)

but this case is simple enough to handle informally -if care is used.
Second, the command i,):= i + I, 0 was chosen to handle the remaining
case,) =n.
Note that the alternative command has the guards j < n -I and j =
n -I, and not j < n -I and)): n -I. The guards of the alternative com-
mand have been made as strong as possible, in keeping with principle
14.10, in order to catch errors.
We will develop another solution to this problem in section 15.2.

Exercises for Section 15.1


1. Develop a second program for the first example of this section. This time use
the invariant and bound function

P: 0 ~ i ~ n A s = (I): i ~ j < n : b U])


t: i

2. The invariant of the loop of the second example was given in terms of a
diagram (see (15.1.6». Replace the diagram by an equivalent statement in the
predicate calculus.
3. Write a program that, given a fixed integer array b[O:n-I], where n >0, sets
X to the smallest value of b. The program should be nondeterministic if the
smallest value occurs more than once in b. The precondition Q, postcondition
R, loop invariant P and bound function tare

Q: O<n
R: x~b[O:n-l] /\ (Ej: O~j<n: x=bU])
P: I~j~n A x~b[O:j-l] A (Ej: O~j<i: x=bU])
t: n-i
Section 15.2 Making Progress Towards Termination 185

4. Write a program for the problem of exercise 3, but use the invariant and bound
function

P: O~i <n /\ x ~b[i:n-l] /\ (Ej: iq<n: x =b[j])


t: i

5. Write a program that, given a fixed integer n >0, sets variable i to the highest
power of 2 that is at most n. The precondition Q, postcondition R, loop invari-
ant P and bound function tare

Q: O<n
R: 0< i ~ n < 2*i /\ (E p: i = 2P )
P: 0< i ~ n /\ (E p: i = 2P )
t: n-i

6. Translate program (15.1.7) into the language of your choice -PLj I, Pascal,
FORTRAN, etc.- remembering the need for the operation cando Compare your
answer with (15.1. 7).

15.2 Making Progress Towards Termination

Four-tuple Sort
Consider the following problem. Write a program that sorts the four
integer variables qO, ql, q2, q3. That is, upon termination the following
should be true: qO ~ql ~q2 ~q3.
Implicit is the fact that the values of the variables should be permuted
-for example, the assignment qO, ql, q2, q3:= 0, 0, 0, 0 is not a solution,
even though it establishes qO~ql ~q2~q3. To convey this information
explicitly, we use Qi to denote the initial value of qi, and write the for-
mal specification

Q: qO=QO /\ ql=Ql/\ q2=Q2/\ q3=Q3


R: qO~ql ~2~q3 /\ perm «qO, ql, q2,q3),(QO, Ql, Q2, Q3))

where the second conjunct perm ( ... , ... ) of R means that the four
variables qO, ql, q2, q3 contain a permutation of their original values.
A loop will be written. Its invariant expresses the fact that the four
variables must always contain a permutation of their initial values:

P: perm«qO,ql,q2,q3),(QO, Ql, Q2, Q3))

The bound function is the number of inversions in the sequence (qO,


ql, q2, q3). For a sequence (qo, ... ,qn-I), the number of inversions is
the number of pairs (qi, qj), i <j, that are out of order -i.e. qi >qj.
186 Part Ill. The Development of Programs

Note that this includes all pairs, and not just adjacent ones. For example,
the number of inversions in (1,3,2,0) is 4. So the bound function is

t: (Ni,j: O~i <j <4: qi >qj).

The invariant indicates that the four variables must always contain a
permutation of their initial values. This is obviously true initially, so no
initialization is needed.
In the last section, at this point of the development the guard of the
loop was determined. Instead, here we will look for a number of guarded
commands, each of which makes progress towards termination. The
invariant indicates that the only possible commands are those that swap
(permute) the values of two or more of the variables. To keep things sim-
ple, consider only swaps of two variables. There are six possibilities:
qO, qI:= qI, qO and qI, q2:= q2, qI, etc.
Now, execution of a command must make progress towards termina-
tion. Consider one possible command, qO, qI:= qI, qO. It decreases the
number of inversions in (qO, qI, q2, q3) iffqO>qI. Hence, the guarded
command qO>qI - qO, qI:= qI, qO will do. Each of the other 5 possibil-
ities are similar, and together they yield the program

do qO>qI - qO, qI:= qI, qO


o qi >q2 - ql, q2:= q2, qi
o q2> q3 - q2, q3:= q3, q2
o qO> q2 - qO, q2:= q2, qO
o qO>q3 - qO, q3:= q3, qO
o qi >q3 - qI, q3:= q3, ql
od

It still remains to prove that upon termination the result R is established


-this is point 3 of checklist 11.9, P /\ , BB ~ R. Suppose all the guards
are false. Then qO ~ ql (because the first guard is false), qi ~ q2 (because
the second is false) and q2 ~ q3 (because the third is false); therefore

qO~qI~q2~q3.

Together with invariant P, this implies the desired result. But note that
only the first three guards were needed to establish the desired result.
Therefore, the last three guarded commands can be deleted, yielding the
program
Section 15.2 Making Progress Towards Termination 187

do qO>qJ - qO, qJ:= qJ, qO


o >
qi q2 - qJ, q2:= q2, qJ
o q2>q3 - q2, q3:= q3, q2
od

Discussion
The approach used here can be summarized as follows.

(15.2.1) -Strategy for developing a loop: Develop guarded


commands, creating each command so that it makes
progress towards termination and creating the corres-
ponding guard to ensure that the invariant is main-
tained. The process of developing guarded commands
is finished when enough of them have been developed
to prove P", BB =? R.

Developing the commands as indicated ensures that points 2, 4 and 5 of


checklist 11.9 are true. The last sentence of the strategy indicates that the
loop is completed when point 3 of the checklist is true. Of course, initiali-
zation to make the invariant true initially (point I of the checklist) may
need to be written.
The emphasis in the strategy is on points 2 and 4 of checklist 11.9,
which concern progress towards termination and maintenance of invari-
ance. In the approach used in section 15.1, the emphasis was first on
proving point 3, that upon termination the result R is true.
Let us discuss the seemingly magical step of deleting three guarded
commands from the loop. Once a correct loop has been developed, a
shorter and perhaps more efficient one can sometimes be derived from it.
Each guarded command already satisfies points 2 and 4 of checklist 11.9.
Strengthening the guards cannot destroy the fact that points 2 and 4 are
satisfied, so that the guards can be changed at will, as long as they are
strengthened. The only problem is to ensure that upon termination the
result still holds -i.e. P" , BB =? R is still true.
If it is possible to strengthen a guard to F (false) without violating
P" , BB =? R, then the corresponding command can never be executed,
so that the guarded command can be deleted. This is what happened in
this example. Only the first three guards were needed to prove P " , BB
=? R, so that the last three could be strengthened to F and then deleted.
We will return to this point in chapter 19 on efficiency.
This little program is nondeterministic in execution, because two, and
even three, guards can be true at the same time. But, for any initial state
188 Part II I. The Development of Programs

there is exactly one final state, so that in terms of the result the program
is deterministic.
The number of iterations of the loop is equal to the number of inver-
sions, which is at most 6.

Searching a two-dimensional array


Consider again a problem discussed in section 15.1: wntlllg a program
to search a two-dimensional array. The only difference in the problem is
that here the array may be empty (i.e. it may have 0 rows or 0 columns).
The fixed array is b [O:m ~I, O:n -I], where 0 ~ m and 0 ~ n, and it is to
be searched for a fixed integer x. Using variables i and j, upon termina-
tion either x =b[i,j] or, if this is not possible, i =m. To be more pre-
cise, R should be established:

(15.2.2) R: (O~i <m 1\ O~j <n 1\ x =b[i, J]) v (i =m 1\ x !fb).

The invariant P, given below in a diagram, states that x is not in the


already-searched rows b[O:i~l] and not in the already-searched columns
b[i,O:j-l] of the current row i.

o J n-I
o x not here

(15.2.3) P: O~i~m 1\0~J~n 1\ J


m-I L -_ _ _ _ _ _--'

The bound function is the sum of number of values in the untested section
and the number of rows in the untested section: t = (m ~i)*n ~J + m ~i.
The additional value m -i is needed because possibly j = n. As a first
step in the development, determine the initialization for the loop.

The obvious choice is i, j:= 0, 0, for then the section in which "x is not
here" is empty. Note carefully how the invariant includes j ~n, instead
of J < n. This is necessary because the number of columns, n, could be
o.
Next, guarded commands for the loop must be developed. What is the
simplest command possible, and what is a suitable guard for it?
Section 15.2 Making Progress Towards Termination 189

The obvious command to try is j:= j + I, because it decreases t. (Another


possibility, to be investigated subsequently, is i:= i + I). A suitable guard
must ensure that P remains true. Formally or informally, we can see that
i ¥= m "} ¥= n cand x ¥= b[i ,j] can be used, so that the guarded com-
mand is

i¥=m "j¥=n cand x¥=b[i,j]-}:=j+1

Note that this guard has been made as weak as possible. Now, does a
loop with this single guarded command solve the problem? Why or why
not? If not, what other guarded command can be used?

A loop with only this guarded command could terminate with i <m "
j = n, and this, together with the invariant, is not enough to prove R.
Indeed, if the first row of b does not contain x, the loop will terminate
after searching through only the first row! Some guarded command must
deal with increasing i.
The command i:= i+1 may only be executed if i <m. Moreover, it
has a chance of keeping P true only if row i does not contain x, so con-
sider executing it only under the additional condition j = n. But this
means that j should be set to 0 also, so that the condition on the current
row i is maintained. This leads to the program

(15.2.4) i, j:= 0, 0;
do i ¥= m "j ¥= n cand x ¥=b[i,j] - j:= j+1
o i¥=m "} =n -i,j:=i+I,O
od

It still remains to show that upon termination R is true -i.e. P " , BB


~ R. Suppose the guards are false. Two cases arise. First, i = m could
hold. Secondly, if i ¥= m, then the falsity of the second guard implies
} ¥=n; therefore the falsity of the first guard implies x =b[i, j]. Thus, if
the guards are false the following must be true:

i =m cor (i ¥=m "j ¥=n "x =b[i, j]),

and this together with P implies the result R. Hence, the program is
correct. Note that in the case i = m the invariant implies that x is not in
rows 0 through m -I of b, which means that x if b .
190 Part III. The Development of Programs

Discussion
This loop was developed by continuing to develop simple guarded
commands that made progress towards termination until P II , BB ~ R.
This led to a loop with a form radically different from what most pro-
grammers are used to developing (partly because they don't usually know
about guarded commands). It does take time to get used to (15.2.4) as a
loop for searching a two-dimensional array.
This problem is often used to argue for the inclusion of gotos or loop
"exits" in a conventional language, because, unless one uses an extra vari-
able commonly called a "flag", the conventional solution to the problem
needs two nested loops and an "exit" from the inner one:

(15.2.5) i, j:= 0, 0;
while i ¥- m do
begin while j ¥- n do
if x = b [i , j] then goto loopexit
else j:= j + I;
i, j:= i+l, 0
end;
loopexit:

We see, then, that the guarded command notation and the method of
development together lead to a simpler, easier-to-understand, solution to
the problem -provided one understands the methodology.
How could program (15.2.4) be executed effectively? An optimizing
compiler could analyze the guards and commands and determine the
paths of execution given in diagram (15.2.6) -in the diagram, an arrow
with F (T) on it represents the path to be taken when the term from
which it emanates is false (true). But (15.2.6) is essentially a flawchart for
program (l5.2.5)! At least in this case, therefore, the "high level" pro-
gram (15.2.4) can be simulated using the "lower-level" constructs of Pas-
cal, FORTRAN and PLjl.
Program (15.2.4) is developed from sound principles. Program (15.2.5)
is typically developed in an ad hoc fashion, using development by test
cases, the result being that doubt is raised whether all cases have been
covered.
Exercises for Section 15.2 191

(15.2.6) i,j:= 0, 0;

dO~,?;-C:~X#b[i.~.=j+l)
Di:mAj~~iJ=i+I.O
od

Exercises for Section 15.2


1. Write a program for the following problem. Given is a fixed three-dimensional
array c[O:m-l,O:n-l,O:p-l], where m,n,p~O. Given is a fixed variable
x. Using three variables i, j and k, find a value c [i ,j ,k] with value x; if
, x E c, set i to m .
2. Write a program that, given fixed integers X and Y, X> 0, Y > 0, finds the
greatest common divisor gcd(X, Y). The greatest common divisor of X and Y
that are not both 0 is the greatest integer that divides both of them. For example,
gcd (I, I) = 1, gcd (2, 5) = 1 and gcd (10,25) = 5. The following properties hold
for x #0, y #0:

gcd(x, y)=gcd(x, y-x)=gcd(x-y, y)


gcd(x, y)=gcd(x, x+y)=gcd(x+y, y)
gcd(x, x)=x
gcd(x, y)=gcd(y, x)
gcd(x, 0) = gcd(O, x) = x

The first two lines hold because any divisor of x and y is also a divisor of x +y
and x-y -since x / d ±y / d =(x ±y) / d for any divisor d of x and y.
Your program has the result assertion

R: x = y =gcd(X, Y)

The program should not use multiplication or division. It should be a loop (with
initialization) with invariant

P: O<x 1\0<y I\gcd(x,y)=gcd(X, Y)

and bound function t: x +y. Use the properties given above to determine possi-
ble guarded commands for the loop.
3. Redo the program of exercise 2 to determine the greatest common divisor of
three numbers X, Y and Z that are> O.
4. Write an algorithm to determine gcd (X, Y) for X, Y ~ 0 using multiplica-
tion and division (see exercise 2). For example, it is possible to subtract a multi-
ple of x from y. The result assertion, invariant and bound function are
192 Part III. The Development of Programs

R: x =0 A y =gcd(X, Y)
P:O:::;;X AO:::;;y A(O,O)~(X,y)Agcd(x,y)=gcd(X, Y)
t: 2*x+y

5. This problem concerns that part of a scanner of a compiler -or any program
that processes text- that builds the next word or sequence of non blank symbols.
Characters bU:79] of character array b[0:79] are used to hold the part of the
input read in but "not yet processed", and another line of input can be read into
b by executing read (b). Input lines are 80 characters long.
It is known that b U :79] catenated with the remaining input lines is a
sequence

wi '-'I REST
where '1" denotes catenation, "-" denotes a blank space, W is a nonempty
sequence of nonblank characters, and REST is a string of characters. The pur-
pose of the program to be written is to "process" the input word W, deleting it
from the input and putting it in a character array s. W is guaranteed to be short
enough to fit in s. For example, the top part of the diagram below shows sample
initial conditions with IO-character lines. The bottom diagram gives correspond-
ing final conditions.

W: 'WORD'
REST: 'NEXT-ONE-IS-IT--' E-IS-IT---
bU:79]: 'WO' input:

Initial Conditions

bU:79]: '-NEXT-ON' input: IE-IS-IT---I s[O:v-I]: 'WORD'

Final Conditions

A loop with initialization is desired. The precondition Q, postcondition R,


invariant P and bound function tare:

Q: O:::;;j <80 A bU:79] 1 "the input lines" = wi '-' 1 REST


R: O:::;;j <80 A s[O:length(W)-I]= W A
(bU:79] 1 the input lines) = '-' 1 REST
P: 0:::;;j:::;;80 A O:::;;v:::;;length(W) A
(s[O:v-l] 1 bU:79]) = (W 1 ' - ' 1 REST)
t: 2*length ( W) - 2*v +j
Chapter 16
Developing Invariants

Assume we want to develop a program S to satisfy {Q} S {R} for


given Q and R, and that we have decided to use a loop (possibly with
some initialization). How do we find a suitable invariant and bound func-
tion for the loop -before writing the loop? This chapter explores this
question.
Section 16.1 shows how a loop invariant can be seen as a weakening of
the result assertion R and outlines various ways of performing this weak-
ening. This illustrates again that programming is a goal-oriented activity.
Each of the sections 16.2-16.5 discusses in detail one way of weakening
the result assertion and illustrates the technique with several examples.

16.1 The Balloon Theory


This section provides some understanding of the nature of an invariant
P of a loop {Q} do B ~ S od {R}. Fig. 16.I.l(a) represents the set of
all states, with those represented by postcondition R encircled. Also
encircled is the set of possible initial states IS, which could be established
by some simple assignments. (Actually, IS and R could overlap, but this
is not shown in the Figure.) Now, an invariant, P, of the loop is a predi-
cate that is true before and after each iteration.

(a) (b)
Figure 16.1.1 Blowing up the balloon
194 Part Ill. The Development of Programs

Hence, the set of states represented by P must contain both the set of
possible initial states represented by IS and the set of final states repres-
ented by R, as shown in Fig. 16. 1.1 (b).
Consider R to be the deflated state of a balloon, which is blown up to
its complete inflated state, P, just before execution of the loop. Each
iteration of the loop will then let some air out of the balloon, until the
last iteration reduces the balloon back to its deflated state R. This is
illustrated in Fig. 16.1.2, where Po=P is the balloon before the first itera-
tion, P J the balloon after the first iteration and P 2 the balloon after the
second iteration.

Po

Figure 16.1.2 Letting the air out

Remark: The balloon and its various states of deflation is defined more
precisely as follows. P is the completely inflated balloon. Consider the
bound function t. Let to be the initial value of t, which is determined by
the initialization, t J the value of t after the first iteration, t 2 the value of t
after the second iteration, etc. Then the predicate

denotes the set of states in the balloon after the ith iteration. Thus, ini-
tialization deflates the balloon to include only states in P " 0:::;; t :::;; to, the
first iteration deflates it more to P " 0:::;; t :::;; t J, etc. 0

The problem, of course, is to know how to blow up the balloon, so


execution of the loop can deflate it. That is, how does one find P and
the bound function t? What information is available? Clearly, only the
result assertion R and the set of initial states IS. Since the balloon
begins as R and is blown up to encompass the initial state but is then
deflated, it seems that R would the more important of the two. This
becomes more plausible when we consider that the initial conditions may
not even be known until P is known. Subsequently, methods will be
investigated to blow up a balloon -actually to weaken a relation- until
it encompasses a set of states IS that can be easily established.
Section 16.2 Deleting a Conjunct 195

Weakening a predicate
Here are four ways of weakening a predicate R:

1. Delete a conjunct. For example, predicate A "B " C can


be weakened to A " C.
2. Replace a constant by a variable. For example, predicate
x ~ b [l: 10], where x is a simple variable, can be weakened to
x ~ b [I: i] " I ~ i ~ 10, where i is a fresh variable. Since a new
variable has been introduced, its possible set of values must be
precisely stated. Of course, the range must include the value of
the replaced constant.
3. Enlarge the range of a variable. For example, predicate
5~i < 10 can be weakened to 0 ~ i < 10.
4. Add a disjunct. For example, predicate A can be weakened
to A vB, for some other predicate B.

The first three methods are quite useful. In each, insight for weaken-
ing R comes directly from the form and content of R itself, and the
number of possibilities to try is generally small. The methods may there-
fore provide the kind of directed, disciplined development we are looking
for.
The fourth method of weakening a predicate is rarely useful in pro-
gramming, in all its generality. There is no reason to try to add one dis-
junct rather than another, and hence adding a disjunct would be a random
task with an infinite number of possibilities. We shall not analyze this
method further.

16.2 Deleting a Conjunct


In this section, the development of an invariant of a loop by deleting a
conjunct of the desired result assertion is illustrated and discussed.

Approximating the square root of a number


Write a program that, given a fixed integer n ~O, establishes the truth
of

(16.2.1) R: O~a2~n «a+l)2

Taking the square root of all terms in R, we find that R is equivalent to


o~ a ~.;n < a +1. Hence, a is the largest integer that is at most .;n.
The first step is to rewrite R as a set of conjuncts:
196 Part III. The Development of Programs

Deleting the third conjunct of R yields a possible invariant:

P: 0«;a 2«;n.

Because n ~O, P can be established by the assignment a:= O. For the


guard of the loop, use the complement of the deleted conjunct, so that
when the loop terminates because the guard is false, the deleted conjunct
is true. This yields the almost-completed program

a:= 0; do (a+I)2«;n - ? od

The purpose of the command of the loop is to progress towards termina-


tion. Clearly, if the guard of the loop is true then a is too small, so that
progress can be made by increasing a. Since a is bounded above by ./n,
a possible bound function is t = ceil (./n)-a. Using the easiest way to
increase a, incrementing by I, yields the program

(16.2.2) a:= 0; do (a+1)2«;n - a:= a+l od

We show that P is indeed an invariant of the loop:

P /\ B = 0«;a 2«;n/\ (a+I)2«;n


= 0«;(a+I)2«;n
= wp("a:= a+I", P)

Discussion
Here, strategy 15.1.4 was used to develop the loop -first the guard
was created and then the loop body. The guard was created in such a
simple and useful manner that it deserves being called a strategy itself.

(16.2.3) -Strategy: When deleting a conjunct from R to pro-


duce an invariant P, try using the complement of the
deleted conjunct for the guard B of the loop.

Choosing B in this manner will ensure that the necessary condition


P /\ , B =? R (point 3 of checklist 11.9) is automatically satisfied.
Exercise I is to develop a program by deleting the second conjunct
instead of the third.
The execution time of the program is proportional to .;n. A faster
program to approximate the square root of a number will be developed in
section 16.3.
Section 16.2 Deleting a Conjunct 197

Linear search
As a second example of deleting a conjunct, consider the following
problem. Given is a fixed array b[O:m-l] where O<m. It is known that
a fixed value x is in b [O:m -I]. Write a program to determine the first
occurrence of x in b -i.e. to store in a variable i the least integer such
that x = b [i].
The first task is to specify the program more formally. This is easy to
do; we have the following precondition Q and postcondition R:

Q: O<m /\ xEb[O:m-l]
R: O~i <m /\ x ~b[O:i-l] /\ x =b[i]

R can be written in more detail as follows:

R:O~i<m /\(Aj:O~j<i:x~bU])/\x=b[i]

Now, R contains three conjuncts -which should be deleted to obtain


an invariant?

A good invariant should be easy to establish. The first two conjuncts are
established by the assignment i:= 0, while most of the difficulty of the
program lies in establishing the third. Hence, it makes sense to delete the
third conjunct, yielding the following invariant:

(16.2.4) P:O~i<m /\(Aj:O~j<i:x~bU])

What should be the guard of the loop?

Use the complement of the deleted conjunct. Thus far, the program is

i:= 0; do x #b[i] -? od

Choose the command for the loop, explaining how it was found.

The task of the command is to make progress towards termination. A


possible bound function is t: m -i, which is always >0 (see the invari-
ant), and the obvious way to decrease it is to increment i by I. It is fairly
easy to see that execution of i:= i+l under the condition x #b[i] leaves
P true. This leaves us with the program well-known as Linear Search:

(16.2.5) {Q} i:= 0; do x #b[i] - i:= i+1 od {R}


198 Part III. The Development of Programs

Discussion
The program is certainly correct, but let us try formally to prove it
using checklist 11.9. First, show that invariant (16.2.4) is initially true:

wp("i:= 0", (16.2.4») = O~O<m II x :Fb[O:-I]

which is certainly implied by Q. Next, prove that (16.2.4) IS indeed an


invariant of the loop. This requires showing that

(16.2.4) II x #h[i] =? wp("i:= i+I", (16.2.4)


or O~i<m IIx:Fb[O:i] =? O~i+l<m IIx:Fb[O:i]

Is this true? Certainly not -the antecedent is not enough to prove that
i + I < m! The problem is that we have neglected to include in the invari-
ant the fact that x E' b [O:m -I]. Formally, the invariant should be

(16.2.6) P: O~i <m II x :Fb[O:i-l] II x E'b[O:m-l]

With this slight change, one can formally prove that the program is
correct (see exercise 5).
In omitting the conjunct xE'b[O:m-l] we were simply using our
mathematician's license to omit the obvious. Note that all the free identif-
iers of x E' b [0: m -I] are fixed throughout Linear Search: x, band mare
not changed. Hence, facts concerning only these identifiers do not
change. It can be assumed that the reader of the algorithm and its sur-
rounding text will remember these facts, so that they don't have to be
repeated over and over again.
Later on, such obvious detail will be omitted from the picture when it
doesn't hamper understanding. For now, however, your task is to gain
experience with the formalism and its use in programming, and for this
purpose it is better to be as precise and careful as possible. It is also to
be remembered that text surrounding a program in a book such as this
one rarely surrounds that same program when it appears in a program
listing, as it should. Be extremely careful in your program listings to
present the program as clearly and fully as possible.

The program illustrates an important -but often forgotten- principle:

(16.2.7) The Linear Search Principle: to find a minimum value


(at least equal to some lower bound) with a property,
investigate values starting at that lower bound in
increasing order. Similarly, when looking for a maxi-
mum value investigate values in decreasing order.
Section 16.3 Replacing a Constant By a Variable 199

Exercises for Section 16.2


1. A program was developed to find an approximation to the square root of n by
deleting the conjunct n «a +1)2 of result assertion (\6.2.1). Develop a different
program by deleting the conjunct a 2 ::::;; n instead. Compare the running times of
the two programs (see Appendix 4).
2. Write a program that, given a fixed integer n >0, finds the largest integer that
is (I) a power of 2, and (2) at most n. (First write down a formal specification
and then derive the invariant by deleting a conjunct.)
3. Write a program that, given two fixed integers x and y satisfying x ~O and
y >0, finds the quotient q and remainder r when dividing x by J'. That is, it
establishes 0::::;; r A r <y A q*y +r = x. The program may not use multiplica-
tion or division. Develop the invariant of the loop by deleting a conjunct.
4. Write a program that, given a fixed array b [O:m -I, O:n -I] and a fixed value
x in b, determines the "first" position of x in b. By "first" is meant that x is
not in a previous row or in a previous column of the current row. That is, using
two variables i and j, the program should establish the predicate

R = O::::;;i<m A O::::;;j <n Ax=b[i,j]A


x ~b[O:i-I,O:n-l] A x ~b[i,O:j-l]

5. Prove with the help of checklist 11.9 that program (16.2.5) is correct, using loop
invariant (16.2.6) and bound function t: m -i.

16.3 Replacing a Constant By a Variable

Summing the elements of an array


A second method for weakening a predicate. replacing a constant by a
variable. is illustrated with the following problem. Write a program that,
given a fixed integer n ~O and fixed integer array b[O:n-I]. stores in
variable s the sum of the elements of b. The result assertion R can be
expressed as

(16.3.1) R: s =(l:j:O::::;;j<n:bU])

The fact that each array element is involved in the sum suggests that a
loop of some form should be developed, so R should be weakened to
yield a suitable invariant P. R contains the constant n (i.e. n may not
be changed). R can therefore be weakened by replacing n by a fresh
variable i, yielding

s =(l:j:O::::;;j<i:bU])
200 Part Ill. The Development of Programs

At the same time, however, reasonable bounds should be placed on i.


Motivating the choice of bounds is the need to establish the invariant ini-
tially and the probable final value of i, which is n. The above predicate
can be established by i, s:= 0, 0, so a possible lower bound for i is O.
Therefore, the range O:n is chosen, yielding the invariant

P: 0 0:( i 0:( n A s = (:L j : 0 o:(j < i : b U])


Program (15.1.1) for this problem was developed using this loop invariant
and the bound function t = n -i:

i, s:= 0, 0; do i ¥n - i,s:= i+l, s+b[i] od

Discussion
Two other constants of R could be replaced to yield an invariant.
Replacing the constant 0 yields the invariant

Oo:(io:(n As =(:Lj:i o:(j <n:bU])

U sing this as an invariant, one can develop a loop that adds the clements
b U] to s in decreasing order of subscript value j (see exercise I of section
15.1 ).
If result assertion R is written as

s = (:Lj:Oo:(jo:(n-l:bU])
the constant expression n -I can be replaced to yield the invariant

-I o:(i o:(n -I As =(:Lj: 0 o:(j o:(i: bU])

Note carefully the lower bound on i this time. Because n can be zero,
the array can be empty. Therefore the assignment i, s:= 0, b [0], a favor-
ite of many for initializing such a loop, cannot be used here. The initiali-
zation must be i, s:= -I, O. (See exercise I).
This example illustrates that there may be several constants to choose
from when replacing a constant by a variable. In general, the constant is
chosen so that the resulting invariant can be easily established, so that the
guard(s) of the loop are simple and, of course, so that the command(s) of
the loop can be easily written. This is a trial-and-error process, but one
gets better at it with practice.
Too often, variables are introduced into a program without the pro-
grammer really knowing why, or whether they are even needed. In gen-
eral, the following is a good principle to follow.
Section 16.3 Replacing a Constant By a Variable 201

(16.3.2) .Principle: Introduce a variable only when there is a


good reason for doing so.

We now have at least one good reason for introducing a variable: the
need to weaken a result assertion to produce an invariant. It goes without
saying that each variable introduced will be defined in some manner.
Part of this definition, which is often forgotten, is the range of the vari-
able. We emphasize the need for this range with the following

(16.3.3) .Principle: Put suitable bounds on each variable introduced.

Approximating the square root of a number


As a second example of replacing a constant by a variable, consider
the following problem. Write a program that, given a fixed integer n ~O,
establishes the truth of

(16.3.4) R: a2~n «a+I)2

A program for this problem was developed in section 16.2 by deleting the
conjunct n «a+I)2; the program took time proportional to .rn . Here
we use the method of replacing a constant by a variable.
First try replacing the expression a + I by a fresh variable b to yield

a 2 ~n <b 2

Clearly, b must be greater than a if this predicate is to be true. More-


over, the predicate can be established by executing a, b:= 0, n + l.
Hence, b is bounded by a + I and n + I, and the invariant is

The guard B for the loop, obtained by investigating P", B ~ R, is


a+ I ¥ b. Thus far, the program is
a, b:= 0, n+l;
do a + I ¥ b - ? od

Since P indicates that a +1 ~ b and the loop should terminate with


a + I = b, the task of each iteration is to bring a and b closer together,
i.e. to decrease the value of b -a. Execution should continue until
b -a = I. Hence, a possible bound function t is b -a -1.
The size of the interval (a, b) could be decreased by one at each itera-
tion, but perhaps a faster technique exists. Perhaps the interval could be
halved, by setting either a or b to the midpoint (a +b )-;"'2. If so, the
command of the loop could have the form
202 Part III. The Development of Programs

(16.3.5) if? - a:= (a+b)--;-2 U? - b:= (a+b)--;-2 fi

Each command must maintain the invariant P. To find a suitable guard


for the first command, first calculate

(16.3.6) wp ("a:= (a +b )--;-2", P)


= (a +b)--;-2<b ~n+1 A «a+b)--;-2)2~n A b 2>n.

The precondition of (16.3.5) will be the invariant together with the guard
of the loop:

P Aa+l#b.

The extra condition needed to imply (16.3.6) is «a +b )--;-2)2~n, so we


take it as the guard for the first command. In a similar fashion, the guard
for the second command is found to be «a +b )--;-2)2 > n. Introducing a
fresh variable d to save local calculations, we arrive at the program

(16.3.7) a, b:= 0, n +1;


{invariant P: a <b ~n+l A a2~n <b 2}
{bound t: b-a+l}
do a+1 #b - d:= (a+b)--;-2;
ifd*d~n -a:=d Ud*d>n -b:=dfi
od

Discussion
It may seem that the technique of halving the interval was pulled out
of a hat. It is simply one of the useful techniques that programmers must
know about, for its use often speeds up programs considerably. The exe-
cution time of this program is proportional to log n, while the execution
time of the program developed in section 16.2 is proportional to .rn .
Program (16.3.7) illustrates another reason to introduce a variable: d
has been introduced to make a local optimization. The introduction of d
not only reduces the number of times the expression (a +b )--;-2 is
evaluated, it also makes the program more readable.
Note that no definition is given for d. Variable d is essentially a con-
stant of the loop body. It is assigned a value upon entrance to the loop
body, and this value is used throughout the body. It carries no value
from iteration to iteration. Moreover, d is used only in two adjacent
lines, and its use is obvious from these two lines. A definition of d would
belabor the obvious and is therefore omitted.
A similar program can be developed by replacing the second occur-
rence of a in (16.3.4) by a variable -see exercise 3.
Section 16.3 Replacing a Constant By a Variable 203

The Plateau problem


Given is a fixed, ordered (by :Q array b[O:n-I], where n >0. A pla-
teau of the array is a sequence of equal values. Write a program to store
in variable p a value to establish

(16.3.8) R: p is the length of the longest plateau of b [O:n -I].

It may be possible to develop a satisfactory program without defining


the length of the longest plateau in more detail. Nevertheless, as a first
step, rewrite (16.3.8) in the predicate calculus.

The value p is the length of the longest plateau if there is a sequence of p


equal values and no sequence of p +I equal values. That is,

b [O:n - I] contains a plateau of length p II


b[O:n-I] does not contain a plateau of length p+I

Because the array is sorted, a subsection b [k:j] is a plateau if and only if


its end elements b [k] and b U] are equal, This allows us to write R in
the predicate calculus as follows:

(16.3.9) R: (Ek: O~k ~n-p: b[k]=b[k+p-I]) II


(A k: O~k ~n-p-I: b[k]#b[k+p])

The only difficulty in writing (16.3.9) might have been in getting k's
bounds correct. Subsequently, we will work with R as written in (16.3.8),
but we will turn to the more formal definition (16.3.9) when insight is
needed.
Clearly, iteration is needed for this program. Remembering the point
of this section, what loop invariant would you choose?

The length of the plateau of an array of length I is obviously I. There-


fore, the following invariant, found by replacing the constant n of R by a
fresh variable i, can be easily established:

(16.3. 10) P: I ~ i ~ n II p is the length of the longest plateau of b [0: i-I]

What should be the bound function, the initialization and the guard of the
loop?

The bound function is t = n -i. The loop initialization is i ,p:= I, I. The


guard of the loop is i #n. What should be the command of the loop?
204 Part III. The Development of Programs

Each iteration must increase ;, and it seems reasonable to increase ; by I


(but see exercise 10). But this may call for a change in p in order to rees-
tablish the invariant. Thus, we consider the two commands;:= i + I and
;, p: = i + 1, p + 1. We determine the conditions under which execution of
the first maintains P:

wp("i:= i+I", P) I <i +I ~n II p is the length of the


longest plateau of b [O:i]

The first conjunct is implied by the guard of the loop. What extra condi-
tion is needed to imply the second conjunct?

It is already known, from P, that p is the length of the longest plateau of


b [0:; -I]. Therefore, p is the length of the longest plateau of b [O:i] iff
b [; -p :i] is not a plateau. Looking carefully at definition (16.3.9) of the
longest plateau, we determine that this holds iff b [i -p] # b [i]. This
leads directly to the guards for both of the commands i:= i + I and i, P
i+l,p+l, and the loop body is

if b[i]#b[i-p] - i:= i+1


Db[i]=b[i-p] - i,p:= i+l,p+1
fi

The final program is given in (16.3.11).

(16.3.11) i ,p:= I, I;
{invariant P: I ~ i ~n "
p is the length of the longest plateau of b[O:i-I]}
{bound t: n-i}
do i #n - if b[i]#b[i-p] - i:= i+1
Db[i]=b[i-p] - i,p:= i+l,p+1
fi
ad

Discussion
A common mistake in developing this program is to introduce, too
early in the game, a variable v that contains the value of the latest, long-
est plateau, so that the test would be b [i] = v instead of b [i] = b [i -p]. I
made this mistake the first time I developed the program. But it only
complicates the program. Principle (16.3.2) -introduce a variable only
Exercises for Section 16.3 205

when there is good reason to do so- should be followed.


Carefully writing definition (16.3.9) of the length of a longest plateau
did help subsequently in determining the body of the loop. Without
(16.3.9), it is too easy to overlook the simple test b [i] = b [i -p l This
once again illustrates the usefulness of writing simple, clear definitions.
This program finds the length of the longest plateau for any array,
even if not sorted, as long as all equal values are adjacent. It is possible
to speed up the program by increasing i by more than I -example, by
p - but the program becomes more complicated.

Exercises for Section 16.3


1. Write a program to sum array elements b [O:n -Il The result assertion is

R: s = (Ij:OE;;jE;;n-l:bU])
The invariant is to be found from the result assertion by replacing the constant
n -I by a variable.
2. Prove formally that the body of the loop of program (16.3.7) actually decreases
the bound function (point 5 of Checklist 11.9). The important point here is that,
+
when the body of the loop is executed, a I < b .
3. Develop a program for approximating the square root of n by replacing the
second occurrence of a in (16.3.4) by b, yielding the invariant

Don't forget to choose suitable bounds for b. Compare the resulting program,
and the effort needed to derive it, with the development presented earlier.
4. (Binary Search). Write a program that, given fixed x and fixed, ordered (by
E;;) array b[l:n] satisfying b[I]E;;x <b[n], finds where x belongs in the array.
That is, for a fresh variable i the program establishes

R: IE;;i<n Ab[i]E;;x<b[i+l]
The execution time of the program should be proportional to log n .
After writing the program, incorporate it in a program for a more general
search problem: with no restriction on the value x, determine i to satisfy

(i =0 A x <b[I]) V
(l E;;i <n A b[i]E;;x <b[i+l]) v
(i =n A b[n]E;;x)

5. Write a program that, given fixed, ordered array b [O:n -I] where n > 0, finds
the number of plateaus in b [O:n -I l
6. Write a program that, given fixed array b [O:n -I] where n > 0, finds the posi-
tion of a maximum value in b -i.e. establish
206 Part III. The Development of Programs

R: O~k <n A b[k]~b[O:n-l].

The program should be nondeterministic if the maximum value occurs more than
once in b.
7. Write a program that, given fixed array b [O:n -I] where n ~ 0, stores in d
the number of odd values in b [O:n -)].
8. Given are two fixed, ordered arrays f [O:m -I] and g [O:n -I], where m, n
~ O. It is known that no two elements of f are equal and that no two elements
of g are equal. Write a program to determine the number of values that occur
both in f and g. That is, establish

k=(Ni,):O~i<m AO~)<n:f[i]=g[j])

9. Write a program that, given fixed array b [O:n -I], where n ~ 0, determines
whether b is zero: using a fresh Boolean variable s, the program establishes

R: s =(A): O~) <n: b[j]=O)

10. Write another program to find the length of the longest plateau of b [O:n -I].
This algorithm uses the idea that the loop body should investigate one plateau at
each iteration. The loop invariant is therefore

o~ i ~ nAp =length of longest plateau of b [O:i -I] A


(i =0 cor i =n cor b[i-I]""b[i])

You may use the fact that the length of the longest plateau of an empty array is
zero. This exercise is illustrative of the fact that not all loop invariants will arise
directly from considering the strategies for developing invariants discussed in this
chapter. Here, we actually added a conjunct, thus strengthening the invariant, to
produce another program.

16.4 Enlarging the Range of a Variable


Another look at Linear Search
The next method for weakening the result assertion is illustrated by an
example that was already discussed, Linear Search. Write a program
that, given a fixed integer n > 0 and an array b [O:n -I] that is known to
contain a value x, finds the first occurrence of x in b .
Denote by iv the least value i satisfying 0 ~ i A X = b [i]. iv is guar-
anteed to exist, by the definition of the problem. Then, using a variable
i, the result assertion for this program can be written as

R: i =iv

The Linear Search Principle indicates that a search for a value i satisfying
R should be in order of increasing value, beginning with the lowest.
Section 16.4 Enlarging the Range of a Variable 207

The Linear Search Principle indicates that a search for a value i satisfying
R should be in order of increasing value, beginning with the lowest.
Thus, the invariant for the loop will be

P: O~i ~iv

The loop is then written as

i:= 0; do x #b[i] ~ i:= i+l od {i =iv}

Discussion
The method used to develop the invariant was to enlarge the range oj a
variable. In R, variable i could have only one value: iv. This range of
values is enlarged to the set to, I, ... ,iv}. In this case, the enlarging
came from weakening the relation i = iv to i ~ iv and then putting a
lower bound on i. This method is similar to the last one, introducing a
variable and supplying its range -it just happens that the variable is
already present in R.
The example illustrates another important principle:

(16.4.1) .Principle: Introduce a name to denote a value that is


to be determined.

Sometimes, introduction of such a name allows us to be more informal


-but not less precise. It may be quite easy to describe a relation in
English but less easy to put it in the predicate calculus and, moreover, the
English description may be enough to give the desired insight. But don't
use this technique as a license to avoid the predicate calculus completely,
for the calculus enables us to reason more effectively about the programs
we are creating.

The Welfare Crook


We now proceed to a second example where enlarging the range of a
variable is useful. Suppose we have three long magnetic tapes, each con-
taining a list of names in alphabetical order. The first list contains the
names of people working at IBM Yorktown, the second the names of stu-
dents at Columbia University and the third the names of people on wel-
fare in New York City. Practically speaking, all three lists are endless, so
no upper bounds are given. It is known that at least one person is on all
three lists. Write a program to locate the first such person (the one with
the alphabetically smallest name).
To get at the essence of the problem, consider searching three ordered
arrays (with no upper bounds) J[O:?], g[O:?] and h[O:?] for the least
208 Part III. The Development of Programs

value that is on all three of them; this least value is known to exist.
This program is often written in 10 to 30 lines of code in FORTRAN,
PL/ I or ALGOL 68 by those unexposed to the methods given in this
book. The reader might wish to develop the program completely before
studying the subsequent development.
What is the first step in writing the program? Do it.

The first step is to write pre- and postconditions Q and R. Since the lists
J, g and h are fixed, we will use the fact that they are alphabetically
ordered without mentioning it in Q or R. So Q is simply T. Using iv,
jv and kv to denote the least values satisfyingJ[iv]=gUv]=h[kv], and
using three simple variables i, j and k, the postcondition R can be writ-
ten as

R: i = iv /\ j = jv /\ k = kv

Notice how the problem of defining the values iv, jv and kv in detail has
been finessed. We know what least means, and hope to proceed without a
formal definition. Now, why should a loop be used? Develop the invari-
ant and bound function for the loop.

The program must search through a variable number of entries in the


lists, and this suggests using iteration. The Linear Search Principle,
(16.2.7), suggests that one search from the beginning of the lists. Enlarg-
ing the range of the three variables i, j and k yields the invariant

P: O~i~iv /\O~j~jv /\O~k~kv

The bound function is t = iv -i + jv - j + kv -k.


Now, what is the initialization, and what commands would one first
think of in order to make progress towards termination?

The initialization is i ,j, k:= 0, 0, O. The simplest ways to decrease the


bound function are: i:= i + I, j:= j +1 and k:= k +I. Generally speaking,
it will be necessary to increment all three variables, so a loop of the fol-
lowing form is suggested.
Section 16.4 Enlarging the Range of a Variable 209

(16.4.2) i, j, k:= 0, 0, 0;
do?-i:=i+1
o ? - j:= j+1
O?-k:=k+l
od

Now, develop a suitable guard for the command i:= i+1.

We have:

wp("i:= i+I", P) =O~i+l~iv AO~j~jv AO~k~kv

The last two conjuncts, and also 0 ~ i + I, are implied by the invariant, so
only i + I ~ iv must be implied by the guard. The guard cannot be i + I ~
iv, because the program may not use iv. But, the relation i + I ~ iv,
together with P, means that f(i) is not the crook, and this is true if
f[i]<gU], Thus, the guard can be f[i]<gU]. In words, since the
crook does not come alphabetically before gU], if f[i] comes alphabeti-
cally before gU], then f[i] cannot be the crook.
But the guard could also be f[i]<h[k] and, for the moment, we
choose the disjunction of the two for the guard:

f[i] <gU] v f[i] <h[k]

The other guards are written in a similar fashion to yield the program

(16.4.3) i, j, k:= 0, 0, 0;
dof[i] <gU] v f[i] <h[k] - i:= i+l
o g[iJ <h[k] v gU]<f[i] - j:= j+1
o h[k]<f[i] v h[k]<gU] - k:= k+1
od

This program terminates, and, upon termination, P is true. But we have


not yet proved that upon termination the desired result holds. Do so.

Point 3 of checklist 11.9 is proved done by showing that

(16.4.4) PA,BB?R

holds, where P is the invariant, BB is the disjunction of the guards and R


is the result assertion.
So suppose the guards are false. Looking at the first disjunct of each
guard, and assuming it is false, we have:
210 Part III. The Development of Programs

f[i] ~gU] ~ h[k] ~ f[i]

Hence, upon termination we havef[i]=gU]=h[k] and R holds.


Can any further simple change be made to make the program more
efficient?

Note that only the first disjunct of each guard is needed to prove (16.4.4).
Hence, the second disjuncts can be eliminated to yield the program

(16.4.5) i, j, k:= 0, 0, 0;
dof[i] <gU] - i:= i+1
o gU] <h[k] - j:= j+1
o h[k]<f[i] - k:= k +1
od

Discussion
In developing this program, for the first guard, at first f[i] <gU] is
developed, and then weakened to f[i]<gU] v f[i]<h[k]. Why is it
weakened?
Well, the first concern is to obtain a correct program; the second con-
cern is to obtain an efficient one. In proving correctness, one task is to
prove that, upon termination, (16.4.4) holds. The stronger , BB is, the
more chance we have of proving (16.4.4). Since BB is the complement of
, BB, this means that the weaker BB is, the more chance we have of prov-
ing (16.4.4). Thus, we have the following principle:

(16.4.6) .Principle: The more guarded commands and the


weaker their guards, the easier it may be to develop a
correct program.

Of course, this principle does not provide a license to develop hundreds of


cases; simplicity and minimum case analysis must still be maintained.
The concern for efficiency caused us to simplify the guards to yield
program (16.4.5). This will be discussed in some detail in section 19.1.
Section 16.5 Combining Pre- and Postconditions 211

16.5 Combining Pre- and Post conditions


Sometimes the use of just one of the three methods described thus far
for weakening an assertion will not yield a suitable loop invariant. This
may happen, for example, when the input variables are themselves to be
modified to form part of the result of execution. Thus, one may have to
use a combination of methods.
In many cases, it is useful to remember from our balloon theory (sec-
tion 16.1) that both the pre- and postcondition of a loop imply the invari-
ant and, therefore, to consider both of them when developing the invari-
ant. Can both the pre- and postcondition be put in the same form, so
that the invariant is seen as a simple generalization of both? Can the
invariant be considered as a sort of union of both?
In this section, we illustrate this approach with two problems.

Inserting Blanks
Consider the following problem. Write a program that, given fixed
n :;::'0, fixed p :;::'0, and array b[O:n -I], adds p*i to each element b[i] of
b. Formally, using Bi to represent the initial value of b [i], we have

Precondition Q: (A i: O~i <n: b[i]=B;)


Postcondition R: (A i: 0 ~ i <n: b [i] = Bi +p*i)

This problem arose when writing a program to insert blanks between


words of a line in order to right-justify the line. The Bi are the numbers
of the columns of the beginning of successive words on a line, and p is
the number of blanks to be inserted between each pair of words. After
insertion, the first word will begin in column Bo, the second in column
BI +p, the third in B 2 +2*p, and so forth.
The problem suggests a loop that changes one b[i] at each iteration.
To derive an invariant, first replace the constant n of R by a variable j:

P: O~j ~n /\ (A i: O~i <j: b[i]=Bi +p*i)

P' states that the first j elements of b have their final values. But the
fact that the other n - j elements have their initial values should also be
included, and the full invariant is

P: O~j ~n /\ (A i: O~i <j: b[i] =Bi +p*i) /\


(A i: j ~ i < n: b [i] = B i )

This leads to the program


212 Part III. The Development of Programs

(16.5.1) j:= 0;
doj-#-n -j,bU]:=j+l,bU]+p*j od

The development of the invariant was a two-step process; a constant was


replaced by a variable and the resulting predicate was modified to take
into account initial conditions. (See section 20.1 for further work with
this example.)

Swapping Equal-Length Sections


The next problem is as follows. Write a program that, given an array
b[O:m-l] with two non-overlapping sections b[i:i+n-l] and bU:
j+n-I], both of length n ;;'0, swaps the two sections. For example, if
the two sections have the values given in Q below, then upon termination
they have the values displayed in R below. In the diagrams, X and Y
denote the initial values of b[i:i+n-l] and bU:j+n-I], respectively.

i+n-I j j+n-I
Q: bl X[O:n -I] I 1\ bl Y[O:n-l] I
i+n-I j j+n-I
R: bl y[O:n-l] I 1\ bl X[O:n -I] I
For the rest of the development, a less formal approach will be used,
which uses the insight gained thus far without requiring all the formal
details. We take for granted that only the sections mentioned should be
changed and that they do not overlap, and use the following diagrams for
the pre- and postconditions -"unswapped" ("swapped") means that the
values in the indicated section have their initial (final) values:

i+n-I j j+n-I
Q: bl unswapped I 1\ bl unswapped I
i+n-I j j+n-I
R: bl swapped
, 1\ bl swapped I
Since each element of the two sections must be swapped, a loop is sug-
gested that will swap one element of each at a time. The first step in find-
ing the invariant is to replace the constant n of R by a variable k:
Section 16.5 Combining Pre- and Postconditions 213

i+k-I j j+k-I
P': 0 ~ k ~n /\ b ,r--sw-a-p-p-ed-'I /\ b "'::-1-s-w-ap"-p-ed----,I

But P' does not indicate the state of array elements with indices in i +k:
i +n -I and j +k:j +n -1. Adjusting P' suitably yields invariant P as the
predicate 0 ~ k ~ n together with

i i+k-I i+k i+n-I j j+k-I j+k j+n-I


(16.5.2) b I swapped I unswapped I /\ b I swapped lunswapped I

The obvious bound function is n -k, and the program is

k:= 0;
do k #n ~ k, b[i+k), bU+k]:= k+l, bU+k), b[Hk) od

For later purposes (section 18.1), we write this as a procedure in (16.5.3).


Review chapter 12 for parameter-argument correspondence conventions, if
necessary.

(16.5.3) {Swap non-overlapping sections b[i:i+n-l) and bU:j+n-I)}


proc swapequals(var b: array of integer,
value i, j, n: integer);
begin var k: integer;
k:= 0;
{invariant: see above, bound: n -k}
do k # n ~ k, b[i+k), bU+k]:= k +1, bU+k), b[Hk) od
end

Discussion
Again, the invariant was developed by replacing a constant of R by a
variable and then adding a conjunct in order to reflect the initial condi-
tions. We used diagrams in order to avoid some formalism and messy
detail. For some, pictures are easier to understand. But be especially
careful when using them, for they can lead to trouble. It is too easy to
forget about special cases, for example that an array section may be
empty, and this can lead to either an incorrect or less efficient program.
To avoid such cases, always define the ranges of new variables carefully
and be sure each picture is drawn in such a way that you know it can be
translated easily into a statement of the predicate calculus.
The development of the invariant was a two-step process. The invari-
ant can also be developed as follows. Both Q (or a slightly perturbed ver-
sion of it due to initialization) and R must imply the invariant. That is,
Q and R must be instances of the more general predicate P. Q states
214 Part III. The Development of Programs

that the sections are unswapped; hence, the invariant must include, for
each section, an unswapped subsection, which could be the complete sec-
tion. On the other hand, R states that the sections are swapped; hence,
the invariant must include, for each section, a swapped subsection, which
could be the complete section. One is led to draw diagram (16.5.2), using
a variable k to indicate the boundary between the unswapped and
swapped subsections.

Exercises for Section 16.5


1. Formally define the pre- and postconditions for the program to swap two non-
overlapping array sections of equal size (without using pictures or diagrams).
2. (Array Reversal). Write a program that reverses an array section b [i :j]. That
is, if initially b[i:j] = (Bi' Bi+I' ... ,Bj ), then upon termination b[i:j] =
(Bj , . . . , Bi +1, Bd· Assume that i and j are within the array bounds and that
i ~j + I. (If i = j + I the array section is empty; this is permitted.)
3. Write a program that, given fixed x, fixed m and n, m <n , and array section
b [m:n -I], permutes the values of b and sets an integer variable p to achieve

m p-I p n-I
R: m ~p ~ n II bil-_~::::..x=--=----LI----=>_x=--=-----II
More formally, if initially b[m:n-I]=B[m:n-I], then the program estab-
lishes

R: m ~p ~n II b[m:p-I]~x <b[p:n-I] IIperm(b, B).

4. (Partition). Write a procedure Partition (b, m, n, p) that, given fixed m


and n, m <n, and array b[m:n-I] with initial value B[m:n-I], permutes
the values of b and sets p to achieve

m p n-/
R: m ~p <n II perm(b, B) 1\ b I~B[m]IB[m]1 >B[m] I
Procedure Partition is a slight modification of the answer to exercise 3.
5. (The Dutch National Flag). Given is an array b[O:n -I] for fixed n ~O, each
element of which is colored either red, white or blue. Write a program to permute
the elements so that all the red elements are first and all the blue ones last. That
is, the program is to establish

b red elements white elements blue elements

The color of an element may be tested with Boolean expressions red(b[i]),


white(b[i]) and blue(b[i]), which return the obvious values. The number of
such tests should be kept to a minimum. The only way to permute array elements
Exercises for Section 16.5 215

is to swap two of them; the program should make at most n swaps.


6. (Link Reversal). A simple variable p and two arrays v[O:?] and s[O:?] are
used to contain a sequence of values Yo, VI, .... V n - I as a linked list:

s
H VII +-.. -1
,--v---,_s:......, v s
Vn-II -II

That is,

(I) v [P] contains the first value yo;


(2) for O~i <n-I, if v[k] contains the value Vi. then v[s[k]] con-
tains the value Vi + l ;
(3) if v[k] contains the last value Vn -I, then s[k] =-1.

No ordering of values in array elements is implied. For example, the fact that Vo
is followed by VI in the linked list does not mean that v[p+I] contains VI'
Write a program that reverses the links -the arrows implemented by array s.
Array v should not be altered, and upon termination the linked list should be

7. Write formal pre- and postconditions for problem 6.


8. (Saddleback Search). It is known that a fixed integer x occurs in fixed, two-
dimensional array b [O:m -I, O:n -I). Further, it is known that each row and
each column of b is ordered (by ~). Write a program to find the position of x
in b -i.e. using variables i and j, the program should establish x = b [i ,j). If
x occurs in several places in b, it does not matter which place if found. Try to
minimize the number of comparisons in the worst case. This kind of problem
arises in mUltiplying sparse polynomials, each given by an ordered list of coef-
ficient-exponent pairs.
9. (Decimal to Binary). Given is an integer variable x = X, where X > O.
Write a program to calculate an integer k and array v[O:k-l] that gives the
binary representation of X, where v [i] is the i th bit of the representation and the
high order bit, v [k -I]. is nonzero. The value in x may be destroyed.
10. (Decimal to Base B). Given is an integer variable x = X. where X >0, and
an integer B > I. Write a program to calculate an integer k and array v [0: k -I]
that gives the base B representation of X. where v [k -I]. the high order digit of
the representation. is nonzero. The value in x may be destroyed.
Chapter 17
Notes on Bound Functions

A bound function serves two purposes. First, it is used to show that a


loop terminates. Secondly, it gives an upper bound on how many itera-
tions can be executed before termination occurs, and thus can be used to
approximate the time required to execute the program. Different bound
functions may be used for the same program, depending on whether the
programmer is interested in just showing termination or in showing that a
program is almost optimal or faster than another one. For example, con-
sider program (16.3.7), which approximates the square root of a positive
integer:

{n ~O}
a, b:= 0, n+l;
{inv: a <b ~n+1 A a2~n <b 2}
doa+l#b -d:=(a+b)-=-2;
ifd*d~n -a:=d Ud*d>n -b:=dfi
od {a2~n «a+I)2}

The bound function b -a +I was used to prove termination. But the


smaller bound function ceil (log(b -a» shows that this program is indeed
much faster than program «(16.2.3», which performs approximately b-a
iterations:

a:= 0; do (a+I)2~n - a:= a+1 od

Comparison of speeds of execution is treated briefly in Appendix 4.


Usually, the invariant of a prospective loop will suggest a bound func-
tion. This was the case in most of the programs developed in earlier
chapters -e.g. summing the elements of an array (15.1.1), Linear Search
(16.2.2), the Plateau Problem (16.3.11) and the Welfare Crook (16.4.2).
However, we give two pointers here to help in finding bound functions.
Chapter 17 Notes on Bound Functions 217

Using the notation of the problem and its solution


Consider a problem from section 16.3, searching a non-empty, two-
dimensional array b [O:n -1, O:m -I] for a value x. The invariant for this
algorithm was:

o j n-l
o~nothere
P: 0 ~ i ~ mAO ~j <n A i
m-I

Since x has to be in the untested section, a possible bound function is

the number of elements in the untested section

which is (m -i)*n - j. It can be formally proven that this is indeed a


bound function for the loop.
The general idea is the following:

(17.1) -Strategy: Express the bound function, in words, as a sim-


ple property of the invariant and the problem, and then
formalize it (if necessary) as a mathematical expression.

A second example of the use of this strategy is the problem Four-tuple


Sort of section 15.2. Four variables qO, ql, q2, q3 were to be permuted
to achieve qO~ql ~q2~q3. The bound function was chosen to be the
number of inversions in the sequence (qO, ql, q2, q3). (Of course, not
knowing what an inversion is might present some initial difficulties.)

Using lexicographic ordering


Consider pairs of integers (i ,j). We say that one pair (i ,j) is less than
another pair (h, k), written (i, j) «h, k), if either

i<h or i=hAj<k

For example, (-I, 5) «5, 1) «5, 2). This is called the lexicographic ord-
ering of integer pairs. It is extended in the natural way to the operators
~, > and~. It is also extended to triples, 4-tuples, etc. For example,

(3,5,5)«4,5,5)«4,6,0)«4,6, I).

N ow consider program (17.2), whose only purpose is to illustrate using


lexicographically ordered tuples to prove termination.
218 Part Ill. The Development of Programs

(17.2) {O<m A O<n)


i, j:= m-I, n-I;
do j # 0 ~ j:= j-I
o i # 0 A j = 0 ~ i, j:= i-I, n-l
od

Execution is guaranteed to terminate, because

(1) Variable i satisfies O:::::;i <m and j satisfies O:::::;j <no


(2) Each iteration transforms the pair (i, j) into a smaller pair
(lexicographically speaking). By (1), this, can only happen a finite
number of times.

But what bound function should be used to prove termination? Pre-


sumably, it should include a term i and a term j, since both variables are
decremented. However, in the second guarded command the decrease of
) in i is accompanied by an increase of n -) in j. In order to have an
effective decrease, the term i should be weighted: i *n . Therefore the
bound function is

Each iteration decreases t by exactly ), so that t indicates exactly how


many more iterations are to be performed.
We state the general idea in the following theorem, which is given
without formal proof since it is obvious from the previous discussion.

(17.3) Theorem. Consider a pair (i, j), where i and j are expressions
containing variables used in a loop. Suppose each iteration of the
loop decreases (i, j) (lexicographically speaking). Suppose further
that i satisfies mini:::::; i :::::; maxi and j satisfies minj:::::;j:::::; maxj,
for constants mini, maxi, minj and maxj. Then execution of the
loop must terminate, and a suitable bound function is

(i -mini) *(1 +maxj -minj) + j -minj

A similar statement can be made concerning a triple (i, j, k), 4-


tuple (i, j, k, I), etc., instead of pair (i, j). 0

If one can exhibit a pair (triple, etc.) that satisfies theorem 17.3, there
is no need to actually produce the bound function, unless it makes things
clearer or is needed for other reasons. We give three examples.
In section )5.2 the following program (15.2.4) was written for searching
a (possibly empty) two-dimensional array.
Chapter 17 Notes on Bound Functions 219

{O:S;m A O:S;n}
i, j:= 0, 0;
do i =F m A j =F n cand x =Fb[i,j] ~ j:= j+1
Di=FmAj=n ~i,j:=i+I,O
od
{(O:S;i<m AO:S;j<n Ax=b[i,jDV(i=m AXtfb)}

The pair (i ,j) is initially (0,0) and each iteration increases it. Therefore,
the pair (m -i, n -j) is decreased at each iteration. Further, we have
O:S;m-i:S;m and O:S;n-j :S;n. Hence, theorem 17.3 can be applied and
the loop terminates. The bound function that arises from the use of the
theorem is (m-i)*(n+I)+n-j.

As a second example, consider program Four-tuple Sort from section


15.2, which permutes variables qO, ql, q2 and q3 to achieve qO :S; ql :S;
q2 :S; q3:

do qO>ql ~ qO, ql:= ql, qO


o
ql >q2 ~ ql, q2:= q2, ql
o
q2>q3 ~ q2, q3:= q3, q2
od

The tuple (qO, ql, q2, q3) is decreased (lexicographically speaking) by each
iteration. It is bounded below by the tuple whose values are min (qO, qJ,
q2, q3) and is bounded above by the 4-tuple whose values are max(qO,
qJ, q2, q3). Hence, the loop terminates.

As a final example, consider the Railroad Shunting Yard problem. A


shunting yard contains a number of trains, each with one or more cars.
An algorithm is to remove all cars from the yard, but under the condition
that only one car be removed at a time. This means that trains must be
split into smaller trains, and the following algorithm is proposed.

do shunting yard is not empty


Select a train train;
if train has exactly one car ~ Remove train from yard
o
train has more than one car ~ Split train into two trains
fi
od

Removing train from the yard reduces the number of trains and reduces
the total number of cars in the yard. On the other hand, splitting a train
leaves the total number of cars the same but increases the number of
trains by 1. So we choose the pair
220 Part III. The Development of Programs

(number of cars in the yard, -(number of trains in the yard»

Each execution of the loop reduces (lexicographically speaking) the pair.


Further, we have

0:::;;; number of cars:::;;; initial number of cars


- (initial number of cars) :::;;; -(number of trains):::;;; 0

By theorem 17.3, the loop terminates.

Exercises for Chapter 17


1. Find the bound function of theorem 17.3 for the Four-tuple Sort program
whose termination is proved using the 4-tuple (qO, qJ, q2, q3).
Chapter 18
Using Iteration Instead of Recursion

A procedure or function is recursive if during its execution it may be


called again. Recursive procedures often arise from recursive definitions
in mathematics. The usual example given is the factorial function, n!,
which for nonnegative integers is defined

O!
n! n*(n-I)! forn>O.

Note how n! is defined in terms of (n -I)!; it is recursively defined.


This definition can be translated easily into a recursive procedure to
compute n!:

{Given n ~O, store n! in answer}


proc jac( value n: integer ;resuIt answer: integer);
if n =0 - answer:= I
On >0 - jac(n-I, answer); answer:= n*answer
fi

Recursion is useful, and it definitely belongs in the programmer's tool kit.


For example, top-down parsing using recursive procedures (sometimes
called recursive descent) has been a favorite of mine in compiler construc-
tion courses for over ten years.
At the same time, in theory at least, any recursive program can be
written iteratively (and vice versa), and in practice it may make sense to
do so. Perhaps the available programming notations force the use of
iteration, perhaps problems of efficiency of space and time force the use
of iteration, or perhaps an algorithm just seems easier expressed itera-
tively.
Through a series of examples, we provide some tools and techniques
222 Part II I. The Development of Programs

for writing programs iteratively that could have been written recursively.
One trick in doing so will be to think iteratively right from the beginning.
That is, if the program will be written using iteration, then the invariant
for the loop will have to be developed before writing the loop (as much as
possible).
The topic will allow us to bring up two important strategies and dis-
cuss the relation between them, for recursive procedures often evolve from
their use. These strategies are: -solving problems in terms of simpler
ones, and divide and conquer. While not on the same level of detail and
precision as some of the strategies presented earlier, these two old
methods can still be useful when practised consciously.
At the end of section 18.3, some comments are made concerning the
choice of data structures in programming and the use of program
transformations.

18.1 Solving Simpler Problems First


Sometimes, we simply don't know how to begin solving a problem, and
the methods analyzed thus far don't seem to help. In such situations, the
following may help.

(18.1.1) -Strategy: Try to solve a problem in terms of simpler ones.

"Simpler" may mean different things at different times. A problem may


be simpler because some restrictions have been omitted (this is generaliza-
tion). It may be simpler because restrictions have been added. Whatever
the change in the problem, if it leads to a solution of the simpler problem
it may be possible to solve the original problem in terms of it.
In order to illustrate the technique, let us develop a program for the
problem Swapping Sections. Given are fixed integer variables m, nand
p satisfying m <n <p. Given is (part 01) an array, b [m:p -1], con-
sidered as two sections:

m n p-l
Q: b IB[m:n-IJI B[n:p-IJ I
where B denotes the initial value of array b. The program should swap
the two array sections, using only a constant amount of extra space
(independent of m, nand p), thus establishing the predicate
Section 18.1 Solving Simpler Problems First 223

m p-I
(18.1.2) R: b I B[n:p-I] IB[m:n-I]1

How should one begin? Well, a procedure swapequals, (16.5.3), has


already been written to swap non-overlapping sections of equal size. Per-
haps the current problem, which involves sections of unequal size, can be
solved in terms of this simpler one.
So suppose for the moment that section b [m:n -I] is bigger than
b[n:p-I]. Consider b[m:n-I] to consist of two sections, the first of
which is the same size as b[n:p-I] (diagram (a) below). Then the equal-
sized sections containing Xl and y can be swapped to yield diagram (b)
below; further, the original problem can then be solved by swapping the
two sections containing X2 and Xl. These two sections may be of unequal
sizes, but at least one of them is smaller than in the original problem, so
that progress has been made.

m m+p-n n p-I m n m+p-n p-I


(a) b I Xl I X2 Y I (c) b I Y Xl I X2 I
m m+p-n n p-I m n m+p-n p-I
(b) b I y I X2 Xl I (d) b I X2 Xl I y I
Now suppose that the second section, b [n:p -I], is larger. Then the case
is as given in diagram (c), and procedure swapequals can be used to
transform it into diagram (d).
Now let's try to work this idea into a program. Diagrams (b) and (d)
indicate that, after execution of swapequals, n is always the left boundary
of the rightmost section to be swapped. But this is also true initially.
Therefore, an invariant can be obtained by replacing constants m and p
by variables and taking into account initial conditions:

However, note that the algorithm requires comparison of the lengths of


b[n:k-I] and b[h:n-I]. Also, procedure swapequa/s requires the
lengths of sections. Therefore it may be better to represent the lengths of
the sections rather than their endpoints. The invariant P becomes the
predicate 0 < i ~ n -m " 0 <j ~p -n together with the following:
224 Part III. The Development of Programs

n-j n
b swap with swap with
b[n:n+j-l] b[n-i:n-l]

U sing the bound function t = max(i , j), the program is written as

i,j:=n-m,p-n; {PI
do i> j - swapequals(b, n-i, n, j); i:= i-j
Ui <j - swapequals(b, n-i, n+j-i,i); j:= j-i
od;
{PAi=j}
swapequals(b, n -i, n, i)

Discussion
This program could also have been written in recursive fashion as

{Swap sections b[m:n-l] and b[n:p-l], where m <n <pI


proc swap-sections (var b: array of integer;
value m, n, p: integer);
if n-m =p-n - swapequals(b, m, n, p-n)
Un-m >p-n - swapequals(b, m, n,p-n);
swap-sections(b, m +p -n, n, p)
Un-m <p-n - swapequals(b, m, p+m-n, n-m);
swap-sections(b, m, n, m +p -n)
fi
In this case, I like the iterative version better. It was not difficult to
discover the invariant, and it is, to me, easier to understand (this is not
always the case). The iterative version does require two extra variables i
and j, which are not needed in the recursive version.
The iterative version has the neat property that deleting all the calls of
swapequals results in program (18.1.3) to compute the greatest common
divisor, gcd(n -m, p -n), of the initial array-section sizes. To see this
old, elegant program emerge from a useful, practical programming prob-
lem was a delightful experience!

(18.1.3) {m<n<p}
i, j:= n-m, p-n;
{inv: 0 <i A 0 <j A gcd(n -m, p -n) = gcd(i, j)}
doi>j-i:=i-j
U i<j -j:=j-i
od
{i = j =gcd(n-m, p-n)}
Exercises for Section 18.1 225

The program could have been developed by first replacing nand p by


variables hand k, and then determining how to reduce the size of the
un swapped portion. There are often many ways to arrive at the same pro-
gram, and one cannot really say that one is better than the other. Redo-
ing a problem once done, using the principles and asking why they weren't
used the first time, can increase programming skill and lead to better pro-
grams. The following confession concerns this point.

Confession: When I first developed this program, myoid habits got in


the way and I failed to follow the principles of introducing variables only
when needed and finding the invariant by replacing constants by variables.
I immediately introduced lour variables, which indicated the beginning of
the two sections to be swapped and their lengths. A student recognized
that one variable wasn't needed because the beginning of the rightmost
section was always n. This caused me to stop and redo the development
as shown above, this time adhering to principle 16.3.2 and introducing
variables only when there is a good reason to do so. This led to the
recognition that the gcd algorithm was embedded in the solution.

Exercises for Section IS.1


1. Consider a procedure reverse (b, i, j), which reverses the list of values in
b[i:j] -see exercise 2 of section 16.5. Use this procedure, which solves a simpler
problem, to write a program to swap adjacent sections.
This exercise illustrates that the simpler problems used to solve a given prob-
lem may be difficult to find. It is difficult to give a methodology to develop any
program. Some programs arise just out of new ideas, and without those ideas the
solutions won't be found.
2. The Fibonacci numbers In are defined as follows:

10 0
II 1
In In-I +In-2 for n >1
The first eight Fibonacci numbers are 0, I, 1,2,3,5,8, 13.
The definition of In for n > 1 can be written in matrix notation as follows:
In 1 In-I
In-I o In-2
It is fairly easy to write a program that takes time proportional to n to calculate
In. However, in a subsequent section, 19.1, a program is given to perform
exponentiation in for positive integers n in time proportional to log n, where i
could be a matrix. Write a program to calculate In in logarithmic time using the
simpler(?) problem of exponentiation.
226 Part III. The Development of Programs

18.2 Divide and Conquer


In the preceding section, we discussed solving a problem in terms of a
known, simpler problem. In this section, we discuss a related strategy,
which has been around for some time:

(18.2.1) -Strategy: Divide and Conquer

In programming, this strategy is often used in the following sense. One


tries to divide a problem into two or more smaller, similar problems. If
the division can be done, then the same process can be performed on the
smaller problems. If the division can be done without too much effort
(during execution), then an effective, efficient algorithm may have been
developed.
This strategy usually leads to dividing something in half and then pro-
cessing each part in the same manner, until the parts are small enough to
process directly. This often leads to a logarithmic factor in the formula
describing the speed of execution.
The difference between strategy 18.1.1, solving a problem in terms of
simpler ones, and strategy 18.2.1, divide and conquer, may be slight. For
some problems, it may be more a matter of what question motivates the
development than anything else. In strategy 18.1.1, one first recognizes a
simpler problem and then asks how it can be used effectively. This was
the case in the development of program Swapping Sections. In strategy
18.2.1, on the other hand, one first asks what it would mean to divide the
problem into smaller pieces, and then looks for ways to solve the original
problem in terms of the pieces.
In strategy 18.1.1, the simpler problem motivates the development. In
strategy 18.2.1, the idea of division leads the way, although it may lead to
using a simpler problem.
We illustrate the approach by developing the program Quicksort, one
of the faster sorting algorithms. Given is a fixed integer n ~O and an
array b [O:n -1]. The array is to be sorted.
If the array is small enough, say n :::;;2, then any simple algorithm may
be used to sort it -for example

Sort b[O:n -I] directly, assuming n :::;;2:


if n # 2 cor b [0] :::;; b [I] - skip
Un =2 cand b[O] >b[l]- b[O], b[I]:= b[I], b[O]
fi

However, if n >2 then a more general method must be used. The divide
and conquer strategy invites us to perform the sort by sorting two (or
more) sections of the array separately. Suppose the array is partitioned as
Section 18.2 Divide and Conquer 227

follows.

o k n-I
? 1

What condition must be placed on the two sections so that sorting them
separately yields an ordered array?

Every value in the first section should be ~ every value in the second sec-
tion:

o k n-I
(18.2.2) b l~b[k:n-l]l;;'b[O:k-l]l

This means that if the values of b can be permuted to establish the above
predicate, then to sort the array it remains only to sort the partitions
b[O:k-l] and b[k:n-I].
Actually, a procedure similar to one that establishes (18.2.2) has
already been written -see exercise 4 of section 16.5- so we will make
use of it. Procedure Partition splits a non-empty array section b [m:n -I]
into three partitions, where the value x in the middle one is the initial
value in b [m]:

m p n-I
(18.2.3) R: m ~p < nAb 1-1_~_x----..JIL--x-,-I_>_x---,I

After partitioning the array as above, it remains to sort the two parti-
tions b [m :p -I] and b [p + I:n -I]. If they are small enough, they can be
sorted directly; otherwise, they can be sorted by partitioning again and
sorting the smaller sub-partitions. While one sub-partition is being sorted,
the bounds of the other must be stored somewhere. But sorting one will
generate two more smaller partitions to sort, and their bounds must be
stored somewhere also. And so forth.
To keep track of the partitions still to be sorted, use a set variable s to
contain their boundaries. That is, s is a set of pairs of integers and, if
(i,j) is in s, then b[i:j] remains to be sorted. We write the invariant

(18.2.4) P: s is a set of pairs (i, j) representing disjoint array


sections b [i:j] of b. Further, b [O:n -I] is ordered
iff all the disjoint partitions given by set s are.
228 Part III. The Development of Programs

Note how English is used to eliminate the need for formally introducing
an identifier to denote the initial value of array b.
Thus, we arrive at the following program:

(18.2.5) s:= {(O, n-I»);


{Invariant: (18.2.4»)
do s # {} - Choose«(i, J), s); s:= s -{(i, j)};
if j -i < 2 - Sort b [i:j] directly
OJ-i~2 - Partition(b, i,j,p);
s:= s U {(i,p-I») U {(P+I,J))
fi
od

Operation Choose«i, j), s) stores In i and j the value of a pair (i, j)


that is in s, without changing s. This is a nondeterministic action, since
any member of s may be chosen. See Appendix 2.

Discussion
Program (18.2.5) describes the basic idea behind Quicksort. Proof of
termination is left to exercise I. The execution time of Quicksort is
O(n logn) on the average and O(n2) in the worst case. The space needed
in the worst case is O(n), which is more than it need be; exercise 2 shows
how to reduce the space.
In the development of this program, the guiding motivation was the
desire to divide and conquer. The simpler problem needed to effect the
divide and conquer was procedure Partition. Had we first noticed that
procedure Partition was available and asked how it could have been used,
we would have been using strategy 18.1.1, solve the problem in terms of
simpler ones.

Exercises for Section 18.2


1. Prove termination using the method developed in chapter 17 (theorem 17.3).
Be careful: a partition b [i:j] can be empty.
2. How big can set s of program (18.2.5) get? The maximum size of s can be
reduced tremendously by maintaining a sequence (see Appendix 2) instead of a set
+
and, after partitioning, putting the two pairs (i ,p -1) and (p 1:j) in the
sequence in a certain order. Revise algorithm (18.2.5) to do this and recalculate
the maximum size the sequence.
Section 18.3 Traversing binary trees 229

18.3 Traversing binary trees

Definitions and notations


An ordered binary tree is a finite set of nodes, or values, that either is
empty or consists of one node, called the root of the tree, and two disjoint
ordered binary trees, called the left subtree and right subtree, respectively.
An ordered binary tree is represented in Fig. 18.3.1. Its root is A; its
left and right subtrees consist of the nodes {B, E, J, J} and {C, F, G,
K, L}, respectively. The roots of A's left and right subtrees are Band
C.

B ~
~
E
A

-----F
C
/~
G

I
/""-J /~
K L
Figure 18.3.1 Example of a Binary Tree

The adjective "ordered" is used to indicate that an ordering on subtrees


is involved: the left subtree is always listed first (or to the left). The
adjective "binary" is used to indicate that there are at most two subtrees.
From now on, we use the shorter term "tree" for "ordered binary tree".
The tree whose root is B in Fig. 18.3.1 has an empty left subtree.
N odes with two empty subtrees are called leaves. In Fig. 18.3.1, G, I, J,
K and L are leaves.
If p is a tree, then empty (P) has the value of the sentence "tree p has
no nodes." Further, if , empty(p), then left[p] and right[p] are used to
denote the left and right subtrees, respectively. Finally, if tree p is not
empty, root [p ] denotes the value of the root node of p .
Trees, graphs and related mathematical structures play an important
role in computer science. They make some ideas easier to understand,
their properties allow the understanding of efficiency of many algorithms,
and they are fundamental parts of many algorithms. The (ordered binary)
tree, for example, is an important concept in several sorting algorithms, in
some storage allocation algorithms, and in compilers. Thus, it is impor-
tant to understand the basic algorithms that manipulate these structures.
230 Part III. The Development of Programs

Above, the term tree is defined in the easiest possible manner: recur-
sively. For that reason, many algorithms that manipulate trees are given
recursively also. Here, we wish to describe a few basic algorithms dealing
with trees, but using iteration. With a firm grasp of this material, it
should not be difficult to develop other algorithms that deal with trees,
graphs and other structures.

Implementing a tree
We describe one typical implementation of a tree, which is motivated
by the need in many algorithms to insert nodes into and delete nodes
from a tree. The implementation uses a simple variable p and three
arrays: root[O:?], left[O:?] and right[O:?].
Variable p contains an integer satisfying -I ~p. It describes, or
represents, the tree.
If integer k describes a tree or subtree, then the following holds:

I. empty(k) is equivalent to k =-1.


2. , empty(k) is equivalent to O~k. If, empty(k) holds, the
value of the root is in root[ k], the left subtree of the root is
given by left[k] and the right subtree by right[k].

For example, the tree of Fig. 18.3.1 could appear as given in (18.3.1).

o I 2 3 4 5 6 7 8 9 10
root B A C E F I J K L G
(18.3.1) p = I left -I 0 5 6 8 -I -I -I -I -I
right 4 3 10 7 9 -I -I -I -I -I

Some comments are in order. First, p need not equal 0; the root node
need not be described by the first elements of the arrays, root [0], left[O]
and right[O]. In fact, several trees could be maintained in the same three
arrays, using pI, p2 and p3 (say) to "point to their roots". This, of
course, implies that the nodes of the trees in the arrays need not be in any
particular order. In (18.3.1), the elements with index 2 of the three arrays
are not used in the representation of tree p at all. Moreover, the root of
the left subtree of A precedes A in the array, while the root of its right
subtree follows it. This means that one can not process the tree by pro-
cessing the elements of root (and left and right) in sequential order.
In the rest of this section, we will deal with a tree p using the original
notations empty (p), root[p], left[p] and right[p]. Note, however, that
this notation is quite close to what one would use in a program dealing
with a tree implemented as just shown.
Section 18.3 Traversing binary trees 231

Counting the nodes of a tree


As a first example, we write a program, Node Count, to calculate the
number of nodes in a tree p. As an abbreviation, let

#p

denote the number of nodes in tree p. Thus, using a simple variable c to


contain the number of nodes, the program should establish

(18.3.2) R: #p =c
The first step, of course, is to give a definition of #p, in the hope that
it will yield insight into the program. Write a definition of #p -it may
help to use recursion since tree is defined recursively.

( 18.3.3)
Jempty(p) ~ 0
#p = )~ ,empt)!(p) ~ I +#leJt[p]+#right[p]

This definition gives us the germ of an idea for the algorithm: if


empty(p), use the result 0; otherwise evaluate I + #leJt [P] + #right [p].
This evaluation requires calculating the number of nodes in leJt[p] and
calculating the number of nodes in right[p]. But #leJt[p] is defined
recursively by (18.3.3) also, and therefore calculating it can be expected to
force us to calculate the number of nodes in its subtrees in the same
manner. Thus, we see the need for counting the number of nodes in
several subtrees of p, and it seems wise to consider using a set variable s
(say) to maintain the set of trees whose nodes must be counted. And this
hints at iteration.
We develop a loop invariant by extending the range of variable c in
result assertion R and taking into account the fact that s contains trees
still to be counted:

(18.3.4) P: #p =c +(2r:r Es: #r)

where each member r of s is some subtree of p. That is, the number of


nodes in tree p is c plus the number of nodes in the trees given in set s.
The invariant is easily established by c, s:= 0, {P}. Each iteration of a
loop should lead closer to termination, and this means that it should pro-
cess a subtree of s in some manner. Using definition (18.3.3), the pro-
gram is easily written as
232 Part Ill. The Development of Programs

(18.3.5) e, s:= 0, {P};


{inv: (lS.3.4)}
{bound: 2*(#p -e) + I s I }
dos,eO - Choose(q, s); s:= s -{q};
if empty (q ) - skip
0, empty(q)- e, s:= e+l, s U{right[qJlu{left[q]}
fi
od {e =#p}

The bound function was discovered by noting that the pair (#p -e, I s I ) is
decreased (lexicographically speaking) by each iteration -see Chapter 17.
Note that it does not matter in which order the subtrees in set s are
processed. This is because the number of nodes in each subtree will be
added to e and addition is a commutative operation. In this case, the use
of the nondeterministic operation Choose (q, s), which stores an arbitrary
value in s into q, nicely frees us from having to make an unnecessary
choice.

Preorder traversal
The preorder list of the nodes of a tree p, written preorder(p), is
defined as follows. If the tree is empty, it is the empty sequence 0; other-
wise, it is the sequence consisting of

I. The root, followed by


2. The nodes of the left subtree, in preorder, followed by
3. The nodes of the right subtree, in preorder.

For example, for the subtree e of Fig. 18.3.1 with root E we have

preorder(e) = (E, I, J)

For the whole tree a of Fig. 18.3.1 we have

preorder (a) = (A , B, E, I, J, C, F, K, L, G)
Using I to denote catenation of sequences, preorder(p) can be written as

jempty(p) - 0
(18.3.6) preorder(p) = I,empty(p) - (root(p)) I preorder(left[p]) I
. preorder (right [p ])

Note that preorder (P) is defined recursively. This notation and the de-
finition of preorder in terms of catenation has been designed to allow us
to state and analyze various properties and algorithms in a simple, crisp
Section 18.3 Traversing binary trees 233

manner; it is illustrative of the use of notation to help promote under-


standing.
A preorder traversal of a tree consists of "walking" through the tree in
the order given by its preorder list, "visiting" each node in turn in order to
perform some operation on it. We now consider developing a program
that performs a preorder traversal, storing the values of the nodes in an
array. More precisely, for a tree p, execution of the program should es-
tablish

(18.3.7) R: e =#p "preorder(p)=b[O:e-l]

Note the similarity between definitions (18.3.3) and (18.3.6). They have
the same form, but the first uses the commutative operator + while the
second uses the non-commutative operator I. Perhaps the program to
calculate the preorder list may be developed by transforming program
N ode Count so that it processes the trees of set s in a definite order.
First, let's rewrite Node Count in (18.3.8) to store the node values into
array b, instead of simply counting nodes. The invariant is

O~e ~#p "


set of nodes of p = b [O:e -\] u {nodes of trees in s}

(18.3.8) e,s:=O,{p};
{bound: 2*(#p-e) + I s I}
do s #{} - Choose(q, s); s:= s -{q};
if empty (q) - skip
0, empty(q)- e, b[e]:= e+l, root[q];
s:= s U {right [q]} U {left[q]}
fi
od {e = #p " b [O:e -I] contains the nodes of p}

Now transform (18.3.8) as follows. Instead of a set s use a sequence r.


The key is to insert trees into r and take them out in a manner that
allows us to conclude that b contains the preorder list of p. This is easily
done by observing the definition of preorder, and we have the invariant

(18.3.9) P: O~e ~#p "


preorder(p) = b[O:e-l] I preorder(ro) I ... I
preorder(r I r I-I)

Each iteration of a loop will then process tree ro. If it is empty, it is


deleted from the sequence of trees to be visited; if not, its preorder is
given by (18.3.6), and the preorder list b and the sequence of trees rare
changed accordingly:
234 Part III. The Development of Programs

(18.3.10) c,r:= 0, (P);


{bound: 2*(#p-c)+lrl}
do r #0 - q,r:= r[O],r[I..];
ifempty(q) -skip
0, empty(q)- c, b[c]:= c+l, root[q];
r:= left[q] I right[q] I r
fi
od {c =#p J

Discussion
In (18.3.5), the order in which the left and right subtrees are stored in
set s is immaterial, because addition, which is being performed on the
number of nodes in each, is commutative. In (18.3.10), however, the
order in which nodes are stored in sequence r is important because opera-
tion I is not commutative.
My first development of this program, done over 5 years ago, was not
performed like this. It was an ad hoc process, with little direction,
because I was new at the game and had to struggle to learn and perfect
techniq ues.
Without the sequence notation (see Appendix 2), including the nota-
tion for catenation, one tries to work with English phrases, for example,
writing the invariant as

t [I :i] is a sequence of pointers to subtrees that have not been


visited, and preorder(p) is equal to b [O:c -I], followed by the
preorder list of these trees.

The use of English is worthwhile in some contexts, for example it was


used heavily in section 18.1 to discuss swapping sections. In this section,
the concepts of "traversing a tree" and "visiting nodes" are useful in gen-
eral discussions. But their use in developing this algorithm can be confus-
ing, for it is not at all easy to see from the algorithm what has and has
not been visited. Far better is to make the result assertion and invariant
more formal, as has been done, because it leads to a crisp, precise, clear
explanation.
The ability to easily write such iterative traversal algorithms is neces-
sary in many applications. Study of these two algorithms and mastery of
the few execises is essential.
Section 18.3 Traversing binary trees 235

A note on data refinement


In developing Quicksort in section 18.2 and Node Counting in this sec-
tion, we used objects (and operations on them) that suited the problem
-sets of pairs of integers and sets of trees- and operations on them.
Only after this did we consider using other data structures to make the
program more efficient with respect to space or time (e.g. exercise 2 of
section 18.2, where sequences are used instead of sets). And we never did
really consider how to rewrite the program in terms of arrays, for that is
almost a trivial step compared to the development preceding it. Further,
it would have been slightly more difficult to work in terms of arrays
instead of sets right from the beginning. This illustrates the important
principle

(18.3.11) .Principle: Program into a programming language, not in it.

In general, this principle deals with data and its representation, as well as
with commands. We should use data structures that suit the problem,
and, once a correct program has been developed, deal with the problem of
changing the data structures to make their use more efficient and imple-
menting them in the programing language. This latter task, often called
"data refinement", has not received the attention that "program refine-
ment" has.
In a "modern" programming notation allowing "data encapsulation",
data refinement may just mean appending a program segment that des-
cribes how the objects are to be represented and the operations are to be
implemented. In other programming notations, it may mean transforming
the program so that it operates on allowable objects of the language.

A note on program transformation


Program transformation is a hot topic these days; many advocate using
an interactive system to transform a problem description into an efficient
program through a series of such transformations. In this section, we
used program transformation to transform program Node Count into a
program to derive a preorder list of a tree.
This is not the place to give a detailed account of program transforma-
tion systems, but one comment should be made. When making a trans-
formation, as we did, always make sure the result can be understood by
itself, without having to study the transformation. Thus, the result should
have its own proof of correctness in terms of loop invariants and so forth.
It is extremely difficult to understand a program by studying a long
sequence of transformations; one quickly becomes lost in details or bored
with the process.
236 Part III. The Development of Programs

Exercises for Section 18.3


1. Write a program to count the number of leaves of tree p.
2. Write a program to store in array b the inorder list of nodes of tree p. The
inorder list is defined as follows. If p is empty the inorder list is the empty
sequence O. If P is not empty, the inorder list is:

I. The nodes of left [P], in inorder, followed by


2. The root, followed by
3. The nodes of right [P], in inorder.

3. Write a program to store in array b the postorder list of nodes of tree p. The
postorder list is defined as follows. If p is empty the postorder list is the empty
sequence O. If P is not empty, the postorder list is

I. The nodes of left [P], in postorder, followed by


2. The nodes of right [P], in postorder, followed by
3. The root.

4. The root of a tree is defined to have depth 0, the roots of its subtrees have
depth I, the roots of their subtrees have depth 2, and so on. The depth of the tree
itself is the maximum depth of its nodes. The depth of an empty tree is -I. For
example, in tree (18.3.1), A has depth 0, F has depth 2, and the tree itself has
depth 3. Write a program to calculate the depth of a tree.
Chapter 19
Efficiency Considerations

The programmer has two main concerns: correctness and efficiency.


Thus far, this book has dealt mainly with the issue of correctness. This
does not mean that efficiency is unimportant. When faced with any large
task, it is usually best to put aside some of its aspects for a moment and
to concentrate on the others, and that is what we have been doing. This
important principle is called Separation of Concerns.
The two main concerns of the programmer can be handled by different
mechanisms. The correctness concern is handled using a theory of
correctness, such as the one developed in Part II. The formal definition
of correctness is given not in terms of how a program is executed, but,
instead, in terms of how theorems of the form {Q} S {R} are to be
proved. It is mathematical in nature, relying heavily on the predicate cal-
culus.
On the other hand, at this time, efficient use of time and space can
best be discussed in terms of some model of execution. Knowledge is
required of the space needed by integer variables, arrays, etc., and one
must understand how the commands of the programming notation are
executed on a computer.
Actually, we have been dealing with both concerns to some extent all
along. For example, in the Four-tuple Problem (section 15.2) three
guarded commands were deleted in order to make the program shorter
and perhaps more efficient. We also developed two programs for approx-
imating the square root of an integer (sections 16.2 and 16.3) and dis-
cussed their relative speeds. However, as it should be, our first concern
has been correctness. An efficient program is useless if it does not do
what it is supposed to do.
In this chapter, we turn our attention to a few general techniques for
improving the efficiency of programs.
238 Part III. The Development of Programs

19.1 Restricting Nondeterminism


Nondeterminism arises when two or more guards of an alternative con-
struct or a loop can be true at the same time. It also arises when com-
mands like Choose(q, s) are used (see Appendix 2). Sometimes, a pro-
gram can be made more efficient by restricting or deleting the nondeter-
mmlsm.
Recall from section 15.2 that, in a correct loop, the guards can be
strengthened without disturbing correctness as long as point 3 of checklist
11.9, P A, BB ~ R, remains true (P is the invariant, R the result asser-
tion and BB the disjunction of the guards). Thus, one may restrict the
nondeterminism by strengthening the guards. And, if a guard is streng-
thened to the everywhere-false predicate F, then the corresponding guard-
ed command may be deleted because the command will never be exe-
cuted.
N ondeterminism can be eliminated without fear of destroying point 3
of checklist 11.9 using the following simple theorem.

(19.1.1) Theorem. Suppose a loop has (at least) two guarded commands,
with guards B1 and B2. Then strengthening B2 to B2 A , B1
leaves BB, and hence P A, BB ~ R, unchanged.

Proof BB contains the disjunct B1 v B2. Strengthening B2 as indicated


changes this disjunct (and only this part of BB) to B1 v (B2 A , BI). Using
De Morgan's law and simplifying, we see this is equivalent to the original
disjunct B1 v B2. Hence, BB remains unchanged. 0

Use of theorem 19.1.1 eliminates nondeterminism because then the two


guards cannot both be true in the same state.
A few examples of strengthening guards and using theorem 19.1.1 may
provide a better understanding.

Revisiting the Welfare Crook


In section 16.4 and exercise I of section 16.4, a program for the Wel-
fare Crook was developed. We re-analyze it here. Given are three alpha-
betically ordered lists of names, stored in fixed, ordered, arrays f[O:?],
g[O:?] and h [O:?]. Some names appear on all three lists; the problem is
to find the first such name. Let iv, jv and kv be the smallest integers
satisfying f[iv] =gUv] =h[kv]. Then, using variables i, j and k, the
following should be established:

R: i = iv A j = jv A k = kv
Section 19.1 Restricting N ondeterminism 239

The invariant for a loop is found by using the Linear Search Principle,
(16.2.7), and enlarging the range of variables in R:

P: O~i ~iv /\ O~j ~jv /\ O~k ~kv

The obvious bound function is t: iv -i +jv -j + kv -k and the first pro-


gram developed « 16.4.3» is

(19.1.2) i,j,k:=O,O,O;
dof[i] <gU] v f[i] <h[k] - i:= i+1
U gU] <h[k] v gU] <f[i] - j:= j+1
U h[k]<f[i] v h[k]<g[j] - k:= k+1
od

Now comes the concern for efficiency. Point 3 of checklist 11.9,


p /\ , BB ~ R (where P is the invariant, BB the disjunction of the guards
and R the result assertion) can be proved using only the first disjunct of
each guard. Therefore, the guards can be strengthened by deleting their
second conjuncts without violating point 3. This yields the shorter and
more efficient program

i, j, k:= 0, 0, 0;
do f[i] <gU] - i := i+1
UgU] <h[k]- j:= j+1
U h[k]<f[i] - k:= k+1
od

Note that theorem 19. I.I could now be used to strengthen two of the
guards, but it is better not to. There is no reason for preferring one of the
commands over the others, and strengthening the guards using the
theorem will only complicate them and make the program less efficient.
In this case, the nondeterminism aids in producing the simplest solution.

Revisiting Four-tuple Sort


In the Four-tuple Sort problem of section 15.2, three guards could be
strengthened to F, the everywhere-false predicate, and therefore the
corresponding guarded commands could be deleted. This eliminated some
of the nondeterminism, but not all of it.

Exponentiation
Consider writing a program that, given two fixed integers X and Y,
X ;;:: 0 and Y;;:: 0, establishes
240 Part III. The Development of Programs

(Define 0° = I.) The program is to consist of a loop with the following


invariant and bound function:

P: O~y A Z*X Y =X Y •
t: y

P is easily established using x, y, z:= X, Y, I, and (at least) two simple


commands can be used to reduce the bound function: y:= y -I and
y:= y +2. Finding the weakest preconditions of these commands with
respect to the invariant leads directly to the program

{O~X A O~ Y}
X,y,Z:=X, Y, I;
do O<y A even(y) - y,X:= y+2,x*x
o O<y - y,z := y-I, Z*X
od {Z =X Y }

N ow consider the efficiency of the program. Dividing by 2 generally


reduces y more than subtracting I; hence, division is preferred. However,
if y is >0 and even, then both guards are true, and an implementation is
free to choose to execute either command. Using theorem 19.1.1, replace
the guard 0 <y by

O<y A ,(O<y A even(y»

and simplify it to yield

{O~X A O~ Y}
x, y, z:= X, Y, I;
do O<y A even(y) - y,X:= y+2,x*x
o O<y A odd(y) - y,z := y-I, z*x
od{z=X Y }

With the preliminary, nondeterministic version the loop could iterate


up to Y times; in the final, deterministic version, the number of iterations
is at most I + 2*ceil (log Y). The algorithm can be rewritten once more
as

x, y, z:= X, Y, I;
do O<y - do even(y) - y,x:= y+2,x*x od;
y, z:= y-I, z*x
od
Section 19.2 Taking an Assertion Out of a Loop 241

19.2 Taking an Assertion Out of a Loop


Consider the following program segment, where dots .,. represent
arbitrary pieces of code that do not assign to i:

do i <n · .. ; k:= 5*i;


· .. ; i:= i+2;
od

This program can be transformed to use the faster arithmetic operation


addition instead of multiplication as follows. First, introduce a fresh vari-
able z to contain the value 5*i and transform the program as follows:

do i <n · .. ; z:= 5*i; k:= z;


· .. ; i:= i+2; ...
od

Next, make z = 5*i part of the invariant of the loop. This means that the
assignment z:= 5 *i within the loop becomes unnecessary, but whenever i
is increased z must be altered accordingly:

z:= i* 5;
{Part of invariant: z =5*i}
do i < n .. ; k:= z;
· .. ; i, z:= i+2, z+IO;
od

Compiler writers call this transformation strength reduction. It has been


used as early as the late 1950's and early 1960's, in both FORTRAN and
ALGOL compilers, to make references to two-dimensional arrays more
efficient. For example, suppose an array b [0:99, 0:50] is stored in row-
major order. The calculation of the address of an element b [i, j] is per-
formed as

address(b [0, O]} + i* 51 +j

Then, within a loop that increments i with each iteration, all calculations
of the address of b [i, j] can be transformed as above to make them more
efficient. This optimization is also effective because it allows the detec-
tion and elimination of certain kinds of common arithmetic expressions.
In general, this transformation is called taking an assertion out of a
loop (and making it part of the loop invariant). In this case, the assertion
z = 5 *i was taken out of the loop to become part of the invariant. The
technique can be used wherever the value of some variable like z can be
calculated by adjusting its current value, instead of calculating it afresh
each time.
242 Part III. The Development of Programs

In the above example, taking the relation out of the loop can reduce
execution time by only a constant factor, but examples exist that show
that the technique can actually reduce the order of execution time of an
algorithm.

Horner's rule
Consider evaluating a polynomial aO+al*x 1+ ... +an_l*X n - 1 for
n ::;::: I and for a value x and given constants ai. The result assertion is

An invariant can be produced by replacing the constant n by a variable i,


and the following program can be developed:

i, y:= I, ao;
{invariant: I~i~n Ay =ao*xo+ ... +ai_l*x i - 1}
{bound: n -i}
do i cpn - i,y:= i+l, y+ai*x i od

But note that calculating Xi each iteration is costly, requiring, in general,


time proportional to logi. Noting that xi =x*X i - 1, we see that introduc-
ing a fresh variable Z and making

part of the invariant of the loop allows us to transform the program into

i, y, z:= I, ao, x;
{invariant: I~i~n Az=Xi Ay =ao*xo+ ... +ai_l*x i - 1}
{bound: n-i}
doicpn -i,y,z:=i+l,y+ai*z,z*x od

This transformation can also be called strength reduction; the operation


exponentiation has been replaced by the faster operation multiplication.
Its use here reduces the order of execution time of the program from
O(n log n) to O(n).

Remark: One can rewrite the polynomial as

« ... (an-l*x +an -2)*x + ... )*x +ao

This form leads directly to the slightly simpler program


Section 19.2 Taking an Assertion Out of a Loop 243

y, i::o; an-I. n-I;


(invariant: 0 ~ i <n /\
y =« ... (an-,*x +an -2)*X + ... )*x + a;)
{bound: i}
do i #0 - i:= i-I; y:= y*x +a; od

This method of computing a polynomial is named after W.G. Horner,


who gave it in connection with another famous problem in 1819, but it
was, in fact, proposed over 100 years earlier by Isaac Newton. This illus-
trates that first analyzing the specification of a program and transforming
it into a slightly different form can be of more help than looking for effi-
cient programs for the original specification. 0

An exercise attributed to Hamming


Consider the sequence q =1,2,3,4,5,6,8,9,10,12, .. , of all numbers
divisible by no primes other than 2, 3 and 5. We shall call this sequence
Seq. Another way to describe Seq is to give axioms that indicate which
values are in it:

Axiom I. I is in Seq.
Axiom 2. If x is in Seq, so are 2*x, 3*x and 5*x.
Axiom 3. The only values in Seq are given by Axioms I and 2.

The problem is to write a program that stores the first 1000 values of
Seq, in order, in an array q[0:999], i.e. that establishes

R: q [0:999] contains the first 1000 values of Seq, in order

A loop of some form is needed. What is a possible loop invariant?

Since Axiom 2 specifies that a value is in Seq if a smaller one is, it may
make sense to generate the values in order. A possibility, then, is to
replace the constant 1000 of R by a variable i, yielding the invariant

P = 1~ i ~ 1000 /\ q [0: i-I] contains the first i values of Seq.

With this invariant, the obvious program structure is

i, q[O]:= 1, 1; {P}
{invariant: P; bound: 1000-i}
do i # 1000 - Calculate xnext, the ith value in Seq;
i, q[i]:= i+l, xnext
od
244 Part III. The Development of Programs

It remains to determine how to calculate xnext, the next value of Seq to


be generated. Since the values of Seq are generated in order, xnext must
be >q(i-l]. Secondly, since I is already in q[O:i-I], xnext must satisfy
Axiom 2 above. This means that xnext must have the form 2*x, 3*x or
5*x for some value x already in q[O:i -I]. Therefore,

xnext is the minimum value >q[i-I] of the form 2*x, 3*x or


5*x for x in q[O:i-l].

So, we introduce three variables x2, x3 and x5 with meaning as expressed


in the following assertion:

Pl: x2 is the minimum value >q[i-l] with form


2*x for x in q[O:i-I],
x3 is the minimum value >q[i-I] with form
3*x for x in q[O:i-I],
x5 is the minimum value >q[i-l] with form
5*x for x in q[O:i-I].

Value xnext is the minimum of x2, x3 and x5. We see, then, that vari-
able xnext is not really needed, and we modify the program structure to

i, q[O]:= I, I; {P}
{invariant: P; bound: 1000-i}
do i =F- 1000 - Calculate x2, x3, x5 to satisfy Pl;
i, q[i]:= i+l, min(x2,x3,x5)
od

We now illustrate taking an assertion out of a loop. Calculating x2, x3


and x5 to establish Pl at each iteration can be time-consuming. How-
ever, they change quite slowly as i is increased (and P is kept invariant),
and it may be possible to speed up the algorithm by taking Pl out of the
loop and making it part of the loop invariant. The fact that q[O:i-l] is
ordered gives additional hope. Thus, we investigate the program structure

i, q[O]:= 1, I; {P}
Establish Pl for i = I;
{invariant: P /\ Pi; bound: IOOO-i}
do i =F- 1000 - i, q[i]:= i + I, min (x2, x3, x5);
Reestablish Pl
od

Now, how is Pl to be reestablished? Consider x2. For some j, x2 =


2*qU], Further, x2 can only be increased, and not decreased, to
2*qU+I] or 2*qU+2], etc. This suggests maintaining the position j. A
similar statement holds for x3 and x5. We therefore introduce three
Exercises for Section 19.2 245

variables j2, j3 and j5 and modify PI as follows:

PI: x2 =2*qU2] is the minimum value >q[i-l] with form


2*x for x in q[O:i-I],
x3 =3*qU3] is the minimum value >q[i-I] with form
3*x for x in q[O:i-l] and
x5=5*qU5] is the minimum value >q[i-I] with form
5*x for x in q[O:i-l]

We are now able to develop the final program:

i, q[O]:= I, I; {PJ
Establish PI: x2, x3, x5, j2, j3, j5:= 2, 3, 5,0,0, 0;
{invariant: P /\ PI; bound: lOOO-i}
do i ¥-1000 ~ i,q[i]:= i+l, min(x2,x3,x5);
Reestablish PI:
dox2~q[i-I] ~j2:=j2+1; x2:= 2*qU2] od;
do x3~q[i-I] ~ j3:= j3+1; x3:= 3*qU3] od;
do x5~q[i-I] - j5:= j5+1; x5:= 5*qU5] od
od

Exercises for Section 19.2


1. Writing a Value as the Sum of Squares. Write a program that, given a fixed
integer r ;;::0, generates all different ways in which r can be written as the sum of
two squares -i.e. that generates all pairs (x, y) satisfying

To help in writing it (and to arrange to use the strategy of taking a relation out of
a loop), assume the following. Two arrays xv and yv will hold the values of the
pairs (x, y) satisfying (19.2.1). Furthermore, the pairs are to be generated in
increasing order of their x -values, and a variable x is used to indicate that all
pairs with x-value less than x have been generated. Thus, the first approxima-
tion to the invariant of the main loop of the program will be

PI: 0 ~i /\ ordered(xv[O:i-l]) /\
the pairs (xvU],yvU]),O~j <i, are all the pairs
with x-value <x that satisfy (19.2.1).
246 Part III. The Development of Programs

19.3 Changing a Representation


It is sometimes useful to transform a program into one that uses a dif-
ferent representation of the data. As simple examples of different
representations for some value, we use both rectangular coordinates and
polar coordinates for points on a plane. The day of the year, which may
be kept in the form (month, day) or in the form (day number within the
year), is another example.
The motivation for changing representation often comes from the de-
sire to apply one of the following two strategies, in the hope that they will
yield a simpler or more efficient program:

-Strategy: Replace an expensive operation by a cheaper one.

-Strategy: Defer an expensive operation, so that it won't be


executed as often.

Other reasons will probably suggest themselves once familiarity with the
technique is acquired. We illustrate with three examples.

Approximating the Square Root


In section 16.3 the following program was developed to approximate
the square root of a fixed integer n ~O:

(19.3.1) a, b:= 0, n +1;


{invariant P: a <b ~n+1 1\ a2~n <b 2}
{bound t: b-a+l}
do a+l ¥-b ~ d:= (a+b)-;-2;
if d*d ~ n ~ a:= d 0 d*d > n ~ b:= d fi
od {a2~n «a +1)2}

We present a minor transformation of this program to illustrate changing


a representation. A less trivial one is required in exercise 2.
Let us assume we want to replace operator -;- in the program by divi-
sion /. This can be done easily if a +b is always even, since then the two
will yield the same result. Keeping a +b even may not be so easy, but if
the difference b -a is always even, then d can be calculated using

d:=a+(b-a)/2

Therefore, let us attempt to deal with the difference c (say) between b


and a and to keep this difference even. This will be easiest if c is always
a power of 2. Thus we have:
Section 19.3 Changing a Representation 247

b =a+c
d=a+c/2
(E p: I!(p: c = 2P ) (therefore c is even)

Because band d are defined in terms of a and c, we may be able to


write the program using only a and c. Thus, we try the loop invariant
and bound function

P: a 2 !(n «a+c)2" (Ep: 1 !(p: c =2P )


t: c+1

The initialization will require a loop to establish P, since c must be a


power of 2. The rest of the program is derived from program (19.3.1)
essentially by deleting the assignments to band d and transforming the
other commands into commands involving c:

(19.3.2) a, c:= 0, I; do c 2!(n ~ c:= 2*c od; {P}


do c =1= I ~ c:= c / 2;
if(a+c)2!(n ~ a:= a+c 0 (a+c)2>n ~ skip fi
od (a 2 !(n «a+I)2)

Controlled Density Sorting


In solving this problem, we attempt to convey the idea of the develop-
ment without presenting all the formal details. The complete details are
left as an exercise.
A table of (not necessarily different) numbers, which is initially empty,
must be maintained. At any time, one of the following three operations
may be performed.

(I) Insert (Vi): insert a new value Vi into the table.


(2) Search (x, p): Return in p the position in the table of a
value x ("position" must be further specified later on).
(3) Print: Print the list of values in the table, in ascending order.

Operation (3) should be performed in time proportional to the number of


values in the table. Furthermore, the total time spent in inserting and
searching should be "small".
The requirement for operation (3) suggests that the table of values be
kept in (ascending) order. In fact, one is led to think of algorithm Inser-
tion Sort. Using an array v[O:n -I] to contain the values and a simple
variable i, the table of values V o, ... ,Vi - 1 that have already been
inserted will satisfy
248 Part III. The Development of Programs

(19.3.3) P: O~i /\ ordered(v[O:i-l]) /\perm(v[O:i-I],{V o, " ' , Vi-d)

Printing can be done in linear time and searching can be done in time
proportional to the logarithm of the current size of v, using Binary
Search.
But what about inserting a new value x? Inserting will require finding
the position j where x belongs -i.e. finding the value j such that v U -I]
~ x < vU]- then shifting vU:i-l] up one position to vU+I:i], and
finally placing x in vU]' Shifting vU:i-l] may take time proportional
to i, which means that each insertion may take time proportional to i,
and therefore, in the worst case the total time spent inserting n items may
be on the order of n 2 . This is expensive, and a modification is in order.
Shifting is the expensive operation, so we try to change the data repre-
sentation to make it less expensive. How can this be done, perhaps to
eliminate the need for shifting altogether?

A simple way to make shifting less expensive is to spread the values out,
so that an empty array element, or "gap", appears between each pair of
values. Thus, an array v[O:2n -I] of twice the size is defined by

(19.3.4) P: O~i /\ ordered(v[O:2i-l]) /\ {Va, " ' , Vi -il E v[O:2i-l] /\


v[O:2i-I]=(gap, value, gap, value, ... , gap, value)

Gaps can be implemented by using a second array gap [O:2n -I] of


bits, where gapU] has the value "vU] is a gap". It is advantageous to let
a gap vU] contain the non-gap value occurring in vU+l], so that Binary
Search can still be used for searching.

Remark: If all values are known to be positive, then the sign bit of vU]
can be used to distinguish values from gaps. 0

Now, shifting and inserting takes no time at all, because the new value
can be placed in a gap. But shifting and inserting destroys the fact that a
gap separates each pair of values, and after inserting it necessary to recon-
figure the array to reestablish (19.3.4). Reconfiguring can be costly, so we
must find a way to avoid it as much as possible.
We can defer reconfiguring the array simply by weakening the invari-
ant to allow several values to be adjacent to each other. However, there
are never adjacent gaps; the odd positions of v always contain values.
We introduce a fresh variable k to indicate the number of array elements
being used, and use the invariant
Section 19.3 Changing a Representation 249

(19.3.5) P: O~i /\ ordered(v[O:k-I]) /\ {V o, ... ,Vi-JlEv[O:k-l] /\


(A j: O~j <k /\ oddU): vU] is not a gap) /\
v [O:k -I] contains k -i gaps /\
(Aj: O~j <k: vU] a gap ~ vU]=vU+I])

Note, now, that when inserting the first value no shifting is required, since
it can fill a gap. The second value is likely to fill a gap also, but it may
cause a shift. The third value inserted may fill a gap also, but the proba-
bility is greater that it will cause some shifting because there are fewer
gaps. At some time, so many values will have been inserted that shifting
again becomes too expensive. At this point, it is wise to reconfigure the
array so that there is again one gap between each pair of values.
To summarize, the table is defined by (19.3.5), with (19.3.4) also being
true initially. That is, values are separated by gaps. The table is initially
set to empty using

i, k:= 0, 0
{(19.3.4) and (19.3.5) are true}

Inserting a value Vi is done by

(19.3.6) {(19.3.5)}
if shifting too expensive - Reconfigure to reestablish (19.3.4)
o shifting is not too expensive - skip
fi;
Find the position j where Vi belongs;
Shift vU: ... ] up one position to make room for Vi;
i, vU]:= ;+1, Vi

When does shifting become so expensive that reconfiguring should again


be considered? Analysis has shown that reconfiguring is best performed
either when a previous shift requires at least JT values to be moved or
when ; /2 values have been inserted since the last reconfiguration. This
makes the total time spent shifting roughly equal to the total time spent
reconfiguring, so that neither one overshadows the other. Under these cir-
cumstances, the worst-case total time spent shifting or reconfiguring is
proportional to n.;n, while the average-case total time is proportional to
n logn.
The development of a complete algorithm is left to the reader (exercise
I).
250 Part III. The Development of Programs

Discussion
The first idea in developing this algorithm was to find a way to make
shifting less expensive; the method used was to put a gap between each
pair of values. The second idea was to defer reconfiguration, because it
was too expensive. The first idea made shifting cheap, but introduced the
expensive reconfiguration operation; the second idea deferred reconfigura-
tion often enough so that the total costs of shifting and reconfiguration
were roughly the same.
The algorithm is a competitor to balanced tree schemes in situations
where a table of values is to be maintained in memory.

Efficient Queues in LISP


The programming notation LISP allows functions on lists v = (vo,
... , vn-d of n values, where n ::;:::0. The five functions with which we
will be concerned are

(I) v =0 yields the value of the assertion "list v is empty";


(2) head(v) yields the value Vo (undefined if v is empty);
(3) tail(v)yields the value (v 1> ••• , v n ) -the list without its
first element- (undefined if v is empty);
(4) construet(w, v), where w is a value and v =(vo, ... ,vn ) a
list, yields the list (w, Vo, ... ,vn );
(5) append(v, w), where w is a value and v =(vo, ... , v n ) a
list, yields the list (vo, ... , v n , w).

The first four functions are executed in constant time. Function append,
however, takes time proportional to the length of the list v to which w is
being appended.
This is all we will need to know about LISP.
Consider implementing a queue using LISP lists and the five functions
just given. A queue is a list v on which three operations may be per-
formed: the first is to reference the first element on the list, the second is
to delete the first element and the third is to insert a value w at the end
of the queue. Thus, the operations on queue v can be implemented as

(l) Reference the first element: head (v),


(2) Delete the first element: v:= tail(v),
(3) Insert value w at the end: v:= append(v, w).

Now, suppose n values vo, ... , Vn-I are to be inserted in a queue and,
between insertions, values may be taken off the queue or the first value on
Section 19.3 Changing a Representation 251

the queue can be examined. In the worst case, the time needed to per-
form the insertions is on the order of n 2. Why?

To insert a value takes time proportional to the length of the queue. To


insert n values into an empty queue can take time proportional to
0+1 + ... +n-I = O(n 2). Clearly, insertion, performed in terms of
the LISP append, is the expensive operation, and a different data repre-
sentation must be used to make it less expensive. What different repre-
sentation would allow insertion to be done in constant time?

Insertion can be done easily if the queue is kept in reverse order. But
this would make deletion expensive. Thus, we compromise: implement
queue v =(vo, " ' , vi-d using two lists vh and vt, where the second is
reversed:

(19.3.7) vh =(vo, " ' , Vk-l), for some k, and


vt =(Vi-j, Vi-2, . . . ,Vk) where vh =0 only if vt =0
Now let us look at the implementations of the queue operations again.
Referencing the first element is still implemented as head(vh) -the res-
triction that vh is empty only if vt is allows us to implement it so simply.
Next, operation Delete must delete the first element from vh, but if vh
becomes empty, keeping (19.3.7) true requires that vt be reversed and
moved to vh. Thus, Delete -i.e. v:= tail (v)- is implemented as

vh:= tail(vh);
if vh = 0 A vt # 0
{inv: queue is (reverse(vt) I vh)}
{bound: I vt I}
do vt #0 - vh, vt:= construct (head(vt ), vh), tail(vt) od
{(19.3.7) A vt =()}
Dvh # 0 v vt = 0 - skip
fi

Finally, Insert can be implemented as vt:= construct (w, VI).


Now, Insert is performed in constant time, while the loop in Delete
will take time proportional to the length of list vt. But, the total time
spent executing this loop is proportional to the total number of values n
inserted in the queue.
252 Part III. The Development of Programs

Exercises for Section 19.3


1. Write a reconfiguration procedure

procedure config(value i, k: integer;


var v: array of integer;
var isgap: array of Boolean);
that, given fixed i >0, fixed integer k, array v and array isgap satisfying
(19.3.5), spreads the array values out to establish (19.3.4).
2. Change the representation of variables in program (19.3.2) so that no squaring
operations are used.
3. Develop a program that, given fixed integers X, Y > 0, establishes z = X y.
Develop it using the idea that z should be calculated through a series of multipli-
cations, so that it may make sense to initialize z to the identity of *, I, trying to
create the invariant of a loop first, and changing a representation to make it all
possible.
Chapter 20
Two Larger Examples of Program Development

Two programs are developed to show the use of the methodology on


slightly larger and more complicated problems. Each has a few further
lessons to offer on the subject of program development. An attempt will
be made to guide the reader to produce the solutions, as has been done in
previous chapters. The developments will proceed at a faster rate than
before at the places that involve only principles and strategies illustrated
earlier.
The exercises contain a series of longer and more difficult problems on
which to try the techniques and strategies discussed in this book.

20.1 Justifying Lines of Text


Consider the following problem. Write a procedure that justifies lines
of text by inserting extra spaces between words so that the last word on
each line ends in the last column. For example, using "#" to denote a
blank, the three lines

(20.1.1) justifying#lines#by########
inserting#extra#blanks#is##
one#task#of#a#text#editor.#

might appear justified as

(20.1.2) justifying#####lines#####by
inserting#extra##blanks##is
one##task#of#a#text#editor.

Several restrictions are placed on how blanks are to be inserted between


words, in order to lessen the visual impact of justification. The number of
254 Part III. The Development of Programs

blanks between different pairs of adjacent words on a line should differ by


no more than one. Secondly, so that long gaping holes don't appear on
one side of the paper, an alternating technique should be used: for even
(odd) lines, more blanks should be inserted on the right (left) side of the
line if necessary. For example, line 2 of (20.1.1) is changed to line 2 of
(20.1.2) by inserting the 2 extra blanks just before the ultimate and the
penultimate word on the line; on line 3, the only extra blank is inserted
after the first word.
We will write a procedure to calculate the column numbers of the
beginning of the words in the justified line, given their column numbers in
the unjustified line. For line I above, the list (I, 12, 18) of column
numbers where the words begin will be changed to the list (I, 16, 26) of
column numbers where the words begin in the justified line 1 of (20.1.2).
For line 2, the input list (1, 11, 17,24) will be changed to (1, 11, 18,26).
The following procedure heading is suggested:

proc justify (value n, z, s: integer;


var b: array of integer );
Line z has n words on it. They begin in columns b[I], ... ,
b [n]. Exactly one blank separates each adjacent pair of words.
Parameter s is the total number of extra blanks that must be
inserted between words in order to justify a line. The procedure
determines new column numbers b [I:n] so that the line is justi-
fied in the manner described above.

Beginning the development


Given the task of writing procedure justify, how would you proceed?

The first step is to write pre- and postconditions for the procedure body.
We begin with the precondition. The words themselves are not part of
the specification, since only column numbers are given. So the precondi-
tion won't be written in terms of words. But it may help to give an
intepretation of the precondition in terms of words. Initially, the input
line has the form

WI [1] W2 [1] ... [1] Wn [s]

where WI is the first word, W2 the second, ... , Wn the last, s is the
number of extra blanks, and the number of blanks at each place has been
shown within brackets. The precondition Q itself must give restrictions
on the input -e.g. that there cannot be a negative number of words or of
extra blanks. In addition, because array b will be modified, it is
Section 20.1 Justifying Lines of Text 255

necessary to denote its initial value, say by B. The precondition is

(20.1.3) Q: O:::;;;s A O:::;;;n A b[l:n]=B[I:n]

Now, having seen the precondition, write the postcondition R.

The justified line has the format

(20.1.4) WI [p+l] .. , [p+I] Wt [q+l] ... [q+l] Wn

where restrictions must be put on variables p, q and t. Both p and q


must be ~ 0; furthermore, they must differ by at most one. The total
number of blanks inserted must be equal to s. Wt is expected to be one
of the words W I through Wn. Finally, there is a restriction on where
extra blanks go, depending on the line number. These restrictions are for-
malized as follows:

(20.1.5) Ql: I:::;;; t :::;;; n


A (t is one of the words)
O:::;;;p A O:::;;;q A (Don't delete blanks)
p*(t-I)+q*(n-t)=s A (Insert s blanks)
(odd(z) A q =p+1 v even(z) Ap =q+l)
(Restriction on inserting blanks)

The purpose of the procedure is to change array b, so let us specify that


change. The following is easily derived:

(20.1.6) R: (A i: I:::;;;i:::;;;t: b[i]=B[i]+p*(i-I» A


(A i: t <i:::;;;n: b[i]=B[i]+p *(t-I)+q *(i-t»

Now, the pre- and postconditions are Q (20.1.3) and R (20.1.6), with
variables p, q and t of R satisfying Ql (20.1.5). What is the general
structure of the algorithm?

The specification leads us to consider the algorithm

(20.1.7) {Q}
Calculate p , q and t to establish Ql;
{Ql A Q}
Calculate new b[l:n] to establish R
{Ql A R}

Had the specification been different, quite likely a different structure


would have arisen.
256 Part III. The Development of Programs

Calculating p, q and t
The two English commands of (20.1.7) have to be refined. We begin
with the first. At this point, refine "Calculate p, q and t to establish
QI". Be absolutely sure the refinement is correct.

Looking at QI for insight, at this point there seems no way to refrain


from distinguishing the two cases odd(z) and even (z). So let us for the
moment consider only the case odd(z). Then q =p+1. We want to
determine values p, q and t that satisfy QI. We first simplify QI by sub-
stituting for q:

I~t~n "O~p "O~p+1 "p*(t-l)+(p+l)*(n-t)=s

which simplifies to

(20.1.8) 1 ~t ~n "0 ~p "p*(n-l) +n -t =s

An obvious solution to (20.1.8) is p =s+(n-I) and n-t =s mod (n-I):


p is the quotient and n - t the remainder when s is divided by n -I. But
note immediately that dividing by n -I is not possible if n = I. A line
can not be justified if it has one word on it! Thus the specification IS
inconsistent. For the moment, let us continue, assuming that n #- I.
In the case odd(z) and n #- I, p, q and t can be calculated using

(20.1.9) p:= s+(n-I);


t:= n -(s mod (n-I»;
q:= p+l

It remains to show that Q " odd(z) 9 wp«20.1.9), QI). The predicate


wp«20.1.9), QI) can be calculated and simplified to

(20.l.l0) I ~n-(s mod (n-I»~n " O~s+(n-I) " odd(z)

so it remains to prove that Q " odd(z) 9 (20.1.10). Here we run into


trouble only if n =0. In this case, (20.1.10) reduces to

I ~-(s mod -I)~O " O~s+-I " odd(z),

which cannot be true. What does n =0 mean? That there are no words
on a line. But of course, a line with 0 words cannot be justified!
Assume the specification is changed so that, if a line has zero or one
words on it, then no justification should occur.
The case even (z) is solved in a similar fashion, leaving us with the fol-
lowing algorithm to establish QI if n > I:
Section 20.1 Right-Justifying Lines of Text 257

Determine p, q and t:
ifeven(z) - q:= s+(n-I); t:= I+(s mod(n-I»;p:= q+1
Uodd(z) - p:= s+(n-I); t:= n-(s mod (n-I»; q:= p+1
fi

Calculating the new column numbers


The format of result assertion R, (20.1.6), suggests using two loops,
one to calculate b[l:t] and the other to calculate b[t+l:n]. Develop
these two loops, using the following suggestions. First, it may help to cal-
culate the b [i] in decreasing instead of increasing order of i. Second, it
may be wise to use a variable e that contains the number of blanks to be
added to b [i], to eliminate the need for a multiplication in each iteration.
Third, there is no need to calculate new values for b [I: t] if p = o.

Here is one solution, which eliminates assigning to b[l:t] if p =0, with


absolutely no additional cost.

(20.1.11) Calculate b[t+l:n]:


k,e:=n,s;
{invariant: t ~k ~n "e =p*(t -I) +q*(k -t)"
b [I: k] contains its initial values "
b [k + I:n] contains its final values}
{bound: k - t 1
do k #-t - b[k]:= b[k]+e; k, e:= k-I, e-q od;
{k=t "e=p*(k-I)}
Calculate b[l:t], assuming invariant holds initially:
{invariant: I ~k ~t "e =p*(k-I) "
b [I: k] contains its initial values "
b [k + I:n] contains its final values}
{bound: k -I}
do e #-0 - b[k]:= b[k]+e; k, e:= k-I, e-p od

or simply

k, e:= n, s;
do k #-t - b[k]:= b[k]+e; k, e:= k-J, e-q od;
do e#-O - b[k]:= b[k]+e; k, e:= k -1, e -p od

Each loop was developed by first writing the invariant, then writing the
command of the loop, and finally determining a suitable guard. The
guard e#-O for the second loop was discovered by noting that the invari-
ant states that e =p*(k-l) and that e =0 implies either p =0 or k = 1,
each of which implies that all values b[i] have their final value.
258 Part III. The Development of Programs

Discussion
The development of this program brings up several interesting points.
First of all, consider the development of the postcondition (20.1.6). A
common mistake in writing this specification is to describe the line as two
cases:

WI [p+I] ... [p+I] Wt [p+2] ... [p+2] Wn if odd(z), or


WI [p+2] ... [p+2] Wt [p+l] ... [p+l] Wn if even (z).

While it can lead to a correct program, the program will be less efficient
than the one developed, even if in a relatively minor way. Generally
speaking, one should try to follow the principle:

(20.1.12) .Principle: Keep the number of different cases to a minimum.


In this case, even two cases is too many! Of course, the two different
cases must rear their heads at some time but it is wise to postpone this as
late as possible. The need to keep the number of cases to a moderate
number is one reason for not using decision tables.
A second interesting point is the development of the algorithm to cal-
culate p, q and I. The problem began as one involving strings of charac-
ters, but it turned out to be a mathematical problem, which required find-
ing a solution to some formulas. If you did not notice that the cases
n = 0 and n = I could not be handled you were not being careful enough.
Finding such errors used to be a random, haphazard process, but there is
no need for this anymore. Each operation introduced (like -;-- or mod) can
be executed only in certain circumstances. Furthermore, the program-
ming discipline requires that Q ~wp(S, R) be proved whenever a com-
mand R to establish {Q} S {R) has been developed. Whether this is
done formally or informally doesn't matter, as long as enough care is used
to give a high degree of confidence in correctness.
A final point concerns the development of the loop to calculate b [1:1].
It would have been easy to write this in PLI I (say) as

e= 0; DO k= 2 TO t; e= e+p; b(k)= b(k)+e; END;

But this would have eliminated the possibility of noticing that the loop
could be written without loss of efficiency to halt immediately if p = O.
Further, one familiar with using loop invariants will generate the invariant
and loop given in (20.1.11) as quickly as the PLI I loop.
Section 20.2 The Longest Upsequence 259

20.2 The Longest Upsequence


Consider a sequence of values (vo, ... , Vn-I). If one deletes i (not
necessarily adjacent) values from the list, one has a subsequence of length
n -i. This subsequence is called an upseqence if its values are in non-
decreasing order. For example, the list (l, 3,4,6,2,4) has a subsequence
(1,3,2), which is not an upsequence, and another subsequence (1,3,6),
which is an upsequence.
We want to write a program that, given a sequence in b[O:n -I], where
n >0, calculates the length of the longest upsequence of b[O:n -I]. As an
abbreviation, use the notation lup(s) to mean:

lup (s) = the length of the longest upsequence of sequence s

Thus, using a variable k to contain the answer, the program has the pre-
and postconditions:

Q: n >0
R: k =lup(b[O:n-ID

Note that a change in anyone value of a sequence could change its long-
est upsequence, and this means that possibly every value of a sequence s
must be interrogated to determine lup (s). This suggests a loop. Begin by
writing a possible invariant and an outline of the loop.

The loop will interrogate the values of b [O:n -1] in some order. Since
lup(b[O:O]) is 1, a possible invariant can be derived by replacing the con-
stant n of R by a variable:

P: I ~i ~n /\ k =lup(b[O:i-I])

The loop itself will have the form

i, k:= I, I;
do i ~ n - increase i, maintaining P od

Increasing i extends the sequence b[O:i -1] for which k is the length of a
longest upsequence, and hence may call for an increase in k. Whether k
is to be increased depends on whether b[i] is at least as large as a value
that ends a longest upsequence of b[O:i-l] (there may be more than one
longest upsequence). It makes sense to maintain information in other
variables so that such a test can be efficiently made. What is the min-
imum information needed to ascertain whether k should be increased?
260 Part III. The Development of Programs

The smallest value m (say) that ends an upsequence of length k of b[O:


i-I] must be known, for then b [O:i) has an upsequence of length k + I iff
b [i) ~ m. Therefore, we revise invariant P to include m:

P: I";;'i";;'n A k =lup(b[O:i-I]) A
m is the smallest value in b[O:i-l) that ends an
upsequence of length k

In the case b[i)~m, k can be increased and m set to b[i), so that the
program thus far looks like

i, k, m:= I, I, b[O); {P}


doi#n ~ifb[i]~m ~k,m:=k+l,b[i)
o b[i]<m ~ ?
fi;
i:= i+1
od

The question now becomes what to do if b [i] <m. Variable k should


not be changed, but what about m? Under what condition must m be
changed?

If b [O:i -I] contains an upsequence of length k -I that ends in a value


";;'b[i], then b[i) ends an up sequence of length k of b[O:i). If, in addi-
tion, b [i) <m, then m must be changed. In order to check this condi-
tion, consider maintaining the minimum value ml that ends an upse-
quence of length k-I of b[O:i-l].
This means that two values are needed: the minimum value m that
ends an upsequence of length k and the minimum value ml that ends an
upsequence of length k -1. Judging by the development thus far, can you
generalize this?

Maintaining m caused us to introduce ml; maintaining ml will cause us


to introduce m2 to contain the minimum value that ends an upsequence
of length k -2. And so on. Therefore, an array of values is needed. We
modify the invariant once more:

(20.2.1) P: I";;'i";;'n A k =lup(b[O:i-I]) A


(A j: I";;'j";;'k: m U] is the smallest value that ends
an upsequence of length j of b [O:i -I])

And the program is changed to


Section 20.2 The Longest Upsequence 261

i,k,m[I]:=I,I,b[O]; {PI
do i#n ~ if b[i]~m[k] ~ k:= k+l; m[k]:= b[i]
o
b[i] <m[k] ~ ?
fi;
i:= i+l
od

Before proceeding further, it makes sense to investigate array m; does it


have any properties that might be useful?

Array m is ordered, because the minimum value that ends an upsequence


of length j (say) must be at most the minimum value that ends an upse-
quence of length j + I.
We are now faced with determining which values of m [I:k] must be
changed in case b[i]<m[k]. Solve this problem.

The case b [i] < m [I] is the easiest to handle. Since m [I] is the smallest
value that ends an up sequence of length I of b[O:i-l], if b[i]<m[I],
then b[i] is the smallest value in b[O:i] and it should become the new
m[I]. No other value of m need be changed, since all upsequences of
b[O:i-l] end in a value larger than b[i].
Finally, consider the case m [I] ~ b [i] < m [k]. Which values of m
should be changed? Clearly, only those greater than b [i] can be changed,
since they represent minimum values. So suppose we find the j satisfying

mU-I] ~b[i] <mU]

Then m[l:j-l] should not be changed. Next, since mU-I] ends an


upsequence of length j-I of b[O:i-I], b[i] ends an upsequence of length
j of b[O:i]. Hence, mU] should be changed to b[i]. Finally, mU+l:k]
should not be changed (why?).
Binary search (exercise 4 of section 16.3) can be used to locate j. The
final program is given in (20.2.2).
The execution time of program (20.2.2) is proportional to (n log n) in
the worst case and to n in the best. It requires space proportional to n in
the worst case, for array m. It uses a technique called "dynamic pro-
gramming", although it was developed without conscious knowledge of
that technique.
262 Part III. The Development of Programs

(20.2.2) i, k, m[l]:= I, I, b[O]; {PI


{inv: (20.2.1); bound: n -i}
do i #n - if b[i] ~m[k] - k:= k+l; m[k]:= b[i]
Ub[i] <m[l] - m[I]:= b[i]
Um[I]"';;b[i]<m[k] -
Establish mU-I]"';;b[i]<mU]:
h, j:= I, k;
{inv: I"';;h <j"';;k A m[h]"';;b[i]<mU]}
{bound: j-h-I}
doh#j-I-e:=(h+j)+2;
if m[e]"';;b[i] - h:= e
Um[e]>b[i] - j:= e
fi
od;
mUJ:= b[i]
fi;
i:= i+1
od

Exercises for Chapter 20


1. (U nique 5-bit Sequences). Consider sequences of 36 bits. Each such sequence
has 32 5-bit sequences consisting of adjacent bits. For example, the sequence
1101011... contains the 5-bit sequences 11010, 10101,01011, .... Write a program
that prints all 36-bit sequences with the two properties

(I) The first 5 bits of the sequence are 00000.


(2) No two 5-bit subsequences are the same.

2. (The Next Higher Permutation). Suppose array b[O:n -I] contains a sequence
of (not necessarily different) digits, e.g. n = 6 and b [0:5] =(2,4,3,6,2, I). Con-
sider this sequence as the integer 243621. For any such sequence (except for the
one whose digits are in decreasing order) there exists a permutation of the digits
that yields the next higher integer (using the same digits). For the example, it is
(2,4,6,1,2,3), which represents the integer 246123.
Write a program that, given an array b[O:n -I] that has a next higher permu-
tation, changes b into that next higher permutation.
3. (Different Adjacent Subsequences). Consider sequences of l's, 2's and 3's. Call
a sequence good if no two adjacent non-empty subsequences of it are the same.
For example, the following sequences are good:

2
32
32123
1232123
Exercises for Chapter 20 263

The following sequences are bad (not good):

33
32121323
123123213
]t is known that a good sequence exists, of any length. Consider the "alphabetical
ordering" of sequences, where sequence sf . <.sequence s2 if, when considered as
decimal fractions, sf is less than s2. For example, 123.<. 1231 because
.123<.1231 and 12.<' 13. Note that if we allow O's in a sequence, then
I
sf 0 .=. sf. For example, 110 .=. II, because .110 = .11.
Write a program that, given a fixed integer n ?O, stores in array b[O:n -I]
the smallest good sequence of length n.
4. (The Line Generator). Given is some text stored one character to an array ele-
ment in array b [O:n -I]. The possible characters are the letter A, ... , Z, a blank
and a new line character (NL). The text is considered to be a sequence of words
separated by blanks and new line characters. Desired is a program that breaks
the text into lines in a two-dimensional array line [O:nolines -I, O:maxpos -I],
with line[O,O:maxpos-l] being the first line, line[I,O:maxpos-l] being the
second line, etc. The lines must satisfy the following properties:

I. No word is split onto two lines.


2. Each line contains no more than maxpos characters.
3. A line contains as many words as possible, with one blank between
each pair of words. Lines are padded on the end with blanks to max-
pos characters.
4. A new line character denotes the end of a line, but will not appear in
array line.
5. (Perm_to_Code). Let N be an integer, N >0, and let X[O:N -I] be an array
that contains a permutation of the integers 0, I, ... , N -I:

perm(X[O:N-I],(O, I, ... ,N-I»

For X, we can define a second array X' [O:N -I] as follows. For each i, element
X'[i] is the number of values in X[O:i-l] that are less than X[i]. For exam-
ple, we show one possible array X and the corresponding array X' , for N = 6.

X = (2, 0, 3, 1,5,4)
X' = (0, 0, 2, I, 4, 4)

Formally, array X' satisfies

(A i: O~i <N: X'[i]=(N): O~} <i: X[j]<X[i]))


Write a program that, given an array x that contains a permutation X of
{O, ... , N -I}, changes x so that it contains the corresponding values X'. The
program may use other simple variables, but no arrays besides x.
264 Part III. The Development of Programs

6. (Code_to_Perm). Read exercise 5. Write a program that, given an array


x =X', where X' is the code for permutation X, stores X in x. No other
arrays should be used.

7. (The Non-Crooks). Array I[O:F-l] contains the names of people who work
at Cornell, in alphabetical order. Array g[O:G -1] contains the names of people
on welfare in Ithaca, in alphabetical order. Thus, neither array contains dupli-
cates and both arrays are monotonically increasing:

1[0]</[1]</[2]< ... </[F-l]


g[O] <g[l] <g[2] < ... <g[G -1]

Count the number of people who are presumably not crooks: those that appear in
at least one array but not in both.
8. Read exercise 7. Suppose the arrays may contain duplicates, but the arrays are
still ordered. Write a program that counts the number of distinct names that are
not on both lists -i.e. don't count duplicates.
9. (Period of a Decimal Expansion). For n > I, the decimal expansion of 1 / n is
periodic. That is, it consists of an initial sequence of digits d I . . . di followed by
a sequence di+I" . di+j that is repeated over and over. For example,
1 /4 = .2500000 ... , so the sequence 0 is repeated over and over (i = 2 and j = I),
while 1/7=.142857142857142857 ... , so the sequence 142857 is repeated over
and over (i = 0 and j = 6, although one can take i to be any positive integer
also). Write a program to find the length j of the repeating part. Use only sim-
ple variables -no arrays.

lO.(Due to W.H.J. Feijen) Given is an array g[O:N-I], N;;:::2, satisfying


O:::;;g[O]:::;;'" <g[N-I]. Define

hi =g[O]+g[l]
hk = hk - I +g[k] for 1 <k :::;;N-l

Write a program to construct an array X[0:2*N -2] containing the values

g[O], ... ,g[N-I],h(, ... ,hN - I

in increasing order. The execution speed of the program should be linear in N.

11. (Exponentiation). Write a program that, given two integers x ;;:::0 and y >0,
calculates the value z = x Y . The binary representation bk -I . . . bib 0 of y is
also given, and the program can refer to bit i using the notation b i . Further, the
value k is given. The program is to begin with z = 1 and reference each bit of
the binary representation once, in the order bk -(, bk -2,
Chapter 21
Inverting Programs

Wouldn't it be nice to be able to run a program backwards or, better


yet, to derive from one program P a second program p-l that computes
the inverse of P? That means that running P followed by p- 1 would be
the same as not running any program at all! Also, if we had the result of
executing P, but had lost the input, we could execute p-l to determine
that input. This chapter is devoted to having fun inverting programs.

Some Simple Program Inversions


Some simple commands are easily inverted. The inverse of x:= x+l,
written (x:= X+l)-l, is x:= x-I. But some commands are not invertible.
For example, computing the inverse of x:= I requires knowledge of the
value of x before the assignment. Such a command may be invertible
with respect to a precondition, though. For example, the inverse of

{x =3} x:= I
IS

{x =I} x:= 3

Thus, execution of the first begins with x = 3 and ends with x = I, while
execution of the second does the opposite. (N ote carefully how one gets
an inverse by reading backwards -except that the assertion becomes the
command and the command becomes the assertion. This itself is a sort of
inversion.) This example shows that we may have to compute inverses of
programs together with their pre- and/ or postconditions.
The command x:= x*x has no inverse, because two different initial
values x =2 and x = -2 yield the same result x =4. To have an inverse,
a program must yield a different result for each different input.
266 Part III. The Development of Programs

Swapping two variables


What is the inverse of x, y:= y, x? By reading the symbols of the
command in reverse, or inverse, order, we get x, y:= y, x. And sure
enough, the command x, y:= y, x is its own inverse!
The idea of copying a program in reverse order to get its inverse IS
appealing, so let us push it further. Let's compute the inverse of

(21.1) x:= x+y; y:= x-y; x:= x-y

Executing backwards to undo the effects of execution of this sequence


would mean first undoing -or executing the inverse of- the third com-
mand x:= x -y, then undoing the second one, y:= x -y, and finally undo-
ing the first one, x:= x +y. We write this as follows, where again the
superscript -I denotes inversion:

The inverse of x:= x-y is x:= x+y, and vice versa. Let's calculate the
inverse of y:= x -y. This is equivalent to y:= -(y -x), which is
equivalent to y:= y-x; y:= -yo The inverse of this sequence is y:= -y;
y:= y +x, which is equivalent to y:= -y +x, which is equivalent to
y:= x-yo Hence, y:= x-y is its own inverse, and (21.2) is equivalent to

x:=x+y; y:=x-y; x:=x-y

But then (21.1) is its own inverse! We leave to exercise I the proof that
(21.1) swaps the values of the integer variables x and y.

Inversion of general commands


With this introduction to the idea of inversion, we now investigate
some inversions that will be needed later. In doing so, keep in mind the
general method of reading backwards.

The inverse of skip. The inverse of skip would be piks, so we will have
to introduce piks as a synonym for skip.

The inverse of S1; S2; ... ; Sn. According to what we did previously,
the inverse of a sequence of commands is the reverse of the sequence of
inverses of the individual commands.

The inverse of x:= c1; S {x =c2}, where c1 and c2 are constants. This is
a kind of a "block". A new variable x is initialized to a value c1, S is
Chapter 21 Inverting Programs 267

executed, and upon termination x has a final value c2. The inverse
assigns c2 to x, executes the inverse of S, and terminates with x = cl:

(x:= cl; S {x = C2})-1 = x:= c2; S-I {x =cl}

Note how, in performing the inversion, the assertion becomes an assign-


ment and the assignment becomes an assertion.

The inverse of an alternative command. Consider the command

(21.3) {Bl v B2} if


BI - SI {Rl}
o
B2 - S2 {R2}
fi {RI v R2}

Execution must begin with at least one guard true, so the disjunction of
the guards has been placed before the command. Execution terminates
with either RI or R2 true, depending on which command is executed, so
RI v R2 is the postcondition.
To perform the inverse of (21.3), we must know whether to perform
the inverse of S2 or to perform the inverse of SI, since only one of them
is executed when (21.3) is executed. To determine this requires knowing
which of R2 and RI is true, which means they cannot both be true at the
same time. We therefore require that Rl/\ R2 = F. For symmetry, we
also require BI /\ B2 = F.
Now let's develop the inverse of (21.3). Begin at the end of (21.3) and
read backwards. The last line of (21.3) gives us the first line of the
inverse: {R2 v RI} if. This makes sense; since (21.3) must end in a state
satisfying R 1 v R2, its inverse must begin in a state satisfying R2 v RI.
Reading the fourth line backwards gives us the first guarded command:

R2 - srI {B2}

This is understood as follows. Execution of (21.3) beginning with B2 true


executes S2 and establishes R2. Execution of its inverse beginning with
R2 true undoes what S2 has done, thus establishing B2.
Note carefully how, when inverting a guarded command with a post-
condition, the guard and postcondition switch places.
Continuing to read backwards yields the following inverse of (21.3)
(provided RI/\ R2 = F):
268 Part III. The Development of Programs

(21.4) {R2V RI} if


R2 - srI {B2}
o
Rl - srI {BI}
fi {B2V Bl}

The inverse of an iterative command. Consider the command

(21.5) do Bl - Sl od {, Bl}

Loop (21.5) contains the barest information -it is annotated only with
the fact that Bl is false upon termination. It turns out that a loop invari-
ant is not needed to invert a loop.
From previous experience in inverting an alternative command, we
know that a guarded command to be inverted requires a postcondition.
Further, we can expect , Bl to become the precondition of the loop
(because we read backwards) and therefore the loop must have a precon-
dition that will become the postcondition. The two occurrences of Bl III
(21.5), lead us to insert another predicate el as follows:

(21.6) {,el} doBl-Sl {el} od{,Bl}

N ow it's easy to invert: simply read backwards, inverting the delimiters


do and od and inverting a guarded command as done earlier in the case
of the alternative command. The inverse of (21.6) is

(21.7) {, Bl} do el - srI {BI} od {, el}

Inverting swap_equals
In section 16.5 a program was developed to swap two non-overlapping
sections b[i:i+n-I] and bU:j+n-l] of equal size n, where n ~O. The
invariant for the loop of the program is 0 ~ k ~ n together with

i i+k-I i+k i+n-I j j+k-l j+k j+n-l


b I swapped I unswapped I /\ b I swapped lunswapped I
The bound function is n -k and the program is

k:= 0;
do k #n - b[i+k],bU+k]:= bU+k],b[i+k]; k:= k+l od

This program looks like a "block", in that a new variable k is initialized.


To use the inversion technique described earlier for a block it must have a
Chapter 21 Inverting Programs 269

postcondition that describes the value of k. This postcondition is k = n,


the complement of the guard of the loop. Also, to invert the loop we will
need a precondition for it and a postcondition for its body; these can be
k =0 and k #0, respectively. Thus, we rewrite the program as

(21.8) k:= 0;
loop: {k =O}
do k #n
b[i+k],bU+k]:= bU+k],b[i+k]; k:= k+1 {k #O}
od
{k =n}
{k =n}

where loop labels the five indented lines: the loop and its pre- and post-
conditions. Using the rule for inverting a block, we find the inverse of
this program to be

k:=n; loop-I {k=O}

U sing the rule for inverting the loop, we find loop -I to be

pool:
{k =n}
do k #0-
(b[i+k],bU+k]:= bU+k],b[i+k]; k:= k+I)-1 {k #n} od
{k =O}

Further, the body of the loop -the inverse of the multiple assignment in
the original loop- is

k:= k-I; b[i+k],bU+k]:= bU+k],b[i+k]

Putting this together yields the inverse of program 21.2.1:

k:= n;
pool: {k =n}
do k #0-
k:= k-I; b[i+k],bU+k]:= bU+k],b[i+k] {k #n} od
{k =O}
{k =O}

Note how the original program swaps values beginning with the first ele-
ments of the sections, while its inverse begins with the last elements and
works its way backward. Note also that (21.8) is its own inverse, so (21.8)
has at least two inverses.
270 Part III. The Development of Programs

Inverting Perm...1o_Code
Exercise 5 of chapter 20 was to write a program for the following
problem. Let N be an integer, N >0, and let X[O:N -I] be an array that
contains a permutation of the integers 0, I, ... , N -I. Formally,

(21.9) perm(X[O:N-I],{O, I, ... ,N-I})

For X, we define a second array X[O:N-I] as follows. Element X'[i] is


the number of values in X[O:i-l] that are < X[i]. As an example, we
show one possible array X and the corresponding array X' , for N = 6:

(21.10) X =(2,0,3,1,5,4)
X' = (0, 0, 2, I, 4, 4)

X is called the code for X. Formally, array X satisfies

(21.1I)(Ai: O~i<N: X'[i]=(Nj: O~j<i: XU]<X[i]))

Write a program that, given an array x containing a permutation X of


{a, ... , N -I}, changes x so that it contains the code X' for X instead.
The program may use other simple variables, but no arrays besides x.
We now develop the program and then invert it; this constructively
proves that for each permutation X there exists exactly one code X, and
vice versa.
The program is to convert an array x containing an initial value X
into its code X'. A possible specification is therefore

{(A i: O~i <N: x[i]=X[i])}


S
{(A i: O~i <N: x[i]=X'[i])}

Each element of array x must be changed, so it probably requires itera-


tion. What is a possible loop invariant?

We try to write a loop that changes one value of array x from its initial
to its final value at each iteration. The usual strategy in such cases is to
replace a constant of the result assertion by a variable. Here, we can
replace 0 or N, which leads to calculating the array values in descending
or ascending order of subscript value, respectively. Which should we do?
In example (21.10), the values X[N -I] and X [N -I] are the same. If
the last values of X and X' were always the same, working in descending
order of subscript values might make more sense. So let's try to prove
that they are always the same.
Chapter 21 Inverting Programs 271

X[N-I] is the last value of X. Since the array values are 0, ... , N-I,
there are exactly X[N -I] values less than X[N -I] in X[O:N -2]. But
X'[N-I] is defined to be the number of values in X[0:N-2] less than
X[N-l]. Hence, X[N-I] and X'[N-I] are the same.
Replacing the constant 0 of the postcondition by a variable k yields
the first attempt at an invariant:

O~k ~N 1\ (A i: k ~i <N: x[i]=X'[iD

But the invariant must also indicate that the lower part of x still contains
its initial value, so we rewrite the invariant as

O~k ~N 1\ (A i: k ~i <N: x[i]=X'[iD 1\


(A i: O~i <k: x[i]=X[i])

The obvious bound function is k, and the loop invariant can be esta-
blished using k: = N.
There is still a big problem with using this as the loop invariant. We
began developing the invariant by noticing that X[N-I] = X'[N-I], so
that the final value of x[N-I] was the same as its initial value. To gen-
eralize this situation, at each iteration we would like x[k -I] to contain
its final value, but the invariant developed thus far doesn't indicate this.
The generalization would work if at each iteration x[O:k -I] contained
a permutation of the integers to, ... ,k -1) and if the code for this per-
mutation was equal to X'[O:k -I]. But this is not the case: the invariant
does not even indicate that x [O:k -I] is a permutation of the integers
to, ... ,k-I}.
Perhaps x can be modified during each iteration so that this is the
case. Let us rewrite the invariant as

P: O~k ~N 1\ (A i: k ~i <N: x[i]=X'[i]) A


perm(x[O:k-I],{O, ... ,k-I}) 1\ x[O:k-I],=X'[O:k-l]

The program will therefore have the form

k:= N;
do k #0 - k:= k-I;
Reestablish P
od

The question is how to reestablish P. Note that, after executing k :=


k-I, x[O:k-l] contains the set to, ... ,k} minus the value x[k]. If we
subtract I from every value in x[O:k -I] that is >x[k], x[O:k-l] will
contain a permutation of to, ... ,k -I}. For example, if we begin with
272 Part III. The Development of Programs

x =(2,5,4, 1,0,3) and k =6

the first iteration will reduce k to yield

x =(2,5,4,1,0,3) and k =5

and then change x to yield

x =(2,4,3, 1,0,3) and k =5

It is easily seen that the permutation (2, 4, 3, 1, 0) is consistent with the


original one, in the sense that the code for (2, 4, 3, 1, 0) is the same as the
first 5 integers of the code for the original array (2, 5, 4, 1, 0, 3). And, in
general, the code x [O:k -I], for x [O:k -I] will be the same as X' [O:k -1],
because the values in x [O:k -I] will be in the same relative order as the
values in X[O:k -1].
These considerations lead directly to program 21.12; the invariant of
the inner loop is simple enough to leave to the reader.

(21.12) k:= N;
do k #0-
k:= k-l;
Subtract I from every member of x[O:k-l] that is>x[k]:
j:= 0;
doj #k - {xU]#x[k]}
if xU]>x[k] - xU]:= xU]-1
o
xU]<x[k] - skip
fi;
j:=j+l
od
od

We want to invert program 21.12. The first step is to insert assertions


so that the inversions given earlier can be applied. We have left out the
pre- and postconditions of the alternative command, since they are just
the disjunctions of the guards and postconditions of the commands,
respectively.
Chapter 21 Inverting Programs 273

k:= N;
loopa: {k =N}
do k #0 -
k:= k-l;
j:= 0;
loopb: {j =O}
doj #k -
if xU]>x[k] - xU]:= xU]-1 {xU]):x[k]}
o
xU]<x[k] - skip {xU] <x[k]}
fi;
j:= j+l
{j #O}
od
{j =k}
{j =k}
{k #N}
od
{k =O}
{k =O}

Now invert the program, step by step, applying the inversion rules given
earlier. First, invert the block k:= N; loopa {k =O} to yield k:= 0;
loopa~1 {k =N}. Next, loopa~l is

apool: {k =O}
do k #N - (k:= k-l; j:= 0; loopb {j =k})~l {k #O} od
{k =N}

Continuing in this fashion yields the following inverse of (2l.l1).


274 Part Ill. The Development of Programs

k:= 0;
apoo/: {k =O}
do k #N-
j:= k;
bpoo/: U=k}
doj #0-
j:= j-I;
if x[k]>xU] - piks {x[k]>xUn
o x[k] ::;;;xU] - xU]:= xU]+1 {x[k] <xU]}
fi
U#k}
od;
U=O}
U =O}
k:= k+1
{k #O}
od
{k =N}
{k =N}

or, without the assertions,

k:= 0;
do k #N-
j:= k;
doj #0 -
j:= j-I;
if x[k]>xU] - piks
ox[k]::;;;xU] - xU]:= xU]+1
fi
od;
k:= k+l
od

Exercises for Chapter 21


1. Prove that wp«21.1),x =X II Y = Y) = (x = Y II Y =X)
2. Is x:= x /2 invertible in theory? in practice? Is x:= x"';- 2 invertible?
3. Invert program Array Reversal (exercise 2 of 16.5).
4. Invert program Link Reversal (exercise 6 of 16.5). What is the problem with it?
5. Invert the program of exercise I, section 18.1.
6. Discover and invert some interesting programs of your own.
Chapter 22
Notes on Documentation

Almost all programs in this book have been written in the guarded
command notation, with the addition of multiple assignment, procedure
call and procedure declaration. To execute the programs on a computer
usually requires translation into Pascal, PL/ I, FORTRAN or another
implemented language. Nevertheless, it still makes sense to use the
guarded command notation because the method of program development
is so intertwined with it. Remember Principle 18.3.11: program into a
programming language, not in it.
In this chapter, we discuss the problems of writing programs in other
languages as well as in the guarded command notation. We give general
rules for indenting and formatting, describe problems with definitions and
declarations of variables, and show by example how the guarded com-
mand notation might be translated into other languages.

22.1 Indentation
In the early days, programs were written in FORTRAN and assembly
languages with no indentation whatsoever, and they were hard to under-
stand because of it. The crutch that provided some measure of relief was
the flaw chart, since it gave a two-dimensional representation that exhi-
bited the program structure or "flow of control" more clearly.
Maintaining two different forms of the program -the text itself and
the flaw chart- has always been prone to error because of the difficulty
in keeping them consistent. Further, most programmers have never liked
drawing flaw charts, and have often produced them only after programs
were finished, and only because they were told to provide them as docu-
mentation. Therefore the relief expected from the use of flaw charts was
missing when most needed -during program development.
276 Part III. The Development of Programs

Following some simple rules, indentation of a program provides a


two-dimensional representation that shows its structure in simple manner.
Further, indentation is something that a programmer can do as a matter
of course, as a habit, during the programming process. Therefore, only
one document is needed, the indented program. All the problems of con-
sistency over two forms dissappear.
Good indentation obviates the need for a flaw chart.
We shall give some simple rules for indentation, which can be used
with most available programming languages. There may be slight differ-
ences from language to language, but, in general, the rules are the same.

Sequential composition
Many programming conventions force the programmer to write each
command on a separate line. This tends to spread a program out, making
it difficult to keep the program on one page. Then, indentation becomes
hard to follow. The rule to use is the following:

(22.1.1) eRule: Successive commands can be written on the


same line provided that, logically, they belong to-
gether.

Here is an example. In program 20.2.2, the following command is used to


establish loop invariant P: i, k, m [1]:= 1, 1, b [0]. In PLjI, which has no
mUltiple assignment command, this can be written on one line as

i= I; k= I; m(l)= b(O); /* P *I
Together, the three assignments perform the single function of establishing
P. There is no reason to force the programmer to write them as

i= 1;
k= I;
m(l)= b(O);

(As an aside, note how the PLj I assignment is written with no blank to
the left of = and one blank to the right. Since PLjI uses the same sym-
bol for equality and assignment, it behooves the programmer to find a
way to make them appear different.)
Don't use rule 22.1.1 as a license to cram programs into as little space
as possible; use the rule with care and reason.
The rule concerning indentation of sequences of commands is obvious:
Section 22.1 Indentation 277

(22.1.2) eRule: Commands of a sequence that appear on suc-


cessive lines should begin in the same column.

Thus, don't write

i= 1;
k= I;
m(l)= b(O);

Indenting subcommands
The rule concerning subcommands of a command is:

(22.1.3) eRule: Indent subcommands of a command 3 or 4


spaces from the column where the command begins (or
more, if it seems appropriate).

For example, write

do a+1 #b - d:= (a+b)-;-2;


if d*d ~ n - a:= d
Ud*d > n - b:= d
fi
od

or, in PL/ I,

DO WHILE (a+1 ,=b);


d= FLOOR((a +b) /2);
IF d*d ~n
THEN a= d;
ELSE b= d;
END;

Note that the body of the loop is indented. Further, the body is a se-
quence of two commands, which, following rule 22.1.2, begin in the same
column. Also, the subcommands of the PLj I conditional statement are
indented with respect to its beginning.
The PLj I conditional statement could also have been written as

IF d*d~n THEN a= d;
ELSE b= d;

or even, since it is short and simple, on one line as


278 Part III. The Development of Programs

IF d*d~n THEN a= d; ELSE b= d;

With respect to exact placement of THEN and ELSE, it doesn't matter


what conventions you follow as long as (I) you are consistent and (2) rule
22.1.3 is followed, so that the structure of the program is easily seen.
Consistency is important, so the reader knows what to expect.
Indentation can, and should, be used with FORTRAN. The loop
given above could have been written in FORTRAN 77 as

C DO WHILE (a+l .NE. b); ...


05 IF (a+l .NE. b) OOTO 25
10 d= FLOOR«a +b} /2}
IF (d*d-n) 12, 12, 14
12 a= d
OOTO 20
14 b= d
20 OOTO 05
25 CONTINUE

Assertions
As mentioned as early as chapter 6, it helps to put assertions in pro-
grams. Include enough so that the programmer can understand the pro-
gram, but not so many that he is overwhelmed with detail. The most
important assertion, of course, is the invariant of a loop. Actually, if the
program is annotated with the precondition, the postcondition, an invari-
ant for each loop, and a bound function for each loop, then the rest of the
pre- and postconditions can, in principle, be generated automatically.
Assertions, of course, must appear as comments in languages that don't
allow them as a construct. (Early versions of Ada included an "assert"
statement; mature Ada does not.) Two rules govern the indentation of
assertions:

(22.1.4) eRule: The pre- and postcondition of a command


should begin in the same column as the command.

(22.1.5) eRule: A loop should be preceded by an invariant and


a bound function; these should begin in the same
column as the beginning of the loop.

We have used these rules throughout the book, so they should appear
natural by now (naturalness must be learned). For two examples of the
use of rule 22.1.5, see program 20.2.2.
Section 22.1 Indentation 279

Indentation of delimiters
There are three conventions for indenting a final delimiter (e.g. od, fi
and the END; of PLfl). The first convention puts the delimiter on a
separate line, beginning in the same column as the beginning of the com-
mand.
The second convention is to indent the delimiter the same distance as
the subcommands of the command -as in the PL( I loop

DO WHILE ( expression );

END;

This convention has the advantage that it is easy to determine which com-
mand sequentially follows this one: simply search down in the column in
which the DO WHILE begins until a non-blank is found.
The third convention is to hide the delimiter completely on the last line
of the command. For example,

DO WHILE ( expression );

... END;
or
do guard ~

od

This convention recognizes that the indenting rules make the end delim-
iters redundant. That is, if a compiler used the indentation to determine
the program structure, the end delimiters wouldn't be necessary. The del-
imiters are still written, because they provide a useful redundancy that can
be checked by the compiler, but they are hidden from view.
Which of the three conventions you use is not important; the impor-
tant point is to be consistent, so that the reader is not surprised:

(22.1.6) eRule: Whatever convention you use for indenting end


delimiters, use it consistently.

The command-comment
Some of the programs presented in this book, like program 20.2.2,
have used an English sentence as a label (followed by a colon) or a com-
ment. The English sentence was really a command to do something, and
280 Part III. The Development of Programs

the program text that performed the command was indented underneath
it. Here is an example.

Set z to the maximum of x and y:


ifx;;:;'y -z:=x
Oy;;:;'x-z:=y
fi

In Pascal, the English sentence would be a comment, and since comments


are delimited by (* and *), this would appear as

(* Set z to the maximum of x and y*)


if x ;;:;, y then z: = x
else z:= y

In reading a program contammg such a command-comment, the com-


mand-comment is considered to be a command in the program, just as
any other. The program text indented underneath it is considered to be
its refinement -it is a program segment that shows how to perform the
command-comment.
When reading a program containing a command-comment, one need
read the refinement only to understand how its refinement works; other-
wise, one need only read the command-comment itself, which explains
what is to be done. Command-comments can be used to break a program
into pieces to reduce the amount of text the reader must look at in order
to find something. Just as binary search allows one to find a value in a
sorted list in logarithmic time, so judicious use of the command-comment
allows one to wend one's way through a program to find something in a
shorter time.
The use of command-comments during programming can be an invalu-
able aid, for it forces the programmer to be precise and also forces him to
be careful about structuring the program. To be most helpful, and to be
in keeping with the methodology presented in this book, the command-
comment should be written before its refinement.
The command-comment must be precise: it must state exactly what its
refinement does, in terms of its input and output variables. For example,
the command-comment

Add elements of the array b together

is not precise enough, for it forces the reader to read the refinement in
order to determine where the sum of the array elements is placed. Far
better is the command-comment
Section 22.1 Indentation 281

Store the sum of elements of b [O:n -\] into x


or
Given fixed n ;" 0 and fixed array b,
establish x =('2.): O~) <n: bU])

As you can see from the last example, the command-comment can be in
the form we have been using throughout the book for specifying a pro-
gram (segment).
Here is the indentation rule for command-comments.

(22.1. 7) eRule: The command-comment itself has the level of


indentation that any other command in its place would
have. Its refinement, which follows it, is indented 3 or
4 spaces.

Some people use the convention that a command-commment and its


refinement appear at the same level of indentation, e.g.

(*Set z to the maximum of x and y *)


if x;"y then z:= x
else z:= y;
k:= 20

The reason for not using this convention should be clear from the exam-
ple: one cannot tell where the refinement ends. Much better is to use rule
22.1.7:

(*Set z to the maximum of x and y *)


if x ;" y then z: = x
else z:= y;
k:= 20

Judicious use of spacing (skipping lines) may help, but no simple rule for
spacing after refinements can cover all cases if refinements are not
indented. So follow rule 22.1.7.
One more point concerning indentation of comments. Don't insert
them in such a manner that the structure of the program becomes hidden.
For example, if a sequence of program commands begin in column 10, no
comment between them should begin in columns to the left of column 10.
282 Part III. The Development of Programs

Keeping program segments small


One way to keep programs intellectually manageable is to keep pro-
gram segments to a reasonable size, for the amount of detail that can be
understood at anyone time is limited. The rule usually used is to keep
the procedural part (not counting specification and declarations) of a pro-
gram segment to one page. This is not much of a restriction if procedures
and macros are used reasonably to present the right level of abstraction
and structure. In fact, it is often hard to make program segments that
long.
The restriction to one page also helps to keep the indentation reason-
able; without the restriction, the indentation can get ridiculously far to the
right.

Procedure headings
As mentioned in chapter 12, the purpose of a procedure is to provide a
level of abstraction: the user of a procedure need only know what the pro-
cedure does and how to call it, and not how the procedure works. To
emphasize this, the procedure declaration should be indented as follows.

(22.1.8) eRule: The procedure heading, which includes a list of


the parameters, a specification of the parameters and a
description of what the procedure does, appears at the
same level of indenting as any command would be
indented in that context. The procedure body is
indented 3 or 4 columns with respect to the procedure
heading.

It may be reasonable to have a blank line before and after the procedure
declaration in order to set it off from the surrounding text.
As an example, here is a Pascal-like procedure declaration:

(*Pre: n =N A x =X A b =B A XEB[O:N-l]*)
(*Post: O~i <N A B[i]=X*)
proc search (value n, x: integer;
value b: array of integer;
result i: integer);
body of procedure

It may be worthwhile to give the pre- and post-conditions less formally


(but not less precisely), as shown below. This is often more understand-
able than the pure predicate calculus approach.
Section 22.2 Definitions and Declarations of Variables 283

(* Given fixed n, x and b[O:N-I] satisfying xEb, *)


(* store a value in i to establish x = b [i] *)
proc search( value n, x: integer;
value b: array of integer;
result i: integer);
body of procedure

As an aside, let us illustrate a special problem with PL/1. In PL/ I, par-


ameter specifications are treated in the same way as, and may appear
along with, declarations of local variables. For example, one can write

/ * Given fixed n, x and b (O:n -1) satisfying x E b,* /


/ * Store a value in i to establish x = b(i) * /
search: PROC (n, x, b, i);
DCL (n, x, b(*), k, i) FIXED;
body of procedure
END;

Writing a call on search requires knowledge of the types of the parame-


ters, and in reading these types one is confronted with the declaration of
the local variable k. To avoid this problem and to give the reader only
what is necessary to write a call, the specification of the parameters
should be separated from the declaration of the local variables:

/ * Given fixed n, x and b(O:n -I) satisfying x E b, * /


/ * Store a value in i to establish x = b(i) * /
sea rch: proc (n, x, b, i);
DCL (n, x, b(*), i) FIXED;
DCL k FIXED;
body of procedure
END;

22.2 Definitions and Declarations of Variables

The Definition of variables


Here is one of the simplest and most important strategies:

(22.2.1) -Strategy: Define your variables before you use them,


and then be sure to adhere to the definitions.

This strategy lies behind much of what has been presented in this book.
A definition of a set of variables is simply an assertion about their logical
relationship, which must be true at key places of the program. In the
284 Part III. The Development of Programs

same vein, a loop invariant is only a definition of a set of variables that


holds before and after each loop iteration. The balloon theory of section
16.1 simply gives heuristics for developing definitions of variables (in
some cases) from the specification of the program.
Rule 22.2.1 seems so obvious; yet apparently it is difficult to learn and
practice. Time and again I have found errors in a program -when its
owner was unable to do so or thought it correct- by asking the critical
question "what do these variables mean" and, after spending ten minutes
with the owner determining what the definitions should have been, point-
ing out places in the program that destroyed the definitions.
The critical point is to precisely define variables before writing code
that uses them and to adhere rigourously to these definitions.
Just as definitions of variables are important during program develop-
ment, so they are important when reading a program. The reader should
first be presented with these definitions, along with text to help under-
stand them. Once the definitions are understood, the program itself is
often obvious. On the other hand, it is grossly unprofessional and unfair
to present a program without precise variable definitions.

Placement of definitions of variables


The proper place to put definitions of (most) variables is at the head of
the program, along with their declarations. This has certain advantages:

I. It forces the grouping of variables by logical relationship


(instead of by type or in haphazard order). The declarations for
each logically related group of variables, together with their def-
inition, should be set off as a group, perhaps with blank lines
before and after it.
2. If written early enough and precisely enough, the definitions
give the programmer an added checklist. Whenever he writes
program text to change one variable of a group, he can refer to
the definition of the group to see what others must be changed
to maintain the definition. Note that the programming method
defined in Part III is oriented towards defining variables before
using them; the program specification, the loop invariant, etc.,
all come before the corresponding program text.
3. The reader knows where to look to understand a use of a
variable: its declaration is accompanied by its definition and the
definition of logically related variables.
Section 22.2 Definitions and Declarations of Variables 285

4. Comments within the program text, for example command-


comments, can refer to the definitions, thus shortening the pro-
gram. For example, instead of writing the command-comment
Add I to n, and then set b[n] to ... and c to ... and d[n]
to the maximum of ... and then, if e is ... add to j ...
one can simply write
Add I to n and reestablish its definition.
It is then up to the reader to read the definition and see what
assertion must be reestablished.

Examples of declarations and definitions


We present a simple example in Pascal to illustrate proper placement
of declarations and definitions; the reader should be able to extend this to
more complicated and longer sequences of declarations.
The variables are used in a program that maintains a list of employees,
their phone numbers, and the division in the company for which they
work. Within the program, at times it will be necessary to build and pro-
cess a list of people and their phone numbers in a particular division.
Both lists will be maintained in alphabetical order. The following type
declaration will be used

type String24 = packed array [1..24] of char;


String8 = packed array [1..8] of char;
Emprec = record (*Employee record *)
name: String24; (*Employee name: last name, first name *)
phone: integer; (*phone number (7 digits) *)
division: String8 (*Division *)
end;
Phonerec= record
name: String24; (*Employee name: last name, first name *)
phone: integer (*phone number (7 digits) *)
end

Note that the format of a name is given. Next is shown an example of


how not to write the declarations; the reason will be explained below.

var staff array [0 .. 10000] of Emprec;


phones: array [0 .. 1000] of Phonerec;
staffsize, divsize, i, j: integer;
div: char;
q: Phonerec;
286 Part III. The Development of Programs

These declarations suffer for several reasons. First, the variables have not
been grouped by their logical relationship. From the name staffsize, one
might deduce that this variable is logically related to array staff, but it
need not be so. Also, there is no way to understand the purpose or need
for divsize. Further, the definitions of globally important variables are
mixed up with the definitions of local variables, which are used in only a
few, adjacent places (i and j, for example).
Then there is no definition of the variables. For example, how do we
know just where in array staff the employees can be found. Are they
inserted at the beginning of the array, or the end, or in the middle? It has
also not been indicated that the lists are sorted.
Here is a better version of these declarations.

var staff array [0 .. 10000] (*sta.ff[O:staffsize-l] are *)


of Emprec; (*the employee records, *)
staffsize: integer; (*in alphabetical order *)

phones: array [0 .. 1000] (*phones [O:divsize -I] are *)


of Phonerec; (*the employees in division *)
divsize: integer; (*whichdiv, in *)
whichdiv: String 8; (*alphabetical order *)

i, j: integer; q: Phonerec;

Now the variables are grouped according to their logical relationship, and
definitions are given that describe the relationship. These definitions are
actually invariants (but not loop invariants), which hold at (almost) all
places of the program.
Variables j, j and q are presumably used only in a few, localized
places, and hence need no definition at this point.
Note carefully the format of the declarations. The variables themselves
begin in the same column, which makes it easy to find a particular vari-
able when necessary. Further, the comments describing each group
appear to the right of the variables, again all beginning in the same
column. Spending a few minutes arranging the declarations in this format
is worthwhile, for it aids the programmer as well as the reader.
One more point. Nothing is worse than a comment like "j is an index
into array b ". When defining variables, refrain from buzzwords like
"pointer", "counter" and "index", for they serve only to point out the lazi-
ness and lack of precision of your thought. Of course, at times such com-
ments may be worthwhile, but in general try to be more precise.
Section 22.3 Writing Programs in Other Languages 287

22.3 Writing Programs in Other Languages


Until the multiple assignment and guarded command notations find
their way into implemented programming notations, it will be necessary to
translate programs into Pascal, FORTRAN, PL/I or some other notation
in order to be able to execute them on a computer. The mUltiple assign-
ment, alternative and iterative commands must be simulated using the
commands of the language into which a program is being translated.
Sometimes the translation is easy. For example, an iterative command
with one guarded command can be written using the Pascal or PL/I while
loop, and an alternative command can be written deterministically using
the case or SELECT statement. However, an iterative command with
more than one guarded command has no simple counterpart in these
other languages and must be simulated.
For example, consider program (16.4.5) for the Welfare Crook, which
finds the first value l[iv]=gUv]=k[hv] (which is guaranteed to exist)
that occurs in three ordered arrays I[O:?], g[O:?] and h[O:?]:

i, j, k:= 0, 0, 0;
{inv: 0 ~ i ~ iv II 0 ~j ~ jv II 0 ~ k ~ kv }
{bound: i-iv + j-jv +k-kv}
do/[i]<gU] - i:= i+l
UgU]<h[k] - j:= j+l
Uh[k]</[i] - k:= k+l
od
{i = iv II j = jv A k = kv }

This program can be written in PL/I as

i = 0; j
= 0; k = 0;
/ *Simulate 3-guarded-command loop:* /
/*inv:O~i~iv 1I0~j~jv 1I0~k~kv*/
/*bound: i-iv + j-jv +k-kv*/
LOOP:
IF 1(i)<gU) THEN DO; i= i+l; GOTO LOOP; END;
IF gU)<h(k) THEN DO; j= j+l; GOTO LOOP; END;
IF h(k)</(i) THEN DO; k= k+I;GOTO LOOP; END;
/ *i = iv II j = jv II k = kv * /

The convention used here is the following. The simulation of a guarded


command loop contains a comment indicating the simulation, a label
(LOOP) to jump to for the next iteration, and an IF-statement for each
of the guarded commands of the loop. Note that exactly one of the com-
mands of the guarded commands will be executed at each iteration.
288 Part III. The Development of Programs

The same thing can be done in FORTRAN, although the language


hampers succinctness of expression even more, as illustrated below. In
the case of FORTRAN, a CONTINUE statement is used at the end of
each simulated guarded command loop to indicate where to jump to upon
termination of the loop. Don't label the following statement and jump to
that labeled statement instead, for then the simulated loop is no longer
independent of the rest of the program. It should be possible to take any
command that performs a particular task and put it in another program,
without modification, and this is not possible if the command contains a
jump out of itself.

C Simulate 3-guarded-command loop -labels 20-26


C inv: O~i ~iv A O~j ~jv A O~k ~kv
C bound: i-iv +j-jv +k-kv
20 IF (f(i) .GE. gU)) GOTO 22
i= i+l
GOTO 20
22 IF (gU) .GE. h(k)) GOTO 24
j= j+1
GOTO 20
24 IF (h(k) .GE. f(i)) GOTO 26
k= k+1
GOTO 20
26 CONTINUE
C {i = iv A j = jv A k = kv }

These examples indicate how guarded commands can be simulated rea-


sonably in other languages. Be sure to use the same conventions for
simulating the iterative commands for every iterative command. Unless
efficiency of the program is extremely important, don't try to use know-
ledge of the program to make the simulation more efficient or shorter.
For the benefit of yourself and the reader, use the same convention for all
similar constructs.

At this point, we give a program in four different notations: the nota-


tion of this book, Pascal, PLjI and FORTRAN. Each is fully docu-
mented, under the assumption that no other text will accompany it.
Section 22.3 Writing Programs in Other Languages 289

Program in the Notation of this Book

{The n words, n ;;:'0, on line number z begin in columns b[I], ... , b[n].
Exactly one blank separates each adjacent pair of words. s, s ;;:, 0,
is the total number of blanks to insert between words to justify the
line. Determine new column numbers b [I:n] to represent the justified
line. Result assertion R, below, specifies that the numbers of blanks
inserted between different pairs of words differ by no more than one,

°
and that extra blanks are inserted to the left or right, depending on
the line number. Unless ~ n ~ I, the justified line has the
following format, where Wi is word i:
WI [p+I blanks] ... [p+I] Wt [q+I] ... [q+l] Wn
where p, q, t satisfy
Q I: I ~t ~n /\ O~p /\ O~q /\ P *(t-I)+q *(n-t)=s /\
(odd(z) /\ q =p+1 v even(z) /\ p =q+I)
Using B to represent the initial value of array b, result assertion R is
R:(O~n~l/\b=B) v «Ai: I~i~t:b[i]=B[i]+p*(i-I»/\
(A i: t < i ~ n: b [i] = B[i] + p *(t -I) + q *(i -t)m

proc justi!y (value n, z, s: integer;


var b: array of integer);
var p, q , t , e, k: integer;
if n ~ I - skip
01 <n -
Determine p, q and t:
if even (z) - q:= s -;-(n -I); t:= I+(s mod (n -I); p:= q + I
Oodd(z) - p:= s-;-(n-I); t:= n-(s mod (n-I»; q:= p+I
fi;
Calculate new column numbers b[I:n]:
k, e:= n, s;
(inv: t~k~n /\e=p*(t-I)+q*(k-t)/\
b[l:k]=B[I:k] /\ b[k+I:n] has its final values}
do k #t - b[k]:= b[k]+e; k, e:= k-I, e-q od;
{inv: I ~ k ~ t /\ e = p *(t -I) /\
b[I:k]=B[I:k] /\ b[k+l:n] has its final values}
do e #0 - b[k]:= b[k]+e; k, e:= k-I, e-p od
fi
290 Part III. The Development of Programs

Program in Pascal

(*The n words. n ~O. on line number z begin in columns b(l) • ...• b(n).
Exactly one blank separates each adjacent pair of words. s. s ~ O.
is the total number of blanks to insert between words to justify the
line. Determine new column numbers b (1: n) to represent the justified
line. Result assertion R. below. specifies that the numbers of blanks
inserted between different pairs of words differ by no more than one.
and that extra blanks are inserted to the left or right. depending on
the line number. Unless O:S;; n :s;; I. the justified line has the
following format. where Wi represents word i:
WI [p+l blanks] ... [p+I] Wt [q+l] ... [q+l] Wn
where p • q. t satisfy
QI: l:S;;t:S;;n AO:S;;p AO:S;;q Ap*(t-I)+q*(n-t)=s A
(odd(z) A q =p+l v even(z) A p = q+l)
Using B to represent the initial value of array b. result assertion R is
R: (O:S;;n:S;;1 A b =B) v «A i: l:S;;i:S;;t: b(i)=B(i)+p*(i-I» A
(Ai: t<i:S;;n: b(i)=B(i)+p*(t-I)+q*(i-t)))*)

procedure justify(n. z, s: integer; var b: array of integer);


var p • q. t. e • k: integer;
begin if n > I then
begin
(*Determine P. q and t:*)
if z mod 2 =0
then begin q:= s div (n -I); t:= I +(s mod(n -1»; p:= q+I end
else beginp:= s div (n-I); t:= n-(s mod(n-I»; q:= p+1 end;
(*Calculate new column numbers b(l:n):*)
k:= n; e:= s;
(*inv: t:S;;k:S;;n A e =p*(t-I)+q*(k-t) A
b(l:k)=B(I:k) A b(k+l:n) has its final values*)
while k <>t do begin b(k):= b(k)+e; k:= k-l; e:= e-q end;
(*inv: I:S;; k ~ t A e =p *(1 -I) A
b(l:k)=B(l:k) A b(k+l:n) has its final values*)
while e <>0 do begin b(k):= b(k)+e; k:= k-I; e:= e-p end
end
end
Section 22.3 Writing Programs in Other Languages 291

Program in PL/I

/*The n words, n ;;;:'0, on line number z begin in columns b(l), ... , ben).
Exactly one blank separates each adjacent pair of words. s, s ;;;:, 0,
is the total number of blanks to insert between words to justify the
line. Determine new column numbers b (I:n) to represent the justified
line. Result assertion R, below, specifies that the numbers of blanks
inserted between different pairs of words differ by no more than one,
and that extra blanks are inserted to the left or right, depending on
the line number. Unless O~n ~ I, the justified line has the format
WI [p+1 blanks] ... [p+l] WI [q+I] ... [q+I] Wn
where p, q, t satisfy
QI: I~t~n "O~p "O~q "p*(t-I)+q*(n-t)=s"
(odd(z) " q =p+1 v even(z) " p =q+l)
Using B to represent the initial value of array b, result assertion R is
R:(O~n~1 Ab=B) v «Ai: l~i~/:b(i)=B(i)+p*(i-l)"
(A i: t < i ~ n:
b (i ) = B (i ) + p *(1 - I ) + q >I;(i - t ))) * /

justify: PROC(n, z, s, b);


DECLARE (n, z, s, b (*» FIXED;
DECLARE (q,p, t, e, k) FIXED;
IF n > I THEN
DO; / *Determine p, q and t:* /
IF MOD(z, 2) =0
THEN DO; q= s I(n-I);
t= I+MOD(s, (n-l»;p= q+l; END;
ELSE DO;p= s I(n-I);
t= n-MOD(s, (n-I»; q= p+l; END;
/ *Calculate new column numbers b (I:n): * /
k= n; e= s;
/*inv: t ~k ~n A e =p*(t-l)+q*(k-t) "
b(l:k)=B(l:k) A b(k+l:n) has its final values*/
DO WHlLE(k ,=t); b(k)= b(k)+e;
k= k-l; e= e-q; END;
/ *inv: I ~ k ~ t " e = p *(t -I) "
b[l:k]=B(l:k) A b(k+l:n) has its final values*!
DO WHILE(e , =0); b(k)= b(k)+e;
k= k-l; e= e-p; END;
END; END justify;
292 Part III. The Development of Programs

Program in FORTRAN

In the FORTRAN example given below, note how each guarded com-
mand loop is implemented using an IF-statement that jumps to a labeled
CONTINUE statement. These CONTINUE statements are included only
to keep each loop as a separate entity, independent of the preceding and
following statements.

C The n words, n ~O, on line number z begin in cols b(l), ... , b(n).
C Exactly one blank separates each adjacent pair of words. s, s ~O, is
C the number of blanks to insert between words to right-justify the
C line. Determine new col numbers b (I:n) to represent the justified
C line. Result assertion R, below, specifies that the numbers of blanks
C inserted between different pairs of words differ by no more than one.
C Also, extra blanks are inserted to the left or right, depending on
C the line number. Unless 0 ~ n ~ I, the justified line has the format
C
C WI [p+1 blanks]' .. [p+I] WI [q+l] .,. [q+l] Wn
C where p, q, I satisfy
C
C QI: 1~/~n /\O~p /\O~q /\p*(t-I)+q*(n-/)=s /\
C (odd(z) /\ q =p+1 v even(z) /\p =q+l)
C
C Using B to represent the initial value of array b, result assertion R is
C
C R:(O~n~1 /\b=B) v «Ai: l~i~t:b(i)=B(i)+p*(i-I»/\
C (Ai: t<i~n: b(i)=B(i)+p*(t-I)+q*(i-t))
C
SUBROUTINE justify (n, z, s, b)
INTEGER n, z, s, b(n)
C
INTEGER q,p,t,e,k
IF (n .LE. I) GOTO 100
C Determine p, q and t:
e= z/2
IF (z .NE. 2*e) GOTO 20
q=s/(n-I)
t= I +s -q*(n-l)
p= q+l
GOTO 30
20 p= s / (n-l)
t= n -s +p*(n-l)
q= p+1
30 CONTINUE
Section 22.3 Writing Programs in Other Languages 293

C Calculate new column numbers b [I:n]:


k= n
e=s
C Guarded command loop.
C inv: t ~k ~n A e =p*(t-I)+q*(k-t) A
C b(1:k)=B(1:k) A b(k+l:n) has its final values
40 IF (k .EQ. t) GOTO 50
b(k)= b(k)+e
k= k-I
e= e-q
GOTO 40
50 CONTINUE

C Guarded command loop.


C inv: I ~k ~t A e =p*(t-I) A
C b(l:k)=B(1:k) A b(k+l:n) has its final values
60 IF (e .EQ. 0) GOTO 70
b(k)= b(k)+e
k= k-l
e=e-p
GOTO 60
70 CONTINUE
100 CONTINUE
END
Chapter 23
Historical Notes

This chapter contains a brief history of research on programming and


a short account of the programming problems presented in this book. It
is a personal view of the field, in that only events that influenced my
research concerning the method of programming are described. For
example, it covers only programming methodology as it relates to sequen-
tial, rather than concurrent, programs. Furthermore, an enormous
amount of research on the theory of correctness of programs goes com-
pletely unmentioned, simply because it did not influence my own ideas
and opinions on the method of programming presented here.

23.1 A Brief History of Programming Methodology

Pre-1960
FORTRAN and F AP, the IBM 7090 assembly language, were my first
programming languages, and I loved them. I could code with the best of
them, and my flaw charts were always neat and clean. In 1962, as a
research assistant on a project to write the ALCOR-ILLINOIS 7090 Algol
60 Compiler, I first came in contact with Algol 60 [39]. Like many, I was
confused on this first encounter. The syntax description using BNF (see
Appendix I) seemed foreign and difficult. Dynamic arrays, which were
allocated on entrance to and deallocated on exit from a block, seemed
wasteful. The use of ":=" as the assignment symbol seemed unnecessary.
The need to declare all variables seemed stupid. Many other things dis-
turbed me.
I'm glad that I stuck with the project, for after becoming familiar with
Algol 60 I began to see its attractions. BNF became a useful tool. I
Section 23.1 A Brief History of Programming Methodology 295

began to appreciate the taste and style of Algol 60 and of the Algol 60
report itself. And I now agree with Tony Hoare that

Algol 60 was indeed a great achievement, in that it


was a significant advance over most of its successors.

Algol 60 has outlived its usefulness as a language, for it is inadequate in


many ways (as is FORTRAN). But the lessons learned on that project, in
the need for simplicity, taste, precision, and mathematical integrity -in
the description of the language as well as the language itself- have had a
profound influence on the field.

The 1960s
The 1960s was the decade of syntax and compiling. One sees this in
the wealth of papers on context-free languages, parsing, compilers, com-
piler-compilers and so on. The linguists also got into the parsing game,
and people received Ph.D.s for writing compilers.
Algol was a focal point of much of the research, perhaps because of
the strong influence of IFIP Working Group 2.1 on Algol, which met
once or twice a year (mostly in Europe). (IFIP stands for International
Federation for Information Processing). Among other tasks, WG2.1 pub-
lished the Algol Bulletin in the 1960s, an informal publication with fairly
wide distribution, which kept people up to date on the work being done in
Algol and Algol-like languages.
Few people were involved deeply in understanding programming per se
at that time (although one does find a few early papers on the subject)
and, at least in the early 1960s, people seemed to be satisfied with pro-
gramming as it was being performed. If efforts were made to develop for-
mal definitions of programming languages, they were made largely to
understand languages and compilers, rather than programming. Concepts
from automata theory and formal languages played a large role in these
developments, as is evidenced by the proceedings [42] of one important
conference that was held under IFIP's auspices.
A few isolated papers and discussions did give some early indications
that much remained to be done in the field of programming. One of the
first references to the idea of proving programs correct was in a stimulat-
ing paper [35] presented in 1961 and again at the 1962 IFIP Congress by
John McCarthy (then at M.I.T., now at Stanford University). In that
paper, McCarthy stated that "instead of trying out computer programs on
test cases until they are debugged, one should prove that they have the
desired properties." And, at the same Congress, Edsger W. Dijkstra
(Technological University Eindhoven, the Netherlands, and later also with
Burroughs) gave a talk titled Some meditations on advanced program-
296 Part Ill. The Development of Programs

ming [II]. At the 1965 IFIP Congress, Stanley Gill, of England, re-
marked that "another practical problem, which is now beginning to loom
very large indeed and offers little prospect of a satisfactory solution, is
that of checking the correctness of a large program."
But, in the main, the correctness problem was attacked by the more
theoretically inclined researchers only in terms of the problem of formally
proving the equivalence of two different programs; this approach has not
yet been that useful from a practical standpoint.
As the 1960s progressed, it was slowly realized that there really were
immense problems in the software field. The complexity and size of pro-
jects increased tremendously in the 1960s, without commensurate increases
in the tools and abilities of the programmers; the result was missed dead-
lines, cost overruns and unreliable software. In 1968, a NATO Confer-
ence on Software Engineering was held in Garmisch, Germany, [6] in
order to discuss the critical situation. Having received my degree (Dr. rer.
nat) two years earlier in Munich under F.L. Bauer, one of the major
organizers of the conference, I was invited to attend and help organize.
Thus, I was able to listen to the leading figures from academia and indus-
try discuss together the problems of programming from their two, quite
different, viewpoints. People spoke openly about their failures in soft-
ware, and not only about their successes, in order to get to the root of the
problem. For the first time, a consensus emerged that there really was a
software crisis, that programming was not very well understood.
In response to the growing awareness, in 1969 IFIP approved the for-
mation of Working Group 2.3 on programming methodology, with Mich-
ael W oodger (National Physics Laboratory, England) as chairman. Some
of its members -including Dijkstra, Brian Randell (University of Newcas-
tle upon Tyne), Doug Ross (Softech), Gerhard Seegmueller (Technical
University Munich), Wlad M. Turski (University of Warsaw) and Niklaus
Wirth (Eidgenossische Technische Hochschule, Zurich)- had resigned
from WG2.1 earlier when Algol 68 was adopted by WG2.1 as the "next
Algol". Their growing awareness of the problems of programming had
convinced them that Algol 68 was a step in the wrong direction, that a
smaller, simpler programming language and description was necessary.
Thus, just around 1970, programming had become a recognized, res-
pectable -in fact, critical- area of research. Dijkstra's article on the
harmfulness of the goto in 1968 [12] had stirred up a hornets' nest. And
his monograph On Structured Programming [14] (in which the term was
introduced in the title but never used in the text), together with Wirth's
article [44] on stepwise refinement, set the tone for many years to come.
Section 23.1 A Brief History of Programming Methodology 297

The early work on program correctness


Although proving programs correct had been mentioned many times,
few people worked in that area until the late 1960s. However, three
important articles written in the 1960s had a profound impact on the field.
The first article on proving programs correct was by Peter Naur (Univer-
sity of Copenhagen) in 1966 [40]. In the paper, Naur emphasized the
importance of program proofs and provided an informal technique for
specifying them.
A seminal piece of work was presented by Robert Floyd at a meeting
of the American Mathematical Society in 1967 [19]. In his talk, Floyd
discussed attaching assertions to the edges of a flaw chart, with the mean-
ing that each assertion would be true during execution of the correspond-
ing program whenever execution reached that edge. For a loop -i.e. a
cycle of the flaw chart- Floyd placed an assertion P on an arbitrary (but
fixed, of course) edge of the cycle, called a cut point. He would then
prove that if execution of the cycle beginning at the cut point with P true
reached the cut point again, P would still be true at that point. Thus was
born the idea of a loop invariant. Floyd also suggested that a specifica-
tion of proof techniques could provide an adequate definition of a pro-
gramming language.
Tony Hoare (then at the University of Belfast; now at Oxford) took
Floyd's suggestion to heart in his article [27] and defined a small pro-
gramming language in terms of a logical system of axioms and inference
rules for proving partial correctness of programs -an extension to the
predicate calculus. For example, the assignment statement was defined by
the axiom (schema):

and the while loop was defined by an inference rule:

PAB{S}P
P {while B do S } P A , B

This inference rule means: if P A B {S } P has been proved, one may


infer that P {while B do S } P A , B holds also.
Hoare's article attempts to deal directly with the programming prob-
lem. It restricts the programming language to "manageable" control
structures, (instead of dealing with flaw charts). It attempts to convey the
need for such restrictions. It shows how defining the language in terms of
how to prove a program correct, instead of how to execute it, might lead
to a simpler design. The tone of the article, together with its comprehen-
sive evaluation of the possible benefits to be gained by adopting the
298 Part Ill. The Development of Programs

axiomatic approach to language definition, both for programming and for


formal language definition, are what made the article so unique.
I must confess that I was not turned on by Hoare's paper in 1969. I
did not understand its implications -perhaps I was not ready to think
deeply about the problems of programming because I was too involved in
the field of compiler construction at that time. This involvement illus-
trates one reason why, today, teaching of programming has lagged so far
behind the research; by and large, people are too busy performing their
own research to spend time thinking about and learning about program-
ming. This is a pity, for teaching programming well is an important part
of our task as computer scientists and educators.
Two years later, I did become more interested and began to understand
the implications of Hoare's work. In fact, I was so impressed that in 1972
I made the subject of loop invariants part of the content of the second
programming course at Cornell University and included it in the text [9].

Research on axiomatic definitions in the 1970s


Tony Hoare's article [27] founded a whole school of research on the
axiomatic definition of programming languages. Today, there are literally
hundreds of papers dealing with the axiomatic treatment of various con-
structs, from the assignment command to various forms of loops to pro-
cedure calls to coroutines. Even the gala has been axiomatized, and,
indeed, very simply.
The research was fraught with lack of understanding and frustration.
One reason for this was that computer scientists in the field, as a whole,
did not know enough formal logic. Some papers were written simply
because the authors didn't understand earlier work; others contained
errors that wouldn't have happened had the authors been educated in
logic.
It is difficult to do a good job developing an axiomatic system when
the only place you have seen such a system is in Hoare's 1969 paper [27],
and yet, I and others operated under just those circumstances. We spent
a good deal of time thrashing, just treading water, instead of swimming,
because of our ignorance. With hindsight, I can say that the best thing
for me to have done 10 years ago would have been to take a course in
logic. I persuaded many students to do so, but I never did so myself.
Let me list a number of achievements in the field ofaxiomatization
during the 1970's. Some have been independently achieved and published
by others also; these are the papers that influenced me. Also, the work
was usually done a year or two before the date of publication.
Section 23.1 A Brief History of Programming Methodology 299

In 1971, proof rules for restricted procedure calls were developed


[28]. Several later papers, which built on the results of [28],
contained mistakes that [28] didn't have.
In 1972, article [29] on the proof of correctness of data represen-
tations did much to spur on research in "abstract data types".
The algebraic specification of data types [25] came later, based
on initial work in 1974 in [33].
In 1973, proof rules were written for most of the programming
language Pascal [30]. This work included the first proof rule for
assignment to array elements, viewing an array as a function as
in chapter 9. The work was actually hampered by the language
in a number of places; it is easier to build a language with ax-
iomatization in mind than to axiomatize as an afterthought.
In 1975, an automatic verification system for (much of) Pascal,
based on axioms and inference rules, was developed [32].
In 1975, a model of execution of a program was used to prove
the "relative completeness" of a set of axioms and inference
rules for an Algol fragment [8]. Thus, it was shown that if a
program could not be proved correct within the axiomatic sys-
tem, then the fault could be attributed to something other than
the axiomatic definition of the language, for example, on the
fact that any axiomatization of the integers is incomplete.
In 1979, the programming language Euclid was defined with the
idea ofaxiomatization imposed on the project from the begin-
ning [34].
In 1980, a general multiple assignment statement, including
assignment to array elements, was defined and used to describe
axioms for procedure calls [23]. This paper clarified some of
the problems concerning initial and final values of variables.

Research in developing proof and program hand-in-hand


In the early 1970's one often heard the cry: the disadvantage of Hoare's
stuff is that it forces you to find an invariant of each loop! Others
shouted back, including myself: the advantage of Hoare's stuff is that it
forces you to find an invariant of each loop!
To some extent, the first cry was on the mark at that time. When the
theory was first presented, it seemed terribly difficult to prove an existing
program correct, and it was soon admitted that the only way to prove a
program correct was to develop the proof and program hand-in-hand
-with the former leading the way.
300 Part III. The Development of Programs

And yet, we didn't really know how to do this. For example, we knew
that the loop invariant should come before the loop, but we had no good
methods for doing so and certainly could not teach others to do it. The
arguments went back and forth for some time, with those in favor of loop
invariants becoming more adept at producing them and coming up with
more and more examples to back up their case.
The issue was blurred by the varying notions of the word proof Some
felt that the only way to prove a program correct formally was to use a
theorem prover or verifier. Some argued that mechanical proofs were and
would continue to be useless, because of the complexity and detail that
arose. Others argued that mechanical proofs were useless because no one
could read them. Article [10] contains a synthesis of arguments made
against proofs of correctness of programs, and it is suggested reading. In
this book, a middle view has been used: one should develop a proof and
program hand-in-hand, but the proof should be a mixture of formality
and common sense.
Several forums existed throughout the 1970's for discussing technical
work on programming. Besides the usual conferences and exchanges, two
other forums deserve mention. First, IFIP Working group 2.3 on pro-
gramming methodology, and later WG2.l, WG2.2 and WG2.4, were used
quite heavily to present and discuss problems related to programming.
Since its formation, WG2.3 has met once or twice a year for five days to
discuss various aspects of programming. No formal proceedings have ever
emerged from the group; rather the plan has been to provide a forum for
discussion and cross-fertilization of ideas, with the results of the interac-
tion appearing in the normal scientific publications of its members. The
group has produced an anthology of already-published articles by its
members [22], which illustrates well the influence of WG2.3 on the field of
programming during the 1970s. It is recommended reading for those
interested in programming methodology.
Secondly, several two-week courses were organized throughout the
1970's by the Technical University Munich. These courses were taught by
the leaders in the field and attended by advanced graduate students,
young Ph.D.s, scientists new to the field and people from industry from
Europe, the U.S. and Canada; they were not just organized to teach a
subject but to establish a forum for discussion of ongoing research in a
very well-organized fashion. Many of the ones dealing with programming
itself (some were on compiling, operating systems, etc.) were sponsored by
NATO. These schools are unusual in that 50 to 100 researchers were
together for two weeks to discuss one topic. The lectures of many of the
schools have been published -see for example [2], [4] and [3].
Back to the development of programs. In 1975, Edsger W. Dijkstra
published a paper [15], which was a forerunner to his book [16]. The
Section 23.2 The Problems Used in the Book 301

book introduced weakest preconditions for a small language and then


showed, through many examples, how they could be used as a "calculus
for the derivation of programs".
For the first time, it began to become clear how one could develop a
loop invariant (before the loop). It became clear that emphasis on theory
and formalism, but tempered with common sense, could actually lead to
the development of programs in a more reliable manner. The concepts
and principles on which a science of programming could be founded
began to emerge.
The text you are now reading is my attempt to convey these concepts
and principles, to show how programming can be practised as a science.

23.2 The Problems Used in the Book


The following list gives the history of problems, as far as I know them,
in the order in which they appear in this book.

The Coffee Can Problem (Chapter 13). Dijkstra mentioned the problem
in a letter in Fall 1979; he learned of it from his colleague, Carel Schol-
ten. It took five minutes to solve.
Closing the Curve (Chapter 13). John Williams (then at Cornell, now at
IBM, San Jose) asked me to solve this problem in 1973. I was not able
to do so, and Williams had to give me the answer.
The Maximum Problem (Chapter 14). [16], pp. 52-53.
The Next Higher Permutation Problem (exercise 2 of chapter 14 and
exercise 2 of Chapter 20). The problem has been around for a long
time; the development is from [16], pp. 107-110.
Searching a Two-dimensional Array (sections 15.1, 15.2). My solution.
Four-tuple Sort (section 15.2). [16], p. 61.
gcd(x, y) (exercise 2 of section 15.2). This, of course, goes back to Euclid.
The versions presented here are largely from [16].
Approximating the Square Root (sections 16.2, 16.3 and 19.3). [16], pp.
61-65.
Linear Search and the Linear Search Principle (section 16.2). The devel-
opment is from [16], pp. 105-106.
The Plateau Problem (section 16.3). I used this problem to illustrate loop
invariants at a conference in Munich, Germany, in 1974. Because of
lack of experience, my program used too many variables (see the discus-
sion at the end of section 16.3). Michael Griffiths (University of
Nancy) wrote a recursive definition of the plateau of an array and then
changed the definition into an iterative program; the result was a pro-
gram similar to (16.3.11). The idealized development given in section
302 Part III. The Development of Programs

16.3 came much later.


Binary Search (exercise 4 of section 16.3). The development given in the
answers to the exercise is due to Dijkstra, in 1978.
The Welfare Crook (section 16.4). Due to a colleague of Dijkstra's, Wim
Feijen, this problem was used as an exercise at the International Sum-
mer School in Marktoberdorf, Germany, in 1975 [2]. Blame the partic-
ular setting on me.
Swapping Equal-Length Sections (section 16.5). Part of the folklore.
Array Reversal (exercise 2 of section 16.5). Part of the folklore.
Partition (exercise 4 of section 16.5). This is used in a sorting algorithm
developed by Tony Hoare (Oxford) in 1962 [26]. The solution is mine.
Dutch National Flag (exercise 5 of section 16.5). [16], pp. 111-116.
Link Reversal (exercise 6 of section 16.5). This has been a favorite exer-
cise and test question in the second programming course at Cornell for
years.
Saddleback Search (exercise 8 of section 16.5). A prospective graduate
student from Berkeley gave me this problem in Spring 1980 during a
discussion on programming; Gary Levin (then a Cornell graduate stu-
dent, now at the University of Arizona) solved the problem essentially
as given in the answer to the exercise.
Decimal to Binary (exercise 9 of section 16.5). Part of the folklore.
Decimal to Base B (exercise 10 of section 16.5). A simple generalization
of Decimal to Binary.
Swapping Sections (section 18.1). The problem was given to Harlan Mills
(IBM, Maryland) and me at the 1980 IFIP Congress in Japan, by Ed
Nelson, who had difficulty solving it. This is one of two solutions we
came up with while traveling [24]). The history of a third solution in
terms of reversing array sections ~the answer to exercise I of section
18.1 is Reverse(b, m, n-I); Reverse(b, n,p-I); Reverse (b, m,
p -l)~ is lost in the sands of time. Shown to me by Alan Demers
(Cornell), the third solution is used in the UNIX editor and in the
Terak screen editor on which I have typed and edited most of this
book.
Quicksort (section 18.2). This is due to Hoare [26].
Counting the Nodes of a Tree (section 18.3). In 1972, this problem was
used to illustrate how difficult it was to find loop invariants [5]! Now,
the problem seems almost trivial.
Preorder, Inorder, Postorder Traversal (section 18.3). The names were
invented by Donald Knuth (Stanford University); the program deriva-
tions are mine.
An Exercise Attributed to Hamming (section 19.2). Dijkstra derives this
program in [16], pp. 129-134; he attributes the problem to R.W. Ham-
ming (Bell Laboratories).
Finding Sums of Squares (exercise 1 of section 19.2). [16], pp. 140-142.
Section 23.2 The Problems Used in the Book 303

Exponentiation (section 19.1 and exercise 15 of chapter 20). The devel-


opment as done in section 19.1 appears in [16], pp. 65-67. The program
of exercise II, which processes the binary representation in a different
order, was shown to me by John Williams. I once listened to two com-
puter scientists discuss exponentiation talk right past each other; each
thought he was talking about the exponentiation routine, not knowing
that the other existed.
Controlled Density Sorting (section 19.3). Robert Melville derived this
algorithm as part of his Ph.D. thesis at Cornell [36]; it appeared in [37].
Efficient Queues in LISP (section 19.3). Robert Melville derived this
algorithm as part of his Ph.D. thesis at Cornell [36].
Justifying Lines of Text (section 20.1). The derivation first appeared in
[21 ].
The Longest Upsequence (section 20.2). Dijkstra gave this as an exercise
a day before he derived it at the 1978 Marktoberdorf Course on Pro-
gram Construction [4]. Four or five people present, who were experi-
enced in the method of programming, had no difficulty with it; the rest
of the audience did. Jay Misra (University of Texas, Austin) had pre-
sented a similar solution earlier in a paper on program development
[38], and a generalization of it is used in the UNIX program DIFF [31].
Unique 5-bit Subsequences (exercise I, chapter 20). In [13].
DifJerent Adjacent Subsequences (exercise 2, chapter 20). In [13].
Perm-to-Code (exercise 5 of chapter 20). This problem was solved by
Dijkstra and his colleague, Willem H.J. Feijen, in connection with in-
verting programs (see chapter 21) in [17]. The concept of inverting pro-
grams and most of the inversions presented in chapter 21 are due to
them.
Code-la-Perm (exercise 6 of chapter 20). See Perm-to-Code.
Appendix 1
Backus-Naur Form

BNF, or Backus-Naur form, is a notation for describing (part of) the


syntax of "sentences" of a language. It was proposed in about 1959 for
describing the syntax of Algol 60 by John Backus, one of the thirteen
people on the Algol 60 committee. (John Backus, of IBM, was also one
of the major figures responsible for FORTRAN.) Because of his modifi-
cations and extensive use of BNF as editor of the Algol 60 report, Peter
Naur (University of Copenhagen) is also associated with it. The ideas
were independently discovered earlier by Noam Chomsky, a linguist, in
1956. BNF and its extensions have become standard tools for describing
the syntax of programming notations, and in many cases parts of com-
pilers are generated automatically from a BNF description.
We will introduce BNF by using it to describe digits, integer constants,
and simplified arithmetic expressions.
In BNF, the fact that I is a digit is expressed by

(Al.l) <digit> ::= I

The term <digit> is delimited by angular brackets to help indicate that it


cannot appear in "sentences" of the language being described, but is used
only to help describe sentences. It is a "syntactic entity", much like "verb"
or "noun phrase" in English. It is usually called a nonterminal, or nonter-
minal symbol. The symbol I, on the other hand, can appear in sentences
of the language being described, and is called a terminal.
(Al.l) is called a production, or (rewriting) rule. Its left part (the sym-
bol to the left of ::=) is a nonterminal; its right part (to the right of ::=) is
a nonempty, finite sequence of nonterminals and terminals. The symbol
::= is to be read as "may be composed of", so that (Al.I) can be read as
Appendix I Backus-Naur Form 305

A <digit> may be composed of the sequence of symbols: I

Two rules can be used to indicate a <digit> may be a 0 or a I:

<digit> ::= 0 (a <digit> may be composed of 0)


<digit> ::= I (a <digit> may be composed of 1)

These two rules, which express different forms for the same nonterminal,
can be abbreviated using the symbol 1, read as "or", as

<digit> ::= 0 1 I (A <digit> may be composed of 0 or 1)

This abbreviation can be used in specifying all the digits:

<digit> ::= 0 1 1 I 21 31 41 51 61 71 81 9

An integer constant is a (finite) sequence of one or more digits. Using


the nonterminal <constant> to represent the class of integer constants,
integer constants are defined recursively as follows:

(A1.2) <constant> ::= <digit>


<constant> ::= <constant> <digit>
<digit> ::= 01 1 1 21 31 41 51 61 71 81 9

The first rule of (A 1.2) is read as follows: a <constant> may be composed


of a <digit>. The second rule is read as follows: a <constant> may be
composed of another <constant> followed by a <digit>.
The rules listed in (A1.2) form a grammar for the language of <con-
stant>s; the sentences of the language are the sequences of terminals that
can be derived from the nonterminal <constant>, where sequences are
derived as follows. Begin with the sequence of symbols consisting only of
the nonterminal <constant>, and successively rewrite one of the nonter-
minals in the sequence by a corresponding right part of a rule, until the
sequence contains only terminals.
Indicating the rewriting action by =>, the sentence 325 is derived:

(A1.3) <constant> => <constant> <digit> (rewrite using second rule)


=> <constant> 5 (rewrite <digit> as 5)
=> <constant> <digit> 5 (rewrite using second rule)
=> <constant> 25 (rewrite <digit> as 2)
=> <digit> 2 5 (rewrite using first rule)
=> 325 (rewrite <digit> as 3)

The derivation of one sequence of symbols from another can be defined


schematically as follows. Suppose U ::= u is a rule of the grammar,
where U is a nonterminal and u a sequence of symbols. Then, for any
306 Appendix I Backus-Naur Form

(possibly empty) sequences x and y define

xU y => x uy

The symbol => denotes a single derivation -one rewriting action. The
symbol =>* denotes a sequence of zero or more single derivations. Thus,

<constant> I =>* <constant> I

since <constant> I can be derived from itself in zero derivations. Also

<constant> I =>* <constant> <digit> I

because of the second rule of grammar (Al.2). Finally,

<constant> I =>* 3 2 5 I

since (Al.3) showed that <constant> =>* 325.

A grammar for (simplified) arithmetic expressions


Now consider writing a grammar for arithmetic expressions that use
addition, binary subtraction, multiplication, parenthesized expressions,
and integer constants as operands. This is fairly easy to do:

<expr> ::= <expr> + <expr>


<expr> ::= <expr> - <expr>
<expr> ::= <expr> * <expr>
<expr> ::= ( <expr> )
<expr> ::= <constant>

where <constant> is as described above in (Al.2). Here is a derivation of


the expression (1 +3)*4 according this grammar:

(AI.4) <expr> => <expr> * <expr>


=> «expr» * <expr>
=> ( <expr> + <expr» * <expr>
=> ( <constant> + <expr> ) * <expr>
=> ( <constant> + <constant> ) * <expr>
=> ( <constant> + <constant> ) * <constant>
=> ( <digit> + <constant> ) * <constant>
=> ( <digit> + <digit» * <constant>
=> ( <digit> + <digit» * <digit>
=> ( I + <digit> ) * <digit>
=> ( 1 + 3 ) * <digit>
=> ( 1 + 3) * 4
Appendix I Backus-Naur Form 307

Hence, <expr> =>* (I + 3) +4.

Syntax trees and ambiguity


A sequence of derivations can be described by a syntax tree. As an
example, the syntax tree for derivation (AI.4) is

<expr>

(
______
<expr>
__________ ------- "'-------II
I _______ *
I ____ )
<expr>
<expr>

<constant>

<expr> + <expr> <digit>


I
<constant>
I
<constant> 4
I
I I
<digit> <digit>
I I
I 3

In the syntax tree, a single derivation using the rule U ::=u is expressed
by a node U with lines emanating down to the symbols of the sequence
u. Thus, for every single derivation in the sequence of derivations there is
a nonterminal in the tree, with the symbols that replace it underneath.
For example, the first derivation is <expr> => <expr> * <expr>, so
at the top of the diagram above is the node <expr> and this node has
lines emanating downward from it to <expr>, * and <expr>. Also,
there is a derivation using the rule <digit> :: = I, so there is a
corresponding branch from <digit> to I in the tree.
The main difference between a derivation and its syntax tree is that the
syntax tree does not specify the order in which some of the derivations
were made. For example, in the tree given above it cannot be determined
whether the rule <digit> ::= I was used before or after the rule <digit>
:: = 3. To every derivation there corresponds a syntax tree, but more than
one derivation can correspond to the same tree. These derivations are
considered to be equivalent.

Now consider the set of derivations expressed by <expr> =>*


<expr> + <expr> * <expr>. There are actually two different deriva-
tion trees for the two derivations of <expr> + <expr> * <expr>:
308 Appendix I Backus-Naur Form

<expr> <expr>

'"
/\~ /~
<expr> + <expr> * <expr>
<expr>
/\~
<expr>/ */ <expr> <expr> + <expr>

A grammar that allows more than one syntax tree for some sentence is
called ambiguous. This is because the existence of two syntax trees allows
us to "parse" the sentence in two different ways, and hence to perhaps
give two meanings to it. In this case, the ambiguity shows that the gram-
mar does not indicate whether + should be performed before or after *.
The syntax tree to the left (above) indicates that * should be performed
first, because the <expr> from which it is derived is in a sense an
operand of the addition operator +. On the other hand, the syntax tree to
the right indicates that + should be performed first.
One can write an unambiguous grammar that indicates that multiplica-
tion has precedence over plus (except when parentheses are used to over-
ride the precedence). To do this requires introducing new nonterminal
symbols, <term> and <factor>:
<expr> <term> 1 <expr> + <term>
..
<expr> - <term>
<term> .. <factor> 1 <term> * <factor>
<factor> .. <constant> 1 «expr»
<constant>:: = <digit>
<constant>:: = <constant> <digit>
<digit> .. 0111213141516171819
In this gramar, each sentence has one syntax tree, so there is no ambi-
guity. For example, the sentence 1+3*4 has one syntax tree:
Appendix I Backus-Naur Form 309

----/ -----
<expr>
1
<term>
+
<expr>

----1----
<term> *
<term>

<factor>
I I I
<factor> <factor> <constant>
I I 1
<constant> <constant> <digit>
I I 1
<digit> <digit> 4
I I
1 3

This syntax tree indicates that multiplication should be performed first,


and, in general, in this grammar * has precedence over + except when the
precedence is overridden by the use of parentheses.

Extensions to BNF
A few extension to BNF are used to make it easier to read and under-
stand. One of the most important is the use of braces to indicate repeti-
tion: {x} denotes zero or more occurrences of the sequence of symbols x.
U sing this extension, we can describe <constant> using one rule as

<constant> ::= <digit> {<digit>}

In fact, the grammar for arithmetic expressions can be rewritten as

<expr> <term> {+ <term> 1 - <term>}


<term> <factor> {* <factor>}
<factor> <constant> 1 «expr»
<constant>::= <digit> {<digit>}
<digit> .. 0111213141516171819

References
The theory of syntax has been studied extensively. An excellent text
on the material is Introduction to Automata Theory, Languages and
Computation (Hopcroft, J.E. and J.D. Ullman; Addison-Wesley, 1979).
The practical use of the theory in compiler construction is discussed in the
texts Compiler Construction for Digital Computers (Gries, D.; John
Wiley, 1971) and Principles of Compiler Design (Aho, A.V., and J. Ull-
man; Addison Wesley, 1977).
Appendix 2
Sets, Sequences, Integers, and Real Numbers

This appendix briefly defines the important types of variables used


throughout the book. Sets will be described in more detail than the oth-
ers, so that the reader can learn important material he might have missed
earlier in his education.

Sets and operations on them


A set is a collection of distinct objects, or elements, as they are usually
called. Because the word collection is just as vague as set, we give some
examples to make the idea more concrete.

The set {3, 5} consists of the integers 3 and 5.


The set {5, 3} consists of the integers 3 and 5.
The set {3, 3, 5} consists of the integers 3 and 5.
The set {3} consists of the integer 3.
The set {} is called the empty set; it contains no elements. The
Greek character ¢ is sometimes used to denote the empty set.

These examples illustrate one way of describing a set: write its elements as
a list within braces { and}, with commas joining adjacent elements. The
first two examples illustrate that the order of the elements in the list does
not matter. The third example illustrates that an element listed more than
once is considered to be in the set only once; elements of a set must be
distinct. The final example illustrates that a set may contain zero ele-
ments, in which case it is called the empty set.
It is not possible to list all elements of an infinite set (a set with an
infinite number of elements). In this case, one often uses dots to indicate
that the reader should use his imagination, but in a conservative fashion,
Appendix 2 Sets, Sequences, Integers, and Real Numbers 311

in extending the list of elements actually given. For example,

{O, I, 2, ... } is the set of natural numbers


{ ... , -2, -1,0, 1,2, ... } is the set of all integers
{I, 2, 4, 8, 16, 32, ... } is the set of powers of 2.

We can be more explicit using a different notation:

{i I there is a natural number j satisfying i =2i}

In this notation, between { and I is an identifier i; between I and } is a


true-false statement -a predicate. The set consists of all elements i that
satisfy the true-false statement. In this case, the set consists of the powers
of 2. The following describes the set of all even integers.

{k I even (k ) }

The notation can be extended somewhat:

{(i,j) I i =j+l}

Assuming i and j are integer-valued, this describes the set of pairs

{ ... , (-I, -2), (0, -I), (1,0), (2, I), (3,2), ... )

The cardinality or size of a set is the number of elements in it. The


notations I a I and card(a) are often used to denote the cardinality of set
a. Thus, 101 =0, 1{1,5}1 =2,andcard({3,3,3})=1.
The following three operations are used build new sets: set union u, set
intersection n and set difference -.

a ub is the set consisting of elements that are in


at least one of the sets a and b
is the set consisting of elements that are in
both a and b
a -b is the set consisting of elements that are in
a but not in b

For example, if a ={A, B, C} and b ={B, C, D}, then a ub


{A, B, C, D}, a nb = {B, C} and a - b = {A}.
Besides tests for set equality or inequality (e.g. a = b, a # {2, 3, 5}),
the following three operations yield a Boolean value T or F (true or
false ):
312 Appendix 2 Sets, Sequences, Integers, and Real Numbers

x Ea has the value of "x is a member of set a"


x !fa is equivalent to ,(x Ea)
a Cb has the value of "set a is a subset of set b"
-i.e. each element of a is in b

Thus, I E {2, 3, 5, 7} is false, l!f {2, 3, 5, 7} is true, {I, 3, 5} C {I, 3, 5} is


true, {3, I, 5} Cf I, 3, 5, O} is true, and {3, I, 5, 2} Cfl, 3, 5, O} is false.
It is advantageous from time to time to speak of the minimum and
maximum values in a set (if an ordering is defined on its elements):
min(a) denotes the minimum value in set a and max(a) the maxilt'um
value.
Finally, we describe a command that is useful when programming
using sets. Let a be a nonempty set and x be a variable that can contain
values of the type of the set elements. Execution of the command

Choose (a , x)

stores in x one element of set a. Set a remains unchanged. This com-


mand is nondeterministic (see chapter 7), because it is not known before
its execution which element of a will be stored in x. The command is
assumed to be used only with finite sets. Its use with infinite sets causes
problems, which are beyond the scope of this book (see [16] for a discus-
sion of unbounded and bounded nondeterminism). Further, in our pro-
grams the sets we deal with are finite.
Choose(a, x) is defined in terms of weakest preconditions as follows
(see chapter 7):

wp(Choose(a, x), R) a #{) 1\ (A i: i Ea: R/)

Sequences
A sequence is a list of elements (joined by commas and delimited by
parentheses). For example, the sequence (I, 3, 5, 3) consists of the four
elements I, 3 , 5 , 3, in that order, and 0 denotes the empty sequence. As
opposed to sets, the ordering of the elements in a sequence is important.
The length of a sequence s, written / s / , is the number of elements in
it.
Catenation of sequences with sequences and/ or values is denoted by /.
Thus,

(l, 3, 5) / (2, 8) (1,3,5,2,8)


(l, 3, 5) / 8 / 2 (1,3,5,8,2)
(1,3,5)/0 (1,3,5)
Appendix 2 Sets, Sequences, Integers, and Real Numbers 313

In programming, the following notation is used to refer to elements of


a sequence. Let variable s be a sequence with n elements. Then

s = (s[0],s[I],s[2], ... ,s[n-l])

That is, s[O] refers to the first element, s [I] to the second, and so forth.
Further, the notation s [koo], where 0::::;; k ::::;; n, denotes the sequence

s[koo] = (s[k], s[k +1], ... , s[n -1])

That is, s [koo] denotes a new sequence that is the same as s but with the
first k elements removed. For example, if s is not empty, the assignment

s:= s[1..]

deletes the first element of s. Executing the assignment when s =0


causes abortion, because the expression s[loo] is not defined in that case.
One can implement a last-in-first-out stack using a sequence s by limit-
ing the operations on s to

s[O] reference the top element of the stack


s:= 0 empty the stack
x, s:= s[O], s[1..] Pop an element into variable x
s:= v I s Push value v onto the stack

One can implement a (first-in-first-out) queue using a sequence s by limit-


ing the operations on s to

s[O] Reference the front element of the queue


s:= 0 empty the queue
x, s:= s[O],s[1..] Delete the front element and store it in x
s:= s I v Insert value v at the rear

Using the sequence notation, rather than the usual pop and push of stacks
and insert into and delete from queues, may lead to more understandable
programs. The notion of assignment is already well understood -see
chapter 9- and is easy to use in this context.

Operations on integers and real numbers


We typically use the following sets:

The set of integers: { ... ,-2, -1,0, 1,2, ... }


The set of natural numbers: {O, I, 2, ... }
314 Appendix 2 Sets, Sequences, Integers, and Real Numbers

We also use the set of real numbers, although on any machine this set and
operations on it are approximated by some form of floating point
numbers and operations. Nevertheless, we assume that real arithmetic is
performed, so that problems with floating point are eliminated.
The following operations take as operands either integers or real num-
bers:

+, - , * addition, subtraction, multiplication


division x I y; yields a real number
<,~,=,;;;',>,"# the relational operators:
x <y is read "x is less than y"
x ~y is read "x is at most y"
x = y is read "x equals y"
x ;;;, y is read "x is at least y"
x> y is read "x exceeds y"
x "# y is read "x differs from y"
abs(x) or I x I absolute value of x: if x <0 then -x else x
floor(x) greatest integer not more than x
ceil (x) smallest integer not less than x
min (x ,y, ... ) The minimum of x, y, .... The minimum
of an empty set is 00 (infinity)
max(x, y, ... ) The maximum of x, y, . . .. The maximum
of an empty set is - 0 0 (infinity)
log(x) base 2 logarithm of x: y =log(x) iff x =2Y

The following operations take only integers as operands.

x"';-y is the greatest integer at most x I y


x mody the remainder when x is divided by y
(for x ;;;'O,y >0)
even (x) "x is an even integer", or x mod 2 = 0
odd(x) "x is an odd integer", or x mod2 = I
Appendix 3
Relations and Functions

Relations
Let A and B be two sets. The Cartesian product of A and B, written
A XB, is the set of ordered pairs (a, b) where a is in A and b is in B:

A XB = {(a,b) I a E'A "b E'B}

The Cartesian product is named after the father of analytic geometry,


Rene Descartes, a 17th century mathematician and philosopher. The
number of elements in A XB is I A I * I B I ; hence the name Cartesian pro-
duct.
A binary relation over the sets A and B is a subset ofAXB . Since
we will be dealing mainly with binary relations, we drop the adjective
binary and call them simply relations. A few words will be said about
other relations at the end of this Appendix.
Let P be the set of people. One relation over PXP is the relation
parent:

parent = {(a, b) I b is a's parent}

Let N be the set of integers. One relation over NXN is the successor
relation:

succ = {(i,i+l) I iE'N}

The following relation associates with each person the year in which he
left his body:

died-.in {(P, i) I person p died in year i}


316 Appendix 3 Relations and Functions

The identity relation over A XA , denoted by I, is the relation

I = {(a, a) I a E A}
When dealing with binary relations, we often use the name of a rela-
tion as a binary operator and use infix notation to indicate that a pair
belongs in the relation. For example, we have

c parent d iff (d, c) E {(a, b) I b is a's parent}


i succ j iff i + I = j
q died--.in j iff (q, j) E {(p, i) I person p died in year i}

From the three relations given thus far, we can conclude several things.
For any value a there may be different pairs (a, b) in a relation. Such a
relation is called a one-to-many relation. Relation parent is one-to-many,
because most people have more than one parent.
For any value b there may be different pairs (a, b) in a relation.
Such a relation is called a many-to-one relation. Many people may have
died in any year, so that for each integer i there may be many pairs (p, i)
in relation died--.in. But for any person p there is at most one pair (p, i)
in died--.in. Relation died--.in is an example of a many-to-one relation.
In relation succ, no two pairs have the same first value and no two
pairs have the same second value. Relation succ is an example of a ane-
ta-one relation.
A relation on A XB may contain no pair (a, b) for some a in A.
Such a relation is called a partial relation. On the other hand, a relation
on A XB is total if for each a E A there exists a pair (a, b) in the rela-
tion. Relation died--.in is partial, since not all people have died yet.
Relation succ is total (on NXN).
If relation R on A XB contains a pair (a, b) for each b in B, we say
that R is onto B. Relation parent is onto, since each child has a parent
(assuming there was no beginning).

Let Rand S be two relations. Then the composition R 0 S of Rand


S is the relation defined by

a RoS c iff (Eb:aRb AbSc)

For example, the relation parent 0 parent is the relation grandparent.


Rrelation died--.in 0 succ associates with each person the year after the
one in which the person died.
Composition is associative. This means the following. Let R, Sand
T be three relations. Then (R 0 S) 0 T = R 0 (S 0 T). This fact is easily
deduced from the definitions of relation and composition. Because com-
Appendix 3 Relations and Functions 317

posItIOn is associative, we usually omit the parentheses and write simply


R 0 SoT.
The composition of a relation with itself is denoted with a superscript
2:

parent 2 is equivalent to parent 0 parent


succ 2 is equivalent to succ 0 succ

Similarly, for any relation R and natural number i one defines Ri as

J, the identity relation


R 0 Ri-I, for i >0

For example,

parent ° J
parentI parent
parent 2 grandparent
parent 3 great -grandparent
and
(i succ k j) iff i+k = j

Looking upon relations as sets and using the superscript notation, we can
define the closure R+ and transitive closure R' of a relation R as fol-
lows.

R+ R l uR2 uR 3 u
R' RO U Rl U R2 U

In other words, a pair (a, b) is in R+ if and only if it is in Ri for some


i >0. Here are some examples.

parent + is the relation ancestor


i succ+ j iff i+k = j for some k >0, i.e. i <j
i succ' j iff i ~j

Finally, we can define the inverse R- 1 of a relation R:

b R- 1 a iff aRb

That is, (b, a) is in the inverse R- 1 of R if and only if (a, b) is in R.


The inverse of parent is child, the inverse of < is >, the inverse of ~ is
;;:" and the inverse of the identity relation I is I itself.
318 Appendix 3 Relations and Functions

Functions
Let A and B be sets. A function f from A to B, denoted by

f: A-B

is a relation that for each element a of A contains at most one pair


(a, b) -i.e. is a relation that is not one-to-many. The relation parent is
not a function, because a child can have more than one parent -for each
person p there may be more than one pair (p, q) in the set parent. The
relations succ and died-.in are functions.
We write each pair in a function f as (a, f (a ». The second value,
f (a), is called the value of function f at the argument a. For example,
for the function succ we have

succ (i) = i + I, for all natural numbers i,

because succ is the set of pairs

{( ... , (-2, -I), (-1,0), (0, I), (I,2), ... )}

Note carefully the three ways in which a function name f is used. First,
f denotes a set of pairs such that for any value a there is at most one
pair (a,b). Second, af b holds if (a,b) is in f. Third, f(a) is the
value associated with a, that is, (a,f (a» is in the function (relation) f.
The beauty of defining a function as a restricted form of relation is
that the terminology and theory for relations carries over to functions.
Thus, we know what a one-to-one function is. We know that composition
of (binary) functions is associative. We know, for any function, what fO,
f', f2, f+ and f* mean. We know what the inverse f-' of f is. We
know that f -I is a function iff f is not many-to-one.

Functions from expressions to expressions


Consider an expression, for example x*y. We can consider x*y as a
function of one (or more) of its identifiers, say y:

fey) = x*y

In this case, we can consider f to be a function from expressions to exp-


ressions. For example,

f(2) = x* 2
f(x+2) = x * (x+2)
f(x*2) = x * x*2
Appendix 3 Relations and Functions 319

Thus, to apply the function to an argument means to replace textually the


identifier y everywhere within the expression by the argument. In making
the replacement, one should insert parentheses around the argument to
maintain the precedence of operators in the expression (as in the second
example), but we often leave them out where it doesn't matter. This tex-
tual substitution is discussed in more detail in section 4.4. The leason for
including it here is that the concept is used earlier in the book, and the
reader should be familiar with it.

n-ary relations and functions


Thus far, we have dealt only with binary relations. Suppose we have
sets A 0, ... , An for n > O. Then one can define a relation on
A oXA 2X ... XAn to be a set of ordered tuples

where each ai is a member of set Ai.


Suppose g is such a relation. Suppose, further, that for each distinct
n-l tuple (00, ... , an-I) there is at most onen-tuple (00, ... , an-I> an)
in g. Then g is an n -ary function -a function of n arguments- where
the value an in each tuple is the value of the function when applied to the
arguments consisting of the first n -1 values:

(ao, 02, " ' , an-lo g(ao, " ' , an-I» and
g(ao, ... , an-I) = an

The terminology used for binary relations and functions extends easily to
n -ary relations and functions.
Appendix 4
Asymptotic Execution Time Properties

Execution time measures of algorithms that hold for any implementa-


tion of the algorithms and for any computer, especially for large input
values (e.g. large arrays), are useful. The rest of this section is devoted to
describing a suitable method for measuring execution times in this sense.
The purpose is not to give an extensive discussion, but only to outline the
important ideas for those not already familiar with them.
First, assign units of execution time to each command of a program as
follows. The assignment command and skip are each counted as I unit of
time, since their executions take essentially the same time whenever they
are executed. An alternative command is counted as the maximum
number of units of its alternatives (this may be made finer in some
instances). An iterative command is counted as the sum of the units for
each of its iterations, or as the number of iterations times the maximum
number of units required by each iteration.
One can understand the importance of bound functions for loops in
estimating execution time as follows. If a program has no loops, then its
execution time is bounded no matter what the input data is. Only if it
has a loop can the time depend on the input values, and then the number
of loop iterations, for which the bound function gives an upper bound,
can be used to give an estimate on the time units used.
Consider the three following loops, n ~O:

i:= n; do i >1 - i:= i-I od


i:= n; F= 0; do i >1 - i:= i-I; j:= 0 od
i:= n; do i >1 - i:= i-';-2 od

The first requires n units of time; the second 2n. The units of time
required by the third program is more difficult to determine. Suppose n
Appendix 4 Asymptotic Execution Time Properties 321

is a power of 2, so that

i =2k

holds for some k. Then dividing i by 2 is equivalent to subtracting I


from k: (2 k )--;-2 = 2k -I. Therefore, each iteration decreases k by I. Upon
termination, i = I =2°, so that k =0. Hence, a suitable bound function is
t = k, and, in fact, the loop iterates exactly k times. The third program
makes exactly ceil (log n) iterations, for any n > O.
Whenever an algorithm iteratively divides a variable by 2, you can be
sure that a log factor is creeping into its execution time. We see this in
Binary Search (exercise 4 of section 16.3) and in Quicksort (18.2). This
log factor is important, because log n is ever so much smaller than n, as
n grows large.
The execution time of the first and second programs given above can
be considered roughly the same, while that of the third is much lower.
This is illustrated in the following table, which gives values of n, 2n and
log n for various values of n. Thus, for n = 32768 the first program
requires 32769 basic units of time, the second twice as many, and the third
only 15!

n: 2 64 128 32768
2n: 2 4 128 256 65536
log n: 0 6 7 15

We need a measure that allows us to say that the third program is by far
the fastest and that the other two are essentially the same. To do this, we
define the order of execution time.
(A4.1) Definition. Let fen) and g(n) be two functions. We say that
fen) is (no more than) order g(n), written O(g(n», if a constant
c >0 exists such that, for all (except a possibly finite number)
positive values of n,

f(n):O:::;c*g(n).

Further, f (n) is proportional to g(n) if f (n) is O(g(n» and


g(n) is O(f(n». We also say that fen) and g(n) are of the
same order. 0

Example I. Let f(n)=n+5 and g(n)=n. Choose c =5. Then, for


I <n, fen) = n+5 :0:::; 5n = 5*g(n). Hence, fen) is O(g(n». Choosing
c = I indicates that g(n) is O(f(n». Hence n+5 is proportional to
n. 0
322 Appendix 4 Asymptotic Execution Time Properties

Example 2. Letf(n)=n andg(n)=2n. Withc=1 we see thatf(n) is


O(g(n)). With c = 1;2, we see that g(n) is O(f(n)). Hence n is propor-
tional to 2n. For any constants K I #0 and K2, K I *n + K2 is propor-
tional to n. 0

Since the first and second programs given above are executed in nand
2n units, respectively, their execution times are of the same order.
Secondly, one can prove that logn is O(n), but not vice versa. Hence
the order of execution time of the third is less than that of the first two
programs.
We give below a table of typical execution time orders that arise fre-
quently in programming, from smallest to largest, along with frequent
terms used for them. They are given in terms of a single input parameter
n. In addition, the (rounded) values of the orders are given for n = 100
and n = 1000, so that the difference between them can be seen.

Order n =100 n = 1000 Term Used


1 I Constant-time algorithm
logn 7 10 logarithmic algorithm
..rn 10 32
n 100 1000 linear algorithm
n logn 700 10000
n ..rn 1000 31623
n2 10000 1000000 quadratic (or simply n 2) algorithm
n3 1000000 109 cubic algorithm
2n 1.26* 1030 -10 300 exponential algorithm

For algorithms that have several input values the calculation of the order
of execution time becomes more difficult, but the technique remains the
same. When comparing two algorithms, one should first compare their
execution time orders, and, if they are the same, then proceed to look for
finer detail such as the number of times units required, number of array
comparisons made, etc.
An algorithm may require different times depending on the configura-
tion of the input values. For example, one array b[l:n] may be sorted in
n steps, another array b'[l:n] in n 2 steps by the same algorithm. In this
case there are two methods of comparing the algorithms: average- or
expected-case time analysis and worst-case time analysis. The former is
quite difficult to do; the latter usually much simpler.
As an example, Linear Search, (16.2.5), requires n time units in the
worst case and n /2 time units in the average case, if one assumes the
value being looked for can be in any position with equal probability.
Answers to Exercises

Answers for Chapter 1


1. We show the evaluation of the expressions for state s 1.
(a) , (m v n) = , (T v F) = , T = F
(b) , m v n = , TV F = F v F = F
(c) ,(m An) = ,(T AF) = , F = T
(d) ,mAn = ,TAF = FAF = F
(e) (m Vn)'9p = (TVF)'9T = T'9T = T
(f) m V(n '9p) = TV(F'9T) = TVT = T
(g) (m =n)A(p =q) = (F=F)A(T=F) = TAF = F
(h) m =(n A(P =q» = F=(FA(T=F»
= F=(FAF) = F=F = T
(i) m =(nAp =q) = F=(FAT=F) = F=(F=F) = F=T = F
(j) (m =n)A(p '9q) = (F = T)A(F '9 T) = FAT = F
(k) (m =nAp)'9q = (F=TAF)'9T = (F=F)'9T = T'9T = T
(I) (m '9n)'9(p '9q) = (F'9F)'9(F'9F) = T'9T = T
(m)(m '9(n '9p»'9q = (F'9(F'9F»'9F
= (F '9 T) '9 F = T '9 F = F
2. a,b bed b Vc b Vc Vd b Ac b AC Ad

TTT T T T T
TTF T T T F
TFT T T F F
TFF T T F F
FTT T T F F
FTF T T F F
FFT F T F F
FFF F F F F
324 Answers to Exercises

3. Let the Boolean identifiers and their meanings be:


xlessy: x <y xequaly: x = y xgreatery: x > y xatleasty: x ~y
ylessz: y < z yequalz: y = z ygreaterz: y > z
vequalw: v = w
beginxlessy: Execution of program P begins with x <y
beginxlessO: Execution of program P begins with x <0
endyequal2powerx: Execution of P terminates with y =2x
noend: Execution of program P does not terminate
We give one possible proposition; there are others.
(a) xlessy v xequaly
(d) xlessy A ylessz A vequalw
(k) beginxlessO ='?>noend

Answers for Chapter 2


1. Truth table for the first Commutative law (only) and the law of Nega-
tion:

b c b Ac cAb (bAc)=(cAb) ,b "b "b =b


TT T T T F T T
TF F F T
FT F F T T F T
FF F F T

Truth table for the first Distributive law (only) (since the last two columns
are the same, the two expressions heading the columns are equivalent and
the law holds):

b c d cAd b Vc bVd b V(c Ad) (b Vc)A(b Vd)


TTT T T T T T
TTF F T T T T
TFT F T T T T
TFF F T T T T
FTT T T T T T
FTF F T F F F
FFT F F T F F
FFF F F F F F

3. , T = ,(TV, T) (Excluded Middle)


=,TA"T (De Morgan)
=,TAT (Negation)
= TA, T (Commutativity)
=F ( Contradiction)
Answers for Chapter 2 325

5. Column I: (b) Contradiction, (c) or-simp!., (d) or-simp!., (e) and-simp!.,


m
(f) Identity, (g) Contradiction, (h) Associativity, (i) Distributivity, Dis-
tributivity, (k) Commutativity (twice), (I) Negation, (m) De Morgan.

6. (a) x v (y v x) v , y (g) 'x ?(x lIy)


= x V(x Vy) V,y (Commut.) = X V(x IIy) (Imp., Neg.)
= (XVx)v(y V,y) (Ass.) = x (or-simp!.)
=x v T (or-simp!., Exc!. Middle)
= T (or-simp!.) (h) T?(,x ?x)
= , TV(x v x) (Imp., Neg.)
(b) (x Vy)lI(x V,y) = F V(x Vx) (exercise 3)
= x v(y II ,y) (Dist.) = x (or-simp!., twice)
x v F (Contradiction)
= x (or-simp!.)

7. Proposition e is transformed using the equivalence laws in 6 major


steps:
I. Use the law of Equality to eliminate all occurrences of =.
2. Use the law of Implication to eliminate all occurrences of ?
3. Use De Morgan's laws and the law of Negation to "move not in" so
that it is applied only to identifiers and constants. For example,
transform, (a v (F II , c» as follows:
,(a V(F II, c»
,all,(FII,c)
= ,all(,FV"c)
= ,all(,FVc)
4. Use , F = T and , T =F to eliminate all occurrences of , F and , T
(see exercises 3 and 4).
5. The proposition now has the form eo v . . . ve n for some n ;? 0, where
each of the ei has the form (goll ... IIgm). Perform the following until
all gj in all ei have one of the forms id, , id, T and F:
Consider some ei with a gj that is not in the desired form. Use
the law of Commutativity to place it as far right as possible, so
that it becomes gm' Now, gm must have the form (h o v ... v
hd, so that the complete ei is
goll ... IIgm-llI(h ov " , vhd

Use Distributivity to replace this by


326 Answers to Exercises

This adds m propositions ei at the main level, but reduces the


level of nesting of operators in at least one place. Hence, after a
number of iterations it must terminate.
6. The proposition now has the form eo v ... ve n for some n ~ 0, where
each of the ei has the form (goA ... Ag m ) and the gj are id, , id, T or
F. Use the laws of Commutativity, Contradiction, Excluded Middle and
or-simplification to get the proposition in final form. If any of the ei
reduces to T then reduce the complete proposition to T; if any reduces to
F use the law of or-simplification to eliminate it (unless the whole propo-
sition is F).
9. The laws have already been proved to be tautologies in exercise I. We
now show that use of the rule of Substitution generates a tautology, by
induction on the form of proposition E(P), where p is an identifier.
Case 1: E(P) is either T, F or an identifier that is not p. In this case,
E(el) and E(e2) are both E itself. By the law of Identity, E = E, so that
a tautology is generated.
Case 2: E(P) is p. In this case, E(el) is e1 and E(e2) is e2, and by
hypothesis e1 = e2 is a tautology.
Case 3: E(P) has the form , E1(P). By induction, E1(e1) = E1(e2).
Hence, E1(e1) and EI(e2) have the same value in every state. The follow-
ing truth table then establishes the desired result:

E1(el) El(e2) , E1(el) , El(e2) , E1(e1) = , E1(e2)


T T F F T
F F T T T

Case 4: E(P) has the form E1(P) A E2(P). By induction, we have that
E1(e1) = El(e2) and E2(el) = E2(e2). Hence, E1(e1) and EI(e2) have the
same value in every state, and E2(e1) and E2(e2) have the same value in
every state. The following truth table then establishes the desired result:

El(eJ) El(e2) E2(e1) E2(e2) EI(eJ) A E2(eJ) EJ(e2) A E2(e2)


T T T T T T
T T F F F F
F F T T F F
F F F F F F

The rest of the cases, E(P) having the forms E1(P) v E2(P), E1(P) ?
E2(P) and EJ(p) = E2(P) are similar and are not shown here.
Answers for Section 3.2 327

We now show that use of the rule of Transitivity generates only tauto-
logies. Since e 1= e2 and e2 = e3 are tautologies, we know that eland e2
have the same value in every state and that e2 and e3 have the same value
in every state. The following truth table establishes the desired result:

el e2 e3 el=e3
T T T T
F F F T

10. Reduce e to a proposition el in conjunctive normal form (see exercise


8). By exercise 9, e = el is a tautology, and since e is assumed to be a
tautology, el must be a tautology. Hence it is true in all states. Proposi-
tion el is T, is F or has the form eo A ... Aen , where each of the ei has
the form go v ... V gm and each gi is id or , id and the gi are distinct.
Proposition el cannot have the latter form, because it is not a tautology.
It cannot be F, since F is not a tautology. Hence, it must be T, and
e = T has been proved.

Answers for Section 3.2

3. (a)~F~ro~m~~~~_~~r_l~'n~~~er~r
I p A q pr I
2 p ~r pr 2
3 p A-E, 1
4 r ~-E, 2, 3

or Fromp Aq,p infer p


I
~r

I p A-E pr I
2 p ~-E, pr 2, I

3. (b) From p = q, q infer p


I p =q pr I
2 q pr 2
3 q ~p =-E, I
4 p ~-E, 3,2

3. (c) Fromp,q ~r,p ~r infer pAr


I Ir ~-E, pr 3, pr I
2 pAr A-I, pr I, I
328 Answers to Exercises

4. Proof of 3(a). Since p /\ q is true, p must be true. From p =';> r, we


then conclude that r must be true.
Proof of 3(b). Since p = q is true, q being true means that p must be
true.

Answers for Section 3.3

1. Infer (p /\ q /\ (p =';> r )) =';> (r V (q =';> r ))


1 I (p /\q /\(p =';>r)) =,;>(r V(q =';>r)) =';>-1, (3.2.11)

2. Infer (p /\ q) =';>(p V q)

2 From p /\ q infer p v q
2.1 P /\-E,prl
2.2 p vq v-I, I
3 (P/\q)=';>(PVq) =';>-1,2

4. Infer p =p vp
1 Fromp infer p v p
l.l pVp v-I, pr I
2 P =';>p vP =';>-1, I
3 Fromp v p infer p
3.1 P v-E, pr I, (3.3.3), (3.3.3)
4 P v P =';> P =';>-1, 3
5 P = P vp =-1, 2, 4

8. The reference on line 2.2 to line 2 is invalid.

11. From , (J infer (J =';> lJ


I ,q pr 1
2 From q infer p
2.1 q pr I
2.2 From,p infer q /\ , q
2.2.1 Tq/\,q /\-1,2.1,1
2.3 P . , -E, 2.2
2 q =';>p =';>-1,2
Answers for Section 3.4 329

26. From p infer p


I p pr I
2 From,p infer p A ,p
2.1 pA,p A-I, I, pr I
3 p ,-E,2

27. The following "English" proofs are not intended to be particularly


noteworthy; they only show how one might attempt to argue in English.
Building truth tables or using the equivalence transformation system of
chapter 2 is more reasonable.
Many of these proofs rely on the property that a proposition b ~ c is
true in any state in which the consequent c is true, and hence to prove
that b ~c is a tautology one need only investigate states in which c is
false.
27. Proof of l. The proposition (p Aq A(P ~r))~(r V(q ~r)) is true
because it was already proven in (3.2.11) that (r V(q ~r)) followed from
(p Aq A(P ~r)).
27. Proof of 2. If p A q is true, then p is true. Hence, anything "ored"
with p is true, so p v q is true.
27. Proof of 3. If q is true, then so is q A q.

27. Proof of 5. Suppose p is true. Then e ~p is true no matter what e


is. Hence (r v s) ~ p is true. Hence, p ~«r vs) ~p) is true.

Answers for Section 3.4


1. From
I pr I
2
p A-E, pr I
q~(PVq) ~-E, I, 2.1
q A-E, pr I
~-E, 2.2, 2.3
3 ~-I, 2

8.(a) From b v c infer, b ~c


I b vc pr 1
2 From, b infer c
2.1 c (3.4.6), I, pr I
3 , b ~c ~-I, 2
330 Answers to Exercises

8.(b) From ~ b ? c infer b v c


I ~b?c prl
2 bv~b (3.4.14)
3 From b infer b v c
3.1 b Vc v-I, pr I
4 b?b Vc ?-I,3
5 From ~ b infer b v c
5.1 c ?-E, I, pr I
5.2 b vc V-I, 5.1
6 ~ b ?b v c ?-I,5
7 b vc v-E, 2, 4, 6

8.(c) Infer b Vc =(~b ?c)


I (bVc)?(,b ?c) ?-1,8(a)
2 (~b?c)?bVc ?-1,8(b)
3 bVc=(~b?c) =-1,1,2
10. We prove the theorem by induction on the structure of expression
E(P).
Case 1: E(p) is the single identifier p. In this case, E(eJ) is simply e1 and
E(e2) is e2, and we have the proof
From e1 =e2, e1 infer e2
I I e1 ?e2 =-E, pr I
2 e2 ?-E, 3, pr 2

Case 2: E(P) is an identifier different from p, say v. In this case E(eJ)


and E(e2) are both v and the theorem holds trivially.
Case 3: E(P) has the form , G (P), for some expression G. By induction,
we may assume that a proof of
From e2=e1, G(e2) infer G(e1)
exists, and we prove the desired result as follows.
Answers for Section 3.4 331

From el =e2, , G(el) infer, G(e2)


1 el =e2 pr 1
2 ,G(el) pr 2
3 From G(e2) infer G(el)/\, G(el)
3.1 e2 =el ,*,,-E, ex. 25 of 3.3, 1
3.2 (e2 =el)/\ G(e2) '*" G(el) ,*,,-1, assumed proof
3.3 (e2 =el)/\ G(e2) /\-1,3.1, pr 1
3.4 G(el) ,*,,-E, 3.2, 3.3
3.5 G(el)/\,G(el) /\-1, 3.4, 2
4 , G(e2) , -I, 3

Case 4: E(P) has the form G(p)/\H(P) for some expressions G and H.
In this case, by induction we may assume that the following proofs exist.
From el =e2, G(el) infer G(e2)
From el =e2, H(el) infer H(e2)
We can then give the following proof.

From el =e2, G (el) /\ H(el) infer G(e2) /\ H(e2)


1 G (el) /\-E, pr 2
2 G(e2) Assumed proof, pr 1, 1
3 H(el) /\-E, pr 2
4 H(e2) Assumed proof, pr 1, 3
5 G(e2)/\H(e2) /\-1,2,4

The rest of the cases, where E(P) has one of the forms G(p)V H(P),
G (P) '*" H(P) and G (P) = H(P), are left to the reader.

11. From a = b, b = c infer a = c


1 a '*"b =-E, pr 1
2 b '*"c =-E, pr 2
3 From a infer c
3.1 b ,*,,-E, 1, pr 1
3.2 c ,*,,-E, 2, 3.1
4 a '*"c ,*,,-1, 3
5 c '*"a proof omitted, similar to 1-4
6 a =c =-1,4,5
332 Answers to Exercises

Answers for Section 3.5


1. Conjecture 4, which can be written as bll\ , gj ? , tb, is not valid, as
can be seen by considering the state with tb = T, rna = T, bl = T,
gh =F,Jd= T and gj =F.
Conjecture 8, which can be written as gh 1\ tb ?(bl ? ,fd), is proved
as follows:

From gh, tb infer bl ? ,fd


I , , tb subs, Negation, pr 2
2 , bl v rna (3.4.6), Premise I, I
3 , ,gh subs, Negation, pr I
4 , rna v ,fd (3.4.6), Premise 2, 3
5 From bl infer ,fd
5.1 , , bl subs, Negation, pr I
5.2 rna (3.4.6), 2, 5.1
5.3 , , rna subs, Negation, 5.2
5.4 ,fd (3.4.6), 4, 5.3
6 bl? ,fd ?-I,5

2. For the proofs of the valid conjectures using the equivalence transfor-
mation system of chapter 2, we first write here the disjunctive normal
form of the Premises:

Premise I: , tb v , bl v rna
Premise 2: , rna v ,fdv , gh
Premise 3: gj v (fd A ,gh)
Conjecture 2, which can be written in the form (rna A gh) ? gj, is
proved as follows. First, use the laws of Implication and De Morgan to
put it in disjunctive normal form:

(E.I) ,rnaV,ghVgj.

To show (E.I) to be true, it is necessary to show that at least one of the


disjuncts is true. Assume, then, that the first two are false: rna = T and
gh = T. In that case, Premise 2 reduces to ,fd, so we conclude that
jd = F. But then Premise 3 reduces to gj, and since Premise 3 is true, gj
is true, so that (E.I) is true also.

Answers for Section 4.1.


1. (a) T. (b) T. (c) F. (d) T.
2. (a){I,2,3,4,6}. (b){2,4}. (c)F. (d)F.
Answers for Section 4.4 333

3. (a) U. (b) T. (c) U. (d) U.


5. We don't build all the truth tables explicitly, but instead analyze the
various cases.
Associativity. To prove a cor (b cor c) and (a cor b) cor c equivalent,
we investigate possible values of a. Suppose a = T. Then evaluation
using the truth table for cor shows that both expressions yield T. Sup-
pose a = F. Then evaluation shows that both yield b cor c, so that they
are the same. Suppose a = U. Then evaluation shows that both expres-
sions yield U. The proof of the other associative law is similar.

Answers for Section 4.2


1. The empty string E -the string containing zero characters- is the
identity element, because for all strings x, x I E = x.
4. (N i: O~i <n: x = b[i) = (Ni: 0 ~i <m: x =c[i).
5. (A k: 0 ~ k < n :
(Ni: 0 ~i <n: b[k) =b[i) = (N): O~} <n: b[k) =c[i))).
6. (a) (A i:) ~i <k+1: b[i) =0)
(b) ,(Ei:}~i<k+l:b[i)=O), or(Ai:}~i<k+1:b[i)#0)

(c) Some means at least one: (E i:) ~ i <k +1: b [i) =0), or, better yet,
(Ni:) ~i <k+1: b[i)=O»O
(d) (O~i<n candb[i)=O)~}~i~k, or
(A i:O~i <n: b[i)=O~) ~i ~k)

Answers for Section 4.3


t I I I
1. (a) (Ek:O~k <n: P "Hk(T» " k >0 (invalid)

(b)
+I I I
(A}:O~}<n:Bj ~wp(SLj,
I R»
Answers for Section 4.4
1. Ej = E (i is not free in E)

E::+ 1 = (A i:O~i <n+l: b[i)<b[i+1)

2. Ej is invalid, because it yields two interpretations of}:

Ej =n >} "(N): 1~}<n:n7} =0»1

3. Ej = E (since i is not free in E)


334 Answers to Exercises

4. The precondition is x+1 >0, which can be written as R:+ 1 (compare


with the assignment statement x:= x + I).
6. (a) For the expressions T, F and id where id is an identifier that is not
i,E;=E.
(b) For the expression consisting of identifier i, E; = e.
(c) (E)l = (E;)
(d) (, E)1 = , (E1)
(e) (E1" E2); = E1; "E21 (Similarly for v, = and =';»

(f) (A i: m ~i ~n: E): = (A i: m ~i ~n: E)


For identifier j not identifier i,
(Aj: m ~j ~n: E)1 = (Aj: m1 ~j ~n1: El) (Similarly for E and N.)

Answers for Section 4.5


1. (a) E is commutative, as is A; this can be written as (E t: (E p:
fool(p, t))), or (E p: (E t: fool(p, t))), or (E p,t: fool(P, t ».
2. (b) (A a,b,c: integer (a ,b,c):
sides(a, b, c)=a+b;:::c "a+c;:::b "b+c ;:::a)

Answers for Section 4.6


l.(a)x=6,y=6,b=T. (b)x=5,y=5,b=T.

Answers for Section 5.1


1. (a) (3,4,6,8). (d) (8,6,4,2).
2. (a) O. (b) 2. (c) O.
3. (a) (i = j "5 =5)v(i # j "5 =bU)) (i =j) v bU]=5.
(b)b[i]=i.

Answers for Section 5.2


1. (Those exercises without abbreviations are not answered.)
(a) bU:k]=O.
(b) bU:k] #0.
(c) OEbU:k].
(d) O'1b[O:j-I] " O'1b[k+l:n-l].

o p q n-I
2. (a) 0 ~ p ~ q +I ~ n " b L-I_~.:....;x.;....."JI_ _ _1-1--'->_x----'I
Answers for Chapter 7 335

Answers for Section 6.2


1. (a) First specification: {n >O} S {x =max({y I y Eb[O:n-I]})}.
Second specification: Given fixed n and fixed array b [O:n -I], establish

R: x =max({y Iy Eb}).

For program development it may be useful to replace max by its meaning.


The result assertion R would then be

R: (E i : 0 ~ i <n :x = b [i]) 1\ (A i: 0 ~ i < n : b [i] ~ x)


(d) First specification:

{n >O}
S
{O~i <n 1\ (Aj:O~j <n: b[i]:):b[j]) 1\ b[i]>b[O:i-I]}.

Second specification: Given fixed n > 0 and fixed array b [O:n -I], set i to
establish

R: O~i <n 1\ b[O:n-I]~b[i] 1\ b[i]>b[O:i-l]

(k) Define average (i) = (:Lj: O~j <4: grade [i, j]).
First specification:

{n > O} S {O ~ i <n 1\ (A j: O~j < n: average (i):): averageU»}.


Answers for Chapter 7
1. (a) i+I >0, or i :):0.
(b) i+2+ j-2=0, or i+j =0.
3. Suppose Q ? R. Then Q 1\ R = Q. Therefore

wp(S, Q)= wp(S, Q I\R) (since Q 1\ R = R)


= wp(S, Q) 1\ wp(S, R) (by (7.4»
?wp(S,R)

This proves (7.5).


6. By (7.6), we see that LHS (of (7.7» ? RHS. Hence it remains to show
that RHS ~ LHS. To prove this, we must show that any state s in
wp (S , Q V R) is either guaranteed to be in wp (S, Q) or guaranteed to be
in wp (S, R). Consider a state s in wp (S, Q V R). Because S is deter-
ministic, execution of S beginning in s is guaranteed to terminate in a
single, unique state s', with s' in Q V R. This unique state s' must be
either in Q, in which case s is in wp (S, Q), or in R, in which case s is in
336 Answers to Exercises

wp(S,R).
7. This exercise is intended to make the reader more aware of how quan-
tification works in connection with wp, and the need for the rule that
each identifier be used in only one way in a predicate. Suppose that Q
~ wp (S, R) is true in every state. This assumption is equivalent to

(E7.1) (Ax: Q~wp(S,R».

We are asked to analyze predicate (7.8): {(A x: Q)} S {(A x: R)J, which
is equivalent to

(E7.2) (Ax: Q)~wp(S,(Ax: R».

Let us analyze this first of all under the rule that no identifier be used in
more than one way in a predicate. Hence, rewrite (E7.2) as

(E7.3) (Ax: Q)~wp(S,(Az: R z1).

and assume that x does not appear in S and that z is a fresh identifier.
We argue operationally that (E7.3) is trut.-. Suppose the antecedent of
(E7.3) is true in some state s, and that execution of S begun in s ter-
minates in state s'. Because S does not contain identifier x, we have
sex) =s'(x).
Because the antecedent of (E7.3) is true in s, we conclude from (E7.1)
that (A x: wp (S, R» is also true in state s. Hence, no matter what the
value of x in s, s'(R) is true. But s(x)=s'(x). Thus, no matter what
the value of x in s', s'(R) is true. Hence, so is s'«A x: R», and so is
s'«A z: R{». Thus, the consequent of (E7.3) is true in s, and (E7.3)
holds.
We now give a counterexample to show that (E7.2) need not hold if x
is assigned in command S and if x appears in R. Take command
S: x:= I. Take R: x = I. Take Q: T. Then (E7.1) is

(A x: T ~wp("x:= I ", x = I»
which is true. But (E7.2) is false in this case: its antecedent (A x: T) is
true but its consequent wp("x:= I ", (A x: x = I» is false because predicate
(Ax:x=l)isF.
We conclude that if x occurs both in Sand R, then (E7.2) does not in
general follow from (E7.1).
Answers for Section 9.2 337

Answers for Chapter 8


3. By definition, wp (make -true, F) = T, which violates the law of the
Excluded Miracle, (7.3).
5. wp("SJ; S2", Q)Vwp("SJ; S2",R)
= wp(SJ, wp(S2, Q» v wp(SJ, wp(S2, R» (by definition)
= wp(SJ, wp(S2, Q)Vwp(S2, R» (since S1 satisfies (7.7»
= wp(S1, wp(S2, Q v R» (since S2 satisfies (7.7»
= wp("SJ; S2", Q v R) (by definition)

Answers for Section 9.1


1. (a) (2*y+3)= 13, or y =5.
(b) x+y <2*y, or x <yo
(c) O<j+1 A(Ai:O:S;i:S;j+l: b[i]=5).
(d) (bU]=5) = (A i: O:S;i :S;j: b[i]=5)
3. Execution of x:= e in state s evaluates e to yield the value s (e) and
stores this value as the new value of x. This is our conventional model of
execution. Hence, for the final state s' we have s' =(s; x:s(e».
We want to show that s'(R)=s(R:). But this is simply lemma 4.6.2.
This means that if R is to be true (false) after the assignment (i.e. in state
s') then R: must be true (false) before (i.e. in state s). This is exactly
what definition (9.1) indicates.
4. Writing both Q and e as functions of x, we have

wp("x:= e(x)",sp(Q(x), "x:= e(x)"»


wp("x:= e(x)",(Ev: Q(V) A x =e(v»)
(Ev: Q(v) A x =e(v)>:rx)
(E4.1) (Ev: Q(v) A e(x)=e(v»

The last line follows because neither Q(v) nor e (v) contains a reference
to x. Now suppose Q is true in some state s. Let v = s (x), the value of
x in state s. For this v, (Q(V) A e(x)=e(v» is true in state s, so that
(E4.1) is also true in s. Hence Q ~(E4.1), which is what we needed to
show.

Answers for Section 9.2


1. We prove only that x:= e1; y:= e2 is equivalent to x, y:= e1, e2.
Write any postcondition R as a function of x and y: R (x ,y).

wp("x:= e1; y:= e2", R(x, y»


= wp("x:= e1", wp("y:= e2", R(x, y»
338 Answers to Exercises

wp("x:= el", R(x, y)12)


= wp("x:= el", R(x, e2»
R(x, e2)~
R(el, e2) (since x is not free in e2)
wp("x, y:= el, e2", R(x, y» (by definition)
3. (a) I*c d =c d , or T
tb) 1::;;;I<n II b[O]=(l:j: O::;;;j::;;;O: bU], or I<n
(c) 0 2 < I II (0+ 1)2~ I, or T

4. In these, it must be remembered that x is a function of the identifiers


involved. Hence, if x occurs in an expression in which a substitution is
being made, that substitution may change x also. In the places where this
happens, x is written as a function of the variables involved. See espe-
cially exercise (b).
(a) wp("a, b:= a+l, x", b =a+l) = x =a+2. Hence, take x =a+2.

(b) wp("a:= a+l; b:= x(a)",b =a+l)


= wp("a:= a+l",x(a)=a+l)
= x(a+l)=a+2

This is satisfied by taking x(a) = a +1. Hence, take x = a +1.


(e) wp("i:= i+l; j:= x (i)", i = j)
wp("i:= i+l", i =x(i»
= i+l =x(i+l)

Answers for Section 9.3


1. For each part, the weakest precondition, determined by textual substi-
tution, is given and then simplified.
(a) (b; i:i)[(b; i:i)[i]]=i = (b; i:i)[i]=i
= i =i = T
(b) (Ej: i::;;;j<n: (b; i:5)[i]::;;;(b; i:5)U])
(Ej: i::;;;j<n: 5::;;;(b; i:5)U])
(Ej: i <j <n: 5 ::;;;(b; i:5)U]) v 5::;;;(b; i:5)[i]
<
(E j: i <j n: 5::;;; (b; i :5)U]) v 5::;;; 5
T

Answers for Section 9.4


1. (a) Rtll;x;;e; i;f), g

2. For each part, the weakest precondition determined by textual substitu-


tion is given and then simplified.
Answers for Chapter II 339

(a) (b; i:3; 2:4)[i]=3


= (i=2A4=3)V(i,e2 A 3=3)
= i,e2
(g) b [p ] = (b ; p: b [ b [p ]]; b [p ]: p )[ ( b; p: b [ b [p ]]; b [p ]: p )[ b [p ]]]
b[P]=(b;p:b[b[p]]; b[p]:p)[P]
= (p = b [P] A b [P] = p ) v (p ,e b [P] A b [p ] = b [b [p ]])
= p = b [P] v b [p ] = b [ b [P]] (see (c»
5. The lemma has been proven for the case that x consists of distinct
identifiers, and we need only consider the case that x = b 0 Sl, ... , b 0 sn.
To prove this case, we will need to use the obvious fact that

(E5.1) (b; s:bos)=b

Remembering that Xi equiv b :Si, we have

E?b: sl:bosl: ... : sn:hosn) (substitute Xi for each Ui)

Eg (n applications of (E5.1»
E

Answers for Chapter 10


3. Letting R = q*w+r =x A r;;:'O, we have

wp(S3, R) = (w ~r v w >r) A
(w ~r -=?wp("r, q:= r-w, q+l", R» A
(w >r -=?wp(skip, R»
(w ~r -=?«q+l)*w +r-w =x A r-w ;:'0» A (w >r -=? R)
(w ~r -=?q*w+r =x A r-w ;;:'0) A (w >r -=? R)

This is implied by R.
6. wp(S6, R) = (f[i]<gU] v f[i]=gU] v f[i]>gUn A
(f[i] <gU] -=? Rj+l) A
(f[i]=gU]-=?R) A
(f[i] > gU] -=? Rj+l)
R A (f[i]<gU]-=?f[i+I]~X) A (f[i]>gU]-=?gU];:'X)
R (since R implies that gU] ~ X and f[i] ~ X)

Answers for Chapter 11


2. In the proof of theorem 10.5 it was proven that
340 Answers to Exercises

Therefore, given assumption I, we have

PABB P A BB AT
P A BB A (A i: P AB; =,;>wp(Sj, P» (since 1. is true)
P A BB A P A (A i: B; =,;>wp(Sj,P»
=';> BBA(Ai: Bj =,;>wp(Sj,P»
wp(IF, P)

3. By a technique similar to that used in exercise 2, we can show that


assumption 3 of theorem 11.6 implies

(E3.I) P A BB =';> wp("T:= t; IF", t <T)

Thus, we need only show that (E3.1) implies 3' of theorem 11.6. Note
that P, IF, and I do not contain tJ or to. Since IF does not refer to T
and to, we know that wp(IF, t1 ::::;;10+1) = BB A tJ ::::;;to+1. We then have
the following:

(£3.1) = P ABB =';> wp(IF, I <t1)t


(by definition of := )
=';> PABBAt::::;;IO+1 =,;>wp(IF, t::::;;tJ-1)FA/::::;;IO+1
(Insert I ::::;; 10+ I on both sides of =';»

PABBAI ::::;;10+1 =';> wp(IF, I ::::;;11-I)J1 A(11::::;;tO+I){1


= PABBAI ::::;;to+1 (wp(IF, t ::::;;tJ-I)A t1 ::::;;10+1):1
=';>
(Distributivity of textual substitution)
P ABBAt ::::;;to+1 =';> (wp(IF, I ::::;;tJ-I)Awp(IF, tJ ::::;;to+I»:1
(IF does not contain 11 nor to)
P ABB At::::;; to+1 =';> wp(IF, t ::::;; tJ-I A t1::::;; to+I){1
(Distributivity of Conjunction)
P A BBAt ::::;;to+1 =';> wp(IF, t ::::;;to){1
P A BB At::::;; to+1 =';> wp(" t1:= t; IF", t ::::;; to)
= PABBAt::::;;IO+1 =';>wp("IF",t::::;;tO)

Since the derivation holds irrespective of the value to, it holds for all to,
and 3' is true.
4. We first show that (11.7) holds for k =0 by showing that it is equiv-
alent to assumption 2:

P A BB =,;>t >0 (Assumption 2)


, P v , BB v t >0 (Implication, De Morgan)
, p v , (t ::::;;0) v , BB
PAt::::;;O=,;>,BB (De Morgan, Implication)
P A I ::::;;0 =';> P A , BB
Answers for Chapter 12 341

= PAt ~Oo?Ho(PA,BB) (Definition of Ho)

Assume (11.7) true for k = K and prove it true for k = K +1. We have:

P A BB A t ~K+I o? wp(IF, PAt ~K) (this is 3')


o? wp(lF, HdP A , BB» (Induction hyp.)

and P A , BB A t ~ K +1 o? P A , BB
= HO(PA,BB)

These two facts yield

PAt ~K+I =9Ho(P A ,BB) v wp(lF,P A ,BB)


= HK+J(P A , BB)

which shows that (II. 7) holds for k = K +I. By induction, (11.7) holds
for all k.
6. H'o(R)=,BBAR. For k >0, H'dR)=wp(IF, H'k-J(R». (Ek:
o~ k: H'd R» represents the set of states in which DO will terminate
with R true in exactly k iterations. On the other hand, wp(DO, R)
represents the set of states in which DO will terminate with R true in k
or less iterations.
10. (I) wp("i:= I", P) = 0<1 ~n A (Ep: 1 =2P )
= T (above, take p =0).
(2) wp(SJ, P) wp("i:= 2*i", O<i ~n A (Ep: i =2P »
0<2*i ~n A (Ep: 2*i =2P ),

which is implied by P A 2*i ~ n.


(3)PA ,BB = O<i~n A (Ep: i=2P )A2*i>n,
which is equivalent to R.
(4)PABB =9 0<i~nA2*i~n
=9 n -i >0, which is t >0.
(5) wp("I]:= I; SJ", t «1)
wp("tJ:= n-i; i:= 2*i", n-i <t1)
wp("tI:= n-i", n-2*i <II)
n -2*i <n-i
-i < 0, which is implied by P.

Answers for Chapter 12


1. We have the following equivalence transformations:
{T(u)J S {RJ (A u: {T(u)J S {RJ)
(A u: T(u) =9Wp(S, R»
(Au: ,T(u)V wp(S,R» (Implication)
342 Answers to Exercises

= (A u: , T(u» V wp(S, R)
(Since neither S nor R contains u)
,(Eu: T(u»V wp(S,R)
(Eu: T(u» ~wp(S, R) (Implication)
(El.l) {(Eu: T(u))} S {} (Implication)
The quantifier A in the precondition of (12.7) of theorem 12.6 is neces-
sary; without it, predicate (12.7) has a different meaning. With the quan-
tifier, the predicate can be interpreted as follows: The procedure call can
be executed in a state s to produce the desired result R if all possible
assignments ii, v to the result parameters and arguments establish the
truth of R. Without the quantifier, the above equivalence indicates that
an existential quantifier is implicitly present. With this implicit existential
quantifier, the predicate can be interpreted as follows: The procedure call
can be executed in a state s to produce the desired result R if there exists
at least one possible assignment of values ii, v that establishes the truth
of R. But, since there is no quarantee that this one possible set of values
ii, v will actually be assigned to the parameters and arguments, this state-
ment is generally false.

Answers for Chapter 14


1. (b) Q: x =X. R: (X;;;::O" x =X) v (X ~O 1\ X =-X).
if x;;;::O - skip Ux ~O - x:= -x fl.
2. Assume the next highest permutation exists (it doesn't for example, for
d = 543221). In the following discussion, it may help to keep the example

d =(1,2,3,5,4,2)
d' =(1,2,4,3,2,5)

in mind. There is a least integer i, O~i <n, such that d[O:i-l] =


d'[O:i-l] and d[i]<d'[i]. One can show that i is well-defined by the
fact that d[i+l:n-l] is a non-increasing sequence and that d[i] <
d[i+l].
In order for d' to be the next highest permutation, d' [i] must contain
the smallest value of d[i +I:n -I] that is greater than d[i]. Let the right-
most element of d[i+l:n-I] with this value be dU]. Consider d" =
(d; i:dU]; j:d[i]). d" represents the array d but with the values at posi-
tions i and j interchanged. In the example above, d" = (1,2,4,5,3,2).
Obviously, d" is a higher permutation than d, but perhaps not the next
highest. Moreover, d'[O:i] =d'10:i].
It can be proved that d"[i+l:n-I] is a non-increasing sequence.
Hence, reversing d"[i+l:n-l] makes it an increasing sequence and, there-
fore, as small as possible. This yields the desired next highest permuta-
tion d.
Answers for Section 15.2 343

To summarize, let i and j satisfy, respectively,

O~i <n-l A d[i]<d[i+l] A d[i+l:n-l] is non-increasing


i <j <n t\ dU]>d[i] t\ dU+I:n-I]~d[i]

Introducing the notation reverse(b, f, g) to denote the array b but with


b[J:g] reversed, we then have that the next highest permutation d' is

d' = reverse «d; i:bU]; j:b[i]), i+l, n-l).

The algorithm is then: calculate i; calculate j; swap b [i] and b U]; reverse
b[i+l:n-I]! Here, formalizing the idea of a next highest permutation
leads directly to an algorithm to calculate it!

Answers for Section 15.1


3. i, x:= I, b [0];
do i '#n - if x ?b[i] - i,x:= i+l, b[i]
nx ~b[i] - i:= i+1
fi
od

Answers for Section 15.2


2. The initialization is x, y:= X, Y. Based on the properties given, the
obvious commands to try are x:= x+y, x:= x-y, x:= y-x, etc. Since
Y > 0, the first one never reduces the bound function t: x +y, so it need
not be used. The second one reduces it, but maintains the invariant only
if x > y. Thus we have the guarded command x > y - x:= x -y. Sym-
metry encourages also the use of y > x - y:= y -x and the final program
IS

x, y:= X, Y;
do x > y - x:= x-y
Oy>x-y:=y-x
od
[O<x=y t\gcd(x,y)=gcd(X, Y)}
[x = gcd(X, Y))
5. t:= 0;
do j '#80 cand bU]'#" - t,s[t+IJ,j:= t+l, bUJ,j+1
nj =80 - read(b); j:= 0
od
344 Answers to Exercises

Answers for Section 16.2


2. Delete the conjunct n <2*i from R:
(Q: O<n)
i:= I;
{inv: 0 < i ~ n A (E p: 2P = i»)
(bound: n-i (actually, log(n-i) will do)}
do 2*i ~n ~ i:= 2*i od
(R: O~i ~n <2*i A (Ep: 2P =i)}
4. Delete the conjunct x =b[i,j] from R:
i, j:= 0, 0;
{inv:O~i<m AO~j<n Axifb[O:i-I,O:n-I] "
x ifb[i ,O:j-l] "x Eb)
{bound: (m -i)*n-j)
dox#b[i,j]"j#n-I ~j:=j+I
U x#b[i,j)"j=n-l ~i,j:=i+l,O
od

Answers for Section 16.3


4. (a) i, j:= I, n;
{inv: 1 ~i <j ~n A b[i]~x <bUn
(bound: logU-i)}
do i+I # j ~ e:= U+j)-;-2;
if b[e]~x ~ i:= e 0 b[e]>x ~ j:= e fi
od
The obvious choice for the second part of the problem is to embed the
program for the first part in an alternative command:

ifx<b[l] ~i:=O
Db[l]~x<b[n] ~Theprogram(a)
Db[n]~x ~i:=n
fi

However, there is a simpler way. Assume the existence of b[O], which


contains the value -00, and b[n+l], which contains the value +00. As
long as the program never references these values, this assumption may be
made. Then, with a slight change in initialization, the program for the
first part used -it even works when the array is empty, setting j to 1 in
that case.
Answers for Section 16.5 345

i,j:=O,n+l;
{inv: O~i <j ~n+1 A b[i]~x <bU]}
{bound: logU-i)}
doi+l¥-j ~e:=(i+j)+2;
{I~e~n}
if b[e]~x ~ i:= eO b[e]>x ~ j:= e fi
od

10. i ,p:= 0, 0;
{inv: see exercise 10; bound: n -i}
do i ¥- n ~ Increase i, keeping invariant true:
j:= i+l;
{inv: b[i:j-l] are all equal; bound: n-j}
doj¥-n candbU]=b[i] ~j:=j+1 od;
p:= max(p, j-i);
i:= j
od

Answers for Section 16.5


4. The only differences in this problem and exercise 3 are that the value
used to separate the array into two sections is already in the array and
that that value must be placed in b (p]. The invariant of the loop (except
for the fact that b[I]=B[I] and that b is a permutation of B) given in
the procedure below is:

m m+1 q p n-I
P: m< q ~p +1 ~ n A x = B[ I] A 1-1 _IL...->_x--,I
b x--,-I_~...:...x_..J..I_._?

proc Partition (value result b: array [*] of integer;


value m, n: integer; result k: integer);
var x, q ,p: integer;
begin
x, q, p:= b[m], m+l,n-l;
{inv: P; bound: p-q+l}
doq~p ~ifb[q]~x ~q:=q+1
Ob(P] >x ~ p:= p-I
Ob[q]>x ~b(P] ~ b[q],b(p]:= b(P],b[q];
q,p:= q+l,p-1
fi
{p =q-I A b[m+l:p]~b[m] A
od;
b(p +I:n -I] >b[m] A b[m] =B[m] =x}
b[m],b(p]:= b(p],b[m]
end
346 Answers to Exercises

6. The precondition states that the linked list is in order; the postcondition
that it is reversed. This suggests an algorithm that at each step reverses
one link: part of the list is reversed and part of it is in order. Thus, using
another variable t to point to the part of the list that is in order, the
invariant is

t v s v s
~Vi+ll t-"'---1 Vn I -I I
Initially, the reversed part of the list is empty and the unreversed part is
the whole list. This leads to the algorithm

p,t:=-I,p;
do t #-1 ~ p, t ,s[t]:= t, s[t],p od

8. The precondition and postconditions are

Q: x Eb[O:m-I,O:n-l]
R: O:::;;;i <m A O:::;;;j <n A x = b[; ,j]

Actually, Q and R are quite similar, in that both state that x is in a rec-
tangular section of b -in R, the rectangular section just happens to have
only one row and column. So perhaps an invariant can be used that indi-
cates that x is in a rectangular section of b:

P:O:::;;;i:::;;;p<m AO:::;;;q:::;;;j<n AxEb[;:p,q:j]

To make progress towards termination the rectangle must be made


smaller, and there are four simple ways to do this: ;:= i + I, etc. Thus, we
try a loop of the form

; , p, q, j:= 0, m -I, 0, n -I;


do ? ~ ; := i+l
U? ~ p:= p-l
U? ~ q:= q+l
D? ~ j := j - l
od

What could serve as guards? Consider i:= i + l. Its execution will main-
tain the invariant if x is not in row; of b. Since the row is ordered, this
can be tested with b [i , j] < x, for if b [i , j] > x, so are all values in row
i. In a similar fashion, we determine the other guards:
Answers for Section 18.3 347

i, p, q, j:= 0, m - I, 0, n - I;
do b[i,j] <x - i:= i+1
Ub[p,q]>x -p:=p-l
Ub[p, q]<x - q:= q+1
Ub[i, j] >x - j:= j-I
od

In order to prove that the result is true upon termination, only the first
and last guards are needed. So the middle guarded commands can be
deleted to yield the program

i, j:= 0, n-I;
do b[i,j]<x -;:= ;+1
Ub[i,j]>x -j:=j-I
od {x =b[i,j]}

This program requires at most n +m comparisons. One cannot do much


better than this, for the following reason. Assume the array is square:
m =n. The off-diagonal elements b[O:m-l] , b[l,m-2], ... , b[m-I:O]
form an unordered list. Given the additional information that x is on the
off-diagonal, in the worst case a minimum of m comparisons is necessary.

Answers for Section 18.2


1. Algorithm (18.2.5) was developed under the general hope that each
iteration of the loop would decrease the total number of elements S (say)
in the partitions still to be sorted, and so S is a first approximation to the
bound function. However, a partition described in set s may be empty,
and choosing an empty partition and deleting it from s does not decrease
S.
Consider the pair (S, I s I ), where I s I is the number of elements in s.
Execution of the body with the first alternative of the alternative com-
mand being executed decreases the tuple (lexicographically speaking)
because it decreases I s I. Execution with the second alternative being
executed also decreases it -even though I s I is increased by I, S is
decreased by at least one. Hence, a bound function is 2*S + I s I .

Answers for Section 18.3


3. Define

emPfY(P) - 0
postorder(p) = { , empty - (postorder (left [P]) I
postorder(right[p]) I (root(p»
348 Answers to Exercises

An iterative formulation of postorder traversal is slightly more compli-


cated than the corresponding ones of preorder and inorder traversal. It
must be remembered whether the right subtree of a tree in s has been
visited. Since (pointers to) nodes of trees are represented by nonnegative
integers, we make the distinction using the sign bit.
The postcondition of the program is

(E3.1) R: c =#p Apostorder(p)=b[O:c-l]

Before stating the invariant, we indicate the postorder traversal of a


signed integer q that represents a tree:
. q < -I - root (abs(q)-2)
post(q) = { q =-1 - 0
q :;;:'0 - postorder (right [q]) I root(q)
U sing a sequence variable s, the invariant is:

(E3.2) P: O~c A q :;;:'-1 A


postorder(p) = b[O:c -1] I postorder(q) I
post (s [0]) I ... I post(s[1 s 1-1])
The program is then
c, q, s:= 0, p, 0;
{invariant: (E3.2)}
do q #-1 - q,s:= left[q],(q) Is
U q=-I As#O -q,s:=s[O],s[l..]
if q <-I - q,c,b[c]:= 0, c+I, root[-q-2]
Uq = -I - skip
Uq >-1 - q, s:= right [q], (-q-2) Is
fa
od {R}

Answers for Section 19.2


1. PI is easily established using i, x:= 0, O. If (xv [i],yv[i]) is a solution
to (19.2.1), then

r = xv[if + yv[i]2 ~ 2*xv[if

Hence, all solutions (xv [i], yv [i]) to the problem satisfy r ~ 2*xv [i]2.
Using the Linear Search Principle, we write an initial loop to determine
the smallest x satisfying r ~ 2*x 2 , and the first approximation to the pro-
gram is
Answers for Section 19.2 349

i, x:= 0, 0; do r >2*x 2 - x:= x+1 od;


{inv: PI II r ~2*x2}
do x 2 ~ r - Increase x ,keeping invariant true od

In order to increase x and keep PI true, it is necessary to determine if a


suitable y exists for x and to insert the pair (x, y) in the arrays if it does.
To do this requires first finding the value y satisfying

Taking the second conjunct,

as the invariant of an inner loop, rewrite the program as

i, x:= 0, 0; do r >2*x 2 - x:= x+1 od;


{inv: PI II r ~2*X2}
do x2~r -
Increase x, keeping invariant true:
Determine y to satisfy (El.I):
y:= x;
{inv: Pl}
do x 2 + y2 > r - y:= y -I od;
if x 2 +y2=r - xv[v],yv[v],i,x:= x, y, i+l, x+1
Dx 2 +y2<r - x:= x+1
fi
od

Now note that execution of the body of the main loop does not destroy
P2, and therefore P2 can be taken out of the loop. Rearrangement then
leads to the more efficient program

i, x:= 0, 0;
do r >2*x 2 - x:= x+1 od;
y:= x;
{inv: PI II Pl}
do x2~r -
Increase x, keeping invariant true:
Determine y to satisfy (E 1.1):
do x 2 +y2>r - y:= y-l od;
2
if x +y2=r - xv[v],yv[v],i,x:= x, y, i+l, x+l
Dx 2 +y2<r -x:= x+l
fi
od
350 Answers to Exercises

Answers for Section 19.3


2. Program 19.3.2, which determines an approximation to the square root
of a nonnegative integer n, is

[n )!O}
a,c:=O, I; doc2~n -c:=2*c od;
[inv: a2~n «a +C)2 A (Ep: 1 ~p: c =2 P }
[bound: .;n - a}
doc#l-c:=c/2;
if(a+c)2~n - a:= a+c
D(a+c)2>n - skip
fi
od
[a2~n «a+I)2}

We attempt to illustrate how a change of representation to eliminate


squaring operations could be discovered.
Variables a and c are to be represented by other variables and elim-
inated, so that no squaring is necessary. As a first step, note that the
squaring operation c 2 must be performed in some other fashion. Perhaps
a fresh variable p can be used, which will always satisfy the relation

Now, which operations involving c can be replaced easily by operations


involving p instead?

Command c:= I can be replaced by p:= 1, expression c 2 by p, c:= 2*c by


p:= 4*p, c # I by P #1 and c:= cl2 by p:= pi 4.
The remaining operations to rewrite are a:= 0 (if necessary), a:= a +c,
(a +c)2 ~ n and (a +c)2 > n. Consider the latter two expressions, which
involve squaring. Performing the expansion (a +c)2 = (a 2 + 2*a*c + c 2)
isolates another instance of c 2 to be replaced by p, so we rewrite the first
of these as

(E3.1) a 2 +2*a*c+p-n ~O

Expression (E3.1) must be rewritten, using new variables, in such a way


that the command a:= a +c can also be rewritten. What are possible new
variables and their meaning?
Answers for Section 19.3 351

There are a number of possibilities, for example q =a 2 , q =a*c, q


a 2 - n, and so forth. The definition

(E3.2) q = a*c

is promising, because it lets us replace almost all the operations involving


a. Thus, before the main loop, q will be 0 since a is 0 there. Secondly,
to maintain (E3.2) across c:= c /2 we can insert q:= q /2. Thirdly, (E3.2)
maintained across execution of the command a:= a +c by assigning a new
value to q -what is the value?

To determine the value x to assign to q, calculate

wp("a, q:= "a+c, x", q =a*c) = x =(a+c)*c

The desired assignment is therefore

q:= (a+c)*c, which is equivalent to


q:= a*c +c 2 , which is equivalent to
q:= q +p

With this representation, (E3.1) becomes

(E3.3) a2+2*q+p-n~O

Now try a third variable r to contain the value n -a 2 , which will always
be ~O. (E3.3) becomes

2*q +p -r ~O

And, indeed, the definition of r can also be maintained easily.


To summarize, use three variables p, q and r, which satisfy

and rewrite the program as


352 Answers to Exercises

{n :;:,O}
p, q, r:= 1,0, n; dop ~n - p:= 4*p od;
do p =1= I - p:= p /4; q:= q/ 2;
if 2*q +p ~r - q,r:= q+p, ; r-2*q-p
02*q+p>r -skip
fi
od
{q2 ~n «q +1)2}

Upon termination we have p = I, c = I, and q = a*c = a, so that the


desired result is in q. Not only have we eliminated squaring, but all mul-
tiplications and divisions are by 2 and 4; hence, they could be imple-
mented with shifting on a binary machine. Thus, the approximation to
the square root can be performed using only adding, subtracting and shift-
ing.

Answers for Chapter 20


1. Call a sequence (of zeroes and ones) that begins with 0000 (4 zeroes)
and satisfies property 2 a good sequence. Call a sequence with k bits a
k -sequence.
Define an ordering among good sequences as follows. Sequence sf is
less than sequence s2, written sf. <. s2, if, when viewed as decimal
numbers with the decimal point to the extreme left, sf is less than s2. For
example, 101.<.1011 because .101 <.1011. In a similar manner, we write
101.=.101000, because .101 =.101000. Appending a zero to a sequence
yields an equal sequence; appending a one yields a larger sequence.
Any good sequence s to be printed satisfies 0.<' s.<. 00001, and must
begin with 00000.
The program below iteratively generates, in order, all good sequences
satisfying O.~. s .~. 0000 I, printing the 36-bit ones as they are generated.
The sequence currently under consideration will be called s. There will be
no variable s; it is just a name for the sequence currently under considera-
tion. s always contains at least 5 bits. Further, to eliminate problems
with equal sequences, we will always be sure that s is the longest good
sequence equal to itself.

Pf: good(s) /I ,good(s I 0) /I 5 ~ I s I /I 0 .~. S .~. 00001 /I


All good sequences .<. s are printed
Sequence s with n bits could be represented by a bit array. However,
it is better to represent s by an integer array c[4:n-I], where c[i] is the
decimal representation of the 5-bit subsequence of s ending in bit i.
Thus, we will maintain as part of the invariant of the main loop the asser-
tion
Answers for Chapter 20 353

P2: 5~n =1 sl ~36 A


c[i] = s[i-4]*2 4 +s[i-3]*2 3 +s[i-2]*2 2 +s[i-I]*2 +s[i]
(for4~i<n)

Further, in order to keep track of which 5-bit subsequences s contains, we


use a Boolean array in [0:31]:

P3: (Ai: O~i <32: in[i] = (i Ec[4:n-l]))

With this introduction, the program should be easy to follow.

n, c[4],in[O]:= 5,0, T;
in[I:31]:= F; {s = (O,O,O,O,O)}
{inv: PI A P2 A P3 A , good(s I O)}
doc[4];61 ~
if n = 36 ~ Print sequence s
Dn ;636 ~ skip
fi;
Change s to next higher good sequence:
doin[(c[n-I]*2+l) mod 32] W.e. ,good(s II)}
~ Delete ending l's from s:
do odd(c[n-l]) ~ n:= n-I; in[c[n]]:= F od;
Delete ending 0:
n:= n-l; in[c[n]]:= F
od;
Append I to s:
c[n]:= (c[n-l]*2+1) mod 32; in[c[n]]:= T; n:= n+l
od

7. The result assertion is

R: c =(Ni: O~i <F: f[i] rf g[O:G-I])+


(N): O~} < G: gU] rf f[O:F-l])
We would expect to write a program that sequences up the two arrays
together, in some synchronized fashion, performing a count as it goes.
Thus, it makes sense to develop an invariant by replacing the two con-
stants F and G of R as follows:

O~h~FAO~k~G A
c =(Ni: O~i <h: f[i'l f:
g[O:G-I])+
(N): O~}<k: gU] Iff[O:F-I])

Now, consider execution of h:= h+1. Under what conditions does its
execution leave P true? The guard for this command must obviously
imply f[h] f: g[O:G -1], but we want the guard to be simple. As it
354 Answers to Exercises

stands, this seems out of the question.


Perhaps strengthening the invariant will allow us to find a simple job.
One thing we haven't tried to exploit is moving through the arrays in a
synchronized fashion -the invariant does not imply this at all. Suppose
we add to the invariant the conditionsf[h-l]<g[k] and g[k-l]<g[h]
-this might provide the synchronized search that we desire. That is, we
use the invariant

P: O:::;;'h:::;;'F A O:::;;'k:::;;' G A f[h-I]<g[k] A g[k-I]<f[h]


c =(Ni: O:::;;'i<h:f[i] 1'g[O:G-I])+
(N j: O:::;;'j <k: gU] l' f[O:F-I])

Then the additional conditionf[h]<g[k] yields

g[k-I] <f[h] <g[k]

so that f[h] does not appear in G, and increasing h will maintain the
invariant. Similarly the guard for k:= k+1 will be g[k] <f[h].
This gives us our program, written below. We assume the existence of
virtual values f[-I]=g[i-I]=-oo and f[F]=g[G]=+oo; this allows
us to dispense with worries about boundary conditions in the invariant.

h, k, c:= 0, 0, 0;
{inv: P; bound: F-p +G-q}
dof #F Ag#G ~
iff[h]<g[k] ~ h,c:= h+l, c+l
Uf[h]=g[k] ~ h, k:= h+l, k+l
Uf[h]>g[k] ~ k, c:= k+l, c+l
fi
od;
Add to c the number of unprocessed elements of f and g:
c:= c +F-h +G-k
Index

abort, 114 Assertion, 2, 100


abs, 314 output, 100
Abstraction, 149 result, 100
Addition, 314 placement of, 278
identity of, 72 Assignment, simple, 117
Aho, A.V., 309 forward rule for, 120
Allen, Layman E., 42 multiple assignment, 121, 127
Alternative command, 132 to an array element, 124, 90
strategy for developing, 174 Associative laws, 20, 69
Ambiguity, 308 proof of, 48
An Exercise Attributed to Hamming, Associativity of composition, 316
243, 302 Atomic expression, 67
and, see Conjunction Axiom, 25
and-simplification, 21
Annotated program, 104 Backus, John, 304
Annotation for a loop, 145 Backus-Naur Form, 304
Antecedent, 9 Balloon theory, 193
Approximating the Square Root, 195, Bauer, F.L., 296
201, 246, 350 BB, 132
Argument, 152 Binary relation, 315
final value of, 155 Binary Search, 205, 302, 344
initial value of, 155 Binary tree, 229
Array, 88 BNF,304
as a function, 89 extensions to, 309
domain of, 89 Body, of a procedure, 150-151
two-dimensional, 96 Boole, George, 8, 20
Array of arrays, 96 Boolean, 8, 66
Array picture, 93 Bound function, 142
Array Reversal, 214, 302 Bound identifier, 76-77
Array section, 93 Bound variable substitution, 80,
Index 359

85 nondeterministic, I I I
Bounded nondeterminism, 312 procedure call, 164
sequential composition, 114-115
Calculus, 25 skip, 114
propositional calculus, 25 Command-comment, 99, 279
predicate calculus, 66 indentation of, 279
Call, of a procedure, 152 Common sense and formality, 164
by reference, 158 Commutative laws, 20
by result, 151 proof of, 48
by value, 151 Composition, associativity of, 3 I 6
by value result, 151 Composition, of relations, 3 16
cand,68-70 Composition, sequential, 114-115
cand-simplification, 80 Concatenation, see Catenation
Cardinality, of a set, 311 Conclusion, 29
Cartesian product, 315 Conjecture, disproving, 15
Case statement, 134 Conjunct, 9
Catenation, 75 Conjunction, 9-10
identity of, 75, 333 distributivity of, 110
of sequences, 312 identity of, 72
ceil, 314 Conjunctive normal form, 27
Changing a representation, 246 Consequent, 9
Chebyshev, 83 Constable, Robert, 42
Checklist for understanding a loop, Constant proposition, 10
145 Constant-time algorithm, 321
Chomsky, Noam, 304 Contradiction, law of, 20, 70
Choose, 312 Contradiction, proof by, 39-41
Closing the Curve, 166, 30 I Controlled Density Sort, 247, 303
Closure, of a relation, 3 17 cor, 68-70
transitive, 3 I 7 cor-simplification, 79
Code, for a permutation, 270 Correctness
Code to Perm, 264, 272-273, 303 partial, 109-110
Coffee Can Problem, 165,301 total, 110
Combining pre- and postconditions, Counting nodes of a tree, 23 I
21 I Cubic algorithm, 321
Command, 108 Cut point, 297
abort, 114
alternative command, 132 Data encapsulation, 235
assignment, multiple, 121, 127 Data refinement, 235
assignment, simple, 128 De Morgan, Augustus, 20
assignment to an array element, 124 De Morgan's laws, 20, 70
Choose, 312 proof of, 49
deterministic, I I I Debugging, 5
guarded command, 13 I Decimal to Base B, 215, 302
iterative command, 139 Decimal to Binary, 215,302
360 Index

Declaration, of a procedure, 150 Enlarging the range of a variable,


Deduction theorem, 36 206
Definition, of variables, 283 Equality, 9-10
Deleting a conjunct, 195 law of, 20
Demers, Alan, 302 equals, see Equality
Depth, of a tree, 236 Equivalence, 19
Derivation, 308 laws of, 19-21
Derived inference rule, 46 Equivalent propositions, 19
rule of Substitution, 46-47 Euclid,301
Determinism, III even, 314
Deterministic command, III Excluded Middle, law of, 20, 70
Difference, of two sets, 311 Excluded Miracle, law of, 110
Different Adjacent Subsequences, Exclusive or, 11
262, 303 Existential quantification, 71
Dijkstra, E.W., 295-296, 300-303 Exponential algorithm, 321
disj, 159 Exponentiation, 239, 252, 302
Disjoint, pairwise, 159 Expression, atomic, 67
Disjoint vectors, 159 Expression, domain of, 117
Disjunct, 9
Disjunction, 9-10 F,8
distributivity of, III Factorial function, 221
identity of, 72 Feijen, W.H.J., 264, 302
Disjunctive normal form, 27 Fibonacci number, 225
Distributive laws, 20, 69 Final value, of a variable, 102
proof of, 48 of an argument, 155
Distributivity of Conjunction, Finding Sums of Squares, 245,
110 302, 348
Distributivity of Disjunction, Flaw chart, 138, 190-191
III disadvantages of, 275
Divide and Conquer, 226 floor, 314
DO, 138-139 Floyd, Robert, 297
domain, 89, 117 Formality and common sense, 164
Domain, of an array, 89 Forward rule for assignment, 120
Domain, of an expression, 117 Four-tuple Sort, 185,239, 301
Dutch National Flag, 214, 302 Free identifier, 76-77
Dynamic Programming, 261 Function, 318
bound function, 142
Efficient Queues in LISP, 250, n-ary function, 319
303 of an Identifier, 318
Eliminating an Implication, 24 variant function, 142
Elimination, rule of, 30
Empty section, 93 gcd, 191,224-225,301,343
Empty set, 310 Gentzen, Gerhard, 42
Empty tree, 229 Gill, Stanley, 296
Index 361

Global reference, 38 Indentation, 275


Grammar, 305 of command-comments, 279
ambiguous, 308 of delimiters 279
sentence of, 305 Inference rule, 25, 30
unambiguous, 308 bound variable substitution, 85
Greatest common divisor, see derived,46
gcd E-E,85
Griffiths, Michael, 30 I E-I,84
Guard, 131 A-E,84
Guarded command, 131 A-I, 84
=-E, 34,43
Halving an interval, 202 =-1,34,43
Hamming, R.W., 302 =,;>-E, 33, 43
Heading, of a procedure, 282 =';>-1, 36, 43
Hoare, CA.R., 295, 297-299, 302 A-E, 31, 43
Hopcroft, J.E., 309 A-I, 30-31,43
Horner, W.G., 243 v-E, 33, 43
Horner's rule, 242 V-I, 31,43
,-E, 40, 43
Identifier, 9 , -1,40,43
bound, 76-77 Initial value, of a variable, 102
free, 76-77 of an argument, 155
quantified, 71 Inorder traversal, 236
quantified, range of, 82 inrange, 125
quantified, type of, 83 Insertion Sort, 247
restriction on, 76 Integer, 67, 314
Identity element, 72 Integer set, 67
Identity, law of, 21 Intersection, of two sets, 311
Identity relation, 316 Introduction, rule of, 30
Identity, of addition, 72 Invariant, 141
of and, 72 Invariant relation, see Invariant
of catanation, 75, 333 Inversion, 185-186
of multiplication, 72 Inverting Programs, 267
of or, 72 Iteration, 139
IF, 132 Iterative command, 139
Iff, ix strategy for developing, 181, 187
IFlP, 295, 300 Ithacating, 31
imp, see Implication
Implementation of a tree, 230 Justifying Lines, 253,
Implication, 9-10 289-293, 303
elimination of, 24
law of, 20 Knuth, Donald E., 302
Implicit quantification, 83-84
Inclusive or, II
362 Index

Laws, preorder, 232


and-simplification, 21 log, 314
proof of, 51 Logarithm, 314, 321
cand-simplification, 70 Logarithmic algorithm, 321
Associative, 20, 69 Longest Upsequence, 259, 303
Commutative, 20 Loop, 139, see Iterative command
Contradiction, 20, 70 annotation for, 145
proof of, 51 checklist for understanding, 145
cor-simplification, 70 iteration of, 139
De Morgan's, 20, 70 lower, 89
Distributive, 20, 69 lup,259
Distributivity of Conjunction,
110 make-true, 116
Distributivity of Disjunction, Many-to-one relation, 316
III max, 314
Equality, 20 Maximum, 30 I
proof of, 51 McCarthy, John, 295
Equivalence, 19-21 McIlroy, Douglas, 168
Excluded Middle, 20, 70 Melville, Robert, 303
proof of, 50 Meta-theorem, 46
Excluded Miracle, 110 Mills, Harlan, 302
Identity, 21 min, 314
Implication, 20 Misra, J., 303
proof of, 51 mod, 314
Monotonicity, III Modus ponens, 34
Negation, 20 Monotonicity, law of, III
proof of, 50 Multiple assignment, 121, 127
or-simplification, 21 Multiplication, identity of, 72
proof of, 51
Leaf, of a tree, 229 n-ary function, 319
Left subtree, 229 n-ary relation, 319
Length, of a sequence, 312 Natural deduction system, 2943
Levin, Gary, 302 Natural number, 67,313
Lexicographic ordering, 217 Naur, Peter, 297, 304
LHS, Left hand side (of an Negation, 9-10
equation) Negation, law of, 20
Linear algorithm, 321 Nelson, Edward, 302
Linear Search, 197, 206, 30 I N ewton, Isaac, 243
Linear Search Principle, 197, 30 I Next Higher Permutation, 178,
Line Generator, 263 262,301,342
Link Reversal, 215, 302, 346 N ode Count, 231, 302
List of nodes, N ode, of a tree, 229
inorder, 236 Non-Crooks, 353,264
postorder, 236, 347 N ondeterminism, III
Index 363

bounded, 312 perm, 91


restriction of, 238 Perm, see Permutation
unbounded, 312 Perm to Code, 263, 270, 303
Nondeterministic command, III Permutation, 75
N onterminal symbol, 304 Placement of assertions, 278
Normal form, conjunctive, 27 Plateau, of an array, 203
Normal form, disjunctive, 27 Plateau Problem, 203, 206, 301
not, see Negation Postcondition, 100, 109
Null selector, 96 strongest, 120
Numerical quantification, 73-74 postorder, 347
Postorder list, 347
odd, 314 Postorder traversal, 236, 347
O'Donnell, Michael, 42 Precedence rules, 12, 67
One-to-many relation, 316 Precondition, 100, 109
One-to-one relation, 316 weakest, 109
Onto relation, 316 Predicate, 66-87
or, see Disjunction evaluation of, 67-68
Order fen), 321 Predicate calculus, 66
ordered, 94, 126 Predicate transformer, 109
Ordered binary tree, 229 Predicate weakening, 195
Order of execution time, 321 Premise, 29
Or, exlusive, II preorder, 233
inclusive, II Preorder list, 232
or-simplification, 21 Preorder traversal, 232
Outline of a proof, 104 Prime number, 83
Output assertion, 100 Problems,
An Exercise Attributed to
Pairwise disj oint, 159 Hamming, 243, 302
Paradox, 21 Approximating the Square Root,
Parameter, 150 195,201,246,350
var, 158 Array Reversal, 214, 302
final value of, 155 Binary Search, 205, 302, 344
initial value of, 155 Closing the Curve, 166, 30 I
result, 151 Code to Perm, 264, 272-273, 303
specification for, 150 Coffee Can, 30 I, 165
value, 151 Controlled Density Sort, 247,303
value result, 151 Decimal to Base B, 215, 302
Partial correctness, 109-110 Decimal to Binary, 215, 302
Partial relation, 316 Different Adjacent Subsequen-
Partition, 214, 302, 345 ces, 262, 303
pdisj, 159 Dutch National Flag, 214, 302
Perfect number, 85 Efficient Queues in LISP, 250,
Period, of a Decimal Expansion, 303
264 Exponentiation, 239, 252, 302
364 Index

Problems (continued) Program inversion, 267


Finding Sums of Squares, 245, Program transformation, 235
302, 348 Programming-in-the-small, 168
Four-tuple Sort, 185, 239, 30 I Proof, 163
gcd, 191, 224-225, 301, 343 by contradiction, 39-41
Insertion Sort, 247 in natural deduction system,
Justifying Lines, 253, 31-32,41-42
289-293, 303 versus test-case analysis, 165
Linear Search, 197, 206, 30 I Proof Outline, 104
Line Generator, 263 Proportional to, 321
Link Reversal, 215, 302, 346 Proposition, 8-016
Longest Upsequence, 259, 303 as a set of states, 15
Maximum, 30 I constant, 10
Next Higher Permutation, 178, equivalent, 19
262, 30 I, 342 evaluation of, 11-14
Node Count, 231, 302 stronger, 16
Non-Crooks, 353, 264 syntax of, 8-9, 13
Partition, 214, 302, 345 weaker, 16
Period of a Decimal Expansion, well-defined, II
264 Propositional calculus, 25
Perm to Code, 263, 270, 303
Plateau Problem, 203, 206, 301 Quadratic algorithm, 321
Quicksort, 226, 302 Quantification, 71-74
Railroad Shunting Yard, 219 implicit, 83-84
Saddleback Search, 215, 302, 346 numerical, 73-74
Searching a Two-Dimensional universal, 73
Array, 182, 188, 301 Quantified identifier, 71
Swap, 103, 119 range of, 71, 74, 82
Swapping Equal-Length Sections, type of, 83
212, 302 Quantifier, existential, 71
Swapping Sections, 222, 302 Queue, 313
Tardy Bus, 59 Quicksort, 226, 302
Unique 5-bit Sequences, 262, Quine, W. V.O, 42
303, 352 QWERTY programmer, 170
Welfare Crook, 207, 238, 302
Procedure, 149-162 Railroad Shunting Yard, 219
heading of, 282 Randell, Brian, 296
argument of a call, 152 Range of quantified identifier,
body of, 150-151 71,74,82
call of, 152 Record, 92, 98
declaration of, 150 as a function, 92, 98
recursive, 217 Recursive procedure, 221
Production, 304 ref, 159
Program, annotated, 104 Refinement, of data, 235
Index 365

Relation, 315 empty section, 93


binary, 315 Seegmueller, Gerhard, 296
closure of, 317 SELECT statement, 134
composition of, 316 Selector, 96
identity relation, 316 Semicolon, role of, I 15
invariant, see Invariant Sentence, of a grammar, 305
many-to-one, 316 Separation of Concerns, 237
n-ary,319 Sequence, 312
one-to-many, 316 length of, 312
one-to-one, 316 catenation of, 312
onto, 316 Sequential composition, 114-115
partial, 316 Set, 310
total, 316 cardinality of, 31 I
Rene Descartes, 315 difference, 3 I 1
Replacing a constant, 199 empty, 310
Restricting nondeterminism, 238 intersection, 3 I I
Result assertion, 100 union, 311
Result parameter, 151 Side effects, 119
Rewriting rule, 304 Simple assignment, 117
RHS, Right hand side (of an Simp'te variable, 117
equation) Simultaneous substitution, 8 I
Right subtree, 229 skip, 114
Role of semicolon, 115 sp, 120
Root, of a tree, 229 Specification of a parameter, 150
Ross, Doug, 296 Stack,313
Routine, 149 State, 11
Rule of elimination, 30 Strategy, for developing a loop,
Rule of inference, 25 181, 187
Rule of introduction, 30 for developing an alternative
Rule of Substitution, 22, 26, command, 174
46-47 Strength reduction, 241
Rule of Transitivity, 23, 26 Stronger proposition, 16
Rule, rewriting, 304 Strongest postcondition, 120
Rule, see Inference rule Strongest proposition, 16
Subscripted variable, 89
Saddleback Search, 215, 302, 346 Substitution, rule of, 22, 26,
Schema, 19, 3 I, 45 46-47
Scholten, Carel, 30 I simultaneous, 81
Scope rule, in Algol 60, 38, 78 textual, 79-81
in natural deduction system, 38 textual, extension to, 128- I 29
in predicates, 78 Subtree, 229
Searching a Two-Dimensional Swap, 103, 119
Array, 182, 188, 301 Swapping Equal-Length Sections,
Section, of an array, 93 212, 302
366 Index

Swapping Sections, 222, 302 Universal quantification, 73


Symbol of a grammar, 304 upper, 89
nonterminal, 304 Upsequence, 259
terminal, 304
Syntax tree, 308 Value parameter, 151
Value result parameter, 151
T,8 Var parameter, 158
Taking an assertion out of a Variable, subscripted, 89
loop, 241 final value of, 102
Tardy Bus Problem, 59 initial value of, 102
Tautology, 14 definition of, 283
relation to theorem, 26 simple, 117
Terminal symbol, 304 Variant function, 142
Textual substitution, 79-81
extension to, 128-129 Weakening a predicate, 195
Theorem, 25 combining pre- and post-
relation to tautology, 26 conditions, 211
as a schema, 45 deleting a conjunct, 195
Total correctness, 110 enlarging the range of a
Total relation, 316 variable, 206
Transitive closure, 3 I 7 replacing a constant, 199
Transitivity, rule of, 23, 26 Weaker proposition, 16
Traversal, inorder, 236 Weakest precondition, 109
postorder, 236, 347 Weakest proposition, 16
preorder, 232 Welfare Crook, 207, 238, 302
Tree, 229 Well-defined proposition, I I
depth of, 236 WFF'N PROOF, 28, 42, 59
empty, 229 While-loop, 138
implementation of, 230 Wilkes, Maurice, 149
leaf of, 229 Williams, John, 301
root of, 229 Wirth, Niklaus, 296
Truth table, 10, 15 Woodger, Michael, 296
Truth values, 8 wp, 108
Turski, Wlad M., 296
Two-dimensional array, 96

U, 69
Ullman, J.D., 309
Unambiguous grammar, 308
Unbounded nondeterminism, 312
Undefined value, 69
Union, of two sets, 311
Unique 5-bit Sequences, 262,
303, 352

You might also like