Tutorial Notes Reasoning About Logic Pro
Tutorial Notes Reasoning About Logic Pro
Link:
Link to publication record in Edinburgh Research Explorer
Document Version:
Peer reviewed version
Published In:
Logic Programming in Action
General rights
Copyright for the publications made accessible via the Edinburgh Research Explorer is retained by the author(s)
and / or other copyright owners and it is a condition of accessing these publications that users recognise and
abide by the legal requirements associated with these rights.
® Alan Bundy
Tutorial Notes: Reasoning about Logic Programs *
Alan Bundy
Abstract
These are tutorial notes for LPSS-92: the Logic Programming Summer School organised
by the CompuLog Esprit Network of Excellence in September 1992. They are an introduction
to the techniques of reasoning about logic programs, in particular for synthesizing, verifying,
transforming and proving termination of logic programs.
Key words and phrases. Logic programming, synthesis, transformation, verification, termin-
ation, abstraction.
1 Introduction
In this tutorial we will describe techniques for reasoning about logic programs
We will see that reasoning about programs is an important software engineering tool. It can be
used to improve the efficiency and reliability of computer programs. Such reasoning is particularly
well suited to logic programs. In fact, the ease of reasoning with logic programs is one of their
main advantages over programs in other languages.
The kind of reasoning tasks we will consider are as follows.
Transformation: to transform a program into a more efficient program meeting the same spe-
cification.
Abstraction: to abstract from the program information about the types of its input/output, its
modes of use, etc.
In conventional, imperative, programming languages these five tasks are very different, but in
logic programming some of them merge together. The reason is that specifications are usually
written as logical formulae. As such, they can be interpreted as logic programs. They may be
very inefficient and they may not always terminate, but they can often be used as a prototype of
the desired program. This means that synthesis and transformation are not different in kind, but
only in degree. Moreover, synthesis can be seen as verification of a partially specified program.
All this imparts a simplicity and unity to reasoning with logic programs.
'1 would like to thank the members of the mathematical reasoning group at Edinburgh and members of the
CompuLog Network for helpful advice and feedback on these notes. In particular, Inn Kraan, Andrew Ireland,
Helen Lowe, Danny Dc Schreye and Michael Maher were especially helpful. Some of the research reported in this
paper was supported by Esprit BRA grant 3012 (CompuLog).
1.1 Semantics of Logic Programs
To reason formally about a computer program we must have a method of turning a question about
programs into a mathematical conjecture. The program reasoning task can then be converted
into a theorem proving task. The usual way to do this is to associate a semantics with the
programming language. By 'semantics' we mean an assignment of a mathematical expression to
each program statement. The mathematical expression is often thought of as the meaning of the
program statement.
To a first approximation it is unnecessary to give a semantics to a logic program; it is already
a mathematical expression, namely a clause in first-order logic. Unfortunately, this is only true
of pure logic programs. In practice, logic programs, e.g. Prolog programs, often contain impure
features, e.g. negation as failure, assert/retract, var, the cut operator, the search strategy,
etc. These cannot be directly interpreted in logical terms.
There are (at least) three solutions to this problem.
1. Provide a semantics for the particular logic programming language, e.g. Prolog, in which
both the pure and impure features of the language are assigned a mathematical interpreta-
tion.
2. Work only within a pure subset of the language. Implement a logic programming language
based on this pure subset.
3. Reason about specifications of logic programs. Introduce the impure features as necessary
during a final compilation of the specification into your logic programming language of
choice.
We will discuss the tradeoffs between these approaches in §2.
• body(Xi) is an arbitrary first-order logic formula, with no defined functions, whose free
variables are in Xij.
In the interests of readability we will sometimes reduce clutter by omitting the vA e tppes
part.
The condition excluding defined functions from the body means that only undefined or con-
structor functions are allowed. That is, we can use functions to form data-structures, e.g. s(0)
or [HdjTI], but not to define relations between objects, e.g. x + Ij.
From which we can deduce that the parameters of penn and ordered are also all lists (or some
more general type).
The method of reasoning with abstract programs is the same as that used for concrete pro-
grams, e.g. partial evaluation, but the reasoning tends to be simpler because of the loss of detail
caused by abstraction.
2 What Shall we Reason About?
In this section we discuss the tradeoffs between reasoning with impure programs, pure programs
or specifications. There are advantages and disadvantages with each approach.
Notation Henceforth, we will use the term completion to mean extended Clark completion.
cvenl(0)
evenl(s(s(N))) — eveml(N)
oddi (s(0))
oddl(s(s(N))) — oddl(N)
even2(0)
even2(s(N)) — odd2(N) (2)
oddl(s(N)) — even2(N)
At first sight the definitions of eveni/oddi and even2/odd2 look like alternative, but equivalent
programs for testing for even and odd numbers. [The natural numbers are represented in unary
notation, where, for instance, 3 is represented by s(s(s(0))).]
3 We will use maths font for pure logic programs.
However, these programs are not logically equivalent, i.e. it is not the case that:
evenl(N) - cvem2(N)
oddl(N) - odd2(N)
Both clauses (3) and (4) are logical consequences of this definition because oddi (0) - odd2(0)
false. -
Thus, whether we regard these two programs as equivalent depends on whether we take logical - C
equivalence of programs, logical equivalence of completions or one of the other six rival notions,
as the definition of equivalence.
2.3 Summary
We can summarise these different tradeoffs as follows.
Cons There are a wide variety of different transformation schemes depending on which proper-
ties of the program it is desired to preserve. A different semantics is required to define each kind
of preservation and to justify the corresponding transformations. These transformations are very
complex and restricted. The semantics and the transformations are sensitive to small changes in
the definition of the programming language.
4 ou use of 'logically equivalent' is non-standard in that we use more than just rules of logic in proofs of
equivalence; we also use mathematical induction.
2.3.2 Pure Programs
Pros The kinds of transformation are general across a wide range of programming languages.
Cons Practically useful impure features are excluded from language. There are still several
different notions of program equivalence and, hence, several different transformation schemes.
These transformations do not preserve the operational behaviour of impure programs, e.g. Prolog
programs. The logical and intended meaning of the program do not coincide. This means that
some information required to reason with the programs is not represented directly.
2.3.3 Specifications
Pros There is only one notion of equivalence. The logical and intended meaning of the specific-
ation coincide.
2.3.4 Conclusion
It seems to me that the benefits of logic programs can best be realised by a combination of
reasoning with specifications and with impure programs. The basic algorithm is best determined
by reasoning at the specification level, where the notion of equivalence is unambiguous and the
reasoning invariant under changes in programming language. However, various implementational
aspects can only be dealt with at the programming language level, so some tuning transformations
at this level must also be catered for. The rest of these tutorial notes will assume this viewpoint.
p(si .....s.)—bodw
with the clause:
3. Ezistentially quantify each variable which occurs in the body but not in the head.
Replace each clause of the form:
where X ?6 Y5, for all i and), andYj occurs in bodv(Yj .....Y), with the clause:
where tj is the type of Yt. We postpone the problem of discovering these types to J9.
4. Combine the clauses for each predicate into a single equivalence. Suppose there are it
clauses defining p, each of the form:
If there are no clauses for a predicate, p/n, mentioned in the program (i.e. k = 0)
then its Clark Completion is:
In addition, we assume various axioms defining the predicate = used in Clark Completions as
syntactic identity. These include the usual equality axioms of reflexivity, symmetry, transitivity
and substitutvity. In addition, we assume axioms that ensure that no equations hold between
non-identical terms, i.e. f(Xj ,..., X) # g(Yt,..., '4) if f and g are not identical, X 0 t if t
does not contain X, and the inverse of substitutivity:
This algorithm is adapted from [Lloyd, 1987[14], in which further details may be found.
subset(D .3).
subset([RdIT1],3) :- member(Hd,3), subset(t1,3).
A goal of the form subset (1,3) succeeds if I is a subset of 3, where sets are represented as lists.
The first stage of the algorithm is to rewrite the program into logical notation.
subset([}, J)
subset([F{dJTl], J) - member(Hd, J) & subset(TI, J)
9
The second stage is to make the head arguments into distinct variables.
subset(I,J) - I = []
subset(l, J) - I = l-tcflTl] & member(Hd, J) & subset(11,J)
Note that in both clauses we exercised our option not to replace J.
The third stage is to existentially quantify each variable which occurs in the body but not in
the head.
subset(I,J) - 1= []
subset(I,J) - 3Rd E Ttjpe,RTIE ttst(Tppe).
= [Mdlii] & member(Hd, J) & subset(Tl, J)
The fourth stage is to combine these two clauses into a single equivalence.
subsct(I, J) - I = [] v
3Rd E Tpe,Tt E Rst(Ttjpe).
= [Mdlii] & mentber(Rd, J) & subset(fl, J)
Definition 2 (The Lloyd-Topor Compilation Algorithm) The key idea of the algorithm
is to put the specification into clausal form. It consists of three stages.
1. Turn the main '-. into a—.
2. Put the specification into clausal form using the Lloyd-Topor rides. They are too
complicated to give in full here. We have illustrated the general idea by giving some
examples in figure 5.2. The complete set is given in [Lloyd, 1987][pll3].
S. Turn the logical symbols into program symbols, i.e. invert stage I of the Clark com-
pletion algorithm.
We can now see the need for the restriction, given in §1.2, to exclude defined functions from
specifications. There is no provision in the Lloyd-Topor algorithm to turn these defined functions
into predicate definitions. Consider, for instance:
plu.s(X,Y,Z).-.(X0& Z = Y)v
3X' € nat. X = s(X') & Z = s(X' + Y)
Lloyd-Topor compiles this into the Prolog program:
10
Name Input Clause Output Clause(s)
3 heud—...&3etbod&... head—...&bodtj&...
V head—...&VEtbody&... head—...&-'3Et-.bodu&...
q(Yi Yk)—bodlJ
.....
In the last nile, q is a new predicate symbol and the Yi are the free variables in
3g€T. body.
These n.zles are applied as rewrite rides to the specification until no more apply.
The specification is then in clausal form.
The second stage is to use the Lloyd-Thpor compilation to put this into clausal form. The rules
from figure 3.2 apply as follows:
11
subset(I, J) - -'BEt € Type. -'(ntember(Et, I) -+ meinber(Et, J))
subset(I, J) - -'noLsubset(1, J)
noLsubset(I, J) - -'(member(Et, I) - member(Et, J))
subset(l, J) - -'noLsubset(l, J)
noLsubset(I, J) - member(El, I) & -'member(EI, J)
The third stage is to rewrite the clauses into program notation.
subset(I,J) :- not not_subset(I,J).
not_subset(I,J) :- member(Ei,I), not membor(E1,.J).
4 Equivalence of Specifications
The problem we will consider in this section is how to prove that two logic program specifications
are equivalent. That is, suppose Speci and Spec2 are two logic program specifications. We will
consider how to prove:
VArgs E T%jpes. Spec1 - Spec2 (6)
In the interests of readability we will sometimes omit the VArgs E Tupes part.
We saw in §3 that each of these two specifications compiles directly into a logic program.
Some of these logic programs are more practical than others. Suppose Frog1 is the logic program
corresponding to Speci and Fro92 to Spec2. If Fr092 is an impractical program and Frog1 is a
practical program then we can view the proof of equivalence (6) as verification of Progi in terms
of Spec2. U both Frog1 and Fro92 are practical programs then we can view this proof as the
transformation of Fr092 into Frog1. Typically, Frog1 will be more efficient than Fro92.
Thus, when reasoning at the level of specifications, the processes of verification and trans-
formation coalesce; they differ in degree not kind. Of course, the real challenge in transformation
is to be given an inefficient program, Fro92, and to construct a more efficient program, Frog1.
This is essentially the same problem as synthesising a practical program, Frogj, which meets a
given specification, SpeC2. We will tackle this joint problem in §5 below.
subset(I,J) —.1 = (] v
3Rd € Ttjpe, TIE tist(Tijpe).
I = (Fid (It] & member(Rd, J) & subset(11, J)
ntember(EI, L) - ( 3Rd € Tijpe, TIE list(Tppe). L = [ EtIfl]) V
(3Rd E Type, TI € tist(Type). L = [ HdITLI & member(EI, It))
The quantification of the outer universal variables has been omitted to reduce clutter.
Using the Lloyd-Topor compilation, the left and right hand sides of equivalence (7) compile
into the two Prolog programs:
12
subset_i(D .3).
subset_i([RdJfl],J) member(Bd,3), subset_i(T1,i).
Applying the definitions of subset and member this reduces to the problem of proving:
The turnstile symbol I- indicates that the induction hypothesis on the left can be assumed when
proving the induction conclusion on the right. Note that the universal variable j in the induction
hypothesis can be instantiated, if necessary, whereas it should not be instantiated in the induction
conclusion. We have ensured this by representing J by a free variable (upper case) on the left and
a constant 3 (lower case) on the right.
Using the definitions of subset and member the induction conclusion can be reduced to:
This completes - the proof of the step case. The whole of equivalence (7) is now proved.
13
5 Synthesis of Specifications
The problem we will consider in this section is how, given an initial specification, we can synthesise
an equivalent new one. That is, suppose we are given a specification Spec2, how can we construct
a specification Speci such that:
As discussed in §4 above, this will enable us to synthesise a practical program, Frog1, from a
specification Spec2 and/or to transform an inefficient program, Fro92, into an efficient program,
Progi.
The essential idea underlying our solution to this problem will be to proceed with the proof
of the equivalence theorem as if Specj were known, and to pick up clues as to its definition as we
proceed. This is best illustrated with an example.
Many of the steps in the previous proof of this equivalence can be repeated, but the proof cannot
be completed due to the absence of the definition of subset. The residue of subgoals is:
subset(f], J) true
t-'
subset(I, J) ..-. I = [] v
JHd E Ippe, It € .ttst(Tpe).
I = [F{dTtj & member(Hd, J) & subset(Tl, J)
as required. The residue of subgoals can be readily proved from this definition, so the proof is
completed.
14
for all but trivial theorems, this process rapidly becomes bogged down in an explosion of partially
generated proofis. This phenomenon is called the combinatorial erplosion.
To defeat the combinatorial explosion we ned to use heuristic methods to guide the proof
building process along the most promising paths. To illustrate such heuristic methods, consider
the problem of rewriting the induction conclusion so that the induction hypothesis may be applied
to it. The form of the step case of an inductive proof is:
P(tl) I- P( [hall!]
I')
Note that the induction conclusion on the right differs from the induction hypothesis on the left
by inclusion of the induction term [hal .. j. We have emphasised this by drawing a box around
the induction term and underlining the induction variable, U inside it. We call this boxed sub-
expression a wave-front. The arrow, T. represents the direction of movement of the wave-front:
upwards (or outwards depending on your point of view) through the induction conclusion.
The presence of this wave-front prevents us from using the induction hypothesis to prove the
induction conclusion. To enable the induction hypothesis to be used we ned to move the wave-
front to the outside of the induction conclusion, i.e. we need to rewrite the induction conclusion
into the form:
We call this rewriting process rippling. It consists of applying rewrite rules which move the
wave-fronts outwards but leave the rest of the induction conclusion unchanged. Rewrite rules of
this form are called wave-rules. They have the form:
Some examples are given in figure 6. For more information about rippling see [Bundy et al, 1991].
15
subset1 ( [HdQ 1,1)
J_member(l-td, J) & subset1
1)
member(Et,J [HdIIg El = Hd V member(EL, 111
VV - C => (A - C) & (B - C)
subsetcl[hdftiiI,J) -
(VU € Type. mentber(EL.( [hdftjJ IT) -. member(Et,j))
subset(fhdljJJ,j) —
(VEt € Type. EL = MV ntember(Et, tiYj T -. mentber(EtJ))
P"dLthi,i) —
I
(VEt E Type. EL = ltd -. member(EL, j) & member(Et, Ii) - member(Et,
)I)
meinber(hd, j) & subset(tL, j) -
VEt € Type. EL = ltd -. menber(Et, j) & VEt e Type. member(Et, tt) - member(Et. 9
16
Rippling is now complete and a copy of the induction hypothesis is embedded within the
induction conclusion. The induction hypothesis can now be used to replace this copy with true.
This leaves the subgoal:
If r(X) is solved before q(X) then the search space will be infinite, but if q(X) is solved first then
the search space is finite.
In the discussion below we will describe a simple termination technique that establishes uni-
versal termination of definite programs 5 Our technique is to associate a well-founded measure
with each procedure call and to show that this strictly decreases as computation proceeds. A
I. e. those without negation as failure.
17
well-founded measure is one that cannot decrease for ever. This ensures that the computation
cannot proceed for ever. An example of a well-founded measure is the natural numbers 6 ordered
by >. This is well-founded because there is no infinite sequence of the form: n, > t2 > it3 >
Eventually, one of the n i would be 0, and there is no natural number smaller than that. This
natural number, well-founded measure will suffice for our purposes. To prove termination we will
associate a natural number with each procedure call and show that the number associated with
the current procedure call strictly decreases as the computation proceeds.
=0 ten([Hd(Tt]) = ten(fl)+ 1
Definition 4 (procedure call measure) The procedure call measure is a function, I ... I,
from a literal, subset(I, J) or mentber(Et, J), where I and J are ground, to a natural number
defined as:
18
4
Note that this argument breaks down if any of the lists in the procedure call arguments
are non-ground. Indeed, a procedure call member(a, L), for instance, will not terminate in the
universal sense. It will return an infinite number of results of the form:
Similar remarks hold for subset(!, J) where either I or J is non-ground. These are examples of
existential termination.
An introduction to more elaborate techniques for proving termination can be found in [Hogger, 1990][Theme
59) and [De Schreye & Verschaetse, 1992].
19
As we have seen, in §2.2.1, this clause can be partially evaluated to the logically and operationally
equivalent clause:
p2(a).
member(E1, [E1IT1]).
member(E1(RdIfl]) member(E1,T1).
given that the type of the first argument of subset is known to be Iist(number), say.
To infer the remaining types we partially evaluate the procedure call:
20
any
number list(number)
?- subset(ljst(number) .X).
If we resolve this goal against the first clause for subset then we must unify list (number) with
0 and X J. We should define abstract unification so that these unifications both succeed, but this
resolution does not tell us more than we already knew.
More interesting is the resolution against the second clause for subset. We should define
abstract unification so that unifying iist(number) and (Rd Ifl] succeeds, instantiating Rd to
number and Ti to list (number). The new procedure calls are:
?- member(number,X), subset(list(number),x).
We can now repeat this process for the member(number,I) procedure call. This gives us two
instantiations of I, both of which are list(nurnber). The new procedure call generated by the
second clause is:
?- member(number,list(number)).
but this is subsumed by the earlier call, so can tell us nothing new. Our loop detection mechanism
should note this and terminate this call.
Thetemaining procedure call is:
?- subset(iist(ninber),ljst(number))
which is also subsumed by an earlier call and should also be terminated.
The abstract interpretation is now complete. We have inferred the types of our predicates to
be:
10 Conclusion
In these notes we have outlined various techniques for reasoning about logic programs. These
techniques can be combined in various ways to form a methodology for logic program develop-
ment. Ideally, for the reasons summarised in §2.3, this methodology should take logic program
specifications as its central representation. The original description of the desired program should
take the form of a specification, as defined in §1.2.
This specification can be used to synthesise a more efficient specification using the techniques
of §5. It may seem odd to discuss the 'efficiency' of a specification. One way to measure this is to
e 21
compile the specification into a logic program using the Lloyd-Thpor algorithm (see §3.2) and then
to measure the efficiency of this program. However, there are some aspects of efficiency that are
independent of the particular target programming language, e.g. the complexity associated with
any forms of recursion used in the specification. These aspects can be measured more directly.
Having synthesised an acceptable specification, this can be compiled into a program using the
Lloyd-Topor algorithm. The termination, mode and similar properties of this program can be
analysed using the techniques of §7 and 9. If necessary, the program can then be tuned using the
techniques of §8.
If a program, rather than a specification, is available, then this can be lifted into a specification
using the Clark completion algorithm, §31. The above methodology is then applicable.
All these techniques can be automated to a greater or lesser extent. Automation makes possible
machine aids to program development that remove some of the tedium and error from the proof,
compilation and analysis steps.
Recommended Reading
In these notes it has only been possible to give a rough outline of the range and complexity of the
techniques available. If you want to find out more, here are some suggestions for further reading.
Elementary and very readable introductions to the ideas outlined in these notes can be found in
the two books by Chris Hogger, [Hogger, 1984, Hogger, 1990]. John Lloyd's book, [Lloyd 1 1987],
is the standard reference for the theoretical background on logic programming. A more de-
tailed account of verification and synthesis of logic programs can be found in Yves Deville's book
[Deville, 1990]. Deville adopts the same position that we have on reasoning with specifications,
wherever possible, rather than with programs. A discussion of the different notions of logic pro-
gram equivalence can be found in the paper by Michael Maher, [Maher, 1981. An introductory
survey of termination proving techniques can be found in the tutorial notes by Danny Dc Schreye
and Kristof Verschaetse, [De Schreye & Verschaetse, 1992], and a similar survey of work on ab-
straction can be found in the tutorial notes by Maurice Bruynooghe and Danny De Schreye,
[Bruynooghe & Dc Schreye, 1988].
References -
[Bruynooghe & Dc Schreye, 1988] Bruynooghe, M. and Dc Schreye, D. (1988). Tutorial notes
for: abstract interpretation in logic programming. Thch-
nical report, Department of Computer Science, Katholieke
tJniversiteit Leuven, Belgium, Tutorial given at ICLP-88,
Seattle. Notes available from the authors. Department of Com-
puter Science, Katholieke Universiteit Leuven, Celestijnen-
laaan 200A, 3001 Heverlee, Belgium.
[Bundy et al, 1990] Bundy, A., Smaill, A. and Hesketh, J. (1990). Turning eureka
steps into calculations in automatic program synthesis. In
Clarke, S.L.H., (ed.), Proceedings of UK IT 90, pages 221-6.
Also available from Edinburgh as DAI Research Paper 448.
[Bundy et al, 1991] Bundy, A., Stevens, A., van Harmelen, F., Ireland, A. and
Smaill, A. (1991). Rippling: A heuristic for guiding inductive
proofs. Research Paper 567, Dept. of Artificial Intelligence,
Edinburgh, To appear in Artificial Intelligence.
22
I
[De Schreye & Verschaetse, 1992] De Schreye, D. and Verschaetse, K. (1992). Termination of
logic programs: Tutorial notes. Technical Report CW-report
148, Department of Computer Science, Katholieke Universiteit
Leuven, Belgium, To appear in the proceedings of Meta-92.
[Deville, 19901 Deville, Y. (1990). Logic programming: systematic program
development. Addison-Wesley Pub. Co.
[Hogger, 1984] Hogger, C.J. (1984). Introduction to logic programming.
Academic Press.
[Hogger, 1990] Hogger, C.J. (1990). Essentials of Logic Programming. Ox-
ford University Press.
[Lloyd, 1987) Lloyd, J.W. (1987). Foundations of Logic Programming.
Symbolic Computation. Springer-Verlag, Second, extended
edition.
[Maher, 1987] Maher, M.J. (1987). Equivalences of logic programs. In
Minker, 3., (ed), Foundations of Deductive Databases and
Logic Programming. Morgan K aufmann.
4.
23