Learning Programs As Logical Queries
Learning Programs As Logical Queries
Type 1 Paper
Charles Jordan
1
and Lukasz Kaiser
2
1
ERATO Minato Project, JST & Hokkaido University
2
LIAFA, CNRS & Universite Paris Diderot
[email protected], [email protected]
Abstract. Program learning focuses on the automatic generation of
programs satisfying the goal of a teacher. One common approach is
counter-example guided inductive synthesis, where we generate a sequence
of candidate programs and the teacher responds with counter-examples
for which the candidate fails. In this paper we focus on a logical approach,
where programs are tuples of logical formulas, i.e. logical queries, and
inputs and outputs are relational structures. We introduce our model
of inductive synthesis and our implementation of it using SAT and
QBF solvers. We survey basic theoretical properties of our model and
show a few experimental results: learning complexity-theoretic reductions,
polynomial-time programs, and learning board games from examples.
1 Introduction
The dream of program learning that instead of writing code we will teach
computers to derive it automatically has a long history in computer science
and has appeared in many forms. In program synthesis we seek any program
that satises a specication, usually given as a logical formula. However, writing
this formal logical specication can be as hard as writing the program itself. To
avoid formal specications, one can consider learning programs using only a set
of example inputs and outputs.
Program learning is dicult and there are many theoretical and practical
challenges. Specifying the exact requirements for the learner is already hard. One
must choose a representation of programs, what assumptions on eciency to
make, whether the goal is any program or a short one, and the precise meaning of
short. More practically, searching for programs in any standard programming
language is very inecient. Most problems are undecidable, there are hardly any
useful normal forms to exploit, and modifying even a small part of a program
can dramatically change its behavior. Given these issues, it is easy to understand
why the dream of program learning has not yet been realized.
Choosing a representation for programs is critical and recent work [1,8,9] in
descriptive complexity suggests that logical queries are uniquely suited for this
task. Descriptive complexity studies the connection between logics and complexity
This work was supported in part by JSPS KAKENHI Grant Number 25106501.
classes: logics equivalent to classes such as NL, P, NP, and PSPACE are known.
Searching for formulas in these logics corresponds to searching for programs in
these classes, and has other advantages that we discuss below.
Here, we introduce a model of learning programs represented by logical queries.
Our approach tries to address the theoretical challenges with program learning
mentioned above, and we leverage recent advances in SAT and QBF solvers
to address more practical concerns. We have implemented our model
3
and we
describe three kinds of initial experiments. For program synthesis, we describe
learning complexity-theoretic reductions and also polynomial-time equivalents
for NP programs. Both of these tasks have complete specications. To show that
this is not a limitation of our approach, we experiment with learning the rules of
board games from sets of example plays. While these experiments are preliminary,
we believe the results show that our approach to program learning is promising.
Related Work. Much of our motivation comes from recent papers using ideas from
descriptive complexity in inductive synthesis. For example, given a specication
in an expressive logic (second-order), [8] synthesized equivalent formulas in
less expressive logics which can be evaluated more eciently. Automatically
nding complexity-theoretic reductions between computational problems was
rst considered by [1]. They focused on quantier-free reductions, a weak class
of reduction dened by tuples of quantier-free formulas.
Both problems are essentially the same nding formulas in a particular form
that satisfy desired properties. However, the implementations are separate and
not publicly available. We [9] have previously compared a number of dierent
approaches to reduction nding. In this paper, we introduce a more general
approach allowing the user to specify an outline of the desired formula and a
specication that it must satisfy. We provide a single model and freely available
implementation that can be used to experiment with various synthesis problems.
Inductive synthesis has a long history and there is a tremendous amount of
work that we do not cover here. See, e.g., [4,11] for a more general perspective.
2 Background in Logic and Descriptive Complexity
In this section we briey review the necessary background from descriptive
complexity. For more details, see [7] or Chapter 3 of [3].
There are many possible representations of programs; in this paper, we focus
on logical representations. One benet of the logical approach is that we are
able to treat structures such as graphs directly, instead of encoding them into
words or numbers. This allows us to express many interesting programs succintly,
and formulas have natural normal forms. These provide guidance for hypothesis
spaces, and improve understandability of learned programs.
Fundamentally, programs transform given inputs into outputs; we represent
these inputs and outputs as relational structures (for example, graphs or binary
strings). A relational signature is a tuple of predicate symbols R
i
with arities a
i
3
All source code is freely available in the 0.9 release of Toss at http://toss.sf.net.
2
and constant symbols c
j
, := (R
a1
1
, . . . , R
ar
r
, c
1
, . . . , c
s
). A nite -structure
consists of a nite universe U, an a
i
-ary relation for each predicate symbol of ,
and a denition an element of U for each constant symbol:
A := (U, R
1
U
ai
, . . . , R
r
U
ar
, c
1
U, . . . , c
s
U) .
For example, the signature for directed graphs contains a single, binary predicate
symbol E and so a directed graph consists of a nite set of vertices and a
binary edge relation. We denote the set of all nite -structures by Struc(). Our
programs are built from formulas in various logics. Formulas of rst-order logic
over a signature have the form
:= R
i
(x
1
, . . . , x
ai
) | x
i
= x
j
| x
i
= c
j
| | | | x
i
| x
i
,
where x
1
, x
2
, . . . are rst-order variables, and the semantics, given an assignment
of the variables x
i
to elements e
i
of the structure, is dened in the natural way.
Queries. Single formulas can be used to dene properties or decision problems,
but in general we represent programs as queries. Queries map -structures to
-structures, dening the universe, relations, and constants using logical formulas.
A rst-order query from -structures to -structures is an r +s + 2-tuple,
q := (k,
0
,
1
, . . . ,
r
,
1
, . . . ,
s
) .
The number k N is the dimension of the query. Each
i
,
j
is a rst-order
formula over the signature . Let A be a -structure with universe U
A
. The
formula
0
has free variables x
1
, . . . , x
k
and denes the universe U of q(A),
U :=
_
(u
1
, . . . , u
k
) | u
i
U
A
, A |=
0
(u
1
, . . . , u
k
)
_
.
That is, the new universe consists of k-tuples of elements of the old universe,
where
0
determines which k-tuples are included.
Each remaining
i
has free variables x
1
1
, . . . , x
k
1
, x
1
2
, . . . , x
k
ai
and denes
R
i
:=
_
(u
1
1
, . . . , u
k
1
), . . . , (u
1
ai
, . . . , u
k
ai
) | A |=
i
(u
1
1
, . . . , u
k
ai
)
_
U
ai
.
That is,
i
determines which of the a
i
-tuples of U are included in R
i
. Finally,
each
i
has free variables x
1
, . . . , x
k
and denes c
i
as the unique (u
1
, . . . , u
k
) U
such that A |=
i
(u
1
, . . . , u
k
).
First-order queries therefore transform -structures into -structures, and we
write q(A) to represent the resulting -structure. The restriction to rst-order
logic here is not essential given a logic L, we dene L-queries in a similar way.
Extending rst-order logic. So far, we have focused only on rst-order logic.
However, rst-order logic on nite structures is often too limited from the
computational perspective it cannot express many interesting queries that are
easy to compute. Over nite structures with additional numeric predicates, the
rst-order denable properties correspond exactly to uniform AC
0
(cf. [7]).
To remove this limitation, one extends rst-order logic with various operators.
For example, the transitive closure operator allows us to write formulas of the
3
form TC[x
1
, x
2
.(x
1
, x
2
)](y
1
, y
2
). This formula takes the transitive and reexive
closure of the (implicit) relation dened by (x
1
, x
2
) and evaluates it on (y
1
, y
2
).
The least xed-point operator allows recursive denitions in formulas of the form
LFP[R(x
1
, . . . , x
k
) = (R, x
1
, . . . , x
k
)](y
1
, . . . , y
k
), where R is a new relation
symbol appearing only positively in the inner formula . The result of this
operator is dened as the least xed-point of the operator R(x) (R, x).
Finally, second-order logic (SO) allows to use quantiers over relations.
There are many known correspondences between logics and complexity classses
(see [7]). The oldest result [2] shows that the class NP is captured by existential
second-order logic. This implies that coNP is captured by universal second-order
logic, and that full SO captures the polynomial-time hierarchy. More practically,
polynomial-time computations are captured by least xed-point logic (LFP)
when a linear order relation is present [5,12]. Although LFP is presumably more
expressive than transitive closure logic (TC), TC captures all problems solvable
in non-deterministic logarithmic space (NL) on ordered structures [6].
Outlines. For a logic L, we refer to the set of L-formulas in which atoms a may be
guarded by some Boolean guard
4
G
a
as L-formula outlines. The Boolean guards
are intended to mean a occurs here, and given an instantiation of the guards I,
we can instantiate an L-formula outline to an L-formula
I
by replacing each
G
a
a by a if G
a
is true in I, and leaving the atom empty otherwise. Intuitively,
an outline xes the structure of the formula but not the precise contents. We
refer to queries containing L-formula outlines as L-query outlines. We omit L
when it is clear from context, and use outline to refer to both query and formula
outlines. Given an outline o, we write inst(o) for the set of formulas or queries
obtainable as instantiations of o.
3 Learning Logical Queries
In this section, we introduce our model of learning logical queries. The model
consists of a learner giving candidate queries or hypotheses and a teacher (or
verier), which gives counter-examples or accepts the query. A learning task is
characterized by a few parameters, rst is the target class C.
Let C Struc() Struc() be a binary relation on relational structures, and
dene the domain of C as dom(C) = {A | (A, B) C for some B}.
Denition 1. A C-teacher t is a function
t : ( )-queries (dom(C) SO()) {}
that satises the following condition.
t(q) =
_
if {(A, q(A)) | A dom(C)} C,
(A, ) where A dom(C), (A, q(A)) C, (B |= i (A, B) C) .
4
We do not require that identical atoms share guards, that distinct atoms have dierent
guards, or that all atoms are guarded.
4
That is, a teacher accepts a query q if for all A dom(C) we have (A, q(A)) C,
and otherwise replies with a counter-example (A, ) such that q(A) |= . Here,
the formula denes the acceptable output on A. Of course, in practice we
generally restrict attention to computable teachers and reasonable classes C.
Denition 2. Let H be a class of logical queries. An H-learner L is a function
that, given a sequence of examples ((A
1
,
1
), . . . , (A
m
,
m
)), satises
L((A
1
,
1
), . . . , (A
m
,
m
)) =
_
h, h H, h(A
i
) |=
i
for 1 i m;
, if no such h H exists .
Note that our learners must always be consistent, and they return i there
is no consistent query in the hypothesis space. The logic used in the query is
determined by H. In practice, we always use outline learners, which are dened
as H-learners for H = inst(q) for some query outline q.
The outline gives a compact representation of a hypothesis space, and can also
enforce certain restrictions on the query. For example, outlines can require a query
to generate an extension
5
of the structure, which is useful when searching for
programs to give explicit isomorphisms or satisfying solutions to SAT instances.
Outline learners using QBF solvers. Essentially, an outline learner is a learner
with a nite, uniform hypothesis space. The main condition of an outline learner,
namely that
_
i
q
I
(A
i
) |=
i
, can be translated into a QBF formula. This is
done by introducing a Boolean variable for each tuple in each relation in q
I
(A
i
),
setting it to true only if the respective formula from q holds for the specied
tuple, and then translating
i
using these Boolean variables for relations. We
omit the complete algorithm due to space constraints, but it is crucial that all
logics we consider are expressible in SO and that each
i
SO, by the denition
of a teacher. This allows us to implement outline learners using QBF solvers to
search for a suitable instantiation of the guards. For a query outline q and a QBF
solver S, we write L(S, q) for the inst(q)-learner using S as explained above.
Given an H-learner L and a C-teacher t we dene the sequence L
t
i
of the
interactions between L and t inductively as follows. We set L
t
0
:= L(), the
hypothesis that L returns on an empty list of examples. If for some i we get
L
t
i
= then the sequence is nished there is no h H that satises the
teacher. Else, let E
i
:= t(L
t
i
) be the answer of the teacher to L
t
i
. If E
i
= the
sequence L
t
i
is nished, the last hypothesis was accepted. In the other case, set
L
t
i+1
= L(E
0
, . . . , E
i
). The following properties are immediate.
Theorem 3. Let S be a correct QBF solver, t a C-teacher, and q a query outline.
1. L(S, q) is a consistent and conservative inst(q)-learner.
2. If t(h) = for some h inst(q) then the sequence L(S, q)
t
i
is nite and its
last element g satises t(g) = .
3. If there is no h inst(q) for which t(h) = then the sequence L(S, q)
t
i
is
nite and its last element is .
5
An extension of a structure is formed by adding new predicates while leaving existing
predicates unchanged.
5
Of course, we often have to restrict the teacher in order to obtain decidability.
We usually restrict C to be a nite set, and usually implement the teacher using
SAT or QBF solvers as well
6
. Although one can easily construct instances where
arbitrarily-large examples are required, in practice it seems that queries that are
correct on moderately sized examples are almost always correct in general (where
moderate depends on the complexity of the query).
A learning task is specied by the pair (C, H). We write L(t, q) for the last
element of the sequence L(S, q)
t
i
, where S is a chosen solver. The sequence is
always nite by Theorem 3, so L(t, q) is well-dened. In the next section, we
examine various learning tasks and look for reasonable teachers and outlines.
4 Experimental Results
4.1 Learning Reductions
Learning reductions was rst considered by [1], and we have also [9] implemented,
benchmarked and evaluated a number of dierent approaches to the problem.
Problem. In descriptive complexity, a reduction from the -property dened by
to the -property dened by is a ( )-query q that satises
A |= q(A) |= (1)
for all -structures A. Of course, reductions should have less computational power
than the complexity classes they are used in and descriptive complexity focuses
on rst-order reductions (i.e., rst-order queries as reductions). Here, we focus on
quantier-free rst-order reductions, a weaker class that still suces to capture
important complexity classes (cf. [9] for more details).
In order to make nding quantier-free reductions decidable, we restrict
attention to a xed size n, i.e., we require Formula (1) to hold only for structures
of size at most n. Assume that we are searching for a dimension-k reduction from
the -property dened by to the -property dened by .
Let P be the set of -structures of size at most n that satisfy , Q be the set of
-structures of size at most n
k
that satisfy , and P and Q be their complements
up to the size bounds. Our target class is C = (P Q) (P Q) we want a
query that maps positive instances to positive instances and negative instances
to negative instances.
Outline. As an outline, we focus on reductions in which all formulas are in DNF
with c conjunctions. We x
0
to be always true and the dimension k (so the
new universe is the set of k-tuples of elements of the old universe). Finally, we
have a number of parameters determining the atomic formulas that may occur
for example, whether to allow certain numeric predicates such as successor.
6
We recommend GlueMiniSat as a SAT and RAReQS as a QBF solver, c.f. [9].
6
Teacher. When the teacher receives a candidate hypothesis q, it checks For-
mula (1), i.e. A |= q(A) |= for all structures A of size at most n. This
is done by assigning a Boolean variable to each bit of A and q(A), and then
constructing a QBF (or SAT) instance similar to the learner L(S, q) described
above. If the instance is satised, the assignment returned by the solver is used
to construct a structure A that is a counter-example to the hypothesis q. The
teacher returns (A, ) if A |= and (A, ) otherwise.
Results. We refer to [9] for more details on our reduction-nding experiments. Our
approach using GlueMiniSat as solver signicantly improves upon the previous,
specialized program ReductionFinder [1].
Example. Consider the NL-complete problems of reachability (given a directed
graph with labeled vertices s and t, determine whether t is reachable from s) and
all-pairs reachability (determine if a directed graph is strongly connected).
Reach := TC[x, y.E(x, y)](s, t) AllReach := x
1
, x
2
(TC[y, z.E(y, z)](x
1
, x
2
)) .
Our system nds the following correct reduction for n 3 in less than a second:
(k := 1,
0
:= true,
1
:= x
1
= s x
2
= t E(x
2
, x
1
)) .
This reverses all edges in the original graph, adds directed edges from s to all
vertices and also adds directed edges to t from all vertices. A similar reduction
exists without reversing the edges however the above is our actual output.
4.2 Learning Fast Programs
In this section, we consider synthesizing programs for a given logical specication.
This is similar to [8], however they focused on synthesizing formulas in more
specialized logics.
Problem. In program synthesis, we are given a specication and hope to nd an
ecient program satisfying it. For us, a specication is a way to verify whether
the output q(A) is accepted for A. There are two major variations either the
output for each structure is unique (as in our example here), or there is a set of
acceptable outputs (e.g., when nding a satisfying solution for SAT instances).
In our example here, we have a query s in an expressive logic (SO) and wish
to nd an equivalent query in a less-expressive logic. In particular, we consider
the problem of identifying winning regions in nite games i.e., directed graphs
with a predicate V
0
(x) meaning that vertex x belongs to Player 0. Decidability
requires restricting the size of the games n, and we set C to be the set of pairs
(A, B) such that A is a nite game of size at most n and B is the extension of A
with the winning region identied in a new monadic predicate W.
7
Outline. Similar to the case for reductions, we use several parameters to x an
outline. For reductions, we considered only quantier-free formulas however,
in this section we need more expressive power. We focus on least xed-point
formulas, with a single xed-point operator that is outermost
7
. We x the arity a
of the xed-point predicate, and assume that the inner formula is a disjunction
of l quantied CNF formulas with k variables each. We use such an outline for
exactly one selected relation W in the query, all others are set to identity.
Teacher. Assume that we have a SO query s that produces the desired extension
with the winning region identied. Given a hypothesis q, the teacher can again
use a SAT or QBF solver (as was the case for reductions) to guess a game A of
size at most n such that in q(A) the new relation W(x) is not equivalent to the
region in s(A). Let a
1
, . . . , a
k
be the winning positions in s(A). The teacher then
returns the pair
_
A, x (W(x)
_
i
(x = a
i
))
_
.
Example. Consider the case of identifying winning regions in nite reachability
games. When the current vertex belongs to a player, that player chooses an
outgoing edge and moves to a connected vertex. Player 1 loses if the play reaches
a vertex that belongs to her and has no outgoing edge. Similarly, Player 0 loses if
the play reaches a position where he must but cannot move, but also if the play
becomes a cycle and goes on forever. The goal is to identify the vertices from
which Player 0 has a winning strategy. That is, we want a formula (x) which
holds exactly on the vertices for which Player 0 has a winning strategy.
Reachability games are positional, i.e., it suces to consider strategies that
depend only on the current position and not on the history of the game. Therefore,
a strategy of Player 1 can be dened as a binary relation S
1
that is a subset of
the edges and that, for vertices of Player 0, contains all successors of the vertex
as well. Then, x is winning for Player 1 (by S
1
) if all vertices reachable from x
by S
1
either belong to Player 0 or have an S
1
-successor. This is easily expressible
using the TC operator and guessing the strategy leads to a second-order formula
0
(x) which holds exactly if x is winning for Player 0.
In reachability games, the following LFP formula denes the winning region
for Player 0:
LFP
_
W(x) =
_
(x
1
: (W(x
1
) E(x, x
1
)) V
0
(x))
x
1
: (W(x
1
) E(x, x
1
) V
0
(x))
__
(x) .
The LFP operator recursively denes W(x), starting with the empty set and
adding tuples that satisfy the formula on the right until a xed-point is reached.
Therefore, this formula says that a position x is winning for Player 0 if
(a) it is the opponents move and all outgoing edges go to positions we win; or
(b) it is Player 0s move and there is an edge to a winning position.
Recall that the xed-point predicate is initially empty. Therefore, the winning
positions after one iteration are the positions belonging to the opponent with no
7
This is a normal-form for xed-point logics, although the arity of the xed point may
increase if it is required to be outermost.
8
outgoing edges. Then, the winning region grows gradually until it is the attractor
of those positions which is correct. An equivalent, slightly longer formula is
found by our program in less than a minute for n 3.
Further work. The LFP formula for reachability games can be written by hand,
but our motivation for presenting this example is the hope to compute polynomial-
time solvers for other games. In particular, weak parity games and parity games
are also positional, so it is trivial to write a SO formula dening the winning
region (as we did for reachability games). But the polynomial-time program for
solving weak parity games is complicated, and the existence of a polynomial-time
solver for full parity games (which is equivalent to the existence of an LFP
formula) is a long-standing open problem.
Our implementation can also search for other programs, for example, programs
for graph isomorphism, graph coloring or SAT. There are natural classes where
these problems are in polynomial-time
8
.
4.3 Learning Formulas from Examples
Recently, a system was implemented [10] that represents board games as re-
lational structures and learns their rules from observing example play videos.
Fundamentally, the system works by computing minimal distinguishing formulas
for sets of structures, e.g., formulas satised by structures representing winning
positions and by none of the losing ones. We implement the computation of
distinguishing formulas in our framework and compare the performance.
Problem. Let P and N be nite sets of -structures. We want to learn a formula
without free variables such that A |= for all A P and for no A N. Unlike
previous tasks, we want a minimal such formula, not just an arbitrary one.
Outline and Teacher. We use the same outlines as for the program synthesis
task. However, we focus on rst-order formulas, i.e., we set a = 0. Moreover, to
nd minimal formulas we iterate through k, and for each k we range l from 1 to
k + 1. The teacher is simple: given a formula it checks if A |= for all A P,
and if not, it returns (A, true) for some A |= . Then, it checks if A |= for all
A N and returns (A, false)
9
if this is not the case for some A |= .
Results. We substituted our SAT-based learner for the procedure for computing
distinguishing formulas used in [10]. Comparing the results, the SAT-based
approach appears to oer signicantly better performance.
Breakthrough Connect4 Gomoku Pawn-Whopping
Original system 113s 33s 11s 936s
SAT-based system 5s 16s 5s 131s
8
E.g., isomorphism of planar graphs, bounded-degree graphs, and graphs excluding a
minor; k-SAT and k-coloring are NL-complete for k = 2 and NP-complete for k 3.
9
Here, true and false are satised by encoded Boolean structures.
9
We use the same example plays for both systems these examples were chosen
by hand for the original system in [10]. But our SAT-based approach searches
(faster) for formulas in more expressive logics, beyond reach for the original
system. For this reason, the resulting formulas are not always correct they are
for Breakthrough and Gomoku, but not for Connect4 and Pawn-Whopping. It
would be easy to overcome this by adding examples or changing the outline to
match the more restrictive logics used by the original system.
5 Conclusions and Future Work
Thanks to the eciency of modern SAT and QBF solvers, our general procedure
outperforms specialized programs both in learning reductions [1] and in learning
from examples [10]. We consider these early results promising and encourage
further experimentation with our freely available implementation.
There are many questions we leave unanswered. For example, there is a large
variance in runtime depending on the precise series of counter-examples given
by the teacher. We would like to know how to choose good counter-examples
and whether randomness [13] can help. We ask what outlines are good, how to
choose them to nd the desired programs quickly and to make them readable.
We are interested in re-using sub-formulas found for one problem to speed up
learning in another one, and in extending our approach to quantitative settings.
References
1. Crouch, M., Immerman, N., Moss, J.E.B.: Finding reductions automatically. In:
Fields of Logic and Computation. LNCS, vol. 6300, pp. 181200 (2010)
2. Fagin, R.: Generalized rst-order spectra and polynomial-time recognizable sets.
In: Complexity of Computation, SIAM-AMS Proceedings. vol. 7, pp. 4373 (1974)
3. Gr adel, E., Kolaitis, P.G., Libkin, L., Marx, M., Spencer, J., Vardi, M.Y., Venema,
Y., Weinstein, S.: Finite Model Theory and Its Applications. Springer (2007)
4. Gulwani, S.: Dimensions in program synthesis. In: Proc. PPDP 2010. pp. 1324.
New York, NY, USA (2010)
5. Immerman, N.: Relational queries computable in polynomial time. In: Proc.
STOC82. pp. 147152 (1982)
6. Immerman, N.: Languages that capture complexity classes. SIAM J. Comput. 16(4),
760778 (1987)
7. Immerman, N.: Descriptive Complexity. Springer-Verlag (1999)
8. Itzhaky, S., Gulwani, S., Immerman, N., Sagiv, M.: A simple inductive synthesis
methodology and its applications. In: Proc. OOPSLA10. pp. 3646 (2010)
9. Jordan, C., Kaiser, L.: Experiments with reduction nding. In: Proc. SAT 2013
10. Kaiser, L.: Learning games from videos guided by descriptive complexity. In: Proc.
AAAI 2012. pp. 963970 (2012)
11. Kitzelmann, E.: Inductive programming: A survey of program synthesis techniques.
In: AAIP 2009, Revised Papers. LNCS, vol. 5812, pp. 5073 (2010)
12. Vardi, M.Y.: The complexity of relational query languages. In: Proc. STOC82. pp.
137146 (1982)
13. Zeugmann, T.: From learning in the limit to stochastic nite learning. Theoret.
Comput. Sci. 364(1), 7797 (2006)
10