A Monadic Multi-Stage Metalanguage: E.Moggi and S.Fagorzi DISI, Univ. of Genova, v. Dodecaneso 35, 16146 Genova, Italy
A Monadic Multi-Stage Metalanguage: E.Moggi and S.Fagorzi DISI, Univ. of Genova, v. Dodecaneso 35, 16146 Genova, Italy
of nite sequences (e
i
[i m) of elements of E, and [e[ denotes its length (i.e. ,
m). e
1
, e
2
denotes the concatenation of e
1
and e
2
.
Term equivalence, written , is -conversion. FV(e) is the set of variables free in e. If E is a set of terms,
then E
0
is the set of e E s.t. FV(e) = . e[x
i
: = e
i
[i m] (and e[x: = e]) denotes parallel substitution
(modulo ).
f: A
fin
B means that f is a partial function from A to B with a nite domain, written dom(f). We
write a
i
: b
i
[i m for the partial function mapping a
i
to b
i
(where the a
i
must be dierent, i.e. a
i
= a
j
implies i = j).
We use the following operations on partial functions: is the everywhere undened partial function;
f
1
, f
2
denotes the union of two partial functions with disjoint domains; fa: b denotes the extension of
f to a , dom(f); fa = b denotes the update of f in a dom(f).
Given a BNF e: : = P
1
[ . . . [ P
m
, we write e+ = P
m+1
[ . . . [ P
m+n
as a shorthand for the extended
BNF e: : = P
1
[ . . . [ P
m+n
.
We write
> for the reexive and transitive closure of a a relation > .
2 Monadic metalanguages, simplication and computation
We outline a general pattern for specifying the operational semantics of monadic metalanguages, which dis-
tinguishes between transparent simplication and programmable computation. This is possible because in
a monadic metalanguage there is a clear distinction between term-constructors for building terms of com-
putational types, and the other term-constructors that are computationally irrelevant. For computationally
relevant term-constructors we give an operational semantics that ensures the correct sequencing of compu-
tational eects, e.g. by adopting some well-established technique for specifying the operational semantics of
programming languages (see [WF94]), while for computationally irrelevant term-constructors it suces to give
local simplication rules, that can be applied non-deterministically (because they are semantic preserving).
Remark 2.1In [Wad99] Wadler adopts a similar style, that distinguishes pure from monadic reduction. How-
ever, his pure reduction is a deterministic strategy, while simplication is non-deterministic. In this respect, our
approach is related to the Cham [BB90]: simplication corresponds to heating and computation to reaction.
Combinatory Reduction Systems. We work in the setting of Combinatory Reduction Systems (CRS)
[Klo80], which extends Term Rewriting Systems (TRS) with binders. In Section 4 the uniformity of CRS
descriptions is exploited for dening the extension with staging generically and concisely. In a CRS the syntax
of terms is specied by a set C of term-constructors with given arity #: C N
e E: : = x [ c([x
i
]e
i
[i m) with #c = (n
i
[i m) and i m.[x
i
[ = n
i
Variables x belong to an innite set X. More complex terms are built by applying a term-constructor c to
a sequence of abstractions [x
i
]e
i
binding the free occurrences of the x
i
in e
i
, thus the set of free variables in
c([x
i
]e
i
[i m) is
FV(c([x
i
]e
i
[i m))
= FV([x
i
]e
i
)[i m where FV([x]e) = FV(e) x
In CRS rewrite rules e > e
), e) > e
[x: = e],
where e and e
are arbitrary terms. It is possible to give a more schematic syntax for rewrite rules, but it
requires metavariables ranging over abstractions.
2
Given a set T of types , a type system deriving judgments of the form e: , where : X
fin
T is a type
assignment, is specied by assigning to each term-constructor c of arity #c = (n
i
[i m) a set of type schema
(
i
i
[i m) consistent with #c, i.e. , [
i
[ = n
i
for i m. More precisely, the typing rules are
x
x:
(x) = c
[x
i
]e
i
:
i
i
[ i m
c([x
i
]e
i
[i m):
c: (
i
i
[i m)
where [x]e: stands for , x
i
:
i
[i m e: with x = (x
i
[i m) and = (
i
[i m). Note that
is used in type schema, but it is not a type T.
Monadic Metalanguages. To specify a monadic metalanguage we dene:
Types T, including computational types M.
Terms e E, including return ret(e) and monadic do do(e
1
, [x]e
2
), which corresponds to Haskell do-
notation x e
1
; e
2
.
A type system, which amounts to give for each term-constructor a set of type schema, in particular for
ret and do the type schema are ret: M and do: M
1
, (
1
M
2
)M
2
A simplication relation e > e
> Id
1
and Id
1
can move Id
1
> Id
2
,
then Id
1
has a move Id
1
> Id
2
s.t. Id
2
> Id
2
.
3 MML: a monadic metalanguage for imperative computations
We introduce a monadic metalanguage MML for imperative computations, which exemplies the pattern
outlined in Section 2 in a familiar case, namely a subset of Haskell with the IO-monad. Moreover, MML
provides a starting point for the addition of staging.
Types T: : = nat [ M [
1
2
[ ref . The type nat of natural numbers avoids a degenerate BNF
(we will ignore it most of the time).
Term-constructors c C: : = ret [ do [ [ @ [ new [ get [ set [ l . Locations l belong to an innite set L
(they are not allowed in user-written programs, but are instrumental to the operational semantics). The
type schema for term-constructors (from which one can infer also their arity) are
ret: M and do: M
1
, (
1
M
2
)M
2
@: (
1
2
),
1
2
and : (
1
2
)(
1
2
)
new: M(ref ) , get: ref M and set: ref , M(ref )
a signature : L
fin
T gives the type to locations, i.e. , l: ref when (l) = .
The BNF for terms generated by the term-constructors above is
e E: : = x [ ret(e) [ do(e
1
, [x]e
2
) [ ([x]e) [ @(e
1
, e
2
) [ new(e) [ get(e) [ set(e
1
, e
2
) [ l
([x]e) and @(e
1
, e
2
) are -abstraction x.e and application e
1
e
2
; new, get and set are the ML-like
operations ref e, !e and e
1
: = e
2
on references.
3
The type system is parametric in , and the rules for deriving judgments of the form
e: are
x
x:
(x) = l
l: ref
(l) = c
[x
i
]e
i
:
i
i
[ i m
c([x
i
]e
i
[i m):
c: (
i
i
[i m)
Simplication > is the compatible closure of @(([x]e
2
), e
1
) > e
2
[x: = e
1
], i.e. , -reduction. We write
= for -equivalence, i.e. , the reexive, symmetric and transitive closure of > . We recall the properties of
simplication (-reduction) relevant for our purposes.
Proposition 3.1 (Congr) The equivalence = induced by > is a congruence.
Proposition 3.2 (CR) The simplication relation > is conuent.
Proposition 3.3 (SR) If
e: and e > e
, then
: .
Remark 3.4 Several extensions can be handled at the level of simplication.
The extension with a datatype, like nat or
1
2
, amounts to add term-constructors for introduction
and elimination (zero: nat, succ: natnat and case: nat, , (nat)) and simplication rules describing
how they interact (case(zero, e
0
, [x]e
1
) > e
0
and case(succ(e), e
0
, [x]e
1
) > e
1
[x: = e]).
Recursive denitions can be handled by a term-constructor x: () with simplication rule x([x]e) > e[x: =
x([x]e)]. However, if one wants simplication of well-typed terms to terminate, then the type schema
for x should be (MM)M and x([x]e) becomes a computational redex.
A test for equality of references ifeq: ref , ref ,
, e
, E
), then dom(
) = dom() and
FV(
) FV(), FV(e
) FV(E).
2. If Id > Id
is closed.
When the program fragment under consideration is a computational redex, it does not matter whether sim-
plication is done before or after computation.
4
Administrative steps, involve only the evaluation context
A.0 (, ret(e), ) > ok
A.1 (, ret(e
1
), E[do(, [x]e
2
)]) > (, e
2
[x: = e
1
], E)
A.2 (, do(e
1
, [x]e
2
), E) > (, e
1
, E[do(, [x]e
2
)])
Imperative steps, involve only the store
I.1 (, new(e), E) > (l: e, ret(l), E), where l , dom()
I.2 (, get(l), E) > (, ret(e), E), provided e = (l)
I.3 (, set(l, e), E) > (l = e, ret(l), E), provided l dom()
Table 1: Computation Relation for MML
: M
: M
do
: M
2
E: M
[x]e:
1
M
2
: M
1
E[do(, [x]e)]: M
, then
1. Id > D implies D
s.t. Id
> D
and D
> D
2. Id
> D
where D and D
E: M
.
Denition 3.7 We write
(, e, E)
dom() = dom()
(l) = e
l
and (l) =
l
imply
e
l
:
l
there exists such that
e: M is derivable
there exists
such that : M
E: M
Id
1
and Id
1
> Id
2
, then
Id
2
2. If
1
Id
1
and Id
1
> Id
2
, then exists
2
1
s.t.
2
Id
2
5
Proof The rst claim is an easy consequence of Proposition 3.3. The second is proved by case-analysis on the
computation rules of Table 1.
Theorem 3.9 (Progress) If
e: M.
4 MMML: a multi-stage extension of MML
We describe a monadic metalanguage MMML obtained by adding staging to MML. At the level of syntax, type
system and simplication the extension is generic, i.e. , applicable to any monadic metalanguage (as dened
in Section 2).
The BNF for types T+ = ) is extended with code types.
The BNF for term-constructors c C+ = up [ dn [ c
V
[ c
M
is extended with up, dn and two recursive
productions c
V
and c
M
, which capture the reective nature of the extension (the set of term-constructors
for MMML is innite, although that for MML is nite). The type schema for the additional term-
constructors are
up: ) is MetaML cross-stage persistence (aka binary inclusion).
dn: )M is compilation of (potentially open) code. An attempt to compile open code causes an
unresolved link error (an eect not present in MML), thus dn has a computational result type.
if c: (
i
i
[i m), then
c
V
: (
i
)
i
)[i m)) builds code representing a term c(. . .)
c
M
: (
i
)M
i
)[i m)M) builds a computations that generates code representing c(. . .)
where
i
[i m) stands for the sequence (
i
)[i m). For instance,
V
: (
1
)
2
))
1
2
)
and
M
: (
1
)M
2
))M
1
2
).
The key dierence between c
V
and c
M
(reected in their type schema) is that generating code with
c
M
may have computational eects, while with c
V
does not. For instance, the computation
M
([x]e)
generates a fresh name (a new eect related to computation under a binder), performs the computation
e to generate the code e
for the body of the -abstraction, and nally returns the code
V
([x]e
) for the
-abstraction.
The BNF for terms e E and the type system (for deriving judgments of the form
e: ) are extended in
the only possible way, given the type schema for the term-constructors. In MMML (unlike
and MetaML)
there is no need to include level information in typing judgments, since it is already explicit in types and terms.
For instance, a MetaML type at level 1 becomes ) in MMML, and a at level 1 becomes a
V
or
M
.
Simplication > is unchanged, i.e. , no new simplication rules are added. The properties of simplication
established in Section 3 (i.e. , Proposition 3.1, 3.2 and 3.3) continue to hold and their proofs are unchanged.
Remark 4.1One may wonder whether there is a need to have both c
M
and c
V
, or whether c
M
can be dened
in terms of c
V
and term-constructors for computational types. Indeed, this is the case when c is not a binder.
For instance, when c:
1
2
and e: M
1
, we could dene c
M
(e) as do(e, [x]ret(c
V
(x))).
However, when c is a binder, like , one cannot move the computation of the body of
M
([x]e) outside the
binder. One could adopt a more concrete representation of terms using rst-order abstract syntax (FOAS),
and introduce a monadic operation gensym: M) to generate a fresh name (see [Fil01]). But in this approach
V
is no longer a binder, and this would be a drastic loss of abstraction for a reference semantics.
In -calculus one can encode a term-constructor c as a constant c
@e
1
@(x.e
2
), where we adopt the standard inx notation for application @. Then we
6
can use c
V
to encode c
M
and c
V
. For instance, do
V
(e
1
, [x]e
2
) becomes do
V
@
V
e
1
@
V
(
V
x.e
2
), and do
M
(e
1
, [x]e
2
)
becomes do@e
1
@(c.do@(
M
x.e
2
)@(f.do
V
@
V
c@
V
f)). With this encoding (unlike FOAS) there is no loss of
abstraction, moreover it gives better control on code generation, e.g. do
M
(e
1
, [x]e
2
) (and its encoding) computes
e
1
rst, while do@(
M
x.e
2
)@(f.do@e
1
@(c.do
V
@
V
c@
V
f)) computes e
2
rst.
4.1 Computation
We now dene congurations and the computation relation Id > Id
(X[, ret(e), E) if e = vc
err if vc undened
Table 3: Computation Relation for MMML
Lemma 4.5
1. If (X[, e, E) > (X
, e
, E
), then X
= X, dom(
) = dom(), CV(E
) = CV(E), FV(
)
FV(), FV(e
) FV(E).
2. If (X[, e, E) > (X
, e
, E
,
dom() dom(
), FV(
, e
) CV(E
) X
and FV(E
) X
CV(E
).
The bisimulation result (Theorem 3.6) is basically unchanged, but the proof must cover additional cases
corresponding to the computation rules in Table 3.
4.2 Type Safety
In MMML the denitions of well-formed conguration
E: M
must take into account the set X. For this reason we need a type assignment which maps names x X to
code types ).
Denition 4.6 We write
(X[, e, E)
e
l
:
l
there exists such that
e: M is derivable
there exists
such that , : M
E: M
e: .
We can now formulate the SR and progress properties for MMML.
Theorem 4.9 (SR)
1. If
Id
1
and Id
1
> Id
2
, then
Id
2
8
, : M
: M
do
, : M
2
E: M
[x]e:
1
M
2
, : M
1
E[do(, [x]e)]: M
c
M
, : M)
E: M
v
i
:
i
)M
i
) [ i m
f
i
:
m+1+i
)M
m+1+i
) [ i n
, x
k
:
k
)[k p, : M
m
)
E[c
M
(v, [x], f)]: M
v = (v
i
[i m) and f = (f
i
[i n)
c
M
: (
i
)M
i
)[i m + 1 + n)M)
m
= (
k
[k p) and x = (x
k
[k p)
in particular
M
, : M
1
2
)
E: M
, x:
1
), : M
2
)
E[
M
([x])]: M
1
Id
1
and Id
1
> Id
2
, then exist
2
1
and
2
1
s.t.
2
2
Id
2
Proof The rst claim is straightforward (see Theorem 3.8). The second is proved by case-analysis on the
computation rules, so we must cover the additional cases for the computation rules in Table 3, e.g.
(G.1) if Id
1
is (X[,
M
([x]e), E), then Id
2
is (X, x[, e, E[
M
([x])]) and the typings
1
, x:
1
)
1
e: M
2
)
and
1
, : M
1
2
)
1
E:
1
vc: )
and
1
, : M
1
E:
1
e: is derivable, therefore we can take
2
1
and
2
1
.
Lemma 4.10 If
M
x. l new(x); get(l): M )
1. ( [ ,
M
x. l new(x); get(l), ) generate x, then create l
2. (x [ l = x, get(l),
M
x.) get content of l
3. (x [ l = x, ret(x),
M
x.) complete code generation of -abstraction
4. (x [ l = x, ret(
V
x.x), ) x is bound by
V
.
This form of recapturing is allowed by [TD99], but not by [CMSar].
No recapturing of extruded variable by another binder using the same name.
l new(0
V
); (
M
x.set(l, x); ret(x)); z get(l); ret(
V
x.z): M nat)
1. ( [ l = 0
V
, (
M
x.set(l, x); ret(x)), ; z get(l); ret(
V
x.z))
generate x and assign it to l
2. (x [ l = x, ret(
V
x.x), ; z get(l); ret(
V
x.z))
rst code generation completed
3. (x [ l = x, get(l), z ; ret(
V
x.z)) get content of l
4. (x [ l = x, ret(x), z ; ret(
V
x.z))
complete code generation of -abstraction
5. (x [ l = x, ret(
V
x
.x), )
the bound variable x is renamed by substitution ret(
V
x.z)[z: = x]
No recapturing of extruded variable by its binder after code generation.
l new(0
V
); z (
M
x.
M
y.set(l, y); ret(x)); f dn(z); u get(l); ret(f u): M(nat nat))
1. (x, y [ l = y, ret(
V
x.
V
y.x), z ; f dn(z); u get(l); ret(f u))
code generation completed, y is bound by
V
and leaked in the store
2. (x, y [ l = y, dn(
V
x.
V
y.x), f ; u get(l); ret(f u)) compile code
3. (x, y [ l = y, ret(x.y.x), f ; u get(l); ret(f u))
get content of l and apply f to it
4. (x, y [ l = y, ret((x.y.x) y), ) the result simplies to (y
and
M
. The motivation for looking at the interactions between compu-
tational eects and run-time code generation comes from MetaML [MHP00, Tah99, TS00, CMSar]. We borrow
code types from MetaML (and
For instance,
B
is when B is empty; if c is
B
, then c
V
and c
M
are given by
BV
and
BM
respectively.
However, MMMLs annotations are sequences B V, M
, while those of
M
are natural number n. A
sequence B identies a natural number n, namely the length of B, moreover for each i < n it says whether
computation at that level has been completed, as expressed by the dierent typing for c
V
and c
M
. The rened
annotations of term-constructors (and computational types) allow to distinguish the following situations:
(
M
x.e, E) we start generating code for a -abstraction
(e, E[
M
x.]) we have generated a fresh name x, and start generating code for the body
(e, E[
M
x.E
]) we are somewhere in the middle of the computation generating code for the body
(ret(e), E[
M
x.]) we have the code for the body of the -abstraction
(ret(
V
x.e), E) we have the code for the -abstraction
All operational semantics proposed for MetaML or
of [DP96] with names a la FreshML (and intensional analysis), has an operational semantics
with steps modeling fresh name generation and recapturing, but its relations with
). for dn is
insucient to validate the local simplication rules for compilation.
Uniform representation in Logical Frameworks. The code types of MMML provide a uniform repre-
sentation of terms, similar to the (weak) Higher-Order Abstract Syntax (HOAS) encoding of object logics in
a logical framework (LF). Of course, in a LF there are stronger requirements on HOAS encodings, but any
advance in the area of LF is likely to advance the state-of-the-art in meta-programming. Recently [Nan02] has
made signicant advances in the area of intensional analysis, i.e. , the ability to analyze code (see [She01]), by
building on [PG00].
Monadic intermediate languages. [BK99] advocates the use of MIL for expressing optimizing transfor-
mations. Also MMML could be used for this purpose, but for having non-trivial optimizations one has to
introduce more aggressive simplications (than those strictly needed for dening the operational semantics)
and rene monadic types with eect information as done in [BK99]. In general, we expect -conversion
@(([x]e
2
), e
1
) e
2
[x: = e
1
] and the following equivalences to be observationally sound
do(ret(e
1
), [x]e
2
) e
2
[x: = e
1
]
c
M
([x
i
]ret(e
i
)[i m) ret(c
V
([x
i
]e
i
[i m))
while other equivalences, like @
V
(
V
([x]e
2
), e
1
) e
2
[x: = e
1
], are more fragile (e.g. they fail when the language
is extended with intensional analysis).
Acknowledgments.
We thank Francois Pottier, Amr Sabry, Walid Taha for useful discussions, and the anonymous referees for
their valuable comments.
References
[BB90] G. Berry and G. Boudol. The chemical abstract machine. In Conf. Record 17th ACM Symp. on
Principles of Programmming Languages, POPL90, San Francisco, CA, USA, 1719 Jan. 1990,
pages 8194. ACM Press, New York, 1990.
[BK99] N. Benton and A. Kennedy. Monads, eects and transformations. In Proceedings of the Third Inter-
national Workshop on Higher Order Operational Techniques in Semantics (HOOTS-99), volume 26
of Electronic Notes in Theoretical Computer Science, Paris, September 1999. Elsevier.
[CMSar] C. Calcagno, E. Moggi, and T. Sheard. Closed types for a safe imperative MetaML. Journal of
Functional Programming, to appear.
[Dav96] Rowan Davies. A temporal-logic approach to binding-time analysis. In the Symposium on Logic
in Computer Science (LICS 96), pages 184195, New Brunswick, 1996. IEEE Computer Society
Press.
12
[DP96] Rowan Davies and Frank Pfenning. A modal analysis of staged computation. In the Symposium
on Principles of Programming Languages (POPL 96), pages 258270, St. Petersburg Beach, 1996.
[Fil01] A. Filinski. Normalization by evaluation for the computational lambda-calculus. In Samson Abram-
sky, editor, Proc. of 5th Int. Conf. on Typed Lambda Calculi and Applications, TLCA01, Krakow,
Poland, 25 May 2001, volume 2044 of Lecture Notes in Computer Science, pages 151165. Springer-
Verlag, Berlin, 2001.
[GJ95] Robert Gl uck and Jesper Jrgensen. Ecient multi-level generating extensions for program spe-
cialization. In S. D. Swierstra and M. Hermenegildo, editors, Programming Languages: Implemen-
tations, Logics and Programs (PLILP95), volume 982 of Lecture Notes in Computer Science, pages
259278. Springer-Verlag, 1995.
[Klo80] J. W. Klop. Combinatory Reduction Systems. PhD thesis, University of Utrecht, 1980. Published
as Mathematical Center Tract 129.
[MHP00] The MetaML Home Page, 2000. Provides source code and documentation online at
http://www.cse.ogi.edu/PacSoft/projects/metaml/index.html.
[Nan02] Aleksandar Nanevski. Meta-programming with names and necessity. In Proceedings of the Sev-
enth ACM SIGPLAN International Conference on Functional Programming (ICFP-02), ACM SIG-
PLAN notices, New York, October 2002. ACM Press.
[NT03] Michael Florentin Nielsen and Walid Taha. Environment classiers. In Proceedings of the ACM
Symposium on Principles of Programming Languages (POPL), N.Y., January 1517 2003. ACM
Press.
[PG00] A. M. Pitts and M. J. Gabbay. A metalanguage for programming with bound names modulo
renaming. In Mathematics of Programme Construction, volume 1837 of Lecture Notes in Computer
Science, pages 230255. Springer-Verlag, 2000.
[PHA
+
97] Simon Peyton Jones, John Hughes, Lennart Augustsson, Dave Barton, and et. al. Haskell
1.4: A non-strict, purely functional language. Technical Report YALEU/DCS/RR-1106, De-
partment of Computer Science, Yale University, Mar 1997. World Wide Web version at
http://haskell.cs.yale.edu/haskell-report.
[She01] T. Sheard. Accomplishments and research challenges in meta-programming. In W. Taha, editor,
Proc. of the Int. Work. on Semantics, Applications, and Implementations of Program Generation
(SAIG), volume 2196 of LNCS, pages 246. Springer-Verlag, 2001.
[Tah99] Walid Taha. Multi-Stage Programming: Its Theory and Applications. PhD thesis, Oregon
Graduate Institute of Science and Technology, 1999. Available from ftp://cse.ogi.edu/pub/tech-
reports/README.html.
[TD99] Peter Thiemann and Dirk Dussart. Partial evaluation for higher-order languages with state. Avail-
able from http://www.informatik.uni-freiburg.de/thiemann/papers/index.html, 1999.
[TS00] Walid Taha and Tim Sheard. MetaML: Multi-stage programming with explicit annotations. The-
oretical Computer Science, 248(1-2), 2000.
[Wad99] Philip Wadler. The marriage of eects and monads. In the International Conference on Functional
Programming (ICFP 98), volume 34(1) of ACM SIGPLAN Notices, pages 6374. ACM, June 1999.
[WF94] Andrew K. Wright and Matthias Felleisen. A syntactic approach to type soundness. Information
and Computation, 115(1):3894, 1994.
13