100% found this document useful (1 vote)
32 views13 pages

A Monadic Multi-Stage Metalanguage: E.Moggi and S.Fagorzi DISI, Univ. of Genova, v. Dodecaneso 35, 16146 Genova, Italy

Section 2 describes a general pattern for specifying the operational semantics of monadic metalanguages. This distinguishes simplification, which uses semantic-preserving rewrite rules, from computation, which may cause side effects. Section 3 exemplifies this pattern for a monadic metalanguage for imperative computations. Section 4 introduces an extension of this metalanguage with staging constructs and explains how definitions must be modified to account for staging. Section 5 provides examples illustrating subtle points of the operational semantics.

Uploaded by

klumer_x
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
100% found this document useful (1 vote)
32 views13 pages

A Monadic Multi-Stage Metalanguage: E.Moggi and S.Fagorzi DISI, Univ. of Genova, v. Dodecaneso 35, 16146 Genova, Italy

Section 2 describes a general pattern for specifying the operational semantics of monadic metalanguages. This distinguishes simplification, which uses semantic-preserving rewrite rules, from computation, which may cause side effects. Section 3 exemplifies this pattern for a monadic metalanguage for imperative computations. Section 4 introduces an extension of this metalanguage with staging constructs and explains how definitions must be modified to account for staging. Section 5 provides examples illustrating subtle points of the operational semantics.

Uploaded by

klumer_x
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 13

A Monadic Multi-stage Metalanguage

E.Moggi and S.Fagorzi

DISI, Univ. of Genova, v. Dodecaneso 35, 16146 Genova, Italy


Abstract
We describe a metalanguage MMML, which makes explicit the order of evaluation (in the spirit of
monadic metalanguages) and the staging of computations (as in languages for multi-level binding-time
analysis). The main contribution of the paper is an operational semantics which is suciently detailed
for analyzing subtle aspects of multi-stage programming, but also intuitive enough to serve as a reference
semantics. For instance, the separation of computational types from code types, makes clear the distinction
between a computation for generating code and the generated code, and provides a basis for multi-lingual
extensions, where a variety of programming languages (aka monads) coexist. The operational semantics
consists of two parts: local (semantics preserving) simplication rules, and computation steps executed in a
deterministic order (because they may have side-eects). We focus on the computational aspects, thus we
adopt a simple type system, that can detect usual type errors, but not the unresolved link errors. Because
of its explicit annotations, MMML is suitable as an intermediate language.
1 Introduction
Staging a computation into multiple steps is a well-known optimization technique used in algorithms, which ex-
ploits information available in early stages for generating code that will be executed in later stages. Multi-stage
programming languages, like MetaML (see [MHP00, Tah99, TS00, CMSar]), provide constructs for express-
ing staging in a natural and concise manner, and must allow arbitrary interleaving of code generation and
computation. Multi-stage programming is particularly convenient for dening generative components, which
take as input a specication of user requirements and generate on the y a customized component, or mobile
applications, which need to adapt after each move, e.g. by assembling components downloaded remotely to
generate code tailored to the local environment.
So far most of the theoretical research on multi-stage programming languages has focused on type systems
(for the most recent proposals see [CMSar, Nan02, NT03]). The resulting operational semantics are often
instrumental to a particular type system (thus dicult to relate and compare), and often ignore the subtle
interactions between code generation and computational eects. In this paper, we provide a deeper under-
standing of the computational aspects of multi-stage programming, in the framework of a metalanguage with
computational types M and code types ): computational types classify terms describing computations,
while code types classify terms representing other terms. We believe that in this framework one can have a
fresh look at typing issues, and above all a generic approach for adding staging to a programming language
(described in a monadic style), including a multi-lingual metalanguage.
An important principle of Haskell [PHA
+
97] is that pure functional evaluation (and all the optimization
techniques that come with it) should not be corrupted by the addition of computational eects. In Haskell
this separation has been achieved through the use of monads (like monadic IO and monadic state). When
describing MMML we adopt this principle not only at the level of types, but also at the level of the operational
semantics. In fact, we distinguish between simplication (described by local rewrite rules) and computation
(that may cause side-eects).
Summary. Section 2 describes a general pattern for specifying the operational semantics of monadic meta-
languages, which distinguishes simplication from computation. Section 3 exemplies the general pattern by
considering a monadic metalanguage MML for imperative computations. Section 4 introduces an extension
MMML with staging, and explains how denitions and results for MML have to be modied and extended.
Section 5 gives simple examples of MMML programs, which illustrate the most subtle points of the operational
semantics. Section 6 discusses related work and issues specic to MMML.

Supported by MIUR project NAPOLI and EU project DART IST-2001-33477.


1
Notation. In the paper we use the following notations and conventions.
m, n range over the set N of natural numbers. Furthermore, m N is identied with the set i N[i < m
of its predecessors.
e ranges over the set E

of nite sequences (e
i
[i m) of elements of E, and [e[ denotes its length (i.e. ,
m). e
1
, e
2
denotes the concatenation of e
1
and e
2
.
Term equivalence, written , is -conversion. FV(e) is the set of variables free in e. If E is a set of terms,
then E
0
is the set of e E s.t. FV(e) = . e[x
i
: = e
i
[i m] (and e[x: = e]) denotes parallel substitution
(modulo ).
f: A
fin
B means that f is a partial function from A to B with a nite domain, written dom(f). We
write a
i
: b
i
[i m for the partial function mapping a
i
to b
i
(where the a
i
must be dierent, i.e. a
i
= a
j
implies i = j).
We use the following operations on partial functions: is the everywhere undened partial function;
f
1
, f
2
denotes the union of two partial functions with disjoint domains; fa: b denotes the extension of
f to a , dom(f); fa = b denotes the update of f in a dom(f).
Given a BNF e: : = P
1
[ . . . [ P
m
, we write e+ = P
m+1
[ . . . [ P
m+n
as a shorthand for the extended
BNF e: : = P
1
[ . . . [ P
m+n
.
We write

> for the reexive and transitive closure of a a relation > .
2 Monadic metalanguages, simplication and computation
We outline a general pattern for specifying the operational semantics of monadic metalanguages, which dis-
tinguishes between transparent simplication and programmable computation. This is possible because in
a monadic metalanguage there is a clear distinction between term-constructors for building terms of com-
putational types, and the other term-constructors that are computationally irrelevant. For computationally
relevant term-constructors we give an operational semantics that ensures the correct sequencing of compu-
tational eects, e.g. by adopting some well-established technique for specifying the operational semantics of
programming languages (see [WF94]), while for computationally irrelevant term-constructors it suces to give
local simplication rules, that can be applied non-deterministically (because they are semantic preserving).
Remark 2.1In [Wad99] Wadler adopts a similar style, that distinguishes pure from monadic reduction. How-
ever, his pure reduction is a deterministic strategy, while simplication is non-deterministic. In this respect, our
approach is related to the Cham [BB90]: simplication corresponds to heating and computation to reaction.
Combinatory Reduction Systems. We work in the setting of Combinatory Reduction Systems (CRS)
[Klo80], which extends Term Rewriting Systems (TRS) with binders. In Section 4 the uniformity of CRS
descriptions is exploited for dening the extension with staging generically and concisely. In a CRS the syntax
of terms is specied by a set C of term-constructors with given arity #: C N

e E: : = x [ c([x
i
]e
i
[i m) with #c = (n
i
[i m) and i m.[x
i
[ = n
i
Variables x belong to an innite set X. More complex terms are built by applying a term-constructor c to
a sequence of abstractions [x
i
]e
i
binding the free occurrences of the x
i
in e
i
, thus the set of free variables in
c([x
i
]e
i
[i m) is
FV(c([x
i
]e
i
[i m))

= FV([x
i
]e
i
)[i m where FV([x]e) = FV(e) x
In CRS rewrite rules e > e

can be specied as in TRS, for instance the -rule is @(([x]e

), e) > e

[x: = e],
where e and e

are arbitrary terms. It is possible to give a more schematic syntax for rewrite rules, but it
requires metavariables ranging over abstractions.
2
Given a set T of types , a type system deriving judgments of the form e: , where : X
fin
T is a type
assignment, is specied by assigning to each term-constructor c of arity #c = (n
i
[i m) a set of type schema
(
i

i
[i m) consistent with #c, i.e. , [
i
[ = n
i
for i m. More precisely, the typing rules are
x
x:
(x) = c
[x
i
]e
i
:
i

i
[ i m
c([x
i
]e
i
[i m):
c: (
i

i
[i m)
where [x]e: stands for , x
i
:
i
[i m e: with x = (x
i
[i m) and = (
i
[i m). Note that
is used in type schema, but it is not a type T.
Monadic Metalanguages. To specify a monadic metalanguage we dene:
Types T, including computational types M.
Terms e E, including return ret(e) and monadic do do(e
1
, [x]e
2
), which corresponds to Haskell do-
notation x e
1
; e
2
.
A type system, which amounts to give for each term-constructor a set of type schema, in particular for
ret and do the type schema are ret: M and do: M
1
, (
1
M
2
)M
2
A simplication relation e > e

on terms, namely the compatible closure of a set of rewrite rules. By


denition of > , the induced equivalence is always a congruence. In addition, we require that >
satises the Church Rosser (CR) and Subject Reduction (SR) properties.
A computation relation > on congurations. A conguration Id Conf describes the state of a
closed system, while the relation > describes how a closed system may evolve. Usually there is an
obvious way to extend > to congurations (preserving the CR property). To formulate a type safety
result (along the lines of [WF94]), we must dene well-formed congurations Id, show that both >
and > preserve well-formedness (for > it should be an easy consequence of SR), and nally
establish a progress property for ===

= > > .
Simplication should be orthogonal to computation, i.e. , if Id
1

> Id

1
and Id
1
can move Id
1
> Id
2
,
then Id

1
has a move Id

1
> Id

2
s.t. Id
2

> Id

2
.
3 MML: a monadic metalanguage for imperative computations
We introduce a monadic metalanguage MML for imperative computations, which exemplies the pattern
outlined in Section 2 in a familiar case, namely a subset of Haskell with the IO-monad. Moreover, MML
provides a starting point for the addition of staging.
Types T: : = nat [ M [
1

2
[ ref . The type nat of natural numbers avoids a degenerate BNF
(we will ignore it most of the time).
Term-constructors c C: : = ret [ do [ [ @ [ new [ get [ set [ l . Locations l belong to an innite set L
(they are not allowed in user-written programs, but are instrumental to the operational semantics). The
type schema for term-constructors (from which one can infer also their arity) are
ret: M and do: M
1
, (
1
M
2
)M
2
@: (
1

2
),
1

2
and : (
1

2
)(
1

2
)
new: M(ref ) , get: ref M and set: ref , M(ref )
a signature : L
fin
T gives the type to locations, i.e. , l: ref when (l) = .
The BNF for terms generated by the term-constructors above is
e E: : = x [ ret(e) [ do(e
1
, [x]e
2
) [ ([x]e) [ @(e
1
, e
2
) [ new(e) [ get(e) [ set(e
1
, e
2
) [ l
([x]e) and @(e
1
, e
2
) are -abstraction x.e and application e
1
e
2
; new, get and set are the ML-like
operations ref e, !e and e
1
: = e
2
on references.
3
The type system is parametric in , and the rules for deriving judgments of the form

e: are
x

x:
(x) = l

l: ref
(l) = c

[x
i
]e
i
:
i

i
[ i m

c([x
i
]e
i
[i m):
c: (
i

i
[i m)
Simplication > is the compatible closure of @(([x]e
2
), e
1
) > e
2
[x: = e
1
], i.e. , -reduction. We write
= for -equivalence, i.e. , the reexive, symmetric and transitive closure of > . We recall the properties of
simplication (-reduction) relevant for our purposes.
Proposition 3.1 (Congr) The equivalence = induced by > is a congruence.
Proposition 3.2 (CR) The simplication relation > is conuent.
Proposition 3.3 (SR) If

e: and e > e

, then

: .
Remark 3.4 Several extensions can be handled at the level of simplication.
The extension with a datatype, like nat or
1

2
, amounts to add term-constructors for introduction
and elimination (zero: nat, succ: natnat and case: nat, , (nat)) and simplication rules describing
how they interact (case(zero, e
0
, [x]e
1
) > e
0
and case(succ(e), e
0
, [x]e
1
) > e
1
[x: = e]).
Recursive denitions can be handled by a term-constructor x: () with simplication rule x([x]e) > e[x: =
x([x]e)]. However, if one wants simplication of well-typed terms to terminate, then the type schema
for x should be (MM)M and x([x]e) becomes a computational redex.
A test for equality of references ifeq: ref , ref ,

with simplication rules ifeq(l, l, e


1
, e
2
) > e
1
and ifeq(l
1
, l
2
, e
1
, e
2
) > e
2
when l
1
,= l
2
.
3.1 Computation
We now dene congurations Id Conf (and the auxiliary notions of store, evaluation context and computa-
tional redex) and the computation relation Id > Id

[ ok (see Table 1).


Stores S

= L
fin
E map locations to their contents.
Evaluation contexts E EC: : = [ E[do(, [x]e)] .
Congurations (, e, E) Conf

= S E EC consist of the current store , the program fragment e
under consideration and its evaluation context E.
Computational Redexes r R: : = do(e
1
, [x]e
2
) [ ret(e) [ new(e) [ get(l) [ set(l, e) .
When the program fragment under consideration is a computational redex, it enables a computation step with
no need for further simplication (see Theorem 3.6).
The conuent simplication relation > on terms extends in the obvious way to a conuent relation
(denoted > ) on stores, evaluation contexts, computational redexes and congurations.
A complete program corresponds to a closed term e E
0
(with no occurrences of locations l), and its evaluation
starts from the initial conguration (, e, ). The following properties ensure that only closed congurations
are reachable (by > and > steps) from initial ones.
Lemma 3.5
1. If (, e, E) > (

, e

, E

), then dom(

) = dom() and
FV(

) FV(), FV(e

) FV(e) and FV(E

) FV(E).
2. If Id > Id

and Id is closed, then Id

is closed.
When the program fragment under consideration is a computational redex, it does not matter whether sim-
plication is done before or after computation.
4
Administrative steps, involve only the evaluation context
A.0 (, ret(e), ) > ok
A.1 (, ret(e
1
), E[do(, [x]e
2
)]) > (, e
2
[x: = e
1
], E)
A.2 (, do(e
1
, [x]e
2
), E) > (, e
1
, E[do(, [x]e
2
)])
Imperative steps, involve only the store
I.1 (, new(e), E) > (l: e, ret(l), E), where l , dom()
I.2 (, get(l), E) > (, ret(e), E), provided e = (l)
I.3 (, set(l, e), E) > (l = e, ret(l), E), provided l dom()
Table 1: Computation Relation for MML

: M

: M

do
: M
2

E: M

[x]e:
1
M
2
: M
1

E[do(, [x]e)]: M

Table 2: Well-formed Evaluation Contexts for MML


Theorem 3.6 (Bisim) If Id (, e, E) with e R and Id

> Id

, then
1. Id > D implies D

s.t. Id

> D

and D

> D

2. Id

> D

implies D s.t. Id > D and D



> D

where D and D

range over Conf ok.


Proof An equivalent statement, but easier to prove, is obtained by replacing

> with one-step parallel
reduction. A key observation for proving the bisimulation result is that simplication applied to a computa-
tional redex r and an evaluation context E does not change the relevant structure (of r and E) for determining
the computation step among those in Table 1.
3.2 Type safety
We go through the proof of type safety. The result is standard and unsurprising, but we make some adjustments
to the Subject Reduction (SR) and Progress properties, in order to stress the role of simplication > and
computation > , when they are not bundled in one deterministic reduction strategy on congurations.
First of all, we dene well-formedness for congurations

Id and evaluation contexts : M

E: M

.
Denition 3.7 We write

(, e, E)

dom() = dom()
(l) = e
l
and (l) =
l
imply

e
l
:
l
there exists such that

e: M is derivable
there exists

such that : M

E: M

is derivable (see Table 2)


Theorem 3.8 (SR)
1. If

Id
1
and Id
1
> Id
2
, then

Id
2
2. If

1
Id
1
and Id
1
> Id
2
, then exists
2

1
s.t.

2
Id
2
5
Proof The rst claim is an easy consequence of Proposition 3.3. The second is proved by case-analysis on the
computation rules of Table 1.
Theorem 3.9 (Progress) If

(, e, E), then one of the following holds


1. e , R and e > , or
2. e R and (, e, E) >
Proof When e R we have (, e, E) > , e.g. when e is get(l) or set(l, e

), then l dom() by well-


formedness of the conguration. When e , R, then e cannot be a > -normal form, otherwise we get a
contradiction with

e: M.
4 MMML: a multi-stage extension of MML
We describe a monadic metalanguage MMML obtained by adding staging to MML. At the level of syntax, type
system and simplication the extension is generic, i.e. , applicable to any monadic metalanguage (as dened
in Section 2).
The BNF for types T+ = ) is extended with code types.
The BNF for term-constructors c C+ = up [ dn [ c
V
[ c
M
is extended with up, dn and two recursive
productions c
V
and c
M
, which capture the reective nature of the extension (the set of term-constructors
for MMML is innite, although that for MML is nite). The type schema for the additional term-
constructors are
up: ) is MetaML cross-stage persistence (aka binary inclusion).
dn: )M is compilation of (potentially open) code. An attempt to compile open code causes an
unresolved link error (an eect not present in MML), thus dn has a computational result type.
if c: (
i

i
[i m), then
c
V
: (
i
)
i
)[i m)) builds code representing a term c(. . .)
c
M
: (
i
)M
i
)[i m)M) builds a computations that generates code representing c(. . .)
where
i
[i m) stands for the sequence (
i
)[i m). For instance,
V
: (
1
)
2
))
1

2
)
and
M
: (
1
)M
2
))M
1

2
).
The key dierence between c
V
and c
M
(reected in their type schema) is that generating code with
c
M
may have computational eects, while with c
V
does not. For instance, the computation
M
([x]e)
generates a fresh name (a new eect related to computation under a binder), performs the computation
e to generate the code e

for the body of the -abstraction, and nally returns the code
V
([x]e

) for the
-abstraction.
The BNF for terms e E and the type system (for deriving judgments of the form

e: ) are extended in
the only possible way, given the type schema for the term-constructors. In MMML (unlike

and MetaML)
there is no need to include level information in typing judgments, since it is already explicit in types and terms.
For instance, a MetaML type at level 1 becomes ) in MMML, and a at level 1 becomes a
V
or
M
.
Simplication > is unchanged, i.e. , no new simplication rules are added. The properties of simplication
established in Section 3 (i.e. , Proposition 3.1, 3.2 and 3.3) continue to hold and their proofs are unchanged.
Remark 4.1One may wonder whether there is a need to have both c
M
and c
V
, or whether c
M
can be dened
in terms of c
V
and term-constructors for computational types. Indeed, this is the case when c is not a binder.
For instance, when c:
1

2
and e: M
1
, we could dene c
M
(e) as do(e, [x]ret(c
V
(x))).
However, when c is a binder, like , one cannot move the computation of the body of
M
([x]e) outside the
binder. One could adopt a more concrete representation of terms using rst-order abstract syntax (FOAS),
and introduce a monadic operation gensym: M) to generate a fresh name (see [Fil01]). But in this approach

V
is no longer a binder, and this would be a drastic loss of abstraction for a reference semantics.
In -calculus one can encode a term-constructor c as a constant c

of higher-order type. For instance,


do(e
1
, [x]e
2
) becomes do

@e
1
@(x.e
2
), where we adopt the standard inx notation for application @. Then we
6
can use c

V
to encode c
M
and c
V
. For instance, do
V
(e
1
, [x]e
2
) becomes do

V
@
V
e
1
@
V
(
V
x.e
2
), and do
M
(e
1
, [x]e
2
)
becomes do@e
1
@(c.do@(
M
x.e
2
)@(f.do

V
@
V
c@
V
f)). With this encoding (unlike FOAS) there is no loss of
abstraction, moreover it gives better control on code generation, e.g. do
M
(e
1
, [x]e
2
) (and its encoding) computes
e
1
rst, while do@(
M
x.e
2
)@(f.do@e
1
@(c.do

V
@
V
c@
V
f)) computes e
2
rst.
4.1 Computation
We now dene congurations and the computation relation Id > Id

[ ok [ err for MMML (see Table 3),


where err indicates an unresolved link error at run-time. We must account for run-time errors, because we
have adopted a permissive (and simple) type system. In the following we stress what auxiliary notions need
to be changed when going from MML to MMML.
Remark 4.2When adding staging, the modications to the denition of > are fairly modular, but we
cannot rely on a general theory (like rewrite rules for CRS) as in the case of simplication.
Stores S

= L
fin
E are unchanged.
Evaluation contexts E EC+ = E[c
M
(v, [x], f)] are extended with one production, where c C,
f: : = [x]e is an abstraction, v: : = [x]ret(e) is a value abstraction. Moreover, v, x and f must be
consistent with the arity of c, for instance E[
M
([x])], E[do
M
(, [x]e)] and E[do
M
(ret(e), [x])].
Intuitively E[
M
([x])] says that the program fragment under consideration is generating code for the
body of a -abstraction.
A conguration (X[, e, E) Conf

= T
fin
(X) S E EC has an additional component, i.e. , the set
X of names generated so far. A name may leak outside the scope of its binder, thus X grows as the
computation progresses.
Computational redexes r R+ = c
M
(f) [ dn(vc) are extended with two productions, where c C, f
must be consistent with the arity of c, and vc VC: : = x [ up(e) [ c
V
([x
i
]vc
i
[i m) is a code value.
The redex c
M
(f) may generate fresh names, while dn(vc) may cause an unresolved link error.
Compilation dn takes a code value vc of type ) and computes the term e of type represented by vc (or
fails if e does not exist). The represented term e is given by an operation similar to MetaMLs demotion.
Denition 4.3 (Demotion) The partial function mapping vc VC to the represented term is given by
x is undened; up(e)= e (this is a base case, like x);
c
V
([x
i
]vc
i
[i m)= c([x
i
]e
i
[i m) when e
i
= vc
i
[x
i
: = up(x
i
)] for i m
where up(x) is the sequence (up(x
i
)[i m) when x = (x
i
[i m)
In an evaluation context for MMML, e.g. E[
M
([x])], the hole can be within the scope of a binder, thus
an evaluation context E has not only a set of free variables, but also a sequence of captured variables.
Denition 4.4 The sequence CV(E) of captured variables and the set FV(E) of free variables are dened
by induction on the structure of E
CV()

= CV(E[do(, [x]e)])

= CV(E)
CV(E[c
M
(v, [x], f)])

= CV(E), x in particular CV(E[
M
([x])])

= CV(E), x
FV()

= FV(E[do(, [x]e)])

= FV(E) (FV([x]e) CV(E))
FV(E[c
M
(v, [x], f)])

= FV(E) (FV(v, f) CV(E))
As in the case of MML, the conuent simplication relation on terms extends to a conuent relation on the
other syntactic categories. Also for MMML we can prove that only closed congurations are reachable from
an initial one ([, e, ), where e E
0
. However, the second clause of Lemma 4.5 is more subtle, in particular
it ensures that FV(E) and CV(E) remain disjoint.
7
Administrative and Imperative steps are as in Table 1, they do not modify the set X.
Code generation steps, involve only the set X and the evaluation context
G.0 (X[, c
M
, E) > (X[, ret(c
V
), E) when the arity of c is ()
G.1 (X[, c
M
([x]e, f), E) > (X, x[, e, E[c
M
([x], f)]) with x renamed to avoid clashes with X. In
particular (X[,
M
([x]e), E) > (X, x[, e, E[
M
([x])])
G.2 (X[, ret(e), E[c
M
(v, [x])]) > (X[, ret(c
V
(f, [x]e)), E) where
v = ([x
i
]ret(e
i
)[i m) and f = ([x
i
]e
i
[i m). The free occurrences of x in e get captured by c
V
, e.g.
(X[, ret(e), E[
M
([x])]) > (X[, ret(
V
([x]e)), E)
G.3 (X[, ret(e
1
), E[c
M
(v, [x
1
], [x
2
]e
2
, f)]) > (X, x
2
[, e
2
, E[c
M
(v, [x
1
]ret(e
1
), [x
2
], f)]) with x
2
re-
named to avoid clashes with X, and the free occurrences of x
1
in e
1
captured by c
M
.
Compilation step, may cause a run-time error
C.1 (X[, dn(vc), E) >

(X[, ret(e), E) if e = vc
err if vc undened
Table 3: Computation Relation for MMML
Lemma 4.5
1. If (X[, e, E) > (X

, e

, E

), then X

= X, dom(

) = dom(), CV(E

) = CV(E), FV(

)
FV(), FV(e

) FV(e) and FV(E

) FV(E).
2. If (X[, e, E) > (X

, e

, E

), FV(, e) CV(E) X and FV(E) X CV(E), then X X

,
dom() dom(

), FV(

, e

) CV(E

) X

and FV(E

) X

CV(E

).
The bisimulation result (Theorem 3.6) is basically unchanged, but the proof must cover additional cases
corresponding to the computation rules in Table 3.
4.2 Type Safety
In MMML the denitions of well-formed conguration

Id and evaluation context , : M

E: M

must take into account the set X. For this reason we need a type assignment which maps names x X to
code types ).
Denition 4.6 We write

(X[, e, E)

dom() = dom() and dom() = X


(l) = e
l
and (l) =
l
imply

e
l
:
l
there exists such that

e: M is derivable
there exists

such that , : M

E: M

is derivable (see Table 4).


Remark 4.7 The formation rule (c
M
) for an evaluation context E[c
M
(v, [x], f)] says that the captured
variables x must have a code type (this is consistent with the code generation rules (G.1) and (G.3) of Table 3)
and that they should not occur free in E, v or f (this is consistent with the second property in Lemma 4.5).
Lemma 4.8 If

vc: ) and e = vc, then

e: .
We can now formulate the SR and progress properties for MMML.
Theorem 4.9 (SR)
1. If

Id
1
and Id
1
> Id
2
, then

Id
2
8

, : M

: M

do
, : M
2

E: M

[x]e:
1
M
2
, : M
1

E[do(, [x]e)]: M

c
M
, : M)

E: M

v
i
:
i
)M
i
) [ i m

f
i
:
m+1+i
)M
m+1+i
) [ i n
, x
k
:

k
)[k p, : M
m
)

E[c
M
(v, [x], f)]: M

v = (v
i
[i m) and f = (f
i
[i n)
c
M
: (
i
)M
i
)[i m + 1 + n)M)

m
= (

k
[k p) and x = (x
k
[k p)
in particular
M
, : M
1

2
)

E: M

, x:
1
), : M
2
)

E[
M
([x])]: M

Table 4: Well-formed Evaluation Contexts for MMML


2. If
1

1
Id
1
and Id
1
> Id
2
, then exist
2

1
and
2

1
s.t.
2

2
Id
2
Proof The rst claim is straightforward (see Theorem 3.8). The second is proved by case-analysis on the
computation rules, so we must cover the additional cases for the computation rules in Table 3, e.g.
(G.1) if Id
1
is (X[,
M
([x]e), E), then Id
2
is (X, x[, e, E[
M
([x])]) and the typings
1
, x:
1
)

1
e: M
2
)
and
1
, : M
1

2
)

1
E:

are derivable. Therefore we can take


2

1
and
2

1
, x:
1
).
(C.1) if Id
1
is (X[, dn(vc), E), then Id
2
is (X, x[, ret(e), E) with e = vc and the typings
1

1
vc: )
and
1
, : M

1
E:

are derivable. By Lemma 4.8


1

1
e: is derivable, therefore we can take

2

1
and
2

1
.
Lemma 4.10 If

e: and e is a > -normal form, then


nat implies e is a natural number
M implies e is a computational redex
ref implies e is a location

) implies e is a code value


(
1

2
) implies e is a -abstraction
Proof By induction on the derivation of

e: . The base cases are: x, up, l, , ret, do, new and c


M
. The
inductive steps are: get, set, dn, c
V
and @ (@ is impossible because by the IH one would have a -redex).
Theorem 4.11 (Progress) If

(X[, e, E), then one of the following holds


1. e , R and e > , or
2. e R and (X[, e, E) >
Proof When e R we have (, e, E) > (see Theorem 3.9). When e , R, then e cannot be a > -normal
form. otherwise we get a contradiction with

e: M and Lemma 4.10.


5 Examples
We give simple examples of computations in MMML to illustrate subtle points of multi-stage programming.
For readability, we use Haskells do-notation x e
1
; e
2
(or e
1
; e
2
) for do(e
1
, [x]e
2
) (when x , FV(e
2
)) and
write
B
x.e for
B
([x]e).
9
Scope extrusion: a bound variable x leaks in the store.
l new(0
V
); (
M
x.set(l, x); ret(x)): Mnat nat)
1. ( [ , new(0
V
), l ;
M
x.set(l, x); ret(x)) create a location l
2. ( [ l = 0
V
,
M
x.set(l, x); ret(x), ) generate a fresh name x
3. (x [ l = 0
V
, set(l, x),
M
x.; ret(x)) assign x to l
4. (x [ l = x, ret(x),
M
x.) complete code generation of -abstraction
5. (x [ l = x, ret(
V
x.x), ) x is bound by
V
, but a copy is also left in the store.
The semantics in [CMSar] is more conservative, when a variable leaks in the store it is bound by dead-code
annotation., but on closed values the two semantics agree.
Recapturing of extruded variable by its binder.

M
x. l new(x); get(l): M )
1. ( [ ,
M
x. l new(x); get(l), ) generate x, then create l
2. (x [ l = x, get(l),
M
x.) get content of l
3. (x [ l = x, ret(x),
M
x.) complete code generation of -abstraction
4. (x [ l = x, ret(
V
x.x), ) x is bound by
V
.
This form of recapturing is allowed by [TD99], but not by [CMSar].
No recapturing of extruded variable by another binder using the same name.
l new(0
V
); (
M
x.set(l, x); ret(x)); z get(l); ret(
V
x.z): M nat)
1. ( [ l = 0
V
, (
M
x.set(l, x); ret(x)), ; z get(l); ret(
V
x.z))
generate x and assign it to l
2. (x [ l = x, ret(
V
x.x), ; z get(l); ret(
V
x.z))
rst code generation completed
3. (x [ l = x, get(l), z ; ret(
V
x.z)) get content of l
4. (x [ l = x, ret(x), z ; ret(
V
x.z))
complete code generation of -abstraction
5. (x [ l = x, ret(
V
x

.x), )
the bound variable x is renamed by substitution ret(
V
x.z)[z: = x]
No recapturing of extruded variable by its binder after code generation.
l new(0
V
); z (
M
x.
M
y.set(l, y); ret(x)); f dn(z); u get(l); ret(f u): M(nat nat))
1. (x, y [ l = y, ret(
V
x.
V
y.x), z ; f dn(z); u get(l); ret(f u))
code generation completed, y is bound by
V
and leaked in the store
2. (x, y [ l = y, dn(
V
x.
V
y.x), f ; u get(l); ret(f u)) compile code
3. (x, y [ l = y, ret(x.y.x), f ; u get(l); ret(f u))
get content of l and apply f to it
4. (x, y [ l = y, ret((x.y.x) y), ) the result simplies to (y

.y), because the bound variable y is


renamed by -reduction.
When y is recaptured by
V
, it becomes a bound variable and can be renamed. Therefore, the connection
with the (free) occurrences of y left in the store (or the program fragment under consideration) is lost.
10
6 Related work and discussion
We discuss related work and some issues specic to MMML. A more general discussion of open issues in
meta-programming can be found in [She01].
Comparison with MetaML,

and
M
. The motivation for looking at the interactions between compu-
tational eects and run-time code generation comes from MetaML [MHP00, Tah99, TS00, CMSar]. We borrow
code types from MetaML (and

of [Dav96]), but use annotated term-constructors as in


M
of [Dav96] (see
also [GJ95]), so that simplication and computation rules are level insensitive. Indeed, the term-constructors
of MMML can be given by an alternative BNF
c C: : = ret
B
[ do
B
[
B
[ @
B
[ new
B
[ get
B
[ set
B
[ l
B
[ up
B
[ dn
B
with B V, M

For instance,
B
is when B is empty; if c is
B
, then c
V
and c
M
are given by
BV
and
BM
respectively.
However, MMMLs annotations are sequences B V, M

, while those of
M
are natural number n. A
sequence B identies a natural number n, namely the length of B, moreover for each i < n it says whether
computation at that level has been completed, as expressed by the dierent typing for c
V
and c
M
. The rened
annotations of term-constructors (and computational types) allow to distinguish the following situations:
(
M
x.e, E) we start generating code for a -abstraction
(e, E[
M
x.]) we have generated a fresh name x, and start generating code for the body
(e, E[
M
x.E

]) we are somewhere in the middle of the computation generating code for the body
(ret(e), E[
M
x.]) we have the code for the body of the -abstraction
(ret(
V
x.e), E) we have the code for the -abstraction
All operational semantics proposed for MetaML or

do not make these ne-grain distinctions. Only [Nan02],


which extends

of [DP96] with names a la FreshML (and intensional analysis), has an operational semantics
with steps modeling fresh name generation and recapturing, but its relations with

and MetaML have not


been investigated, yet.
The up and dn primitives of MMML are related to cross-stage persistence %e and code execution run e of
MetaML. In MMML demotion vc is partial, thus evaluation of dn(vc) may raise an unresolved link error,
while in MetaML demotion is total, and an unresolved link error is raised only when evaluating x (at level 0).
However, in [CMSar] demotion is applied only to closed values, during evaluation of well-typed programs.
Multi-lingual extensions. It is easy to extend a monadic metalanguage, like MMML, to cope with a
variety of programming languages: each programming language PL
i
is modeled by a dierent monad M
i
with
its own set of operations. However, one should continue to have one code type constructor ), i.e. , the
representation of terms should be uniform. Therefore, there should be one up: ) and one c
V
(for each
c), but several dn
i
: )M
i
and c
M
i
, one for each monad M
i
. In this way, we could have terms of type
M
1
M
2
), which correspond to a program written in PL
1
for generating programs written in PL
2
.
Compilation strategies. The compilation step (C.1) in Table 3 uses the demotion operation of Deni-
tion 4.3, which returns the term vc of type represented by a code value vc of type ) (if such a term
exists). One could adopt a lazier compilation strategy, which delays the compilation of parts of the code. A
lazy strategy has the eect of delaying unresolved link errors, including the possibility of never raising them
(when part of the code is dead). For instance, a possible clause for lazy demotion is ret
V
(e)= dn(e). A more
aggressive approach is to replace the compilation step with simplication rules
dn(up(e)) > e dn(c
V
([x
i
]e
i
[i m)) > c([x
i
]dn(e
i
[x
i
: = up(x
i
)])[i m)
However, one must modify the type system to ensure the SR and progress property,but changing the type
schema for dn to ) is not enough!
11
Type systems. We have adopted a simple type system for MMML, which does not detect statically all
run-time errors. In particular, we have not included the closed type constructor [] of MetaML for two reasons:
1. there are alternative approaches to prevent link errors incomparable with the closed type approach (e.g.
the region-based approach of [TD99] and the environment classier approach of [NT03])
2. it requires dead-code annotations (x)e that are instrumental to the proof of type safety.
Better type systems are desirable not only for detecting errors statically, but also to provide more accurate type
schema for dn, e.g. dn: [)], which could justify replacing the compilation step by local simplication rules
(see above). [Nan02] is the best attempt up-to-date in addressing typing issues, although it does not explicitly
consider computational eects. The adaptation of Nanevskis type system to MMML, e.g. rening code types
[C) with a set C of names, is a subject for further research. Also the type system of [NT03] (where one has
several code type constructors )

, corresponding to dierent ways of representing terms) could be adapted


to MMML, but at a preliminary check it seems that the more accurate type schema (.)

). for dn is
insucient to validate the local simplication rules for compilation.
Uniform representation in Logical Frameworks. The code types of MMML provide a uniform repre-
sentation of terms, similar to the (weak) Higher-Order Abstract Syntax (HOAS) encoding of object logics in
a logical framework (LF). Of course, in a LF there are stronger requirements on HOAS encodings, but any
advance in the area of LF is likely to advance the state-of-the-art in meta-programming. Recently [Nan02] has
made signicant advances in the area of intensional analysis, i.e. , the ability to analyze code (see [She01]), by
building on [PG00].
Monadic intermediate languages. [BK99] advocates the use of MIL for expressing optimizing transfor-
mations. Also MMML could be used for this purpose, but for having non-trivial optimizations one has to
introduce more aggressive simplications (than those strictly needed for dening the operational semantics)
and rene monadic types with eect information as done in [BK99]. In general, we expect -conversion
@(([x]e
2
), e
1
) e
2
[x: = e
1
] and the following equivalences to be observationally sound
do(ret(e
1
), [x]e
2
) e
2
[x: = e
1
]
c
M
([x
i
]ret(e
i
)[i m) ret(c
V
([x
i
]e
i
[i m))
while other equivalences, like @
V
(
V
([x]e
2
), e
1
) e
2
[x: = e
1
], are more fragile (e.g. they fail when the language
is extended with intensional analysis).
Acknowledgments.
We thank Francois Pottier, Amr Sabry, Walid Taha for useful discussions, and the anonymous referees for
their valuable comments.
References
[BB90] G. Berry and G. Boudol. The chemical abstract machine. In Conf. Record 17th ACM Symp. on
Principles of Programmming Languages, POPL90, San Francisco, CA, USA, 1719 Jan. 1990,
pages 8194. ACM Press, New York, 1990.
[BK99] N. Benton and A. Kennedy. Monads, eects and transformations. In Proceedings of the Third Inter-
national Workshop on Higher Order Operational Techniques in Semantics (HOOTS-99), volume 26
of Electronic Notes in Theoretical Computer Science, Paris, September 1999. Elsevier.
[CMSar] C. Calcagno, E. Moggi, and T. Sheard. Closed types for a safe imperative MetaML. Journal of
Functional Programming, to appear.
[Dav96] Rowan Davies. A temporal-logic approach to binding-time analysis. In the Symposium on Logic
in Computer Science (LICS 96), pages 184195, New Brunswick, 1996. IEEE Computer Society
Press.
12
[DP96] Rowan Davies and Frank Pfenning. A modal analysis of staged computation. In the Symposium
on Principles of Programming Languages (POPL 96), pages 258270, St. Petersburg Beach, 1996.
[Fil01] A. Filinski. Normalization by evaluation for the computational lambda-calculus. In Samson Abram-
sky, editor, Proc. of 5th Int. Conf. on Typed Lambda Calculi and Applications, TLCA01, Krakow,
Poland, 25 May 2001, volume 2044 of Lecture Notes in Computer Science, pages 151165. Springer-
Verlag, Berlin, 2001.
[GJ95] Robert Gl uck and Jesper Jrgensen. Ecient multi-level generating extensions for program spe-
cialization. In S. D. Swierstra and M. Hermenegildo, editors, Programming Languages: Implemen-
tations, Logics and Programs (PLILP95), volume 982 of Lecture Notes in Computer Science, pages
259278. Springer-Verlag, 1995.
[Klo80] J. W. Klop. Combinatory Reduction Systems. PhD thesis, University of Utrecht, 1980. Published
as Mathematical Center Tract 129.
[MHP00] The MetaML Home Page, 2000. Provides source code and documentation online at
http://www.cse.ogi.edu/PacSoft/projects/metaml/index.html.
[Nan02] Aleksandar Nanevski. Meta-programming with names and necessity. In Proceedings of the Sev-
enth ACM SIGPLAN International Conference on Functional Programming (ICFP-02), ACM SIG-
PLAN notices, New York, October 2002. ACM Press.
[NT03] Michael Florentin Nielsen and Walid Taha. Environment classiers. In Proceedings of the ACM
Symposium on Principles of Programming Languages (POPL), N.Y., January 1517 2003. ACM
Press.
[PG00] A. M. Pitts and M. J. Gabbay. A metalanguage for programming with bound names modulo
renaming. In Mathematics of Programme Construction, volume 1837 of Lecture Notes in Computer
Science, pages 230255. Springer-Verlag, 2000.
[PHA
+
97] Simon Peyton Jones, John Hughes, Lennart Augustsson, Dave Barton, and et. al. Haskell
1.4: A non-strict, purely functional language. Technical Report YALEU/DCS/RR-1106, De-
partment of Computer Science, Yale University, Mar 1997. World Wide Web version at
http://haskell.cs.yale.edu/haskell-report.
[She01] T. Sheard. Accomplishments and research challenges in meta-programming. In W. Taha, editor,
Proc. of the Int. Work. on Semantics, Applications, and Implementations of Program Generation
(SAIG), volume 2196 of LNCS, pages 246. Springer-Verlag, 2001.
[Tah99] Walid Taha. Multi-Stage Programming: Its Theory and Applications. PhD thesis, Oregon
Graduate Institute of Science and Technology, 1999. Available from ftp://cse.ogi.edu/pub/tech-
reports/README.html.
[TD99] Peter Thiemann and Dirk Dussart. Partial evaluation for higher-order languages with state. Avail-
able from http://www.informatik.uni-freiburg.de/thiemann/papers/index.html, 1999.
[TS00] Walid Taha and Tim Sheard. MetaML: Multi-stage programming with explicit annotations. The-
oretical Computer Science, 248(1-2), 2000.
[Wad99] Philip Wadler. The marriage of eects and monads. In the International Conference on Functional
Programming (ICFP 98), volume 34(1) of ACM SIGPLAN Notices, pages 6374. ACM, June 1999.
[WF94] Andrew K. Wright and Matthias Felleisen. A syntactic approach to type soundness. Information
and Computation, 115(1):3894, 1994.
13

You might also like