Newsgroups: comp.lang.scheme
Path: cantaloupe.srv.cs.cmu.edu!rochester!udel!news.mathworks.com!newshost.marcam.com!zip.eecs.umich.edu!newsxfer.itd.umich.edu!gatech!psuvax1!news.ecn.bgu.edu!siemens!princeton!news.princeton.edu!blume
From: blume@dynamic.cs.princeton.edu (Matthias Blume)
Subject: A low-level macro system
In-Reply-To: bh@anarres.CS.Berkeley.EDU's message of 21 Nov 1994 15:20:12 GMT
Message-ID: <BLUME.94Nov21140201@dynamic.cs.princeton.edu>
Originator: news@hedgehog.Princeton.EDU
Sender: news@Princeton.EDU (USENET News System)
Nntp-Posting-Host: dynamic.cs.princeton.edu
Organization: Princeton University
References: <39l7er$q02@wsiserv.informatik.uni-tuebingen.de>
	<BLUME.94Nov19142656@dynamic.cs.princeton.edu>
	<3ant10$npt@agate.berkeley.edu>
	<BLUME.94Nov20152331@dynamic.cs.princeton.edu>
	<3aqdrc$oci@agate.berkeley.edu>
Date: Mon, 21 Nov 1994 19:02:01 GMT
Lines: 158

In article <3aqdrc$oci@agate.berkeley.edu> bh@anarres.CS.Berkeley.EDU (Brian Harvey) writes:

   blume@dynamic.cs.princeton.edu (Matthias Blume) writes:
   >There are low-level macro systems, which (as far as I can tell) are as
   >close to the eval-it-twice paradigm as possible without sacrificing
   >hygiene.

   I don't understand this.  I was under the impression that "low level"
   meant "not hygienic," for those situations in which the whole point of
   the macro you're trying to define is to transcend the scope rules.
   So in what sense are these systems low level?

You are right -- I didn't say what I meant to say.  A good low-level
mechanism should allow two things:

	- serve as the basis on top of which a hygienic high-level
	  system can be built
	- circumvent hygiene (but only if desired)

SYNTAX-RULES is not ``low-level''ish enough to permit the latter,
while EVAL-IT-TWICE is so broken that it doesn't allow the former.

   >  If anybody is interested in details -- ask!

   Hmm, all this yelling at me started because someone made a similar
   offer (to tell the details about how R4RS got that way) and I said
   "yes, please do."  So I'd better not take you up on this one, I guess.

I'll try it anyway:

According to the EVAL-IT-TWICE paradigm macros are functions
(aka. macro transformers) from s-expressions to s-expression.

When the compiler encounters a macro invocation it applies the
corresponding function to the syntactic representation of the macro
invocation and compiles the result.

This is a useful way to view things.  However, the traditional
approach to represent syntax as s-expressions (in particular to
represent variables as symbols) leads to the well-known quirks, which
let this kind of a macro system appear undesirable to many of us.

It turns out that you can get away with representing almost everything
as s-expressions -- as long as you do something more elaborate with
variable names.  To get an insight of what ``elaborate'' things one
should do here read Jonathan Rees' paper!

The good news is that it isn't necessary to reveal the implementation
details of variable names the compiler (or interpreter) uses.  It
suffices to give macro transformers the chance to use the correct
representation for the symbols it wants to insert into the output of
the macro transformation.  This can be done with the help of the
compiler.  The macro transformer ``asks'' the compiler for a suitable
hygienic (or non-hygienic) representation of the name it wants to use
and the compiler provides such a beast.
(For brevity I skip the discussion of recognizing names in the *input*
of the transformer, which requires similar arrangements.)

The only slight nuisance with this is that ordinary symbols cannot be
inserted into the output of a macro directly (unless they ultimately
appear within a quotation).

In VSCM I will provide a new special form to describe low-level macro
transformers.  The PRIMITIVE-TRANSFORMER special form is very similar
to a LAMBDA form, but it also carries some extra information about the
symbols the transformer wants to use.

The synopsis for PRIMITIVE-TRANSFORMER is the following:

	(primitive-transformer
		(<h-name> ...)                ; hygienic names
		(<nh-name> ...)               ; non-hygienic names
	  (<e-arg> <cmp-arg> <se-arg>         ; mandatory transformer args
	     <h-arg> ...                      ; hygienic representations
	     <nh-arg ...)                     ; non-hygienic representations
	  <body>)                             ; body of the macro transformer

Basically this describes a function, which takes the original
expression (passed as <e-arg>) to a transformed expression.

<cmp-arg> is a procedure to compare names (this is what's usually
known as FREE-IDENTIFIER=?).  (In VSCM one can use EQUAL? for
BOUND-IDENTIFIER=?, but I acknowledge that it is probably cleaner to
pass both routines as arguments to the transformer.)

<se-arg> is a procedure used to relay ``syntax error'' complaints back
to the compiler.

<h-arg> ... (there must be exactly as many <h-arg>s as there are
<h-name>s!) are the formal parameters used by the compiler to pass the
correct hygienic representation of <h-name>s to the transformer.
<h-arg>s can appear in calls to <cmp-arg>, or they can be inserted
into the output of the transformer where (as long as they are not
being bound anew) they act as aliases for the <h-name>s as they appear
in the context of the PRIMITIVE-TRANSFORMER expression.

<nh-arg> ... and <nh-name>s are similarly related.  However, instead
of passing a hygienic representation of the <nh-name>s the compiler
will arrange for the <nh-name>s to have the same representation these
names would have had they been passed as part of the arguments to the
macro.  This can be used to circumvent hygiene, e.g. for introducing
implicit bindings.

Probably it would also be useful to have a mechanism for creating
brand new temporary names.  I haven't added this yet, but I'm
contemplating on doing so.

I already brace myself for the impact of Brian jumping right into my
face for presenting yet another overcomplicated low-level mechanism.
Maybe we can prevent this by looking at two examples, which hopefully
clarify issues:

The PRIMITIVE-TRANSFORMER equivalent of

(define-syntax stream-cons
  (syntax-rules ()
    ((_ x y) (cons x (delay y)))))

is:

(define-syntax stream-cons
  (primitive-transformer
      (cons delay)
      ()
      (exp cmp se my-cons my-delay)
    (if (not (= (length exp) 3))
	(se "bad stream-cons expression"))
    `(,my-cons ,(cadr exp) (,my-delay ,(caddr exp)))))

The other example defines a LOOP macro.  (LOOP <command> ...)
specifies an infinite loop.  However, it also implicitly binds the
symbol EXIT in a way such that (EXIT <exp>) leaves the loop with
result <exp>.

(define-syntax loop
  (primitive-transformer
      (call-with-current-continuation lambda letrec tmp)
      (exit)
      (exp cmp se
	   my-call/cc my-lambda my-letrec my-tmp
	   your-exit)
    `(,my-call/cc
      (,my-lambda (,your-exit)
	 (,my-letrec ((,my-tmp
		       (,my-lambda ()
			  ,@(cdr exp)
			  (,my-tmp))))
	    (,my-tmp))))))

Note that the current implementation of SYNTAX-RULES in VSCM is a
macro, which expands into a PRIMITIVE-TRANSFORMER expression.  (I'm
not going to present it in this forum, because it probably wouldn't
make a very good intoductory example. :) BTW, the internal mechanisms
used to deal with macros in the compiler are taken almost verbatim
from Rees' paper.

--
-Matthias
