Subject: Re: Is there a trick so macrolet can't refer to itself?
From: rpw3@rpw3.org (Rob Warnock)
Date: Sat, 19 Aug 2006 07:39:23 -0500
Newsgroups: comp.lang.lisp
Message-ID: <PMedneTPEOHmm3rZnZ2dnUVZ_oKdnZ2d@speakeasy.net>
Pascal Costanza  <pc@p-cos.net> wrote:
+---------------
| I hesitate to judge Henry Baker's coding style because I haven't studied 
| the paper in detail and don't completely understand its goals. One of 
| the advantages he mentions is that his abstractions allow for several 
| different implementations, and maybe that's indeed where macros help in 
| his case - indeed, syntactic abstractions potentially gives you more 
| freedom in this respect than functional abstractions. (I am referring to 
| the third advantage he mentions in his conclusions.)
| 
| However, when skimming through his code, I see some macros that could 
| have been implemented as functions as well, so I am at least skeptical.
+---------------

Some background context might help...

META-II itself dates back to 1964 (see reference [Schorre64] in Baker's
paper). As Baker said (1991):

    The scheme we will describe has been used for at least 27 years,
    but is relatively unknown today, because computer science departments
    are rightly more interested in teaching theoretically pretty models
    like regular and context-free languages, rather than practical
    methods that may also be elegant and efficient.

META-II used a BNF-style input format, sort of like YACC but with *both*
the lexer and the syntax-parser/code-generator unified into a single
input specification, unlike the sometimes-awkward Lex/YACC split.
Baker's reader macros & MATCHIT macro allow one to write parsers
that look *very* much like the original META-II style, but entirely
within Lisp! [That is, without having an *external* META-II input file
format that's "compiled" into Lisp source and then loaded into Lisp
to get the lexer/parser/compiler to run.]

The "META" class of compiler-compilers handles most of the common
regular expression forms, is recursive [that is, supports writing
top-down/recursive-descent parsers], and the code is tiny. The parsers
one writes using it are often tiny, too. I used Schorre's "META-II"
back in 1970 on the PDP-10 to write a quick & dirty BLISS compiler
in a single weekend! [It took in BLISS source and output PDP-10
assembler (MACRO-10 source). It was totally unoptimized -- the
generated code was *horribly* inefficient -- but the generated
code ran and gave correct results.]

Unfortunately for those reading Baker's paper, unless you have some
previous exposure to META-II (or the META family) it's hard to
appreciate how closely he's managed to capture the original overall
style -- it's quite amazing!

Of course, if you're not familiar with META, then the heavy use of
reader macros to achieve the resulting terseness of specification
may seem a bit overdone, especially the very large number of reader
macros, which result in something resembling Perl's level of "line
noise"!! One might be tempted to throw all the reader macros away
and use a more "traditional" Lisp style, albeit still with a large
number of macros so as to continue to allow a somewhat "declarative"
coding style. One could certainly do that, although the resulting
grammar specifications might not be nearly as compact as Baker's.
Though the expansion needn't be all *that* bad -- maybe something
similar to the s-expr form of CL-PPCRE, but perhaps with shorter
markers:

    $foo       ==> (* foo)       ; Kleene star (zero or more)
    [foo bar]  ==> (& foo bar)   ; Sequence [a.k.a. PROGN]
    {foo bar}  ==> (/ foo bar)   ; Alternative or union ["pick 1"]
    @(digit d) ==> (? (digit d)) ; Satisfies-p [a.k.a. TYPEP]
    !(s-expr)  ==> (! (s-expr))  ; Emit normal Lisp code

So Baker's first version of PARSE-INTEGER [which is really just
"recognize-integer", he adds the saving of the value later]:

    (defun parse-integer (&aux d)
      (matchit [{#\+ #\- []} @(digit d) $@(digit d)]))

could be re-written [still with macros, but no *reader* macros!] as:

    (defun parse-integer (&aux d)
      (matchit (& (/ #\+ #\- (&))
		  (? (digit d))
		  (* (? (digit d))))))

or even expand the MATCHIT macro and the little marker symbols[1]
out completely, though that's starting to get a bit verbose:

    (defun parse-integer (&aux d)
      (and (or (match #\+) (match #\-) (and))
	   (match-type digit d)
	   (not (loop while (match-type digit d)))))

What especially interesting to *me*, at least, is that these sorts
of "lexers" can easily be intermingled with larger, "syntax parser"
elements using top-down/recursive-descent to form full parsers and
code generators. See section "C. CONTEXT-FREE GRAMMARS" in Baker's
paper for a taste of how that works.

Does that address your skepticism somewhat?


-Rob

[1] An earlier draft of this reply used actual named macros
    instead of one-character marker symbols, that is, "m&",
    "m/", "m?", "m*", etc., but I think MATCHIT can be made
    to work [somewhat like LOOP!] with just the bare symbols.
    [If not, the prefix "m" could always be put back.]

-----
Rob Warnock			<rpw3@rpw3.org>
627 26th Avenue			<URL:http://rpw3.org/>
San Mateo, CA 94403		(650)572-2607