Ben Pawson <dpawson@rnib.org.uk> wrote:
+---------------
| What efficiency lessons are there to be had...
+---------------
The main "efficiency" lesson I can offer is to *not* be too
concerned about efficiency until you've gotten your algorithms
correct & tested, for several reasons:
1. It's "the right thing to do". ;-}
2. Coming to Scheme from (say) a C bit-twiddling context (as I did),
one tends to over- and/or mis-estimate the costs of things. Get
your app working, then see if it really needs any performance
tuning [hey, maybe it won't!]. Otherwise you'll find yourself
spending lots of time mis-micro-optimizing. Despite its "minimalism",
Scheme actually offers *many* ways to compute the same result.
Initially, just pick one, and get on with it...
3. Since *everything* is a procedure call anyway, procedure calls
are "cheap" (relatively speaking), so don't shy from them. That
is, don't try to do too much in each routine, and use helper
procedures liberally [that is, *don't* code your parsers as one
big routine like the following example suggests! ;-} ]. That will
also make development/debugging easier.
4. Initially, don't shy away from or try to optimize away allocation
of space for intermediate results. Yes, there is a cost to GC, but
it's probably less than you think, and it's almost certainly not
*where* you initially think it is.
For example, I was initially horrified by an "obvious" style of
token gathering that goes something like this[*]:
(case (character-type c)
...
((may-start-symbol)
(do ((result (list c) (cons c2 result))
(c2 (my-next-char) (my-next-char)))
((not (allowable-in-symbol? c2))
(my-unget-char c2)
(string->symbol (list->string (reverse result))))))
...
because of all the cons cells (and the string) that are allocated and
immediately thrown away.
Well, yes. But it works. And it's usually "fast enough". If not, you can
certainly replace this with a call to a routine that accumulates characters
by mutation of a top-level allocated-once string [though if you're restricted
to R4RS primitives you still end up with one garbage string when you return
(string->symbol (substring buffer ...))], but then you also have to worry
about what to do if a symbol is too long [allocate a bigger buffer], etc.
-Rob
[*] p.s. For those who don't like "do" loops with no bodies, feel free
to write it as:
(let loop ((result (list c))
(c2 (my-next-char)))
(if (allowable-in-symbol? c2)
(loop (cons c2 result) (my-next-char))
(begin
(my-unget-char c2)
(string->symbol (list->string (reverse result))))))
-----
Rob Warnock, 7L-551 rpw3@sgi.com http://reality.sgi.com/rpw3/
Silicon Graphics, Inc. Phone: 650-933-1673 [New area code!]
2011 N. Shoreline Blvd. FAX: 650-933-4392
Mountain View, CA 94043 PP-ASEL-IA