Re: Basic file filter lessons appreciated. — Rob Warnock Lisp usenet archive


Ben Pawson  <dpawson@rnib.org.uk> wrote:
+---------------
| What efficiency lessons are there to be had...
+---------------

The main "efficiency" lesson I can offer is to *not* be too
concerned about efficiency until you've gotten your algorithms
correct & tested, for several reasons:

1. It's "the right thing to do".  ;-}

2. Coming to Scheme from (say) a C bit-twiddling context (as I did),
   one tends to over- and/or mis-estimate the costs of things. Get
   your app working, then see if it really needs any performance
   tuning [hey, maybe it won't!]. Otherwise you'll find yourself
   spending lots of time mis-micro-optimizing. Despite its "minimalism",
   Scheme actually offers *many* ways to compute the same result.
   Initially, just pick one, and get on with it...

3. Since *everything* is a procedure call anyway, procedure calls
   are "cheap" (relatively speaking), so don't shy from them. That
   is, don't try to do too much in each routine, and use helper
   procedures liberally [that is, *don't* code your parsers as one
   big routine like the following example suggests! ;-} ]. That will
   also make development/debugging easier.

4. Initially, don't shy away from or try to optimize away allocation
   of space for intermediate results. Yes, there is a cost to GC, but
   it's probably less than you think, and it's almost certainly not
   *where* you initially think it is.

For example, I was initially horrified by an "obvious" style of
token gathering that goes something like this[*]:

	(case (character-type c)
	  ...
	  ((may-start-symbol)
	   (do ((result (list c) (cons c2 result))
		(c2 (my-next-char) (my-next-char)))
	       ((not (allowable-in-symbol? c2))
		(my-unget-char c2)
		(string->symbol (list->string (reverse result))))))
	  ...

because of all the cons cells (and the string) that are allocated and
immediately thrown away.

Well, yes. But it works. And it's usually "fast enough". If not, you can
certainly replace this with a call to a routine that accumulates characters
by mutation of a top-level allocated-once string [though if you're restricted
to R4RS primitives you still end up with one garbage string when you return
(string->symbol (substring buffer ...))], but then you also have to worry
about what to do if a symbol is too long [allocate a bigger buffer], etc.


-Rob

[*] p.s. For those who don't like "do" loops with no bodies, feel free
to write it as:

	   (let loop ((result (list c))
		      (c2 (my-next-char)))
	     (if (allowable-in-symbol? c2)
	       (loop (cons c2 result) (my-next-char))
	       (begin
		 (my-unget-char c2)
		 (string->symbol (list->string (reverse result))))))

-----
Rob Warnock, 7L-551		rpw3@sgi.com   http://reality.sgi.com/rpw3/
Silicon Graphics, Inc.		Phone: 650-933-1673 [New area code!]
2011 N. Shoreline Blvd.		FAX: 650-933-4392
Mountain View, CA  94043	PP-ASEL-IA