Subject: Re: lisp idiom for processing each line in a file?
From: rpw3@rpw3.org (Rob Warnock)
Date: Sun, 26 Feb 2006 07:13:23 -0600
Newsgroups: comp.lang.lisp
Message-ID: <Za6dnZUBC9buNJzZnZ2dnUVZ_v2dnZ2d@speakeasy.net>
Kirk Job Sluder  <kirk-nospam@jobsluder.net> wrote:
+---------------
| The macro I use is stolen from here:  
| http://kantz.com/jason/clim-primer/word.lisp
...
| (defmacro do-file ((path line-variable &key (key #'identity)) &body body)
+---------------

Yes, this is useful, but I think I'd probably want to change a
few things:

1. Reverse the order of the PATH and LINE-VARIABLE arguments,
   for two reasons: (a) So that it reads more in the style of
   other DO-XXX and WITH-XXX macros, which list the bound variable
   first; and (b) so one can naturally add additional OPEN keyword
   arguments after the filename [particularly :EXTERNAL-FORMAT,
   given where this thead started!].

2. [Minor] Add a :STREAM keyword arg for a variable name to bind
   to the open stream, in case the user might need access to it
   during the traversal [e.g., for error messages or something].

3. [Minor] Have the macro test for whether the :KEY parameter
   was provided, and if not, *don't* blindly call the KEY function.
   [Yes, a good compiler will optimize (FUNCALL #'IDENTITY FOO)
   to just FOO, but maybe not all compilers are that good.]

4. [Minor] Use LOOP instead of DO for the iteration, mainly because
   it's more concise [not that that matters much inside a macro].

5. [Minor] Change the name to DO-FILE-LINES, to better indicate
   what is being iterated over. [To me, DO-FILE sounds like it does
   something *to* the file.]

If it hadn't been for #2, #1(b) could have been accomodated by
simply adding ":ALLOW-OTHER-KEYS T" to the WITH-OPEN-FILE call,
but with #2 *and* #1(b), one needs to strip the :STREAM argument
out of the WITH-OPEN-FILE call, since the variable name may well
be unbound. Given that, I decided to strip out both :STREAM & :KEY.
[Though if there had been more than two arg pairs to strip, I might
have used something more efficient than multiple REMFs.] Anyway,
put it all together and you get the following:

    (defmacro do-file-lines ((line path &rest open-options
					&key (stream (gensym) stream-p)
					     (key #'identity key-p)
					&allow-other-keys)
		       &body body)
      "For each line in the file named by PATH, the BODY is executed 
      with the variable LINE bound to the value of the :KEY function 
      (default #'IDENTITY) applied to that line. If any additional 
      OPEN-OPTIONS are provided, they will be used when opening the 
      file. If a variable name is provided to :STREAM, it will be bound 
      to the opened stream."
      (let* ((options (if (or stream-p key-p)
			(let ((options (copy-list open-options)))
			  (when stream-p
			    (remf options :stream))
			  (when key-p
			    (remf options :key))
			  options)
			open-options))
	     (loop-var (if key-p (gensym) line))
	     (let-args (if key-p `((,line (funcall ,key ,loop-var))) nil)))
	`(with-open-file (,stream ,path ,@options)
	   (loop for ,loop-var = (read-line ,stream nil nil)
		 while ,loop-var do
	     (let ,let-args
	       ,@body)))))

It could be used like this:

    > (do-file-lines (foo "foo.tmp")
	(print foo))

    "The first line" 
    "The second line" 
    "And the longest and last third line" 
    NIL
    > (do-file-lines (foo "foo.tmp" :key #'length :stream bar)
	(print (list foo bar)))

    (14 #<Stream for file "foo.tmp">) 
    (15 #<Stream for file "foo.tmp">) 
    (35 #<Stream for file "foo.tmp">) 
    NIL
    > 


-Rob

-----
Rob Warnock			<rpw3@rpw3.org>
627 26th Avenue			<URL:http://rpw3.org/>
San Mateo, CA 94403		(650)572-2607