Subject: Re: I don't understand Lisp
From: Erik Naggum <clerik@naggum.no>
Date: 1998/09/23
Newsgroups: comp.lang.lisp
Message-ID: <3115539566299174@naggum.no>

* Donald Fisk <donald.fisk@bt-sys.spamblock.bt.co.uk>
| What you want to do with strings is often different from what you want to
| do with lists, or vectors of numbers.

  this is a very limiting view.

| Usually, some sort of lexical analysis is required.

  pattern matching should work for all types, not just characters, IMO.

| This is built into Snobol and Perl, but to do the same in Lisp would
| either require some extra programming, or worse, some degree of
| cleverness.

  I fail to see the problem.  this isn't exactly rocket science, so how
  many times would you need to do it?  on the other hand, I was afflicted
  with the "don't make huge libraries yourself" illness for a long time,
  since that's how one normally works under Unix.  however, dumping a new
  Lisp image with lots and lots of functions effectively makes them part of
  the language, not your application.

| To choose a specific example, here is the Snobol which can separate
| a URL into a protocol, machine and path:
| 
|         url arb . protocol "://" arb . machine "/" rem . path
| 
| Now, try doing that in one line of Lisp.

  I'd say just (delimited-substrings url "://" "/"), but I assume the above
  really sets a bunch of variables, as well, so let's make it

(multiple-value-bind (protocol host path) (delimited-substrings url "://" "/")
  ...)

  by the way, this doesn't actually work for _URLs_.  relative URLs fail,
  the path actually _includes_ the slash, which might not even be present
  to begin with.  the "host" part may contain a username, a port, or both.
  not all protocols have paths.  not all URLs separate the protocol from
  the first "argument" with double slashes -- the syntax simply says colon,
  and parsing of the rest depends on the protocol.

  _this_ is why such simple solutions are wrong, and the only reason people
  choose them is because they don't understand the complexity of the issues
  involved.  good programmers acknowledge the complexity of the task and
  handle the trivial cases as the special cases they are when worth it,
  while bad programmers assume the trivial and special-case the rest.

  regexps are usually valid only for the trivial cases, and they quickly
  get so complex that one wonders what people were thinking when they did
  _not_ return to the drawing board and get it right with a real parser.
  
| If this sort of thing comprises only a small part of your code, but is
| used in several programs (a likely scenario), you're still better off
| sticking to Lisp, after extending it with a macro which does Snobol
| pattern matching.

  in this case, you're much better off writing a real parser for URL's.
  not surprisingly, that is shorter in Lisp than in Perl when it follows
  the _entire_ specification, not just the easy parts.  and that's why I
  hate Perl and love Lisp -- it's incredibly hard to do _all_ the work
  correctly in Perl and know that you have done it, and it's no significant
  extra cost in Lisp.  this means that Perl attracts the quick and dirty
  hackers, and Lisp those who want correctness, but what else is new?

  on the other hand, I have made at least half a living from fixing other
  people's incomplete code for one and a half decade, so the more Perl is
  used, the more work I get to do over, after the poor client has learned
  how complex the task really is, and can appreciate the correctness.
  without Perl, I would have had to convince them myself...  thanks, Perl!

#:Erik
-- 
  ATTENTION, all abducting aliens!  you DON'T need to RETURN them!