Subject: Re: 8-bit input (or, "Perl attacks on non-English language communities!")
From: Erik Naggum <erik@naggum.no>
Date: 1999/02/11
Newsgroups: comp.lang.lisp
Message-ID: <3127701651609043@naggum.no>

* cbarry@2xtreme.net (Christopher R. Barry)
| Hmmm... MULL (MUlti Lingual Lisp)?

  give me a break.  Common Lisp has all it needs to move to a smart wide
  character set such as Unicode.  we even support external character set
  codings in the :EXTERNAL-FORMAT argument to stream functions.  it's all
  there.  all the stuff that is needed to handle input and output should
  also be properly handled by the environment -- if not, there's no use for
  such a feature since you can neither enter nor display nor print Unicode
  text.

| This isn't going to make users of 8-bit character sets experience
| increased storage overhead for the exact same string objects and a
| performance hit in string bashing functions, now is it?

  there are performance reasons to use 16 bits per character over 8 bits in
  modern hardware already, but if you need only 8 bits, use BASE-STRING
  instead of STRING.  it's only a vector, anyway, and Common Lisp can
  already handle specialized vectors of various size elements.

  if it is important to separate between STRING and BASE-STRING, I'm sure a
  smart implementation would do the same for strings as the standard does
  for floats: *READ-DEFAULT-FLOAT-FORMAT*.

| On the upside, unicode support could give an additional excuse for Lisp's
| apparent "slowness" in certain situations.  In my Java class the
| instructor seems to always bring up unicode support as part of the excuse
| for Java's lousy performance (hmm... this isn't really comforting for
| some reason though...).

  criminy.  can teachers be sued for malpractice?  if so, go for it.

#:Erik
-- 
  Y2K conversion simplified: Januark, Februark, March, April, Mak, June,
  Julk, August, September, October, November, December.