Subject: Re: case sensetivity
From: Erik Naggum <cl@naggum.no>
Date: 1998/05/28
Newsgroups: comp.lang.lisp
Message-ID: <3105383767158339@naggum.no>


* David Bakhash
| is there an ANSI way to put package info at the top of a (package)
| Lisp file so that it treats all symbols as being case-sensetive, as if 
| they were inside | |'s ?

  case-sensitivity is not a property of the package, but of the Lisp
  reader, controlled by the SETF'able function READTABLE-CASE.  :UPCASE and
  :DOWNCASE are obviously case-insensitive.  :PRESERVE and :INVERT are
  case-sensitive.  with :PRESERVE, you need to type all standard symbols in
  upper-case.  with :INVERT, you must remember the algorithm: only symbol
  names where all unescaped characters for which BOTH-CASE-P is true have
  the same case have the case of those characters inverted.

  I, too, thought case should have been a property of the package, but that
  offers some rather messy semantic relationships with the way packages are
  used by other packages and access to symbols from several packages makes
  it complicated to decide which package should control the case-sensitivity
  of a symbol name when interned.

  instead of this very messy situation, I have written a new reader macro
  that handles the case of the symbol the way the reader does, yet does not
  cons a symbol.  this makes it possible to use either :INVERT or Allegro
  CL's non-standard "case-mode" stuff and still write in all lower-case.
  it also stands out as noticeably different, much unlike using lower-case
  symbol-name strings in Allegro's lower-case modes, which breaks stuff.

  the principle is that if a symbol is written as `foobar', then the name
  of that symbol should be written `#"foobar"', and this is semantically
  identical to #.(symbol-name 'foobar), except that it should never have to
  cons a symbol -- the reader already has to pass a fresh string from the
  input stream to INTERN, and the intent is to capture that string before
  it gets passed to INTERN.

  here's the implementation for Allegro CL 4.3 and 5.0.  caveat emptor.

;;; reader for symbol names that does case conversion according to the
;;; rest of the symbol reader.  thanks to Sean Foderaro for the pointer
;;; to EXCL::READ-EXTENDED-TOKEN, which luckily does all the dirty work.

(defun symbol-namestring-reader (stream character prefix)
  (declare (ignore prefix))
  (prog1 (excl::read-extended-token stream)
    (unless (char= (read-char stream) character)
      (excl::.reader-error stream "invalid symbol-namestring syntax"))))

;; set it in all readtables.  (yes, I know this is _really_ dirty.)
(eval-when (:compile-toplevel :load-toplevel)
  (loop with readtables = (excl::get-objects 11)
      for i from 1 to (aref readtables 0)
      for readtable = (aref readtables i)
      do (when (excl::readtable-dispatch-tables readtable)
	   (set-dispatch-macro-character #\# #\"
					 #'symbol-namestring-reader
					 readtable))))

  a portable implementation would UNREAD-CHAR the character it had just
  read (it should therefore be bound to be #\"), call the reader to get the
  string, and frob the case according to the value of READTABLE-CASE (and
  make sure it got escaping right, which is a _pain_), but I'll do that
  only when I actually need it.  it is sufficient for me that it can be
  done portably, too.

  the big advantages of this technique is that you can always refer to a
  symbol name by a unique syntax that never gets confused with anything
  else, doesn't wantonly create uninterned symbols, or worse: redundant
  keywords, and _always_ gets the complexity of :INVERT right, so the
  symbol that is named "FOOBAR" internally still has the reader syntax
  `foobar' and the symbol-name syntax `#"foobar"'.  this is important (and
  convenient) when writing arguments to APROPOS and the package functions.

  the above code modifies all existing readtables, which some might find
  yucky beyond belief, but Allegro CL also offers named readtables that
  might make this a little easier on the aesthesticles.  to use a named
  readtable for a given project:

(let ((readtable (copy-readable nil)))	;copy from the standard
  (set-dispatch-macro-character #\# #\" #'symbol-namestring-reader readtable)
  (setf (readtable-case readtable) :invert)
  (setf (named-readtable :foo-project) readtable))

  now you can say

(eval-when (:compile-toplevel :load-toplevel)
  (setf *readtable* (named-readtable :foo-project t)))

  or you can use the IN-SYNTAX proposal from Kent Pitman, which I have
  implemented as follows:

(defun in-syntax-ensure-readtable (evaled quoted)
  "Ensure that the argument to IN-SYNTAX is a readtable, or error."
  (if (readtablep evaled) evaled
      (error 'type-error
	:datum evaled
	:expected-type 'readtable
	:format-control
	"~@<IN-SYNTAX argument `~S' evaluates to a ~:@(~S~), ~
	 not a (named) READTABLE.~:@>"
	:format-arguments (list quoted (type-of evaled)))))

(defmacro in-syntax (readtable)
  "Set *READTABLE* to READTABLE (evaluated) for the remainder of the file.
If READTABLE is a keyword, uses NAMED-READTABLE to retrieve the readtable."
  `(eval-when (:compile-toplevel :load-toplevel :execute)
     (setq *readtable*
       ,(if (keywordp readtable)
	  `(excl:named-readtable ,readtable t)
	  `(in-syntax-ensure-readtable ,readtable ',readtable)))))

;; start off with one that we can always rely on.
(setf (excl:named-readtable :ansi-cl) (copy-readtable nil))

  this implementation of IN-SYNTAX is of course just as happy with a
  variable or any other form that yields a readtable object, which is the
  fully portable version.  (just remove the test for KEYWORDP.)

  hope this helps and also helps people decide against using the invisibly
  non-standard "case-mode" :case-sensitive-lower stuff (which breaks code
  without letting you know it could do so) and encourages people to adopt
  the standard READTABLE-CASE value :INVERT to get case-sensitive
  lower-case symbols.

#:Erik
-- 
  "Where do you want to go to jail today?"
			-- U.S. Department of Justice Windows 98 slogan