Subject: Re: Name for the set of characters legal in identifiers From: Erik Naggum <erik@naggum.no> Date: 14 Jan 2004 08:22:42 +0000 Newsgroups: comp.lang.lisp Message-ID: <3283057362064279KL2065E@naggum.no> * Russell Wallace | Thanks for the explanation - okay, so basically any character _can_ | be part of a symbol... fair enough... my question is really about | the English terminology, though. The terminology is really pretty simple, but you have to look at it from the right angle. In languages that require identifiers to be made up of particular characters, there is obviously a name for the character set, but in a language that goes out of its way to make it possible to use absolutely any character you want, there are only names for those characters that need special treatment to become part of a symbol name because their "normal" function is not to. | Whereas if you write... | | (defun )(')( ...) | | That won't work; (, ) and ' are "punctuation" (?) and normally | recognized by the reader as special characters. Well, they are known as "macro characters". The important thing is that the set of macro characters is not defined by the language, but by the readtable in effect when the Common Lisp reader processes your source. There is a standard readtable, however, and one would have to say "unescaped terminating macro characters in the standard readtable" or another phrasing that tries to hide the obvious anal retentiveness to really speak about the characters that will not be part of a symbol name unless you have changed the rules. There is nothing particularly special about any of these macro characters. There are some restrictions on what the readtable can do and how the reader collects characters into symbol names. If you really insist, calling them "constituent characters" will help, but realize that this property is a result of falling through every other test -- unless it is escaped, in which case it wins its constituency right away. (There's an awful pun waiting to happen here, about Iowa, but I'll ignore the temptation.) | (I'm talking about the normal case, not what you can persuade the | reader, interner or whatever to do if you try hard enough :)) While this may seem reasonable from the angle you chose to look at this problem, it is the a priori reasonability of the position that has produced your problem. It is in fact unreasonable to approach Common Lisp from this angle. The problem does not exist. This (defun |)(')(| ...) is in fact fully valid Common Lisp code. You cannot define away the solution to the problem and insist that you still have a problem in need of an answer. | So there's "whitespace", "punctuation" and... what's the third | category called? Not "alphanumeric"... "constituent characters"? I have to zoom out and ask you what you would do with the elusive name for this category. If I guess correctly at your intentions, I would perhaps have said that "any character can be part of a symbol name, but most macro characters need to be escaped to prevent them from having their macro function". (The important exception is #, the only non-terminating macro character in the standard readtable, meaning that #xF will be interpreted as hexadecimal number, but F#x is a three-character-long symbol name with a # in it.) Unless you have a simple need that can be resolved by a nice, vague explanation that only informs your reader that Common Lisp is a lot different from languages that require particular characters in the names of identifiers/symbols, I think Chapter 23 in the standard, on the Common Lisp Reader, would be a really good suggestion right now. Yeah, I'm back allright, with undesirably high levels of precision, scaring away frail newbies from day one. Maybe I'll go hibernate. -- Erik Naggum | Oslo, Norway Act from reason, and failure makes you rethink and study harder. Act from faith, and failure makes you blame someone and push harder.