Subject: Re: invert-string revisited From: Erik Naggum <erik@naggum.net> Date: Thu, 06 Jun 2002 00:22:15 GMT Newsgroups: comp.lang.lisp Message-ID: <3232311731697130@naggum.net> * Kragen Sitaker | So far, this is all theoretical. I don't know of any Unicode language | that has actually taken the plunge and declared itself committed to | case-insensitivity. It seems more appropriate to require lowercase-only names in a Unicode- based language than to allow and preserve mixed case. I think it is a serious design blunder to allow the programmer to decide on the uppercase version of an ordinarily lowercase letter just because of an arbitrary rule to avoid interword delimiters. If what you say is true about locales, a Turkish, say, programmer would make a different uppercase choice than a French, say, but now without the benefit of preserved locale information. So, if you wanted to be the most reasonable and "international" in a Unicode-based language, you should outlaw the use of uppercase letters from the language altogether and use an explicit interword delimiter. I have argued elsewhere that embedding case information in the encoding of letters, with a resulting near doubling of the code space requirement, was a huge mistake, like early Common Lisp had encoded the font in its character type. In a better world, we would have developed writing systems with individual markers for sentence start, not just their end, and proper name start and end, too. All our other punctuation marks and conventions developed haphazardly and each has an interesting story to tell, so it is only an historical accident that we fixed and encoded our character set(s) at the time we did and much would have so very been different if we had just waited a litte longer to solidify it all, but they say that about NTSC and HDTV, too... -- In a fight against something, the fight has value, victory has none. In a fight for something, the fight is a loss, victory merely relief. 70 percent of American adults do not understand the scientific process.