Subject: Re: Pause for keystroke From: Erik Naggum <erik@naggum.net> Date: Sat, 06 Oct 2001 13:19:20 GMT Newsgroups: comp.lang.lisp Message-ID: <3211363159251860@naggum.net> * Andras Simon | And what if you want to read (as opposed to read-line) something, but | newline is OK, too? I'm asking this because I'm slightly irritated by | having to type something to CMUCL's toplevel in order to get back the | prompt if the previous output wasn't terminated by a newline. This is not a trivial problem, unfortunately. The most appropriate way to read Common Lisp input from an untrusted source is to obey the rules of the source, and if that is lines instead of expressions, you need to work with prompts and various forms of input editing. E.g., it would be nice if the prompt could contain an indication of unclosed delimiters, as well as some way to discard the unfinished form. The only time it is appropriate to call the function read directly is when you "know" that the source will contain a Common Lisp expression, or, in other words, when you are prepared to deal with errors coming from violating that knowledge. (This is why it is a very good idea to use Emacs interfaces to Common Lisp environments.) E.g., if you want to read interactive user input, I think the appropriate way to do this is to collect a syntactically valid form first, _then_ process it. Many input processors tend to be built around the assumption that it is easier to backtrack than to validate before processing. Much of the parsing literature that exists unquestioningly _assumes_ that the _only_ way to get any match at all is to confuse the validation and parsing processes. This is in part caused by the largely unfounded belief in context-free grammars, which has many strongly appealing theoretical aspects, but also a large number of negative human factors that detract from readability and processability. (The influence of these bad theories on the retarded notion of "ambiguity" in SGML/XML/etc has caused a large increase in the cost of designing document types and applications, not the least that of educating/training/hurting developers and users alike so they stop wanting something completely reasonable.) Part of the problem of this mode of thinking is that most streams are very naively implemented one-pass structures. E.g., if you want to collect a line of input, you copy characters from the stream (buffer) to some (other) buffer while looking for a line terminator character. If you could instead push a "mark" on the stream (buffer), tell the stream to skip characters until a line terminator was seen, and return with it, you could extract the portion of the buffer from the mark until the current position, if you needed to: it should also be possible to refer to the characters in the stream buffer via a displaced array. Naturally, the simple-minded single-buffer approach to buffering input and output is also at fault. As long as you refill the same buffer with new data, you cannot work with buffer marks. (Neither can you ask the operating system to kindly pre-fill the next buffer while you are doing something else, so you end up with a _guessing_ operating system and completely unnecessary delays at the buffer edges.) Designing an SGML entity manager and parser around these ideas (in 1993-1994) caused a dramatic 5-fold speed increase over the naive C stream implementation. So much interesting work remains in the Common Lisp reader if one wants to support a better interactive environment, and that includes _much_ better error recovery when reading Common Lisp files that have not been produced by a competent expression-oriented environment like Emacs with an intelligent user. The common way of de-coupling the input processing from the "terminal" also leaves much to be desired. Those who remember TOPS-20's command line processor will know what I mean and miss, but others may need to have several layers of blinders removed after only having been exposed to the ultra-primitive Microsoft command line and only somewhat better Unix command line, especially if they think that GNU readline is an improvement. A typewriter remain a typewriter no matter how much chrome you add to it, and Unix even has an error message to tell you that you have violated its assumptions: "ENOTTY -- Not a typewriter". Unfortunately, nigh the whole world is now duped into thinking that silly fill-in forms on web pages is the way to do user interfaces. It is not unlikely that this is a sort of improvement over the typewriter, but that is about all there is to it. ///