Subject: Re: chain of transformations From: Erik Naggum <erik@naggum.net> Date: 25 Jul 2002 03:49:44 +0000 Newsgroups: comp.lang.lisp Message-ID: <3236557784646741@naggum.net> * Jeff Sandys | I often implement data transformations that are modeled | on sequenced steps of processes: a -> b -> c -> d -> e ... | | It is easy to create and debug each transformation step, | as in (defun a2b (in) ..., but then I end up with the | final combined transformation as: | (e2f (d2e (c2d (b2c (a2b in))))) | that looks kind of weird. | | Is there a better lisp idiom for a chain of transformations? This may be a case for Common Lisp's unique features, for which it seems it gets harder to argue in the presence of those who appear to believe that syntax and convenience are entirely irrelevant, but more on that below. The standard idiom, functional composition inside-out read from right to left, may be hard to read and follow for a series of transformations, so you may want to express it entirely differently, like you initially wrote it: [ in -> a2b -> b2c -> c2d -> d2e -> e2f ] You could do this with an ordinary macro, too: (transform in -> a2b -> b2c -> c2d -> d2e -> e2f) where the symbol -> is a mere noise word or marker, but if you find yourself programming with such transformations, both order and notation may be shaped to your convenience with reader macros and supporting code to make it easier for you to see effortlessly that you have written the code correctly, but if you do not have enough of these forms in your code, the value of an unusual form will outweigh the convenience, and the burden of understanding the small changes you have made to the syntax will cause a maintainer to remove it. (Much like someone who uses the term "hapax legomenon" will probably use it only once -- and immediately regret having used up that joke for good.) If you look for a "Lisp idiom", I think you have preordained your solution and constricted your options prematurely. If you look for a way to use Lisp to do precisely what you have in mind, in the way you already think, and you are reasonably certain that you "think Lisp", I see no problems with creating a mini-language more suited for the job than more long-hand notations. However, the fear that you may not "think Lisp" well enough just because you want to use an (evil?) infix syntax for a clearly defined purpose may be psychologically stultifying. The key is to maintain a high and consistent level of aesthetics and not engage in rabid excesses or purposeless changes. For instance, when I had to deal with C++ a long time ago, the desire to redesign the language was overpowering and actually got in the way of using the language. Similarly, extremely tasteless stunts in Common Lisp would cause other programmers to receive a constant stream of SIGWTFs while reading your code and that would get in the way of business. (Much like someone who decides to write his own translation of the Bible, say, and quotes verses in an unusual form that causes those who thought they knew them to get upset instead of nodding knowingly to reams of archaic words ending in "eth".) Having read the meandering thread on Lisp's unique features, I am loth to conclude that some people consider syntax and convenience entirely irrelevant and therefore simply do not get the point: What we are more likely to do is closely related to the effort is required to do it -- some things are simply not done because the complexity or effort exceeds a threshold (which should not be interpreted as laziness but as economy). For instance, if you had to go through a checklist of steps to be executed with precise timing to yield the desired results every time you needed something essential to your life, how many times would you repeat it before you went and invented something to help you achieve the results with less effort -- like, say, a microwave oven and TV dinners? We know that some people actually repeat complex chains of steps tens, if not hundreds, of thousands of times and that this has been going on for hundreds, if not thousands, of generations with no change, so there is some element of the elusive "human nature" that clearly lets some people feel comfortable with repeating repetitive tasks endlessly and regard change as anathema to comfort. I would argue that Common Lisp is for people who are _not_ like that. Java man evidently enjoyed hard labor with the same primitive tools for a periods of time indistinguishable from eons, but Homo sapiens invented Leatherman tools and Common Lisp. There is, however, little point in a "programmable programming language" if you never program it (but even less if you feel you have to make some local changes just to be a member in good standing of the elite users thereof). The value of programmability becomes visible when your needs are in flux, such as when the demands are actually unknown at the outset. What mortals call "applications" are usually just one step away from the programming language -- the entire application is written in the same language, albeit with non-linguistic "abstractions". To a Common Lisp programmer inspired to build the language bottom-up while he solves the problem top-down, the application his users see is more like a meta-application -- an application of the application of the programmable programming language. This is sometimes called "domain-specific" or "fourth-generation" languages by people who fail to understand that you do not have to build an entire development environment just because you want a new way to write "struct". Put another way, if you have only one statue in mind and you know exactly what it should look like, feel free to carve it out of granite right away (and to make that more efficient), but if you have to experiment until you get it right, you would probably find Play-Doh or clay or even a plate of mashed potatoes more convenient than throwing away 95%-finished granite statues every decade. As you find experimenting more forgiving of errors, you would also automate the granite carving process, just as we have done with compilers and development environments over the years. There is nonetheless something to be said for stability. (Much like someone who experiments with different keyboard layouts is expected to become more efficient rather than spend all his time changing it to become more efficient in some vaguely defined "future", such as after all the boring tasks have been completed by others.) The key to successful use of Common Lisp's unique features is not to be led astray by the plethora of options, but to shape the language according to actual needs. There is probably no substitute for long experience and painfully acquired wisdom in this area, but one has to be aware of the inherent dangers in using a new and improved syntax -- if you change your mind, leaving behind relics of the past is unacceptable, and you therefore need a mechanism to update uses of such features -- just like those who think that SGML or XML are good ideas for long-term document storage will find that changing your mind becomes exponentially more expensive as the inherent deficiencies of those languages make it nigh intractable to produce chained transformations for documents conforming to one DTD to conforming to each its next revision. Compilers may well deal with source files that use different versions of the customized, but humans are likely to want to rely on their memory, lest they never advance beyond the hunt-and-peck mode of typing on keyboards, either. This, incidentally, is another useful feature of Common Lisp: internally, the source code ends up as manipulable data in the language itself. A program that reads and updates source files as you get one bright syntactic idea after another is eminently doable, lending itself to achieving stability through malleability. But To get back to your question -- other than the possibly fancy option, I might write the steps out thus: (let* ((in (a2b in)) (in (b2c in)) (in (c2d in)) (in (d2e in)) (in (e2f in))) (whatever in)) Note that _apparently_ reusing the same variable in let* may be misleading, as it is not actually reused, but creating new bindings. This would be a good idea if the abstract "type" of the object handed from transformation to transformation does not change (which is not the same as the representational type in the language, naturally). I tend to use this form when there are multiple arguments to each call (and the chained input is not spottable from afar) and it may be hard to understand precisely what the returned value is. Adding a name to it may help in reading and maintaining the code. In that case, you would not reuse the variable name any more than you would call your functions "x2y". I prefer to use a variable named after the type when there are separate lookup-functions, for instance. A fairly unique feature of Common Lisp lets me get away with a variable named the same as a class -- instead of having to use articles like "a" or "the" in front of them or some other shenanigans -- reusing a name comes very natural in human languages and will, when properly used, work to reduce the cognitive load when reading. Tangentially speaking of "foo2bar"-functions, I have come to hate that way to name functions. It is reminiscent of the stupid redundancy in Java where you have to write SomeComplexTypeName foo = new SomeComplexTypeName (bar); -- and thank St. GNUcius for dynamic abbreviation in Emacs -- you have to keep the function call in sync with types of _two_ variables and probably have to write the type names out in full several times. I prefer naming functions based on _one_ aspect and then to use generic functions or designators to ensure that it works reasonably for reasonably input types. To sum up, I think the notation/idiom you end up choosing should be dictated by your expected need to type, verify, and read the code written using it. There is nothing wrong, in my view, with a thorny mess warning readers not to trespass if modifying it would actually endanger the code. (This is not to be mistaken for the "it was hard to write, it should be hard to read" school of writing, though.) I prefer a sort of Huffman code for syntactic features -- the most frequently used gets the shortest and most compact syntax. This is naturally an empirical issue, and random beliefs in frequency of use are usually off by several orders of magnitude because our cognitive apparatus appears to be wired to associate importance with cogntive strain and emotions like pain, even though they are usually inversely related. -- Erik Naggum, Oslo, Norway Act from reason, and failure makes you rethink and study harder. Act from faith, and failure makes you blame someone and push harder.