Subject: Re: Could CDR-coding be on the way back? From: Erik Naggum <erik@naggum.net> Date: 14 Dec 2000 03:17:39 +0000 Newsgroups: comp.lang.lisp,comp.arch Message-ID: <3185752659290095@naggum.net> * Per Bothner <per@bothner.com> | Perhaps, but does it make sense? Excuse me? Yes, it makes sense to read Lisp data that the application does not know the structure of a priori. For one thing, source _code_ matches that description. Data whose structure is only evident after having been read in as a complete expression also matches very well. | Presumably you application has some idea of the structure of the data | it is expecting? (Even trees have structure.) Yes, it knows they will all be Lisp objects. That's it. This is part of the significant charm with Lisp's lists. | But that is just my argument - it is *not* much less efficient. If | you double the array as it grows, the "average" element is only copied | *a single time*. (Some elements may be copied n*log(n) times, but at | least half the elements have been copied at most once, at least | three-quarter of the elements have been copied at most twice, etc.) There's something wrong with your math. The average number of times an element is copied is 1.0 if you have (- (1+ (expt 2 n)) m) elements in the final array with m elements in the initial allocation and approaches 2.0 asymptotically for (1+ (expt 2 n)) elements in the final array, since adding the last element will copy everything and end up with half the allocated array empty (short that one element). | My argument is a little of all of: | | (1) I think high-level programming languages should ... | (2) I think programmers should ... | (3) As a consequence of (1) and (2), ... I'd be interested in how you _arrived_ at your preferences, instead of just having something follow from some random personal favorites. | (Of course CL is object-oriented. However, it differs from Java in | that certain pre-defined types are used very heavily so it makes sense | to optimize for those types in a way that makes less sense for Java.) This is where you're losing me. I don't see a huge need for making a basic type more efficient in languages that _don't_ show a predominant use of that basic type. And why are we discussing Java? I thought it was already as array-friendly as you wanted languages to be? And Lisp already has adjustable arrays, so what's the point in redesigning them? | The displacement offset is not needed for adjustable arrays. It is only | needed for displaced arrays. Sigh. Why am I wasting any time on this? The displacement is needed for the "rest" (cdr) function that people who work on lists, however they are actually implemented, need very often. If you want to offer Lisp people something in terms of efficiency, you can't take away their primitives without changing the entire language under them. That's what I was trying to get at in the last parapraph I wrote. | And you can place the allocation size at the start of the data array. Yes, but that is a truly boneheaded decision as it means that the data array can no longer be specialized to contain a uniform data type -- say double-floats instead of pointers to double-floats (and you really want that if you go the way of arrays in the first place -- people even ask for specialized list types, and they won't stop if they know the underlying implementation is array-based), but must somehow have a special initial element or consist of a single word that destroys alignment of the rest of the array. This is just stupid. It's just like the old string representation that used the first byte to hold the length and couldn't hold more than 255 characters per string. | (It would be needed there anyway if memory management code needs to | scan memory consequtively.) Really? I thought memory management code followed pointers and chains from a root set if it were any smart. Randomly walking over memory it has no idea what contains just seems like a recipe for major disaster. Some conservative garbage collectors that are bolted onto languages that were never properly designed to deal with abandoned pointers seem to make do with interpreting machine words as pointers on a hunch, but I have always been exceedingly skeptical of the entire approach. If you keep the size of a vector in a well-known structure with the pointer to the data array, there is only one way to get at that data array, anyway. | That gives only 2 words for the header meta-object. And much more overhead and increased complexity and general madness. You have obviously never actually experimented with this in _Lisp_- like languages. I don't care about optimizing other languages for adjustable arrays, as we already have efficient adjustable arrays in Common Lisp, so there's no point in reinventing the wheel, and using adjustable arrays to represent lists requires concern for the usage of lists so optimized. I'm sure you can optimize this stuff to death in Java, but Java doesn't have any need for cdr-coding or more efficient storage of lists in particular, either, so I must admit to completely failing to see the whole point of this exercise, except that you get to argue for your preference for array-based languages, which I do not share, and am unlikely to begin to share just because of this. | Of course if you want to preserve identity of lists for linked links, | you will also need an initial header cons. I'm sorry, but is this a real argument? You're talking about data arrays that move when they grow at the tail end, right? Lists can grow at the tail end without any "header cons" simply because we just overwrite the cdr that is nil in the final cons with a pointer to the cons cell that holds the new element (and a new nil). Lists thus grown retain their first cons cell, which does not move, and thus retains identity at the pointer level. You do need a meta-object for the header of arrays because the data array moves the address of the first element when it is moved. | As I said: Only twice total, on average. Well, you actually said *a single time*, but if you keep doubling it every time you say it, I'm happy. (Sorry, but I found that fuuny. :) | However, my preference is for languages where the programmer does not | usually explicitly "cdr" down a list - or even index them. Well, I think it is preferable to ask people who prefer the languages they are trying to optimize for their ideas on how to optimize them instead of optimizing the programmers to use some language _they_ do not prefer simply because the optimizers do. I'm confused about your purposes, I guess. Promote APL if you like, but then I don't see why you're even offering suggestions for how to implement lists efficiently in Lisp. This is mystifying to me. #:Erik -- The United States of America, soon a Bush league world power. Yeee-haw!