Subject: Re: Common Lisp wish list item From: Erik Naggum <erik@naggum.no> Date: 20 Aug 2002 22:24:55 +0000 Newsgroups: comp.lang.lisp Message-ID: <3238871095242925@naggum.no> * Robert Swindells | PDP-10 Maclisp and Franz Lisp used bibop rather than tag bits to type | pointers. They shifted down the address to get a page number which was used | as an index into an array of type values. When I implemented these things myself, I used fairly large pages (64K) and held the type information in the first few words of the page. To access the type, just zero the lowest 16 bits and read the type information. The cache line would be fairly frequently accessed and would probably remain in L1. When allocating values from such a type-homogenous page, the first few words of the page would also contain the next unused unit. Since this word is in the same cache line as the type information, it would also ensure that that cache line would remain in L1. Which page to allocate from would be found from a table of "open pages" for each type. Garbage collection knew the layout of the types on a page from other administrative information on the page and copying out those values that were still referenced was fairly simple. Mark bits was implemented efficiently with one bit per allocated type and stored in dedicated words. I used this scheme mainly to take care of C programs that allocated a huge number of small pieces of memory of a small set of sizes and noticed that the standard allocator used up a lot of memory for administrative purposes and got really wasteful. Since it was both expensive to allocate 64K-pages and wasteful to move things around too much, I could exclude individual pages from the garbage collection with a flag in the page that would cause it not to be copied. I implemented this on the first Unix machine I had access to, an Altos running on M68K that had 4M or memory. Because of this new allocation scheme, the program was able to solve problems three times as large as it could do with normal malloc and free. We actually found that some types were used only for slowly acreting long-term storage that had previously caused much fragmentation of memory and that there was more harm than good in freeing these objects and this caused the whole new allocator design. | On a modern CPU with huge penalties for cache misses, tags make more sense | as they will always be in the cache along with the pointer that they type. | Needing to load a cacheline for the pointer as well as one for part of the | type table would really reduce cache efficiency. Not necessarily. The above scheme, now at least 20 years old, would scale well in modern processors. I have not tried to reimplement it. Perhaps I should... I "invented" it on the spot, but I have no idea whether this scheme has been "independently invented" in the interim. I have seen various scheme to allocate memory in particular sizes, but a lot of them have just used the (IMHO stupid) word before the object to make a linked list of freed objects, wasting much memory with alignment requirements. | The 68000 was an ideal CPU for this as it had three "function code" pins | that would indicate whether a bus cycle was for instruction fetch or data | read. Never did hardware stuff on the M68K, but I really liked its instruction set. Such a pity that Intel "won" with its braindamaged design. | In a modern CPU there are usually several spare bits in a page table entry. | I would like to see some way of using them to store the type of objects in | that page. In my experience, you may need quite a bit of administrative overhead at the beginning of the page, and the usual page size of 4K is way small to pay for the overhead. Or so I thought when I used 64K pages. It was also sexy in that the &~0xffff operation turned into smarter instructions in the compiler back then. Ah, the calm breeze on these walks down memory lane (several puns intended). -- Erik Naggum, Oslo, Norway Act from reason, and failure makes you rethink and study harder. Act from faith, and failure makes you blame someone and push harder.