Subject: Re: How to do fast matrix-multiplication?
From: rpw3@rpw3.org (Rob Warnock)
Date: Fri, 10 Feb 2006 01:35:21 -0600
Newsgroups: comp.lang.lisp
Message-ID: <o-OdncbcttKk33HeRVn-rg@speakeasy.net>
bradb <brad.beveridge@gmail.com> wrote:
+---------------
| What is the advantage of an implementation using low bits as tags?  I
| get the feeling that it ought to be less efficient than using the
| higher bits.  But I'm also sure that smarter people than me have
| thought about it harder than I have, and that low bits must be no less
| efficient than high bits ... so what's the history?
+---------------

The best overview of the tradeoffs that I know of is here:

    ftp://ftp.cs.indiana.edu/pub/scheme-repository/doc/pubs/typeinfo.ps.gz
    "Representing Type Information in Dynamically Typed Languages",
    by David Gudeman (Oct. 1993).

But the essence of the lowtag choice is this: Since most heap objects
are as big as a cons or larger, and since on 32-bit byte-addressed
machines a cons is typically 8 bytes, you can choose to force 8-byte
alignment of heap objects without very much wastage. Given *that*, a
3-bit lowtag system is almost "free": heap objects can be accessed
directly using the descriptor as a native machine pointer without
having to mask off the lowtag field, provided only that a small
(type-dependent) constant can be subtracted from the descriptor by the
hardware during the process of doing address arithmetic [which most
instruction sets can do]. E.g., in 32-bit CMUCL the lowtag for a cons
is 3, so [after type checks] in C the operation (CDR foo) is simply
"((struct cons *)((u32)(foo) - 3))->cdr".


-Rob

-----
Rob Warnock			<rpw3@rpw3.org>
627 26th Avenue			<URL:http://rpw3.org/>
San Mateo, CA 94403		(650)572-2607