Subject: Re: Walking a list
From: rpw3@rpw3.org (Rob Warnock)
Date: Sun, 01 Jul 2007 00:44:25 -0500
Newsgroups: comp.lang.lisp
Message-ID: <rvqdnafGEfikohrbnZ2dnUVZ_s6onZ2d@speakeasy.net>
Kent M Pitman  <pitman@nhplace.com> wrote:
+---------------
| rpw3@rpw3.org (Rob Warnock) writes:
| > But this is merely an efficiency hack, and is not even normally
| > *observable* by the Lisp programmer, with one exception -- two
| > freshly created instances of such an "immediate" type may compare
| > as EQ, whereas for non-immediates EQ would fail but EQUAL [or
| > maybe EQL] would succeed. E.g.:
| >     > (let ((a (eval (read-from-string "(+ 2 3)")))
| > 	    (b (parse-integer "000101" :radix 2)))
| > 	(eq a b))
| >     T
| >     > 
| > So in the implementation I used to run that example,
| > FIXNUMs are probably encoded as immediates.
| 
| But in others they might not be.  And without talking to the vendor or
| reading the source code, I don't think you could so easily infer the
| truth.  Your analysis might be right in some cases, but perhaps not in
| others.
+---------------

True. I was not trying to say that EQ is any kind of reliable test
for immediates. Rather, I was saying that the use by an implementation
of immediates for some data types *might* be observable by the Lisp
programmer, even without poking under the hood [reading the source
or "peek"ing at memory].

+---------------
| In Maclisp, almost all fixnums were heap-consed (except in specific
| immediate communication), BUT in order to save space [we only had 18
| bits of word-addressed data, or 256K words = 1.25 MB], "small" fixnums
| were interned.  Consequently, (eq (eval '(+ 1 1)) (eval '(- 3 1))) was
| likely to return T in Maclisp, for example, even though 
| (eq (eval '(+ 5000 1)) (eval '(- 5002 1))) would return NIL, and even
| though numbers up to well over 5000 were still fixnums [as a 36-bit
| machine, fixnums went to 2^35-1, I believe].
+---------------

Just so. Indeed, SIOD Scheme (which I mentioned as an example of a
Lisp in which *all* values are references to objects) also pre-interned
some number of small positive integers[1] at startup time. Early versions
defaulted to only 100, but later versions defaulted to 2048 (but all
allowed changing that on the comamnd line). All arithmetic operators
checked to see if the current value being returned was an integer[1]
within the (currently-)interned range, and if so, calculated a reference
to the pre-interned value, else consed up a new one from the heap.

[1] Well, SIOD actually had only double floats, but it would pre-intern
    objects for 0.0d0, 1.0d0, 2.0d0, etc., and then the arithmetic
    would check to see if for some double float result "x" whether
    (x - (n = (long)x)) == 0 were true and if 0 <= n < inums_dim
    were also true, and if so return inums[n].

+---------------
| Now, Maclisp was not CL.  But my point is that there are legitimate
| implementation strategies that could decide to "intern" or "coalesce"
| or "EQ-ify" fixnums.  So I don't agree that EQ exposes immediate
| data--it exposes some form of address gaming, but the nature of that
| address gaming is sometimes tricky to infer merely from its effects.
+---------------

No argument. So let me try to rephrase a bit. The person I was
replying to suggested that we tell Lisp newbies that *all* Lisp
values are references to objects. I was pointing out that while
that is a good thing in general [and is even literally true in
some Lisps], some values *might* be immediate data rather than
heap references per se, and that those *might* be detectable
without "going under the covers".

You have now pointed out that (pre-)interned data values (either
statically-allocated or heap objects) might be indistinguishable
from immediate data values, at least with respect to EQ, which I
certainly *should* have mentioned, given that I had already made
refernce to SIOD!!  D'oh!  ;-}  ;-}

In fact, I can't immediately [pardon the pun] think of any way
[without "going under the covers"] for the Lisp programmer to
*ever* distinguish values implemented as immediates from values
implemented as references to (continually re-)interned objects.[2]
Can you?

[2] I use the nonce term "continually re-interned objects" to
    include not only the possible re-interning of arithmetic
    results, as in the above, but also other possible object
    types which might behave similarly. E.g., I'm told that in
    Java the [immutable] String is such a type (i.e., that the
    results of, e.g., StringAppend are automatically interned),
    while mutable arrays of characters are not.

+---------------
| > +---------------
| > | If you want to pass a new copy of the object, you have to
| > | explicitly create the copy, and pass a reference to it.
| > +---------------
| > 
| > Again, except for immediates, where a copy of the "reference"
| > *is* a new copy of the "object".
| 
| I think the right way to understand integers and characters is that
| they are not immediates or non-immediates...
+---------------

Sorry, I didn't mean to fixate on small integers or characters.
It's just that those were examples that were close at hand on
my favorite CL implementation. Indeed, I already noted that in
Unicode-aware implementations characters are probably *not*
immediates [though they *might* be interned, and if so, indis-
tinguishable from immediates]. I was just trying make sure that
a crack was left open in the "all Lisp objects are references"
teaching, so that newbies wouldn't feel lied to if/when they
eventually learned that it wasn't *always* true. Maybe I went
a bit too far, I dunno.

+---------------
| But calling them immediates when talking about Lisp-level abstractions
| probably confuses matters, even though subprimitively indeed
| representing them as immediates is a possible implementation choice
| that would have various consequences on the Lisp-visible side.
| 
| I just think we should avoid that. It's ok to do by analogy, to help
| someone get a vague handhold on the fact that the world over here
| might be different than in other languges, but not to lead anyone to
| think that we just have a contorted way of saying immediate without
| owning up to it.  It isn't that.  We have a way of allowing immediates
| and pointers to be used freely and interchangeably in some contexts as
| implementation strategies, and a language that is defined such that
| these things don't matter.
+---------------

Yes, that's all I was trying to convey: "All Lisp objects are
references, all the time... [except that if you plan to become
an implementor (or an FFI designer) some day you should know
that under the covers they might not always all be references
to heap objects]".

I was also exploring the extent to which the "immediate"
optimization might be programmer-visible but I think you've
probably convinced me that it's indistiguishable from
continually re-interned heap object types.


-Rob

-----
Rob Warnock			<rpw3@rpw3.org>
627 26th Avenue			<URL:http://rpw3.org/>
San Mateo, CA 94403		(650)572-2607