Subject: Re: Quick way to figure out if floats are boxed in arrays?
From: rpw3@rpw3.org (Rob Warnock)
Date: Fri, 13 Oct 2006 22:34:52 -0500
Newsgroups: comp.lang.lisp
Message-ID: <m-idncsOoftBx63YnZ2dnUVZ_tKdnZ2d@speakeasy.net>
Duane Rettig  <duane@franz.com> wrote:
+---------------
| Nicolas Neuss <firstname.lastname@iwr.uni-heidelberg.de> writes:
| > Hmm, maybe UPGRADED-ARRAY-ELEMENT-TYPE?
| > (upgraded-array-element-type 'double-float)  ; if T then boxed?
| 
| Since this answer was posed with a question mark, I'll solidify it to
| a fine mush.  If this expression returns T then the double-floats will
| definitely be boxed.  If it returns nil, then I am not sure if it a
| priori dictates that the float is housed unboxed into a specialized
| array made with that element-type.  This is definitely the case in
| Allegro CL (i.e. nil means unboxed), and I believe that most if not
| all other CLs do the same thing (not specifically to store
| double-floats unboxed into an array made  with element-type
| double-float, but to do so iff the above return from u-a-e-t is nil).
+---------------

But, but... I thought UPGRADED-ARRAY-ELEMENT-TYPE is supposed to do this:

    [CLHS]
    Function UPGRADED-ARRAY-ELEMENT-TYPE
    ...
    Description:
    Returns the element type of the most specialized array representation
    capable of holding items of the type denoted by typespec. 

So while it can certainly return T, or the same type that it was given
as an argument, or some type between that type & T, how can it possibly
return NIL?!? [1]

Anyway, I think the OP's question is unanswerable within ANSI CL.
Yes, for many CLs and typespecs it might be the case that if:

    (upgraded-array-element-type 'foo) ==> foo

then a FOO can be stored in a specialized array without "boxing"
[defined as being a pointer to separate heap-allocated memory
per element]. So one might be tempted to define something like this:

    (defun array-element-unboxed-type-p (type)
      (equalp type (upgraded-array-element-type type)))

But as Carl Shapiro pointed out to me off-line, the CLHS makes
absolutely no guarantee whatsoever that a specialized array actually
*does* store things unboxed!! All a "specialized array" guarantees
is that you can do type discrimination on the element type. [See CLHS
"15.1.2 Specialized Arrays" & "15.1.2.2 Required Kinds of Specialized
Arrays".] That is, it is perfectly legal [though most people would
consider it not very useful] for a CL implementation to behave as
follows on some types, call one of them WEIRD-TYPE:

    > (upgraded-array-element-type 'weird-type)

    WEIRD-TYPE
    > (make-array 10 :element-type 'weird-type i:initial-element (make-weird-type))

    #(#<weird> #<weird> #<weird> #<weird> #<weird>
      #<weird> #<weird> #<weird> #<weird> #<weird>)
    > (array-element-type *)
    WEIRD-TYPE
    > 

and yet have the array elements *actually* be pointers to heap-allocated
objects of type WEIRD-TYPE. the (confirming) implementation simply chose
to allow specialized arrays of WEIRD-TYPE to allow type discrimination
on that type. (Or something.)

[Likewise, ANSI CL does not *require* that STRINGs be stored as
dense arrays of unboxed elements. They might well be stored as
arrays of pointers to "something else".]

Conversely, if what the OP was *really* worried about is not "boxing"
per se but "wasted space", the above ARRAY-ELEMENT-UNBOXED-TYPE-P
would *also* give you the "wrong" answer when an implementation chooses
to use an upgraded type which -- while still *unboxed* -- is not
exactly the same size as the type you asked about. Consider these
examples from CMUCL [Allegro is probably similar]:

    > (mapcar 'upgraded-array-element-type
	      '(float single-float double-float))

    (T SINGLE-FLOAT DOUBLE-FLOAT)
    > (mapcar 'upgraded-array-element-type
	      '(nil character base-char extended-char))

    (BIT BASE-CHAR BASE-CHAR BIT)           ; See [1], [2] below
    > (mapcar 'upgraded-array-element-type
	      '(unsigned-byte (unsigned-byte 8) (unsigned-byte 16)
		(unsigned-byte 17) (unsigned-byte 24) (unsigned-byte 30)
		(unsigned-byte 32)))

    (T (UNSIGNED-BYTE 8) (UNSIGNED-BYTE 16) (UNSIGNED-BYTE 32)
     (UNSIGNED-BYTE 32) (UNSIGNED-BYTE 32) (UNSIGNED-BYTE 32))
    > (mapcar 'upgraded-array-element-type
	      '(signed-byte (signed-byte 8) (signed-byte 16)
		(signed-byte 17) (signed-byte 24) (signed-byte 30)
		(signed-byte 32)))

    (T (SIGNED-BYTE 8) (SIGNED-BYTE 16) FIXNUM FIXNUM FIXNUM (SIGNED-BYTE 32))
    > 

In particular, (UPGRADED-ARRAY-ELEMENT-TYPE '(SIGNED-BYTE 30)) ==> FIXNUM,
yet there is *no* wasted space or boxing involved in that "upgrade"!!
[But there might be a *performance* penalty (shifting).]

Similarly, (UPGRADED-ARRAY-ELEMENT-TYPE '(SIGNED-BYTE 24)) ==> FIXNUM
*does* represent a small percentage of wasted space, but still no boxing.

So, to repeat, I don't think there's any portable ANSI CL way to
know whether or not specializing an array for a given type will
result in boxing and/or "wasted space" in the array. One would
need to do some implementation-specific poking around at a very
low level to find out.


-Rob

[1] Well, except for an argument of NIL. Carl says Allegro does
    (UPGRADED-ARRAY-ELEMENT-TYPE NIL) ==> NIL, which is not altogether
    unreasonable. On the other hand, what CMUCL does is arguably
    not "wrong" either: (UPGRADED-ARRAY-ELEMENT-TYPE NIL) ==> BIT.

[2] In CMUCL the EXTENDED-CHAR class is empty, so it's not totally
    unreasonable for UPGRADED-ARRAY-ELEMENT-TYPE to return the same
    thing it does for other empty classes such as NIL.

-----
Rob Warnock			<rpw3@rpw3.org>
627 26th Avenue			<URL:http://rpw3.org/>
San Mateo, CA 94403		(650)572-2607