Subject: Re: Corman Lisp and binary files
From: rpw3@rpw3.org (Rob Warnock)
Date: Mon, 10 Apr 2006 01:47:32 -0500
Newsgroups: comp.lang.lisp
Message-ID: <yt-dnQMmHegZYqTZnZ2dnUVZ_u-dnZ2d@speakeasy.net>
Zach Beane  <xach@xach.com> wrote:
+---------------
| Shyamal Prasad <shyamalprasad@verizon.net> writes:
| > I was just refering to CL as a standard. Or rather, cltl and the
| > hyperspec since I've never read the actual standard.
+---------------

For all practical purposes, the HyperSpec (CLHS) *is* "the actual
standard" [well, to be precise, mechanically-derived from the same
TeX sources the printed ANSI Standard was mechanically-derived from].
You can have confidence in the CLHS.

CLtL & CLtL2, on the other hand, are another matter entirely...

+---------------
| > The language specification seems extremely stand offish with regards
| > to processing binary streams: there is no "8 bit byte" type data, 
| 
| That's not true. (UNSIGNED-BYTE 8) is such a data type.
+---------------

So we don't confuse Shyamal *too* much, we should point out that
while most implementations do provide (UNSIGNED-BYTE 8), a random
implementation need not implement it except as a subset of a larger
type. Similarly, most implementations *don't* provide an exact
(UNSIGNED-BYTE 5) type, but some random implementation might.
Since he's interested in what specialized array types will
"do the right thing" with READ-SEQUENCE, we should mention
using UPGRADED-ARRAY-ELEMENT-TYPE to see how some specific
integer subtype behaves:

    cmucl> (upgraded-array-element-type '(unsigned-byte 5))

    (UNSIGNED-BYTE 8)
    cmucl> (make-array 10 :element-type '(unsigned-byte 5))

    #(0 0 0 0 0 0 0 0 0 0)
    cmucl> (type-of *)

    (SIMPLE-ARRAY (UNSIGNED-BYTE 8) (10))
    cmucl> 

And similarly:

    cmucl> (with-open-file (s "foo" :element-type '(unsigned-byte 5))
	     (describe s))
    #<Stream for file "foo"> is a structure of type FD-STREAM.
    ...[trimmed]...
    ELEMENT-SIZE: 1.
    ELEMENT-TYPE: (UNSIGNED-BYTE 8).
    FD: 6.
    BUFFERING: :FULL.
    ...[trimmed]...
    cmucl> 

So if you tried to read 5-bit bytes with CMUCL you would silently
get 8-bit bytes instead.

That said, (UNSIGNED-BYTE 8) *is* probably the type you want to use
for binary streams of 8-bit bytes on most CL implementations.


-Rob

-----
Rob Warnock			<rpw3@rpw3.org>
627 26th Avenue			<URL:http://rpw3.org/>
San Mateo, CA 94403		(650)572-2607