Madhu <enometh@meer.net> wrote:
+---------------
| Rob Warnock <rpw3@rpw3.org> wrote:
| | Nope, tried all kinds of type declarations -- doesn't help. The
| | problem is, as I said, a typo in "cmucl/src/compiler/float-tran.lisp",
| | in the #+RANDOM-MT19937 (DEFTRANSFORM RANDOM). Two of the COND terms
| | need to be swapped, namely, this code, which does a premature GIVE-UP
| | [because the RANDOM-FIXNUM-MAX is smaller than (EXPT 2 32), so the
| | second branch can never be taken!]:
| [snip]
| | cmu> (defvar *foo* 429496729)
| |
| | *FOO*
| | cmu> (and (> *foo* kernel:random-fixnum-max)
| | (< *foo* (expt 2 32)))
| |
| | T
|
| Ah but youve chosen *foo* FIXNUM.
| (fixnump *foo*) => T
+---------------
True enough, though it *was* bigger than KERNEL:RANDOM-FIXNUM-MAX,
which was all I really needed to show that the patch was working. ;-}
+---------------
| I think The no-consing behaviour you observe [after patching
| compiler/float-tran.lisp] is not because of 32bit inlining but
| because your values lie in the fixnum range.
+---------------
Yes, but no, not entirely. Yes, if you pick a number larger than
a fixnum (but 2^32 or smaller) and type the intermediate variable
with (INTEGER 1 4294967296), you do still get lots of consing
[~168 bytes per iteration], presumably because it uses bignum
arithmetic to get it into a register. The same thing happens
if you type the intermediate variable with (UNSIGNED-BYTE 32),
but to a lesser degree [only ~86 bytes per iteration].
But interestingly [the "no, not entirely" case], if you pre-decrement
the argument, type the intermediate variable with (UNSIGNED-BYTE 32),
and then pass it incremented to RANDOM, the consing goes away again,
regardless of the value of *FOO* [provided, of course, that you
don't attempt values outside the range (INTEGER 1 4294967296)]:
cmu> (load "float-tran-patch")
; Loading #P"/u/rpw3/float-tran-patch.x86f".
T
cmu> (setf *foo* (expt 2 32)) ; [already DEFVAR'd earlier]
4294967296
cmu> (time (let ((bar (1- *foo*))) ; so as not to violate the U32
(declare (type (unsigned-byte 32) bar))
(dotimes (i 1000000)
(random (the (unsigned-byte 32) (1+ bar))))))
; Compiling LAMBDA NIL:
; Compiling Top-Level Form:
; Evaluation took:
; 0.02f0 seconds of real time
; 0.015088f0 seconds of user run time
; 0.0f0 seconds of system run time
; 34,611,940 CPU cycles
; 0 page faults and
; 0 bytes consed.
;
NIL
cmu>
Or for something midway between a fixnum and 2^32:
cmu> (setf *foo* (floor (- (expt 2 32) most-positive-fixnum) 2))
1879048192
cmu> (fixnump *foo*)
NIL
cmu> (time (let ((bar (1- *foo*)))
(declare (type (unsigned-byte 32) bar))
(dotimes (i 1000000)
(random (the (unsigned-byte 32) (1+ bar))))))
; Compiling LAMBDA NIL:
; Compiling Top-Level Form:
; Evaluation took:
; 0.02f0 seconds of real time
; 0.015289f0 seconds of user run time
; 0.001162f0 seconds of system run time
; 37,957,864 CPU cycles
; 0 page faults and
; 0 bytes consed.
;
NIL
cmu>
I think this works because the compiler is smart enough to know
that one plus a U32 is the same as (INTEGER 1 4294967296), which is
the preferred RANDOM arg type to do the special-casing on X86 platforms.
You can also avoid the consing the following way, though I'm not sure
if it's less or more ugly than the above!! ;-}
cmu> (time (if (= *foo* 4294967296)
(dotimes (i 1000000)
(random 4294967296)) ; special-cased in the DEFTRANSFORM
(let ((bar *foo*))
(declare (type (integer 1 4294967295) bar))
(dotimes (i 1000000)
(random bar)))))
; Compiling LAMBDA NIL:
; Compiling Top-Level Form:
; Evaluation took:
; 0.02f0 seconds of real time
; 0.015329f0 seconds of user run time
; 0.0f0 seconds of system run time
; 35,832,596 CPU cycles
; 0 page faults and
; 0 bytes consed.
;
NIL
cmu>
Again, the typing in the second branch lets the compiler know it
can keep BAR in a register and not need generic bignum arithmetic
on it.
+---------------
| | But the DEFTRANSFORM has special code on x86 platforms to inline
| | 32-bit random numbers [that is, an arg to RANDOM of less than or
| | equal to 2^32]. Are you suggesting that [even after the above bug-fix]
| | said special-case code is in fact wrong?!?
|
| The issue IIUC is in using (REM (random-chunk) n) to return a random
| number in the range [0, N-1], (as Raymond Toy pointed out:)
|
| (random-chunk) is uniformly distributed in the range
| [0, (expt 2 random-chunk-legth)]
|
| [The mersenne twister in cmucl uses a random-chunk-length = 32.]
+---------------
Actually, (random-chunk) is of type (UNSIGNED-BYTE 32), and so is is
uniformly distributed in the range [0, (expt 2 random-chunk-length)),
using half-open intervals, or [0, (1- (expt 2 random-chunk-length))]
using closed intervals.
Also, I suspect Ray may have forgotten that I was specifically talking
about the X86 platform. That (REM (RANDOM-CHUNK) N) code is under a #-X86
conditional, and won't be used. And in any event, that section of code
is only used for the case of a compile-time *constant*, and even then,
only when the constant is *not* exactly 2^32.
For the cases I was trying to fix -- the arg is a variable in [1, 2^32] --
a different section of code is used, which on the X86 platform does
(VALUES (BIGNUM::%MULTIPLY (RANDOM-CHUNK ...) ARG)) instead of the
bignum REM, where BIGNUM::%MULTIPLY is an inlined VOP that takes two
32-bit numbers and returns a 64-bit result in two 32-bit registers --
that is, doesn't cons.
Bottom line, if you stand on your head & spin around three times ;-} ;-}
you can get CMUCL on X86 to serve up pseudo-random numbers up to
and including 32 bits long without consing.
-Rob
p.s.
+---------------
| So there is a small bias in the generated numbers in the range
| [0, (mod (expt 2 random-chunk-length) n)]
|
| I'm not an expert, so please correct me if I'm wrong.
+---------------
Does this bias still exist if you use the correct type for
RANDOM-CHUNK, which on X86 is (UNSIGNED-BYTE 32)??
-----
Rob Warnock <rpw3@rpw3.org>
627 26th Avenue <URL:http://rpw3.org/>
San Mateo, CA 94403 (650)572-2607