[Cross-posting trimmed to groups I read...]
Larry D'Anna <larry@elder-gods.org> wrote:
+---------------
| Jon Harrop <jon@ffconsultancy.com> wrote:
| > Larry D'Anna wrote:
| >> Johannes Laire <johannes.laire@gmail.com> wrote:
| >>>> That is O(n) with GHC.
| >>>
| >>> It is O(1) for unboxed arrays.
| >>
| >> I thought it was O(1) for boxed arrays too, but I tested it and defiantly
| >> is not. Any idea why boxed vs. unboxed should make a difference?
| >
| > The write barrier incurred when mutating a boxed element in an array dirties
| > the whole array and that causes the GC to traverse the whole array at every
| > minor/gen0 GC, which is periodic (separated by constant amounts of time for
| > the purposes of this description). That dirtying is not necessary for
| > unboxed elements so you do not see the asymptotic performance degradation
| > with them, e.g. the STUArray of Int in your example.
|
| Why in the world does it have to dirty the whole array? Is there any sane
| reason for that or is this just an unfortunate deficiency of GHC? I just
| can't believe that an array update is O(n). But but my tests updating a
| STArray really is O(n) in GHC. I'm a bit shocked.
+---------------
If GHC were using a generational GC with a software card-marking
write barrier, then updating a single element in an array of boxed
elements should at most dirty a card's worth of elements. According
papers I've read on this, good "card" sizes these days are ~256 bytes
[or at least somewhere in the 128-1024 byte range], which on a 32-bit
machine would mean an extra 64 elements to scan during the first GC
following the update [but possibly no extra work thereafter, if the
updated value gets promoted into the same generation as the array].
This would therefore be an O(1) cost w.r.t. to the size of the array,
though O(c) for "c" the number of distinct cards updated. [That is,
doing a zillion updates to all of the elements of one card costs
no more GC overhead than doing a single update to that same card.]
So all you're saying is that GHC isn't using that kind of GC. ;-}
-Rob
-----
Rob Warnock <rpw3@rpw3.org>
627 26th Avenue <URL:http://rpw3.org/>
San Mateo, CA 94403 (650)572-2607