Joe Marshall <prunesquallor@comcast.net> wrote:
+---------------
| Luke Gorrie <luke@bluetail.com> writes:
| > It also appears that on some systems (i.e. my laptop) the garbage
| > collector causes a great deal of kernel work. Smarter use of system
| > calls may fix this, e.g. coalescing mprotect()s of adjacent pages.
|
| Some experiments have shown that checking the write-barrier in
| software by doing a conditional branch on every write can outperform
| hardware checking based on mprotect.
+---------------
I remember reading some papers about that a while back. One in particular,
<URL:http://citeseer.nj.nec.com/hosking92comparative.html>, examined
various kinds of write barriers and concluded that a software write
barrier using card marking (with fairly small cards, 256-512 bytes)
out-performed kernel/VM page-based hardware write barriers:
The page trapping scheme performed poorly in comparison to
card marking. Interestingly, this does not appear to be due to
the overhead of fielding page traps, since that is included in
running time, which was not significantly higher (and often lower)
than in the card marking schemes. Rather, it is because pages
are too large a granule so they miss the optimum card size.
That is, a single write means that the whole page gets "dirtied"; you have
to scan the entire page at the next GC. Whereas with a card-marking system
you only have to scan the (presumably much smaller) card that was dirtied.
We can summarize the conclusions as follows: a card size of 256
or 512 bytes appears optimal for card marking on this hardware;
page trapping was surprisingly effective, but is not the best
scheme because its granularity is too large; and remembered sets[1]
are best overall.
-Rob
[1] "Remembered sets" are yet another way of doing software write barriers,
providing an even more precise record of the significant stores:
...associates a remembered set with each generation, recording
the objects or locations in older generations that may contain
pointers into that generation. Any pointer store that creates
a reference from an older generation to a younger generation
is recorded in the remembered set for the younger generation.
At scavenge time the remembered sets for the generations being
scavenged include the heap root set for the scavenge.
...
Our remembered sets are implemented as circular hash tables
using linear hashing. ...
-----
Rob Warnock <rpw3@rpw3.org>
627 26th Avenue <URL:http://rpw3.org/>
San Mateo, CA 94403 (650)572-2607