Rewriting the GC
tullmann at cs.utah.edu
Mon Apr 1 10:10:18 PST 2002
> > The big problem with walking the stack isn't the Java stack as much as
> > the native stack. You could walk the Java parts precisely, and the
> > native bits conservatively, but I don't know what you'd win anything
> > by doing this.
> OK, I'm not so familiar with the way Java interacts with
> native code, but why do we need to walk the native bits at
> all? Surely C code doesn't need GC?
All the native methods in Kaffe, the threading system, the core class
loading --- that's all written in C. That code uses the same stack as
the Java code. That code uses Java object refereneces left and right,
and stores them on the stack. I know the object allocation code in
the GC itself relies on the GC scanning the stack to avoid collecting
a brand new object (for whom the only reference is on the C stack).
If you grep through the core of the VM, it invokes gc_malloc() a LOT.
It definitely relies on the GC. Godmar and Tim cleaned things up a
lot to avoid creating excessive walkable objects (e.g., many of the
structures that hang off a class are explicitly allocated/deallocated)
but there are still a significant number of GC objects that are
created in C.
> > As I understand GC trade-offs, the big win for precise GC is the
> > ability to update pointers and thus implement a compacting collector.
> > Is there something else you're hoping to get out of precise stack
> > walking?
> Predictability and speed of GC.
You're talking about bogus pointer references in a conservative scan
being viewed as pointers, when in fact they're just integers that look
like pointers? Most of the literature about that says the overhead is
pretty small. I doubt you'd see any performance improvement (I guess
things would get worse from having to manage stack maps and the like).
As for predictability, I don't see that as a useful goal for Kaffe....
Remember, Kaffe already gets the benefits of precise GC for most
objects. It is just thread stacks and objects for which it has no
layout information or specific walk function that cause a conservative
scan (and only for those objects). (I wonder what the other
conservatively walked objects are, I think there are some still.)
Implementating a compacting collector, OTOH, would be really cool.
Kaffe could support generational collection (which is what all the
"real" JVMs support AFAIK), and its support for multi-process VMs (the
Flux work I do --- KaffeOS and JanosVM) would be much improved.
That's just a significant amount of additional work, I believe.
You could get an upperbound on the predictability by instrumenting the
VM to count the number of references from a conservative scan are the
only reference to an object. Many of those will be (I bet) legit
references, but a precise stack walk cannot get rid of more than that...
> > Another approach to consider is to implement GC-safe points (e.g., on
> > method calls and backwards branches in Java code). Then you only have
> > to track and update the stack maps at each safe point,
> There's a lot to be said for this, but since you can allocate
> unlimited memory in an exception handler, every point that can
> throw an exception has to be a safe point, which reduces the
Most call points and backwards branches would have to be gc-safe,
anyway --- to avoid looong gaps without safe points --- so I don't
think exceptions pose any significant problems. While you don't get
major wins for the JIT'd code this way (though I think there are some
nice ones) a system that supports safe points is (and I'm somewhat
guessing here) much easier to write safe native code in --- you don't
have to worry about your C code getting interrupted by the GC at
arbitrary points, only at safe points you explicitly insert in your
code. There are downsides, of course --- like if you forget to put a
safe point in somewhere, the GC can be blocked for a long time.
----- ----- ---- --- --- -- - - - - -
Pat Tullmann www.tullmann.org
"Forty-Two." -- Deep Thought
More information about the kaffe