Copyable GC

Thu Apr 1 06:41:06 PST 1999

Hello.

I have an idea about memory fragmentation.
It's obvious, that to reduce fragmentation we
need to have ability to copy objects at GC time.

We've discussed this before and found next
hard points with it:

1. Objects referenced from jit-generated code
2. Objects referenced from stack
3. Object.hashCode() method.

For 1. We know, that only Class and String
objects may be referenced from code.
Class objects are not allocated and deallocated
very offten and they have their own GC walk function.
They may be allocated and deallocated by special
malloc-like algoritm and may be considered unmoveable.
Strings referencable from native code may only
be allocated through class' constant pool.
It's better to not have direct references to String
objects from jit-generated native code, but load
them indirectly (i.e. have a hard-coded pointer
to cont pool and load String indirectly via this
pointer and fixed offset). If jit for distinct
platform do not load String objects indirectly but
have hardcoded pointer to String objects - for
this platform String object's may be considered
unmoveable, like Class objects.

For 2. It's possible to have jit compiler that
knows for shure stack/register map for objects.
If we'll write new version of jit engine - it whould
be fine to remember to implement in it this
feature. But until that we may consider
object's referenced from native stacks to be
unmoveable.

For 3. Object.hashCode() method should return
the same value for the whole lifecicle of
object. If we will have ability to optimize memory
allocation (defragment memory) by object
copying - we need to find a way to return the same
value.

An approach to allocate additional 32 bits for
each object to store hash code is not the best.
It will add too much of memory overhead and
it's unclear - will we win anything having
defragmentation and this overhead.

Another approach is to return objet's address
as hash code if object was not moved and
have system-wide map for oldHashCode->newHashCode.
This will adds 64 bits for each moved object.
It's still unclear about how much memory will
be required by this global map. For some
applications this may be an effective startegi,
but it's hardly a good solution for all
applications.

The last approach (that's the idea I wish to propose)
is to implement a special bit for unmoveable
object's and set it each time Object.hashCode()
native method is called.

I.e. I have a proposition:

1) allocate two bits for object movement.
2) first bit for per-scan object fixing
(marking it unmoveable) and second bit for
always unmoveable objects.
3) have two-pass GC collector - first
pass will implement mark-and-sweep
GC collection and will set first bit
of object movement - if object is
unmoveable because of direct reference
from native stack or by conservative GC walk
method. The second bit is set for Class objects,
String objects and other memory structures
that cannot be moved _and_ by Object.hashCode()
method. If first pass of GC will determine that
memory fragmentation is more than a distinct
value (for example - 25 %), or GC was called
by System.gc() (wich instructs VM to made it's best
sffort to GC memory) - the second (copy) pass
will be executed.
4) The second pass of GC will decrease memory
fragmentation by object's copying.
First, each object with any of two umoveable bits
set is unmoveable.
Second, all heap blocks that have unmoveable objects
are unmoveable.
Third, all last allocated object's will be moved
from blocks that are moveable to unmoveable
blocks and then if there are heap blocks with
holes - object's from later allocated blocks
are moved to earlyer allocated blocks.

Some other possible optimizations related to
defragmentation of memory may be investigated - like
posibility to mark unmoveble objects ready
to be finalized, dynamic reconfiguring of
heap block types (table of their sizes) based
on memory fragmentation statistic and so on.

Regards
  Maxim Kizub