Memory profile

Maxim Kizub max at immsp.kiev.ua
Wed Jan 27 09:05:14 PST 1999


jason wrote:
> 
> Maxim Kizub writes:
> >
> >
> >
> > I think this will not help.
> >
> > I'm looking for memory leaks, and they unlikely
> > in _java_ objects. If they exists, they must be in
> > memory that is explicitly allocated and freed.
> > I need a pointer in raw memory blocks...
> > But I can't understant how gc_unit and gc_block
> > are allocated and used...
> 
> It looks like you already found the (or at least the most serious)
> problem,  but I can try to answer your question anyway.
> 
> gc_blocks describe pages of memory.  A page contains several small
> objects, or a run of pages contains one large object.  The an
> gc_blocks is allocated at startup, and normally it stays put.
> 
> The gc_blocks contain three pointers into the pages they manage: funcs
> state and data.  Funcs and state are byte arrays allocated at the
> start of the page, each object has a func and state.  data points to
> the first actual object on the page.
> 
> gc_unit is the object header used in the psuedo-treadmill collector.
> gcMalloc allocates the requested size + 8 bytes.  The gc_unit is at
> the base of the allocated memory, and the object reference just above
> the gc_unit.  This is why markObject does UTOMEM (which subtracts 8
> from a pointer) before checking if the pointer is a valid object.
> 
> I haven't looked at the post-libtool version of kaffe, so my
> information could be out of date.


Ok. I have another question about gc_unit and gc_blocks.
Looks like I do not understand one thing - why we need
gc_unit ?

I calculated, that for this scheme (without gc_unit)
each Object.class object must allocate 6 bytes of
memory (about 7 bytes if add gc_block header).

Look.

1 byte per object state (colour & state),
1 byte per GC walk func (really need 4 bits, IMHO).
4 bytes per dtable of object itself.

That's all!

Each gc_block may be calculated from
object's pointer, if we know, that pages
are allocated at specific align (to say,
4096 or 8192 align or like this).
For example, to get gc_block from
memory address, we just need

gc_block *info = (gc_block*)(mem & ~0x1FFF); // for 4096 page

The minimal object's size (without
func & state) is 4 bytes.

How GC should allocate & free objects?

Let's say we have:

gc_block {
	...
	void* fr;	// now points to 'free5'
	...
	obj1
	obj2
	obj3
	obj4
	free5	// now points to 'free6'
	free6	// now pointe to next free
	...
}

When object in gc_block is freed, it's
contents must be replaced with pointer
to next free object, for example,

(void*)&obj3 = fr;
fr = &obj3


if we free 'obj3', the gc_block will
looks like

gc_block {
	...
	void* fr;	// now points to 'free3'
	...
	obj1
	obj2
	free3	// now points to free5
	obj4
	free5
	free6
	...
}

When we allocate new object, we just

void *tmp = (void*)*fr;
memcpy(fr,&obj3,size);
fr = tmp;


Looks like this is sufficient for GC, allocation and
freeding. I see no need in gc_unit structure
allocation...

Next, I have some thoughts about reducing
memory fragmentation and speedup of
GC.

I'm not shure this is already implemented, but...

I think, there may be 2 ways to GC - quick
and full.

Quick will include collecting of normal
objects, without attempt to GC classes
and interned strings. Each object referred
from conservative walker must be marked
as non-moveable, plus classes and
interned strings are non-moveable at
this (quick) mode.

If GC still can't free enough of memory,
it will make full scan. In full scan
mode objects will be _moved_. I.e.
each object, not marked as unmoveable
at fast phase may be moved to
first gc_block that has free space.
Now, as we are shure, that this object
is not referenced from stack, we add
it to special hash table, that has
keys of old object's address, and
values of new object's address.

When we encounter this object's reference
(old reference) later (this may be
only method's fields and other places,
where we shure this is pointer to
object, not an integer), we'll take
it's new value from that special
hash table and replace the pointer.

Additionaly, we may move classes
and interned strings too. The only
problem is that they referred from code
directly. To avoid this, we need to
change bytecode generator to
load this pointers indirectly, using
class's constant pool. Then,
any generated code will have pointers
to it's own class' constant pool,
not pointers to actual classes
and strings.

Or, we may force allocation of
classes and interned strings at
specially assigned pages, not in
common gc_block. Then their allocation
will not lead to memory fragmentation
if we'll move other object's at full
GC phase.


More information about the kaffe mailing list