markObject given a primitive class object

Godmar Back gback at
Wed Mar 10 20:32:48 PST 1999

> With the snap shot downloaded on the 9th of March, I'm getting a non gc'ed
> address in the following spot.  In particular, it's a class object for short
> (not Short).  I don't see how one can assume that this memory is always gc'ed
> unless no one ever passes the short class around.

That shouldn't happen with a snapshot from March 9th.
Could you check that your itypes.c has a line

    Hjava_lang_Class* _Jv_shortClass;	/* correct */

as opposed to

    Hjava_lang_Class _Jv_shortClass[1];	/* wrong */

If so, you either didn't download a snapshot from March 9 or the 
snapshot system is broken.  Let us know if it's the latter!

Tim was working on adapting gcj for use with Kaffe.  He had a partial
implementation and decided to share it (it's in the CVS).  However, it was 
developed using Transvirtual's VM (where things are different) and so he 
accidentally ended up putting things in non'gced space without noticing
it -- which is where gcj puts them and expects them for various reasons.

I noticed how things broke and we discussed things and decided to revert
to the old scheme for now.  The issue is still open.  For interested
people, I'll append an email in which I discuss the issue.

If anybody has useful feedback and/or ideas, this would be more than

	- Godmar

>From kaffe-core at  Sun Feb 28 20:29:52 1999
Return-Path: <kaffe-core at>


 I am still really peeved by the gcj Object/Class static/dynamic 
allocation issue.

Here is the problem again: gcj allocates some classes in the data
segment, and expects some classes to be allocated in the data segment.

    Hjava_lang_Class _Jv_intClass[1];

in itypes.c.  This is actually java.lang.Integer.TYPE.
The problem now is that *any* static or non-static field of type Object
or Class can point to such an statically allocated object.  Therefore,
when walking such a field, care must be taken that the referenced object
is not blindly marked (with "markObjectDontCheck"), because it may not
have been allocated by the gc.

Right now, I simply replaced markObject with markAddress in the gc COM
table to prevent crashes.  This means that we treat *any* reference like
an integer value on the stack where we need to do the gc_heap_isobject check.
An expensive hack we need to get rid of.

I see three solutions:

1.  Since we've settled on using our own version of egcs for now, how about
    we reserve space for a faked header before each static object.
    When we mark, we read that header, and if it's fake, we don't mark.
    This shouldn't hurt too much since if we mark, we'd read that header
    anyway, so it's in the cache (at least the first time we mark it, 
    i.e., if it isn't already marked.  If it's already marked, we'd see 
    its colour and proceed.)

    So, then the question becomes how to best express that in the code.
    I still have the ideal of separating gc and vm core to the greatest extent
    possible.  This ideal may not be realistic, though.
    This solution would a) assume the gc places a header before the object
    and b) that there exists a value that this header cannot have
    (like 0xffffffff).

    For now, we could just change

	Hjava_lang_Class _Jv_intClass[1];

	int fake[2] = { 0xffffffff, 0xffffffff };
	Hjava_lang_Class _Jv_intClass[1];

    and hope the compiler/assembler/linker will not rearrange or pad
    things (is there a gcc directive maybe to guarantee that?)
    I don't like that because it seems fragile and/or gcc dependent 
    and/or architecture dependent.

    But I am *not* willing to tolerate that we'd have to 
    gc_heap_isobject-check every pointer before marking it.

2.  Try to do a quick check based on the address, i.e., only mark if it's
    between the smallest and biggest heap address.  This will
    require that the heap range is not interleaved with the address
    ranges of the data segments of the shared libs in the system.

    Again, it seems fragile to rely on that.
    Guaranteeing the property that no data segments interleave seems even 
    worse cause we'd have to have some hacks to find out where the dynamic 
    linker placed the data segments of shared libs.  Ouch.

    Secondly, if we allow a list of heap segments or data segments,
    we might as well use gc_heap_isobject.

3.  Special-case all references of type Class/Object. 
    That is, give objects containing non-static references to either type
    a special walk function (we don't want to give them special treatment
    in the walk function for *all* objects (i.e., walkObject) cause that 
    would slow it down.)
    Ditto for classes with static references to either type.
    Ditto for ref arrays of either type.

    The special walk function, when it sees a ref of type Class, would
    look at a flag in the class object and know where it's from.  When
    it sees a ref of type Object, it would check whether the actual type 
    is Class, and proceed as for Class refs.
    The problem with this approach is that it special-cases Class/Object
    to a certain extent.
    Who knows what egcs is going to put in static memory tomorrow?

4.  Would be to require a flag in all objects that would say whether it's
    from the heap or not.  This would again be something the gc wouldn't
    need to know about.

    This wastes space.

What do you think?  I'm leaning towards 1., but are there other
caveats maybe that I haven't thought about?

	- Godmar

PS: On a related note, I noticed that the static fields are not resolved for
preloaded classes at this point.  Keep in mind that static fields must be
walked by the gc.  Plus, the objects referred to by static fields of 
preloaded classes must be kept alive somehow.  Using gc_add_ref on the 
actual reference won't work; a solution may be to register the class as a 
root and walk it, marking its the references its static fields point to.
In this case, we'd have to tweak the gc to not mark roots blindly,
which should be easy and okay.

More information about the kaffe mailing list