[kaffe] Fixing Cygwin port using Classpath's pure Java java.util.zip

Thu Dec 12 08:11:49 PST 2002

Hi all,

I've investigated the problems kaffe has on Cygwin.
Here's what happens with the current sources from CVS:

./configure --enable-debug --with-awt=no

runs fine. Sometimes it forgets to run configure in
the libltdl directory, so we may need to run it a few
times in order to get a proper build setup. That's
quite cumbersome, and I'm not sure how to debug it.
Suggestions and patches are welcome.

make

also runs fine, even the class library builds using
kjc. It takes a while on my 28 M box, but it gets the
job done.

export -n CLASSPATH; make check

starts off well on a couple of tests, but then it's
straight into crash-land. Almost every test fails and
dumps core during compilation of the test case, i.e.
kaffe crashes compiling the test cases using kjc.

That's where my debugging skills come in. And I'd
really like to tell you how I squished the bug in no
time, but unfortunately, it wasn't so. Running kaffe
in Gdb on Cygwin crashes even harder, sending the
debugger into core-dump-land. So it was a quite
pointless excercise, really, and I didn't want to
debug the debugger first.

Using kaffe's own logging facilities (-vmdebug is
great!), I figured out that the crash happens inside
kaffe's native methods for Zip file handling in
libzip. These methods are part of kaffe's class
library's implementation of java.util.zip.

Fortunately, GNU Classpath provides a pure java
implementation of java.util.zip. I decided to merge it
in and see if that would work better. In short, it
does, and reduces the number of test case failures to
34. Nothing crashes during compilation. One failure is
probably a bug in Classpath's implementation of Zip
file date handling (DosTimeVerify).

It didn't work out of the box, of course. Enter
circular dependency hell, represented by
java.lang.Character$CharacterProperties. It is a class
used to retrieve, and access a compressed form of the
Unicode character properties database. String methods,
for example, use properties of characters in order to
be able to convert them to upper/lower case. Whenever
you use string comparisons with the default
Comparator, you end up using the CharacterProperties.

The compressed database is stored in two binary files,
kaffe/lang/unicode.idx, and kaffe/lang/unicode.tbl.
More information on them is available in
FAQ/FAQ.unicode. CharacterProperties uses
ClassLoader.getResource() in order to get them. That's
the start of the circle. The resource loading request
ends up in the hands of the java.util.zip classes.
They use String comparisons everywhere, and String
comparisons without an initialized CharacterProperties
database lead to interesting, obscure crashes. See
http://www.kaffe.org/pipermail/kaffe/2002-April/007892.html
for the last incarnation, that I fixed by inserting
some Zip file handling code straight into
CharacterProperties. Very ugly stuff.

The best way to solve this circular dependency
problem, is to avoid using getResource altogether
inside CharacterProperties. I replaced the binary
files holding the character properties database by two
class files with byte arrays. That unfortunately
increased the size of the database files by around 40
KB, or a factor of 10. The problem with using byte
arrays is that they are initialized *manually* by the
VM at runtime, so the compiled class file contains a
long statical initializer method. I'd like to hear
about better methods for storing binary data in class
files.

With that being set, the rest went well. I needed to
add a few more classes to the Klasses.jar.bootstrap
file, and Cygwin now compiles and runs mosts tests
without trouble. That should give a much better start
for people trying to get kaffe to work on Cygwin.

The tests causing trouble seem to use Threads in one
way or the other. I'll try to see if it is something
obvious, but I can't promise much.

That was the good news. The bad news is that I'm
afraid of performance issues when using the pure java
zip library from GNU Classpath. There is small
difference when I compile kaffe's class library using
kjc. With Classpath's zip libraries, it's 148 seconds,
with kaffe's code it's 144 seconds on my p3-650 mhz
notebook. I don't think the difference is worth
delaying the patch. But on platforms without a jit,
the difference *might* be much higher.

Also, using a pure java zip increased the amount of
memory I need to compile the class library by about 4
MB. This might be application specific, though.

In my opinion, it's worth it, because it replaces code
from a native library by java code, and thus makes
porting and debugging easier. But this is touching a
lot of code paths, so I'd like to hear your opinions
before I change things so radically.

So what do you think? Is fixing the cygwin port worth
the potential performance loss for interpreter-only
platforms? Are there things that are better in kaffe's
implementation of java.util.zip when compared to
Classpath's?

curious,

dalibor topic

p.s. I'll try to get the patch with a changelog entry
posted tonight. 

__________________________________________________
Do you Yahoo!?
Yahoo! Mail Plus - Powerful. Affordable. Sign up now.
http://mailplus.yahoo.com