[kaffe] Re: [jcvm-general] Memory allocation benchmark
sergipop at mx3.redestb.es
Mon Jan 17 19:28:52 PST 2005
sorry about the other forwarded message. i was talking about this one.
On Mon, 17 Jan 2005 12:53:36 -0800
Tzvetan Mikov <mikov at usa.net> wrote:
> I have written a very simple program that is supposed to test the performance
> of the heap manager by making many small allocations. (IMO, typical Java
> programming style encourages excessive creation of temporary objects, so it is
> very important to have fast allocations; garbage collection speed itself is
> not as important.)
> The program calls two versions of the same function in a loop and measures the
> time. The first version allocates a new object for its result, the second
> version stores the result in the same object which is passed as a parameter.
> The difference between the two is supposed to roughly show to speed of memory
> allocation and the performance penalty of standard Java style. Of course, I am
> not confident it shows any of that, but you can judge for yourself ...
> Below are the (unsorted) results from running under different JVMs under XP
> and Debian Sarge on the same hardware (Athlon XP 2000, 256 MB RAM).
> There is a caveat: The results are meaningful only if there was no garbage
> collection in the middle of the test. This can be verified by enabling the
> "verbose gc" setting of the JVMs that have it.
> - Sun, IMB and JC did not make a gc cycle, so their results are accurate.
> - I don't know how to enable verbose gc in GIJ or GCJ. Looking at the
> execution times I suspect that GCJ did not make a gc cycle, so its results can
> be trusted.
> - Kaffe and SableVM did a gc cycle and I couldn't prevent it by changing heap
> settings, so only results of test2 (without allocation) have some meaning.
> Everything should be taken with at least a dozen grains of salt, since in this
> simplistic test it is quite possible that the JITs are doing lots of hidden
> optimizations under the covers, making the whole test meaningless (one of the
> reasons I prefer static compilation - a JIT can never be trusted).
> Windows XP
> Sun Microsystems Inc. Java HotSpot(TM) Client VM 1.4.2_06-b03
> test1 221 ms. 221 ns per iteration
> test2 160 ms. 160 ns per iteration
> Allocation time is 61 ns
> IBM Corporation Classic VM 1.4.2
> test1 190 ms. 190 ns per iteration
> test2 70 ms. 70 ns per iteration
> Allocation time is 120 ns
> Free Software Foundation, Inc. GNU libgcj 3.2 (mingw special 20020817-1)
> test1 521 ms. 521 ns per iteration
> test2 210 ms. 210 ns per iteration
> Allocation time is 311 ns
> Microsoft JVM
> test1 200 ms. 200 ns per iteration
> test2 160 ms. 160 ns per iteration
> Allocation time is 40 ns
> null JC 1.3.1
> test1 605 ms. 605 ns per iteration
> test2 375 ms. 375 ns per iteration
> Allocation time is 230 ns
> Free Software Foundation, Inc. GNU libgcj 3.3.5 (Debian 1:3.3.5-5)
> test1 555 ms. 555 ns per iteration
> test2 305 ms. 305 ns per iteration
> Allocation time is 250 ns
> GIJ Free Software Foundation, Inc. GNU libgcj 3.3.5 (Debian 1:3.3.5-5)
> test1 2127 ms. 2127 ns per iteration
> test2 1303 ms. 1303 ns per iteration
> Allocation time is 824 ns
> Kaffe.org project Kaffe 1.1.4+cvs
> test1 1125 ms. 1125 ns per iteration
> test2 264 ms. 264 ns per iteration
> Allocation time is 861 ns
> Etienne M. Gagnon and others SableVM 1.1.6
> test1 2158 ms. 2158 ns per iteration
> test2 1640 ms. 1640 ns per iteration
> Allocation time is 518 ns
> Here is how I interpret the results:
> - So far commercial JVMs are overwhelmingly faster in both memory allocation
> (test1) and pure execution speed (test2). I suspect Sun, IBM and Microsoft all
> use a copying GC - it should explain the very fast allocations. I hate to
> admit that Microsoft's retired JVM is the absolute winner with 40 ns per
> allocation - 6 times faster than GCJ.
> - Currently JITs are faster than statically compiled code. Kaffe executes
> test2 faster than anybody else under Linux. (This was quite a surprise for
> - GCJ and JC are even in allocation perormance, with JC having the edge. This
> is a very pleasant surprise, since GCJ uses Boehme's GC, which has been
> improved and tuned over many years. I suspect we are close to the reasonable
> allocation performance limit that can be expected from a non-copying GC.
> I however noticed something else about JC, which is giving me some worries.
> Running a simple "Hello World" program (with a pre-generated elf), JC consumes
> 130 MB of physical RAM. By comparison the rest of the pack running the same
> program under Linux are within 2-7 MB or physical RAM. (This is not virtual
> space but actual memory - the RES column in in output of top). On my 256 MB
> test machine such excessive memory usage could be lowering the performance in
> the test.
> Is it possible that I have bad configuration, or if not, can this memory
> consumption be improved? To be honest I was expecting JC to have the lowest
> memory requirements (of course when not generating C) - something within the
> vicinity of 1-2 MB - suitable for embedded use.
More information about the kaffe