Benchmarking kaffe (Was: Re: SPECjvm98)

Mon Mar 25 14:20:21 PST 2002

On Monday 25 March 2002 10:32, Jukka Santala wrote:
> On Sat, 23 Mar 2002, Dalibor Topic wrote:
> > Do you have any specific benchmarks in mind?
>
> Not particularily, but despite the subject-line on this thread, I agree
> with the view we should probably stay with open-source benchmarks. Ashes
> looks like a good alternative, and even has Kaffe regression tests
> section, oddly enough.

I assume that's because the people that put together ashes are also developing 
a byte code optimization tool called soot. They use ashes to evaluate its 
performance. Wild speculation: they managed to successfully crash different 
VMs with the resulting bytecodes, thus the regression tests.

I was also thinking about the java grande benchmarks, scimark 2.0 and jMocha 
(I have not checked their licensing status yet). While they are application 
specific, they are a smaller download than ashes, allowing more people to 
participate.

> I think more interesting question is if we should try to agree on a
> standard runtime environment; the compiler and libraries can have much
> bigger effect on performance than the JVM/KaffeVM in question. Primarily,
> this doesn't matter as long as the environment stays the same from test to
> test, or changes are explictly noted, but to improve comparability of
> optimizations between platforms etc. there might still be use for agreeing
> on such. It seems like GCC 2.96 would be the preference, as this is common
> standard, altough moving to 3.x series should be considered. How about
> libraries, though? This is a tougher nut to crack, altough admittedly a
> proper VM benchmark shouldn't depend so much on library performance.

I agree that environments don't matter as long one is comparing the results to 
one's earlier runs. I doubt that comparing one's results with those of others 
will lead to anything beyond wild speculation.

If I recall it correctly, the original motivation was to notice & avoid 
performance regressions on platforms unavailable to the patch developer.  
Different Linux distributions, for example, ship different version of gcc 
with their latest offerings. Requiring that everyone compiles kaffe with the 
version x just puts an additional roadblock before people can evaluate 
performance. I would rather like to see if there are any regressions on 
environments that people use, than on a synthetic one.

 As you point out, there are a lot of external influences to VM performance. I 
don't feel that we should specify a standard runtime environment, since the 
standard environment is a moving target on any platform. In my opinion 
benchmark results tend to become irrelevant after some time anyway. Sticking 
to a standard environment would just limit the number of people able to 
contribute results.

What kind of contribution process would be suitable for such an effort? Emails 
to a specific mailing list? Web forms?

dalibor topic

_________________________________________________________
Do You Yahoo!?
Get your free @yahoo.com address at http://mail.yahoo.com