Benchmarking kaffe (Was: Re: SPECjvm98)

Tue Mar 26 13:32:54 PST 2002

I'll definitely add an url to any benchmark results that are available. 
If you need a place to put up such a webpage, that could be hosted on
kaffe.org as well...

A plot that showed benchmark performance plotted vs. historical releases
would be really cool (especially if we do some work to improve the
performance).  

There are so many different benchmarks out there, it would be hard to
pick a standard set that would remain consistent over time (not to
mention keeping the environment consistent over time).  I think the best
approach to constructing such a plot would be to choose a particular
benchmark which illustrates the performance of a particular part of the
VM, and build a bunch of older versions of Kaffe, and run them, one
after the other, through the benchmark.

Benchmarks are quite a tricky thing - I haven't decided in my own mind
yet which are the most important ones yet.  It might make sense to build
our own benchmark suite using real applications that people use kaffe
for.

Cheers,

 - Jim

On Tue, 2002-03-26 at 05:07, Jukka Santala wrote:
> 
> On Mon, 25 Mar 2002, Dalibor Topic wrote:
> > I agree that environments don't matter as long one is comparing the results to 
> > one's earlier runs. I doubt that comparing one's results with those of others 
> > will lead to anything beyond wild speculation.
> 
> If you think environments don't matter, then you're not agreeing with me
> ;) The bare minimum for the benchmarks to be useful is that we should know
> the environment used. Let it be stressed that I thought of this more in
> terms of "recommendation" than "requirement", ie. "The benchmarks would be
> most useful if produced using such and such compiler and library such and
> such, but if you prefer something else, state so with the benchmark
> results". To put it in crude terms, this could for example prevent people
> from re-implementing an optimization already provided in some compiler; or
> conversely, breaking some compiler-based optimization.
> 
> > Sticking to a standard environment would just limit the number of
> > people able to contribute results.
> 
> That's one of the things I'm afraid of. The last thing we want is people
> upgrading their compiler/libraries on the run, and forgetting to mention
> it in the benchmarks, leading to everybody think they've broken something
> terribly, or found a new optimization. While certainly we can't, and
> shouldn't, stop people from providing any benchmarks they feel they
> should, there's no particular advantage to having a high number of people
> contributing them. A couple of motivated people would be sufficient; and
> if they're not interested enough to set up a separate toolchain to ensure
> all the benchmarks are built with the same environment, I'd rather not
> rely on the data they provide for anything significant.
> 
> > What kind of contribution process would be suitable for such an
> > effort? Emails to a specific mailing list? Web forms?
> 
> Well, I was most initially thinking of having both a gnuplot graph of the
> development of the benchmark performance over time, as well as a textual
> log of the specific results. In the most simplest case, this would only
> require an e-mail notification of the location of the graphs to this list,
> and the URL could then be added to the official web-page if deemed
> useful/reliable enough. If enough data is provided, it might be worth it
> just to write a script on the web-site machine, that would gather the
> benchmark logs and collate combined graphs from them.
> 
> But, as implied, if we're aiming for just "any benchmark", for posterity
> and some pretend-comparisions between system perfomances, then all bets
> are off, and we should probably have some sort of web-form for users to
> input in that "Herez the rezultz I gotz from running my own
> number-calculation benchmark, calculating how many numbers there are from
> 1 to 1000 while playing Doom in another Window. This is OBIVIOUSLY what
> everybody else will be doing with the VM's, so I think this counts. I'm
> not sure I even have a compiler." ;)
> 
>  -Jukka Santala
>