pthread performance

Patrick Doyle doylep at eecg.toronto.edu
Mon Jul 17 15:38:16 PDT 2000



I was hoping to use Kaffe for research into JVM scalability.  However, it
seems that Kaffe w/pthreads has serious scalability problems.  Does anyone
have insight into why that might be?  Is pthreads support in its infancy?

I wrote a program which spawns threads that go into a tight loop creating
String objects.  I ran it with Kaffe's default user-level threads on a
dual-Celeron machine, and I got these results:

   Mult    Workers Objects Spins   Time(ms)
   500     1       2000    0       14485
   500     2       1000    0       14493
   500     4       500     0       14454

This is very good performance, IMHO.  These three runs all create the same
number of objects (ie. Mult x Workers x Objects is constant), and they all
take the same amount of time to run.  This is what is to be expected from
a user-level threads implementation, in which all threads multiplex a
single CPU.

Then I tried the pthreads implementation.  These are the best-case results
I expected from 2 CPUs:

   Mult    Workers Objects Spins   Time(ms)
   500     1       2000    0       14000
   500     2       1000    0       7000
   500     4       500     0       7000

Because pthreads migrate to different CPUs, I'd expect the 2-thread
version to require half the time of the 1-thread version.  Then the
CPUs are saturated, so the 4-thread version wouldn't display any
improvement.  Of course, all of this is subject to the OS's whims with
regard to thread migration.  However, even with no migration at all,
pthreads should be no worse than user-level threads.

Well, here's what I got:

   Mult    Workers Objects Spins   Time(ms)
   500     1       2000    0       14186
   500     2       1000    0       33340
   500     4       500     0       74959

Needless to say, this is truly awful.  I noted that the CPU usage during
the multi-threaded runs fell off dramatically.  With 2 threads, each
thread uses only about 20% CPU, and with 4, they only used about 2-7%
each.

Any insights would be helpful.

PS.  I had to manually add a function called "jthread_stackcheck" which
simply returns 1, just to make the code compile.

--
Patrick Doyle
doylep at eecg.utoronto.ca


More information about the kaffe mailing list