Problem with StringBuffer

Tatu Saloranta tatu at hypermall.net
Tue Mar 28 16:51:06 PST 2000


I finally found out the reason for 'memory leak' on my Java program,
and the culprit in this case was StringBuffer - implementation.

I have a lexer (made with JFLex), that uses StringBuffer for
constructing
the strings for certain tokens. The StringBuffer is reused so that each
time a new such token is getting creted, StringBuffer.setLength(0) is
called. However, in Kaffe's implementation at least, the actual length
of the buffer is not changed. This wouldn't be a big problem in itself,
as there's just one StringBuffer instance... But, alas, as Strings &
StringBuffers are optimized so that when new String(StringBuffer) is
called, the character array is actually shared between the string
and string buffer, until the StringBuffer needs to change the data
(or length of the buffer via setLength()). The problem here is that
as soon as I encounter a long token (~4000 chars in this case),
_all_ tokens after that will use up the same 4k memory (actually, with
kaffe it's 8192 bytes as array keeps on doubling in size) regardless
of their actual length! Even when sharing is removed (as a result of
setLength(), for example), the String - instance simply copies the
huge array, not checking the length of its contents.

The same problem occurs in Sun's JDK as well, although by printing
StringBuffer.capacity() regularly, I noticed that the behaviour is
not 100% identical. In both cases, though, I end up getting an
OutOfMemory exception... :-)

I'm not sure what should be done to this; I can 'fix' the problem
in my program by instantiating new StringBuffers (I do that
now if StringBuffer.capacity() exceeds 80 chars). Still, this
is a somewhat subtle but fatal problem, and probably other
people have encountered the same problem at some other point.
Or perhaps they just thought it's because Java is such a
memory hog... :-)
Either the char array of StringBuffer could be deflated (not just
inflated) on setLength() (perhaps if the new length < old length / 2
or such), or on destringize(), depending on how much wasted space
the new array would have?

Any ideas?

-+ Tatu +-


More information about the kaffe mailing list