[kaffe] CVS kaffe (guilhem): Implemented stack overflow

Guilhem Lavaux guilhem at kaffe.org
Fri Apr 23 23:31:02 PDT 2004


On Fri, 2004-04-23 at 21:50, Timothy Stack wrote:
> > 
> > On Thu, 2004-04-22 at 23:08, Timothy Stack wrote:
> > > > Implemented stack overflow detection.
> > > 
> > > Can you give us a more detailed explanation of what this all is?
> > > 
> > 
> > Sure. I was writing some comments in the code now. I wanted to be able
> > to detect stack overflows like null pointer exception. There were two
> > problems for that: detect the real boundaries of the stack and be able
> > to run the signal handlers on another stack.
> 
> So, I actually implemented most of this in the JanosVM (see the 
> kaffe/kaffevm/systems/unix-jthreads2 directory) and that implementation 
> touched quite a few things that didn't seem to show up in your checkin.
> Of course, you might still have other things to checkin or I'm 
> misunderstanding the scope of your work.
> 
> > detectStackBoundaries() sets up a temporary signal handler for SIGSEGV,
> > places a jump point and tries to overflow the stack using a recursive
> > function (infiniteLoop). This is the slowest part of the code indeed
> > because if the stack is very large you may have to wait some time before
> > reaching the end of it...
> 
> Many operating systems only allocate stack pages when the program attempts 
> an access, so its very possible that this will allocate a lot of memory 
> that doesn't get freed and is never used except for this detection.
> 

True. Maybe then we should change this. I have looked at how boehm-gc
detect it. They have an OS-dependent function to retrieve it. Apparently
the different has each a way to retrieve it cleanly.

> > Anyway, when the stack end is reached a
> > sigsegv is raised. This gives us the state of the stack pointer when the
> > segv occured and so one of the boundary of the stack. With getrlimit you
> > get the other boundary of the stack using its size.
> 
> Its usually safe to assume that the very top/bottom of memory is the other 
> end of the stack...
> 
> > Now to detect stack overflows while running programs you also only have
> > to check the stack pointer. If it is outside of the boundaries while a
> > segv is raised we have to throw a StackOverflowError, in the other case
> > it is NullPointerException.
> 
> But this will only work for the main thread...  You need to setup guard
> pages on all of the other threads to cause the SIGSEGV.  Also, SIGSEGV is
> synchronous and does not block the async signals (e.g. timer), so you need
> multiple signal stacks to handle the possibility of multiple threads doing
> a null deref.  In the JanosVM implementation, SIGBUS/SIGSEGV are captured
> on an alternate signal stack and then passed off to a separate per-thread
> stack where code can continue as normal for a little while (e.g. allocate
> and throw the exception).

Hmmm... we only have to know the boundaries for the main thread as the
other stacks are allocated by kaffe itself. I still have to look at the
guard page when we still want to do some work after a stack overflow (I
think you are using mprotect protection/deprotection). 

Concerning the SIGBUS/SIGSEGV passed on a per-thread stack I don't think
we need it for a first approach (actually it was already not implemented
in kaffe and runned "perfectly" ;) until now).

Would it be possible to merge jthreads2 one day ? It seems a lot cleaner
than what we have for the moment.

> 
> There is also the matter of native code that cannot be rewound like java.  
> For example, in the JanosVM impl I added checks to all of the non-trivial 
> soft_ methods to make sure there was sufficient stack, otherwise an 
> overflow was thrown.  But, I think there were still a few cases that I 
> did not cover, especially for JNI...
> 
> > This code is entirely optional, in case sigaltstack, SA_ONSTACK or
> > STACK_POINTER is not defined, the old buggy behaviour is used.
> 
> How is it buggy?
> 

If you take the pointer of a variable and assume that the stack
bottom/top is somewhere before/after it is not really a clean way of
doing things I guess...

The garbage collector may miss objects and we may have
jthread_on_current_stack returning false though the pointer is in the
stack.

Cheers,

Guilhem.





More information about the kaffe mailing list