[kaffe] [PATCH] pthread spec violation (leads to deadlock on recent glibc)

Wed Sep 29 09:46:33 PDT 2004

Noa Resare wrote:
> On tis, 2004-09-28 at 22:28 +0200, Guilhem Lavaux wrote:
> 
>>Hi Noa,
>>
>>>- thereafter jthread_exit() gets called in exitThread() and is
>>>advertised as not returning. This is not true. As can be found in 
>>>systems/unix-pthreads/thread-impl.c if certain conditions are met it
>>>just returns. when exitThread() returns control is moved to tRun in
>>>thread-impl.c
>>
>>jthread_exit() should not return. I haven't noticed that. :( I would 
>>suggest a special longjmp to reset the stack and go back in the tRun 
>>loop directly (though I don't know if it is really legal).
> 
> 
> Please don't do that. IMHO the primary design goal of software in
> general and free software in particular should be readability and
> understandability. From that the other good stuff (stability,
> predictability, performance, changeability and ease of debugging)
> follows naturally. longjmp() magic should be reserved to those cases
> when it is absolutely necessary. 
> 
> I think that it is perfectly reasonable that thread execution returns to
> the threading implementation (in this case tRun()) by a simple
> (implicit) return for possible thread reuse. The commentaries should be
> updated though.
> 

Ok. I have reread the code it could make sense. But the name should be 
changed as it is not a true "exit thread". ;) I would suggest something 
which will go with "cleanup before exiting" or "cleanupToExit".

> 
>>>- before tRun() goes on to remove the current thread from the
>>>activeThreads list it calls TLOCK() calls pthread_cond_wait() (via
>>>lockStaticMutex() -> locks_internal_lockMutex() -> slowLockMutex() ->
>>>ksemGet() -> jcondvar_wait())
>>>
>>>The obvious way of fixing this IMHO is to delay the
>>>unlinkNativeAndJavaThread() invocation until the threads implementation
>>>has removed the current thread from the activeThreads list. However this
>>
>>Hmm... maybe we should change how the thread context is destroyed. 
>>unlinkNativeAndJavaThread may be a callback for the threading system so 
>>it is called whenever a thread exits and it has already been removed 
>>from the active list. It will keep the VM parts in the VM and will give 
>>us a way to introduce some sort of specific "finalization".
> 
> 
> Please have a look at the attached patch. I have tested it with jthreads
> and pthreads with no regressions.
> 

It looks Ok. But I will also rename the function as the inter-dependence 
of modules increases. The standard I have personally used atm is to name 
  the core VM functions with the prefix "KaffeVM_". So we may know in 
future that unlinkNativeAndJavaThread is a universal VM function.

> 
>>ATM, I am only aware of unix-jthreads and unix-pthreads users. The other 
>>threading system are well behind the current design I think (I've 
>>already evolved the interface but not all as I don't have access to 
>>those systems).
> 
> 
> Good to know. 
> 
> More philosophically I have been thinking about why a mutex would need
> thread specific state to work in the first place. My experience of
> threading implementations other than pthreads in Linux is nonexistent so
> I suppose I'm overlooking something but from where I'm standing
> pthread_mutex_* provides this functionality without any need for
> threadlocal state. And I hear they are fast also. If we need more
> functionality (such as timeouts) we could always store the condvar in
> the lock structure.
> 
> Of course moving away from the current implementation in kaffe is not a
> task for me but I'm curious anyway. Gurus out there, enlighten me :)
> 

To the end you always need the thread state anyway. Look at 
jthreads/jthread_condvar. The thread must be pushed in a wait queue and 
rescheduled. From the user point of view, it should be possible prevent 
the locking system to use the local thread data: the problem is that you 
will have to initialize a semaphore for each lock you want to setup (and 
there are many in the core VM). This will result in more memory 
consumption (maybe small), the initialization will slow down (I guess) 
and the complexity of the locking variable in locks.c will increase 
(COMPARE_AND_EXCHANGE will have to access to an internal member of a 
locking structure). ATM, I think the locking system is fast as in the 
better case it only "compare and swap" some data to (un)lock and in the 
worst case it enters a special loop but as it will have to wait anyway...

Cheers,

Guilhem Lavaux.