With this type of thread management they're required to trap into the kernel to acquire a mutex. There are obviously severe performance issues with this which partly explains some of the blocking and performance issues in Androids garbage collector
Android runs on Linux. Linux uses Futexes (Fast Userspace Mutexes) for locking abstractions like semaphores and mutexes. From Wikipedia -
"A futex consists of a kernelspace wait queue that is attached to an aligned integer in userspace. Multiple processes or threads operate on the integer entirely in user space (using atomic operations to avoid interfering with one another), and only resort to relatively expensive system calls to request operations on the wait queue (for example to wake up waiting processes, or to put the current process on the wait queue). A properly programmed futex-based lock will not use system calls except when the lock is contended; since most operations do not require arbitration between processes, this will not happen in most cases."
thread management they're required to trap into the kernel to acquire a mutex.
Of course GC is handled by the VM. That's not the point here though - the point I thought you made was that the GC uses mutexes and they're required to trap into the kernel. (At least that's how it reads above) That's not the case on Linux. GC uses mutexes which are implemented as futexes[1] that stay in user space for most of the time.
All threads are native pthreads. All threads, except the JDWP debugger
thread, are visible to code running in the VM and to the debugger. (We
don't want the debugger to try to manipulate the thread that listens for
instructions from the debugger.) Internal VM threads are in the "system"
ThreadGroup, all others are in the "main" ThreadGroup, per convention.
The GC only runs when all threads have been suspended.
Regarding the pthread_mutex_lock code you posted: I'd say this shows that in the non-contention case there is no kernel call involved, assuming the used mutexes are of kind PTHREAD_MUTEX_NORMAL; atomic_exchange is likely entirely in user space[1] and thus only if the mutex is already locked will pthread_mutex_lock reach "wait" (wait_event ?) which is probably the kernel call; i.e. those are futexes.
Although I don't understand how the memory management makes use of mutexes (once a (any?) thread realizes that the GC needs to be run, how does it wait until all threads have reached a safe point?), and I don't have time to check ATM.
It is somewhat difficult to read this code (due to the formatting), but this seems like a futex to me: if the lock is not being contended, the atomic_exchange() will take the lock without entering kernel space. If you want to argue with blinkingled, you should look at whether this specific lock is under contention or not, not paste large blocks of code that just demonstrate and prove his argument. ;P
yeah pthreads is implemented using futex. I don't care about the implementation details. It doesn't matter since it's still lock-based (albeit more efficient due to fewer system calls). My original point is, threading is still lock-based. Replacing your lock-based code with queues eliminates many of the penalties associated with locks and also simplifies your remaining code. Instead of using a lock to protect a shared resource, you can instead create a queue to serialize the tasks that access that resource. Queues do not impose the same penalties as locks. For example, queueing a task does not require trapping into the kernel to acquire a mutex (or however mutex is implemented)
Android runs on Linux. Linux uses Futexes (Fast Userspace Mutexes) for locking abstractions like semaphores and mutexes. From Wikipedia -
"A futex consists of a kernelspace wait queue that is attached to an aligned integer in userspace. Multiple processes or threads operate on the integer entirely in user space (using atomic operations to avoid interfering with one another), and only resort to relatively expensive system calls to request operations on the wait queue (for example to wake up waiting processes, or to put the current process on the wait queue). A properly programmed futex-based lock will not use system calls except when the lock is contended; since most operations do not require arbitration between processes, this will not happen in most cases."