Hacker News new | past | comments | ask | show | jobs | submit login

As far as I can establish with a cursory skim, the article describes priority inversion, and states:

"OS schedulers might implement different techniques to lessen the severity of this problem, but it's far from being solved once and for all."

The article fails to describe what "solved once and for all" might mean.

Given that Priority Ceiling Protocol and Priority Inheritance are two common solutions used in real-time schedulers, I would like to understand why priority inversion is not a solved problem.

On the other hand, I have yet to learn whether Mac OS has any synchronisation primitives that mitigate mutex priority inversion.




Given that Priority Ceiling Protocol and Priority Inheritance are two common solutions used in real-time schedulers, I would like to understand why priority inversion is not a solved problem.

Because Windows and Linux are not designed for hard real time. There's a "rt_mutex" thing in Linux, which does priority inheritance, but it's not used much.

QNX took this seriously; it has priority inheritance not just for mutexes, but for interprocess messaging and I/O too. But it's a real-time OS. It passes through not only the priority, but the CPU quantum for non hard real time tasks. So a group of rapidly interacting processes are effectively scheduled as if they were one process which happens to cross address spaces.

On most non-real-time OSs, if you have a microservices-type program which is rapidly passing control back and forth between multiple processes, it will work fine as long as there's spare CPU time. If there's no spare CPU time, performance suddenly drops badly. That's because the processes being released by mutex unlocks go to the end of the line for CPU time.

This is something microkernel designers take seriously, because it has a big impact on microkernel performance. Unikernel designers, not so much.


I agree, there are operating systems where few guarantees are made about protection from priority inversion, at least not without specific programmer intervention. As a programmer you need to be aware that priority inversion can be a thing.

However, this has no bearing on whether or not priority inversion is a "solved problem". "Solved problem" generally means that there are well known, widely studied technical solutions to a problem. My understanding right now is that priority inversion is a solved problem -- it is a standard topic in any introductory textbook on real-time systems with solutions that I stated. But maybe I'm wrong -- certainly the author seems to think so -- hence my question stands: I would like to understand why priority inversion is not a solved problem.


Sorry to double-reply, but the paper I was thinking of was probably [0]

It seems very informative. Doesn't appear to be peer-reviewed, mind.

[0] http://www.cse.uaa.alaska.edu/~ssiewert/a320_doc/Yodaiken-20...


> hence my question stands: I would like to understand why priority inversion is not a solved problem.

I believe it's because although complete solutions do exist, they have prohibitive overheads.

Annoyingly I can't find the relevant paper I'm thinking of.


> I would like to understand why priority inversion is not a solved problem.

Maybe I just was not careful enough in choosing this phrasing when writing the article.

Do you think it would make sense if I change "but it's far from being solved once and for all." to "each of them with different upsides and downsides."?


Trade offs yes. I think I understand now that you meant that it is common that application-level priority inversion is not automatically avoided by the OS -- programmers still need to worry about it. Even if theoretical solutions exist, they may not be available on your chosen platform, or they may need to be manually configured (e.g. PTHREAD_PRIO_INHERIT).


Thanks. I committed a change (https://github.com/Dobiasd/articles/commit/8ac8db11eb95de688...). Feel free to open a PR if you have a better phrasing in mind. :)


> Because Windows and Linux are not designed for hard real time.

True, although RT kernels have been developed for both.

Windows actually features Autoboost (Priority Inheritance). As Windows also has IO priority, the boost also applies to IO operations, boosting both the thread priority as well as the IO request priority of the lock owner, when a higher-priority thread blocks on a mutex owned by a lower-priority thread.


> On the other hand, I have yet to learn whether Mac OS has any synchronisation primitives that mitigate mutex priority inversion.

It does! xnu supports priority inheritance through "turnstiles", a kernel-internal mechanism which is used by default by a number of locking primitives (list at [1]), including normal pthread mutexes (though not read-write locks [2]), as well as the os_unfair_lock API (via the ulock syscalls). With pthread mutexes, you can actually explicitly request priority inheritance by calling pthread_mutexattr_setprotocol [3] with PTHREAD_PRIO_INHERIT; the Apple implementation supports it, but currently ignores the protocol setting and just gives all mutexes priority inheritance.

[1] https://opensource.apple.com/source/xnu/xnu-4903.241.1/osfmk...

[2] https://opensource.apple.com/source/libpthread/libpthread-33..., search for _kwq_use_turnstile

[2] https://linux.die.net/man/3/pthread_mutexattr_setprotocol


> xnu supports priority inheritance through "turnstiles"

Do you have any information on how long this "turnstiles" feature has been available? [1] I was aware that the userspace pthread libraries implement but ignore PTHREAD_PRIO_INHERIT, but wasn't aware that the underlying implementation gave priority inheritance semantics.

Update: Looks like the "turnstiles" feature may have its origins in Solaris. [2]

[1] This link would suggest as of Mac OS 10.14: http://newosxbook.com/articles/12-10.14.html

[2] http://sunsite.uakom.sk/sunworldonline/swol-08-1999/swol-08-...


Yeah, looks like 10.14 and iOS 12. I actually hadn't realized it was so new. Before that, it seems that priority inheritance ("promotion") had long existed for kernel-internal locks [1], but not for anything in userland. It also seems that the behavior of supporting but ignoring PTHREAD_PRIO_INHERIT long predates turnstiles, which is pretty annoying.

As for origins... xnu's implementation appears to be written from scratch, but in terms of inspiration for the design and name, I'm not sure. Aside from Solaris, another possibility is FreeBSD [2] (but note that xnu's turnstiles are in the Mach side of the codebase rather than the BSD side).

[1] https://opensource.apple.com/source/xnu/xnu-1504.15.3/osfmk/...

[2] https://github.com/freebsd/freebsd/commits/master/sys/sys/tu...


Priority inheritance isn't always trivial to implement. The protocols by which a high-priority thread passes its higher priority to a lower priority thread need to be synchronous, otherwise both could end up executing concurrently at high priority. Or you could make the protocol asynchronous but the scheduler fair enough, dividing the time between the two threads, or perhaps have the threads swap priority while the originally-lower-priority thread works on behalf of the other.

With asynchronous designs you also have a problem identifying when the lower-priority thread is done working on behalf of the higher-priority thread. You can work around this by only letting the worker inherit the higher priority for a small amount of time, but this is guess work: after all, the worker itself could be using async I/O and processing many different clients' requests over any time span, so how can the scheduler know when it's working on behalf of one client or another?

If you trust the lower-priority threads then just let them manage their priorities according to those of the clients they manage. Now you need something like SCM_RIGHTS/SCM_CREDS to pass priority tokens/info around.

Asynchrony is essential, so I suspect that automatic priority inheritance can't quite be had.


Those solutions raise the low-priority task to the higher circle. Which breaks a deadlock but doesn't fix the latency issue.

The low-priority task was low-priority for a reason. It wasn't supposed to soak up CPU cycles needed for higher-priority tasks. Those solutions do just that - making a long-running background task run before low-latency more-critical tasks increases their latency (delays them responding to their time-sensitive events).


Grand Central Dispatch can resolve certain kinds of priority inversions involving owned locks.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: