Hacker News new | past | comments | ask | show | jobs | submit login
Is the C runtime and library a legitimate part of the Unix API? (2017) (utcc.utoronto.ca)
65 points by ingve on Sept 27, 2020 | hide | past | favorite | 51 comments



> The standard C library's runtime environment is designed for C, and it generally contains a tangled skein of assumptions about how things work. Forcing all other languages to fit themselves into these undocumented constraints is clearly confining, and the standard C library generally isn't designed to be a transparent API; in fact, at least GNU libc deliberately manipulates what it does under the hood to be more useful to C programs.

It's hard to reconcile this with the fact that every other language besides Go manages to interface with the standard c library just fine. It seems like if it were such a big problem, you'd see more languages opting out.


The reason Go does it is because the overhead of calling C code from Go is relatively high. Other languages typically don't suffer from this friction.

The closest I can think of is Java, but I have no idea how much is reimplemented in pure Java to avoid the overhead of JNI.


JNI is expensive, but the JDK implementers give themselves some much more efficient ways to call into native code that aren't available to us mere mortals. For example, "HotSpot intrinsics", which are mostly used to hook into JVM internals, but also for some performance critical stuff:

https://gist.github.com/apangin/7a9b7062a4bd0cd41fcc

Ordinary native methods implemented by the JVM itself don't need to use JNI either.


This is weird because 2/3 of Go creators are the ultimate C and Unix hackers: Rob Pike and Ken Thompson.


Both were also key members of Plan9. The golang ABI is influenced by it (though they aren't the same).


Including Inferno and Limbo, that everyone keeps forgetting about.


That may be so. But that just seems like another way of saying, "Go has an issue with this and other languages don't." Why is calling into C so slow on Go and so not slow in other languages like Java or C#?


Because Go decided on a runtime design that doesn't play well with the OS. As far as I can gather:

* Go uses its own calling convention that's different from C.

* Go doesn't use system stacks (goroutines have their own growable stacks).

* Goroutines are implemented as green threads using collaborative scheduling. Any call to C has to play nice with the scheduler to avoid blocking other goroutines running on the same thread.

* Garbage collection often requires that data be copied in order to be passed to the C library, since the original data may be freed by GC during execution.


IAFAIK all of managed environments, be it Java or .NET or even Erlang/OTP, suffer from the last three points. And the Cygwin deffinitely suffers from the first one, because it uses SysV ABI instead of any of customary Win32 calling conventions (on x86, there was no "the" calling convention on Windows, everyone and their dog invented their own and then had to somehow interoperate with the others and with Win API itself... it wasn't pretty but it worked okay). Still, all those environments do pretty fine.

Honestly, as someone who started their software developer education from Turbo Pascal/Delphi on DOS/Windows, system-level libc, and system-dictated ABI was a very surprising concept. Why do I have to follow some arbitrary register-usage rules inside my code? Why do I have to export functions from .so files using exactly this calling convention? Why does no one export "FreeXxx" from their libraries, why do they think I can just call "free()" when I'm done with the object they returned to me, I don't even have C's "free()" in my language?

When the whole custom-memallocs rage hit the Linux scene, it was very amusing to watch how people were suddenly realizing the existence of "wait, a program can have several "malloc()/free()" inside it, how to chose right ones?" problem.


Same here.

In what concerns Java and .NET, there are intrics (with annotations) that jump through some of the safety restrictions for AOT code and when running on JIT, it eventually short circuits some calls after it realises them as being safe.

In any case, both platforms are pursuing alternative solutions to bring back those Delphi/Turbo Pascal like of calling into external code.

Android also has such annotations that allows ART just to direct call into NDK code.


One of the goals of Panama is to replace JNI, actually.


In the microcomputer world, things are very different and libc is basically never the standard API; for example, in DOS, the API was not libc, but of course many applications would just go down to the BIOS or the hardware directly, for efficiency reasons, and C was only one of many other languages used --- Pascal and Asm were also quite common encountered. On Windows, the stable API is a set of exported functions from DLLs like GDI, KERNEL, and USER, and although libc is available, it is a layer on top of those. The libc as the API seems to be a specific characteristic of POSIX/Unix systems.


From the discussion about the Rust stdlib I understood they deliberately build parts of it upon libc specifically because Windows has no stable API below their libc.


Rust doesn't make much use of libc on Windows; it mostly bypasses libc and makes direct calls to the Windows API. AFAIK it only uses a few functions like memcpy, memset, memcmp (LLVM itself emits calls to these) and the math library (sin, atan2 etc.) from the Microsoft libc.


Windows is not an UNIX, libc belongs to the C compiler vendor.

For example, the Windows way to clear memory is ZeroMemory() not memset().


The politics between Windows and DevDiv are just as interesting as well. For example it was @malwareminigun (used to work on MSVC compiler team) on Twitter that complained about the CPU time that CompatTelRunner consumes (mostly by Appraiser gathering application and device inventory).


Having been targeting Windows since the 3.0 days, I think those politics are responsible for some monumental failures.

Like Longhorn, as everything .NET was hard to swallow for them. Later confirmed, from my point of view, when Joe Duffy recounts how some devs reacted at the internal success of Midori.

So we got Longhorn core stuff redone in COM instead for Vista, and by the time Windows 8 came out WinRT, with an .NET runtime incompatible with CLR, including only able to understand the MSIL that the team considered relevant for firstly MDIL (Windows 8/8.x) and then .NET Native (Windows 10 onwards).

Now we have Reunion trying to sort out the mess, brining everything back into Windows 7 like development stack, where UWP is just the COM evolution and both worlds (Win32 / UWP) share the same sandboxing and container models.

And in the process .NET 5 is focusing on the desktop side, with .NET Native having an uncertain future, and a big question mark if CoreRT will ever take its place.

Ah and then there is the whole story of them killing C++/CX via C++/WinRT, in name of ISO C++ compatibility, for a bunch of libraries that are anyway Windows specific.

Now they realised that telling everyone to just wait for ISO C++23 to gain C++/CX features wasn't the most welcoming decision to "Developers, Developers, Developers".

C++/WinRT only feels better for those that have eschewed C++/CX and used the real man WRL (ATL like) template library, which naturally was the Windows team.

Those politics really suck from the outside.


I'm pretty sure not everyone at MS agrees with the spyware they're adding to Windows either.


@MalwareMinigun is Billy O'Neal BTW. I have a Wikipedia article about this topic:

https://en.wikipedia.org/wiki/Draft:Upgrade_Readiness

Appraiser not only consumes CPU (and HDD/SSD!) time, but is also technically quite interesting as well. For example it loads a simple kernel driver (nxquery.sys) into your Windows 7 machine to read a MSR to determine if NX has been disabled in the BIOS.


Your comment seems a bit out-of-place, am I missing a connection between your link and this Hacker News post?


This causes a problem in OpenBSD where they want to verify that system calls are only made from libc (https://lwn.net/Articles/806776/).


I'm not 100% sure this is right, but if there's anything that can be called a Unix API it's POSIX, and libc is part of the specification. So I think the answer is yes.

OP references Illumos, but I also think the only way you get access to syscalls on macOS is through Apple's libc.


> OP references Illumos, but I also think the only way you get access to syscalls on macOS is through Apple's libc.

That's right. In fact this was a source of a time-related bug in Go on Mac OS. https://github.com/golang/go/issues/17490 https://github.com/golang/go/issues/16570


There’s even some fun gotchas whereby to correctly bind some of the symbols, you need to make sure the correct header expanding c stub wrapper is used to make sure the sym linker works correctly on OS X. Was helping a friend with some ffi binding for Mac earlier this year and that wound up being key


Well, apparently this idea didn't always work. Go uses libc on macOS now. https://groups.google.com/g/golang-nuts/c/uX8eUeyuuAY


Go does things very differently. They will probably roll their own drivers at some point and run directly on Xen or KVM. The designers of Go have a lot of problems with evolution of Unix in general. So it is understandable that they don't want to get involved with libc clusterfluck.


Yeah but that means if you want to fix things at the libc level, you necessarily skip Go. OpenBSD had/has this problem.

But more broadly, I think the policy is less "Unix took a wrong turn" and more that goroutines make C/Go interop painful.


But then, Go is the lurching zombie of Plan 9. :-)


Inferno.


Even if it was "unix took a wrong turn" there wouldn't be any project on earth more qualified to make that claim. I mean, it's Rob Pike & Co. and if you look at their Git repo the first several commits are by Brian Kernighan. They're former Bell Labs employees who worked on things like UNIX.


And even with all that experience they still came up with a terrible programming language.


If I were Google, go should be shut down, and the Plan 9 crowd reassigned to Fuschia and forced to only write in languages outside their puny comfort zone like Rust.


The fact that everyone forgets about Limbo and Inferno, and keeps talking about Plan 9, proves how successful Go would have been with another employer.


Huh? I think Rob Pike et al. suck at language design, period, but had some decent ideas for systems. I think Go would suck with any employer, but they could do better work on non-programming language things.


Nope, as proven with Limbo.

Go 1.0 is basically Limbo with a bit of Oberon-2 syntax.

Outside the chocolate factory, it would have been just as successful, to the point that everyone forgets about Inferno and only talks about Plan 9.


> The designers of Go have a lot of problems with evolution of Unix in general.

Well, they invented Unix, and later its improved successor (Plan 9).


They invented UNIX, but then BSD and GNU took over add added all the cruft and ruined the experience. So they started Plan 9.


And then improved Plan 9, using God's predecessor.


>"Ordinary non-C languages on Unixes generally implement a great many low level operations by calling into the standard C library. This starts with things like making system calls, but also includes operations such as getaddrinfo(3).

Go doesn't do this; it implements as much as possible itself, going straight down to direct system calls in assembly language.

Occasionally there are problems that ensue."

This is an interesting tidbit about Go to the perspective of an OS designer; basically an aspiring OS designer would only need to implement the subset of the Unix API syscalls that Go actually required, and only to the degree that Go needed specific functionality from them; that is,

it would be an interesting idea to code a new OS and/or OS microkernel -- which would only support Go's subset of Unix's syscalls/API...

Or, perhaps even more interesting -- code a 'bare-metal' version of Go which includes its own OS-like support code (mini internal OS functionality), implementing the Unix syscalls it makes, but on bare metal hardware..."


Isn't Tamago (that hit the front page few days ago) exactly that sort of bare-metal Go

https://github.com/f-secure-foundry/tamago


while there is an overlap between libc and POSIX where by POSIX defines its standard in terms of the C language spec and its std lib, getaddrinfo(3) behavior is part of POSIX standard only.

Obviously go devs are attempting to do the right thing here and trying to implement async version of getaddrinfo(3) because POSIX's version is not and this route is extremely messy as it ends up requiring to reimplement bind's libs, and or the whole dns standard.

But this has nothing to do with C's std lib. its posix interface is defined in C, that's all about it.

What's being attempted here is to implement a POSIX defined function in another language and with a different behavior (async).

If there exist a getaddrinfo_a() in the posix standard, go devs wouldn't have had any issue and they would have went the typical route and created a binding between go's call and the posix call provided by the OS.


> If there exist a getaddrinfo_a() in the posix standard,

In principle, if you can assume modern POSIX, then you can also spawn a POSIX thread to call getaddrinfo() and wait asynchronously for the result.

In practice getaddrinfo() isn't always thread-safe :/ But it's supposed to be.


Back in the 1980s when I was learning programming on a Unix machine, it took me quite a while to understand the difference between sections 2 and 3 of the man pages.


You need it to know which syscall is which, so I suppose yes. There's /proc/kallsyms for Linux, but that's not portable.


It scares me that passing too little stack space to libc results in such subtle failures.


Google has a completely different culture from the open source community when it comes to stack. Many FOSS programmers assume an 8mb stack due to a belief that malloc() is slow. Google built all its production services under the assumption of a 64kb stack, because they found a way to make malloc() go fast and wanted to have lots of threads. So it shouldn't be scary or unexpected that there's a little bit of friction bringing those two worldviews in harmony. Google was nice enough to open source tcmalloc.


This was libc attempting to detect insufficient stack space, under the assumption of a guard page. Go chose to not create a guard page. Nobody can force you to fasten your seatbelt.


But nobody requires that you fasten your seatbelt, either. I believe a stack check is still opt-in using the default flags on most compilers, unless you alloca. Plus, stack probes are merely precautionary, they can still be broken by an inconvenient sequence of stack allocations and signal handlers being run.


Not libc. The stack probe was inserted by the compiler when compiling the vDSO (part of the kernel, but mapped into user processes) with certain ricer^Whardening flags enabled.


This is one of the “gotchas” in C, because the standard doesn’t define anything about the stack at all but it is really quite easy to run into issues if you’re not careful even with “standards compliant” programs. What’s scarier is that this can happen at any function call, and there is no standard way to detect this issue. (Most compilers do support stack probes, though.)


If there is not enough heap there is the OOM Killer (basically randomly kill processes), also scary.




Consider applying for YC's W25 batch! Applications are open till Nov 12.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: