Huh, I can understand labeling all of those are messes, but what's wrong with Berkeley sockets? The API seems fairly straightforward, and has managed to withstand the test of time pretty well. I'm curious what grievances you have with it?
The fact that strict aliasing has to be broken to use the API shows that, amongst other things, the API is from a different era of practicality over correctness.
In the past, people have popped up to say “but that’s only in the bind() implementation side” (when talking about `sockaddr`, for instance) but I find this argument amusing. At the end of the day, if something breaks because of this, that nuance is lost amidst the mass of frustration.
You can also say that strict aliasing is incompatible with the original design of C (i.e. it's a huge mess). We should have never had undefined behaviour in C to begin with, that should have been a separate language (possibly one that can still #include C headers and talk the same ABI).
Ah, I hadn't considered the whole sockaddr/sockaddr_storage/sockaddr_in/sockaddr_un... thing, yeah that's a mess. I guess a more modern API would either use opaque pointers or an union (though that has its own share of disadvantages). Thanks for bringing this up!
Note that it wouldn’t be a strict aliasing violation, for example, if the argument to bind() were typed const sa_family_t *. The annoying cast would still be required, but very much legal (as long as the other side hasn’t screwed up).
I cannot speak for M. DeVault, but one of the interesting things about the API is that accept() embodies two distinct things. A hypothetical alternative API would have enabled such things as filtering and rejecting undesirable incoming connections before a handshake completes with application-mode mechanisms instead of with kernel-mode mechanisms.
People often point to the function calling conventions of the API, such as the "sockaddr" structures and the "errno" mechanism (that has historically been tricky for non-Unix operating systems and for systems where there are multiple C compilers with multiple C runtime libraries). But there are actual architectural decisions that had fairly reasonable alternative paths. I wouldn't say that it's a "mess" because of these alternatives, but it is definitely not the sole and unequivocal way of approaching things; and there were and are tradeoffs to be had.
If you want to establish bi-directional communication with some process on the same host, that process should create two rendezvous points with mkfifo(2). Your process opens the read FIFO for reading, and the write FIFO for writing, and you're done. If you finish writing and want to just read, you close(2) the write file descriptor, and keep reading the read file-descriptor until you hit EOF.
If you want to establish bi-directional communication with some process on a remote host over TCP/IP, that process needs to listen(2). Your process connects to it with connect(2), and gets back a single-file descriptor that (unlike any other file-descriptor) is both writable and readable. If you finish writing and want to just read, you have to use the special shutdown(2) function, a quirk that only exist to work around the quirk of TCP sockets being both readable and writable.
Some might argue that this is a pretty minor wart, all things considered, and sure, it's nowhere near as confusing as the mess of terminal job control. But it also seems like it they could have implemented it in a non-quirky way if they'd just spent thirty seconds thinking about it beforehand.
It's very un-Unix like. In Unix, everything is a file, right? But sockets are entirely managed via syscalls. There are also some very nasty skeletons lying in wait in more obscure parts of the API, such as cmsg.
Consider an alternative design (Plan 9 is something like this, it's been a while so I'm short on details):
1. Open /net/dns and write the desired hostname, then read back the resolved IP address
2. Open /net/tcp/clone and write the address and port, then read back a connection number
3. Open /net/tcp/$conn_id and the file is now a full duplex TCP stream
Compare this to BSD sockets: use getaddrinfo for the DNS lookup (gross!), create a socket with a syscall, then connect the socket with another syscall using sockaddr (gross!). Much worse and much less Unixy.
Not really. Modern Unixes have evolved towards everything is a file descriptor, but the original design very much called for everything to be a file. Bell Labs started Plan 9 in part as a response to this very issue.
There was no original design for Unix. It never was 1 thing, but a bunch of loose ideas wired (and rewired) together pragmatically over time by different people, usually scratching an itch. I highly recommend Khernigan’s UNIX: A History and a Memoir to get a glimpse of the dynamic.
All the “it’s all a file” idea was added as an ideal along the way because the interface was so dang convenient. But I’ve never read it in any documentation.
Even pipes were added as a “that’s a cool idea.” There was a conversation, a late night coding session and pipes came into existence.
Plan 9 was a response to the evolved (vs designed) Unix. Took “the best” from it and made those design principles. It was an “if we could do it all over again, knowing what we know now” project.
What’s interesting to me is that evolved systems seem to dominate design-first systems in adoption. Maybe it’s that they’re pragmatic. I don’t know. Or maybe my observation is just wrong.