This would be solved if the "select" keyword was also working on I/O objects. It could gain special io.ReadReady and io.WriteReady channels that return a boolean.
This would also simplify most of the code that tries to pull events from different sources, including I/O. Right now each I/O needs to be extracted in it's own goroutine (adding error handling makes it worse):
fromIO := make(chan []byte)
go func() {
for {
buf := myPool.Get()
myIO.Read(buf)
fromIO <- buf
}
}()
for {
select {
case a := <-fromIO:
// do one thing
case b := <-fromOtherChan:
// something else
}
}
After the change:
for {
select {
case <-myIO.ReadReady:
buf := myPool.Get()
myIO.Read(a)
case b := <-myOtherChan:
// ...
}
I/O isn't an object, it's an interface (Go has no objects). There is nothing magical about I/O or any interface--they are just functions--so this change would require extra magic to be implemented which I strongly oppose.
One of the beautiful things about Go is the designers carefully chose to confine language magic to only a few fundamental areas (channels, goroutines, arguably maps) and resisted going beyond that. Go approximates "C plus Concurrency and Better Data Structures" and nothing more.
On the contrary, the magic is required because things like select and channels and coroutines are implented at the language level in go. If they were a library with proper extension points and, sure possibly languge sugar on top, it would be doable.
The compiler actually just transforms everything channel related into function calls within the runtime package. The majority (probably > 99%) of the implementation of channels and select is Go itself.
"There is nothing magical about I/O or any interface--they are just functions--so this change would require extra magic to be implemented which I strongly oppose."
"Extra", yes, "magic", debatable. (Which, to be clear, I mean straight, not as a rhetorically-weakened pure disagreement.) There is precedent for using the io.Reader and io.Writer interfaces but allowing structs that implement those interfaces to implement additional interfaces to gain additional capabilities already, especially in the io.Copy function: https://golang.org/pkg/io/#Copy (Click on the Copy name to pass through to the underlying implementation, which bounces through to another function which should still be on your screen, and the first few lines show the special casing.) Though I also find myself with some frequency having to accept a Reader or a Writer and examining it to see if it's also an io.Closer, because if I want to wrap a new io.Reader around an underlying io.Reader for some reason, something I do quite a lot, it's important to propagate proper closing behavior, especially when wrapping with gzip or something else that really needs to be closed and not just sort of trail off at an arbitrary point.
Making io.Reader "just work" in a switch statement would be magic, but using a similar technique to the above it might be possible to offer a new "io.ReadableChan(io.Reader) <-chan struct{}" (or the obvious WritableChan extension) function that files or sockets would implement using a polling mechanism, and everything else would get an automatic new goroutine created that used just io.Reader's existing interface. This could offer a zero-buffer-until-IO implementation without having to modify io.Reader. The resulting channel coming back from ReadableChan might have to be a new sort of magic channel internally, but that would be an internal detail that shouldn't ever rise up to your code.
Alternatively, it might be a "io.ReadableChan(func() []byte) <-chan []byte", waiting internally until it can be read then determining the []byte to use for the read by invoking the provided function. I'm just spitballing here, not preparing a change proposal. (For such a thing I'd also want to consider whether that should be "func () ([]byte, error)", for instance.)
This is one of those places where Go favors practicality over purity, and is very much not channeling Haskell. Checking whether an interface value implements another interface is probably something you shouldn't be doing all the time, but it's not inherently un-Go-like or anything. You're just taking more responsibility. Much like how you are supposed to share memory by communicating but Go will still let you communicate by sharing memory if you want.
The epoll/select syscalls and the file descriptors for TCPConns are exposed, so a lot can be done in third-party packages; I linked some relevant stuff in another comment. It's kinda hard for me to imagine explicit wait APIs being deeply integrated with std, though; maybe if things reached the point where sitting on buffers became the bottleneck for a lot of apps (in which case, dang).
The fsnotify lib at https://github.com/fsnotify/fsnotify has a similar challenge at a high level (ferry events from various syscalls into a Go chan) and may be interesting as a reference.
But: perspective. If KBs per idle conn are eating a well-provisioned modern server box whole, then back of envelope you might be at, like, hundreds of thousands to a million conns for a single box (few KB/conn * 1M = few GB), and, like, huge congrats for getting that far! Not the absolute limit of the hardware, maybe, but clearly not a toy!
Other folks also note that once you're juggling that many connections there are other things that could become big factors in scalability, like whether your kernel and network will remain happy, can you keep up if more conns than expected suddenly become active, and (I'd add) will the rest of your code (the stuff besides reads/writes) be able to keep up.
Still, if anyone wants to wrap the epoll-results-to-a-chan thing in a package, they can always post something + see if anyone uses it to do cool stuff. As you suggested, it's plausible there are places it can make code cleaner.
This would kill composition. If my reader were a tls.Conn wrapping a net.TCPConn, what would it mean to be ready? Does every io.Reader now need to be an io.ReadyReader? How can a framed Reader know if it's ready without holding a buffer?
There is a race condition, or at least an avenue for a race condition, in this design. Even if myIO.ReadReady is unbuffered, there is nothing to stop another goroutine from calling myIO.ReadReady after the select case is entered. The channel receive would have to return an exclusive handle of some type that could used to read from myIO. And then at that point, how do you signal to myIO that you are done reading? Should myIO "unlock" after a single Read call? After all pending data has been consumed? What if you're using a parsing package that calls Read multiple times? How do you keep it from blocking?
Usually IO read (or writes) should only be handled by a single goroutine at the same time. I don't think that the Read() function provides concurrent access guarantees either.
But you're right that myIO.ReadReady doesn't give the guarantee that there is enough data to fill the len(buf) on the next Read call, which would make it then blocking.
One of the features which I would most love to see in Go is a Selectable interface, which could make things like this doable (albeit as a bit of a hack).
Even if a socket's Selectable implementation just abstracted away the goroutine from your first example, it would make the developer experience a lot nicer for me.
Agreed, this would be a cool change. Part of the problem is that the only way to "kill" the goroutine doing the read is to close out the fd from under it. However, depending on what you actually have open (e.g. a unix device), it often gets stuck.
Having select{} work on io.Reader and io.Writer would be great!
Forgive me if I'm wrong, but I believe this is because the select keyword is a language syntax convenience whereas actual I/O needs a full go routine because that's the unit of parallelism.
No, depending on the compiler it is where the call that calls main() and calls global initialization lives, also where support functions for compiler intrisics might live, also the implementation support for data types not directly supported by the architecture, e.g. floating point emulation.
Right... which means that "select" is like "class". It means some shit, but it isn't supposed to mean "KERNEL ASYCIO INTERFACE" the same way that "class" doesn't mean struct.
This would also simplify most of the code that tries to pull events from different sources, including I/O. Right now each I/O needs to be extracted in it's own goroutine (adding error handling makes it worse):
After the change: