Hacker News new | past | comments | ask | show | jobs | submit login
Go Things I Love: Channels and Goroutines (justindfuller.com)
192 points by iamjfu on Jan 7, 2020 | hide | past | favorite | 69 comments



The example under "Communicating by sharing memory" isn't correct, despite the author claiming that "it works". It's a very common example in concurrency 101 (updating a value). The fact that the author claims that it's correct is pretty concerning to me.

Adding a print(len(ints)) at the bottom of the function:

  $ go run test.go
   5
  $ go run test.go
   8  

More on-topic, channels have their own tradeoffs. I often reach for WaitGroups and mutexes instead of channels, because things can get complicated fast when you're routing data around with channels...more complicated than sharing memory. I don't think it's good advice to broadly recommend one over the other--understand their tradeoffs and use the right tool for the job at hand.


> More on-topic, channels have their own tradeoffs. I often reach for WaitGroups and mutexes instead of channels, because things can get complicated fast when you're routing data around with channels...more complicated than sharing memory. I don't think it's good advice to broadly recommend one over the other--understand their tradeoffs and use the right tool for the job at hand.

Unfortunately, some Go people reach for the "this is not idiomatic"-cudgel far too quickly, instead of actually looking at the various trade-offs.


Indeed! Even the Go wiki advocates exactly what you're saying about looking at the tradeoffs: https://github.com/golang/go/wiki/MutexOrChannel


This share by communicating mantra needs to die. Channel based code in go has a tendency to require nontrivial cleanup. Each time you need to compose a channel you you end up introducing another goroutine (for example just to map a type or merge multiple channels). Now that goroutine needs to be closed, then you end up with an additional close channel, and sometimes you need to drain channels to prevent locks.

They are not universally a bad thing, but total avoidance of sync package primitives is a bad idea (and much slower in many circumstances)

This gives a good analysis:

https://www.jtolio.com/2016/03/go-channels-are-bad-and-you-s...


For simple cases mutexes and such can be much faster and simpler. Channel based constructs are good when they model the problem well as streaming or queuing.


Most of these points are fair, but:

> Now that goroutine needs to be closed, then you end up with an additional close channel

Goroutines that just map values or similar will usually share a context with some other goroutine and can therefore reuse that context's close channel.


Yeah, but you often still have to write the code that reacts to the context. The way context gets talked about in the Go community, it gives many programmers the impression that a context closing will somehow forcibly shut down a goroutine or something. There's a lot of "when the context is cancelled, it will...", but it really ought to be phrased more like "when the context is cancelled, you can...", something to the effect of "have code that catches that close and handles it properly".

So, having a context is still "end[ing] up with an additional close channel" as silasdavis said. Even if you pass it to something that will be cancelled like a network operation, you still must correctly notice and handle the resulting timeout error.

The value for context is almost entirely that everybody in the community has come to agree on it rather than its functionality per se. Which is the same as io.Reader, in that the utility isn't its functionality, which any number of languages can replicate, but the way everybody agrees to implement and use it through the entire ecosystem, which is where other languages have a lot more trouble. Nothing technically stops that from happening, but you often end up with islands of agreement in different major frameworks or library ecosystems instead of ecosystem-wide agreement. Context is everywhere now; it's actually been a few months since I encountered a place I wish that took it that didn't.


Hi, author of the post here. Which example doesn't work? I just pasted the communicate by sharing memory example into the go playground: https://play.golang.org/p/bWtyGTC-EsC and it gives the same length every time. Am I missing something or are you referring to a different example?

> More on-topic, channels have their own tradeoffs. I often reach for WaitGroups and mutexes instead of channels, because things can get complicated fast when you're routing data around with channels

You're absolutely right. I certainly didn't intend to give a blanket recommendation. It's more of a, "If you're sharing memory, might it become clearer if you share memory by communicating?" I was worried that the simplistic examples would not properly represent the cases I was thinking of. I think that's a communication error on me.


I think it's only consistent in play.golang.org because results are cached: https://blog.golang.org/playground.

If I run it locally it's not consistent:

    go run lock/main.go
    [3 0 1 2 5 4 6 9 7 8] 10⏎
    
    go run lock/main.go
    [1 5 6 7 8 0] 6⏎
    
    go run lock/main.go
    [0 3 1 2 4 5 6 7 8] 9⏎
Here's a version that does work for me: https://play.golang.org/p/b6bRb9pgIGZ

The issue is that appending to slices concurrently is not safe, so you have to use a lock around the append or similar.


In a way this supports the argument for the "Do not communicate by sharing memory; instead, share memory by communicating" mantra, as in a language with no compile-time checks for incorrect use of shared memory, it is very easy to get it wrong.


Yes, it looks like the author has updated their post to say as much following the comments here.


I have. I didn't have time to rewrite it (I'm working, and I didn't want to take it down) so I added a few caveats. I am going to research this more and follow up again. Thanks again to everyone for the great responses. I learned a lot from the comments here.


If you try to run a naive version of this in rust, the compiler won't even run it.

https://play.rust-lang.org/?version=stable&mode=debug&editio...

I'm a beginner at rust, but this is the version I came up with that compiles and works:

https://play.rust-lang.org/?version=stable&mode=debug&editio...


TIL. That's a huge flaw. Thanks so much for your response!


Welcome! It's the same for other data structures, by the way (not just slices) — maps are not safe for concurrent writes either. (The rationale seems to be that users of the data types can choose whether to make them safe for concurrent use or not depending on the use case.)

I found this helpful: https://youtu.be/29LLRKIL_TI?t=1340


They added sync.Map

> Map is like a Go map[interface{}]interface{} but is safe for concurrent use by multiple goroutines without additional locking or coordination. Loads, stores, and deletes run in amortized constant time.

https://golang.org/pkg/sync/#Map


> The rationale seems to be that users of the data types can choose whether to make them safe for concurrent use or not depending on the use case.

Also that for most uses a concurrent map is way overkill, and a thread-safe one is both costly and basically useless (hence the Java folks not keeping the thread-safety when migrating from Hashtable to HashMap).

On the other hand they're kinda shit given how awful non-builtin data structures are in Go, and how easy it is to "leak" maps between goroutines.


Hashtable used a giant mutex when locking.. it was just slow. Especially when compared to ConcurrentHashMap.


Results are cached, but the main reason the example “works” is because the playground has GOMAXPROCS set to 1, meaning only a single goroutine will be running at any given point.


That makes sense, thanks for the correction.


Ran the example on my machine and can confirm it's broken.

Be careful relying on the results of the Go playground; it has a bunch of differences and probably has GOMAXPROCS set to 1, which most other systems will not.

You need a sync.Mutex or similar protecting your call to append :)


The Go playground is designed to be deterministic, right? Not sure that’s a useful test. On mobile so I can’t compile right now or else I’d take a look.


Correct. Results are cached for the same program “ID”.


As an aside, I recommend run.Group [1] as a replacement for WaitGroup. It implements a very common pattern where you have N goroutines that should execute as one unit (if one fails, everyone else should abort) and allows you to the final error. Similar to ErrGroup, but better.

[1] https://github.com/oklog/run


I wish WaitGroups weren't so awkward syntax wise... "Here is an easy way to keep track of a bunch of Async tasks you launch as a group! Just don't forget to increment this counter each time you launch one or it won't work!"


I prefer to call add once if I know how many routines I will create.

    var wg sync.WaitGroup
    n := 10
    wg.Add(n)
    for i := 0; i < n; i++ {
        go func() {
            defer wg.Done()
            doSomething()
        }()
    }
    wg.Wait()
You still have to call wg.Done(), unfortunately. People have written small wrappers around sync.WaitGroup that change the interface:

    var wg bettersync.WaitGroup
    n := 10
    for i := 0; i < n; i++ {
        wg.Go(func() {
            doSomething()
        })
    }
    wg.Wait()


What are the performance characteristics of channels? I do some work with computer graphics, and a lot of the concurrency concerns involve operating on large chunks of memory (textures, vertex buffers etc) and my mental model is that channels involve a lot of copying, so you would pay a heavy cost for sending these heavy objects back and fourth. But I have no idea if my mental model is correct.


You'd pass pointers to the buffers over the channels, so there wouldn't be much copying.

You would need to be careful that once one goroutine has sent a pointer to a channel, it doesn't touch that buffer again (until the receiver has finished with it, at least), otherwise you can get data races. You're implementing ownership semantics here, and unlike in some languages i could name, you're doing it without help from the type system.


At that point just use a mutex.


Edit: removed (can’t seem to delete?)


No, GP was taking about a clear race condition on growing the slice in the first example. It is not just a case of "idiomatic go" vs "idiomatic" or something, as the author suggested, a problem only when the code grows. It is a critical bug in the first example.

Edit: add what one gets when run with `go run -race`

> WARNING: DATA RACE > Read at 0x00c0000a6000 by goroutine 8: > runtime.growslice()


I guess I wish this was a bit more in-depth as to what "go" can do with channels or goroutines, but maybe I've just been using languages where all this is already possible. The article is a nice cursory glance, I just want to learn more :)

I mean the first example it looks like is using a Mutex, and then (b)locking on it, and the second just looks like having a queue of messages (mailbox) that it rips through (like the actor pattern).

Some questions I've got after reading this...

How does the go-runtime (?) schedule these calls? Does it manage an internal thread pool? Is the scheduler asynchronous, parallel, or both? How do you manage contention or back pressure if I begin to flood messages to to one channel (or many)? How many channels can I have open and what are the limits? Can I still lock the system stupidly using channels, if so, how (or how not)?

Edit: Truly, I'm curious because as I researched asynchronous programming and efforts to better future proof my career (years ago) as we began really increasing core counts-- Go never stood out. It's a fairly practical language, yes, but if I want a better paradigm for asynchronous programming the future it really isn't there (IMHO). BEAM stood out as something unique, the JVM stood out as something even more practical, and Rust stood out as something performant (with the serious caveat of not being C or C++), while Go has always seemed like an odd one to me... People talk about channels and goroutines like their special but they seem pretty damn run of the mill to me... WAT AM I MISSING?


Goroutines are scheduled onto system threads via M:N greenthreading. Each "system" thread can steal work from other threads, and there's some stuff so that blocking system calls don't use up your "goroutine system threads" (you can put a limit on the number of goroutines that run concurrently). Channels are fixed size on creation, either 0 (directly handing off items from 1 goroutine to another) or >0 (the channel has space allocated to hold that many items). If the channel is full(buffered, i.e. >0 size channel) or there's nothing waiting to receive the item(unbuffered channel), the sender sleeps until it can send the item. Channels are purely data structure, have as many as you want. You can lock the system fairly trivially, but Go can detect it in many cases and give you stack traces of all the places stuff is waiting to read/write (only if all goroutines deadlock).

edit: And no, there's nothing really special about channels, just they play nice with the goroutine scheduler, so it's perfectly sensible to do lots of "wait on these channels" stuff inside your program, without having to have lots of OS threads and such. (goroutines are a bit more lightweight then OS threads)


Go channels have a fixed capacity, by default zero, and their main special feature (compared to just having some concurrent_queue<T> type) is the select statement, which helps devs do things the right way. Nothing stops you from stupidly locking the system, e.g. by having a thread block itself by writing to the channel it’s responsible for consuming.


Is the select semantically different than something like this?

    bool TryWrite(T)
How does go handle cancellation with the select syntax? I guess you have a cancel channel and select across both the cancel and the read channels? Is Go smart enough to sleep and not busy wait in that case?


> Is the select semantically different than something like this?

Yes. select can do a lot of things. It's overloaded to be something like 5 different things. Here, let me show you:

    chanOne := make(chan int)
    chanTwo := make(chan int)
    ctx := context.Background()

    // Receive whichever is ready first. Blocking
    select {
    case v := <-chanOne:
    case v := <-chanTwo:
    }

    // Non-blocking read
    select {
    case v := <-chanOne:
    default:
    }

    // Timeout and cancellation
    select {
    case v := <-chanOne:
    case <-time.After(1 * time.Second):
    // timeout
    case <-ctx.Done():
    // cancel
    }

    // Write or read, whichever channel is ready first
    select {
    case chanOne <- 1:
    case v := <-chanTwo:
    }

Another form of cancellation is closing channels, which causes reads to return the zero value immediately, and causes any writes to that closed channel to immediately panic (effectively abort the program).


Neat. Most of the permutations are easily handled in other languages as well except for the full blocking (sleeping) case. Without a channel the scheduler can reason about, most languages would have to busy wait or possibly pull from multiple channels.


I think the biggest advantage is that it's a lot harder to screw up using a select statement than most equivalent API's you'd create in other languages that allow waiting on one of N channels/signals/channel/event operations, and branching based off that.

Simply because you're either constructing a bunch of callback objects or you get some number N saying the Nth parameter had an event, and your code has to match N to the specific channel correctly.


Go does put that goroutine to sleep, freeing the underlying OS thread to work on other goroutines, so it's very lightweight.


It's like what you'd do with actors on top of some kind of userspace scheduling (e.g. a thread pool that accepts tasks), yes. Writes to channels are implicitly yield points, so you write cooperative multitasking code (again, like you'd do in any language with userspace scheduling) with all the advantages that entails (no blocking), but without having to explicitly manage your sequencing/yielding. Since the whole language is built around the idea that this is going to be happening, a lot of the immediate problems with userspace scheduling go away: anything that's trying to do thread-local storage (e.g. web requests or database sessions getting bound to the current thread) will necessarily be goroutine-aware and do the right thing.

There's no free lunch, of course: it becomes much harder to write code that is pinned to a single system thread (e.g. good luck interoperating with a C library that expects you to pass callbacks), or to use higher-performance blocking I/O in the cases where that's warranted. But it makes the 80% case very straightforward.



I do like that select statement which hits the first case that has its channel ready with a message. That's very nice. And having channels in your standard library is brilliant and everyone should do it.

A shame about the shared memory thing though. I firmly believe that designing a language where memory is shared by default is a Bad Idea. You should probably provide a way to allow it when you really need it (for performance, usually, in very very very carefully-designed code), but having memory sharing by default is a source of soooooo many bugs.

I know, because I've caused most of them.


I've found channels create more complexity than their usually worth and it's often simpler, more readable, and more maintainable to just use a sync.Mutex or sync.RWMutex.


I'd be hard-pressed to disagree. The entire time reading this blog post, I was confused why a plain function couldn't have been used instead.


Channels are really nice; I love writing "workers" with them and sending errors and "status messages" with a simple `status <- "starting supernode"`/`errors <- err`; doing this with i.e. Node's async/await is just so much more complex.


Go channels are nicest concurrency mechanism I know. They bring nice ergonomics to the conceptual simplicity of a select()/epoll() loop.


Apart from the main topic, really liked the layout and theming of your blog. Curious to know of it's hosted somewhere or self built.



For anyone interested in using channels and coroutines in PHP, swoole (https://github.com/swoole/swoole-src) provides a reasonable implementation!


off tangent, but I really like that syntax of defining types especially for channels:

    type Foo(chan<- int)
instead of what I usually see

    type Foo chan<- int
unfortunately it doesn't appear compatible with gofmt (entirely), which changes it to:

    type Foo (chan<- int)
I still think it's a good pattern for channels though. It makes it a lot clearer what the type is especially if you have a slice of channels:

    type Foo []chan<- int
vs

    type Foo [](chan<- int)


The only thing golang has going for it is "goroutines". Now that all other popular languages are getting some variant of async (e.g. C#) or green thread implementations (e.g. Java), it will be tough to advocate for golang for new projects given its severe shortcomings.


"Getting" and "built with" are entirely different beasts.

In Java, if it gets first class green threads...does all code immediately start using those instead of an actual thread pool? Probably not; your runtime behaviors would change. Will all libraries immediately change to using them? Assuredly not, again, runtime behaviors will change, and no matter how cleanly implemented, switching and testing will take effort, for both the library maintainer and the consumer.

As a consumer, then, you're left with a moving target as some of your underlying libraries make the switch, plus the related testing effort. And you also have to either make a clean switch over, or you have to mix threaded + greenthreaded libraries, which is likely non-trivial.

Languages that offer a single, sufficiently powerful concurrency construct avoid that; all golang libraries support goroutines (even if they are not themselves leveraging them).


I'm not sure what you're referring to. The JVM itself will support green threads, which means that any existing blocking libraries can be called from said green threads and they will become asynchronous. The runtime will take care of multiplexing the green threads on hardware threads when an IO operation takes place for instance. It's pretty much transparent to the user.


>> any existing blocking libraries can be called from said green threads and they will become asynchronous

Yes, but any frameworks that currently create and manage a threadpool will not automatically be changed. As well as any libraries/frameworks that rely on async code. And any existing code you own will also not.

For brand new projects, using only libraries that assumed blocking code, and had no threadpool expectations within them, sure, it's transparent.

For any old projects it's not. For any libraries that were written to be async already, you don't benefit. For libraries written to manage a threadpool, you have to wait for them to be updated. And still, as now, God help you if you want to mix and match, to use a library written to be async, and a library written to use a threadpool...but of course, now compounded because you want to write sane, performant, readable blocking code using green threads. You have to deal with all those incongruities and make them all work together.

Languages that have an M:N threading mechanism from the get go (hah!) get the benefit of everything building atop that from the beginning.


From my understanding, they're making the green thread API be very close to the current threading API, so that you need minimal changes to make them work. If I'm not mistaken, if you're already using an Executor, it's just a matter of passing one or two extra parameters when constructing it to make everything seamlessly use green threads.

Even with mixing and matching, current systems can already run thousands of hardware threads, so I'm not sure it will affect performance or debuggability in any major way.


Golang has simplicity going for it. It's an incredibly easy language to read and to keep in your head. Channels are nice too


Golang is very nice and simple, yet powerful and performant in many ways by default.

However, channels were way overhyped, and one really should think long and hard about wether to use them at all, ie. find specific use cases. They're based on Mutex, but slower and often end up becoming a more complex distributed pattern.

Why not use channels? While sharing state/memory by communicating sounds nice, in practice you end up with coding requirements for a distributed system. Do you need one? Fine. It could even work out nicely for making a skeleton distributed monolith, before splitting into microservices.

But for most common use cases, channels distribute behaviour across your system unnecessarily, and you end up needing to synchronize and clean things up between different moving parts. If you don't have clearly defined bounded contexts such a pattern can fit into, it just seems like increasing complexity for no good reason.

Of course, for very straightforward implementations that you know won't grow in complexity, channels are fine, especially for simply reusing a working pattern.


Other popular languages are not "getting" async. Asynchronous communication (just as synchronous) have been around since the dawn of computing.

The choice of synchronous or asynchronous communication have their own set of tradeoffs, but in most cases the additional complexity introduced by async is IMO not really work it.


Async a keyword (as in async/await although really await is the keyword) and Rust just got in November.


Other language do it completely differently, there is no such thing as async in Go since the paradigm is blocking from a programmer perspective.

I don't think understand how it works in Go.


What are you talking about? A go routine can run async to another go routine. It's async.

It's just a different perspective of concurrency than node.

I literally quote from go by example: https://gobyexample.com/goroutines

"Our two function calls are running asynchronously in separate goroutines now."


What they really mean by async is sync (await)! ;-)

In relation to Golang: https://yizhang82.dev/go-and-async-await


I'm aware of what they mean and how they are confused.

You should know that the author of that article failed to mention that node is free of race conditions caused by context switching, this is a huge deal and in many cases worth the trouble of async await syntax.


I know how it works in golang. It's more or less transparent to the user, just how like it's being implemented in Java where the user doesn't designate functions as `async`, and the runtime automatically takes care of it.


As an aside Java's concurrency is outright confusing and just plain hard to get right. Doing concurrency in golang was like a breath of fresh air.


What's confusing about it? It's built on threads and mutexes, not unlike golang with its goroutines and mutexes. The main difference is that Java will be getting green threads soon, which will allow the user to spawn hundreds of thousands of them if he/she wants to.


> What's confusing about it?

Do I use Thread, Runnable, Executor, ExecutorService, or CompletableFutures? How do they interact with synchronized and Locks and which Locks do I use? Where do Semaphores fit in this picture?

Sure, for someone who is in Java day in and day out and who deal with concurrent java all the time these might be very straightforward tools with clear separation of goals and intent. In my personal experience there are always easy to introduce bugs with Java's concurrency, and every time I have to do something concurrent in Java I reach for my notes first trying to refresh my memory on what's what, which I don't need to do with golang.


> Do I use Thread, Runnable, Executor, ExecutorService, or CompletableFutures

It depends on what you need. Executor is an interface, so you can't use that directly, you have to use one of its implementations (like ExecutorService).

CompletableFuture is what you use when you want to handle the return type of a given function or method call. I've come across so much golang code where a function that returns an error is invoked with "go foo()", and the error is ignored.

The thing is, when you use golang, you have to re-write all these abstractions yourself anyway. Want to have a function execute in the background then you wait on the value? You have to pass in a channel and wait on it. What if it now returns an error? You have to pass in two channels now, or return a composite struct. In Java, you'd use a CompletableFuture and be done.

Semaphores are similar to WaitGroup in golang (but with more use cases obviously).

You get much more control, as well as writing code with a clear intent when you use its concurrency abstractions. In golang, you're writing much lower level code that you have to dig in to understand what's going on.

Not to mention that in golang it's difficult to provide anything that comes close to what's in the java.util.concurrent package, mainly due to the lack of generics.


It really not that different from C#'s async/await except go doesn't need the async markup to fix a syntax collision and every method is awaited. Java plans to implement something similar.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: