tl;dr We rewrote some internal Python software in Go, it is now faster.
Unfortunately, there is no technical content (which seems endemic to the recent storm of Go articles). Is it faster because is a static language producing machine code? Was their choice of Python packages wrong? How much are the performance improvements caused by knowing better where the bottlenecks are? Were other languages with larger ecosystems considered?
For me, the really interesting point of the article was less "yay, it's faster than Python" (which is generally undisputed), but more that (to paraphrase):
"I didn't know anything about Go other than 'it sounds cool', so I decided to rewrite this very important bit of Disqus' infrastructure. In one week, from start to finish, I had it implemented, tested, and deployed and improved latencies by several orders of magnitude and reduced the capacity requirement from four fully loaded servers to 20% of one!"
To be able to say that is pretty impressive, worth writing about, and definitely interesting to me as someone who wants to see Go used more. Disqus is well known, has lots of traffic, and so a recommendation from them carries a lot more weight than some random benchmarks. (As cool as they are.)
This was partially intention on my part. A lot of the problems that we had with our Python backends were a bit unknown. We had some guesses, and not the right instrumentation to dig into it. There are memory leaks in gevent, and the CPU that was being consumed was ridiculous.
Internally, we had each played with Go for a little bit, and felt very natural to us. There weren't any other considerations really.
For comparison, we're still dark writing to the old realtime just to make sure that nothing is broken, and old realtime is consuming easily 16x resources and technically doing a bit less since it's not really publishing anymore.
My first compsci teacher was Harm Bakker aka Harm the Almighty Recursion Master, and anyone who followed his classes has the following quote imprinted in their brains:
"... and then it's tempting to start kludging. I have a suggestion: get it right the first time."
Whenever I'm programming I can still feel his disappointed stare of disapproval when applying a quick and dirty hack.
> "... and then it's tempting to start kludging. I have a suggestion: get it right the first time."
Where this falls down in the real world is that oftentimes your understanding of the problem is quite limited at first. You'll ship a demo or an MVP, and that will teach you a little. As you continue to explore the problem through your work you'll often redefine it. Or it will redefine itself, as happens in so many war stories of service scaling, product maturation, and so on.
Fortunately, it's not as black and white in the real world. Quick and dirty hacks are a real thing that will never go away. If anything, having the ability to do those under stressful scenarios is a good quality to have.
Someone already responded with hacks being a fact of life. There's basically CS programming, and then 'real life'. If there wasn't a need to generally get things out the door I imagine there would be far fewer exploits of software out there.
Is it really a surprise that the Python VM is not performant? I thought everyone knew that by now. Any reasonably well engineered VM (e.g. V8, JVM, Go) will blow it out of the water. This should be well known to anyone who calls themselves a software engineer.
The Disqus system could have been written in a large number of other languages and have performed more or less the same as the Go system. The only advantage I can see of Go is that it's reasonably close to Python/Ruby in terms of semantics. If Python or Ruby is your only tool then Go might be a good choice. Not so if, for example, you're more into the functional style.
Edit: Ok, so Go doesn't use a VM. It still has a runtime, as all languages must. Substitute runtime for VM above as appropriate. Point still stands.
>>Any reasonably well engineered VM (e.g. V8, JVM, Go) will blow it out of the water.
FYI, Go does not use a VM like V8 or the JVM; Go code compiles to native binary.
However, like many VM based languages and unlike (a lot of) C, Go should be write once, (compile) and run everywhere(Win,Lin,BSD,Mac). Also like many VM languages Go is type/memory safe and is garbage collected.
Edit: It appears V8 is also not a VM, it is a interpreter that performs just-in-time (JIT) compilation of source with no intermediate representation such as bytecode (Java instead JITs bytecode). This makes sense since Java programs are distributed as bytecode and Javascript programs are distributed as source. @calinet6: Thanks for pointing this out.
It is going to take ages to clear young developer minds that safe strong typed languages don't require a VM and we have to thank Sun and Microsoft to have spread that misconception.
Where does that come from? People who have only ever used Java or C#? I mean, there's nothing conceptually about VMs or strong typing that would lead to such a conclusion?
Also, Go (via its usual compiler) is even more self contained than most binary languages: it compiles to a statically linked binary - it doesn't even depend on /usr/lib. If the CPU and OS is good, you can just throw around the binary with scp and run it wherever, no setup.
Does V8 specify a virtual machine and instruction set which you could make a real processor for? I thought it went straight from Javascript to either x86 or ARM.
"Point still stands" is a pretty snarky response considering you got important details flat-out wrong while suggesting your "information" should be "well known to anyone who calls themselves a software engineer."
It's OK to be wrong, just don't be a jackass about it... people will take you a lot more seriously if you're gracious when they notice it.
The CPython implementation is very slow. There are many language implementations that are much faster (e.g. O'Caml, Haskell, Scala, Clojure, Java, Lua, C, probably Rust). Any one of these is likely to produce a system about as fast as the Go system with about the same amount of effort. Thus dwelling on Go being faster than Python is not interesting.
What is interesting is the properties of Go that make it better or worse suited to particular organisations and problems. This is what I tried to get at in the second paragraph.
The precise implementation strategy of Go/Python/V8/whatever is interesting in its own right but irrelevant to my points.
Wait, what? You mean that the claim that "they could write it in a number of other languages" was falsified? Or maybe you think that "Python VM is slow"? No? Then these points still stand. Geez.
There is nothing special about ditching Python and getting speedups. Just last week I did the same with rewriting a service in Erlang - and I'm almost certain that it will perform even better on 8 cpus than even Go would.
Or Jython. From what I've seen, the Jython guys have been at the forefront of squeezing more performance out of the JVM, especially using the new invokedynamic / Method handles stuff.
1. Yes, that is one of the reasons for Go's speed.
2. You don't have many options aside from Gevent.
3. I can't talk for them, but my Python bottlenecks were mainly memory related. I was simply using up too much of it, due to how it was handling each request. My pattern was a basic Data/Handler/Json one. Which is as simple as they come. Yet, I was having issues (at a small scale).
4. Yes, of course. I considered and built parts using other languages. Did some testing, too. But Go stood out as the best choice given my needs.
Maybe an insider could shed some light on this campaign. Does Google marketing dedicate people to cater for social networking sites and news aggregators? How is the 'storm' organized? Obviously, many startups try to spread the word through 'cheap' channels. Only few are really successful.
I don't think this is organized by Google. I think it's just a natural consequence of more people checking out and using Go for production work.
For what it's worth, outside of the FAQ, I don't think golang.org even mentions Google. In general, Go is kept very distinct from Google, so it wouldn't make much sense for Google to embark on a Go marketing campaign.
You do realize that Go is an open-source project with contributors from both inside and outside of Google? I imagine more user's outside of Google use Go than inside.
I have done work with Python + Gevent before, and used it to developed the second Nuuton prototype. Go blows it out of the water. It is way faster, and simpler to develop. I went from a bottleneck of 3K hits/second to around 5K without any optimization (and a lack of general knowledge about Go). These numbers generated during testing, and not in production.
Thanks to Go, my website Niflet can handle 2K hits/second on a cheap machine, each uniquely and usually within 5 ms. I'm finding Go to be well worth the effort to learn it.
For someone who hasn't invested much time into looking beyond the basic syntax of Go, can you give us a little info on what your stack looks like?
I haven't gotten to the point where I've looked into things like web frameworks, templating engines, Postgres adapters, anything like that. But I'm interested in it.
Niflet is my first foray into modern web development (I'm a database developer by day), so I kept it simple, just Go and SQLite. I'm using Go's webserver, which is basic but wasn't hard to extend to do compression and other niceties. Templating is via Go's template package. SQLite gets embedded, making the entire app a single 5 MB file. A free Dropbox account is used for nightly backup, handled by a scheduled goroutine (Go's concurrency feature).
For a heavyweight web app that needs to scale to multiple machines I would've done things differently. It turns out that Go is so fast that needing to scale would be a luxury problem for me. In theory I can serve 70M users spread over a 10-hour day for < $100 a month.
The standard library templating in Go is one of my favorite features. Between html.template and text.template, I probably use it in over half of the programs I've written.
The templates are on disk and their contents passed to the template function on initialization. So yes, the app is not a single file. I forgot about these files because they rarely change, hence are rarely deployed.
I'd like a copy-paste of that email as well! I'm learning Go by following the Tour and making some dumb 'university' level examples to get a feel for the language and I'd like to actually build a website with it.
I don't know why the parent comment does it but with nginx you can leverage things like caching out of the box. (Would mostly be used on static files, either generated or pre-made).
Same here. The amount of effort that was required to achieve better results was very minimal. I'm excited to see what I can do when I fully understand Go inside and out to make things even better. :)
Seems to be an ever increasing trend to replace Python components experiencing high load with ones written in Go. Is this Go's niche? Definitely a language I will be learning in the near future, it doesn't seem like it will be going away any time soon.
I've settled on Go to replace a worn out Java mess (otherwise a Python shop). We need the computational performance, and I do like the general feel of the language. I think this is something you're going to see a lot of going forward. It's the same niche Scala has been filling to an extent, but I personally think Go is a much better option (unless you need the JVM of course).
You might have misunderstood the parent, I think. It seems like he was saying that they used Java (and then Go) in lieu of Python because of the performance.
But to answer your question, the Language Shootout seems to suggest Java and Go are on the same plane in terms of speed. Take that with however many grains of salt you like.
It's also interesting that according to the Language Shootout Go uses a fraction the memory of Java - meaning you can save a lot of money by deploying it on cheaper machines or VMs and get similar performance to Java.
I just checked out the language shootout. Go has really pulled ahead from where it used to be. Faster than SBCL or Ocaml or Free pascal on quad core 64-bit - that's impressive.
I can use Scala and not pay that penalty. I'm looking at Go to replace components of my system where I don't want a full bore JVM, but I have to be thoughtful about latency. I prefer C for this (C++ seems to be the standard there). Would love to move to something like Go once I can.
I'd say Go's niche is pretty much any sort of network server. Right combination of high performance, easy multithreading, and making it hard to shoot yourself in the foot (no buffer overflows, for instance)
I've heard this several times. But the success stories are mostly concerned with smaller parts under heavy load. Go still is a lot harder to work with than Python if you know Python and its tools well...
With Go you don't have anything comparable to Django, Numpy, Pandas...
> With Go you don't have anything comparable to Django, Numpy, Pandas...
Given the age of the language I would assume building out its associated toolchain is only a matter of time. Python didnt ship with Django, Numpy, Pandas...
Python is rapidly becoming the lingua franca of data science; but there the vast majority of your inner loop isn't Python (numpy, scipy, pandas, numba, theano; C, Fortran, assembler, Cython, LLVM, CUDA...).
Basically Python is the glue language data people wanted, it turns out.
As are most of us lol. I would say if you want to use golang for a serious project and need additional libs you have to ready to write it yourself, which grants the opportunity to give back to the community and contribute to open source etc.
Will this change with time? I know numpy, pandas and matplotlib are likely relatively large projects, but my assumption is someone will likely put together data analysis / matrix math libraries to perform some of these functions.
I think Julia (http://julialang.org/) might be a better alternative to Scientific Python than Go. I'm not sure you can get the same flexibility/expressiveness you get in Numpy/R/Matlab in Go.
I find it rather annoying that everyone seems to be advocating Go, it may be a good language, but I can't help but feel that the only reason it's being used is because Google's behind it. You guys should look into alternatives like Nimrod (http://nimrod-code.org/), which is in fact a lot closer to Python than Go will ever be.
People are advocating Go because it delivers. It has great concurrency primitives (channels), it is "boring" language syntax wise, and it is fast.
Beyond that, it deals with a lot of the 'ugly' things around the edges of other languages. Dependency management, build management, deployments... all these IMHO are much more well thought out in go.
It delivers? Lol. The GC used to suck for 32 bit systems and it still sucks for realtime. As opposed to Nimrod's which pretty much guarantees a maximum pause time of 2 milliseconds -- independent of the heap size. And that's only the start: Go also lacks generics and meta programming in general. And its memory/concurrency model is just as racy as Java's.
Go was never designed for "realtime". Also, 32 bits wasn't the main compiler focus, 64 bits was. This problem being mainly fixed with the 1.1 release, this is a non issue now. The memory model seems pretty well defined without being too restrictive, with the recent addition of the race detector.. Go looks well equiped for this kind of problems and some pretty interesting projects are there to prove it.
You are correct on some points. 32 bit was broken, realtime is a non-feature. It does (by design) lack generics and meta programming (Pike talked about these at length at one point).
I have to disagree on the concurrency model, I think message passing channels are a much more natural primitive to model concurrency in, and goroutines are exceptionally light.
EDIT: When you talk about nimrod, you might go ahead and mention you are the designer of nimrod... it might color your judgement.
"Nobody is smart enough to optimize for everything, nor to anticipate all the uses to which their software might be put." -- Eric Raymond, _The Art of Unix Programming_
Go's primary niche is server software, and in that niche, it is gaining in popularity and has the backing of a large company. For servers, neither support for a 32-bit address space nor real-time support is important.
Does support for generics really matter when the language has built-in support for the most common collection types?
The allowance of shared mutable memory between goroutines does worry me somewhat.
Personally, having it backed by Google makes Go better in my opinion. I feel a ton of smart, very insane individuals are working on it and it can only get better.
Nimrod? Never heard of it or the person backing it.
I'm _not_ a language snob, trust me! I'm just a regular, family guy, software developer and I try to put my proverbial eggs in reliable baskets.
Go doesn't seem to be going away any time soon and it's really really fast.
If you want to put your eggs in reliable baskets then why not use Java, or C#, or even C++. Those languages are most definitely not going to go anywhere anytime soon.
I now have to yet again close my apps and restart my computer for an IT-forced Java runtime update, for some app I rarely use. For that reason alone I'd not consider Java.
I forgot that C# added more asynchronous stuff since I last coded in it. I used to love C# but it's more verbose than Go and I'm weary of the obfuscated MS documentation that pushes me straight to blogs. Also I gave up on using Windows for web development.
If I'm not mistaken, that's a fallacious slippery-slope argument. I'm still undecided on how to weigh language popularity against other factors, but surely relative popularity is an important factor in things that have a big effect on real-world software development, such as availability of libraries and tools. For now, at least, Go is much more popular than Nimrod, and also has more people working on the language implementation and surrounding tools.
I believe Go purpose is exactly that of replacing more familiar languages with a static compiled version that is palatable. It's fulfilling a niche where, otherwise, one would resort to C/C++.
If you're CPU bound, I think it's a good option to at least check out. It can work with real threads, and doesn't compete with the GIL. In our case, our actual CPU was low, but due to the high concurrency, any little bit of CPU work requires a context switch and contends with everything, for a little work. We alleviated that by just using a bunch of processes.
Will it replace all of our Python? Highly doubt it. Some very specific pieces? Probably.
That's cool, but I am wondering where are the trials of say Cython or PyPy? Why is there a need to switch to a different language, where are the benchmarks or profiling to show where the performance issues lie? I like Go, and I plan to learn it myself, but I just don't get why people jump ship so quick from Python. This post makes it seem like Python is slow. I say it's not the language but the interpreter. GIL is nonexistant in Jython, or IronPython. It can be avoided with the multiprocessing module. Gevent may be good for regular python, but if you have the GIL getting in the way, best to switch to something else.
Our assumptions were that the GIL was getting in the way with our excessive concurrency. gevent + PyPy is not really a safe thing yet.
This entire project was also stood up in a week, so it's not like this service is some monster service with a lot of work invested into it. It was small enough that instrumenting and figuring out where the problems may lie and figuring them out would have probably taken just as much time.
Perfectly true, gevent + pypy requires a separate build of gevent, but pypy comes with eventlet, so it might have warranted an investigation to see if it makes sense to switch.
However, I suppose without details of the service itself, it's hard to make a judgment call from the sidelines.
Still, glad you were able to improve performance. Why was 'Go' chosen?
Nevermind, I see that was answered in another thread.
I think part of the lesson here is that Disqus went very far with their realtime service using Python, a tool they were very familiar with and allowed them to iterate quickly. Migrating to Go took them very little time, once it became necessary.
I concur. I've been using go for close to a year now, discovered it inside Google and continued to play after leaving. It's really a fantastic language to work with. You can get away with using type inference most of the time which keeps the language feeling very dynamic. The inbuilt library support is very mature and provides almost all of what you need.
How would you normally go about stress-testing a Django app like that to look into bottlenecks?
All the articles and talks I see on scalability and optimization use production servers as examples, but as you way, you want to optimizing for traffic instead of responding to it.
There's no way to stress-test a complex web realtime system without actually running it on the real hardware.
You'd expect very nonlinear behavior between number of users, response time and number of machines used.
Not being a part of Disqus, the only way I see to "predict" performance beyond the load already seen is to try and use past data from the current real system.
Here are some potential reasons (I obviously don't know the actual reasons). You won't like any of them, but I hope you can take me at my word that I'm not a partisan (I have a passing familiarity with Erlang that comes of maintaining a Riak cluster, and I've built some fairly large projects in Golang, but my core languages are C and Ruby).
* Golang has a more conventional, familiar, boring, "safe" language design than Erlang does. If your current language is Python, you're apt to find Golang congenial; I describe Golang to my friends as "a modernized hybrid of Python and Java".
* Golang has a more flexible concurrency offering than Erlang. It has a very flexible (statically typed, sync-or-async) first-class channel type that you can deploy anywhere in a program whether you parallelize it or not. More importantly, it has first-class support for "conventional" shared-everything concurrency with mutexes and semaphores and whatnot. If you're designing everything from scratch to fit your language's preferred idiom, this may not matter, but if you're porting an existing design it might matter a great deal.
* Golang has what I will perhaps hyperbolically refer to as first-class support for strings, with reasonably performant regular expressions, a clearly delineated "just a bag of bytes" type designed from scratch for high-performance buffering and packet framing/demux, and a sane approach to UTF-8 Unicode. It also turns out to be a very elegant framework (as imperative languages go) for building parsers, since you can put the concurrency stuff to work for it. These are common programming tasks and not one Erlang is well known for handling gracefully.
* As I understand it, both Erlang and Golang have comparable lightweight process/thread/coroutine/whatever facilities. Golang was designed from the start to handle huge numbers of demand-spawned threads with small, dynamically allocated stacks. I'm not saying Golang does this better than Erlang does, but it might be tricky to argue that Erlang's process model is better than Golang's.
* Erlang requires an installed runtime and executes in a VM. Golang produces native binaries. You can compile a Golang program on one machine, rsync that single binary to another machine that has never been blessed with a Golang installation of any sort, and run it without a problem.
* Golang has modern, extremely high-quality tooling. Entire huge Golang projects rebuild in seconds or less. The compiler is extremely helpful. Code formatting and documentation are first class problems for the toolchain. The Go package system is well thought out, and, again, is familiar to Python programmers.
> Golang has a more conventional, familiar, boring, "safe" language design than Erlang does.
Do you mean "safe" as in "nobody got fired for buying IBM" or as in language making hard to start thermonuclear war by accident? Because Erlang, with it's immutability everywhere is as safe as, or safer than Go.
> Golang has a more flexible concurrency offering than Erlang.
Basically everything you wrote here Erlang has. And what they were porting was in Python + GEvent, which means no threads anyway.
> a clearly delineated "just a bag of bytes" type designed from scratch for high-performance buffering
Erlang's binaries and strings (two different types) are designed for the same thing.
> but it might be tricky to argue that Erlang's process model is better than Golang's
Well, Go did a nice job borrowing Erlang's solution here. As it origins from Erlang, it has to be good. ;)
> Erlang requires an installed runtime and executes in a VM. Golang produces native binaries.
Erlang is able to produce a single binary. That it contains a VM and libraries and user code is another matter, but rsyncing Erlang binaries is also possible.
> Golang has modern, extremely high-quality tooling.
Dialyzer, EDoc, various process viewers, debuggers and so on for Erlang are really mature.
I think that familiarity of syntax, familiar programming paradigm and raw (single threaded, number crunching) speed were the reasons. Go has many benefits over Erlang in the realm of marketing, but I don't believe Erlang is much worse a language because of it.
I meant "safe" in the "nobody got fired for buying IBM" sense. My sense of it is that from now until the heat death of the universe, Golang developers will say that language is safer because of its type system, and Erland developers will say that language is safer because of share-nothing and immutability. I think both languages are safer than Python.
But then, you cannot have it both ways. Either Erlang is "safer" because of share-nothing and immutability, or Golang is more flexible w/r/t concurrency. I can design a shared concurrent data structure in Golang that is protected by locks or atomic counters, and I can do that easily, using the native first-class facilities of the language. Can two Erlang processes cooperate using a single shared buffer with a custom-designed concurrency scheme? Is that a natural thing to express in Erlang? It's a natural thing to express in Golang.
I feel like you didn't really engage with the string processing point I made.
I feel like it doesn't help an Erlang advocate's point too much to observe that Golang stole the best parts of Erlang.
I feel like every time it is ever pointed out in any language comparison that Golang can produce runnable native binaries, someone always has some rube-goldbergian alternative that nobody in the real world ever uses to get their own language bootstrapped onto some routine from a single file.
I gave specific reasons why Golang's tooling is strong. I feel like I got a response that says "Erlang's tools are mature". Nobody is arguing that Erlang is immature. The argument is that it's geriatric. :)
Ultimately, I agree that most of what I think is good about Golang is probably icing on the cake after "the language is boring enough to be familiar to Python programmers".
> This is a taskbar button press for me.
I'd hate for anyone to lightly brush off this feature. The fact that you can you can just rsync a binary means you can just spin up a machine, not care about what libraries are installed, and just run the Go binary on it.
I know there are tools like Chef out there, but I can spin up production-ready Go application servers on any Linux box in a minute because of this feature.
Erlang shows, logs, send via mail and telepathically tells you about what died where with all the details, and then additionally restarts the failed process according to restarting policy and tries again. Or not.
Anyway, I think this thread was about comparison of Erlang and Go - other threads are already full of people writing how wonderful Go is. As a part time Erlang developer, in this thread, I'd like to read what Go has better than than Erlang, not what is good in Go in general, because the latter is just increasing noise-to-ratio.
And it's possible to build single binary and deploy it with Erlang too.
Static typing with a very convenient expression in the language.
A concurrency model that is easier to adopt if your frame of reference is concurrent C++.
A simpler, friendlier syntax, which is probably not a win if you're a veteran Erlang programmer.
Probably better tooling: native binaries, a lightning fast compiler with great error messages and testing facilities, &c.
Perhaps a more modern standard library, which is made somewhat simpler and more concise by the pragmatic adoption of a very little bit of conventional OO, without going whole hog the way Java does.
They can be. By default a `make(chan int)` is blocking, but you can give it a buffer size with `make(chan int, 1000)` or something and it won't block until it's full.
When this happens, it seems like it tends to be exposing a fault in the design of the program. In a purely asynchronous system you can obviously avoid deadlocking in interprocess communication while still having a system that never correctly converges.
Sure, a program that uses a channel with a large buffer size as if it were an asynchronous channel contains a bad bug. The point is that if you need such an asynchronous channel, Go doesn't provide it. There are many possible examples of programs that need truly asynchronous channel functionality in which using a channel with a large buffer size would expose the program to subtle deadlocks that may only manifest in the wild on large data sets.
Wouldn't a correct design for programs that occasionally needed to handle huge data sets be to consciously and deliberately serialize (or at least bound the concurrency of) some parts of the code, rather than to pretend that the program was operating on an abstraction that could buffer arbitrary amounts of data in parallel and always converge properly?
Yes. You can always build the right thing in a system with synchronous channels—there are many ways to build asynchronous channels out of synchronous channels, after all, even if they aren't built into the language. My point is that asynchronous channels make it easier to avoid accidentally shooting yourself in the foot, that's all.
For what it's worth, I don't think that Go made a bad decision here (although I personally wouldn't have made the same decision, because of examples like those I gave in my other reply downthread). Certainly synchronous channels are faster. There are always tradeoffs.
If you want async, you could really just keep consuming from the channel with goroutines until your hardware catches on fire. There's nothing in Go that is limiting this behavior.
Async communication makes it harder, but doesn't avoid it by default. E.g. if in in Erlang gen_server A calls gen_server B w/o timeout and B, to process this call, then calls A (ouch, bad design, but possible), you've got a wonderful deadlock.
I'm using both, Erlang/OTP and Golang. Both have their strengths and advantages. So simply use the right tool for the right work (and no hammer to turn a screw).
Right, you can deadlock in any actor-based system (well, not any actor-based system—I've seen research systems that provably do not, but they're research). Async communication just makes it harder to screw up, as you say.
My point is that there are actually very few concurrency problems where deadlocks are solved by increasing the buffer size by some fixed amount. If you want your code to be correct, in most cases a buffer size of 100 might as well be a buffer size of 1, except that an increased buffer size can improve performance for some scenarios.
When people talk about asynchronous channels, they usually mean that you can stream messages to another actor and know that you won't block. That is not true for Go channels. You can increase the buffer size, but that just reduces the chance that your program will deadlock: it doesn't make a Go channel work in situations where you need an asynchronous channel for your program to be correct.
then your normal program flow won't block. I guess that's what you mean by deadlock. If your program runs into an actual deadlock, the runtime will detect it, crash the program and show stacktraces.
That just gives you more complicated deadlocks. You still have to be aware of the potential for tasks to deadlock and consciously design them not to do that. It's a problem that does come up all the time in Go programs, but tends to come up quickly enough (due to the way concurrency is designed in Golang) that you fix it quickly, like an accidental nil reference in a Ruby program.
The technical barrier for Erlang is much much greater in my opinion than Go. Introducing something into our stack is a very tricky situation. We are mostly all Python engineers, so bringing in something new has to be intuitive for others to pick up on and get involved.
I like Erlang a lot, but learning Go is much easier, especially for people with a background in imperative languages. When you code in Go everything seems very familiar, even for someone who tries it for the first time; in Erlang not so much. Now, don't get me wrong, I think it is amazing to know different paradigms, but some people just can't afford that luxury (think lack of time).
There's probably no match for Erlang's error handling mechanisms out there, I actually dislike Go's panic mechanism, but if you have a small piece in your system which needs to be wickedly fast and highly concurrent I see Go as a perfectly good choice.
The type system in Go is as important as the concurrency system. Sure, Erlang can do concurrency as well as (or better than) Go, but it doesn't have the type safety that Go offers.
For one, a shared-nothing design makes migration to different machines easier and doesn't require a stop-the-world garbage collector.
Erlang is memory safe in the presence of many-core (or many-system) parallelism. Go is not (you can segfault, possibly in an exploitable way, if GOMAXPROCS > 1).
Erlang unbounded channels reduce deadlocks because your sender can continue execution without waiting for a receiver.
Have we heard a lot of stories about deployed Golang apps having problems because of garbage collection? It's true that this is a significant designed-in advantage for Erlang, which can run it's collector on a process-by-process basis.
What's also true and potentially compensatory is the C-like degree of control Golang gives you over how you allocate memory and lay it out.
I'm not sure the unbounded channel thing is a real advantage for Erlang. I'm happy to be convinced I'm wrong. What's a real, correct design which would be hard to realize in Golang (without unbounded channels) that relies on unbounded channels?
> Have we heard a lot of stories about deployed Golang apps having problems because of garbage collection? It's true that this is a significant designed-in advantage for Erlang, which can run it's collector on a process-by-process basis.
In general most Go apps that have been deployed are Web apps and server infrastructure, where concurrent garbage collection is not too much of a problem in practice. So Go's choice makes sense in Go's context. It does limit parallel scalability in some contexts—which of course are not the contexts that most people have been using Go for at this point.
> What's also true and potentially compensatory is the C-like degree of control Golang gives you over how you allocate memory and lay it out.
Go doesn't give you C-like control over allocation of memory. Language constructs will allocate memory in ways that are not immediately obvious, to quote Ian Lance Taylor [1]. It does give you control over layout of memory.
> I'm not sure the unbounded channel thing is a real advantage for Erlang. I'm happy to be convinced I'm wrong. What's a real, correct design which would be hard to realize in Golang (without unbounded channels) that relies on unbounded channels?
Suppose you're pulling down images from the network and printing out a sorted list of URLs of all the images you find. You might structure it as two goroutines A and B. Make two channels, "urls" and "done". Goroutine A is the network goroutine and simply crawls looking for images to stream to B over the channel "urls". When it's done it sends "true" on "done". Goroutine B is the sorting goroutine and first blocks on the channel "done" before it proceeds, after which it drains the "urls" channel and sorts the results.
This program contains a deadlock due to synchronous message sends. If there are more URLs to be downloaded than the buffer size of "urls", then the program will deadlock. If "urls" were an asynchronous channel, however, this would be fine.
Of course this can be structured to fix it, by doing the send in another goroutine for example (although that costs performance). But hopefully that's a good illustration of the subtleties of synchronous message sending.
Go doesn't give you C-like control over allocation of memory. Language constructs will allocate memory in ways that are not immediately obvious, to quote Ian Lance Taylor [1]. It does give you control over layout of memory.
These are two sides of the same coin. Having control over memory layout allows you to implement what are in effect allocators.
Here, you will never actually block on your send since it runs on it own goroutine. I can't see an actual use case for this kind of thing but since you are using this argument over and over then.. :)
It's mostly FUD, but it's a rathole of an argument since nobody has defined what "exploitable" means. Rather than nail down the term so we can all have ourselves an even more pointed language war, we should probably just let this go.
In general I consider segfaults exploitable, because of heap spray and virtual method calls. Even if not "exploitable", I consider it "very very scary".
This is far fetched. The threat scenario that page contemplates is "what happens if you're trying to safely run untrusted Golang code, as if it was content-controlled Javascript and you were a browser". It's not the case that Golang in its natural environment gives attackers the ability to paint the heap with malicious addresses and then provide themselves with a statistically significant shot at corrupting memory to exploit those addresses.
Your point is a reason that Golang couldn't be dropped in as a browser Javascript replacement. But nobody has ever suggested that it could be; it can't, just like Java (which was designed for the purpose but failed at it) and Erlang (which wasn't) can't.
> It's not the case that Golang in its natural environment gives attackers the ability to paint the heap with malicious addresses and then provide themselves with a statistically significant shot at corrupting memory to exploit those addresses.
What if a Go program allowed Go objects to be scripted by untrusted user code written in JavaScript? In browsers it is very possible to corrupt the Frame (Gecko)/RenderObject (WebKit) tree, which is in a C++ heap that is separate from the JavaScript heap.
The memory safety issues in C++ are a problem not because they mean that untrusted code written in C++ can't safely be executed (although it does mean that). It also means that safe languages become easily weaponizable. JavaScript (or Lua, or whatever) embedded in a Go program could paint the heap with malicious addresses.
This argument reduces to, "what if a Golang process exposed enough of its runtime to Javascript so that Javascript would be able to simulate an attacker just having access to Golang in the first place". In reality, if you were wacky enough to bolt Javascript onto Golang, you probably wouldn't do it in a way that would enable heap spray exploits, even if you had no idea what a heap spray exploit was.
The Javascript/C marriage problem isn't "heap spraying", it's that the interpreters themselves are full of exploitable C bugs, which is made much worse by the fact that the Javascript object lifecycle is expressed in an inherently unsafe language and so every tuple of [reference, event] has to be diligent checked. The same simply wouldn't be the case for any realistic marriage of Golang/Javascript, if only because the number of exploitable code conditions in Golang is miniscule compared to that of C.
All heap spraying does is make bugs that are very plausible to exploit easy to exploit reliably. You still have to start with "plausible".
> All heap spraying does is make bugs that are very plausible to exploit easy to exploit reliably. You still have to start with "plausible".
I'm not confident that race conditions in a shared-everything language are not "plausible". My experience is that race conditions are subtle and hard to find, even with a race detector. All you have to do is race on a map or a slice. And virtual calls are everywhere in Go.
Sure, we don't know that it's a problem so far, as nobody has created such a scenario. We're in violent agreement there. I grant that for server-side use cases, it doesn't matter—people use enormous C++ server codebases in production all the time and memory safety issues rarely bite them to the same degree that we see in browsers.
All I'm saying is that I don't have the same level of confidence that Go is free from memory safety exploits that I have for, say, Erlang or Java.
The use-after-free bugs that people are heap-spraying to exploit are plausible because the people who find them can tell you a simple story about how a program writes attacker-controlled data to attacker-controlled addresses. Unlike the Golang hypothetical you offer, they aren't plausible just because someone says they are.
You're wildly off the mark when you say that C++ serverside code tends to survive against attackers looking for memory corruption bugs. They do not. They fail with memory corruption flaws routinely. That was a cheap shot (you tried to create an equivalence class of unsafety between two totally unrelated languages and two totally unrelated sets of bug classes) and it won't work. You're going to have to try harder to make a case, if it's worth it to you.
Nothing is as bad as browser Javascript (it would be hard to conceive of a harder software security problem to design against), but C++ server software is pretty far towards the "unsafe" side of the security spectrum, and Golang and Erlang probably occupy virtually the same spot on that spectrum.
Unfortunately, I think we're basically at an impasse here.
I've described a scenario whereby a Go program that embedded untrusted safe code could fall to memory safety vulnerabilities. To be exact, it creates a slice of interfaces and accesses the slice in a racy way in such a way that it calls virtual functions at the same time it inserts, causing the slice to be reallocated. Then an attacker sprays the heap with addresses of shellcode. This results in arbitrary code execution when calling a virtual method.
You're saying that this is so unlikely as to be implausible, as it's never been observed in practice and might not even work. That's fine, I respect that position. We'll leave it there, and agree to disagree about whether this is a concern relative to languages like Erlang that are designed to be 100% free of memory safety problems. :)
What you've done in this comment is recapitulate the idea of a web browser executing content-controlled code, which is a problem that neither Erlang nor Golang could safely solve, and which no reasonable designer would ever use Erlang or Golang to solve, but layered on just enough abstraction to make that observation sound symptomatic of a problem with Golang.
I'm not trying to be pissy about it; I make bogus arguments all the time too, often without realizing it. You obviously know what you're talking about. I just think in this one subthread, you're wrong.
I don't dispute that segfault almost always means exploitability (I've seen friends write amazing exploits using only the slightest restricted memory corruptions).
But the data races don't bother me at all, first of all you won't encounter them if you embrace a program design facilitating goroutines and channels (the often quoted "don't communicate by sharing, share by communicating"), and second because we now have the tools to detect data races.
Segfaults do not almost always mean exploitability. In fact, the largest class of segfaults (un-offsetted NULL pointer dereferences) are rarely exploitable. The argument that says "look at that program, it segfaulted, it is probably exploitable" is not really valid.
honestly, I don't know. I know Go does concurrency well and I like it, the only reason I said it was equivalent or better was to avoid a flamewar. I've never even used Erlang.
I love Go and continue to play with it, glad to see another story.
I have to say I always thought Disqus were a cool team, I have been doing websites for over a decade and whenever I needed a comment system they were the first choice. It was a real shame when I came up against their recent decision to push out the 'Discover' feature. It resulted in my having to remove Disqus from most of my clients websites as I received several angry emails complaining about porn links (it wasn't porn, but sleazy celeb stuff like bikini photos etc). Sad to say I have written them off as a commenting platform; what they did, although probably a brilliant business decision and making them loads more ad money, felt like a complete betrayal of trust.
With changes like these I always wonder if the users actually see the difference in performance. Was there positive feedback after these changes were rolled out onto production, or rather; was there a _decrease_ in people complaining about server performance issues?
This article uses the term "realtime" and then discusses latencies in seconds and tens of milliseconds. It further says that Go approaches C in terms of speed. Can Go actually achieve C latency which is an order of magnitude faster than this, or is that hyperbole?
How does GC impact that? What about C to Go inter-op. Is there a latency penalty?
Yes the GC has a large impact on this and yes C to Go inter-op has a heavy penalty. Memcachier did a talk at a Go meetup where they discussed a custom slab allocator they built to workaround some of these problems.
Like C, Go is compiled to a native binary. Certainly something like GCC will tend to produce faster code, but there's nothing fundamental here preventing you from writing code that can run fast in Go.
The real question isn't "can go achieve C latency" but rather "will every-day code written in Go achieve the same latency as every-day code written in C". And the answer is probably: it'll be slower, though usually in the same ballpark, but it will be easier to write, easier to maintain and have fewer bugs.
For me it is a very real question of whether "can Go achieve C latency" because if I'm going to deploy a Go application it is going to be to replace very performance critical components that are currently written in C. If they weren't latency sensitive neither C nor Go would be my choice (I'd go higher up the abstraction hierarchy to make it easier to write, easier to maintain etc).
If Go can't do that (or can't do that yet) it's fine by me, I'll just let it bake a few more years and reassess, but there have been tons of articles lately talking about how great Go's performance is. I'm just curious if that is an artifact of the original languages being compared to, or if it has broken through to C levels of latency. If it has, it becomes a lot more interesting to me.
In raw performance, Go is relatively close to C. But since you keep mentioning latency: the garbage collector makes Go unsuitable for hard realtime applications.
Well, according to the benchmark game Go tends to be about 3 times slower than C on tight, highly optomized benchmarks[1]. Python tends to be 40 times slower than C[2]. Those, of course, might not be representative of your tasks, or the optimization effort you're putting in.
These benchmarks seems to rely heavily on math operations and the Go bench seems compiled with gc. I suspect that gccgo with the latest runtime (There was no release of gcc with the 1.1 runtime yet) can show prettier results with the use of the more mature gcc optimisations.
If you need hard realtime, go is not an appropriate solution. For a start you should expect occasional GC pauses on the order of tens of milliseconds, with no upper bound if you're using a lot of memory.
That said, go actually can come close to C on sustained throughput. (On some benchmarks, naive go programs beat naive C, but usually go is somewhat slower.) You can interoperate with C using cgo, and there is no penalty. (It may not work with all go compilers.) If you want to interoperate with C++ then see http://www.swig.org/Doc2.0/Go.html for the best solution. I believe that there is a performance penalty there.
This is true. :) Our definition of "realtime" is slightly different than a CS definition. Sub-1s end to end time usually falls under "realtime" when it comes to the web, but it depends on what the subject is of course. In our case, that's perfectly acceptable. We're obviously shooting for lower, but at this point, we can't do much to improve network latency and other factors involved. So for now... this is good. :)
Most of the latency involves other components in our system. This isn't just computational time. Most of the actual computation with our transformers and whatnot are done in less than 1ms.
I was very curious about the claim that Go approached C speeds. I've not seen those claims before and the latencies cited certainly don't back that up.
C inter-op speed would be a big part of any such claims as any systems language would need to support that.
Well, the latency referred to is not factor of the Go code. It's more of a factor of our nginx front end which are handling a few million concurrent connections. The time spent doing computation is being measured in nanoseconds.
This is not the first company switching from a a dynamic language to Go. Python and Ruby are great for getting started quick but when you scale up you need a workhorse like Go. Nothing too fancy but nice features that give the language a modern touch. I guess Github and more will pick it up soon.
I find blanket statements about performance like this to be pretty meaningless. I develop both Python and Go as well, and honestly performance is not the number one criteria I've used in deciding which to use for a project. I ask questions like "Are there existing Python packages that tip the scale", "If I expect to need real concurrency, would it be IO bound or CPU bound", "Do I want to make the tradeoff of the increased stability of static typing in exchange for more man-hours on this project?" Things like that.
It's perfectly fine to be using more than one tool. Starting with Python can be a good idea because of the rapid development and turnaround while experimenting. If Python ends up being fast enough, that's great. But if it doesn't, then it's nice to have this new language which is almost native speed and still fun to write, unlike Java or C++.
So in your opinion, languages are ranked in order of "performance"? See, in my experience, it just doesn't work that way. Sometimes it does, sure, but it's not a rule that I can use in my life.
Seems like it would be nice if I could just say "I need this done fast, I'll use Python" or "I need 'performance' so I'll use Golang"
And to be fair, many benchmarks in Go don't perform nearly as fast as C or even the JVM. Go has gotten faster with each point release but again, these sorts of raw performance benchmarks aren't really good for anything except providing an apples-to-apples comparison. Because it's not usually a question of building the same exact app in the same exact way in one language or another.
That's good to hear :-) As you see, there's quite a bit of interest in such articles. I presume you can give a lot of additional details without exposing any IP.
Now I can't authorize with social networks buttons. Some stupid password is requested. Guys, whole idea of authorization buttons is to not ask user for email and password.
Jobs were originally pushed from our Django app, into a Celery job, which consumed the message, and published into Thoonk. Thoonk was then consumed by the old Python backend.
With this rewrite, we've also removed Thoonk and are using a fanout exchange in Rabbit directly. Go consumes the Celery job from the Rabbit queue.
What does realtime refer to in this case? Live updating of Disqus threads? Does that mean there's an open connection for every post I visit that has Disqus comments?
Yes. Our realtime updates vote counts and new comments over websockets if your browser supports it. If not, we fall back to some more typical HTTP streaming.
Another HN thread, another half a dozen troll comments spouting inaccuracies about the language, then covering up for gross misunderstandings of the language and then saying "but I'm still right". Gotta love Go threads.
It's more like the anti-Scala, because can read other people's code within a few hours instead of shrugging for 6 months because they just stumbled upon yet another language feature.
Perhaps I wasn't clear enough. Go is the new hot language on HN. It's what startups are switching to and blogging about en masse.
Before Go a bunch of shops switched from ruby and rails to scala for various reasons. Before scala companies switched from either php or something enterprisey like Java/C# to rails. Before that people switched from C++ to Java/C#. Before that C to C++. Before that ASM to C. Before that punchcards to ASM.
Thus, in terms of the hype cycle, Go is the new Scala.
Unfortunately, there is no technical content (which seems endemic to the recent storm of Go articles). Is it faster because is a static language producing machine code? Was their choice of Python packages wrong? How much are the performance improvements caused by knowing better where the bottlenecks are? Were other languages with larger ecosystems considered?