Hacker News new | past | comments | ask | show | jobs | submit login
Compiler Explorer (godbolt.org)
303 points by stmw on Aug 5, 2020 | hide | past | favorite | 48 comments




And a related article written by Matt Godbolt himself

"Optimizations in C++ Compilers (acm.org)" https://news.ycombinator.com/item?id=23822044 (102 points, 23 days ago, 20 comments)


Oh, wow, I'm sorry! It was new to me, and it didn't flag it for me as a duplicate.

It is so neat to have this smorgasbord without needing to install emulators or containers with toolchains, or worse, acquire all of the relevant hardware as in the old old days.


It's not a duplicate! Reposts are ok after a year or so: https://news.ycombinator.com/newsfaq.html.

I post those links because many users like to read older threads. Still looking for brief unambiguous wording to make that clear... "if curious see also" is still leading to some misunderstandings.


Obligatory xkcd: https://xkcd.com/1053/

As a compiler writer, I find this useful when I have an "oh crap" moment and need to find a simple C program to generate specific IR, or I need to find the IR of a simple C program. Despite literally having a built tip-of-trunk compiler that I'm working on. It's that convenient.


It is possible to hook custom compilers if you run compiler explorer locally, at least I sort of remember using that in a previous role. Sounds like it could be useful for your use case.


Godbolt links are always great for resolving "well actually" arguments about how smart compilers are (which, as you can guess, are quite common on Hacker News). No more "the compiler will do this"–if you're going to claim that, you better have a Godbolt to back it up!


Something that surprised me: Compilers happily optimize pointer to member function calls out if it knows definitely what the pointer points to, but not if the pointer is to a virtual function. Recently figured that out thanks to a slightly related debate on whether it would optimize the first case at all - which I thought it would - but I’m surprised by the second case, since it’s not like there’s a virtual dispatch to worry about once you’ve already taken a reference - you’re already specifying exactly which one implementation you want. Just one of many nuggets learned from Godbolt: apparently marking a function virtual can kill optimizations even in cases where it seems like it shouldn’t.


Where's your point of confusion? There's still virtual dispatch going on with a pointer to virtual member function... they're typically implemented as fat pointers containing the vtable offset, and calling via the pointer still does virtual dispatch.

An alternative would be for the compiler to generate a stub function (can be shared across all classes for cases where the same vtable offset is used, unless your architecture has some very exotic calling convention) that jumps to a fixed offset in the vitable, and use a skinny pointer to that stub as a virtual member function pointer, but that's not the usual implementation.

In any case, calling through a pointer to virtual member function has to perform virtual dispatch unless the compiler can statically constrain the runtime type of *this being used at the call site. Remember that C++ needs to allow separate compilation, so without a full-program optimizer that breaks dlopen(), C++ can't perform devirtualization on non-final classes.

Making the class final will allow the compiler to statically constrain the runtime type and devirtualize the pointer to member function call.

Edit: added paragraph break.


Ahhh, nevermind. I was under the impression that when you grabbed a pointer to member function, it was grabbing a specific implementation. I actually did not know it supported virtual dispatch.

Now I think I finally understand why MSVC pointer to member functions are 16 bytes wide; I knew about the multi-inheritance this pointer adjustment, but I actually had no idea about virtual dispatch through member pointers. Frankly, I never used them in a way that would’ve mattered for this.


Try making your class final. Then it won't bother looking up the address of the virtual function in cases where it knows which class it is.


They can optimize pointer to a virtual function, but they need PGO data for doing it.

The problem is that most devs don't bother with PGO.


In the absence of PGO data, some compilers nonetheless will speculate the target of a virtual function call and optimize as such.


It could be a link time optimization to check if a single implementation is used and then de-virtualize the calls.


True, I imagine that clang with ThinLTO might do it.


Unfortunately I am too late to edit, but actually I simply misunderstood that the member function pointer would do dynamic dispatch. Somehow after all of these years, I never used that feature. (Although it explains some of the extra bytes needed in a member to function pointer; prior to now I was only aware of the this pointer offset part.)


I've seen devirtualization happen with link-time optimization as well.


Godbolt is great for comparing compiler versions (and compilers).

For example, you can see gcc's progression of efficiency for c atomics with https://godbolt.org/z/brsoEr. If you increment the gcc version number, you will see the (very slow) mfence disappear, and xchg show up.

Then there is Clang at O3: If an int falls in the forest, and there is no one around, was it ever incremented? No. The function turns into a bare ret.


Compilers, versions and also switches and instruction sets. I think the Compiler Explorer should be taught in one of the labs for any CS61C type course when you're first exposed to assembly language.


I'm actually quite surprised GCC's isn't also a bare ret. It of course is if you replace the atomic_int with a regular int, I don't know why that wouldn't be hitting the same optimizations. Yes it's atomic, but it's still an unused local that doesn't escape the function.


Clang/llvm also does heap ellision (removing unnecessary mallocs) while gcc didn't last time I checked. I think the llvm devs allows themselves to have a more practical understanding of the cpp standard.


Well we could check with godbolt:

https://godbolt.org/z/1reM4s

GCC seems to eliminate them since at least 4.7.1.


That's fair. Here is another example where it eliminates since 5.1.

https://godbolt.org/z/3oGc3P


Right, that's why atomic ints are not a replacement for volatile when doing inter-process communication.

If the memory access (load or store) itself is the desired behavior, you better make sure you add volatile, even to atomic<int> types. The atomic<> simply provides certain guarantees about atomicity of that access in regards to other accesses within the same process, not about access from different process. If the compiler analysis determines that the atomic<> store/load isn't necessary within the currently compiled program, than it may completely elide it.

I think GCC may be disabling certain optimizations on atomic<> access because many people mistakenly use atomic<> for interprocess communication and it would break that code?


Also languages and bugs.


Godbolt is such an amazing tool, and amazing that it's free.

For a random example from a few days ago, I wanted to understand how Rust compiles various approaches to doing pairwise addition between a f64 vector and a f32 vector: https://godbolt.org/z/9envsT. Profiling can tell me which is fastest, but godbolt is really helpful for understanding why.

(Fun fact I learned recently, after years of using it: Godbolt is named after its creator, Matt Godbolt [0]).

[0] https://xania.org/MattGodbolt


Indeed, with such a splendid last name, it must take considerable humility to avoid naming all of his projects and even variables with it. I doubt I would be able to resist.


"As you can see, step 3 of the Godbolt algorithm increments the fourth Godbolt variable by the gradient of the Godbolt network's Godbolt matrix..."


He has a patreon if you'd like to contribute: https://www.patreon.com/mattgodbolt


And you can get stickers!

No other sticker has quite the cachet of a Godbolt sticker.


It is not named after him, at least now: the tool is called Compiler Explorer.

The thing is that he originally served it from its domain so people use both interchangeably.


> The thing is that he originally served it from its domain so people use both interchangeably.

And it still is, right? Or is there also a Compiler Explorer domain?


It's on https://compiler-explorer.com/ too. But it's too long to type, and so many weblinks point at godbolt.org that I've long since accepted that while its name is definitely "Compiler Explorer", folks will call it "godbolt" and type that into their browser. It's shorter and frankly when life gifts you a surname like mine, why not accept it? :)


for fun it's also on godbo.lt


Throwing a bunch of code at godbolt and seeing it spit out a single 'mov' is definitely on my list of Top 10 Most Satisfying Things: https://godbolt.org/z/bdn37q


If you are wondering how it works, here is Matt Godbolt himself explaining it. Surprisingly simple and cheap. https://www.youtube.com/watch?v=bSkpMdDe4g4&feature=youtu.be...


Will watch the video later, and I'm probably missing something but at this moment I wonder what more it is than just running "gcc -S", which just spits out assembly code.

By the way, what I think could make this tool more useful for the average user is if the assembly were decompiled into another language (like C).


> By the way, what I think could make this tool more useful for the average user is if the assembly were decompiled into another language (like C).

It's somewhere in the ideas list... https://github.com/compiler-explorer/compiler-explorer/issue...

Something similar is Andreas Fertig's awesome https://cppinsights.io/


This cleans up the assembly and gives you access to many more compilers.


There is also a console client for it - cce[1]. It is written in Rust.

[1] https://github.com/ethanhs/cce


There is also one for .NET called SharpLab: https://sharplab.io/


How is this the first time I heard about this?


I recently discovered that it supports OCaml: https://ocaml.godbolt.org/


There's a language dropdown on any editor window at the top right. You'll see we support quite a number of languages!

We also have a REST API:

$ curl -s https://godbolt.org/api/languages Id | Name go | Go c | C fortran | Fortran c++ | C++ cppx | Cppx assembly | Assembly cuda | CUDA python | Python llvm | LLVM IR d | D ispc | ispc analysis | Analysis nim | Nim rust | Rust clean | Clean pascal | Pascal haskell | Haskell ada | Ada ocaml | OCaml swift | Swift zig | Zig⏎


Wow, that's awesome!

It's pretty cool being able to compare equivalent programs in different languages.


I did a submission 2 days ago about an impressively smart optimization in Clang (GCC doesn't) from a badly linear isEven function : https://news.ycombinator.com/item?id=24049872


Enjoying the Python code deconstruction in GodBolt as well, feels like it could help understand how things work in general (even though it's less of an issue, than actual compiled languages).


surprised at the big differences in various x86-64 compilers.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: