Living with Microsoft C++ Compiler Bugs and Ambiguities

mappu · on Oct 14, 2015

After reading http://blogs.msdn.com/b/vcblog/archive/2015/09/25/rejuvenati... i'm not really surprised about these kind of issues. It's impressive that VC2013 was as compliant as it was, when all template parsing was based on string interpolation instead of an AST (and simultaneously distressing that a billion-dollar company would release such a product so recently).

Is the submitted article talking about VC2015? Although i suppose they will need to keep supporting the older toolchains.

stinos · on Oct 14, 2015

Is the submitted article talking about VC2015?

It's a pitty it is not stated which version - maybe it means 'all versions' - nor possible fixes a user can use. E.g. for the 'Binding Rvalues to Lvalue References' case: using /W4 flag will yield a C4239 warning, turning on 'Treat warnings as error' will prevent using the code all together. For things like the '__VA_ARGS__ chaos' and 'Elimiated Types' on the other hand there is no fix (that I know of), but I'd argue such constructs do not have a place in modern C++ (or maybe even any C++ for that matter) anyway and they should be avoided by the users and as such should never turn into problems.

asveikau · on Oct 14, 2015

> For things like the '__VA_ARGS__ chaos' and 'Elimiated Types' on the other hand there is no fix (that I know of), but I'd argue such constructs do not have a place in modern C++ (or maybe even any C++ for that matter) anyway and they should be avoided by the users and as such should never turn into problems.

The point is that massively popular libraries (Windows SDK, boost) depend on or work around the broken behavior. This attempt to ignore the problem by calling it insufficiently modern is impractical for folks who need to care in the real world.

userbinator · on Oct 14, 2015

I think the standards for large languages like C++ should really be developed in parallel and released together with a reference implementation - that way others would have a baseline to compare behaviour with, and it'd discourage adding features that turn out to be ridiculously difficult to implement correctly (e.g. export templates) or diverging dialects. The reference implementation doesn't have to be optimised or produce optimised code, but should be written to follow the standard closely, favouring clarity instead of efficiency.

jcranmer · on Oct 14, 2015

Requiring a reference implementation with a standard is a bad idea. First off, who's going to write it? The compiler vendors are the people who'd likely do the best job, and they already have to adapt their implementations for changes--now you're making them do it twice. Another, even bigger issue, is that it actually doesn't tell you what's hard to implement. MSVC, gcc, and clang all take different approaches to their compiler, and what can be easy to implement in one could turn out to be ridiculously difficult in another (e.g., exposing a full AST via reflective mechanisms). You need a diversity of implementations, not just one implementation, to know what's actually hard to implement and what's not.

It's interesting that you bring up export templates, because that fiasco feature actually already sort of had a reference implementation. The original Cfront compiler actually supported it (admittedly, it was very buggy). The problem with export was that, at the time of discussion (~1996), no one really understood just what the impact of all of the template complexity meant--they assumed that templates were basically just a typesafe variant of the C preprocessor (or at least that most common use could be boiled down to that description; its Turing-completeness was known by then).

So, in that vein, export template was seen as a "well, this is going to be tricky, but so is name resolution in general in C++" and the overarching use case was compelling (you can hide the implementation of a regular function, why not templates?). Compiler vendors objected to the feature, but not on the basis that it was essentially impossible to implement, but rather that major design points would need quibbles. It was eventually accepted into the standard largely with the understanding that everyone would implement something close to it, if not the exact current specification.

The only thing that would have prevented export template from being standardized would have been requiring a fairly robust implementation that could have discovered that export template really did infect every part of the code base in a horrible way, and it should be noted that reference implementations generally aren't that robust. Amaya, after all, didn't prevent HTML 4 or CSS 2 from having unimplementable sections.

pjmlp · on Oct 14, 2015

> The original Cfront compiler actually supported it (admittedly, it was very buggy). The problem with export was that, at the time of discussion (~1996), no one really understood just what the impact of all of the template complexity

Actually I think they kind of did, as Ada already had generics and modules in 1983. And STL was born on Ada.

However, Ada compilers were once upon a time believed to be complexer to implement than C++ ones and had beefy hardware requirements, so maybe they didn't want to try that route.

Now they are trying to finally add modules. :)

gsnedders · on Oct 14, 2015

> Amaya, after all, didn't prevent HTML 4 or CSS 2 from having unimplementable sections.

Amaya was always in an odd place — it was a "testbed" and not a reference implementation. (Also, if you actually look at the parts HTML 4 that required anyone to do anything, you end up with things like dbaron's table (as in, a desk) implementation of it: put quotations mark at either end of the table, and whatever HTML you put on the table, it has fulfilled all the requirements of the standard.)

But yes, the underlying problem of different approaches is a large part of the reason why you want multiple implementations — the other large part is you want evidence the standard is clear enough to lead to interoperable implementations. These are the two big reasons the W3C now require evidence of two, distinct, interoperable implementations of a specification to proceed down the Recommendation track beyond Candidate Recommendation (i.e., to Proposed Recommendation)… and this is the reason why so much around the W3C takes so long, because people by and large don't want to fix ancient, obscure bugs which are often time-consuming to fix for the sake of the standard progressing.

Kristine1975 · on Oct 14, 2015

At least parts of it are essentially developed this way: C++14's features were implemented in clang while the standard was still in development (if memory serves) and several of its main programmers are members of the standards committee. The module system that's been proposed (for C++1z?) has been implemented in clang as well. Since clang is open source, it could count as a reference implementation ;-)

arximboldi · on Oct 14, 2015

In fact, the problem with having one single "reference implementation", like in Python, is that other implementations never get to be "compliant enough".

The situation in C++ is saner, since both GCC and Clang are moving the standard forward, and together make sure that the standard can accept multiple valid implementations and code tends to be truly portable.

The problem is really simply that Microsoft was never interested, at least until very recently, to have a truly standards compliant compiler.

Kristine1975 · on Oct 14, 2015

>The problem is really simply that Microsoft was never interested, at least until very recently, to have a truly standards compliant compiler.

I think Microsoft neglected C++ over .Net before suddenly changing their minds and deciding to support C++ more.

VS2010 was one of the first compilers (even before gcc/clang) to support parts of the preliminary C++0x standard (e.g. lambda functions). But when the final standard was released they fell behind in implementing it.

bstamour · on Oct 14, 2015

> VS2010 was one of the first compilers (even before gcc/clang)

GCC supported C++0x features back when they thought the standard would be released before 2010. Hence the 0x, not 1x. I remember using the `-std=c++0x` flag back in 2008, when I first took a course on C++, which is a few years before VS2010.

jcranmer · on Oct 14, 2015

I don't think it's that they weren't interested as much as it was that the team they had was relatively small and their compiler needed a massive refactoring to get there. MSVC never even built an AST for processing!

spoiler · on Oct 14, 2015

I'm a bit confused. Shouldn't this:

    X<b> x;

be this:

    X<B> x;

Or am I confusing things? Also, I assume the `</b>` at the end is a typo?

Furthermore, I wasn't even aware you could access `a` the way it's being accessed in the same example; I always use `this->a`. Actually, I'd access `x` through `this->x` anyway, in case there's a lvar in the context. Although, I rarely write C++ anyway.

dnesteruk · on Oct 14, 2015

Yes indeed, thanks for pointing this out.

pdkl95 · on Oct 14, 2015

That #define comment behavior is interesting - I wonder if it was the cause of a nasty bug I had a looooong time ago when I was porting the driver for our Serial HIPPI NIC to the initial release of NT 4.0 (NDIS 4).

There was a crash bug that happened only every few million packets. Obviously, this was a race condition. After a lot of hair-puling and waiting for 5-30min every time I wanted to test something, I decided to simply replaced the actual locking code with a single global spinlock at every entry to see if a serialized driver worked[1].

After that version still crashed, and a lot of trial and error, I finally found out that the spinlock were never actually running. Microsoft's compiler was compiling loops like this as if it was a NOOP:

    #define SPINLOCK(lock)      \
        for (;;) {              \
            if (TESTLOCK(lock)) \
                break;          \
        }

But it worked after I made this change:

    -    for (;;) {
    +    while (1) {

Empty for loops were dropped... but only in macros. Now I'm wondering if it was getting commented out in some subtle way.

[1] There was some concern about hardware errors, as a similar PCI chip we were using failed to notice changes in the PCI GNT# if it change REQ# in the same clock cycle.

shadowmint · on Oct 14, 2015

    This may have been a bit of a discouraging post with respect to MSVC,
    but do not worry — in a subsequent post we’ll take a look at MSVC-specific 
    constructs which, while not necessarily standard-compliant, are nevertheless 
    interesting and often quite usable

bah. Everything about the MSVC compiler is pretty much terrible.

Having some extra non standards compliant features in no way mitigates its terribleness.

The only good thing we can take from these articles is that yes, the guys at Microsoft have acknowledge how bad the situation is, and are actually trying to do something about it.

honestly? I'll believe it when I see it.

Breaking backwards compatibility is a nightmare because lots of libraries depend on the very particular way the MSVC compiler works; but that means you can't actually fix bugs.

So long story short, there's probably never going to be a 'good' version of MSVC. It'll pretty much be stuck with its peculiarities forever.

Maybe one day we'll get a 'new' compiler that can sit along the legacy one and actually compile, you know, the C++ standard. I'm not holding my breath, but hey, we can daydream...

cremno · on Oct 14, 2015

Making their preprocessor compliant with (modern) C and C++ standards is on their roadmap though (and two-phase lookup is too):

http://blogs.msdn.com/b/somasegar/archive/2014/06/03/first-p...

Also (nowadays) MS admits that their implementation of __VA_ARGS__ isn't compliant:

https://connect.microsoft.com/VisualStudio/Feedback/Details/...

userbinator · on Oct 14, 2015

I'm not a big C++ user, nor terribly fond of the language in any case, but IMHO the "single-phase lookup" MSVC is doing for templates is conceptually much simpler to understand (and apparently implement) than the standard 2-phase: instantiating a template is literally a textual substitution, which is intuitively what I'd expect and understand templates to be.

Kristine1975 · on Oct 14, 2015

With two-phase lookup errors in code that does not depend on template arguments are caught earlier (when the template is defined) than with one-phase lookup (when the template is instanciated). Which is why template code written for Visual Studio might not compile with a standard conforming compiler.

>instantiating a template is literally a textual substitution

I'm being pedantic here, but that's not quite the case. Example:

  template <typename T>
  void foo()
  {
    T t;
    ...
  }

  foo<int[2]>();

With simple textual substitution this would result in a syntax error, as

  int[2] t;

isn't valid C++ code.

misnome · on Oct 14, 2015

Awesome; I didn't know you could use macros to define REM as a comment in Visual C++!

marvy · on Oct 14, 2015

Me neither! Here's another cool macro trick: http://journal.stuffwithstuff.com/2012/01/24/higher-order-ma...

voltagex_ · on Oct 14, 2015

That's worthy of its own article.

Kristine1975 · on Oct 14, 2015

You haven't lived until you've implemented a compile-time counter in C++: http://b.atch.se/posts/constexpr-counter/ (which doesn't rely on compiler bugs)

cbd1984 · on Oct 14, 2015

A Real Programmer can write BASICA in any language!

progmal1 · on Oct 14, 2015

I understand that we get in a mess with different compilers not supporting the standard with cross platform code so MSVC should be changed.

But with respect to binding RValues to LValue references, why is the MSVC way not the standard? At first glance the MSVC way appears way more intuitive.

For instance if I have code: update_X(X()); I would expect to be able to refactor it to this safely: { X x; update_X(x); }

toolslive · on Oct 14, 2015

join the club: g++ and clang++ also have different opinions about what's acceptable.