Hacker News new | past | comments | ask | show | jobs | submit login
How we made compiler warnings fatal in Firefox (blog.mozilla.org)
235 points by nachtigall on July 5, 2017 | hide | past | favorite | 115 comments



I remember my first programming job straight out of school, where a full clean build of our project emitted 10K+ warnings, and I thought to myself, "Is this really how professional programming is?" Asked a co-worker why on Earth we don't fix them, as there may be serious bugs hiding in there, and the response was "We're too busy fixing bugs!"


Related:

"We're going to rush through implementation so that we have lots of time left at the end to fix all the bugs caused by rushing through implementation."


Also we will never actually do that last part.


But I get to look high-minded by participating in the hand-wringing about the problem without actually moving a muscle to improve the situation.


The last refuge of the scoundrel!


Related: "We're not sure if this product has a market, and the company that will acquire our tech will integrate it with their stack."


Honestly, if there's a warning you've accepted as a known issue or inapplicable to your code, just suppress it. Maybe document it as a low priority bug in the tracker so it has a place on the to-do list, but get it out of the compiler results.


> Congratulations! You now have fatal warnings on everywhere that is practical.

I think that last bit is the most important one. You need to be practical about stuff like this.

Putting a ban-hammer down like a CI-nazi is most likely not going to get you the results you were hoping for.

Edit: Looking at the linked issue [1], this has taken quite some time. First comment on that issue is 15 years old :D

[1] https://bugzilla.mozilla.org/show_bug.cgi?id=187528)


I think that linked bugzilla issue is amazing, slow patient work over years to fix issues. We should really celebrate this sort of effort more in our industry.


Move slow and fix things?


Pun intended? ;-)


To be fair, the process of making them fatal was a lot more recent. Probably only took 3 or 4 years!


It also means that this is mainly applicable on software that is expected to be in development for many years and then have many years of active maintenance.


Software that is not expected to be in development for years probably doesn't need this. Get it working, manually tested, ship it and move on. This is probably cheaper than the extra engineering effort to make everything perfect. Most warnings are for things that won't matter since you won't port to that system (it doesn't matter if your invoke undefined behavior if the only compiler you use at your optimization level does what you want)

All of the above will come back to bite you hard if you have to release again. Some will come back to bite you right away (the expense of manually testing everything), while others can lurk hidden for years (little vs big endian issues).


> (it doesn't matter if your invoke undefined behavior if the only compiler you use at your optimization level does what you want)

I mean, in one sense you're right, but UB can make even trivial maintenance absolute hell even if you stick to the exact same build toolchain.


That's great of you work at a startup that you don't expect to make it. Or your in one of those groups at a big corp that's really only doing busy work to satisfy manger egos and you know the project will be canned before it's ever used. If people are actually going to use your software then this applies.


The field I was thinking about was actually game development. Usually games are developed, shipped once, and then only have a short maintenance lifetime before the developers move on to other projects. This applies a lot with shrink wrapped games and also with game jam games, but far less so with the games in perpetual beta or early access, and also with long lived free-to-play or MMOs.


One big reason to make compiler warnings fatal (not mentioned here, as far as I can see) is the fact that otherwise, unless you do a full rebuild, you won't see warnings for any source files that don't need to be rebuilt.


The other reason is that people are lazy, and soon, you will have thousands of warnings, and as a logical result, people will simply ignore them all.


We use the Warnings plugin and set Jenkins to always do a full build, so we can have a good overview of potential warnings on the various platforms a project is built on.

Not surprisingly, 0 warnings on your dev machine doesn't mean you also have 0 warnings on a production build for another architecture..


That's always something where I think current build systems are failing.


But if you didn't change a file, do you really want to see warnings for it? Wouldn't you rather just see warnings for stuff you changed and see all of the warnings when a full build is done in CI?


I guess the ideal would be the possibility to see both.


Perhaps not build systems, but I think editors that help you keep track of warnings should probably have a list of them per-file and only clear that list when the file actually gets rebuilt


I don't know, I'd rather have that functionality in the build system. The editor should just parse the build system's output and keep the stuff it does to a minimum.


Why not just fix the warnings and be done with it? Seems like far less work to me.


How do you deal with thirdparty includes, particularly with big header-only libraries? You basically have to change compiler flags on the fly, and AFAIK it isn't well supported. My colleagues have dealt with this by defining some ugly macros which enable/disable hardcoded warnings, and there are lots of them.


I've had good success including not-my-code with the appropriate compiler flag. This tells the compiler that I'm not interested in warnings in those files since I'm not going to change them.

https://gcc.gnu.org/onlinedocs/cpp/System-Headers.html


How does this work with templates? If a type you pass into such a library causes a warning in that library, is that your fault or the library's? How could the compiler possibly tell? It seems like this can't even work in principle if there are templates involved.


The warnings I've had issues with were warnings regardless of whether the code was templates: ignored return values, unused parameters, etc.

There may be cases where the potential false negatives are unacceptable. Be judicious, but in my case the benefits far outweighed the risks.

Of course, it's a great idea to provide upstream patches and issues whenever possible.


Thanks, this looks like a better solution!


That's good to know. I usually just add a warning suppression or two.


Make a warnpush.h file where you use compiler pragmas to suppress warnings and then make a warnpop.h file to restore. Then you do

  #include "warnpush.h"
  #include <crappy.h>
  #include "warnpop.h"
I always do this for third party headers in my projects. Except for std headers.

Example:

https://github.com/ensisoft/newsflash-plus/blob/master/app/c...



Heh, an analogous file in my colleagues' project is 250 non-emply LOC. It seems to list every warning for GCC from 4.8 to 6.X. Sorry, not open source (yet?).


With GCC and Clang, you should use -isystem<thirdparty/header/dir> instead of -I<thirdparty/header/dir> for these include directories. This will suppress warnings from these headers.

CMake also supports this through include_directories(SYSTEM …).

With MSVC you will have to fall back on macros.


Can't you do it just with some #pragmas?


If the number of distinct warnings is small, I tend to fix them and submit a PR.


The hardest part about where you are and where you want to be is the getting there from here.

This is some good practical advice (and a welcome improvement) from Mozilla.

I'm also reminded of an email I sent out to a technical mailing list some time back, in response to a user's question about suppressing warning messages when running his program (this for a runtime-based interpreter).

Various suggestions came for how to limit, suppress, or remove warnings from the log.

My response, which apparently earned some favourable consideration at the vendor:

"Fix the bugs in your code."


I'm surprised it took them this long. I've used ``-Wall -Werror`` on all my code for decades. If you use it from the start it's easy to maintain and catches a lot of potential bugs. There are the occasional false positives and warnings coming from 3rd party headers. For these I add warning specific suppressions.


Did you build on many different platforms using multiple compilers with various degrees of standards support, just as Mozilla does? That makes it harder (for starters: how do you suppress warnings in a portable way?)


It would be nifty to supress warnings on a per include directory level by passing flags directly to the compiler. This is how static analysis tools like PVS avoid flooding us with errors.


That’s what Mozilla does, too, reading the article.

However, that’s not what the grandparent wrote: "For these I add warning specific suppressions”. I read that as disabling single warnings, not a compilation unit at a time.

For example, in Visual Studio, you can do (https://msdn.microsoft.com/en-us/library/2c8f766e.aspx):

   #pragma warning(push)
   #pragma warning(disable: 1234)
   ...
   #pragma warning(pop)
Intel’s compiler is compatible with this (https://software.intel.com/en-us/cpp-compiler-18.0-developer...), but it would surprise me if it used the same warning numbers.

GCC and clang, on the other hand, need (https://gcc.gnu.org/onlinedocs/gcc/Diagnostic-Pragmas.html#D..., https://clang.llvm.org/docs/UsersManual.html#controlling-dia...):

   #pragma GCC diagnostic push
   #pragma GCC diagnostic ignored “-W...”
   ...
   #pragma GCC diagnostic pop
Because you ideally only want to target the exact place that triggers a warning, either is fairly wordy, and it gets worse if you want to cater for all three compilers.

It also wouldn’t surprise me if Mozilla supported multiple versions of gcc (for instance because they want to build for an OS with long-term support, or for an obscure OS for which the latest gcc is not (yet) available).

(_If_ you use C99, at least with gcc and clang, you can simplify that by using _Pragma, which has the big advantage that you can use it in preprocessor macros, so that you may be able to simplify what things look like in the source)


>> ideally only want to target the exact place that triggers a warning

On the other hand, its typically some library that is messing up the static analysis tool.

While the #pragama options are good for disabling warnings in typical projects, from personal experience they are broken in NVCC.


Usually what people do is they develop with -Wall -Werror but then turn off -Werror for the shipping version.


> There are the occasional false positives and warnings coming from 3rd party headers.

Can't you just -isystem those includes?


Also -Wextra, but I haven't tried the hyperpedantic options that need to be enabled individually.


We did the same for the linter in our JavaScript project. Of course, the linter config is focused mostly on pitfalls and "good parts", not on code style (except for the most annoying code style issues).


Can't most of these code-style issues be auto-formatted on save these days?

Years ago when I still programmed in Go I made use of that all the time: if it didn't format on save I knew I had a compiler error somewhere.


kozak is specifically saying their linter is mostly focused on any non-codestyle issue.

And compiler errors are not worth linting given you can just do a syntax check, linting is useful to avoid or require disambiguation of error-prone constructs.

Examples in JS would be `with`, `==`, assignments inside a conditional check, parseInt without a base (mostly in older browser), parseFloat, non-prefixed octal literals (073 rather than the explicit hex-style 0o73), string expression statements (outside of the "use strict" pseudo-pragma), and several scoping errors (which strict mode will mostly only check at runtime). All of these are syntactically perfectly valid, but more often than not they're also mistakes or sources of errors.


assignments inside a conditional check

I see this in C code often. Why do people seem to like doing this?


As demonstrated by other commenters, makes for a terse iterator-ish pattern in languages without iterators (like C, or until recently Javascript) e.g.

    while ((c = getchar()) != EOF) {
     //
    }
in a language with iterators would be

    for (c in chars()) {
     //
    }


And note that this C pattern only works when you can pass the "end of iteration" signal in-band, if you can't you will need to use more complex pseudo-iteration protocols e.g.

    while(generator(&c)) {
        //
    }


Because

    while ((c = getchar()) != EOF) {
     //
    }    
Is considered more idiomatic and readable than any alternatives that's not assigning to c within the conditional. You'll see similar idioms in many other languages, e.g.

     while((line = reader.readLine()) != null) {
     }
But if you're talking about e.g.

    if (foo = (bar & 0xf)) {
    }
It's just bad style ...


I think it's more idiomatic because otherwise you have to do:

  char c = EOF;
  while (c != EOF) {
    c = getchar();
    ...
  }
The idiomatic way reads better: "while next character is not EOF".

Your (excellent) example of bad style is bad because it reads poorly: "if the result of the assignment of bar & 0xf to foo is true". Not only that, but it is not obvious what the result of an assignment is, if anything. (Yes, I do know it is the value, but personally, I think <nothing> is a more logical result).


Don't you mean

    char c = getchar();
    while (c != EOF) {
        c = getchar();
        …
    }
?

Which not only reads badly but requires writing the "producer" twice. Though alternatively you could:

    while (1) {
        c = getchar();
        if (c == EOF) { break; }
        ...
    }


    char c;
    do {
        c = getchar();
    } while (c != EOF);


This is worse, in most cases as you normally want to do something to c after you have read it, forcing you to re-check the condition

    int c; //a char can't represent EOF
    do {
        c = getchar();
        if (c != EOF) {
           //do stuff
        }
    } while (c != EOF);


if ((err = func())) goto fail_err;

Two of the most hated thing of C but so convenient...


Masochism?


I remember compiling and working with Firefox code many years ago, one thing that surprised me were non-fatal assertion errors continuously printing errors to console in debug build. It looked like Netscape legacy, I wonder if they managed to clean this.


The newer MOZ_ASSERT is fatal in debug builds. The non-fatal legacy NS_ASSERTION still exists and unit tests complain if the number of times it fires for a given test changes.

So it's under control, but the Netscape-era assertions haven't all been converted to reliable assertions.


Firefox ESR 52.2.0 on Debian still prints a whole bunch of assertion errors if you run it in a terminal:

    (firefox-esr:345): GLib-GObject-CRITICAL **: g_object_ref: assertion 'object->ref_count > 0' failed


I've never seen any Gtk application that does not spew messages like this at least occasionally.


I've noticed this as well in all the Gtk apps that I've run through the terminal.

Never having programmed in Gtk, what is the cause of this? Is Gtk broken or are all the programmers using it incorrectly? I've programmed a great deal in WinAPI and stuff like that never happens.


No one knows how to use GTK correctly, not even the GTK developers. At least that was the situation in 2014 when subsurface moved from GTK to Qt, https://www.youtube.com/watch?v=ON0A1dsQOV0.


I think this is very common.

There's a bunch of boilerplate code that just gets carried from project to project.

My favourite was when KDE decided to move to cmake. People realized that most m4/autoconf macros were there for no reason and no one actually knew what they checked for let alone how.



> Sam Leffler's graphics/libtiff is one of the 122 packages on the road to www/firefox, yet the resulting Firefox browser does not render TIFF images

Don't even get me started on the circular dependencies that need to be broken by installing packages with some options disabled. Kind of an awkward bootstrap.


I added CMake support to libtiff last year. Haven't checked if the ports tree uses it yet.


WinAPI does not print into console warnings about accidental attempt to drop ref-counted object that has dropped already. So how can you know? Maybe it happens with WinAPI all the time silently.


It's not just Gtk. At $WORK, we have a lot of user space daemons where everything boils down to GLib, and GIO for all IPC.

We do our own hardware and kernel development (upstream kernel + .ko's for our stuff) and that's good code. Then we have the daemons in userland providing interfaces to ioctl's and cdevs, with some house keeping and interfaces for DBus and HTTP APIs.

All the userland stuff just vomits GLib-CRITICAL messages. No one bothers to check for them because everything is running as daemons with stderr redirected to /dev/null...

And $WORK is one of the better places I've worked at. I am certain that if those assertions would have been fatal, they would have been fixed a long time ago.


Developers should be running with G_DEBUG=fatal-criticals so that those warnings convert into program crashes. But then again developers shouldn't have been writing anything in a language that requires manual resource management since C++.


I can confirm in my case, I coded a lot with Gtk a while ago and these warnings are close to impossible to avoid.


Developers should be running with G_DEBUG=fatal-criticals...


Given how GObject is implemented (void pointers everywhere ), that's not surprising.


How are you supposed to make data structures which work with any type without void pointers in C?


Have done something similar, and it works, but ...

... the downside is that every time someone uses a different version of your normal compiler, or a different compiler, you have an unfortunately large chance of having the build fail because a new bit of code that previously was warning free is now determined to be warning-worthy.

Not a huge problem (and better than cultivating the habit of ignoring warnings), but it is a time cost that has to be paid now and again for committing to halting the build on a static analysis failure.


From the blog post:

"Set things up so that fatal warnings are off by default, but enabled on continuous integration (CI). This means the primary coverage is via CI. You don’t want fatal warnings on by default because it causes problems for developers who use non-standard compilers (e.g. pre-release versions with new warning classes). Developers using the same compilers as CI can turn it on locally if they want without problem."


I think that's why TFA suggested only doing this in CI (with a known compiler version) and not for local builds.


The problem with my work project is that most of the warnings are from third party libraries (Cocoapods since it's an iOS project). Most of them from Jainrain, which has publicly said they don't care.


If you're copy-and-pasting source code into your project, you can ignore all warnings on a per-file basis (See https://stackoverflow.com/a/6921972/1176156).

And if you're using a package manager like Cocoapods, it shouldn't be a problem.

Edit: Wait I just re-read your post again and you are using Cocoapods. Can you not just turn on "treat warnings as errors" to YES for your main project, but leave it off for your main project? Or just add `inhibit_all_warnings!` to the top of your Podfile, like on this example file: https://guides.cocoapods.org/syntax/podfile.html


Yea, I've had success with the following procedure:

1. Determine the set of warnings you're going to enable for you project, and apply them to your own code and all third-party libraries.

2. For each third party dependency, find the minimal set of warnings that, when suppressed, make them compile cleanly, and disable just those (only for that specific third party project).

3. If any of those suppressions makes you uncomfortable, submit a patch to the third party maintainer if possible :)

As for #3 with open source libraries, I've seen the spectrum of attitudes from "thank you very much for the patch!" to "we don't care about compiler warnings, just turn them off"

EDIT: One thing this doesn't address is third party dependencies where #including their headers from your project results in warnings. If you're a developer who writes header files that produce warnings, please take a moment to hang your head in shame.


Thanks.


> Most of them from Jainrain, which has publicly said they don't care.

A good reason to use a different lib, IMO.


Marketing is committed to it. But for new apps? We'll see.


I wonder how much harder this approach will make compiler updates for them.


Compilers in general have become much better about this. today a potential compiler warning is carefully examined before it gets added. Historically (1995) compilers added warnings when someone wanted to with no consideration on if it was a good idea. As a result upgrading compilers often resulted in a ton of new warnings to look at most of which were not interesting.

Today clang will typically build all of google's software with the new warning on and look at the results. Then they look at each one and decide if it is a real bug or false positive. For the real bugs they look at how serious each bug is (if it is a previously unknown security hole that worse than it could fail in the real world only in unusual situations). For false positives they consider silencing the error changes the code (in some cases the code is very ugly and cleaning the code up fixes the warning, while in others the code was good until the ugly stuff to silence the warning was added). Then the consider the rates of all of the above to decide if the false positives are worth the potential gains. I believe other compilers do similar things, though I don't have insight into their processes (Chandler Carruth gives great talks at C++ conferences on clang which is where I get my information).

Compiler writers are in competition in this area. As a result most things that should be warned about are already warned about, while things that are not worth warning about are not warned about. Thus updates to the compiler are quick because you keep having "how did we ever get by with this mistake" moments, which in turn keeps you motivated to fix the next warning.


I used to do very frequent compiler updates for... a rather large codebase, it's not too bad. Even then it's only a part time job for one person, for a more reasonable code base (like firefox) and a more reasonable update schedule (official compiler releases) it's even easier. Basically you just snap off a dev branch from some stable build, do the update and identify all the failures then go through each one. Take a little bit of time to identify the problems and fix them as you go. If something looks like it's beyond your expertise level or requires the special expertise of the "code owners" to fix properly don't bother trying to wrap everything up in a nice tidy package all at once simply use pragmas or what-have-you to disable the warnings for the smallest section of code possible and file bugs (at a relevant level of priority) to fix up the issue properly later. Every once in a while you'll run into an annoying hairball kind of problem that requires a fair amount of work to fixup (like, say, warnings in generated code or something) but most of the time it's fairly trivial stuff.


They specifically chose warnings one by one(e.g -Wunused-but-set-variable) instead of by groups (-Wall) to avoid that being an issue.


That helps, but is not bullet proof. Changes to optimization passes can cause warnings to be emitted in cases where they weren't previously. For example, detection of an unused result of an expression.


I can't really think of any case where this is possible. Warnings are produced (only?) before the serious optimisation passes. Have you got an example where unused result warning changes between optimisation levels?


These are, or used to be, quite common with gcc. Unused-result warnings require dataflow analysis to run to determine whether the values are used, and on no-optimisation compiles gcc didn't bother to do dataflow analysis.

Conversely, here's one where a warning is produced only without -O2:

  orth$ cat z.c
  int foo (int a) {
     int x;
     if (a == 5) { x = 3; return 42; }
     return x;
  }
  orth$ gcc -O2 -Wall -o z.o -c z.c
  orth$ gcc  -Wall -o z.o -c z.c
  z.c: In function ‘foo’:
  z.c:4:4: warning: ‘x’ may be used uninitialized in this function [-Wmaybe-uninitialized]
      return x;
      ^
  orth$ gcc --version
  gcc (Debian 4.9.2-10) 4.9.2


I've not seen an unused result warning change, but I've come across a case where a dead code warning appeared at higher optimisation levels when some function inlining revealed to the compiler that a certain function would call exit().

Still, I wouldn't expect so many of these that updating a new compiler would be too onerous.


Since I've been programming in the Go language for a few months now I've really often compared the compiler's strictness with the C/C++ compiler's default laxity. The Go compiler won't even let you leave in an unused import, and there are a number of other things that in C would maybe generate a warning that won't compile in Go.


I absolutely hate that you can't have unused variables in Go, or that you can't have unreachable code in Go or Java. It makes the already tiresome work of debugging significantly more frustrating.


Why? What's useful about unused variables or unreachable code?


It's not useful in shipped code, and absolutely should be a warning, but it's incredibly useful while creating the program or debugging.


Go isn't even remotely close to strict. They optimised for speed vs correctness checking.


Rust laughs and pats Go on the head.


You seem nice.


How is ALLOW_COMPILER_WARNINGS implemented?


Firefox has a custom build system, and it's implemented within that. So it's not something that will translate to another project, alas.


#pragma warning( disable : xxxxx )


> Also, before upgrading the compilers on CI you need to fix any new warnings. But this isn’t a bad thing.

It's, in fact, downright stupid and means you're writing in a different programming language with each compiler upgrade and at the mercy of some poorly-considered third-party opinions.

Many compiler warnings are just stylistic. They can even contradict each other. Compiler A says, "suggest parentheses here". Oops, compiler B doesn't like it: "superfluous parentheses in line 42".

As a rule of thumb, warnings that aren't finding bugs aren't worth a damn.

The proper engineering approach is to evaluate newly available warnings and try them. Keep any that are uncovering real problems, or could prevent future ones; enable those and not any others.

Warnings are just tools. You pick the tools and control them, not the other way around.

Every code change carries risk!


They do say this at the start:

> Choose with some care which warnings end up fatal. Don’t be afraid to modify your choices as time goes on.

So, my understanding is that they didn't make all warnings fatal. Only the ones they believe are important.


I don't know what languages you are thinking but firefox is mostly C and C++.

There are no superfluous parenthesis warnings in GCC and MSVC.


Indeed. I just never really internalized all the operator precedence in C/C++ so I never write code like `a + b >> c`; I always add parenthesis when mixing arithmetic and bit operations.


Uhh, perhaps I'm not parsing your claim correctly, but that warning most certainly does exist in gcc.

  $ gcc -Wall test.c
  test.c: In function ‘main’:
  test.c:6:5: warning: suggest parentheses around assignment used as truth value [-Wparentheses]
      if (a = b) {
      ^
Parentheses here would be superfluous, but would perhaps confirm you really intended to assign (i.e. used = instead of ==).


You realize that you have an assignment instead of a comparison?

The warning here couldn't be more justified. In a decent language, this should be a hard error.


Putting parens around that example, while not strictly needed, also makes it harder to forget to add them if you later add an && or || to the conditional. So I am in favor.


Well, Firefox does have more and more Rust now...

  fn main() {
      if (true) { }
  }
results in

  warning: unnecessary parentheses around `if` condition
   --> <anon>:2:8
    |
  2 |     if (true) { }
    |        ^^^^^^
    |
    = note: #[warn(unused_parens)] on by default


Read the article. They choose which warnings are considered fatal.


and so fixing a new warning from an upgraded compiler might mean disabling it.


Relying on a third party isn't nearly as bad when the third party is also you!




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: