Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Results of the Grand C++ Error Explosion Competition (tgceec.tumblr.com)
184 points by ot on Jan 27, 2014 | hide | past | favorite | 43 comments


I'd like to see a variation of this contest where a "typical", valid source code is given as part of the rules, and the objective is to change the code as little as possible to produce the highest amount of compilation errors. For example it could be measured the ratio between the output of the compiler and the edit distance with the reference code.

It would be more representative of all those WTF moments when you have some code that compiles perfectly fine, change a line and suddenly GCC decides to throw up thousands of meaningless messages :)


My favorite was the Best Cheat winner:

  /usr/include; perl -e "@c=\"x\"x(2**16); while(1) {print @c}" 1>&2
"Extra credit for using Perl, which is the only language less readable than C++ templates."


Aye, but I'd go for

  /usr/include; perl -e "@c=\"x\"x(2**16); while(1) {warn @c}"
instead. Fewer chars, and you're not relying on the shell to throw it into stderr for you.


here's John Regehr on the construction of his entry - the "barest hands" winner: http://blog.regehr.org/archives/1088


That's one of the drawbacks of C++. I really love it but sometimes a tiny syntax error will produce so many errors that you can't even scroll to the top of the console to find the cause.


clojure/clojurescript has very similar issues. A small syntax error (e.g. forgetting the vector in a :keys unpacking — putting the bare field name) will throw up hundreds of lines of unreadable stack trace.


I hear this complaint a lot about Clojure, but I have to admit that I don't fully understand it.

My usual experience with Clojure has been that the stack traces are long... but the exception error messages are generally pretty good, and the stack frames that correspond to user code generally have good information on the position of the offending statement.

Maybe you could make an argument that the internal Clojure stack frames should be hidden from stack traces, but Java itself makes that difficult. The mechanisms that `Throwable` uses for capturing and printing the stack back trace are both private methods of the class. Working around this would involve identifying every (most?) places in a codebase where a Clojure stack trace might be printed and then using custom Clojure-specific stack trace printer. This might be doable for stack traces printed by the REPL or compiler, but very difficult for stack traces printed by an external linked-in logging framework.


Debuggers can be helpful by disguising/translating details, but I'd be very cautious about a feature that messed with the stack frame. Sometimes that's what's being debugged! It would be a disservice to muddy that information.

Most of us are pretty good at scanning down a stack to find our own stuff.


> My usual experience with Clojure has been that the stack traces are long... but the exception error messages are generally pretty good, and the stack frames that correspond to user code generally have good information on the position of the offending statement.

I do not agree, but please note my comment isn't talking about "running" stack traces here it's talking about the compiler blowing up on what is essentially a syntax error:

    (let [{:keys foo} {:foo 1}]
        (print foo))
note how `foo` is bare rather than in a vector. Put that in a script, `clj -m` it

    Exception in thread "main" java.lang.IllegalArgumentException: Don't know how to create ISeq from: clojure.lang.Symbol, compiling:(hello.clj:5:3)
        at clojure.lang.Compiler.analyzeSeq(Compiler.java:6567)
        at clojure.lang.Compiler.analyze(Compiler.java:6361)
        at clojure.lang.Compiler.analyze(Compiler.java:6322)
        at clojure.lang.Compiler$BodyExpr$Parser.parse(Compiler.java:5708)
        at clojure.lang.Compiler$FnMethod.parse(Compiler.java:5139)
        at clojure.lang.Compiler$FnExpr.parse(Compiler.java:3751)
        at clojure.lang.Compiler.analyzeSeq(Compiler.java:6558)
        at clojure.lang.Compiler.analyze(Compiler.java:6361)
        at clojure.lang.Compiler.analyzeSeq(Compiler.java:6548)
        at clojure.lang.Compiler.analyze(Compiler.java:6361)
        at clojure.lang.Compiler.access$100(Compiler.java:37)
        at clojure.lang.Compiler$DefExpr$Parser.parse(Compiler.java:529)
        at clojure.lang.Compiler.analyzeSeq(Compiler.java:6560)
        at clojure.lang.Compiler.analyze(Compiler.java:6361)
        at clojure.lang.Compiler.analyze(Compiler.java:6322)
        at clojure.lang.Compiler.eval(Compiler.java:6623)
        at clojure.lang.Compiler.load(Compiler.java:7064)
        at clojure.lang.RT.loadResourceScript(RT.java:370)
        at clojure.lang.RT.loadResourceScript(RT.java:361)
        at clojure.lang.RT.load(RT.java:440)
        at clojure.lang.RT.load(RT.java:411)
        at clojure.core$load$fn__5018.invoke(core.clj:5530)
        at clojure.core$load.doInvoke(core.clj:5529)
        at clojure.lang.RestFn.invoke(RestFn.java:408)
        at clojure.core$load_one.invoke(core.clj:5336)
        at clojure.core$load_lib$fn__4967.invoke(core.clj:5375)
        at clojure.core$load_lib.doInvoke(core.clj:5374)
        at clojure.lang.RestFn.applyTo(RestFn.java:142)
        at clojure.core$apply.invoke(core.clj:619)
        at clojure.core$load_libs.doInvoke(core.clj:5413)
        at clojure.lang.RestFn.applyTo(RestFn.java:137)
        at clojure.core$apply.invoke(core.clj:619)
        at clojure.core$require.doInvoke(core.clj:5496)
        at clojure.lang.RestFn.invoke(RestFn.java:408)
        at clojure.main$main_opt.invoke(main.clj:335)
        at clojure.main$main.doInvoke(main.clj:440)
        at clojure.lang.RestFn.invoke(RestFn.java:436)
        at clojure.lang.Var.invoke(Var.java:423)
        at clojure.lang.AFn.applyToHelper(AFn.java:167)
        at clojure.lang.Var.applyTo(Var.java:532)
        at clojure.main.main(main.java:37)
    Caused by: java.lang.IllegalArgumentException: Don't know how to create ISeq from: clojure.lang.Symbol
        at clojure.lang.RT.seqFrom(RT.java:505)
        at clojure.lang.RT.seq(RT.java:486)
        at clojure.core$seq.invoke(core.clj:133)
        at clojure.core$reduce1.invoke(core.clj:890)
        at clojure.core$destructure$pb__4541$pmap__4544$fn__4547.invoke(core.clj:4013)
        at clojure.core$reduce1.invoke(core.clj:896)
        at clojure.core$destructure$pb__4541$pmap__4544.invoke(core.clj:4014)
        at clojure.core$destructure$pb__4541.invoke(core.clj:4028)
        at clojure.core$destructure$process_entry__4557.invoke(core.clj:4030)
        at clojure.core$reduce1.invoke(core.clj:896)
        at clojure.core$destructure.invoke(core.clj:4033)
        at clojure.core$let.doInvoke(core.clj:4046)
        at clojure.lang.RestFn.invoke(RestFn.java:467)
        at clojure.lang.Var.invoke(Var.java:427)
        at clojure.lang.AFn.applyToHelper(AFn.java:172)
        at clojure.lang.Var.applyTo(Var.java:532)
        at clojure.lang.Compiler.macroexpand1(Compiler.java:6468)
        at clojure.lang.Compiler.analyzeSeq(Compiler.java:6546)
        ... 40 more
Yeah… the third time around you might instantly know what it is, the first time around when it's a few levels down a type method not so much.


The first line of the error message tells you where and what the error is, almost exactly:

    ... Don't know how to create ISeq from: clojure.lang.Symbol, compiling:(hello.clj:5:3)
It's true that the full interpretation of the error message requires some experience with the language (ISeq? Symbol?), but when is that not the case? Consider this one character typo in a Java file:

    voidx displayMessage(String message);
javac produces an error message that's just about as opaque as what Clojure produces in your example.

    [ERROR] ... MessageSink.java:[5,4] error: cannot find symbol
Conversely, if I leave an open brace in Java source, I get the following kind of error from javac:

    [ERROR] ... ConsoleMessageSink.java:[12,1] error: reached end of file while parsing
If I leave an open form in Clojure, it at least tells me the location of the form I left open:

    Exception in thread "main" java.lang.RuntimeException: EOF while reading, starting at line 18, compiling:(toto/data.clj:255:1)
(Of course, thanks to paredit-mode, I had to play a minor trick on my editor to get it to let me even introduce that error in the first place.)


I like clojure. I really do. But when it comes to day to day usage I cringe because of the down right terrible error messages.


This is so true. I hope they fix it, cause it's really annoying.


That's why I like Python so much.

Made a syntax error? Here's the line number, pointer to the column, and the kind words "SyntaxError" from the interpreter.


Then you simply never tried any other modern language than Python.

Python's error messages are pretty OK, but they're nothing special. I'd say that in general, evil yucky unstartuppy Enterprise languages like Java and C# do slightly better.


I think you did not understand moccajoghurt's point. Because the errors cascade - that is, the one syntax error causes many, many (many) more errors, there are so many errors that it's hard to scroll up to the first error.


It's worse. In C++, that first error often is a red herring. The classical example is a header file where one forgets to type a semicolon. In classical C++ compilers, that triggered errors pointing to the file that includes the erroneous header.

For examples, see #9 in http://web.mst.edu/~cpp/common/common_errors.html, http://stackoverflow.com/questions/11216569/missing-semicolo..., or http://stackoverflow.com/questions/14077486/c-learning-heade... (nice example of someone manages to solve a "missing semicolon before X in file F, line L" error, and is smart enough to realize that that probably is not the right way to fix the problem)

Clang developers have worked hard to improve that; see http://clang.llvm.org/diagnostics.html, section "Quality of Implementation and Attention to Detail".


While I have trained myself to look for "clusters of errors" instead of individual errors, in my experience, the first error g++ reports tends to be the real cause. (I've also been programming in C++ long enough that I often don't even bother reading the error message - it's usually quicker if I just go to that line of code, and look for what's wrong.)


Wouldn't the -fmax-errors option help with that?

Or if all else fails, just pipe the errors to "head".


s/C++/g++/

clang++ isn't nearly as bad.


I tried some of the examples with g++/cc and with clang. Clang is a lot better but it still produces some pretty ridiculous output (but not gigs of ridiculous output fortunately).


Oh, I totally agree. That said, these examples are engineered to make your compiler shit a brick. Most of the time clang++ gives you much more readable output. g++, however, is a lot more careful about your digits. It'll take a little while before clang++ gains serious traction in the scientific community.


> g++, however, is a lot more careful about your digits. It'll take a little while before clang++ gains serious traction in the scientific community.

How do you mean? Does clang have some sort of problem with numeric precision?


g++ has improved their error messages in the past few years, especially when it comes to template error messages. But C++ is still really complicated and you can get get pretty nasty errors.

And clang++ may be better in some aspects but it is not perfect either, one of the entries in this contest caused clang++ to segfault.


That's why you don't make syntax errors! :-)


It's not unlike the busy beaver problem, where you have to produce the highest number with a turing machine before halting:

https://en.wikipedia.org/wiki/Busy_beaver


"C++. The reason to increase your scrollback buffer."


make 2>&1 | head (or less) is a common occurrence in my .bash_history :(


I actually have this written on the corner of my whiteboard because I can never remember bash's syntax for piping out stderr!


If you don't mind being bash-specific, you can shorten 2>&1 | to just |& .


Good lord. I think you just changed my life.


Similarly &> works for redirecting both stderr and stdout to a file. >& is the alternate, less preferred syntax.


Perhaps it will help you to understand some background:

In a Unix process stdout typically is file descriptor 1 and stderr is fd 2. So you are saying: Redirect from fd 2 into fd 1.


The biggest problem is mixing up >& and &> ...which ends up leaving a bunch of files called "1" around with a runaway background process :(


Yeah, it's tricky. I remember it by breaking it into two pieces. "2>" means redirect stderr (file descriptor 2) to the following token. "&1" is the token for stdout (kinda like the address-of operator). So taken together, "2>&1" means redirect stderr to stdout.

Bash-only shortcuts like &> don't improve the situation :)


The phrase "two greater amper one" seems to help me. My tendency otherwise is to type the ampersand before the greater-than.


We had a similar competition on the Programming Puzzles & Code Golf StackExchange (that's a mouthful; I think I'll just continue calling it »cgse«) three years ago: http://codegolf.stackexchange.com/q/1956/15

We should probably start summarizing some of the better tasks in blog posts.


Is there something about the C++ standard (templates?) that requires recursive include to be valid? Because that first one looks like it would be really easy to check for and stop.


Includes predate C++ by a bit. And there is nothing that specifically forbids such a thing; it's just the preprocessor literally inserting (and interpreting) the file specified.

Now, normally there are include guards (or #pragma once) to guard against including the same header file twice. But of course that's only necessary because the language specification doesn't say that including the same file twice is disallowed.

One can surely construct something where including the same file twice does make sense, e.g. in the light of new #defines in between the two inclusions:

    // foo.h
    #ifdef FOO
      ...
    #else
      ...
    #endif

    // foo.c
    #include "foo.h"
    #define FOO
    #include "foo.h"
Mind you, this is probably not a good idea, at all. But some people might build such things. A similar method that springs to mind is Android's native build system ndk-build, which is a thickly-veiled GNU make where you basically include targets that do something with a lot of variables you set beforehand. And you can do so multiple times in a row. Maybe the included file uses a defined symbol to emit some generated code based on that symbol and you might want to do that with several symbols. Macros are usually preferred for that, but nothing stops you from using an included file.


Also do not forget the more common case:

- foo.c includes bar.h and baz.h

- bar.h and baz.h both include qux.h

This is why most C header files have ifdef guards like this

#ifndef __BAR_H

#define __BAR_H

// contents go here

#endif

This makes sure that the contents of the header are only included the first time the preprocessor sees it in a particular file


Hence include guards, yes. Agreed, I could have made that point a little better. Although they are usually only necessary when the header files contains things that may not appear twice in the same compilation unit, such as ... uhm, pretty much everything.


Recursive includes can be useful. For example Boost Preprocessor [1] uses them to implement iteration.

[1] http://www.boostpro.com/mplbook/preprocessor.html


Ah, a contest in the style of the venerable Obfuscated C.


y u no english?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: