Hacker News new | past | comments | ask | show | jobs | submit login
How undefined signed overflow enables optimizations in GCC (kristerw.blogspot.com)
100 points by ot on Feb 21, 2016 | hide | past | favorite | 90 comments



Wasn't there substantial controversy over gcc optimizing away certain code that tried to check for integer overflows?

Check out this infamous thread: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=30475 ...it gets pretty nasty!

The takeaway is that although overflow is technically undefined, a lot of security-related code was naively implemented by checking for overflows occurring in the most typical way (wraparound), and this check began to be removed by this sort of optimization. This made some pragmatically minded security folks quite angry, because although such behavior is technically undefined by the spec, it was the most widely used method of defending against overflows, and recognizing that such code was already everywhere in the wild they feared all the vulnerabilities that would be introduced by gcc dropping branches for i > i + proposed_increment.


>>The takeaway is that although overflow is technically undefined

As they say: technically correct is the best kind of correct.

>>This made some pragmatically minded security folks quite angry

They sound very entitled to me. The C standard is clear on the issue, it's not some arcane hidden thing, it's just a fundamental behavior of basic types in the language. It's very annoying to read the entitled comments from a person who is clearly in the wrong bashing the GCC crew.

>> it was the most widely used method of defending against overflows, and recognizing that such code was already everywhere in the wild they feared all the vulnerabilities that would be introduced by gcc dropping branches for i > i + proposed_increment.

As mentioned by maintainers in the thread there is already a flag to catch unsigned overflows. They even mentioned a way to catch those overflows in the code.

I want a compiler to be technically correct and do optimizations which are possible as the whole point of writing things in C is for them to be fast and efficient. It's nice that GCC has quite a big repertoire of flags, checks, sanity checks, warnings, sanitizers etc. They are very helpful. Demanding that the core compiler doesn't use the language specs to produce better code is pathetic though.

It also didn't "break" any real world things. You just need to use proper compiler flags to compile your non-conforming code if you want a newer version of the compiler. Then thank the hard working compiler maintainers for providing those to you as they could've spent their time on something more fun than fixing shitty code for you.

/rant


The C standard is clear on the issue

The problem is succinctly summarised by the saying "In theory, there is no difference between theory and practice. In practice, there is." Unfortunately what most programmers think of as "C" is subtly different from how the standard defines it. In other words, perhaps we should fix compilers (and eventually the standard) to match reality instead of the other way around.

The standard even states in the section on undefined behaviour that "behaving during translation or program execution in a documented manner characteristic of the environment" --- exactly what C programmers are usually expecting from UB --- is a possible choice.


Yes. People are quick to point to the standard when it comes to compilers doing something surprising, and quick to use this as evidence that programmers are silly to expect the expected. But if compilers did exactly what these programmers expect, they could point to the standard just the same...

(At least this article provides a good explanation of advantages to be had from being aggressively pedantic about UB. Many by way of evidence just wave their hands and claim people are stupid.)


> (At least this article provides a good explanation of advantages to be had from being aggressively pedantic about UB. Many by way of evidence just wave their hands and claim people are stupid.)

This. While it doesn't show how much real programs benefit from these UB assumptions (or how many new security bugs are introduced...), at least it has some explanations. Still, you have to ask yourself whether the compiler is a tool to help you do your work, or a perverse, pedantic, and frustrating adversary. The ultimate "UB optimizer" would do its best to find tiny corners of undefined behavior in your program, then replace the whole thing with "exit(0)".


That would be an interesting exercises. Of course, the whole thing is only interesting if it comes with a proof of justification. Otherwise it's too easy for real world C programs: they are all undefined.


If you don't go by the spec, what do you go by? Intuition? What if yours is different from someone else's?

In this case as well, acknowledging that signed integer overflow is undefined behavior allows for more consistent behavior in certain circumstances. For instance, the compiler knows that `x < x + c`, where `c > 0`, and optimize it to `true`. Even if you don't care about the optimization, is it not equally surprising that such a statement could ever be `false`?

No matter which way you go, some behavior is going to be surprising. And in these scenarios, the only plausible thing to do so is to follow the standard, so at least there's a consistent ideal that everyone's intuitions can converge to, which is crucially important. Each compiler implementing a different C-alike standard just leads to more confusion.


If you don't go by the spec, what do you go by?

To quote the standard, "a documented manner characteristic of the environment."

If the machine is two's complement (which is the vast majority of them), then you'd expect that behaviour. If it's sign-magnitude or one's complement or saturating, then the behaviour would be different but still consistent.

Even if you don't care about the optimization, is it not equally surprising that such a statement could ever be `false`?

Overflows wrapping around is not surprising, it's what decades of computing hardware (or centuries if you include mechanical calculators like https://en.wikipedia.org/wiki/Pascal%27s_Calculator ) have always done.


> If the machine is two's complement (which is the vast majority of them), then you'd expect that behaviour. If it's sign-magnitude or one's complement or saturating, then the behaviour would be different but still consistent.

How is this different in practice from undefined behavior, other than the compiler can't take advantage of it for optimizations?


Maybe a better way to illustrate the surprising behavior is that

if (x < x + c) { // block }

if (0 < c) { // block }

are not equivalent statements if you add overflow wrapping into the spec. What if the processor your application is running on doesn't do two's compliment arithmetic? Does gcc need to emulate this behavior to comply with the spec?


Except that different programmers subtly different flavour of C is different from others. For a handful of features it may work but in general I doubt it.


> In other words, perhaps we should fix compilers (and eventually the standard) to match reality instead of the other way around.

Without an analysis of the performance loss from removing this optimization, that would be irresponsible.

The authors of GCC were not stupid. They understood that these optimizations can be important for performance, and that's why they implemented them. When this came up on the mailing list several years ago, Ian Lance Taylor found several places this optimization was kicking in in GCC itself.

It is not OK to just randomly avoid doing optimizations allowed by the C standard because they might trigger bugs. C depends on undefined behavior for performance. You depend on these compiler optimizations to get good performance out of the programs you run every day.


It is not OK to just randomly avoid doing optimizations allowed by the C standard because they might trigger bugs. C depends on undefined behavior for performance. You depend on these compiler optimizations to get good performance out of the programs you run every day.

Geoff offers a simple succinct example of optimizing based on undefined behavior in another comment in this thread: https://news.ycombinator.com/item?id=11147068

I tested with GCC, Clang, MSVC, and ICC, and found that only GCC removed the checks and optimized the function down to a constant. Clang, MSVC, and ICC instead generated code that matched the programmer's clear intent.

Are you sure they would be better compilers if they matched GCC's behavior here? I agree the GCC authors are far from stupid, but I think there might be a difference in philosophy here rather than just a missed opportunity for optimization.


> Are you sure they would be better compilers if they matched GCC's behavior here?

I don't know. Measure it on SPECINT or something.


My point is that we are talking about basic types in C. Something you should learn in your first few hours with the language. It's not some arcane thing like what happens in the pre-processor if you nest things or what happens in some rarely used library function in some never occurring edge case.


In my experience most people don't get exposed to underflows and overflows when learning programming. It's not until they hit a bug that they actually find out overflows and underflows exist. Then, in my experience they don't go read the spec. They ask a co-worker or google it. They may or may not end up with the correct answer vs an answer that works at the moment.

I've used C since the mid 80s and C++ since the late 90s. Overflow for me always worked as a normal 2's compliment overflow on every project from 1985 to 2011 so I had no reason to question that it might be undefined.... until it was. Even more interesting is when asking a mailing list of 200+ C++ experts, the answers for how to actually detect overflow where all over the place. Meaning no one really knew for sure the spec correct way to check even though they were all C++ guys with +15 years experience

There's all kinds of details to languages that are buried in a spec but are not something that comes up "in your first few hours with the language".


The problem with the C standard is that most programmers treat C as portable assembler whereas from the beginning the group that wrote the standard tried to make it a proper, abstract high-level language.

So we got an enormous disconnect between what programmers expect and what the language really offers.

For a long time, compilers would side with the programmers making sure that optimisations would not break common idiom.

A number of years gcc left that path. So now, for system code, certainly if the code has to be secure, it is better to avoid gcc.

The C standard is in some areas extremely complex, and it doesn't make sense to expect all programmers to completely understand what is essentially a broken standard.


Perhaps we should even leave C as a language for system code. (System code has to be correct and secure first. Optimization for speed comes second, and is only needed for a few small, very hot code paths.)


Relevant article expressing the same sentiment: http://blog.metaobject.com/2014/04/cc-osmartass.html


I'm sorry if you feel gcc maintainers are unappreciated, that's not good. Clearly no compiler can rescue programmers from every security mistake they might make.

But raising security concerns is not "entitled," insecure-by-default is not okay, and "it's fast and the standard lets us get away with it" is not a sufficient defense.

Maybe non-conforming code should be penalized by making it slow, not by causing another large-scale public scourge like Heartbleed. Such risks are not just between gcc maintainers and inept programmers, there's a lot more at stake than that.


Doing something that you _know_ is wrong, then complaining when the compiler happens to break it, is entitled if I've ever seen it.

I don't think I would trust code that relies on undefined behavior to be secure. That doesn't really sound that safe.


As mentioned by maintainers in the thread there is already a flag to catch unsigned overflows. They even mentioned a way to catch those overflows in the code.

I can't seem to find that flag. Only ftrapv for signed, but none for unsigned. Do you know what flag is it exactly?


As mentioned by maintainers in the thread there is already a flag to catch unsigned overflows

Too bad such a flag doesn't exist.


GCC has builtins for overflow-detecting arithmetic operations, and has done for a while. These generally compile down to one of the jump instruction flavours on x86.

https://gcc.gnu.org/onlinedocs/gcc/Integer-Overflow-Builtins...


These look very unknown to me. How portable are these? If only GCC has these, I am not going to use them at all.


Clang also has them: http://clang.llvm.org/docs/LanguageExtensions.html#builtin-f..., under the heading "Checked Arithmetic Builtins".


You could implement them as static/inline functions for compilers that lack the builtins (using a bunch of comparisons with SHORT_/INT_MAX, ...) , which would be slower but for things explicitly meant as security check, probably worth the cost.

I've used them to replace explicit checks for overflow in some DSP related code which tried to implement saturation. The gcc builtins also were much faster than tedious "manual" checks.


Here is how to do integer overflow checks from the same author as the OP in the bug report:

https://www.fefe.de/intof.html


Since this is likely to degenerate into a discussion on the merits and pitfalls of optimizing undefined behavior, how about a look at this thread on mechanicalsympathy where the topic was beat to death just a short while ago:

https://groups.google.com/forum/#!topic/mechanical-sympathy/...

One of the juicier replies discussing some of the undefined behavior in C (by Rajiv Kurian):

1. A program in a hosted environment does not define a function named main using one of the specified forms - not a compile time error, it's UB.

2. The arguments to certain operators are such that could produce a negative zero result, but the implementation does not support negative zeros.

3. Two declarations of the same object or function specify types that are not compatible - again not a compilation error, it's UB.

4. An unmatched ' or " character is encountered on a logical source line during tokenization. During tokenization!!

5. An exceptional condition occurs during the evaluation of an expression - how specific.

6. An exceptional condition occurs during the evaluation of an expression - hmmm

7. A structure or union is defined as containing no named members - ookay.

8. The value of an unnamed member of a structure or union is used - why is there a compiler at all!

9. The character sequence in an #include preprocessing directive does not start with a letter - lol

10. The program modifies the string pointed to by the value returned by the setlocale function - dear god.

11. The string set up by the getenv or strerror function is modified by the program .

12. The array being searched by the bsearch function does not have its elements in proper order - not a logical error just straight up UB.

This is the tip of the iceberg. How confident is any one now that their program doesn't invoke UB? Things many programmer would assume are compiler errors are actually UB. What would happen if gcc/clang exploited each one of these?


10 and 11 are effectively just saying that the string returned by setlocale(), getenv() or strerror() might be a string constant, which are already unmodifiable.


I always write my C code with a C++ compiler, since it has some quality-of-life improvements. I think C++ is free to define undefined behavior without worrying about backwards compatibility, so I wonder how many of those still apply if you use a C++ compiler.


> This is the tip of the iceberg. How confident is any one now that their program doesn't invoke UB? Things many programmer would assume are compiler errors are actually UB. What would happen if gcc/clang exploited each one of these?

This is a pointless question. There is no reason for gcc and clang to exploit these, as there are no performance wins they could get by doing so. The analogy you're giving presupposes that the developers of those compilers like making code insecure for fun, because they're language lawyers. They aren't doing that. They're doing these optimizations because of documented, measurable performance gains, usually because someone filed a bug against the compiler asking why it didn't optimize their code in the way they thought it should.


12. Well breaking the invariants of a function is never a good idea, but putting defined behaviour in that case would put substational restrictions on how the function could be implemented (and still almost certainly do something weird anyway latter)


It could be implementation-defined without being UB.


Does anybody have an example of 7 and 8? I can't even think of how you'd accomplish them right now.


This is a list of optimization patterns; it's not an analysis of how much actual code gets sped up, and it doesn't consider the accommodations that would be made in the alternative universe: where defined signed overflow is the default, and there are other tools for when this impacts performance too much.

The question is if the alternative universe is better than this one, where you have to work harder to protect against unwanted overflow - and especially the compiler eliminating your efforts to test whether it occurred or not.


The command line switch '-fno-strict-overflow' will grant you access to this alternative universe.


The alternative universe includes a different C++ standard and potentially additional language constructs or types to opt out. Different defaults have cultural implications far beyond the mechanics of any single compile.


Any range information the compiler intuits from signed values not overflowing could, in principle, be given explicitly with the following code:

  if (foo > 1000) __builtin_unreachable();
Or macros that make that kind of thing look prettier. I prefer using unsigned values everywhere for both safety and semantics. (It's not possible to have a negative number of bananas, so why should I use a type that admits a value of -3 bananas?)


Clang also has a more explicit __builtin_assume(expression) builtin, although anecdotally GCC makes better use of these hints. Folly has a function to make it portable:

https://github.com/facebook/folly/blob/bb5ed8070d533c016e1e9...


I always use signed ints precisely because -3 bananas is more obviously incorrect than 2147483645 bananas. The opportunity of storing special 'null' values is also useful.


It doesn't say that much about the loop optimizations--

The cases where I've seen the largest gains come from loops where analysis shows that the loop runs, say, exactly 8 times or forever (due to overflow). Knowing it runs 8 times exactly lets the loop get unrolled, which then lets vectorization work... and then some inner loop runs more than 3x faster.


In languages with GCs or managed in some other way, knowing the upper bound of loop iterations is also important because it allows you to remove the check-if-we-need-to-stop-the-world operation from the loop.

This applies even if the loop isn't unrolled. For example if I see a loop that definitely doesn't iterate more than 2^32 times (because it doesn't overflow) and I can see it only has a few simple instructions in it, then I may decide to omit the check from the loop because I know it can't take that long.


In Java specifically, I've gotten good results by following this technique:

- start with an empty routine for the main loop to be optimized

- add the simplest loop you can that iterates over your source data, written such that has a simple obvious bounds check

- test the loop in a microbenchmark harness and look at the generated assembly if necessary to verify that bounds checks get hoisted out, the loop is being unrolled, etc.

- slowly add more code and continuously verify that assumptions are met

- if the loop body gets too complex and the JVM stops optimizing it properly, try and split the loop into multiple seperate (possibly nested) loops that are in their own functions, and repeat the process

- when finished, experiment with inlining loops again

Coming at it from the other angle - a complex loop that you poke until the generated code is decent - is much more frustrating. E.g. manually unrolling loops is fairly pointless - you really really want the JVM to unroll the loops, because it will do so with fewer bounds checks that you can't eliminate yourself. Start with the JVM producing good fast code, and keep it that way as the complexity grows.


Yes I understand that the java.util.concurrent code includes a lot of loops that look unusual, and in fact they're carefully constructed in such a way as to just avoid the poll (the check-if-we-need-to-stop-the-world part) for inner-loop concurrency operations.


Just checked on my system (fairly recent macbook pro) and a loop that does nothing but increment a memory location and runs 2^32 times takes 8 seconds. Doesn't Java stop the world more often than that?


I think something that accesses main memory goes beyond 'simple' instructions for this definition.

This is an example of a Java compiler phase doing it, although here it looks like it's only doing it for empty loops (or loops that end up empty due to hoisting), so it's not quite what was going on about https://github.com/graalvm/graal-core/blob/0a68438187e7e3196....


However, loops can still be unrolled with unsigned integer loop indices, in many (most?) circumstances.


I never see anyone do this, but it doesn't seem like a bad idea to run a program through a C interpreter and see if it ever triggers undefined behavior. As I understand it, the C standard is written so that all undefined behavior is able to be caught in this manner. I could even see something like gcc coming with a 'create an interpreted executable for debugging purposes' flag.

It wouldn't constitute a formal proof or anything, but it would solve the problem of people leaving overflow and null pointer checks in plain sight expecting them to be hit.


Undefined behaviour isn't a set of checks the compiler goes through before deciding to call invoke_nasal_demons() which you can then just replace with yell_loudly_about_UB(). Undefined behaviour is a set of conditions that the standard says they can pretend do not occur, ever, so certain results are easier to obtain because the UB is never checked for and assumed to have not occurred.

UB is fundamentally not a well defined condition because it is the absence of knowledge about the state of a program, and many UB conditions are very very hard to detect if not actually impossible (cf Turing) which is why the standard says the compiler doesn't have to care to begin with.

Its very nature means detecting UB in anything approaching a reliable way is impossible, the C standard is not remotely compatable with such an idea.



But while running your C program through the C interpreter, what user input would you give your C program?


If it's easy, then you could just run the program as normal with a few test cases. If you're a web browser for example, use it to browse some web pages. If you're a BitTorrent client, use it to download some Linux ISOs.

If the interpreter limits the external resources available[1], then you'd have to rephrase the guts of your C program as a pure computation. Then if you had a program for rendering HTML files into BMP files, I would collect a couple examples and run it with them as input.

If necessary, you could embed the test data into a .c file that exercises the rendering code and remove any dependency on user input.

"Just run the program" doesn't constitute a correctness proof. And if you have access to kernel functions at all, I'm sure there's a way to rewrite your own code at runtime while not invoking undefined behavior. It looks like CompCert C's formally verified C Interpreter[1] only allows print, free, and malloc as external resources. [1] http://compcert.inria.fr/man/manual004.html


But if the test cases you run it on are not comprehensive enough, the behavior of your program might be undefined when run on user input, but not be undefined when run on your test cases.


I wonder how often actual code uses these patterns. Is there anyone who would write (-x)/(-y) instead of x/y?

Also, some of these patterns claim to eliminate something, but they actually just move it around. For example, -(x/y) -> (-x)/y doesn't eliminate negation, so it's not clear to me that the latter form is faster than the former.


It doesn't really matter if someone would write such code by hand. Code can be transformed into that pattern by compiler transformations such as inlining, CSE, constant folding etc..


Re moving the negation: it could be faster if x is available a few cycles before y in a superscalar architecture as you could save a cycle by negating x while in parallel computing y.


I believe these are mostly either a) the result of prior optimizations/inlining b) in generated code.

This is often true of optimizations in general.


This kind of makes a great case for why signed/unsigned ought to be a feature independent of overflow.


So, does that mean we should use unsigned integers whenever possible, such that we have defined overflows and still benefit from optimizations?


I don't think so. Most of those optimizations come from the "no overflow" part, not the "signed" part. If unsigned overflow were also undefined then these optimizations would work there as well. Since unsigned overflow is defined (all operations are modulo 2^bitsize), stuff like transforming `x + c > x` to `true` is not valid, since it can legally be false.


Unsigned has its own can of worms. Unsigned has higher promotion order than signed, so if you do a-b where either a or b is unsigned, you can end up with 4294967295 where you expect -1.


I'm not a compiler expert, but since these optimization are possible due to the fact signed integers don't overflow, they don't apply to unsigned integers...


No. You should use a language better aligned to the processor - one in which you can express a "loop exactly n times" construct directly, without having to add a ceremonial iteration variable.


> one in which you can express a "loop exactly n times" construct directly

The problem with this advice is that there are so many things that the processor may be interested in knowing declaratively, rather than having to claw it back from the semantics of your application somehow, that the problem has become totally unmanageable. And even if we could manage it, the next generation of processor microcodes may be completely uninterested in what we've carefully encoded into your programs are actually want some other information instead.


I don't think the problems with that advice are singular. Perhaps the most ultimately-relevant is that looping exactly N times without a "ceremonial" counter is really only very rarely useful. Truly, how often do you see the canonical `for (i = 0; i < N; i++)` pattern in C with no references to `i` anywhere in the loop body? (Hint: if there are any, it's not "ceremonial".) Blindly repeating the exact same sequence of code just isn't something you want to do very often.

To address the issue of "okay, in what languages could we do this anyway?": taken in full mathematical generality, such a capability would necessarily require bignum support, which cuts down your options significantly. I'm not sure if that was actually the intent, but if we instead accept the limitation of "up to some largeish power of two", constructing a macro to do this "directly" in C would also be quite easy. So...(shrug)


I agree that it's rare, but aren't those exactly the cases where this kind of optimization is relevant? In the cases where you're using i and it could overflow, the optimization couldn't be applied anyway.


Premature optimization is the root of all evil - a lot of programs perform more than adequately when just written in a way that directly expresses programmer intent. In cases where you need to push the very limits of performance, you'll have to add extra processor-specific information (or at least information based on educated guesses about how processors in general behave) - I don't think there's any way to avoid that. But if you're including extra information in the source that clutters up developer intent and doesn't help the processor optimize, something has gone very wrong.


Thank you for telling me not to use C and C++. Unfortunately, I cannot follow your advice due to existing codebases.


A lot of languages support C linkage. It can be surprisingly easy to migrate a codebase a little at a time.


What language would you suggest?


The majority of programming languages don't have UB as part of their standard.

It only happened in C, and its derivatives, because by the time ANSI C came to be, no compiler vendor wanted to give up on their hardware specific behaviors so it got all tossed under UB carpet.


EDIT: I see- a different reading of the post than my original take would suggest that the author is saying, you as a C/C++ programmer are "not allowed" to compile code that would cause under/overflow. The compiler assumes that you are following this rule, enabling these optimizations. The author's wording is unclear as to who or what is not being allowed to under/overflow, hence my confusion.

---

Right off the bat-

> Signed integers are not allowed to wrap in C and C++

What? This is...not true.

  $ cat > test.c
  #include <stdio.h>
  #include <limits.h>
  
  int main()
  {
      int i = INT_MAX;
      printf("%i\n%i\n", i, i+1);
  }
  
  $ cc test.c 
  $ ./a.out 
  2147483647
  -2147483648
Im not an expert on these matters, but I suspect signed integers are not defined as wrapping and in fact not defined as doing anything particular in under/overflow situations, though in one common convention, two's complement integer arithmetic, they often do. The optimizations rely on the fact that the programmer can't depend on any specific overflow or underflow behavior.


It is absolutely true. Your test program is not well formed. The result of overflowing or underflowing a signed integer is undefined.

What's tripping you up is that C can be very lax with stuff it doesn't allow. Some stuff is required to generate a compiler error, like if you try to start a variable name with a number. Some stuff is just "undefined behavior" which means the compiler can basically do whatever it wants.

Undefined behavior is often referred to as "nasal demons" because it would be perfectly legal for a conforming C compiler which encounters undefined behavior to make demons fly out of your nose. This is impractical, but real-world compilers can and do crash or silently emit unexpected values in this case.

The most confusing part of this is that a subset of undefined behavior is doing what you want it to do. That's what you're encountering here. The moment you do i+1 you've hit undefined behavior. In this particular case, the compiler is emitting code that causes the value to wrap, which is what you expected. This is still undefined behavior, which means your program is ill formed, and the compiler can change its mind and betray you at any moment!


You are incorrect, or perhaps correct in a nit picky way[0]. It is undefined behavior, as any Google search can tell you. Your compiler is just being nice--it is allowed to print "screw you" as the result of that program.

For more about undefined behavior, try here: http://blog.llvm.org/2011/05/what-every-c-programmer-should-...

[0] Since its undefined behavior, the compiler can choose to wrap it. If that's what you're saying, I think it's pretty confusing expressed the way you did.

In spite of that, a well defined C program is not allowed to overflow, and a compiler can compile a program that overflows it absolutely any way it chooses to. It is in fact able to optimize the program based on the assumption that the overflow cannot happen, even if it is actually possible.


I think 'ssalazar is interpreting the phrase "not allowed to wrap" to mean "if the C compiler causes them to wrap, it is violating the spec". Undefined behavior means anything is allowed; this includes reasonable behavior.

(Remember that undefined behavior exists for the purpose of allowing optimizations. Adding code to print "screw you" is rarely an optimization. Doing just what the programmer expected is usually an optimization, which is why UB is so tricky; it's just not always an optimization, and doing something unexpected might be faster.)


You're right. I added a note when I thought of this, and if that's what the parent meant, I think it's an exceedingly unhelpful addition.


What you say is correct, but I think people are not helping the discussion by adding correct but misleading statements such as "the compiler is allowed to print 'screw you'". To me it sounds as if the compiler would be trying hard to find an opportunity to sneak in malicious code -- and then shout "told you so!" into my face with an ugly grin.

The core of undefined behavior, for me, is the following: The compiler will, during optimization, convert your program (as written) into a second program (in its intermediate language, data-structures, assembler code, ...). This second program is identical to the program as written by you only regarding the defined behavior, and the transformation might not be obvious. For some optimizations (automatic vectorization, multithreading, ...) it will be pretty convoluted. For some simpler one, it might just replace mults by shifts or remove an unnecessary check here and there.

If you rely on undefined behavior for calculation, branches, loops (such as INT_MAX+1 < INT_MAX), the compiler may create such a second program which behaves as if a branch may (or may not) be taken, a loop may, or may not terminate, or have a few iterations more or less.

Unfortunately some of the statements that got replaced or mangled could be needed to guard certain invariants, avoid smashing the stack or accessing an array within bounds. And when you loose these guarantees, of course all bets are off as you might start executing random code from the data your program is trying to process (unintentionally, or maliciously crafted).

A simple if(i+1<i){...} never was replaced by puts("screw you"), I'm almost... well... hardly... ...it might only happen rarely. :-)


It's a variant of the "nasal demons" comment, but you might be right. I don't know if that kind of phrase is helpful or not.


It's a little subtler than that. Wrapping two's complement arithmetic is a function of your CPU, not your compiler. Right now, those numbers are being handed directly to the CPU, because the compiler doesn't care what happens in the case of signed overflow. The CPU chooses to implement wrapping arithmetic.

However, the compiler might do something to your program's control flow long before it hits the CPU's arithmetic instructions. For instance, try compiling this function under optimization:

    titan:/tmp geofft$ cat test.c
    int test(int x) {
            if (x > 0) return 1;
            x -= 10;
            if (x > 0) {
                    return 2;
            }
            return 1;
    }
    titan:/tmp geofft$ cc -O3 -c test.c
    titan:/tmp geofft$ objdump -S test.o

    test.o:     file format elf64-x86-64


    Disassembly of section .text:

    0000000000000000 <test>:
       0:   b8 01 00 00 00          mov    $0x1,%eax
       5:   c3                      retq
The compiler makes no promises about what happens in the case of signed integer overflow or underflow. In this case, for numbers in [INT_MIN, INT_MIN + 10), it doesn't promise that wrapping happens -- even though, if it were to actually do the math on my CPU, it would end up with two's complement. So it assumes the second if can never be true if the first if is untrue, and optimizes the function to return a constant 1.

I think you're right that the phrasing is sloppy, and should be "Signed integers are not allowed to overflow or underflow in C and C++," making it an obligation on the programmer not to overflow. If you're interpreting it as an obligation on the compiler not to implement wrapping, then yeah, that's incorrect.


  titan:/tmp geofft$ cc -O3 -c test.c
Is your use of 'cc' instead of 'gcc' intentional here? It implies that all compilers make this "optimization", but I think it's currently only GCC. ICC, Clang, and MSVC continue to work that way that the programmer intended: Godbolt for ICC and Clang http://goo.gl/jFfvyz, MSVC http://webcompiler.cloudapp.net/.

While it would be unwise to rely on this "polite" behavior from a compiler, I find it difficult to argue that GCC does a better job of compiling this particular code. But perhaps the speedup on similar code is enough to justify the occasional insidious bug?

It would be nice if there was a GCC option that produced a warning here. Is there one?


No, that's just habit and me being slightly sloppy because it's a forum comment (same with the code style, sorry). I did expect this to be common across multiple compilers, so I'm surprised it's only gcc. I'd be curious if there's a similar-in-spirit example for LLVM, since I do expect they do similar optimizations.

Looks like gcc -Wstrict-overflow=3 or higher produces a warning. Alternatively, -fno-strict-overflow disables the optimization (which is included as part of -O2 or higher).


Nice example.

More people should know that the second if statement is unreachable code.



Here's an example indicating how we can use overflow from INT_MAX to INT_MIN to terminate a loop... ...if signed overflow would be properly defined.

    /* Write \"hi\" n times, counter will wrap to negative
     * to indicate we've written enough greetings.
     *
     * i=3  -> we want to write hi three times
     *
     * 1st INT_MAX-i+1 = INT_MAX-2 -> write hi
     * 2nd INT_MAX-1               -> write hi
     * 3rd INT_MAX                 -> write hi
     * 4th INT_MIN      (signed overflow, terminates loop)
     */

    int
    write_n_times_hi(int i)
    {
        int j;
        for (j=INT_MAX-i+1; j>0; j++)
                printf("hi\n");
        return i;
    }
This will be compiled into an endless loop by gcc 5.3.0/x86_64 for -O3, but will terminate properly with -O0.

If you change the (undefined for j=INT_MAX)

    j++
into the (defined)

    *((unsigned*)&j)++,
the compiler may no longer assume that j stays >0 all of the time, and the loop has to terminate.


So compilers use a tricky and unintuitive corner of the C standard as a half-assed substitute for doing real program-wide bounds analysis.


It's not a tricky corner of C. It's the most basic C thing ever if you are not not negligent and actually check the standard for what happens when you overflow a signed int.

I mean, the C standard isn't that long nor complicated. There aren't many basic types in C, learning how they behave in basic situations like division by 0, overflow etc. is an hour long task.


May I refer you to the 13 pages of section J.2 Undefined behavior of the C standard[1]. I hope you don't have any other appointments after your "hour long" task...

[1]http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1256.pdf


We are talking about the basic operations on basic types in C, not all possible undefined behaviors. You don't need to learn a list of possible undefined behaviors. You just need to check how your basic types behave when you try to store numbers too big for the number of bits they have. It's something you learn in like first few hours of your introductory C course.


Assuming you aren't targeting multiple systems, which with its own C compiler.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: