From the Google C++ Style Guide: *"You should not use the unsigned integer types...

nly · on Dec 3, 2014

Which is a completely birdbrained policy given that signed integer under and overflow is completely undefined. If you want to catch implicit signed -> unsigned conversions then enable that warning on your compiler.... what they'd advocating is just dangerous.

rmrfrmrf · on Dec 3, 2014

In a strict typing environment, the other major issue is that int is cross-platform and forward compatible whereas uint32_t, uint64_t, uint8_t, uint16_t, etc. will all always be unsigned within a specified bound, so whenever we have 128-bit or 256-bit registers, we'll have to go back and update all this code that effectively "optimizes" 1 bit of information (nevermind the fact that int is usually more optimized than uint these days).

Furthermore, casting uintx_t to int and back again while using shared libraries is a huge pain in the ass and can waste a lot of programmer time that would be better spent elsewhere, especially when working with ints and uints together (casting errors, usually in the form of a misplaced parenthesis, are pretty small and can take a very long time to find).

kevinnk · on Dec 3, 2014

> int is cross-platform and forward compatible whereas uint32_t, uint64_t, uint8_t, uint16_t, etc. will all always be unsigned within a specified bound, so whenever we have 128-bit or 256-bit registers, we'll have to go back and update all this code

uintN_t (and intN_t) are MORE portable and cross platform than int in the sense that you get much better guarantees about it's size and layout.

Furthermore, int is NOT the size of the register (x64 commonly has an int of 32 bits) so any updating you'd have to do to uintN_t, you'd have to do to int as well. Regardless, I can't imagine why you'd need to do any updating in the first place - it's perfectly valid to stick a uint32_t in a 64 bit register.

> nevermind the fact that int is usually more optimized than uint these days

Where are ints more optimized than uint? Not in the processor, not in the compiler (modulo undefined behavior on overflow) and not in libraries.

sjolsen · on Dec 3, 2014

> so whenever we have 128-bit or 256-bit registers, we'll have to go back and update all this code that effectively "optimizes" 1 bit of information

This is why we have uint_least8_t and friends. In fact, int is really just another int_least16_t.

> Furthermore, casting uintx_t to int and back again while using shared libraries is a huge pain in the ass and can waste a lot of programmer time that would be better spent elsewhere

Could you give an example? It sounds like you're just talking about performing the casts, which shouldn't take much effort at all as indiscriminately as C casts about integral values.

mikeash · on Dec 3, 2014

You don't have to update code. If 64 bits was enough on a 64-bit CPU, it'll be enough on a 128-bit CPU. The one exception is when dealing with quantities that actually depend on the bit width of the CPU, like dealing with array sizes. The language already has good types for this, like size_t, and using int won't save you. (Quite the contrary, int will sink you, because int is almost always 32 bits even on 64-bit systems.)

dwd · on Dec 3, 2014

I had my first nasty production bug (back in the early 2000s) when I assumed an Integer was 32bit in VBScript.

2 billion survey results was never going to happen. 32,767 would have been fine as well except to compound the issue ops pointed the production site at the test database.

astrange · on Dec 3, 2014

Are your choices between variable-width "int" and fixed-width "uint_x"? After all, in C you can just declare something "unsigned" and it's the width of int.

However, I think this is a problem. The expected value ranges of your variables don't change just because your memory bus got wider - maybe you can use more than 4GB memory in a process now, but it's a mistake to plan for single array indexes being more than 32bit.

If you do try to be more flexible, I'm sure this would introduce more bugs than the forward-compatibility it'd add. Especially if 'int' is smaller than on the platform you tested on. That's why languages like Swift, Java, C# always have 32-bit int on every platform.

> casting errors, usually in the form of a misplaced parenthesis, are pretty small and can take a very long time to find

Agreed, but writing casts also adds unwarranted explicitness. What if someone made a typo and put the wrong type in the cast? How do you tell what's right? What if you change the type of the lvalue or the casted value? Now you have to think about each related cast you added.

What's the alternative? Well, the compiler should just know what you mean…

lilyball · on Dec 3, 2014

Int is not cross-platform and forward-compatible. It's implementation-defined, so it's up to the compiler. Practically speaking, every modern compiler defines int as 4 bytes, and can be expected to never change that (because of the vast swaths of bad code out there that is written with the assumption that an int is 4 bytes). So it's not forward-compatible. And while on most platforms you can expect the compiler to have picked 4 bytes, it's certainly possible for compilers to pick other sizes for int (I would assume compilers for embedded architectures might do that), which means it's not cross-platform either.

nly · on Dec 3, 2014

How is 'int' cross-platform and forward compatible? The size and range of int is implementation (meaning compiler and CPU ABI) defined.

desdiv · on Dec 3, 2014

The size of int is implementation dependent, but its minimal range isn't. If I'm representing integer quantities between -32767 and +32767 with int, then it will work reliably across all platforms and compilers that's C99 complaint. I believe that's what GP is referring to.

astrange · on Dec 3, 2014

"Completely undefined" is a good thing, because it's a strict line between good and bad (good and evil?). So, now that you know all integer overflows are bad, you can:

* dynamically test your program with ubsan to be sure they really don't happen, and then

* let your compiler optimize with the knowledge that integers won't overflow.

This last one eliminates maybe half the possible execution paths it can see, and loop structure optimizations practically don't work without it.

On the other hand, unsigned overflows? Some of those are bad, but some are fine, right? How will an analyzer know which is which?

Some notable libraries like C++ STL want you to write loops with unsigned math (size_t iterations), but those people invented C++, so why would you trust them with anything else?

nly · on Dec 3, 2014

Ubsan won't catch signed integer overflow unless you happen to hit the overflow case during your tests. Relying on dynamic analysis to catch errors you should have avoided statically is shoddy.

astrange · on Dec 4, 2014

It's certainly less complete, but it's a little harder to decide what you want to prove statically.

If a function must-overflow the optimizer (hopefully) replaces the entire thing with an abort under ubsan, so you could look for that. But that's probably not sensitive enough.

And if the function is just 'x + 1' that may-overflow, but it's not important.

Maybe you want this: http://pdos.csail.mit.edu/papers/stack:sosp13.pdf

yongjik · on Dec 3, 2014

To be fair, even though unsigned integer overflow is very well defined, it's most certainly NOT what you want when used as an index or counter of anything.

coolgeek · on Dec 3, 2014

From the coolgeek style guide:

"Never use a signed type for a number that can never be negative"

One of my pet peeves is developers using int (instead of unsigned ints) for primary keys in database tables.

sytelus · on Dec 3, 2014

+1. Everytime I see for(int i=0;...;i++) I wonder why we have developed this habit of defaulting all int as signed and consider uint as taboo (most coding guidelines asks not to use them unless "you know what you are doing"). Most of the time we use integers for counting and so uint should have been more natural. I did this in one of my libraries I was writing from scratch and I was happy for a while but then I got in to trouble because there is lot of code out there with interfaces expecting signed ints even though they should using uint. So ultimately the legacy forced me back to default again at using signed int.

TeMPOraL · on Dec 3, 2014

> I wonder why we have developed this habit of defaulting all int as signed and consider uint as taboo (most coding guidelines asks not to use them unless "you know what you are doing").

I'm pretty sure that it's just because "int" is one word and "unsigned int" is two, plus more than twice the characters. I suspect if "int" defaulted to "unsigned int" and you'd have to specify signed ints explicitly, the taboo would be reversed.

Never underestimate the power of trivial inconveniences.

yongjik · on Dec 3, 2014

Well, in my case, from time to time I have to do these stuff:

    for (int i = x.size() - 1; i >= 0; i--) ...
    for (int i = 0; i < x.size() - 1; i++) if (x[i] < x[i+1]) ...

Both will blow up badly with unsigned ints.

(Well, to be fair, both will blow up with signed ints if x.size() is greater than 2G, so it's a matter of expectations.)

detrino · on Dec 3, 2014

Forget about for statements for a second and let's write both a counting up and a counting down loop using while statements.

    // count up
    std::size_t i = 0;
    while (i != 10)
    {
        std::cout << i << "\n";
        ++i;
    }

    // count down
    std::size_t i = 10;
    while (i != 0)
    {
        --i;
        std::cout << i << "\n";
    }

After initialization a for statement repeats "test; body; advance", this is ideal for counting up loops, but what we need for counting down loops is "test; advance; body". Since C/C++ do not provide the latter as a primitive you have to use a while loop as shown above. Using a signed integer to shoehorn a counting down loop into a for statement at the cost of 1/2 your range is a hack IMO. Note that when working with iterators you have to resort to a while statement as iterating past begin is UB.

Aldo_MX · on Dec 3, 2014

  for( size_t i = x.size(); i-- > 0; ) ...

delinka · on Dec 3, 2014

Now i is no longer an index, it's index-plus-one.

Aldo_MX · on Dec 3, 2014

Nope, it's still an index, after the comparison and before executing the code inside the loop, i is decreased by 1.

Remember that a for loop does something like this behind the scenes:

  size_t i = x.size();
  while( i-- > 0 )
  {
    // your code which needs backwards iteration here

    ;  // do nothing because there is no third statement
  }

shdon · on Dec 3, 2014

No, it's not. The check is done before the loop, so i has the correct value inside the loop.

Still, the trick makes it look suspect and that's an argument against using it.

Aldo_MX · on Dec 3, 2014

> Still, the trick makes it look suspect and that's an argument against using it.

This is true. The code is confusing to people not used to it. A workaround could be to hide this code inside a macro, so people not interested in digging into the code would take the macro's word:

  #define REVERSE_LOOP( x, i ) for( size_t i = x.size(); i-- > 0; )

But unfortunately, that doesn't help with the fear that people has against unsigned types.

stinos · on Dec 3, 2014

I have to do these stuff: for (int i = x.size() - 1; i >= 0; i--) .

That yields compiler warnings for signed vs unsigned then, no?

dkbrk · on Dec 3, 2014

> Everytime I see for(int i=0;...;i++)

In this case, it makes absolutely no difference at all. It could be argued that writing unsigned int would make the code slightly harder to read. That said, I like to use stdint.h and unint32_t would, I think, not have any drawbacks.

> there is lot of code out there with interfaces expecting signed ints even though they should using uint

That's not a good reason to not use unsigned integers, it's a zero-overhead cast from unsigned to signed (at the risk of overflowing into the negative).

Joky · on Dec 3, 2014

It change the semantic on the loop bound, and thus what the compiler can/cannot do when optimizing the code.

Using uint limits the optimizer...

detrino · on Dec 3, 2014

Using int also limits the optimizer in some ways, for example division/modulo becomes more expensive.

Every time I see someone mention the optimization argument for signed integers I ask for examples and I've yet see a good one.

10098 · on Dec 3, 2014

yeah, some numbers can never be negative, but their difference can. and that's when it usually comes to bite me in the ass. i almost never use unsigned ints now.

nly · on Dec 3, 2014

I disagree, signed integer arithmetic in C and C++ is just toxic. Sure, if you need to compute the difference between two integers, which have both been pre-checked to lie between say -100 and +100, then fine, use signed ints... but for arbitrary input you need to do more work.

There's example code on the CERT secure coding guidelines here (look under 'Substraction'):

https://www.securecoding.cert.org/confluence/display/seccode...

Writing safe code to calculate the absolute difference between two unsigned integers is much less hairy: max(x,y) - min(y,x).

sjolsen · on Dec 3, 2014

All arithmetic in C and C++ is toxic. That's the reality of using bounded-precision types. Honestly, I wish they'd had the foresight not to use the traditional infix operators for built-in types; they practically beg programmers to implicitly treat built-in types like the mathematical types they very vaguely resemble.

Really, working directly in fixed-precision arithmetic is absurd. In order to be able to rely on its correctness with any degree of certainty, you need to very carefully track each operation and its bounds, at which point you may as well have just used arbitrary-precision types, explicitly encoded your constraints, and had the compiler optimize things down to scalar types when possible, warning when not.

10098 · on Dec 3, 2014

the funny thing is that fixed precision arithmetic is used literally everywhere and it just works. i'd say it's good enough for most practical purposes.

sjolsen · on Dec 3, 2014

It is not used literally everywhere. It does not always "just work," as the original post demonstrates. It often happens to be good enough for most practical purposes, yes, but arbitrary-precision arithmetic is better for most practical purposes.

Fixed-precision arithmetic has one main advantage over arbitrary-precision arithmetic: it is more time- and space-efficient. This advantage only applies if the fixed-precision arithmetic is actually correct and the fixed-precision arithmetic meets some concrete time or space constraint which arbitrary-precision arithmetic fails to meet. It generally takes time and effort to demonstrate that these conditions hold; because one can rely on the correctness of arbitrary-precision arithmetic without doing so, arbitrary-precision arithmetic should then generally be the default choice.

This assumes that you care about making relatively strong guarantees about the correctness of your programs. If for some reason you don't, then sure, use ints and whatnot for everything. If you do, though, I suspect you'll find that it's easier to track down a performance bottleneck caused by using bignums than an obscure bug triggered by GCC applying an inappropriate optimization based on overflow analysis.

rtpg · on Dec 3, 2014

do these problems disappear with unsigned arithmetic?

Joky · on Dec 3, 2014

This is true for a signed addition as well, since you are not allowed to overflow.

TorKlingberg · on Dec 3, 2014

There are two schools of though here, and I am not convinced either is obviously right.

1) If you don't need negative numbers, use unsigned integers.

2) If you don't need the extra positive range of unsigned integers (or defined wrapping), use signed.

You advocate (1), but C is generally based on (2), with the default int being signed, and many standard functions using plain int.

einhverfr · on Dec 3, 2014

PostgreSQL doesn't give you an unsigned int option but if they did I wouldn't use it.

Having a negative pkey space is actually useful. In LSMB we reserve all negative id's for test cases, which are guaranteed to roll back. This has a number of advantages including the ability to run a full test run on a production system without any possibility of leaving traces in the db.

dragonwriter · on Dec 3, 2014

Most DBs don't support unsigned int [0] as a type (though its perfectly sensible to have a constraint that enforces >0.)

[0] though several do support UUIDs, which are essentially unsigned 128-bit ints, and which (with a well-selected generation mechanism) are better as server-assigned surrogate keys than sequential integers, signed or unsigned, anyway.

ANTSANTS · on Dec 3, 2014

Every time I see

  if (index < 0) { /* error */ }

I die a little inside.

tiglionabbit · on Dec 3, 2014

Unsigned ints aren't supported by any sql database.

MrOrelliOReilly · on Dec 3, 2014

Sarcasm?

http://dev.mysql.com/doc/refman/5.0/en/numeric-type-overview...

esaym · on Dec 3, 2014

His sarcasm is saying that mysql isn't a real database since it has data types that break the sql standard I guess.

"SQL only specifies the integer types integer (or int), smallint, and bigint." http://www.postgresql.org/docs/9.3/static/datatype-numeric.h...

tiglionabbit · on Dec 3, 2014

Oh, I wasn't aware that there was a database that supported them. I mostly use postgres and sqlite, which both do not support them.

imanaccount247 · on Dec 3, 2014

>One of my pet peeves is developers using int (instead of unsigned ints) for primary keys in database tables.

Seems like a pretty ignorant pet peeve considering that's the only option for every database that doesn't auto-corrupt data.

mohawk · on Dec 3, 2014

That seems like bad advice to me. A possible infinite loop is given as justification in case of wrongly implemented reverse iteration (counting down an unsigned loop variable). Well, i claim that an infinite loop is a much more noticeable bug than undefined overflow behaviour, negative view counts, etc. Unsigned ints will make bugs impossible that with signed ints will (hopefully, famous last words) trigger assertions, if they are enabled...

Buge · on Dec 3, 2014

One problem with this is that the sizes of STL containers are returned unsigned, and with high warning levels, compilers will warn about comparing a signed int with one of these sizes.

cjensen · on Dec 3, 2014

The size of everything is unsigned. You may not know if size_t is unsigned int or unsigned long, but you can always be sure it isn't int.