"You should not use the unsigned integer types such as uint32_t, unless there is a valid reason such as representing a bit pattern rather than a number, or you need defined overflow modulo 2^N. In particular, do not use unsigned types to say a number will never be negative. Instead, use assertions for this." [0]
Which is a completely birdbrained policy given that signed integer under and overflow is completely undefined. If you want to catch implicit signed -> unsigned conversions then enable that warning on your compiler.... what they'd advocating is just dangerous.
In a strict typing environment, the other major issue is that int is cross-platform and forward compatible whereas uint32_t, uint64_t, uint8_t, uint16_t, etc. will all always be unsigned within a specified bound, so whenever we have 128-bit or 256-bit registers, we'll have to go back and update all this code that effectively "optimizes" 1 bit of information (nevermind the fact that int is usually more optimized than uint these days).
Furthermore, casting uintx_t to int and back again while using shared libraries is a huge pain in the ass and can waste a lot of programmer time that would be better spent elsewhere, especially when working with ints and uints together (casting errors, usually in the form of a misplaced parenthesis, are pretty small and can take a very long time to find).
> int is cross-platform and forward compatible whereas uint32_t, uint64_t, uint8_t, uint16_t, etc. will all always be unsigned within a specified bound, so whenever we have 128-bit or 256-bit registers, we'll have to go back and update all this code
uintN_t (and intN_t) are MORE portable and cross platform than int in the sense that you get much better guarantees about it's size and layout.
Furthermore, int is NOT the size of the register (x64 commonly has an int of 32 bits) so any updating you'd have to do to uintN_t, you'd have to do to int as well. Regardless, I can't imagine why you'd need to do any updating in the first place - it's perfectly valid to stick a uint32_t in a 64 bit register.
> nevermind the fact that int is usually more optimized than uint these days
Where are ints more optimized than uint? Not in the processor, not in the compiler (modulo undefined behavior on overflow) and not in libraries.
> so whenever we have 128-bit or 256-bit registers, we'll have to go back and update all this code that effectively "optimizes" 1 bit of information
This is why we have uint_least8_t and friends. In fact, int is really just another int_least16_t.
> Furthermore, casting uintx_t to int and back again while using shared libraries is a huge pain in the ass and can waste a lot of programmer time that would be better spent elsewhere
Could you give an example? It sounds like you're just talking about performing the casts, which shouldn't take much effort at all as indiscriminately as C casts about integral values.
You don't have to update code. If 64 bits was enough on a 64-bit CPU, it'll be enough on a 128-bit CPU. The one exception is when dealing with quantities that actually depend on the bit width of the CPU, like dealing with array sizes. The language already has good types for this, like size_t, and using int won't save you. (Quite the contrary, int will sink you, because int is almost always 32 bits even on 64-bit systems.)
I had my first nasty production bug (back in the early 2000s) when I assumed an Integer was 32bit in VBScript.
2 billion survey results was never going to happen. 32,767 would have been fine as well except to compound the issue ops pointed the production site at the test database.
Are your choices between variable-width "int" and fixed-width "uint_x"? After all, in C you can just declare something "unsigned" and it's the width of int.
However, I think this is a problem. The expected value ranges of your variables don't change just because your memory bus got wider - maybe you can use more than 4GB memory in a process now, but it's a mistake to plan for single array indexes being more than 32bit.
If you do try to be more flexible, I'm sure this would introduce more bugs than the forward-compatibility it'd add. Especially if 'int' is smaller than on the platform you tested on. That's why languages like Swift, Java, C# always have 32-bit int on every platform.
> casting errors, usually in the form of a misplaced parenthesis, are pretty small and can take a very long time to find
Agreed, but writing casts also adds unwarranted explicitness. What if someone made a typo and put the wrong type in the cast? How do you tell what's right? What if you change the type of the lvalue or the casted value? Now you have to think about each related cast you added.
What's the alternative? Well, the compiler should just know what you mean…
Int is not cross-platform and forward-compatible. It's implementation-defined, so it's up to the compiler. Practically speaking, every modern compiler defines int as 4 bytes, and can be expected to never change that (because of the vast swaths of bad code out there that is written with the assumption that an int is 4 bytes). So it's not forward-compatible. And while on most platforms you can expect the compiler to have picked 4 bytes, it's certainly possible for compilers to pick other sizes for int (I would assume compilers for embedded architectures might do that), which means it's not cross-platform either.
The size of int is implementation dependent, but its minimal range isn't. If I'm representing integer quantities between -32767 and +32767 with int, then it will work reliably across all platforms and compilers that's C99 complaint. I believe that's what GP is referring to.
"Completely undefined" is a good thing, because it's a strict line between good and bad (good and evil?). So, now that you know all integer overflows are bad, you can:
* dynamically test your program with ubsan to be sure they really don't happen, and then
* let your compiler optimize with the knowledge that integers won't overflow.
This last one eliminates maybe half the possible execution paths it can see, and loop structure optimizations practically don't work without it.
On the other hand, unsigned overflows? Some of those are bad, but some are fine, right? How will an analyzer know which is which?
Some notable libraries like C++ STL want you to write loops with unsigned math (size_t iterations), but those people invented C++, so why would you trust them with anything else?
Ubsan won't catch signed integer overflow unless you happen to hit the overflow case during your tests. Relying on dynamic analysis to catch errors you should have avoided statically is shoddy.
It's certainly less complete, but it's a little harder to decide what you want to prove statically.
If a function must-overflow the optimizer (hopefully) replaces the entire thing with an abort under ubsan, so you could look for that. But that's probably not sensitive enough.
And if the function is just 'x + 1' that may-overflow, but it's not important.
To be fair, even though unsigned integer overflow is very well defined, it's most certainly NOT what you want when used as an index or counter of anything.
+1. Everytime I see for(int i=0;...;i++) I wonder why we have developed this habit of defaulting all int as signed and consider uint as taboo (most coding guidelines asks not to use them unless "you know what you are doing"). Most of the time we use integers for counting and so uint should have been more natural. I did this in one of my libraries I was writing from scratch and I was happy for a while but then I got in to trouble because there is lot of code out there with interfaces expecting signed ints even though they should using uint. So ultimately the legacy forced me back to default again at using signed int.
> I wonder why we have developed this habit of defaulting all int as signed and consider uint as taboo (most coding guidelines asks not to use them unless "you know what you are doing").
I'm pretty sure that it's just because "int" is one word and "unsigned int" is two, plus more than twice the characters. I suspect if "int" defaulted to "unsigned int" and you'd have to specify signed ints explicitly, the taboo would be reversed.
Never underestimate the power of trivial inconveniences.
Forget about for statements for a second and let's write both a counting up and a counting down loop using while statements.
// count up
std::size_t i = 0;
while (i != 10)
{
std::cout << i << "\n";
++i;
}
// count down
std::size_t i = 10;
while (i != 0)
{
--i;
std::cout << i << "\n";
}
After initialization a for statement repeats "test; body; advance", this is ideal for counting up loops, but what we need for counting down loops is "test; advance; body". Since C/C++ do not provide the latter as a primitive you have to use a while loop as shown above. Using a signed integer to shoehorn a counting down loop into a for statement at the cost of 1/2 your range is a hack IMO. Note that when working with iterators you have to resort to a while statement as iterating past begin is UB.
> Still, the trick makes it look suspect and that's an argument against using it.
This is true. The code is confusing to people not used to it. A workaround could be to hide this code inside a macro, so people not interested in digging into the code would take the macro's word:
#define REVERSE_LOOP( x, i ) for( size_t i = x.size(); i-- > 0; )
But unfortunately, that doesn't help with the fear that people has against unsigned types.
In this case, it makes absolutely no difference at all. It could be argued that writing unsigned int would make the code slightly harder to read. That said, I like to use stdint.h and unint32_t would, I think, not have any drawbacks.
> there is lot of code out there with interfaces expecting signed ints even though they should using uint
That's not a good reason to not use unsigned integers, it's a zero-overhead cast from unsigned to signed (at the risk of overflowing into the negative).
yeah, some numbers can never be negative, but their difference can. and that's when it usually comes to bite me in the ass. i almost never use unsigned ints now.
I disagree, signed integer arithmetic in C and C++ is just toxic. Sure, if you need to compute the difference between two integers, which have both been pre-checked to lie between say -100 and +100, then fine, use signed ints... but for arbitrary input you need to do more work.
There's example code on the CERT secure coding guidelines here (look under 'Substraction'):
All arithmetic in C and C++ is toxic. That's the reality of using bounded-precision types. Honestly, I wish they'd had the foresight not to use the traditional infix operators for built-in types; they practically beg programmers to implicitly treat built-in types like the mathematical types they very vaguely resemble.
Really, working directly in fixed-precision arithmetic is absurd. In order to be able to rely on its correctness with any degree of certainty, you need to very carefully track each operation and its bounds, at which point you may as well have just used arbitrary-precision types, explicitly encoded your constraints, and had the compiler optimize things down to scalar types when possible, warning when not.
the funny thing is that fixed precision arithmetic is used literally everywhere and it just works. i'd say it's good enough for most practical purposes.
It is not used literally everywhere. It does not always "just work," as the original post demonstrates. It often happens to be good enough for most practical purposes, yes, but arbitrary-precision arithmetic is better for most practical purposes.
Fixed-precision arithmetic has one main advantage over arbitrary-precision arithmetic: it is more time- and space-efficient. This advantage only applies if the fixed-precision arithmetic is actually correct and the fixed-precision arithmetic meets some concrete time or space constraint which arbitrary-precision arithmetic fails to meet. It generally takes time and effort to demonstrate that these conditions hold; because one can rely on the correctness of arbitrary-precision arithmetic without doing so, arbitrary-precision arithmetic should then generally be the default choice.
This assumes that you care about making relatively strong guarantees about the correctness of your programs. If for some reason you don't, then sure, use ints and whatnot for everything. If you do, though, I suspect you'll find that it's easier to track down a performance bottleneck caused by using bignums than an obscure bug triggered by GCC applying an inappropriate optimization based on overflow analysis.
PostgreSQL doesn't give you an unsigned int option but if they did I wouldn't use it.
Having a negative pkey space is actually useful. In LSMB we reserve all negative id's for test cases, which are guaranteed to roll back. This has a number of advantages including the ability to run a full test run on a production system without any possibility of leaving traces in the db.
Most DBs don't support unsigned int [0] as a type (though its perfectly sensible to have a constraint that enforces >0.)
[0] though several do support UUIDs, which are essentially unsigned 128-bit ints, and which (with a well-selected generation mechanism) are better as server-assigned surrogate keys than sequential integers, signed or unsigned, anyway.
That seems like bad advice to me. A possible infinite loop is given as justification in case of wrongly implemented reverse iteration (counting down an unsigned loop variable). Well, i claim that an infinite loop is a much more noticeable bug than undefined overflow behaviour, negative view counts, etc. Unsigned ints will make bugs impossible that with signed ints will (hopefully, famous last words) trigger assertions, if they are enabled...
One problem with this is that the sizes of STL containers are returned unsigned, and with high warning levels, compilers will warn about comparing a signed int with one of these sizes.
"You should not use the unsigned integer types such as uint32_t, unless there is a valid reason such as representing a bit pattern rather than a number, or you need defined overflow modulo 2^N. In particular, do not use unsigned types to say a number will never be negative. Instead, use assertions for this." [0]
[0]: http://google-styleguide.googlecode.com/svn/trunk/cppguide.h...