> There are also 255 representations of almost all representable numbers. [...] Aside from the fact that you're wasting an entire byte of your representation
How is this different than any other floating-point representation? I'm pretty sure IEEE floating-point has the same redundancy, though numbers are normalized so comparisons are cheaper as you note. But IEEE doubles "waste" even more bits due to the 2^52 representations of NaN.
> For most comparisons [...] it will take around FIFTY INSTRUCTIONS TO CHECK IF TWO NUMBERS ARE EQUAL OR NOT.
Good point, sounds like a notable weakness and barrier to adoption.
> Crockfords bugaboo with IEEE 754 floating-point is bizarre, verging on pathological.
He calls it "the most frequently reported bug in JavaScript." Wouldn't you be interested in improving on the most common cause of user confusion in a technology you care about?
IEEE 754 doesn't waste any bits – there is only a single representation of each value (except for multiple NaNs). In this proposal, there are 255 representations of most values, which means that it has almost an entire byte of redundancy. The waste is bad, but the lack of a canonical representation of each value is worse.
I personally think that the way to handle floating-point confusion is better user education. However, if you really want a decimal standard, then, as I mentioned above, there already is one that is part of the IEEE 754 standard. Not only do there exist hardware implementations, but there are also high-quality software implementations.
A better approach to making things more intuitive in all bases, not just base 10, is using rational numbers. The natural way is to use reduced paris of integers, but this is unfortunately quite prone to overflow. You can improve that by using reduced ratios of – guess what – floating point numbers.
> There are also 255 representations of almost all representable numbers. For example, 10 is 1 x 10^1 or 10 x 10^0 – or any one of 253 other representations.
You are not correct. The smallest significand possible is 1x10^1, but you can't delve further into positive exponents. Conversely, 56 signed bits allows the largest integer power of 10 as 10 000 000 000 000 000 so the exponent will be -15. So there are exactly 17 representations of 10, and that's the worst it gets. All other numbers except powers of 10 have fewer representations, and most real world data affected by noise has a single representation because they use the full precision of the significand, and you can't shift them to the right or left without overflow or loss of precision.
So the redundancy is much less than you think, one in 10 real values has two representations, one in 100 has three etc. This is common for other decimal formats and not that big of a problem, detecting zero is a simple NOR gate on all significant bits.
The real problem with this format is the very high price in hardware (changing the exponent requires recomputing the significand) and complete unsuitability for any kind of numerical problem or scientific number crunching. Because designing a floating point format takes numerical scientists and hardware designers, not assembly programmers and language designers.
Heck, the only reason he put the exponent in the lower byte and not the upper byte, where it would have ensured a perfect compatibility to most positive integers, is that X64 assembly does not allow direct access to the upper byte.
You are right. I forgot about the interaction between the exponent and the significand. The lack of a canonical representation is still quite problematic.
In one format, IEE754 has 24576 possible representations of zero[1], which fits your definition of "wasted bits". Some of your other criticisms might be valid, but at this point I'd like to see an accurate technical comparison between DEC64 and the decimal formats of IEEE 754.
This is why decimal floating-point formats are kind of a disaster in general and are only implemented in relatively rare hardware intended for financial uses. In many of those applications, using a decimal fixed point representations is better – i.e. counting in millionths of pennies (you can still count up to ±9 trillion dollars with 64 bits). But yes, a technical comparison of different decimal formats would definitely be interesting. I suspect that despite the occasional failure of intuitiveness, we're far better off with binary formats and better programmer education.
I feel like that bug has more to do with Javascript's (and clearly Crockford's) inane desire to pretend it is the case that all numeric types are the same.
Nobody who does anything with numbers believes that! Even if all you can do is count your fingers you believe in the difference between integers and floats. They have different algebraic properties entirely and it takes a whole hell of a lot of work to get from one to the other---there's even a whole class (fractions) in between.
I'm not sure what that has to do with it. Even if you are ok with the idea that integers and "decimal numbers" are different, it's still confusing that 0.1 + 0.2 != 0.3.
It's confusing because it is very difficult to look at a decimal number and know whether it can be represented exactly as base-2 floating point. It's especially confusing because you get no feedback about it! Here is a Ruby session:
The precise value of double(0.1) is 0.1000000000000000055511151231257827021181583404541015625. That is precise, not an approximation.
If you know of a program in any of these languages that will print this value for "0.1" using built-in functionality, please let me know because I would love to know about it.
Likewise the precise value of double(1e50) is 100000000000000007629769841091887003294964970946560. Anything else is an approximation of its true value.
In another message you said that what's really important is that the string representation uniquely identifies the precise value. While that will help you reconstruct the value later, it does not help you understand why 0.1 + 0.2 != 0.3.
It helps because 0.1 + 0.2 produces 0.30000000000000004 for 64-bit floats – so at least you can see that this value isn't the same as 0.3. In Ruby you just get two values that print the same yet aren't equal, which is way more confusing. I agree that printing the minimal number of digits required for reconstruction does not help with explaining why 0.1, 0.2 and 0.3 in 64-bit floats aren't the real values 1/10, 2/10 and 3/10.
rasky at monocle in ~
↪ python
Python 2.7.5 (default, Sep 2 2013, 05:24:04)
[GCC 4.2.1 Compatible Apple LLVM 5.0 (clang-500.0.68)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> 0.1
0.1
>>> ^D
rasky at monocle in ~
↪ python3
Python 3.3.3 (default, Dec 24 2013, 13:54:32)
[GCC 4.2.1 Compatible Apple LLVM 5.0 (clang-500.2.79)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> 0.1
0.1
It used to, but it was changed to reduce users' confusion.
We may be speaking across each other here. Ruby is lying in the sense that there are multiple distinct float values that it will print as 0.3 – in particular, confusion ensues when two values look the same but are unequal. These other languages print each distinct float value differently, using just enough decimal digits to reconstruct the exact binary value. Ruby doesn't give you enough digits to reconstruct the value you have. Nobody actually prints the full correct value because it's fifty digits long and is completely redundant given that you know you're dealing with a 64-bit float.
Yeah, that's my point. 0.1 and 0.2 do add exactly to 0.3, but in any finite representation of real numbers you'll get rounding errors on some equations like this. If you use an infinite representation then equality is no longer computable.
> There are also 255 representations of almost all representable numbers. [...] Aside from the fact that you're wasting an entire byte of your representation
How is this different than any other floating-point representation? I'm pretty sure IEEE floating-point has the same redundancy, though numbers are normalized so comparisons are cheaper as you note. But IEEE doubles "waste" even more bits due to the 2^52 representations of NaN.
> For most comparisons [...] it will take around FIFTY INSTRUCTIONS TO CHECK IF TWO NUMBERS ARE EQUAL OR NOT.
Good point, sounds like a notable weakness and barrier to adoption.
> Crockfords bugaboo with IEEE 754 floating-point is bizarre, verging on pathological.
He calls it "the most frequently reported bug in JavaScript." Wouldn't you be interested in improving on the most common cause of user confusion in a technology you care about?