If you want a default decimal floating-point type, the only defensible choice is the decimal128 type standardized by IEEE 754. It has a fully-defined arithmetic, specified by experts who have spent their careers thinking about the issues involved, and is wide enough to exactly represent the US national debt in every currency used on earth.
There are situations where other decimal floating-point types are appropriate, but if you do not understand the tradeoffs you are making, you should be using decimal128.
I made a few notes the last time I saw this type come up somewhere:
- It has significantly less exponent range than IEEE decimal64 (it effectively throws away almost three bits in order to have the exponent fit in a byte; 2^56 is ~7.2E16, which means that it can represent some, but not all, 17-digit significands; the effective working precision is 16 digits, which actually requires only ~53.15 bits).
- Even if you weren’t going to use those extra bits for exponent, they could be profitably used for other purposes.
- Lack of infinity is a mild annoyance.
- Rounding is biased for add/sub/mul, broken for divide.
- It's not significantly more computationally efficient than the IEEE formats, despite handwaving by the author.
----
Edit: I would be remiss not to note that Intel has made available a well-tested complete implementation of IEEE 754 decimal64 and decimal128 under 3-clause BSD: http://www.netlib.org/misc/intel/
I don't know how much effort Intel is allocating to this project; I sent a bug report to the author listed on that page in 2015. The bug was acknowledged and fixed, with a new release out in "several days", but nothing yet.
That thread is rather depressing, being mostly just kneejerk and insults. From what I can tell, the actual criticisms are just:
1. "rounding modes and overflow behavior are not addressed", which is incorrect.
2. "Where's exp(), log(), sin(), cos()?", which you can find in dec64_math.c.
3. "There are also 255 representations of almost all representable numbers", which is incorrect.
4. "it will take around FIFTY INSTRUCTIONS TO CHECK IF TWO NUMBERS ARE EQUAL OR NOT" in the slow case, which is probably true but is unlikely to be common given the design of the type. It can undoubtedly be improved with effort.
5. "The exponent bits should be the higher order bits. Otherwise, this type breaks compatibility with existing comparator circuitry", which seems like an odd comment given the type is not normalized.
6. "With both fields being 2's compliment, you're wasting a bit, just to indicate sign", which AFAICT is just false.
7. "there's no fraction support", which I don't understand.
So the only valid criticism I saw in that whole thread is #4, and even that is only a tenth as valid as its author thought it was.
Re 1: it is true that there is a single paragraph "discussing" rounding in https://github.com/douglascrockford/DEC64/blob/master/dec64.... -- which is not much. If there is more, I did not find it, and would genuinely appreciate a pointer! In the meantime, though, alternative rounding methods are missing, and those can be quite important for both scientific and business computations. I guess they could be added. But, a lot of things "could be done". Fact is, they were done so far. In so far, DEC64 is mostly a proposal for people to sit down and work out an actual standard. Perhaps this will happen one day, but it seems so far not so many people are convinced it's worth investing efforts into that (I am also not sure if the author is interested in feedback and collaboration? I see no indication for that anywhere)
Re 2: these are at best toy implementations, at worst dangerous (in the sense that they may provide wildly inaccurate results, due to convergence
Re 3: agreed. Though there is still a gargantuan number of values which have 2 or more representations, and that makes all kinds of comparisons more complicated. Dealing with that efficiently in SW is difficult, and more so in HW. It might be worth it if the advantages outweigh it, but at least I personally don't see it.
Re 4: if it can be "undoubtedly improved with effort", why hasn't it been done in several years? Sure, it may be possible, but I will keep my doubts for the time being :-)
> In the meantime, though, alternative rounding methods are missing, and those can be quite important for both scientific and business computations.
My inexpert understanding is that modifying rounding modes is super niche and poorly supported by most things, so this doesn't strike me as much of a problem. A saner replacement to rounding mode flags would just be to offer different operations for those rare cases they are wanted.
> Dealing with that efficiently in SW is difficult, and more so in HW.
Not really; you never really need to normalize values and not doing so makes basically everything other than comparisons cheaper. I don't see how normalizing around every arithmetic operation would make the hardware any simpler.
> if it can be "undoubtedly improved with effort", why hasn't it been done in several years?
Because this is one guy's project and it hasn't seen much (any?) use.
> So the only valid criticism I saw in that whole thread is #4, and even that is only a tenth as valid as its author thought it was.
Not really here to defend these criticisms, but if #4 is only a tenth as valid as the author thought it was, then does that mean that you know the worst case instruction count to compare two DEC64 numbers is five? Also, we're talking x86 here, how many μops and static instruction bytes are we talking on either side of this? Are all of the branches usually predictable? Can we ever afford to inline something as critical as the comparison operation?
As for the validity of #5, I think this might make more sense in the context of hardware implementations, where a low-order exponent position could increase complexity (requiring adjustments to reuse any of the FPU datapath) AFAIK.
All I meant is that the author of the comment said it would apply in most cases, where even pessimistically it seems it should apply one time in ten. I don't have detailed performance information.
3. Correct there are not 255 possibilities, it’s closer to 50 - I have no paper to work it out on right now. This is because the desire to “not require normalization” is not a good idea.
4. To implement compare quickly you can probably short circuit by first checking that the values are normalized, and normalizing them if not. From a hardware point of view you just added two registers for compare, in addition to the logic for normalizing. For hardware it is probably cheaper to unconditionally normalize.
All math operations will also need to normalize inputs first because otherwise you throw away precision for no reason. For similar reasons all outputs will need to be normalized anyway. Again, given this why would you choose to not just have an implied leading bit?
5. Uh yeah what? If you’re making dedicated hardware you could randomize the bit ordering that’s what making hardware is all about ;)
> Correct there are not 255 possibilities, it’s closer to 50
1 in 10 values have two representations, 1 in 100 have three, etc., since you can only change the exponent when the (normalized) value is 0 mod 10.
> For hardware it is probably cheaper to unconditionally normalize.
Given normalization is going to be infrequent (most values will either just be integers or have a single representation), and that arithmetic is more common than comparisons, I don't see why this would be. Normalizing a binary value is much easier, so you can't carry floating point intuitions over.
> All math operations will also need to normalize inputs first because otherwise you throw away precision for no reason.
No, you just conform the exponents and handle overflow.
The desperate clinging to a numeric system oriented to the binary internals of a computer in the human interface to those internals is insane. Choosing base 2 numeric systems as the default number system in any high level language is just shortsighted and stupid.
Maybe DEC64 is flawed, but can we stop pretending that we like binary-based default systems for any reason other than the barrier to entry it creates for newcomers and visceral feeling of communing with computer internals.
Binary has a place -- in low-level languages. Like assembler. There's nothing wrong with any language providing a non-decimal system as an optional number type.
The system we have in many high-level languages is a frankenstein of a binary system that you usually interact with through decimal representation.
The proposal gets at least one thing right: the languages of the future will have a numeric system default that has less insanity. But it will take the passing of a generation or two of developers.
There's no such thing as a "binary" or "decimal" number.
In the real world, there are natural numbers, integers, rationals and real numbers.
Computer languages are designed with types that mimic this real-world number stack. Low-level binary implementation details don't leak unless you're overflowing or using bit operations.
What you're really complaining about is the fact that rationals aren't a first-class type in any popular language. With that I agree, it's a shame that corners were cut we three number types instead of four.
On the contrary, decimal number types (such as the one in C# that I am most familiar with) are addressing another kind of number that occurs very frequently within "the real world" and is inherently in base ten. Namely, numbers with a fixed number of digits past the decimal point and with specific rounding rules. These are incredibly common in finance.
The whole point is to support the kind of accounting that happens in real business, which does not do things with rational numbers. One third of your bill does not charge you one third of a penny, these aren't rational numbers but the operations on them involve rounding.
Dec64 does not reflect what happens in real business any more than IEEE floats does. "There is no reals" applies to currency too, but in a stronger form: in currency there are no fractions either. Translation: in real life, no one can give you a fraction of a cent. So the programmer has to make a decision about what happens to those fractions when you give 1/3 off.
Every newbie programmer tries to avoid thing about this by using IEEE floats. They discover, usually years later after some anal auditor has come down on them like a ton of bricks for the hundredth time because the dropped low order bits from 1/3 of a cent hit that 1 in a billion case and effected a significant digit, and then it finally dawns that 1 in a billion isn't really 1 in a billion because thousands of such calculations get combined into a single profit and loss figure that is out by 2 cents and chasing that 2 cents for 2 weeks really only to discover it was caused by a computer can't compute, really, really pisses off the auditor, that you realise if aren't thinking about those fractions of a cent as hard as a C programmer focuses on malloc(), they will have gone to whoop whoop in half the code you have written. You will have nightmares about divide signs for the rest of your life. Crockford seems to think Dec64 allows the programmer to avoid thinking about the problem. He is just as wrong as every newbie programmer.
There is only one safe format for currency that accurately reflects this reality, that forces you to accept that you must think about those fractions of a cent. It is the humble integer. You put in it the lowest denomination: cents in the case of the USA. And then when you write total = subtotal / 3 * 4, and you test it, your error stands out like dogs balls, and you fix it.
Tangent: in the real world, there are no real numbers. Whether or not there are arbitrary rational numbers is something of an open question.
"Binary number" as used by grandparent really refers to dyadic rationals (https://en.wikipedia.org/wiki/Dyadic_rational), which are a perfectly well-defined dense subset of the rationals. Similarly, "decimal number" is really "terminating decimal expansion" (or whatever you want to call the decimal analogue of dyadic rational), which is again a well-defined dense subset of the rationals. This is a perfectly valid mathematical distinction; the numbers that people work with day-to-day are much more frequently the latter.
Hrmmm. I thought something was fishy with the statement: "Tangent: in the real world, there are no real numbers". The reals are defined as the set of all rational and irrational numbers on the number line. See https://en.wikipedia.org/wiki/Real_number for a reasonably pedagogical discussion.
There are reals, they do exist. In the "real" world (as poorly defined as that is).
The issue is, fundamentally, what programming languages call "real numbers" are not real numbers. They are an approximation to a subset of the reals. This approximation has holes, and the implementations work to some degree to define regions of applicability, and regions of inapplicability. Usually people get hung up or caught in the various traps (inadvertently) added to the specs for "Reals".
Its generally better to say "floating point numbers" than "Reals" in CS, simply because floating point is that subset of the Reals that we are accustomed to using.
I definitely agree with the comment on rationals. I am a fan of Perl6, Julia and other languages ability to use rationals as first class number types.
Sadly, as with other good ideas that require people alter their code/libraries, I fear this will not catch on due to implicit momentum of existing systems.
In a very reasonable sense, the real numbers do not exist in the real world. Almost all real numbers are non-computable, so under apparently reasonable assumptions about what experiments you can conduct, there is no experiment you can do that will produce a measurement with the value of most real numbers.
From this, it’s fairly non-controvesial to say that only the computable reals exist; these are a tiny (measure-zero) subset of the reals.
If you go further and assume a fundamentally discrete universe (much more controvesial), then all you can really measure are integers.
For Floats in particular, binary implementation details leak all the time due to rounding. A number like 0.0625 can be represented in binary exactly with only 4 bits, but a number like 0.1 can only be represented approximately even when using 64 bits.
This could be solved with a rational-style data type, but I consider the fact that real-style data types don't capture that to also be the implementation leaking.
Rationals are not generally useful for scientific computation because eventually you need to start invoking the relatively costly euclidean algorithm to do basic operations, the compute time is no longer constant. Also lots of operations take square roots anyways, so you wind up in trouble that way.
Yeah, you're right, there's a good historic reason for this current mess. Computing originated in scientific computational math problems, so the algorithms and data types are biased for that.
Javascript programmers don't need a number type designed for accurately computing trig and square roots and representing pi, but they got one anyways.
XLISP (and therefore XLISP-STAT) had rational numbers[1] and came out in 1988, R2RS[2] introduced rational numbers in 1985. The Lisp Machines Lisps had rational numbers. There were a lot of people who had rational numbers in the 1980s and the early 1990s.
Yep, it seems people need to understand the details of base translation before writing a comment :p
Anyway, IIRC you can round-trip 15 decimal digits correctly with an IEEE double. Don't know if it's floor(15.955) or something else, but it's close enough for me to consider DEC64 quite useless compared to existing, quite well-designed and widely used implementation of FP.
How does this relate to https://en.wikipedia.org/wiki/IEEE_754 (2008 edition) which added decimal floating point in a couple of sizes including 64 bit? It's strange to publish something in the same space without any reference to that. It appears to be incompatible (IEEE 754 decimal64 has 53 bits of significand and 11 bits of exponent; this thing has 56 bits and 8). How is that helpful?! Libraries and hardware supporting IEEE 754 exist. What am I missing?
Edit: it does say "A later revision of IEEE 754 attempted to remedy this, but the formats it recommended were so inefficient that it has not found much acceptance." at the end. Hmm.
This has got to be very slow, both in hardware and software implementations, compared to IEEE packed decimal float.
Addition and subtraction on anything other than matched exponents is going to need rounds of multiplication-by-10, however you implement it. Using IEEE-754 packed decimal, you only need a handful of gates per digit to unpack into BCD.
> DEC64 is intended to be the only number type in the next generation of application programming languages.
Intended by whom? A lone voice (however correct), a standards body, or an industry consortium?
This article should really carry some authorship information.
A much better discussion and design for decimal arithmetic on binary computers was done by Mike Cowlishaw (http://speleotrove.com/decimal/) He shifted the entire computation chain into decimal (representing decimal digits in a packed form as well)
Ugh having an explicit bit prior to the decimal was a mistake intel made in x87 - it introduces a pile of horror as there end up being multiple representations of the vast majority of numbers, which means you have to normalize prior to any comparison operation, or accept the your comparisons may be wrong
The explicit leading bit directly halves the space of addressable bits (which is how you get space for multiple representations)
If you want insight to how awful this is, x87 has pseudo infinities, pseudo /nans/, unnormal values, and pseudo denormals
The solution intel eventually took was to recognize that the leading bit was useless but require it to be as though it were implicit and treat any case where that is wrong as being invalid.
So yes you do want a format that requires normalization, because the alternative is one that requires normalization anyway, but is also insane for any kind of comparison, and wastes precision needlessly.
Nice. It is unbelievable that none of today's widely used programming has standard support for handling money correctly. I don't want to know how many programs use (binary) floating point to do it.
(Disclaimer: I don't work with any financial figures and I have not checked whether the dec64 proposal is a sound one.)
Languages* (or “platforms”) have decimal types. Or you can use integers and agree on a normalization (such as cents or 1/1000 of a cent). Because languages* also have integers.
In Java, the java.math.BigDecimal is not a primitive data type. It also has an awkward api due to lack of operator overloading (thanks to BigDecimal not being a primitive data type it seems).
Well that's true, and if you're going to argue that the lack of equally or more concise syntactic support is an omission, I agree. I think 1.23 should denote a decimal floating point constant, and a special suffix should be required if you want to use binary.
But I think it's important to make sure programmers don't end up with the idea that they might as well store money amounts in binary floating point because the language they are using doesn't support decimal.
> In Java, the java.math.BigDecimal is not a primitive data type. It also has an awkward api due to lack of operator overloading (thanks to BigDecimal not being a primitive data type it seems).
If you're on the JVM, Groovy is nicer in that regard. It uses by default BigDecimal and BigInteger for literal numeric types, and has operator overloading: http://groovy-lang.org/syntax.html
> It uses by default BigDecimal and BigInteger for literal numeric types
Apache Groovy uses by default BigDecimal for literal decimal types, but only uses BigInteger for literal integral types when it would overflow the largest available primitive type for that literal. You can force a BigInteger by using the suffix `g` on the integral literal. Perhaps misunderstanding this is the source of some hard-to-find bugs in some of your Groovy code.
Of course more digits than 2 exists. In Germany petrol/gas has always been priced with 3 decimals. And in B2B it's even more common. I'm talking about 98% of consumer business where the precision is full cents for each transaction.
In general you need to know when to round and to how many digits. But my complaint was that languages have a natural API for binary floating point, which is useful for scientific number crunching, but rarely anything comparable for commercial calculations.
I have not tried to design a natural API for fixed (but parameterizable) decimal precision and the implicit rounding required. But I would be surprised if it's impossible to come up with anything less verbose (explicit constructors everywhere) and less error prone (forgetting to call rounding).
Also, I don't try to endorse the Python API. I don't even use it. I just pointed out their api is easier since the operators work with their decimal data type, as opposed to the Java version.
The databases have number formats that represent numerical values for money correctly. And they’re probably the most widely used programming systems for dealing with money.
Correct. But databases are not programming languages.
I have never done any SW development in the commercial domain. Do you say in banking system the amounts never make it to the host language, they are always calculated in the database?
A cashier system most likely doesn't even have a database. At least not used for every item.
But my main question was: The system that handles my bank account is it written in SQL only (or mainly)? I really don't know, because I have never seen the source code of such a system.
For the average cashier / checkout system I would doubt that, although again I have never seen the source.
Problem is that code is written by many people, so you have to agree on a way to divide (for example). Then, it means that you have to enforce the way you code it and current standard programming languages (python, java, c,...) don't help you there.
We need a decimal format because people who read the code are sometimes not coders how know about the various tricks to pack decimals in uint64.
So, u64 is perfectly doable, sure, but it doesn't scale with the various types of people who write/read the code.
(same thing for floats : you can do money computations with them, but that'll work only if the people understand the trade offs of the representation; and believe me, they're not legions)
That works fine for addition and subtraction. When you divide (or do percent calculations) integer cents is no longer, you need to deal with rounding manually/some custom class with every calculation.
Yeah but that's rounding error which isn't limited to just money. There's no universal precision/representation between all the currencies an currencies change all the time so it would be wildly inappropriate to bake into a programming (language?) standard.
Also, what's with the aversion to traits/types? This is the whole point of having operators/types.
Correct. Precision must be parameterized for every variable.
2 is the most commonly used value (for money), but not the only reasonable one. And rounding needs to be done at certain steps but not a others. Typically internal calculations are expected to be done with maximum precision, and every time you have an item visible to the customer you round. Often to full cents, but there are cases where you show a fixed number of decimal cents even to the customer. I doubt that cases where you show maximum precision to a customer are common.
Ieee 754-2008 defines decimal floating point with sound specifications for rounding and arithmetic.
It’s provided in most higher level environments these days (java, .net - where I think it might even be a vm primitive?)
I still question the value of it but it does allow “perfect” representation of values common in decimal systems. But by the same token it can’t represent everything in all other bases. Shrug.
Yes, I have heard COBOL has support for fix point decimals. But I have never used COBOL and I guess few people would encourage to start new software projects in COBOL, even if the use case is commercial/financial.
The most important part of IEEE decimal arithmetic is the (weakened, made optional) mandate that decimal floating point perform exact arithmetic— deterministic cross platform results on the same inputs. This is important not just for financial accounting which motivated this departure from binary floating point (where error is tolerated), but for any distributed system that needs reliable computation in the presence of heterogeneous hardware or compiler optimizations. It is embarrassing that we live with this state of affairs in 2018 with no way to do deterministic/ exact real number arithmetic in programs without emulating the FPU in software.
IEEE binary floating point is deterministic. There is no such thing as tolerated errors in compliant implementations.
Non-compliant implementations are widespread for performance reasons, but that is a completely different story.
Also I'm not sure about GPU, but FPU are typically compliant in HW, and it is typically the compilers that have faster approximate modes, and I think for mainstream ones only when you explicitly enable such optimizations, which are disabled by default.
I imagine this would be useful for applications which do a lot of conversion from/to ieee754 to/from text... Programming languages and spreadsheet software come to mind.
Has anyone ported a non-trivial application to use DEC64, and compared the results?
> In modern systems, this sort of memory saving is pointless. By giving programmers a choice of number types, programmers are required to waste their time making choices that don’t matter. Even worse, making a bad choice can lead to a loss of accuracy or destructive bugs. This is a bad practice that is very deeply ingrained.
Boundless arrogance, minimal information.
If you want a decimal floating point type, use the IEEE formats, there are high quality implementations everywhere, and if you're using C++ you can probably switch without much more than a string replacement and a couple header includes. Douglas Crockford can be forgiven for having apparently no concept of computing at the limits of the machine (in terms of cache latency, memory size, and CPU throughput), but if you get comfortable with this level of ignorance and aren't famous, you will be highly replaceable.
Added:
Moore's law is effectively dead. This means that for a given CPU microarchitecture family and your monetary or spatial budget for memory, you have limited resources to achieve your goals. If you write an inefficient program with no regard for performance at the numerics level, your program will no longer automatically get much faster and cheaper to run, you will instead be contributing to the pile of performance debt your successors will be cursing and shedding tears over.
Furthermore, with his "loss of accuracy" comment, he seems to imply that his 64 bit decimal types are even remotely large enough that common users will not lose accuracy (by which I suppose he really means precision).
There are situations where other decimal floating-point types are appropriate, but if you do not understand the tradeoffs you are making, you should be using decimal128.
I don't agree with everything in Jens Nockert's "silly review", but he's right about a lot of things: http://blog.aventine.se/2014/03/09/a-silly-review-of-dec64.h...
I made a few notes the last time I saw this type come up somewhere:
- It has significantly less exponent range than IEEE decimal64 (it effectively throws away almost three bits in order to have the exponent fit in a byte; 2^56 is ~7.2E16, which means that it can represent some, but not all, 17-digit significands; the effective working precision is 16 digits, which actually requires only ~53.15 bits).
- Even if you weren’t going to use those extra bits for exponent, they could be profitably used for other purposes.
- Lack of infinity is a mild annoyance.
- Rounding is biased for add/sub/mul, broken for divide.
- It's not significantly more computationally efficient than the IEEE formats, despite handwaving by the author.
----
Edit: I would be remiss not to note that Intel has made available a well-tested complete implementation of IEEE 754 decimal64 and decimal128 under 3-clause BSD: http://www.netlib.org/misc/intel/