9999999999999999.0 – 9999999999999998.0

twtw · on Jan 5, 2019

I don't understand all the crap that IEEE 754 gets. I appreciate that it may be surprising that 0.1 + 0.2 != 0.3 at first, or that many people are not educated about floating point, but I don't understand the people who "understand" floating point and continue to criticize it for the 0.1 + 0.2 "problem."

The fact is that IEEE 754 is an exceptionally good way to approximate the reals in computers with a minimum number of problems or surprises. People who don't appreciate this should try to do math in fixed point to gain some insight into how little you have to think about doing math in floating point.

This isn't to say there aren't issues with IEEE 754 - of course there are. Catastrophic cancellation and friends are not fun, and there are some criticisms to be made with how FP exceptions are usually exposed, but these are pretty small problems considering the problem is to fit the reals into 64/32/16 bits and have fast math.

svat · on Jan 5, 2019

> considering the problem is to fit the reals into 64/32/16 bits and have fast math

Floating-point numbers (and IEEE-754 in particular) are a good solution to this problem, but is it the right problem?

I think the "minimum of surprises" part isn't true. Many programmers develop incorrect mental models when starting to program, and get no feedback to correct them until much later (when they get surprised).

It is true that for the problem you mentioned, IEEE 754 is a good tradeoff (though Gustafson has some interesting ideas with “unums”: https://web.stanford.edu/class/ee380/Abstracts/170201-slides... / http://johngustafson.net/unums.html / https://en.wikipedia.org/w/index.php?title=Unum_(number_form... ). But many programmers do not realize how they are approximating, and the "fixed number of bits" may not be a strict requirement in many cases. (For example, languages that have arbitrary precision integers by default don't seem to suffer for it overall, relative to those that have 32-bit or 64-bit integers.)

Even without moving away from the IEEE-754 standard, there are ways languages could be designed to minimize surprises. A couple of crazy ideas: Imagine if typing the literal 0.1 into a program gave an error or warning saying it cannot be represented exactly and has been approximated to 0.100000000000000005551, and one had to type "~0.1" or "nearest(0.1)" or add something at the top of the program to suppress such errors/warnings. At a very slight cost, one gives more feedback to the user to either fix their mental model or switch to a more appropriate type for their application. Similarly if the default print/to-string on a float showed ranges (e.g. printing the single-precision float corresponding to 0.1, namely 0.100000001490116119385, would show "between 0.09999999776482582 and 0.10000000521540642" or whatever) and one had to do an extra step or add something to the top of the program to get the shortest approximation ("0.1").

tasty_freeze · on Jan 6, 2019

When I went to university in 1982, one of the lower level courses was called "Numerical Methods". It went over all of the issues related to precision, stability, as well as a host of common numerical integration and approximation methods.

I'm just a sample size of one, but isn't this kind of class a requirement for CS majors?

analog31 · on Jan 6, 2019

I majored in math and physics. In math, our version of the course was called "numerical analysis," and we covered those things. In physics, the behavior of floating point math was covered in one of our lab courses.

I don't know what's taught to CS majors, and as others have pointed out, programmers don't necessarily study CS.

I believe the issue is just that the pitfalls of floating point are not apparent without a certain level of math education.

But there may be one more pitfall, which is that those of us using FP regularly, also happen to be "scientific" or "exploratory" programmers who haven't learned a lot of formal software engineering discipline (including me). So we understand the math but might be more prone to making mistakes with it.

I do kind of like the idea of flagging any number that is potentially exposed to a FP issue. We all make mistakes. Displaying all floats in exponential notation by default would be a good enough warning to the wary. We only display them as decimals for readability.

wwweston · on Jan 6, 2019

It wasn't required at my school a decade or two later. Floating point representations were touched on somewhat in the intro to computer architecture class where we wrote assembly for a MIPS simulator. I got an extra dose because I switched my major to Math, where there was an entire course on Numerical Methods (though it wasn't required by either the Math or CS depts).

I'm a little amazed in retrospect that the Math department was where one had to go to get a class in the mechanical details of computing when as a subject it's usually considered (and often in practice is) notably up the ladder of abstraction. And this was a weird outlier as the single most practical upper division class offered by the Math department at the time...

kccqzy · on Jan 6, 2019

In my case, the CS class only covered the representation of floating point numbers (a sign bit, exponent bits, fraction bits, issues like bias etc) but not things like numerical approximation or integration methods. Those were in a separate class under the math department. And I think that's fair; after all those are really about scientific computing, not so much about computer science.

sampo · on Jan 6, 2019

> isn't this kind of class a requirement for CS majors?

I think most CS departments dropped numerical analysis from their requirements by the end of 1980s. Nowadays you are more likely to find such a course in some dusty corner of math or engineering departments.

vlovich123 · on Jan 6, 2019

My university moved it from a 100 series course to a 200 series course but it's still being taught to ECE undergrads.

The problem is more that we don't have the tools to track and understand how the errors are altered as we do the math (e.g. how would you even begin to try representing catastrophic cancellation at compile-time) and doing the numerical error analysis on the abstract math itself is hard once the math gets complex let alone trying to figure it out after you've optimized the code for performance & tweaked the algorithms for real-world data/discrete space.

Now perhaps it could be possible to do it at runtime in some way but I suspect the performance of that is prohibitive to the point where arbitrary precision math or decimal numbers is going to be a better solution.

dzhiurgis · on Jan 6, 2019

CS isn’t really required for majority of developers.

svat · on Jan 6, 2019

From a 1998 interview with William Kahan, the “father” of IEEE-754 floating point (emphasis mine):

> My reasoning was based on the requirements of a mass market: A lot of code involving a little floating-point will be written by many people who have never attended my (nor anyone else's) numerical analysis classes. We had to enhance the likelihood that their programs would get correct results. At the same time we had to ensure that people who really are expert in floating-point could write portable software and prove that it worked, since so many of us would have to rely upon it. There were a lot of almost conflicting requirements on the way to a balanced design.

I imagine that the number of people writing code without having taken a numerical methods class has only increased since the late 1970s being talked about, or even in the two decades since that interview.

Aeolun · on Jan 6, 2019

Not in many of the CS related majors. Though I guess it is for hard CS.

jimmaswell · on Jan 6, 2019

My CS course at a community college went over the limitations of floats in detail, in the CS class.

copperx · on Jan 6, 2019

How detailed, if I may ask?

jimmaswell · on Jan 7, 2019

I remember we learned how the representation worked bit-for-bit, and how it being stored in binary meant it couldn't perfectly represent everything in decimal. 1.01b meaning 12^0 + 02^-1 + 1*2^-2 for example.

sampo · on Jan 5, 2019

There is a proposal about "unum" or "posit" number system. They give more precision for small numbers (small meaning smaller than about 10^70, for 64bit numbers), less precision for huge numbers, and an overall larger range, than the floating point system.

https://en.wikipedia.org/wiki/Unum_(number_format)

http://www.johngustafson.net/pdfs/BeatingFloatingPoint.pdf

(They are definitely not any easier to understand than the floating point system, though.)

CodeArtisan · on Jan 6, 2019

there is also DEC64

http://dec64.com/

acqq · on Jan 6, 2019

> Even without moving away from the IEEE-754 standard, there are ways languages could be designed to minimize surprises.

Floating point in base 10 is already in the standard since 2008:

https://en.m.wikipedia.org/wiki/Decimal_floating_point

The standard is not to blame, the lack of demand for that feature is.

Most of the potential users don’t know that they can demand from their software and hardware suppliers that feature. Using it there would be less “surprises.”

mark-r · on Jan 6, 2019

Showing that the result of 9999999999999999.0 – 9999999999999998.0 is a number between 1.9999999999999998 and 2.0000000000000124 will not solve the problem. IEEE floating point doesn't keep track of loss of precision.

svat · on Jan 6, 2019

To be clear, my point was that if programmers always saw floating-point numbers printed out as a range, from their beginning programming days, more of them would be likely to understand floating-point numbers better — or at least avoid the (impossible) idea that they map 1:1 with the real numbers. Having understood floating-point numbers, they would know what to expect from 9999999999999999.0 – 9999999999999998.0 with 64-bit floating-point. So though seeing a range here won't help magically restore precision that's been lost, having seen ranges earlier would help, before trying to carrying out this calculation.

Nevertheless, you have a good point and what I take away from this is that showing a range for the end result of a computation (instead for a given number directly entered by the user/programmer) can be misleading if the result of the exact computation wouldn't actually have been in that range.

crankylinuxuser · on Jan 5, 2019

When NASA can't even get it right, because of "surprises", there's no chance in hell I'm blaming us mere mortal programmers... or even 10x wizards. (0)

It's time to look at other ways to depict fractional parts of numbers in a computer. I know that one can express any rational number as a integer fraction. And our computers are incapable of expressing a irrational number exactly - it does so to a certain precision... In other words, every number a computer expresses is a rational number.

The exception is if the computer could express irrational numbers as symbolics, then we could work with the symbolic instead. And then as a last pass, the symbolic could convert to a imprecise rational depiction, or express as its native type.

(0) https://itsfoss.com/a-floating-point-error-that-caused-a-dam...

twtw · on Jan 6, 2019

That failure had nothing to do with floating point specifically. The same failure could have occurred when converting any data type with larger range (including rationals or arbitrary precision numbers) to 16 bit integer.

I would say the fact that floats can have data larger than INT16_MAX is hardly a surprise. That was just a bug, not some great and surprising drawback of floating point.

saagarjha · on Jan 5, 2019

Rationals get unwieldy quickly, even with the simplest of arithmetic. A couple of additions is enough to get a large denominator.

bjourne · on Jan 6, 2019

That has not been my experience using a language with builtin support for rationals. The rational is simplified after each operation so it never grows unwieldy large. It is slower than floats, but imo vastly superior for most use cases.

johndough · on Jan 6, 2019

> The rational is simplified after each operation so it never grows unwieldy large.

We recently had an exponential memory growth bug because rationals can not always be simplified, for example if you start with (2/3) and repeatedly square it. Fortunately, this was not in user-facing code, so there was no chance of a denial-of-service attack, but that's definitely something to watch out for with rationals.

Retra · on Jan 6, 2019

"Most" use cases? You mean the use cases that don't involve heavy math, like graphics, physics, and statistics?

stephencanon · on Jan 7, 2019

Even simplifying after every operation, in the typical case exact rationals grow exponentially in the number of terms in the computation. This means that either: (a) you cannot use them for any non-trivial computation. (b) you have to round them, in which case they are strictly worse than floating-point numbers because they have redundant representations and a very non-uniform distribution.

trishmapow2 · on Jan 5, 2019

Ariane 5 is ESA not NASA

empath75 · on Jan 5, 2019

You can work directly in reals.

http://hackage.haskell.org/package/exact-real

SideQuark · on Jan 6, 2019

That package approximates reals. It fails to do equality on infinite precision reals, for example.

lenticular · on Jan 5, 2019

Yeah, the limitations of FP are well-known to anyone who does much numerical work.

Floating point numbers are the optimal minimum message length method of representing reals with an improper Jeffery's prior distribution. A Jeffery's prior is a prior that is invariant under reparameterization, which is a mandatory property for approximating the reals.

In this case, it is where Prob(log(|x|)) is proportional to a constant.

Thus, we aren't going to ever do better than floats if we are programming on physical computers that exist in this universe. There is a reason why all numerical code uses them. Best to learn their limitations if you are going to use them, otherwise use arbitrary precision.

3pt14159 · on Jan 6, 2019

Outside of the academic world decimals are almost always a better solution if performance isn't critical.

Most logic is multiplicative. For example, apply a 30% tax on a dollar quantity and display both subtotal and grand total. With floats, there are inequalities. With decimal there usually aren't unless you're dividing, but we already have to deal with divide errors in base ten, and it is much more likely to need to represent 0.30 than 1/3 and because decimal shares a base with binary (since it's factors are 5 and 2) binary doesn't really get us anything but headaches anyway. It's true that there are still gotchas, but they happen less often and usually don't end up looking stupid and weird for no reason. That 0.1 + 0.2 = 0.300000000000001 is dumb and we all know it.

jacobolus · on Jan 6, 2019

> Outside of the academic world decimals are almost always a better solution

Is “academic world” now a shorthand for “all numerical computing”?

Decimals basically never make sense, except possibly in some calculations related to money. Those make up a minuscule part of modern computer use.

Maybe decimals are also better for homework assignments for schoolchildren?

The type of applications where decimals are useful are by and large insensitive to compute speed and need no special hardware support. You can easily write your code for decimal arithmetic on top of integer arithmetic hardware.

Those of us who need binary floating point for graphics, audio, games, engineering, science, .... won’t stop you.

dotancohen · on Jan 6, 2019

Even with money I use integers. Instead of dollars (or local currency), I store values internally as pennies (or local equivalent 1/100 of main currency). Sometimes when working with interest I'll need to work with floats, and some databases I have values stored as DECIMAL(8,2) instead of INT, but for the most part I've saved quite a few headaches by keeping my values in INTs.

loa-in-backup · on Jan 8, 2019

There are currencies that have the lowest value coin as 1/20 of main coin.

CamouflagedKiwi · on Jan 6, 2019

That seems okay until you need to track sub-penny accuracy somewhere, then you have a big problem.

Lots of applications don't need that, but a surprising amount do, so it's not a global solution.

hakfoo · on Jan 6, 2019

But there's still a minimum significant value which can be defined from the problem space. Do DECIMAL(16,8) or whatever.

In some situations, there may even be industry or legal standards as to what can be considered rounding noise.

toyg · on Jan 6, 2019

No need for snark. You might be correct that, by “computational volume”, handling currency values might be considered a niche; but even something like World of Warcraft has to handle money at some point.

nearbuy · on Jan 6, 2019

That doesn't seem dumb at all. Making BCD the default would mean floats use 17% more space for the same precision. That might seem like a small loss, but it's also for an incredibly small gain. Programmers would still have to be aware that testing two decimals for exact equality is dangerous. I don't see the problem with 0.1 + 0.2 = 0.300000000000001 if you aren't testing floats for direct equality.

otabdeveloper2 · on Jan 6, 2019

> it is much more likely to need to represent 0.30 than 1/3

Citation needed, because this isn't really true.

Even if one concedes your (unspoken) idea that only financial transactions aren't "academic" (which also isn't true), in the real world financial transactions will typically include currency conversions, and those will have all sorts of weird non-decimal factors.

SideQuark · on Jan 6, 2019

You cannot do anything in finance with such reasoning. Take something simple, say a mortgage at 5% compounded 12 times a year. To compute payments using some fixed length representation or decimal is going to lead to more error than to use the usual floating point. This rabbit hole would continue for many applications.

Floating point makes them all much easier to do well.

jes5199 · on Jan 6, 2019

Have you worked on finance software? I have - we always used ints for everything, so we could avoid rounding suprises

bigiain · on Jan 6, 2019

I did a web project in the gambling space ~10 years back - we were legally required to perform all calculations as integers in ten thousandths of a cent (or millionths of a dollar).

We chose to _not_ do _any_ calculations client side in Javascript...

dotancohen · on Jan 6, 2019

Which regulation is that? I've worked on financial applications, but not gambling, and I've not heard of this regulation. I should probably know about it!

bigiain · on Jan 6, 2019

Australian, or possibly Tasmanian state regs. This would have been around 2011 or so (The Samsung Galaxy S2 was the "top of the range Android phone" at the time...)

ajanuary · on Jan 6, 2019

I worked for a gambling company in the UK for a bit. They did all their maths in pennies, not thousandths of a penny.

SideQuark · on Jan 7, 2019

Yes, I have. I also have a math PhD, have written scads of scientific and numerical software, and have written articles on floating-point math. So now that we have enough of our personal accolades out of the way, let's focus on facts regarding calculations:

How did you use ints to compute compounded interest on loans? I asked that above, and you avoided it. I ask again.

For example, suppose you have a mortgage where you lent $100K at 5% annual, compounded 12 times a year, for 30 years, and you need basic values regarding this loan.

Often in such calculations you need to compute 100K*(1+0.05/12)^360. How do you do that with integers? Naively you need (1+(1/20)/12)^360, which as a reduced fraction each of the numerator and denominator have over 2800 binary digits. Do you really do this with integers?

Now put that in a mortgage trading or pricing system where it needs to do millions/billions of those per second.

Doing this as double gives enough precision to make the difference between computed and infinitely precise negligible (approx. 10^-17 error).

It's easy to make examples where doing incremental calculations, rounding to pennies and storing, results in long term error. In these cases I don't see how do to it with integers without massive overhead.

And this is a trivial, common example. Doing stuff like hedge fund stuff, or anything using numerical integration to make models for pricing, would be astounding hardly to do with integer only math.

What finance software did you write? A simple ledger works fine as integers. Anything more complex will hit performance and scaling issues soon after the basics.

DougBTX · on Jan 6, 2019

I’ve not done any real finance programming, so is this a reasonable explanation?

Currency is stored as a count of cents (millicents if being fancy). Therefore the two main features of floats are not useful:

- Support for very small numbers is not needed. Floats dedicate approx half their range to numbers between -1 and +1, this is wasted when counting whole cents.

- Support for very large numbers at the expense of precision is actively bad, as the precision must always be down to individual cents.

So the useful range of floats is much reduced when using floats for counting, approx 54 bits out of a 64 bit float are used. Instead ints (“counting numbers”) are much better for counting cents than floats (which approximate the continuous real numbers in a finite number of bits).

shakna · on Jan 6, 2019

IBM Decimals are the standard for finance applications. Integers, not floating point.

Floats don't have the precision required.

erik_seaberg · on Jan 5, 2019

There's no reason for every step of a computation to be confined to the same very small message length. And the necessary error analysis should be built into the language, preferably in the same "advanced users only, here be dragons" package as the imprecise types themselves.

TFortunato · on Jan 5, 2019

So interestingly, processor makers are on the same page with you re: computations, and lots of processors can internally do computations in "extended precision", e.g. 80-bit floats, only converting to/from 64-bit doubles at the start and end of the computation.

https://en.m.wikipedia.org/wiki/Extended_precision

garmaine · on Jan 5, 2019

That hasn’t been the case for over a decade.

ealloc · on Jan 6, 2019

It may be coming back though.

IBM's new and rising supercomputer architecture, POWER9, supports hardware IEEE binary128 floats (quad precision). Their press claims the current fastest supercomputer in the world uses POWER9.

The ppc64 architecture (still produced by IBM) supports "double-double" precision for the long-double type, which is a bit hacky and software-defined, but has 106 bit mantissa.

And ARM's aarch64 architecture supports IEEE binary128 long-doubles as well, though it is implemented in software now (by compiler). Maybe they plan a hardware implementation in the future?

TFortunato · on Jan 6, 2019

How do you mean? The x86-64 instruction set / abi specifies long doubles as 80-bits, and still supports them ...

simonbyrne · on Jan 6, 2019

Essentially there are two different sets of floating point instructions on x86 and x86-64: - the x87 instructions, which descend from the original 8087 coprocessor (and have 80-bit registers), and - the SSE instructions, which descend from the Pentium MMX feature set, are faster, support SIMD operations, and can be fully pipelined.

The x87 instructions are basically for legacy compatibility, or if you manually use long doubles on some platforms.

The idea behind extended precision registers was good in theory, but ultimately caused too much hassle in practice.

TFortunato · on Jan 6, 2019

Yep - and there are absolutely some cases where you do want to manually use it which is why the x86_64 ABI on SysV (used by Linux and OS X, still specifies the long double type as 80-bits, and why GCC and Clang will still emit these instructions when long doubles are used!

(Sorry, this is more for the folks who aren't familiar with this, since it seems like you are familiar, but I didn't want it to seem like this isn't widely supported when they read "legacy" or "some platforms")

Here is a good toy examples that runs into the same numbers shown in the parent, showing the two different instruction types, and that long double can give you the correct answer, while still being run in hardware, vs. going all the way to float128s which are currently emulated in software!

Code w/ Assembly: https://godbolt.org/z/W3ZmqJ Output: https://onlinegdb.com/Sy_I3Q1ME

simonbyrne · on Jan 6, 2019

I agree that extended precision can be very useful, though I think the failing was on the software side: basically languages and compilers didn't provide useful constructs to control things like register spilling (which caused the truncation of the extended precision).

The current hardware trends seem to be providing instructions for compensated arithmetic, like FMA and "2sum" operations. I think this is ultimately a better solution, and will make it possible to give finer control of accuracy (though there will still be challenges on the software/language side of how to make use of them).

conistonwater · on Jan 6, 2019

All the floating-point arithmetic that is natively supported these days is the 32- and 64-bit kind in SSE instruction sets and its extensions. The fact that something is "available" doesn't mean much in terms of actual support. As far as I know, long double means 128-bit floats in modern clang/gcc, and they are done in software.

TFortunato · on Jan 6, 2019

Long doubles are typically 80-bit x87 "extended precision" doubles as far as I've seen. (Except on windows :-P ). It's part of the reason why LLVM has the 80 bit float type.

https://en.cppreference.com/w/cpp/language/types

https://software.intel.com/en-us/articles/size-of-long-integ...

They are definitely still supported in modern Intel processors. That said, there can be some confusion because they end up being padded to 16 bytes for alignment reasons, so take 128 bits of memory, but they are still only 10 byte types.

They are a distinct type from the "quad" precision float128 type, which is software emulated as you mentioned.

All that being said, you are right that most of the time float math ends up in SSE style instructions, but as soon as you add long doubles to the mix, the compiler will emit x87 style float instructions to gain the extra precision.

Example: https://godbolt.org/z/PMZVdb

leiroigh · on Jan 6, 2019

And nobody uses this terrible mis-feature in practice, everything runs via 64 bit xmm registers.

Rightly so, because programmers want their optimizing compiler to decide when to put a variable on the stack and when to elide a store/load cycle by keeping it in a register. With 80 bit precision, this makes a semantic difference and you end up in volatile hell.

TFortunato · on Jan 6, 2019

Yeah I agree that everything typically runs in XMM registers and that's what people want. I'm not sure what about the availability of extended precision makes it s a misfeature? For some cases it IS what you want, and it's nice to be able to opt in to using it..

EDIT: If I had some application where I needed the extended range, like maybe I was going to run into the exact numbers above, I'd appreciate the ability to opt-in to this. Totally agree I wouldn't want the compiler to surprise me with it, but also not terrible, or useless.

Code w/ Assembly: https://godbolt.org/z/W3ZmqJ Output: https://onlinegdb.com/Sy_I3Q1ME

BeeOnRope · on Jan 6, 2019

To be fair, the problem you describe isn't inherent to 80-bit floating point values. If you use 80-bit values in your ABI or language definition, it won't occur - it occurs when you try to user a wider type to implement a narrower type, e.g., implementing 64-bit floats (as specified in the ABI or language) with 80-bit operations.

In that case, the extra precision is present and "carried across" operation when registers or the dedicated floating point stack is used, but is discarded when values are stored to a narrower 64-bit location. So the problem is one really of mismatch between the language/ABI size and the supported hardware size. Of course, 80 bits isn't a popular floating point size any more in modern languages, so this happens a lot.

garmaine · on Jan 7, 2019

The x87 ISA does, yes, and they are supported for binary compatibility reasons. However the actual x87 registers are shadowed by the vector registers so you can only use one. Any modern vectorizing compiler uses the vector instructions for FPU arithmetic, even when scalar, with a max precision of 64-bit.

gpderetta · on Jan 8, 2019

>The x87 ISA does, yes, and they are supported for binary compatibility reasons.

Well x86-64 is not binary compatible with x86 so that's not the reason. It is mostly for software relying on either the rounding quirks or the extended 80 bit precision I guess.

> However the actual x87 registers are shadowed by the vector registers so you can only use one

You are confusing with the legacy MMX registers which are deader than the x87 for stack. XMM registers do not shadow the for stack.

loeg · on Jan 6, 2019

The x86-64 instruction set does not specify any particular ABI and definitely does not specify the precision of a C language type.

TFortunato · on Jan 6, 2019

Fair point, I was a little fast and loose with my words there, which is definitely dangerous when it comes to things like C language / ABI standards! :-P

lenticular · on Jan 6, 2019

A small message length means small memory, which is important in physical computers. It is information-theoretic optimal. This is a well-defined term.

erpellan · on Jan 6, 2019

The limitations should be well known. One of the first things I check when joining a financial software project is how the system represents money. I’m rarely surprised.

(It’s inevitably floats or doubles)

ayidnelm · on Jan 6, 2019

> Floating point numbers are the optimal minimum message length method of representing reals with an improper Jeffery's prior distribution.

Do you have a link to a proof or discussion of this? I haven't heard this before and I would love to have this statement unpacked a little more.

vanderZwan · on Jan 6, 2019

> Thus, we aren't going to ever do better than floats if we are programming on physical computers that exist in this universe.

Maybe, but that doesn't mean the particular implementation of floats being used is the best one. See also: Unums and Posits

https://posithub.org/

nestorD · on Jan 6, 2019

To my understanding, experiments with Unums showed shortcoming that Gustafson didn't anticipate and lead to Posits which drop the fixed length constraint. Doing that makes improving precision a lot easier but at the cost of computation time. Overall I am not convinced that the current implementation is optimal but it is a very good trade-off between speed and precision.

vanderZwan · on Jan 6, 2019

> Doing that makes improving precision a lot easier but at the cost of computation time.

Not quite. The difference in computation time is the current the lack of hardware support, not something inherent to the underlying encoding method. So in practice you are right, but in, for example, embedded contexts without floating point hardware, the performance advantages of IEEE floats should disappear (especially if using a 16 or 8 bit posit suffices).

Posits are simpler to implement than IEEE floats (less edge cases) and use more bits for actual numbers whereas IEEE floats waste about half on NaNs. The use of tapered precision is also nice.

tntn · on Jan 6, 2019

Even if hardware support existed, it seems like a variable length encoding has some inherent overhead relative to a fixed length encoding. If you have a "base length" of e.g. 32 bits and occasionally expand to 64, there's an inherent cost there in both computation and memory, presumably for greater precision. Perhaps that overhead could be minimal with hardware support, but it seems it must have some.

vanderZwan · on Jan 6, 2019

Those are type one unums, not posits. What you are saying about variable length encoding may be true, but it does not actually apply to the current comparison. Type 2 unums are also fixed length, but have other issues.

tntn · on Jan 6, 2019

'nestorD was discussing the effects and overheads of "dropping the fixed length constraint" in the comment you replied to.

vanderZwan · on Jan 7, 2019

Oh darn, you're right. My bad!

In my defense, the comment he replied to got downvoted and I thought it was nestorD, so I was "primed" to misinterpret his comment as criticizing unums in general.

chrisseaton · on Jan 5, 2019

People get upset that floating point can’t represent all infinite number of real numbers exactly - I can’t understand how they think that’s going to be possible in a finite 64 bits.

m0zg · on Jan 5, 2019

To hit the point home a little harder: you can easily iterate through the entire representable set of float32 on a modern machine within seconds. I've encountered many engineers who don't quite get that.

chrisseaton · on Jan 5, 2019

Right - if you have a monadic function that takes a 32-bit float, your tests should probably literally cover every single input value.

jancsika · on Jan 5, 2019

Wait, where did OP's 64-bit slot go?

> I can’t understand how they think that’s going to be possible in a finite 64 bits.

You apparently stole 32 of them to make your bat.

If you put them back your tests balloon to half a century each.

tom_ · on Jan 5, 2019

An alternative calculation: https://news.ycombinator.com/item?id=18109432

"You can rent a Skylake chip on Google Cloud that'll perform 1.6 trillion 64 bit operations per second for $0.96/hr preemptively. That's enough to run one instruction over a 64 bit address space exhaustively over 120 days, or for ~$2800"

It might not make economic sence to actually make this happen for any realistic test, but it's interesting that it might actually be feasible to do it on any kind of human timescale...

dannypgh · on Jan 6, 2019

At some point, your test switches from testing the code, to testing the machine the code runs on. That likely happens before 120 days.

josefx · on Jan 6, 2019

Given issues like the intel fdiv bug that may make sense to test if you really want to avoid running into hardware specific bugs.

dannypgh · on Jan 6, 2019

Maybe, but I'd rather a test suite that's designed to test hardware, rather than overloading some code's unit tests.

I think most unit tests are best served by testing key values- e.g. values before and after any intended behavior change, values that represent min/max possible values, values indicative of typical use.

The unit test can serve as documentation of what the code is intended to do, and meaninglessly invoking every unit test over the range of floats obscures that.

There are certainly cases where all values should be tested, but I don't think that's all cases.

chrisseaton · on Jan 5, 2019

Wow that's aggressively snarky.

I presume they were saying 'and for 32-bit floats you also get this property that you can...'

tshaddox · on Jan 5, 2019

Or on any computer at all, even an “infinite” (at least unbounded) computer like a Turing machine, considering that almost all real numbers are not computable.

ummonk · on Jan 5, 2019

Well, you don't need to represent all the real numbers. You can get quite far with just rationals or algebraic numbers, although you'll have trouble with exponentials and trignometry. And computable numbers are basically superior to any other number system for computation.

You of course need an unbounded but finite amount of space to store these numbers, which is perfectly fine.

dnautics · on Jan 5, 2019

> And computable numbers are basically superior to any other number system for computation.

I don't think that's really quite true. The point of FP is that you don't get any wierd statefulness in your compute complexity as values accumulate, every operation basically has O(1) compute time where N is the number of previous operations you've done. For rationals and algebraics that isn't the case.

ori_b · on Jan 5, 2019

You'll have massive performance issues the second you end up with a relatively prime numerator or denominator that ends up in an iterative algorithm.

yoz-y · on Jan 6, 2019

To me the only downside of IEEE 754 is that most languages including C and C++ do not provide a sensible canonical comparison methods. This leads to surprised beginners and then a ton of home made solutions which are often not appropriate.

MauranKilom · on Jan 9, 2019

Would you really want a default comparison where a == b does not imply a - b == 0?

yoz-y · on Jan 9, 2019

I think it depends, in languages which have implicit type coercion I think that would hurt. In languages like swift, where you need to explicitly cast even an Int to Double it would be less of a footgun. I'd rather floats have some overloaded operator maybe ~=, for approximate comparison.

delhanty · on Jan 6, 2019

Exactly!

Very far from a floating point expert here, but what I do is to scale-down by a few odd prime-power factors as appropriate:

Scaling down by powers of 5 is obviously appropriate for decimals, currency etc.

Scaling down by powers of 3 is good for angles measured in the degrees, minutes, seconds system.

If one scales down a lot there is an increased risk of overflow, so one can compensate by scaling up some powers of 2.

The way I think of this is as using my own manual exponent bias [0].

>the exponent is stored in the range 1 .. 254 (0 and 255 have special meanings), and is interpreted by subtracting the bias for an 8-bit exponent (127) to get an exponent value in the range −126 .. +127.

So, for example, even single-precision number are always exact multiples of 1/(2^126), and I'm just changing the denominator to contain powers of 3, 5, 7, ... etc.

[0] https://en.wikipedia.org/wiki/Exponent_bias

microcolonel · on Jan 6, 2019

Integers are a lot less trouble for many currency problems, but I think some people are afraid of multiplying integer fractions.

In financial calculations I've seen, figures are given in standard magnitudes (per cent, per mille, basis points, integer cents, etc.) which, if you're lucky with your language, can be encoded as types which can be promoted to higher precision (somewhat) transparently.

ummonk · on Jan 5, 2019

Presumably we could actually make decimal floating point computation the default and greatly reduce the amount of surprise. I don't think the performance difference would be an issue for most software.

smallnamespace · on Jan 5, 2019

Decimal floating point won't avoid this issue, for a sufficiently large value the ulp would be 10.

ummonk · on Jan 5, 2019

It would solve more common issues like this though:

> I appreciate that it may be surprising that 0.1 + 0.2 != 0.3 at first, or that many people are not educated about floating point, but I don't understand the people who "understand" floating point and continue to criticize it for the 0.1 + 0.2 "problem."

That's not a calculation that should require a high level of precision.

smallnamespace · on Jan 5, 2019

The correct solution is to understand how floating point number systems work and use near comparisons for floats.

Decimal fp is still 'wrong' for, say, 1/3 + 1/3 = 2/3.

int_19h · on Jan 6, 2019

A lot of real-world data is already in base-10 for obvious reasons, and so an arrangement that lets you add, subtract and multiply those without worrying is worthwhile, even if it can't handle something more exotic.

smallnamespace · on Jan 6, 2019

Would you really call 'any rational with divisible factors other than 2 and 5' to be 'exotic'?

Maybe we really should move back to base-60 like the Babylonians used, then you could at least divide by 3.

int_19h · on Jan 7, 2019

Because humans standardized on base-10, and computers are ultimately for humans to use?

MauranKilom · on Jan 9, 2019

Maybe we should also add data types to every language that can convert exactly between inches, feet, miles and every other non-base-10 unit?

The argument "we want to look at base-10 in the end so it should be the internal representation" is really weak and ignores basically every other practical aspect.

int_19h · on Jan 6, 2019

The way to avoid this issue is to avoid floating-point numbers that have any implicit zeroes (due to exponent) after its significant digits. Basically restrict the range to only values where it's guaranteed that for any x1 and x2 from the range, (x1-x2) produces a non-zero dx such that x2+dx == x1.

The only example off the top of my head that is floating point is C# "decimal", which actually originates from the Decimal data type in OLE Automation object model (which could be seen in VB6, and can still be seen in VBA):

https://msdn.microsoft.com/en-us/library/cc237603.aspx

Note this bit:

"scale: MUST be the power of 10 by which to divide the 96-bit integer represented by Hi32 * 2^64 + Lo64. The value MUST be in the range of 0 to 28, inclusive."

The reason why it's limited to 28 is because the 96-bit mantissa can represent up to 28 decimal digits exactly. The way it's enforced, any operation that produces a result outside of this range is an overflow error (exception in .NET).

dbaupp · on Jan 6, 2019

I believe IEEE754 floats have that subtraction/addition guarantee (as long as the hardware doesn't map subnormals to zero). The problem in this case is the input numbers are rounded when they are converted from text/decimal to a float, and so aren't exact.

int_19h · on Jan 7, 2019

> I believe IEEE754 floats have that subtraction/addition guarantee (as long as the hardware doesn't map subnormals to zero).

They don't - all 11 bits of the exponent (for float64) are in use, so you can have something like 1e300, and then you can't e.g. add 1 to it and get a different number.

   >>> x = 1e100
   >>> x
   1e+100

   >>> y = x + 1
   >>> y
   1e+100

   >>> x - y
   0.0

dbaupp · on Jan 8, 2019

That's something else: x == y, so the difference must be 0. What you said above is

  if x != y, x - y != 0

which holds for all x and y, but is different to

  if dx != 0, x + dx != x

which fails for some (many!) x and dx.

ori_b · on Jan 5, 2019

Would ` 9999999999999999.0 – 9999999999999998.0 == 10.0000` really be less surprising?

sampo · on Jan 6, 2019

I think the numbers would need to be something like

9999999999999995.0 – 9999999999999994.0 == 10.0000

twtw · on Jan 6, 2019

Binary-coded decimal formats have more or less already lived and died (both fixed and floating point). They still have areas of applicability, but this idea is very much not a new one - x86 used to have native BCD support, but it was taken out in amd64 IIRC.

hedora · on Jan 6, 2019

I took the table to be a handy guide to where arbitrary precision is the default vs. hw accelerated math.

Filtered by languages I care about, I guess I have no choice but to learn perl 6 if I want correct (but presumably slow) floating point with elegant syntax (my taste might not match yours).

I’d be curious to know what the random GPU languages and new vector instruction sets do with this computation. I don’t think they’re all 754 compliant.

twtw · on Jan 6, 2019

Can't comment on the situation with other GPU languages, but CUDA on GPUs since fermi are 754 compliant, with the exception that certain status flags are unavailable.

etCeteraaa · on Jan 5, 2019

Because if there are obvious edge and corner cases, like overflow scenarios, a professional system will either ensure that expectations are lived up to, or flatly denied as errors.

No surprises.

Gibbon1 · on Jan 5, 2019

> exceptionally good way to approximate

You answered your question. 99% of the time being exact is a requirement and calculation speed is utterly unimportant, thus using IEEE 754 results in programs that are fundamentally broken.

coddingtonbear · on Jan 5, 2019

Is that really true? In my experience, 99.9% of the time I don't need an exact number; the vanishingly few times when I have such a need (almost entirely calculations involving currency), using a fixed point representation is simple enough.

dkarl · on Jan 5, 2019

People do different kinds of work, so there are programmers who experience it both ways, 99% of the time floats are good solution or 99% of the time floats are an incorrect solution. Because of history and language support, classes and other resources for learning to program teach you to use floating-point numbers and don't bother with alternatives. As a result you have a lot of programmers who default to treating every number with a dot in it as floating point number, and they get burned by it, and instead of realizing it's just a gap in their education that they can correct, they treat overuse of floats as a mistaken industry-wide consensus that needs to be overturned.

zokier · on Jan 5, 2019

I think there is argument to be made for high-level languages defaulting for arbitrary precision math ("make it correct first, fast second"). But considering that we are still fumbling around with fixed-width integers and that is much simpler domain after all, I don't hold my breath on "solving" the problem of reals any time soon.

Gibbon1 · on Jan 6, 2019

That's exactly my point, why is the default a lossy format? And consider the distinction between variables and calculations. Formats like IEEE754 are designed for performing fast high accuracy transformations on matrices. I have no complaints about that. But the default arbitrary number format should be able to store exact integers and ratio's.

Gibbon1 · on Jan 5, 2019

You don't need an exact number but customer data is universally decimal. Soon as you blindly convert that to IEE754 everything is now broken.

coddingtonbear · on Jan 5, 2019

Is it really, though? I'm honestly struggling to think of a non-currency situation in which fractional customer data necessarily be handled as a decimal value -- and, honestly, even if the availability heuristic might make them seem more common than they are, I'd be astonished if even a single percent of general calculations programmers collectively ask computers to perform are involving currency. Most real-life situations just don't even inherently _have_ that kind of precision, let alone need it. Seriously, I can't think of a time when I've needed to store a coordinate or a person's height as a decimal value to prevent something from being broken.

tedunangst · on Jan 6, 2019

My 5/8s wrench disagrees. Happens to store quite nicely in a float, however.

svat · on Jan 5, 2019

A useful website for these that I ran across recently: https://float.exposed/

For example, entering 9999999999999999.0 into "double" gives https://float.exposed/0x4341c37937e08000 and entering 9999999999999998.0 gives https://float.exposed/0x4341c37937e07fff

My wishlist for such a page would contain two additional features:

1. Allow entering expressions like "a OP b == c", so that one can enter "0.1 + 0.2 == 0.3" or "9999999999999999.0 - 9999999999999998.0 == 1.0" and see the terms on the left-hand side and right-hand side.

2. Show for each float the explicit actual range of real numbers that will be represented by that float. For example, show that every real number in the range [9999999999999999, 10000000000000001] is represented by 10000000000000000, and that every real number in the range (9999999999999997, 9999999999999999) is represented by 9999999999999998.

The author of this one has a blog post about it: https://ciechanow.ski/exposing-floating-point/ and I also like a shorter (unrelated) page that nicely explains the tradeoffs involved in floating-point representations and the IEEE 754 standard, by usefully starting with an 8-bit format: http://www.toves.org/books/float/

tzs · on Jan 5, 2019

The IEEE 754 calculator at http://weitz.de/ieee/ does some of what you ask for. You can enter two numbers, see the details of their representation, and do plus, minus, times, or divide using them as operands and see the result.

DougBTX · on Jan 5, 2019

> 9999999999999999.0 into "double" gives https://float.exposed/0x4341c37937e08000

Nice that it reformats the input to "10000000000000000.0", gets the point across that a 64 bit double float just doesn't have enough bits to exactly represent 9999999999999999.0, but that it does happen to be able to represent 9999999999999998.0.

loeg · on Jan 6, 2019

An easy rule of thumb is each 3 decimal digits takes 10 bits to represent. 9999999999999999 is 16 (= 15 + 1) decimal digits. And 3 bits can only represent 0-7. So you need more than 3 bits for that final decimal digit. So, 50 + 4 bits.

IEEE 754 64-bit floats have 53 significant bits ("mantissa").

csours · on Jan 5, 2019

This is awesome, I tried to read and understand the 754 float spec before, and I didn't really get it.

Try playing around with half precision, it makes things a lot easier to understand.

scrollaway · on Jan 5, 2019

OT: What a lovely tld .exposed is. I… really wonder about its majority userbase.

al2o3cr · on Jan 5, 2019

The arithmetic is correct - the problem is that "9999999999999999.0" isn't representable exactly.

9999999999999998.0 in IEEE754 is 0x4341C37937E07FFF

"9999999999999999.0" in IEEE754 is 0x4341C37937E08000 - the significand is exactly one higher.

With an exponent of 53, the ULP is 2 - so parsing "9999999999999999.0" returns 1.0E16 because it's the next representable number.

    Using one of these workarounds requires a certain prescience of the
    data domain, so they were not generally considered for the table above.

Doing arithmetic reliably with fixed-precision arithmetic always requires understanding of the data domain. If you need arbitrary precision, you'll need to pay the overhead costs of arbitrary-precision: either by opting-in by using the right library, or by default in languages like Perl6 and Wolfram.

alanfranz · on Jan 5, 2019

What is the "right answer"? Is the article claiming that such languages don't respect IEEE-754, or that IEEE-754 is shit?

If you want arbitrary precision, use an arbitrary precision datatype. If you use fixed precision, you'll need to know how those floats work.

Pointless article, imho.

sbierwagen · on Jan 5, 2019

Note that the last example in the list, Soup, handles the expression "correctly", and also happens to be a programming language the author is working on.

ken · on Jan 6, 2019

> Is the article claiming that such languages don't respect IEEE-754, or that IEEE-754 is shit?

No, I don't think so. Where does that come from? The page doesn't mention FP standards at all.

> If you want arbitrary precision, use an arbitrary precision datatype.

That's the point. Half of them don't offer this feature. The other half make it very awkward, and not the default.

We went through this exercise years ago with integers. These days, there are basically two types of languages. Languages which aim for usability first (like Python and Ruby), which use bigints by default, and languages which aim for performance first (like C++ and Swift), which use fixints by default. It's even somewhat similar with strings: the Rubys and Pythons of the world use Unicode everywhere, even though it's slower. No static limits.

With real numbers, we're in a weird middle ground where every language still uses fixnums by default, even those which aim for usability over performance, and which don't have any other static limits encoded in the language. It's a strange inconsistency.

I predict that in 10 years, we'll look back on this inconsistency the same way we now look back on early versions of today's languages where bigints needed special syntax.

geocar · on Jan 6, 2019

> Pointless article, imho.

I'm sorry you thought so. It pops up pretty often and always seems to spark a lot of conversation, so I think most programmers that give it any thought can find it a very interesting area of study.

There's an incredible amount of creep: We have what starts with nice notation (like x-y) and have to trade a (massively increased) load in either our minds or in the heat our computer generates. I don't think that's right, and I think the language we use can help us do better.

> What is the "right answer"?

What do you think it is?

Everyone wants the punchline, but this isn't a riddle, and if this problem had a simple answer I suspect everyone would do it. Languages are trying different things here: Keeping access to that specialised subtraction hardware is valuable, but our brains are expensive too. We see source-code-characters, lexicographically similar but with wildly differing internals. We want the simplest possible notation and we want access to the fastest possible results. It doesn't seem like we can have it all, does it?

mcguire · on Jan 5, 2019

I think the surprise was that Go uses arbitrary precision for constants.

Retric · on Jan 5, 2019

There are several.

If you subtract two numbers close to each other with fixed precision you don’t know what the revealed digits are. (1000 +/- .5) - (999 +/- .5) = 1 +/- 1.

Thus 0, 1, and 2 are all within the correct range.

csours · on Jan 5, 2019

What does revile mean in this context?

Retric · on Jan 5, 2019

Edit: revealed

Floating point numbers have X digits of accuracy based on the format. (Using base 10 for simplicity) Let’s say .100 to .999 times 10^x.

But what happens when you have .123x10^3 - .100x10^3. It’s .23? x 10^2 but what is that ? we might prefer to pick 0 but it really could be anything. We can’t even be sure about the 3. If the numbers where .1226 x 10^3 and .1004 x 10^3 that just got rounded the correct number would be .222 x 10^2

lenticular · on Jan 5, 2019

Yeah, that's just a limitation of the format. Approximating an uncountably infinite quantity of numbers with only 64 bits is never going to be exact.

Y However, you aren't going to do any better without using vastly more expensive arbitrary precision.

derefr · on Jan 5, 2019

You could see it as a "limitation of the format", or you could see it as exchanging one type of mathematical object for another.

For example, CPU integers aren't like mathematical integers. CPU integers wrap around. So CPU integers aren't "really" the integers—CPU integers are actually the ring of integers modulo 2n , with their names changed!

I'm not sure what the name of the ring(?) that contains all the IEEE754 floating-point numbers and their relations is called, but it certainly exists.

And, rather than thinking of yourself as imprecisely computing on the reals, you can think of what you're doing as exact computation on members of the IEEE754 field-object—a field-object where 9999999999999999.0 - 9999999999999998.0 being anything other than 2.0 would be incorrect. Even though the answer, in the reals, is 1.0.

lenticular · on Jan 6, 2019

Floating point numbers aren't a ring. In fact, they aren't even associative. Thus, they are don't even rise to the level of group or even monoid.

mabbo · on Jan 5, 2019

The point is to illustrate a simple fact that most of us know- but maybe some don't.

https://m.xkcd.com/1053/

twtw · on Jan 5, 2019

It doesn't even illustrate that particularly well. As is, the page just seems to be pointing at floating point and yelling "wrong", with no information on what's actually happening.

By all means embrace the surprise and educate today's 10,000, by why not actually explain why these are reasonable answers and the mechanics here behind the scenes?

jacobtwotwo · on Jan 5, 2019

I was one of those people today! This intrigued me enough to learn more about IEEE 754.

Thank you for the relevant xkcd!

dangerbird2 · on Jan 5, 2019

The right answer is to convert to an integer or bignum. If the language reads 9999999999999999.0 as a 32 bit float, you will get 0.0. If it's a double, you'll get 2.0.

zamadatix · on Jan 5, 2019

I don't think there is a "right" answer. Defaulting to bignum makes no more sense than defaulting to float for inputs "1" and "3" if the operation to be performed on the next line is division. Symbolic doesn't make sense all of the time either, what if it's a calculator app and the user enters "2*π", they probably don't want "2π" to be the result.

If we're going to try to find a "right" answer from a language view without knowing the exact program and use cases then the most reasonable compromise is likely "error" because types weren't specified on the constants or parsing functions.

InclinedPlane · on Jan 6, 2019

There's is a mathematically correct answer for this problem given their decimal representation. That's the correct answer for the math, period. What "good enough" behavior is for a system that uses numbers under the hood depends on context and is only something that the developer can know. Maybe they're doing 3D graphics and single precision floats are fine, maybe they're doing accounting and they need accuracy to 100ths or 1000ths of a whole number.

The appropriate default is, I would argue, the one which preserves the mathematically correct answer (as close as possible) in the majority of cases and enables coders to override the default behavior if they want to specify the exact underlying numerical representation they desire (instead of it being automatic). That goes along with the "principle of least surprise" which is always a good de facto starting point for any human/computer interaction.

bjourne · on Jan 6, 2019

Take a piece of pen and paper and subtract the two numbers. Whatever number you get for the difference is "the right answer."

gpderetta · on Jan 6, 2019

Ok, I'll try with pi and e. I'll be right back.

InclinedPlane · on Jan 6, 2019

The point is that this reveals a common weakness in most programming languages. Not that floating point math has limits, but that this isn't well communicated to the user. One of the hallmarks of good programming language design is the "principle of least surprise" which things like funky floating point problems definitely fall into. Not everyone who uses programming languages, in fact very few of them, have taken numerical analysis, and many devs are not well versed in the weaknesses of floating point math. So much so that a very common way for devs to become acquainted with those limits and weaknesses is by simply blundering into them, unknowingly writing bugs, and then finding the hard way the sharp corners in the dark. This is not ideal.

Consider a similar example, pointers. Some languages (like C and C++) use pointers heavily and it's expected that devs using those languages will be experienced with them. However, pointers are very "sharp" tools and have to be used exceedingly carefully to avoid creating programs with major defects (crashes, memory leaks, vulnerabilities, etc.) They are so hard to get right that even software written by the best coders in the world commonly has major defects in it related to pointer use. This problem is so troubling to some that there are many languages (java, javascript, python, C#, rust, etc.) which have been designed to avoid a lot of the most difficult to use aspects of languages like C and C++, they use garbage collection for memory management, they discourage you from using pointers directly, and so on. However, even those languages do very little to protect the user from blundering into a mindfield of floating point math.

Consider, for example, simply this statement:

x = 9999999999999999.0

Seems rather straightforward, right? But it's not, it's a lie. Because in many languages the value of x won't be as above, it'll be (to one decimal digit precision) 10000000000000000.0 instead. Whereas the value of ....98.0 is the same as the double precision float representation to one decimal digit precision (thus the difference between the two comes out as 2.0 instead of 1.0). Now, maybe in a "the handle is also a knife" language like C this is fine, but we have so many languages which go to such extremes everywhere else to protect the user from hurting themselves except when it comes to floating point math. And here's a perfect case where the compiler, runtime, or IDE could toss an error or a warning. Here you have a perfect example of trying to tell the language something you want which it can't do for you in the way you've written, that sounds like an error to me. The string representation of this number implies that you want a precision of at least the 1's place in the decimal representation, and possibly down to tenths. If that's not possible, then it would be helpful for the toolchain you're using for development to tell you that's impossible as close to you doing it as possible, so that you know what's actually going on under the hood and the limitations involved.

Something which would also drive developers towards actually learning the limitations of floating point numbers closer to when they start using them in potentially dangerous ways than instead of having to learn by fumbling around and finding all the sharp edges in the dark. The sharp edges are known already, tools should help you find and avoid them not help new developers run into them again and again.

misterdoubt · on Jan 5, 2019

Using one of these workarounds requires a certain prescience of the data domain

I'm a little concerned if merely knowing the existence of floating point arithmetic constitutes "prescience."

seanalltogether · on Jan 5, 2019

Are there any mainstream languages that consider a decimal number to be a primitive type? I feel like floating point numbers are far less meaningful in every day programs. Even 2d graphics would be easier with decimal numbers. Unless you're using numbers that scale from very small to very large, like 3d games or scientific calculations, you don't actually want to use floating point.

mschuetz · on Jan 5, 2019

> Are there any mainstream languages that consider a decimal number to be a primitive type

Mathematica. But it's not particularely fast.

> Unless you're using numbers that scale from very small to very large, like 3d games or scientific calculations, you don't actually want to use floating point.

Unfortunately, we can sometimes only use floats in 3D graphics and floats aren't even good for semi-large to large 3D scenes. Unity is a particular bad offender. It's not even necessary for meshes but having double precision transformation matrices would make life so much easier. Could simply use double precision world and view matrices, then multiply them together and the large terms would cancel out in the resulting worldView matrix, which can then by cast back to single precision floats.

MauranKilom · on Jan 9, 2019

Not to mention z-fighting for distant objects.

misterdoubt · on Jan 5, 2019

Depends how you define 'primitive type.' A decimal number is built-in for C# and comes along with the standard libraries of Ruby, Python, Java, at least.

jimhefferon · on Jan 5, 2019

Racket: https://docs.racket-lang.org/guide/numbers.html

softawre · on Jan 7, 2019

C#

https://docs.microsoft.com/en-us/dotnet/csharp/language-refe...

lozenge · on Jan 5, 2019

Cobol?

twtw · on Jan 5, 2019

Julia has built in rationals (as do a few other languages).

I'm not aware of any language (other than Wolfram) that defaults to storing something like 0.1 as 1/10 - i.e. uses the decimal constant notation for rationals, rather than having some secondary syntax or library.

kccqzy · on Jan 6, 2019

Even in Wolfram, 0.1 is not the same as 1/10.

    In[1]:= Precision[0.1]

    Out[1]= MachinePrecision

    In[2]:= Precision[1/10]

    Out[2]= \[Infinity]

twtw · on Jan 6, 2019

Ah, thanks for the correction.

I don't currently have a license, so out of curiosity is 0.3 == 3/10 in wolfram?

kccqzy · on Jan 6, 2019

Yes that's true.

According to the documentation of Equal†,

> Approximate numbers with machine precision or higher are considered equal if they differ in at most their last seven binary digits (roughly their last two decimal digits).

Which is why in Mathematica, 0.1+0.2==0.3 is also True.

If you need a kind of equality comparison that returns False for 0.3 and 3/10, use SameQ. Funnily, SameQ[0.1+0.2,0.3] is also True, because SameQ allows two machine precision numbers to differ in their last binary digit.

†: https://reference.wolfram.com/language/ref/Equal.html

sgt101 · on Jan 6, 2019

Yes, but the Julia REPL produces 2.0 as an answer and casting both these to BigInt doesn't work either.

twtw · on Jan 6, 2019

Thus my second paragraph. You have to opt in to use other formats.

Casting to bigint doesn't work because the problem occurs when converting the decimal constant in the source to floating point. You would have to convince the parser to parse the constant as something besides a float.

sgt101 · on Jan 7, 2019

Agree - but it's a shame, fundamentally what ever the reason this is really, really egregious behaviour

rurban · on Jan 5, 2019

In Common Lisp it's even standardized. Arithmetic is too important to be left to wrong CPU intrinsics.

kccqzy · on Jan 6, 2019

There are issues with arbitrary precision decimal numbers. For one, you can't deal with things like 1/3: these are repeating decimals so they need infinite memory to represent.

ummonk · on Jan 5, 2019

Do you mean some fixed point decimal number? Cause the normal way to do decimal numbers would still be floating point.

derefr · on Jan 5, 2019

They mean arbitrary-precision decimal arithmetic (i.e. a struct containing bignum x and integer y where the connoted value is x*10^y, such that multiplication can be defined simply as the independent multiplication of the value parts and of the exponent parts.)

mongol · on Jan 6, 2019

lalaithion · on Jan 5, 2019

There are plenty of languages without the concept of a primitive type...

garethrees · on Jan 6, 2019

The linked post is a bit poorly expressed, but I think there is a good point there: fixed-size binary floating-point numbers are a compromise, and they are a poor compromise for some applications, and difficult to use reliably without knowing about numerical analysis. (For example, suppose you have an array of floating-point numbers and you want to add them up, getting the closest representable approximation to the true sum. This is a very simple problem and ought to have a very simple solution, but with floating-point numbers it does not [1].)

Perhaps it is time for the developers of new programming languages to consider using a different approach to representing approximations to real numbers, for example something like the General Decimal Arithmetic Specification [2], and to relegate fixed-size binary floating-point numbers to a library for use by experts.

There is an analogy with integers: historically, languages like C provided fixed-size binary integers with wrap-around or undefined behaviour on overflow, but with experience we recognise that these are a poor compromise, responsible for many bugs, and suitable only for careful use by experts. Modern languages with arbitrary-precision integers are much easier to write reliable programs in.

[1] https://en.wikipedia.org/wiki/Kahan_summation_algorithm [2] http://speleotrove.com/decimal/decarith.html

MauranKilom · on Jan 9, 2019

Do note that UB on integer overflow is (at least nowadays) more of a compiler wish for optimization reasons than it is technically necessary (your CPU will indeed just wrap around if you don't live in the 80s anymore, but a C++ compiler might have assumed that won't happen for a signed loop index).

f2f · on Jan 5, 2019

Also worth checking: http://0.30000000000000004.com/

most popular previous discussion: https://news.ycombinator.com/item?id=10558871

chaitanya · on Jan 6, 2019

There's an easier way to specify long floats in Common Lisp: use the exponent marker "L" e.g. 9999999999999999.0L0. No need to bind or set reader variables.

That said, even in Common Lisp I think its only CLISP (among the free implementations) that gives the correct answer for long floats.

CLISP:

    [1]> (- 9999999999999999.0L0 9999999999999998.0L0)
    1.0L0

SBCL, CMUCL and Clozure CL:

    * (- 9999999999999999.0L0 9999999999999998.0L0)
    2.0d0

The standard only mandates a minimum precision of 50 bits for both double and long floats, so there's no guarantee that using long floats will give the correct answer, as we can see.

http://www.lispworks.com/documentation/HyperSpec/Body/t_shor...

mark-r · on Jan 6, 2019

Is floating point math broken?

https://stackoverflow.com/questions/588004/is-floating-point...

No, it's just that a lot of people don't understand its limitations.

mbostock · on Jan 5, 2019

It’s nice that JavaScript has arbitrary-precision integers now. 9999999999999999n - 9999999999999998n === 1n

dsalaj · on Jan 5, 2019

Google calculator gives answer 0 where as duckduckgo calculator answers with 2. xD

pdw · on Jan 5, 2019

That's disappointing. The Android calculator has an awesome arbitrary precision engine (implemented by Hans Boehm!).

https://cacm.acm.org/magazines/2017/8/219594-small-data-comp...

ivanhoe · on Jan 6, 2019

Interesting thing is that Google calculator will give 2 if you fill in the numbers by clicking on the calculator buttons, instead of writing them in the search bar.

azhenley · on Jan 5, 2019

Bing gives 1 though!

sampo · on Jan 6, 2019

So duckduckgo uses the normal 64bit floating point, and the clever people at bing automatically switch to bignums when needed. But I have no idea what google does to get that 0?

azhenley · on Jan 6, 2019

I think Google is truncating the numbers. You get 0 even if you do:

9999999999999999 – 9999999999999990

sampo · on Jan 6, 2019

    9999999999999999 - 9999999999999971 ==  0
    9999999999999999 - 9999999999999970 == 30
    9999999999999999 - 9999999999999969 == 32
    9999999999999999 - 9999999999999966 == 34

nly · on Jan 5, 2019

This is particularly sucky to solve in C and C++ because you don't get arbitrary precision literals.

    #include <boost/multiprecision/cpp_dec_float.hpp>
    #include <boost/lexical_cast.hpp>
    #include <iostream>

    using fl50 = boost::multiprecision::cpp_dec_float_50;

    int main() {
        auto a = boost::lexical_cast<fl50>("9999999999999999.7");
        auto b = boost::lexical_cast<fl50>("9999999999999998.5");
        std::cout << (a - b) << "\n";
    }

works

    int main() {
        fl50 a = 9999999999999999.7;
        fl50 b = 9999999999999998.5;
        std::cout << (a - b) << "\n";
    }

doesn't, even if you change fl50 out for a quad precision binary float type.

svat · on Jan 5, 2019

> Even user-defined literals in C++11 and later don't let you express custom floating point expressions

Note that in your code sample you're not actually using user-defined literals (https://en.cppreference.com/w/cpp/language/user_literal). This works (based on on your earlier code sample and adding user-defined literals):

    #include <boost/multiprecision/cpp_dec_float.hpp>
    #include <boost/lexical_cast.hpp>
    #include <iostream>
    using fl50 = boost::multiprecision::cpp_dec_float_50;
    fl50 operator"" _w(const char* s) { return boost::lexical_cast<fl50>(s); }
    int main() {
        fl50 a = 9999999999999999.7_w;
        fl50 b = 9999999999999998.5_w;
        std::cout << (a - b) << "\n";
    }

nly · on Jan 5, 2019

Thanks, it's nice to be wrong! For some reason I had it in my head that you couldn't get the token as a char const* for floating point expressions...

rg3 · on Jan 5, 2019

Note in C you can get the correct result if you use long doubles, which normally go to 80 bits[1]:

printf("%Lf\n", 9999999999999999.0L - 9999999999999998.0L);

In my x86_64 computer it breaks when you add enough digits. At this point it started outputting 0.0 as the difference:

printf("%Lf\n", 99999999999999999999.0L - 99999999999999999998.0L);

With 63 bits for the fraction part you more or less get around 19 decimal digits of precision, and the expression above uses 20 significant digits.

[1] https://en.wikipedia.org/wiki/Extended_precision#x86_extende...

thanatos_dem · on Jan 6, 2019

Interestingly, SQLite gets it wrong, returning 2.0, but MySQL, MariaDB, Postgres, and Cockroach all get it right at 1.0

I guess this comes down to most of them having implementations of arbitrary precision decimals.

zeroimpl · on Jan 7, 2019

In PostgreSQL, if you specify a decimal literal, it is assumed to be type NUMERIC (arbitrary precision) by default, as opposed to FLOAT or DOUBLE PRECISION.

If you stored your values in table rows as DOUBLE PRECISION, you would of course get the wrong answer.

tzury · on Jan 6, 2019

With python, I get 2 even when using Decimals.

    Python 2.7.3 (default, Oct 26 2016, 21:01:49)
    [GCC 4.6.3] on linux2
    Type "help", "copyright", "credits" or "license" for more 
    information.
    >>> from decimal import *
    >>> getcontext().prec
    28
    >>> a=Decimal(9999999999999999.0)
    >>> b=Decimal(9999999999999998.0)
    >>> a-b
    Decimal('2')

That is unexpected.

minitech · on Jan 6, 2019

The issue is that 9999999999999999.0 == 10000000000000000.0. You need to pass a string: Decimal('9999999999999999.0')

maxnoe · on Jan 6, 2019

This is because you create a python float (64-bit) before passing this float to the Decimal class.

  >>> a = 9999999999999999.0
  >>> a
  1e+16

lelf · on Jan 6, 2019

  >>> Decimal('9999999999999999.0')-Decimal('9999999999999998.0')
  Decimal('1.0')

rscho · on Jan 6, 2019

There is interesting ongoing research on representing exact reals: https://youtu.be/pMDoNfKXYZg

Veedrac · on Jan 6, 2019

Quick summary of the talk:

A specialist number representation is made for exact representation of values in geometric calculations (think CAD). Numbers are represented as sums of rational multiples of cos(iπ/2n).

Exact summation, multiplication and division (not shown) of these quantities are possible, and certain edge-cases (eg. sqrt) have special-case handling.

The system was integrated into and tested on an existing codebase.

The speaker was also one of the authors of Herbie, if other people remember that.

gpm · on Jan 5, 2019

2 with 64 bit floats, 0 with 32 bit floats.

sp332 · on Jan 5, 2019

I could understand 0, but how does it get 2?

codeflo · on Jan 5, 2019

FP numbers are (roughly) stored in the form m×2^e (m = mantissa, e = exponent). When numbers can't be represented exactly, m is rounded. My guess is that these numbers end up being encoded as 4999999999999999×2 and 4999999999999999.5×2, where the latter is rounded up to 5000000000000000×2.

twtw · on Jan 5, 2019

The nearest doubles to each of these two decimal constants end up being roughly 2 apart. Whereas for fp32 both decimal constants are stored as the same float.

tomxor · on Jan 5, 2019

Because 9999999999999999 is rounded to 10000000000000000 before any math even happens. Precision != order of magnitude.

theoh · on Jan 5, 2019

These integers are so large that they cannot be precisely represented by 32 bit or 64 bit floats. So there's a rounding effect. https://stackoverflow.com/a/1848953

your-nanny · on Jan 5, 2019

because 2 is the interval of precision at that scale. In floating point loss of precision scales with magnitude.

Dylan16807 · on Jan 5, 2019

x999 rounds up when stored, x998 stays the same.