> *Key Takeaways:* > *[...]* > *Floating-point arithmetic is hard.* I have succe...

user3939382 · on June 16, 2021

There are libraries that offer more appropriate ways of dealing with it, but last time I ran into a FP-related bug (something to do with parsing xlsx into MySQL) I fixed it quickly by converting everything to strings and doing some unholy procedure on them. It worked but it wasn’t my proudest moment as a programmer.

tomrod · on June 16, 2021

I wish to learn a better way. FP is sure to byte again and again.

marcosdumay · on June 17, 2021

Well, manually floating the point in a string is sure to bite again and again too, but way more frequently than in binary.

There is actually no better way, if you try to calculate over the reals (with computers or whatever you want), you are prone to be bitten. Once in a while there's an article about intervalar algebra on HN, those are a great opportunity to just nod positively and remember all of the flaws of intervalar algebra I got to learn on my school's physics labs. (And yeah, those flaws do fit some problems better than FP, but not all.)

rini17 · on June 17, 2021

Pity the rational numbers (fractions) did not catch on. Of course it has flaws too but bit easier to grasp. And it handles important cases like 1/3 or 1/10 exactly.

exporectomy · on June 16, 2021

As long as you're using it to represent what could be physical measurements of real-valued quantities, it's nearly impossible to go wrong. Problems happen when you want stupendous precision or human readability.

Numerically unstable algorithms are a problem too but again, intuitively so if you think of the numbers as physical measurements.

brandmeyer · on June 16, 2021

I am regularly reminded of William Kahan's (the godfather of IEEE-754 floating point) admonition: A floating-point calculation should usually carry twice as many bits in intermediate results as the input and output deserve. He makes this observation on the basis of having seen many real world numerical bugs which are corrupt in half of the carried digits.

These bugs are so subtle and so pervasive that its almost always cheaper to throw more hardware at the problem than it is to hire a numerical analyst. Chances are that you aren't clever enough to unit test your way out of them, either.

jrochkind1 · on June 17, 2021

Yep, floating point numbers are intended for scientific computation on measured values; however many gotchas they hsve when used as intended, there are even MORE if you start using them for numbers that are NOT that. money or any kind of "count" rather than measurement (like, say, a number of bytes).

The trouble is that people end up using them for any non-integer ("real") numbers. It turns out that in modern times scientific calculations with measured values are not necessarily the bulk of calculations in actually written software.

In the 21st century, i don't think there's any good reason for literals like `21.2` to represent IEEE floats instead of a non-integer data representation that works more how people expect for 'exact' numbers (ie, based on decimal instead of binary arithmetic; supporting more significant digits than an IEEE float; so-called "BigDecimal"), at the cost of some performance that you can usually afford.

And yet, in every language I know, even newer ones, a decimal literal represents a float! It's just asking for trouble. IEEE float should be the 'special case' requiring special syntax or instantiation, a literal like `98.3` should get you a BigDecimal!

IEEE floats are a really clever algorithm for a time when memory was much more constrained and scientific computing was a larger portion of the universe of software. But now they ought to be a specialty tool, not the go-to for representing non-integer numbers.

cycomanic · on June 17, 2021

I think you are significantly underestimate the prevalence of floating point calculations, there is a reason why Intel and AMD created all the special simd instructions. Multimedia is a big user for example. You also seriously underestimate the performance cost of using decimal types, we are talking orders of magnitude.

jrochkind1 · on June 17, 2021

Fair! Good point about multimedia/animation/etc.

There are still a lot of people doing a lot of work in which they hardly ever want a floating point number but end up using it because it's the "obvious" one that happens when you just write `4.2`, and the BigDecimal is cumbersome to use.

exporectomy · on June 17, 2021

I like that idea too. I wonder why Python doesn't use bigdecimals by default. Maybe because it seems to require you to choose a precision?

seoaeu · on June 17, 2021

Notably, this is only true of 64-bit floats. Sticking to 32-bit floats saves memory and sometimes are faster to compute with, but you can absolutely run into precision problems with them. When tracking time, you'll only have millisecond precision for under 5 hours. When representing spacial coordinates, positions on the Earth will only be precise to a handful of meters.

necheffa · on June 16, 2021

I do a lot of floating point math at work and constantly run into problems either from someone else's misunderstanding, my own misunderstanding, or we just moved to a new microarchitecture and CPU dispatch hits a little different manifesting itself as rounding error to write off (public safety industry).

exporectomy · on June 17, 2021

If you expect bit-for-bit reproducible results, then yea, you'd have to know about the nitty-gritty details. The values should usually still correspond to the same thing in common real world precision though.

RhysU · on June 17, 2021

> it's nearly impossible to go wrong

It's a matter of time if one doesn't know to look for numerically stable algorithms. Or if one thinks performance merits dropping stability.

https://github.com/RhysU/ar/issues/3 was an old saga in that vein.

tehjoker · on June 17, 2021

Unfortunately, that doesn't work when you have to do:

1 - quantity2 / (quantity1 - quantity2)

... or some such thing. If quantity1 and 2 are similar, ouch!

exporectomy · on June 17, 2021

Not sure if there's a mistake in that expression, since if they're similar, you're already going to get some ridiculously large magnitude (unphysical) result. Maybe you mean calculating the error between two values or convergence testing? In that case, it hardly matters if whether you do

quantity2/quantity1 - 1

or

(quantity2 - quantity1) / quantity1

with double precision and physically reasonable values.

opheliate · on June 16, 2021

So you have problems if you want a precise answer, you want to display your answer, or if you want to use any of a large number of useful algorithms? That sounds like it’s quite easy to go wrong.

exporectomy · on June 17, 2021

You can't want a precise answer from physical measurements unless you don't know how to measure things. Display should be done with libraries, and numerical instability makes algorithms basically useless, so you pretty much have to be inventing it yourself.

pvg · on June 16, 2021

I consider the domain sophisticated enough to be an independent skill

It's been a whole field with its own patron saint for a quite a while, take a look at

https://en.wikipedia.org/wiki/William_Kahan

RhysU · on June 17, 2021

Just this week I watched someone discover that computing summary statistics in 32-bit on a large dataset is a bad idea. The computer science curricula needs to incorporate more computational science. It's a shame to charge someone tens of thousands of USD and to not warn them that floating point has some obvious footcanons.

bigiain · on June 17, 2021

> Just this week I watched someone discover that computing summary statistics in 32-bit on a large dataset is a bad idea. The computer science curricula needs to incorporate more computational science.

Sadly, I suspect too many "computer science" courses have turned into "vocational coding" courses, and now those people are computing summary statistics on large datasets in Javascript...

bqmjjx0kac · on June 17, 2021

Could you shed some light on what they did wrong, and what would be a better way to do it?

cellularmitosis · on June 17, 2021

not OP, but the hint is in “computing summary statistics in 32-bit on a large dataset”.

A large dataset means lots of values, maybe we can assume the number of values is way bigger than any individual value. Perhaps think of McDonalds purchases nation-wide: billions of values but each value is probably less than $10.

The simplest summary statistic would be a grand total (sum). If you have a good mental model of floats, you immediately see the problem!

The mental model of floats which I use is 1) floats are not numbers, they are buckets, and 2) as you get further away from zero, the buckets get bigger.

So let’s say you are calculating the sum, and it is already at 1 billion, and the next purchase is $3.57. You take 1 billion, you add 3.57 to it, and you get... 1 billion. And this happens for all of the rest of the purchases as well.

Remember: 1 billion is not a number, it is a bucket, and it turns out that when you are that far away from zero, the size of the bucket is 64. So 3.57 is simply not big enough to reach the next bucket.

RhysU · on June 17, 2021

Well explained! All of the later contributions to the sum are effectively ignored or their contributions severely damaged in 32-bit because the "buckets" are big.

It was precisely this problem. The individual had done all data preparation/normalization in 32-bit because the model training used 32-bit on the GPU. It's a very reasonable mistake if one hasn't been exposed to floating point woes. I was pleased to see that the individual ultimately caught it when observing that 2 libraries disagreed about the mean.

Computing a 64-bit mean was enough. Compensated (i.e. Kahan) summation would have worked too.

bqmjjx0kac · on June 17, 2021

Thanks for the explanation!