32b float can match your desktop. Really just takes a few compiler flags(like av...

sitkack · 2024-10-10T17:44:56.000000Z

I understand what you are saying ...

You aren't guaranteed that your microcontrollers float is going to match your desktop. Microcontrollers are riddled with bugs, unless you need floats and fixedpoint is fast enough. My recommendation is still to use fixedpoint if application is high reliability.

Esp if your code needs to be portable across arm, risc-v, etc.

bobmcnamara · 2024-10-11T23:46:00.000000Z

Many microcontrollers today, including ARM, RISC-V, and Xtensa have IEEE compliant FPUs or libms available. Same numeric format, same rounding, same result.

Fixed point isn't bad at all, just often slower when a compliant FPU is available.

Someone · 2024-10-13T16:44:33.000000Z

> IEEE compliant FPUs or libms available. Same numeric format, same rounding, same result.

IEEE only mandates results within ½ ULP (= best possible) for basic operations such as addition, subtraction, multiplication, division, and reciprocal.

For many other ones such as trigonometric functions, exponential and logarithms, results can (and do) vary between conforming implementations.

https://docs.oracle.com/cd/E19957-01/806-3568/ncg_goldberg.h...:

“The IEEE standard does not require transcendental functions to be exactly rounded because of the table maker's dilemma. To illustrate, suppose you are making a table of the exponential function to 4 places. Then exp(1.626) = 5.0835. Should this be rounded to 5.083 or 5.084? If exp(1.626) is computed more carefully, it becomes 5.08350. And then 5.083500. And then 5.0835000. Since exp is transcendental, this could go on arbitrarily long before distinguishing whether exp(1.626) is 5.083500...0ddd or 5.0834999...9ddd. Thus it is not practical to specify that the precision of transcendental functions be the same as if they were computed to infinite precision and then rounded. Another approach would be to specify transcendental functions algorithmically. But there does not appear to be a single algorithm that works well across all hardware architectures. Rational approximation, CORDIC,16 and large tables are three different techniques that are used for computing transcendentals on contemporary machines. Each is appropriate for a different class of hardware, and at present no single algorithm works acceptably over the wide range of current hardware.”

jcranmer · 2024-10-13T17:25:54.000000Z

IEEE 754-2019 says for the transcendental functions (the ones in §9.2):

> A conforming operation shall return results correctly rounded for the applicable rounding direction for all operands in its domain.

so all of them are supposed to be correctly rounded. I think IEEE 754-2008 also requires correct rounding, but I don't have that spec in front of me right now.

In practice, they're not correctly rounded--the C specification explicitly disclaims the need for them to be (§F.3¶20), reserving the cr_ prefix for future mandatory correctly-rounded variants.

Someone · 2024-10-14T08:44:19.000000Z

Thanks! Reading https://grouper.ieee.org/groups/msc/ANSI_IEEE-Std-754-2019/b... (“It is believed that any existing implementation of 754-2008 conforms to 754-2019”) it seems IEEE 754-2008 also required it.

Even with that and ignoring C’s “we don’t support that”), it still can be hard to write C code that provides identical results on all platforms. For example, I don’t think much code uses float_t or double_t or checks FLT_EVAL_METHOD (https://en.cppreference.com/w/c/types/limits/FLT_EVAL_METHOD)

jcranmer · 2024-10-14T14:02:20.000000Z

So the things you mention aren't useful for getting consistent numerical results. You really have to start getting into obscure platforms like mainframes to find stuff where float and double aren't IEEE 754 single and double precision, respectively. FLT_EVAL_METHOD is largely only relevant if you're working on 32-bit x86 code, and even then, you can sidestep those problems if you're willing to require that hardware be newer than 20 years old or so.

The actual thing you need to do for consistency is to be extremely vigilant in the command line options you use, and also bring your own math library implementations rather than using the standard library. You also need vigilance in your dependencies, for somebody deciding to enable denormal flushing screws everybody in the same process.

bobmcnamara · 2024-10-14T23:55:02.000000Z

Ah, I've had a slightly different task many times: porting a high level algorithm from MATLAB or labview or keras to C.

As part of this I construct a series of test inputs, and confirm that they are bitwise equivalent to the high level language. It's usually as simple as aligning the rounding mode, disabling fused MAC, and a few other compiler flags that shouldn't be project defaults.

The other fun part is using the vector unit - for that we have to define IEEE arithmetic in the order the embedded device does it(usually 4x or 8x interleaved), port that back up, and verify.

Never did use a whole lot of transcendentals - maybe due to the domains I worked in.