Hacker News new | past | comments | ask | show | jobs | submit login
16-bit math look-up tables – the unexpected power of scaled-integer math (wilsonminesco.com)
114 points by mirceasoaica on April 1, 2017 | hide | past | favorite | 51 comments



For those who are interested, CORDIC (for COordinate Rotation DIgital Computer) is another technique from that era. It "is a simple and efficient algorithm to calculate hyperbolic and trigonometric functions, typically converging with one digit (or bit) per iteration." See https://en.wikibooks.org/wiki/Digital_Circuits/CORDIC

For instructional purposes, here is a simple Python implementation that uses only adds and shifts in the inner loop, followed by a single scaled-multiply to finish it up: https://code.activestate.com/recipes/576792-polar-to-rectang...


IIRC, CORDIC has some issues as you scale the number of bits you want--you can't just run CORDIC 64 times for 64-bits of accuracy, for example.

I think the point of diminishing returns is 14-16 bits, but it's been a long time and I'd be happy to be proven wrong.


I used CORDIC on a FPGA a few years ago to compute atan2.


Cute. However, unless you're using a CPU from the 6502 era, it's probably not worth the trouble for multiplication and division. Today's low-end CPUs have good multiply hardware, and for a few dollars more, you get a decent FPU. Trig functions, though, may be worth precomputing. The standard libraries for trig functions often grind their way out to far more precision than you need for graphics or control, and that takes time. Interpolating from a modest sized table can be faster.


> However, unless you're using a CPU from the 6502 era, it's probably not worth the trouble for multiplication and division.

When we talk PC, fixed point math was popular a few generations longer than the 6502 era. The 6502 had no multiply and division instructions at all, and up to the 80386 there was only integer multiply and division and that was slow as molasses. Before the 80486 fixed point wasn't a matter of speed, it was a matter of survival (at least for graphics and such).


I remember my old 8086 PC turbo'd to 8 MHz would crawl along as it rendered the wireframe space shuttle demo image that came with Autocad. I somehow got my hands on an 8087 math coprocessor, plugged that bad boy in, and the difference was phenomenal. Good times.


indeed, I was using fixed point maths on PlayStation 1 games in the mid to late 90s. It was often responsible for the gaps you'd see between polygons on many PS1 games.


Gaps were mostly a sign of sloppy developers and rushed schedules. Compare Tomb Raider 1 on both PC/PSX with Destruction Derby (reflections) and Gran Turismo 1/2 (polyphony).


Yes, there were ways to avoid it (shared vertex calculations), but the effect seen was mostly where there are two separate vertices that are supposed to be at the same position, aren't, because each vertex was positioned with different transformation matrices.


I was using it for z-80, game boy color homebrew. Getting down in the bits like this is.... bracing!


I remember doing a Gameboy Color game after working on a PS1 title. And even though I came from an 8 bit 6502 background (BBC Micro), going back to it was hell. 'Bracing' is an understatement! I couldn't imagine much worse these days.


It’s usually possible to re-frame problems to not require trig functions at all.

For instance, you can represent rotations as unit magnitude complex numbers, compose them using complex multiplication, and trivially get whatever trig functions you want out. If you need to compress them for I/O, take the stereographic projection (requires 1 division per point for both forward and inverse transform) and then optionally reduce the precision of the result.


Construction of a unit complex number though, given an angle, requires trigonometry. Precomputing this and re-using it is identical to precomputing the sine and cosine of that angle and reusing them instead - the complex number itself doesn't simplify anything here other than storing both the sine and cosine in one variable.


The only time you need to start with an angle is if a human or other external system is feeding it to you, and the only time you need to convert to an angle is when you need to present the data to a human or other external system.

The point here is that you can usually get rid of evaluating transcendental functions in the middle of the number crunching part of your code, which is where you would care most about saving operations.


I don't think that's right. A rotation implies an angle. If it's not coming from a human, then you are either hard-coding an angle to rotate by, in which case you need to take sine and cosine of that angle to generate a complex number, or you're getting the rotation that is implied from some coordinates, hard-coded or otherwise. In which case, you need to normalise a complex number - still requiring a square root.

I don't think there's a way to do rotations without transcendental functions (or square roots) that isn't equivalent to merely precomputing the transcendental functions or square roots.

Maybe square roots are faster, in which case you have a point that avoiding transcendental functions is the way to go.


I think they're referring to quaternions. May require some trig to generate the initial quaternion, but then in your number crunching code you can multiply with other quaternions (representing rotation composition) and then even convert the result to a 3x3 rotation matrix, both with no trig. This is how game engines handle storing 3D rotation.


We’re talking about possible representations of a rotation in the plane. The usual representation is “angle measure”, e.g. radians, degrees, or “binary radians”. This representation makes it easy to compose rotations (just add the angle measures), and the binary version is an efficient use of bits, but if you want to do anything else, e.g. rotate some arbitrary vector, then you need to do a bunch of transcendental function evaluations, which is computationally expensive.

I am recommending using the Cartesian (x, y) coordinate representation instead. This is often called a “complex number” x + iy on the “complex unit circle”, or you can think of it as (cosine, sine) if you prefer. Composition of rotations in this representation is still straight-forward: (a + ib)(c + id) = (acbd) + i(ad + bc).

If you need to compress this down to one number for transmission or storage, you can take the stereographic projection (“half-angle tangent”), (x, y) ↦ y / (x + 1), and optionally also reduce the precision. This saves you at least half the bits vs. storing the pair of coordinates, and is relatively inexpensive to compute (takes one division per point).

You are right that to normalize an arbitrary vector requires taking a square root. Fortunately, this doesn’t need to be done too often.

As an added bonus, this method generalizes much more easily than angle measure to representing higher-dimensional rotations and points on higher-dimensional spheres. For instance, I recommend using cartesian coordinates (or the stereographic projection for compression) instead of latitude and longitude for storing points on the 2-sphere such as geographical places. Many pieces of mapping software are constantly converting back and forth (implicitly or explicitly) between latitude/longitude vs. cartesian coordinates for computing distances, directions, and areas, finding intermediate points, applying 3-dimensional rotations, clipping shapes, and so on. This is slow and complicated vs. just using Cartesian coordinates as the internal representation.

It is usually unnecessary to compute the angle measure corresponding to a rotation unless communicating with a human. Angle measure is just one of several possible representations, and it’s only the most common because it gets taught to every child.


An example, a solar sailing toy: http://wry.me/t/gravity/gravity.html

Yes, there's a sqrt. (And another for gravity.) No trig needed or wanted.


I don't quite agree. Complex analysis is more than powerful enough to recover the unit complex number corresponding to a specific angle without referencing any kind of trigonometry. Of course you'll have reconstructed the sine and cosine functions that way, but you don't need to use a single trigonometric function or theorem to get there.


In terms of implementation, what are you not quite agreeing with?


Except for the cache hit.


If you're interpolating and only need 16 bits of precision, you need only a small table. 32 or 64 entries should be enough for sine and cosine.


An example of the linear-interpolated lookup table approach using fixed-point math is the sine generation[0] in my music synthesizer. It generates a table of 1024 entries, then lerps between those. This gives you precision beyond the least perceptible audio error, and is pretty fast.

When I first wrote that code, I was targeting 32 bit ARM, with possibly very weak floating point units. These days, on the types of chips you'd see in phones (or computers in the $9-$40 range like rpi 3), the floating point calculation is _so_ fast (especially with simd) that the cost of the memory lookup starts dominating. So, the NEON implementation computes 4 sin values in a gulp, using a floating point polynomial approximation about the same precision as above[1].

[0] https://github.com/google/music-synthesizer-for-android/blob...

[1] https://github.com/google/music-synthesizer-for-android/blob...


It’s probably a generally better idea to use a polynomial approximation if you have a floating point unit; a degree 8 polynomial for sin(x) on the range [0, π/2] gets you to just about the limits of single precision floating point. If you only need about 4 digits of precision, you can use a degree 5 polynomial.


Couldn't you do better by only using the range [0, π/4] and noting that sin(x) = cos(x - π/2)?


Sure, then you can get away with an even smaller degree polynomial.


FWIW, Applesoft for the 6502 uses a 5th order poly.


Back in the day of 80286's and 80386's there was a popular Fractal generating software that did all calculations in fixed point arithmetic because PCs usually had no FPU back then.

Fractint[1], the site has a certificate error, but otherwise works fine, even the software still gets updated from time to time.

[1] https://fractint.org/


Used it to generate uniform random plasma fields for game textures. Great program!


With the right color map they made a nice sky with clouds. Color cycling them through a smooth full color map was also nice.


I used this exact approach just yesterday when I needed to calculate some sine values on an 8-bit MCU. Turns out including a precalculated table takes way less program memory than linking softfloat and trig libraries. 8051 is not dead, folks!



Unums are unlikely to gain much usage. Posits[1][2], also by Gustafson, are a more reasonable alternative to IEEE-754 floating point (but will still have a difficult time displacing IEEE-754, if they can at all).

[1] http://web.stanford.edu/class/ee380/Abstracts/170201-slides.... [2] https://www.youtube.com/watch?v=aP0Y1uAA-2Y


Posits seem impressive. My only concern is the lack of NaNs seems like a bug rather than a feature.

It's true that some programmers do the silliest things when faced with NaNs. But the fact is they are useful. You often want to do calculations over big matrices, where some elements simply don't have a mathematically defined answer (usually because of div0s but also because input data might have holes).

It would be a royal pain if the whole calculation stopped every time you had to add two infinities.


Isaac Yonemoto has suggested the use of ±∞ as a NaN, and it seems to work exactly like NaN does other than x/±∞ = 0 whereas you want x/NaN = NaN. I hate hardware exceptions. If you have a code that occasionally hits a NaN, that means the code is still under development. You would never release such a code into the wild, because it clearly is not safe to use and has not done the basic blocking and tackling of guarding against input arguments that will prevent any indeterminate forms, square roots of negative values, etc.

I haven't written them up yet, but _valids_ are what you want if you are want software that can gracefully and mathematically handle the results that make floats generate a NaN. Think of the valid computing environment as the numerical debugging environment for posits. It's slower and ultra-careful and rigorous, but once you get your algorithm to the point where it never tries to color outside the lines, then switch to posits and go FAST.

Leaving a NaN in a number system designed for lean speed is a mixing of computing esthetics. Which do you want? Rigorous and careful and mathematical, or good enough, fast, and cheap? You have to make up your mind, because if you _mix_ the two esthetics in one number system, guess what: You get neither. It won't be fast, because it has to check for exceptions all the time, and it won't be mathematical because it keeps replacing correct answers with answers within its vocabulary (that is, it rounds). IEEE floats are a mixture of the two esthetics, and that is their fatal flaw.


>is the lack of NaNs seems like a bug rather than a feature.

I think this is a reasonable concern. I'll propose to John that we make there be an optional "mode" where the infinity token is treated as "NaN". In reality, this mode just amounts to "ignore NaN traps", because the way that it's done in my hardware models, it requires almost no extra hardware.


What if we want both NaN and infinity?


if you really want both, then either 1) use IEEE floats or 2) write your own data type, or 3) used a boxed data type, which is really the best solution for the dominant "missing data" abuse case anyways.


Actually, posits handle NaNs already, by interrupting the calculation and doing whatever you have set up to handle the exception. What they do NOT do is represent Not-a-Number with a number. If a programmer is about to compute, say, x/y and in running the code it sometimes hits the case y = 0, then any competent programmer can write a conditional test to guard against that happening. It is not reasonable to ask computer hardware to magically continue to work somehow when it hits a bug in a program.

Think of it this way: posits have a signaling NaN but do not have a quiet NaN.


I think I would probably choose to have one NaN rather than +/-Infinity. Infinity itself is not a number -- it's just a special NaN.



It's not a number if you are like me and say "a number is an element of a ring." (We're not talking about ordinals or cardinals, the subject being rationals, otherwise I might have a different definition.)

The projectively extended real line doesn't support adding infinity to itself, which is what the grandparent to your comment was talking about. The problem is that the projectively extended real line does not have both a positive and negative infinity.


Polynomials are numbers now? :-)


Mainly, I was poking a bit of fun at the idea that "number" even has a rigorous definition. You can't really assert whether infinity is a number or not without 1) knowing what they mean by "number" or 2) knowing what they mean by "infinity." If you want "numbers" to be a field extension of the reals and you want it to contain an "infinity," then the hyperreal numbers might be appropriate [1]. If you are instead content with having an infinitesimal and having only a ring extension, then the dual numbers might be appropriate [2].

[1] https://en.wikipedia.org/wiki/Hyperreal_number [2] https://en.wikipedia.org/wiki/Dual_number

But, anyway, if you can claim infinity is a number because someone thought of the real projective line, I can say polynomials are numbers. Square matrices, too --- I think of square matrices as being big numbers; rank measures how invertible a big number is.

A stranger consequence of my definition is that a continuous real-valued function on a space X is a "number." If X is a single point, then such a function is the same as a real number. If X is a discrete set of n points, then the set of functions is R^n. If X is compact (I think that's sufficient?) then the maximal ideals in the set of continuous functions is in correspondence with X itself. This suggests that for any ring, one may imagine there to be a space that it is the functions of, whose points are the maximal ideals of that ring. For the ring of complex one-variable polynomials, the points correspond to C itself (the maximal ideals are generated by (x-c) for varying constants c). So, yeah, polynomials are numbers now.


> In real analysis, the projectively extended real line [...] is the extension of the number line by a point denoted ∞.

> The extended complex numbers consist of the complex numbers C together with ∞.

It may be an "extended number", but it is not a number.


If you can consider a “complex number” or a “transcendental number” to be a “number”, then there’s really no reason to not also consider ∞ to be a number.

In general, the boundary of the category of ideas (if you like, elements of some formal model) that can be called “numbers” is a very fuzzy and arbitrary one.

Some people might reject as “numbers” anything other than the counting numbers 1, 2, 3, 4. Others might allow “negative numbers” or ratios. Still others are happy to include quaternions or infinite strings of digits output by some computer program. Meh.


Āryabhaṭa's sine table, the first known mathematical LUT, was made around 500 DC:

https://en.wikipedia.org/wiki/Āryabhaṭa's_sine_table


> first known mathematical LUT

Babylonian multiplication and reciprocal tables are about 2500 years older.

For that matter, Hipparchus and Ptolemy’s chord tables are also centuries older, https://en.wikipedia.org/wiki/Ptolemy's_table_of_chords


>And by the way, real-world I/O in control situations is never floating-point, whether timers, counters, A/D and D/A converters, servos, etc..

well, actually ;) Adlib/Sound Blaster Yamaha OPL2 and OPL3 both used external Floating Point DACs (YM3014B,YAC512)


>> Jack Crenshaw addresses it in chapter 5 of his fine book "Math Toolkit for Real-Time Programming."

Ah shit, can't believe I missed this, all these years. Jack Crenshaw ie Let's Build a Compiler. Love that guy.


Is there a stdclib which allows for easy replacement of expensive funtions like fsin or fcos with lutables? Also do those not poison the cache?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: