Hacker News new | past | comments | ask | show | jobs | submit login
Evolutionary algorithm to find Quake III's fast inverse square hack with Python (multigrad.blogspot.com)
173 points by soravux on April 8, 2014 | hide | past | favorite | 51 comments



Řrřola (a famous demoscene coder [1]) found a far better constant than the original (a 2.7-fold accuracy improvement) by using an optimisation algorithm over all three constants present in the original inverse square hack.

His algorithm and the result (along with a plot of relative error) can be found on his blog [2].

[1]: See, for example, this 256-byte raycaster https://www.youtube.com/watch?v=R35UuntQQF8

[2]: http://rrrola.wz.cz/inv_sqrt.html


This is really interesting. I'll be sure to include these references in my next blog post on the subject. The goal of the post was to find the equation from scratch, though. As it can be seen, the constant optimization was less the spotlight of the post.


Not to mention your blog engine/site is broken. Please fix the fact that we can't click on the right scrollbar due to your fancy shmancy javascript side menu.

Put it on the left hand side, or don't overlay it over the scrollbar.

Damn #hipstercoders


The placement of any site elements on top of the scrollbar is indeed annoying. Is that a Blogspot thing, or is it due to a customization?


It's blogspots "Dynamic Views". Complete dreck. I can only assume it's the default when you create a blog nowadays as I have no idea why anybody (least of all developers) would opt into that crap.


Someone at blogspot needs a smack on the head. The site takes several seconds to load, which is absolutely ludicrous for displaying plain text, and when it finally does it's unusable as you've noted.


It's a sadly commonly used theme that also does not display anything at all if you don't enable Javascript.


[deleted]


Requiring tons of javascript and seconds of load time to display plaintext? Yeah, that is pretty crazy.


Everything is on default from Blogspot. I only put the Gist source code displayer and the LaTeX math javascript. If anyone is offering a new design, I would gladly take it... As long as I can concentrate on the content and not the container.


Has anyone gone in an plugged this into the Quake III / IOQuake 3 source and see if it's make a difference? Speed or better looking geometry?


its not faster and the accuracy improvement is more of a curiosity than of real value.

infact on modern desktop/console hardware invsqrt hack is really bad. it causes type aliasing which is 'expensive' in native software terms. the x86(-64) chips underlying them all now has a dedicated instruction for this, as do the PPCs that came in the generation before... still of value on ARM though which is becoming more and more important.


There is an amazing, easy to use peice of software for doing stuff like this called Eureqa (http://www.nutonian.com/products/eureqa/). I've used it to find approximations for a lot of different functions.


Am I correct when I think of this as a curve/surface fitting tool that tries to automatically find a low complexity equation together with parameters that fits some provided data?


The term for this is called "Symbolic Regression".


Yes, that's exactly it.


On the flip side of the coin, there's ries, which takes a number and finds a mathematical expression that approximates it: http://mrob.com/pub/ries/


I had a program for my HP-48GX that use to do something similar. It was especially useful when I knew that my decimal answer had some numerical constant, such as sqrt(2), and I wanted to convert it to something more reasonable. It would only return a single value, so I had to be careful that I didn't just use it every time, since just because an answer could approximate a multiple of Pi, it might only be a coincidence and not a good answer.


Some of that was actually built-in to the calculator. The "→Q" and "→Q𝛑" functions took a floating-point number and return a rational number or rational expression in terms of pi. The tolerance for the approximation was determined by how many significant digits the calculator was configured to display. It was pretty nice for a calculator that didn't include a computer algebra system.


It was specifically an improved library of those functions. I'll have to dig around on hpcalc.org and see if I can find it again. You could do algebraic calculations if you wrapped the statement in quotes, but once you got used to RPN that made little sense to do.


It was specifically this library: http://www.hpcalc.org/details.php?id=1403.


That's really cool, I've been looking for something like that.


Eureqa seems a wonderful software for this specific application. If only I could get 2,500$ for a one-year license...


It has a 30 day free trial and you can use it for free if you limit to a few hundred datapoints (not ideal but workable.)

It's not perfect for this application though because it doesn't bitwise operations or use execution time as a fitness function.


There are cheaper ones such as GeneXproTools.


This reminds of another trick on the C64 to multiply two 8 Bit numbers a*b = [ ((a+b)/2)^2 - ((a-b)/2)^2 ]

more or less: one 8 bit additon, one 8 bit subtraction, two table lookups for the squares and one 16 bit subtraction.


Were the divide by twos just bit shift operations? (I have no idea if the 6510 had bit shift operators.)


Yep, it does.

ROL rotate left

ROR rotate right

ASL arithmetic shift left

LSR logic shift right

But for it to work for all 2-complement coded values we need to ensure that the MSB is copied back into itself to keep the value the same sign, which can be done with another shift.

; Divide a signed 16-bit value by 2

        LDA MEM+1    ;Load the MSB

        ASL A        ;Copy the sign bit into C

        ROR MEM+1    ;restore MSB

        ROR MEM+0    ;Rotate the LSB

It's a bunch of cycles but nothing compared to a generic division.


Well, I think it is part of the table.

http://codebase64.org/doku.php?id=base:fast_8bit_multiplicat...

But the C64 had also a shifting operation.


Hacker's Delight [1][2] is an amazing book dedicated to discussing about all kinds of bit hacking, including fast integer square root, cube root calculation.

[1]: http://www.hackersdelight.org/

[2]: http://www.amazon.com/Hackers-Delight-Edition-Henry-Warren/d...


Now the question is: could you arrive at this algorithm without previous knowledge of what it should converge to?


That is the exact goal of the post. I arrived to the algorithm through evolution with random initialization. I enforced absolutely no heuristic to make it converge to this equation.


I know, but can we prove there's no heuristic on that algorithm? Because if the answer is "yes", this should be general enough to find other optimizations. It's fun to think about.


The best answer I can provide is: The code is there. There are no heuristics in it. Everytime you run it, you get different results (because of random initialization seeds and the stochastic nature of EA). It may find the (a - (x >> 1)) equation on a specific execution, or not. Over the runs I made, this equation (or similar) was the most popular and nothing come close to it. In fact, it finds a lot of other optimizations; either less accurate or way more complex. I remember getting tens and tens of operations in equations with "bof" accuracies.


Why is that a relevant question now? How would doing this be more helpful without the results of the algorithm you're attempting to cut corners to approximate?


Is anyone else seeing a perpetual "loading" string where the example code is supposed to be?


It's javascript that loads the Gist into the page. You may have NoScript or similar that prevents the loader to fetch the code. The first code is the one from the Wikipedia page (fast inverse square root) and the second is this one: https://gist.github.com/soravux/9673839


Shameless plug some of you might like to read: http://www.flipcode.com/tpractice/

(I used to be really into game programming when I was in my teens.)


Why on earth would they create a named constant for 1.5F, but not for 0x5f3759df?


What would you name it?

    #define AND_THEN_A_MIRACLE_OCCURS (0x5f3759df)


Fair enough. Still, it seems silly to name a constant THREE_HALVES. What information does that add over 1.5? What happens if you change it and it becomes wrong?


MAGIC_NUMBER?


With fast native code its more than practical to search the entire space of floating point numbers and guarantee that you find the optimal magic number:

http://jheriko-rtw.blogspot.co.uk/2009/04/understanding-and-...

As already mentioned in another comment it has been taken further by Řrřola who has optimised a version with the newton step including the constants involved in that.

http://rrrola.wz.cz/inv_sqrt.html


For a moment I thought this was a way to workaround any patents with this hack, by rediscovering the hack dynamically; but I could be remembering a different patented technique.


You might be thinking of depth-fail shadow volumes, semi-famous for being patented by Creative despite having been discovered independently by a number of people (including Carmack) over the years.


That's actually a cool idea. I wonder if that would hold up.


Under US law it would not, as even genuine independent (re-)discovery is not a defense to infringement.


That's pretty ridiculous. The purpose of the law is to keep people from unfairly copying ideas, not rediscovering them.

However this is different. It's not a person rediscovering it, but an algorithm which happens to do exactly the same thing but is provably not unique or patented. It's a different algorithm that just happens to converge on the same process.

This is perfectly analogous to patents on real world inventions - there are plenty of cases of different machines that do the same thing. You can't patent the output of an algorithm, just the process by which it gets to that output.


You can be liable for triple damages if the court decides you knew about the patent and infringed anyway. Normal damages if you did not. I believe independent invention is meant to be so rare as to be suspicious (despite many documented cases where this happened).

The (supposed) purpose is to give an incentive to inventors to freely broadcasting their inventions, by giving them a monopoly on it so they can recoup their investment. Without that incentive, inventors would keep their inventions trade secrets.


and how do you prove this?


This is probably one of my favourite HN posts of 2014. Outstanding.


Oh, I thought this gonna be like regex golf:

http://xkcd.com/1313/

Misleading title.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: