Instant neural graphics primitives with a multiresolution hash encoding

MauranKilom · on Jan 16, 2022

My summary (from someone who is not in the field but likes backpropagation):

The core idea behind this type of approach ("parametric encoding") is that you learn a scene as some spatial data + a (small) neural network. For example, a 128^3 grid of data values and a 10k parameter model. In the forward pass you feed whatever data is at the voxel(s) in question to the network, and the backward pass updates both the network and the same voxel(s).

The innovation in this paper is in how the spatial data is represented. Prior work includes dense grids, multi-resolution grids and octrees to name some - but all of them are either GPU-unfriendly or waste parameters on empty space. They figured that they can just hash the coordinates and use them directly as an index into a data array (edit: A multi-resolution stack of data arrays - sorry for not getting this right initially), with hash collisions left to the network to figure out (it's gonna figure out whether there's a collision on fine layer through info from the coarser ones, I guess).

(Relatively) few parameters + GPU-friendly data structure = fast training. Tempted to try and implement this myself...

amelius · on Jan 16, 2022

Isn't the effect of hashing the same as sampling more coarsely?

MauranKilom · on Jan 16, 2022

I think the key here is that e.g. surface information only grows at O(N²) rate whereas number of grid points scales as O(N³). The hash function approach means your arrays will be filled with detailed information densely, whereas sampling coarsely would still leave most of the array with "nothing here" information.

Your comment made me realize that I forgot to mention the multi-resolution aspect of their hash encoding (there are several data arrays corresponding to different resolutions - coarse ones are 1:1 indexed but finer ones have hash collisions for the network to deal with). It's in the title, but I should still include it.

ReactiveJelly · on Jan 16, 2022

If it's so fast, I'd like to see it working on a smaller scale on a CPU.

Every new deep learning paper that comes out, I'm disappointed that it needs...

- A $500-$1,000 GPU

- A huge proprietary NVidia driver

- Some odd language or language extensions, usually CUDA

- Python

jonas21 · on Jan 16, 2022

Why? The point of research is to push the limits of what's possible, not to build something that runs on every single platform.

I find it remarkable that most recent deep learning papers release the source code needed to reproduce their result -- and even more remarkable that many papers, like this one, can be reproduced on hardware that a hobbyist can afford.

And if you'd like this to run on a CPU, you're welcome to port it. The code is open source after all.

olliej · on Jan 17, 2022

The reason they run on a GPU isn’t spite. It’s because the work for neural net based ML is inherently dependent on vast amounts of independent floating point operations.

CPUs tend to have very few FPUs per core, so you max out a modern systems CPUs idealised throughput at maybe 40-80 concurrent streams. On top of that the FPUs on a CPU are generally require to perform fully compliant ieee754 arithmetic at at least 32bit of precision.

Modern GPUs can have that number of FPUs per hardware thread and then have a few hundred of those hardware threads. Each of those GPU FPUs are also faster as they can both elide some elements of ieee754, and operate at lower precision (fp16) to get even more performance.

So you could read the paper, and implement it on a CPU and the very best that you, or anyone, could do would be literal orders of magnitude slower than the GPU implementation.

That’s why you don’t see them doing it on a CPU, let alone in Python.

adammarples · on Jan 17, 2022

The reason it runs onGPU is because this research was literally done by NVIDIA!

olliej · on Jan 18, 2022

Nvidia also makes CPUs.

The reason the research is coming out of nvidia is because this kind of research is inherently GPU limited. So if it came out of AMD, Intel, Google, or Apple, it would be dependent on either GPU, or non-programmable NN specific hardware. If it came out of academia it would still be on a GPU, because none of this is remotely practical on a CPU.

dr_zoidberg · on Jan 17, 2022

Well we can shorten that list if you're able to write your models in tensorflow 1.15 to: Windows 10 and Python 3.6+. Microsoft has done something quite interesting with tensorflow-directml [0, 1]. A friend is training convolutional networks on a Ryzen 5 3500u ultrabook, at about the same speed my old notebook with a GeForce 940mx could. I'm tempted to test it on a 4600H when I have a bit of time, it could be interesting if the iGPU is able to access a large portion of the 24GB of RAM that system has.

[0] https://pypi.org/project/tensorflow-directml/

[1] https://docs.microsoft.com/en-us/windows/ai/directml/gpu-ten...

PufPufPuf · on Jan 17, 2022

Machine learning research often scales up to solve a new problem, and then scales down the solution until it's actually usable. Object detection, for example, is now fully usable on a phone CPU.

stjohnswarts · on Jan 17, 2022

I'm sorry if you're doing GPU calculations then you want a powerful video card unless your research is on improving performance of algos on less powerful hardware. There are only so many hours in a day.

astrange · on Jan 17, 2022

You can rent the GPU from Google Colab. (It's actually more like $15k than $1k, which makes you wonder why Colab is so cheap.)

Why everything's written in Python I couldn't tell you.

evgen · on Jan 17, 2022

Everything is written in Python because early in on the process people realized that you are not doing anything special and are certainly not doing actual math: you are just wiring together libraries that you barely understand using a cookbook that someone else provided for you to generate results that you cannot explain so that an investor can tick a checkbox for a feature list that you never see. If you are just gluing together C/C++ libraries then there are worse languages that could have been selected, but once momentum gathered behind Python as the glue language it was hard to divert to another language (e.g. how hard the Julia folks are trying, and failing, to do just that...)

wokwokwok · on Jan 17, 2022

To be fair, almost every deep learning paper that comes out needs something like 10x GPU cloud nodes to run on.

The days where you could run anything significant on a single 1k graphics card are long gone.

This is, ironically, the first time that (I’m aware of) you could distill this Nerf stuff down into a size that runs on a single consumer GPU (RTX 2x or higher)

…so, some of your points are fair, but hey, at least these folk are trying to bring this down from “only usable by large corporations” to “runs on your desktop”.

I mean, it’s not perfect, but I think in this case you’re complaining about something abstract, when these folk are actually going in the right direction.

acchow · on Jan 17, 2022

Your smartphone in 10 years will have an equivalent GPU

ReactiveJelly · on Jan 17, 2022

I haven't learned Android in the last 10 years, doomsday principle says I'm probably not gonna learn it in the next 10 either.

sva_ · on Jan 17, 2022

We'll hopefully have WebGPU by then

lpapez · on Jan 17, 2022

By this logic you would be disappointed by any space related discovery because it needs a multimillion dollar telescope...

rdedev · on Jan 16, 2022

Just curious. What is your objections wrt the last 3 points ?

ReactiveJelly · on Jan 17, 2022

I like FOSS a lot. Normal programming languages have relatively small downloads and run on normal CPUs dating about 10 years back with almost no issue.

GPU workloads always want some odd driver that has a gigantic download, and they're constantly coming up with new reasons to force you to the newest APIs, which means you have to buy new chips that have the right architecture or firmware for the new APIs.

So I have to buy this co-processor, and then I can't even treat it like a black box that I send commands to, I need a gigabyte-scale SDK or something to issue the commands on my behalf.

I can't stand it. It's as if there was a tiny window when programming was simple, after I learned about FOSS, and before GPGPU caught on. As if the personal computer really will turn out to have been a fad.

olliej · on Jan 17, 2022

Ok, GPGPU isn’t “general purpose” in the basic sense, it means “not just graphics”. No CPU is going to be able to get performance in NNs that matches that of a GPU. A CPU simply cannot do the work. The closest a “general” CPU gets to that kind of thing are the big vector machines like the old Crays or Itanium’s packet architecture. Programming for either of those architectures is non trivial, and for normal software those architectures are slower than normal CPUs.

Despite the trade offs those systems made, consumer GPUs ended up with better performance because a lot of the things and general CPU has to do interfere with performance of pure numerical computation.

astrange · on Jan 17, 2022

This research uses Nvidia products because it was released by Nvidia, who have hired approximately every graphics researcher out there.

hwers · on Jan 17, 2022

you can get really really far with a colab pro+ account these days (just $50 per month)

ath92 · on Jan 16, 2022

For some additional context, when the original NeRF paper (https://arxiv.org/pdf/2003.08934.pdf) was published 2 years ago, it reportedly took at least 12 hours (depending on hardware used of course) to train on the scene with the bulldozer. This has now been reduced to about 5 seconds (!), with realtime rendering of the result.

hwers · on Jan 16, 2022

The gigapixel example could be done with fourier features which takes about a few minutes to train (on colab-like resources). Definitely still a huge improvement though (and based on more clever hashing techniques than optimization).

WithinReason · on Jan 16, 2022

Goodbye polygons, hello neural networks?

aappleby · on Jan 16, 2022

More like "run this low-quality polygon and raytracing renderer at 320x240 @20 fps, upscale to 4k120 with acceptable quality".

EZ-Cheeze · on Jan 16, 2022

I think meshes and textures can be replaced by intelligently shifting, raytraced billions of platonic solids

http://zeroprecedent.com/platonic

ReactiveJelly · on Jan 17, 2022

Why not billions of triangles? Unreal is betting on Nanite because triangles have so many nice properties in addition to having the whole art pipeline already set up.

(I could not get the URL to load. Maybe HN hugged it)

EZ-Cheeze · on Jan 17, 2022

Triangles have no volume and no diffraction occurs inside them as it does with Platonic solids. The idea is that real-time raytracing will allow complex variations and interactions of "Platonic dust particles" and the rays bouncing and refracting between and in them. It would be a more expressive "clay" for the AI to tinker with than triangles - the orientation/color/transparency changes of each solid will be able to elicit more visual effects than doing it with flat triangles.I got banned from Eleuther discord today The One#3740

blovescoffee · on Jan 17, 2022

This person just invented a very handwavey GAN

EZ-Cheeze · on Jan 17, 2022

Effective representation changes everything

Like hashtables

aantix · on Jan 16, 2022

I’ve seen the GTA demo.

Are there any commercial games currently doing this?

JayStavis · on Jan 16, 2022

Neural rendering? I doubt it. Check out deep learning super sampling though (DLSS) from NVIDIA, which has to be plumbed into the game itself to enable.

https://www.nvidia.com/en-us/geforce/technologies/dlss/

ReactiveJelly · on Jan 16, 2022

Not sure yet.

This is probably going to fight virtual geometry tech like Unreal's Nanite, which is still using triangles but using clever automated LoD and GPGPU rasterization so that rendering e.g. 20 million pixel-sized triangles is fast and looks just as good as rendering a trillion triangles. (normally very small or thin triangles are a pathological case for hardware rasterizers)

michaelgiba · on Jan 17, 2022

Wow this is a game changer / landmark advancement

tmilard · on Jan 16, 2022

Magic, impressive. I have no word.