Hacker News new | past | comments | ask | show | jobs | submit login
Show HN: Real-Time 3D Gaussian Splatting in WebGL (antimatter15.com)
309 points by antimatter15 on Sept 11, 2023 | hide | past | favorite | 59 comments



This is really cool! The control scheme is confusing though. Instead of the typical WASD for moving and using the mouse to look around, dragging the mouse moves forwards and backwards and orbits around some point, A and D strafe, while W and S look up and down.

EDIT: Looks like a full list of controls is in the readme: https://github.com/antimatter15/splat#controls


Author here- I'm sorry about the camera controls! Happy to accept pull requests that replace it with something more sensible

The original idea was to be able to navigate around with just arrow keys (conceptually by turning yourself around in place and being able to walk back and forward).


This is insanely cool!

If you integrate this with ThreeJS you'd have a lot of control options for free!

Whilst you're here, I have a question for you: It seems like you don't render read gaussians (I see sharp edges in many cases). Is this a bug on my side or is this an optimization made to be able to run fast? I created an issue to discuss if you prefer https://github.com/antimatter15/splat/issues/2


If you do an update, consider this a vote for WASD + mouselook. It's a ubiquitous scheme among everyone with an interest in real time computer graphics


No need to apologize, it's a minor thing! Anyway really neat stuff and I love seeing it here.


It's very similar to the N64 FPS controls (i.e. Goldeneye): arrow keys (joystick) for the "primary movements" of forward/backward and yaw, with which you can move and look anywhere in a 2D space. Then, WASD (C buttons) for the "secondary movements" of strafe and pitch.


It's pretty telling that you had to reach back 26 years to find a control scheme that could be used as an analogy. I don't even know where to start with the mouse controls. Up/down translation lock after right click; reverse yaw. Thing is a test of patience!


It actually seems like the FreeCAD control scheme almost verbatim, I always hated that thing and its insistence to not provide any way to orbit around the up vector.

Like, are there people whose head does not rotate around their neck on axis but ends up sideways and rolled when they turn or something, to whom this makes perfect sense? I can't see any other explanation.


FWIW, OP, I liked the control scheme a lot (using mouse only).


Being brutally honest here, but I just cant get over the control scheme enough to even appreciate the rendering demo. It is unusably unintuitive and awful.


Really cool, I am also working on a port of gaussian-splatting [0] but to WebGPU.

Like all the other implementations I have seen so far, this also makes the same mistake when projecting the ellipsoids in a perspective: First you calculate the covariance in 3D and then project that to 2D [1]. This approach only works with parallel / orthographic projections and applying it to perspectives leads to incorrect results. That is because perspective projections have three additional effects:

- Parallax movements (that is the view plane moves parallel to the ellipsoids) change the shape of the projected ellipse. E.g. a sphere only appears circular when in center of the view, once it moves to the edges it becomes stretched into an ellipse. This effect is manually counter balanced by this matrix I believe [2].

- Rotating an ellipse can change the position it appears at, or in other words creates additional translation. This effect is zero if the ellipse has one of its three axes pointing straight at the view (parallel to the normal of the view plane). But, if it is rotated 45°, then the tip of the ellipse that is closer to the view plane becomes larger through the perspective while the other end becomes smaller. Put together, this slightly shifts the center of the appearance away from the projected center of the ellipsoid.

- Conic sections can not only result in ellipses but also parabola and hyperbola. This however is an edge case that only happens when the ellipsoid intersects with the view plane and can probably be ignored as one would clip away such ellipsoids anyway.

The last two effects are not accounted for in these calculations in any of the implementations I have seen so far. What would be correct to do instead? Do not calculate the 3D covariance. Instead calculate the bounding cone around the ellipsoid which has its vertex at the camera position (perspective origin). Then intersect that with the view plane and the resulting conic section is guaranteed to be the correct contour of the perspective projection of the ellipsoid.

[0]: https://github.com/graphdeco-inria/gaussian-splatting [1]: https://github.com/antimatter15/splat/blob/3695c57e8828fedc2... [2]: https://github.com/antimatter15/splat/blob/3695c57e8828fedc2...


In general, a Gaussian is no longer a true Gaussian after camera projection since the pinhole camera projection function is nonlinear (due to dividing by z). However, if the Gaussian is small relative to the size of the image, you can apporximate it by linearizing the projection function. Therefore the Gaussian splatting paper uses the Jacobian of the projection function as described in equation 5 of the paper [0]. In practice, this approximation is extremely good. This Jacobian is the matrix you mentioned in the third link and it is mathematically sound and not "manually counter balanced". For a derivation, see [1].

[0] https://repo-sam.inria.fr/fungraph/3d-gaussian-splatting/3d_...

[1] https://math.stackexchange.com/a/4716514/43771


I read the paper and I am aware that the gaussian projection is an approximation anyway (hence I spoke about ellipsoids, not gaussians). Still, one could at least aim to get the iso contour right and yes using the Jacobian matrix is not unsound, just incomplete. As I said, this approach can not produce the distinctive "wiggle" that you get from rotating an ellipsoid while staring dead center at it.


True, it is an approximation after all. But it is a useful approximation since the main advantage of Gaussian splatting is the speed.


Yeah I think you're right, they're pretending the projection is a linear transformation (in cartesian coordinates) and using it to transform the Gaussian.

Or viewed alternatively they're approximating the projection by assuming all of the Gaussian is at a fixed depth, which I suppose works if it is far enough away.

A projective transformation of a Gaussian seems somewhat annoying, though I assume someone will have done it before. Seems like it should be possible to do it with projective coordinates but the final projection to cartesian coordinates is tricky.

For what it's worth, projecting a contour is also wrong, the whole density changes which also affects the contours.


Hi. I'm not very familiar with the gaussian splat technique but aren't they essentially quads with some intrinsic data in the vertices. I thought projecting quads was already a solved problem. Could you elaborate how this differs from a simple array of quads? Thank you.


If you can implement the intersecting bounding cone idea without impacting frame rates that's going to be even smoother on WebGPU but it would be interesting to see the difference apples to apples with this type of implementation.


Dynamic? I've. video?


When you zoom out there's lots of visible polygon edges that don't look like they should really be there, as if it's trying to draw soft 'blobs' but the texture coords aren't quite right? Is that a bug or an intentional part of the technique?


Intentional.

Basically its a semidense point cloud [1], but instead of a point, there is a blob which has been coloured, angled and scaled to match the input picture. This means they are optimised to be viewed from a certain distance.

Think of it like a 3d vector drawing, if you zoom in too much, or pull one part away, it all starts to look a bit funky.

[1]https://www.researchgate.net/publication/326621750/figure/fi...


So far I have only seen gaussian splatting used on photographic data. Would it make sens to use it for other graphics data, too. Or in other words, does it have potential to be used in games?


Depends, radiance field approaches (like gaussian splatting) are basically 3D photos. They do only capture color at geometry (position and direction), but have no concept of surfaces, materials and light transport in general (emission, absorption, transmission, reflection, scattering, etc.). In other words, they can only do static scenes (no animations) with pre-baked lighting.

The industry seems to be trying to move away from this with things like PBR (physical based rendering) and ray / path tracing which enables far better dynamic lighting.

Also, they are extremely space inefficient at the moment. A scene that would take a good traditional rendering engine a few dozen GB would take TB instead. Though, that might improve in the future with more optimization.

One exception to the above, where gaussian splatting might be interesting to see is procedural / generated content (possibly even animated). Especially for volumetric effects which currently use particle systems, like smoke, fire, clouds, flowing water, etc.


I thought I understood that the speculator highlights and view dependent color problems you mention are massively improved via adding spherical harmonics into each ellipse?


Sure, why not? It's just a fancy point cloud. I can easily imagine an open world Minecraft-esque game that uses this for its base engine instead of voxels.


Would this technique work for video? The readme of the inria work[1] seems to imply a model is trained per static scene, does that rule out video?

[1] https://github.com/graphdeco-inria/gaussian-splatting


It's already a thing [1]. They also have a project website [2] with some nice videos, although the code hasn't yet been released.

[1] https://arxiv.org/abs/2308.09713

[2] https://dynamic3dgaussians.github.io/


The recording from the point of view of the football they were tossing at each other made me feel things. My friend mentioned that it's like the 'braindance' from Cyberpunk 2077.


Wow - it didn't occur to me but it feels exactly like a braindance.


What am I looking at?


Gaussian splatting is a fancy word for pointcloud but with coloured shapes instead of points.

Its been around for ages, but It was never used because if you have a million points in a point cloud, you'd need to artistically manipulate a million points.

Its like 3d hair, its pretty simple, just render a billion hairs, but in practice its hard to make it look good.

Here we tell a machine learning model to adjust the angle, colour, shape and size of a million primitives (ie a square, circle, triangle etc.) so that it looks like a the photos we provide.


It's a little bit more than that. Gaussians are view-dependent, which means that they can capture the full radiance field of the scene, rather than just the color and geometry of the objects. All the light bouncing around from different objects can be reproduced, including reflections etc.

See the reflections here: https://www.youtube.com/watch?v=mD0oBE9LJTQ

This is also pretty good, but more subtle: https://www.youtube.com/watch?v=tJTbEoxxj0U


This implementation does not support view-dependence though (mentioned in the readme)


> Gaussians are view-dependent,

indeed, but that's just adding view dependent points.


My initial understanding is these scenes can’t be made dynamic (animated, physically responsive). Is that correct?


basically this: https://github.com/graphdeco-inria/gaussian-splatting — a somewhat different approach at rendering 3d scenes.


A thought occurred to me: is this similar to how Media Molecule's Dreams on the PS4/5 renders its scenes?


> Media Molecule “Dreams” has a splat-based renderer (I think the shipped version is not purely splat-based but a combination of several techniques).

From: https://aras-p.info/blog/2023/09/05/Gaussian-Splatting-is-pr...

Good eye


Interesting video here (from 2015) on the development of the Dreams rendering tech: https://www.youtube.com/watch?v=u9KNtnCZDMI


Also maybe worth mentioning that the 2nd author of the InstantNGP paper is also a cofounder of and lead tech guy at Media Molecule, Alex Evans. I've been a huge fan of his work since 90s demoscene and briefly got to work with him at Lionhead Studios, guy's a legit genius.


Does this use the method proposed by Kerbl and Kopanas at SIGGRAPH 2023?

https://repo-sam.inria.fr/fungraph/3d-gaussian-splatting/


Yes, but this is just the splatting/rendering part and not the optimization part that generates the reconstruction in the first place.


This is beyond cool. Point clouds are one thing but this… this is amazing. Kudos and great job. It even runs on my work Lenovo at 60fps.


It runs on my mid range phone at 36fps. Did not expect that.

Lots of artefacts though, especially if I move the camera.


those artifacts are part of the algorithm.


Wow this is insanely cool.

If you make it work within ThreeJS, you're going to leave a trace in the history of 3D on the web with that stuff!


I've never experienced this set of mouse controls for a 3D view ever before and was highly confused for a bit.


Very impressive! Curious what the frame rate would be like for stereoscopic rendering of the same scene on the same hardware. Are there optimizations to be had past the halfway mark?


Definitely. One of the time-consuming parts of rendering is sorting the gaussians by distance to the camera, which for two nearby cameras could be optimized. This also goes for adjacent frames - assuming smooth motion im pretty sure there is some speedup to be had by assuming the previous sort will be close to correct rather than starting from scratch each frame.


Fancy! I like that on mobile I can drag to move around!


Gotta try on a pc for some reason on iOS the cloudiness feels more like nerf than Gaussian to me for some reason, gotta try it on pc later


Is it possible to increase the number of points (resolution) with some setting? I want to see more refined view on a higher end machine.


Click through to the github for a list of the controls (I didn't think to try spacebar!) and links to other example scenes.


Gaussian Spatting is the new sensation of the sumer in the 3D Scanning Field. Will it live to its expectation ?


Wow. I was literally just working on my own implementation. You beat me to it! Great work!


why the hell is it that anyone who makes these "clever" demos provides the world's shittiest camera that adds unwelcome rolls.

late 90s bedroom me is shaking his head.


euler angles aren't cool any more


Can't wait to pull this up on my desktop tomorrow.


Runs fine on my 2016 iPhone SE. kudos


Last sentence of the readme…!




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: