Given their setup of 120 to 160 cameras to capture the performances, the results seem kinda just "meh". While it's neat that they were able to generate 3D models from all that data, it's also not very surprising.
Maybe their data processing methods are more impressive than their end results.
It reminds me of the bullet time rigs used for The Matrix over 20 years ago except I don't think they were creating 3D models of people.
What is your opinion on the future of film and acting given the trajectory of AI and tech like this? Do you see significant changes to the way movies will be shot, or to the level of involvement of human actors in film?
Not really ? Film acting is a delicate art. And it’s 1000x faster than computer animation. You can film a cheap film in 3 weeks maybe, on a tight schedule. The actors will be able to deliver what you need.
Animating the equivalent film would cost you something between 1-3 years.
So no I don’t see a way that any of this will affect film actors. We already have fully functional digi doubles, they just don’t use NERFs.
Pointing a camera at something / someone interesting is still somewhere around 1000x more efficient than computer animation. I don’t see a near or mid future where this changes.
I think the appeal in this NeRF tech in in the ability to use both real capture information, like traditional filmmaking, AND the ability to reframe, refocus, recompose images, like in 3d scenes.
...yeah, but there was no interpolation happening in the original The Matrix. Every single frame we saw corresponded to a single camera that took a photo.
New works like this one allow interpolation.
It's fair to ask if this is similar to what we saw in the Matrix sequels, where I believe they used different solutions. But again, I think the Matrix sequels mostly used entirely CG actors...? Not interpolation of recorded acting.
Fair criticism, but there's not a lot of other foundational work in this space either; they need to capture and compute the "ground truth" in order to train and evaluate their models. That requires a lot more data than is used by the NeRF reconstruction.
The closest technology I had previously seen to this was MS’ volumetric video capture for “holoportation” and 3D video recording at their SF Reactor facility and this seems like a significant improvement[0]. That also used an array of cameras in a studio setting but the resulting color and model quality wasn’t close to as accurate and sharp as these examples.
Are there other examples of similar tech in use at a comparable level of quality in recent history? Against any comparable end results I’ve personally seen, it looks very impressive.
Their results are clearly 'meh' given the number of inputs, in [0] they use only 2-4 cameras in real-time to achieve the same or better quality (see their supplementary material). In [1] is a video view synthesis generated by 4 cameras.
Do we have the understanding of neuro anatomy today to make a Light Ocular-Oriented Kinetic Emotive Responses (L.O.O.K.E.R.) device? We certainly have the technology.