Nvidia tool generates full 3D models from a single still image

kristjansson · on Jan 8, 2022

Astounding but I can't help be be underwhelmed at the results. Look at the Shelby Cobra dropped in at 1:01 [0] - it's a vaguely grey car-shaped thing like all of the rest of the models that captures little of the source image.

[0]: https://youtu.be/gz5E9wszZSI?t=61

heleninboodler · on Jan 8, 2022

It seems like an April Fool's joke to me. The dramatic narration overlaid on lumpy terrible-looking cars made me laugh out loud several times. I mean, I'm sure what they're doing is technically impressive but the results are downright awful. All the cars look like they're beaten to hell and back and I honestly wouldn't be able to pick "KITT" out of a lineup of any other black car if it weren't for the cylon light on the front.

sva_ · on Jan 8, 2022

> The dramatic narration overlaid on lumpy terrible-looking cars made me laugh out loud several times.

The horses did it for me. It made it look like some low effort parody.

penjelly · on Jan 8, 2022

2:30 for horses https://youtu.be/dvjwRBZ3Hnw

zokier · on Jan 8, 2022

I was thinking that the results looked pretty bad, until the video mentioned storyboarding and it clicked to me. Even as low-quality as the models are, they still can work great as placeholders. So this would allow very quick iteration on designing a overall scene, pulling items from reference images from google searches etc, and when you are happy with the result you can send it to an army of 3d modelers (or designers or whatev) to be built.

So the comparison point should probably be like grey boxes or whatev people have been using as placeholders/throwaways so far, and compared to that these generated models actually look pretty great

IshKebab · on Jan 8, 2022

Maybe, though for storyboarding I would have thought you can just use existing generic models or make very primitive ones very quickly.

wizzwizz4 · on Jan 8, 2022

I'd just use a prism with two textured faces, and the edge of the texture smeared around the curvy edge.

PotatoPancakes · on Jan 8, 2022

Definitely not great... I bet that in a few years I bet it will be shockingly good. Experts have probably been working on it for years, and now that they've chosen a sufficient model structure with effective convolutions and whatnot, it's just a matter of tweaking the parameters and feeding it more and more data (3D scanning tools are making this easier as time goes on too).

rasz · on Jan 9, 2022

You mean until they overfit it with 3D models of every single car ever made.

alophawen · on Jan 8, 2022

!remind me in a few years

jquery · on Jan 8, 2022

That’s just the “texture” view. There’s also the parts and materials view that helpful segmented the various pieces of the car.

I’m not sure why they chose to show off the “texture” view without even explaining what it is.

Also, while it may look iffy right now, it seems they’ve done the heavy lifting. A round of polish and in a couple years nobody will be laughing at this. (I’m already impressed, personally).

shakow · on Jan 8, 2022

> it seems they’ve done the heavy lifting

The heavy lifting is the not-yet-done fine details in the 3D-models. Vaguely car-shaped blobs not even respecting the original general shape of the car (e.g. the Cobra or the Toyota SUV) but with the correct texture extrapolated is nifty, but nothing more.

nacs · on Jan 8, 2022

The horse is hilariously bad:

https://youtu.be/gz5E9wszZSI?t=53

lostcolony · on Jan 8, 2022

Yeah...that looks...frankly worse than the kind of thing I did in my undergrad 3d graphics class, where we did some edge detection, tessellated it, and projected it into 3D space.

Arnavion · on Jan 8, 2022

>This gives creators of any skill level the ability to easily generate models for diverse uses - storyboarding, pre-vis, game content, architectural models, and more.

It sounds like they expect you to use it when you just need a vaguely correct shape anyway, at least as a starting point for further refinement.

gbraad · on Jan 8, 2022

Kitt itself looks like a Nissan with a rounded butt :-)

amelius · on Jan 7, 2022

Observation: stuff that used to come from SIGGRAPH now comes from NVidia.

pixel_fcker · on Jan 8, 2022

That’s because nvidia hired like every graphics researcher.

belter · on Jan 8, 2022

The paper: "Image GANs Meet Differentiable Rendering For Inverse Graphics And Interpretable 3D Neural Rendering"

https://arxiv.org/pdf/2010.09125.pdf

exodust · on Jan 8, 2022

They turned KITT into a blurry 3d blob, poorly lit to hide the flaws. Barely qualifying as a full 3d model. "Black car" is the appropriate name for the result, nothing more.

They disabled comments on the article, and on the youtube video. Combined with YouTube's newly hidden dislike ratings, we can't know how many viewers are unimpressed. Hype has more breathing room than ever before.

cinntaile · on Jan 8, 2022

I don't get the negative comments. The results look alright, but they lack detail at this point. This will improve over time and there are probably a lot of applications that don't need this level of detail anyway. I'd guess this will put some 3D modelers out of a jobb x years into the future, classic disruptive tech stuff!

tsumnia · on Jan 8, 2022

I like to take a line from Two Minute Papers[1] - If this is what one "paper" was able to produce, imagine the improvements will be two "papers" down the line.

[1] https://www.youtube.com/channel/UCbfYPyITQ-7l4upoX8nvctg

dtgriscom · on Jan 8, 2022

Same. These results are far better than I (with little 3D modeling experience) could generate.

A pro could do better, but a) do you have a pro available? and b) can they afford the time to model all of the item you want to throw into your mockup?

TheJoeMan · on Jan 8, 2022

Please let me know if this already exists, but: video games are becoming huge file sizes in part due to HD graphics. Why not just generate say the faces of pedestrians on the fly? Is it important that some character in my copy of the game looks identical in all copies of the game? I would argue it just needs a unique face to go with the name.

Guest19023892 · on Jan 8, 2022

Are you referring to procedurally generated content?

https://en.wikipedia.org/wiki/Procedural_generation

.kkrieger got a lot of attention for using this method and only being 96KB.

https://www.youtube.com/watch?v=2NBG-sKFaB0

lostcolony · on Jan 8, 2022

I've seen some work on, and expect some games have used, a similar method where they'll have a bidirectional process to generate something from a seed, AND create a seed after having crafted something, allowing them to mix and match, and still achieve a super high compression rate (as, at the end, all that is stored for each face or other element is the seed that generates it).

Certainly, there are similar things in user space where you can share your creations from some games using a code that was generated or similar.

roywiggins · on Jan 8, 2022

Well, there's Minecraft. Its worlds are huge and procedurally generated.

As to faces, Skyrim does that. There aren't unique face meshes for every NPC, there's some base meshes and various parameters to adjust texture and shape.

omegalulw · on Jan 8, 2022

I mean is file size really a concern? I would rather have a 300 GB game with something like quixel megascanned assets than procedurally generated crap.

The latter sounds nice but in practice game execs just use that to cut corners and you end up with an unrewarding grirnd fest of a game.

dataflow · on Jan 8, 2022

> I mean is file size really a concern?

Yes.

> I would rather have a 300 GB game

I, on the other hand, would not.

manigandham · on Jan 8, 2022

Aesthetics and gameplay are completely separate topics. Procedural generation can increase the visual quality without having anything to do with how you play.

For example, increasing background details (more pedestrians, cars, traffic, sounds, etc) in a city scene can greatly add to the atmosphere regardless of your specific gameplay loop.

elil17 · on Jan 8, 2022

You are probably blessed with a large hard drive and a fast internet connection. Other people don’t have these things.

vinyl7 · on Jan 9, 2022

I was hoping 8 hour download times was a thing of the past

dillondoyle · on Jan 8, 2022

The new cartridge is an external NVMe?!

tomc1985 · on Jan 8, 2022

It's not just HD graphics, but uncompressed or less-compressed game assets that are provided as such to minimize load times

TheDudeMan · on Jan 8, 2022

It can still be the same in every instance of the game -- generate from a known seed. This is basically good data compression, customized for faces.

2OEH8eoCRo0 · on Jan 8, 2022

Spoken dialog is a killer. Studios have to contract voice actors and then it's very difficult to make tweaks later. Why not fancy text to speech with variables for intonation, smoker/nonsmoker, age, gender, accent, etc.

Going even further, things like GPT can generate text given an input. It's beyond me but couldn't something similar be used to describe environments on the fly to be generated, or generate NPC stories on the fly given inputs?

I think a lot of this tech is coming together and games are about to get a whole lot larger, more immersive, and cheaper to produce. Even just replacing voice actors would be a huge leap.

michaelmrose · on Jan 8, 2022

You might be overthinking the problem. There are games that would lose half their weight replacing uncompressed audio with mp3.

Present games just completely ignore file size because compared to GPU 1tb SSD are cheap and multi TB HD cheaper.

jquery · on Jan 8, 2022

Very few games this would be the case.

There are games that completely waste their space, but we can’t lump them all together. Some create vast detailed and beautiful worlds from their storage budget. Some of them, however, are large without any compelling reason to be, with many gigabytes of wasted space on assets that are barely used.

wruza · on Jan 8, 2022

With these load times I figured to only install active games on SSD and delete/move to HDD when done. Reminds me ISA HDD/FDD multicard era somehow.

r_hoods_ghost · on Jan 8, 2022

There's a trade off between processing time and storage space in pretty much everything to do with games. For example the lighting in a scene is often precomputed offline and then baked in to a texture which ships with the game. This increases the speed in which a frame can be rendered, but also means that your game just gained another few mbs in weight. Similar thing with animations of clothing, these are often physically based simulations done offline which are then baked in (again we often use textures to store the information, mapping RGB to XYZ).

If you were to generate faces "on the fly" as a level streamed off disk then that would likely take up too much of your frame budget and your game would grind to a stuttering halt. The other way of doing it (and this is generally how games which use procgen work) would be to either have a precompute step when you first install or launch a game. This might give you a long install or launch time and would still mean that you have a huge install size, but you would avoid taking up a chunk of your frame budget. Or you could do some clever scheduling and level design where you force the player through "tunnels" between areas that gives you enough time to generate a few new faces and load in assets before they enter an area. This is a pretty common technique where you need to load in a bunch of assets or do some heavy calculatuins. The Witcher 2 had an interesting and slightly buggy version of this where whenever Gerald opened a door to go from outside to inside the camera would swing around so you couldn't see inside the building, a second long animation would play and then when the camera turned back so you could see into the now fully loaded interior if the building. It didn't work properly but it was a good idea.

You could do what ms flight does and have a locally installed low resolution world and a high resolution streamed world. Adapting this you could stream in procgen'd faces and other items generated offline. I think this would work rather nicely.

Lastly, I know it was just an example but still, faces are probably the last thing you want to procgen with out passing the results by a human filter as humans are incredibly sensitive to them looking "off". If all your characters end up looking grotesque and off putting you might be shooting your self in the foot. See the furore about eFootball's terrifying crowd models.

paavohtl · on Jan 8, 2022

I think most open world AAA games have done this for over a decade. They either vary a base face model randomly, or mix from hundreds of variants, hairstyles, clothing options, accessories etc. Literally no one is making 400 unique models for a crowd of 400.

Liquix · on Jan 8, 2022

Yes! The tech mentioned in the article solves this problem: no more "we have to remove/compress all of these models/textures because our game can't be 300GB". Just use 10 high-res pictures of the object from all angles instead of a model. This could save orders of magnitude of disk space

It also solves the problem of "we have to choose what objects to model because our 3D artists don't have infinite time". Now the artists review GAN output and tweak what the algorithm gets wrong, creating tons of content without needing to create every vertex from scratch. Very exciting

picture · on Jan 8, 2022

Not sure if "10 high-res pictures of the object from all angles instead of a model" saves any disk space at all considering models in games are usually just one high res picture uv wrapped to vertexes.

Also, creating new content from GAN might not be super appealing for the artistic qualities of the game. Why not do what Escape from Tarkov seem to do with photogrammetry on real objects? Many props like soup cans, couches, whole environments, weapons, can be acquired in real world reasonably easily.

xwdv · on Jan 8, 2022

It’s great for Indie game devs who just want quick assets and don’t want to go through the hassle of finding some artist and paying obscene money for a couple assets. My inability to get good consistent artwork is what has always demotivated me from actually finishing any kind of game.

DaiPlusPlus · on Jan 8, 2022

I remember as a kid messing around with early desktop-computer photogrammetry software: the kind where you have to print-out some black+white marker circles and place them around some trinket you want to capture in 3D.

..it worked, but the output *.3ds files were low-resolution and the textures only looked good from one direction - and the material's lighting was way off.

So the concept is nothing new, but it's never as simple as taking even a bunch of photos, let alone 1 photo - there's so much information necessary for future renders that cannot be captured from photos alone.

Grakel · on Jan 8, 2022

Maybe art is worth its market rate.

karmasimida · on Jan 8, 2022

As a demo .... I was expecting something more flashy than this one. The current one seems only OK-ish to me.

alophawen · on Jan 8, 2022

This is cool but this also not how any horse back ever looked: https://www.youtube.com/watch?v=gz5E9wszZSI&t=53s

monkeydust · on Jan 8, 2022

Curious, does anyone know software that could generate body sizing for an individual based in image stills or video taken from a normal mobile phone? Essentially creating a 3d human model but with high degree of accuracy.

GistNoesis · on Jan 8, 2022

I don't know of one yet. Lot of people are working on it, using various strategies. The grail being able to provide a virtual fitting-room experience. Or monitoring fitness of individuals.

A 2021 paper with code (I haven't tried it): https://github.com/akashsengupta1997/hierarchicalprobabilist... to generate a 3d morph-able human-shape in the correct pose from a single picture.

A picture always has scale uncertainty (a 2-meter human viewed from 1 meter away look the same that a 1-meter human viewed from 0.5 meter away), so that is an additional problem that must be taken care of.

But now recent phone have 3d sensor that provide information that could be useful.

To generate 3d human models there is also makehuman . In the old days there was a soft called facegen, that could generate 3d face models and could automatically fit their parameters to two pictures using an iterating refinement procedure.

Deep-learning usually estimate everything jointly in a single step so they are faster, but often less accurate. But there exist models that learn to refine a previously generated model, so you can apply them repeatedly and get improved quality (Denoising Diffusion Probabilistic Models is one generic class of models that does this).

vultour · on Jan 8, 2022

I was worried about the legality of using real cars in your game. Then I noticed what it generated from the picture of a G-wagon.

anaisbetts · on Jan 8, 2022

A commercial (and therefore, far more expensive) version of this tool is https://www.capturingreality.com. Cool that nVidia is starting to make this more accessible, even if the results aren't quite usable yet

chillingeffect · on Jan 8, 2022

Ironically ttansforming K.I.T.T. revealed the limitations... but still excellent progress!

ksec · on Jan 8, 2022

Apple has something similar, Object Capture.

https://developer.apple.com/augmented-reality/object-capture...

cokeandpepsi · on Jan 8, 2022

totally different

sydthrowaway · on Jan 8, 2022

This is photogrammetry

f0e4c2f7 · on Jan 7, 2022

Wasn't there just a tool posted here to do this on a Show HN recently? I can't remember the name of it now. I believe you had to upload more than one image for it to work but they had an api.

Edit: after looking this is what I was thinking of.

usdz.app

junon · on Jan 7, 2022

You might be thinking of Looking Glass.

The_rationalist · on Jan 7, 2022

usually its NERFs derivatives

baybal2 · on Jan 7, 2022

MatroX GraphX circa 1999-2001

Matrox had a famous 2D to 3D demo

gilbetron · on Jan 8, 2022

Yeah, it was nothing like this.

ffhhj · on Jan 8, 2022

Please share link to video.

smrtinsert · on Jan 8, 2022

Is this KITT appearance considered canon?

fennecfoxen · on Jan 8, 2022

KITT is okay, but I prefer William Daniels as John Adams (or as Mr. Feeney).

Dracophoenix · on Jan 8, 2022

Hard to believe that Mr. Daniels is still alive all these years later.

saruberoz · on Jan 8, 2022

this is really cool. ps: funny thing, seems like the tire all the same

Splizard · on Jan 8, 2022

I think it's degrading to call this a 'tool', it's an intelligence.

picture · on Jan 8, 2022

"Intelligence" seems kinda hype-y to me. Machine learning is mainly just statistics. We still have just a ways to go until real "intelligence" afaik

visarga · on Jan 8, 2022

Yeah, much of human intelligence is also just statistics. We do a lot of unsupervised learning and reasoning based on correlation.

The opposite of this would be causal reasoning but that is hard for humans as well. Not everyone can use causal reasoning, it requires many years of training. The discovery of causal mechanisms (science) is slow and it takes many people to do it.

For example the president of Turkey believes interest rates should be lowered in a hyperinflation. What can we do? Causal reasoning is hard.

I think we do something else, slightly different than causal reasoning. We're generating hypothesis and testing them out, in an iterative process. Like a GAN (generator+discriminator) or Actor Critic (policy + value function). More famously, AlphaGo was generating many rollouts for each move, one model to generate moves, another to evaluate their consequences.

schleck8 · on Jan 8, 2022

At this rate the metaverse might actually become a thing, if the hardware allows for enough immersion