Take this idea, and apply it to Google Earth. Have it procedurally generate the rest of the data as I zoom in.
Its incredible I can check out my small home town on Google Earth now entirely in 3d (which wasnt the case just a few months ago). Yet the trees/cars are still these blocky low resolution things sticking out of the ground. Imagine Google Earth procedurally generating a high resolution mesh as I zoomed in. Train it with high resolution photogrammetry of other similar locations for the ground truth - and let me zoom in endless in their VR app
I like that idea as a separate program (or just in the VR app) but I think it would be confusing/misleading in Google Earth itself. At the very least there needs to be a clear user-facing indication as to which content is procedurally generated (ie fictional) versus photographed (ie real). Obviously there’s a grey area with image processing but I think there’s a real concern with prioritizing nice pictures over actual information.
Yeah, going and flying over the small rural farm I grew up on I was disappointing since Flight Simulator had procedurally generated a whole compound full of random buildings that made it look like some cult compound.
With that said I bet this would choke on lots of actual Minecraft worlds, because people often build things using blocks where the semantics get thrown completely out the window in favor of aesthetics. Want a big mural on the wall? You're going to be building the wall itself out of differently-colored blocks of wool
Maybe they'll solve that part one day :)
Edit: That said, it could choke in some really interesting ways...
Question for the experts: Is it possible that GANs will be used for rendering video game environments in the near-ish future? This has been one of my private predictions with respect to upcoming tech, but I'd love to know if people are already thinking about this, or alternatively, why it won't happen.
Non real-time as in generating levels for your game to release it's doable today. Real-time it will probably be doable soon, especially for shared-world games where a server can generate for multiple people at a time rather than single-player.
Even real-time today it should be doable if you create your game with that in mind. You really don't need to generate all the textures, just a compact representation of the level which is to be rendered normally after the fact.
Artistically, developers could do some trippy dream sequences with GANs, where the glitchyness and training artifacts add to the immersions. Because one can sample GANs or mix in latent dimensions, the experience can be tailored individually based on the characters decisions for instance.
I'm not sure if this is what you're asking about, but here's a Two Minute Papers (dear fellow scholars!) video about a deep learning paper for super sampling with some applications for games: https://www.youtube.com/watch?v=OzHenjHBBds
Maybe I'm just too dumb, but I wish these papers would cut the nonsense and explain the key elements in layman's terms with simple examples. I'm super curious how you can do something like this in a fully unsupervised fashion, but the "Hybird Voxel-conditional Neural Rendering" doesn't mean much to me. Maybe if I knew what "voxel-bounded neural radiance fields" were...
The voxel bounded neural radiance field is important as neural radiance field was some prior research paper this builds off. But the very high level is just voxel data to image generation using some form of neural nets. I didn’t look at the paper but I’d hope it summarizes neural radiance fields and if not it’ll at least cite them and then you’d read there and see how this paper extends that work.
Personally prefer blocky Voxel Art to the photoreal scene ;)
NVidia also released their RTXDI SDK for global illumination at a scale of millions of dynamic lights in real time. Combined with GANCraft, anyone could become a world class environmental artist using only Pixel Art tools.
Yes, there are all sort of weirdness in their rendering but that's what you get in a research paper. Put that in the hands of actual game designers and you will have incredible possibilities.
Wow that site and all it's auto play videos crashed my phone. I get that you want to show off the cool tech but please don't put that many videos on autoplay.
It looks impressive, but what exactly is the machine learning doing on the original to produce the result?
And wouldn't it be possible to simply take the original minecraft map as a height map and texture map and then regenerate a new world with the original world data and more advanced post processing? You could interpolate and randomize more detail into the scene than you started with.
It’s not really adding any meaningful detail per se. where there’s a grass block it’s just rendering grass. All it is doing is projecting a stable image of “grass” (taken from a labeled image database) in that voxel.
Not to minimize the awesomeness of that... doing it stably in 3D while moving the camera is the point of this paper, and is amazing.
But it’s not really adding detail beyond “these are the kinds of pixels that grass has and the AI figured out we can put them in this arrangement without making things jumpy”
profit: now you can see what any town would look like with complete streets. I call it Complete Street View.
Please do implement. Of course it would be dreamlike, this is a strength as you wouldn’t want the gan to make design recommendations, just a plausible feel.
The image translators work for the construct program, but there is way too much information to decode the matrix. You get used to it. I don't even see the code, all I see is blonde, brunette, readhead...
At the end of the paper it says that one frame takes 10secs to render. I wonder whether one day this method will be able to render in real time (say 30fps).
Maybe, but OTOH we have very efficient 3D rendering technology that we understand very well. If I had more compute, I'd want to raytrace everything in real-time, but I wouldn't feel the need to bring neural networks into the mix. A better use case of machine learning is probably to help procedurally generate the data to be rendered. It would be really neat to be able to turn a few photos of a real-world location into high quality 3D meshes with no gaps, for example.
This creates a realistic topography from voxels. If you could do this in realtime then you could have a game where you have the flexibility of minecraft yet the appearance of a more photorealistic game. Imagine playing a game that looked like Control except everything is destructible and constructible. It's an exciting idea.
Almost definitely. There's many ways to optimize further with software and hardware is only getting better. I wouldn't be surprised if it's doable today with some cheating, a bit more hardware and a lot of work on optimization.
It doesn't look like it uses any texture information. I think it only takes in a list of block locations and spits out a scene. I would think you would have to train it with every different combination of textures.
That’s mostly a product of Minecraft’s technical choices. Modern computers can render axis aligned voxel grids on the order of 1,000,000^3 (think Minecraft scale but the blocks are sub millimeter) with PBR/GI in real time. Interactive would be another story I suppose.
The clickbait article makes people believe "you can create 3d models of ANY 2d object" but in reality, this would only come down to cars, cats and human faces. We have only so much datasets that are suitable for a GAN.
The neural net part of this seems somewhat trivial and also misapplied. This is not a realtime renderer, and I would hazard that if you gave someone who knows GLSL the task, they would produce something far and away more compelling than this, that could probably render at <1 FPS.
They would produce something which won't generalize to other types of environment without another huge load of human labor.
Your complaint could be made about just about any new technology. It's usually worse than what came before it at first, but the value is in the potential to eventually become better than what came before it.
Its incredible I can check out my small home town on Google Earth now entirely in 3d (which wasnt the case just a few months ago). Yet the trees/cars are still these blocky low resolution things sticking out of the ground. Imagine Google Earth procedurally generating a high resolution mesh as I zoomed in. Train it with high resolution photogrammetry of other similar locations for the ground truth - and let me zoom in endless in their VR app