Hacker News new | past | comments | ask | show | jobs | submit login
This beach does not exist (thisbeachdoesnotexist.com)
238 points by vsemecky on July 19, 2021 | hide | past | favorite | 82 comments



I think that these images are less convincing than "thispersondoesnotexist". The network did well for the extent a tree is similar to a person: long palm leaves look are like green hair, and look passable.

The network did bad where it had to take into account global illumination, or distance to the object and its level of detail: 1) tree trunks are mostly flat and do not depend on the possible sun location; 2) shadows don't match the trees and the rocks; 3) nearby rocks often lack texture and are not properly illuminated (they look flat).

It's not surprising given that the method is fundamentally local (CNN-based). I think better results will be achieved when a generator creates a scene graph rather than a raster. Are there works which actually create a 3D scene (or use it as an internal representation)?


Given that these scenes are produced from existing rasters, I don’t know how you would do that without either data classification or time of day/GI inference. You’d have to do spacial inference to produce the 3D scenes which is more complicated, and then at that point if you’re using the generated raster output as textures, you’d have to normalize them to their albedos and then do global illumination from there.


I do not follow recent CV research anymore, but there were various "3D from a single image" techniques even 10-15 years ago. I suppose some progress has been made. A quick search turns up this paper https://openaccess.thecvf.com/content/CVPR2021/papers/Zhang_... So it's possible to infer not only depth map, but also an accurate scene graph. Estimating illumination and texture synthesis have also been done in the past. Maybe it is not necessary to go all the way into proper 3D and ray tracing, but it can be possible to use 3D representation as one of the intermediate layers. My point is these generators should embed some kind of domain model to look more realistic.


In some of these pictures [1][2][3] the vegetation and the rocks look like if they had been "photoshopped" by some amateur. Anyway it's a great start for sure!

[1] https://thisbeachdoesnotexist.com/data/seeds-075/4375.jpg

[2] https://thisbeachdoesnotexist.com/data/seeds-075/3300.jpg

[3] https://thisbeachdoesnotexist.com/data/seeds-075/7638.jpg


Yeah, the texture on the rocks in [3] look more like the animal hide of a giant animal like an elephant or a rhino than they do stone.


Nice effort, but as with all these "this X does not exist" image generators, it gives you an uncanny feeling. Something about some of the images is slightly off, but you (or at least I) are not able to put your finger on it most of the time. One of the things I was able to put my finger on are the palm trees - some of them have leaves hanging in mid-air...


For me, what really stood out were the rocks — real rocks just don’t have those sorts of textures.


I thought they looked great at first but as soon as I started taking a closer look, everything started to look very wrong. I'm certain growing up near an ocean gave me a more critical eye.


Agreed on the nice effort comment. The ones showing the scene from a distance took me some extra time to figure out what was "off". For me it is the whitecaps on the water (if present). Some of the pictures appear to show waves going in the opposite direction of their surroundings. One of those things that when you see it, you can't really unsee it.


I was hoping to see something like in Contact where everything looks normal except something is weird. They made the waves run in reverse so they appear to be originating from the beach.


Can you share links to some others of these?


Or entire trees...


Does anyone else get extremely powerful uncanny valley vibes from these AI-generated images? Even the ones that are pretty good give me a sort of deep existential horror.

Actually the original “this person does not exist” wasn’t so bad, but things like this, DALL-E, or the sheep one that was posted the other day I find extremely unsettling and can’t explain why.


Absolutely. I've been wondering if noticing that 'something feels wrong' is a deeply ingrained response related perhaps to survival. When a possible threat is detected, even when it's not clear what the threat is, it's time to feel very uncomfortable.


>> Absolutely. I've been wondering if noticing that 'something feels wrong' is a deeply ingrained response related perhaps to survival.

I thought the latest thinking is that most of what we "see" is really a projection from our own mind (similar to how these images are generated) and our attention is drawn to those areas where our projection and sensory input don't match. So yes, we are evolved to be alerted when "something feels wrong" because it's unexpected and could be a danger.

The mismatch can happen at any level of the hierarchy. One person here said some waves appeared to be moving the wrong way for the rest of a scene. Things like that are at a higher level of abstraction than rock textures or tree trunks with gaps. Some things might seem wrong but take more time to identify or explain what the problem is.


Most of human comfort (which isn't pharmaceutically induced) comes from a sense of familiarity. Unfortunately, this is also where most people get their sense of "truthiness," but that's another story.

Anyways, if you were raised in a machine with these images and then shown the real world, I wonder if it would be a relief or also uncanny?

I know that when it comes to sound, it's really hard to create organic sounds. And that part of our ability to hear sounds happens (theoretically) before the brain, in the form of grouping harmonic series together into isolated sound sources. Perhaps eyes do something similar? Some sort of pre-processing that functionally works for all natural phenomena, kind of like the harmonic series does for sound.


I was going to post the same, but you said it. For me it's like; Then if this can "not-exist" now, we're at level already. We're approaching a "what is real situation", should we be able to put this into a person's memory.


Yes, I had to close the page after about 5-10 seconds of browsing. Interestingly, I have been slowly cutting back on coffee and am a bit sluggish as a result, but after looking at those images my mind got a bit of jolt.


Wonder if you activated a fight or flight response


Launching a new hobby deep learning project https://ThisBeachDoesNotExist.com - a synthetic beach image generator built on a neural network StyleGAN.


Is this a ShowHN ?

It looks nice, but I don't feel safe using untrusted pickle weights of pretrained models, as they can allow for arbitrary code execution.


Probably yes, although I'm not sure I understand exactly what ShowHN is.

I fully understand your concerns, but I don't know how to guarantee that the pkl is ok. Don't run it on your computer, but in some isolated environment, like Colab.


https://news.ycombinator.com/showhn.html

I'd argue that running untrusted code in a Colab is even worse, as you'd risk your account instead of just your computer.

I don't think pickle files can be loaded safely, it's better to use a numpy archive npz to store the weights which can be loaded without a security risk by using allow_pickle=false when loading (the default since numpy 1.16.3)


The solution is to use a different format that is safer.


Do these projects ever have a way to ensure the output isn't just one of the elements of the training set? Is it simply something that would be of negligible probability of happening?


Typically they would investigate something like the closest (in terms of eg MSE) image from the training set.


MSE is kind of crappy for images; you'd use something like image-specific, like SSIM.


Sure, but if you are looking for memorization MSE will give you a decent clue I would think.


How do you compute MSE between images?


How to get MSE between images:

1. Express each pixel as a vector of numbers, probably with RGB.

2. For each of the color vectors, for each of the colors, subtract the color 1 from color 2.

3. Square those differences, and add them.

In this way you can get mean squared error for each pixel or for the entire image.


Right but I don't see how it's useful in this context given that rotations, reflections and translations of an object across the canvas can give you high MSE while retaining most of the original image.


Agreed, but it would be surprising (impossible depending on the arch) if the CNN learned such transformations.


Would love to see some blind study on what people actually find "uncanny" vs. what they think is uncanny because they are told it's a generated image. Maybe I don't fully understand the concept of the "uncanny valley" but simply saying This Looks Shopped isn't uncanniness.


The problem with that is that the very definition of "uncanny" is that it is subjective. You see it and feel that it is off, without being able to explain why.


I'm waiting for a "this X does not exist" website generator.


"This this x does not exist does not exist"


"This GAN does not exist"


This GAN could not exist.


"This website does not exist"

recursion paradox joke :)


I was actually thinking about the details of implementing this after reading the title.


Same :) seems like you could get pretty decent results just selecting random nouns?


I would like to also couple an algorithm to generate those images. Maybe just randomly select nouns for the title, but pass on augmented/cleaned versions of the title to some generator + CLIP to autogenerate fitting images.


It used to be called stumbleupon.


There's a character in "Soylent Green", an old fellow who claims that there used to be all sorts of wonderful nature back in the past, but no-one believes him. I feel like this stuff could facilitate a similar situation.


NVidia has some interesting developments in this vein, specifically with temporal coherence so subsequent frames of animation are consistent over time: https://www.youtube.com/watch?v=jl0XCslxwB0


When they do these "Random Latent Walk" animations, it becomes apparent that some features sort of... stay still (i.e the detail in the leaves seems to not actually move, but the larger features such as branches move around it)

Does this artifacting have a name?


This problem was recently addressed in detail here[1] which serves as a good overview.

1. https://arxiv.org/abs/2106.12423 (project page: https://nvlabs.github.io/alias-free-gan/)


Holy... ! That second video, the character on the right, one of her incisors ends up in the middle of her mouth. It's amazing how these things can get so many details right, just from analyzing existing images, but some details like the symmetry of teeth get lost.


Is this the end of stock footage copyright?


One can hope, but I'm afraid it'll only make things more complicated.

In theory you could train a generator on stock footage. In theory the stock footage copyright holder could sue you for creating a derivative work without paying them. In theory that claim could be bunkus if the creator used a different training set - how do you prove a generated image's origins?


Decent case that the derivative work is transformative and protected by fair use. (And yes I realize that doesn't stop them from suing you, just pointing out)


How would you know which stock photo was used for training? Someone may take a whole bunch of images they buy on a dark market, create a huge train model, and dump it on the public internet over torrent or something. And there would be no way to know which images were used to train the model


Perhaps as this technique becomes more and more used there will be regulations on the source data set, such as the requirement to prove you own all of it (or that it is in the public domain).


Just like Co-pilot is the end of the GPL-ed code?


We’ll probably end up seeing new IP law to incentivize/protect the collection of training data. Creating training data is crazy expensive, but exploiting it without permission/attribution/compensation is currently legally protected as fair use.


> Is this the end of stock footage copyright?

We probably need ThisMonumentDoesntExist.com for that.


Would the AI technically own the copyright to the work they just produced?


I like to download the image sets from these "x does not exist" sites. Most only have a few thousand images with sequential filenames (one notable exception being thispersondoesnotexist, which has no filenames, so I made a script that hashed the files and discarded duplicates. I stopped at 50,000 images without finding a single duplicate).

Anyway, to grab all these beach images, I used this command:

    wget -w 3 --random-wait https://thisbeachdoesnotexist.com/data/seeds-075/{1..9999}.jpg
Average file size is 130KB, total 1.3GB.


One thing that gets me about GANs is that whatever approach is used to generate the outputs always results in GAN-look; these images always look “fuzzy” and objects have “hairy” boundaries which I suspect is the result of some common gradient function technique blending the output in ways that are identifiably artificial.

Plenty of photographs have crisp object definitions in the real world. These do not. Trees blend into humans standing on the beach.


How do you know it really does not exist? There is this effect called 'over-fitting', where you might expect an exact sample from the training set to show up in the video


As someone who is terrified of the deep sea (caused probably by Lovecraft's work), I find the artifacts on the images truly unsettling.


Nice!

Only a small feature request: When you click the button to get another set of random images or another cluster in the knn, the page waits until the new image is loaded to update it, so it "does nothing" for half a second. I'd like that it instantly shows a spinning thing or other signal to show that it's waiting for info and will be updated soon.


My favorite of these is This Chemical Does Not Exist:

https://www.thischemicaldoesnotexist.com/


Interesting work. I wish the video paused on at least some of the beaches though so you can actually look at them before they transform into the next one.


The video on the site is very enjoyable to have on in the background at 0.25x speed.


It's like one of those sci-fi time travle movies or docuseries showing the evolution of the earth/techtonic plates/etc.


Long-shot request: Generate 3D models of beaches like this for a flight simulator.


I recognize some photos from pexels(dot)com, they are all public domain.


The video made me seasick


Plot twist: These are all real beaches but they no longer exist because of climate change.


Plot twist: These were all real but don't exist anymore because no man can step in the same river twice, let alone swim at the same beach


Plot twist: What is a man? Naked chicken sits in cave watching shadows on walls....


The rocks are really bad


Rock is dead, they say.


What a fancy way to wreck a nice beach.


on a beach... nowhere.


Surely, the title should be the "photo of this beach does not exist". Beach photos are always a bit samey so it is probably more straight forward to train AI to build a random beach photo than to actual build a map of a beach.


I'd say these images have high artistic value. In the coming years it is very likely that "real art" will be confined to snobs and most people will be able to buy non series produced high quality automatically generated art.


I've never understood the appeal of photorealistic art anyways, so I'll be happy to be labeled a snob that likes art that looks like art.


I agree, I think most photorealistic art lost most of its appeal when color photography came out.

Maybe there's a brief moment of "I'm impressed with how much work you put into this" but for that to happen, you kind of have to see the work being done.

I think that's why impressionism has such staying power, impressionists realized that your brain doesn't need photorealism. They were panned by their contemporaries who were focused on photorealism, but time has proven they were onto something.

Even now the impressionist-style neural nets don't get it right.


>Maybe there's a brief moment of "I'm impressed with how much work you put into this" but for that to happen, you kind of have to see the work being done.

Or be knowledgeable enough to estimate the effort and skill required. Ironically, things I took for granted when I was young have impressed me more and more over the years. Probably because I've either tried to do them myself or learned about their backgrounds.

So I'm not surprised artists and at enthusiasts would value realism more than most people. Just like how speed metal is probably more popular among guitarists. The technique becomes a source of value rather than just an implementation detail.


I'm an artist that doesn't like photorealism. I can appreciate the skill without wanting to spend much time looking at it. The imperfections and accidents are what make art pieces interesting to me.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: