Show HN: Stock Photos Using Stable Diffusion

nostromo · on Sept 30, 2022

Some of these are a bit nightmarish!

https://replicate.com/api/models/stability-ai/stable-diffusi...

I love it. But whoever entered "spider salad" and "cockroach salad" previously so it would show up when I searched for "salad" -- I'm mad at you.

ericmcer · on Sept 30, 2022

Yeah I would love a university to use this on their website. https://replicate.com/api/models/stability-ai/stable-diffusi...

IWorkAtBigCo · on Oct 1, 2022

Found the next Welch's Grape Juice kid: https://replicate.com/api/models/stability-ai/stable-diffusi...

dqpb · on Oct 1, 2022

The terrifying thing is this is what we look like to the AI. It sees us as melty faced monsters.

badrabbit · on Oct 1, 2022

Although it would make a good art piece at a campus

jarrenae · on Sept 30, 2022

In V2 we're planning to add a voting system and additional filtering/tagging to solve for a lot of these unusual/nightmareish summoned images.

I for one am sorry for your cockroach salad jump-scare, but of course, you know summoning from beyond is tricky business.

bambax · on Oct 1, 2022

Actual stock photo websites have human reviewers and it's a big part of their value proposition.

lyjackal · on Oct 1, 2022

I searched for “happy” and I tend to agree with the nightmarish look. Pretty much all of the results looked like they would love to use their happy teeth to eat you. Consistently hitting the uncanny valley

jarrenae · on Oct 1, 2022

Yeah, unfortunately uncanny valley is spot on for most generated faces with this version of Stable Diffusion.

I've seen some people tackle the problem by running faces through a secondary AI that makes faces look "better" [1] though in this case "better" just means "pretty" by recent western standards, so that has its own set of shortcomings.

I'd be interested to hear what you think about the inanimate object searches?

[1] https://replicate.com/tencentarc/gfpgan

GaggiX · on Oct 1, 2022

> AI that makes faces look "better" [1] though in this case "better" just means "pretty" by recent western standards

The AI is not making the faces pretty, it's just restorating them, even the "ugly" ones. The AI was trained on FFHQ, a dataset that contains considerable variation in terms of age, ethnicity; also it was trained by Tencent ARC Lab, so I have no idea how the west has anything to do with this.

lolc · on Oct 2, 2022

It makes the faces average, which correlates with perceived beauty.

mrtksn · on Sept 30, 2022

Sometimes can be cute: https://dropover.cloud/d6b973#841f5cc5-f92f-44fe-93ec-9e9b9d...

The uncanny valley is real but I guess these issues can be fixed with a "face sanity engine".

"Limbs sanity engine" can help too: https://dropover.cloud/8bf440#ce54ae49-bb16-4ce9-97d1-71b3b8...

q7xvh97o2pDhNrh · on Sept 30, 2022

Gives a whole new meaning to "debugging."

apsdsm · on Oct 1, 2022

This might just be the best stock photo site for YouTube creepypasta videos.

roganp · on Sept 30, 2022

Oh yes. Some are very creepy / hilarious. Awesome just the same.

selcuka · on Oct 1, 2022

No wonder they call it ghostly.

thefilmore · on Oct 1, 2022

I'm seeing images with a dreamstime[0] watermark when searching "man in suit":

https://replicate.com/api/models/stability-ai/stable-diffusi...

[0]: https://www.dreamstime.com

an1sotropy · on Oct 1, 2022

Nice find! So it was trained with dreamstime images.

Do the output images come with licensing and copyright images, so that dreamstime can be compensated for downstream commercial use?

What a legal mess.

saasxyz · on Oct 8, 2022

I found an image with istock watermark in huggingface stabilityAI space itself.

an1sotropy · on Oct 1, 2022

Found some more with "team discussion in well-lit office" (trying to think of most corporate possible phrasing)

https://replicate.com/api/models/stability-ai/stable-diffusi...

"team discussion in large office"

https://replicate.com/api/models/stability-ai/stable-diffusi...

rany_ · on Oct 1, 2022

I'm surprised it didn't mangle their watermark. It's extremely clear!

CameronBanga · on Sept 30, 2022

Having a button in the search bar that's a blue circle that says Photo, etc, and then not having it start the generation process when clicked feels odd to me. Took me about 30 seconds to realize I had to hit the enter key. Would likely feel weirder on mobile.

jarrenae · on Sept 30, 2022

Agreed. Mobile we have an added "Search" button appear, but that's on my list of improvements to make.

yamtaddle · on Oct 1, 2022

UX suggestion: example search already performed on the landing page. You can fake it a bit so it's not actually hitting your search logic (and incurring that cost) every time. Just so when you arrive you see the sort of thing a search might return.

[EDIT] Actually instead of dropping straight into the actual search-result UI, how about scrunching the header up a tad more (there's already a bunch of incomplete-looking space under it) and a row of example images with example searches that might bring them up:

    [ Image ]       [ Image ]      [ Image ]
    "Cats playing    "The moon,    "Statue of
     baseball"        made of       liberty
                      cheese"       driving a car"

jarrenae · on Oct 1, 2022

Thanks for the feedback, and I totally agree. I'd really hoped to fill out the landing page a bit more before we launched today, but we weren't able to roll it out in time.

Honestly I'm taking a lot of inspiration from Unsplash, but gearing up to add very unique features specifically based around generation.

We're going to be iterating on this pretty fast, so I wouldn't be surprised if we have something there within the week.

an1sotropy · on Oct 1, 2022

None of the text-to-image tools seem to really understand 3D geometry, so I feel safe for now. Look at examples for icosahedron [1] vs dodecahedron [2] vs octahedron [3] None of the images were actually geometrically correct - is that quibbling? Maybe, but sometimes for some audience words actually mean something, not just some vague evocation of the angular aesthetic of something. Has someone delineated the words that will not appear in a stock photography prompt? If there was some feedback like "I'm confident in this" to "I'm guessing here, user beware", it would be a lot more usable.

[1] https://replicate.com/api/models/stability-ai/stable-diffusi...

[2] https://replicate.com/api/models/stability-ai/stable-diffusi...

[3] https://replicate.com/api/models/stability-ai/stable-diffusi...

jarrenae · on Oct 1, 2022

That's one of the things I've found deeply interesting about the current generation of tools, there's little (if any) comprehension going on, it's really just trying to "enhance" a blur/bit of noise to make the image it was told to make.

And I'm not sure I completely know what you mean, but we are planning to add voting and tagging to improve filtering for images.

londons_explore · on Oct 1, 2022

You saw this right:

https://dreamfusion3d.github.io/

That's the same type of diffusion model used here, and without any further training, it is constrained to generate something that is consistent from all angles when viewed in 3d.

an1sotropy · on Oct 1, 2022

Thanks for pointing that out, maybe I’ll try it.

The question is, if I say “icosahedron” will it actually make a (3D) icosahedron

barbazoo · on Sept 30, 2022

Not sure I understand how to use this. I searched for "monkey on car" and these are the "categories" I get:

"a dead monkey", "a monkey dancing", "a dead monkey" (again), "a ca"

roganp · on Sept 30, 2022

They are offering you a previously generated image. Need to click the button at bottom of page to get an original rendering "from beyond"

jarrenae · on Sept 30, 2022

Exactly. We'll improve button location in another iteration.

jarrenae · on Sept 30, 2022

Also we'll have to add reporting for specific search terms. We do have a NSFW filter on by default, but there are often things that skirt around the rules while are hard to filter for.

itishappy · on Oct 1, 2022

The word "penis" returns surprisingly good results and even better related terms.

I find it a amusing that the site seems almost eager to suggest related terms that it then refuses to generate images of.

jarrenae · on Oct 1, 2022

haha well I hadn't considered this, but you've actually stumbled upon something interesting. Duly noted!

We're using a synonym API that's separate from the image summoning. Basically our filtering happens after the synonym API, but before the image summoning API.

mutant · on Oct 1, 2022

Orgy did well too

gus_massa · on Sept 30, 2022

They take too long to generate, but there is no clear indication of that. You should add a spinning mouse or other thing that shows that the server is working. (A robot paining a canvas would be nice, but you need someone that can make nice drawings. A hourglass or a spinning circle are good enough.)

jarrenae · on Sept 30, 2022

Agreed. That's already one of the things I have on the list for v2, "make image summoning more obvious/loading" and also we'll improve the button location for "Summoning new images" because it's likely that users won't want to scroll to the bottom just to generate new images.

gus_massa · on Sept 30, 2022

I'd make the working indicator a top priority. It's not so difficult to add, but it makes the UI much better.

jarrenae · on Sept 30, 2022

You're absolutely right. We'll likely have a fix rolled out for this and a few other things on Monday.

LanternLight83 · on Sept 30, 2022

Summoning, I like that~

jarrenae · on Sept 30, 2022

Seances aren't easy things to get working with ones and zeros.

sritchie · on Oct 1, 2022

Or in any other context, for that matter!

knicholes · on Sept 30, 2022

Dall-E 2 does something great: Show prompts and examples of images that those prompts generate. This educates your consumer to be able to get more of what they want while they wait. It kind of tickles the desire for mastery.

smeej · on Oct 1, 2022

It's fascinating how much AI struggles to mimic signs and text. With as much as we enter text into computers, my instinct was to think this should be really easy for computers, but they don't actually receive and process the abstraction of writing like we do, do they?

We use shapes to indicate sounds and sequences to make words, but the computer is ultimately just getting 1 or 0, on or off. It doesn't seem that it does have the associations we use intuitively because of how humans interact with language.

jarrenae · on Oct 1, 2022

I'm excited to see the next versions of Stable Diffusion. I think we'll probably see this improve significantly over the next year.

asdkhadsj · on Oct 1, 2022

Frankly i am quite hopeful for 10 years. Still not sure if/when i want them driving, but these versions of SD make me feel like they're entirely dumb yet feature rich (almost).

Ie i never really thought we'd get to the point where you could tell a computer to do something like in the movies without the computer _also_ seeming intelligent at the same time.

The thought of the computer being very capable but still no more smart than my terminal is.. interesting.

ihateolives · on Oct 1, 2022

Vegan gave me this one: https://replicate.com/api/models/stability-ai/stable-diffusi...

I'm actually mildly impressed.

wodenokoto · on Oct 1, 2022

This is awesome. I see this coming builtin to power point.

Cellist eating a donut is super freakish!

https://replicate.com/api/models/stability-ai/stable-diffusi...

MikeDelta · on Oct 1, 2022

Looks like a donut eating a cellist.

dillondoyle · on Oct 1, 2022

The suggested search results are amazing in such a ridiculous way.

"paper" produced "a man reading a newspaper while riding a walrus"

"a wolf reading a newspaper"

"Trapped inside infinity"

and I got to say, the wawlrus readers look passable at a glance when shrunk to low res

https://replicate.com/api/models/stability-ai/stable-diffusi...

onwardly · on Oct 1, 2022

I imagine that is a short term problem.

dillondoyle · on Oct 2, 2022

is it a problem!? ;)

rany_ · on Oct 1, 2022

It still has trouble understanding sentences, it feels to me that it just generates images based on keywords and not the meaning of my sentence.

For example, I tried "attractive woman disgusted by an ugly bystander" and the generated images show a disgusted woman with no "ugly bystander".

Similar situation with "man angry at a squirrel seeks revenge" (generated image shows an angry squirrel with no man in the image, when the man was the one supposed to be angry..)

andybak · on Oct 1, 2022

This is the biggest difference between SD and Dall-E (and Imagen) to my mind. SD can produce stunning results but it tends to treat prompts as "word salad" rather than a grammatical instruction.

krick · on Oct 1, 2022

Not sure how to evaluate that. Maybe it's kinda fun, but… I mean, generating crappy images from text isn't exactly new by now. It may be "an early version" (and this is exactly why I struggle to evaluate that — obviously, we shouldn't be too judgemental of "an early version"), but it surely isn't "a truly functional stock photo platform" yet. I mean, by far. "By a light-year" kind of far.

jarrenae · on Oct 1, 2022

This is definitely a fair assessment. I think a lot of the "wow" factor is just seeing the generated images in the first place.

In truth I think a lot of value will be added as we start improving filtering. Once users are able to vote on "usable" or "unusable" images, or request variations of an existing photo.

I've genuinely used it for 3-4 photos where I would have previously used Unsplash, and I'm optimistic that I can get that number to steadily trend upwards.

I don't expect this to erase any of the existing stock photo tools on the market, though I do think this will add some new value to the space. Honestly my goal was "will my mom be able to use this?"

Hope that helps clarify the goal a bit more, and I do really appreciate the feedback!

Kye · on Oct 1, 2022

stargate sg-1 at a furry convention:

https://replicate.com/api/models/stability-ai/stable-diffusi...

https://replicate.com/api/models/stability-ai/stable-diffusi... (cool suit)

https://replicate.com/api/models/stability-ai/stable-diffusi...

agluszak · on Oct 1, 2022

Usability note: please add a clickable "search" button.

ericmcer · on Sept 30, 2022

Whoa this is cool and I would def used a more refined version of it. The images with people are a little bit... freaky but objects and animals look fine.

I wonder if this exists inside of Squarespace or Wordpress. I imagine the ability to generate quality license free stock photos would be a huge selling point for them.

jarrenae · on Sept 30, 2022

We're going to add voting to help empower users to sort between better/worse summoned images. And an API tool for devs to leverage is planned as well.

pimlottc · on Oct 1, 2022

The animals definitely do not look fine to me, all the results for "cat" I saw were pretty squarely in the uncanny valley.

bscphil · on Oct 1, 2022

It's sort of interesting, given the undeniable power that these new AI techniques have, just how limited the output is at the moment. Only 512x512 images.

I tried a specific query - "man running from a tiger" - and none of the provided images were even close. Seems to be a common problem.

init2null · on Oct 1, 2022

Upscaling with a neural network resizer such as Topaz Gigapixel Ai dramatically improves the usability of the art. On1 also has a resizer that might be a good less-expensive choice.

jarrenae · on Oct 1, 2022

Upsizing is something I'm really hoping to bring in the next version of the site.

I'm imagining a variety of different tools to improve the images, so ideally if an image is good enough for somebody, they could "enhance it" and from then on, that image would be available at a higher resolution for all users.

glenneroo · on Oct 1, 2022

I've had great results upscaling with various versions of the open source RealESRGANx4 which I usually just run via Visions of Chaos (which also supports lots of other upscalers, stable diffusion, disco diffusion, and hundreds of other AI related tools).

jtxt · on Oct 2, 2022

I really like this idea! Related results work fairly well. Tons of potential here!

Ideas: Allow voting for prompts. Allow voting for results. (But try to prevent the rich get richer effect... https://medium.com/hacking-and-gonzo/how-hacker-news-ranking...) Allow requesting more results for a given prompt.

bug: When there is an error, make it so "back" goes to before the error, instead of before I went to the website perhaps?

switchstance · on Sept 30, 2022

I am so thankful we got out of the stock business when we did.

AI generated photos, videos, music and animations are here, and I believe it's only a matter of time before they replace a large percentage of the stock websites/companies.

jarrenae · on Sept 30, 2022

That's sort of the reason we started building this. I think there will absolutely always be room for paid, high quality stock photos, but "content" at the speed of thought is here, and I'm excited to see how the space evolves.

kaetemi · on Oct 1, 2022

The suggested tags when searching "anime girl" are just a bit creepy.

wheresmycraisin · on Sept 30, 2022

Omg I cannot wait for human faces to become non-freaky with this technology. People pay real money to sites like Getty or Adobe (the former of which is owned by a corp that you may or may not find politically compatible with your beliefs) to fill their landing pages. And for specific categories, for example "happy asian couple", there's only a few models to choose from so it becomes repetitive fast.

jmcphers · on Sept 30, 2022

If you need a non-freaky human face generated by AI, look no further than: https://thispersondoesnotexist.com/

wheresmycraisin · on Sept 30, 2022

That's good for some situations where all you need is a headshot, yes. Maybe it can be combined with this stable diffusion stuff?

cercatrova · on Oct 1, 2022

Dreambooth is what you're looking for.

jarrenae · on Sept 30, 2022

I can't wait either. We're going to add follow-up solutions to upres, expand, and improve facial features. Additionally, we're aiming to improve search terminology on the back end to start providing more relevant results for exactly those sorts of searches.

awb · on Sept 30, 2022

The landscape and city photos are stunning.

The ones with people and animals tend to have distorted faces or bodies.

But keep up the good work. There’s a definitely a market for this.

jarrenae · on Oct 1, 2022

Thanks a lot! Of course Stable Diffusion is doing most of the heavy lifting here, but we're working hard to make the UI way more user-friendly and provide real lasting utility.

If you have any suggestions, I'm all ears.

riedel · on Oct 3, 2022

How safe is it to use those stock photos? How sure can one be that stable diffusion does not create any copyrighted work? Is the training data all freely licenced or is there a mechanism that does not recreate the training data, otherwise I could see remaining risks.

jarrenae · on Oct 3, 2022

In complete transparency, the "jury is out" on this sort of conversation, meaning we don't completely know. Getty Images/Unsplash said they're not allowing AI images because of this concern, but I personally believe that's because they're concerned about the competition/controversy.

Stable Diffusion, like Dalle-2, and MidJourney were 100% trained on unlicensed copyrighted work, which is probably where most of the concern is focused.

Even in this HN thread, there are examples of "summoned" images with what looks to be watermarks or signatures from notable sites and artists.

We technically haven't written our license yet, but it'll basically be along the lines of "don't resell these images without significant alterations, don't use these images to launch a similar tool, don't re-distribute or sell the images as is to other stock photo sites."

These images (AI generated in general, not directly GhostlyStock's photos) have already been used in and on magazine covers, artist album art, and many other places.

My personal take is: - yes these tools have been trained on copyrighted work - yes it does seem rather insensitive to use an artist's name to replicate their style - yes I/we do have rights to license these generated works as we please

The space is rapidly evolving, and I'm also very interested to see how this all plays out over time.

note: I wrote this rather rapidly, so forgive me if I seem a bit blunt about this. I do think it's a conversation that should be taken seriously.

mobiuscog · on Oct 3, 2022

I was intrigued by the text I had as a subtitle in one run:

> Please Contact Us US$1,000 One or more parties reside in a country not supported by Escrow.com. Please contact us to discuss alternate payment options with a Flippa Representative.

BonoboIO · on Oct 1, 2022

https://replicate.com/api/models/stability-ai/stable-diffusi...

Nightmare stuff.

omwow · on Oct 1, 2022

Would love the option to not just have square images, but other formats as well (especially 16:9)

jarrenae · on Oct 1, 2022

Spot on. In my initial sketches for the site I had vertical/horizontal sorting for images, and I think we'll be able to add that pretty quickly.

luismmolina · on Oct 1, 2022

This needs a lot of curation. Maybe put some ads and offer a percentage to users willing to give ratings to the quality of the images. I will be happy to spend some time clicking , to earn some money for a beer once in a while.

jarrenae · on Oct 1, 2022

You're spot on with the curation! I'm really exciting to add voting and weighing to the images. I think that's where this'll really start to shine. "Girl reading in a coffee shop" generates about 1 usable image per summon, so I'm looking forward to empowering users to manually filter for more usable images.

needle0 · on Oct 1, 2022

512x512 is standard for SD generated images, but seems pretty low res when you look at it as a stock photo. Might be good to provide an AI upscaled version of the image for download.

JimWestergren · on Oct 1, 2022

Use the Lexica API and show images from them as well! https://lexica.art/docs

an1sotropy · on Oct 1, 2022

For a bad time, search for “open” there

dvh · on Oct 1, 2022

SD is so human that it too have a problem drawing bicycle.

bredren · on Oct 1, 2022

A simple url-argument-based API to get back a photo, similar to source.unsplash.com’s behavior would be terrific.

jarrenae · on Oct 1, 2022

100% Agreed on this one. I was really hoping we'd have that feature ready before launching today. Hopefully we'll have an early version up in the next week or so.

WalterBright · on Sept 30, 2022

"D-Man programming" didn't turn up anything interesting, though images.google.com got it right :D

jarrenae · on Sept 30, 2022

For more vague terms, this probably won't do too well for a while, but this definitely isn't going to "replace" anything else completely in the near future. It's really intended to be a new tool to add to the toolbox.

napsterbr · on Sept 30, 2022

Gotta love some of these suggestions:

"The Ethereum logo glowing above wet pavement, with a bull charging"

Alex3917 · on Oct 1, 2022

If I search for "bicycle repair shop", all I see are images for car repair shops.

asah · on Sept 30, 2022

[ferrari] got me pictures of ferrets. then it got HN hug.

jarrenae · on Sept 30, 2022

Interesting. Over time we'll be working to improve more specific searches to prevent unusual errors. Right now we're just loosely comparing letters in searches.

And yeah, our servers are working overtime, but what's fun is that as more images are generated, there's less of a need to "bulk generate" because many searches show existing results.

yieldcrv · on Oct 1, 2022

Id like to normalize the term “commissioning an AI”

rasz · on Oct 1, 2022

penis

shows fountain pen

poopnugget · on Oct 1, 2022

Doesnt work. I type my prompt and it gives results for something different.

jarrenae · on Oct 1, 2022

Sorry, it's a little clunky at the moment. Just to clarify, you're actually directly searching for images that have been previously been summoned by other users. Our recommendation algorithm right now is "pretty bad" because we're just checking for words that might be related.

But if you scroll down to the bottom of the search results, you can "summon more images from beyond", and then you'll see a direct prompt generation to your search.

Ideally this'll enable people to reuse images that are really good and viable over and over again just like stock photos instead of searching hard for new images from random image generations, which is pretty exciting I think.

Let me know if this helps!

nixpulvis · on Sept 30, 2022

Terrible.

jarrenae · on Sept 30, 2022

I'm interested to hear why you think so? If theres's something specifically wrong with what we've built that we could possibly improve, I'd appreciate the feedback.

nixpulvis · on Oct 1, 2022

It's a horror show. Creepy and weird stock photos for sure. Useful for Halloween perhaps, but uncanny as hell. I cannot think of much else to say other than that it's terrible.

Maybe some people want terrible, but that doesn't change the fact that an arm is melding into the mouth of this girl's face, and all I asked for was "smiling".

jarrenae · on Oct 1, 2022

Yeah, that's the unfortunate side effect of generated images, there's no comprehension of generation on the AI's side of things. People rarely come out correctly right now, but animals tend to do a bit better. Inanimate objects are mostly useable.

This is more of a proof of concept for us, and I'm trying to solve for some of this by adding voting and improved prompts on the back end. I'm hopeful that in a few weeks we'll have a much improved version out with less "horror show" images overall.