Hacker News new | past | comments | ask | show | jobs | submit login

So I've done a bit of comparative testing between Janus 7b and Flux Dev - strictly considering PROMPT ADHERENCE since Janus is limited to 384x384. As mentioned elsewhere upscaling is a FAR simpler problem to solve than adherence.

Results testing star symmetry, spatial positioning, unusual imagery:

https://imgur.com/a/nn9c0hB






Prior to flux 90℅ of my SD images had one dimension smaller than 480-512px. I prefer the smaller images both for speed and bulk/batch, I can "explore the latent space" which to me means running true random images until one catches my eye, then exploring the nearby seeds and subseeds - the model seed and then there's a smaller latent space seed that kind mutates your image slightly. All images in a batch might share the first seed but the second seeds are all different. Just what I call exploring the latent space. I can make a video, because i doubt what I typed makes perfect sense.

Nice. A couple discord users back in the early days of SD were doing something similar by generating random alphanumeric positive/negative prompts and then pushing the seed/subseed values up and down.

In my experience, changing the seed even by a single digit can drastically alter the image so I'd be curious to know how truly "adjacent" these images actually are.


it doesn't drastically alter the images, in my experience. More like changing the trim on a dress or the shape of drapes. The structure and composition of the nearby images are similar.

random seed

https://imgur.com/a/ySOUKSM

variation seed

https://imgur.com/a/GSo0Sjm

sorry i did the HN thing (i didn't show my work):

> A neon abyss with a sports car in the foreground

>Steps: 20, Sampler: DPM++ 2M Karras, CFG scale: 7, Seed: 1496589915, Size: 512x512, Model hash: c35782bad8, Model: realisticVisionV13_v13, Variation seed: 1496589915, Variation seed strength: 0.1, Version: 1.6.0

\ the first image is the same except the seeds were random (the main seed is one of the first 4 though)


Could you send me the video if you ever end up making it? I don’t understand how jumping between nearby seeds means anything in the latent space. As far as I know it’s closer to like a hash function where the output is drastically different for small changes in the input.

i posted 2 replies to your sibling comments: 1 with a demo of what i mean with 2 batches of 4 with completely random and then latent space "random" seeds; and then a second comment with a single imgur link that shows the only setting i touched and an explanation of how i use it.

I apologize if this isn't what "exploring latent space" means but /shrug that's how i use it and i'm the only one i know that knows anything about any of this.

edit to add: i get frustrated pretty easily on HN because it's hard to tell who's blowing smoke and who is actually earnest or knows what they're talking about (either or is fine). I end up typing a whole lot into this box about how these tools work, how i use them, the limitations, unexpected features...


seeds are noticeably "nearby" each other? that is very unexpected to me

variation seeds are nearby, this is what i call the latent space, see my reply that has two imgur links to your sibling comment-er

That sounds fascinating. Would you mind writing up a demo on how to do that?

https://imgur.com/a/PpYGnOz unsure about other UI, but: you can usually set a seed, and also see the seed of an image you've generated. so generate/load an image you like so you have the seed. Lock the seed. Find the variation seed setting. lock that (on automatic1111's webUI it automatically locks to the main seed) - now adjust the variation strength. If you're doing small images you can make this small, because the variations will be very minor. I set 0.1 - which i use with larger images if i am looking for a specific color smear or something, but once i narrow it down i reduce this 0.05 or below. When you click an image in a UI it ought load all the details into the configurable parts, including the variation seed / subseed, which means you can just keep exploring around individual variations' spaces, too. expanding the strength a bit if you get stuck in local minima (or boring images), and reducing the strength to get the image you want to rescale to publish or whatever.

Ask it to create a salad with an avocado chopped in half. See whether each half has a nut in it.

It would be worth throwing imagen3/imagefx into the comparison.

Good idea - I've updated the comparisons with Imagen3 and DALL-E 3. I also cherry picked the best result from each GenAI system out of a max of 12 generations.



Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: