Hacker News new | past | comments | ask | show | jobs | submit login

What do you mean by extreme requirements? There's lots of SDXL fine tunings available at civit, like https://civitai.com/models/119012/bluepencil-xl for anime. The relevant discords for models/apps are full of people doing this at home.

Or are you looking at some very specific definition / threshold for fine tuning here?




There is something I find rather hard to communicate about the difference between these models on civitai and what I think a competent model should be able to do.

I'd describe them like "a bike with no handlebars" because they are incredibly difficult to steer to where you want.

For example if you look at the preview images like this one: https://civitai.com/images/3615715

The model seems to have completely ignored a good 35% of the text input, most egregiously I find the (flat chest:2.0), the parenthesis denoting a strengthening of that specific part of the prompt. The values I see people use with good general models range from 1.05~1.15. 2.0 in comparison is an extremely large value, that ended up _still not working at all_, if you take a look at the actual image.


> the model seems to have completely ignored a good 35% of the text input, Well, when you blindly cargo cult a prompting style designed to work around issues with SD1.x in SDXL and in the process spam several hundred tokens, mostly slight variants, into the 75-ish-token window (which, yes, the UIs use a merging strategy to try to accommodate), you have that problem.

> most egregiously I find the (flat chest:2.0)

The flat chest is fighting to compensate the also heavily weighted (hands on breasts:1.5) which not only affects hand placement but also the concept of "breasts", and the biases trained into many of the community models with that term mean that having that concept in the prompt and heavily weighted takes a lot to counteract. So, no, I don't think its ignoring that.


I'm just going out on a limb here, but a paid service needs to be good with limited input. I've used SD locally quite a lot and it takes quite a bit of work through x/y plots to find combinations of settings that produce good images somewhat consistently. Even when using decent fine tunings from CivitAI.

When I use a decent paid service, pretty much every prompt gives me a good response out of the box. Which is good, because otherwise I'd have no use for paid services, since I can run it all locally. This causes me to go to a paid service whenever I want something quick, but don't need full control. When I do want full control, I stick to my local solution, but that takes a lot more time.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: