Hacker News new | past | comments | ask | show | jobs | submit login
StableStudio, an open-source release of DreamStudio (stability.ai)
233 points by jacooper on May 17, 2023 | hide | past | favorite | 48 comments



Getting to the meat of the release (from their github [1]):

> What's the difference between StableStudio and DreamStudio?

    Not much! There are a few tweaks we made to make the project more community-friendly:
    - We removed DreamStudio-specific branding.
    - All "over-the-wire" API calls have been replaced by a plugin system which allows you to easily swap out the back-end.
        - On release, we'll only be providing a plugin for the Stability API, but with a little bit of TypeScript, you can create your own.
    - We removed Stability-specific account features such as billing, API key management, etc.
        - These features are still available at DreamStudio's account page.
Doesn't look like any of their models are getting open sourced alongside the app but they're making it possible to swap out the image generation backend using plugins

[1] https://github.com/Stability-AI/StableStudio


The models will be open-sourced too!

They aren't quite ready yet...

https://twitter.com/EMostaque/status/1641795867740086272


The claim that Stable Diffusion XL will be open sourced, has to be taken with a pinch of salt.

Stability AI already claim Stable Diffusion (non-XL) is an “Open Source Model”[1]. It's not. The code is open source but the model is proprietary.

[1] https://stability.ai/


> The code is open source but the model is proprietary.

Are you saying that because the Open RAIL M license doesn't conform to the OSI definition of "open source"?

I agree that it's not "open source", but I don't think that means it's "proprietary".


Yes, the OpenRAIL licences are deliberately not open source. Being open source or not is binary, and proprietary is the antithesis of open source.

There are of course additional ways to describe specific proprietary things, like distributable in this case. Distributable but non-free, because the software license is morally judgemental, and limits 'fields of endeavour'.


I'm with you on "not open source", but I don't think "proprietary" works as the opposite of "open source" here.

The dictionary definition of "proprietary" tends towards "one that possesses, owns, or holds exclusive right to something" - that doesn't quite fit here, because models licensed under OpenRAIL aren't exclusively restricted to their owners.

They have terms of how they can used that are more restrictive than open source licenses, but they don't enforce "exclusive" usage by their owners.


Source available is the fitting term here.


Model weights can't really be described as source code though. The equivalence isn't exact, but I'd describe the weights more as the compiled binary, with the training data & schedule being the source (which is sort of under an open source license, with the complication of LAION's "it's just links"). The fact it costs $1 million to "compile" isn't relevant.

This isn't to defend Stability particularly though - they've been getting increasing slow and restrained in their model releases. Charitably because they're attracting a lot of heat from political and anti-AI aligned groups. Uncharitably because they've taken a lot of funding now.

(Edit: typo)


> Model weights can't really be described as source code though. The equivalence isn't exact, but I'd describe the weights more as the compiled binary, with the training data & schedule being the source

I think this is a really interesting discussion! I see where you're coming from, but I'm minded to disagree in part.

For one, I think it's possible to release model weights under a liberal licence, yet train on proprietary data. (ChatGPT is trained on oodles of proprietary data, but that doesn't limit what OpenAI do with the model). Normally, obviously, the binary is a derivative work of the source.

Also, the GPL defines source code as 'the preferred form for modification'. I don't disagree that model weights are a black box. But we've seen loads of fine tuning of LLaMA, so we don't always need to train models from scratch.

Ideally, of course, having both unencumbered training data and model weights would be perfect. But in the interim, given I don't have that million dollars, I'll settle for the latter.


Yeah, neither view is a perfect fit. Another example is vision transformer backbones, where a common generic base weight is used to fine-tune all sorts of different processes (segmentation, image to text, etc). The terminology (and licenses) haven't really kept up.

A properly unencumbered model would be my preference too. The community generally seems a bit laissez-faire with license compliance though, so the restrictions currently don't generate much push back. (Plus it's not totally clear that you can copyright model weights at all, given they're the output of an automatic process).


They claim their other models are "open source" too, but their licence violates several points of the open source definition.


very cool


This is still really nifty.

> All "over-the-wire" API calls have been replaced by a plugin system which allows you to easily swap out the back-end.


Kudos to Stability for focussing on open source when ”Open“ai is trying to dig a moat around their castle.


If it's legit. Based on some of the earlier comments, it seems like this claim might be bullshit.


How is Stability AI going to make money? What they provide is super cool and VCs are into it but at some point they need to start giving them returns. Is the plan to get acquired?


Their business plan is to build custom models for clients/companies who want to own models built on their own data. No comment on the viability of such a business model.

https://twitter.com/EMostaque/status/1649152422634221593


You can buy credits to generate images using DreamStudio. In theory, this seems more convenient than running software locally. But last I checked, their image quality wasn’t all that interesting compared to MidJourney, so I stopped using it.

The Discord-based UI for MidJourney is clunky and unreliable, and waiting for images to generate is annoying. Image quality has improved quite a bit, but it’s no better at following instructions. So it seems like there’s an opening for a competitor, but the competition would need to actually be better.


Some ideas:

AI platform - They could charge a subscription fee for access to premium features on its open-source StableStudio platform.

Consulting services - They could charge businesses for its AI and machine learning consulting services on a per-project basis or on a retainer basis.

Acquisition - A larger company with more resources could acquire Stability AI to scale its operations and reach a wider audience.


When even users are recommending a subscription service, that's when you know we have become brainwashed into thinking it's a healthy or required practice.


They're a B2B finetuning and fully managed model solution. Open Source is free marketing for them.


That’s some expensive free marketing


Stable diffusion cost only 1 million, that's like 5 marketing people for SF salaries. Not sure they'd achieve better results than SD got them.


Citation needed? That seems incredibly low. Is it considering the cost of paying researchers?


According to the man himself, Emad


>That’s some expensive free marketing

No, it's free.

They need to make the models already in order to sell them as managed models.

Giving them away costs nothing once they're made.


I wonder if they will merge with Huggingface


they don't sell an API? i m sure they could bring a lot of customers to a computing provider


No local inference support as of yet at least.

    Here are a few things we’d be excited to support…

        Local inference through WebGPU
        Local inference through stable-diffusion-webui
        Desktop installation
        ControlNet tools
        Tell us what you want to see!


Since I re-implement the Stability APIs with gyre.ai, it should "just work" for a local inference option.

I haven't tested though, and the API spec has lots of under-specified bits, so could be some stuff I need to tidy up.

(Alternatively, despite the post content, in the github repo there is already a webui plugin, no idea of current state).


Doesn't that just mean that local inference runs on the CPU?


Will this become more popular than automatic1111 webUI ?


Looks like they are trying to make it worth with a1111.

  Here are a few things we’d be excited to support…
   Local inference through WebGPU
   Local inference through stable-diffusion-webui
   Desktop installation
   ControlNet tools


I wouldn't bet on it. Automatic1111 already has a strong community and growing ecosystem. Stability.ai would have to try really hard and offer something significant to win people over.


Very curious to this as well, I’d imagine not unless something dramatic happens as Automatic is the gathering storm of public support and interest.


I’d rather them focus on getting their hosted solution/api on par with Automatic. It needs control net, it needs seamless images. And all the other goodies that are now pretty standard.

I like a hosted solution. Can we just make it much better please? If I am paying you to host a service I don’t want to also have to go an give my labour to create the software. That’s why I wanted a hosted service in the first place.


People keep hyping this automatic UI but having tried it it's honestly so awful. Can't believe people are saying it's better than dreamstudio


The UI is awful, the functionality is not.


For serious generation control, nothing else comes remotely close. It is the most impressive github project I have every installed tbh. The flexibility of the system is unmatched in the image gen space. It feels like I am back in the 90s.


Agree the UI is not good. But dream studio also quite poor.

But the functionality of automatic is too notch.

SD without control net is now totally obsolete.


So StableStudio is just a plugin-extendable user interface for interacting with an image generation model (e.g. via queries against some opaque remote endpoint) , but does not provide the actual image generation model?

So the actually interesting meat is not part of the offer?


Don't discount the value (or difficulty) of a great user interface

There are multiple servers that provide the backend "meat" already: gyre.ai (my personal project); Automatic1111 webui; InvokeAI; plus all the hosted APIs to name just a few.

There are no UI interfaces I'd consider great. I think Flying Dog aistudio[1] is the best (although I'm biased), but still early days. The UI part of WebUI is ... functional(ish).

DreamStudio / StableStudio is super basic in capability at the moment, but it's a space with tons of room for improvement.

[1] https://www.youtube.com/watch?v=FcGlA38M52M

(Edit: markdown doesn't work here)


Can it be used via an API?


If you want to use stable diffusion t2i and i2i today via an self-hosted api it's very easy to do so with auto1111, just click the api link at the bottom of the gradio app and there's extensive documentation on the endpoints.


I think it can only use other APIs itself. Looks like they want to make it work with a1111's api and they will ship a plugin that talks to stability's pay api.


Did you miss the large API link in the header?


Unkind response, and the API link goes to info about Stability's API, not info about an API for interacting with the StableStudio app.


You don't use APIs to interact with apps. Apps use APIs to do actions.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: