Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

As someone who's largely "OK" with morality clauses in otherwise liberal AI licenses, I think we should start calling these "weights-available" models to distinguish from capital-F Free Software[1] ones.

I'm starting to get irritated by all these 'non-commercial' licensed models, though, because there is no such thing as a non-commercial license. In copyright law, merely having the work in question is considered a commercial benefit. So you need to specify every single act you think is 'non-commercial', and users of the license have to read and understand that. Even Creative Commons' NC clause only specifies one; they say that filesharing is not commercial. So it's just a fancy covenant not to sue BitTorrent users.

And then there's LLaMA, whose model weights were only ever shared privately with other researchers. Everyone using LLaMA publicly is likely pirating it. Actual weights-available or Free models already exist, such as BLOOM, Dolly, StableLM[0], Pythia, GPT-J, GPT-NeoX, and CerebrasGPT.

[0] Untuned only; the instruction-tuned models are frustratingly CC-BY-NC-SA because apparently nobody made an open dataset for instruction tuning.

[1] Insamuch as an AI model trained on copyrighted data can even be considered Free.




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: