Hacker News new | past | comments | ask | show | jobs | submit login

They clearly expected the leak, they distributed it very widely to researchers. The important thing is the licence, not the access: you are not allowed to use it for commercial purpose.



How could Meta ever find out your private business is using their model without a whistleblower? It's practically impossible.


This is an old playbook from Facebook, where the company creates rules that they know they can not detect violation of.

This gives the company plausible deniability while still allowing ~unrestricted growth.

Persistent storage (in violation of TOS) and illicit use of Facebook users’ personal data was available to app developers for a long time.

It encouraged development of viral applications while throwing off massive value to those willing to break the published rules.

This resulted in outsized and unexpected repercussions though, including the Cambridge Analytica scandal.

People should be wary of the development as much as they are enthused. The power is immense and potential for abuse far from understood.


You are certainly partly right, but it's also about liability. Those models might output copyrighted information, which Facebook doesn't want to get sued about. So they restrict the model for research. If someone uses it to replicate copyrighted work, they are not responsible.


Open AI faces the same liability concerns though. I think IP concerns are low on the list given past success of playing fast and loose on emergent capabilities of new tech platforms.

For example, WhatsApp’s greyhat use of smartphone address book.

The US government also has a stake in unbridled growth seems, in general, to give a pass to business exploring new terrain.


I think you can make that argument for all behind-the-scenes commercial copyright infringement, surely?


Have reasonable suspicion, sue you, and then use discovery to find any evidence at all that your models began with LLaMA. Oh, you don't have substantial evidence for how you went from 0 to a 65B-parameter LLM base model? How curious.


Fell off the back of a truck!


Recovered it from a boating accident.


Yes, that's how software piracy has always worked.


You can just ask if there is no output filtering


The future is going to be hilarious. Just ask the model who made it!


Does the model know, or will it just hallucinate an answer?


Probably both.


Same way anti-piracy worked in the 90s: cash payouts to whistleblowers. Yes, those whistleblowers are guaranteed to be fired employees with an axe to grind.


LLaMa uses books3 which is a source of pirated books, to train the model.

So either, it is very hypocrite of them to apply DCMA while the model itself is illegal. Or, they are trying to somewhat stop spreading as they know it is illegal.

Anyways, since the training code and data sources are opensource, you 'could' have trained it yourself. But even then, you are still at risk for the pirated books part.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: