They clearly expected the leak, they distributed it very widely to researchers. ...

nkzd · on April 9, 2023

How could Meta ever find out your private business is using their model without a whistleblower? It's practically impossible.

bredren · on April 10, 2023

This is an old playbook from Facebook, where the company creates rules that they know they can not detect violation of.

This gives the company plausible deniability while still allowing ~unrestricted growth.

Persistent storage (in violation of TOS) and illicit use of Facebook users’ personal data was available to app developers for a long time.

It encouraged development of viral applications while throwing off massive value to those willing to break the published rules.

This resulted in outsized and unexpected repercussions though, including the Cambridge Analytica scandal.

People should be wary of the development as much as they are enthused. The power is immense and potential for abuse far from understood.

elcomet · on April 10, 2023

You are certainly partly right, but it's also about liability. Those models might output copyrighted information, which Facebook doesn't want to get sued about. So they restrict the model for research. If someone uses it to replicate copyrighted work, they are not responsible.

bredren · on April 10, 2023

Open AI faces the same liability concerns though. I think IP concerns are low on the list given past success of playing fast and loose on emergent capabilities of new tech platforms.

For example, WhatsApp’s greyhat use of smartphone address book.

The US government also has a stake in unbridled growth seems, in general, to give a pass to business exploring new terrain.

ben_w · on April 9, 2023

I think you can make that argument for all behind-the-scenes commercial copyright infringement, surely?

tel · on April 9, 2023

Have reasonable suspicion, sue you, and then use discovery to find any evidence at all that your models began with LLaMA. Oh, you don't have substantial evidence for how you went from 0 to a 65B-parameter LLM base model? How curious.

Aeolun · on April 10, 2023

Fell off the back of a truck!

moffkalast · on April 10, 2023

Recovered it from a boating accident.

PufPufPuf · on April 9, 2023

Yes, that's how software piracy has always worked.

halotrope · on April 9, 2023

You can just ask if there is no output filtering

guwop · on April 9, 2023

The future is going to be hilarious. Just ask the model who made it!

barbariangrunge · on April 9, 2023

Does the model know, or will it just hallucinate an answer?

dizhn · on April 10, 2023

Probably both.

BeFlatXIII · on April 10, 2023

Same way anti-piracy worked in the 90s: cash payouts to whistleblowers. Yes, those whistleblowers are guaranteed to be fired employees with an axe to grind.

whazor · on April 10, 2023

LLaMa uses books3 which is a source of pirated books, to train the model.

So either, it is very hypocrite of them to apply DCMA while the model itself is illegal. Or, they are trying to somewhat stop spreading as they know it is illegal.

Anyways, since the training code and data sources are opensource, you 'could' have trained it yourself. But even then, you are still at risk for the pirated books part.