> Adding an LLM abstraction layer doesn't make the existing laws (or social/mora...

PragmaticPulp · on April 12, 2023

> for possibly stealing artist's work in the open source domain

The provenance of the training set is key. Every LLM company so far has been extremely careful to avoid using people's private data for LLM training, and for good reason.

If a company were to train an LLM exclusively on a single person's private data and then use that LLM to make decisions about that person, the intention is very clearly to access that person's private data. There is no way they could argue otherwise.

dragonwriter · on April 12, 2023

> Every LLM company so far has been extremely careful to avoid using private people's data for LLM training

No, they haven’t. (Now, if you said “people's private data” instead of “private people's data”, you’d be, at least, less wrong.)

softfalcon · on April 12, 2023

I've spoken with a lawyer about data collection in the past and I think there might be a case if you were to:

- collect thousands of people's data

- anonymize it

- then shadow correlate the data in a web

- then trace a trail through said web for each "individual"

- then train several individuals as models

- then abstract that with a model on top of those models

Now you have a legal case that it's merely an academic research into independent behaviors affecting a larger model. Even though you may have collected private data, the anonymization of it might fall under ethical data collection purposes (Meta uses this loophole for their shadow profiling).

Unfortunately, I don't think it is as cut and dry as you explained. As far as I know, these laws are already being side-stepped.

For the record, I don't like it. I think this is a bad thing. Unfortunately, it's still arguably "legal".

Wowfunhappy · on April 12, 2023

I realize that data can be de-anonymized, but if the same party anonymized and de-anonymized the data... well, IANAL, and you apparently talked to one, but that doesn't seem like something a court would like.