Or, alternatively, copyright risk is a major concern for real customers, and thi...

9dev · on Nov 6, 2023

You’re talking about the issue as if it weren’t your posts, your pictures, or content of artists you are into.

This isn’t a ”copyright risk“, it’s a Silicon Valley corporation getting away with declaring copyright just… obsolete.

lucubratory · on Nov 6, 2023

They are my content, actually, from the last ~15 years of being on the internet. I don't care about it personally, and even if I did it is really obviously fair use so even if I find it objectionable I don't get to actually legally compel someone to stop.

9dev · on Nov 7, 2023

What on earth is fair use about a public company deriving its whole valuation from the processing of content taken from the internet without any regard for licensing, or robots.txt rules??

The technology is cool, I get it. But saying ”I don’t mind, they can use my content“ is on par with ”I don’t need privacy, I have nothing to hide“ in terms of statement quality.

zoogeny · on Nov 7, 2023

I can see a specific argument you are making as something along the lines of, just because some people are ok with their content being used by large corporations to train their for-profit LLMs doesn't mean it should be ok for those companies to take my content and use it to train their LLMs without my permission.

But, I suppose I see it the other way as well. Just because you don't want large corporations to train their LLMs using your content doesn't mean that society has to settle on making it illegal. As an imperfect analogy: just because some people don't want to have their picture taken when they are out in public doesn't mean that taking pictures of people in public ought to be illegal.

So I think we have to get passed the "I don't like this, so it is evil" kind of thinking. As in the analogy to pictures of people in public, there is some expectation of privacy that we give up when we enter out into public. Perhaps there is some analogy there to content that we freely release into public. Perhaps we need stricter guidelines on LLM attribution. I don't have an answer, but I'm not going to allow this decision to be de facto made by the strong emotions of individuals who have already made up their minds.

TeMPOraL · on Nov 7, 2023

Approximately all of that content becomes significantly more useful to society when fed into blender and released as ChatGPT, as long as the latter is generally available, and even accounting for it being for-profit, and any near-term consequences of propping up an SV company. By significantly I mean orders of magnitude, and that's going by the most naive take method of dividing utility flowing from ChatGPT by the size of training data.

So yeah, it may not be ideal, it's also of general public interest so much, that bringing up copyright seems... of poor taste.

(Curiously, I don't feel the same about image models. Perhaps that's because image models compete with current work of real artists. LLMs, at this point, don't meaningfully compete with anyone whose copyright their training possibly infringed.)

stavros · on Nov 7, 2023

I agree with you on OpenAI, I disagree on LLMs in general. I wish someone would use all the content available ever to train a great open-source LLM.

TeMPOraL · on Nov 7, 2023

> This isn’t a ”copyright risk“, it’s a Silicon Valley corporation getting away with declaring copyright just… obsolete.

While this does not fully represent my views on what's a very complex issue, since you phrased it like this, I feel compelled to say: about damn time someone did it.

tomjen3 · on Nov 6, 2023

Or if it turns out that copyright is a major issue OpenAI now have standing to fight any claims.

andygeorge · on Nov 6, 2023

did an AI write this

andybak · on Nov 6, 2023

Nice jab but it doesn't sound like AI so it's not as cutting as your probably thought.