Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I think scale is what changes the nature of the thing. At the point where you're having a machine consume billions of documents I don't think you could reasonably call that reading anymore. But what you are doing in my eyes is indexing, and the legal basis for that is heavily dependent on what you do with it.

If a human reads it that would be a reproduction of the work, but if you serve that page as a cache to a human you're okay, usually.

If you compile all that information in a database and use it to answer search queries that's also okay, and nothing forbids you from using machine learning on that data to better answer those search queries.

Both of the above are actually being challenged right now but for the time being they're fine.

But that database is a derivative work, in that it contains copyrighted material and so how you use it matters if you want to avoid infringement — for example a Google employee SSHing to a server to read NYT articles isn't kosher.

What isn't clear is whether the model is a derivative work. Does it contain the information or is it new information created from the training data Sure, if you're clever you could probably encode information in the weights and use it as a fancy zip file but that's a matter of intent. If you use Rewind or Windows Recall and it captures a screenshot of a NYT article and then displays it back to you later is that a reproduction? Surely not. And that's an autonomous system that stores copywritten data and regurgitates it verbatim.

So if it's impractical to actually use it for piracy and it very obviously isn't anyone's intent for it to be used as such then I think it's hard to argue it shouldn't be allowed, even on data that was acquired through back channels.

But copyright is more political than logical so who knows what the legal landscape will be in 5 years, especially when AI companies have every incentive to use their lawyers to pull the ladder up behind them.



Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: