> The fact that a private company (or companies) can do so in the first place with such great impact is an opportunity for disruption. The fact that is has already occurred is an opportunity for libraries like the Internet Archive.
No, it isn't. What IA stores long-term is relevant to future generations, less so to us now. What matters for us is that you can censor anything on IA, retroactively, by updating robots.txt.
IA won't be able to capitalize on this opportunity for disruption until copyright law gets completely overhauled. I don't see this happening soon, as powers that be - both public and private - are all aligned in their interest to make IP protection even stronger.
> that you can censor anything on IA, retroactively, by updating robots.txt.
IA claims to have fixed that, and it's been a couple years since I've caught them respecting robots.txt. If you have examples of them respecting robots.txt more recent than, say, 2018... citation please? This was a serious problem in the past, but I had hoped (and believed) it was no longer a thing.
(I'm deliberately avoiding saying "They don't do that anymore.", since it's a low-probability, high-impact event, and I may just not have encountered it, but complying with robots.txt would be a really vile thing for a supposed library to do.)
Regardless of whether or not you personally have witnessed this, it is the Internet Archive's stated content removal policy:
Do you collect all the sites on the Web?
No, the Archive collects web pages that are publicly available. We do not archive pages that require a password to access, pages that are only accessible when a person types into and sends a form, or pages on secure servers. Pages may not be archived due to robots exclusions and some sites are excluded by direct site owner request.
...
What is the Wayback Machine's Copyright Policy?
The Internet Archive respects the intellectual property rights and other proprietary rights of others. The Internet Archive may, in appropriate circumstances and at its discretion, remove certain content or disable access to content that appears to infringe the copyright or other intellectual property rights of others. If you believe that your copyright has been violated by material available through the Internet Archive, please provide the Internet Archive Copyright Agent with the following information:...
No, it isn't. What IA stores long-term is relevant to future generations, less so to us now. What matters for us is that you can censor anything on IA, retroactively, by updating robots.txt.
IA won't be able to capitalize on this opportunity for disruption until copyright law gets completely overhauled. I don't see this happening soon, as powers that be - both public and private - are all aligned in their interest to make IP protection even stronger.