Hacker News new | past | comments | ask | show | jobs | submit login

The problem is that the Internet Archive exists on legally shaky ground. Neither they nor anyone else has a right to archive copyrighted web content and display it to the public. They manage to continue doing so in part because they're clearly non-commercial. They also manage to continue doing so because they voluntarily respond to robots.txt, even retroactively.

Libraries/archives have no special exemption from copyright law, which is actually a good thing, because otherwise libraries would presumably need to be licensed in some way by the government to qualify for special treatment.




Why not look at WHOIS information when getting an update, and then class a site as 'different' based on whether that changes? In most cases, a new domain owner usually means the site isn't the same as the earlier versions.

You'd then just have to stop the archive indexing/showing content after the WHOIS information changed, while leaving the stuff before it intact. Maybe you'd then have a nice form to report pages you want removed/hidden (for the edge cases), or even a seperate robots.txt/meta declaration you can make confirming you're the same person that owns the site. After all, most of the reasons why sites go missing aren't deliberate attempts to rewrite history, but domain squatters not wanting holding pages indexed.

Feels like it'd be so easy to implement robots.txt in a more logical way on the Internet Archive.


It's been suggested, but there's no way to automatically do it correctly. The whois info might be anonymized, in which case a change means nothing at all. It might just be someone's name and address, with no way of verifying who that someone works for. Also meaningless. Better just to default to something safe, and spend your manpower on something more important.


    > the Internet Archive exists on legally shaky ground
Not least because of the EU's 'right to be forgotten'.


That doesn't seem likely to apply.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: