Hacker News new | past | comments | ask | show | jobs | submit login

The article seems to be suggesting that the Million Dollar Home Page has in some sense failed to fulfill it's promise because many of the links are now dead. I don't follow that logic at all. To me it seems that the MDHP's job was to be an iconic piece of internet history, and they've entirely fulfilled their end of the bargain.



I don't see how the article suggests that. The article is using the Million Dollar Homepage as an example to draw attention to interesting things about internet history, including the complexity of archiving internet content. The page itself isn't being attacked, but rather used to exemplify a broader concern.

The article uses the Million Dollar Home Page as an example of an interesting historical artifact that tells us a lot about the internet of 2005 in terms of design, the context in which the page was created, etc.

However, Million Dollar Home Page is also used as an example of the complexity of archiving internet content, i.e. archiving a page in a complete way must also involve archiving the pages it links to, otherwise the functionality of the archived page will decay over the years. This has important implications for its usefulness as a historical artifact.

Most of us remember what the internet was like when the Million Dollar Home Page was created, but many years from now, it will be challenging for people studying the history of the internet to really know what it was like back then unless we archive things in a way that preserves functionality.


that's beautiful and all, but nothing works like that. a page is lots of things. the html and js can be preserved like an artifact in a museum, ok. but the rest, the server hosting it, its connection to the internet, it being http only, etc... has to be preserved like architectural works. yes, some building facade migth still look like it was build in the 1800, but everything in it is updated to conform to fire code, exit signaling, etc.

so, in the end, even if we managed to keep that page up, it would still be of very little interest to the pople of the future.

the proper way to do things is to document everything. by good historians.


The article does not purport to have a solution to this problem. It is a contribution to an ongoing effort to describe the problem and understand it so that solutions may be pursued.


But, and I feel like I'm going to get a proverbial reprimand for asking the obvious, but why, in all this time, has no solution been proposed?


"No perfect solution has been proposed." might be more accurate.


That sounds like a excellent opportunity for corporate IT departments to make a good haul of PR and general kudos, for making a effort to release their archived caches, wherever they may be stored due to data retention policies.

I have a huge soft spot for projects where you can get the most happening because you are not required to jump a known hurdle to usefully contribute.

Overnight I was fretting with the necessity of sorting out any residual legal issues that might attach to digging out old cache dumps.

But forgetting the fact that the very same companies are commonly invested in just the tech to sort out the problem,.

Problem being one which can present all kinds of ways. Do you have any chance of finding adult content in your cache? Do you care about how much you are seen surfing the competition's websites? Will URLs reveal that anonymous forum login from 2002, slagging your rivals benchmarking? Did you put Squid on your intranet or webmaster, without https because your predecessor thought it was on the non routeing private range? Did you use DNS in any way to point to document resources that are accessible to users via the proxy server and Squid?

Anyhow lots of data protection suites have been used to purge archives and remove any trace of activity or files best kept private.

That's how the latest HDS kit is sold : cluster FS with hyperconverged local nodes crunching security, audit and search results.

I'm sure I'm only wishing on a prayer that you can find great troves for redecorating the empty space that the WWW once pointed to.

But imagine that we really could fill enough gaps in the dead link forests!

Even just attempting could be a superb way to promote your storage products and bless your customers who offer up the raw stores with lots of great ways to engage regular journalists with the subject.

I totally live best effort projects, and especially the ambitious culturally interesting ones.

So does Joe Public.

I'm in.

Where do we start?

(if I have the time, I'm quite serious about this, my profession is advertising not online but the traditional way of doing it. I have just got my head back from exploding with the multitude of ways to sell, promote, demo, boast, reminisce, predict, forecast, warn people about how society will crumble without personal all flash arrays... Last one might be pushing it a bit, but I see a fantastic deal of business on the back of the idea. It's beautiful because everyone is invited to participate, no vendor or company is locked out. So the public is not getting the boring official line and canned quotes. This is real people showing the technology but showing the extent to which we discard valuable culture. For thousands of years mankind grew as our means of recording documentation and thought and expression grew. Look now how easily we will throw it all away!!!! I know we can do better.


One way to look at it is emphasizing the power of link rot. These links typically cost dozens or hundreds of dollars to put in a single link in the MDH, up to the max of $37k; the linkers were motivated and knew that much of the value was in having a valid link to click on. But barely a decade on, even this isn't enough for a majority of links to still work. Interests are too ephemeral, people move on too much, control of domains and hosting are too hard to maintain, even when you are paying thousands of dollars. Linkrot is inevitable.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: