Hacker News new | past | comments | ask | show | jobs | submit login
The Internet Archive and Jason Scott are saving our weird Internet history (zamzar.com)
243 points by whyleyc on Jan 27, 2020 | hide | past | favorite | 34 comments



What a lot of people think of as "Jason Scott" is the result of hundreds of people working at/with the Internet Archive, hundreds more volunteering effort and resources at Archive Team, and dozens of great contributors to the stacks/collections of TEXTFILES.COM.

I am a very pretty and flamboyant figure standing at the front of a literal army of real contributors to taking online history seriously and making it both available and accessible to future generations.


Thanks for your work, i only have the ability to donate to a very few projects, but the Internet Archive is always one of them. Computers would be a lot more boring to me without it.


Thank you for your donations! We do our absolute best to put them to the most effective use we can.


Thanks for the work that you all do. We've updated the blog post with links to https://archive.org/donate to encourage readers to get behind all of the good stuff that goes on at the Internet Archive.


You and your team are heroes. We can never know, in the present day, how valuable some artifact might become in the future. We need more people defending the long view of our civilization.


Huge respect for your work, and that of the entire team end to end. The internet archive is deeply important.

What do you think of the proposition that people should put important content on the archive on day one? ( Comment thread here: https://news.ycombinator.com/item?id=21843987 )

Also: Also thanks for the pennies a couple decades ago


There is a certain weirdness that still can't be captured in a static archive, including sites that had dynamic server based content. But also the overall feeling of the internet back then. For example, pre-web I recall being able to telnet to a CD store and do a search, then place an order. The telnet interface acted a lot like a BBS.

Then there is the shared cultural knowledge at the time, such as Adam Curry (I think?) who on his own created MTV.com. Then later got sued and had to had the domain over when they realized there was value to it. On top of that, there was one of the first spoof pages I recall, "Madam Furry", which poked fun at Adam Curry's page. (Hopefully I'm remembering all these names correctly).

And lets not forget about Gopher, with Veronica and Archie, along with something else called WAIS (Wide Area Information Services), which always seem very slow and barely workable.

Oh, and how did we figure out the who/what/where? The Internet Yellow Pages, of course. Thick book that had everything categorized. I've still got mine around, brought it into work to put in the commons area book shelf.


The mtv.com thing is correct. He was working for MTV at the time, then left and kept building the web pages. There wasn't then a precedent about domain names being connected to trademarks. I don't remember Madam Furry.

WAIS has an interesting connection to the Internet Archive.

Long before the Internet Yellow Pages, we used Scott Yanoff's Internet services list. That's how I found out about the WWW. Also there was a big list of sites that allowed anonymous FTP.



The Internet Archive is one of the two projects along with Wikipedia that when possible I try to help with some quid, so I was about to donate one more time, but alas their Paypal donation page stopped working months ago. It just displays a darker page with the Paypal logo in the middle and stays there doing nothing. All adblockers already disabled of course.


I'm in Europe and just made a donation via PayPal without any issues. Perhaps they've fixed it in the meantime?


You can PayPal money to donations@archive.org


Wikimedia is a lot like Mozilla. Neither is really hurting for cash.


The big donors will be much less interested if the organization can't show that lots of people are still invested enough in the project to put a little money into it. The many small donors keep the few big donors around.


It's still important to donate to them so they can stay independent.

Say Wikimedia would become dependent on the donations of a company, that company could technically force them to publish misleading information.

In regards to mozilla they are currently not financially independent.

Most of their money comes from google (in exchange google is the default search engine).


"Dependent" suggests a necessity is being fulfilled, and that's the point of my comment. We're past necessity. Both Mozilla and Wikimedia are already bringing in several times over the amount of cash they need to operate.


AIUI, both organizations have plenty of worthwhile projects which are constrained by available funding. They're not just maintaining a web browser + hosting an encyclopedia.

Of course, this is not to say that funding the Archive isn't also important.


> both organizations have plenty of worthwhile projects

Yes of course.

But both projects' biggest constraints are not (lack of) funding. It's bad mismanagement, or in the case of Mozilla, really bad mismanagement.


It would be nice to see them save all the weird content available, or even all the weird content they quite plausibly have a legal right to save.

IE, this is as good a place as any to once again complain about ways that a significant amount of stuff enters a memory hole AFTER being put int to the internet archive (example ezboard.com but I think that's just an example[1]).

Basically, archive pulls content when a later robots.txt file says don't archive (the robots.txt file of the domain parker after the actual website closed, generally). Broadly, their approach seems like "anyone, anywhere who even implicitly claims copyright on X can knock it and anything related X off archive forever. [1]

And the last time I discussed this here, the policy itself had supposedly been updated but ezboard content in particular (the example in the link and something I'm interested in), still wasn't available.

[1] https://archive.org/post/389129/why-is-archived-content-purg...


Part of the problem is that archive.org is effectively a single target. The larger they get, the more content they house, the more devastating it would be to have them removed and the more logical this ultra-gun-shy approach to legality is.

Moreover, states and corporations are naturally going from incidentally targeting archived content to systematically targeting it - so that through a flood of incidental and purposeful copyright attacks, archived content may wind-up a very well "pruned" set indeed.

It seems like having a less "monolithic" approach to archiving may be necessary - I'm not sure what that would look like, networks of smaller archivers who try to cover everything? Tor? Freenet?


My understanding is that the archived content is hidden but remains archived despite the robots file. Presumably for later historical/scientific work


Sure, I hope so, but the chilling to the free availability of information remains.


Worthy of donation https://archive.org/donate/


I've always felt a bit sad that my earliest forays into the web on Demon and CIX aren't fully captured here. You are all lucky whipper-snappers who have their youthful exuberance archived for when you're old and boring.


Give us time; things are popping up always around here.


Oh snap, nto too many CIXen still around. I came across some of my usenet comments from around then last year which was amusing and horrifying in equal measure.


I feel the same, although in a sense it was an early taste of the probably-inevitable privacy landscape that faces our kids, and has allowed me to embrace that with slightly less panic.


I'm glad that mine aren't, and that what is saved is largely under silly-in-retrospect not-tied-to-me screennames.


If you buy a lot from Amazon, you can choose Internet Archive as your supported charity for AmazonSmile.


Archive Encyclopedia Dramatica and then you can brag about weird history.


https://archive.org/search.php?query=encyclopedia%20dramatic... 123 results - and yeah some of them are weird.

Edit: and yeah you can browse it in the Wayback Machine https://web.archive.org/web/20200113163734/https://encyclope... which has more frequent updates, but I don't think they're as thorough which means the inter-page links might go to versions from different dates. The dump files have a better chance of being good snapshots.


Hah, as 5GB large XML files... still, glad that it's there!


how about we don't persist a place full of gore, dox, bigotry and bullying?


I'm serious, whoever downvotes this please let me know why a site laughing about peoples suicides and inciting hatred should take up any disk space in any archive.

Do you enjoy this gruesomely inhumane shit?




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: