Hacker News new | past | comments | ask | show | jobs | submit login
Forever Storage (querna.org)
27 points by pquerna on June 12, 2010 | hide | past | favorite | 16 comments



The novel about this is Neal Stephenson's Anathem... where people strive to keep just the necessary information alive (like, "Where did we store the atomic waste a couple of 1000 years ago?").

No matter how your data is going to be stored, it will eventually have to be copied to another medium. And if you want to store it "forever", you will get data corruption, either because of transmission errors or because the storage medium fails. Keeping the data redundant is not enough for guaranteed data safety, as no matter how many copies you create, one day all of them can fail at the same time.

But you can take steps to ensure that your data is realistically safe for say a 1000 years (which is not by any means "forever", but then who but an archaeology AI will read your data at that time). The Long Now Foundation is thinking about stuff like this (http://www.longnow.org/).

An interesting question is, how do you ensure that your data will not be intentionally wiped out, because it is deemed (religiously, politically, legally) offensive? A metal band's CD cover is now considered dangerously close to child porn here abouts (Scorpions, Virgin Killer). That may be enough to make some people erase your data for decency's sake. A dead man can not fight a cease and desist order.

A different view on the matter would be: you are going to die, and your data is going to die eventually. And even if it is around, it's likely that no one will want to read it anyway. But your existence will have consequences for the rest of (human) history, as just by living you provide an input to the human race. This input may be in the form of biological data (children) or pure information (a blog article about "forever storage" which inspires the next great startup founder to do great things). Just go on, do something positive besides leaving a huge carbon footprint and atomic waste that lasts for a million years.


Long Now's Rosetta Stone is the perfect project. Monel, their chosen alloy, doesn't seem to have any issue being stable for 1000 years. Their human-visible text is obviously too bulky to store a reasonable database of 500GB or 1TB, but no reason it couldn't support a more compact encoding.

(Alternately, you could use optical storage and hope redundancy and ECC will be enough. For example, if you want to store 500GB, you could take 20 Bluray 50GB drives, split it all into 25GB chunks, and then generate 25GB of ECC data, such as http://en.wikipedia.org/wiki/Parchive files. Do this every year, and you'll be able to correct files back and forth between archive-sets as well as within individual disks. Even if the Bluray discs degrade, there ought to be enough of a trace left for things like electron microscopes to pick up.)


I wonder if there could be a distributed/open solution to this... not so easy probably. Some Bit Torrent style distributed storage comes to mind, where you have to provide x*the ammount of data you would like to store to others, with y% availability.

Like a data-storage ponzi scheme, but it might be sustainable because storage and bandwidth keep getting cheaper.


Tahoe LFS seems to fit the bill....erasure coding for the win!

http://tahoe-lafs.org/trac/tahoe-lafs


Putting all those 250GB hard drives lying around to good use!


Technically I agree this is completely possible, the challenges are more likely to be economic. How do you make sure the company exists 1000 years in the future? That question poses itself especially for a potential startup providing this service, but would even be relevant if an established company were to do it.

If I were to trust anyone with doing this, it would probably be the catholic church. Can't think of anyone else with a sufficient track record.


Yes, surviving a societal collapse is pretty hard.

I think a societal collapse, in which you assume the basic structure of of Capitalistic motivations failed, might only have 2 real outcomes:

* Star Trek: Money becomes mostly pointless, and you hope with the altruism of other humans, the data is restored.

* Mad Max: Welp, sucks for your data, it'll be lost in the sands of time.

I think the more important perspective is, if you can make the first 200 years, the chances of someone else picking up the data increases massively -- right now most data is created and destroyed in years, never mind decades.


What if it wasn't a company? What if you created an "heirloom box" that stored all your important data.

Your children could each have their own box to store their important data. Their box will be synced with your data as well.

The grandchildren would have boxes that sync with their parents and the grandparents data.

I could see these advantages:

1) As time goes on, these boxes contain a greater number of people's important data and become more valuable.

2) It is also somewhat secure, since the data is only in the hands of your children and grandchildren while you are alive.

3) Each generation would have a newer version of the box. The hardware is never obsolete.


All it would require is a society that honors, with full force, contracts with the dead. The problem is that this also creates inertia in our system of law: for example, what if someone signed their descendants into slavery 400 years ago? If we honored that contract, it would mean making an exception to laws against slavery.

Similarly, whatever mechanisms or means the company is currently depending on to preserve its data might be made illegal in the future (because the information will have to be copied or updated to newer formats to maintain readability, no matter what it is, so there will necessarily be an intelligent component doing the translation.) If it just hires normal people to upgrade the data, capitalism might be abolished in the future. If it depends on religious devotion, religion might be disallowed from preserving private knowledge. If it creates an AI, that AI might later be found to be a slave, and set free...


If what you have is truly worth preserving for that long, transcode it into http://en.wikipedia.org/wiki/Junk_DNA (several times over to allow for mutations) and splice it into sperm/egg/embryo cells at fertility clinics. Let your "progeny" multiply across the planet with Nature taking care of replicating your data through space and time. For retrieval: I suppose that in the distant future, DNA databases will probably be preserved for easy lookup by sequence and thus so will your data.

The human genome is ~6B base pairs = ~3B bits = ~357MB. Only 1.5% constitute protein-coding genes. Let's say some of that "junk DNA" is actually useful, we could still get away with ~200MB of payload per human cell. Further scaling can be obtained by targeting other organisms with large populations (bees, rabbits, cockroaches, mosquitoes, bacteria) or even synthesize new viruses or parasites that hitch along to humans for an evolutionary ride while preserving your data at the same time ...

(Disclaimer: I'm in the middle of an X-files marathon)


It's very easy to keep the data intact, if you know when, exactly, you want access to it again: as in the book referenced in the article, The Forever War, just put the data on an object in space, relativistically accelerate it, and set it for a return course in X years. From the object's frame of reference, much less time (and degradation) will pass.

Or, even simpler: find a reflective surface X/2 lightyears away, and shoot coherent light at it with sufficient power.

The real problem is preserving the information, which degrades even faster than the data. Even communicating a very simple message[1] across more than a 1000-year gap turns out to be nearly impossible, as it will be intentionally misinterpreted in the direction of whatever will make the finders happy.

[1] http://www.damninteresting.com/this-place-is-not-a-place-of-...


It's very easy to keep the data intact, if you know when, exactly, you want access to it again: as in the book referenced in the article, The Forever War, just put the data on an object in space, relativistically accelerate it...

It's "easy" to relativistically accelerate something!? (Macroscopic, not fundamental particles or ions.) This is the old wag about how, in Google engineer speak, "trivial" means the theory is known and all that lies between now and implementation is 300 million dollars of planning and engineering.


Preserving data for future historians is a good thing but why is it so important to preserve your own data? Do you really think that's what will be the most valuable to them? It seems better to contribute to projects like the Internet Archive.

Also, the value of data depends partially on its scarcity, and I expect data about us won't be particularly scarce. More data is better, but when doing a bulk analysis, 10 terabytes of family pictures will not be all that different from 10.1 terabytes of family pictures.


I think the best way to guarantee your data to live on is to get it in a form you can save for the future as easily as a photo album.

Some time ago I saw (maybe even here on Hacker News) something about "setting your data in stone"...

http://primera.eu/millenniata/millenniata-en.html is what I found quickly about it now, there is/was a bigger explaining page with a heat and ice test of this medium.


A few years ago I won a (very small) business plan competition for a company describing a consumer backup service like this one.

The kicker is figuring out what would happen if the company stops trading for whatever reason (e.g. bankrupcy). Although I didn't follow through with it, I think you can build a reasonably safe winding-down process that at the very least returns the data to its owners.


I was thinking about this. You don't charge people to add data - just to remove it.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: