Hacker News new | past | comments | ask | show | jobs | submit login

It's not about recovering a large amount of data; its about recovering a large % of your own stored data.

Amazon starts to charge you extra anytime you exceed restoring 5% of the data.

If for example you save all your tax-related documents in Glacier, then you are audited then the accounts department or the government will want all the information. Not 5% of it. Not 10% of it. Everything. At that point Amazon will have you over a barrel, because getting the data out at a reasonable time frame will cost exponentially more than dripping out the data over the course of 20 months.




> Amazon starts to charge you extra anytime you exceed restoring 5% of the data.

Isn't one way to get past this... increasing your data usage by 20x? If OP used less than a $1 a month, then if he uploaded $20 of junk data, he can get the 5% original data back "for free". Sure, it's $20, but it beats out $150+.


It looks like it is even more complicated than that. You can get 5% out per month at no charge, but that it only if you spread it out across the entire month. The extra charges happen the first time you exceed 5%/30 in a single day.


A better way would be if you had 20+ categories of data that are totally unrelated, like; your tax-stuff, your code, your diary from 1995-2010, ... Since these are very unrelated, you are not likely to need every one of at the same time, ASAP.

Though it's hard for me to imagine having so many categories of unrelated, useful and important data.


Glacier is about storing a large amount of data in a cost-efficient manner, when you do not anticipate needing it 1) cheaply and 2) quickly.

Legal issues, at least in the US, do not have those requirements.

I have had situations in the past 12 months where recovering past data would be worth >$1k for the right 100k of data.


>If for example you save all your tax-related documents in Glacier, then you are audited then the accounts department or the government will want all the information. Not 5% of it. Not 10% of it. Everything.

Are you sure about that? I haven't worked with tax litigation specifically, but I've worked with e-discovery w.r.t. e-mail and I can assure you that no one ever asks for all the e-mails sent by a particular company over all time. It's always a matter of asking for the e-mails sent by or received by a select group of people, over a fairly discrete time period. For something like this, a Glacier store might make sense, if it was coupled with an online metadata cache stored in e.g. S3.


With tax litigation the issue is that you have to prove you didn't simply shift money and accounting briefs around, and the only way to realistically prove it is to show all the statements in the time period that you'd required to keep them (I think that's the last 6 years).

The government basically comes to you and says they think you owe X, and you have to prove that false to their satisfaction. The more data you give your CPA to work with, the better.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: