Amazon Drive removing unlimited storage plan

jasonsync · on June 8, 2017

It's frustrating to watch cloud companies subsidize their storage, in order to break into the market with a product that is too good to be true.

This strategy is the worst possible scenario for our entire industry. Users feel slighted, and the trustworthiness of the cloud in general is gradually eroded, as the scenario plays itself out over and over.

It feels like ransomware. Pay more or we'll delete your files.

When Dropbox for Business launched, it was $12.50 / user / month for "all the storage you need". Just recently Dropbox announced pricing changes, which will take effect in 2018. The new unlimited plan is $20.00 / user / month. And for those unwilling to pay, there's now a fixed storage tier, which is slightly cheaper than the original price, but it's capped at 1 TB.

Microsoft OneDrive included unlimited storage with any Office 365 subscription. After millions of users bought in, Microsoft dropped the maximum storage to 1 TB. Users were then given the choice of deleting their files, or moving elsewhere.

Mozy had an unlimited plan, and then dropped it and raised prices. SugarSync had an unlimited plan, but eventually dropped it.

Barracuda offered virtually unlimited cloud storage with their Copy.com service. A few years in and Barracuda shut down the entire service. Users were given very little notice, and had to move elsewhere.

Bitcasa, one of the original "unlimited" cloud storage providers (and a TechCrunch Disrupt Battlefield finalist) crashed and burned three years in. Again, users were given very little notice, and there was talk of a class-action lawsuit.

Time and time again we're seeing startups burn through their capital subsidizing the storage as some sort of brilliant marketing plan. It's not.

Disclaimer, I work at https://www.sync.com

hbosch · on June 8, 2017

The trouble here is, I think, "unlimited" tends to just mean "a lot" and for Average Joe that's fine. If you go to an all-you-can eat buffet, you will eventually be unwelcome after you have 20 plates stacked on your table. Is that a business problem or a customer abuse problem?

The ugly fact is the Unlimited Amazon Drive has been abused by media pirates and data-hoarders to store up to 20TB (and sometimes more) at what any reasonable person would say is an unreasonable cost. Not to mention, much of this media is accessed and streamed often. Wander over to /r/PlexShares to get an idea: imagine if someone had 20TB of 4K and BluRay movies stored on your service @ a very generous $60/yr, streaming that data 24/7 to 10, 15, maybe 20 people via Plex (with each streamer paying the media manager $10 a month) all over the world. While torrenting all day. Suddenly that sounds more like abuse IMO.

Of course, I agree that you shouldn't call something "Unlimited" if it isn't. But it's not a one-sided issue and I thought I'd bring that up. I don't personally know anyone, in real life, who uses Amazon Drive. Most people don't even know it exists. The only time I see it discussed, especially the "unlimited" tier, is on /r/datahoarders and /r/seedboxes as an exploitable deal.

As far as I can tell, Prime Photos will remain unlimited. I wonder how long it'll be before media hoarders hide their content in Google/Prime Photos with a convenient CLI tool?

mrighele · on June 8, 2017

> The ugly fact is the Unlimited Amazon Drive has been abused by media pirates and data-hoarders to store up to 20TB

Somebody on reddit bragged about reaching more than 1PB [0]

> Of course, I agree that you shouldn't call something "Unlimited" if it isn't.

I am maybe nitpicking, but it really was unlimited. Nobody is being billed for exceeding a given threshold or stopped from using it. The service is being discontinued and people will not be able to renew it. They never said "forever" :-).

It's like I got a special deal from my gym for unlimited access. If next year they won't offer it anymore I cannot say "it was not unlimited".

> As far as I can tell, Prime Photos will remain unlimited. I wonder how long it'll be before media hoarders hide their content in Google/Prime Photos with a convenient CLI tool?

In another thread on reddit somebody was already talking about that, so I guess it won't be long.

[0] https://www.reddit.com/r/DataHoarder/comments/5s7q04/i_hit_a...

BinaryIdiot · on June 8, 2017

> It's like I got a special deal from my gym for unlimited access. If next year they won't offer it anymore I cannot say "it was not unlimited".

I don't think this analogy works. If you used a lot of space, or really any space over the amount that they change to as the upper limit, that data is now at risk. Some places will make it read only until you move it off (up to a certain amount). Most will simply delete it after a certain amount of time if you don't move it.

Your gym can't really take back an amount of your previous, unlimited usage of the gym. If you have legit uses where you're way over the limit this can pose a real issue. I ran into this when storing photos and videos on Microsoft's OneDrive after they lowered the crazy high amount they were given...I eventually decided to pay the higher fees to cover my data so I wouldn't lose it until I had more time to move it all off.

I no longer use OneDrive.

csdreamer7 · on June 8, 2017

> It's like I got a special deal from my gym for unlimited access. If next year they won't offer it anymore I cannot say "it was not unlimited".

That is a very anti-consumer way of looking at it. Storage, especially for businesses, is not like a gym membership. These bait and switch tactics are harmful for the consumer as well as the industry itself.

You place a certain trust with data storage companies. Alot of media companies can easily have 30 terabytes of data to backup or share. They kill the unlimited plan with essentially a price hike. Now I am wondering "when is the next price hike coming"?

Amazon has already cut their CLI interface and their web interface is terrible. I would rather just keep it on a NAS with NextCloud.

mrighele · on June 8, 2017

> That is a very anti-consumer way of looking at it. Storage, especially for businesses, is not like a gym membership. > These bait and switch tactics are harmful for the consumer as well as the industry itself.

The point of the comparison was to stress the fact that unlimited amount of space doesn't necessarily mean for an unlimited amount of time.

> Storage, especially for businesses, is not like a gym membership.

I though that the unlimited plan was only for personal use. If it was open to businesses I understand why it became so quickly a money sink for Amazon.

PretzelFisch · on June 8, 2017

I don't think you can call it bait and switch. That would imply that they offered you unlimited but only gave you a limited amount. They did allow unlimited storage, they have now decided to remove this product and offer something else in it's place. The consumer can choose to stay or leave. It's difficult to leave, but it's the same kind of deal when your apartment's lease is not renewed. I can't really see this as anti-consumer

csdreamer7 · on June 8, 2017

I assumed bait and switch now encompassed this marketing tactic. If you know the exact term this tactic is called please let me know.

> The consumer can choose to stay or leave. . It's difficult to leave, but it's the same kind of deal when your apartment's lease is not renewed. I can't really see this as anti-consumer

Your argument is simply: consumers deal with something like this for an unrelated industry so it is not anti-consumer. That is not a good argument.

edit: also, renting is a very poor example. There are laws that govern how much rent can be raised that vary based on jurisdiction. If renting wasn't anti-consumer why would such laws exist? No such laws against gouging against for data storage which undermines your argument.

dhimes · on June 8, 2017

You place a certain trust with data storage companies.

I think the point is that now we know not to.

csdreamer7 · on June 8, 2017

Noah has been ranting about this long before Amazon and Onedrive did this.

halomru · on June 8, 2017

all-you-can eat buffets work because each person has to pay seperately and the amount any person can eat in one setting has hard physical limits that are fairly low.

Unlimited in the computing world usually has practical limits many orders of magnitude above normal usage, making outliers much, much more expensive. But you can't substantially increase the price because you can't afford to lose your normal customers and be left with only outliers.

CobrastanJorji · on June 8, 2017

I don't have a whole lot of sympathy for the argument that "we meant a lot but UNLIMITED was a much better marketing term, so we said that."

kbar13 · on June 8, 2017

You are missing the point. Amazon's service was truly unlimited

msq · on June 8, 2017

> Amazon's service was truly unlimited

It is not and never was. Check TOS

> 5.2 Suspension and Termination. Your rights under the Agreement will automatically terminate without notice if you fail to comply with its terms. We may terminate the Agreement or restrict, suspend, or terminate your use of the Services at our discretion without notice at any time, including if we determine that your use violates the Agreement, is improper, substantially exceeds or differs from normal use by other users, or otherwise involves fraud or misuse of the Services or harms our interests or those of another user of the Services. If your Service Plan is restricted, suspended, or terminated, you may be unable to access Your Files and you will not receive any refund of fees or any other compensation.

https://www.amazon.com/gp/help/customer/display.html?nodeId=...

Dylan16807 · on June 9, 2017

But did they ever kick anyone off? They didn't kick off people using hundreds of terabytes or more. And abuse language is there on every ToS. I'd say it truly was unlimited while they were selling it.

msq · on June 19, 2017

> But did they ever kick anyone off?

Sure they did. Saw plenty of people complaining about it on r/DataHoarders months before this big cut-off happened.

WhitneyLand · on June 8, 2017

>it's not a one-sided issue

It's totally a one sided issue. Abuse and fraud are not specific to Amazon, they are attempted at every business and industry under the sun.

It's just not credible to believe they initially thought the offer was long term sustainable without any limitations, protective terminations for abuse, etc. This is no more complex than what it appears. A deliberate, and effective, marketing campaign that would eventually have to come to an end.

Maybe they could at least use some kind of old school caveat like "while supplies last", or "until the rate of user acquisition is strategically outweighed by the subsidies required to absorb our losses".

BoorishBears · on June 8, 2017

I think without abuse, for most users it was sustainable. If I had to guess, the average HDD size being sold on a budget PC today is closer to 250GB (I've been seeing plenty with 32GB EMMC drives).

That's 4 PCs worth of content without going over the limit, and post non-technical users are probably using even less than that.

They could probably account for outliers using way more, but as long as most users were "average" users backing up word docs and family pictures they'd be fine.

The problem is "data horders" latched on much harder than average users. I wouldn't be surprised if most of those "average" users aren't affected by this change at all. I can't imagine my non-technical parents (for example) generating 1TB of data to back up very easily at all.

fapjacks · on June 8, 2017

But they aren't "abusing" the system if the system offers "unlimited" bandwidth/storage/whatever. Companies are being very dishonest because none of them actually mean "unlimited" when they say "unlimited". They should have just said 1TB from the beginning. But they were trying to fool people into signing up because they used -- disingenuously -- the word "unlimited". So screw the companies. They shouldn't say things they don't mean.

BoorishBears · on June 8, 2017

My point is more, to the average consumer it really was "unlimited". When you can backup 4 lifetimes worth of your PC with a service, for "practical" purposes it is unlimited

fapjacks · on June 8, 2017

Well what I'm saying is that companies should then offer 1TB of storage, not "unlimited" and then yank the rug out from underneath people that are actually using the "unlimited" storage they were advertised. The storage is not unlimited if any consumer can get to a point where they're using too much storage. The companies then call these people "abusive", but I personally don't think it makes it any less scummy of these companies if we can find some value of "unlimited" (like "practically unlimited") that sort of fits the story of them pulling out of their promise. It's the companies that are being abusive, not the consumer.

BoorishBears · on June 8, 2017

I guess we just don't agree on that.

To me unlimited is impossible right off the bat if you want to take it literally (there must be some finite limit to how much free storage Amazon has). Both sides in the agreement have some definition of unlimited that is less that unlimited, and to me hosting TBs of pirated content and porn is going past what a reasonable definition of unlimited and turning into abuse. If only the minority of users with legitimate TBs of data of their own creation to back up had used it I doubt AWS would have had trouble profiting without storage limits. But with people abusing it (or using it as piracy storage or mass internet backups if you want to claim that's not abuse) I don't see why they shouldn't have put an end to the plan.

Taek · on June 8, 2017

I have seen ACD accounts storing more than 1000 TB of data (mostly collected by scraping porn sites) for a mere $60 a year.

If you offer unlimited, expect people to take advantage.

zurn · on June 8, 2017

Some data hoarders can be acommodated since the service only cares about average usage per user.

Also data deduplication works even for data hoarders if they are hoarding media commonly shared on the internet.

Given the above points, I think there is no basis to call it "abuse".

vsl · on June 9, 2017

They are encrypting it to avoid detection of this TOS violation.

yazan94 · on June 8, 2017

Are there any other alternatives to Amazon that still offer unlimited storage?

WhitneyLand · on June 8, 2017

You forgot one more reason this is crappy behavior. It unfairly crowds out startups who want to go with a sustainable business model from the outset.

I realize "fairness" is not an excuse for inability to compete with the marketing of other companies. But if a company wants to test customer appetite for a service with pricing that is upfront honest, stable, and bound to only get better over time, there's never enough time to give it a proper shot.

StillBored · on June 8, 2017

This is just the technical equivalent of dumping or subsidizing, and has become fairly common in quite a number of industries to keep out competitors. The complete lack of regulation enforcement has been a problem for at least the last 40 years. To me it also seems to be a hallmark of the SV venture capital scene, where the majority of the capital isn't used to build a better product but to flood the market for a few years to gain a controlling share.

tootie · on June 8, 2017

You should never trust unlimited anything. It's never really unlimited. They should really just offer a "as big as a reasonable person could want" package. For a household that wants to backup some phones and PCs and the family photo album, you can offer 5-10TB of storage and they'll never fill it up. But eventually you know that a handful of power users are going to abuse the system to the point that you have to stop.

ytjohn · on June 8, 2017

I like how Google did their storage. They started off with a something that was (at the time) insane amount of storage for email, like 1GB. Then they slowly increased that over time as storage costs decreased. It was never unlimited, but it seemed unlimited compared to their competitors, who were often at 1/10th or even 1/100th the space.

Granted that's more difficult to do today as more players have embraced the concept. Google Photos (formerly Picasa) came out with unlimited storage for photos, and Amazon Cloud Drive is a response to that. ACD still offers unlimited photos, but they should never have tried offering unlimited "other files" storage. They have been doing cloud storage to technically adept people for years. I don't know why they thought they could offer a rather similar service, mark it as unlimited, and not expect people to use the cheap unlimited option over their per-use pricing plan.

fapjacks · on June 9, 2017

But I say that they're not "abusing" the system as long as the company is offering "unlimited" service. They're actually not offering "unlimited" service, so it's the companies that are abusing the buyers, not the other way around. If companies offered like you said, a "large enough" package with some limit, instead of "unlimited", then I would absolutely agree that they're abusing the system.

ballenf · on June 8, 2017

I disagree on the effect this has on consumers. I am a paying customer of several cloud storage services, 2 of which you mention.

The effect on me of the temporary unlimited tiers is that I get to know my data usage requirements for a while before I have to choose a tier.

It's really a very welcome intro to cloud storage. I can easily see that even with truly unlimited storage I only used xxGB and therefore can choose the appropriate tier with confidence that my costs won't jump.

One surprise about these unlimited deals is that I'd rather pay Apple for photo storage than use a third-party service that is free (or included in Prime, e.g.). I tried multiple free photo storage services, but the convenience of Apple's offering far outweighs the cost saving.

If you want unlimited cloud storage the G Suite $10 / month plan is still unlimited. (Actually it says 2 PB in Google Photos when backing up original size photos & videos.)

blazespin · on June 8, 2017

"unlimited" is such a stupid word. TANSTAAFL. There should be an advertising law against the word "unlimited".

Semaphor · on June 9, 2017

FWIW, backblaze still has unlimited storage. Of course it's cold storage, so abusing that is harder.

illumin8 · on June 8, 2017

To be fair, this is not "pay more or we'll delete your files."

That would be a terrible way to treat customers. Customers will simply lose the ability to upload new files. All their existing content will still be there.

panopticon · on June 8, 2017

> You have a 180-day grace period to either delete content to bring your total content within the free quota, or to sign up for a paid storage plan. After 180 days in an over-quota status, content will be deleted (starting with the most recent uploads first) until your account is no longer over quota.

Looks like that's not the route Amazon took.

williamle8300 · on June 8, 2017

Yup. Totally agreed.

But duuuude. How much did you pay for that domain?! NOICE.

jasonsync · on June 8, 2017

Back in 2013 we originally launched with the domain name "Sync.us" (we didn't have a choice, the .com was taken). Instant regret, as it made branding difficult, and our marketing efforts simply drove traffic to the many, more recognizable sync branded TLDs.

Then a few months later, one of our team members inadvertently visited Sync.com (a common Freudian domain name slip for all of us), and discovered the domain was up for auction. Instead of subsidizing our storage, like most of our competition was doing at the time, we diverted a substantial portion of our marketing budget towards winning the auction ;-)

It wasn't cheap.

yuhong · on June 17, 2017

I found out about OneDrive for Business which on some plans lets you bump up beyond 5TB by calling support. I wonder how that works.

klodolph · on June 8, 2017

If you're cynical you'd think "of course they'd offer a free version and then make you pay for it." But that's basically the cloud storage market in a nutshell. All the cloud storage providers are in a race to the bottom, and once they have your data they want to upsell you on other cloud technologies. If you just need to store data, that's actually great news, because these providers are working hard on the optimizations that make it possible to give you storage as cheap as they can.

Cheap options are Glacier at $0.004/GB, Backblaze at $0.005/GB, GCS Coldline at $0.007/GB. That's the monthly cost for bare cloud storage with no egress. Anything cheaper than about $50/year for 1TB will fall in one of two categories:

1. Subsidized by the provider, but probably not for long. This has happened so many times I can't even remember them all, Amazon is just the most recent.

2. Strictly DIY, probably with a much higher chance of data loss than you realize.

kefka · on June 8, 2017

Before you mention "Cheap options" and Amazon Glacier, read this: https://news.ycombinator.com/item?id=10921365

"How I ended up paying $150 for a single 60GB download from Amazon Glacier"

> If you’re retrieving more than 5% of your stuff, expect to pay a fee, “starting at $0.011 per gigabyte”.

Key words here is "starting at". as in, no cap on how much they can and will charge you for your data. Cause you know, 60GB is realllly expensive to transfer :(

_______________________________

What Glacier seems to be for, is more like compliance paperwork and other "We are required to save this for X years" kind of paperwork. Because most of those types of documents, you don't care about and only legally have to keep available. And if a court case comes along, the Glacier retrieval price can just be added on top of court fees (which in that case, Glacier is probably a pittance even at the $10k level).

klodolph · on June 8, 2017

There's a lot of articles about these things that seem to have the same summary: "I didn't read the documentation for how pricing works and then I was surprised by how pricing worked." Some people were shocked by Glacier's egress pricing and others were shocked by the cost per file. Both of these are approximately zero, most of the time, for most people, if they know what they're doing. If you don't know what you're doing and haven't read the docs you shouldn't be using Glacier.

> Cause you know, 60GB is realllly expensive to transfer :(

Network and disk IO capacity planning is hard. You can't just buy GBs of egress at $0.01 and then resell them at $0.01, because you're paying for maximum capacity and turning around and selling average usage. Similarly, when you have a bunch of data shoved into unused sections of disk, you can't just read them back out without affecting whatever else is reading from the same disk. If you want to sell something close to cost, you need to reflect the pricing structure of what you're paying to your customers.

So if you upload a bunch of data to a super-cold storage system that's even named after a geological formation that stays frozen for centuries, to remind you how cold it is, and then you make it hot by trying to download it all at once, you'd expect it to be more expensive.

Just do your regular cost-benefit analysis and it should be fine, based on how long you plan on storing the data and how likely or often you think you'll need to restore it.

kefka · on June 8, 2017

The root problem here is , that Amazon Glacier is as opaque as mud about pricing. Or did the words "Starting At" not pique your buzzword bullshit detector? It certainly did mine.

Pricing means sifting through a legalese dense block of text to try to come up with even a way to estimate the costs. That is a problem. I should be given a rough cost expectation up front. Sure, it's not going to be exact and all, but This article is about a cost factor of *180.. That's the difference between an iPad and a Ferrari.

Other places of varying types of backup systems are a lot more clearer to tell how much I should be expected to pay. Or at least, I get within +/- 10%. Glacier? "Yeah, Don't worry about it till we bill ya!" That's the scam with all these cloud services really. Once we have your nuts in a vice, we can squeeze and extract, cause you don't have any other choice, now do you?

klodolph · on June 8, 2017

I think the information you're working with might be out of date, I don't see the words "starting at" anywhere. Glacier egress pricing is $0.01 per GB and $0.05 per 1,000 requests. That's for standard retrievals. Not so complicated.

Before Feb 2017, it did work with the peak hourly request fees. Those were well documented. Apparently peak transfer rates were "confusing". I didn't think so, after all, that's similar to how I pay for internet access at home (I pay based on capacity, not based on bytes transferred).

Scaevolus · on June 8, 2017

Glacier pricing has changed since that post. It's now a much simpler model, around $0.10/GB to download. The original pricing was very complicated though!

https://aws.amazon.com/glacier/pricing/

glenneroo · on June 8, 2017

So how would you categorize CrashPlan? They offer unlimited storage for $60/year. I've had around 8TB backed up there for the last 2-3 years. They have existed since 2001 (no idea how long the unlimited feature has existed).

On the other hand, supporting your point #1, I used to have several TB on StreamLoad which 10ish years ago used to also offer unlimited... until they suffered a critical hardware failure and lost tons of customer data. They then changed to MediaMax, then suddenly disappeared.

Perhaps you know of others, or this list is incomplete, but Wikipedia has a list of file services, of which there aren't many who offer an unlimited plan:

https://en.wikipedia.org/wiki/Comparison_of_file_hosting_ser...

klodolph · on June 8, 2017

CrashPlan is "unlimited" but it's not a storage service, it's a backup service. From what I understand, you have to use their client software, and that software will use RAM proportional to the total size of the data you're storing. In practice this puts a hard limit on the amount of data which is proportional to the amount of RAM you have.

So heavy users are not only capped, they're also subsidized by the light users.

freeone3000 · on June 8, 2017

I've got about 12TB on Crashplan, and the client's never used more than about 2GB. I doubt your "cap" applies in practice.

nucleardog · on June 8, 2017

So about 170mb/TB if this scales linearly for some reason.

Some users were apparently bragging about having 1PB+ on ACD. That would mean the RAM usage would actually be a useful upper bound, as I doubt many of these guys have 170GB of RAM in their media server PCs.

gwillen · on June 8, 2017

In addition to the comment below about CrashPlan requiring you to use their client software, which limits abusabiliy -- I am also seeing signs of strain from CrashPlan. They have always had a policy that claimed they would delete data from machines that hadn't touched the service in a long time, but they had never implemented this in practice before; just last year they finally started sending out threatening letters to people saying they're going to delete data on short notice. (And they didn't really back down when people got angry.) This tells me the storage expense is starting to hurt them.

devicenull · on June 9, 2017

Yep, I cancelled my CrashPlan account when they pulled that. IIRC they gave me under a week to either download the data for 5-6 machines, or lose it forever.

ValentineC · on June 8, 2017

> So how would you categorize CrashPlan? They offer unlimited storage for $60/year. I've had around 8TB backed up there for the last 2-3 years.

CrashPlan's upload speed is a joke for people located outside North America. I had an account for ~3 years, but found it difficult to upload more than 6TB. (There are rumours that CrashPlan throttles uploads.)

On the other hand, I've managed to backup most of my ~15TB NAS (and more) on Amazon Cloud Drive.

GordonS · on June 11, 2017

I'm in the UK and evaluated CrashPlan a couple of years ago - IIRC, it was so slow it was going to take several weeks to upload my 1TB of data.

ghshephard · on June 8, 2017

Crashplan is a backup service. Different category of company.

samnwa · on June 8, 2017

Ever hear of Sia? That's a new decentralized option gunning for Amazon. Website is sia.tech.

klodolph · on June 8, 2017

It's very difficult to take Sia seriously. For one thing, Sia only has 25T of data. That's total data, in the entire global network, including parity blocks. And when I checked prices today, it was more expensive than Glacier—but I can only imagine how much the prices would swing if you suddenly bought 2.5T of storage.

Sia's decentralization is not unique. Amazon, Google, and Microsoft cloud storage are also decentralized. The idea of reselling unused bytes on disks that you already own is also not unique—again, Amazon, Google, and Microsoft already do this.

Taek · on June 8, 2017

Sia's decentralization is unique. It's unique not because it spreads data all over the world, but because there's no central player that controls prices or decides to shut things down. Amazon, Google, Microsoft, they can change terms or disable services whenever they want. Sia is a lot more robust, because it's governed via a blockchain, and you put your data on 30+ individual parties, making it extremely unlikely that a sufficient number of them all shutdown simultaneously.

This situation with retiring unlimited storage on ACD would not be possible with Sia. Amazon is changing terms and conditions, Sia is a blockchain where no party has the power to do that.

-----

To address your pricing comment, there are more than 1000 TB for sale on the Sia network. Adding 2.5 TB to the network would not move the price at all, that's really not how pricing works on Sia.

The reason that it's more expensive than it used to be is because prices are set entirely in Siacoin. When the siacoin prices rise, hosts need to manually re-adjust their prices to keep the same USD price. As of writing, there are no tools that will let this happen automatically. The siacoin price has doubled 6 times in 6 months. The result is that a lot of the defaults are now grossly expensive, and hosts need to be very familiar with the pricing mechanisms to respond accordingly. Most have not, though the ones that have are seeing much higher utilization than the ones that haven't.

We will be releasing stuff in the next month to help hosts set prices more intelligently, that should move the prices back to the competitive spot that they historically been at. The high prices right now are merely a result of market confusion among the hosts.

cableshaft · on June 8, 2017

Hi Taek, Sia recently came on my radar, and it seems pretty interesting usage of the blockchain, and I've started following it.

Is there a good place to keep an eye on the roadmap for the technology (with info like what you had in the last paragraph)?

pjc50 · on June 8, 2017

> The siacoin price has doubled 6 times in 6 months.

Presumably this represents entirely speculative activity driven by the other fashionable blockchains?

cableshaft · on June 8, 2017

I'm sure that's part of it, but you can't really avoid that. But there are a ton of altcoins out there now, and most of them don't have interesting tech behind them or much to distinguish them from any of the others. Sia seems to have interesting tech behind it, to me at least.

prirun · on June 8, 2017

Have to be careful with Glacier: the retrieval costs can be very high and are extremely hard to calculate.

Amazon S3 Infrequent Access has higher retrieval costs, though not as crazy as Glacier, and a delete penalty. Deleting a file after 1 day will cost 30x normal S3 storage fees.

Likewise, Coldline sounds great, but they have higher retrieval fees (not so bad) and a delete penalty. The Coldline delete penalty will cost 90x the regular storage cost if a file is deleted the next day. I don't recommend Coldline unless you know you won't be deleting files before 90 days.

cm2187 · on June 8, 2017

If it is backup, it doesn't need to be super safe. The risk that you lose both your primary data, (optionally your local backup) and your online backup at the same time is pretty insignificant, given that the online backup will be uncorrelated as in a different physical location (primary and local backup very correlated I agree).

klodolph · on June 8, 2017

> The risk that you lose both your primary data, (optionally your local backup) and your online backup at the same time is pretty insignificant, given that the online backup will be uncorrelated as in a different physical location (primary and local backup very correlated I agree).

Actually, that reasoning is exactly what I wanted to talk about. When the primary is lost, there is a much higher chance than you'd expect that a secondary contains unrecoverable errors. This is why RAID 5 arrays fail to rebuild after a disk failure so often—they're supposed to be able to tolerate a single disk failing, but they can't tolerate a disk failing and any other IO error at the same time. Part of this is due to how short the timeout is for failed reads in RAID setups, but I've still seen a lot of RAID 5 arrays fail, and I've seen a few RAID 6 arrays fail too.

On top of that, there's the high chance of configuration errors in DIY systems.

cm2187 · on June 8, 2017

Agree. Having a script doing a full read of all the data every couple of months and sending a report by email (that you would notice if not sent) is a sort of must have.

w458cmau · on June 8, 2017

I am still looking for a good way of doing this.

Weekly integrity check of all backed up data. E-Mail which informs of result. Web interface which shows overview of results of historical checks. External service which sends E-Mail if integrity check failed to run (e.g. https://deadmanssnitch.com/).

rwiggins · on June 8, 2017

FreeNAS gets fairly close to this out of the box. My home server runs ZFS with RAIDZ2. By default, I think, there's a weekly cronjob to scrub the ZFS pool (integrity check everything, as I understand it), and then the results of that scrub are emailed to me.

I don't believe it has a web interface with historical checks, although I could be wrong. That said, it might be stored in a log file somewhere.

I also don't have an external service that would send me an email if it failed to run. That said, I would get an email if cronjobs had a mysterious error; otherwise, if the server itself was dead, my data would not be accessible on my home network, so I'd notice.

If the home server dies tragically, well, I hope Google Cloud Storage is doing similar integrity checks -- that's where my offsite backups are.

cm2187 · on June 8, 2017

Integrity is harder because you have to sort of maintain a signature of every file on the side. What I do is to just have a script that reads all data. If some data is unreadable an exception will be thrown and a text message will be sent to me. And a confirmation email is sent otherwise.

In .net it's only a few lines of codes. Haven't thought of deadmansnitch but it's a good idea. Would just take one more line of code.

drdaeman · on June 8, 2017

git-annex has fsck command that can test the data and rclone remote so you can use any configuration of "cloud" storages to hold the chunks.

mongol · on June 8, 2017

I would argue the opposite. When you reach for your backup, you are in a dire situation. If it fails you then, things will be very gloom.

himlion · on June 8, 2017

The problem is sometimes you only find out your backup is corrupted when you need it.

qeternity · on June 8, 2017

If you don't test your backups, then they're not backups.

jjn2009 · on June 8, 2017

yup, the assertion that both failing is highly unlikely is under the assumption that you are checking the status of each copy regularly.

cm2187 · on June 8, 2017

In fact for any disk, it is important to have a script that reads all the data at least every couple of months. It will force the bad sectors to be identified or to be notified you have a bad disk before things get worse.

cm2187 · on June 8, 2017

Yeah, agree. And I am not talking about enterprise level of reliability. More like personnal / small office.

But for anything above a few TB, running your own hardware will be way cheaper if you take say a 5y horizon. Disks are so large today that you don't need a big config. In fact you might even keep two copies to reduce the risk.

Of course there is the occasional trip to the datacentre. I made that mistake. I pay more in uber than I saved by picking a datacentre far away.

w458cmau · on June 8, 2017

Do you need co-location though? I dropped an HP Microserver at a friend's place, which backs up snapshots of my main backup server at home daily. Seems to work well so far, at quite limited costs. Since my main backup server also backs up his NAS we both win. Since all backups are encrypted there is also little privacy risk from theft (or snooping on one another).

cm2187 · on June 8, 2017

With 1Gbps symetrical connections slowly becoming more frequent this would be a much cheaper alternative (assuming you only run it for backups, I also run a mail server and some websites). But in London it is not very practical. I know very few people who have really fast broadband.

jo909 · on June 8, 2017

My backup contains files and old versions of files I no longer have on the primary storage. It is much more than a 1:1 copy, and can not be recreated.

I need to feel safe that I can reasonably undo my changes on the primary storage, even if it takes me years to realise what I did.

This is why I keep two backups. Until now, the secondary offsite copy was on ACD...

dsr_ · on June 8, 2017

That's not a backup. That's an archive. You need to backup archives the same way you do any other data you care about.

_vvhw · on June 8, 2017

If it is backup, then it needs to be scrubbed regularly to detect and repair bit rot.

thebspatrol · on June 8, 2017

Ah, the old "give them a bunch of storage and then ask for more money to keep storing it" meme.

guiambros · on June 8, 2017

Indeed.

My annual cost will jump from $60 to $180. That's too much for simple offline backup , so it's time to start looking for options again :(

Glacier may a more affordable option, but my experience a few years ago has been terrible.

Any suggestions? Google Drive is also pricey ($240); Crashplan is incompatible with NAS, and tarsnap is out of question (>$6,000/year).

zeta0134 · on June 8, 2017

I personally run syncthing on several devices, and don't worry about the cloud. It's self-hosted, devices replicate files between themselves, and there's no real limit other than hard drive space. It runs on just about anything too; several of my backup systems are Raspberry Pis.

It can be a bit weird to set up initially, and is a lot less magical in the interest of putting you in control for privacy reasons, but the flexibility added is pretty useful. I have a music folder that I sync to my phone without needing to pull the rest of my backups along with it, since they wouldn't fit anyway. Several of my larger folders aren't backed up on every single device for similar reasons, but some of my really important smaller folders (documents, photos, regular backups of my website's database) go on everything just because it can.

Anyway, check it out. Highly recommended all around: https://syncthing.net/

Symbiote · on June 8, 2017

If you accidentally delete some files (even all the files!), won't Syncthing delete all the "backups"?

I don't use Syncthing, I use an rsync script I wrote over 10 years ago, using the --link-dest option to keep incremental backups for around 2 years.

This relies on Zsh's fancy globbing, but the gist of it is:

    date=$(date +%Y%m%d-%H%M)

    [for loop over users]

    older=( $backups/$user/*(N/om) )

    rsync --archive --recursive \
        --fuzzy --partial --partial-dir=$backups/$user/.rsync-partial \
        --log-file=$tempfile --link-dest=${^older[1,20]} \
        --files-from=$configdir/I-$user \
        --exclude-from=$configdir/X-$user \
        $user@$from:/ $backups/$user/$date/

anc84 · on June 8, 2017

Syncthing has options to store versions of files so that scenario is easily avoided: https://docs.syncthing.net/users/versioning.html

lewisl9029 · on June 8, 2017

Unfortunately, in my experience, Syncthing's versioning mechanisms leave much to be desired compared to what I'm used to from Dropbox. AFAIK all of Syncthing's versioning schemes only keep versions of files that have been changed _on other devices_, and not those that have changed on the device itself, whereas what I'm looking for is an option to keep a synchronized version history for all files on all devices, and the ability to more intuitively roll back and roll forward the state of any file to any revision without having to mess with manually moving and replacing files and reading timestamps (better yet would be the ability to do so for entire directories, but I realize this would probably be very difficult to accomplish across devices in a decentralized manner).

rovr138 · on June 8, 2017

I used a similar script for a long time but I'm using now rnsapshot.

GordonS · on June 8, 2017

For me, one of the main benefits of cloud-based backup is that it's off-site - so if my house burns down, my data is still safe.

brixon · on June 8, 2017

Think about a Media Safe. Some are really expensive, but this one is not too bad. Just a really small storage area. https://www.amazon.com/First-Alert-2040F-Water-Media/dp/B000...

guiambros · on June 9, 2017

What about break ins? Someone enters your place and steal your NAS (and the Media Safe)...

guiambros · on June 9, 2017

This.

That's my primary use case for Amazon Drive. I have a robust rsync of the workstations and laptops to a NAS, and then to a second (incremental-only, no delete) NAS. Works great, but if the house burns down, or if someone breaks in and steals the computers, I want to ensure there's a copy somewhere.

russdpale · on June 8, 2017

If that is your main concern you could always put it on an external drive and put it a bank safe deposit box. I've thought about doing that for at least the very important things, perhaps even printing some important pictures too.

HelloNurse · on June 8, 2017

You just need another house to burn down.

Don't you have friends or relatives at a reasonable distance who can set up mutual backups on each other's home servers?

GordonS · on June 8, 2017

Yes, but nobody else with a FTTC internet connection with an unlimited bandwidth allowance (I'm in the UK). I have 2TB of data, so speed is important.

creepydata · on June 8, 2017

I don't have a single friend that has a home server. Most adults don't even own computers anymore, just phones and perhaps an iPad.

ytjohn · on June 8, 2017

A good scenario is building a backup server/nas solution that you can put in a little cubby at your friends place. There's trust involved that you're not using their internet to hack the government, and you have to be mindful of their bandwidth/power costs. So not a rackmount server or even a tower, but something much smaller and very appliance looking. A nuc sitting atop a wd passport or their "my book".

If it provides them a benefit like an in-house plex server, even better.

richardknop · on June 8, 2017

Another option would be to rent a safety deposit box at a bank for $25 per year and store your backups there as flash drives. Cheap and very secure.

Of course it requires you going to the bank regularly to update the backup.

lewisl9029 · on June 8, 2017

I've moved mostly to syncing through Syncthing for my devices too, but I'm curious what people use for sharing files with others and accessing files through a browser on machines you don't control?

olegkikin · on June 9, 2017

So you're one house fire away from losing all your data forever.

mkup · on June 8, 2017

There's Siacoin, a cryptocurrency/blockchain built around the idea of decentralized encrypted p2p storage.

They're dirt cheap to store as of now: median contract price is $12/TB·mo, but network storage utilization is currenly only 2%, so actual deals settle on about $2/TB·mo. Downside is that exchange rate of their coin is highly volatile, at least was during last month.

https://sia.tech/

http://siahub.info/

http://siapulse.com/page/network (Prices tab)

lewisl9029 · on June 8, 2017

Do these decentralized storage networks provide any guarantees in terms of durability, redundancy and availability? I've been looking into Siacoin, Filecoin, Storj and the like, but lack of clarity around some important concerns have so far prevented me from taking them seriously as a backup solution:

1. Performing a restore in a timely fashion on a large dataset seems like a tall order if these networks don't impose any minimums for the upstream bandwidth of the hosts.

2. Files can completely disappear from the network if the machines that are hosting them happen to go dark for whatever reason, which seems to be a much more likely occurrence for some random schnub hosting files for beer money than it would be for traditional storage providers that have SLAs and reputations to uphold.

Maybe these concerns are unfounded, and some or all of these networks already have measures in place to address them? I'd appreciate it if someone more familiar with these networks could enlighten me if that's the case.

Geee · on June 8, 2017

In addition to redundancy, Sia has the concept of collateral, which is basically money locked in a smart contract that says "I'm willing to bet this money that I'm not going to lose your files". I.e. Hosts lose the money if they fail to store your files.

Different hosts have different amount of collateral, and it's both an important security measure as well as market mechanism.

Also, Sia is completely decentralized (unlike StorJ for example), so it can't be intervened with by anyone which might result in lost files.

Taek · on June 8, 2017

Speaking as a Sia developer, I can address your concerns.

> these networks don't impose any minimums for the upstream bandwidth of the hosts.

Sia today primarily handles that through gross redundancy. If you are using the default installation, you're going to be putting your files on 50 hosts. A typical host selection is going to include at least a few sitting on large pipes. Downloads on Sia today typically run at about 80mbps. (the graph is really spiky though, it'll spike between about 40mbps and 300mbps).

We have updates in the pipeline that will allow you to speedtest hosts before signing up with them, and will allow you to continually monitor their performance over time. If they cease to be fast enough for your specific needs, you'll drop them in favor of a new host. ETA on that is probably ~August.

> Files can completely disappear from the network if the machines that are hosting them happen to go dark for whatever reason

We take host quality very seriously, and it's one of the reasons that our network has 300 hosts while our competitors are reporting something like 20,000 hosts. To be a host on Sia, you have to put up your own money as collateral. You have to go through this long setup process, and there are several features that renters will check for to make sure that you are maintaining your host well and being serious about hosting. Someone who just sets Sia up out of their house and then doesn't maintain it is going to have a very poor score and isn't going to be selected as a host for the most part.

Every time someone puts data on your machine, you have to put up some of your own money as collateral. If you go dark, that money is forfeit. This scares away a lot of hosts, but that's absolutely fine with us. If you aren't that serious about hosting we don't want you on our network.

> but lack of clarity around some important concerns have so far prevented me from taking them seriously

We are in the middle of a re-branding that we hope introduces more clarity around this type of stuff as it relates to our network.

ericflo · on June 8, 2017

This is the one I've got my eye on - once the marketplace boots up on both sides, it's going to be hard to compete against it. I suspect some day even the big providers like Amazon and Google will sell into these kinds of marketplaces.

klodolph · on June 8, 2017

I'm calling it, it's not gonna happen.

For data storage, you need error encoding. Sia does that, but you pay for it. So for 1TB of data, you upload 2TB to the network (that's how Sia is configured) and at the current $2.02/TB per month, that's $4.04/TB, which is more expensive than Glacier. Glacier charges funny for downloads but Sia charges for downloads too.

I assume that if you wanted to store ~2.5TB like we're talking about, you'd be paying more than $4/TB, because 2.5TB is 10% of the total of all data currently stored in Sia, currently 24.5 TB. (By comparison the major cloud providers are undoubtedly in the exabyte range of actual data stored. Or for another comparison, you could comfortably hold 24.5 TB of storage media in one hand.)

Sia promises to be cheap because you're using unused bytes in hard drives that people already bought, but that's exactly what Amazon, Google, and Microsoft are already doing, except their data centers are built in places where the electricity costs less than what you're paying. Plus they don't charge you extra for data redundancy.

ericflo · on June 8, 2017

In that case, Sia provides an avenue for an new company with access to cheap electricity to compete with Amazon, Google, and Microsoft without investing a cent in marketing or product. They will just plug in and start receiving payments, and strengthen the network and lower the price in the process.

Another cool thing is Sia lets hosters set their storage and bandwidth prices, so specialized hosts will likely pop up. For example one host might use tape drives, set cheap storage cost and expensive bandwidth cost. Clients can prioritize as desired. SSD servers with good peering can do the opposite.

The real interesting part will be when you can create one-time-use URLs to pass out, which connect directly to the network - effectively turning it into a distributed CDN.

Taek · on June 8, 2017

The $2 / TB / Mo we've traditionally advertised as our price included 3x redundancy. The math we've done on reliability suggests that really you only need about 1.5x redundancy once you are using 96 hosts for storage.

The network prices today are less friendly, though that's primarily due to market confusion. The siacoin price has doubled 6 times in 6 months, and there's no mechanic to automatically re-adjust host prices and the coin price moves around. So hosts are all currently advertising storage at these hugely inflated rates, and newcomers to Sia don't realize that these aren't really competitive prices.

Though, I will assert that even at our current prices it's not price that's the primary barrier to adoption. It's some combination of usability, and uncertainty. Sia is pretty hard to set up (it's around 8 steps, with two of those steps taking over an hour to complete), and a lot of people are not certain that Sia is truly stable enough to hold their data.

We're focused on addressing these issues.

Geee · on June 8, 2017

You can't compare to Glacier. S3 is a more comparable product. And obviously redundancy is already in the price, or did you think there's no redundancy?

klodolph · on June 8, 2017

From what I understand, your client does the error encoding and pays for raw data storage on the network, rather than trusting the network to do error encoding. You can configure the encoding to whatever you want, you just end up paying more for more redundant encodings.

jshmrsn · on June 8, 2017

Isn't this exactly what Pied Piper gets used for in later seasons?

styx31 · on June 8, 2017

Currently trying Backblaze: https://www.backblaze.com/b2/cloud-storage.html. Overall fits my needs.

Sukotto · on June 8, 2017

I used Backblaze for several years before closing out my account in 2012.

Initial backup took a long time. There was no easy way to prioritize, for example, my photos over system files. I ended up manually prioritizing by disallowing pretty much my entire filesystem, and gradually allowing folders to sync. First, photos, then documents, then music, etc.

Eventually it all got synced up and it was trouble-free... until I tried to get my data back out.

The short version of the story is that a power surge fried my local system. I bought a new one and had some stress when it appeared the BB client was going to sync my empty filesystem (processing it as a mass delete of my files). I managed to disable the sync in time.

Then I discovered there was no way to set the local BB client to pull my files back down. Instead, I had use their web-based file manager to browse all my folders and mark what I wanted to download. BB would then zip-archive that stuff which would then only be available as an http download. There was no Rsync, no torrent, no recovery if the download failed halfway, and no way to keep track of what I had recently downloaded. Also, iirc, they were limited to a couple of GB in size per file. (which didn't matter because at that time, the download would always fail it the file was larger than __MB (I don't remember the exact number. 100MB? 300? Also hazy on the official zipfile size limit)

So I had to carefully chunk up my filesystem for download because the only other option BB offered was to buy a pre-filled harddrive from them (that they would ship to me).

I felt like Backblaze was going out of their way to make it hard for me in order to sell me that harddrive of my data. I felt angry about that and stubbornly downloaded my data one miserable zipfile at a time until I had everything.

Once I was reasonably sure I had everything I cared about, I closed my account and haven't looked back.

[Edit to add] This was at least 5 years ago. No doubt their service has improved since then.

fencepost · on June 8, 2017

I would think that for a full restore you might be better off with their restore by mail. Note that if you move your data off the drive they ship then send it back they refund the charge for it.

timlin · on June 8, 2017

I use Backblaze but haven't had to do a restore yet. It appears their current limits are 500gb per zip file. They also have a "BackBlaze Downloader" utility (Mac & Windows) that has the ability to resume interrupted downloads.

https://help.backblaze.com/hc/en-us/articles/217665888-How-t...

MrRadar · on June 8, 2017

It looks like styx31 linked to B2 which is a separate service from their backup service that's closer to S3 or Google Cloud Storage. With that you can use rclone which should avoid the issues you encountered, though at higher cost if you have a lot of data (there are per-GB storage and download fees).

deweller · on June 8, 2017

My restore experience was also poor with Backblaze. The download speed was slow. If I had to restore an entire drive it would have take me many days to download the entire thing.

I switched to Arq with Amazon Drive as the storage backend.

jve · on June 8, 2017

Well you get to choose cheap backup vs expensive restore. Better than impossible restore (that is if you don't do backups)

rrix2 · on June 8, 2017

I'm in the process of uploading 2TB of backups to B2, it's ridiculously cheap, and they don't charge for upload BW, just storage and download

darawk · on June 8, 2017

You could try out one of these new storage cryptocurrencies:

https://www.storj.io

http://sia.tech/

https://filecoin.io/

https://maidsafe.net/

I haven't used them myself so I can't vouch for the UX or quality, but they should be able to offer pretty low prices.

zwischenzug · on June 8, 2017

I feel like a luddite but I have three backups at home (PC HD, 2 rsync'd USB drives I bought several years ago) and one off-site backup (encrypted HD in locker at work). Far cheaper afaict than any cloud backup.

w458cmau · on June 8, 2017

I think this is a good basic and relatively low-tech strategy.

Do you do versioning? As in what happens if your files are silently corrupted e.g. by accident or by malware? Rsync would overwrite your files, and you might even overwrite your off-site backup when you connect it.

My main reason for going beyond such a set-up though is that it takes time, effort and remembering to sync the off-site backup by taking it home, syncing and putting it back. And during that time all your data is in the same place. If something happens to your home during that time (break-in, flooding, fire...) you're out of luck. Unless your rsync'd drives are also encrypted and you just switch one of them with the off-site one for rotation.

zwischenzug · on June 8, 2017

One of my backups is 'add only'

The really key stuff is in git repos.

Most of the data (films, mostly) I could stand to lose.

dingaling · on June 8, 2017

My offsite backup is likewise an encrypted disk stored at a friend's house, and vice versa. After the initial hardware puchase cost it's free.

sokoloff · on June 8, 2017

And, based on my experience, generally horrifically out of date.

jamespo · on June 8, 2017

* cheap * capacity * convenient

choose 2

teekert · on June 8, 2017

I have a Raspberry Pi at my parents home (which has r/w to the disk attached to my fathers Airport extreme), it rsyncs every night with the server in my basement (which has all my data on 2 disks.) It also syncs my parents data back to me. It works well but I still need to add a feature to email me if syncing somehow halts or errors out. I use "rsync -av" (over SSH), so nothing is ever deleted.

dspillett · on June 8, 2017

> so nothing is ever deleted.

It could be overwritten though. A good backup protects you from more than just destruction at the primary site. There are various relatively efficient ways to arrange snapshots when using rsync as your backup tool.

Also, remember to explicitly test your backups occasionally, preferably with some sort of automation because you will forget to do it manually, so detect unexpected problems (maybe the drive(s)/filesystem in the backup device are slowly going bad but in a way that only affect older data and don;t stop new changes being pushed in).

w458cmau · on June 8, 2017

Versioning backups seems like a must. Encrypting malware is a thing and has been for a while, just like rm -rf type mistakes which are subsequently propagated automatically to "backups".

dspillett · on June 8, 2017

Another thing that I do with my backups is making it so that the main machine can't access the backups directly and vice-versa. It is slightly more faf to setup, adds points of failure (though automated testing is still possible), and is a little more expensive (you need one extra host) but to significantly so.

My "live" machines push data to an intermediate machine, the the backup locations pull data from there. This means that the is no one machine/account that can authenticate against everything. Sending information back for testing purposes (a recursive directory listing normally, a listing with full hashes once a month, which in each case gets compared to the live data and differences flagged for inspection) is the same in reverse.

This way a successful attack on my live machines can't be used to attack the backups and vice-versa. To take everything you need to hack into all three hosts separately.

Of course as with all security systems, safe+reliable+secure+convenient storage of credentials is the next problem...

JoblessWonder · on June 8, 2017

This is especially true with Ransomware type attacks that encrypt/corrupt data. Having a backup of unusable files isn't doing anyone any good.

tw04 · on June 8, 2017

Crashplan isn't incompatible with NAS. You can either mount a share and run it from your workstation, or run it directly on the NAS itself. The core of the product is Java so it runs on just about any architecture to boot.

snuxoll · on June 8, 2017

Coming from someone who tried to do this setup, it wasn't worth it. CrashPlan's client isn't something you generally would want to run on your NAS, it takes memory proportionate to the amount of data on your disk (and a fair amount of RAM, at that) and unless you're running an GUI on your NAS it's impossible to configure without a huge headache.

You can run it from your workstation, but if you've got a reasonable amount of data on your NAS then the memory issues will bite you again. Something like Backblaze B2 is more expensive, but I'd rather pay $10/mo to backup the 2TB of data on my NAS (growing every day) and use CrashPlan to backup my computers only.

ValentineC · on June 8, 2017

> CrashPlan's client isn't something you generally would want to run on your NAS, it takes memory proportionate to the amount of data on your disk (and a fair amount of RAM, at that) and unless you're running an GUI on your NAS it's impossible to configure without a huge headache.

CrashPlan's client is able to attach to a headless instance [1], but the RAM requirement does mean that it's only really usable on NASes with expandable RAM.

[1] https://support.code42.com/CrashPlan/4/Configuring/Use_Crash...

guiambros · on June 9, 2017

+1.

I used Crashplan for 3 years on a Synology NAS. It's a disaster. Every time there was a Synology upgrade, the CP headless server would stop working, and you'd need to reinstall, re-set the keys, etc.

After 10 ou 15 times doing this, I got rid of Crashplan entirely, migrated my backups to Amazon Drive, and never looked back.

Given the lack of decent options, seems the best choice will really be to pony up the $180 for 3TB that Amazon will start charging next year...

klodolph · on June 8, 2017

If you were paying $60/year for 2-3T of cloud storage then Amazon was subsidizing you. Even Glacier would cost $120/year for 2.5T, and Glacier is so cheap that everyone's trying to figure out how they could possibly sell Glacier and still be making money.

fnord123 · on June 8, 2017

>Even Glacier would cost $120/year for 2.5G

$120/year for 2.5T.

https://aws.amazon.com/glacier/pricing/

klodolph · on June 8, 2017

Yes, thank you for pointing out the typo.

askvictor · on June 8, 2017

Isn't the catch that you have to pay to get your data out of there?

mikevm · on June 8, 2017

Why is CrashPlan incompatible with NAS? I am running it on a headless Ubuntu server and it works just fine (you just need about 1GB of RAM for every TB of storage).

glenneroo · on June 8, 2017

And if you happen to be running FreeNAS, there is even a plugin available via the GUI (same RAM rules apply).

mikevm · on June 8, 2017

Same for Synology.

guiambros · on June 9, 2017

Have you upgraded your Synology OS lately? For 3 years, every time I did it, the headless CP server would stop working.

mikevm · on June 9, 2017

I don't actually have Synology. A friend of mine does and he runs CrashPlan on it.

imron · on June 8, 2017

If it's just backup, and it's from a single computer (with potentially multiple external harddrives), then maybe BackBlaze: https://www.backblaze.com/cloud-backup.html

toast76 · on June 8, 2017

Backblaze (and others) don't support backing up from a NAS, which, for a family, is impractical.

Polyisoprene · on June 8, 2017

https://www.backblaze.com/b2/partner-synology.html

ptman · on June 8, 2017

But isn't that with B2 pricing, not the $5/month unlimited pricing?

snuxoll · on June 8, 2017

Sure, but B2's pricing isn't too expensive anyway. If I had all 7TB of usable space filled up on my NAS it'd cost me $35/mo - that's easily doable, even for a digital packrat like myself.

imron · on June 9, 2017

That's $420 a year, which is well over what the grand-grand-*-poster of this sidethread mentioned was too much.

Even for his smaller data size of 3TB it still works out to $180 a year which is the same as what he'd have to pay Amazon.

victorhooi · on June 11, 2017

For Google storage - you can get GSuite (https://gsuite.google.com/pricing.html) - $10 a user a month, for unlimited storage via Google Drive.

You can then mount the drive using DriveFS:

https://blog.google/products/g-suite/introducing-new-enterpr...

It's basically a FUSE filesystem built on top of Google Drive.

Alternatively, you can use the Drive Sync Client, if you want to just sync stuff back and forth (without a virtual FS).

nightfly · on June 8, 2017

The Glacier storage class on S3 would probably better if you like Amazon and are okay with Glacier's price. Backblaze's B2 is pretty cheap too, and has a nice API.

ptman · on June 8, 2017

Isn't Glacier separate from S3?

nightfly · on June 8, 2017

It's both: https://aws.amazon.com/s3/storage-classes/

algesten · on June 8, 2017

I'm using Resilio Sync (formerly BitTorrent Sync) which makes like a private cloud dropbox between my machines/phone. No versioning, but pretty solid.

izacus · on June 8, 2017

Google Cloud Nearline storage is rather cheap, doesn't have as much limitations as Glacier and is AWS API compatible so NAS backup software works with it.

rabboRubble · on June 8, 2017

Re Crashplan & NAS... I've managed to get NAS back up to work. Are you certain on this point? I am going to double check my set up.

I have the MacOS CrashPlan client configured to back up a variety of NAS shares when the NAS is powered on and the share is mounted. Only about 4 shares, and I made a point to mount them and leave them mounted until the sync completed.

The shares are cold storage, so once synced, they stay virtually unchanged.

tehlike · on June 8, 2017

maybe use https://camlistore.org/ on GCE?

nodesocket · on June 8, 2017

For fun, I just checked out Google Cloud Platform. 1 TB of regional storage costs $20.00 a month not including bandwidth which could be huge.

1 TB of egress bandwidth is $120.00 a month.

klodolph · on June 8, 2017

Regional storage is inappropriate for backup. On GCS, backup should be nearline or coldline, depending on how long you think it will be there.

Presumably you'd pay the $120 bandwidth fee seldom or never.

nodesocket · on June 8, 2017

Ok, Google Storage Nearline is still $10 a month for 1 TB. That's $120 a year vs $59.99 a year for Amazon Drive not including Google bandwidth which could be significant.

klodolph · on June 8, 2017

I feel like the comment about bandwidth got ignored—you only pay for egress bandwidth, which basically means you only are paying high bandwidth fees if you lost all of your data and it's an emergency, at which point they seem pretty reasonable because you just lost your house in a fire or something like that. Uploading is free (well, you pay your ISP).

Most of the time, people only need to restore a few files from backup because they were accidentally deleted. The bandwidth costs for a few GB here and there are pretty cheap.

Mathnerd314 · on June 8, 2017

I've been thinking that a p2p backup solution (encrypted storage, storage cryptocurrency, occasional random requests to make sure they're still around) would work. I guess these guys: https://storj.io/. $15/TB of storage, $50/TB of bandwidth. and competitors: https://news.ycombinator.com/item?id=13723722

dawnerd · on June 8, 2017

Where are you getting 240 for google drive? It's only costing me 120/year. (Well technically through gsuite)

guiambros · on June 9, 2017

$19.99 per month for 2TB plan [1].

[1] https://www.google.com/settings/storage

victorhooi · on June 11, 2017

You can buy a GSuite plan (https://gsuite.google.com/pricing.html) - which is $10 a user a month, for unlimited storage.

PinguTS · on June 8, 2017

Make a tool that makes photo files out of data files. In its simplest form you just need to make the according header files (added benefit: additional meta data can be easily encoded). Because as a Amazon Prime photo storage is for free and unlimited.

Could be a nice hack.

dingo_bat · on June 8, 2017

They could be reencoding, witch will destroy the data.

crest · on June 8, 2017

Use forward error correction.

hasch · on June 8, 2017

GCE's Nearline maybe?

luca_ing · on June 8, 2017

Backblaze b2?

AtlasLion · on June 8, 2017

I was just reading on reddit how some users were uploading petabytes of data to it. I am an ACD user, but I can't blame them for stopping that.

brandon272 · on June 8, 2017

I was always amused when I warned people on /r/datahoarder against abusing the service because Amazon would inevitably put an end to it. I was always told that I had no idea what I was talking about and was given many rationalizations about why Amazon wouldn't care about users storing dozens or hundreds of TB of files on the service.

selckin · on June 8, 2017

Not entirely fair, there are whole communities around storing 100s of TB on amazon

pedroaraujo · on June 8, 2017

It is fair from the moment you offer an unlimited plan and even more fair when you make a service of it and charge for it.

Customers are customers, not product managers. It is only natural to make use of a service you pay for.

selckin · on June 8, 2017

Indeed, and it's in their right to stop offering that when the period you payed for ends.

Its understandable from their point of view to offer unlimited and be awesome but not expect this kind of usage that is not sustainable. So they made a mistake and are correcting it.

It's hard to see it as a deliberate strategy to pull in users and then charge them more when they are "locked in"

_pmf_ · on June 8, 2017

> So they made a mistake and are correcting it.

Do they also refund people for their time wasted assuming this was a sustainable service, or does this "correction" only work in one direction?

vitalysh · on June 8, 2017

I'm not even mad. /r/datahoarders brought this on themselves. Who in their right mind expects to upload 100s of TBs of data, encrypted and pay 59.99?

ghughes · on June 8, 2017

People who understand the literal meaning of the word "unlimited"?

vitalysh · on June 8, 2017

Yes, Amazon is to blame here as well. They shouldn't have offered unlimited service.

At the same time, I don't get, why would you encrypt your "Linux ISO's"? Let the AWS dedup do its job, don't abuse it, and everyone is happy.

cf · on June 8, 2017

I don't get why if they don't mean unlimited just say up to 20TB/mo.

chc · on June 8, 2017

Possibly because there actually wasn't any limit. Maybe if a handful people were exceeding $LOTS TB, they don't care, but if 60% of users exceed $LOTS TB, the service becomes unsustainable. In this case, the service really is unlimited (there genuinely is no limit that you're not allowed to go over), and if you wanted that effect, advertising a limit would be net negative — a high limit would encourage the "too many users use a lot" case and lead to the same result we get now where the plan has to be canceled for unsustainability, and a low limit would defeat the purpose.

unixhero · on June 8, 2017

Because it isn't linux iso's.

nimchimpsky · on June 8, 2017

were they being serious, I can't tell.

tgjsrkghruksd · on June 8, 2017

@vitalysh

> At the same time, I don't get, why would you encrypt your "Linux ISO's"? Let the AWS dedup do its job, don't abuse it, and everyone is happy.

Because if you are a self-proclaimed data hoarder, do you have the time to sort through and selectively classify your hoard to "encrypt this ISO don't encrypt that tarball" on a file-by-file basis across many terabytes?

How much would be saved by deduping anyway? If they're not deliberately making it easy/redundant, even if you got 300TB down to 100TB or such, a single order-of-magnitude reduction doesn't fundamentally change the economics of "unlimited."

Blame data hoarders, but don't blame encryption.

vitalysh · on June 8, 2017

I store a bit of data at home (only ~20TB). Really easy to sort. There are plenty of apps that do it for you. This extension with those keywords in filename goes to this directory. Others to another dirs.

I only have my pictures and personal data in AWS cloud, encrypted. They way I set it up? Point rclone to relevant directories and skip the rest.

theshrike79 · on June 8, 2017

Except Amazon revoked rcloud's key a while back.

Any recommendations on the "plenty of apps" that sort your data for easy searching?

awirth · on June 8, 2017

As someone completely unfamiliar with this space, this prompted me to do some reading into this rclone issue. I'll record it here for anyone else similarly curious.

It seems that as of a few months ago, two popular (unofficial) command line clients for ACD (Amazon Cloud drive) were acd-cli[1] and rclone[2], both of which are open source. Importantly the ACD API is OAuth based, and these two programs took different approaches to managing their OAuth app credentials. acd-cli's author provided an app on GCE that managed the app credentials and performed the auth. rclone on the other hand embedded the credentials into their source, and did the oauth dance through a local server.

On April 15th someone reported an issue on acd-cli titled "Not my file"[3] in a user alleged that they had received someone else's file from using the tool. The author refered them to amazon support. The issue was updated again on May 13th with another user that had the same problem - this time with better documentation. The user reached out to security@amazon.com to report the issue.

Amazon's security team determined that their system was not at fault, but pointed out a race condition in the source for the acd-cli auth server (sharing the auth state in a global variable between requests...) and disabled the acd-cli app access to protect customers.[4]

In response to this banning, one user suggested that a workaround to get acd-cli working again would be to use the developer option for local oauth dance, and use rclone's credentials (from the public rclone source).[5] This got rclone's credentials banned as well,[6] presumably when the amazon team noticed that they were publicly available.

To top this all off, the ACD team also closed down API registration for new apps around this time (which seems to have already been a strenuous process). I suppose the moral of the story is that OAuth is hard.

[1]: https://github.com/yadayada/acd_cli [2]: https://github.com/ncw/rclone [3]: https://github.com/yadayada/acd_cli/issues/549 [4]: https://github.com/yadayada/acd_cli/pull/562#issuecomment-30... [5]: https://github.com/yadayada/acd_cli/pull/562#issuecomment-30... [6]: https://forum.rclone.org/t/rclone-has-been-banned-from-amazo...

oridecon · on June 8, 2017

I hope this (and the many more examples) put a stop to this "unlimited" bs. You can't say people were abusing a service that throws that keyword for marketing reasons.

brandon272 · on June 8, 2017

That is very selective of them. While their marketing materials said "unlimited", people chose to ignore the ToS which stated that they wouldn't tolerate abuse and that abuse was basically whatever they determined it to be.

Kocrachon · on June 8, 2017

One guy in particular admitted to having 1PB stored. People like him fucked the rest of us over.

PKop · on June 8, 2017

Yes.. but them not having an upper limit doomed "the rest of you" from the beginning. Is anyone surprised some would do that? Is Amazon? Should they be? Of course not..

awirth · on June 8, 2017

Source on the 1PB guy for the curious: https://www.reddit.com/r/DataHoarder/comments/5s7q04/i_hit_a...

chrisan · on June 8, 2017

Ouch, reading those comments, even by the OP... the writing was on the wall then even

FireBeyond · on June 8, 2017

Who is now at 1.5PB, while someone else replies to him who has a 1.4PB flair, and another has a 1.1PB flair...

dawnerd · on June 8, 2017

Looks like it's really the plex people to blame. They were hosting tons of TBs of pirated movies/tv shows.

thanksgiving · on June 8, 2017

And why is that a problem? Copyright is theft.

derefr · on June 8, 2017

Corporations see "complicity in an illegal act" as a negative utility far larger than the ultimate lifetime value of any single customer. So, when you do something illegal (even if for dumb reasons) and use a corporate service to do so, you've got to expect that said corporation will immediately try to distance themselves from complicity in that act by terminating your account with them. This is one of those "inherent in the structure of the free market" things.

thanksgiving · on June 8, 2017

Why? Isn't this like a private storage? Unless people are sharing the files, why should Amazon care what's in the files?

richmarr · on June 8, 2017

So, first of all I think you're focusing on the wrong thing.

The whole point of an unlimited tier is to attract large numbers of outsiders who don't want the cognitive burden of figuring out $/GB/month and estimating how many GB photos they'll need to store.

What we're talking about here is that they got some customers like that, but they also got a small number of customers taking them for a ride, call them 'power users' the kind of customers who (as we see elsewhere in these comments) won't stick around if the price changes.

There's nothing wrong with these power users storing huge amounts of data at subsidised price, just like there's nothing wrong with Amazon changing the pricing. They just decided to stop subsidising that behaviour and probably take a slight hit on a conversion rate somewhere.

As for your question about 'private' storage, it's a grey area. Privacy isn't absolute, especially in cases where a company is by inaction helping you breaking the law (whether you agree with the law or not). Companies work very hard to distance themselves from responsibility for their customers actions and don't want to jeopardise that by letting it get out of hand

thanksgiving · on June 8, 2017

> Privacy isn't absolute, especially in cases where a company is by inaction helping you breaking the law (whether you agree with the law or not). Companies work very hard to distance themselves from responsibility for their customers actions and don't want to jeopardise that by letting it get out of hand

How does this work with Google Play Music (you can upload up to 50k songs for free and listen to it "on the cloud")?

I think you are focusing on the wrong thing. Corporations don't care about the law any more than individuals do. Laws and regulations are just guidelines if you are determined enough to get your way. Look at all the Uber stories. Pretty sure people here still like Travis for his tenacity no matter what you say about his morality.

I think we often forget that humans wrote the laws we have today. They didn't come to us in stone tablets down the mountain top. At the end of the day, these laws don't matter. They are not written in stone so as to speak. We should always strive to do better. Intellectual property is a sham. I mean think about it. I think there is legitimate intellectual property, the trademark.

I think it is wrong for me to sell "Microsoft Windows" (even if I wasn't charging any money) if I had modified the software and added malware into it. But me watching a movie or reading a book without paying royalties does not hurt anyone.

Please think about it. Just because something is legal does not make it right and just because something is illegal does not make it wrong. We need to calibrate our laws based on our image and not the other way round. We write the laws. The laws don't write us.

richmarr · on June 15, 2017

> Corporations don't care about the law any more than individuals do.

I'm struggling to find a connection between the points that I made in my comment and the points in your reply. Suspect we have some miscommunication here... my own comment wasn't spectacularly well filtered.

I'll bite on these though;

> Laws and regulations are just guidelines if you are determined enough to get your way. Look at all the Uber stories.

Don't conflate civil or criminal law with the work of regulatory bodies, who in my experience with the FCA and OFT are very open and collaborative without any need for "tenacity".

Uber work very hard on marketing and competition, but they are allowed to succeed to regulators who WANT them to succeed despite their amoral hussle, not because of it. Regulators in my experience (the FCA and OFT specifically) are very open and collaborative. They understand that markets move on and regulations sometimes lead and sometimes follow.

> Please think about it. Just because something is legal does not make it right...

So, I'm assuming from this comment that you're quite young. Just for you information; I suspect most folks on HN are already aware of the delta between legality and morality.

I'd also recommend thinking about the subjective nature of morality, and the causes and malleable nature of it.