Hacker News new | past | comments | ask | show | jobs | submit login

Ah, the old "give them a bunch of storage and then ask for more money to keep storing it" meme.



Indeed.

My annual cost will jump from $60 to $180. That's too much for simple offline backup , so it's time to start looking for options again :(

Glacier may a more affordable option, but my experience a few years ago has been terrible.

Any suggestions? Google Drive is also pricey ($240); Crashplan is incompatible with NAS, and tarsnap is out of question (>$6,000/year).


I personally run syncthing on several devices, and don't worry about the cloud. It's self-hosted, devices replicate files between themselves, and there's no real limit other than hard drive space. It runs on just about anything too; several of my backup systems are Raspberry Pis.

It can be a bit weird to set up initially, and is a lot less magical in the interest of putting you in control for privacy reasons, but the flexibility added is pretty useful. I have a music folder that I sync to my phone without needing to pull the rest of my backups along with it, since they wouldn't fit anyway. Several of my larger folders aren't backed up on every single device for similar reasons, but some of my really important smaller folders (documents, photos, regular backups of my website's database) go on everything just because it can.

Anyway, check it out. Highly recommended all around: https://syncthing.net/


If you accidentally delete some files (even all the files!), won't Syncthing delete all the "backups"?

I don't use Syncthing, I use an rsync script I wrote over 10 years ago, using the --link-dest option to keep incremental backups for around 2 years.

This relies on Zsh's fancy globbing, but the gist of it is:

    date=$(date +%Y%m%d-%H%M)

    [for loop over users]

    older=( $backups/$user/*(N/om) )

    rsync --archive --recursive \
        --fuzzy --partial --partial-dir=$backups/$user/.rsync-partial \
        --log-file=$tempfile --link-dest=${^older[1,20]} \
        --files-from=$configdir/I-$user \
        --exclude-from=$configdir/X-$user \
        $user@$from:/ $backups/$user/$date/


Syncthing has options to store versions of files so that scenario is easily avoided: https://docs.syncthing.net/users/versioning.html


Unfortunately, in my experience, Syncthing's versioning mechanisms leave much to be desired compared to what I'm used to from Dropbox. AFAIK all of Syncthing's versioning schemes only keep versions of files that have been changed _on other devices_, and not those that have changed on the device itself, whereas what I'm looking for is an option to keep a synchronized version history for all files on all devices, and the ability to more intuitively roll back and roll forward the state of any file to any revision without having to mess with manually moving and replacing files and reading timestamps (better yet would be the ability to do so for entire directories, but I realize this would probably be very difficult to accomplish across devices in a decentralized manner).


I used a similar script for a long time but I'm using now rnsapshot.


For me, one of the main benefits of cloud-based backup is that it's off-site - so if my house burns down, my data is still safe.


Think about a Media Safe. Some are really expensive, but this one is not too bad. Just a really small storage area. https://www.amazon.com/First-Alert-2040F-Water-Media/dp/B000...


What about break ins? Someone enters your place and steal your NAS (and the Media Safe)...


This.

That's my primary use case for Amazon Drive. I have a robust rsync of the workstations and laptops to a NAS, and then to a second (incremental-only, no delete) NAS. Works great, but if the house burns down, or if someone breaks in and steals the computers, I want to ensure there's a copy somewhere.


If that is your main concern you could always put it on an external drive and put it a bank safe deposit box. I've thought about doing that for at least the very important things, perhaps even printing some important pictures too.


You just need another house to burn down.

Don't you have friends or relatives at a reasonable distance who can set up mutual backups on each other's home servers?


Yes, but nobody else with a FTTC internet connection with an unlimited bandwidth allowance (I'm in the UK). I have 2TB of data, so speed is important.


I don't have a single friend that has a home server. Most adults don't even own computers anymore, just phones and perhaps an iPad.


A good scenario is building a backup server/nas solution that you can put in a little cubby at your friends place. There's trust involved that you're not using their internet to hack the government, and you have to be mindful of their bandwidth/power costs. So not a rackmount server or even a tower, but something much smaller and very appliance looking. A nuc sitting atop a wd passport or their "my book".

If it provides them a benefit like an in-house plex server, even better.


Another option would be to rent a safety deposit box at a bank for $25 per year and store your backups there as flash drives. Cheap and very secure.

Of course it requires you going to the bank regularly to update the backup.


I've moved mostly to syncing through Syncthing for my devices too, but I'm curious what people use for sharing files with others and accessing files through a browser on machines you don't control?


So you're one house fire away from losing all your data forever.


There's Siacoin, a cryptocurrency/blockchain built around the idea of decentralized encrypted p2p storage.

They're dirt cheap to store as of now: median contract price is $12/TB·mo, but network storage utilization is currenly only 2%, so actual deals settle on about $2/TB·mo. Downside is that exchange rate of their coin is highly volatile, at least was during last month.

https://sia.tech/

http://siahub.info/

http://siapulse.com/page/network (Prices tab)


Do these decentralized storage networks provide any guarantees in terms of durability, redundancy and availability? I've been looking into Siacoin, Filecoin, Storj and the like, but lack of clarity around some important concerns have so far prevented me from taking them seriously as a backup solution:

1. Performing a restore in a timely fashion on a large dataset seems like a tall order if these networks don't impose any minimums for the upstream bandwidth of the hosts.

2. Files can completely disappear from the network if the machines that are hosting them happen to go dark for whatever reason, which seems to be a much more likely occurrence for some random schnub hosting files for beer money than it would be for traditional storage providers that have SLAs and reputations to uphold.

Maybe these concerns are unfounded, and some or all of these networks already have measures in place to address them? I'd appreciate it if someone more familiar with these networks could enlighten me if that's the case.


In addition to redundancy, Sia has the concept of collateral, which is basically money locked in a smart contract that says "I'm willing to bet this money that I'm not going to lose your files". I.e. Hosts lose the money if they fail to store your files.

Different hosts have different amount of collateral, and it's both an important security measure as well as market mechanism.

Also, Sia is completely decentralized (unlike StorJ for example), so it can't be intervened with by anyone which might result in lost files.


Speaking as a Sia developer, I can address your concerns.

> these networks don't impose any minimums for the upstream bandwidth of the hosts.

Sia today primarily handles that through gross redundancy. If you are using the default installation, you're going to be putting your files on 50 hosts. A typical host selection is going to include at least a few sitting on large pipes. Downloads on Sia today typically run at about 80mbps. (the graph is really spiky though, it'll spike between about 40mbps and 300mbps).

We have updates in the pipeline that will allow you to speedtest hosts before signing up with them, and will allow you to continually monitor their performance over time. If they cease to be fast enough for your specific needs, you'll drop them in favor of a new host. ETA on that is probably ~August.

> Files can completely disappear from the network if the machines that are hosting them happen to go dark for whatever reason

We take host quality very seriously, and it's one of the reasons that our network has 300 hosts while our competitors are reporting something like 20,000 hosts. To be a host on Sia, you have to put up your own money as collateral. You have to go through this long setup process, and there are several features that renters will check for to make sure that you are maintaining your host well and being serious about hosting. Someone who just sets Sia up out of their house and then doesn't maintain it is going to have a very poor score and isn't going to be selected as a host for the most part.

Every time someone puts data on your machine, you have to put up some of your own money as collateral. If you go dark, that money is forfeit. This scares away a lot of hosts, but that's absolutely fine with us. If you aren't that serious about hosting we don't want you on our network.

> but lack of clarity around some important concerns have so far prevented me from taking them seriously

We are in the middle of a re-branding that we hope introduces more clarity around this type of stuff as it relates to our network.


This is the one I've got my eye on - once the marketplace boots up on both sides, it's going to be hard to compete against it. I suspect some day even the big providers like Amazon and Google will sell into these kinds of marketplaces.


I'm calling it, it's not gonna happen.

For data storage, you need error encoding. Sia does that, but you pay for it. So for 1TB of data, you upload 2TB to the network (that's how Sia is configured) and at the current $2.02/TB per month, that's $4.04/TB, which is more expensive than Glacier. Glacier charges funny for downloads but Sia charges for downloads too.

I assume that if you wanted to store ~2.5TB like we're talking about, you'd be paying more than $4/TB, because 2.5TB is 10% of the total of all data currently stored in Sia, currently 24.5 TB. (By comparison the major cloud providers are undoubtedly in the exabyte range of actual data stored. Or for another comparison, you could comfortably hold 24.5 TB of storage media in one hand.)

Sia promises to be cheap because you're using unused bytes in hard drives that people already bought, but that's exactly what Amazon, Google, and Microsoft are already doing, except their data centers are built in places where the electricity costs less than what you're paying. Plus they don't charge you extra for data redundancy.


In that case, Sia provides an avenue for an new company with access to cheap electricity to compete with Amazon, Google, and Microsoft without investing a cent in marketing or product. They will just plug in and start receiving payments, and strengthen the network and lower the price in the process.

Another cool thing is Sia lets hosters set their storage and bandwidth prices, so specialized hosts will likely pop up. For example one host might use tape drives, set cheap storage cost and expensive bandwidth cost. Clients can prioritize as desired. SSD servers with good peering can do the opposite.

The real interesting part will be when you can create one-time-use URLs to pass out, which connect directly to the network - effectively turning it into a distributed CDN.


The $2 / TB / Mo we've traditionally advertised as our price included 3x redundancy. The math we've done on reliability suggests that really you only need about 1.5x redundancy once you are using 96 hosts for storage.

The network prices today are less friendly, though that's primarily due to market confusion. The siacoin price has doubled 6 times in 6 months, and there's no mechanic to automatically re-adjust host prices and the coin price moves around. So hosts are all currently advertising storage at these hugely inflated rates, and newcomers to Sia don't realize that these aren't really competitive prices.

Though, I will assert that even at our current prices it's not price that's the primary barrier to adoption. It's some combination of usability, and uncertainty. Sia is pretty hard to set up (it's around 8 steps, with two of those steps taking over an hour to complete), and a lot of people are not certain that Sia is truly stable enough to hold their data.

We're focused on addressing these issues.


You can't compare to Glacier. S3 is a more comparable product. And obviously redundancy is already in the price, or did you think there's no redundancy?


From what I understand, your client does the error encoding and pays for raw data storage on the network, rather than trusting the network to do error encoding. You can configure the encoding to whatever you want, you just end up paying more for more redundant encodings.


Isn't this exactly what Pied Piper gets used for in later seasons?


Currently trying Backblaze: https://www.backblaze.com/b2/cloud-storage.html. Overall fits my needs.


I used Backblaze for several years before closing out my account in 2012.

Initial backup took a long time. There was no easy way to prioritize, for example, my photos over system files. I ended up manually prioritizing by disallowing pretty much my entire filesystem, and gradually allowing folders to sync. First, photos, then documents, then music, etc.

Eventually it all got synced up and it was trouble-free... until I tried to get my data back out.

The short version of the story is that a power surge fried my local system. I bought a new one and had some stress when it appeared the BB client was going to sync my empty filesystem (processing it as a mass delete of my files). I managed to disable the sync in time.

Then I discovered there was no way to set the local BB client to pull my files back down. Instead, I had use their web-based file manager to browse all my folders and mark what I wanted to download. BB would then zip-archive that stuff which would then only be available as an http download. There was no Rsync, no torrent, no recovery if the download failed halfway, and no way to keep track of what I had recently downloaded. Also, iirc, they were limited to a couple of GB in size per file. (which didn't matter because at that time, the download would always fail it the file was larger than __MB (I don't remember the exact number. 100MB? 300? Also hazy on the official zipfile size limit)

So I had to carefully chunk up my filesystem for download because the only other option BB offered was to buy a pre-filled harddrive from them (that they would ship to me).

I felt like Backblaze was going out of their way to make it hard for me in order to sell me that harddrive of my data. I felt angry about that and stubbornly downloaded my data one miserable zipfile at a time until I had everything.

Once I was reasonably sure I had everything I cared about, I closed my account and haven't looked back.

[Edit to add] This was at least 5 years ago. No doubt their service has improved since then.


I would think that for a full restore you might be better off with their restore by mail. Note that if you move your data off the drive they ship then send it back they refund the charge for it.


I use Backblaze but haven't had to do a restore yet. It appears their current limits are 500gb per zip file. They also have a "BackBlaze Downloader" utility (Mac & Windows) that has the ability to resume interrupted downloads.

https://help.backblaze.com/hc/en-us/articles/217665888-How-t...


It looks like styx31 linked to B2 which is a separate service from their backup service that's closer to S3 or Google Cloud Storage. With that you can use rclone which should avoid the issues you encountered, though at higher cost if you have a lot of data (there are per-GB storage and download fees).


My restore experience was also poor with Backblaze. The download speed was slow. If I had to restore an entire drive it would have take me many days to download the entire thing.

I switched to Arq with Amazon Drive as the storage backend.


Well you get to choose cheap backup vs expensive restore. Better than impossible restore (that is if you don't do backups)


I'm in the process of uploading 2TB of backups to B2, it's ridiculously cheap, and they don't charge for upload BW, just storage and download


You could try out one of these new storage cryptocurrencies:

https://www.storj.io

http://sia.tech/

https://filecoin.io/

https://maidsafe.net/

I haven't used them myself so I can't vouch for the UX or quality, but they should be able to offer pretty low prices.


I feel like a luddite but I have three backups at home (PC HD, 2 rsync'd USB drives I bought several years ago) and one off-site backup (encrypted HD in locker at work). Far cheaper afaict than any cloud backup.


I think this is a good basic and relatively low-tech strategy.

Do you do versioning? As in what happens if your files are silently corrupted e.g. by accident or by malware? Rsync would overwrite your files, and you might even overwrite your off-site backup when you connect it.

My main reason for going beyond such a set-up though is that it takes time, effort and remembering to sync the off-site backup by taking it home, syncing and putting it back. And during that time all your data is in the same place. If something happens to your home during that time (break-in, flooding, fire...) you're out of luck. Unless your rsync'd drives are also encrypted and you just switch one of them with the off-site one for rotation.


One of my backups is 'add only'

The really key stuff is in git repos.

Most of the data (films, mostly) I could stand to lose.


My offsite backup is likewise an encrypted disk stored at a friend's house, and vice versa. After the initial hardware puchase cost it's free.


And, based on my experience, generally horrifically out of date.


* cheap * capacity * convenient

choose 2


I have a Raspberry Pi at my parents home (which has r/w to the disk attached to my fathers Airport extreme), it rsyncs every night with the server in my basement (which has all my data on 2 disks.) It also syncs my parents data back to me. It works well but I still need to add a feature to email me if syncing somehow halts or errors out. I use "rsync -av" (over SSH), so nothing is ever deleted.


> so nothing is ever deleted.

It could be overwritten though. A good backup protects you from more than just destruction at the primary site. There are various relatively efficient ways to arrange snapshots when using rsync as your backup tool.

Also, remember to explicitly test your backups occasionally, preferably with some sort of automation because you will forget to do it manually, so detect unexpected problems (maybe the drive(s)/filesystem in the backup device are slowly going bad but in a way that only affect older data and don;t stop new changes being pushed in).


Versioning backups seems like a must. Encrypting malware is a thing and has been for a while, just like rm -rf type mistakes which are subsequently propagated automatically to "backups".


Another thing that I do with my backups is making it so that the main machine can't access the backups directly and vice-versa. It is slightly more faf to setup, adds points of failure (though automated testing is still possible), and is a little more expensive (you need one extra host) but to significantly so.

My "live" machines push data to an intermediate machine, the the backup locations pull data from there. This means that the is no one machine/account that can authenticate against everything. Sending information back for testing purposes (a recursive directory listing normally, a listing with full hashes once a month, which in each case gets compared to the live data and differences flagged for inspection) is the same in reverse.

This way a successful attack on my live machines can't be used to attack the backups and vice-versa. To take everything you need to hack into all three hosts separately.

Of course as with all security systems, safe+reliable+secure+convenient storage of credentials is the next problem...


This is especially true with Ransomware type attacks that encrypt/corrupt data. Having a backup of unusable files isn't doing anyone any good.


Crashplan isn't incompatible with NAS. You can either mount a share and run it from your workstation, or run it directly on the NAS itself. The core of the product is Java so it runs on just about any architecture to boot.


Coming from someone who tried to do this setup, it wasn't worth it. CrashPlan's client isn't something you generally would want to run on your NAS, it takes memory proportionate to the amount of data on your disk (and a fair amount of RAM, at that) and unless you're running an GUI on your NAS it's impossible to configure without a huge headache.

You can run it from your workstation, but if you've got a reasonable amount of data on your NAS then the memory issues will bite you again. Something like Backblaze B2 is more expensive, but I'd rather pay $10/mo to backup the 2TB of data on my NAS (growing every day) and use CrashPlan to backup my computers only.


> CrashPlan's client isn't something you generally would want to run on your NAS, it takes memory proportionate to the amount of data on your disk (and a fair amount of RAM, at that) and unless you're running an GUI on your NAS it's impossible to configure without a huge headache.

CrashPlan's client is able to attach to a headless instance [1], but the RAM requirement does mean that it's only really usable on NASes with expandable RAM.

[1] https://support.code42.com/CrashPlan/4/Configuring/Use_Crash...


+1.

I used Crashplan for 3 years on a Synology NAS. It's a disaster. Every time there was a Synology upgrade, the CP headless server would stop working, and you'd need to reinstall, re-set the keys, etc.

After 10 ou 15 times doing this, I got rid of Crashplan entirely, migrated my backups to Amazon Drive, and never looked back.

Given the lack of decent options, seems the best choice will really be to pony up the $180 for 3TB that Amazon will start charging next year...


If you were paying $60/year for 2-3T of cloud storage then Amazon was subsidizing you. Even Glacier would cost $120/year for 2.5T, and Glacier is so cheap that everyone's trying to figure out how they could possibly sell Glacier and still be making money.


>Even Glacier would cost $120/year for 2.5G

$120/year for 2.5T.

https://aws.amazon.com/glacier/pricing/


Yes, thank you for pointing out the typo.


Isn't the catch that you have to pay to get your data out of there?


Why is CrashPlan incompatible with NAS? I am running it on a headless Ubuntu server and it works just fine (you just need about 1GB of RAM for every TB of storage).


And if you happen to be running FreeNAS, there is even a plugin available via the GUI (same RAM rules apply).


Same for Synology.


Have you upgraded your Synology OS lately? For 3 years, every time I did it, the headless CP server would stop working.


I don't actually have Synology. A friend of mine does and he runs CrashPlan on it.


If it's just backup, and it's from a single computer (with potentially multiple external harddrives), then maybe BackBlaze: https://www.backblaze.com/cloud-backup.html


Backblaze (and others) don't support backing up from a NAS, which, for a family, is impractical.



But isn't that with B2 pricing, not the $5/month unlimited pricing?


Sure, but B2's pricing isn't too expensive anyway. If I had all 7TB of usable space filled up on my NAS it'd cost me $35/mo - that's easily doable, even for a digital packrat like myself.


That's $420 a year, which is well over what the grand-grand-*-poster of this sidethread mentioned was too much.

Even for his smaller data size of 3TB it still works out to $180 a year which is the same as what he'd have to pay Amazon.


For Google storage - you can get GSuite (https://gsuite.google.com/pricing.html) - $10 a user a month, for unlimited storage via Google Drive.

You can then mount the drive using DriveFS:

https://blog.google/products/g-suite/introducing-new-enterpr...

It's basically a FUSE filesystem built on top of Google Drive.

Alternatively, you can use the Drive Sync Client, if you want to just sync stuff back and forth (without a virtual FS).


The Glacier storage class on S3 would probably better if you like Amazon and are okay with Glacier's price. Backblaze's B2 is pretty cheap too, and has a nice API.


Isn't Glacier separate from S3?



I'm using Resilio Sync (formerly BitTorrent Sync) which makes like a private cloud dropbox between my machines/phone. No versioning, but pretty solid.


Google Cloud Nearline storage is rather cheap, doesn't have as much limitations as Glacier and is AWS API compatible so NAS backup software works with it.


Re Crashplan & NAS... I've managed to get NAS back up to work. Are you certain on this point? I am going to double check my set up.

I have the MacOS CrashPlan client configured to back up a variety of NAS shares when the NAS is powered on and the share is mounted. Only about 4 shares, and I made a point to mount them and leave them mounted until the sync completed.

The shares are cold storage, so once synced, they stay virtually unchanged.


maybe use https://camlistore.org/ on GCE?


For fun, I just checked out Google Cloud Platform. 1 TB of regional storage costs $20.00 a month not including bandwidth which could be huge.

1 TB of egress bandwidth is $120.00 a month.


Regional storage is inappropriate for backup. On GCS, backup should be nearline or coldline, depending on how long you think it will be there.

Presumably you'd pay the $120 bandwidth fee seldom or never.


Ok, Google Storage Nearline is still $10 a month for 1 TB. That's $120 a year vs $59.99 a year for Amazon Drive not including Google bandwidth which could be significant.


I feel like the comment about bandwidth got ignored—you only pay for egress bandwidth, which basically means you only are paying high bandwidth fees if you lost all of your data and it's an emergency, at which point they seem pretty reasonable because you just lost your house in a fire or something like that. Uploading is free (well, you pay your ISP).

Most of the time, people only need to restore a few files from backup because they were accidentally deleted. The bandwidth costs for a few GB here and there are pretty cheap.


I've been thinking that a p2p backup solution (encrypted storage, storage cryptocurrency, occasional random requests to make sure they're still around) would work. I guess these guys: https://storj.io/. $15/TB of storage, $50/TB of bandwidth. and competitors: https://news.ycombinator.com/item?id=13723722


Where are you getting 240 for google drive? It's only costing me 120/year. (Well technically through gsuite)


$19.99 per month for 2TB plan [1].

[1] https://www.google.com/settings/storage


You can buy a GSuite plan (https://gsuite.google.com/pricing.html) - which is $10 a user a month, for unlimited storage.


Make a tool that makes photo files out of data files. In its simplest form you just need to make the according header files (added benefit: additional meta data can be easily encoded). Because as a Amazon Prime photo storage is for free and unlimited.

Could be a nice hack.


They could be reencoding, witch will destroy the data.


Use forward error correction.


GCE's Nearline maybe?


Backblaze b2?


I was just reading on reddit how some users were uploading petabytes of data to it. I am an ACD user, but I can't blame them for stopping that.


I was always amused when I warned people on /r/datahoarder against abusing the service because Amazon would inevitably put an end to it. I was always told that I had no idea what I was talking about and was given many rationalizations about why Amazon wouldn't care about users storing dozens or hundreds of TB of files on the service.


Not entirely fair, there are whole communities around storing 100s of TB on amazon


It is fair from the moment you offer an unlimited plan and even more fair when you make a service of it and charge for it.

Customers are customers, not product managers. It is only natural to make use of a service you pay for.


Indeed, and it's in their right to stop offering that when the period you payed for ends.

Its understandable from their point of view to offer unlimited and be awesome but not expect this kind of usage that is not sustainable. So they made a mistake and are correcting it.

It's hard to see it as a deliberate strategy to pull in users and then charge them more when they are "locked in"


> So they made a mistake and are correcting it.

Do they also refund people for their time wasted assuming this was a sustainable service, or does this "correction" only work in one direction?


I'm not even mad. /r/datahoarders brought this on themselves. Who in their right mind expects to upload 100s of TBs of data, encrypted and pay 59.99?


People who understand the literal meaning of the word "unlimited"?


Yes, Amazon is to blame here as well. They shouldn't have offered unlimited service.

At the same time, I don't get, why would you encrypt your "Linux ISO's"? Let the AWS dedup do its job, don't abuse it, and everyone is happy.


I don't get why if they don't mean unlimited just say up to 20TB/mo.


Possibly because there actually wasn't any limit. Maybe if a handful people were exceeding $LOTS TB, they don't care, but if 60% of users exceed $LOTS TB, the service becomes unsustainable. In this case, the service really is unlimited (there genuinely is no limit that you're not allowed to go over), and if you wanted that effect, advertising a limit would be net negative — a high limit would encourage the "too many users use a lot" case and lead to the same result we get now where the plan has to be canceled for unsustainability, and a low limit would defeat the purpose.


Because it isn't linux iso's.


were they being serious, I can't tell.


@vitalysh

> At the same time, I don't get, why would you encrypt your "Linux ISO's"? Let the AWS dedup do its job, don't abuse it, and everyone is happy.

Because if you are a self-proclaimed data hoarder, do you have the time to sort through and selectively classify your hoard to "encrypt this ISO don't encrypt that tarball" on a file-by-file basis across many terabytes?

How much would be saved by deduping anyway? If they're not deliberately making it easy/redundant, even if you got 300TB down to 100TB or such, a single order-of-magnitude reduction doesn't fundamentally change the economics of "unlimited."

Blame data hoarders, but don't blame encryption.


I store a bit of data at home (only ~20TB). Really easy to sort. There are plenty of apps that do it for you. This extension with those keywords in filename goes to this directory. Others to another dirs.

I only have my pictures and personal data in AWS cloud, encrypted. They way I set it up? Point rclone to relevant directories and skip the rest.


Except Amazon revoked rcloud's key a while back.

Any recommendations on the "plenty of apps" that sort your data for easy searching?


As someone completely unfamiliar with this space, this prompted me to do some reading into this rclone issue. I'll record it here for anyone else similarly curious.

It seems that as of a few months ago, two popular (unofficial) command line clients for ACD (Amazon Cloud drive) were acd-cli[1] and rclone[2], both of which are open source. Importantly the ACD API is OAuth based, and these two programs took different approaches to managing their OAuth app credentials. acd-cli's author provided an app on GCE that managed the app credentials and performed the auth. rclone on the other hand embedded the credentials into their source, and did the oauth dance through a local server.

On April 15th someone reported an issue on acd-cli titled "Not my file"[3] in a user alleged that they had received someone else's file from using the tool. The author refered them to amazon support. The issue was updated again on May 13th with another user that had the same problem - this time with better documentation. The user reached out to security@amazon.com to report the issue.

Amazon's security team determined that their system was not at fault, but pointed out a race condition in the source for the acd-cli auth server (sharing the auth state in a global variable between requests...) and disabled the acd-cli app access to protect customers.[4]

In response to this banning, one user suggested that a workaround to get acd-cli working again would be to use the developer option for local oauth dance, and use rclone's credentials (from the public rclone source).[5] This got rclone's credentials banned as well,[6] presumably when the amazon team noticed that they were publicly available.

To top this all off, the ACD team also closed down API registration for new apps around this time (which seems to have already been a strenuous process). I suppose the moral of the story is that OAuth is hard.

[1]: https://github.com/yadayada/acd_cli [2]: https://github.com/ncw/rclone [3]: https://github.com/yadayada/acd_cli/issues/549 [4]: https://github.com/yadayada/acd_cli/pull/562#issuecomment-30... [5]: https://github.com/yadayada/acd_cli/pull/562#issuecomment-30... [6]: https://forum.rclone.org/t/rclone-has-been-banned-from-amazo...


I hope this (and the many more examples) put a stop to this "unlimited" bs. You can't say people were abusing a service that throws that keyword for marketing reasons.


That is very selective of them. While their marketing materials said "unlimited", people chose to ignore the ToS which stated that they wouldn't tolerate abuse and that abuse was basically whatever they determined it to be.


One guy in particular admitted to having​ 1PB stored. People like him fucked the rest of us over.


Yes.. but them not having an upper limit doomed "the rest of you" from the beginning. Is anyone surprised some would do that? Is Amazon? Should they be? Of course not..



Ouch, reading those comments, even by the OP... the writing was on the wall then even


Who is now at 1.5PB, while someone else replies to him who has a 1.4PB flair, and another has a 1.1PB flair...


Looks like it's really the plex people to blame. They were hosting tons of TBs of pirated movies/tv shows.


And why is that a problem? Copyright is theft.


Corporations see "complicity in an illegal act" as a negative utility far larger than the ultimate lifetime value of any single customer. So, when you do something illegal (even if for dumb reasons) and use a corporate service to do so, you've got to expect that said corporation will immediately try to distance themselves from complicity in that act by terminating your account with them. This is one of those "inherent in the structure of the free market" things.


Why? Isn't this like a private storage? Unless people are sharing the files, why should Amazon care what's in the files?


So, first of all I think you're focusing on the wrong thing.

The whole point of an unlimited tier is to attract large numbers of outsiders who don't want the cognitive burden of figuring out $/GB/month and estimating how many GB photos they'll need to store.

What we're talking about here is that they got some customers like that, but they also got a small number of customers taking them for a ride, call them 'power users' the kind of customers who (as we see elsewhere in these comments) won't stick around if the price changes.

There's nothing wrong with these power users storing huge amounts of data at subsidised price, just like there's nothing wrong with Amazon changing the pricing. They just decided to stop subsidising that behaviour and probably take a slight hit on a conversion rate somewhere.

As for your question about 'private' storage, it's a grey area. Privacy isn't absolute, especially in cases where a company is by inaction helping you breaking the law (whether you agree with the law or not). Companies work very hard to distance themselves from responsibility for their customers actions and don't want to jeopardise that by letting it get out of hand


> Privacy isn't absolute, especially in cases where a company is by inaction helping you breaking the law (whether you agree with the law or not). Companies work very hard to distance themselves from responsibility for their customers actions and don't want to jeopardise that by letting it get out of hand

How does this work with Google Play Music (you can upload up to 50k songs for free and listen to it "on the cloud")?

I think you are focusing on the wrong thing. Corporations don't care about the law any more than individuals do. Laws and regulations are just guidelines if you are determined enough to get your way. Look at all the Uber stories. Pretty sure people here still like Travis for his tenacity no matter what you say about his morality.

I think we often forget that humans wrote the laws we have today. They didn't come to us in stone tablets down the mountain top. At the end of the day, these laws don't matter. They are not written in stone so as to speak. We should always strive to do better. Intellectual property is a sham. I mean think about it. I think there is legitimate intellectual property, the trademark.

I think it is wrong for me to sell "Microsoft Windows" (even if I wasn't charging any money) if I had modified the software and added malware into it. But me watching a movie or reading a book without paying royalties does not hurt anyone.

Please think about it. Just because something is legal does not make it right and just because something is illegal does not make it wrong. We need to calibrate our laws based on our image and not the other way round. We write the laws. The laws don't write us.


> Corporations don't care about the law any more than individuals do.

I'm struggling to find a connection between the points that I made in my comment and the points in your reply. Suspect we have some miscommunication here... my own comment wasn't spectacularly well filtered.

I'll bite on these though;

> Laws and regulations are just guidelines if you are determined enough to get your way. Look at all the Uber stories.

Don't conflate civil or criminal law with the work of regulatory bodies, who in my experience with the FCA and OFT are very open and collaborative without any need for "tenacity".

Uber work very hard on marketing and competition, but they are allowed to succeed to regulators who WANT them to succeed despite their amoral hussle, not because of it. Regulators in my experience (the FCA and OFT specifically) are very open and collaborative. They understand that markets move on and regulations sometimes lead and sometimes follow.

> Please think about it. Just because something is legal does not make it right...

So, I'm assuming from this comment that you're quite young. Just for you information; I suspect most folks on HN are already aware of the delta between legality and morality.

I'd also recommend thinking about the subjective nature of morality, and the causes and malleable nature of it.


Hi. Out of interest, what is it that you do in your life to generate income / money for food?


I am a programmer (:


I call it the Dropbox Mantra.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: