Hacker News new | past | comments | ask | show | jobs | submit login
Backblaze restore for Personal Backup is awful
203 points by cloogshicer on Dec 12, 2021 | hide | past | favorite | 230 comments
I've been using Backblaze for a few years for my home computer. You know how everyone keeps telling you, a backup is only a true backup once you've done at least one restore? Now I know why (silly me).

I just got a "Safety Freeze" error [1] - essentially some inconsistency with my backup. Backblaze will not tell me the actual cause of this inconsistency. It's possible that some data might be missing - Backblaze doesn't tell me though.

The only official solution is to manually check all files (millions in my case). I also can't download a full backup since Backblaze only allows downloads of up to 500GB at once. So my only option is to do a full hard drive restore, costing $189 + customs in Europe, so at the end probably closer to 300€. But even then I won't know if/which of my data was corrupted.

What bugs me is that the Backblaze desktop software should be able to resolve this - it should be possible to do a hash of all the files that are in the most recent backup, and cross-check it with the hashes of the files on my machine.

Not sure what I should do now.

Edit: This is the response from the Backblaze support:

> I apologize, but this [=manually checking all files] would in fact be the only sure fire way if you are concerned about any deleted files. There wouldn't be a way to compare hashes the way you describe in this case as that mechanism is simply not implemented into the Backblaze software. The Backblaze software is intended to prevent data loss. It would not be intended to be able to automatically cross reference local and server data against each other to display what is and is not backed up on our servers.

They can't be serious with this?

[1] https://www.backblaze.com/safety_frozen.html




On macOS, Backblaze ships with 21 identical copies of the same executable, nearly 200 MB in all, presumably because they don't realize you can just execute one binary n times (with different argv[0] if they'd like) and they haven't written their code to be thread safe.

Needless to say it doesn't instill confidence in the quality of software engineering that I rely on for disaster recovery.

    % openssl md5 /Library/Backblaze.bzpkg/bztrans*
    MD5(/Library/Backblaze.bzpkg/bztrans_thread00)= 772308dbd9b8083f4dc1c31bfe6a28da  
    MD5(/Library/Backblaze.bzpkg/bztrans_thread01)= 772308dbd9b8083f4dc1c31bfe6a28da
    ...
    MD5(/Library/Backblaze.bzpkg/bztrans_thread19)= 772308dbd9b8083f4dc1c31bfe6a28da
    MD5(/Library/Backblaze.bzpkg/bztransmit)= 772308dbd9b8083f4dc1c31bfe6a28da


> presumably because they don't realize you can just execute one binary n times

This seems needlessly dismissive. I feel like they definitely know that you can execute the same binary multiple times.


Okay, so let's steel man it: What's a better explanation? Maybe a weird build issue? (I'm struggling to think of specifics to match, though)


A build issue could totally be it. I feel like that's really common in the frontend / NPM world if you're not careful: if two of your dependencies rely on a third sub-dependency, but those dependencies have specific and incompatible version requirements for that sub-dependency, then you'll end up with two different copies of that dependency in order to satisfy those version requirements.


> > Backblaze ships with 21 identical copies of the same executable

>

> This seems needlessly dismissive. I feel like they definitely know that you can execute the same binary multiple times.

I'm the programmer at Backblaze that made the copies on purpose, I wrote some extra code to do this, and it's meant to help us debug certain things. Yes they are identical, the installer only ships with one copy of the executable, the installer then makes the copies on purpose. I get to explain this from time to time. :-)

In Windows when you want to know what is going on behind the scenes, you can bring up Task Manager and look at the different names of the different processes that are running. On the Macintosh this is called Activity Monitor, same sort of thing. The different names for the executables are for different "threads" which have different roles. Backblaze is multi-threaded to get higher performance.

The parent coordination process is called "bztransmit". But when doing the actual transmission it spawns the bztrans_thread01, bztrans_thread02, bztrans_thread03, etc.

So BEFORE I made multiple copies of the executables with different names, a customer would say "bztransmit is hung" or "bztransmit is using up too much memory". There was very little visibility into this. But now that I made multiple copies with different executable names, when the same customer says "bztrans_thread03 is hung". Or they say "bztransmit is using too much memory". We immediately have narrowed down what to look at.

Here is a screenshot showing what "Chrome" looks like to me in Windows, and how it compares to how Backblaze's bztransmit looks like to me in Windows: https://i.imgur.com/KOJHJ9Q.jpg In that screenshot, you can see there is the "main thread", and "worker threads". Meanwhile chrome is just one big list of processes all named the same thing (see the screenshot). I prefer the Backblaze system, but I understand it upsets some customers that prefer the chrome experience.

That's it. It's not some huge mystery.

One question asked here was do we know you can launch the same executable twice? Yes, and we do that. The bztrans_thread05 is launched for thread 05, thread 25, thread 45, thread 65, etc. It's THREAD_NUMBER mod 20. Here is what it looks like to hit 500 Mbits/sec upload speeds, this isn't photoshopped, it's a real screenshot on my development computer: https://i.imgur.com/hthLZvZ.gif

Another question is: why not use hard links or symbolic links? That's the only real optimization possible here, everything else was on purpose. The answer is not an excuse, it's just an explanation if you are curious. The software we develop at Backblaze is cross platform, so what we like to do is make the most general form first that will always work, then if customers complain or we want to refine it we special purpose code in per file system or per platform. The most general thing to do is make full copies. We could then go on to make links on the Mac WHEN POSSIBLE and the equivalent on Windows WHEN POSSIBLE, but it never became a large priority. The reason I can't use one technology is we support several file systems, and not all of them are the same or support the same technology for links.

Every feature we have is the result of prioritizing it over working on other things. Until recently, we did not have a lot of funding or an infinite supply of programmers, so we had to choose what order to implement each feature in. I'm not saying we got all the priorities correct, or that we did things in the correct order. I don't really even think there is one correct order. For example, some individual home user customers prefer saving 180 MBytes of their valuable boot SSD space over me implementing single sign on for our corporate customers. On the other hand some corporate customers DEMANDED single sign on or they wouldn't purchase the product at all. They are both correct, but there is only one of me, so we made some judgement calls and left the multiple copies and worked on single sign on. Some customers were happy, some are miserable.

We do have open client recs for both Windows and Mac programmers, so if you would like to make a good salary, full benefits, and help us out, come join us! :-)


Thanks for the explanation (and congrats on the IPO).

Consider also that having multiple distinct binaries interferes with (dis)allowing Backblaze to reach the internet with process-based firewalls like Little Snitch, because each copy needs to be configured separately.

Some tools, like Docker, have a function for collecting relevant diagnostics. Perhaps it would be useful to you to migrate towards a solution like that rather than asking the user to identify a misbehaving process on their own.


> and congrats on the IPO

Thanks, that was super exiting for us. After 14 years, I claim (and this is controversial) that we're no longer a startup and now we're just a mid-sized publicly traded company. :-)

> process-based firewalls like Little Snitch, because each copy needs to be configured separately

Yeah, that was actually a surprise and unfortunate. What the Mac architect (one of my business partners) and I think is that now that it is nice and stable, we might go down to 1 or 2 bztrans_thread executables, and one bztransmit. That seems like a better tradeoff where we waste much MUCH less disk space, and it is only 3 executables to allowlist in Little Snitch, and it achieves basically what we want now that it's stable and working well.

Originally there were 10 threads MAXIMUM, and we made 10 copies. And each copy was linking with shared libaries so it was only 10 MBytes of disk space which nobody noticed. Then Windows lost their friggin' minds with one of their releases and forced us to link statically which bloated it way up to 5 or 10 MBytes per executable. Then we went to 20 threads maximum and the whole thing was silly. When we went to 100 threads maximum we said "enough" and went to mod 20 for re-using executable names.

By the way, ALL OF THIS could be avoided if Microsoft and Apple provided an API to set the name displayed in Task Manager/Activity Monitor. Maybe that's a security issue, I don't know. But frankly wouldn't it be SUPER TOTALLY USEFUL if chrome displayed the current web page loaded in the process name of each and every chrome process? Then you would know which one to kill when something goes sideways.


>By the way, ALL OF THIS could be avoided if Microsoft and Apple provided an API to set the name displayed in Task Manager/Activity Monitor.

The problem is process lists should be showing the true state of the computer. It wouldn't be a good idea to hide the actual executable name. But it sounds like it could be useful to add another column for "label", so that threads could set a label and offer more insight on the process list.


> useful to add another column for "label",

Yeah, that would work really well. When you look at the "services" control panel in Windows, there are two columns. One column is "Name" of the service, and another column is a longer explanation with the column header "Description". I put a small description in there for bzserv (our service) plus a URL to our company website. I think this is just being polite, customers who don't recognize what "bzserv" is can immediate find more information on it.


Eh, even Clang does this. Install it on Windows and you'll get clang-cl.exe, clang++.exe, clang-cpp.exe, and clang.exe, all 90 MB executables, and almost entirely the same executables, but not bit-for-bit identical, so you can't hardlink them. I actually hate this too, but my point is it doesn't necessarily say much about software quality.


They are identical copies of each other. Since symlink support is not universal on Windows (Possible since around the time WSL was introduced, but still requires either special privileges or developer mode enabled), many tools including GNU autotools fall back to copying. Since recently (https://reviews.llvm.org/D99170), LLVM et least tries to create a symlink on Windows, but still caused problems because this is kind of unexpected by Windows. The NSIS installer used by the official Windows installer (https://llvm.org/builds/) does not support symlinks at all such that the installer includes 4 identical copies of Clang.

A clean solution would be to have stub executables that link to a central libClang.dll/libLLVM.dll (-DCLANG_LINK_CLANG_DYLIB=ON), but this is not supported under Windows because a DLL can export at most 2^16 symbols. Some work would need to be invested to make process launching on Windows work differently than on other platforms, but then disk space is not that much of a problem.


The Clangs are different from each other in some installations, and identical in others. In particular the Visual Studio installation doesn't have identical binaries, but the MSYS2 installation does. You'll have to check yours. For example, if you dir "%ProgramFiles%\Microsoft Visual Studio\2022\Community\VC\Tools\Llvm\bin\clang*.exe" you'll see they're slightly different sizes.

(Also note that in this case you don't need symlinks per se, just identical binaries with hardlinks. Each executable could inspect argv[0] and figure out what it's invoked as, and behave accordingly.)

What I don't understand is why can't they just bundle everything into a single DLL, then make whatever stub executables they want just call the exported main() in that DLL. I think that might be what MSYS2's copy already does, though I haven't checked. That DLL wouldn't need to export anything else, just the main() function that each .exe stub would forward to. And it can handle everything else internally if it so desires.


Different sizes doesn't always mean different sizes (I'm not sure if that is completely true using 'dir'): https://superuser.com/questions/1353064/why-does-size-on-dis...


I'm well aware of all the nuances but I assure you the files are different here.


> Since symlink support is not universal on Windows (Possible since around the time WSL was introduced, but still requires either special privileges or developer mode enabled)

Symlinks have been supported since Vista, though by default creating a symlink does indeed require Admin rights. Hardlinks can be created by anybody and are extensively used by Windows itself.


> Symlinks have been supported since Vista

I'm not aware of even Windows 10 or Windows 11 being able to create Symlinks on FAT32 and exFAT, but I could be wrong? But it doesn't matter, the code as written is cross platform, it works on Apple Journaled File Systems, APFS, exFAT, etc. We then compile it for Macintosh, compile it for Windows, and compile it for Linux.

We can then spend extra time and carefully detect each filesystem and each platform and then make the optimization if we can. And this is a valid criticism that we have not done this yet. But no matter what we need this general code that will always work FIRST, what the links are is a space optimization to save valuable SSD space when it is possible.


But these are bit-for-bit identical.


maybe they're not identical but backblaze has discovered 20 hash collisions that are also valid executables? that would be impressive in a way


Bit-for-bit identical means that they are actually identical though.


Edit: You're talking about the Backblaze binaries? Yeah that's why I said Clang's is almost bit-for-bit identical, so it's clear to everyone it's not 100% the same as the Backblaze situation. Not sure what your complaint is about my comment though? It's still practically the same problem, and you can solve it just as well with the argv[0] tricks or whatever.

Or are you saying your heuristic for determining the underlying engineering quality changes its output based on 1 bit flip? Identical executables is awful engineering, but a few bytes different (out of 90MB) is great engineering?


Not sure why you think clang having multiple not-identical executables is relevant to Backblaze having multiple identical executables.


> Not sure why you think clang having multiple not-identical executables is relevant to Backblaze having multiple identical executables.

Really? You genuinely don't see why I would think a case with >99.99% identical executables might be relevant to a case with 100% identical executables?


I genuinely don’t see why either.

With 100% identical executables, there does not appear to be a good reason to have multiple copies under any circumstance.

With anything < 100% identical, well, maybe there’s a good reason to have multiples. Who knows? Id probably give someone the benefit of the doubt and figure there was some engineering challenge that made it faster/easier to do it that way.

So yes, 100% identical is completely different than almost 100% identical.


There is a reason: the majority of the file size comes from statically linked LLVM libraries.

You can instead configure LLVM’s build system to build a single dynamic library and have the tools link to it, and this eliminates all of the duplication. However, it apparently comes with a “substantial performance penalty” [1] due to the nature of dynamic linking. (This actually surprises me, and I wonder whether it’s only referring to the inability to do LTO, or whether even LTO-less static linking is faster. Aside from the startup time issue.)

A theoretical alternative would be to build all of the tools into a single executable, à la Busybox, where the combined executable would inspect argv[0] to figure out which tool’s code should be run. That way you could statically link the LLVM libraries without duplicating them in multiple executables. LLVM’s build system does not support this. I think it would be nice if it did, but it would be nontrivial to implement.

[1] https://llvm.org/docs/BuildingADistribution.html


> (This actually surprises me, and I wonder whether it’s only referring to the inability to do LTO, or whether even LTO-less static linking is faster.)

I think the latter is also the case (though to a much lesser extent) on x64. One of the unfortunate features of x64 is it lacks direct 64-bit jumps, so every jump to an external library ends up being an indirect call. (In fact, with a potential memory load on top of that.) This was kind of surprising for me when I learned it too; it doesn't apply to x86.


> So yes, 100% identical is completely different than almost 100% identical.

They're completely (!) different? And you're saying this despite the fact that the comment I replied to was discussing cases where one could "just execute one binary n times (with different argv[0] if they'd like)"... which is something you can do with different executables just as well as with identical ones? It's not just a little different but completely different? So different that not only you don't see any similarity, but you also cannot fathom why I might think there's some similarity?!


Would you please stop posting in the flamewar style to HN? It's not what this site is for, and it destroys what it is for. We've had to ask you about this more than once in the past already. If you wouldn't mind reviewing https://news.ycombinator.com/newsguidelines.html and taking the intended spirit of the site more to heart, we'd be grateful.


Apologies, but I still don't know how to respond in a situation like this. Can I ask how you would respond to this case yourself? I hope you can at least see why I found it difficult to swallow when people shut down my comment telling me there's zero similarity between a 99.99% case and a 100% case. It seems needlessly hyperbolic to me at best, and quite obviously frustrating. Looking back, I almost feel like it baited me into this. How do you see this from your standpoint? Does it feel like a good-faith comment to you that they can't tell why I would see 99.99% and 100% as similar?


I hope I can answer that for you. The core issue is that you are taking the original point out of context and kind of robotically parsing snippets.

Context is important. For example, chimpanzee DNA is 99% identical to human DNA. Without context you could make the argument, “They are almost identical!” and be correct. But in the context of discussing the ability to fly to the moon and return safely, the two genomes are completely different.


I appreciate the reply. I don't understand what context you believe I was missing? The analogy is incredibly confusing to me too, because humans are more than their DNA: you obviously can't hardlink humans, nor could you replace them with stubs that call some "common" humans, it's not like the obstacle to doing these goes away even with 100% equal DNA vs. 99%.

But with computer code, you very much can (for example) easily replace all the executables with a combined one that gets hardlinked with different names, such that they behave differently depending on argv[0]. They don't even need to be 100% identical for us to be able to do this; it works just as well with 1% identical code—you just end up with bigger combined binaries. This is the Busybox approach (notice it combines executables that have hardly anything in common), and it's in fact exactly what Clang already sometimes does (like on MSYS2); one would think they could take the same approach here. This is also precisely the context of the comment I replied to was saying, right? This is the context I was replying to. It's so confusing to me that you claim I'm missing context when I was in fact addressing the context I saw directly—and it was the other comments that were not.


No, actually. And your incredulous doubling-down is, well, making it more obvious that you seem to be missing the point.

If they're 99% the same, it's generously easy to assume that there's a material difference. That assumption is completely nonsensical if they're 100% identical.

So no, for the sake of every bit of context in this conversation, it does not make sense that you'd bring up an unrelated scenario of "similar" binaries.


The 'these' being referenced here are the Backblaze executables, not your clang executables.


They're not saying the clang are bit by bit identical, but the blackblaze ones.


This only happens on Windows, on *nix systems these are all pointing to `clang-<version>`


This makes slightly more sense if it's the case that they are statically linking the Clang and LLVM libraries into each tool. Each one has a slightly different main() (e.g., one parses gcc-style args) but ends up using the same library functionality.


They are the same binary and on on Linux they are just soft-links.


Interestingly they're also the same binary on Windows in MSYS2. But not if you install through whatever installer Visual Studio uses (probably the native one).


This is hilarious. Their marketing drivels on about how they employ a bunch of Apple engineers and the MacOS client is top-notch.


Even if they're not very smart couldn't they hard link these?


I had previously symlinked them all to one copy and there was no apparent consequence for my backups.


What about your restores ;)


Per their support, there are known issues which can cause the client and their servers to become "out of sync". If this happens, your backups will stop occurring (even though the app will indicate that the all files are backed up successfully through the current date). The only notice you will receive that this has occurred is after 14 days when they send you an email to warn you that they'll delete your backups if you don't fix it.

They don't seem to recognize why backups silently failing is a very bad thing.

I stopped using Backblaze after that revelation.


I think this is exactly what happened here. Or at least extremely similar.


You know what, I had Backblaze tell me that I had a drive missing on Friday that had been disconnected for 14 days. My external USB drive failed to startup up correctly after a power outage. However, I had 3 or 4 notices from Backblaze in that time of successfully completed back ups. I think the default is to warn after not successfully backing up for 5 days. I set that to 2 instead. Some of the checks seem disconnected from each other.

I’ve also had issues with downloading my 2TB of data the Backblaze has. Takes about 3 or 4 days, if you’re successful. With retries and failures, wall time was about a week. And I have a 1G/1G internet connection. And doing the work to break it into 500GB chunks sucks.

I’d like to back up more of my files but I worry I would not be able to get it back.


Do you have a local friend who would like to swap backup services with you? Each of you buys and configures a Raspberry Pi with a 2TB disk. Encrypt a storage filesystem. Load up your first backup, then swap boxes. Now you each have remote rsync/rsnapshot/rclone/ZFS/whatever targets, and if you run into big trouble, you can go get the box back for a really fast restore.


Disclaimer: I work at Backblaze so I'm biased and you should keep me honest.

> if you run into big trouble, you can go get the box back for a really fast restore

Backblaze provides this service for our customers! Customers can ask for an 8 TByte encrypted USB attached external hard drive to be prepared in the Backblaze datacenter with all their data beautifully restored on it, then we FedEx this anywhere in the world to them. This is a free service as long as the customer returns the USB drive to Backblaze in a reasonable amount of time (a couple months, and we can work with customers if they need longer). Or customers can pay $189 (which includes the drive, the data, and world wide shipping) and keep the 8 TByte USB drive.

You can read more about the USB hard drive restores here: https://www.backblaze.com/blog/usb-hard-drive-restore/ and why it is a free service here: https://help.backblaze.com/hc/en-us/articles/217665948-Resto...

What we often see is that if a customer's laptop is stolen or crashes, they sign into the Backblaze website to download the 3 or 4 individual files they were working on when the laptop was stolen. Let's say that is a term paper due the next day. That way they are back up and running IN SECONDS. Then the customer orders a free USB drive with 8 TBytes of their data which will show up in a few days. They can live without their wedding photos and their music for 3 days, but that term paper has to be handed in.


incredible business answer..


A lot of people sing BB's praises but I never had a good experience with them. The client was always slow, buggy, and resource hungry, and its UI is terrible. They got shirty with me for reporting bugs when I was using a macOS beta. And finally, at some point even though nothing about my computer changed (it was a Mac Mini, what was going to change), I got a message saying some security/copy protection system had detected that my computer was "different", and I had to un-install and re-install the entire app to fix it (there apparently being no easier way to unset a flag). I uninstalled and skipped the second part.

Instead of using BB, get a Synology/Qnap/FreeNAS box to backup all your stuff locally, and back that up to another service (e.g. Glacier or Synology's own C2).


I caution the casual reader against glacier. It's not what it appears at a glance. Your files should be put into a single archive before upload otherwise you'll spend weeks waiting for AWS scripts to manage old files.

B2/S3 is what most people want.


We have 23TB of images stored in S3 and I was recently looking at moving them to Backblaze to save hundreds of dollars per month. These are all individual image files, because reasons.

Then I realized that S3 Glacier and Deep Archive were even less expensive than B2. I took a bit further of a look and found that Glacier/DA files have some fairly chonky metadata that must be stored in normal S3, and for a lot of our images the metadata was larger than the image in question. So Glacier/DA would increase our storage costs. Over all it probably wasn't a money-saving situation.

The ideal use case is to bundle those up into a tar file or something and store those large files, and manage the metadata and indexing/access ourselves.

So, using rclone to copy 11TB of data to B2.


Wasabi is also a little cheaper than AWS S3 afaik


Keep in mind when you create something in wasabi you pay for it for 3 months, even if you delete it minutes later.


> otherwise you'll spend weeks waiting for AWS scripts to manage old files.

Can you elaborate on this part?


Every file uploaded is considered an immutable archive. It does not have a version history. So let's say you have 100,000 files you backed up and want to update them and don't want to pay for the storage of the old files. You need to request for a manifest of hashes for all files. This will take a few days to generate then you will be given a json file that is over a gigabyte. Next, you will write a script to delete each file one at a time, rate limited to one request per second. Have fun.


Are you maybe referring to Glacier "vaults" (the original Glacier API)? With the old Glacier Vault API you had to initiate an "inventory-retrieval" job with an SNS topic etc. It took days. Painful.

But these days you can store objects in an S3 bucket and specify the storage class as "GLACIER" for "S3 Glacier Flexible Retrieval" (or "GLACIER_IR" for S3 Glacier Immediate Retrieval or "DEEP_ARCHIVE" for S3 Glacier Deep Archive). You can use the regular S3 APIs. We haven't seen any rate limiting on this approach.

The only difference from the "online" storage classes like STANDARD, STANDARD_IA, etc is that downloading an object with GLACIER/GLACIER_IR/DEEP_ARCHIVE storage class requires first making it downloadable by calling the S3 "restore" API on it, and then waiting until it's downloadable (1-5 minutes for GLACIER_IR, 3-5 hours for GLACIER, and 5-12 hours for DEEP_ARCHIVE).


I just bought a Quantum Superloader3 last Xmas. Each LTO-8 tape (it can take 16 in 2 magazines, I use 15 + 1 cleaning tape) will hold 12 TB without compression, 30TB with, and 7 of them can back up the 100TB used on the 128TB of disk that is the house RAID array.

It takes about 2 days to make a full backup, and I can fit incrementals for the next 5 days on the batch-of-7. Then I switch to the second magazine, and do the same thing. I actually have 3 magazines, one of which I swap in and out every week, and during the before-times, I'd take that off-site to work.

I have ~30 years of data, from back when I was in college and writing CD-ROMs for backup, all on the one system. Admittedly, the major space-taking thing is the Plex library, but I wouldn't want to lose that either. It takes about 5 minutes to walk into the garage (where the server-rack is), swap magazines and I'm done - the rest is automatic.

I have vague ideas for writing a more-efficient tar designed for this specific type of setup (big disk with attached tape). The best way to do it I think is to have multiple threads reading and bzip2-compressing data, piping blobs through to a singleton tape-writer thread. Every now and then (50GB, 500GB, 1TB ?) close the device and reopen the non-rewindable device to get a record-marker on the tape, and then store the tape/record-marker/byte-offset etc. into a SQLite database on the disk. That way I'd get:

- High levels of compression without making the tape head wait for the data, which ruins the tape head. Multiple threads pooling highly-compressed data into a "record"

- fast lookup of what is where, I'm thinking a SQL LIKE query syntax for search, against the disk-based DB. No more waiting for the record to page in from the end of the tape.

- fast find on-tape, since you'd know to just do the equivalent of 'mt fsf N' before you actually have to start reading data

Right now, tar is good enough. One of these days when I get time, I'll write 'bar' (Backup And Restore :)


Interesting. Would you mind if I asked you a few questions off-thread about the QS3? I'm working on adding a local tape option to my homelab (https://www.reddit.com/r/HomeDataCenter/comments/pira9v/powe...)

I need a solution for backing up the stored states for my 100 trillion digit PI calculation efforts.


> I need a solution for backing up the stored states for my 100 trillion digit PI calculation efforts.

O_O


Sure, fire away. I didn't know HN even had a PM feature :)


small question about superloader3. how noisy it is ?


Too loud for an office. Which is a shame because in these pandemic times, I repurposed the shed at the bottom of the garden as an office (insulated, add power and a/c) and I wanted it there.

It's not noisy when writing to tape, but the mechanism is noisy when a tape is being loaded, the magazine is being shuffled to get the right tape, etc. It's the mechanical parts rather than the tape drive itself that's too loud, especially with Webex conferencing being a part of the day now.

So I have it set up in the garage, in the server-rack. I was worried about temperatures in the Summer, so I bought a 100W solar panel, an attic fan, and linked them up, positioning the fan above the rack. That fan shifts so much air that the in-rack fans (with temperature monitoring) didn't get above 85 all summer, which is pretty amazing for the Bay Area. The tape deck seems to be fine in that sort of temperature, and yes I do do the occasional 'tar tvf' to check the data is readable :)


Thanks. Are there any continuously running fans when it's not actively working ? Or something else that makes a noise ?

I got an HP IP KVM switch a while ago. When it's on, it makes as much noise as a bunch of servers or a blade center. Got some fans for retrofit...


Yes. I'd forgotten about that. There is a reasonably-above-ambient-noise fan that is running constantly. Another reason it was banished to the server-rack in the garage.


one more, did you get it at full price or you found it somewhere at discount/used ? i been trying for a few years, on and off to find something


I got mine at Backupworks.com[1] - they seem to have a perpetual sale on for pricier items like this. They gave me a discount on a batch of tapes bought at the same time and threw in some barcodes and a cleaner tape as well.

[1] https://www.backupworks.com/quantum-superloader-3-LTO.aspx


You need to differentiate between BB Personal Backup and BB B2 service which is more like something you suggested. But these days I just use rsync.net + Wasabi + Kopia + rclone.


I really want to use rsync.net, but the price per GB scares me off.


I have one of their borg accounts, which is a fraction of the normal price. https://www.rsync.net/products/borg.html


Please email us about the "HN Readers Discount" which we have offered for 12-ish years ...


Do you understand that popping up like Beetlejuice every time someone mentions rsync pricing and being snarky about an unpublished discount is not a good look?

Particularly given you never address the commenter's point, which is that your pricing is pretty expensive?

"Price high, low volume" is certainly a valid pricing strategy, but you have no right to be snarky when people say you're priced high.

The fact that people are consistently describing you as overpried means your marketing really isn't showing the corresponding value to them.


Not sure if they're overpriced when they also have borg discount and rsync.net is one of the only service that I can borg or "zfs send" directly as well as the fact that they've been running nearly 20 years? to know they don't disappear next year makes it rather feel not overpriced.


I have 3.5TB backed up to Backblaze for $5 a month. I haven't found any other online backup option that provides anywhere near that cost.

I use a RAID 1 to handle drive failure and also keep local backups on a NAS. BB is my third layer of backup. I've never run into issues with BB backups so I'm happy for what I get for the price.


> I have 3.5TB backed up to Backblaze

The question this thread seems to be raising is... do you? Do you really? Are you sure?


S3 Glacier Deep Archive, $0.00099 per GB per month.

I have a ZFS based NAS. And periodically do a incremental backup (zfs send) of the entire dataset, encrypt it gpg and pipe it straight up to S3 deep archive. Works like a charm.

The catch with S3 deep archive is if you want to get the data back... It's reliable, but you will pay quite a bit more. So as a last resort backup, it's perfect.


Can you tell how it's done specifically? Are you zfs send'ing to another directory and encrypt it entirely on the fly while transferring to Glacier?

Does it do incremental backup transfer to Glacier or does it have to transfer the entire encrypted blob every time?


No intermediate directory/file at all, all done on the fly.

    sudo zfs send -i <LAST_BACKUPED_SNAPNAME> <CUR_SNAPNAME> | gpg -e -r <YOURKEY> --compress-algo none -v | pv | aws s3 cp --storage-class DEEP_ARCHIVE - s3://<PATH_TO_DEST>.zfs.gpg
The very first time you do it, you will need to do a full backup (ie. without the `-i <...>` option). Afterwards, subsequent backups can be done with the -i, so only the incremental difference will be backed up.

I have a path/naming scheme for the .zfs.gpg files on s3 which include the snapshot from/to names. This allows to determine what the latest backed up snapshot name is (so the next one can be incremental against that). And also use when backing up, since the order or restore matters.


But how do you verify or test your backups in this scenario?


Pretty much the exact reverse of backing it up https://news.ycombinator.com/item?id=29541729

    aws s3 cp s3://... - | gpg -d ... | zfs recv ...
When restoring the order of restores matters, you first need to restore the full snapshot, and then the subsequent incremental ones in order.


I meant, how to test them without incurring a large cost.


Ah gotcha, I haven't done full restore of my main dataset.

I've only verified with a smaller test dataset to validate the workflow on s3 deep archive (retrieval is $0.02/GB). I've done full backup/restore with the zfs send/gpg/recv workflow successfully (to a non aws s3 destination), and used s3 for quite a long time for work and personal without issue, so personally I have high confidence in the entire workflow.


If you want to try another sketchy service, I use iDrive which was $9.95 for 10TB for the first year.


> get a Synology/Qnap/FreeNAS box to backup all your stuff locally, and back that up to another service (e.g. Glacier or Synology's own C2).

I'm not a big fan of backing-up a back-up and opted for a Time Machine backup to a local nas and in parallel, an off-site backup to B2 with Arq on my macs.


I have a Synology NAS currently and use their HyperBackup hosted tier. Maxes out at 1TB sadly, not even enough for my laptop backup. C2 on the other hand is well priced for bulk storage but does not have the “free” deduplication.


I use HyperBackup to backup multiple TB to Backblaze B2. Works great, FWIW.


The hosted version I'm thinking of is their Synology C2 personal it seems.

Currently I use it to back up my lossless music collection and nothing else.

Backblaze B2 was on the table too, however I think my asymetric internet connection is my biggest issue right now. Only 40-50Mbps upload won't do much for backing up multiple TBs of data. May need to consider pre-seeding drive option if I can justify the cost.

Other solution was a separate, lower spec Synology I can pre-seed and send to a friend's house who has a homelab.


I love Backblaze, I really do, but holy hell is the user experience bad if you ever need to actually interact with any part of their software -- client or website. And it's been that way forever, and it's super frustrating that it seems like they've put zero effort into improving any of it.

Though reading some of the "backup got frozen, compare everything by hand" stories from here, I'm starting to wonder if maybe I shouldn't put my data somewhere else. Problem is, I've yet to find another "fire and forget" type of wolution for windows that works anywhere near as well.


There doesn’t seem much reason to “love Backblaze” based on what you said. You might as well be transferring your files into a volcano and then have to trek into the volcano in the hopes you might get lucky and everything hasn’t burned up if you ever have to recover something.


Absolutely agree. I have had it save my butt a few times when I damaged an important file and was able to pull an older version, but every single time I have done this, it made me trust their tooling for whole system recovery less and less. I do wish there was a better option. Currently considering Arq for the next computer.


I use Arq on a Mac Mini to have last resort offsite copies of my Time Machine backups on AWS Glacier. I must be breaking some proper backup practices, but I know nothing about nothing. Being the guy who knows nothing, I can say Arq is great. It allows you to dissociate the backup software from the backup service, which means I get to have a piece of software I know runs well, and is financed by my 50$ purchase price.


I was unable to recover backed up files on backblaze. It just doesn’t work. They don’t seem to do any periodic integrity checks on the cold data. I lost most of my files, they were able to recover maybe 20% of the data.

Use anything else, but not backblaze.


This reminds me of the xkcd comic "TornadoGuard"[0]. Whatever else the company is doing, if their core functionality doesn't fundamentally work when it needs to, then what is everyone paying for?

[0]https://m.xkcd.com/937/


I've never had a problem with BackBlaze - wasn't it at some point one the gold standards for backups?


Only by virtue of being the most recognised name in this space.


But they publish hard disk reports and their PR person talks to people on HN; sometimes founder as well. They must be the best.


More information about "safety freez" from an backblaze engineer: https://old.reddit.com/r/backblaze/comments/hvcbpw/safety_fr...


"The log files that list what Backblaze has backed up are called "bz_done" files. They list 'what has been done'. Here is where they are located on your computer: ... WARNING: don't edit those files -> you are guaranteed to corrupt your backup. You'll lose everything."

Why does the integrity of the backup rely on files stored on the computer being backed up? This seems so stupid that I'm sure I'm missing a clue.


>Why does the integrity of the backup rely on files stored on the computer being backed up? This seems so stupid that I'm sure I'm missing a clue.

Reading the explanation in the reddit thread, that's not the impression I got at all.

1. If your computer exploded, your backup integrity would not be compromised

2. If gremlins in your computer did mess with the file, your backups could be compromised. That sounds bad, until you realize that gremlins in your computer could also compromise the executable to do other things that could compromise the backup, (eg. telling the server to delete existing backup data because the retention period has passed or whatever, or simply uploading bad data and waiting for the retention period of 30 days to pass). Moral of the story: if the computer doing the backup can't be trusted to operate correctly, all bets are off.


Because otherwise a trashed computer could backup corrupted data over the good backup.


It's not a backup then. It's a synced copy.


That's really what the personal 'backups' are, it's just a synced copy that will happily mirror corrupted data and delete files that were deleted (not sure if there's any delay). If you want real backups then use the B2 storage directly.


It's a backup with a retention policy. In backblaze's case, it's 30 days. I'm fairly certain that still counts as a backup. Do you think that only immutable, append-only storage counts as "backup"?


Yup. Sounds pretty terrifying.


That can't possibly be how it works. No way.


Well, their suggested fix basically says that I should "unlink" my computer from the online backup, and relink it (by re-installing the Backblaze app). But before I do that, I should download any missing files, since they will be deleted from the backup upon re-link (if they can't be found on my computer).

But I can't do that without knowing which files are missing.


Wow, that engineer just... "doesn't get it".

He doesn't understand the fundamentals of the problem he's solving, and is actively writing code that basically puts tools down and starts shouting "EVERYONE STOP!" in response to expected scenarios.

Most filesystems provide no guarantees at all by default on writes. NTFS journals metadata writes, but not data writes. Append-only files are absolutely expected to be truncated. The application must deal with this either by being insensitive to rollback, or by explicitly requesting a write cache flush! This is very well known to anyone that has ever written any kind of write-ahead-log, database engine, etc... There's a bazillion articles about how this is intended behaviour and no amount of screaming and shouting will make it go away. Learn the storage API guarantees, or STOP writing code that has critical dependencies on storage!

This quote especially seemed childish and immature to me:

> "And this one makes me actively angry, because both Microsoft and Apple will happily throw away portions of your files and not tell you about it"

No, this is NOT Microsoft's or Apple's fault. It is 100% HIS fault for not understanding what's going on. Even if a file flush is correctly requested, consumer HDD and SSD drives are well known to ignore this and keep things in volatile RAM cache to boost their IOPS numbers at the expense of durability. Only "enterprise" drives honour this API properly, and even then there are corner-cases around things like 512/4K sectors and torn writes. Similarly, consumer drives don't have power protection, so partial sector writes are entirely possible and should be expected if they lose power mid-write, or just crash at an inopportune time.

To be more constructive: The correct thing to do is that a log writer must always end log writes with a checksum of some sort. If the log is truncated (for any reason!), then it must recover starting from the last-known-good checksum. Throwing your hands up and saying "NO MORE BACKUPS FOR YOU! EVER!" is not the right response. Retrying backups from the last-known-good point automatically is the correct response.

PS: Some of his other rants are also a lack of understanding of thread safety and/or the lack of ECC RAM in consumer-grade PCs causing random misbehaviour. Heck, you'd also have to deal with bad sectors, corrupt filesystems, and a bunch of other corner cases that just makes this guy scream and shout instead of writing robust code...

PPS: I just realised that the reddit post is by 'CTO and Founder of Backblaze'. Wow. Note to self, do not use Blackbaze for anything, ever. If the CTO is this clueless, then their products can't be trusted.


I've never worked with a CTO that actually wrote code and I have worked with many that were not domain experts. The CTO is a manager, not an engineer.

I'd be more interested in the skill of the developers themselves than the CTO.


He said he worked on this code personally.


it's even worse: there's no guarantee that if you don't touch something on the filesystem it will not be corrupted. It used to be almost the case with HDDs but you're using SSD and NVMe devices. Basically, it can happen that when there's a sudden power off, they will do weird stuff like "set the 6th bit of every byte in an entire region". In essence, whatever your strategy for making/keeping a summary of the source side of your backup, you will always need to be able to figure out the state/diff from scratch. This needs to be work. Everything else is merely there to speed things up.


Checksums are reasonably robust against this kind of thing, but there are storage systems out there (most SSDs!) that can even reorder blocks! So you can have valid checksums but still get corrupt data…


| So you can have valid checksums but still get corrupt data…

Yup, therefore:

| you will always need to be able to figure out the state/diff from scratch.

So if you evaluate a backup system, this needs to be the first thing to check. "what if I backup and accidentally lose my log/summary/sync-state ?"


Sounds like they should probably be using a SQLite database instead of trying to make a custom hand rolled solution using raw files.


SQLite can’t magically protect against consumer hard drives not respecting cache flush commands.

Even if you could guarantee some things, refusing to run backups or restores in the face of rollback is just Wrong with a capital W.


Disclaimer: I'm the engineer you are commenting on, I just want to straighten out one mis-understanding.

> This quote especially seemed childish and immature to me:

>

> > "And this one makes me actively angry, because both Microsoft and Apple will happily throw away portions of your files and not tell you about it"

>

> No, this is NOT Microsoft's or Apple's fault. It is 100% HIS fault for not understanding what's going on. Even if a file flush is correctly requested

Several times you seem to jump to the assumption that I don't understand fsync and disk flushing and that is the core issue. You aren't understanding what I'm criticizing. Here is an example of what bothers me:

You take a picture at your wedding, and you store it on your hard drive. You like the photo, it means a lot to you, and you use it as the background for your desktop FOR FIVE YEARS. You have rebooted hundreds of times, and it's always the background for your desktop. Then one day 5 years after your wedding, you reboot your laptop, and it seems to take a little longer to boot, and then after you sign into the laptop half the image you use for your desktop background is scrambled. The middle of it looks like dirty snow. You didn't get any reports of any issues from the OS manufacturer, but now one of your photos is corrupted.

This isn't because the software that wrote the photo 5 years earlier forgot to flush the picture to disk. It just isn't. Behind the scenes, as your laptop was booting from an aging drive, it probably encountered some issue, and it went about fixing the problem as best it could - which I have no problem with. My issue is the drive lost some data, and if the OS manufacturer would let you know this occurred you could take useful actions like order a new drive, prepare a restore from a few weeks ago before that issue occurred, etc.

> No, this is NOT Microsoft's or Apple's fault.

It isn't their fault that the drive is going bad, I agree. Drives go bad, that's why we have backups. My issue is the OS manufacturer try to cover up too much, keep too much hidden from the user, and didn't let the user know data loss has occurred (or might have occurred). And yes, I hold them accountable for not telling customers what is going on. It isn't anybody's "fault" that it occurred, but there is a responsibility to let customers know about it so the customer can take appropriate actions so they don't lose data (or more data).

I try to write incredibly paranoid software. Part of the reason is that is the "average" environment the Backblaze client runs in is more unstable than what most software developers are used to. The whole point of backups is to run when the computer is going sideways, it has bad RAM, it's losing disk sectors, or a customer's cat likes sleeping on the keyboard because it's warm, and the fans are clogged with cat fur. And because the family has teenage children that don't know about computer security problems, they download and install unstable junk from all over the internet because why not? That's the environment my software runs in, and I take my job of trying to protect my customer's data very seriously.


Thanks! I already found that, unfortunately, they also just say: Do a full restore via shipped hard drive (expensive for non-USA).


> What was the bug? It was a threading bug.

lol


I don't like Backblaze because they require you to hand over your encryption key to their website to restore which kills any hope of it being a 0-knowledge solution.

Right now I'm using Arq Backup + S3 and have been happy.


I was about to use Backblaze Personal and then noticed that. Was then about to use Arq + Backblaze B2 when I realized that (1) Arq supports OneDrive, (2) I have a TB of OneDrive as part of my Microsoft 365 subscription, and (3) I only use cloud storage for file transfer between mobile devices and desktop, so my OneDrive space was almost all unused, so went with Arq + OneDrive. 3 years of weekly backups later and I've still only used about half of my 1 TB.

The only thing I'm not happy about with Arq is that a "verify" downloads all the backup data to checksum it. That takes 3 or 4 days on my connection.

I thought I read that many cloud storage provider APIs provide a way to ask the server for a checksum of a stored blob. I'd have expected Arq to make use of that, but maybe it is not reliable (the server might just report what the checksum is supposed to be, not actually read the data and checksum it?).

Arq documents their storage format. I wonder if it would be possible to use a VM on Azure to access my OneDrive storage and do the checksumming on the VM?


Yes, online storage services do store a checksum of the files. They can use that internally to periodically scrub their stored data. For example, if S3 has stored 3 copies of your file, they can use the hash to distinguish good and bad copies, and replace bad copies. For erasure-coded services like Backblaze B2, they can reconstruct a file using any 17 of 20 erasure segments, so with a known checksum they can figure out which segment is bad, delete it, and use the other segments to generate a replacement. I was told by Backblaze that they do periodically scrub B2 files.

HashBackup (I'm the author) has a lot of ways to verify backups. For B2 uploads, it generates a SHA1 hash before sending the data, stores that in a local database, sends the data with the hash, B2 generates an independent hash of the data, and compares it to the one HB sent. If there is a mismatch, an upload error occurs and the upload is retried.

On download, B2 sends the data and SHA1 hash (I'm assuming it verifies the hash during erasure reconstruction also), HB generates a SHA1 hash for the received data, verifies it against the SHA1 B2 sent, and verifies it against the SHA1 stored when the file was uploaded.

After the upload, files can be verified by:

- downloading the complete file as above, then verifying every block in the file - getting the file size and SHA1 hash from B2 and verifying those against the local database (no data download) - downloading N random samples from each file and verifying the block-level SHA1 for each sample - downloading a small percentage of the files every day to verify the backup over time (this can be done with whole files or random samples)

If a problem is detected and you have other copies of the backup on another storage service or locally, HashBackup's selftest will fetch all copies of a file to reconstruct the original backup data to correct any errors, then upload the corrected file to all destinations.


Thanks a lot for your work on HashBackup. I can't recommend it enough. It has been great and working for me for years, no data loss and multiple restores. Top notch documentation too, just a bit sad that it isn't open source so that I could contribute. But the quality is there and it has been really solid for me.


Arq backup to OneDrive used to be very slow since IIRC they were using an old version of the OneDrive API with pretty ridiculous latencies. How is it these days?


Arq 7 uses the Microsoft Graph API, the one we're supposed to use now.


Thanks for the reply!


The "back up and validate" function only downloads all the data if you're using a storage location that doesn't provide checksums (SFTP or NAS/folder). So, it shouldn't be downloading very much if you're using it with OneDrive. Please email support@arqbackup.com so we can follow up on this. Thanks.


I don't have automatic updates enabled and it has been a while since I manually updated. I just updated and the "back up and validate" menu item is no longer there for my OneDrive backup. It is still there for my backup to a local removable drive.

I see from this Reddit thread [1] that it was removed for storage locations that don't need it a few months ago, saying

> Back Up and Validate is a waste of time and money for any storage location that provides checksums. That's why we removed it. It only causes pain and cost.

My last "back up and validate" to OneDrive was a couple months before you removed it, so it looks like I was just hitting the "waste of time and money" case that prompted its removal.

From the Reddit thread, it looks like the way to do a check that everything is OK on both ends when using a storage provider that provides checksums is to clear the cache and to modify the backup plan (just editing it and saving is sufficient), and then do a backup. The cleared cache makes it get the checksums for everything in the backup and the modified backup plan makes it rescan everything locally. Is that correct?

[1] https://www.reddit.com/r/Arqbackup/comments/n87vuw/missing_b...


There's no need to edit the backup plan. As I wrote in the reddit thread, clear the cache and do a backup. Sorry it's not more clear.


How do they think this could be acceptable?


I switched to Arq after a very bad customer service experience (one they easily could've solved but decided to be jerks about instead)


I do the same, but with B2 which is nice cause it can be very cheap.


Could you give a ballpark figure for how much that costs you, and how much data you're dealing with?


like $1-2/mn. For a very long time it was like $1-2/year. for roughly 188gb. has hourly incremental backups of certain files like all my source files (includes things like node modules and etc)

i do a full-full-everyfile backup on occasion to a NAS.


I use Arq with B2, which seems like a happy compromise.


I'm a fan of Arq and DIY storage, but how do you handle versioning?


Arq has baked in file versioning unless you mean something else?


Ah, of course. Sorry. :)


Has it ever been marketed as zero knowledge?


Same.

Arq + Wasabi


So frustrating.

Backup is like an insurance policy - your only interaction with the product is the "front-end" (e.g. "how easy is it to configure and buy what I need"), as opposed to what you're really paying for (the provider's ability to deliver in your time of need).

I worked at a backup company [1] (acquired by Quantum in 2014 [2]) where we built a peer-to-peer storage network. We ran "restore" drills early and often :)

[1] https://www.linkedin.com/company/symform/about/ [2] https://www.crunchbase.com/acquisition/quantum-corp-acquires...


Seeing as you're an expert, what's your current online backup solution?


Backblaze is an awful piece of software when you look at it from “a backup software” point of view. It’s made pretty, simple, native (or is it Electron now?) - yes. But then it stops there. On top of that if you read the ifs, buts, and gotchas you’d want to stay far away from them.

They’ve downright absurd data deletion/retention and versioning rules.

Besides I do not trust any service that promises to give anything “unlimited” for a fixed cost.

As I usually mention in comments on this topic - I’d strongly urge people to use and support backup tools like borgbackup.org (Vorta is an excellent GUI), restic.net (a GUI is glaringly missing), kopia.io (up and coming; comes with a GUI), for smaller datasets there’s very good but more expansive Tarsnap (not FOSS).

And then there are others - https://github.com/restic/others#list-of-backup-software


> They’ve downright absurd data deletion/retention and versioning rules.

laughs in Code42

Code 42, makers of Crashplan, who shut down their consumer backup service and deleted the private keys needed to decrypt backups on your own drives...

...followed that up by recently announcing that they were changing file retention rules to purge all files older than 90 days. They lied and said it would be effective in a week or so, but it turns out that it was effective upon your archives being run through the periodic compact process after the announcement, and of course some people were at the head of that line.

Crashplan Pro Enterprise was a brilliant backup system and they drove it into the ground through staggering sales and support incompetence.

Now they sell some buzzword-box-ticking "stop your employees from COPYING THAT FLOPPY!" big-brother-esque profiling/monitoring software.

Thank you for mentioning those other backup programs, by the way. Hadn't heard of one or two.


IME it is vastly more robust to use your backup client of choice (eg Duplicati) wrapping Backblaze B2, their S3 competitor. As a bonus it becomes much easier to migrate to or from them once your data is accessible in an open API, particularly when you have limited local upload bandwidth like a lot of US cable internet customers.


I want to like Duplicati so much, but it breaks if you even think about looking at it. It's like the P&R Jail meme.

"Interrupt the initial backup? We corrupt the backups."

"Relocate the backup files and update the backup destination? We corrupt your backups."

"Update to a new version? Believe it or not, we corrupt your backups."

The eight hundred open github issues is also pretty frightening for a now many-years-old piece of backup software. They really, really need to spend 6 months or more just nailing as many issues as possible, starting first and foremost with bugs relating to restores.


Take a look at Duplicacy as an alternative then. Similar core idea (takes a folder, chunks it up, tracks revisions, etc) but is faster than Duplicati and Restic. Has some other nice features like deduplication, multi client, etc.

Kopia is another up and coming option too.


I went and looked back at Duplicacy and tried to remember why I never went with it, and after 30 minutes of head scratching, realized there was a "buy" button and it all came back:

Only one computer for personal use, no GUI. Otherwise, annual licensing.

Fuck that.


This entitled attitude irks me to no end. I'm curious, how much of your work do you donate to others?

I would like to benefit from some of your free work, for personal use.


The personal backup plan is extremely cheap and almost unlimited.

Maybe they will think about the product (and cost) and phase it out ("Migrate to B2 our professional industry leading offering and save 10% of your first bill" [of $2865]) .


If I did my calculation right, based on my usage, including miscellaneous egress and API fees, I could store about 1.25 TB on B2 for 7 bucks a month. I would bet their median personal backup customer is under that.


Can highly recommend this method. And you can just use B2/S3 to store other stuff too.


That would be very frustrating. I've been using B2 to store my encrypted backups (via Duplicati in my case) and that has been solid (and cheap) in both backup and restores. For those leery of their personal backup solution, maybe go that route.


" ... wouldn't be a way to compare hashes the way you describe ..."

Hashing files shouldn't be hard ...

  ssh user@rsync.net sha256 some/dir/some/file

  ssh user@rsync.net sha512t256 some/dir/some/file  [1]
... or if you like it quick and dirty:

  ssh user@rsync.net cksum some/dir/some/file  [2]
[1] SHA-512t256 is a version of SHA-512 truncated to only 256 bits. On 64-bit hardware, this algorithm is approximately 50% faster than SHA-256 but with the same level of security.

[2] https://www.rsync.net/resources/howto/remote_commands.html


I don't use Backblaze but I use rclone which can be connected to a Backblaze backend. Rclone is opensource and has a subcommand check[0] that can compare files between remote/local or remote1/remote2. I suggest using it to find the missing files.

[0] https://rclone.org/commands/rclone_check/


Pretty sure what you mean is Backblaze B2, but the author is talking about Backblaze Personal Backup.


Yes sorry didn't know they were different. Well atleast I learnt something today.


To actually resolve this issue, here's what I think I'll do (alternative suggestions much appreciated!):

- Buy a new hard drive that's big enough to fit all my data

- Download all the data from Backblaze to the new drive (will probably take a few days/weeks)

- Write some tool that does the hashing/matching for me to see if anything's missing/corrupted (does something like this exist already?)

- Switch to a better service or just local backups


I echo what some others said, rather than buy a hdd just to verify, why not run in an ec2 vm or similar then just nuke it when you're done. Remember, ingress bandwidth to aws is free


Hmm, thanks for the suggestion, but I have a lot of data (> 5TB), I think this would get expensive very quickly.


If you end up doing that, you may find:

  diff -rq
useful. It lets you do a recursive "diff" comparing two folders, it also reports if a file is present in both folders but with different contents. I always use that command when moving big folders around.


I've used Beyond Compare, worked really well.


Could Beyond Compare be used as a backup solution to transfer files from a file server to a backup drive (like a Synology)?


Another vote here for Beyond Compare. Awesome software on both PC and Mac.


Thanks for the suggestion!


At the very least, you don't need to do the first two steps. Backblaze can ship you your backup on a hard drive, which you can then return for a full refund.

https://www.backblaze.com/restore.html


Thank you for the suggestion!

The problem is that this is very expensive if you're not in the US. I'd have to pay for customs + return shipping, which would add up to much more than the cost of a new drive.


Temporary import of goods is a thing in many jurisdictions. Have the sender fill out the paperwork correctly, and return the drive when you're done. Then you might be exempt from VAT, customs duty, and those infuriating processing fees the couriers charge you.

It's not /quite/ the same, but thankfully even after Brexit I can export broken electronics from the UK to the EU for warranty repair—without being charged a fee when the repaired item is returned to me. (The manufacturer in the EU isn't charged a fee either, when they receive the broken item.)

As for returns shipping, maybe you can shop around, but yeah, it can be expensive.


Thanks, I'll look into that!


I'm looking for a new cloud backup service, one that works on linux/MacOS and with multiple machines. I've heard many mixed things about Backblaze -- ranging from "It's amazing!" to "Use their [pay per GB] buckets unless you are using a single Windows or MacOS computer", to "Other providers are cheaper". Certainly the best in terms of $/GB seems to be OpenDrive's "unlimited" option, which only includes 'no NASes' as a rather nebulous condition.

I'd love to know -- what do other HN users for this? The story author's point about backups only being backups once you've used them to restore really* rings true to me.


I use rsync.net and borgmatic[1] backing up about a terrabyte. It's about the same price as S3 (with no egress charges, cheaper if you use just their Borg plan[2]) and you can backup a multitude of ways from rsync to zfs snapshots.

It's not as user friendly as something with a GUI but IMO anyone on HN should be able to get it working in about 30 mins.

1. https://torsion.org/borgmatic/

2. https://rsync.net/products/attic.html


Borgmatic (or plain Borg) to rsync.net is appealing.

One thing to notice about the super-affordable Borg plan is that it doesn't include free ZFS snapshots. My understanding is that you can have the SSH key used by the host to push its backup restricted to only Borg, and only append-only, within its repo... but if there's another way to access the ZFS (e.g., with an unrestricted SSH key), the Borg repo could be deleted. And then you might really want automatic ZFS snapshots as an additional layer of protection.


"One thing to notice about the super-affordable Borg plan is that it doesn't include free ZFS snapshots."

To clarify - it doesn't include free ZFS snapshots. You can configure and use them but we count them towards your quota.

If your data doesn't change much, they won't use up much space ...


Thanks, that rsync.net Borg plan looks like a great deal.


This is true. However rsync.net/borgmatic is my 3rd tier backup and Borg does the versioning/retention rather than zfs.

1. I have local zfs snapshots on my machine(s)

2. These get sent to my home NAS

3. They are then sent to a NAS I have at a friend's house

4. Then Borgmatic does it's thing to rsync.net

This covers my needs, I could upgrade to the full plan but if I need to be going to rsync.net shit has already hit the fan.


B2 is, at smaller capacities, cheaper than their Backup offering. You have to implement your own uploader, though. (We pay something like $8.50/month for ~2TB of storage in B2 and the necessary transactions to keep the backup up-to-date with rclone)

Their Backup client is so buggy and unreliable that you're honestly better off just backing up locally, there's less chance of your backups ending up worthless that way.


rsync.net has a special pricing tier for Borg users that might be worth looking into.

https://github.com/borgbackup/borg

https://rsync.net/products/attic.html


1. Sign up for Dropbox

2. Sign up for the optional 'PackRat' service, or whatever they call it nowadays

3. xcopy c:\*.* d:\dropbox\backup /s /e /d

Somewhat tongue-in-cheek, but that's basically what I do, using the Windows equivalent of a cron job to refresh the backup once a day. Not on the whole c: drive, just on my development directory tree.

The ability to dig through older versions of all files has been incredibly valuable, given the large number of build targets and other assorted binaries that aren't under source control.


I use the Backblaze client on laptops and desktops, and B2 for NAS backups. No complaints.


Backblaze is handy for being able to get at your files from any device or location regardless of whether your computer is online. I wouldn't rely on it for anything else. A backup service that can't do differential comparisons on your data is not a backup service, it's an upload service.


Personally I bought a used tape machine and some $20 2.5TB tapes.

It supports hardware encryption and the backup process is pretty much "tar cvf /dev/st0 /some/files". Downside is you don't actually save money doing this, because the drive costs ~$800. Upside is you're not relying on anybody else for worst-case scenario recovery of your important data.

The actual read/write speed are about as fast as HDD, but if you're compressing it's easy to get blocked on CPU and get a fraction of that.

Mine supports hardware encryption with AES, newer ones have hardware compression and better read/write/storage per tape.

Actually, Backblaze already supports shipping an HDD, so if they supported shipping cheap tapes they might be usable for archival backups too.


Are you not concerned with having the backups in the same location as your source? It's rare, but it doesn't protect against fire, flood, and theft.


They're tapes with encrypted data, and they're $20 each. You can write many copies, check the hash, and distribute them in many places.

Put them in safety deposit boxes at the bank, give copies to a friend, bury them in your backyard, and - yes - store a set or two in the closet.

Because they're encrypted, there is the issue of the encryption key needing to be managed. But that's the only issue.



What do you do if the $800 drive dies? The ~ubiquity of ethernet / fast-ish wifi is is one of the best reasons to use online backups.

> Mine supports hardware encryption with AES

Can you get the bytes off the tapes without that particular drive (or that particular model of drives)? I prefer software RAID over hardware RAID because I don't want to have to keep a replacement card around.


What's the upside to using tape? You mentioned that you don't save money doing this, at least with your scale. Why not just back up to a few hard drives?


Hard drives go bad more frequently than tape, and don't actually save you any work if you're making multiple redundant encrypted copies.

I have 26 terabytes on my NAS, so with 2 copies I'd probably use $800 in HDDs? I could swap to a new 10TB HDD every few hours just like a tape, attach group labels just like a tape, it'd have similar speed, and I'd probably write new images to the same HDDs every so often.

The word "just" sounds like you think it's less work, but it sounds the same to me. FWIW, I did roll from HDD to HDD for years, so I'm familiar with the concept.


If I were in your situation with my personal data, I'd spend the money to get the restore hard drive shipped plus get hard drives/arrays for working space to do hash comparisons with the BB data. At this point, for some files, you may only have one good copy.


Thanks for the suggestion, I think this is close to what I'll do (see below). Really sucks though, this is exactly what BB was supposed to prevent in the first place, all this manual labor.


The cloud is a great availability tool for personal work.

It is a poor backup for personal work because the terms and conditions are for B2B.

The terms and conditions are suited for contractual obligations under due diligence. They allow a business to avoid negligence claims when something goes wrong in a third party contract.

The service is not designed around the sentimental value of baby's first steps. Stop paying for storage and it goes from viewable on everyone's iPad to gone.

Personal work should be backed up on physical media. Multiple copies in multiple locations. If there's a copy in the cloud, that's convenient. But it is not durable.

Good luck.


I don't know, I would say B2 (or S3, etc.) are perfectly suitable as a secondary backup location. They are as durable or more durable than a NAS that I have in a friend's basement or something.


They are as durable as automatic billing from a credit card.

Disks in your friend's basement are more durable when not attached to a network and sitting in a Pelican case instead.

One of the things a personal backup might want to survive is a long period of incapacitation and or neglect or higher priorities. The cloud, because it takes a dependency on credit cards, doesn't do that. A spinning hard disk in a box tends to and if it doesn't spin up there is still the physical media from which the data can be recovered.

S3, etc. are suitable for business records. You can insure against their loss. If you don't have money to pay the bill you're probably not really going to need a backup much longer. etc.

To put it another way, there is a decades long track record of success with physical media. There is about a decade and a half of total experience with the cloud.


The continuous payment is an important point.

But for the sample size of me, I feel like I am far more likely to keep paying for cheap cloud storage for 10 years than I am to adequately maintain and test an offsite backup myself.

And to be clear, this is secondary backup, on top of my primary, onsite physical backup.


I recommend using ZFS encrypted snapshots send with rsync.net ZFS plan.


Contact Backblaze support and see what they suggest. They may be able to point you to log files that offer more information on the reason for the safety freeze.


I already did. They literally told me to manually cross-reference my hundreds of millions of files (edit: seems like it's "only" about half a million. I was mistaken).

I did follow up, but haven't gotten a response yet. Will update as I get it.

Edit: New update:

> I apologize, but this would in fact be the only sure fire way if you are concerned about any deleted files. There wouldn't be a way to compare hashes the way you describe in this case as that mechanism is simply not implemented into the Backblaze software. The Backblaze software is intended to prevent data loss. It would not be intended to be able to automatically cross reference local and server data against each other to display what is and is not backed up on our servers.


So, let me tell you about how I was getting safety frozen, and why.

I had a Mac which crashed and that had a whole bunch of files go missing. Backblaze didn't want to do backups with a huge subset of my disk files gone without having gotten notification messages from the file changes API that this stuff should be missing/was deleted. So they safety froze my machine.

It's cool that Backblaze notices this. But yes, the question of what to do next isn't clear. Does one just start a new backup? Does one scour the backup looking for anything that may be missing on your machine, now? Etc.


Yes, that's exactly the issue - that I don't know if/which files are missing/corrupted.


I find it very hard to believe that you were told to do something that equates to eyeballing hundreds of millions of files.

Would you mind posting the exact information you gave them, and their response - redacting sensitive information of course.


It sounds ridiculous, doesn't it?

This was my message to them:

> Hi, I was safety frozen. How can I find out the cause? I haven't made any significant changes to my system lately. In the support document it says that I should check if any data is missing - I have a lot of data backed up in backblaze, how can I know for sure nothing is missing or corrupted? How should I proceed? I've checked with CrystalDiskInfo and it seems that all SMART data is fine. Computer behaves normally. I've attached bzlogs and bzreports folders, just in case they're important.

And this was their response:

> Unfortunately there is simply no way to confirm what caused a safety freeze from our end. We can only provide the most common causes in this case, however we wouldn't be able to pinpoint the exact cause. The only way to verify any missing/deleted data would be to access your View/Restore Files page and cross reference what is found on our servers and what is found locally on your system. There wouldn't be any direct or automatic method of checking what data is missing, if any.

Edit, New Update:

> I apologize, but this would in fact be the only sure fire way if you are concerned about any deleted files. There wouldn't be a way to compare hashes the way you describe in this case as that mechanism is simply not implemented into the Backblaze software. The Backblaze software is intended to prevent data loss. It would not be intended to be able to automatically cross reference local and server data against each other to display what is and is not backed up on our servers.


"Hundreds of millions of files" sounds more like a database to me...


I was mistaken actually. It's "only" about 600k files. Sorry, and thanks for mentioning it! I'll correct it above too if I can.


No worries. It just reminded me of a conversation I had with a colleague when I was a Solaris admin...

He asked me what would happen if he put millions of very small files in a directory, I explained about filesystems and that each file would take up a 4k sector and space in the file, and that the behavior of the shell is undefined if there are more than 10,000 files in a directory as defined by POSIX...

I'm pretty sure he tried it anyways :-)


Arc Backup and B2 is a much better and cheaper solution.


Is B2 the other offering from Backblaze? I would be interested in hearing why this plus Arq is better than the product the OP is having difficulty with.


Maybe you could restore the backblaze data to a cloud VM such as Azure VM or EC2 - the export should be faster to there than back to your home.


Could this be related to BackBlaze powering down their service temporarily because of the Log4j 0day exploit? ( should be back up though)

https://www.backblaze.com/blog/system-maintenance-update-log...


Tried to use backblaze to save personal files on a laptop I was going to lose access to, at first it said it was incorrectly permissioned. I changed a few setting and the message went away. My fun surprise when I went to restore from the backup and 90% of my files were missing and my former computer was bricked.


I like my simple setup of Time Machine + TimeMachineEditor + a Synology on a protected plug. Battle tested with multiple Macbook setups by restoring a backup. One thing I learned is it doesn't work over a VPN connection from thousands of miles away, network is too brittle.


Do you also have an off-site backup in case of fire, theft, tyrannosaurus, etc.?


No, the data I consider really vital is on some kind of cloud service usually: Github, 1password or mail servers. My DR strategy assumes those events are black swans that I don't need to try manage. It might be a case of considerable pain and some lost data, but I won't be totally zeroed.


This is really awful. I'm curious what solution people recommend who have had to do restores. I'm curious too if anyone has any thoughts on Tarsnap;

https://www.tarsnap.com/


I used to be on Crashplan, but moved to Backblaze when Crashplan went company-only.

I have to say, the Crashplan client is far superior to the Backblaze one, which frequently complains I am out of diskspace when I'm not, and just generally seems unreliable.


Maybe I’m missing something, but as far as I understand the simplest course of action is to reinstall Backblaze and inherit the existing backup. Or is there something that’s preventing you from doing that?


I am sure that BB engineers (or even higher ups) read this.

You seem to be a very customer-oriented and open company.

Let’s take all that new stock moneys and public market exposure and get things going. You’ve got the foundation built.


Backblaze earnings report tomorrow, Coincidence ?


Backblaze doesn’t seem to be profitable.

How could they provide resources required to improve their products?

Such issues remain unless prices are increased.


I use Backblaze and am quite happy with the service. I wouldn't use it as my sole backup method however.


Try a full recovery and check it works. They lost most of my files.


I did several years back. It took a while but worked as expected.


Have you done a restore?


I have a couple of times and it works fine. I suppose I have yet to run into trouble with either of their offerings. Most restores have been individual files or lage archives (in the 5TB range).


Hardware companies are seldom good at software. The ones that are really stand out.


BackBlaze professes to be a software company that isn't good at hardware.


Can you acknowledge you've been hit by a rare bug? The service isn't awful.

My own experience with Backblaze personal backup was negative because I had so many directories and files that it took them longer than they anticipated to prepare a recovery drive, but I know that I don't have a typical use case and I recommend it without reservation to people who want that kind of whole computer backup.


> Can you acknowledge you've been hit by a rare bug? The service isn't awful.

It seems to me that service that can't deal with the bug I'm actually having is awful, whether that bug is rare or common. As an individual user, I care about my experience, not the statistical aggregate of user experiences. (Generalised 'I' here—this is not my bug.)


Well, sure, but in the context of a discussion forum it's less helpful, and usually the goal of writing it publicly is to deter potential happy users. We see this often in software support forums where someone has a rare bug and is offended that others won't stop their usage because of it.


This doesn't seem to be a rare bug, but intended behavior as hardware components fail: https://www.backblaze.com/safety_frozen.html

Also, the official solution (checking all backed up files manually) is genuinely awful, even if this were actually a rare bug.


I agree about that line in the documentation being too simple (though it shouldn't deter the technical users here.) They should advise to order a backup drive and compare all the files, or provide CLI instructions to check the safety log file against the one-liner. They seem to provide much better, candid support for hardware issues/safety freezes on reddit, based on what others have posted here, which is cool of them. Contacting support should yield similarly complete answers.


I already contacted support and this is the only thing I got so far. They literally told me to manually match all those files.

Ordering a drive from them is very expensive if you're not in the US like me.


Too bad. Yes, the location does make it rough.

There are some funny stories about Backblaze personally delivering an emergency drive to someone in a remote part of the world, but they don't help the median user.


Can you acknowledge that critical bugs aren't rare as the comments here should show?


After re-reviewing the thread, some of the reddit threads and another support forum, it doesn’t seem appropriate to do that. Interested in prevalence data to the contrary, of course.


Then you clearly weren't reading.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: