Yev from Backblaze here! We do it because it's interesting! Initially, since we're pretty transparent, we were hoping that others would join in the fun and share their stats so we'd know how our environment and hard drive stacked up against others - but no one at our scale has really done that yet. We do have a lot of off-the-record confirmation that our experience is somewhat similar to others, which is neat - but we were just trying to be transparent and share something interesting from our infrastructure. A lot of folks jumped on it and found it interesting so we keep it going!
Plus along the way some folks find out about us and sign up for the services we offer (B2 Cloud Storage and Computer Backup) and that's nice too! Plus we also like these conversations and at the end of the day, it's fun!
The consistent transparency has guided many of my acquisition decisions personally and professionally. It also drove me to seriously examine B2, which is used personally and professionally as well.
Would like to encourage your organization to keep publishing these works, and the works like your POD. It’s really spurned on a lot of innovation and sharing.
> why do they share this info? Is it to show they’re reliable or just for curiosity? Or some other reason?
It has many facets. It's attractive for us, nerds. It also helps their providers to spot problematic models. They also show their technical prowess, and lastly I always take a look to the most failing models and try to avoid them in the data center, if I can.
Is it, have you had good experiences loading the backups? I was on the lookout recently for some offsite backup solution and came across this blog post that didn’t inspire confidence in backblaze.
In addition to the other mentioned reasons, I guess it's valuable feedback for their providers, and if that feedback is acted upon, it's beneficial for them.
"Transparency breeds trust. We’re in the business of asking customers to trust us with their data. It seems reasonable to demonstrate why we’re worthy of your trust." - Blackblaze
What would be interesting for SSD is percentage of advertized TBW when (or just before) the SSD failed, ie 100% if the SSD fails at exactly its advertized TBW, 50% if it fails at half the TBW, and 200% if it lasts two times the advertized TBW.
A well written SSD firmware simply slows down with age. It will never get to failure because the slowdown gets so extreme it becomes unusable. The drive also gets slightly smaller (by passing write failures to the OS to mark sectors bad).
Thats because a "worn out" flash sector is never fully worn out - it can still store some data, just less than the error correction can correct. It is possible to combine two sectors to have extra error correction data to still recover a sector. Now you have less than half the performance.
Worn out flash also doesn't hold data long - perhaps only a few hours before too many bits have flipped and it is unreadable. To fix that, you need to rewrite the data, which slows everything down more.
And now that you have a bunch of unreliable sectors, you also need "super-sectors" which can do sector-based hierarchical erasure coding to recover data from sectors where even the methods above have caused data to be lost. This slows down writes even more.
In the worst case, reading a single sector requires reading every sector on the drive to reconstruct. Clearly thats going to be slow enough the drive will have stopped being used long before that.
Sadly some drive firmware doesn't implement some or all of the above, so they appear to have "failed" and become unreadable, which IMO is inexcusable when it's very easy to design so that worn out drives become slow instead.
While I agree SSD's should either slow down or fail read-only, the rest seems like wishful thinking and/or extreme exaggeration.
Do you have any examples or references that drives that implement your suggested algorithms?
I wouldn't expect drives to have sector-splicing and super-sectors (though multi-level cells regularly store less bits/cell), infinitely degrade their size via write-failures or rewrite data every hour. Especially frequent rewriting would self-destruct the drive if it weren't already preceded by catastrophic data loss.
Super interesting. Do you have any examples of drives that have the proper firmware in your experience? Sounds much better to own than something that suddenly fails.
Sadly not. Black-box testing an SSD for these kind of features takes months and hundreds of drives, and manufacturers will never talk about the inner workings of their firmware. Many big SSD users develop their own SSD hardware and firmware partly for this reason.
SSD firmware is also a spectrum between "correct" and "performance", and I know of no SSD's that for example maintain all of the acknowledged data on a power failure. Sure many SSD's may typically do that, but that isn't a guarantee when the power fails in worst-case conditions.
Old Apple SSD's for example have a special extra wire on the connector specifically for "impending power failure" to help them do that. PC SSD's don't even have a standard message to mean "power failure expected in 250 milliseconds".
One thing I've always wondered, are these drives what people would recommend that you stick in a desktop or nas or are these 'datacenter' drives that are overkill for consumer use?
I would definitely recommend skipping any current large WD drives for home use. < 6TB are SMR and problematic in a NAS, and 6TB+ have a very irritating noise that no-one seems to be able to diagnose (sounds like a scanner when idle). I bought some 8TB WD Golds, and I was expecting them to be louder, but they do something very weird when idling and it's an extremely penetrating sound. It seems to be present on most large models: https://community.wd.com/t/strange-noise-coming-from-10tb-dr...
Backblaze has said in the past that they don't buy enterprise drives because they didn't notice any difference from consumer ones. I don't know if that's still their policy.
The large Toshiba drives they're using are enterprise drives. A couple of weeks ago these drives were among the cheapest (€/TB) drives to get here in the Netherlands, so I assume Backblaze just bought them because of their price.
Disclaimer: This is my personal experience from being an HPC sysadmin and old school computer enthusiast.
If there's one trend I've seen from using generations of HDDs, excluding some problematic generations like first SATA Seagate Barracudas and early WD Caviars which died for no reason at all, newer generation HDDs are always more reliable from previous generation, regardless of their class (datacenter / consumer).
For the last 10 years or so (starting with the introduction of first WD Green / Blue / Black series), the HDDs are exceptionally reliable unless you abuse them on purpose (like continuous random read/write benchmarking).
I've replaced two 11 year old WD Blacks w/o any problems this year to upgrade to two IronWolf Pro NAS drives, because I wanted something dense and PMR. At office, I changed an old Seagate Constellation ES.2 (aka Barracuda enterprise) drive since it started to develop bad sectors (which I removed from an old disk storage unit anyway). IIRC, it was around ~10 years old too with a much heavier workload history.
Looks like the most differentiating factors between enterprise and consumer drives are the command sets they support and features they bundle. NAS and other enterprise drives have features to make them more reliable in harsher conditions (heat, vibration, operational knocks induced by hot swapping, etc.).
If you're getting an enterprise disk with a storage unit, you're probably also getting disks with special firmware developed for this brand anyway, so they're not off the shelf enterprise drives.
At the end of the day, for normal operating conditions, device class doesn't matter for the home user, but for density and speed, you might need to get an enterprise drive anyway.
I can only speak about the Exos drives, but the Ironwolf (Pro) drives are basically just the Exos drives relabeled for the consumer market. I'm actually about to buy 6 14TB Exos x16 drives to replace 7 6TB Exos 7E8 drives that are 5 years old. Frequently you can get the Exos drives cheaper than Ironwolf anyway.
> Frequently you can get the Exos drives cheaper than Ironwolf anyway.
You can. I’ve been getting the 16s and they are great but with caveats. The size makes volume creation and expansion insanely long (like a week per added drive). Not Seagate’s fault I know.
The noise. They are loud. They are either chirping away or grinding away.
I once had a 20-drive NAS with 1 TB Samsung Spinpoints F1s.
The NAS is replaced but I still have the drives just for labbing / testing purposes.
I never had a drive failure during the lifetime of the NAS. Probably because it was off most of the time and only powered-on with wakeonlan when needed.
So those drives don’t have many hours on them. But recently they started dying. I lost 3 of them this year during some tests.
Imagine that these drives are probably 10+ years old.
Age does seem to matter.
Obviously this is a small uncontrolled sample but it seems that you really should keep this in mind when you run a NAS at home. Keep an eye on the SMART parameters as suggested by Backblaze and really consider replacing drives at some point. I would be afraid that drives do start dying at the same time due to age.
Does backblaze ever power cycle the drives (either on a schedule or due to planned/unforeseen circumstances)?
If so, it would be interesting to know how many drives failed on the day of a power cycle vs days with no power cycle.
I know other providers have found that "power cycle days" can be 100x more deadly for drives than "non-power-cycle days". It can have a massive impact when estimating data loss probabilities - since unforseen power cycle days tend to impact more than one drive at a time...
Andy for Backblaze here: I looked at that 3-4 years ago. It looks like power cycling increased failure rates, but we don't power cycle our systems very often, maybe 1-2 a year, so not the best use case. This is on my list for a relook one of these days, if we find anything interesting we'll let folks know.
Think of the classic "bathtub" curve (which says that young drives fail more frequently, old drives fail more frequently, and mid-age drives are most reliable).
That curve doesn't seem to match the data here. Or if it does, it says the "old" increase in failure rate happens at over 5 years.
I would guess backblaze will replace these old drives because they are too small/too slow/use too much power before they replace them for being too unreliable.
Andy for Backblaze here: A while back we did an analysis of drive failure over time, i.e. the bathtub curve. It is probably a good idea to update that, as I believe we are seeing lower failure rates upfront these days.
I feel like SSD failure has little correlation with hours running, and more to do with TBW. Would be nice to see some read/write totals on these stats going forward.
Depends on the SSDs and the cause of failure. There have been high profile cases where a firmware bug meant an absolute cap on hours running.
I've been involved in server farms with thousands of (mostly Intel) SSDs and (mostly WD) spinning drives; the spinning drives tended to have pre-failure indicators, but we couldn't figure out any indicators before SSD failure and generally they would just completely disappear from the bus when they did fail. The failure rate was signficantly less though. Our write rate wasn't very high and tended to be small writes; for busy disks, more than what we could do with a spinning disk, but usually not near the capability of the drives.
Just a homelabber, but in my 40+ drive homelab I've noticed the same. HDDs normally toss a couple SMART errors when dying. For SSDs I've gone through maybe 10 of them so far, and they just suddenly kick the bucket without warning.
My experience with WD SSDs is that they just slow down more and more as they get close to the TBW limits, without failing fully / dropping off the bus (as another user describes here: https://news.ycombinator.com/item?id=27040491).
Just echoing the sentiment of loving Backblaze. The HDD stats really aligns with the open source roots of the company. I love that they also open source the designs of their storage pods.
This "show your work" strategy helps me trust them at the end of the day. With this kind of storage being a commodity, a high level of openness could be a competitive advantage.
Someday I'm going to take those numbers and try to see how the AFR varies with the age of the drive. I expect 8 year old drives to have a higher AFR than 3yo, and 6 month olds to be somewhere in between, for example.
And then, we could compute what the life expectancy of a given model is given how long you've had it (just bought, few years old, etc)
It'd be fun to compare those across vendors and drive models. There's maybe even enough data at this point that some of the numbers might be meaningful! =)
Andy at Backblaze here. We do look at drive model failure over time. We did a post on this topic several years ago, 2015? At the time, most drive followed the bathtub curve of failure, but I'm not sure that is still the case. I think its time to update that report.
But I have a question - why do they share this info? Is it to show they’re reliable or just for curiosity? Or some other reason?