Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I understand his decision to skip the HW RAID controller (I like mdadm too). But a BBU is definitely critical to this op. NVMe has so many more queues that on a busy box it's going to hold way too much uncommitted data and metadata. Now the only other reasonable option is a UPS but it adds significant RUs/cost to the setup.


I am no expert, but are you sure that uncommitted data on the ssd is an issue?

The Intel DC P3600 has a built in capacitor that should give it econoff backup power to commit the data. The SanDisk Extreme Pro disks don't have volatile cache, but instead uses SLC NAND flash for cashing, that will survive a power loss.

There is of course still the issue that data send to the server and stored in main memory will be lost if you loses power, but a HW RAID controller with battery backup would not have prevented that (but an UPS of course might).


It would definitely prevent an uncommitted data problem if fsync or O_DIRECT were used (which should always be used for critical writes).

UPSes defend against power outages, but they're only one rung on the data-integrity ladder. Controller BBUs, on the other hand, protect against both power outages AND kernel panics.


Indeed we have fsync enabled on our database servers (PostgreSQL)


(And just wanted to point out that that is only one part of the equation)


I tend to agree with you here runarb, I truly don't believe there is much that a hardware RAID contrôlée and battery will protect you from with modern flash storage, at this point I think time is better spent on improving / tuning the database and filesystem for resiliency.


UPSs all the way. With high random writes you'll get alot of buffering so even if you have a BBU for the raid, the data might still be in RAM.

the order of shutdown is important too, should really be 1) consumers of storage (within the first minute or so) 2) storage (about 15 minutes after)

if your network wobbles or the storage turns off too fast, everything hangs, and your in poostain pants town.

Ideally your DNS/AD shouldn't ever willingly shutdown. or if it does it should be well after the last bit of your production system has turned off.


I went down the same road - using mdadm and a UPS instead of a HW RAID controller with battery backup unit. LSI MegaRAID SAS 9261-8i SGL with a battery backup costs roughly 650 EUR; Eaton 9130i 1500VA Rack 2U UPS costs 850 EUR. So it's a bit more expensive, but you would be buying that UPS anyway.


I'd be willing to assume they have a UPS capable of handing the servers in each rack to allow for controlled shutdowns in the event of a power outage.

Hell, I have a UPS in place for my desktop, modem, and router at home.




Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: