Hacker News new | past | comments | ask | show | jobs | submit login

It isn't no rhyme or reason. The issue has to do with layering caches over each other.

If you layer caches over each other and they don't pass hit rates to each other bad things happen, especially for designs like ARC. The ZFS caches explicitly take hit rate into account... but with a page cache over them... they don't know the true hit rates. Heck they are a secondary cache ;). Then L2ARC has the same issue one layer down. And there's the issue.

It isn't magical. It is just annoying.

Honestly on linux: btrfs is probably better for your sanity than ZFS.

As far as ceph goes, if you understand it, and can do the systems engineering.. yeah it is great stuff. My only complaint with ceph is just too complex for most mortals. (Despite that once you get to any real large scale... you can't avoid that complexity.)

ceph you can do some crazy shit if you know what you are doing. Want to do a seamless cluster migration? We got ya. ;) Not as good as AFS probably. But it is pretty scary.




What you say makes lots of sense. I can see how hit rates not being passed down would cause issues.

But anecdotally I think I'm in the same spot many others are. Btrfs let me down a couple times (once lots of truncated files and another time I ran out of inode space on a mostly empty disk iirc) and I'm in no hurry to give it another chance.

zfs has just happily worked for me. I ran it as a root filesystem for years. All the snapshot/send/receive stuff just worked and stayed out of my way.

Note that this was for personal use and with SSDs and later nvme so I wasn't comparing it head to head with anything. Which might be why I didn't notice issues like page caching behavior.


ZFS isn't what lets you down in the ZFS stack. It is memory fragmentation due to ARC, and the ARC not responding to memory pressure correctly.

I'm sure there's parts of btrfs that still suck. Don't get me wrong. And you can do the running out of space, so you can't remove files trick with ZFS too ;).

Personally, I only put btrfs into use on my laptop about 1.5 years ago. So... Take that for what you will. I'm pretty conservative.


True, ARC isn't magic, it is more that it makes poor decisions on occasion, with no real way to work around it reliably, aka no guaranteed way to ensure a cache hit.

I would posit however, if you need to guarantee cache hits... you don't really need cache, you need to roll your own flash storage and find a way to intelligently manage what's present on it at a given time. I've done this, namely for an environment that had very, very clear, scheduled needs for hot data, warm data, and archive data. Fairly trivial to take files and just scoot them to a different storage array when they're X days old on a nightly basis. That system is completely agnostic to, well, everything, I've done it on windows hosts and, while it's more of a pain in the ass, also worked fine there. That was back in the hardware raid or nothing days, when having a terabyte of flash available was fairly exotic and expensive, but, as a model, programmatically structured storage isn't old by any means and probably fits many environments better than ZFS and "as much memory as accounting would approve".

Also, I don't really 100% agree that ceph is too complex for mere mortals, it's more that, it very, very quickly becomes too complex for your average linux sysadmin to handle if you're not careful.

I really should dip my toes deeper into BTRFS for production usage though. I've played with it in the lab, I run it at home almost exclusively, but, never have done any big boy britches production stuff with it and it very well might be a good tool to have in my pocket.


> The issue has to do with layering caches over each other.

I think this is mostly incorrect. In my understanding, the Linux page cache is mostly skipped. The problem, such as there is a problem, is occasionally duplicated data. My understanding is this data is only mmap files and perhaps the dentry and inode cache?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: