NILFS: A File System to Make SSDs Scream

wmf · on June 3, 2009

Note that NILFS makes old, crappy SSDs scream. Also, the article makes NILFS look good by comparing it to old filesystems, not COW-tree filesystems that are the current state of the art. NILFS may have merit, but this article doesn't demonstrate that in an honest way.

aaronblohowiak · on June 3, 2009

I don't know if we read the same article. This article compares NILFS to btrfs (favorably, I might add.)

davidmathers · on June 3, 2009

What's a COW-tree? And which filesystems are COW-tree filesystems?

wmf · on June 3, 2009

Copy-on-write tree. See WAFL, ZFS, and Btrfs.

davidmathers · on June 3, 2009

There's one comparison to ZFS:

One of the most noticeable features of NILFS is that it can “continuously and automatically save instantaneous states of the file system without interrupting service”. NILFS refers to these as checkpoints. In contrast, other file systems such as ZFS, can provide snapshots but they have to suspend operation to perform the snapshot operation. NILFS doesn’t have to do this. The snapshots (checkpoints) are part of the file system design itself.

wmf · on June 3, 2009

AFAIK ZFS never "suspends operation", whatever that means.

rbanffy · on June 3, 2009

Pretty much doesn't as a COW fs does not need to.

kragen · on June 4, 2009

Likewise for WAFL.

maggit · on June 3, 2009

Pardon my ignorance of SSDs and how they work, but aren't seek times supposed to be low or negligible? Isn't a log-structured FS optimised for a disk with high seek time?

I guess my questions boil down to:

1. Why is this approach fast on an SSD? Wouldn't the performance boost be more noticeable on a spinning disk?

2. Shouldn't someone try to make an FS optimised for SSD characteristics? Have someone already done this?

wmf · on June 3, 2009

Isn't a log-structured FS optimised for a disk with high seek time?

Sort of. It makes writes sequential (and thus fast) and reads nearly random (and thus slow).

Why is this approach fast on an SSD? Wouldn't the performance boost be more noticeable on a spinning disk?

Old, crappy SSDs have very slow random writes (which is like seeking), so log structuring increases write performance by orders of magnitude. Since random and sequential reads are equally fast on SSDs, the randomized layout doesn't hurt read performance as it does on disks.

Shouldn't someone try to make an FS optimised for SSD characteristics?

If the characteristics are different for every SSD model, no. Recent SSDs are becoming insensitive to access pattern, so the correct "optimization" is probably to perform no optimizations other than trim.

kragen · on June 4, 2009

I suspect, without being certain, that recent SSDs are doing this by implementing something like a log-structured filesystem in a translation layer. I also suspect that this is something that's better done in your OS kernel than in your SSD controller, because it has better information to work with.

jganetsk · on June 4, 2009

The correct optimization would be one to minimize wearing out the disk. Essentially, something that would move hot data out from blocks that are in danger of being worn out.

limmeau · on June 4, 2009

Trim?

wmf · on June 4, 2009

Trim is a command that punches a hole in the SSD (much like a sparse file) so that the SSD doesn't waste effort managing unallocated space.

voberoi · on June 4, 2009

While a speed boost is great and all, don't you have the added concern of making sure your blocks don't wear out on an SSD? Because log structured file systems commit metadata and data in large sequential writes to the hard drive, they have background processes that constantly move data around to prevent massive fragmentation from files that are later deleted. It seems like there would just be too much writing going on if what you want is a long-lasting SSD.

They do mention in the article that it would be a bad idea to use NILFS on your root partition because of heavy traffic, so lets say I decide to use it on a volume where I store all my media (where I pretty much write once, read a lot, and never delete). Now, a) why would I use an SSD for this given that they're so much more expensive and b) wouldn't I get comparable performance using an HDD here since my files will be laid out pretty much sequentially given my workload? Doesn't the same apply to companies storing mounds of data? Where is the middle ground where I would want to use NILFS on an SSD? One where I a) won't be sending enough write traffic to my drive to wear out my SSD too quickly and b) will be reading a lot of random blocks?

What am I missing? I see the merits in using NILFS, but I just don't see the purpose in comparing its performance against other filesystems on SSD's.

wglb · on June 3, 2009

Ah. Garbage collection for file systems. With screaming, to boot.

Wasn't some work done on log-structured file systems done by DEC some time ago? I can't find a reference to it.

pcc · on June 3, 2009

Log structured fs (including gc, checkpointing, wear-leveling) has been quite prevalent in the embedded linux world for several years already e.g. jffs2 / yaffs / logfs / ubifs etc.

(Of course these typically target raw NAND flash via MTD / UBI layer, so typically aren't directly suited to the SSD block device abstraction)

davidw · on June 3, 2009

> Wasn't some work done on log-structured file systems done by DEC some time ago? I can't find a reference to it.

Didn't Ousterhout (who, amongst other things, worked on a log based file system) work there on sabbatical?

http://en.wikipedia.org/wiki/John_Ousterhout

ciupicri · on June 4, 2009

He's mentioned in the article along with another guy.

> The concept was developed by John Ousterhout of TCL fame and Fred Douglis.

jsonscripter · on June 3, 2009

How would you implement a swap on such a filesystem? You couldn't, could you?

Oh, and the jokes about having NILFS on your hard drive are going to be so lame...

jerf · on June 3, 2009

"How would you implement a swap on such a filesystem? You couldn't, could you?"

The days of "swap" are numbered. The future is a smart memory hierarchy that uses knowledge about size/speed tradeoffs to make smart decisions about where to store things, but that won't look much like "swap".

Even with SSDs, it has become very unusual for "the system is currently swapping" to mean anything other than "the system's performance characteristics just became completely unacceptable". I stopped using a swap partition somewhere around 512MB to 1GB, and now my laptop that I didn't even try to max out the memory on shipped with 4GB. I can't hardly even fill up the cached buffers in 4GB in any reasonable period of time.

(Please note the difference between "very unusual" and "never". In fact, I can cite a server in my care that benefits from using a bit of swap. But that server is the exception, not the rule.)

jongraehl · on June 3, 2009

We'll still have swap as long as CPUs support virtual memory. Nothing beats using a dumb programming language and process model, and without explicitly deciding anything, being able to scale past physical memory until your working set just barely fits.

Further, there may be some performance benefit to swapping idle process memory to make room for more buffer cache.

It's true that for interactive use (especially a laptop with spun-down drives), you may be happy without swap. But I don't see the facility being mothballed.

pmjordan · on June 3, 2009

How would you implement a swap on such a filesystem? You couldn't, could you?

I guess that's mostly a non-issue, as on Linux you usually don't use a swap file but a whole volume. There is support for using files, but you only use it as a last resort. Let's face it, if you know how to set up NILFS2 then you're probably able to create a swap partition.

Freaky · on June 3, 2009

> How would you implement a swap on such a filesystem? You couldn't, could you?

Why not? It's just a filesystem with layout that's a bit different to most, users of the filesystem should be completely oblivious to it.

Swap just needs a file it can randomly read and write to; if a fs can't provide such random access, it's not just swap it'll be useless for.

kingnothing · on June 4, 2009

I liked the question but I had to downvote you for trying to turn this into reddit.

menloparkbum · on June 3, 2009

the jokes about having NILFS on your hard drive are going to be so lame

there's a node I'd like to fsck!

bbuffone · on June 3, 2009

Seems like a challenge to me... so...

NILFS or MILFS whats the difference? About 35 years ...awwwww

adammarkey · on June 3, 2009

My hard drive is already filled with MILFS and performs quite well for my requirements.

/me ducks and covers