Hacker News new | past | comments | ask | show | jobs | submit login

> You also note that reading a file sequentially from disk is very fast, which it is, but there is no guarantee that the file's contents are actually sequential on disk (fragmentation), right?

Correct. And there are actually two layers of fragmentation to worry about: the traditional filesystem-level fragmentation of a file being split across many separate chunks of the drive's logical block address space (which can be fixed by a defrag operation), and fragmentation hidden within the SSD's flash translation layer as a consequence of the file contents being written or updated at different times.

The latter can often have a much smaller effect than you might expect for what sounds like it could be a pathological corner case: https://images.anandtech.com/graphs/graph16136/sustained-sr.... shows typically only a 2-3x difference due to artificially induced fragmentation at the FTL level. But if the OS is also having to issue many smaller read commands to reassemble a file, throughput will be severely affected unless the OS is able to issue lots of requests in parallel (which would depend on being able to locate many file extents from each read of the filesystem's B+ tree or whatever, and the OS actually sending those read requests to the drive in a large batch).




Consider applying for YC's W25 batch! Applications are open till Nov 12.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: