This is very interesting. However, the OP's idea of "failure" is when something ...

wmf · on Oct 26, 2010

It's possible that silent data corruption occurred much earlier.

That is possible, but since flash uses ECC the controller should notice corruption.

Also, does the OP know whether any write buffering is going on?

There shouldn't be, since he's already using O_DIRECT.

why some writes took much longer than others

That's probably caused by erasing and data copying, since one erase is required for every ~1MB of data written.

If y'all are really interested in this topic, there are some good academic papers.

http://nvsl.ucsd.edu/ftest/

InclinedPlane · on Oct 26, 2010

Flash drives use ECC precisely so that they can detect when memory cells are nearing the end of their useful writable lifetime before the data is unreadable.