Hacker News new | past | comments | ask | show | jobs | submit login

> Shouldn't it timeout only when the server is no longer ACK-ing packets?

That's exactly what happens. The server ACKs data until it fills its write buffer, and then stalls unresponsive until the entire buffer is flushed to disk. If it takes longer to flush the buffer to disk than the client's timeout, it gives up.

I have personally watched this happen via wireshark where the server doesn't ACK for more than 10 minutes.




That's not it. I only had this problem on a fast-ethernet connection (because I had to share the cable for two connections). The server could write ~ 50 MB/s, but it still timed out on the 10MB/s upload.


It's possible you were seeing another problem, but this issue is more likely to appear with a faster network connection, because the network transfer happens faster than the disk writes.

You can confirm by watching /proc/meminfo and watching the Dirty and Writeback numbers.

Changing up the vm.dirty* settings can help as described here:

https://lonesysadmin.net/2013/12/22/better-linux-disk-cachin...


Holy shit i think you and the above comment, along with this thread, may have finally given me the answer to one of the few problems i was never able to solve.

About 4-5 years ago, i was working on a project, and part of that was copying big amounts of data to a system via nfs. At 30 minutes exactly, nfs would croak, transfer fails.

I think this buffer fill and empty flow was fucking killing it. Its a shame i dont work there anymore, id definitely wanna try tweaking these settings and see if i could solve it


Yeah that does sound like the symptoms of the problem I discovered. If you ever witness it again, the trick is watching /proc/meminfo for the Dirty and Writeback numbers.

And it's the vm.dirty* settings to change to fix it as described here: https://lonesysadmin.net/2013/12/22/better-linux-disk-cachin...




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: