MickDick wrote:To make things clear, what exactly is the current hypothesis for the lag?
I've explained it previously both in this thread and others, but the concrete symptoms are that individual VFS operations seemingly randomly take Really Long (sometimes up to 20 seconds) to complete, particularly metadata-heavy ones, such as open(), rename(), close() and the like. The problem is that I don't know what they are blocking on, nor how to find that out. Processes running sync() or fsync() calls do seem to have particularly high probability of making other processes block, but they are far from exclusively responsible.