It is an interesting question -- Given actual dropbox usage patterns, is there there an caching approach that is better than LRU for real Dropbox customers?
The answer I got was, hybrid LFU/LRU algorithms could work better, but they are complicated and were never tested against real Dropbox access patterns.
is there any known algorithm that looks at the relationship between cached items? Ie. File A is accessed, then B, then C, then D, then E, and so on. A would a have stronger relationship to B, one step away, than to D, two steps away. So if we later access file A, the algorithm would know there's a higher probability that we need File B next, so it could check if file B is in the cache and if not, prefetch it and save it in the cache.
Seems like a natural place for an application of a 1st-order markov model - one state for every entity (file), record the load events from each previously loaded file to the next file, and then your cache can predictively load a set of files by computing the most likely transitions.
Going higher than 1-order might make it even smarter, but with the cost of taking more memory (increasing the likelyhood of thrashing).
It actually seems like a very novel and interesting approach, although I don't know if it's been done before.
(Quick googling reveals that there is a LOT of work done on this approach, but I don't have the time right now to see if it's a good one or not.)
Didn't Microsoft introduce something similar in Windows Vista, where it tries to keep the system cache occupied in order to have a better response time for the user?
In the case of viewing a slideshow of photographs on a webpage (and, doubtless, some other things), the browser cache probably further reduces the usefulness of LRU on the server side and reduces the chance of worst-case for MRU.
It is an interesting question -- Given actual dropbox usage patterns, is there there an caching approach that is better than LRU for real Dropbox customers?
The answer I got was, hybrid LFU/LRU algorithms could work better, but they are complicated and were never tested against real Dropbox access patterns.