In my experience with Gemini it definitely does not take a few minutes. I think that's a big difference between Claude and Gemini. I don't know exactly what Google is doing under the hood there, I don't think it's just quantization, but it's definitely much faster than Claude.
Caching a code base is tricky, because whenever you modify the code base, you're invalidating parts of the cache and due to conditional probability any changed tokens will change the results.
Caching a code base is tricky, because whenever you modify the code base, you're invalidating parts of the cache and due to conditional probability any changed tokens will change the results.