Great post. This exact limitation of web LLMs is why I'm leaning strongly towards local models for the easier stuff. Prompt caching can dramatically speed up fixed tasks.
But frontier models are just too damn good and convenient so I don't think its possible to fully get away from web LLMs.
But frontier models are just too damn good and convenient so I don't think its possible to fully get away from web LLMs.