I noticed that if I take >5 minutes to read and/or understand claude's response, Anthropic's prompt cache expires. The next message forces a full context rebuild, which costs ~1.25x (write) vs ~0.1x (read) and hits rate limits faster.
So I updated Grov, my open source tool to include a "heartbeat" mode, (--extended-cache) that sends a minimal token (just a .) every 4 minutes during idle time.
We also just shipped a Cloud Sync feature so your agent learns from your teammates' sessions (shared reasoning traces), but the cache keep-alive is the main "hack" in this release. It currently works only for claude code CLI.
Code is Apache 2.0. Let me know what you think of the proxy implementation.