Also, what this adds is mostly overhead at the wrong level of abstraction, not visibility.
> Important note for Claude Code users: Claude Code already handles prompt caching automatically for its own API calls — system prompts, tool definitions, and conversation history are cached out of the box.
Source: their GitHub
From the FAQ:
You're right, and it's a fair question. Claude Code does handle prompt caching automatically for its own API calls — system prompts, tool definitions, and conversation history are cached out of the box. You don't need this plugin for that.
This plugin is for a different layer: when you build your own apps or agents with the Anthropic SDK. Raw SDK calls don't get automatic caching unless you place cache_control breakpoints yourself. This plugin does that automatically, plus gives you visibility into what's being cached, hit rates, and real savings — which Claude Code doesn't expose.
> Claude Code already handles prompt caching automatically for its own API calls
Claude Code is an app. The API layer is different.
When did people start thinking that the Claude Code app and the API are the same thing?
Are these just all confused vibe coders?
The first thing on the page is "Automatic prompt caching for Claude Code."
Why should one expect this to actually be "Automatic prompt caching for new apps you develop with Claude Code"?
It appears to be hard to explain what this plugin does, and the authors did a terrible job; they did not even try.
Also the anthropic API did already introduce prompt-caching https://platform.claude.com/docs/en/build-with-claude/prompt...
What is new here?
As a matter of fact, i think this is not a problem at all as Anthropic makes it extremely easy to cache stuff; you just set your preferred cache level on the last message, and Anthropic will automatically cache it under the hood. Every distinct message is another “cache” point, eg they first compute the hash of all messages, if not found, compute the hash of all messages - 1, etc.
It’s really a non problem.
I am so confused why you chose an MCP server to solve this, wouldn’t a regular API at least have some merit in how it could be used (in that it doesnt require an LLM to invoke it) ?
> "Hasn't Anthropic's new auto-caching feature solved this?"
> Largely, yes — Anthropic's automatic caching (passing "cache_control": {"type": "ephemeral"} at the top level) handles breakpoint placement automatically now. This plugin predates that feature and originally filled that gap.