Prompt-caching – auto-injects Anthropic cache breakpoints (90% token savings)

2026-03-1311:386927prompt-caching.ai

Open source MCP plugin that automatically injects prompt cache breakpoints into Claude Code sessions. Up to 90% token cost reduction — zero config.

⏳ Pending approval in the official Claude Code plugin marketplace. Install directly from GitHub in the meantime:

Run these inside Claude Code — no npm, no config file, no restart needed:

/plugin marketplace add https://github.com/flightlesstux/prompt-caching
/plugin install prompt-caching@ercan-ermis

Claude Code's plugin system handles everything automatically. The get_cache_stats tool is available immediately after install.


Read the original article

Comments

  • By numlocked 2026-03-1312:572 reply

    As per its own FAQ this plugin is out of date and doesn’t actually do anything incremental re:caching:

    > "Hasn't Anthropic's new auto-caching feature solved this?"

    > Largely, yes — Anthropic's automatic caching (passing "cache_control": {"type": "ephemeral"} at the top level) handles breakpoint placement automatically now. This plugin predates that feature and originally filled that gap.

    • By orphea 2026-03-1313:002 reply

      I don't understand and I'm curious, why a dead on arrival open source tool needs a separate domain?

        Domain Name: prompt-caching.ai
        Updated Date: 2026-03-12T20:31:44Z
        Creation Date: 2026-03-12T20:27:35Z
        Registry Expiry Date: 2028-03-12T20:27:35Z

      • By imjonse 2026-03-1314:55

        It's more likely the other way around, the .ai domain with a fairly generic and maybe future-proof name needed a quick vibecoded project to not be empty when it launches.

      • By derrida 2026-03-1313:121 reply

        Is it perhaps because this is for claude code but there's other tools that use anthropics api like custom agents? (some i prefer to use than claude code - e.g sketch.dev what is now called shelley at exe.dev) perhaps?

        • By stingraycharles 2026-03-1313:29

          No, because this doesn’t actually “fix” any existing code. It’s only useful for helping an LLM to modify your code to adjust the caching parameters in the right place, but it doesn’t have the correct API for that.

    • By thepasch 2026-03-1315:32

      I’m pretty sure whoever made this didn’t read the website they asked their LLM to generate for them.

  • By somesnm 2026-03-1312:154 reply

    Hasn't this been largely solved by auto-caching introduced recently by Anthropic, where you pass "cache_control": {"type": "ephemeral"} in your request and it puts breakpoints automatically? https://platform.claude.com/docs/en/build-with-claude/prompt...

    • By philipp-gayret 2026-03-1312:271 reply

      Looking at my own usage with claude code out of the box and nothing special around caching set up. For this month according to ccusage I have in tokens 0.2M input, 0.6M output, 10M cache create, 311M cache read for 322M total tokens. Seems to me that it caches out of the box quite heavily, but if I can trim my usage somehow with these kind of tools I'd love to know.

      • By stingraycharles 2026-03-1312:44

        This is not about caching things for stuff that others built, it’s solely to modify code that you’re writing that will use Anthropic’s API endpoints.

    • By stingraycharles 2026-03-1312:26

      Yes, it has, this is a non-problem, and even if it was a problem, an MCP server would most definitely be one of the worst ways to fix it.

    • By gostsamo 2026-03-1312:50

      It is answered in the FAQ.

    • By ermis 2026-03-1312:361 reply

      [flagged]

      • By stingraycharles 2026-03-1312:39

        Please don’t use AI for writing comments.

        Also, what this adds is mostly overhead at the wrong level of abstraction, not visibility.

  • By katspaugh 2026-03-1312:352 reply

    > This plugin is built for developers building their own applications with the Anthropic API.

    > Important note for Claude Code users: Claude Code already handles prompt caching automatically for its own API calls — system prompts, tool definitions, and conversation history are cached out of the box.

    Source: their GitHub

    • By jasonlotito 2026-03-1313:441 reply

      Does anyone actually read anymore?

      From the FAQ:

      You're right, and it's a fair question. Claude Code does handle prompt caching automatically for its own API calls — system prompts, tool definitions, and conversation history are cached out of the box. You don't need this plugin for that.

      This plugin is for a different layer: when you build your own apps or agents with the Anthropic SDK. Raw SDK calls don't get automatic caching unless you place cache_control breakpoints yourself. This plugin does that automatically, plus gives you visibility into what's being cached, hit rates, and real savings — which Claude Code doesn't expose.

      > Claude Code already handles prompt caching automatically for its own API calls

      Claude Code is an app. The API layer is different.

      When did people start thinking that the Claude Code app and the API are the same thing?

      Are these just all confused vibe coders?

      • By user34283 2026-03-1314:051 reply

        Is this a joke?

        The first thing on the page is "Automatic prompt caching for Claude Code."

        Why should one expect this to actually be "Automatic prompt caching for new apps you develop with Claude Code"?

        It appears to be hard to explain what this plugin does, and the authors did a terrible job; they did not even try.

        • By jasonlotito 2026-03-1318:33

          > Is this a joke?

          Yes, your comment is a joke. I agree.

HackerNews