MCP integration

Token Saver

Cut token cost without cutting quality. An installer MCP that teaches your agent to spend fewer tokens on its own prose while keeping code, commands and context verbatim.

@prom.codes/saver is the third prom.codes MCP server. Where the context engine answers "what does this codebase look like?" and the memory server answers "what did we already learn and decide?", the saver answers a blunter question: "why is this agent burning tokens narrating itself?" It installs a small efficient-output rule block into your agent runtime so the model spends fewer tokens on its own prose — preamble, play-by-play narration, pasted tool-output recaps — while keeping everything that actually matters verbatim.

The tagline is "lean, not terse." It shapes phrasing, never what the agent does and never what it remembers.

Pure JS — `npx` just works

Unlike the context and memory servers, Saver has no database, no embeddings, and no native modules. It is pure JavaScript, so npx -y @prom.codes/saver@latest is turnkey on every platform — no build step, no native-dependency compilation, no platform-specific binaries. If npx has ever stalled on a native build for you, this one won't.

What it keeps verbatim (anti-context-loss)

This is the whole point, so it is stated plainly: correctness, safety, completeness and the agent's own working context come first; token-saving comes second. If they conflict, Saver does not compress. The rules leave the following untouched, word for word:

Code, diffs and commands — never paraphrased or trimmed.
API names, file paths and exact error strings — kept literally.
Load-bearing caveats, trade-offs and negations — anything hinging on "not", "only", "unless" stays in full.
Destructive or multi-step sequences — the steps an agent must not drop or reorder are preserved.
The agent's own working context — anything the agent will need on a later turn is never compressed away.
The closing summary at the end of a task — leanness applies to narrating work in progress, never to the report you get when the agent hands a task back. What it did, what changed, what it verified, what is still open: in full, at every level. A summary you have to ask for costs more than it ever saved.

What it does trim is the model's own filler: conversational preamble, step-by-step narration of work already shown in tool calls, and restating tool output that the reader can already see.

Tools

Saver exposes two tools: setup (the on-ramp) and status (a health check). The rule only saves tokens once it is installed into a runtime config file — and an MCP server can't act on its own, so the Saver makes that happen two ways:

Auto-install on startup (default on). When the Saver boots in a project that already has a config file (CLAUDE.md / AGENTS.md / .cursor / .augment), it installs the marked rule block automatically — idempotent, never creates a new file, skipped when no project is open (home/root). Opt out with PROMETHEUS_SAVER_AUTO_SETUP=off. This is why the Saver "just works" after you add it.
setup (explicit). Run it once per workspace to install into all detected runtimes, including creating AGENTS.md if you have no config yet.

Both write only inside a marked block — foreign content is never touched. status reports the resolved workspace root, the key state, and which runtimes already have the block — so you can confirm where it writes and whether it's in.

If a window opens with no project, the workspace falls back to the host's working directory (often your home dir); setup then refuses to write rule files into your home dir — open your project folder first (Claude Code passes it via CLAUDE_PROJECT_DIR).

The `setup` tool

Argument	Values	Default	Purpose
`level`	`lite` · `balanced` · `aggressive`	`balanced`	How hard to lean out the prose.
`runtimes`	array of `claude-code` / `cursor` / `augment` / `agents`	auto-detected	Which runtime rule files to write. Omit to let Saver detect what's present.

Where the rules land

setup installs the rule block into the rule file for each detected runtime, always inside a marked, idempotently-replaced block:

Runtime	Rule file
Claude Code	`CLAUDE.md`
Cursor	`.cursor/rules/prom-saver.mdc`
Augment	`.augment/rules/prom-saver.md`
Generic agents	`AGENTS.md`

Re-running setup updates the block in place rather than duplicating it, and only the content between the markers is ever rewritten — your own instructions in those files are left alone.

Levels

Level	Typical output-token savings	What it does	When to use
`lite`	~10–15%¹	Drops filler only — preamble and obvious narration.	The lightest touch; you want to keep almost all of the agent's voice.
`balanced`	~15–20% (measured²)	Default and recommended. The conservative setting — trims prose meaningfully while protecting all the verbatim categories above.	Almost everyone. Start here.
`aggressive`	~15–30% (measured², uneven)	Telegraphic output; opt-in. Auto-falls back to `balanced` for code-generation, debugging and destructive work.	Power users who want maximum leanness and accept the terser style on routine turns.

¹ lite is not separately benchmarked; the figure is an estimate — it trims a subset of what balanced does. ² balanced and aggressive were first measured on claude-haiku-4-5 across six representative coding turns (single-shot, temp 0); a broader replay across real coding sessions measured ~16% fewer output tokens for balanced — hence the range above.

Scope discipline (opt-in, second rule)

The levels above shape prose, which is the smaller pool. This second, opt-in rule goes after the bigger one: an agentic run costs context size × turns, so the saving that matters is fewer rounds — and rounds are lost to changes that miss, not to words. A wrong edit costs a review round, a fix round, and the full context of both.

{ "name": "setup", "arguments": { "scope": true } }

It installs a separate marked block (prom-scope) next to the efficient-output one — its own block, not a fourth level, because it is a different axis. scope: false removes it again; omitting the argument leaves it exactly as it is. What it tells the agent:

Understand generously; change deliberately. Reading is cheap next to guessing wrong. This is the important half: an agent that reads less and then guesses is more expensive, not less.
Work with what is there — reuse the existing helper, follow the file's conventions, and change what the task needs (a bug fix does not need the tidy-up around it).
Rewrite when the code genuinely cannot carry the task — broken, unsafe, unprotected at a critical point — and say in one sentence why the smaller change would not have held. It is not a rule against real work.
Tests, error handling at real boundaries, validation and security checks are never "unnecessary work."

Setup

Saver needs no API key and no workspace root — and because it's pure JS there's no index to build. One command:

claude mcp add saver -- npx -y @prom.codes/saver@latest

The -- separator before the command is required. See the Claude Code install page for scopes and the optional .mcp.json form (Cursor uses .cursor/mcp.json, VS Code .vscode/mcp.json — same JSON shape).

Then ask your agent to run setup once per workspace. It detects which runtimes are present (Claude Code, Cursor, Augment, generic AGENTS.md) and installs the efficient-output rule block. The install is idempotent: re-running updates the block in place instead of duplicating it.

Environment variables

Everything is optional — both default sensibly, so the bare install above is all most people need.

Variable	Default	Purpose
`PROMETHEUS_API_KEY`	—	Optional. The same `prom_live_…` key as the context/memory servers; ties the install to your account and is the hook for future metering. Today it changes nothing about what `setup` writes, and a malformed value just warns and continues keyless.
`PROMETHEUS_WORKSPACE_ROOT`	auto-detected	Override only to point `setup` at a different repo than the one you're in.
`PROMETHEUS_SAVER_SCOPE`	off	`on` makes startup auto-install the opt-in scope-discipline rule as well.
`PROMETHEUS_SAVER_AUTO_SETUP`	on	`off` stops the server installing the rule block on startup.

Requirements

Node ≥ 20.10. No database, no embeddings, no native modules — the whole package is JavaScript, which is what makes npx turnkey here.