@zapgun/clawback
v0.7.1
Published
Tokenmaxxing for Claude Code.
Maintainers
Readme
clawback
clawback is a tokenmaxxer — a transparent local gateway to Anthropic (and other
providers) that:
- shows you exactly where your tokens are going,
- gives you a curated set of one-click optimization knobs to make them go further,
- gives you a HUD inside
claudethat lets you know where you are in your token burn.
Your same plan, your same quota, just way more efficient.
clawback is not an agent and works with any
agent harness (see below).
Quickstart
To start clawback with a default-secure config; run Claude Code; and open the dashboard in your browser:
npx @zapgun/clawback quickstartContents
- Why
- Description
- Direct API Access (and Agent Setup)
- The Seven Knobs
- The Dashboard
- DIY Verify
- Configuration
- Privacy & Security
- Testing
- Contact
- License
Why
Anthropic's API has a bunch of optimizations to help us use it more efficiently, but using them effectively is tricky. clawback handles that complexity for you.
The big optimization is using the prompt cache. This turns expensive calls into cheap ones, immediately. Cache reads bill at 12–20× less than cache writes (tier-dependent). But the cache evicts silently —
- after 5 idle minutes (the default unspecified TTL),
- after 60 idle minutes (the default specified TTL),
- at midnight, when the date Claude Code injects into your system prompt changes, causing cache key rotation,
- whenever your system or tools prompts change, etc.
Every eviction means your next turn rebuilds context you already paid for with expensive "write" tokens instead of cheap "read" tokens. clawback sits in front of Anthropic and applies our optimizations to keep you in-cache:
- keep-alive — synthetic pings every 1–4 minutes (or 15–45 in extended cadence) to bridge idle gaps.
- 1h cache TTL — injects
cache_control: ttl=1hso eligible sessions use Anthropic's premium cache tier. - 5m cache TTL - removes
cache_control: ttl=1hto drop you down to cheaper writes if you don't need the premium tier. - strip-ephemeral — normalizes volatile prompt fragments (dates,
<env>blocks, rotating billing tokens) before forwarding, so your prompt cache stays stable.
Bottom line: clawback keeps you in-cache so that expensive cache writes become cheap cache reads. This means longer sessions, fewer cold starts, and more work, all out of the same quota.
clawback also gives you manual knobs for other optimizations, like traffic shaping; and gives you a passthrough mode so you can see baseline claude use with no protections applied.
Description
clawback quickstart writes ./CLAWBACK.md with sensible defaults, wires Claude Code's statusline with the clawback HUD, starts clawback in server mode (if it isn't already running), and launches claude with ANTHROPIC_BASE_URL pointed at clawback.
It binds the dashboard to loopback (host: "127.0.0.1") and mints a fresh adminToken alongside, so the setup is safe out of the box, and then
opens the local dashboard in your browser of choice. (Pass --lan to bind 0.0.0.0 with TLS for phone/remote access.)
Run each piece:
clawback init --local # create ./CLAWBACK.md
clawback setup claude # wire the HUD into Claude Code's statusline.
clawback claude # launch `claude` connected to `clawback` (launch server if necessary).Launch the server separately:
clawback &
ANTHROPIC_BASE_URL=http://localhost:8080 claudeAttach more claude sessions:
Once you're running a clawback server, run clawback claude in a new terminal to connect another session.
Uninstall
clawback uninstall claude strips the statusline HUD from ~/.claude/settings.json (or <cwd>/.claude/settings.json with --project), and leaves every other setting untouched.
Direct API Access (and Agent Setup)
To use clawback as a gateway to the Anthropic API, or to use it with an agent,
or another tool like the Claude Desktop app:
clawback &
export ANTHROPIC_BASE_URL=http://localhost:8080
open /Applications/Claude.appPoint any tool that needs the Anthropic API at a clawback server using ANTHROPIC_BASE_URL (or other configuration setting).
The Seven Knobs
| Toggle | What it does | Why you'd flip it |
|--------|--------------|-------------------|
| passthrough | Turns off every clawback intervention (baseline). | Measure clawback versus baseline. |
| keep-alive | Synthetic pings keep the prompt cache warm. | Avoid cache expiry between turns. |
| 1h cache TTL | Use Anthropic's premium extended cache tier. | Make the cache survive long idle gaps. |
| strip-ephemeral | Normalize dates / env / billing tokens in system prompts. | Prevent silent cache invalidation across days. |
| extended cadence | 15–45 min ping cadence (pairs with 1h TTL). | Cut keep-alive spend ~6–12×. |
| mobile | gzip outgoing + non-streaming responses. | Save battery on tethered links. |
| auto-continue | Resume sessions automatically after hitting a rate-limit. | Finish overnight jobs without babysitting. |
Flip any of them from the dashboard (hotkeys 1–7) or via the admin endpoints:
curl -X POST -H 'content-type: application/json' \
-d '{"action":"toggle"}' \
http://localhost:8080/_proxy/extend-cache-ttlIf you're calling from another machine (any non-loopback client), remember to also send the adminToken as a bearer token.
The Dashboard
Open http://localhost:8080/_proxy/ui/ to see:
- Manual toggles for all optimization knobs.
- Live time-series charts — context %, quota (5-hour and weekly windows), cache hit rate, turn %, tokens/sec, and time-to-first-token, with one line per session plus aggregates and baselines.
- Mode-change markers — a vertical line at every toggle flip, so you can see how each metric responds to each intervention.
- Suggestions — clawback watches your traffic and recommends a knob when your stats imply you'd benefit (e.g. low cache hit rate → strip-ephemeral; rate-limit walls → auto-continue). Enable with one click.
- Session filter bar to drill into a single
claudesession.clawback claudemints a per-session id, so each claude instance gets its own metrics, statusline, and session record.--label <name>names it in the UI,--resume <id>re-attaches a resumed claude to its history.
--remote <url> points clawback to an instance running on another host. This is useful for monitoring efficiency gains across a fleet of claude instances.
DIY Verify
The A/B harness we use for statistical validation lives in the repo — under
.skills/ and benchmark/, which aren't bundled into the npm package — so clone
the repo to run it:
git clone https://github.com/zapgun-ai/clawback && cd clawback.skills/ab/scripts/ab_block.sh runs a counterbalanced A/B block — passthrough
baseline vs. treatment stack — then analyzes and charts the result with confidence intervals. You can use this to generate statistically precise efficiency reports.
Here's an example of how to validate haiku (more capable models show even larger gains):
# headline profile (5–30 min idle gaps), ≥200 turns/arm for a reportable %:
bash .skills/ab/scripts/ab_block.sh --profile L2 --turns 200 --driver pty \
--model claude-haiku-4-5-20251001 --out runs/L2The efficiency win grows as turns cross the 5-minute, 1-hour, and midnight eviction boundaries, in addition to other optimizations.
Configuration
clawback uses a layered configuration file system with CLI overrides. In
order of increasing priority, settings are pulled from:
- Defaults —
src/config.js. - Global —
${XDG_CONFIG_HOME:-$HOME/.config}/clawback/CLAWBACK.md. For example:~/.config/clawback/CLAWBACK.md. - Local —
./CLAWBACK.md(auto-discovered), or--config <path>. - CLI flags —
clawback --helpfor the full list.
If --passthrough is set all interventions are disabled and clawback becomes
a byte-transparent gateway to the API.
See the example config for complete default settings and documentation.
Here are some common flags:
--host <host> Bind host (default 127.0.0.1)
--port <port> Bind port (default 8080)
--upstream <url> Upstream Anthropic URL
--passthrough Transparent byte-forwarding gateway
--keep-alive <on|off> Ping scheduler (default on)
--inject-extended-cache-ttl <on|off> ttl=1h injection (default on)
--keep-alive-mode-extended <on|off> 15–45 min cadence (default off)
--strip-ephemeral-from-system <on|off> Normalize system prompt (default on)
--mobile gzip + non-streaming bundle
--auto-continue Auto-resume claude after cap clears
--admin-token <secret> Bearer token for mutating admin endpointsObserver mode (good for non-Anthropic upstreams)
Our cache interventions are Anthropic-specific, but the telemetry isn't. Point clawback at an OpenAI-compatible upstream in --passthrough mode and you keep the dashboard, statusline, and per-session metrics:
clawback --upstream https://api.x.ai --passthrough --port 8081Privacy & Security
By default clawback runs entirely on your machine. There is no product telemetry yet. By
default it binds to loopback. On a LAN bind, the auto-minted adminToken gates
mutating admin endpoints; read endpoints (health, sessions, metrics, the UI)
stay open and expose stats but never prompt content, API keys, or the
adminToken.
Expose clawback to the public internet at your own risk. We typically tunnel
it through Tailscale, WireGuard, or an SSH port-forward instead. For HTTPS on
a trusted LAN, clawback init-cert mints a self-signed cert and --tls on
serves it.
Testing
These should all pass and run clean:
npm test # jest
npm run lint # biome
npm run check # biome lint + format check (CI gate)Contact
License
See LICENSE.
