claumon-pro
v0.1.0
Published
Predict and visualize your AI API usage limits before you hit them
Downloads
135
Maintainers
Readme
Claumon Pro
Predict and visualize your AI API usage limits before you hit them.
Claude Pro and Max users only see current usage snapshots — not where they're heading. Agents get cut off mid-task with no warning, wasting compute. Claumon Pro reads your local Claude Code usage data, forecasts when you'll hit your rate limits, and alerts you in time to throttle agents.
Claumon Pro · Anthropic (Claude Code) · plan: max5x
5h window ████████████░░░░░░░░ 58% resets 07:48 PM
weekly ██████████░░░░░░░░░░ 48% rolling 7d · 2.4M tokens
burn rate ▁▁▁▁▁▁▁▁▁▁█▅ 50.4k tok/min ↗ active (high confidence)
forecast limit hit ~03:15 PM (19 min) ⚠ before reset
today $13.20 est · 1.3M tokens · fable 100%How it works
Claude Pro/Max plans have no official usage API — so Claumon parses your local Claude Code transcripts (~/.claude/projects/**/*.jsonl), which record per-message token usage. No API key, no network calls, no data leaves your machine.
- 5-hour window tracking — models Claude's session-based rolling limit
- Burn-rate forecasting — EWMA over the last hour projects your ETA-to-limit
- Alerts before cutoff — macOS notification + terminal bell at 70%/90% and when limit is < 30 min away
- Anomaly detection — flags usage spikes vs your 7-day baseline
- Cost estimates — per-model pricing applied to your actual token mix
Install
npm install -g claumon-proRequires Node 20+ and at least one Claude Code session on this machine.
Usage
claumon # one-shot dashboard
claumon watch # live dashboard + alerts (refreshes every 30s)
claumon watch --interval 10
claumon export --csv --days 7 > usage.csv
claumon providers # list data sourcesConfiguration
Exact Pro/Max token limits are unpublished, so budgets are estimates. Two options:
claumon config --plan max5x # use a preset (pro | max5x | max20x)
claumon config --auto # calibrate to YOUR observed peak session (recommended)
claumon config --threshold 0.5 0.8 0.95 # custom alert thresholds
claumon config --plan custom --five-hour-budget 2000000--auto scans your full history, finds your busiest-ever 5h session, and sets the budget to that ceiling + 5%. The closer you've come to a real cutoff, the more accurate it gets.
Config lives at ~/.claumon/config.json.
Claude Code Plugin
This repo doubles as a Claude Code plugin — usage awareness inside your sessions:
/plugin marketplace add jamestubman/claumon-pro # or a local path
/plugin install claumon@claumon-proWhat you get:
/claumon:usage— full dashboard rendered in-session- Session-start summary — one line of usage context when a session opens
- Automatic limit warnings — when you cross 70%/90% or are on pace to hit the limit within 30 min, a warning is injected into Claude's context and Claude switches to token-frugal behavior (concise output, fewer speculative tool calls). Debounced — fires on band changes, not every prompt.
/claumon:statusline— opt-in status line gauge:▮▮▮▯▯ 61% · resets 19:48 · ⚠ limit in 47m- Skill — ask naturally: "how much budget do I have left?", "when does my window reset?"
The plugin runs the bundled zero-dependency CLI (dist/claumon.cjs) — no npm install needed.
Tiers
| | Free | Pro ($6/mo) | |---|---|---| | Anthropic (Claude Code) tracking | ✅ | ✅ | | Forecasts + macOS/terminal alerts | ✅ | ✅ | | OpenAI + other providers | — | coming soon | | Slack/email alerts | — | coming soon | | Export reports | CSV/JSON | scheduled |
Development
npm install
npm test # vitest — parser, window math, forecast
npm run dev # run CLI from source
npm run build # compile to dist/Notes
- Weighted tokens:
input + 5×output + 1.25×cache_write + 0.1×cache_read— approximates the cost basis limits are enforced on, rather than raw counts. - The 5h window is session-based (opens at your first message after the previous window expires), matching Claude's observed behavior.
- Weekly usage shown as a rolling 7-day sum; Anthropic's actual weekly reset anchor isn't observable from local data.
