token-bleed
v0.2.9
Published
See exactly what your Claude Code and Codex sessions cost. Per prompt. Per session. Per project.
Maintainers
Readme
Token Bleed
See exactly what Claude Code, Codex, and OpenCode are costing you. Per session. Per project. Per prompt.
Quick start (recommended)
npx token-bleed
Installation
Install permanently
npm install -g token-bleed
token-bleedRun on login (Mac)
npm install -g token-bleed
token-bleed install
# Token Bleed now starts automatically.
# Open http://localhost:3847 anytime.Fix log retention (do this once)
token-bleed fix-retentionThe problem
Claude Code, Codex, and OpenCode are productive. They're also expensive when you're not watching.
They manage context automatically, fire tool calls in the background, and read files you didn't ask them to read. By the time your bill lands, you have no idea which project burned $40 or which prompt pattern is costing you three times what it should.
Token Bleed fixes that. It reads your local Claude Code, Codex, and OpenCode session data and turns it into a real cost dashboard. No API key, no cloud, no telemetry. Your data never leaves your machine.
What it does
Cost visibility, not estimates
Total spend, daily trends, average session cost, and per-message breakdowns. Filtered by time period and agent: Claude Code, Codex, or OpenCode. Accurate to configured model pricing including cache write and cache read rates.
When you shipped
Your coding-agent sessions rendered as a contribution-style heatmap, with colors blending by agent based on your daily usage. See your build cadence at a glance, not just what you spent.

Project, session and prompt breakdown
See which projects are burning the most and which models you're actually using. Drill down from project to session to individual message. Every number is traceable.

Session Compare
Pick any two sessions and diff them side by side. Token counts, cost, cache behavior, tool call volume. Useful when you're testing prompt strategies and want to know which approach is actually cheaper, not just which feels faster.

Model Compare
Compare two models across the same workload. Input tokens, output tokens, cache hit rate, total cost. Makes the Opus vs Sonnet decision data instead of instinct.
Cache hit rate tracking
Prompt caching is the biggest lever most builders aren't using correctly. Token Bleed tracks your cache hit rate so you can see whether your workflow is actually taking advantage of it, and by how much.
Optimization signals
Surfaces patterns in your usage: sessions with no cache hits, high tool call counts, models you're paying Opus prices for on tasks that don't need it.

Connect any model to Claude Code
Token Bleed isn't just a dashboard—it helps you connect Claude Code to non-Anthropic models. Use the built-in model bridge to run Claude Code against OpenAI, Gemini, or local models via Ollama and LiteLLM, and track their costs in one place.
How it works
Token Bleed reads local usage data from the tools you already run:
| Agent | Local source |
| ----- | ------------ |
| Claude Code | ~/.claude/projects/**/*.jsonl |
| Codex | ~/.codex/sessions/**/*.jsonl |
| OpenCode | ~/.local/share/opencode/opencode.db |
It parses token usage and model info, then computes cost using built-in pricing, custom pricing from Settings, or OpenCode's recorded cost where no Token Bleed pricing is configured.
No network requests. No accounts. Runs at localhost:3847.
Data refreshes from disk every 5 minutes or on demand via the Refresh button.
Models supported
Built-in pricing for Claude and Codex models. OpenCode models and other custom/local models can be priced in Settings; if no custom price exists, OpenCode sessions fall back to the cost recorded by OpenCode. Prefix matching handles future versioned IDs automatically.
Claude (Claude Code)
| Model | Input | Output | Cache Write | Cache Read | | ----------------- | ----- | ------ | ----------- | ---------- | | claude-opus-4-7 | $15 | $75 | $18.75 | $1.50 | | claude-sonnet-4-6 | $3 | $15 | $3.75 | $0.30 | | claude-haiku-4-5 | $0.80 | $4 | $1.00 | $0.08 | | claude-3-5-sonnet | $3 | $15 | $3.75 | $0.30 | | claude-3-5-haiku | $0.80 | $4 | $1.00 | $0.08 | | claude-3-opus | $15 | $75 | $18.75 | $1.50 | | claude-3-haiku | $0.25 | $1.25 | $0.30 | $0.03 |
OpenAI (Codex)
| Model | Input | Output | Cache Read | | ------------ | ----- | ------ | ---------- | | gpt-5.5 | $5 | $30 | $0.50 | | gpt-5.4 | $2.50 | $15 | $0.25 | | gpt-5.4-mini | $0.75 | $4.50 | $0.075 |
Custom Models & Pricing
You can add your own pricing for any custom model in the Settings tab. This allows you to track costs for Gemini, local models, or any other provider with the same precision as built-in models.
By default, local and custom models show usage data but report $0 cost until their pricing is configured.
Local model quirks
Token Bleed works with any model Claude Code, Codex, or OpenCode connects to, including local models via Ollama or similar.
One thing to know: local model servers do not implement prompt caching, so they report the full conversation context as input_tokens on every turn instead of incremental deltas. This means input token totals for local model sessions will be significantly higher than equivalent Claude sessions and are not directly comparable. Session Compare and Model Compare flag this when a local model is present.
API
The server exposes a REST API if you want to build on top of it.
| Method | Path | Description |
| ------ | ---------------------------- | ------------------------------------------------------------- |
| GET | /api/stats | Global totals and summary |
| GET | /api/projects | Per-project cost and usage |
| GET | /api/sessions | Paginated session list (filterable by source, project, model) |
| GET | /api/sessions/:id | Single session detail |
| GET | /api/sessions/:id/messages | Per-message breakdown for a session |
| GET | /api/models | Per-model aggregated stats |
| GET | /api/models/comparison | Side-by-side stats for two models |
| GET | /api/daily | Daily cost and activity over time |
| GET | /api/meta | Date range and cleanup period from your Claude settings |
| GET | /api/refresh | Invalidate the in-memory cache |
| POST | /api/settings | Update cleanupPeriodDays in ~/.claude/settings.json |
All list endpoints accept a ?since=YYYY-MM-DD query param to filter by date. Endpoints that return sessions, projects, prompts, models, stats, or daily activity also accept ?source=claude, ?source=codex, ?source=opencode, or comma-separated combinations.
Stack
- Runtime: Node.js 18+
- Server: Fastify
- Frontend: Vanilla TypeScript, no framework
License
MIT. Build with it, fork it, ship it.
Built by Richard Sylvester · AI Revenue Club
