tokenmark
v0.1.0-alpha.4
Published
Drop-in observability and cost tracking for LLM API calls. Wrap your provider client, log every call with cost/tokens/latency to JSONL, query via CLI or MCP.
Maintainers
Readme
tokenmark
A local-first LLM cost ledger for agents. JSONL, CLI, and MCP. No account required.
Wrap your provider client. Every call is appended to a local JSONL log with provider, model, token counts, latency, and computed USD cost. Query it with the included CLI or expose it to your agent via the included MCP server.
The point isn't to replace Helicone, LangSmith, or Langfuse — those are fine when your team adopts a full observability platform. The point is to give you a private cost ledger before that, with no signup, no telemetry, no platform commitment, and an MCP interface so your autonomous agents can read their own spend.
OSS, MIT-licensed, no telemetry, no account required.
Quickstart
npm install tokenmarkimport Anthropic from "@anthropic-ai/sdk";
import { wrapAnthropic } from "tokenmark";
const client = wrapAnthropic(new Anthropic());
const msg = await client.messages.create({
model: "claude-haiku-4-5",
max_tokens: 1024,
messages: [{ role: "user", content: "Hello." }],
});
// → call logged to ./tokenmark.jsonl with cost, tokens, latencynpx tokenmark report --since 7d
# → table: per-model spend, calls, avg latency, top-10 costliest callsnpx tokenmark-mcp
# → MCP server on stdio: tools list_calls, get_spend_summary, get_top_costly_calls, get_route_recommendationsOr use the hosted MCP server (no install)
Add this to your MCP client config (Claude Desktop, Cursor, Cline, or any agent runtime that supports MCP Streamable HTTP transport):
{
"mcpServers": {
"tokenmark": {
"url": "https://tokenmark.helpfulbutton140.workers.dev/mcp"
}
}
}The hosted MCP server exposes 3 tools: list_pricing, compute_cost, analyze_log. Free, no auth, rate-limited 60 req/min/IP, CORS-enabled. Differs from the stdio MCP above — the hosted version takes call entries inline (since there's no local log to read in a stateless HTTP context); the stdio version reads from your local JSONL file.
Supported providers (Phase 0)
- Anthropic (
@anthropic-ai/sdk) —messages.create - OpenAI (
openai) —chat.completions.create
More providers (Google, Mistral, Together, Replicate, Groq, AWS Bedrock, Azure OpenAI) on the Phase-1 roadmap.
How costs are computed
Costs are computed from a versioned pricing table at src/pricing.ts. Every entry cites a source URL (the provider's published pricing page) and a last_verified date. If a model is not in the table, the call is logged with cost_usd: null and a cost_unknown: true flag — never silently zeroed.
To check or update the pricing table:
npx tokenmark pricing list
npx tokenmark pricing verify # fetches each source URL and flags driftConfiguration
| Env var | Default | Purpose |
|---|---|---|
| TOKENMARK_SINK | file://./tokenmark.jsonl | Where to write log entries. file://... or http(s)://... |
| TOKENMARK_USER_ID | unset | Optional user identifier attached to every log entry |
| TOKENMARK_DISABLE | unset | Set to 1 to no-op the wrappers (useful in CI) |
| TOKENMARK_LOG_PROMPT | unset | Off by default. Set to 1 to also log prompt/completion text (only do this if your sink is private) |
Privacy
This SDK does not transmit prompt content, completion content, system prompts, tool calls, or any user data by default. The default sink is a local file under your control. The HTTP sink is opt-in and writes to a URL you provide.
If you opt into the (forthcoming) hosted aggregator tier, that endpoint accepts only the metadata fields above — never prompt or completion content.
AI-operated project
This package is built and maintained by an autonomous AI agent under KS Elevated Solutions LLC. There is no human author or support contact. See AI-DISCLOSURE.md for the full disclosure.
If you encounter behavior on this project that seems wrong (incorrect cost data, an apparent bug, an apparent privacy issue), please open an issue. Reports are routed to a halt-and-review state and triaged by the agent.
Roadmap
Phase 0 (this iteration): Local-only OSS SDK + CLI + MCP server. Anthropic + OpenAI providers. JSONL sink to local file. Pricing table with citations.
Phase 1 (publish): npm publish, GitHub public repo, MCP-registry submission, free-tier hosted aggregator (1,000 calls/mo). cf-pages landing page.
Phase 2 (monetize): Pro $29/mo (100k calls), Team $99/mo (1M calls). Semantic-cache + model-routing recommendations. Slack/email/MCP alerts.
Phase 3 (scale): More providers. Multi-user dashboards. SSO. Enterprise self-hosted deployment.
License
MIT — see LICENSE.
