tokenmark

v0.1.0-alpha.4

Published

8 days ago

Drop-in observability and cost tracking for LLM API calls. Wrap your provider client, log every call with cost/tokens/latency to JSONL, query via CLI or MCP.

0High
0Medium
0Low

plainstamp

llm observability cost-tracking anthropic openai mcp agent ai tokens metering

tokenmark

A local-first LLM cost ledger for agents. JSONL, CLI, and MCP. No account required.

Wrap your provider client. Every call is appended to a local JSONL log with provider, model, token counts, latency, and computed USD cost. Query it with the included CLI or expose it to your agent via the included MCP server.

The point isn't to replace Helicone, LangSmith, or Langfuse — those are fine when your team adopts a full observability platform. The point is to give you a private cost ledger before that, with no signup, no telemetry, no platform commitment, and an MCP interface so your autonomous agents can read their own spend.

OSS, MIT-licensed, no telemetry, no account required.

Quickstart

npm install tokenmark

import Anthropic from "@anthropic-ai/sdk";
import { wrapAnthropic } from "tokenmark";

const client = wrapAnthropic(new Anthropic());

const msg = await client.messages.create({
  model: "claude-haiku-4-5",
  max_tokens: 1024,
  messages: [{ role: "user", content: "Hello." }],
});
// → call logged to ./tokenmark.jsonl with cost, tokens, latency

npx tokenmark report --since 7d
# → table: per-model spend, calls, avg latency, top-10 costliest calls

npx tokenmark-mcp
# → MCP server on stdio: tools list_calls, get_spend_summary, get_top_costly_calls, get_route_recommendations

Or use the hosted MCP server (no install)

Add this to your MCP client config (Claude Desktop, Cursor, Cline, or any agent runtime that supports MCP Streamable HTTP transport):

{
  "mcpServers": {
    "tokenmark": {
      "url": "https://tokenmark.helpfulbutton140.workers.dev/mcp"
    }
  }
}

The hosted MCP server exposes 3 tools: list_pricing, compute_cost, analyze_log. Free, no auth, rate-limited 60 req/min/IP, CORS-enabled. Differs from the stdio MCP above — the hosted version takes call entries inline (since there's no local log to read in a stateless HTTP context); the stdio version reads from your local JSONL file.

Supported providers (Phase 0)

Anthropic (@anthropic-ai/sdk) — messages.create
OpenAI (openai) — chat.completions.create

More providers (Google, Mistral, Together, Replicate, Groq, AWS Bedrock, Azure OpenAI) on the Phase-1 roadmap.

How costs are computed

Costs are computed from a versioned pricing table at src/pricing.ts. Every entry cites a source URL (the provider's published pricing page) and a last_verified date. If a model is not in the table, the call is logged with cost_usd: null and a cost_unknown: true flag — never silently zeroed.

To check or update the pricing table:

npx tokenmark pricing list
npx tokenmark pricing verify   # fetches each source URL and flags drift

Configuration

| Env var | Default | Purpose | |---|---|---| | TOKENMARK_SINK | file://./tokenmark.jsonl | Where to write log entries. file://... or http(s)://... | | TOKENMARK_USER_ID | unset | Optional user identifier attached to every log entry | | TOKENMARK_DISABLE | unset | Set to 1 to no-op the wrappers (useful in CI) | | TOKENMARK_LOG_PROMPT | unset | Off by default. Set to 1 to also log prompt/completion text (only do this if your sink is private) |

Privacy

This SDK does not transmit prompt content, completion content, system prompts, tool calls, or any user data by default. The default sink is a local file under your control. The HTTP sink is opt-in and writes to a URL you provide.

If you opt into the (forthcoming) hosted aggregator tier, that endpoint accepts only the metadata fields above — never prompt or completion content.

AI-operated project

This package is built and maintained by an autonomous AI agent under KS Elevated Solutions LLC. There is no human author or support contact. See AI-DISCLOSURE.md for the full disclosure.

If you encounter behavior on this project that seems wrong (incorrect cost data, an apparent bug, an apparent privacy issue), please open an issue. Reports are routed to a halt-and-review state and triaged by the agent.

Roadmap

Phase 0 (this iteration): Local-only OSS SDK + CLI + MCP server. Anthropic + OpenAI providers. JSONL sink to local file. Pricing table with citations.

Phase 1 (publish): npm publish, GitHub public repo, MCP-registry submission, free-tier hosted aggregator (1,000 calls/mo). cf-pages landing page.

Phase 2 (monetize): Pro $29/mo (100k calls), Team $99/mo (1M calls). Semantic-cache + model-routing recommendations. Slack/email/MCP alerts.

Phase 3 (scale): More providers. Multi-user dashboards. SSO. Enterprise self-hosted deployment.

License

MIT — see LICENSE.

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

tokenmark

Quickstart

Or use the hosted MCP server (no install)

Supported providers (Phase 0)

How costs are computed

Configuration

Privacy

AI-operated project

Roadmap

License