@alxmss/toon
v1.0.3
Published
Token-Oriented Object Notation — high-density serialization for LLM context windows
Readme
TOON — Token-Oriented Object Notation
TOON is a serialization format for LLM context windows. It replaces JSON with a compact, indentation-aware notation that cuts 30–57% of tokens from structured data — no information lost, no fine-tuning required.
The Problem
You're building an LLM-powered feature. You serialize some data and stuff it into a prompt:
const context = JSON.stringify(records) // 18,471 tokensThat's expensive. JSON was designed for machines to parse, not for transformer attention. Every "key": is repeated on every row. Every {, }, [, ] is a token that carries no information density. At scale — 200 log events, 50 API responses, a deep config object — this bloats your context window and your bill.
TOON fixes this at the serialization layer:
import { stringify } from '@alxmss/toon'
const context = stringify(records) // 9,985 tokens — 45.9% lessSame data. Same LLM. No prompt engineering. Just fewer tokens.
How It Works
TOON uses three core compression mechanisms:
1. HRV (Header-Row-Value) — for arrays of uniform objects, keys are declared once in a header row instead of repeated on every item:
// JSON: 627 tokens
[{"method":"GET","path":"/users","auth":true},{"method":"POST","path":"/users","auth":true},...]
// TOON: 334 tokens (-46.7%)
endpoints[len:3]:
# method | path | auth
> GET | /users | true
> POST | /users | true
> GET | /health | false2. Dot-path compression — single-child object chains collapse to a flat path:
// JSON: 88 tokens
{"db":{"primary":{"host":{"address":"10.0.0.1"}}}}
// TOON: 38 tokens (-56.8%)
db.primary.host.address: 10.0.0.13. Quote elision — strings that can't be ambiguous (IPs, semver, URL paths, bare words) are never quoted. 3.2.1, 10.0.0.1, /users/{id} stay as-is.
TOON + RTK: Two Layers, Zero Overlap
If you use RTK (the token-saving command wrapper for Claude Code), these tools work at different layers — one does not replace the other.
| Layer | Tool | What it compresses | When |
|-------|------|-------------------|------|
| Development | RTK | Shell command output entering Claude Code's context | rtk vitest, rtk git diff, rtk tsc |
| Runtime | TOON | Structured data sent to your LLM in production | stringify(data) in your app |
RTK keeps your dev loop lean. TOON keeps your users' API calls cheap. Stack both for a two-layer token budget.
Development Runtime
─────────────────────────────── ──────────────────────────────────────
rtk vitest → Claude Code stringify(payload) → your LLM API call
rtk git diff → Claude Code parse(llmOutput) ← LLM response
rtk tsc → Claude CodeInstallation
npm i @alxmss/toonOne-time Claude Code setup (teaches Claude to use TOON automatically in this project):
npx @alxmss/toon initThis writes a TOON conventions block into your project's CLAUDE.md. From that point on, Claude Code uses stringify() and TOON_SYSTEM_PROMPT whenever it writes a feature that sends data to an LLM.
Quick Start
Serialize data for an LLM
import { stringify, TOON_SYSTEM_PROMPT } from '@alxmss/toon'
const response = await anthropic.messages.create({
system: myInstructions + '\n\n' + TOON_SYSTEM_PROMPT,
messages: [{
role: 'user',
content: stringify(data), // not JSON.stringify
}],
})TOON_SYSTEM_PROMPT (~280 tokens) teaches the model to read and emit TOON. Use TOON_SYSTEM_PROMPT_COMPACT (~90 tokens) for follow-up calls once the model is context-trained.
Parse LLM output back to an object
import { parse } from '@alxmss/toon'
const result = parse(llmOutput)
// → plain JS object, round-trip exactValidate before parsing
import { lint } from '@alxmss/toon'
const issues = lint(toonString)
// [{ severity: 'error', line: 4, column: 1, message: '...' }]
// Empty array = structurally validMeasure savings on your own data
npx @alxmss/toon check data.json TOON Density Report — data.json
────────────────────────────────────────────────────────────
Metric JSON TOON Delta
────────────────────────────────────────────────────────────
Bytes 56792 27334 -51.9%
Tokens (cl100k_base) 18471 9985 -45.9%
────────────────────────────────────────────────────────────
Fits in window (128k) 10× 20× ×2.00
────────────────────────────────────────────────────────────
Density Score 45.9% reduction
[█████████░░░░░░░░░░░] 45.9%
Context Expansion Factor: 2.00× — TOON fits 2.00× more data in the same windowWhen to Use TOON
TOON's compression is structural — it eliminates repeated key names and punctuation, not values. The gain scales with schema uniformity.
Maximum benefit (40–57%)
| Use Case | Typical Saving | Why | |----------|---------------|-----| | Observability pipelines (CloudWatch, Datadog, Loki) | ~46% | Log events are the most uniform data in existence → HRV | | GitHub / REST API responses (repos, issues, PRs) | ~46% | Repeated field names across paginated records → HRV | | Infrastructure config (k8s, Terraform) | ~57% | Long single-child chains → dot-path compression | | RAG pipelines with structured records | 35–47% | DB rows, product catalogs, CRM contacts → HRV | | Agentic tool schemas / endpoint inventories | 35–45% | Repeated schema fields across tool definitions → HRV |
Good benefit (30–40%)
| Use Case | Typical Saving | Why | |----------|---------------|-----| | CI bots and PR review agents (mixed payloads) | ~35% | Flat KV metadata + tabular arrays → dot-path + HRV | | LLM data transformation (validate / enrich / classify) | 30–45% | Uniform input records → savings on both request and response |
Diminishing returns
- Prose documents (articles, emails, legal text) — no structural repetition to eliminate
- Tiny payloads (< 50 tokens) —
[len:N]anchor overhead isn't amortized - Highly irregular arrays — falls back to block format, still ~20–30% savings
Not a fit
- Human-edited config files — YAML/TOML are more ergonomic to write
- Binary or streaming data
- Top-level arrays — wrap in
{ items: [...] }first
Why the Savings Compound
- Lower API cost — input tokens are priced per token; 46% fewer tokens = 46% less on that payload
- More data per window — a 200k Claude window holds 1.85× more complete records in TOON than JSON
- Faster time-to-first-token — smaller prompts start streaming sooner
- Fewer RAG round-trips — fitting more records per call reduces retrieval calls per session
For a pipeline processing 1M CloudWatch events/day, the measured 45.9% reduction translates to ~42M tokens saved — before latency improvements.
Measured Results
Synthetic fixtures (validated on every npm test)
| Shape | JSON Tokens | TOON Tokens | Savings | |-------|-------------|-------------|---------| | 12-row uniform table | 627 | 334 | 46.7% | | 6-row sparse table (1 optional col) | 294 | 196 | 33.3% | | Mixed document (KV + HRV + dot-path) | 415 | 271 | 34.7% | | Deeply nested config (3 levels) | 88 | 38 | 56.8% | | Non-uniform block array | 111 | 77 | 30.6% |
Real-world stress test (scripts/stress-test.ts)
| Fixture | JSON Tokens | TOON Tokens | Reduction | 200k window CEF | |---------|-------------|-------------|-----------|-----------------| | CloudWatch logs — 200 events × 9 fields | 18,471 | 9,985 | 45.9% | 2.00× | | GitHub repos — 50 repos × 12 fields | 5,457 | 2,919 | 46.5% | 1.89× |
CEF = Context Expansion Factor: how many more complete documents fit in the same window.
Tokenizer: cl100k_base (same encoder as GPT-4). Run node_modules/.bin/tsx scripts/stress-test.ts to reproduce.
LLM handshake verification
TOON_SYSTEM_PROMPT was verified against claude-sonnet-4-6 on a 12-row HRV CloudWatch log with a three-part reasoning task. Score: 5/5 — correct on highest latency value, timestamp, most-errored user, error count, and root-cause pattern. The TOON payload used ~180 tokens vs ~420 for equivalent JSON — 57% less on the reasoning task itself.
ANTHROPIC_API_KEY=sk-... node_modules/.bin/tsx scripts/handshake-test.tsAPI Reference
stringify(value, options?)
stringify(value: Record<string, unknown>, options?: {
indent?: 2 | 4 // default: 2
dotPath?: boolean // default: true — compress single-child chains
sizeHints?: boolean // default: true — emit [len:N] anchors
hrvThreshold?: number // default: 0.5 — max extra/base key ratio for sparse HRV
}): stringThrows TypeError if top-level value is not a plain object.
Throws ToonSerializationError on circular references.
parse(input, options?)
parse(input: string, options?: {
validateHints?: boolean // default: true — throw on [len:N] length mismatch
}): Record<string, unknown>Throws ToonParseError (with .line, .column, .suggestion) on violations.
lint(input)
lint(input: string): Issue[]
// Issue: { severity: 'error' | 'warning', line: number, column: number, message: string }Never throws. Returns all structural issues in one pass — use before parse() in CI or editor integrations.
TOON_SYSTEM_PROMPT / TOON_SYSTEM_PROMPT_COMPACT
Pre-written system prompt snippets. Full (~280 tokens) for first-time integration; compact (~90 tokens) for follow-up calls.
CLI
# One-time project setup
npx @alxmss/toon init # writes TOON block to ./CLAUDE.md
npx @alxmss/toon init --global # writes to ~/.claude/CLAUDE.md
# Measure token savings on any JSON file
toon check data.json
toon check data.json --window 32000 # custom context window
toon check data.json --toon # also print the TOON outputFormat Reference
Sigils
| Sigil | Role | Example |
|-------|------|---------|
| key: value | Key-value pair | name: Alice |
| a.b.c: value | Dot-path (single-child chain) | db.host: localhost |
| [len:N] | Size hint / structural anchor | users[len:3]: |
| # col1 \| col2 | HRV header | # id \| name \| role? |
| > v1 \| v2 | HRV data row | > 1 \| Alice \| admin |
| - key: value | Block array item | - method: GET |
| ~ | Null / absent | latency: ~ |
| col? | Optional HRV column | # id \| note? |
| // | Line comment | // deprecated |
Type inference (first match wins)
~ or null → null
true / false → boolean
bare integer → int
bare float → float
"…" → string
[…] → inline array
{…} → inline object
anything else → bare string (IPs, semver, URL paths never need quotes)Array tier selection
All scalars → key[len:N]: a, b, c
All objects, no optional keys → HRV-uniform (# / > rows)
Objects with ≤50% optional keys → HRV-sparse (col? columns, ~ for absent)
Otherwise → Block array (- items)Spec
Formal EBNF grammar: spec/GRAMMAR.md
Known limitations:
- Block string
|is parse-only (stringify uses quoted strings instead) NaN/Infinityserialize as quoted strings- Top-level arrays are not supported — wrap in
{ items: [...] }
