npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

@alxmss/toon

v1.0.3

Published

Token-Oriented Object Notation — high-density serialization for LLM context windows

Readme

TOON — Token-Oriented Object Notation

v1.0.2 Tests License

TOON is a serialization format for LLM context windows. It replaces JSON with a compact, indentation-aware notation that cuts 30–57% of tokens from structured data — no information lost, no fine-tuning required.


The Problem

You're building an LLM-powered feature. You serialize some data and stuff it into a prompt:

const context = JSON.stringify(records)  // 18,471 tokens

That's expensive. JSON was designed for machines to parse, not for transformer attention. Every "key": is repeated on every row. Every {, }, [, ] is a token that carries no information density. At scale — 200 log events, 50 API responses, a deep config object — this bloats your context window and your bill.

TOON fixes this at the serialization layer:

import { stringify } from '@alxmss/toon'
const context = stringify(records)  // 9,985 tokens — 45.9% less

Same data. Same LLM. No prompt engineering. Just fewer tokens.


How It Works

TOON uses three core compression mechanisms:

1. HRV (Header-Row-Value) — for arrays of uniform objects, keys are declared once in a header row instead of repeated on every item:

// JSON: 627 tokens
[{"method":"GET","path":"/users","auth":true},{"method":"POST","path":"/users","auth":true},...]

// TOON: 334 tokens (-46.7%)
endpoints[len:3]:
  # method | path    | auth
  > GET    | /users  | true
  > POST   | /users  | true
  > GET    | /health | false

2. Dot-path compression — single-child object chains collapse to a flat path:

// JSON: 88 tokens
{"db":{"primary":{"host":{"address":"10.0.0.1"}}}}

// TOON: 38 tokens (-56.8%)
db.primary.host.address: 10.0.0.1

3. Quote elision — strings that can't be ambiguous (IPs, semver, URL paths, bare words) are never quoted. 3.2.1, 10.0.0.1, /users/{id} stay as-is.


TOON + RTK: Two Layers, Zero Overlap

If you use RTK (the token-saving command wrapper for Claude Code), these tools work at different layers — one does not replace the other.

| Layer | Tool | What it compresses | When | |-------|------|-------------------|------| | Development | RTK | Shell command output entering Claude Code's context | rtk vitest, rtk git diff, rtk tsc | | Runtime | TOON | Structured data sent to your LLM in production | stringify(data) in your app |

RTK keeps your dev loop lean. TOON keeps your users' API calls cheap. Stack both for a two-layer token budget.

Development                        Runtime
───────────────────────────────    ──────────────────────────────────────
rtk vitest      → Claude Code      stringify(payload) → your LLM API call
rtk git diff    → Claude Code      parse(llmOutput)   ← LLM response
rtk tsc         → Claude Code

Installation

npm i @alxmss/toon

One-time Claude Code setup (teaches Claude to use TOON automatically in this project):

npx @alxmss/toon init

This writes a TOON conventions block into your project's CLAUDE.md. From that point on, Claude Code uses stringify() and TOON_SYSTEM_PROMPT whenever it writes a feature that sends data to an LLM.


Quick Start

Serialize data for an LLM

import { stringify, TOON_SYSTEM_PROMPT } from '@alxmss/toon'

const response = await anthropic.messages.create({
  system: myInstructions + '\n\n' + TOON_SYSTEM_PROMPT,
  messages: [{
    role: 'user',
    content: stringify(data),   // not JSON.stringify
  }],
})

TOON_SYSTEM_PROMPT (~280 tokens) teaches the model to read and emit TOON. Use TOON_SYSTEM_PROMPT_COMPACT (~90 tokens) for follow-up calls once the model is context-trained.

Parse LLM output back to an object

import { parse } from '@alxmss/toon'

const result = parse(llmOutput)
// → plain JS object, round-trip exact

Validate before parsing

import { lint } from '@alxmss/toon'

const issues = lint(toonString)
// [{ severity: 'error', line: 4, column: 1, message: '...' }]
// Empty array = structurally valid

Measure savings on your own data

npx @alxmss/toon check data.json
  TOON Density Report — data.json
  ────────────────────────────────────────────────────────────
  Metric                              JSON      TOON     Delta
  ────────────────────────────────────────────────────────────
  Bytes                              56792     27334    -51.9%
  Tokens (cl100k_base)               18471      9985    -45.9%
  ────────────────────────────────────────────────────────────
  Fits in window (128k)                10×       20×      ×2.00
  ────────────────────────────────────────────────────────────

  Density Score  45.9% reduction
  [█████████░░░░░░░░░░░] 45.9%

  Context Expansion Factor: 2.00× — TOON fits 2.00× more data in the same window

When to Use TOON

TOON's compression is structural — it eliminates repeated key names and punctuation, not values. The gain scales with schema uniformity.

Maximum benefit (40–57%)

| Use Case | Typical Saving | Why | |----------|---------------|-----| | Observability pipelines (CloudWatch, Datadog, Loki) | ~46% | Log events are the most uniform data in existence → HRV | | GitHub / REST API responses (repos, issues, PRs) | ~46% | Repeated field names across paginated records → HRV | | Infrastructure config (k8s, Terraform) | ~57% | Long single-child chains → dot-path compression | | RAG pipelines with structured records | 35–47% | DB rows, product catalogs, CRM contacts → HRV | | Agentic tool schemas / endpoint inventories | 35–45% | Repeated schema fields across tool definitions → HRV |

Good benefit (30–40%)

| Use Case | Typical Saving | Why | |----------|---------------|-----| | CI bots and PR review agents (mixed payloads) | ~35% | Flat KV metadata + tabular arrays → dot-path + HRV | | LLM data transformation (validate / enrich / classify) | 30–45% | Uniform input records → savings on both request and response |

Diminishing returns

  • Prose documents (articles, emails, legal text) — no structural repetition to eliminate
  • Tiny payloads (< 50 tokens) — [len:N] anchor overhead isn't amortized
  • Highly irregular arrays — falls back to block format, still ~20–30% savings

Not a fit

  • Human-edited config files — YAML/TOML are more ergonomic to write
  • Binary or streaming data
  • Top-level arrays — wrap in { items: [...] } first

Why the Savings Compound

  • Lower API cost — input tokens are priced per token; 46% fewer tokens = 46% less on that payload
  • More data per window — a 200k Claude window holds 1.85× more complete records in TOON than JSON
  • Faster time-to-first-token — smaller prompts start streaming sooner
  • Fewer RAG round-trips — fitting more records per call reduces retrieval calls per session

For a pipeline processing 1M CloudWatch events/day, the measured 45.9% reduction translates to ~42M tokens saved — before latency improvements.


Measured Results

Synthetic fixtures (validated on every npm test)

| Shape | JSON Tokens | TOON Tokens | Savings | |-------|-------------|-------------|---------| | 12-row uniform table | 627 | 334 | 46.7% | | 6-row sparse table (1 optional col) | 294 | 196 | 33.3% | | Mixed document (KV + HRV + dot-path) | 415 | 271 | 34.7% | | Deeply nested config (3 levels) | 88 | 38 | 56.8% | | Non-uniform block array | 111 | 77 | 30.6% |

Real-world stress test (scripts/stress-test.ts)

| Fixture | JSON Tokens | TOON Tokens | Reduction | 200k window CEF | |---------|-------------|-------------|-----------|-----------------| | CloudWatch logs — 200 events × 9 fields | 18,471 | 9,985 | 45.9% | 2.00× | | GitHub repos — 50 repos × 12 fields | 5,457 | 2,919 | 46.5% | 1.89× |

CEF = Context Expansion Factor: how many more complete documents fit in the same window.
Tokenizer: cl100k_base (same encoder as GPT-4). Run node_modules/.bin/tsx scripts/stress-test.ts to reproduce.

LLM handshake verification

TOON_SYSTEM_PROMPT was verified against claude-sonnet-4-6 on a 12-row HRV CloudWatch log with a three-part reasoning task. Score: 5/5 — correct on highest latency value, timestamp, most-errored user, error count, and root-cause pattern. The TOON payload used ~180 tokens vs ~420 for equivalent JSON — 57% less on the reasoning task itself.

ANTHROPIC_API_KEY=sk-... node_modules/.bin/tsx scripts/handshake-test.ts

API Reference

stringify(value, options?)

stringify(value: Record<string, unknown>, options?: {
  indent?: 2 | 4       // default: 2
  dotPath?: boolean    // default: true  — compress single-child chains
  sizeHints?: boolean  // default: true  — emit [len:N] anchors
  hrvThreshold?: number // default: 0.5  — max extra/base key ratio for sparse HRV
}): string

Throws TypeError if top-level value is not a plain object.
Throws ToonSerializationError on circular references.

parse(input, options?)

parse(input: string, options?: {
  validateHints?: boolean  // default: true — throw on [len:N] length mismatch
}): Record<string, unknown>

Throws ToonParseError (with .line, .column, .suggestion) on violations.

lint(input)

lint(input: string): Issue[]
// Issue: { severity: 'error' | 'warning', line: number, column: number, message: string }

Never throws. Returns all structural issues in one pass — use before parse() in CI or editor integrations.

TOON_SYSTEM_PROMPT / TOON_SYSTEM_PROMPT_COMPACT

Pre-written system prompt snippets. Full (~280 tokens) for first-time integration; compact (~90 tokens) for follow-up calls.


CLI

# One-time project setup
npx @alxmss/toon init           # writes TOON block to ./CLAUDE.md
npx @alxmss/toon init --global  # writes to ~/.claude/CLAUDE.md

# Measure token savings on any JSON file
toon check data.json
toon check data.json --window 32000   # custom context window
toon check data.json --toon           # also print the TOON output

Format Reference

Sigils

| Sigil | Role | Example | |-------|------|---------| | key: value | Key-value pair | name: Alice | | a.b.c: value | Dot-path (single-child chain) | db.host: localhost | | [len:N] | Size hint / structural anchor | users[len:3]: | | # col1 \| col2 | HRV header | # id \| name \| role? | | > v1 \| v2 | HRV data row | > 1 \| Alice \| admin | | - key: value | Block array item | - method: GET | | ~ | Null / absent | latency: ~ | | col? | Optional HRV column | # id \| note? | | // | Line comment | // deprecated |

Type inference (first match wins)

~ or null      → null
true / false   → boolean
bare integer   → int
bare float     → float
"…"            → string
[…]            → inline array
{…}            → inline object
anything else  → bare string   (IPs, semver, URL paths never need quotes)

Array tier selection

All scalars                        → key[len:N]: a, b, c
All objects, no optional keys      → HRV-uniform  (# / > rows)
Objects with ≤50% optional keys    → HRV-sparse   (col? columns, ~ for absent)
Otherwise                          → Block array  (- items)

Spec

Formal EBNF grammar: spec/GRAMMAR.md

Known limitations:

  • Block string | is parse-only (stringify uses quoted strings instead)
  • NaN / Infinity serialize as quoted strings
  • Top-level arrays are not supported — wrap in { items: [...] }