toon-format-parser
v1.1.0
Published
Parser bidireccional para formato TOON (Token-Oriented Object Notation) - Optimizado para LLMs
Maintainers
Readme
TOON Parser
Description
TOON (Token-Oriented Object Notation) is a human-readable data format optimized for Large Language Models (LLMs). It provides a bidirectional parser for encoding JavaScript objects to TOON format and decoding TOON strings back to JavaScript objects.
TOON combines the readability of YAML with the efficiency of tabular formats and introduces Collection Markers for unambiguous parsing of large datasets, making it ideal for LLM interactions and context-window optimization.
Key Features
- Collection Markers: Use
items[N]:to denote arrays anditems[N]{headers}:for tabular data. - Tabular Efficiency: Automatic detection and encoding of object arrays into compact CVS-like rows.
- LLM Optimized: Minimizes tokens by removing redundant brackets, braces, and quotes.
- Robust Parsing: Handles complex nested structures and unquoted strings with ease.
Installation
Install TOON Parser using Bun:
bun add toon-parserQuick Start
import { encode, decode } from 'toon-parser';
// Encode JavaScript data to TOON format
const data = {
sessionId: "abc-123",
users: [
{ name: 'Alice', role: 'admin' },
{ name: 'Bob', role: 'user' }
]
};
const toonString = encode(data);
console.log(toonString);
/* Output:
sessionId: abc-123
users: items[2]{name,role}:
Alice,admin
Bob,user
*/
// Decode back to JSON
const decoded = decode(toonString);Examples
Tabular Arrays with Markers
TOON uses markers to help LLMs and parsers understand the structure and size of collections:
const activities = [
{ time: '08:00', task: 'Breakfast', location: 'Rome' },
{ time: '10:00', task: 'Coliseum Tour', location: 'Piazza del Colosseo' }
];
const toon = encode(activities);
// Output:
// items[2]{time,task,location}:
// 08:00,Breakfast,Rome
// 10:00,Coliseum Tour,Piazza del ColosseoSimple Arrays (Dash format)
For lists of primitives or heterogeneous data:
const tags = ['travel', 'itinerary', 2025];
const toon = encode(tags);
// Output:
// - travel
// - itinerary
// - 2025Complex Nested Structures
const trip = {
title: "Italy 2025",
days: [
{
day: 1,
events: [
{ time: "09:00", desc: "Arrival" },
{ time: "12:00", desc: "Lunch" }
]
}
]
};
const toon = encode(trip);
/* Output:
title: "Italy 2025"
days: items[1]:
- day: 1
events: items[2]{time,desc}:
09:00,Arrival
12:00,Lunch
*/API Reference
encode(data, options?)
Encodes data to TOON format.
indent: Number of spaces (default: 2).delimiter: Tabular separator (default:,).
decode(toon, options?)
Decodes TOON string to JavaScript objects.
trim: Trim whitespace from values (default: true).
Size Comparison
TOON is significantly more compact than JSON, especially for structured data sets common in LLM prompts.
| Format | Size (Bytes) | Reduction | | :--- | :--- | :--- | | JSON | 27,905 | 0% | | TOON | 9,704 | ~65% |
Benchmark based on a complex 4-day travel itinerary with nested activities and checklists.
Contributing
See the project specification for technical details on the format.
