yson-format
v0.1.0
Published
Hyper-compact LLM-native data format that saves 35-60% tokens compared to JSON
Maintainers
Readme
YSON: Hyper-Compact Data Format for LLMs
Save 35-60% on LLM API costs by using fewer tokens for the same data.
YSON (Why-son) is a data serialization format designed for AI applications. It reduces token usage while maintaining full compatibility with JSON, making your LLM applications faster and cheaper.
Why YSON?
Massive Cost Savings
Every token costs money. YSON uses 35-60% fewer tokens than JSON.
Real-world impact:
- 10,000 API calls/day with JSON: ~$15/day
- Same calls with YSON: ~$6-9/day
- Annual savings: $2,000-3,000+
The Difference
JSON (45 tokens):
{
"users": [
{"id": 1, "name": "Alice", "age": 30},
{"id": 2, "name": "Bob", "age": 25},
{"id": 3, "name": "Carol", "age": 35}
]
}YSON (18 tokens - 60% reduction):
users
$S id name age
1 Alice 30
2 Bob 25
3 Carol 35Key Features
- Schema-Based Compression: Define structure once, reuse for all rows
- Space-Delimited: No unnecessary punctuation
- Smart Quoting: Quotes only when needed
- Human-Readable: Easy to read, write, and debug
- 100% JSON Compatible: Perfect round-trip conversion
Quick Start
# Clone and install
git clone https://github.com/YuvrajGupta1808/yson.git
cd yson
npm install
# Run demo
npm run demoBasic Usage
import { YSONConverter } from './src/index.js';
// JSON to YSON
const data = { user: { id: 1, name: "Alice" } };
const yson = YSONConverter.encode(data);
console.log(yson);
// Output:
// user
// id 1 name Alice
// YSON to JSON
const json = YSONConverter.decode(yson);
// Count tokens
const tokens = YSONConverter.countTokens(yson);Live Examples with Gemini API
# Get API key from https://aistudio.google.com/app/apikey
# Add to .env: GEMINI_API_KEY=your-key
npm run gemini:compare # Compare formats
npm run gemini:retrieval # Test accuracy
npm run gemini:optimize # See optimization
npm run gemini:format # Input/output efficiency
npm run gemini:company # Data analysisWhen to Use YSON
Perfect For:
- High-volume LLM API calls
- RAG systems with large context
- AI agent communication
- Cost-sensitive applications
- Structured data in prompts
Proven Results
Token Reduction vs JSON:
- Simple objects: 40-60% fewer tokens
- Arrays with schemas: 50-70% fewer tokens
- Nested structures: 30-40% fewer tokens
Tested with Google Gemini API on real-world data with identical accuracy.
Works With
- OpenAI (GPT-4, GPT-3.5)
- Google Gemini
- Anthropic Claude
- Open-source models (Llama, Mistral, etc.)
- Any LLM that accepts text input
API Reference
// YSONConverter
YSONConverter.encode(jsonObject) // JSON to YSON
YSONConverter.decode(ysonString) // YSON to JSON
YSONConverter.countTokens(ysonString) // Count tokens
// JSONParser
JSONParser.parse(jsonString)
JSONParser.stringify(ast, pretty=false)
JSONParser.countTokens(jsonString)License
MIT
Start saving on your LLM costs today.
