@arkadia/data

v0.1.11

Published

3 days ago

Parser and Stringifier for Arkadia Data Format (AKD)

0High
0Medium
0Low

amada-org

arkadia.solutions

ai data format parser schema arkadia

@arkadia/data

                                   ; i  :J
                               U, .j..fraaM.  nl
                            b h.obWMkkWWMMWMCdkvz,k
                         ! .mQWM:o hiMoMW v.uaXMdohbi
                        hI,MMmaIao.Wo .IMkoh FCMwqoXa
                      ,.c.aWdM. d,aToW  .    Mb!. MopfQ.L
                       jhj.xoM :k    aCu F: w MpmqMvMMI,I
                      bzMhz:W    .Mw . o lYh ai M iMa pM.j
                     hzqWWM;    M;o.WMWWMkMX f.a aa bModpo.
                     ;tMbbv   xp oJMMWWWWMMMM iv  dLMXakM:T
                       mdh        MMWWWWWWWbQLCzurjktvMor
                      ,QFw ;M,b .MWWWWWWWMWMWd  xz   M,kd X
                      qjMIo IMTW.WWWWWMWWWM.o.I   rpULaMdi.
                       .mMM  uoWWWMWWWWWWp qM,,M l M;mMbrI
                        f nm  MMW MWWjMuMj  I  o   LbMac
                              WWdMWWWW Mv a.b..aauMhMwQf
                              MoWWW,WWtjonJMWtoMdoaoMI
                              MMMM Mi    xd:Mm tMwo Cr,
                             xMMc .otqokWMMMao:oio.
                             MW    .   C..MkTIo
                            WW
                           QWM
                           WW
                          uMW
                          WW
                          MW

The High-Density, Token-Efficient Data Protocol for Large Language Models.

Arkadia Data Format (AKD) is a schema-first protocol designed specifically to optimize communication with LLMs. By stripping away redundant syntax (like repeated JSON keys) and enforcing strict typing, AKD offers up to 30% token savings, faster parsing, and a metadata layer invisible to your application logic but fully accessible to AI models.

✨ Key Features

📉 Token Efficiency: Reduces context window usage by replacing verbose JSON objects with dense Positional Records (Tuples).
🛡️ Type Safety: Enforces types (int, float, bool, string) explicitly in the schema before data reaches the LLM.
🧠 Metadata Injection: Use #tags and $attributes to pass context (e.g., source confidence, deprecation warnings) to the LLM without polluting your data structure.
⚡ High Performance: Zero-dependency, lightweight parser built for high-throughput Node.js/Edge environments.

📦 Installation

npm install @arkadia/data
# or
yarn add @arkadia/data
# or
pnpm add @arkadia/data

🚀 Quick Start

Basic Usage

import { encode, decode } from '@arkadia/data';

// 1. Encode: JavaScript Object -> AKD String
const data = { id: 1, name: 'Alice', active: true };

// Default encoding (compact)
const encoded = encode(data);
console.log(encoded);
// Output: <id:number,name:string,active:bool>(1,"Alice",true)

// 2. Decode: AKD String -> JavaScript Object
const input = '<score:number>(98.5)';
const result = decode(input);

if (result.errors.length === 0) {
  console.log(result.node.value); // 98.5
} else {
  console.error('Parse errors:', result.errors);
}

🛠 API Reference

`encode(data: unknown, config?: EncodeConfig): string`

Serializes a JavaScript value into an AKD string.

data: The input string, number, boolean, array, or object.
config:
compact (boolean): Removes whitespace. Default: true.
colorize (boolean): Adds ANSI colors for terminal output. Default: false.
escapeNewLines (boolean): Escapes \n in strings. Default: false.

`decode(text: string, config?: DecodeConfig): DecodeResult`

Parses an AKD string into a structured node tree.

text: The raw AKD string.
config:
debug (boolean): Enables internal logging.
Returns DecodeResult:
node: The Root Node (contains .value, .dict(), .json()).
errors: Array of parsing errors.

⚡ Benchmarks

Why switch? Because every token counts. AKCD (Arkadia Compressed Data) consistently outperforms standard formats.

BENCHMARK SUMMARY:

   JSON  █████████████████████░░░░     6921 tok     0.15 ms
   AKCD  ████████████████░░░░░░░░░     5416 tok     4.40 ms
   AKD   ███████████████████░░░░░░     6488 tok     4.29 ms
   TOON  █████████████████████████     8198 tok     2.36 ms

   FORMAT     TOKENS       VS JSON
   ---------------------------------
   AKCD       5416         -21.7%
   AKD        6488         -6.3%
   JSON       6921         +0.0%
   TOON       8198         +18.5%

CONCLUSION: Switching to AKCD saves 1505 tokens (21.7%) compared to JSON.

📖 Syntax Specification

AKD separates structure (Schema) from content (Data).

1. Primitives

Primitive values are automatically typed. Strings are quoted, numbers and booleans are bare.

| Type | Input | Encoded Output | | ----------- | --------- | ----------------- | | Integer | 123 | <number>123 | | String | "hello" | <string>"hello" | | Boolean | true | <bool>true | | Null | null | <null>null |

2. Schema Definition (`@Type`)

Define the structure once to avoid repeating keys.

/* Define a User type */
@User <
  id: number,
  name: string,
  role: string
>

3. Data Structures

Positional Records (Tuples)

The most efficient way to represent objects. Values must match the schema order.

/* Schema: <x:number, y:number> */
(10, 20)

Named Records (Objects)

Flexible key-value pairs, similar to JSON, used when schema is loose or data is sparse.

{
  id: 1,
  name: "Admin"
}

Lists

Dense arrays. Can be homogenous (list of strings) or mixed.

[ "active", "pending", "closed" ]

4. Metadata System

AKD allows you to inject metadata that is visible to the LLM but ignored by the parser when decoding back to your application.

Attributes (`$key=value`) & Tags (`#flag`)

@Product <
  $version="2.0"
  sku: string,

  /* Tagging a field as deprecated */
  #deprecated
  legacy_id: int
>

5. Escaped Identifiers (Backticks)

AK-Data allows the use of spaces, symbols, and special characters in names by wrapping them in backticks (```). This applies to schema names, field keys, and metadata attributes.

@`System User+` <
  // $`last-sync`="2024-05-10" //
  `Full Name`: string,
  `is-active?`: bool,
  $`Special ID*` id: number
>
{
  `Full Name`: "John Doe",
  `is-active?`: true,
  id: 101
}

6. Prompt Output Mode (`--prompt-output`)

This mode is specifically designed for Large Language Models (LLMs). It transforms AK-Data into a Structural Blueprint, providing a perfect template for the AI to follow. Instead of raw data values, it renders a recursive, human-readable schema structure.

Key Features:

Full Structural Expansion: Anonymous nested types are fully expanded into braces {}.
Semantic Hinting: Field-level comments from the schema are injected directly into the template.
Representative Sampling: Lists show a single blueprint element followed by a continuation hint (...), saving tokens while maintaining clarity.

Example Usage:

# Generate a structural template for an LLM
echo '<[ /* id */ id: number, name: string, val: <id: string, num: number> ]>' | akd dec -f akd --prompt-output -

Output:

[
  {
    id: number /* id */,
    name: string,
    val: {
      id: string,
      num: number
    }
  },
  ... /* repeat pattern for additional items */
]

Why use it?

Reduce Hallucination: The LLM sees exactly what types and formats are expected for every field.
Context Efficiency: By showing only one example in a list, you define the logic without wasting the context window on repetitive data.
Implicit Instruction: The transition from positional () to named {} in prompt mode helps the AI differentiate between the "Instructions" and the final "Compact Output".

📄 License

This project is licensed under the MIT License.

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

@arkadia/data

✨ Key Features

📦 Installation

🚀 Quick Start

Basic Usage

🛠 API Reference

encode(data: unknown, config?: EncodeConfig): string

decode(text: string, config?: DecodeConfig): DecodeResult

⚡ Benchmarks

📖 Syntax Specification

1. Primitives

2. Schema Definition (@Type)

3. Data Structures

Positional Records (Tuples)

Named Records (Objects)

Lists

4. Metadata System

Attributes ($key=value) & Tags (#flag)

5. Escaped Identifiers (Backticks)

6. Prompt Output Mode (--prompt-output)

📄 License

`encode(data: unknown, config?: EncodeConfig): string`

`decode(text: string, config?: DecodeConfig): DecodeResult`

2. Schema Definition (`@Type`)

Attributes (`$key=value`) & Tags (`#flag`)

6. Prompt Output Mode (`--prompt-output`)