@piscodm/claw-compressor

v0.1.9

Published

3 months ago

LLM context compression plugin for OpenClaw — reduces token usage by compressing tool results and normalizing messages at write time.

0High
0Medium
0Low

piscodm

openclaw openclaw-plugin llm token-optimization compression

claw-compressor

An OpenClaw plugin that compresses tool results and normalizes messages at write time, reducing LLM token usage across all sessions — no proxy, no routing, no payments required.

Why

Every tool call (file reads, shell output, web fetches, API responses) gets stored in the session transcript and re-sent to the LLM on every subsequent turn. Most of that output is noise. A 10KB exec result usually has ~300 chars of useful signal. The rest is repeated headers, empty lines, and irrelevant detail.

This plugin intercepts messages at write time, compresses them once, and stores the compressed version permanently. Every future turn in that session benefits automatically.

Works with any provider (Anthropic, OpenAI, etc.) — hooks into OpenClaw's session pipeline, not the HTTP layer.

Installation

openclaw plugins install @piscodm/claw-compressor
openclaw gateway restart

That's it. You'll see this in the logs on next start:

Claw Compressor active | layers: observation, json-compact, whitespace | threshold: 500 chars

Local / Dev Install

git clone https://github.com/jpbedoya/claw-compressor
cd claw-compressor
npm install && npm run build
openclaw plugins install -l .
openclaw gateway restart

How It Works

The plugin registers four OpenClaw hooks:

`tool_result_persist`

Fires before a tool result is written to the session transcript. Main compression path — biggest wins from compressing exec, web_fetch, browser, and other tool outputs. Only fires when content exceeds observationThreshold (default: 500 chars).

`before_message_write`

Fires before any message (user, assistant, system) is written. Used only for whitespace normalization — a low-risk pass safe on all message types. Tool results are skipped here (already handled above).

`gateway_start`

Fires once at gateway start. Kicks off an async, non-blocking npm registry check for newer versions. 3-second delay so startup isn't affected.

`session_start`

Fires at the start of each agent session. If an update was found, injects a system event so the agent naturally informs the user. Each session notified at most once per gateway boot.

Compression Layers

Three layers run in order:

| Layer | What it does | Default | |-------|-------------|---------| | observation | Extracts key signal from large tool outputs (errors, status codes, JSON keys, first/last lines) | ✅ | | json-compact | Strips whitespace from pretty-printed JSON blobs | ✅ | | whitespace | Normalizes repeated blank lines, trims trailing spaces | ✅ |

Observation Layer

The heaviest hitter. For tool results over the threshold:

Detects and preserves error messages / stack traces
Preserves status lines (exit codes, HTTP status, success/failure markers)
Extracts top-level JSON keys (structure without values)
Keeps the first and last N lines of output
Truncates to observationMaxLength

Result: a 10KB exec output becomes a ~300-char summary that retains all actionable information.

Per-Tool Overrides

Tools that produce informational content get a higher threshold and larger max length. Tools whose output is mostly ephemeral noise get compressed aggressively. File reads are excluded entirely — the model needs to read their full content.

| Tool | Behavior | |------|----------| | read, Read | Observation disabled — content is preserved as-is | | web_fetch | Threshold: 4000 chars, max: 2000 chars | | web_search | Threshold: 1000 chars, max: 800 chars | | exec, process | Threshold: 300 chars, max: 300–400 chars |

Commands

`/compress_stats [days]`

Shows compression savings for the last N days (default 7, max 90).

📦 Claw Compressor Stats (last 7d)

Overall
• Events compressed: 428
• Chars saved: 1.2M / 1.4M (83%)
• Est. tokens saved: ~303K
• Est. cost saved: ~$0.91 (@ $3.00/M tokens)

⚠️ Fidelity Warnings
• `web_fetch` — 2 high-compression events (p95: 94%) — may be discarding real content
  Consider raising the threshold or disabling observation for these tools.

By tool (avg | p50 | p95)
• `exec`      — 890K saved | avg 87% | p50 85% | p95 97% | 312 calls
• `browser`   — 210K saved | avg 94% | p50 92% | p95 99% | 38 calls
• `web_fetch` ⚠️ — 95K saved | avg 71% | p50 68% | p95 94% | 22 calls

By layer
• `observation`  — fired 380x
• `whitespace`   — fired 280x
• `json-compact` — fired 95x

`/compress_sample [n]`

Shows the last N before/after compression excerpts (default 5, max 20). Useful for manually verifying the compressor isn't discarding real content.

Samples are lightly redacted before storage — emails, bearer tokens, and long hex strings are replaced with [EMAIL], [REDACTED], [TOKEN].

🔬 Compression QA Samples (last 3)

[1] `exec` — 91% compressed · Mar 8, 10:23 AM
Before (4821 chars):
...
After (432 chars):
...

Samples are captured randomly (~20% of events) up to the daily cap (qaSamplesPerDay, default 5).

Fidelity Guardrails

The plugin monitors for suspicious compression ratios that may indicate real content is being discarded:

Read-like tools (read, Read, web_fetch, web_search, memory_get, memory_search, image, pdf, canvas, browser) — warns if compressed >70%
Any tool — warns if compressed >95%

When triggered:

A ⚠️ FIDELITY RISK warning is logged immediately to the gateway log
The event is flagged with fidelityWarning: true in the stats JSONL
The tool appears in the fidelity warnings section of /compress_stats

Configuration

In ~/.openclaw/openclaw.json, under plugins.entries.claw-compressor.config:

{
  "plugins": {
    "entries": {
      "claw-compressor": {
        "enabled": true,
        "config": {
          "enabled": true,
          "observation": true,
          "jsonCompact": true,
          "whitespace": true,
          "observationThreshold": 500,
          "observationMaxLength": 300,
          "logStats": true,
          "statsPath": "~/.openclaw/compressor-stats.jsonl",
          "qaSampling": true,
          "qaSamplesPerDay": 5,
          "qaSamplesPath": "~/.openclaw/compressor-samples.jsonl",
          "tokenPricePer1M": 3.0
        }
      }
    }
  }
}

All fields are optional — defaults shown above.

| Field | Default | Description | |-------|---------|-------------| | enabled | true | Master switch | | observation | true | Enable observation layer | | jsonCompact | true | Enable JSON compaction | | whitespace | true | Enable whitespace normalization | | observationThreshold | 500 | Min chars before observation fires | | observationMaxLength | 300 | Max chars after observation | | logStats | true | Write stats to JSONL | | statsPath | ~/.openclaw/compressor-stats.jsonl | Stats file path | | qaSampling | true | Enable QA sample capture | | qaSamplesPerDay | 5 | Max samples captured per calendar day | | qaSamplesPath | ~/.openclaw/compressor-samples.jsonl | Samples file path | | tokenPricePer1M | 3.0 | Token price (USD/1M) for cost estimate display |

Per-Tool Overrides

Override thresholds or disable observation entirely for specific tools:

{
  "toolOverrides": {
    "my_tool": { "threshold": 2000, "maxLength": 1000 },
    "sensitive_tool": { "observation": false }
  }
}

Stats File Format

~/.openclaw/compressor-stats.jsonl — one JSON line per compression event:

{
  "timestamp": "2026-03-08T17:23:11.204Z",
  "toolName": "exec",
  "sessionKey": "agent:main:telegram:group:-1001234567890:topic:123",
  "originalChars": 4821,
  "compressedChars": 432,
  "savedChars": 4389,
  "savedPct": 91,
  "layers": ["observation", "json-compact"],
  "fidelityWarning": false
}

Stats are global — all sessions (Telegram, Discord, sub-agents, etc.) write to the same file.

Project Structure

claw-compressor/
├── package.json
├── tsconfig.json
├── openclaw.plugin.json
├── README.md
├── src/
│   ├── index.ts          # Plugin entry — hooks + commands
│   ├── stats.ts          # JSONL stats writer
│   ├── report.ts         # Stats reader + formatter (percentiles, cost, fidelity)
│   ├── samples.ts        # QA sample capture + formatting
│   └── compression/
│       ├── types.ts      # Config types + defaults
│       ├── index.ts      # compress() pipeline
│       └── layers/
│           ├── observation.ts
│           ├── json-compact.ts
│           └── whitespace.ts
└── dist/                 # Compiled output

Development

npm run build              # compile
npm run build -- --watch  # watch mode
npm test                   # run test suite (~150ms, 18 tests)

# After any src/ change:
npm run build && openclaw gateway restart

Key Technical Notes

Write-time compression: Content is compressed once when stored. Every subsequent turn in that session sends the compressed version automatically — zero per-request overhead.
AgentMessage content format: Tool results (role: "toolResult") use content: ContentBlock[] — the plugin handles both string and block-array formats via getTextContent() / withContent().
Deduplication: If the same tool output appears twice in a session (e.g. same file read twice), the second occurrence is replaced with a back-reference instead of stored again in full.
tool_result_persist is synchronous: Compression is synchronous — no async needed.
Stats are cumulative: The JSONL grows indefinitely. Prune manually if needed, or just delete to reset.
Update check is fully async: Fires after 3s at gateway start, fails silently if network is unavailable.

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

claw-compressor

Why

Installation

Local / Dev Install

How It Works

tool_result_persist

before_message_write

gateway_start

session_start

Compression Layers

Observation Layer

Per-Tool Overrides

Commands

/compress_stats [days]

/compress_sample [n]

Fidelity Guardrails

Configuration

Per-Tool Overrides

Stats File Format

Project Structure

Development

Key Technical Notes

`tool_result_persist`

`before_message_write`

`gateway_start`

`session_start`

`/compress_stats [days]`

`/compress_sample [n]`