@piscodm/claw-compressor
v0.1.9
Published
LLM context compression plugin for OpenClaw — reduces token usage by compressing tool results and normalizing messages at write time.
Maintainers
Readme
claw-compressor
An OpenClaw plugin that compresses tool results and normalizes messages at write time, reducing LLM token usage across all sessions — no proxy, no routing, no payments required.
Why
Every tool call (file reads, shell output, web fetches, API responses) gets stored in the session transcript and re-sent to the LLM on every subsequent turn. Most of that output is noise. A 10KB exec result usually has ~300 chars of useful signal. The rest is repeated headers, empty lines, and irrelevant detail.
This plugin intercepts messages at write time, compresses them once, and stores the compressed version permanently. Every future turn in that session benefits automatically.
Works with any provider (Anthropic, OpenAI, etc.) — hooks into OpenClaw's session pipeline, not the HTTP layer.
Installation
openclaw plugins install @piscodm/claw-compressor
openclaw gateway restartThat's it. You'll see this in the logs on next start:
Claw Compressor active | layers: observation, json-compact, whitespace | threshold: 500 charsLocal / Dev Install
git clone https://github.com/jpbedoya/claw-compressor
cd claw-compressor
npm install && npm run build
openclaw plugins install -l .
openclaw gateway restartHow It Works
The plugin registers four OpenClaw hooks:
tool_result_persist
Fires before a tool result is written to the session transcript. Main compression path — biggest wins from compressing exec, web_fetch, browser, and other tool outputs. Only fires when content exceeds observationThreshold (default: 500 chars).
before_message_write
Fires before any message (user, assistant, system) is written. Used only for whitespace normalization — a low-risk pass safe on all message types. Tool results are skipped here (already handled above).
gateway_start
Fires once at gateway start. Kicks off an async, non-blocking npm registry check for newer versions. 3-second delay so startup isn't affected.
session_start
Fires at the start of each agent session. If an update was found, injects a system event so the agent naturally informs the user. Each session notified at most once per gateway boot.
Compression Layers
Three layers run in order:
| Layer | What it does | Default |
|-------|-------------|---------|
| observation | Extracts key signal from large tool outputs (errors, status codes, JSON keys, first/last lines) | ✅ |
| json-compact | Strips whitespace from pretty-printed JSON blobs | ✅ |
| whitespace | Normalizes repeated blank lines, trims trailing spaces | ✅ |
Observation Layer
The heaviest hitter. For tool results over the threshold:
- Detects and preserves error messages / stack traces
- Preserves status lines (exit codes, HTTP status, success/failure markers)
- Extracts top-level JSON keys (structure without values)
- Keeps the first and last N lines of output
- Truncates to
observationMaxLength
Result: a 10KB exec output becomes a ~300-char summary that retains all actionable information.
Per-Tool Overrides
Tools that produce informational content get a higher threshold and larger max length. Tools whose output is mostly ephemeral noise get compressed aggressively. File reads are excluded entirely — the model needs to read their full content.
| Tool | Behavior |
|------|----------|
| read, Read | Observation disabled — content is preserved as-is |
| web_fetch | Threshold: 4000 chars, max: 2000 chars |
| web_search | Threshold: 1000 chars, max: 800 chars |
| exec, process | Threshold: 300 chars, max: 300–400 chars |
Commands
/compress_stats [days]
Shows compression savings for the last N days (default 7, max 90).
📦 Claw Compressor Stats (last 7d)
Overall
• Events compressed: 428
• Chars saved: 1.2M / 1.4M (83%)
• Est. tokens saved: ~303K
• Est. cost saved: ~$0.91 (@ $3.00/M tokens)
⚠️ Fidelity Warnings
• `web_fetch` — 2 high-compression events (p95: 94%) — may be discarding real content
Consider raising the threshold or disabling observation for these tools.
By tool (avg | p50 | p95)
• `exec` — 890K saved | avg 87% | p50 85% | p95 97% | 312 calls
• `browser` — 210K saved | avg 94% | p50 92% | p95 99% | 38 calls
• `web_fetch` ⚠️ — 95K saved | avg 71% | p50 68% | p95 94% | 22 calls
By layer
• `observation` — fired 380x
• `whitespace` — fired 280x
• `json-compact` — fired 95x/compress_sample [n]
Shows the last N before/after compression excerpts (default 5, max 20). Useful for manually verifying the compressor isn't discarding real content.
Samples are lightly redacted before storage — emails, bearer tokens, and long hex strings are replaced with [EMAIL], [REDACTED], [TOKEN].
🔬 Compression QA Samples (last 3)
[1] `exec` — 91% compressed · Mar 8, 10:23 AM
Before (4821 chars):
...
After (432 chars):
...Samples are captured randomly (~20% of events) up to the daily cap (qaSamplesPerDay, default 5).
Fidelity Guardrails
The plugin monitors for suspicious compression ratios that may indicate real content is being discarded:
- Read-like tools (
read,Read,web_fetch,web_search,memory_get,memory_search,image,pdf,canvas,browser) — warns if compressed >70% - Any tool — warns if compressed >95%
When triggered:
- A
⚠️ FIDELITY RISKwarning is logged immediately to the gateway log - The event is flagged with
fidelityWarning: truein the stats JSONL - The tool appears in the fidelity warnings section of
/compress_stats
Configuration
In ~/.openclaw/openclaw.json, under plugins.entries.claw-compressor.config:
{
"plugins": {
"entries": {
"claw-compressor": {
"enabled": true,
"config": {
"enabled": true,
"observation": true,
"jsonCompact": true,
"whitespace": true,
"observationThreshold": 500,
"observationMaxLength": 300,
"logStats": true,
"statsPath": "~/.openclaw/compressor-stats.jsonl",
"qaSampling": true,
"qaSamplesPerDay": 5,
"qaSamplesPath": "~/.openclaw/compressor-samples.jsonl",
"tokenPricePer1M": 3.0
}
}
}
}
}All fields are optional — defaults shown above.
| Field | Default | Description |
|-------|---------|-------------|
| enabled | true | Master switch |
| observation | true | Enable observation layer |
| jsonCompact | true | Enable JSON compaction |
| whitespace | true | Enable whitespace normalization |
| observationThreshold | 500 | Min chars before observation fires |
| observationMaxLength | 300 | Max chars after observation |
| logStats | true | Write stats to JSONL |
| statsPath | ~/.openclaw/compressor-stats.jsonl | Stats file path |
| qaSampling | true | Enable QA sample capture |
| qaSamplesPerDay | 5 | Max samples captured per calendar day |
| qaSamplesPath | ~/.openclaw/compressor-samples.jsonl | Samples file path |
| tokenPricePer1M | 3.0 | Token price (USD/1M) for cost estimate display |
Per-Tool Overrides
Override thresholds or disable observation entirely for specific tools:
{
"toolOverrides": {
"my_tool": { "threshold": 2000, "maxLength": 1000 },
"sensitive_tool": { "observation": false }
}
}Stats File Format
~/.openclaw/compressor-stats.jsonl — one JSON line per compression event:
{
"timestamp": "2026-03-08T17:23:11.204Z",
"toolName": "exec",
"sessionKey": "agent:main:telegram:group:-1001234567890:topic:123",
"originalChars": 4821,
"compressedChars": 432,
"savedChars": 4389,
"savedPct": 91,
"layers": ["observation", "json-compact"],
"fidelityWarning": false
}Stats are global — all sessions (Telegram, Discord, sub-agents, etc.) write to the same file.
Project Structure
claw-compressor/
├── package.json
├── tsconfig.json
├── openclaw.plugin.json
├── README.md
├── src/
│ ├── index.ts # Plugin entry — hooks + commands
│ ├── stats.ts # JSONL stats writer
│ ├── report.ts # Stats reader + formatter (percentiles, cost, fidelity)
│ ├── samples.ts # QA sample capture + formatting
│ └── compression/
│ ├── types.ts # Config types + defaults
│ ├── index.ts # compress() pipeline
│ └── layers/
│ ├── observation.ts
│ ├── json-compact.ts
│ └── whitespace.ts
└── dist/ # Compiled outputDevelopment
npm run build # compile
npm run build -- --watch # watch mode
npm test # run test suite (~150ms, 18 tests)
# After any src/ change:
npm run build && openclaw gateway restartKey Technical Notes
- Write-time compression: Content is compressed once when stored. Every subsequent turn in that session sends the compressed version automatically — zero per-request overhead.
AgentMessagecontent format: Tool results (role: "toolResult") usecontent: ContentBlock[]— the plugin handles bothstringand block-array formats viagetTextContent()/withContent().- Deduplication: If the same tool output appears twice in a session (e.g. same file read twice), the second occurrence is replaced with a back-reference instead of stored again in full.
tool_result_persistis synchronous: Compression is synchronous — no async needed.- Stats are cumulative: The JSONL grows indefinitely. Prune manually if needed, or just delete to reset.
- Update check is fully async: Fires after 3s at gateway start, fails silently if network is unavailable.
