@cldy-com/clawsqueezer
v1.0.0
Published
π Stale content eviction for OpenClaw β squeeze images, tool results, and exec outputs from context. Compaction fires 2-3x less often.
Downloads
112
Maintainers
Readme
Your messages are 3% of context. Images, tool results, and exec outputs are 86%. After the LLM has processed them, they're dead weight β but they stay in context until compaction fires.
ClawSqueezer evicts stale heavy content before each LLM call. Compaction fires 2-3x less often.
The Problem
Real OpenClaw session breakdown (analyzed from production data):
π Tool results (160) 65,000 tokens 42% β File reads, exec outputs, web fetches
πΈ Image (1 screenshot) 48,000 tokens 30% β ONE base64 image
π§ Tool call args (160) 18,000 tokens 12% β SSH commands, file paths, arguments
π€ Assistant text (128) 18,000 tokens 11% β LLM responses
π¬ User messages (75) 4,000 tokens 3% β Your actual words
π© Overhead 4,000 tokens 2% β Tool IDs, thinking
ββββββββββββββββββββββββββββββββββββββββββββββ
~157K tokens filling a 200K context windowThat image was seen 20 turns ago. Those file reads were processed and acted upon. But they're still sitting in context, eating tokens, until compaction triggers an expensive LLM call to summarize everything.
The Solution
assemble() runs before every LLM call and evicts stale content:
Image (48K tokens, 5 turns old)
β "[image was here β 48,000 tokens, processed 5 turns ago]"
β 48K tokens freed
File read (10K tokens, 8 turns old)
β "[tool result squeezed β was 40,000 chars β Preview: import { Router }...]"
β 9.5K tokens freed
Exec output (5K tokens, 6 turns old)
β "[exec: npm run build 2>&1]"
β 5K tokens freedRecent content is never touched. Only stale heavy blocks get squeezed.
Requirements
- OpenClaw >= 2026.3.7 (ContextEngine plugin slot)
- Node.js >= 20
Make sure your OpenClaw version meets this requirement before enabling the plugin.
Installation
# From npm
openclaw plugins install @cldy-com/clawsqueezer
# From GitHub
openclaw plugins install https://github.com/cldy-com/ClawSqueezer
# From local path (for development)
openclaw plugins install /path/to/ClawSqueezer --linkThen activate it as the context engine:
openclaw config set plugins.slots.contextEngine clawsqueezerOr in openclaw.json:
{
"plugins": {
"slots": {
"contextEngine": "clawsqueezer"
}
}
}Configuration
{
"plugins": {
"config": {
"clawsqueezer": {
"staleTurns": 4,
"minTokensToSqueeze": 200,
"keepPreviewChars": 200,
"imageAgeTurns": 2
}
}
}
}| Option | Default | Description |
|--------|---------|-------------|
| staleTurns | 4 | Turns before content is eligible for eviction |
| minTokensToSqueeze | 200 | Minimum token size to consider evicting |
| keepPreviewChars | 200 | Characters of preview to keep from evicted content |
| imageAgeTurns | 2 | Turns before images are evicted (lower = more aggressive) |
Rollback
If anything goes wrong, rollback is instant:
# Soft β back to default, plugin stays installed
openclaw config unset plugins.slots.contextEngine
# Medium β plugin won't load
openclaw plugins disable clawsqueezer
# Hard β completely gone
openclaw plugins uninstall clawsqueezerNo data loss. ClawSqueezer only modifies messages in memory during assemble() β it never writes to session files.
How It Works
Message arrives β OpenClaw processes normally
β
βΌ
assemble() fires before LLM call
β
βΌ
ββββββββββββββββββββββββ
β Scan messages for: β
β β’ Images > N turns β
β β’ Tool results > N β
β β’ Exec outputs > N β
β β’ Large tool args β
ββββββββββββ¬ββββββββββββ
β
Replace with tiny placeholders
β
βΌ
LLM sees lean context β more room for work
Compaction fires less often β saves moneyWhat Gets Squeezed
| Content Type | Typical Size | After Squeeze | When | |-------------|-------------|--------------|------| | Base64 image | 48,000 tokens | ~20 tokens | After 2 turns | | Tool result (file read, exec output) | 2,000β10,000 tokens | ~50 tokens | After 4 turns | | Tool result (web fetch) | 1,000β5,000 tokens | ~30 tokens | After 4 turns | | Tool call arguments | 200β2,000 tokens | ~20 tokens | After 4 turns |
What's Never Touched
- Recent messages (within
staleTurns) - User text messages (always small)
- Assistant text responses (the actual conversation)
- Thinking blocks
- Small tool results (below
minTokensToSqueeze) - Tool call structure (type, id, name preserved for API pairing)
Production Results
First production deployment:
Before ClawSqueezer: Context fills to 180K β compaction fires
After ClawSqueezer: 73 blocks evicted, ~96K tokens freed per call
Image: 11K freed | Tool results: 85K freedStandalone Usage
import { squeeze } from "@cldy-com/clawsqueezer";
const { messages: squeezed, stats } = squeeze(messages, {
staleTurns: 4,
imageAgeTurns: 2,
});
console.log(`Freed ${stats.tokensFreed} tokens`);
console.log(`Evicted: ${stats.imagesEvicted} images, ${stats.toolResultsEvicted} tool results`);Why Not Just Compress The Summary?
We tried that first. It works (55% smaller summaries), but the real waste isn't in the summary β it's in the 86% of context that's stale images and tool outputs. Compressing the summary saves 3K tokens. Evicting one old screenshot saves 48K.
License
Apache-2.0 β Built by CLDY
