@rahadiana/opencode-ultrapress

v0.2.16

Published

a month ago

Token compression plugin for OpenCode AI — multi-layer filtering, semantic compression, and dynamic context pruning.

Downloads

178

0High
0Medium
0Low

rahadiana

opencode plugin compression token context pruning llm

UltraPress — DCP + Caveman + RTK Token Compression

A 4-layer token compression plugin for OpenCode AI, combining three complementary compression techniques to maximize context retention while minimizing token consumption.

| Technique | Layer | Strategy | |-----------|-------|----------| | DCP (Dynamic Context Placeholder) | L3 | Compresses stale conversation spans into expandable summaries | | Caveman (Semantic Compression) | L2 | NLP/MLM/LLM-based message reduction preserving meaning | | RTK (Real-Time Truncation) | L1 + L4 | Command-specific output filtering + deduplication |

Published as: @tarquinen/opencode-dcp on npm
Local name: @rahadiana/opencode-ultrapress
Requires: OpenCode AI with plugin support

Installation

# === Recommended: install from npm as an OpenCode plugin ===
opencode plugin @tarquinen/opencode-dcp@latest --global

# === Local development / fork ===
git clone https://github.com/rahadiana/opencode-ultrapress.git
cd opencode-ultrapress
npm install
npm run build
# Then add to opencode.json:
#   "plugin": ["/path/to/opencode-ultrapress"]

After installation, restart OpenCode and verify:

/up stats

After upgrading: Rebuild (npm run build if local) and restart OpenCode. If caching issues persist, uninstall/reinstall.

Quick Start

/up              Toggle compression on/off
/up enable       Enable compression
/up disable      Disable compression
/up config       View current configuration
/up stats        Layer-by-layer statistics
/up config k=v   Set a config value (persisted across restarts)

All 4 layers are enabled by default with conservative settings. Run /up stats immediately to see your baseline.

Architecture

The plugin registers 5 hooks across the OpenCode lifecycle. Data flows sequentially, each applying a distinct compression strategy:

   command.execute.before        tool.execute.after          chat.message
   ┌──────────────────────┐      ┌──────────────────┐       ┌──────────────────────┐
   │ /up command handling │      │ L1 Output Filter │       │ L2 Caveman Semantic  │
   │ Config persistence   │      │ L4 Cleanup       │       │ L3 DCP Compression   │
   │ Version notification │      │ (dedup + purge)  │       │ Token sync           │
   └──────────────────────┘      └──────────────────┘       └──────────────────────┘
                                                                      │
                                                                      ▼
                                             ┌───────────────────────────────────┐
                                             │ experimental.chat.messages       │
                                             │ .transform → Strip injections    │
                                             │ experimental.session.compacting  │
                                             │ → Inject protected context       │
                                             └───────────────────────────────────┘

Full architecture documentation with diagrams: docs/architecture.md

Layer Reference

L1 — Output Filter (RTK)

Real-time command-specific output truncation.

Files: src/layers/layer1-output-filter.ts, src/filters/
Hook: tool.execute.after

Each command type gets a custom filter:

| Command | Strategy | |---------|----------| | bash (build) | HEADER/SUMMARY/FILES stripped. Keep errors + final status | | bash (docker) | Tags/registry/config stripped. Keep errors + image ID | | bash (grep) | Empty results stripped. Keep total count | | git status | One file per line. Empty sections stripped | | git diff | Only +/- lines preserved. Unchanged collapsed | | git log | One-liner: hash date message | | git add/commit/push | Single-line summary | | ls/find/tree | Dedup paths, grouped by directory, max 3 per dir | | cat (single file) | Content preserved verbatim | | Jest/Vitest/Pytest/Cargo/Bun | Failures + summary only. Passing stripped | | Generic fallback | ANSI stripped, blank lines collapsed, smart truncation (70/25 split) |

When truncated, full output is tee-saved to a temp .log file: [truncated, see path].

Config: outputFilter.maxCharsPerOutput (6000), skipTools, customFilters

L2 — Caveman Semantic

NLP-based message compression preserving meaning.

Files: src/layers/layer2-caveman.ts, src/caveman/
Hook: chat.message

Three backends:

| Mode | Backend | Dependencies | Speed | Quality | |------|---------|-------------|-------|---------| | nlp | TF-IDF sentence scoring | None | ~1-5ms | Medium | | mlm | Transformers.js embeddings | @huggingface/transformers | ~50-200ms | Good | | llm | LLM summarization | Provider model | ~500ms-2s | Best |

All modes protect code blocks, errors, and short messages.

L3 — DCP (Dynamic Context Placeholder)

Conversation span compression with expandable summaries.

Files: src/layers/layer3-dcp.ts, src/dcp/
Hook: chat.message

Adaptive pipeline:

Context Monitor — tracks token counts by role
Scorer — rates messages by purpose, recency, and role
Pruner — selects stale candidates, groups into batches
Compress State — generates high-fidelity summaries
Storage — persists originals for later expansion via ultrapress_expand

Token-aware: skips auto-compress when batch < batchThreshold (default 5).

Protected (never compressed): system prompts, last N messages (preserveLastN), messages with critical keywords.

L4 — Cleanup (RTK)

Post-filter deduplication and error purging.

Files: src/layers/layer4-cleanup.ts, src/cleanup/
Hook: tool.execute.after (alongside L1)

| Strategy | Behavior | |----------|----------| | Dedup | Identical consecutive tool calls collapsed to one | | Error Purge | Stale errors purged after N turns (default 5) |

Commands

All commands prefixed with /up:

| Command | Description | |---------|-------------| | /up | Toggle compression on/off | | /up enable | Enable compression | | /up disable | Disable compression | | /up stats | Full layer-by-layer statistics | | /up config | View current persistent config (JSON) | | /up config k=v | Set config (e.g. notification=detailed) | | /up schema | Print JSON config schema | | /up notify off | Disable notifications | | /up notify minimal | Show only major events (default) | | /up notify detailed | Show per-layer notifications |

Stats Example

UltraPress Session Stats
========================
Total Messages  : 140
Sent Messages   : 55
Received        : 85
Total Original  : 670.0k
Total Compressed: 261.9k
Overall Savings : 408.1k (+61%)

-- Layer Breakdown --
L1 (Output Filter)
  120 tools filtered, saved 46.2k
L2 (Semantic)
  30 messages, saved 175.9k
L3 (Summary)
  95 compressions, saved 40.6M (accumulated)
L4 (Cleanup)
  25 tools deduped, saved 61.5k

Hook Reference

| Hook | Handles | |------|---------| | command.execute.before | /up commands, config persistence, version check | | tool.execute.after | L1 output filter + L4 cleanup | | chat.message | L2 semantic + L3 DCP compression, token sync, post-command suppression | | experimental.chat.messages.transform | Strip [analyze-mode]/[search-mode] artifacts | | experimental.session.compacting | Inject protected context during compaction |

Configuration

Full reference: docs/konfigurasi-lengkap.md (Bahasa Indonesia)

Config is auto-created at ~/.config/opencode/ultrapress.plugin.json on first run.

Defaults

{
  "enabled": true,
  "enableDebug": false,
  "notification": "minimal",
  "outputFilter": {
    "enabled": true, "maxCharsPerOutput": 6000, "teeSaveOnTruncate": true,
    "skipTools": ["task"], "customFilters": []
  },
  "semantic": {
    "enabled": true, "mode": "nlp", "model": "Xenova/all-MiniLM-L6-v2",
    "compressUserMessages": true, "compressAssistantMessages": false,
    "compressToolOutputs": true, "protectCodeBlocks": true,
    "protectErrors": true, "minLengthChars": 250, "skipTools": ["task"]
  },
  "summarization": {
    "enabled": true, "preserveLastN": 4, "scoreThreshold": 0.45,
    "batchThreshold": 5, "maxBatchSize": 12, "contextLines": 3, "memoryLimit": 300
  },
  "cleanup": {
    "enabled": true,
    "deduplication": { "enabled": true },
    "purgeErrors": { "enabled": true, "turns": 5 }
  },
  "commands": { "enabled": true }
}

/up config outputFilter.maxCharsPerOutput=4000
/up config semantic.mode=llm
/up config summarization.preserveLastN=6

Custom Tools

Two tools the LLM can invoke:

| Tool | Usage | |------|-------| | ultrapress_compress(mode, from_id, to_id, message_ids) | Manually compress a message range or specific IDs | | ultrapress_expand(block_id, topic) | Expand a previously compressed block by ID or topic |

Development

Source Map

src/
  index.ts                  Plugin entry, hook registration
  commands/slash.ts         /up command handler, stats, config setter
  config/                   Schema, defaults, validation
  layers/                   4 compression layer files
  caveman/                  L2: NLP/MLM/LLM implementations
  dcp/                      L3: scorer, pruner, compression, storage, tools
  cleanup/                  L4: dedup, error purge
  filters/                  L1: bash, git, fs, test, generic filters
  utils/                    Logger, token counter

Commands

npm install       Install deps (npm lockfile)
npm run lint      Typecheck (tsc --noEmit)
npm test          Unit tests (bun test)
npm run build     Build dist/ (tsup, ESM + d.ts)
npm run benchmark Performance benchmarks

CI/CD

| Workflow | Trigger | Steps | |----------|---------|-------| | CI | PR / push | lint -> test -> build | | Publish | Tag v* | CI gates -> npm publish -> GitHub Release |

Version synced from git tag (strip leading v). Do not manually publish.

Troubleshooting

| Issue | Fix | |-------|-----| | Plugin not loaded | opencode config get plugin to verify. Rebuild if local | | /up not working | Restart OpenCode. Check plugin registration | | L3 not compressing | summarization.enabled must be true. Batch must reach threshold | | L2 MLM/LLM failing | Fall back to nlp mode. Install @huggingface/transformers | | Caching issues | opencode plugin remove @tarquinen/opencode-dcp then reinstall |

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

UltraPress — DCP + Caveman + RTK Token Compression

Table of Contents

Installation

Quick Start

Architecture

Layer Reference

L1 — Output Filter (RTK)

L2 — Caveman Semantic

L3 — DCP (Dynamic Context Placeholder)

L4 — Cleanup (RTK)

Commands

Stats Example

Hook Reference

Configuration

Defaults

Custom Tools

Development

Source Map

Commands

CI/CD

Troubleshooting

Further Reading