openclaw-plugin-tokenranger

v2026.3.2

Published

4 months ago

TokenRanger — context compression plugin for OpenClaw. Reduces cloud LLM token costs by 50-80% via local SLM summarization (Ollama).

0High
0Medium
0Low

synchronic1

openclaw openclaw-plugin llm token-compression ollama context-compression

openclaw-plugin-tokenranger

TokenRanger is a community plugin for OpenClaw that compresses session context through a local SLM (via Ollama) before sending to cloud LLMs — reducing input token costs by 50–80%.

How it works

User message → OpenClaw gateway
  → before_agent_start hook
  → Turn 1? Skip (full fidelity for first message)
  → Turn 2+: strip code blocks, send history to localhost:8100/compress
  → FastAPI sidecar runs LangChain LCEL chain (Ollama)
  → Compressed summary returned as prependContext
  → Cloud LLM receives compressed context instead of full history

Inference strategy is auto-selected based on GPU availability:

| Strategy | Trigger | Model | Approach | |---|---|---|---| | full | GPU available | mistral:7b | Deep semantic summarization | | light | CPU only | phi3.5:3b | Extractive bullet points | | passthrough | Ollama unreachable | — | Truncate to last 20 lines, no compression |

Requirements

OpenClaw ≥ 2026.2.0 (install guide)
Ollama installed and running locally (ollama.com)
Python 3.10+ (for the FastAPI compression sidecar)

Install

1. Install the plugin

openclaw plugins install openclaw-plugin-tokenranger

To pin an exact version (recommended for production):
openclaw plugins install [email protected] --pin

2. Run first-time setup

openclaw tokenranger setup

setup does the following automatically:

Pulls the required Ollama models (mistral:7b + phi3.5:3b)
Creates a Python virtualenv and installs FastAPI/LangChain deps
Registers the TokenRanger sidecar as a system service:
- Linux: systemd user unit (tokenranger.service)
- macOS: launchd agent (com.peterjohannmedina.tokenranger.plist)
Starts the sidecar on localhost:8100

3. Restart the gateway

openclaw gateway restart

4. Verify

openclaw tokenranger

You should see your current settings and sidecar status (reachable/unreachable).

Manual sidecar start (if needed)

If the system service didn't register, you can start the sidecar directly:

# Linux / macOS
~/.openclaw/extensions/tokenranger/service/start.sh

Configuration

After install, configure under plugins.entries.tokenranger.config in your openclaw.json (edit via openclaw config set plugins.entries.tokenranger.config.<key> <value>):

| Key | Default | Description | |---|---|---| | serviceUrl | http://127.0.0.1:8100 | TokenRanger FastAPI sidecar URL | | timeoutMs | 10000 | Max wait per request before fallthrough | | minPromptLength | 500 | Min chars of history before compressing | | ollamaUrl | http://127.0.0.1:11434 | Ollama API base URL | | preferredModel | mistral:7b | Model used in full GPU strategy | | compressionStrategy | auto | auto / full / light / passthrough | | inferenceMode | auto | auto / cpu / gpu / remote |

Example — force CPU-only light mode:

openclaw config set plugins.entries.tokenranger.config.compressionStrategy light
openclaw config set plugins.entries.tokenranger.config.inferenceMode cpu
openclaw gateway restart

Commands

| Command | Description | |---|---| | /tokenranger | Show current settings and sidecar health | | /tokenranger mode gpu | Force GPU (full) compression strategy | | /tokenranger mode cpu | Force CPU (light) compression strategy | | /tokenranger mode off | Disable compression (passthrough) | | /tokenranger model | List available Ollama models | | /tokenranger toggle | Enable / disable the plugin |

Upgrading

TokenRanger follows calendar versioning (YYYY.M.D[-patch.N]), matching the OpenClaw release cadence.

Check for updates

openclaw plugins update tokenranger --dry-run

This shows the available version without applying anything.

Apply an update

openclaw plugins update tokenranger
openclaw tokenranger setup   # re-runs sidecar setup if service files changed
openclaw gateway restart

Note: setup is idempotent — it only pulls new models or reinstalls deps if versions have changed. It will not wipe your existing config.

Pin to a specific version

If you want to lock to a known-good release:

openclaw plugins install [email protected] --pin
openclaw tokenranger setup
openclaw gateway restart

To see all published versions:

npm view openclaw-plugin-tokenranger versions --json

After major OpenClaw upgrades

Check CHANGELOG.md in this repo for any breaking config key renames or sidecar API changes before upgrading TokenRanger across a major OpenClaw version bump.

Uninstalling

openclaw plugins uninstall tokenranger
openclaw gateway restart

This removes the plugin from config and deletes its install directory. The Python sidecar and system service are left in place — to fully remove:

# Linux
systemctl --user stop tokenranger && systemctl --user disable tokenranger
rm ~/.config/systemd/user/tokenranger.service

# macOS
launchctl unload ~/Library/LaunchAgents/com.peterjohannmedina.tokenranger.plist
rm ~/Library/LaunchAgents/com.peterjohannmedina.tokenranger.plist

Performance

5-turn Discord conversation benchmark (GPU, mistral:7b-instruct):

| Turn | Input tokens | Compressed | Reduction | Latency | |---|---|---|---|---| | 2 | 732 | 125 | 82.9% | 1,086ms | | 3 | 1,180 | 150 | 87.3% | 1,375ms | | 4 | 1,685 | 212 | 87.4% | 1,960ms | | 5 | 2,028 | 277 | 86.3% | 2,420ms |

Cumulative: 5,866 → 885 tokens (84.9% reduction), ~1.6s avg/turn.

CPU (phi3.5:3b-mini) benchmarks TBD.

Graceful degradation

TokenRanger never breaks your chat. At every failure point there's a silent fallthrough:

Sidecar unreachable → passthrough (message sent to cloud LLM uncompressed)
Ollama timeout → passthrough
Compression returns empty string → original message used
Plugin disabled → zero overhead, standard OpenClaw routing

Contributing

Issues and PRs welcome: https://github.com/peterjohannmedina/openclaw-plugin-tokenranger

For discussion, find us in the OpenClaw Discord.

Release process (maintainers)

See CONTRIBUTING.md for the full release and versioning workflow.

License

MIT — see LICENSE

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

openclaw-plugin-tokenranger

Table of contents

How it works

Requirements

Install

1. Install the plugin

2. Run first-time setup

3. Restart the gateway

4. Verify

Manual sidecar start (if needed)

Configuration

Commands

Upgrading

Check for updates

Apply an update

Pin to a specific version

After major OpenClaw upgrades

Uninstalling

Performance

Graceful degradation

Contributing

Release process (maintainers)

License