npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

openclaw-plugin-tokenranger

v2026.3.2

Published

TokenRanger — context compression plugin for OpenClaw. Reduces cloud LLM token costs by 50-80% via local SLM summarization (Ollama).

Downloads

39

Readme

openclaw-plugin-tokenranger

TokenRanger is a community plugin for OpenClaw that compresses session context through a local SLM (via Ollama) before sending to cloud LLMs — reducing input token costs by 50–80%.


Table of contents


How it works

User message → OpenClaw gateway
  → before_agent_start hook
  → Turn 1? Skip (full fidelity for first message)
  → Turn 2+: strip code blocks, send history to localhost:8100/compress
  → FastAPI sidecar runs LangChain LCEL chain (Ollama)
  → Compressed summary returned as prependContext
  → Cloud LLM receives compressed context instead of full history

Inference strategy is auto-selected based on GPU availability:

| Strategy | Trigger | Model | Approach | |---|---|---|---| | full | GPU available | mistral:7b | Deep semantic summarization | | light | CPU only | phi3.5:3b | Extractive bullet points | | passthrough | Ollama unreachable | — | Truncate to last 20 lines, no compression |


Requirements

  • OpenClaw ≥ 2026.2.0 (install guide)
  • Ollama installed and running locally (ollama.com)
  • Python 3.10+ (for the FastAPI compression sidecar)

Install

1. Install the plugin

openclaw plugins install openclaw-plugin-tokenranger

To pin an exact version (recommended for production):

openclaw plugins install [email protected] --pin

2. Run first-time setup

openclaw tokenranger setup

setup does the following automatically:

  • Pulls the required Ollama models (mistral:7b + phi3.5:3b)
  • Creates a Python virtualenv and installs FastAPI/LangChain deps
  • Registers the TokenRanger sidecar as a system service:
    • Linux: systemd user unit (tokenranger.service)
    • macOS: launchd agent (com.peterjohannmedina.tokenranger.plist)
  • Starts the sidecar on localhost:8100

3. Restart the gateway

openclaw gateway restart

4. Verify

openclaw tokenranger

You should see your current settings and sidecar status (reachable/unreachable).

Manual sidecar start (if needed)

If the system service didn't register, you can start the sidecar directly:

# Linux / macOS
~/.openclaw/extensions/tokenranger/service/start.sh

Configuration

After install, configure under plugins.entries.tokenranger.config in your openclaw.json (edit via openclaw config set plugins.entries.tokenranger.config.<key> <value>):

| Key | Default | Description | |---|---|---| | serviceUrl | http://127.0.0.1:8100 | TokenRanger FastAPI sidecar URL | | timeoutMs | 10000 | Max wait per request before fallthrough | | minPromptLength | 500 | Min chars of history before compressing | | ollamaUrl | http://127.0.0.1:11434 | Ollama API base URL | | preferredModel | mistral:7b | Model used in full GPU strategy | | compressionStrategy | auto | auto / full / light / passthrough | | inferenceMode | auto | auto / cpu / gpu / remote |

Example — force CPU-only light mode:

openclaw config set plugins.entries.tokenranger.config.compressionStrategy light
openclaw config set plugins.entries.tokenranger.config.inferenceMode cpu
openclaw gateway restart

Commands

| Command | Description | |---|---| | /tokenranger | Show current settings and sidecar health | | /tokenranger mode gpu | Force GPU (full) compression strategy | | /tokenranger mode cpu | Force CPU (light) compression strategy | | /tokenranger mode off | Disable compression (passthrough) | | /tokenranger model | List available Ollama models | | /tokenranger toggle | Enable / disable the plugin |


Upgrading

TokenRanger follows calendar versioning (YYYY.M.D[-patch.N]), matching the OpenClaw release cadence.

Check for updates

openclaw plugins update tokenranger --dry-run

This shows the available version without applying anything.

Apply an update

openclaw plugins update tokenranger
openclaw tokenranger setup   # re-runs sidecar setup if service files changed
openclaw gateway restart

Note: setup is idempotent — it only pulls new models or reinstalls deps if versions have changed. It will not wipe your existing config.

Pin to a specific version

If you want to lock to a known-good release:

openclaw plugins install [email protected] --pin
openclaw tokenranger setup
openclaw gateway restart

To see all published versions:

npm view openclaw-plugin-tokenranger versions --json

After major OpenClaw upgrades

Check CHANGELOG.md in this repo for any breaking config key renames or sidecar API changes before upgrading TokenRanger across a major OpenClaw version bump.


Uninstalling

openclaw plugins uninstall tokenranger
openclaw gateway restart

This removes the plugin from config and deletes its install directory. The Python sidecar and system service are left in place — to fully remove:

# Linux
systemctl --user stop tokenranger && systemctl --user disable tokenranger
rm ~/.config/systemd/user/tokenranger.service

# macOS
launchctl unload ~/Library/LaunchAgents/com.peterjohannmedina.tokenranger.plist
rm ~/Library/LaunchAgents/com.peterjohannmedina.tokenranger.plist

Performance

5-turn Discord conversation benchmark (GPU, mistral:7b-instruct):

| Turn | Input tokens | Compressed | Reduction | Latency | |---|---|---|---|---| | 2 | 732 | 125 | 82.9% | 1,086ms | | 3 | 1,180 | 150 | 87.3% | 1,375ms | | 4 | 1,685 | 212 | 87.4% | 1,960ms | | 5 | 2,028 | 277 | 86.3% | 2,420ms |

Cumulative: 5,866 → 885 tokens (84.9% reduction), ~1.6s avg/turn.

CPU (phi3.5:3b-mini) benchmarks TBD.


Graceful degradation

TokenRanger never breaks your chat. At every failure point there's a silent fallthrough:

  • Sidecar unreachable → passthrough (message sent to cloud LLM uncompressed)
  • Ollama timeout → passthrough
  • Compression returns empty string → original message used
  • Plugin disabled → zero overhead, standard OpenClaw routing

Contributing

Issues and PRs welcome: https://github.com/peterjohannmedina/openclaw-plugin-tokenranger

For discussion, find us in the OpenClaw Discord.

Release process (maintainers)

See CONTRIBUTING.md for the full release and versioning workflow.


License

MIT — see LICENSE