npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

@arvoretech/pi-smart-context

v0.2.0

Published

Intelligent model routing and prompt compression for Pi/Kiro

Readme

@arvoretech/pi-smart-context

Intelligent model routing and retrieval-augmented prompt compression for Pi/Kiro.

Features

Model Routing

Uses Haiku (fast, cheap) to classify task complexity based on the full conversation context — not just the current message. So "bora" after a complex architecture discussion correctly routes to Opus.

| Classification | Model | When | |---|---|---| | trivial | claude-haiku-4-5 | Greetings, meta-conversation, no pending task | | simple | claude-sonnet-4-6 | Single-file fixes, quick questions | | medium | claude-sonnet-4-6 | Standard multi-file work (deterministic baseline) | | complex | claude-opus-4-8 | Architecture, large refactors, security audits | | Large context (>500K) | claude-sonnet-4-6 | 1M window needed |

Retrieval-Augmented Compression

The core principle (from the prompt-compression literature): the model never loses access to information — it just pays less to carry it by default.

Compressed/dropped content is replaced by a summary + a recover_context("id") hint. The original is kept in an in-memory store. If the model actually needs the detail, it calls the recover_context tool to pull back the full text. This lets us compress aggressively with no quality loss.

Pipeline

| Stage | Technique | Safety | |---|---|---| | Tool output (structural) | Log folding, n-gram dedup, JSON tabularize, cross-turn delta | Lossless / near-lossless | | BM25 relevance | Score old messages vs current query | — | | Haiku summarization | Summarize old messages preserving load-bearing facts, cached by hash | Lossy but recoverable | | Retrieval drop | Replace low-relevance content with stub + recover hint | Recoverable |

Cache-aware (critical)

Anthropic/Kiro use prompt caching keyed by prefix. Compression that rewrites the context differently each turn would break the cache and increase cost.

Two protections:

  1. Runtime cache detection — the extension inspects the last assistant message's cacheRead/cacheWrite. If the provider is actively caching, lossy compression of the prefix is disabled (only safe structural compression of new tool output runs). No cache break, ever.
  2. Stable/monotonic compression — when cache is off, once a message is compressed the identical compressed form is reused on every subsequent turn, so even the one-time prefix change rebuilds and stays stable.

Note: at the time of writing, the Kiro provider reports cacheRead: 0 / cacheWrite: 0 across sessions — caching is effectively off, so compression is pure savings (the full context is re-billed every turn with no cache to break). The cache-detection path future-proofs the extension for when Kiro enables caching.

Aggressive quality gate

  • Last 4 turns never compressed (active working set)
  • Compression only applied if it saves >15%
  • Haiku summary only used if it beats the original by >15%; otherwise falls back to a recoverable stub

Commands

  • /smart-context — Stats: chars saved, avg ratio, Haiku calls/cache hits, recoverable items

Architecture

src/
├── index.ts                      # Hooks + recover_context tool
├── router.ts                     # Haiku-based complexity classification
└── compression/
    ├── pipeline.ts               # Orchestrates stages, cache-stable, retrieval-augmented
    ├── store.ts                  # Content store for recover_context
    ├── haiku-summarize.ts        # Haiku summarizer with hash cache
    ├── types.ts
    └── stages/
        ├── bm25.ts               # BM25 relevance scoring
        ├── dedup.ts              # N-gram line deduplication
        ├── log-fold.ts           # Log error extraction + folding
        ├── json-compact.ts       # JSON array tabularization
        └── delta.ts              # Cross-turn delta compression

Usage

cd arvore-pi-extensions && pnpm install
cd packages/smart-context && pnpm build

Add to your Pi packages. The extension hooks into before_agent_start (routing), context (compression), and tool_result (structural tool-output compression).