npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

thincontext

v1.0.3

Published

Drop-in middleware to compress LLM context before it hits the API

Readme

thincontext

Drop-in TypeScript middleware that compresses LLM context before it hits the API.

Every agent re-sends the same file reads, system prompts, and tool outputs on every turn. Thincontext sits in the middle and removes the redundancy — transparently, without changing your message format.

Agent → ContextCompressor.compress(messages) → LLM API

Node.js ≥ 18 · TypeScript · ESM + CJS


Install

npm install thincontext

Quick start

import { ContextCompressor } from 'thincontext'

const compressor = new ContextCompressor()

const { messages, stats } = await compressor.compress(myMessages)

console.log(`${stats.savedTokens} tokens saved (${((1 - stats.compressionRatio) * 100).toFixed(1)}%)`)

Zero configuration is needed for the default hash-based dedup behaviour. Add embed and summarize to unlock the full pipeline.


What it does

Five compression stages can run in sequence on every compress() call:

| Stage | What it does | Requires | |---|---|---| | Summarizer | Decays old conversation turns: verbatim → summary → dropped | summarize fn | | Deduplicator | Skips system/tool content the LLM already saw this session | nothing (hash) or embed fn (semantic) | | Chunker | Extracts only relevant lines from large code/document context | embed fn | | ReferenceCompressor | Replaces repeated large blocks with short [ref:...] tokens | nothing | | BudgetManager | Drops lowest-priority messages to fit a hard token budget | nothing |

Each module only activates when its dependencies are provided — the compressor degrades gracefully.

In practice

In a typical coding agent session where the same files are read across multiple turns:

  • first read of a large file: content is normalised and passed through
  • subsequent turns: the full file content can be replaced with a short reference or duplicate marker
  • older conversation turns (if summarize is configured): progressively compressed to short summaries, then dropped

In real Pi testing, thincontext produced meaningful savings on repeated tool-heavy turns, but not on every turn.


Important: savings are opportunistic, not guaranteed

Thincontext does not guarantee token savings on every turn.

A Pi footer like:

🗜 -0% chars

can be completely normal even when the extension is installed and working.

Helps most when

  • the agent reads the same files repeatedly across turns
  • the agent produces the same or very similar tool output multiple times
  • there are large tool results that exceed the truncation limit
  • repeated outputs are old enough to pass the dedup window

Helps less when

  • most output is new and unique
  • the session is dominated by fresh writes/edits
  • the repeated content is still too recent to deduplicate
  • tool outputs are already short
  • protected modification history must remain visible

Why you may see 0% savings

Some turns are mostly made of:

  • one-off bash output
  • fresh read results
  • recent edit / write operations
  • unique install logs or error logs

In those cases, thincontext may correctly decide that there is little or nothing safe to compress.


Options

new ContextCompressor({
  budget: 8000,
  embed: myEmbedFn,
  summarize: mySummarizeFn,
  countTokens: myTokenFn,

  dedup: {
    strategy: 'hash',
    threshold: 0.92,
    maxVectors: 5000,
  },

  summarization: {
    keepLastFull: 5,
    summarizeBeyond: 10,
  },

  chunking: {
    maxLines: 50,
    contextLines: 5,
    minLines: 100,
  },
})

Adapters

Adapters ship as separate entrypoints — zero impact on the core bundle if unused.

Embedding

import { openaiEmbed } from 'thincontext/embeddings/openai'
import { localEmbed } from 'thincontext/embeddings/local'

const compressor = new ContextCompressor({
  embed: openaiEmbed({ apiKey: process.env.OPENAI_API_KEY! }),
  // or: embed: await localEmbed()
})

Summarization

import { anthropicSummarize } from 'thincontext/summarize/anthropic'
import { openaiSummarize } from 'thincontext/summarize/openai'

const compressor = new ContextCompressor({
  summarize: anthropicSummarize({ apiKey: process.env.ANTHROPIC_API_KEY! }),
})

Message conversion

import { fromOpenAI } from 'thincontext/adapters/openai'
import { fromAnthropic } from 'thincontext/adapters/anthropic'

Message priorities

Tag messages to control how BudgetManager handles token pressure:

const messages = [
  { role: 'system', content: 'You are...', priority: 'critical' },
  { role: 'user', content: ragChunk, priority: 'low' },
  { role: 'assistant', content: lastReply, priority: 'high' },
]

Priorities: 'critical' · 'high' · 'normal' · 'low'


Session persistence

State (seen hashes, summary cache, ref table) lives in memory and survives across compress() calls.

const snapshot = compressor.export()
const compressor2 = ContextCompressor.restore(snapshot, { budget: 8000 })

Integrations

Pi agent

Install as a Pi package — the extension is bundled inside the npm package:

pi install npm:thincontext

Or add to your ~/.pi/agent/settings.json:

{
  "packages": ["npm:thincontext"]
}

The extension hooks Pi's context event to compress messages before every LLM call, with tool result deduplication and a live footer:

🗜 -72% chars

Commands inside Pi:

/thincontext on|off|reset|budget <n>|lines <n>|dedup-after <turns>|debug

Pi-specific notes

Current defaults are conservative:

  • maxToolLines = 300
  • dedupAfterTurns = 2
  • recent edit/write tool results are protected from budget dropping

Known limitations:

  • bash writes such as sed -i or echo > file are not reliably detected as modification records
  • truncation can hide important information that appears late in very long output
  • token estimates shown by the extension are approximate; Pi's own usage counters are more trustworthy
  • a given turn may show no savings even when the extension is working correctly

Claude Code

No context interception hook exists in Claude Code's interactive CLI — there is no equivalent to Pi's context event that fires before each LLM call.

The thincontext library still works for custom SDK/wrapper workflows, but a true drop-in Claude Code CLI plugin equivalent to the Pi extension is not currently possible with the available integration surface.


Token counting for Claude

cl100k_base is GPT-4's tokenizer. For Claude, expect some variance. See docs/token-counting.md for custom counter guidance.


What this is not

  • not an LLM proxy
  • not a RAG system
  • not model-specific
  • not a browser library

Publishing

The repo includes a GitLab pipeline that:

  • runs typecheck/tests on pushes
  • publishes to npm on version tags like v1.0.0

After publish, users can install with:

npm install thincontext

or in Pi:

pi install npm:thincontext

Development

npm ci
npm run typecheck
npm test
npm run build

License

MIT