npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

context-compact

v2.0.0

Published

LLM context window compactor — summarizes old conversation history to free up context space

Readme

context-compact

LLM context window compactor — summarizes old conversation history to free up context space without losing continuity.

The Problem

Long-running AI agent sessions accumulate thousands of messages. When the context window fills up, naive truncation drops old messages and the agent loses critical context: active tasks, decisions made, identifiers referenced. The conversation breaks.

context-compact solves this by summarizing old messages via your LLM of choice, then replacing them with a compact summary. The agent retains continuity while freeing up context space.

Install

npm install context-compact

Quick Start — Anthropic SDK

import Anthropic from "@anthropic-ai/sdk";
import { compactIfNeeded, type SummarizeFn } from "context-compact";

const client = new Anthropic();

const summarize: SummarizeFn = async (messages, instructions, previousSummary) => {
  const systemPrompt = previousSummary
    ? `${instructions}\n\nPrevious summary:\n${previousSummary}`
    : instructions;

  const response = await client.messages.create({
    model: "claude-sonnet-4-20250514",
    max_tokens: 4096,
    system: systemPrompt,
    messages: messages.map((m) => ({
      role: m.role === "assistant" ? "assistant" : "user",
      content: typeof m.content === "string" ? m.content : JSON.stringify(m.content),
    })),
  });

  return response.content[0].type === "text" ? response.content[0].text : "";
};

// In your agent loop:
const result = await compactIfNeeded({
  messages: conversationHistory,
  contextWindowTokens: 200_000, // Claude's context window
  summarize,
});

// Use result.messages as the new conversation history
conversationHistory = result.messages;

Quick Start — OpenAI SDK

import OpenAI from "openai";
import { compactIfNeeded, type SummarizeFn } from "context-compact";

const client = new OpenAI();

const summarize: SummarizeFn = async (messages, instructions, previousSummary) => {
  const systemContent = previousSummary
    ? `${instructions}\n\nPrevious summary:\n${previousSummary}`
    : instructions;

  const response = await client.chat.completions.create({
    model: "gpt-4o",
    messages: [
      { role: "system", content: systemContent },
      ...messages.map((m) => ({
        role: m.role as "user" | "assistant" | "system",
        content: typeof m.content === "string" ? m.content : JSON.stringify(m.content),
      })),
    ],
  });

  return response.choices[0].message.content ?? "";
};

const result = await compactIfNeeded({
  messages: conversationHistory,
  contextWindowTokens: 128_000,
  summarize,
});

conversationHistory = result.messages;

How Chunking Works

The summarization model itself has a context limit. You can't send 100k tokens of history to a model with a 128k window — there's no room for the response.

context-compact splits old messages into chunks that fit within the summarization model's context, then summarizes sequentially with a running summary carried forward. For very long histories, it can split into parallel parts, summarize each independently, then merge.

Messages: [m1, m2, m3, ..., m500]
                    ↓
         Split into chunks
      [m1..m100] [m101..m200] [m201..m300] ...
                    ↓
       Summarize sequentially
      summary₁ → summary₂ → summary₃ → ...
                    ↓
          Final summary + kept messages

The chunk size adapts to message size. If messages are large (e.g., code files in tool results), chunks shrink to avoid exceeding the summarization model's limit.

Identifier Preservation

Agent conversations are full of opaque identifiers that must survive summarization exactly:

  • UUIDs: 550e8400-e29b-41d4-a716-446655440000
  • File paths: /src/components/Auth/LoginForm.tsx
  • URLs: https://api.example.com/v2/users/42
  • Hashes: sha256:a3f2b8c...
  • IPs/ports: 192.168.1.100:8080

By default (identifierPolicy: "strict"), the summarization prompt instructs the LLM to preserve these verbatim. You can customize or disable this:

// Custom policy
compactIfNeeded({
  // ...
  options: {
    identifierPolicy: "custom",
    identifierInstructions: "Preserve all file paths and URLs exactly.",
  },
});

// Disable
compactIfNeeded({
  // ...
  options: { identifierPolicy: "off" },
});

Token Estimation

Token counting without a tokenizer SDK uses a chars / 4 heuristic. This underestimates for code and unicode text. The safetyMargin option (default: 1.2) compensates by multiplying the estimate when making threshold decisions.

import { estimateTokens } from "context-compact";

const tokens = estimateTokens(messages);
// Use for budgeting, not exact billing

API Reference

compactIfNeeded(params)

Main entry point. Evaluates whether compaction is needed and compacts if so.

| Parameter | Type | Default | Description | |-----------|------|---------|-------------| | messages | Message[] | required | Full conversation history | | contextWindowTokens | number | required | Model's context window size | | triggerRatio | number | 0.85 | Fraction of context that triggers compaction | | keepRatio | number | 0.5 | Fraction of tokens to keep as recent history | | summarize | SummarizeFn | required | Your LLM summarization callback | | options | CompactionOptions | {} | See options below |

Returns CompactResult with compacted, messages, summary, and stats.

compact(params)

Lower-level: compact unconditionally.

| Parameter | Type | Description | |-----------|------|-------------| | toSummarize | Message[] | Messages to compress | | toKeep | Message[] | Messages to preserve verbatim | | summarize | SummarizeFn | Your LLM summarization callback | | options | CompactionOptions | See options below |

estimateTokens(messages)

Estimate token count without an API call.

repairToolPairing(messages)

Fix orphaned tool_result blocks whose tool_use partner was dropped. Returns { messages, droppedOrphanCount }.

CompactionOptions

| Option | Type | Default | Description | |--------|------|---------|-------------| | identifierPolicy | "strict" \| "off" \| "custom" | "strict" | Identifier preservation mode | | identifierInstructions | string | — | Custom instructions for "custom" policy | | customInstructions | string | — | Extra instructions appended to prompt | | maxChunkTokens | number | auto | Max tokens per summarization chunk | | safetyMargin | number | 1.2 | Token estimate multiplier | | parts | number | 2 | Parallel summarization parts | | signal | AbortSignal | — | Cancellation signal |

SummarizeFn

type SummarizeFn = (
  messages: Message[],
  instructions: string,
  previousSummary?: string,
) => Promise<string>;

You implement this with your LLM SDK. The library calls it with messages to summarize, instructions for the summarization, and an optional running summary from prior chunks.

Security

Before messages are passed to your summarize callback, all tool_result.details fields are stripped. These often contain large, untrusted payloads from tool executions (file contents, API responses) that should not reach the summarization model.

License

MIT