npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

@agentsy/context

v0.2.4

Published

Compression, drift detection, and reversible output shaping

Readme

@agentsy/context

Compression, drift detection, reversible output shaping, and cache-friendly prompt planning for LLM applications.

Status

Version: 0.2.0-alpha.0
License: GPL-3.0-or-later
Published: @agentsy/core v0.2.0, @agentsy/types v0.1.1

Installation

npm install @agentsy/context @agentsy/core @agentsy/types

Quick Start

Compress Conversation History

import { compressConversation } from '@agentsy/context';

interface Message {
  role: 'user' | 'assistant' | 'system';
  content: string;
}

const messages: Message[] = [
  { role: 'system', content: 'You are an AI programming assistant...' },
  { role: 'user', content: 'Help me refactor this function...' },
  { role: 'assistant', content: 'Here is the refactored code...' },
  // ... more messages
];

const result = compressConversation(messages, {
  maxTokens: 200000,
  preserveLast: 2, // Keep last 2 messages for continuity
  estimateTokens: (msg) => Math.ceil(msg.content.length / 4)
});

console.log(`Dropped ${result.droppedCount} messages`);
console.log(`Retained ${result.retained.length} messages`);
console.log(`Estimated tokens: ${result.estimatedTokens}`);

Compress Output

import { compressOutput } from '@agentsy/context';

const longResponse = `
  This is a very long response that contains a lot of filler text.
  Basically, you should consider removing unnecessary words.
  Here's some code:

  \`\`\`typescript
  const example = "preserve this exactly";
  \`\`\`

  And a link: https://example.com/docs
`;

const compressed = compressOutput(longResponse, {
  level: 'full',
  preserve: {
    codeFences: true,
    inlineCode: true,
    urls: true
  }
});

console.log(compressed);
// Output: Code blocks, inline code, and URLs preserved exactly
// Filler words removed: "basically", "should consider", "a lot of"

Cache Prompt Plans

import { createCachePromptPlan } from '@agentsy/context';
import { applyOpenAIPromptCaching } from '@agentsy/providers/caching';

const plan = createCachePromptPlan({
  prefix: 'ctx-v1',
  provider: 'openai'
});

const cached = applyOpenAIPromptCaching('prompt body', plan);

console.log(cached.prompt_cache_key); // openai:ctx-v1

Manual Compaction

import { createManualCompaction } from '@agentsy/context';

const result = createManualCompaction({
  focus: 'architecture',
  maxTokens: 200,
  messages: ['diff --git a/a b/a', 'plain prose'],
  sessionId: 'sess-1'
});

console.log(result.summary.focus); // architecture
console.log(result.summary.nextSteps); // ['rehydrate:architecture']

Token Management: See @agentsy/tokenomics for createInMemoryTokenManager, PacingController, and budget management.

API Reference

compressConversation

Compresses a conversation history to fit within a token budget.

function compressConversation<TMessage>(
  messages: readonly TMessage[],
  options: CompressionOptions<TMessage>
): CompressionResult<TMessage>

Parameters:

  • messages: Array of messages to compress
  • options.maxTokens: Maximum tokens to retain
  • options.preserveLast: Number of recent messages to always preserve (default: 0)
  • options.estimateTokens: Function to estimate tokens per message

Returns:

  • retained: Array of messages that fit in budget
  • droppedCount: Number of messages dropped
  • estimatedTokens: Estimated token count of retained messages

compressOutput

Compresses output text while preserving code blocks, URLs, and other critical elements.

function compressOutput(
  input: string,
  options?: OutputCompressionOptions
): string

Parameters:

  • input: Text to compress
  • options.level: Compression level - 'lite' (40-50%), 'full' (65-75%), 'ultra' (75-87%)
  • options.preserve: What to preserve (codeFences, inlineCode, urls)

Returns: Compressed text string

compressOutputDetailed / compressOutputV2

Use the detailed helpers when you need content kind, routing, metrics, or reversible markers.

Use Cases

1. Context Window Management

import { compressConversation } from '@agentsy/context';

function prepareLLMRequest(
  messages: Message[],
  maxTokens: number
): Message[] {
  const result = compressConversation(messages, {
    maxTokens: maxTokens - 10000, // Safety margin
    preserveLast: 2,
    estimateTokens: (msg) => Math.ceil(msg.content.length / 4)
  });

  if (result.droppedCount > 0) {
    console.warn(`Dropped ${result.droppedCount} messages to fit budget`);
  }

  return result.retained;
}

2. Cost-Aware Request Routing

import { createInMemoryTokenManager } from '@agentsy/tokenomics';

const manager = createInMemoryTokenManager();

async function routeRequest(
  model: string,
  estimatedTokens: number
): Promise<string> {
  const budget = await manager.createBudget({
    maxCost: 10.0,
    maxTokens: 100000,
    model,
    name: 'routing-budget',
    periodMs: 3600000,
    priority: 'medium',
    provider: 'openai',
    resetStrategy: 'rolling'
  });

  const allocation = await manager.requestTokens({
    estimatedTokens,
    estimatedCost: estimatedTokens * 0.00001, // $0.01 per 1K tokens
    model,
    provider: 'openai',
    requestType: 'completion',
    budgetId: budget.id
  });

  if (allocation.conditions) {
    // Try cheaper model
    return 'gpt-4o-mini';
  }

  return model;
}

3. Output Compression for Token Savings

import { compressOutput } from '@agentsy/context';

function compressAssistantResponse(response: string): string {
  // Compress to 75% of original size
  return compressOutput(response, {
    level: 'full',
    preserve: {
      codeFences: true,
      inlineCode: true,
      urls: true
    }
  });
}

4. Multi-Budget Management

import { createInMemoryTokenManager } from '@agentsy/tokenomics';

const manager = createInMemoryTokenManager();

// Create separate budgets for different models
const gpt4Budget = await manager.createBudget({
  maxCost: 50.0,
  maxTokens: 500000,
  model: 'gpt-4',
  name: 'gpt-4-budget',
  periodMs: 3600000,
  priority: 'high',
  provider: 'openai',
  resetStrategy: 'rolling'
});

const claudeBudget = await manager.createBudget({
  maxCost: 30.0,
  maxTokens: 200000,
  model: 'claude-3-5-sonnet',
  name: 'claude-budget',
  periodMs: 3600000,
  priority: 'medium',
  provider: 'anthropic',
  resetStrategy: 'rolling'
});

// Request from appropriate budget
async function requestTokens(
  model: string,
  tokens: number
): Promise<TokenAllocation> {
  const budgetId = model === 'gpt-4' ? gpt4Budget.id : claudeBudget.id;

  return await manager.requestTokens({
    budgetId,
    estimatedTokens: tokens,
    estimatedCost: tokens * 0.00001,
    model,
    provider: model === 'gpt-4' ? 'openai' : 'anthropic',
    requestType: 'completion'
  });
}

Performance Characteristics

Compression Performance

  • Output compression: <10ms average for typical responses
  • Conversation compression: <50ms for 100-message histories
  • Token estimation: <1ms per message

Accuracy

  • Token estimation: Conservative (never underestimates)
  • Compression preservation: 100% accuracy for code blocks, URLs, paths
  • Budget enforcement: Deterministic, no race conditions

Best Practices

1. Conservative Token Estimation

Always overestimate token counts to avoid exceeding model limits:

const estimateTokens = (msg: Message) => {
  // Conservative: 4 chars per token (typical for English)
  return Math.ceil(msg.content.length / 4);
};

2. Preserve Critical Content

Always preserve code, URLs, and file paths in output compression:

compressOutput(response, {
  level: 'full',
  preserve: {
    codeFences: true,
    inlineCode: true,
    urls: true
  }
});

3. Use Safety Margins

Leave buffer between budget and actual usage:

const safeMaxTokens = model.maxInputTokens - 10000; // 10K safety margin
compressConversation(messages, { maxTokens: safeMaxTokens });

4. Monitor Budget Status

Regularly check budget status to proactively manage costs:

const status = await manager.getBudgetStatus(budgetId);
if (status.remainingCost < status.totalCost * 0.2) {
  console.warn('Budget at 20% or less remaining');
}

License

GPL-3.0-or-later - See LICENSE.md for details.

Contributing

See IMPLEMENTATION-PLAN.md for development roadmap.

Related Packages