npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

ai-sdk-context-management

v0.8.1

Published

Explicit context management strategies for AI SDK language models

Readme

ai-sdk-context-management

Context management for AI SDK agents: explicit prompt preparation, optional agent tools, and structured telemetry.

Every agent eventually runs into the same problem: the next model call cannot see everything. Context management is the policy that decides:

  • what stays verbatim
  • what gets compressed
  • what can be dropped
  • what should stay stable for prompt caching
  • when the agent should manage its own future context

ai-sdk-context-management sits at the model boundary. It does not replace your thread store or orchestrator. It prepares the exact messages payload you send to AI SDK before each call and exposes optional tools that let the agent shape later turns.

Installation

npm install ai @ai-sdk/provider ai-sdk-context-management ai-sdk-system-reminders

Quick Start

The minimum integration is small:

import { generateText } from "ai";
import {
  ToolResultDecayStrategy,
  createContextManagementRuntime,
} from "ai-sdk-context-management";

const requestContext = {
  conversationId: "conv-123",
  agentId: "agent-456",
};

const runtime = createContextManagementRuntime({
  strategies: [new ToolResultDecayStrategy({ maxPromptTokens: 24_000 })],
});

const prepared = await runtime.prepareRequest({
  requestContext,
  messages,
  tools: {
    ...agentTools,
    ...runtime.optionalTools,
  },
  model: {
    provider: "openrouter",
    modelId: "anthropic/claude-4",
  },
});

const result = await generateText({
  model: baseModel,
  messages: prepared.messages,
  tools: {
    ...agentTools,
    ...runtime.optionalTools,
  },
  toolChoice: prepared.toolChoice,
  providerOptions: prepared.providerOptions,
  experimental_context: { contextManagement: requestContext },
});

await prepared.reportActualUsage(result.usage.inputTokens);

That is the core contract:

  • call runtime.prepareRequest(...) immediately before the model call
  • merge in runtime.optionalTools if you use tool-emitting strategies
  • pass the same request context to experimental_context for optional tool execution
  • call reportActualUsage(...) with the actual input-token count after the model step completes

If you want the full stack version with telemetry, summarization, scratchpad, and reminders, run examples/04-composed-strategies.ts.

Request Context Contract

prepareRequest(...) requires requestContext.

Optional tools read experimental_context.contextManagement.

Both must contain the same request-scoped identity:

{
  conversationId: string;
  agentId: string;
  agentLabel?: string;
}

If that context is missing, prepareRequest(...) cannot run and context-management tools will reject execution.

Strategy Index

Per-strategy docs live in src/strategies/.

| Strategy | What changes in the prompt | What the agent gets | Docs | Runnable example | | --- | --- | --- | --- | --- | | SystemPromptCachingStrategy | Moves system messages into a stable prefix and can consolidate them | Better cache reuse and less prompt churn | docs | 06-system-prompt-caching.ts | | SlidingWindowStrategy | Keeps the recent tail, can optionally preserve a head, and drops older non-system turns | Bounded context with simple recency bias or setup preservation | docs | 01-sliding-window.ts | | ToolResultDecayStrategy | Leaves recent tool results raw, then replaces older oversized results with placeholders based on depth and total tool-context pressure | Keeps the reasoning chain while hiding only the heaviest payloads when tool usage actually grows | docs | 02-tool-result-decay.ts | | SummarizationStrategy | Replaces older turns with a tagged summary block using either summarize(...) or model | Older facts survive in compressed form without replaying the whole middle | docs | 03-summarization.ts, 07-model-backed-summarization.ts | | ScratchpadStrategy | Injects persisted scratchpad state and can remove stale tool exchanges | Structured working state, note edits, and selective forgetting | docs | 08-scratchpad.ts | | PinnedMessagesStrategy | Marks specific tool call IDs as protected before pruning | Lets the agent keep the evidence it considers critical | docs | 09-pinned-messages.ts | | CompactionToolStrategy | Compacts old history only after the agent asks for it | Agent-controlled compression at task boundaries | docs | 10-compaction-tool.ts | | ContextUtilizationReminderStrategy | Appends a warning block when the prompt gets tight | Gives the agent time to summarize or compact before failure | docs | 11-context-utilization-reminder.ts | | ContextWindowStatusStrategy | Appends a compact token-usage status block to the latest user turn | Gives the agent explicit working-budget and raw-window visibility | docs | n/a |

Strategy Ordering

Strategies run in array order. A good default is:

  1. SystemPromptCachingStrategy
  2. PinnedMessagesStrategy
  3. pruning and compression strategies
  4. agent-directed context tools
  5. reminder strategies

In practice that usually means:

  1. SystemPromptCachingStrategy
  2. PinnedMessagesStrategy
  3. SlidingWindowStrategy (optionally with headCount), ToolResultDecayStrategy, or SummarizationStrategy
  4. ScratchpadStrategy or CompactionToolStrategy
  5. ContextUtilizationReminderStrategy

Choosing A Stack

  • Short, bounded conversations: SlidingWindowStrategy
  • Preserve setup plus the latest turns: SlidingWindowStrategy({ headCount, keepLastMessages })
  • Tool-heavy agents: SystemPromptCachingStrategy + ToolResultDecayStrategy
  • Long-running agents: SystemPromptCachingStrategy + PinnedMessagesStrategy + ToolResultDecayStrategy + SummarizationStrategy({ model })
  • Agents that self-manage context: SystemPromptCachingStrategy + PinnedMessagesStrategy + ScratchpadStrategy + CompactionToolStrategy
  • Full graduated stack: run examples/04-composed-strategies.ts

Tool Result Decay

ToolResultDecayStrategy now decays tool context using:

  • effectiveDepth = depth * pressureFactor(toolContextTokens)
  • default pressure anchors:
    • 100 -> 0.05
    • 5_000 -> 1
    • 50_000 -> 5

That means low-token tool usage can remain intact for many turns, while heavy tool sessions decay aggressively much earlier.

You can tune the curve with pressureAnchors and the warning forecast with warningForecastExtraTokens:

new ToolResultDecayStrategy({
  pressureAnchors: [
    { toolTokens: 100, depthFactor: 0.05 },
    { toolTokens: 5_000, depthFactor: 1 },
    { toolTokens: 50_000, depthFactor: 5 },
  ],
  placeholderMinSourceTokens: 800,
  warningForecastExtraTokens: 10_000,
});

Warnings are emitted through the runtime reminder path, either into systemReminderContext or inline in the prompt, with machine-readable attributes:

  • tool_call_ids
  • placeholder_ids
  • forecast_extra_tool_tokens
  • forecast_tool_context_tokens

Runnable Examples

All examples are local and deterministic. They use mock models, print the transformed prompt, and show exactly what is interesting about the output.

| Example | Run | What to look for | | --- | --- | --- | | 01-sliding-window.ts | cd examples && npx tsx 01-sliding-window.ts | The oldest exchange disappears, so the model only sees the recent tail | | 02-tool-result-decay.ts | cd examples && npx tsx 02-tool-result-decay.ts | Pressure-aware decay keeps light tool history longer, then replaces older heavy results with placeholders | | 03-summarization.ts | cd examples && npx tsx 03-summarization.ts | A tagged summary system message replaces older turns | | 04-composed-strategies.ts | cd examples && npx tsx 04-composed-strategies.ts | Multiple strategies stack cleanly and telemetry shows what ran | | 05-sliding-window-head.ts | cd examples && npx tsx 05-sliding-window-head.ts | Setup context and the latest blocker remain, but the middle drops out | | 06-system-prompt-caching.ts | cd examples && npx tsx 06-system-prompt-caching.ts | System instructions consolidate into a stable prefix | | 07-model-backed-summarization.ts | cd examples && npx tsx 07-model-backed-summarization.ts | A model-generated summary replaces older discussion | | 08-scratchpad.ts | cd examples && npx tsx 08-scratchpad.ts | scratchpad(...) changes what the next turn sees | | 09-pinned-messages.ts | cd examples && npx tsx 09-pinned-messages.ts | One pinned tool result survives while other old ones decay | | 10-compaction-tool.ts | cd examples && npx tsx 10-compaction-tool.ts | compact_context(...) compacts now and reuses the stored summary later | | 11-context-utilization-reminder.ts | cd examples && npx tsx 11-context-utilization-reminder.ts | The latest user message gains a warning before hard pruning starts |

See examples/README.md for the full example index.

Runtime API

createContextManagementRuntime({ strategies, telemetry, estimator, systemReminderContext })

Returns:

  • prepareRequest
  • optionalTools

prepareRequest(...) returns:

  • messages
  • providerOptions
  • toolChoice
  • reportActualUsage(actualInputTokens)

The runtime merges tools from all strategies and throws on tool-name collisions.

If systemReminderContext is provided, reminders are routed through ai-sdk-system-reminders. Otherwise, reminders are appended inline to the prompt.

Scratchpad API

ScratchpadStrategy exposes a scratchpad(...) tool for maintaining current working state across turns.

The tool supports:

  • description: required one-line note describing what this scratchpad update is doing
  • setEntries: merge key/value entries into the scratchpad
  • replaceEntries: replace all key/value entries
  • removeEntryKeys: delete specific entries by key
  • preserveTurns: keep the first N and last N user/assistant turns from before this scratchpad(...) call, trimming only the middle while preserving the raw messages inside those kept turns
  • omitToolCallIds: remove completed tool exchanges after the important parts have been captured

Entry names are intentionally unconstrained.

Hosts can pass emptyStateGuidance to ScratchpadStrategy if they want to inject host-specific empty-scratchpad hints or behavioral guidance. The library does not add those hints by default.

Telemetry

The runtime can emit raw events for every request:

  • runtime-start
  • strategy-complete
  • tool-execute-start
  • tool-execute-complete
  • tool-execute-error
  • runtime-complete

These events include request context, before/after token estimates, removed-tool deltas, pinned-tool deltas, and the final provider-facing prompt snapshot after prepareRequest(...) runs. The runtime-complete event also includes the final providerOptions and toolChoice.

Utilities

The package also exports:

  • createDefaultPromptTokenEstimator
  • createLlmSummarizer
  • buildSummaryTranscript
  • buildDeterministicSummary
  • CONTEXT_MANAGEMENT_KEY

Running Locally

bun test
bun run typecheck
bun run build