ai-sdk-context-management
v0.8.1
Published
Explicit context management strategies for AI SDK language models
Maintainers
Readme
ai-sdk-context-management
Context management for AI SDK agents: explicit prompt preparation, optional agent tools, and structured telemetry.
Every agent eventually runs into the same problem: the next model call cannot see everything. Context management is the policy that decides:
- what stays verbatim
- what gets compressed
- what can be dropped
- what should stay stable for prompt caching
- when the agent should manage its own future context
ai-sdk-context-management sits at the model boundary. It does not replace your thread store or orchestrator. It prepares the exact messages payload you send to AI SDK before each call and exposes optional tools that let the agent shape later turns.
Installation
npm install ai @ai-sdk/provider ai-sdk-context-management ai-sdk-system-remindersQuick Start
The minimum integration is small:
import { generateText } from "ai";
import {
ToolResultDecayStrategy,
createContextManagementRuntime,
} from "ai-sdk-context-management";
const requestContext = {
conversationId: "conv-123",
agentId: "agent-456",
};
const runtime = createContextManagementRuntime({
strategies: [new ToolResultDecayStrategy({ maxPromptTokens: 24_000 })],
});
const prepared = await runtime.prepareRequest({
requestContext,
messages,
tools: {
...agentTools,
...runtime.optionalTools,
},
model: {
provider: "openrouter",
modelId: "anthropic/claude-4",
},
});
const result = await generateText({
model: baseModel,
messages: prepared.messages,
tools: {
...agentTools,
...runtime.optionalTools,
},
toolChoice: prepared.toolChoice,
providerOptions: prepared.providerOptions,
experimental_context: { contextManagement: requestContext },
});
await prepared.reportActualUsage(result.usage.inputTokens);That is the core contract:
- call
runtime.prepareRequest(...)immediately before the model call - merge in
runtime.optionalToolsif you use tool-emitting strategies - pass the same request context to
experimental_contextfor optional tool execution - call
reportActualUsage(...)with the actual input-token count after the model step completes
If you want the full stack version with telemetry, summarization, scratchpad, and reminders, run examples/04-composed-strategies.ts.
Request Context Contract
prepareRequest(...) requires requestContext.
Optional tools read experimental_context.contextManagement.
Both must contain the same request-scoped identity:
{
conversationId: string;
agentId: string;
agentLabel?: string;
}If that context is missing, prepareRequest(...) cannot run and context-management tools will reject execution.
Strategy Index
Per-strategy docs live in src/strategies/.
| Strategy | What changes in the prompt | What the agent gets | Docs | Runnable example |
| --- | --- | --- | --- | --- |
| SystemPromptCachingStrategy | Moves system messages into a stable prefix and can consolidate them | Better cache reuse and less prompt churn | docs | 06-system-prompt-caching.ts |
| SlidingWindowStrategy | Keeps the recent tail, can optionally preserve a head, and drops older non-system turns | Bounded context with simple recency bias or setup preservation | docs | 01-sliding-window.ts |
| ToolResultDecayStrategy | Leaves recent tool results raw, then replaces older oversized results with placeholders based on depth and total tool-context pressure | Keeps the reasoning chain while hiding only the heaviest payloads when tool usage actually grows | docs | 02-tool-result-decay.ts |
| SummarizationStrategy | Replaces older turns with a tagged summary block using either summarize(...) or model | Older facts survive in compressed form without replaying the whole middle | docs | 03-summarization.ts, 07-model-backed-summarization.ts |
| ScratchpadStrategy | Injects persisted scratchpad state and can remove stale tool exchanges | Structured working state, note edits, and selective forgetting | docs | 08-scratchpad.ts |
| PinnedMessagesStrategy | Marks specific tool call IDs as protected before pruning | Lets the agent keep the evidence it considers critical | docs | 09-pinned-messages.ts |
| CompactionToolStrategy | Compacts old history only after the agent asks for it | Agent-controlled compression at task boundaries | docs | 10-compaction-tool.ts |
| ContextUtilizationReminderStrategy | Appends a warning block when the prompt gets tight | Gives the agent time to summarize or compact before failure | docs | 11-context-utilization-reminder.ts |
| ContextWindowStatusStrategy | Appends a compact token-usage status block to the latest user turn | Gives the agent explicit working-budget and raw-window visibility | docs | n/a |
Strategy Ordering
Strategies run in array order. A good default is:
SystemPromptCachingStrategyPinnedMessagesStrategy- pruning and compression strategies
- agent-directed context tools
- reminder strategies
In practice that usually means:
SystemPromptCachingStrategyPinnedMessagesStrategySlidingWindowStrategy(optionally withheadCount),ToolResultDecayStrategy, orSummarizationStrategyScratchpadStrategyorCompactionToolStrategyContextUtilizationReminderStrategy
Choosing A Stack
- Short, bounded conversations:
SlidingWindowStrategy - Preserve setup plus the latest turns:
SlidingWindowStrategy({ headCount, keepLastMessages }) - Tool-heavy agents:
SystemPromptCachingStrategy+ToolResultDecayStrategy - Long-running agents:
SystemPromptCachingStrategy+PinnedMessagesStrategy+ToolResultDecayStrategy+SummarizationStrategy({ model }) - Agents that self-manage context:
SystemPromptCachingStrategy+PinnedMessagesStrategy+ScratchpadStrategy+CompactionToolStrategy - Full graduated stack: run
examples/04-composed-strategies.ts
Tool Result Decay
ToolResultDecayStrategy now decays tool context using:
effectiveDepth = depth * pressureFactor(toolContextTokens)- default pressure anchors:
100 -> 0.055_000 -> 150_000 -> 5
That means low-token tool usage can remain intact for many turns, while heavy tool sessions decay aggressively much earlier.
You can tune the curve with pressureAnchors and the warning forecast with warningForecastExtraTokens:
new ToolResultDecayStrategy({
pressureAnchors: [
{ toolTokens: 100, depthFactor: 0.05 },
{ toolTokens: 5_000, depthFactor: 1 },
{ toolTokens: 50_000, depthFactor: 5 },
],
placeholderMinSourceTokens: 800,
warningForecastExtraTokens: 10_000,
});Warnings are emitted through the runtime reminder path, either into systemReminderContext or inline in the prompt, with machine-readable attributes:
tool_call_idsplaceholder_idsforecast_extra_tool_tokensforecast_tool_context_tokens
Runnable Examples
All examples are local and deterministic. They use mock models, print the transformed prompt, and show exactly what is interesting about the output.
| Example | Run | What to look for |
| --- | --- | --- |
| 01-sliding-window.ts | cd examples && npx tsx 01-sliding-window.ts | The oldest exchange disappears, so the model only sees the recent tail |
| 02-tool-result-decay.ts | cd examples && npx tsx 02-tool-result-decay.ts | Pressure-aware decay keeps light tool history longer, then replaces older heavy results with placeholders |
| 03-summarization.ts | cd examples && npx tsx 03-summarization.ts | A tagged summary system message replaces older turns |
| 04-composed-strategies.ts | cd examples && npx tsx 04-composed-strategies.ts | Multiple strategies stack cleanly and telemetry shows what ran |
| 05-sliding-window-head.ts | cd examples && npx tsx 05-sliding-window-head.ts | Setup context and the latest blocker remain, but the middle drops out |
| 06-system-prompt-caching.ts | cd examples && npx tsx 06-system-prompt-caching.ts | System instructions consolidate into a stable prefix |
| 07-model-backed-summarization.ts | cd examples && npx tsx 07-model-backed-summarization.ts | A model-generated summary replaces older discussion |
| 08-scratchpad.ts | cd examples && npx tsx 08-scratchpad.ts | scratchpad(...) changes what the next turn sees |
| 09-pinned-messages.ts | cd examples && npx tsx 09-pinned-messages.ts | One pinned tool result survives while other old ones decay |
| 10-compaction-tool.ts | cd examples && npx tsx 10-compaction-tool.ts | compact_context(...) compacts now and reuses the stored summary later |
| 11-context-utilization-reminder.ts | cd examples && npx tsx 11-context-utilization-reminder.ts | The latest user message gains a warning before hard pruning starts |
See examples/README.md for the full example index.
Runtime API
createContextManagementRuntime({ strategies, telemetry, estimator, systemReminderContext })
Returns:
prepareRequestoptionalTools
prepareRequest(...) returns:
messagesproviderOptionstoolChoicereportActualUsage(actualInputTokens)
The runtime merges tools from all strategies and throws on tool-name collisions.
If systemReminderContext is provided, reminders are routed through ai-sdk-system-reminders. Otherwise, reminders are appended inline to the prompt.
Scratchpad API
ScratchpadStrategy exposes a scratchpad(...) tool for maintaining current working state across turns.
The tool supports:
description: required one-line note describing what this scratchpad update is doingsetEntries: merge key/value entries into the scratchpadreplaceEntries: replace all key/value entriesremoveEntryKeys: delete specific entries by keypreserveTurns: keep the firstNand lastNuser/assistant turns from before thisscratchpad(...)call, trimming only the middle while preserving the raw messages inside those kept turnsomitToolCallIds: remove completed tool exchanges after the important parts have been captured
Entry names are intentionally unconstrained.
Hosts can pass emptyStateGuidance to ScratchpadStrategy if they want to inject host-specific empty-scratchpad hints or behavioral guidance. The library does not add those hints by default.
Telemetry
The runtime can emit raw events for every request:
runtime-startstrategy-completetool-execute-starttool-execute-completetool-execute-errorruntime-complete
These events include request context, before/after token estimates, removed-tool deltas, pinned-tool deltas, and the final provider-facing prompt snapshot after prepareRequest(...) runs. The runtime-complete event also includes the final providerOptions and toolChoice.
Utilities
The package also exports:
createDefaultPromptTokenEstimatorcreateLlmSummarizerbuildSummaryTranscriptbuildDeterministicSummaryCONTEXT_MANAGEMENT_KEY
Running Locally
bun test
bun run typecheck
bun run build