@p-vbordei/context-compressor
v1.0.0
Published
Context window compression and token-budget management for LLMs
Downloads
22
Maintainers
Readme
context-compressor
A robust context length manager and conversation compressor for LLM agent workspaces. It handles secret scrubbing/redaction, token estimation, tool output pruning/deduplication, and progressive context summarization.
License
Apache License 2.0 (100% independent and open-source).
Features
- Secret Redaction: Regex-based masking for database connection strings, JWTs, Slack/GitHub/Discord tokens, private keys, environment variables, authorization headers, and phone numbers.
- Token Estimation: Fast and accurate character-based token approximation that matches Python integer floor division formulas
(length + 3) // 4. - Tool Result Pruning & Deduplication: MD5-based duplicate tool output elimination, one-line summary generation for massive tool outputs (terminal outputs, read/write files), and JSON parameter truncation.
- Progressive Compression: Automatically protect crucial conversation headers (system instructions, first user/assistant exchange) and tail turns, while progressively summarizing middle turns through a customizable callback.
Installation
npm install context-compressorUsage
1. Sensitive Text Redaction
Use redactSensitiveText to mask sensitive API keys, database connection strings, and authorization tokens:
import { redactSensitiveText } from 'context-compressor';
const dirtyLogs = "Connecting with DB URL postgresql://admin:superSecretPassword@localhost:5432/mydb and token ghp_1234567890abcdefghij1234567890abcdefghij";
const cleanLogs = redactSensitiveText(dirtyLogs, true);
console.log(cleanLogs);
// "Connecting with DB URL postgresql://admin:***@localhost:5432/mydb and token ghp_12...fghij"2. Context Compression and Summarization
Set up ContextCompressor to manage long prompt messages and fit within target context windows:
import { ContextCompressor, Message } from 'context-compressor';
const compressor = new ContextCompressor({
contextLength: 8000,
thresholdPercent: 0.50, // Compress when active prompt exceeds 4000 tokens
protectFirstN: 1, // Protect system + first user message
protectLastN: 4, // Protect latest 4 messages from summarizing
summarizeCallback: async (prompt: string) => {
// Invoke LLM to summarize the formatted turns
const response = await callLlm({
system: "Summarize this part of the conversation compactly, keeping key facts.",
prompt
});
return response.text;
}
});
let messages: Message[] = [
{ role: 'system', content: 'You are an agent.' },
{ role: 'user', content: 'Start task X.' },
// ... many middle tool calls, terminal outputs, and file modifications ...
{ role: 'assistant', content: 'Here is what we have.' },
{ role: 'user', content: 'Now show me the status.' }
];
// Check if we exceed the budget and need compression
if (compressor.shouldCompress(promptTokens)) {
messages = await compressor.compress(messages, promptTokens);
}