npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

@stackforgeai/copilot-context

v1.0.0

Published

Context management, hierarchical chunking, retrieval pipelines, schema-enforced structured output, and persistent session memory for GitHub Copilot SDK — all calls guarded by @stackforgeai/copilot-guard.

Readme

@stackforgeai/copilot-context

Context management, hierarchical chunking, retrieval pipelines, schema-enforced structured output, and persistent session memory for GitHub Copilot SDK — all LLM calls guarded by @stackforgeai/copilot-guard.


Overview

@stackforgeai/copilot-context is a production-grade module that solves the core challenges of working with large-context LLM workflows:

  • Context windows grow unbounded — this module manages context budgets with auto-compaction
  • RAG retrieval lacks structure — hierarchical chunking with parent-child expansion and multi-stage retrieval pipelines
  • LLM outputs are unpredictable — schema-enforced structured output with validation and retry
  • Conversation history explodes token costs — session memory with auto-compaction preserves continuity within budget
  • Latency is invisible — P50/P95/P99 latency tracking across all operations

All LLM calls are routed through @stackforgeai/copilot-guard for token budget enforcement. Direct @github/copilot-sdk access is never used.


Features

ContextManager

  • Bounded context window with configurable token budget
  • Priority-based entry retention during compaction
  • Auto-compaction when utilization exceeds threshold
  • LLM-powered summarization of older entries via the guard
  • High-priority entries (≥5) are protected from compaction
  • Real-time utilization tracking

ChunkingEngine

  • Hierarchical parent-child document chunking
  • Configurable chunk size and overlap
  • Parent chunks group children for expansion during retrieval
  • Flat (non-hierarchical) mode available
  • Metadata propagation to all chunks
  • Parent/child lookup helpers

RetrievalPipeline

  • 4-stage pipeline: embed → retrieve → rerank → stitch
  • Term-based retrieval (no external vector DB required)
  • LLM-powered reranking via the guard for relevance scoring
  • Parent-chunk expansion (child match → full parent context)
  • Per-stage latency instrumentation
  • Configurable retrieval count and top-K

SchemaEnforcer

  • Schema-driven structured JSON output from LLMs
  • Type validation: string, number, boolean, array, object
  • Required/optional field support
  • Automatic retry with error feedback on validation failure
  • JSON extraction from markdown fences and preamble
  • Array and object schemas supported

SessionMemory

  • Token-budgeted conversation memory
  • Auto-compaction via LLM summarization
  • Snapshot/restore for session persistence
  • Configurable auto-compact toggle
  • Render as prompt-ready context string

ContextObserver

  • Per-operation latency recording
  • P50/P95/P99 percentile calculations
  • Aggregated summary across operations
  • JSON export for observability backends
  • time() helper for automatic latency measurement

Installation

npm install @stackforgeai/copilot-context @stackforgeai/copilot-guard @github/copilot-sdk

Requirements:

  • Node.js ≥ 20
  • @github/copilot-sdk must be installed as a peer dependency
  • @stackforgeai/copilot-guard is a direct dependency

Usage Examples

Context Management with Auto-Compaction

import { ContextManager } from "@stackforgeai/copilot-context";

const manager = new ContextManager({
  maxTokens: 4_000,
  compactionModel: "gpt-4o-mini",
  compactionThreshold: 0.8,  // Auto-compact at 80% utilization
  keepRecentCount: 2,        // Always keep the 2 most recent entries
});

// Add context entries with priority
await manager.add("System instructions for the assistant.", "system", 5); // High priority
await manager.add("User asked about API design.", "user");
await manager.add("Suggested REST endpoints.", "assistant");

// Check utilization
console.log(`${manager.getTotalTokens()} tokens used (${(manager.getUtilization() * 100).toFixed(0)}%)`);

// Render for inclusion in a prompt
const context = manager.render();

// Manual compaction (also happens automatically on add)
const result = await manager.compact();
console.log(`Compacted ${result.entriesCompacted} entries, saved ${result.tokensBefore - result.tokensAfter} tokens`);

Hierarchical Chunking + Retrieval

import { ChunkingEngine, RetrievalPipeline } from "@stackforgeai/copilot-context";

// Chunk a document
const engine = new ChunkingEngine({
  chunkSize: 256,       // tokens per chunk
  overlap: 32,          // overlap between chunks
  hierarchical: true,   // create parent-child groups
  childrenPerParent: 4,
});

const chunks = engine.chunk("doc-1", documentText);
console.log(`${engine.getParents(chunks).length} parents, ${engine.getChildren(chunks).length} children`);

// Build retrieval pipeline
const pipeline = new RetrievalPipeline({
  model: "gpt-4o-mini",
  retrievalCount: 20,
  rerankTopK: 5,
  expandToParent: true, // Expand child matches to parent for richer context
});

pipeline.index(chunks);

const result = await pipeline.retrieve({ query: "How does authentication work?", topK: 5 });
console.log(`Retrieved ${result.chunks.length} chunks in ${result.totalDurationMs}ms`);
console.log("Context:", result.context);

// Check per-stage latency
for (const stage of result.stages) {
  console.log(`${stage.stage}: ${stage.durationMs}ms`);
}

Schema-Enforced Structured Output

import { SchemaEnforcer } from "@stackforgeai/copilot-context";

const enforcer = new SchemaEnforcer({
  model: "gpt-4o-mini",
  maxRetries: 2,
});

const schema = {
  name: "APIEndpoint",
  fields: [
    { name: "method", type: "string", description: "HTTP method" },
    { name: "path", type: "string", description: "URL path" },
    { name: "description", type: "string", description: "What the endpoint does" },
  ],
  isArray: true,
};

const endpoints = await enforcer.enforce(
  "Design 5 REST API endpoints for a user management service.",
  schema,
);
// endpoints is guaranteed to be a validated array of objects
console.log(endpoints);

Session Memory with Persistence

import { SessionMemory } from "@stackforgeai/copilot-context";

const memory = new SessionMemory({
  maxTokens: 2_000,
  compactionModel: "gpt-4o-mini",
  sessionId: "project-alpha",
  autoCompact: true,
});

await memory.addTurn("user", "I need a REST API for user management.");
await memory.addTurn("assistant", "I'll design endpoints for CRUD operations.");
await memory.addTurn("user", "Add OAuth2 authentication.");

// Render for inclusion in a prompt
const context = memory.render();

// Save session state
const snapshot = memory.getSnapshot();
// Store snapshot to file/DB...

// Restore in a new instance
const restored = new SessionMemory({ maxTokens: 2_000, compactionModel: "gpt-4o-mini" });
restored.restore(snapshot);

Configuration

ContextManager

| Option | Type | Default | Description | |---|---|---|---| | maxTokens | number | — | Maximum token budget for the context window | | compactionModel | string | — | Model ID for LLM-powered compaction | | compactionThreshold | number | 0.8 | Utilization ratio (0–1) that triggers auto-compaction | | keepRecentCount | number | 2 | Minimum entries to keep uncompacted |

ChunkingEngine

| Option | Type | Default | Description | |---|---|---|---| | chunkSize | number | 512 | Target chunk size in tokens | | overlap | number | 64 | Overlap tokens between adjacent chunks | | hierarchical | boolean | true | Whether to create parent-child groups | | childrenPerParent | number | 4 | Number of child chunks per parent |

RetrievalPipeline

| Option | Type | Default | Description | |---|---|---|---| | model | string | — | Model for LLM-powered reranking | | retrievalCount | number | 20 | Candidates to retrieve before reranking | | rerankTopK | number | 5 | Top results after reranking | | expandToParent | boolean | true | Expand child matches to parent chunks |

SchemaEnforcer

| Option | Type | Default | Description | |---|---|---|---| | model | string | — | Model for structured output generation | | maxRetries | number | 2 | Retry attempts on validation failure | | timeout | number | 60000 | Guard call timeout in ms |

SessionMemory

| Option | Type | Default | Description | |---|---|---|---| | maxTokens | number | — | Maximum token budget for memory | | compactionModel | string | — | Model for compaction summarization | | sessionId | string | auto | Session identifier | | autoCompact | boolean | true | Auto-compact when budget exceeded |


Architecture Overview

┌────────────────────────────────────────────────────────┐
│                  @stackforgeai/copilot-context          │
│                                                        │
│  ┌──────────────┐  ┌──────────────┐  ┌──────────────┐ │
│  │ContextManager│  │ChunkingEngine│  │SchemaEnforcer│ │
│  │  (compaction) │  │ (parent/child)│  │  (validate)  │ │
│  └──────┬───────┘  └──────┬───────┘  └──────┬───────┘ │
│         │                 │                  │         │
│  ┌──────┴───────┐  ┌─────┴────────┐  ┌─────┴──────┐  │
│  │SessionMemory │  │  Retrieval   │  │  Context   │  │
│  │ (persistence)│  │  Pipeline    │  │  Observer  │  │
│  └──────┬───────┘  └──────┬───────┘  └────────────┘  │
│         │                 │                            │
│         └────────┬────────┘                            │
│                  ▼                                     │
│       ┌──────────────────┐                             │
│       │   IGuard (DI)    │                             │
│       └────────┬─────────┘                             │
└────────────────┼───────────────────────────────────────┘
                 ▼
       ┌──────────────────┐
       │ @stackforgeai/   │
       │ copilot-guard    │
       │ (token budget)   │
       └────────┬─────────┘
                ▼
       ┌──────────────────┐
       │ @github/         │
       │ copilot-sdk      │
       └──────────────────┘

Key architectural decisions:

  • All LLM calls flow through IGuard interface → CopilotGuardcopilot-sdk
  • No direct copilot-sdk imports in any source file except via the guard
  • Dependency injection via IGuard enables unit testing with mocks
  • ContextObserver is embedded in each component for latency tracking
  • Hierarchical chunking follows the parent-child expansion pattern from production RAG

Troubleshooting

"Could not find a declaration file for module '@stackforgeai/copilot-guard'"

Ensure @stackforgeai/copilot-guard is installed and has been built (npm run build in the guard package). The dist/ folder must contain .d.ts files.

"@github/copilot-sdk is not installed"

Install the peer dependency:

npm install @github/copilot-sdk

"All N attempts failed validation for schema"

The LLM consistently returned output that did not match the schema. Try:

  • Simplifying the schema (fewer fields, simpler types)
  • Using a more capable model (e.g., gpt-4.1 instead of gpt-4o-mini)
  • Increasing maxRetries
  • Adding more context to the task prompt

Auto-compaction not triggering

Check that compactionThreshold is set (default 0.8). Compaction only triggers when adding a new entry would push utilization above the threshold AND there are more entries than keepRecentCount.

Token estimation is approximate

Token counts use a chars / 4 heuristic. Actual token counts depend on the model's tokenizer. For precise budgeting, track the outputTokens from guard responses.


DISCLAIMER AND LIMITATION OF LIABILITY

IMPORTANT: THIS SOFTWARE IS PROVIDED STRICTLY ON AN "AS IS" AND "AS AVAILABLE" BASIS.

BY USING THIS SOFTWARE, YOU ACKNOWLEDGE AND AGREE THAT:

  • THE SOFTWARE MAY CONTAIN BUGS, DEFECTS, DESIGN FLAWS, LOGIC ERRORS, SECURITY ISSUES, OR INCOMPLETE FEATURES
  • THE SOFTWARE MAY FAIL TO LIMIT OR PREVENT TOKEN USAGE, API REQUESTS, COST OVERRUNS, OR BILLING EVENTS
  • TOKEN ESTIMATION, CONTEXT COMPACTION, CHUNKING, RETRIEVAL, SCHEMA VALIDATION, AND MEMORY MANAGEMENT FEATURES MAY BE INACCURATE, INCOMPLETE, OR NON-FUNCTIONAL
  • THE SOFTWARE MAY PRODUCE UNEXPECTED RESULTS
  • THE SOFTWARE MAY NOT BE SUITABLE FOR PRODUCTION ENVIRONMENTS
  • THE SOFTWARE MAY NOT PREVENT EXCESSIVE CHARGES FROM AI PROVIDERS OR CLOUD SERVICES

THIS SOFTWARE DOES NOT GUARANTEE:

  • COST SAVINGS
  • BILLING PROTECTION
  • TOKEN ACCURACY
  • FINANCIAL PROTECTION
  • RETRIEVAL ACCURACY
  • SCHEMA COMPLIANCE
  • CONTEXT PRESERVATION
  • SESSION CONTINUITY
  • SYSTEM STABILITY
  • SECURITY
  • RELIABILITY
  • FITNESS FOR ANY PARTICULAR PURPOSE

TO THE MAXIMUM EXTENT PERMITTED BY APPLICABLE LAW:

THE AUTHORS, CONTRIBUTORS, MAINTAINERS, COPYRIGHT HOLDERS, AFFILIATES, AND DISTRIBUTORS SHALL NOT BE LIABLE FOR ANY CLAIMS, DAMAGES, LOSSES, LIABILITIES, OR EXPENSES OF ANY KIND, INCLUDING BUT NOT LIMITED TO:

  • API FEES
  • TOKEN CHARGES
  • CLOUD COMPUTE COSTS
  • INFRASTRUCTURE COSTS
  • FINANCIAL LOSSES
  • LOST PROFITS
  • BUSINESS INTERRUPTION
  • SERVICE OUTAGES
  • DATA LOSS
  • DATA CORRUPTION
  • SECURITY INCIDENTS
  • INDIRECT DAMAGES
  • INCIDENTAL DAMAGES
  • CONSEQUENTIAL DAMAGES
  • SPECIAL DAMAGES
  • PUNITIVE DAMAGES
  • MISUSE OF THE SOFTWARE
  • FAILURE OF SAFETY FEATURES
  • FAILURE OF TOKEN LIMITS
  • FAILURE OF CONTEXT COMPACTION
  • FAILURE OF RETRIEVAL ACCURACY
  • FAILURE OF SCHEMA VALIDATION
  • FAILURE OF SESSION MEMORY
  • ERRORS IN TOKEN ESTIMATION
  • EXCESSIVE BILLING EVENTS
  • PRODUCTION FAILURES

USE OF THIS SOFTWARE IS ENTIRELY AT YOUR OWN RISK.

YOU ARE SOLELY RESPONSIBLE FOR:

  • VERIFYING ALL OUTPUTS
  • MONITORING API USAGE
  • MONITORING TOKEN CONSUMPTION
  • MONITORING BILLING
  • IMPLEMENTING ADDITIONAL SAFEGUARDS
  • TESTING IN YOUR OWN ENVIRONMENT
  • CONFIGURING APPROPRIATE LIMITS
  • VALIDATING ALL EXECUTION LOGIC
  • MAINTAINING BACKUPS AND RECOVERY PROCEDURES

THIS PROJECT SHOULD NOT BE USED AS THE SOLE OR PRIMARY MECHANISM FOR COST CONTROL, BILLING GOVERNANCE, SECURITY, OR PRODUCTION SAFETY.

ALWAYS IMPLEMENT INDEPENDENT PROVIDER-SIDE BILLING ALERTS, RATE LIMITS, BUDGET CONTROLS, AND MONITORING SYSTEMS.

IF YOU DO NOT AGREE WITH THESE TERMS, DO NOT USE THIS SOFTWARE.


License

MIT License

Copyright (c) 2026 StackForgeAI

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND.

For full license text, see the LICENSE file.