@stackforgeai/copilot-guard

v1.0.2

Published

a month ago

Lightweight runtime quota and usage guard for GitHub Copilot SDK integration.

Downloads

0High
0Medium
0Low

xerrex

@stackforgeai/copilot-guard

Guardrails for AI SDK usage to help reduce accidental excessive token consumption, runaway request loops, recursive prompt execution, and unsafe request orchestration patterns.

Overview

@stackforgeai/copilot-guard is a lightweight middleware/guard layer designed to help developers add defensive protections around AI SDK calls.

This project focuses on preventing common mistakes such as:

accidental high-volume request loops
recursive AI execution chains
unbounded retries
excessive concurrent requests
unexpected token consumption
uncontrolled batch processing
runaway orchestration logic

The package is intended to be provider-agnostic and may be used alongside AI SDKs and tooling ecosystems.

Features

Premium model output token budget limiting
Premium model detection (static list + live SDK billing metadata)
Actual output token tracking from API responses
Hard block when premium token limit is reached
Live model list loading with billing metadata
Usage introspection via getUsage()
Lightweight integration

Installation

npm install @stackforgeai/copilot-guard

Quick Start

const { CopilotGuard } = require('@stackforgeai/copilot-guard');

const guard = new CopilotGuard({
    premiumLimit: 50000  // max output tokens allowed for premium models
});

async function main() {
    // Optionally load live model metadata for accurate premium detection
    await guard.loadAvailableModels();

    const result = await guard.sendAndWait({
        model: 'gpt-4.1',
        prompt: 'Hello, world!'
    });

    console.log(result);
    console.log(guard.getUsage());
    // { premiumTokensUsed: 12, premiumLimit: 50000, remaining: 49988 }
}

main();

Example: Blocking When Token Budget Is Exhausted

const guard = new CopilotGuard({ premiumLimit: 100 });

// Once premiumTokensUsed >= premiumLimit, further calls to premium
// models throw an error rather than making an API request.
try {
    const result = await guard.sendAndWait({
        model: 'gpt-4.1',
        prompt: 'Summarize this document...'
    });
} catch (err) {
    console.error(err.message);
    // [CopilotGuard] Blocked: 'gpt-4.1' is a premium model.
    // Premium token limit reached: used 100 / 100 output tokens.
}

If the premium token limit is reached, the guard throws before making any API call.

Example: Using Messages and Attachments

const result = await guard.sendAndWait({
    model: 'claude-3.7-sonnet',
    messages: [
        { role: 'system', content: 'You are a helpful assistant.' },
        { role: 'user', content: 'Explain recursion.' }
    ]
});

Configuration Reference

const guard = new CopilotGuard({
    premiumLimit: 100000  // maximum output tokens allowed across all premium model calls
});

| Option | Type | Description | |---|---|---| | premiumLimit | number | Maximum cumulative output tokens allowed for premium models |

API Reference

`sendAndWait(req, timeout?)`

Sends a request to the Copilot SDK and waits for the response. Throws if the premium token budget is exceeded before the call.

req: {
    model: string;           // model id (e.g. 'gpt-4.1', 'claude-3.7-sonnet')
    prompt?: string;         // plain text prompt
    messages?: { role: string; content: string }[];  // chat messages
    attachments?: { type: string; path?: string }[]; // optional file attachments
}
timeout?: number             // milliseconds to wait for a response (default: 60000)

`loadAvailableModels()`

Fetches the live model list from the Copilot SDK and caches billing metadata. Call this once at startup for accurate premium model detection.

`getUsage()`

Returns current token usage stats:

{ premiumTokensUsed: number, premiumLimit: number, remaining: number }

Intended Use Cases

AI SDK wrappers
Agent frameworks
Prompt orchestration systems
Batch AI processing
Background workers
CLI tools
AI-enabled services
Experimental AI workflows

Goals

This project aims to help developers:

guard against accidental premium model token overuse
surface live token usage during a session
block requests before they hit the API when limits are reached
experiment more safely with premium AI models

Non-Goals

This package does NOT guarantee:

prevention of all excessive billing events
accurate token estimation (output tokens are taken from the API response; pre-call estimation is not performed)
request count limiting, concurrency limiting, retry limiting, or throttling (not implemented)
loop detection (not implemented)
compatibility with non-Copilot AI providers
prevention of all logic errors
production-grade fault tolerance
complete protection against misuse
compliance or security certification

Developers remain fully responsible for validating behavior in their own environments.

Compatibility

This package may be used with:

OpenAI SDKs
Anthropic SDKs
GitHub Copilot-related tooling
local LLM runtimes
custom AI orchestration systems
experimental AI frameworks

Compatibility may vary depending on implementation details.

Development Status

This project may be experimental, under active development, incomplete, or subject to breaking changes at any time.

Interfaces, behaviors, APIs, and internal logic may change without notice.

DISCLAIMER AND LIMITATION OF LIABILITY

IMPORTANT: THIS SOFTWARE IS PROVIDED STRICTLY ON AN "AS IS" AND "AS AVAILABLE" BASIS.

BY USING THIS SOFTWARE, YOU ACKNOWLEDGE AND AGREE THAT:

THE SOFTWARE MAY CONTAIN BUGS, DEFECTS, DESIGN FLAWS, LOGIC ERRORS, SECURITY ISSUES, OR INCOMPLETE FEATURES
THE SOFTWARE MAY FAIL TO LIMIT OR PREVENT TOKEN USAGE, API REQUESTS, COST OVERRUNS, OR BILLING EVENTS
TOKEN ESTIMATION, RATE LIMITING, LOOP DETECTION, THROTTLING, AND SAFETY FEATURES MAY BE INACCURATE, INCOMPLETE, OR NON-FUNCTIONAL
THE SOFTWARE MAY PRODUCE UNEXPECTED RESULTS
THE SOFTWARE MAY NOT BE SUITABLE FOR PRODUCTION ENVIRONMENTS
THE SOFTWARE MAY NOT PREVENT EXCESSIVE CHARGES FROM AI PROVIDERS OR CLOUD SERVICES

THIS SOFTWARE DOES NOT GUARANTEE:

COST SAVINGS
BILLING PROTECTION
TOKEN ACCURACY
FINANCIAL PROTECTION
REQUEST SAFETY
SYSTEM STABILITY
SECURITY
RELIABILITY
FITNESS FOR ANY PARTICULAR PURPOSE

TO THE MAXIMUM EXTENT PERMITTED BY APPLICABLE LAW:

THE AUTHORS, CONTRIBUTORS, MAINTAINERS, COPYRIGHT HOLDERS, AFFILIATES, AND DISTRIBUTORS SHALL NOT BE LIABLE FOR ANY CLAIMS, DAMAGES, LOSSES, LIABILITIES, OR EXPENSES OF ANY KIND, INCLUDING BUT NOT LIMITED TO:

API FEES
TOKEN CHARGES
CLOUD COMPUTE COSTS
INFRASTRUCTURE COSTS
FINANCIAL LOSSES
LOST PROFITS
BUSINESS INTERRUPTION
SERVICE OUTAGES
DATA LOSS
DATA CORRUPTION
SECURITY INCIDENTS
INDIRECT DAMAGES
INCIDENTAL DAMAGES
CONSEQUENTIAL DAMAGES
SPECIAL DAMAGES
PUNITIVE DAMAGES
MISUSE OF THE SOFTWARE
FAILURE OF SAFETY FEATURES
FAILURE OF RATE LIMITS
FAILURE OF TOKEN LIMITS
FAILURE OF LOOP DETECTION
FAILED REQUEST BLOCKING
ERRORS IN COST ESTIMATION
EXCESSIVE BILLING EVENTS
PRODUCTION FAILURES

USE OF THIS SOFTWARE IS ENTIRELY AT YOUR OWN RISK.

YOU ARE SOLELY RESPONSIBLE FOR:

VERIFYING ALL OUTPUTS
MONITORING API USAGE
MONITORING TOKEN CONSUMPTION
MONITORING BILLING
IMPLEMENTING ADDITIONAL SAFEGUARDS
TESTING IN YOUR OWN ENVIRONMENT
CONFIGURING APPROPRIATE LIMITS
VALIDATING ALL EXECUTION LOGIC
MAINTAINING BACKUPS AND RECOVERY PROCEDURES

THIS PROJECT SHOULD NOT BE USED AS THE SOLE OR PRIMARY MECHANISM FOR COST CONTROL, BILLING GOVERNANCE, SECURITY, OR PRODUCTION SAFETY.

ALWAYS IMPLEMENT INDEPENDENT PROVIDER-SIDE BILLING ALERTS, RATE LIMITS, BUDGET CONTROLS, AND MONITORING SYSTEMS.

IF YOU DO NOT AGREE WITH THESE TERMS, DO NOT USE THIS SOFTWARE.

License

MIT License

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND.

For full license text, see the LICENSE file.

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

@stackforgeai/copilot-guard

Overview

Features

Installation

Quick Start

Example: Blocking When Token Budget Is Exhausted

Example: Using Messages and Attachments

Configuration Reference

API Reference

sendAndWait(req, timeout?)

loadAvailableModels()

getUsage()

Intended Use Cases

Goals

Non-Goals

Compatibility

Development Status

DISCLAIMER AND LIMITATION OF LIABILITY

License

`sendAndWait(req, timeout?)`

`loadAvailableModels()`

`getUsage()`