@stackforgeai/copilot-skills
v1.0.0
Published
Composable skill engineering framework for GitHub Copilot SDK — define, register, compose, and execute reusable AI skills with typed I/O, context awareness, caching, and full observability, all routed through copilot-guard.
Maintainers
Readme
@stackforgeai/copilot-skills
Composable skill engineering framework for GitHub Copilot SDK — define, register, compose, and execute reusable AI skills with typed I/O, context awareness, caching, and full observability, all routed through copilot-guard.
Overview
@stackforgeai/copilot-skills provides a structured, production-ready layer for building reusable AI "skills" — typed prompt recipes with defined inputs, outputs, and execution policies — on top of the @github/copilot-sdk.
Inspired by the emerging trend of skill files and prompt engineering as first-class developer interfaces, this package treats each AI capability as a composable, testable, cacheable unit. Skills can be chained sequentially, run in parallel, gated by approval handlers, and observed with latency percentiles and token metrics.
All LLM calls are routed through @stackforgeai/copilot-guard. Direct SDK access is not permitted. copilot-guard controls token budgets, premium model governance, and execution authorization.
Features
- Typed Skill Definitions — Structured descriptors with prompt templates, input schemas, model overrides, and execution policies
- SkillRegistry — Central registry for discovering, filtering, and managing skills by name, tag, or search query
- SkillRunner — Executes skills through copilot-guard with input validation, approval gates, and caching
- SkillComposer — Chain skills sequentially with output routing and conditional step control, or run them in parallel
- BuiltinSkills — Pre-built, production-ready skill templates: summarize, classify, extract, translate, review-code, generate-tests, draft-email, analyze-sentiment
- SkillCache — TTL-based result cache keyed by (skillName, input) to avoid redundant API calls and reduce token spend
- SkillObserver — Per-skill execution metrics with P50/P95/P99 latency percentiles and cache hit tracking
- Approval Gates — Per-skill
requiresApprovalflag with pluggable async approval handlers - Template Engine — Lightweight
{{variable}},{{#if}}, and{{#each}}interpolation without external dependencies - Zero-dependency validation — Schema-based input validation with descriptive error messages
- Full TypeScript support — Strict types for all APIs, inputs, outputs, and configuration
- copilot-guard enforced — Token budgets, premium model blocking, and usage introspection are always active
Installation
npm install @stackforgeai/copilot-skills @stackforgeai/copilot-guard @github/copilot-sdkUsage Examples
Basic: Register and run a custom skill
import { CopilotSkills } from '@stackforgeai/copilot-skills';
const skills = new CopilotSkills({
premiumLimit: 100, // max billing units for premium model calls (free models cost 0)
defaultModel: 'gpt-4.1', // valid free Copilot model (billing.multiplier = 0)
enableCaching: true,
});
skills.register({
name: 'product-description',
description: 'Generate e-commerce product copy.',
systemPrompt: 'You are a professional copywriter. Write persuasively.',
promptTemplate:
'Write a product description for {{productName}}.\n' +
'{{#if features}}Key features: {{features}}\n{{/if}}' +
'Target audience: {{audience}}',
outputFormat: 'text',
cacheable: true,
inputSchema: {
fields: [
{ name: 'productName', type: 'string', required: true },
{ name: 'audience', type: 'string', required: true },
{ name: 'features', type: 'string', required: false },
],
},
});
const result = await skills.run('product-description', {
productName: 'UltraSound Pro Headphones',
audience: 'audiophiles',
features: '40-hour battery, ANC, lossless audio',
});
console.log(result.text); // LLM-generated description
console.log(result.tokensUsed); // Output tokens consumed
console.log(result.cached); // false on first call
console.log(result.latencyMs); // Wall-clock latency
console.log(result.traceId); // Unique trace IDBuilt-in Skill Templates
Register all pre-built skills in one line:
import { CopilotSkills, BuiltinSkills } from '@stackforgeai/copilot-skills';
const skills = new CopilotSkills({ premiumLimit: 100 });
skills.registerAll([
BuiltinSkills.summarize(),
BuiltinSkills.classify(),
BuiltinSkills.extract(),
BuiltinSkills.translate(),
BuiltinSkills.reviewCode(),
BuiltinSkills.generateTests(),
BuiltinSkills.draftEmail(),
BuiltinSkills.analyzeSentiment(),
]);
// Summarize
const summary = await skills.run('summarize', {
text: 'Long article text...',
maxBullets: 5,
});
// Translate
const translation = await skills.run('translate', {
text: 'Hello, world!',
targetLanguage: 'Japanese',
});
// Analyze sentiment
const sentiment = await skills.run('analyze-sentiment', {
text: 'This product is absolutely fantastic!',
});
// sentiment.text → {"sentiment":"positive","confidence":0.97,"reasoning":"..."}Override any built-in field when registering:
skills.register(
BuiltinSkills.summarize({ model: 'gpt-4.1', cacheTTLMs: 3_600_000 }),
true, // override
);Sequential Composition
Chain skills where each step's output feeds the next:
const result = await skills.compose(
[
// Step 1: summarize
{ skill: 'summarize' },
// Step 2: translate the summary — maps previous output to next input
{
skill: 'translate',
inputMapper: (previousOutput, originalInput) => ({
text: String(previousOutput),
targetLanguage: originalInput.targetLanguage as string,
}),
},
// Step 3: classify sentiment — skipped if text is too short
{
skill: 'analyze-sentiment',
condition: (prev) => String(prev).length > 20,
inputMapper: (prev) => ({ text: String(prev) }),
},
],
{ text: 'Long document...', targetLanguage: 'French' },
);
console.log(result.finalOutput); // output of last executed step
console.log(result.totalTokens); // tokens across all steps
console.log(result.steps.length); // number of steps that ranParallel Execution
Run multiple independent skills at the same time:
const result = await skills.parallel(
['summarize', 'classify', 'analyze-sentiment'],
{ text: 'Product review text...', categories: 'positive, neutral, negative' },
);
for (const step of result.steps) {
console.log(step.skillName, '→', step.result.text);
}Approval Gates
Require human (or automated) sign-off before a skill executes:
skills.register({
name: 'delete-records',
description: 'Generate a delete SQL statement',
promptTemplate: 'Generate DELETE SQL for table {{table}} where {{condition}}',
requiresApproval: true,
});
// Option 1: Global handler in config
const skills = new CopilotSkills({
globalApprovalHandler: async (skillName, input) => {
console.log(`Approve "${skillName}"?`, input);
return true; // Replace with real user prompt or policy check
},
});
// Option 2: Per-call handler
const result = await skills.run(
'delete-records',
{ table: 'users', condition: 'inactive = true' },
{ approvalHandler: async () => confirmWithHuman() },
);Observability
// Guard token usage
console.log(skills.getUsage());
// { premiumTokensUsed: 340, premiumLimit: 50000, remaining: 49660 }
// Per-skill metrics with latency percentiles
const metrics = skills.getMetrics('summarize');
console.log(metrics);
// {
// skillName: 'summarize',
// totalRuns: 12,
// cacheHits: 7,
// totalTokens: 250,
// avgLatencyMs: 430,
// p50LatencyMs: 400,
// p95LatencyMs: 820,
// p99LatencyMs: 1050,
// }
// All skills at once
const all = skills.getAllMetrics();
// Invalidate cache for a specific skill
skills.invalidateCache('summarize');
// Clear entire cache
skills.clearCache();Registry Management
// Search by name, description, or tag
const results = skills.searchSkills('code'); // finds 'review-code', 'generate-tests'
const codingSkills = skills.listSkillsByTag('engineering');
// Check if registered
if (skills.hasSkill('summarize')) {
console.log('Ready to summarize.');
}
// Remove a skill
skills.unregister('temp-skill');
// Fluent registration
skills
.register(BuiltinSkills.summarize())
.register(BuiltinSkills.translate())
.register(myCustomSkill);Configuration
const skills = new CopilotSkills({
// Inject a custom guard (or mock for testing) — default: creates CopilotGuard
guard?: IGuard;
// Max cumulative billing units for the default CopilotGuard instance — default: 100
// Free models (billing.multiplier=0) cost 0 per call; premium models cost their multiplier per call
premiumLimit?: number;
// Default model when a skill does not specify one — default: 'gpt-4.1' (free model, multiplier=0)
// Must be a valid Copilot SDK model ID. Call guard.loadAvailableModels() to enumerate valid IDs.
defaultModel?: string;
// Set to false to disable result caching globally — default: true
enableCaching?: boolean;
// Called for all skills with requiresApproval: true when no per-call handler is provided
globalApprovalHandler?: (skillName: string, input: SkillInput) => Promise<boolean>;
});SkillDefinition fields
| Field | Type | Required | Description |
|---|---|---|---|
| name | string | ✅ | Unique registry key |
| description | string | ✅ | Human-readable summary |
| promptTemplate | string | ✅ | Template with {{var}}, {{#if}}, {{#each}} syntax |
| version | string | — | Semantic version (e.g. "1.0.0") |
| model | string | — | Override default model for this skill |
| tags | string[] | — | Categorization tags for search and filtering |
| systemPrompt | string | — | System role prompt; enables messages format |
| outputFormat | 'text'\|'json'\|'markdown'\|'list' | — | Injects format instructions prefix |
| inputSchema | SkillSchema | — | Field definitions for pre-execution validation |
| requiresApproval | boolean | — | If true, approval handler must return true before execution |
| cacheable | boolean | — | If true, identical inputs return cached output |
| cacheTTLMs | number | — | Cache time-to-live in milliseconds (default: 300_000) |
Architecture Overview
┌─────────────────────────────────────────────────────┐
│ CopilotSkills │
│ (Unified facade — registry + runner + composer) │
└──────────┬───────────────────────┬───────────────────┘
│ │
┌──────▼──────┐ ┌───────▼───────┐
│SkillRegistry│ │SkillComposer │
│ register() │ │ compose() │
│ search() │ │ parallel() │
│ listByTag()│ └───────┬───────┘
└─────────────┘ │
│
┌────────▼────────┐
│ SkillRunner │
│ 1. Validate │
│ 2. Approve │
│ 3. Cache check │
│ 4. Render tmpl │
│ 5. guard.send()│ ←── ALL LLM calls
│ 6. Cache store │
│ 7. Record obs │
└────────┬────────┘
│
┌──────────────▼──────────────┐
│ copilot-guard │
│ Token budget enforcement │
│ Premium model blocking │
│ Usage introspection │
└──────────────┬──────────────┘
│
┌──────────────▼──────────────┐
│ @github/copilot-sdk │
└─────────────────────────────┘
Supporting components:
SkillCache — TTL-based result cache (keyed by skillName + input)
SkillObserver — Per-skill metrics with P50/P95/P99 latency percentiles
BuiltinSkills — Pre-built skill definition factory (8 templates)
renderTemplate — {{var}} / {{#if}} / {{#each}} template engineData flow for a single skill execution
CopilotSkills.run(name, input)→SkillRunner.run()- Input validated against
inputSchema - Approval gate checked (if
requiresApproval) - Cache lookup — returns immediately on hit
renderTemplate(promptTemplate, input)produces final promptcopilot-guard.sendAndWait(model, prompt)— ONLY path to the AI model- Result stored in
SkillCache(ifcacheable) - Observation recorded in
SkillObserver SkillResultreturned to caller
Troubleshooting
[SkillRunner] Skill '...' is not registered
→ Call skills.register(definition) or skills.registerAll([...]) before skills.run().
[SkillRunner] Input validation failed
→ Check that all required fields in inputSchema are present in the input object and have the correct types.
[SkillRunner] Skill '...' requires approval but no approvalHandler was provided
→ Either set globalApprovalHandler in the constructor config or pass approvalHandler in SkillRunOptions.
[CopilotGuard] Blocked: premium token limit reached
→ Increase premiumLimit in the constructor, or switch to a non-premium model. Check usage with skills.getUsage().
@github/copilot-sdk is not installed
→ Run npm install @github/copilot-sdk in your project.
[SkillRunner] Execution rejected by the approval handler
→ Your approvalHandler returned false. This is expected behavior when an approval gate denies execution.
Skills not found in search() or listByTag()
→ Check that tags are defined on the skill definition and match the search query exactly (case-insensitive substring match).
Cached result returned when fresh data is needed
→ Call skills.invalidateCache(skillName) to clear the cache for that skill, or pass { bypassCache: true } in SkillRunOptions.
DISCLAIMER AND LIMITATION OF LIABILITY
IMPORTANT: THIS SOFTWARE IS PROVIDED STRICTLY ON AN "AS IS" AND "AS AVAILABLE" BASIS.
BY USING THIS SOFTWARE, YOU ACKNOWLEDGE AND AGREE THAT:
- THE SOFTWARE MAY CONTAIN BUGS, DEFECTS, DESIGN FLAWS, LOGIC ERRORS, SECURITY ISSUES, OR INCOMPLETE FEATURES
- THE SOFTWARE MAY FAIL TO LIMIT OR PREVENT TOKEN USAGE, API REQUESTS, COST OVERRUNS, OR BILLING EVENTS
- SKILL CACHING, INPUT VALIDATION, APPROVAL GATES, AND SAFETY FEATURES MAY BE INACCURATE, INCOMPLETE, OR NON-FUNCTIONAL
- THE SOFTWARE MAY PRODUCE UNEXPECTED OR INCORRECT OUTPUTS FROM AI MODELS
- THE SOFTWARE MAY NOT BE SUITABLE FOR PRODUCTION ENVIRONMENTS WITHOUT ADDITIONAL SAFEGUARDS
- THE SOFTWARE MAY NOT PREVENT EXCESSIVE CHARGES FROM AI PROVIDERS OR CLOUD SERVICES
- APPROVAL GATES MAY BE BYPASSED, FAIL, OR BEHAVE UNEXPECTEDLY
THIS SOFTWARE DOES NOT GUARANTEE:
- COST SAVINGS
- BILLING PROTECTION
- TOKEN ACCURACY
- FINANCIAL PROTECTION
- REQUEST SAFETY
- SYSTEM STABILITY
- SECURITY
- RELIABILITY
- FITNESS FOR ANY PARTICULAR PURPOSE
- CORRECTNESS OF AI MODEL OUTPUTS
- EFFECTIVENESS OF APPROVAL GATES
- CACHE ACCURACY OR FRESHNESS
TO THE MAXIMUM EXTENT PERMITTED BY APPLICABLE LAW:
THE AUTHORS, CONTRIBUTORS, MAINTAINERS, COPYRIGHT HOLDERS, AFFILIATES, AND DISTRIBUTORS SHALL NOT BE LIABLE FOR ANY CLAIMS, DAMAGES, LOSSES, LIABILITIES, OR EXPENSES OF ANY KIND, INCLUDING BUT NOT LIMITED TO:
- API FEES
- TOKEN CHARGES
- CLOUD COMPUTE COSTS
- INFRASTRUCTURE COSTS
- FINANCIAL LOSSES
- LOST PROFITS
- BUSINESS INTERRUPTION
- SERVICE OUTAGES
- DATA LOSS
- DATA CORRUPTION
- SECURITY INCIDENTS
- INDIRECT DAMAGES
- INCIDENTAL DAMAGES
- CONSEQUENTIAL DAMAGES
- SPECIAL DAMAGES
- PUNITIVE DAMAGES
- MISUSE OF THE SOFTWARE
- FAILURE OF SAFETY FEATURES
- FAILURE OF APPROVAL GATES
- FAILURE OF TOKEN LIMITS
- STALE OR INCORRECT CACHED RESULTS
- FAILED REQUEST BLOCKING
- ERRORS IN COST ESTIMATION
- EXCESSIVE BILLING EVENTS
- PRODUCTION FAILURES
USE OF THIS SOFTWARE IS ENTIRELY AT YOUR OWN RISK.
YOU ARE SOLELY RESPONSIBLE FOR:
- VERIFYING ALL AI MODEL OUTPUTS BEFORE USE
- MONITORING API USAGE AND TOKEN CONSUMPTION
- MONITORING BILLING AND IMPLEMENTING PROVIDER-SIDE BUDGET CONTROLS
- IMPLEMENTING ADDITIONAL SAFEGUARDS APPROPRIATE TO YOUR ENVIRONMENT
- TESTING IN YOUR OWN ENVIRONMENT BEFORE PRODUCTION DEPLOYMENT
- CONFIGURING APPROPRIATE LIMITS AND APPROVAL POLICIES
- VALIDATING ALL EXECUTION LOGIC AND SKILL BEHAVIOR
- MAINTAINING BACKUPS AND RECOVERY PROCEDURES
THIS PROJECT SHOULD NOT BE USED AS THE SOLE OR PRIMARY MECHANISM FOR COST CONTROL, BILLING GOVERNANCE, SECURITY, COMPLIANCE, OR PRODUCTION SAFETY.
ALWAYS IMPLEMENT INDEPENDENT PROVIDER-SIDE BILLING ALERTS, RATE LIMITS, BUDGET CONTROLS, AND MONITORING SYSTEMS.
IF YOU DO NOT AGREE WITH THESE TERMS, DO NOT USE THIS SOFTWARE.
License
MIT License
Copyright (c) 2026 StackForgeAI
Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
