@stackforgeai/copilot-genkit

v1.0.0

Published

8 days ago

Production-grade AI orchestration framework for GitHub Copilot SDK — flows, prompts, tools, structured output, middleware, and observability, all routed through copilot-guard.

0High
0Medium
0Low

xerrex

copilot github-copilot copilot-sdk copilot-guard genkit ai-framework flows prompts tools structured-output middleware observability function-calling schema-validation

@stackforgeai/copilot-genkit

Production-grade AI orchestration framework for the GitHub Copilot SDK — flows, prompts, tools, structured output, middleware, and observability. All LLM calls are routed through @stackforgeai/copilot-guard for token budget enforcement.

Overview

copilot-genkit is a production-grade AI orchestration framework purpose-built for the GitHub Copilot SDK. It delivers:

Flows — typed, observable async functions with automatic tracing
Prompts — Handlebars-style templates with role markers and variable interpolation
Tools — type-safe function definitions for model-driven tool calling
Structured Output — schema-enforced JSON output with automatic validation and retry
Middleware — composable request/response hooks (retry, cache, logging, custom)
Registry — centralized action discovery for flows, prompts, and tools
Observability — span-based tracing with latency percentiles and JSON export

All calls are guarded by copilot-guard — direct @github/copilot-sdk access is never used.

Features

| Feature | Description | |---|---| | defineFlow() | Create typed, traced async flows that can call generate() internally | | definePrompt() | Handlebars-style templates with {{role "system"}}, {{#if}}, {{#each}} | | defineTool() | Register tools for LLM function calling with type-safe handlers | | generate() | Core generation with optional schema validation, tools, and middleware | | Schema Validation | Lightweight schema enforcement with auto-repair on failure | | Middleware | Composable retry(), cache(), logging(), and createMiddleware() | | Registry | List, lookup, and discover all registered actions | | Tracing | Automatic span creation, trace hierarchy, latency P50/P95/P99 | | Guard Integration | All LLM calls routed through IGuard for token budget control |

Installation

npm install @stackforgeai/copilot-genkit @stackforgeai/copilot-guard @github/copilot-sdk

Or from the monorepo:

cd copilot-genkit
npm install
npm run build

Usage Examples

Below are several hands-on examples that show common copilot-genkit workflows. Each example includes:

Purpose: what the example demonstrates and when to use it.
Who it's for: suggested audience level (Beginner / Intermediate / Advanced).
Code: runnable snippet showing the API usage.
Notes & tips: background knowledge, pitfalls, and next steps.

Quick Start

Purpose: A minimal end-to-end example showing how to create a CopilotGenkit instance and make a simple generation call. Use this to verify your environment and SDK authentication.

Who it's for: Beginner — get something working quickly.

import { CopilotGenkit } from '@stackforgeai/copilot-genkit';

const ai = new CopilotGenkit({
  // Let copilot-guard select a safe default free model; avoid hardcoding premium models.
  premiumLimit: 100,
});

const response = await ai.generate({
  prompt: 'Explain event-driven architecture in 3 bullets.',
});
console.log(response.text);

Notes & tips:

Background: CopilotGenkit routes requests through copilot-guard, which enforces budget and validates model names. If you see model validation errors, call guard.loadAvailableModels() in advance.
Next steps: Try a prompt template or add output.schema (see Structured Output) to enforce structured responses.

Flows

Purpose: Demonstrates how to create a reusable, typed async flow that calls generate() internally and is instrumented with tracing.

Who it's for: Intermediate — building composable app logic around LLM calls.

const summarizer = ai.defineFlow(
  { name: 'summarize' },
  async (input: { text: string }, ctx) => {
    const result = await ai.generate({
      prompt: `Summarize: ${input.text}`,
    });
    return result.text;
  },
);

const result = await summarizer.run({ text: 'Long document...' });
console.log(result.output);    // Summarized text
console.log(result.traceId);   // Trace ID for observability
console.log(result.latencyMs); // Execution time

Notes & tips:

Background: Flows encapsulate business logic and make it easier to test and observe behavior. They are ideal for higher-level application features like summarization, extraction, or multi-step pipelines.
Instrumentation: Traces include traceId and span hierarchy so you can correlate downstream tool calls and retries.

Prompt Templates

Purpose: Show how to author reusable role-based prompt templates with role markers and variable interpolation.

Who it's for: Beginner → Intermediate — useful for consistent system/user messaging patterns.

const greeting = ai.definePrompt(
  { name: 'greeting' },
  `{{role "system"}}
You are a friendly assistant who speaks {{language}}.

{{role "user"}}
Hello, my name is {{name}}. Tell me about {{topic}}.`,
);

// Render to messages (no API call)
const messages = greeting.render({ language: 'English', name: 'Ada', topic: 'AI' });

// Generate a response (calls guard)
const response = await greeting.generate({ language: 'English', name: 'Ada', topic: 'AI' });

Notes & tips:

Background: Prompt templates help maintain consistent system instructions (temperament, persona) and reduce prompt-duplication across services.
Best practice: Put safety-critical or high-level context in the system role. Keep user prompts concise and data-driven.
Advanced: Use conditional blocks and loops to render lists or optional fields.

Structured Output

Purpose: Show how to request schema-enforced JSON output and let copilot-genkit validate and auto-repair model output when possible.

Who it's for: Intermediate → Advanced — useful when your application requires reliable, machine-readable responses.

const result = await ai.generate({
  prompt: 'List 3 REST API endpoints for a task app.',
  output: {
    schema: {
      isArray: true,
      fields: [
        { name: 'method', type: 'string' },
        { name: 'path', type: 'string' },
        { name: 'description', type: 'string' },
      ],
    },
    format: 'json',
  },
});

console.log(result.output); // Validated JSON array

Notes & tips:

Background: The schema system accepts a declarative description of expected fields. CopilotGenkit injects format instructions into the prompt and validates the returned JSON.
Failure handling: If parsing/validation fails, the library may attempt a single repair attempt by asking the model to reformat to the schema. For production, always validate on your side too.

Tools / Function Calling

Purpose: Demonstrates registering a tool (function) and enabling the model to request a tool call; the host executes the tool and returns results to the application.

Who it's for: Advanced — building tool-enabled agents or safe function-calling flows.

const calculator = ai.defineTool(
  {
    name: 'calculator',
    description: 'Perform arithmetic',
    inputSchema: {
      fields: [
        { name: 'operation', type: 'string' },
        { name: 'a', type: 'number' },
        { name: 'b', type: 'number' },
      ],
    },
  },
  (input) => {
    if (input.operation === 'add') return input.a + input.b;
    if (input.operation === 'multiply') return input.a * input.b;
    return 0;
  },
);

// Use in generation
const result = await ai.generate({
  prompt: 'Calculate 15 * 23',
  tools: [calculator.definition],
});

if (result.finishReason === 'tool_call') {
  const { tool, input } = result.output;
  const toolResult = await ai.executeToolCall(tool, input);
  console.log('Result:', toolResult);
}

Notes & tips:

Background: Tools let you expose safe, deterministic functionality (calculators, DB lookups, internal APIs) to the model without giving it direct system access. The model can request a tool call by returning a structured tool-call object.
Safety: Validate tool inputs before executing, and enforce timeouts and resource limits on tool execution.

Middleware

Purpose: Illustrate common middleware patterns (retry, caching, logging) and how to compose them for resilient LLM calls.

Who it's for: Intermediate — useful for production reliability and observability.

import { retry, cache, logging, createMiddleware } from '@stackforgeai/copilot-genkit';

const result = await ai.generate({
  prompt: 'Resilient call',
  use: [
    logging(),
    retry({ maxRetries: 3, initialDelayMs: 500 }),
    cache({ ttlMs: 60_000 }),
  ],
});

Notes & tips:

Background: Middleware composes cross-cutting behaviors. retry() helps with transient errors, cache() reduces cost for repeat calls, and logging() captures request/response for debugging.
Cost awareness: Use caching and deterministic prompts to minimize repeated premium model calls.

Observability

Purpose: Show how to inspect traces and latency percentiles produced by CopilotGenkit instrumentation.

Who it's for: Intermediate → Advanced — monitoring and performance tuning.

// List recent traces
const traces = ai.listTraces(10);
for (const t of traces) {
  console.log(`${t.traceId} | ${t.status} | spans: ${t.spans.length}`);
}

// Get latency percentiles
const perf = ai.getLatencyPercentiles('generate');
console.log(`P50: ${perf.p50}ms, P95: ${perf.p95}ms, P99: ${perf.p99}ms`);

// Export all traces as JSON
const data = ai.exportTraces();

Notes & tips:

Background: Tracing provides correlation between flows, prompts, tool calls, and middleware. Use traceId to examine a single end-to-end run in detail.
Production: Export traces to your observability backend and correlate with request metadata (user id, tenant, environment).

Architecture Overview

┌─────────────────────────────────────────────────────────────┐
│                     CopilotGenkit                           │
│                                                             │
│  ┌──────────┐   ┌──────────┐  ┌──────────┐  ┌────────────┐  │
│  │  Flows   │   │ Prompts  │  │  Tools   │  │  Registry  │  │
│  └────┬─────┘   └────┬─────┘  └────┬─────┘  └────────────┘  │
│       │              │             │                        │
│       └──────┬───────┘             │                        │
│              ▼                     │                        │
│  ┌──────────────────────┐          │                        │
│  │     generate()       │◄─────────┘                        │
│  │  + Schema Validator  │                                   │
│  │  + Middleware Chain  │                                   │
│  └──────────┬───────────┘                                   │
│             │                                               │
│  ┌──────────▼──────────┐     ┌───────────────────────────┐  │
│  │      Tracer         │     │       Middleware          │  │
│  │  (Spans, Traces,    │     │  retry / cache / logging  │  │
│  │   Percentiles)      │     │  + custom                 │  │
│  └─────────────────────┘     └───────────────────────────┘  │
│                                                             │
└──────────────────────┬──────────────────────────────────────┘
                       │
                       ▼
            ┌─────────────────────┐
            │   copilot-guard     │
            │  (IGuard interface) │
            │  Token budget       │
            │  Model validation   │
            └──────────┬──────────┘
                       │
                       ▼
            ┌─────────────────────┐
            │  @github/copilot-sdk│
            │  (peer dependency)  │
            └─────────────────────┘

Key design principles:

All LLM calls go through copilot-guard — no direct SDK access
Dependency injection via IGuard interface for testing
Composable middleware chain for cross-cutting concerns
Automatic span-based tracing for all operations
Lightweight schema validation without external dependencies (no Zod)

API Reference

CopilotGenkit

| Method | Description | |---|---| | generate(req) | Send a generation request through the guard with middleware/schema support | | defineFlow(config, fn) | Define a typed, traced async flow | | definePrompt(config, template) | Define a reusable prompt template | | defineTool(options, handler) | Define a tool for function calling | | executeToolCall(name, input) | Execute a registered tool by name | | renderTemplate(template, vars) | Render a template string (no API call) | | listActions(type?) | List registered actions (flows, prompts, tools) | | hasAction(type, name) | Check if an action is registered | | getTrace(traceId) | Get a trace record by ID | | listTraces(limit?) | List recent traces | | getLatencyPercentiles(type, name) | Get P50/P95/P99 latency metrics | | exportTraces() | Export all traces as JSON | | getUsage() | Get current guard token usage stats |

Middleware Functions

| Function | Description | |---|---| | retry(opts) | Exponential backoff retry with configurable max retries | | cache(opts) | In-memory response cache with TTL and max entries | | logging(opts) | Request/response logging with custom logger | | createMiddleware(name, fn) | Create a custom named middleware |

Utility Exports

| Export | Description | |---|---| | renderTemplate(template, vars) | Standalone template rendering | | parseRoleTemplate(template, vars) | Parse role-based templates to messages | | createTool(options, handler) | Create a tool action | | formatToolsForPrompt(tools) | Format tool definitions for prompt injection | | parseToolCall(text) | Parse tool call JSON from model response | | SchemaValidator | Standalone schema validation class | | ActionRegistry | Standalone action registry | | Tracer | Standalone tracer |

Troubleshooting

Build Errors

| Error | Solution | |---|---| | Cannot find module '@stackforgeai/copilot-guard' | Build copilot-guard first: cd ../copilot-guard && npm run build | | Cannot find module 'node:crypto' | Upgrade to Node.js >=20 | | ERR_UNKNOWN_FILE_EXTENSION | Install tsx: npm install --save-dev tsx |

Runtime Errors

| Error | Solution | |---|---| | Guard blocked or request failed | Increase premiumLimit or check token usage with getUsage() | | Schema validation failed | Check that the model output matches your schema definition | | Tool not found | Ensure the tool is registered with defineTool() before calling executeToolCall() | | Action already registered | Each action name must be unique within its type |

DISCLAIMER AND LIMITATION OF LIABILITY

THIS SOFTWARE IS PROVIDED STRICTLY ON AN "AS IS" AND "AS AVAILABLE" BASIS.

BY USING THIS SOFTWARE, YOU ACKNOWLEDGE AND AGREE THAT:

THE SOFTWARE MAY CONTAIN BUGS, DEFECTS, DESIGN FLAWS, LOGIC ERRORS, SECURITY ISSUES, OR INCOMPLETE FEATURES.
THE SOFTWARE MAY FAIL TO LIMIT OR PREVENT TOKEN USAGE, API REQUESTS, COST OVERRUNS, OR BILLING EVENTS.
TOKEN ESTIMATION, RATE LIMITING, LOOP DETECTION, THROTTLING, AND SAFETY FEATURES MAY BE INACCURATE, INCOMPLETE, OR NON-FUNCTIONAL.
SCHEMA VALIDATION, STRUCTURED OUTPUT ENFORCEMENT, AND TOOL CALL PARSING MAY PRODUCE INCORRECT OR INCOMPLETE RESULTS.
FLOW EXECUTION, MIDDLEWARE CHAINS, AND OBSERVABILITY TRACING MAY NOT FUNCTION AS EXPECTED UNDER ALL CONDITIONS.

THE AUTHORS, CONTRIBUTORS, MAINTAINERS, COPYRIGHT HOLDERS, AFFILIATES, AND DISTRIBUTORS SHALL NOT BE LIABLE FOR ANY CLAIMS, DAMAGES, LOSSES, LIABILITIES, OR EXPENSES OF ANY KIND, INCLUDING BUT NOT LIMITED TO:

API FEES, TOKEN CHARGES, CLOUD COMPUTE COSTS, OR OTHER FINANCIAL LOSSES
DATA LOSS, CORRUPTION, OR SECURITY INCIDENTS
INDIRECT, INCIDENTAL, SPECIAL, CONSEQUENTIAL, OR PUNITIVE DAMAGES
LOSS OF PROFITS, REVENUE, GOODWILL, OR BUSINESS OPPORTUNITIES

USE OF THIS SOFTWARE IS ENTIRELY AT YOUR OWN RISK.

YOU ARE SOLELY RESPONSIBLE FOR:

VERIFYING ALL OUTPUTS AND GENERATED CONTENT
MONITORING API USAGE, TOKEN CONSUMPTION, AND BILLING
IMPLEMENTING ADDITIONAL SAFEGUARDS AND VALIDATION
TESTING MIDDLEWARE, FLOWS, AND TOOL INTEGRATIONS THOROUGHLY

THIS PROJECT SHOULD NOT BE USED AS THE SOLE OR PRIMARY MECHANISM FOR COST CONTROL, BILLING GOVERNANCE, SECURITY, OR PRODUCTION SAFETY.

ALWAYS IMPLEMENT INDEPENDENT PROVIDER-SIDE BILLING ALERTS, RATE LIMITS, BUDGET CONTROLS, AND MONITORING SYSTEMS.

License

MIT License. See package.json for details.

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

@stackforgeai/copilot-genkit

Overview

Features

Installation

Usage Examples

Quick Start

Flows

Prompt Templates

Structured Output

Tools / Function Calling

Middleware

Observability

Architecture Overview

API Reference

CopilotGenkit

Middleware Functions

Utility Exports

Troubleshooting

Build Errors

Runtime Errors

DISCLAIMER AND LIMITATION OF LIABILITY

License