@x12i/ai-gateway

v9.0.7

Published

3 days ago

AI Gateway - Unified interface for LLM provider routing and management

Downloads

784

0High
0Medium
0Low

x12i

ai llm gateway router provider

@x12i/ai-gateway

Unified gateway for LLM provider routing and management with production-ready features: context propagation, usage tier tracking, activity tracking, and comprehensive metadata. Built on top of @x12i/ai-providers-router with integrations for @x12i/x-models, @x12i/activix (see package.json for pinned versions), and @x12i/logxer for structured logging.

Mandatory runtime identity (v9+)

Every invoke / invokeChat request must include identity: the full runtime envelope from the upstream client (not invented inside the gateway).

identity.jobId and identity.taskId are only taken from that upstream object. The gateway never generates, rewrites, or back-fills them from deprecated top-level jobId / taskId fields.
If identity is missing or jobId / taskId are empty, the gateway logs warn via Logxer (missingRuntimeIdentityObject / missingUpstreamIdentityFields) when a logger is configured, and still attaches the merged envelope so the rest of the pipeline can proceed.
The same merged object is request.identity, forwarded to the router, returned as response.metadata.identity, and persisted on Activix as runContext (same reference as request.identity).

See Identity contract and Logger initialization.

Features

🔀 Provider Routing: Dynamic provider registration and automatic routing with fallback support
📊 Context Propagation: aiRequestId and identity propagation for distributed tracing (see Identity contract)
⚡ Usage Tier Tracking: RPM/TPM limit enforcement via @x12i/x-models
📈 Activity Tracking: Comprehensive activity logging via @x12i/activix v6 (xronox-activitix), fixed Mongo collections ai-activities / bad-requests, validated root-level outer / inner I/O plus runContext for Activix 6
📝 Structured Logging: Production-ready logging via @x12i/logxer (LogMeta jobId / sessionId / correlationId, optional debugKind, default runtimeIdentity from env) with diagnostic tracing for instruction resolution and propagation debugging
📋 Rich Metadata: Detailed execution metadata (latency, tokens, model, cost, aiRequestId, identity)
🏥 Health Checks: Monitor provider health and availability
🔄 Request/Response Interceptors: Modify requests and responses
🔍 Auto-Discovery: Automatically discover and register installed provider packages
🎯 Object Type Output Support: Parse responses into typed inference outputs (classification, extraction, Q&A, etc.) via @x12i/outputs-library
✅ Enhanced Schema Validation: Strict/non-strict validation modes, automatic schema resolution from instruction metadata, and graceful outputs library fallback (v1.7.0+)
✅ Graceful Outputs Library Error Handling: Automatic fallback parsing, clear error detection, and parsing method metadata (v1.7.1+)
✅ Guaranteed Consistent Structure: Always returns consistent structure at all levels (content, parsedContent, parsedOutput) - JSON is always JSON, text is always text, structures are forced when needed (v1.7.4+)
📋 Automatic Output Schema Guidance: Automatically extends instructions with JSON schema expectations when outputType/schema is available (v2.1.1+)
🔍 Output Structure Audit: Automatically audits response structure against schema - identifies missing/extra fields, always available when schema exists (v2.1.1+)
🔄 Automatic Retry: Intelligent retry logic for network errors, server errors (5xx), and throttling (429) with exponential backoff
📚 Content Resolver (nx-content): Resolve instruction keys, prompt keys, and instructions blocks from local folder or git repo via nx-content. See Content Resolver — Upstream Guide.
📋 Instruction Metadata API: Fetch structured metadata (outputType, schema, validation rules) for metadata-driven inference systems (v1.6.9+)
🔧 Response Transformation Hooks: Transform responses at different stages (preParse, postParse, preValidate, postValidate) for output mapping and data normalization (v1.6.9+)
🔧 Custom/Dynamic Instructions Mode: Use instructions that already contain full JSON schema - no schema formatting added, instructions used exactly as provided (v3.0.4+)
🤖 Response Repair Fallback: In mode=prod, performs a minimal in-gateway repair attempt for malformed JSON/Markdown responses (logs a warning when used). In mode=debug, parsing failures hard-fail for maximum visibility.
📊 Response Fix Metadata: Track when and how responses were fixed, including fix strategy, confidence, and warnings (v3.0.4+)
🔍 Instruction Optimizer: Use AI to analyze and fix poorly-written instructions - meta-feature that improves instruction quality (v3.0.4+)
🧪 Instruction Testing: Test instructions by running them and analyzing if responses match expected format (v3.0.4+)
📝 Multiple Output Modes: Support for JSON output, structured text output, and two-step conversion (v3.0.5+)
📋 Dual Instruction Formats: Support for JSON schema instructions and structured text format specifications (v3.0.5+)
📚 Standard Object Types: Reference standard object types by name (e.g., 'sentiment-analysis') instead of defining schemas manually - includes examples, validation, and structured text instructions (v3.0.6+)
🔍 Auto-Extraction of Output Formats: Automatically extract output format specifications from instruction templates when using structured-text mode - no need to manually specify flexMdFormat or primaryObjectType (v3.3.3+)
✅ Output Format Validation: Validates output format specifications using flex-md SDK before sending to LLM, with configurable minimum compliance level (L0-L3)
📋 Contract Output Parsing: Parse AI responses against expected schemas and store results in activity records for compliance monitoring (v6.3.1+)

Installation

npm install @x12i/ai-gateway

📚 Documentation: After installation, documentation is available in:

node_modules/@x12i/ai-gateway/CONTENT_RESOLVER_UPSTREAM_GUIDE.md - Content resolver (nx-content): config, keys, local/git, upstream checklist
node_modules/@x12i/ai-gateway/docs/IDENTITY_OBJECT_CONTRACT.md - Identity contract for Activix (sessionId + instance)
node_modules/@x12i/ai-gateway/docs/LOGGER_INITIALIZATION.md - Required reading: How to properly initialize logger
node_modules/@x12i/ai-gateway/TROUBLESHOOTING.md - Troubleshooting guide
node_modules/@x12i/ai-gateway/TROUBLESHOOTING_TOOLBOX.md - Diagnostic tools
node_modules/@x12i/ai-gateway/INTEGRATION_GUIDANCE.md - Integration guidance

🔧 Troubleshooting Helpers: Import diagnostic functions directly:

import { validateAIRequest, diagnoseRequest, formatDiagnostic } from '@x12i/ai-gateway';

🔍 Debugging: Enable detailed request logging and comprehensive diagnostic tracing:

export AI_GATEWAY_DEBUG=true           # Basic request logging
export AI_GATEWAY_DEBUG_REQUEST=true   # Detailed request structure
export FLEX_MD_MIN_COMPLIANCE_LEVEL=L0 # Output format validation level (L0/L1/L2/L3, default: L0)

This logs the exact request structure received by invoke(), including property descriptors, which is critical for debugging validation errors like "objectTypes is required".

🔍 Advanced Diagnostic Logging

The gateway includes comprehensive diagnostic logging for instruction resolution and propagation debugging. When debug logging is enabled, the following diagnostic events are logged:

Phase 2 Instruction Resolution:

instructions.phase2.validation_inputs - Logs comparison inputs for key echo validation
instructions.phase2.resolution_result - Logs resolution status, source, and attempts

Instruction Propagation Chain:

instructions.propagation.autoExtract.entry - Instruction hash at auto-extraction
instructions.propagation.constructMessages.entry - Instruction hash at message construction
instructions.propagation.providerInvoke.entry - System prompt hash at LLM invocation

Resolution Detection:

instructions.constructMessages.entry - Detects if constructMessages receives resolved instructions

Gate Checks:

gate.activityStart.precheck - Validates instructions before activity tracking
gate.llmInvoke.precheck - Validates instructions before LLM calls

Error Handling:

badRequest.written - Confirms bad request path execution

Benefits:

Trace ID: Each request gets a stable trace ID for correlation across all logs
Hash Chain: Track instruction content changes through the entire pipeline
Fail-Safe Gating: Verify that invalid instructions are properly rejected
Resolution Audit: Detect double-resolution or propagation failures

All diagnostic logs include traceId, jobId, and agentId for correlation. Content is safely redacted (first 80 chars only) with hashes for comparison.

Quick Start

Basic Usage with Enhanced Gateway

import { AIGateway } from '@x12i/ai-gateway';
import { OpenAIProvider } from '@x12i/ai-provider-openai';
import { GrokProvider } from '@x12i/ai-provider-grok';

// Create enhanced gateway
const gateway = new AIGateway({
  defaultProvider: 'openai',
  fallbackChain: ['grok'],
  usageTier: 'tier-3',  // RPM/TPM limits
  enableActivityTracking: true,
  enableUsageTracking: true,
  enableLogging: true
});

// Register providers
gateway.register(new OpenAIProvider({ 
  apiKey: process.env.OPENAI_API_KEY 
}));
gateway.register(new GrokProvider({
  apiKey: process.env.GROK_API_KEY
}));

// Invoke with mandatory runtime identity (upstream job/task correlation)
const response = await gateway.invoke({
  aiRequestId: 'call-001',
  agentId: 'agent-456',
  instructions: 'Reply briefly.',
  identity: {
    sessionId: 'run-1',
    instance: { instanceId: 'agent-456', type: 'ai-reasoner' },
    aiRequestId: 'call-001',
    jobId: 'job-123',
    taskId: 'task-789',
    agentId: 'agent-456'
  },
  workingMemory: { input: 'Hello!' }
  // … primaryObjectType / flexMdFormat / messages as required by your request type
});

// Response includes comprehensive metadata (including `identity`)
console.log(response.metadata);

Using Base Router (Direct Access)

import { LLMProviderRouter } from '@x12i/ai-gateway';

// Use base router if you don't need enhanced features
const router = new LLMProviderRouter({
  defaultProvider: 'openai',
  fallbackChain: ['grok']
});

router.register(new OpenAIProvider({ 
  apiKey: process.env.OPENAI_API_KEY 
}));

const response = await router.invoke({
  messages: [{ role: 'user', content: 'Hello!' }]
});

Provider registration and OpenRouter (no manual register required)

If you only use the gateway (e.g. via @woroces/ai-tasks) and do not call gateway.register() or configure the router yourself:

OpenRouter: Set OPEN_ROUTER_KEY or OPENROUTER_API_KEY in the environment and do not set USE_OPENROUTER=false. The gateway enables OpenRouter mode so the router can route without any registered provider (requires router support). Load .env before any code that creates the gateway; if the gateway is created by another package (e.g. ai-skills) before env is loaded, pass the key explicitly: openrouter: { apiKey: process.env.OPEN_ROUTER_KEY ?? process.env.OPENROUTER_API_KEY } in the gateway config.
Direct providers: Set the relevant API key (e.g. OPENAI_API_KEY, GROK_API_KEY). The gateway lazy-auto-registers these on first invoke()/invokeChat(), so you do not need to call autoRegisterProviders or register().

If you see "No provider specified and no providers registered" or "Provider not registered: openrouter", set OPEN_ROUTER_KEY (or another provider’s API key), ensure .env is loaded before the process that creates the gateway, or pass openrouter: { apiKey } in the gateway config. See TROUBLESHOOTING.md for details.

Setup Guide

This guide shows where and how to configure each functionality of the AI Gateway.

Configuration Overview

The AI Gateway can be configured in multiple ways:

Gateway Constructor - Main configuration when creating the gateway
JSON Default Files - Default configurations loaded from src/defaults/ (model-config.json, instructions-blocks.json)
Environment Variables - Via nx-config2 (for logging and other settings)
- FLEX_MD_MIN_COMPLIANCE_LEVEL - Minimum flex-md compliance level for output format validation (default: L0). Valid values: L0, L1, L2, L3. See Output Format Validation section for details. When set to L0 (default), no format validation is required. When set to L1 or higher, format specifications are required in instructions and validation errors will reject requests.
Request-Level - Override gateway defaults per request

1. Logging Configuration (@x12i/logxer)

Logger initialization: The gateway uses @x12i/logxer. Pass a Logxer from createLogxer, or omit logger and let the gateway build a default (still Logxer-based). See Logger Initialization Guide for mandatory identity on requests and LogMeta / debugKind usage.

Where to configure:

Gateway constructor: enableLogging, packageName, logger
Environment variables: {PREFIX}_LOGS_LEVEL (canonical per-package level), legacy {PREFIX}_LOG_LEVEL, plus {PREFIX}_LOG_FORMAT, file sinks, unified logger, etc., as documented for @x12i/logxer.

How to configure:

import { AIGateway } from '@x12i/ai-gateway';
import { createLogxer } from '@x12i/logxer';

// Create logger once at application startup
const logger = createLogxer(
  { packageName: 'MY_APP', envPrefix: 'MY_APP', debugNamespace: 'my-app' },
  {
    logLevel: 'info',           // verbose|debug|info|warn|error
    logFormat: 'json',          // text|json|yaml|table
    logToFile: true,
    logFilePath: '/var/log/app.log',
    enableUnifiedLogger: true,
    unifiedLogger: {
      transports: { papertrail: true },
      service: 'my-app',
      env: 'production'
    },
    runtimeIdentity: {
      service: 'my-app',
      env: process.env.NODE_ENV,
      version: process.env.npm_package_version
    }
  }
);

const gateway = new AIGateway({
  enableLogging: true,
  logger,
  packageName: 'MY_APP'
});

Why use the same Logxer?

Consistent log format across your application
Unified logging destination
LogMeta correlation (jobId, sessionId, correlationId, debugKind, runtimeIdentity) matches gateway and Activix fields
Single point of control for log levels

Per-package log level (@x12i/logxer):

Canonical: {PREFIX}_LOGS_LEVEL — same envPrefix as createLogxer / your packageName (e.g. MY_APP → MY_APP_LOGS_LEVEL).
Legacy: {PREFIX}_LOG_LEVEL is used only if {PREFIX}_LOGS_LEVEL is not set.
Default when both are unset: warn (not info, not silent).
Silence this package’s diagnostics: off, none, or silent (case-insensitive).
Values: off | none | silent | error | warn | info | debug | verbose.

If the gateway builds the default logger and you omit packageName, the prefix is AI_GATEWAY → e.g. AI_GATEWAY_LOGS_LEVEL.

Environment variables (examples):

MY_APP_LOGS_LEVEL=info             # raise verbosity (or use debug / verbose)
MY_APP_LOGS_LEVEL=off              # silence this package’s logs
# MY_APP_LOG_LEVEL=info            # legacy; ignored if MY_APP_LOGS_LEVEL is set

MY_APP_LOG_FORMAT=json             # text|json|yaml|table
MY_APP_LOG_TO_FILE=true
MY_APP_LOG_FILE=/var/log/app.log
MY_APP_LOG_TO_UNIFIED=true
DEBUG=my-app                       # elevates verbose/debug when package is not fully silent

Diagnostic logging: When debug level is enabled, the gateway emits diagnostic logs for instruction resolution and propagation through the same Logxer pipeline.

What happens if logger is not provided: The gateway creates a default Logxer via createLogxer. For production, prefer passing your app’s logger so levels, transports, and runtimeIdentity stay aligned with the rest of your stack.

2. Activity Tracking Configuration (xronox-activitix via @x12i/activix v6)

Activix version: This gateway targets @x12i/activix v6.x (built on @xronoces/xronox-store). The dependency range is declared in package.json (currently ^6.5.1). Activity I/O is stored at the document root as outer (and optional inner); the deprecated nested structure wrapper is not used.

Where to configure:

Gateway constructor: enableActivityTracking, activityTracker
Environment variables (auto-configured when no custom tracker provided):
- MONGO_URI (required) - MongoDB connection string
- MONGO_LOGS_DB or MONGO_DB (required) - Database name

✅ Centralized configuration

The gateway resolves activity tracking from environment variables via src/config/activity-tracking-config.ts. Main and bad-request collection names are fixed at the package level (not overridden by env): ai-activities for normal runs and bad-requests for failed/invalid requests. skill-executions is used for skill-related flows when applicable.

Environment variable priority (connection only):

Database: MONGO_LOGS_DB → MONGO_DB (no default, must be provided)
URI: MONGO_URI (required)

⚠️ CRITICAL: correlation and identity

aiRequestId (required on each gateway request): Primary correlation id for this LLM call; the gateway does not invent a jobId for you.
Run context (Activix BSON field runContext): Same object as request.identity (including required upstream jobId and taskId), plus sessionId and nested instance: { instanceId, type } when present; see Identity contract.
jobTypeId, taskTypeId: Optional aggregation / grouping fields (unchanged semantics).
Each activity: Gets its own unique database record with unique _id (MongoDB ObjectId).
Two-phase tracking: startActivity() creates a new record; logSuccess() / logFailure() update the same record by that record’s id.

Runtime objects observability (debug only):

@x12i/ai-gateway exports runtimeObjects for runtime diagnostics. This package is a leaf runtime package, so runtimeObjects?.packagesRuntimeObjects is always [].

Runtime objects are available only in debug mode:

mode=debug

debug is the default when mode is omitted. In production, use:

mode=prod

When mode=prod, runtimeObjects is undefined.

import { runtimeObjects } from '@x12i/ai-gateway';

const activities = await runtimeObjects?.activixClient?.getJobActivities({ jobId });
const logs = await runtimeObjects?.logxerClient?.getJobLogs({ jobId });

The gateway only exposes official queryable clients. It exposes activixClient only when the effective Activix client already implements getJobActivities(), and logxerClient only when the effective Logxer client already implements getJobLogs(). The gateway does not query Mongo, Logxer storage, or private package internals to emulate missing query APIs.

See Runtime Objects Observability Methodology for the reusable package-level contract.

Recommended (auto-configured from environment variables):

import { AIGateway } from '@x12i/ai-gateway';
import { OpenAIProvider } from '@x12i/ai-provider-openai';

// Set environment variables:
// MONGO_URI=mongodb://localhost:27017
// MONGO_LOGS_DB=logs-db

const gateway = new AIGateway({
  enableActivityTracking: true,  // default: true
  // Activix is auto-configured; writes use collections ai-activities / bad-requests / skill-executions
});

gateway.register(new OpenAIProvider({
  apiKey: process.env.OPENAI_API_KEY
}));

Advanced (custom Activix v6 instance):

If you pass your own Activix, configure the same collection names the gateway expects so routing matches persistence:

import { AIGateway } from '@x12i/ai-gateway';
import { Activix } from '@x12i/activix';

const statusValues = {
  started: 'started',
  inProgress: 'in_progress',
  completed: 'success',
  failed: 'failed',
  timeout: 'timeout'
};

const activityTracker = new Activix({
  collections: [
    { name: 'ai-activities', statusValues },
    { name: 'skill-executions', statusValues },
    { name: 'bad-requests', statusValues }
  ]
});

const gateway = new AIGateway({
  enableActivityTracking: true,          // default: true
  activityTracker,                       // plug in custom tracker
});

What gets tracked (persisted when DB is configured):

Identity: Fields aligned with request.identity / Activix runContext: aiRequestId, upstream jobId and taskId, sessionId, instance, plus optional jobTypeId, agentId, taskTypeId, etc., as provided
Timing: startTime, endTime, duration, status (started|success|failed)
Request data: Stored in request object (instructions, prompt, input, messages, workingMemory)
Config data: Stored in config object (model, provider, temperature, maxTokens)
Response data: Stored in response object (content, metadata)
Cost: Calculated and stored per activity

Best Practices for Type IDs:

jobTypeId: Use MD5 hash of your job type string (e.g., MD5('data-processing-job')) for consistent job-level aggregation
taskTypeId: Use MD5 hash of your task/instruction text (e.g., MD5('What is the capital of France?')) for consistent task-level aggregation
If taskTypeId is not provided, it's auto-generated from the pre-parsed instructions MD5 hash
Same type = same hash = easy aggregation and tracking across multiple jobs/tasks

Key design points:

✅ Each activity = separate database record with unique _id
✅ aiRequestId = per-request correlation (required); jobId / taskId come from upstream identity (required on each request; see v9+ contract above)
✅ Request data sent once in startActivity() (creates new record)
✅ Response data sent once in logSuccess() (updates same record by _id)

Default: Activity tracking is enabled by default; without DB config it will log but not persist.

✅ Activix v6 integration

Configuration (activity-tracking-config.ts):
- Mongo connection from env; collection names ai-activities and bad-requests are fixed for consistency across deployments.
Lifecycle (@x12i/activix v6):
- ✅ startRecord / completeRecord / failRecord (two-phase lifecycle)
- ✅ Status transitions: started → success or failed (per your statusValues mapping)
- ✅ Persistence via xronox-store queue semantics (see Activix package docs)
Testing (ai-gateway):
- Standalone test available (npm run test:activities:standalone) that bypasses config parsing issues
- Tests activity lifecycle end-to-end: creation → completion → database persistence
- See .tests/TESTING_GUIDE.md for complete testing documentation

See: .reports/new/ACTIVITY_LIFECYCLE_IMPROVEMENTS_VERIFICATION_REPORT.md for complete verification details.

Skill Execution Tracking

When executing skills (instruction keys starting with skills/), the gateway automatically tracks skill executions separately from gateway invocations. Skill executions are stored in the skill-executions collection and support parent-child relationships.

Required Fields:

instructions: Skill instruction key (e.g., skills/professional-answer)
inferenceType: Recommended - type of inference (e.g., question-answer, classification)

Optional Fields:

masterSkillActivityId: Parent skill activity ID (when a skill calls another skill)
skillId: Skill identifier (auto-detected from instruction key, can be overridden)
masterSkillId: Parent skill identifier (when a skill calls another skill)

Example: Basic Skill Execution

const response = await gateway.invoke({
  jobId: 'job-123',
  agentId: 'my-agent',
  instructions: 'skills/professional-answer',
  inferenceType: 'question-answer', // Recommended
  skillId: 'skills/professional-answer', // Optional, auto-detected from instructions
  input: { question: 'What is...' },
  primaryObjectType: 'professional-answer'
});

// Get the activity ID for linking child skills
const activityId = response.metadata?.activityId;

Example: Nested Skill Execution (Skill Calling Another Skill)

// Parent skill execution
const parentResponse = await gateway.invoke({
  jobId: 'job-123',
  agentId: 'my-agent',
  instructions: 'skills/parent-skill',
  inferenceType: 'analysis',
  skillId: 'skills/parent-skill'
});

// When parent skill calls child skill, pass parent's activityId and skillId
const childResponse = await gateway.invoke({
  jobId: 'job-123', // ✅ Same jobId (links activities)
  agentId: 'my-agent',
  instructions: 'skills/child-skill',
  inferenceType: 'question-answer',
  skillId: 'skills/child-skill',
  masterSkillActivityId: parentResponse.metadata?.activityId, // ✅ Parent's activity ID
  masterSkillId: 'skills/parent-skill' // ✅ Parent's skill ID
});

Automatic Tracking:

The gateway automatically:

Detects skill executions from instruction keys starting with skills/
Connects to instruction metadata (key, version) from content resolution
Routes to skill-executions collection (ActivityManager + Activix handle routing)
Returns activityId in response metadata for linking child skills
Supports parent-child skill relationships via masterSkillActivityId and masterSkillId

For detailed integration guides, see:

3. Usage Tracking Configuration (x-models)

Where to configure:

Gateway constructor: enableUsageTracking, usageTier
JSON defaults: src/defaults/model-config.json (not used for usage tracking)

How to configure:

const gateway = new AIGateway({
  enableUsageTracking: true,  // Default: true
  usageTier: 'tier-3'         // RPM/TPM limits: 'tier-1' | 'tier-2' | 'tier-3'
});

Note: If @x12i/x-models is not available or has export issues, usage tracking will gracefully degrade with warnings logged.

Default: Usage tracking is enabled by default with usageTier: 'tier-3'

4. Default Model and Engine Configuration

Where to configure:

JSON Defaults (lowest priority): src/defaults/model-config.json
Gateway Constructor (medium priority): defaultModel, defaultEngine
Request Config (highest priority): request.config.model, request.config.provider

How to configure:

Step 1: Create/Edit JSON defaults (src/defaults/model-config.json):

{
  "defaultModel": "gpt-4o",
  "defaultEngine": "openai",
  "temperature": 0.7,
  "maxTokens": 2000,
  "topP": 1.0,
  "frequencyPenalty": 0.0,
  "presencePenalty": 0.0
}

Step 2: Gateway constructor:

const gateway = new AIGateway({
  defaultModel: 'gpt-5-nano',      // Overrides JSON default
  defaultEngine: 'openai',            // Overrides JSON default
  temperature: 0.9,                  // Overrides JSON default
  maxTokens: 4000                    // Overrides JSON default
});

Step 3: Request-level override:

const response = await gateway.invoke({
  jobId: 'job-123',
  agentId: 'agent-456',
  instructions: 'You are helpful',
  input: 'Hello',
  config: {
    model: 'gpt-4o',        // Overrides gateway default
    provider: 'openai',     // Overrides gateway default
    temperature: 0.5        // Overrides gateway default
  }
});

Priority Order:

Request config (highest)
Gateway constructor config
JSON defaults (lowest)

5. InstructionsBlocks Configuration

Where to configure:

JSON Defaults (lowest priority): src/defaults/instructions-blocks.json
Content resolver (nx-content) (medium priority): blocks under e.g. blocks/{blockName}/{agentId} (see Content Resolver — Upstream Guide)
Gateway Constructor (highest priority): instructionsBlocks object

How to configure:

Step 1: Create/Edit JSON defaults (src/defaults/instructions-blocks.json):

{
  "input-prefix": "Please process the following input:",
  "default-prompt": "You are a helpful assistant."
}

Step 2: Gateway constructor:

const gateway = new AIGateway({
  instructionsBlocks: {
    'input-prefix': 'Custom prefix from config:',  // Overrides JSON default
    'custom-block': 'Custom block content'
  }
});

Step 3: Content Registry (if available):

Store blocks at paths supported by nx-content (e.g. blocks/{blockName}/{agentId} or with taskTypeId). When content resolver is configured (via contentRegistryConfig or env vars), instructions blocks are resolved from local or git.

Priority Order (highest to lowest):

Gateway constructor instructionsBlocks (highest priority)
Content registry with taskTypeId (if taskTypeId provided)
Content registry without taskTypeId
JSON defaults (lowest priority)

See: Content Resolver — Upstream Guide for configuration, env vars, key vs text rule, and checklist.

6. Instruction Resolution (Content Resolver / nx-content)

How it works: The gateway uses nx-content to resolve content. Instruction type is determined by whitespace:

No spaces → Key (resolved from local folder or git)
Has spaces → Literal text (used as-is)

There is no option to override this; the spaces rule is the only decision point.

Configuration: Pass contentRegistryConfig (or legacy contentRegistry) when creating the gateway:

// Local content only
const gateway = new AIGateway({
  contentRegistryConfig: {
    localPath: '.metadata'   // or absolute path
  }
});

// Local + Git (mode: 'dev' = local wins, 'prod' = git wins)
const gateway = new AIGateway({
  contentRegistryConfig: {
    localPath: '.metadata',
    mode: 'dev',
    github: {
      repo: process.env.GITHUB_REPO_URL,
      token: process.env.GITHUB_TOKEN,
      branch: 'main'
    }
  }
});

Environment variables (used when no explicit config): CONTENT_REGISTRY_LOCAL_ROOT, CONTENT_REGISTRY_MODE, GITHUB_REPO_URL, GITHUB_TOKEN, CONTENT_REGISTRY_GIT_BRANCH.

Behavior:

Keys (no spaces) → resolved from nx-content (local or git); never sent as message content
Literal text (has spaces) → used as-is
Unresolvable key → error; no LLM call

File layout: e.g. skills/<name>.instructions.md, skills/<name>.prompt.md under the content root. See Content Resolver — Upstream Guide for full layout, checklist, and diagnostics.

7. Template Parsing Configuration (workingMemory)

Where to configure:

Request-level: workingMemory object

How to configure:

const response = await gateway.invoke({
  jobId: 'job-123',
  agentId: 'agent-456',
  // Instructions can be a key (resolved from content resolver) or text (parsed as template)
  instructions: 'professional-answer.instructions',  // Key with suffix
  // OR: instructions: 'You are a {{role}} assistant.',  // Text with template variables
  context: 'User is working on {{project}} project.',
  // Prompts can be a key (resolved from content resolver) or text (parsed as template)
  prompt: 'professional-answer.prompt',  // Key with suffix
  // OR: prompt: 'Analyze this {{type}}: {{input}}',  // Text with template variables
  input: 'This is a review',
  workingMemory: {
    role: 'helpful',
    project: 'AI Gateway',
    type: 'product review',
    input: 'This is a review'
  }
});

What gets parsed:

instructions - Resolved from content resolver (nx-content) if it's a key (no spaces), or parsed as template if text
context - Parsed as template with workingMemory
prompt - Resolved from content resolver if it's a key (no spaces), or parsed as template if text
All parsed using @x12i/rendrix (v4+) with workingMemory, shortTermMemory, experienceMemory, knowledgeMemory

Rendrix (@x12i/rendrix) v4 template protocol

Simple placeholders {{name}} or {{a.b.c}} are required (MUST): if resolution is undefined after the usual memory merge, rendering throws TemplateResolutionError from the parser. The gateway rethrows that error (it is not converted into a silent fallback).
Values that do not throw include null, empty string "", 0, and false.
Optional placeholders: {{path |}} (empty if missing) or {{path | fallback text}} (literal fallback when missing).
Helpers, blocks, {{file:...}}, {{json ...}}, etc. follow the parser’s own rules; the MUST/optional rules above apply to plain path mustaches.
For full parser API details, see @x12i/rendrix README / CHANGELOG.md.

Gateway template options (passthrough)

GatewayConfig.templateRendering — default TemplateRenderOptions for every invoke() render path (merged after packaged src/defaults/template-rendering.json, which ships with subPathSearch.enabled: false). Your gateway config overrides that JSON.
templateRenderOptions on the request (ChatRequest / AIRequest) — merged on top of the gateway default for that call only (per-field override; subPathSearch fields merge with request winning).
Supported fields match the parser: templateId, subPathSearch (enabled, roots), silentMissingMustTokens (legacy Handlebars-style silence for missing MUST paths).
Sub-path root priority: subPathSearch.roots is an ordered list. The parser tries roots in array order; the first root that resolves the leaf path wins (see ISSUE-005). There is no separate “priority” field—the order of roots is the priority. Omit roots when enabled is true to use @x12i/rendrix packaged defaults.

// Example: prefer execution.*, then input.*, then inputs.* when a full path misses
new AIGateway({
  templateRendering: {
    subPathSearch: {
      enabled: true,
      roots: ['execution', 'input', 'inputs']
    }
  }
});

Memory overlay priority for value resolution (highest first): templateTokens (merged into short-term before render) → shortTermMemory → workingMemory → experienceMemory → knowledgeMemory.
Root config.defaults.json may include a templateRendering block for apps that merge this file into GatewayConfig. Packaged template-rendering.json includes a sample roots order (used when you turn enabled on; while enabled is false, roots are ignored by the parser).

Template-Based Prompts:

Prompts work exactly like instructions - both can be resolved using explicit keys. See Content Resolver — Upstream Guide.
Both instructions and prompts receive the same memory context for template rendering
Use explicit keys with suffixes: professional-answer.instructions and professional-answer.prompt
See Prompt Template Usage Guide for details

Note: Requires @x12i/rendrix ^4.x (already a dependency).

9. Provider Registration

Where to configure:

Automatic (Recommended): Environment variables - providers auto-register on gateway creation
Manual: Runtime: gateway.register(provider)

Automatic Registration (v4.0.7+):

Providers are automatically registered based on environment variables when the gateway is created:

// Set environment variables in .env:
// OPENAI_API_KEY=sk-...
// GROK_API_KEY=xai-...
// ANTHROPIC_API_KEY=sk-ant-... (optional)
// GOOGLE_API_KEY=... (optional)

const gateway = new AIGateway({
  defaultProvider: 'openai',
  fallbackChain: ['grok']
});

// Providers are automatically registered! No manual registration needed.
// The gateway will log which providers were auto-registered.

📋 Configuration Reference:

See .env.example in the project root for a comprehensive guide to all environment variables, including:

Provider API keys (required/optional)
MongoDB/activity tracking configuration
Content registry setup (S3, GitHub, Redis)
Internal system actions configuration
Logging and debugging options

Copy .env.example to .env and fill in your values.

Supported Providers (Auto-Registration):

OpenAI: OPENAI_API_KEY → Auto-registers openai provider
Grok: GROK_API_KEY → Auto-registers grok provider
Anthropic: ANTHROPIC_API_KEY → Auto-registers anthropic provider (if package installed)
Google: GOOGLE_API_KEY → Auto-registers google provider (if package installed)
Cohere: COHERE_API_KEY → Auto-registers cohere provider (if package installed)
Mistral: MISTRAL_API_KEY → Auto-registers mistral provider (if package installed)

Manual Registration (Optional):

You can still manually register providers if needed:

import { OpenAIProvider } from '@x12i/ai-provider-openai';
import { GrokProvider } from '@x12i/ai-provider-grok';

const gateway = new AIGateway({
  defaultProvider: 'openai',
  fallbackChain: ['grok']
});

// Manual registration (optional - auto-registration handles this if env vars are set)
gateway.register(new OpenAIProvider({
  apiKey: process.env.OPENAI_API_KEY
}));

gateway.register(new GrokProvider({
  apiKey: process.env.GROK_API_KEY
}));

Note: Auto-registration only registers providers that:

Have their API key set in environment variables
Have their provider package installed (e.g., @x12i/ai-provider-openai)

If a provider package is not installed, auto-registration will skip it gracefully (with a debug log for optional providers, warning for required ones).

9. Complete Configuration Example

import { AIGateway } from '@x12i/ai-gateway';
import { createLogxer } from '@x12i/logxer';
import { Activix } from '@x12i/activix';
import { OpenAIProvider } from '@x12i/ai-provider-openai';

// 1. Configure activity tracker (and reuse its logger)
// Single source of truth: set up the logger once, pass it to the tracker,
// then reuse the same logger for the gateway.
const logger = createLogxer(
  { packageName: 'MY_APP', envPrefix: 'MY_APP', debugNamespace: 'my-app' },
  {
    logLevel: 'info',
    logFormat: 'json',
    enableUnifiedLogger: true
  }
);

const statusValues = {
  started: 'started',
  inProgress: 'in_progress',
  completed: 'success',
  failed: 'failed',
  timeout: 'timeout'
};

const activityTracker = new Activix({
  collections: [
    { name: 'ai-activities', statusValues },
    { name: 'skill-executions', statusValues },
    { name: 'bad-requests', statusValues }
  ]
});

// 2. Create gateway with all configurations
const gateway = new AIGateway({
  // Provider routing
  defaultProvider: 'openai',
  defaultModel: 'gpt-4o',
  defaultEngine: 'openai',
  fallbackChain: ['grok'],
  
  // Usage tracking
  enableUsageTracking: true,
  usageTier: 'tier-3',
  
  // Activity tracking
  enableActivityTracking: true,
  activityTracker: activityTracker,
  
  // Logging
  enableLogging: true,
  packageName: 'MY_APP',
  logger: logger,
  
  // Content resolver (local and/or git)
  contentRegistryConfig: {
    localPath: '.metadata',
    // optional: mode: 'prod', github: { repo: process.env.GITHUB_REPO_URL, token: process.env.GITHUB_TOKEN }
  },
  
  // InstructionsBlocks
  instructionsBlocks: {
    'input-prefix': 'Custom prefix:'
  },
  
  // LLM defaults
  temperature: 0.7,
  maxTokens: 2000
});

// 4. Register providers
gateway.register(new OpenAIProvider({
  apiKey: process.env.OPENAI_API_KEY
}));

Configuration Priority Summary

For each configuration option, priority is (highest to lowest):

Request-level config (in invoke() call)
Gateway constructor config
Content resolver (nx-content) (for instructionsBlocks only)
JSON defaults (from src/defaults/)

Enhanced Gateway Features

1. Context Propagation (Job ID)

The gateway automatically propagates jobId through the entire request lifecycle for distributed tracing.

const response = await gateway.invoke({
  // Minimum required fields
  jobId: 'job-123',           // required
  agentId: 'agent-456',       // required
  instructions: 'You are a helpful assistant.', // required

  // Provide either messages OR prompt/input
  messages: [{ role: 'user', content: 'What is AI?' }],
  // OR:
  // prompt: 'professional-answer.prompt',  // Key resolved from content resolver (nx-content)
  // input: 'What is AI?',
  // Optional extra context inserted between instructions and user prompt/input
  // context: 'Only answer with a single sentence.',

  // Optional
  taskId: 'task-789',
  taskTypeId: 'question-answering',  // Or auto-generated from instructions MD5 hash
  graphId: 'graph-123',              // Optional: Graph execution context
  nodeId: 'node-456',                // Optional: Node execution context
  config: { model: 'gpt-5-nono' }
});

// jobId is automatically:
// - Attached to request metadata
// - Included in response metadata
// - Logged in all log entries
// - Tracked in activity logs

Request requirements:

Required fields: jobId, agentId, and instructions are mandatory
Content field (choose one):
- messages array (for tool calling or custom message sequences)
- OR prompt + input (prompt template resolved from content resolver or parsed with template variables)
- OR input alone (uses default prefix from instructionsBlocks)
Optional fields: context (inserted as system message between instructions and user content), workingMemory (for template variables), graphId/nodeId/coreSkillId (for graph execution context - see Graph Execution Support)
Model requirement: config.model must be supplied per request (no default model)

Important Notes:

instructions is always required, even when using messages array
When messages is provided, the gateway constructs system messages from instructions (and context if provided), then appends your messages array
context is inserted as a separate system message between instructions and the user prompt/input
Template-Based Instructions and Prompts: Both instructions and prompt can be resolved from the content resolver (nx-content) using keys (e.g., professional-answer.instructions and professional-answer.prompt). Both receive the same memory context for template rendering. See Content Resolver — Upstream Guide and Prompt Template Usage Guide.
All template fields (instructions, context, prompt) support template variables via workingMemory

Benefits:

Full request traceability across distributed systems
Correlate logs, activities, and metrics by jobId
Debug complex multi-step AI workflows

2. Usage Tier Tracking (RPM/TPM Limits)

The gateway integrates with @x12i/x-models to enforce usage tier limits and prevent rate limit errors.

import { AIGateway, getTierInfo } from '@x12i/ai-gateway';

// Initialize with usage tier
const gateway = new AIGateway({
  usageTier: 'tier-3',  // 5,000 RPM, 2M TPM
  enableUsageTracking: true
});

// Get tier information
const tierInfo = getTierInfo('tier-3');
console.log(`RPM Limit: ${tierInfo?.rpm}, TPM Limit: ${tierInfo?.tpm}`);

// Gateway automatically:
// - Records every request to x-models
// - Calculates RPM/TPM consumption
// - Logs consumption percentages
// - Prevents exceeding tier limits

Available Tiers:

tier-1: 500 RPM, 500K TPM
tier-2: 5,000 RPM, 1M TPM
tier-3: 5,000 RPM, 2M TPM (default)
tier-4: 10,000 RPM, 4M TPM
tier-5: 15,000 RPM, 40M TPM

3. Activity Tracking (xronox-activitix via @x12i/activix v6)

The gateway uses @x12i/activix v6 (xronox-activitix) for full lifecycle logging. Recommended: enable MongoDB persistence so tracking is automatic. Default collections: ai-activities, bad-requests, skill-executions (see section 2).

⚠️ CRITICAL: correlation, identity, and unique record ids

IMPORTANT DESIGN CONCEPTS:

Per-request correlation
- aiRequestId (required): One id per gateway invocation; used as the primary leaf correlation field (stored on the activity row and inside Activix runContext).
- identity.jobId and identity.taskId (required): Taken only from the upstream identity object; the gateway does not invent them.
- jobTypeId, taskTypeId: Optional aggregation fields (same ideas as before).
- Activity: Each individual LLM request is a separate activity with its own unique record.
Mongo _id is the unique row key
- jobId / taskId on the row mirror upstream identity for correlation; multiple activities may share a jobId when you intend grouping.
- Activix updates rows by the record id from the start phase, not by jobId.
Two-phase tracking (Activix v6)
- Phase 1 (start): Creates a NEW database record with unique _id
  - Sends request-side data: request, config, runContext, root-level outer (and optional inner) I/O, startTime, status: 'started' (plus other gateway metadata)
  - Returns metadata containing the unique record id for completion
- Phase 2 (complete / fail): Updates the SAME record by that id
  - Sends response/error data: response, endTime, duration, cost, status
  - Does not re-send full request payload
Data structure (v2.6.0+):
- Request fields (messages, instructions, prompt, input, context, workingMemory) → ONLY in request object
- Config fields (model, provider, temperature, maxTokens) → ONLY in config object
- Response fields (content, metadata) → ONLY in response object
- NO duplication: Fields are NOT at root level, only in their structured objects

Example: same logical job, three LLM calls

Each call must have a distinct aiRequestId and a full identity (including jobId and taskId from upstream). Use the same identity.jobId (and distinct taskId per call, or your own convention) if you want to group rows in Mongo.

import * as crypto from 'crypto';

function md5(text: string): string {
  return crypto.createHash('md5').update(text).digest('hex');
}

const jobTypeId = md5('data-processing-job');

const identityBase = {
  jobId: 'job-123',
  sessionId: 'sess-1',
  instance: { instanceId: 'inst-1', type: 'gateway' }
};

await gateway.invoke({
  aiRequestId: 'req-001',
  identity: { ...identityBase, taskId: 'task-001' },
  jobTypeId,
  agentId: 'agent-1',
  // objectTypes, messages, ...
});

await gateway.invoke({
  aiRequestId: 'req-002',
  identity: { ...identityBase, taskId: 'task-002' },
  jobTypeId,
  agentId: 'agent-1',
  // objectTypes, messages, ...
});

// Query in Mongo (main collection name is ai-activities):
// db.getCollection('ai-activities').find({ 'runContext.aiRequestId': 'req-001' })
// db.getCollection('ai-activities').find({ 'runContext.jobId': 'job-123' })

Configuration

import { Activix } from '@x12i/activix';

const statusValues = {
  started: 'started',
  inProgress: 'in_progress',
  completed: 'success',
  failed: 'failed',
  timeout: 'timeout'
};

const activityTracker = new Activix({
  collections: [
    { name: 'ai-activities', statusValues },
    { name: 'skill-executions', statusValues },
    { name: 'bad-requests', statusValues }
  ]
});

const gateway = new AIGateway({
  enableActivityTracking: true,
  activityTracker
});

// Auto-persisted by the tracker:
// - Each activity creates a new record with unique _id
// - Start/end/duration, status (started|success|failed)
// - Provider, model, cost
// - Request/response metadata, errors
// - Correlation via runContext (and mirrored top-level fields); optional jobId for grouping

Database Record Structure

{
  // Unique identifier (MongoDB auto-generated)
  _id: ObjectId('693970636e8d0f171e4aa528'),  // ← UNIQUE per activity
  
  // Activix v6: canonical correlation BSON object `runContext` (same reference as `request.identity`, merged with gateway fields)
  runContext: {
    sessionId: 'sess-1',
    instance: { instanceId: 'gw-1', type: 'gateway' },
    aiRequestId: 'req-abc',
    jobId: 'job-123',
    jobTypeId: 'xyz789...',
    agentId: 'agent-456',
    taskId: 'task-789',
    taskTypeId: 'abc123...',
    graphId: 'graph-456',
    nodeId: 'node-789',
    masterSkillId: '...',
    masterSkillActivityId: '...'
  },
  // Mirrored / denormalized top-level fields may also appear from the gateway payload (query either as needed)
  aiRequestId: 'req-abc',
  sessionId: 'sess-1',
  instance: { instanceId: 'gw-1', type: 'gateway' },
  jobId: 'job-123',
  jobTypeId: 'xyz789...',
  agentId: 'agent-456',
  taskId: 'task-789',
  taskTypeId: 'abc123...',
  graphId: 'graph-456',
  nodeId: 'node-789',

  // Required activity I/O: root-level `outer` (optional `inner[]` for steps) — Activix v6 (see @x12i/activix docs)
  outer: { input: { ... }, output: { ... } | null, metadata: { ... } },
  // inner: [ { input, output, metadata, startedAt, endedAt, ... }, ... ],
  
  // Timing
  startTime: 1765372020804,
  endTime: 1765372021535,   // Added by logSuccess
  duration: 731,            // Added by logSuccess
  status: 'success',        // Updated by logSuccess (was 'started')
  
  // Request data (from startActivity - ONLY in request object)
  request: {
    raw: { 
      instructions,  // Original instructions (before template parsing)
      context,      // Original context (before template parsing)
      prompt        // Original prompt (before template parsing)
    },
    parsed: { 
      instructions,  // Parsed instructions (after template parsing with workingMemory)
      context,       // Parsed context (after template parsing with workingMemory)
      prompt         // Parsed prompt (after template parsing with workingMemory, includes input if provided)
    },
    input: "...",     // Original input text
    messages: [...],  // Final constructed messages array
    workingMemory: {...}  // Working memory used for template parsing
  },
  
  // Config data (from startActivity - ONLY in config object)
  config: {
    model: 'gpt-5-',
    provider: 'openai',
    temperature: 0.7,
    maxTokens: 1000,
    rawConfig: {...}
  },
  
  // Response data (from logSuccess - ONLY in response object)
  response: {
    content: "...",
    metadata: {...}
  },
  
  // Cost (from logSuccess)
  cost: 0.002,
  
  // Metadata
  createdAt: Date,
  updatedAt: Date
}

Key points:

✅ Each activity = separate record with unique _id
✅ aiRequestId = per-request correlation (required on invoke)
✅ runContext.jobId / runContext.taskId = upstream identity (required on invoke since v9+)
✅ Request data sent once at activity start; response data on completion
✅ Updates use Activix record id / _id, not jobId

Retry Tracking (@x12i/activix v6)

The gateway automatically retries network errors, server errors (5xx), and throttling (429) with exponential backoff. Retry attempts are tracked and stored in activity records.

Retry Metadata Structure:

// Success case - retry metadata in response.metadata.retries
{
  response: {
    metadata: {
      retries: {
        count: 2,  // Number of retry attempts
        attempts: [
          {
            attempt: 1,                    // 1-based attempt number
            timestamp: 1234567890,          // When retry occurred
            error: "fetch failed",          // Error message
            errorType: "network",           // Error classification
            delayMs: 1000                   // Delay before retry
          },
          {
            attempt: 2,
            timestamp: 1234568890,
            error: "fetch failed",
            errorType: "network",
            delayMs: 2000
          }
        ]
      }
    }
  }
}

// Failure case - retry count in error message
{
  status: "failed",
  error: "Grok API network error: fetch failed [Retries: 3]"
}

Error Types:

network: Network errors (fetch failed, DNS, connectivity)
http-429: Throttling/rate limiting
http-5xx: Server errors (500, 502, 503, etc.)
timeout: Timeout errors

Querying Activities with Retries:

// Query activities that had retries
const activitiesWithRetries = await db.activities.find({
  'response.metadata.retries.count': { $gt: 0 }
});

// Query activities with network errors that were retried
const networkRetries = await db.activities.find({
  'response.metadata.retries.attempts.errorType': 'network'
});

// Query activities that failed after retries
const failedAfterRetries = await db.activities.find({
  status: 'failed',
  error: /\[Retries: \d+\]/
});

Requirements:

@x12i/activix required for retry tracking metadata persistence
Backward compatible: Works with older versions (retry metadata just won't be stored)

4. Response Structure (v2.1.0+)

The gateway returns a comprehensive response structure that captures the full lifecycle: raw provider response, gateway normalization, inference parsing, and calculated metrics.

Complete Response Structure

const response = await gateway.invoke({
  jobId: 'job-123',
  agentId: 'agent-456',
  instructions: 'You are a helpful assistant.',
  input: 'What is AI?',
  config: { model: 'gpt-5-nano' }
});

// Response structure:
{
  // ============================================
  // Raw Provider Response (from router)
  // ============================================
  content: string,              // Normalized string (always present)
  rawText?: string,             // Original raw text from provider (before parsing)
  
  // Raw content from provider (if preserved)
  // Note: response.content is normalized, rawContent would be in routerResponse
  
  // ============================================
  // Gateway Normalization & Parsing
  // ============================================
  parsedContent?: TContent,     // Parsed JSON object/array (if content was JSON)
  
  metadata: {
    // Content type classification
    contentType?: 'string' | 'object' | 'array' | 'null',
    
    // ============================================
    // Gateway Calculated Metrics
    // ============================================
    jobId?: string,             // Job ID for correlation
    latencyMs: number,          // Execution time in milliseconds
    tokens: {
      prompt: number,           // Input tokens
      completion: number,       // Output tokens
      total: number,            // Total tokens
      // Cache token support (if available)
      cacheInputTokens?: number,
      cacheOutputTokens?: number,
      cacheTotalTokens?: number
    },
    model?: string,             // Model ID used (e.g., 'gpt-4o', 'claude-sonnet-4')
    provider?: string,          // Provider used (e.g., 'openai', 'anthropic')
    cost?: number,              // Cost in USD (if available)
    
    // ============================================
    // Inference Output Parsing (if inferenceType provided)
    // ============================================
    parsedOutput?: unknown,     // Typed inference output (classification, Q&A, etc.)
    inferenceType?: string,     // Inference type used (e.g., 'classification')
    outputValidationErrors?: string[], // Schema validation errors (if validation enabled)
    
    // ============================================
    // Provider Metadata (from router)
    // ============================================
    // Additional metadata from provider response
    // (merged from routerResponse.metadata)
  },
  
  // ============================================
  // Usage Information (from router)
  // ============================================
  usage?: {
    cost?: number,              // Cost from provider
    // Additional usage fields from provider
  }
}

Response Structure Breakdown

1. Raw Provider Response:

content - Normalized string (always present, never "[object Object]")
rawText - Original raw text from provider (preserved if available)
usage - Usage information from provider (cost, tokens if available)
Provider metadata merged into response.metadata

2. Gateway Normalization & Parsing:

parsedContent - Parsed JSON object/array (if content was JSON)
metadata.contentType - Type classification: 'string' | 'object' | 'array' | 'null'

3. Inference Output Parsing (if inferenceType provided in request):

metadata.parsedOutput - Typed inference output (classification, question-answer, extraction, etc.)
metadata.inferenceType - Inference type used
metadata.outputValidationErrors - Schema validation errors (if validation enabled)

4. Gateway Calculated Metrics:

metadata.jobId - Job ID for correlation
metadata.latencyMs - Request duration in milliseconds
metadata.tokens - Token breakdown (prompt, completion, total, cache tokens)
metadata.cost - Cost in USD
metadata.model - Model ID used
metadata.provider - Provider used

Example: Full Response

const response = await gateway.invoke({
  jobId: 'job-123',
  agentId: 'agent-456',
  instructions: 'Classify sentiment',
  input: 'I love this product!',
  inferenceType: 'classification',
  parseOptions: { classes: ['positive', 'negative', 'neutral'] },
  config: { model: 'gpt-5-nano' }
});

// Complete response structure:
{
  // Normalized content (always string)
  content: '{"label":"positive","confidence":0.95}',
  
  // Raw text from provider
  rawText: '{"label":"positive","confidence":0.95}',
  
  // Parsed JSON (if content was JSON)
  parsedContent: { label: 'positive', confidence: 0.95 },
  
  metadata: {
    // Content classification
    contentType: 'object',
    
    // Gateway metrics
    jobId: 'job-123',
    latencyMs: 1250,
    tokens: {
      prompt: 100,
      completion: 50,
      total: 150
    },
    model: 'gpt-5-mini',
    provider: 'openai',
    cost: 0.002,
    
    // Inference output (parsed)
    parsedOutput: {
      label: 'positive',
      confidence: 0.95
    },
    inferenceType: 'classification',
    outputValidationErrors: undefined // No validation errors
  }
}

Note: The response structure captures the full lifecycle from raw provider response through gateway normalization to final parsed inference output, providing complete observability and traceability.

5. Structured Logging

The gateway uses @x12i/logxer for structured logging with LogMeta correlation. See Logger initialization.

import { createLogxer } from '@x12i/logxer';

const gateway = new AIGateway({
  enableLogging: true,
  packageName: 'MY_APP',
  logger: createLogxer(
    { packageName: 'MY_APP', envPrefix: 'MY_APP', debugNamespace: 'my-app' },
    {
      logLevel: 'info',
      logFormat: 'json',
      enableUnifiedLogger: true
    }
  )
});

// All operations are automatically logged:
// - Request initiation with jobId
// - Provider/model selection
// - Usage consumption
// - Success/failure with full context

6. Object Type Output Support (@x12i/outputs-library)

The gateway integrates with @x12i/outputs-library to parse LLM responses into typed inference outputs (classification, question-answer, extraction, etc.).

Overview

When you specify an inferenceType in your request, the gateway automatically:

Parses the response into a typed output object
Validates against JSON Schema (optional)
Provides type-safe access to structured data

Installation

npm install @x12i/outputs-library

Note: @x12i/outputs-library is automatically installed as a dependency.

Dependency Resolution: The gateway includes npm overrides to resolve version conflicts between the outputs library and content-registry. The integration uses dynamic imports for graceful degradation - if the outputs library is not available, the gateway will continue to work (parsing will be skipped with a warning).

Installation: The package.json includes overrides to handle version conflicts automatically. If you still encounter issues:

npm install --legacy-peer-deps

Note for Package Maintainers: The @x12i/outputs-library package should update its peer dependency from @xronoces/content-registry@^1.0.0 to @xronoces/content-registry@>=1.0.0 or ^1.0.0 || >=2.7.0 to support both versions. See DEPENDENCY_RESOLUTION.md for details.

Supported Inference Types

classification - Classify content into predefined categories
question-answer - Answer questions based on context
extraction - Extract structured data from unstructured text
summarization - Generate summaries of content
risk-assessment - Assess risks with scores and factors
recommendation - Generate recommendations with priorities
transformation - Transform data between formats

Basic Usage

import { AIGateway } from '@x12i/ai-gateway';
import type { ClassificationOutput } from '@x12i/ai-gateway';

const gateway = new AIGateway({
  defaultProvider: 'openai'
});

// Request with inference type
const response = await gateway.invoke({
  jobId: 'job-123',
  agentId: 'agent-456',
  instructions: 'Classify the sentiment of the text.',
  input: 'This product is amazing!',
  inferenceType: 'classification',
  parseOptions: {
    classes: ['positive', 'negative', 'neutral']
  }
});

// Access parsed output
const classification = response.metadata.parsedOutput as ClassificationOutput;
console.log(classification.classes); // ['positive', 'negative', 'neutral']
console.log(classification.confidence); // { positive: 0.85, negative: 0.1, neutral: 0.05 }

With Schema Validation

const response = await gateway.invoke({
  jobId: 'job-123',
  agentId: 'agent-456',
  instructions: 'Extract user information.',
  input: 'Name: John Doe, Email: [email protected]',
  inferenceType: 'extraction',
  validateOutputSchema: true  // Enable validation
});

// Check validation errors (if any)
if (response.metadata.outputValidationErrors) {
  console.warn('Validation errors:', response.metadata.outputValidationErrors);
}

// Access parsed output
const extraction = response.metadata.parsedOutput as ExtractionOutput;
console.log(extraction.extracted); // { name: 'John Doe', email: '[email protected]' }

Question-Answer Example

const response = await gateway.invoke({
  jobId: 'job-123',
  agentId: 'agent-456',
  instructions: 'Answer the question based on the contex

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

@x12i/ai-gateway

Mandatory runtime identity (v9+)

Features

Installation

🔍 Advanced Diagnostic Logging

Quick Start

Basic Usage with Enhanced Gateway

Using Base Router (Direct Access)

Provider registration and OpenRouter (no manual register required)

Setup Guide

Configuration Overview

1. Logging Configuration (@x12i/logxer)

2. Activity Tracking Configuration (xronox-activitix via @x12i/activix v6)

Skill Execution Tracking

3. Usage Tracking Configuration (x-models)

4. Default Model and Engine Configuration

5. InstructionsBlocks Configuration

6. Instruction Resolution (Content Resolver / nx-content)

7. Template Parsing Configuration (workingMemory)

9. Provider Registration

9. Complete Configuration Example

Configuration Priority Summary

Enhanced Gateway Features

1. Context Propagation (Job ID)

2. Usage Tier Tracking (RPM/TPM Limits)

3. Activity Tracking (xronox-activitix via @x12i/activix v6)

⚠️ CRITICAL: correlation, identity, and unique record ids

Configuration

Database Record Structure

Retry Tracking (@x12i/activix v6)

4. Response Structure (v2.1.0+)

Complete Response Structure

Response Structure Breakdown

Example: Full Response

5. Structured Logging

6. Object Type Output Support (@x12i/outputs-library)

Overview

Installation

Supported Inference Types

Basic Usage

With Schema Validation

Question-Answer Example