@x12i/ai-gateway
v10.2.2
Published
AI Gateway - Unified interface for LLM provider routing and management
Readme
@x12i/ai-gateway
Unified gateway for LLM provider routing, structured logging, optional Activix activity persistence, and cost/model resolution via @x12i/ai-tools v3. Built on @x12i/ai-providers-router, @x12i/logxer, @x12i/rendrix (templates), and @x12i/flex-md (output-format hints and max-token lookup).
What this package does
| Area | Behavior |
|------|----------|
| Routing | Registers providers (or lazy-registers from env), invokes the router with merged model config, retries, and optional fallback chain. |
| invoke() | Builds messages from instructions + prompt templates + workingMemory; requires runtime identity and actionType / actionRef. |
| invokeChat() | Raw chat-style requests; no instruction builder or action classification. |
| Cost | Steps A→D on every successful invoke() / invokeChat(): router cost first, then @x12i/ai-tools catalog via calculateFromRecord when still unpriced. Single path — resolveCostCompletionWithAiTools. |
| Activix | Optional Mongo-backed activity rows; billing written from gateway-computed slice on completeRecord (outer.cost + root fields). No Activix autoCost re-pricing. |
| Trace mode | diagnostics.mode === 'trace' adds metadata.attempts[], metadata.usage, and per-attempt costUsd / costStatus. |
Pinned dependency versions are in package.json (currently Activix ^8.6, ai-tools ^3.1, ai-profiles ^3.2, ai-providers-router ^4.9).
Installation
npm install @x12i/ai-gatewayPeer/provider packages (for example @x12i/ai-provider-openai) are installed by your application when you register those providers.
Mandatory runtime identity (v9+)
Every invoke() / invokeChat() request must include identity from the upstream client (the gateway does not invent jobId / taskId).
- Missing or empty
identity.jobId/identity.taskId→ warn logs when a logger is configured; the call may still proceed with the merged envelope. - The same object is
request.identity,response.metadata.identity, router context, and ActivixrunContext.
See Identity contract.
Action classification (invoke() only)
| Field | Required | Values / notes |
|-------|----------|----------------|
| actionType | Yes | 'skill' | 'preSkill' | 'postSkill' |
| actionRef | Yes | Non-empty stable id (e.g. skill path) |
Copied onto identity for Activix and activity metadata. Not used by invokeChat().
Quick start
import { AIGateway } from '@x12i/ai-gateway';
const gateway = new AIGateway({
defaultProvider: 'openrouter',
enableLogging: true,
enableActivityTracking: true,
aiTools: {
enabled: true,
resolveModels: true,
calculateCost: true
}
});
const response = await gateway.invoke({
aiRequestId: 'call-001',
agentId: 'agent-456',
actionType: 'skill',
actionRef: 'skills/quick-reply',
instructions: 'Reply briefly.',
prompt: '{{input}}',
identity: {
sessionId: 'run-1',
instance: { instanceId: 'agent-456', type: 'ai-reasoner' },
aiRequestId: 'call-001',
jobId: 'job-123',
taskId: 'task-789',
agentId: 'agent-456'
},
workingMemory: { input: 'Hello!' },
config: { model: 'openai/gpt-4o-mini', provider: 'openrouter', maxTokens: 256 }
});
console.log(response.content, response.metadata?.costUsd, response.metadata?.tokens);Providers without manual register()
- OpenRouter: Set
OPENROUTER_API_KEYin.env. The gateway always passes this key to the router when set. By default, OpenRouter is preferred for routing (including when you also have direct keys such asOPENAI_API_KEY).@x12i/ai-toolsresolves concrete model ids + provider viaresolveInvokeModel()(catalog normalization, OpenRouter vs direct transport, router proxy flags). Pass catalog model ids such asopenai/gpt-4o-minior{ provider: 'openrouter', model: 'deepseek/deepseek-v4-pro' }. Composite display slugs likeopenrouter/deepseek/deepseek-v4-proare split at ingress (see Invoke model ingress below). Profile/choice aliases (cheap/default, …) must be resolved upstream (ai-tasks /resolveAIProfile) — the gateway rejects them withGATEWAY_ALIAS_MODEL_REJECTED. PREFER_OPENROUTER=false: Do not prefer OpenRouter when a direct provider API key exists — use the direct provider instead. OpenRouter is still used as fallback when the request targets a provider without a direct key (e.g.anthropicwithoutANTHROPIC_API_KEY). It does not disable OpenRouter whileOPENROUTER_API_KEYis set.- Direct providers: Set
OPENAI_API_KEY,GROK_API_KEY, etc. Registered lazily on first invoke.
Details: OpenRouter env.
Load .env before constructing the gateway if another package creates it first.
Base router only
import { LLMProviderRouter } from '@x12i/ai-gateway';
const router = new LLMProviderRouter({ defaultProvider: 'openai' });
// register providers, then router.invoke({ messages: [...] })Configuration
Gateway constructor (common flags)
| Option | Default | Purpose |
|--------|---------|---------|
| enableLogging | true | Logxer pipeline |
| logger | built-in | Pass your app createLogxer() instance |
| enableActivityTracking | true | Activix persistence (needs Mongo env when no activityTracker) |
| activityTracker | — | Custom Activix instance (collection names must still match package constants) |
| enableUsageTracking | true | In-process usage tier helper |
| aiTools | see below | Model resolution + catalog pricing |
| mode | 'debug' | 'dev' | 'debug' | 'prod' — ai-tools model resolution strictness (see below) |
| diagnostics | — | { mode: 'trace' } for rich metadata.attempts / metadata.usage |
| retry | code defaults | Provider invoke retry; override per request (see Runtime defaults) |
| temperature, topP, frequencyPenalty, presencePenalty | code defaults | Gateway-wide sampling; override per request |
| maxTokens | — | Required on every invoke (see below); optional gateway-wide default |
Packaged defaults: only defaults/template-rendering.json (Rendrix merge at init). No packaged model, instructions blocks, or rate-limit JSON.
Runtime defaults (v10+)
Constants exported from @x12i/ai-gateway — not env vars. Downstream packages should re-export or pass through on their public invoke API.
| Constant | Default | Override priority |
|----------|---------|-------------------|
| GATEWAY_DEFAULT_TEMPERATURE | 0.7 | modelConfig > request.config > GatewayConfig > constant |
| GATEWAY_DEFAULT_TOP_P | 1.0 | same |
| GATEWAY_DEFAULT_FREQUENCY_PENALTY | 0.0 | same |
| GATEWAY_DEFAULT_PRESENCE_PENALTY | 0.0 | same |
| GATEWAY_DEFAULT_RETRY | { maxRetries: 3, initialDelay: 1000, maxDelay: 30000, backoffMultiplier: 2, enableJitter: true, throttlingDelay: 5000 } | request.config.retry > request.retry > GatewayConfig.retry > constant |
import {
GATEWAY_DEFAULT_RETRY,
GATEWAY_DEFAULT_TEMPERATURE,
resolveRetryConfig
} from '@x12i/ai-gateway';Required on every invoke: config.model (or modelConfig.model) and maxTokens (request.config, modelConfig, GatewayConfig, or internalSystemActions). Missing model → ModelRequiredError (code: 'MODEL_REQUIRED'). Missing maxTokens → MaxTokensRequiredError (code: 'MAX_TOKENS_REQUIRED'). Profile/choice alias at invoke → GatewayAliasModelRejectedError (code: 'GATEWAY_ALIAS_MODEL_REJECTED'). There is no packaged default model, no flex-md / Optimixer auto-fill, and no GATEWAY_DEFAULT_MAX_TOKENS. Use @x12i/optimixer in the client that wraps this gateway if you want adaptive completion budgets.
Rate limiting: removed from the gateway. See upstream rate-limit spec — implement in @x12i/ai-providers-router.
Template rendering (defaults/template-rendering.json)
Used by @x12i/rendrix when parsing instructions, prompt, and context:
- Loaded at gateway init from
defaults/template-rendering.json(copied todist/defaults/on build). - Merged with
GatewayConfig.templateRendering. - Per-request override via
templateRenderOptions,smartInput,smartInputRenderOptions.
Flow: mergeGatewayAndRequestTemplateRenderOptions() → parseTemplate() → Rendrix render(). Details: UPSTREAM_TEMPLATE_RENDERING_AND_PARSER_V4.md.
Downstream passthrough (ai-skills, ai-tasks, graph-engine)
Hosts wrapping the gateway should expose on their public API:
| Field | Required | Notes |
|-------|----------|-------|
| model | Yes | Never omit — gateway does not infer a model |
| provider | When not fully resolved by OpenRouter + ai-tools | |
| temperature, topP, frequencyPenalty, presencePenalty, maxTokens | Optional | Document defaults from GATEWAY_DEFAULT_* |
| retry | Optional | Same shape as RetryConfig; defaults from GATEWAY_DEFAULT_RETRY |
| mode | Optional | 'dev' | 'debug' | 'prod' — pass through to GatewayConfig.mode |
| Billing | Read-only on response | response.metadata.costUsd, costStatus, tokens — gateway-owned; do not re-price |
| templateRenderOptions / smartInput | Optional | Rendrix overrides |
Instructions must be complete caller text — the gateway no longer injects packaged instruction blocks.
Activix response size cap
DEFAULT_ACTIVITY_FULL_RESPONSE_MAX_CHARS (512_000) caps JSON stored in Activix content.fullResponse when diagnostics allow it. Override with diagnostics.activityFullResponseMaxChars on the invoke request.
Environment (selected)
| Variable | Role |
|----------|------|
| MONGO_URI, MONGO_LOGS_DB / MONGO_DB / MONGO_AI_LOGS_DB | Activix when no custom tracker |
| ACTIVIX_DB_NAME | Activix database override (falls back to MONGO_AI_LOGS_DB → MONGO_LOGS_DB → MONGO_DB) |
| mode / MODE | Operational mode (dev, debug, prod) — expose to downstream clients |
| AI_GATEWAY_LOGS_LEVEL | Log threshold for gateway diagnostics (AI_GATEWAY prefix): error … verbose |
| AI_GATEWAY_VERBOSE | Full payload lines (still requires AI_GATEWAY_LOGS_LEVEL=verbose) |
| LOGXER_PACKAGE_LEVELS | Bulk stack levels, e.g. AI_GATEWAY:info,AI_PROVIDER_ROUTER:debug |
| OPENROUTER_API_KEY | OpenRouter key; always wired when set (required for OpenRouter transport) |
| PREFER_OPENROUTER | Optional; default prefer OpenRouter when key is set. false = use direct provider keys when present; OpenRouter still used as fallback when a provider has no key |
| Other provider keys | OPENAI_API_KEY, GROK_API_KEY, etc. |
Logging details: Logger initialization. Package identity and audit checklist: LOGXER_INTEGRATION_CHECKLIST.md.
Logxer identity
| Field | Value | Env / filter |
|-------|-------|----------------|
| packageName (log package column) | AIGateway | — |
| envPrefix | AI_GATEWAY | AI_GATEWAY_LOGS_LEVEL, LOGXER_PACKAGE_LEVELS |
| debugNamespace | ai-gateway | DEBUG=ai-gateway |
Console lines show app: (host app from cwd package.json) separately from package: (AIGateway for gateway code, host package for host code).
Exports: GATEWAY_LOGXER_PACKAGE, GATEWAY_LOG_ENV_PREFIX, createGatewayLogger, resolveGatewayVerboseEnabled.
Invoke model ingress (v10.2+)
Before catalog lookup, mergeConfig() normalizes the invoke wire shape via normalizeInvokeModel from @x12i/ai-profiles (defense in depth — idempotent when upstream already fixed):
| Input | Result |
|-------|--------|
| model: 'openrouter/deepseek/deepseek-v4-pro', provider: 'unspecified' | { provider: 'openrouter', model: 'deepseek/deepseek-v4-pro' } → catalog lookup |
| { provider: 'openrouter', model: 'deepseek/deepseek-v4-pro' } | Pass-through unchanged |
| model: 'cheap/default' (profile/choice alias) | GATEWAY_ALIAS_MODEL_REJECTED — resolve upstream via ai-tasks / resolveAIProfile |
Exported helpers: normalizeInvokeModelAtIngress, GatewayAliasModelRejectedError.
@x12i/ai-tools v3 (models + cost)
Engine-owned catalog bootstrap and post-call billing. Consumers read metadata.costUsd / costStatus only — no direct @x12i/ai-tools dependency for cost.
Resolution order (after every successful LLM call)
| Step | Condition | Result |
|------|-----------|--------|
| A | Router/provider returned finite costUsd (or equivalent) | costStatus: "priced", set cost |
| B | Tokens + catalog pricing succeeds (isAuthoritative, not unknownModel, finite cost ≥ 0) | priced (+ optional breakdown) |
| C | Tokens but no price | unpriced |
| D | No usage | omit costUsd and costStatus |
Step A always wins; explicit router costStatus: "unpriced" is never overridden by catalog.
Implemented in resolveCostCompletionWithAiTools only ( CostCalculator.calculateFromRecord via buildGatewayPricingRecord for Step B). Upstream target: resolveInvokeBilling in ai-tools — AI_TOOLS_INVOKE_BILLING_ORCHESTRATOR_SPEC.md.
aiTools config (aligned with funcx / generic engine contract)
| Flag | Default | Purpose |
|------|---------|---------|
| enabled | true | Bootstrap AiModelsCatalogClient + CostCalculator |
| calculateCost | true | Run post-call catalog pricing when router did not price |
| resolveModels | true | mergeConfig() → resolveInvokeModel() |
| modelsOnly | true | Reject profile shortcuts at catalog resolution (cheapest, …). Aliases are always rejected at ingress regardless of this flag. |
| bundledOnly | false | Offline bundled catalogs only |
| costIncludeBreakdown | false | Include prompt/completion breakdown on priced results |
| catalogLane | "text" (ai-tools default) | Catalog lane for resolution + cost lookup (text, image, …) |
| cacheTtlMs | ai-tools default (24h) | In-memory catalog cache TTL |
- No Catalox / Firestore — catalogs come from ai-tools open-assets JSON (optional
bundledOnly).
Gateway exports the model orchestrator from @x12i/ai-tools ≥ 3.3.0 (resolveInvokeModel, preferOpenRouter, …). Profile/choice keys (cheap/default, …) must be resolved before invoke — the gateway does not run alias resolution. Shortcuts like cheapest / bare cheap are rejected at ingress or catalog resolution.
Gateway billing helpers (exported for tests/integrators): resolveCostCompletionWithAiTools, buildGatewayPricingRecord, catalogPricingSucceeded, buildTraceUsageSummary, enrichTraceAttemptsWithBilling.
Activity tracking (@x12i/activix 8.6)
When tracking is enabled and no custom tracker is supplied, the gateway constructs Activix with fixed collection names (see src/config/activity-tracking-config.ts):
| Collection | Typical use |
|------------|-------------|
| ai-actions | Normal gateway invocations |
| bad-requests | Validation / pre-start failures |
| skill-executions | Skill-specific rows |
Lifecycle: startRecord → completeRecord / failRecord keyed by activityId (not jobId).
Successful completion (no duplicate billing on outer.metadata):
- Root:
cost,costUsd,costStatus,metadata(routing + billing mirror;metadata.modelUsedfor studio) outer.metadata: routing only (modelUsed,provider, …)outer.cost: Activix cost shape (usd,tokens,provider,model,details)response.metadata: same billing slice as returned to callers
Activix 8.6 runs materializeRecordRoutingAndBilling on persist (root config mirror from routing metadata). Gateway resolves billing before completeRecord and sets outer.cost from that slice.
autoCost: Activix 8.6 default is false. The gateway keeps autoCost: false on the default activity manager so billing is not recomputed via ai-tools at persist time (no second pricing path). Custom activityTracker instances may opt in to Activix autoCost (uses @x12i/ai-tools v3 calculateFromRecord when enabled).
Mongo env: MONGO_URI + MONGO_LOGS_DB or MONGO_DB.
Response metadata and cost
On every successful invoke() and invokeChat():
metadata.provider,modelUsed,maxTokensRequested,effectiveModelConfig(invoke only)metadata.tokens,costStatus,costUsd, optionalcostBreakdown;costmirrorscostUsdwhen priced
Client rules (ai-skills, graph-engine, etc.)
| metadata.costStatus | Meaning | Client action |
|------------------------|---------|---------------|
| priced | Gateway resolved a billable USD amount | Use metadata.costUsd (or cost) |
| unpriced | Tokens recorded; no authoritative price | Do not call ai-tools or re-price |
| (absent) | No token usage | No billing signal |
Do not add a direct @x12i/ai-tools dependency for post-call cost. For Activix rows you write yourself, use normalizeToActivixCostShape (re-exported from @x12i/activix) from costUsd + metadata.tokens.
Full contract: AI Gateway invoke execution metadata.
Trace diagnostics
await gateway.invoke({
...request,
diagnostics: { mode: 'trace' }
});Adds metadata.attempts, metadata.usage, metadata.requestIds, and per-attempt costUsd / costStatus after catalog enrichment.
Operational modes
| Mode | Model resolution | Notes |
|------|------------------|-------|
| dev | Strict — unknown profile/model fails at mergeConfig when aiTools.resolveModels is on | Best for CI / local |
| debug | Same strict resolution | Default when env unset |
| prod | Same strict resolution | No implicit default model — callers must pass model |
Set via constructor mode or env mode / MODE. Downstream hosts should document and expose mode so graph/skill callers know resolution behavior.
Every mode requires an explicit model on the request (concrete catalog id or normalized OpenRouter slug). Unknown models throw ModelResolutionError. Profile/choice aliases throw GatewayAliasModelRejectedError at ingress — resolve them upstream (ai-tasks).
Testing
| Script | What it runs |
|--------|----------------|
| npm test | All unit/integration tests in .tests/run-all.js (tsx, no network) |
| npm run test:ai-tools | ai-tools + cost + trace helper unit tests |
| npm run test:ai-tools:live | Real invoke + dev strict model check (needs API key) |
| npm run test:flex-md-parsing | flex-md parsing scenarios |
| npm run test:flex-md-esm-regression | ESM build regression for flex-md |
| npm run test:prepublish | build + npm test |
Live tests use LIVE_TEST_PROVIDER / LIVE_TEST_MODEL (default openrouter + openai/gpt-4o-mini). Set LIVE_SKIP_INVOKE=1 to skip the LLM call. Profile alias invokes (LIVE_TEST_PROFILES=1) are no longer supported — resolve profiles upstream (ai-tasks) before calling the gateway.
Documentation index
Full index: docs/README.md. Upstream gaps: docs/upstream-reports/.
| Document | Topic |
|----------|--------|
| AI_GATEWAY_INVOKE_EXECUTION_METADATA.md | Metadata, cost, trace, Activix completion |
| IDENTITY_OBJECT_CONTRACT.md | Identity / runContext |
| OPENROUTER_ENV.md | OPENROUTER_API_KEY and PREFER_OPENROUTER |
| UPSTREAM_PROFILE_RESOLUTION_AND_OPENROUTER_FALLBACK.md | Profile routing + OpenRouter fallback |
| UPSTREAM_TEMPLATE_RENDERING_AND_PARSER_V4.md | Parser v4 + template-rendering.json |
| GRAPH_EXECUTION_SUPPORT.md | Graph / node identity |
| LOGGER_INITIALIZATION.md | Logxer setup |
| DUAL_PACKAGE_SETUP_GUIDE.md | ESM + CJS publish layout |
Troubleshooting helpers
import { validateAIRequest, diagnoseRequest, formatDiagnostic } from '@x12i/ai-gateway';Enable gateway/router diagnostics:
export AI_GATEWAY_LOGS_LEVEL=debug
export AI_PROVIDER_ROUTER_LOGS_LEVEL=debug
# Optional full I/O payloads (requires _LOGS_LEVEL=verbose on the relevant package):
export AI_GATEWAY_VERBOSE=true
export AI_PROVIDER_ROUTER_VERBOSE=trueBuild and publish
npm run build # ESM + CJS + defaults copy + CJS verify
npm run test:prepublishPublished files: dist/, dist-cjs/, config.defaults.json, README.md.
License
MIT — see package metadata.
