@pkent/aigateway

v1.2.0

Published

3 days ago

A provider-neutral LLM client library for OpenAI, Anthropic, Qwen, GLM, and OpenRouter

Downloads

373

0High
0Medium
0Low

pkent

llm openai anthropic openrouter qwen glm gateway ai

AIGateway

A provider-neutral LLM client library. One class routes chat / vision / streaming requests to OpenAI, Anthropic, Qwen (Alibaba DashScope), GLM (Zhipu AI), OpenRouter, or an OpenAI-compatible gateway (aibroker) based on the model name, and normalizes every response into a single shape.

No server, no .env — all configuration is passed into the constructor.

Install

npm install @pkent/aigateway

Requires Node.js 18+ (uses native ESM and async iterators). The package is ES-module only.

Quick start

import AIGateway from '@pkent/aigateway';

const g = new AIGateway('anthropic/claude-opus-4.8', process.env.ANTHROPIC_API_KEY);

const res = await g.chat([{ role: 'user', content: 'Hello' }]);
console.log(res.content[0].text);
console.log(res.usage.total_tokens);

Models are addressed as <provider>/<model> (e.g. anthropic/claude-opus-4.8, openai/gpt-4o). The leading provider segment selects the provider; the rest is the model id sent upstream.

One instance is bound to one model and one API key. To use a different model, construct another instance.

Constructor

new AIGateway(model, key, options?)

model (string, required) — <provider>/<model>. The provider segment selects the provider (see the routing table); an unknown or missing segment throws AIGatewayError (code: 'unsupported_model').
key (string, required) — the API key for the resolved provider.
options (object, optional):

| Option | Type | Applies to | Default | Description | |-------------|--------|-----------------|------------------|-------------| | baseURL | string | all | provider default | Override the upstream endpoint (regional endpoints, proxies, self-hosted). Required for aibroker (no default). | | maxTokens | number | all | unset (see below)| Default max output tokens; a per-call maxTokens overrides it. | | timeout | number | all | unset (SDK default) | Default per-call request timeout in ms; a per-call timeout overrides it. | | referer | string | OpenRouter only | omitted | Sent as the HTTP-Referer attribution header. | | title | string | OpenRouter only | omitted | Sent as the X-Title attribution header. | | client | object | advanced | — | Inject a pre-built SDK client (or a compatible fake for testing). When set, key/baseURL/referer/title are not used to build a client. |

The constructor performs no network calls and throws AIGatewayError for an invalid model (code: 'invalid_model') or missing key (code: 'invalid_api_key', unless a client is injected).

Read-only properties & discovery

g.model            // the bound model string, e.g. 'anthropic/claude-opus-4.8'
g.provider         // resolved provider id, e.g. 'anthropic'
AIGateway.providers()
// => [
//   { id: 'openrouter', prefix: 'openrouter/' },
//   { id: 'anthropic',  prefix: 'anthropic/' },
//   { id: 'qwen',       prefix: 'qwen/' },
//   { id: 'glm',        prefix: 'glm/' },
//   { id: 'openai',     prefix: 'openai/' },
//   { id: 'aibroker',   prefix: 'aibroker/' }
// ]

Methods

`await g.chat(messages, options?)`

`await g.vision(messages, options?)`

vision is chat with image content blocks in messages. Both return the v2 response shape. options may include temperature, maxTokens, responseFormat, signal (an AbortSignal to cancel the request), and timeout (ms). (responseFormat is mapped to OpenAI-compatible response_format; ignored by Anthropic.)

const g = new AIGateway('openai/gpt-4o', OPENAI_KEY);
const res = await g.vision([
  {
    role: 'user',
    content: [
      { type: 'text', text: 'What is in this image?' },
      { type: 'image_url', image_url: { url: 'https://example.com/cat.png' } },
    ],
  },
], { temperature: 0.2 });

`g.stream(messages, options?)`

Returns a ChatStream (not a promise). It is async-iterable — yielding { type: 'text_delta', text } deltas — and exposes a .final promise that resolves to the full response once streaming completes.

const stream = g.stream([{ role: 'user', content: 'Write a haiku' }], { temperature: 0.7 });

for await (const delta of stream) {
  process.stdout.write(delta.text);
}

const final = await stream.final;
console.log('\n', final.stop_reason, final.usage);

If the upstream errors, the iterator throws and .final rejects with the same error. You may await .final without iterating (deltas buffer in memory), or iterate without awaiting .final.

Accepts the same signal/timeout options; aborting the signal rejects both the iterator and .final.

Model routing

Every model is addressed as <provider>/<model>. The leading provider segment selects the provider and is stripped before the upstream call. There is no catch-all — an unknown or missing segment throws AIGatewayError (code: 'unsupported_model').

| Model id | Provider | Upstream model | Notes | |--------------------------------|--------------|----------------|-------| | anthropic/<model> | anthropic | <model> | Anthropic SDK. | | openai/<model> | openai | <model> | OpenAI-compatible. | | qwen/<model> | qwen | <model> | OpenAI-compatible. Default base URL https://dashscope-intl.aliyuncs.com/compatible-mode/v1. | | glm/<model> | glm | <model> | OpenAI-compatible. Default base URL https://open.bigmodel.cn/api/paas/v4. | | openrouter/<vendor>/<model> | openrouter | <vendor>/<model> | Only the openrouter/ segment is stripped, leaving OpenRouter's native id. Default base URL https://openrouter.ai/api/v1. | | aibroker/<remainder> | aibroker | <remainder> | OpenAI-compatible gateway meta-provider. Only the aibroker/ segment is stripped; the remainder is forwarded verbatim and the gateway does its own routing. Requires baseURL (no default) — the constructor throws AIGatewayError (code: 'missing_base_url') without it. |

OpenRouter is opt-in: send openrouter/anthropic/claude-opus-4.8 to route through OpenRouter, or anthropic/claude-opus-4.8 to hit Anthropic directly.

aibroker is a gateway meta-provider: it forwards to any OpenAI-compatible gateway you point it at, so it embeds no host and requires a baseURL. Only the leading aibroker/ segment is stripped — everything after it is sent upstream as the model id, letting the gateway do its own routing (aibroker/openai/chatgpt-5.5 → openai/chatgpt-5.5, aibroker/openrouter/openai/chatgpt-5.5 → openrouter/openai/chatgpt-5.5). The key is passed straight through as the Bearer for the gateway.

const g = new AIGateway('aibroker/openai/chatgpt-5.5', GATEWAY_TOKEN, {
  baseURL: 'https://your-gateway/v1',
});

Adding a provider: drop a module into src/providers/ exporting { id, prefix, matches, create } (where matches tests the <id>/ segment) and register it in src/providers/registry.js. create(config) returns { id, chat, vision, stream }.

Response shape

{
  "id": "msg_01ABCXYZ",
  "object": "response",
  "created": 1776675600,
  "provider": "anthropic",
  "model": "claude-sonnet-4-5",
  "role": "assistant",
  "stop_reason": "end_turn",
  "content": [{ "type": "text", "text": "Hello! How can I help you today?" }],
  "usage": {
    "input_tokens": 10,
    "output_tokens": 9,
    "total_tokens": 19,
    "cached_input_tokens": 6,
    "cache_creation_input_tokens": 0,
    "cache_read_input_tokens": 6
  }
}

Cache-related usage fields are optional and only present when the upstream provider reports them.

Cache hints

Request content blocks may include a provider-neutral cache hint:

cache: true — prefer caching this stable block.
cache: false or omission — no cache hint.

Mapping:

Anthropic — cache: true on a text block becomes cache_control.
OpenAI-compatible providers — the hint is accepted but stripped before the upstream request, so caller intent is preserved without breaking compatibility.

await g.chat([
  {
    role: 'system',
    content: [
      { type: 'text', text: 'Core instructions', cache: true },
      { type: 'text', text: 'Output valid JSON', cache: true },
    ],
  },
  { role: 'user', content: [{ type: 'text', text: 'Live request data' }] },
]);

`maxTokens` behavior

maxTokens defaults to unset, which preserves each provider's native behavior:

Anthropic requires max_tokens, so it falls back to 4096 when none is supplied.
OpenAI-compatible providers do not require a cap, so none is sent — output is not truncated by the library.

Supply maxTokens (in the constructor or per call) to apply a cap to every provider. A per-call value overrides the constructor default.

Cancellation & timeouts

Pass an AbortSignal as signal to cancel an in-flight request; pass timeout (ms) to bound it. Both are forwarded to the underlying provider SDK's request options and apply to chat, vision, and stream.

const controller = new AbortController();
const res = g.chat(messages, { signal: controller.signal });
// ...later:
controller.abort();   // res rejects with the SDK's APIUserAbortError

For a hard wall-clock cap that also honors an external cancel, combine a timeout signal with your own:

const signal = AbortSignal.any([AbortSignal.timeout(120_000), runSignal]);
const res = await g.chat(messages, { signal });

The timeout option is a per-attempt connection timeout and may be retried by the SDK; prefer a combined signal (as above) when you need a guaranteed wall-clock bound. Aborting a stream rejects both the async iterator and the .final promise.

Errors

Input errors (invalid model/key/messages) throw AIGatewayError with a code (invalid_model, unsupported_model, invalid_api_key, invalid_messages, invalid_message, invalid_message_role, invalid_message_content).
Upstream/provider errors propagate as the underlying SDK error — chat / vision reject; stream throws on the iterator and rejects .final.
An aborted request rejects with the SDK's APIUserAbortError; a timed-out request with APIConnectionTimeoutError. Like other provider errors, these propagate unchanged (not wrapped in AIGatewayError).

import AIGateway, { AIGatewayError } from '@pkent/aigateway';

try {
  await g.chat(messages);
} catch (err) {
  if (err instanceof AIGatewayError) {
    console.error('Bad request:', err.code, err.message);
  } else {
    console.error('Provider error:', err);
  }
}

Testing

npm test

Tests run fully offline via injected fake clients (options.client). They cover provider resolution, constructor validation, the v2 response shape for chat / vision, streaming deltas + .final, error propagation, the OpenRouter prefix strip, cache-hint mapping, and maxTokens behavior.

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme