trpc-ai-providers

v0.0.2

Published

a month ago

tRPC procedures and core failover for multi-provider text and image LLMs

0High
0Medium
0Low

thmsmt

trpc llm ai openai anthropic gemini perplexity stability replicate streaming multi-provider failover

trpc-ai-providers

Plug in a bunch of text and image providers, let them fail over in order, and ship a tiny tRPC factory so you are not copy-pasting OpenAI boilerplate for the fifth time this week. API keys and vendor SDKs stay on the server; the /client entry is just subscription helpers, so there is no sneaking openai into the browser bundle.

The package stays intentionally low-level, but it includes two features that help in real apps:

preflight diagnostics so you can see which providers are actually configured and in what order they will run;
seeded routing so a conversation can stick to the same provider family instead of bouncing between model personalities.

v1 is still deliberately boring: no tool-calling circus. Streaming goes out as a subscription, so your transport needs to actually support that (SSE, WebSocket, whatever you already use).

Getting started

Install

pnpm add trpc-ai-providers
# or: npm install trpc-ai-providers

Peer dependencies: @trpc/server ^11, zod ^4. For browser streaming helpers, add @trpc/client ^11 and a subscription-capable tRPC link.

Five steps to a first response

Add the package and peers above.
Set at least one provider env var (Environment variables); otherwise you will have a bad time (no_providers_available).
Server: import trpc-ai-providers/server, bolt createAiProvidersProcedures onto your router, or call completeWithTextFailover directly if you are not feeling tRPC today.
Client (optional): use your typed trpc client; for chat streams, import helpers from trpc-ai-providers/client only (not /server).
Run your bundle analyzer once. If openai, @google/generative-ai, or @anthropic-ai/sdk land in client chunks, something went wrong: fix import paths before shipping (you owe yourself a coffee break to untangle it).

Import paths

| Entry | Use in | Contents | | ------------------------------- | ------------ | ---------------------------------------------------------------------------------------------------------------------------------------- | | trpc-ai-providers or /types | Shared | Zod schemas and types. Safe for forms and import type. | | trpc-ai-providers/server | Server only | Procedures factory, failover, env helpers, diagnostics, seeded routing helpers, errors. | | trpc-ai-providers/client | Browser only | runChatStreamSubscription, createChatStreamController, mapMessagesToChatStreamInput, parseChatStreamDelta, createChatStreamAbortController. Still no SDKs. |

Barebones server

A. One-shot completion (no tRPC)

import {
  completeWithTextFailover,
  textCredentialsFromEnv,
} from "trpc-ai-providers/server";

const result = await completeWithTextFailover({
  credentials: textCredentialsFromEnv(process.env),
  messages: [{ role: "user", content: "Say OK in one word." }],
});

// result.text, result.provider, result.model
// Optional on direct failover only: result.latencyMs, result.attempts (if any failures/skips),
// result.usage (OpenAI non-streaming maps API usage when present)

If nothing is configured, you get AiProviderFailoverError with kind: "no_providers_available".

B. tRPC router (chat + image)

import { initTRPC } from "@trpc/server";
import {
  createAiProvidersProcedures,
  textCredentialsFromEnv,
  imageCredentialsFromEnv,
} from "trpc-ai-providers/server";

const t = initTRPC.context<{ headers: Headers }>().create();

const ai = createAiProvidersProcedures({
  procedure: t.procedure,
  resolveCredentials: () => ({
    ...textCredentialsFromEnv(process.env),
    ...imageCredentialsFromEnv(process.env),
  }),
});

export const appRouter = t.router({
  ai: t.router(ai),
});

export type AppRouter = typeof appRouter;

Procedures under ai:

ai.chat.complete: mutation, full reply.
ai.chat.stream: subscription, { delta: string } chunks.
ai.image.generate: mutation, base64 image + media type.

Image providerOptions (tRPC input):

OpenAI: SDK snake_case extras (size, quality, output_format, …). prompt, model, and stream always come from the procedure / server after sanitization. OpenAI defaults to the GPT Image family (gpt-image-1 unless you pass openaiModel). DALL·E 2/3 still work via openaiModel with response_format: b64_json on the server.
Gemini (Nano Banana): native image via generateContent with server-forced responseModalities: ["TEXT","IMAGE"]. Optional providerOptions.gemini.generationConfig is merged after sanitization (client cannot override contents or responseModalities). Default image model is gemini-2.5-flash-image unless you pass geminiModel or set GEMINI_IMAGE_MODEL. Uses the same GEMINI_API_KEY / GOOGLE_GENERATIVE_AI_API_KEY as text Gemini.
Stability: extra multipart fields merged before server-set prompt, output_format, width, and height (those always win). The API model is fixed to stable-image-core; the low-level stabilityModel option on generateImageWithFailover is currently unused (reserved for a future release; use providerOptions.stability for vendor knobs instead).
Replicate: sanitized bag merged into the prediction input object; prompt always wins over the client. Polling respects the same AbortSignal as other providers (per-attempt signal and your overall budget), with no hidden extra wall-clock cap inside the Replicate adapter.

resolveCredentials must run on the server only. Never accept API keys from untrusted input.

Barebones client (streaming)

Use your app’s trpc setup. Import stream helpers from trpc-ai-providers/client only.

import {
  createChatStreamAbortController,
  mapMessagesToChatStreamInput,
  runChatStreamSubscription,
} from "trpc-ai-providers/client";

const ac = createChatStreamAbortController();

const { text, aborted } = await runChatStreamSubscription(trpc.ai.chat.stream, {
  messages: mapMessagesToChatStreamInput(rows),
  signal: ac.signal,
  onTextUpdate: (full) => setDisplayedText(full),
  // Same optional fields as the server: providerOrder, timeoutMs, models, generationOptions, providerOptions
});

ac.abort(); // user cancel: resolves with aborted: true, no throw

If signal is already aborted when you call runChatStreamSubscription, it resolves immediately with { text: "", aborted: true } without opening a subscription. Other failures still reject.

Stream lifecycle controller (one active run; a new start aborts the previous):

import {
  createChatStreamController,
  mapMessagesToChatStreamInput,
} from "trpc-ai-providers/client";

const chatStream = createChatStreamController({
  procedure: trpc.ai.chat.stream,
});

await chatStream.start({
  messages: mapMessagesToChatStreamInput(rows),
  onTextUpdate: (full) => setDisplayedText(full),
});

chatStream.stop();

Use this from a hook or service so the UI does not reimplement AbortController wiring. For custom signal merging, call runChatStreamSubscription directly.

App-side pattern: drive streams from the submit path; avoid auto-starting from useEffect on every message-list change.

Core API (no tRPC)

import {
  completeWithTextFailover,
  streamWithTextFailover,
  generateImageWithFailover,
  textCredentialsFromEnv,
  imageCredentialsFromEnv,
} from "trpc-ai-providers/server";

await generateImageWithFailover({
  credentials: imageCredentialsFromEnv(process.env),
  prompt: "A red circle on white",
  signal: AbortSignal.timeout(120_000),
  providerOptions: {
    openai: { size: "1024x1024", quality: "low" },
  },
});

Wire input and sanitization

tRPC input uses Zod; providerOptions and portable generationOptions extras are still untrusted client data on the wire. Before merge into vendor calls, the library:

strips forbidden SDK keys per provider (so clients cannot override messages, model, prompt, etc. where the server owns them);
drops unsafe object keys such as __proto__, constructor, and prototype;
caps token-style numeric fields to a fixed ceiling.

See SECURITY.md for where secrets must live and what not to log.

Environment variables

Text

| Variable | Purpose | | --------------------------------------------------------------------- | ------------------------------------------------ | | OPENAI_API_KEY | OpenAI | | GEMINI_API_KEY or GOOGLE_GENERATIVE_AI_API_KEY | Gemini | | PERPLEXITY_API_KEY | Perplexity | | ANTHROPIC_API_KEY | Anthropic | | LLM_PROVIDER_ORDER | Comma list: openai,gemini,perplexity,anthropic | | LLM_TIMEOUT_MS | Per-attempt timeout ms (default 60000) | | OPENAI_MODEL, GEMINI_MODEL, PERPLEXITY_MODEL, ANTHROPIC_MODEL | Model overrides |

Image

| Variable | Purpose | | -------------------------------------- | --------------------------------------------------------- | | STABILITY_API_KEY | Stability | | REPLICATE_API_TOKEN | Replicate | | IMAGE_PROVIDER_ORDER | Comma list: openai,gemini,stability,replicate | | GEMINI_IMAGE_MODEL | Gemini image model override (default gemini-2.5-flash-image) | | IMAGE_TIMEOUT_MS or LLM_TIMEOUT_MS | Default per-attempt image timeout (120000) | | REPLICATE_IMAGE_MODEL | Replicate slug (default black-forest-labs/flux-schnell) |

Failover behavior (the short version)

Order: providerOrder on the request wins when it has valid ids (deduped, junk dropped); otherwise env; otherwise defaults.
Missing keys: skipped_missing_credentials, not “vendor down”.
Timeouts: Each attempt merges your signal with a per-attempt timer. On text helpers, attemptTimeoutMs beats timeoutMs when both are set; otherwise env defaults apply.
Streaming: Failover only before the first non-empty delta; then the current provider owns the rest of the stream.
Empty replies: May become empty_response and trigger failover when policy allows.
Cancel (text): Your abort stops the chain (complete may omit a signal; stream expects one).
Image: Packaged input.timeoutMs is per provider attempt. The image.generate mutation also applies a total wall-clock budget (300_000 ms) for the whole failover chain. When you call generateImageWithFailover directly, your signal is the only overall budget: use AbortSignal.timeout(...) for a wall clock, or AbortController for manual cancel. Manual cancel on that signal surfaces as failover kind: "aborted"; AbortSignal.timeout surfaces as kind: "timeout" when reason looks like a timeout (see abortSignalLooksLikeTimeout on trpc-ai-providers/server).

Diagnostics / preflight

This is useful for health endpoints, startup logs, admin panels, and support tickets.

import {
  getAiProviderDiagnostics,
  imageCredentialsFromEnv,
  textCredentialsFromEnv,
} from "trpc-ai-providers/server";

const diagnostics = getAiProviderDiagnostics({
  text: {
    credentials: textCredentialsFromEnv(process.env),
    messages: [{ role: "user", content: "hello" }],
  },
  image: {
    credentials: imageCredentialsFromEnv(process.env),
    prompt: "A red circle on white",
  },
});

console.log(diagnostics.text?.resolvedOrder);
console.log(diagnostics.text?.providers);
console.log(diagnostics.image?.resolvedAvailableProviders);

What you get back:

the default order,
the resolved order after providerOrder / env / policy,
the attempt timeout that will apply,
a per-provider row with:
- credentialConfigured,
- includedInResolvedOrder,
- priority,
- resolved model,
- simple reason like missing_credentials or not_in_resolved_order.

No secrets are returned: only booleans and model labels.

Seeded routing / conversation affinity

This lets you keep a conversation pinned to a stable provider preference using a seed such as:

conversation id,
tenant id,
user id,
experiment bucket.

Under the hood it uses weighted rendezvous hashing: the order is deterministic, spreads traffic fairly, stays stable for the same seed, and still works with failover.

import type { TextProviderId } from "trpc-ai-providers";
import {
  completeWithTextFailover,
  createSeededProviderPolicy,
  textCredentialsFromEnv,
} from "trpc-ai-providers/server";

const providerPolicy = createSeededProviderPolicy<TextProviderId>({
  seed: "conversation:42",
  preferred: ["anthropic"],
  weights: {
    anthropic: 3,
    openai: 2,
    gemini: 1,
    perplexity: 1,
  },
});

const result = await completeWithTextFailover({
  credentials: textCredentialsFromEnv(process.env),
  messages: [{ role: "user", content: "Write a tiny limerick." }],
  providerPolicy,
});

You can also derive the seed from the policy context:

import type { TextProviderId } from "trpc-ai-providers";
import { createSeededProviderPolicy } from "trpc-ai-providers/server";

const providerPolicy = createSeededProviderPolicy<TextProviderId>({
  seed: ({ messages }) => {
    const firstUserMessage = messages?.find((m) => m.role === "user");
    return firstUserMessage?.content.slice(0, 40) ?? "anonymous";
  },
});

Observability (optional)

Pass observer on completeWithTextFailover, streamWithTextFailover, or generateImageWithFailover:

onAttempt: after each failure/skip (attemptIndex). No secrets or raw bodies.
onSuccess: once on success with provider, model, latencyMs, and prior failure/skip attempts only.

Observer callbacks are wrapped in try/catch so they cannot break failover.

Provider policy (low-level only)

Optional synchronous providerPolicy(ctx) on failover option objects returns a provider id list. It runs once per request with defaultOrder and previousAttempts: []. Output is normalized like providerOrder. Not exposed on packaged tRPC inputs in this release.

If you do not want to write your own policy function, use createSeededProviderPolicy from /server.

Usage metadata

UsageMetadata on LlmResult.usage appears when the vendor returns usage (OpenAI non-streaming complete maps usage). Packaged chat.complete still returns only text / provider / model; use completeWithTextFailover if you need usage on the wire.

Errors and tRPC codes

Failures are AiProviderFailoverError with kind and attempts[]. Do not branch on message strings.

tRPC mapping:

no_providers_available, config_error, skipped_missing_credentials → BAD_REQUEST
aborted → CLIENT_CLOSED_REQUEST
timeout, provider_error, empty_response → INTERNAL_SERVER_ERROR

Keep messages readable; never paste raw keys into them.

AI-facing docs

README.md for humans.
LLMS.md for agents / code assistants: public entrypoints, server/client import boundaries, security invariants, and safe change patterns.

Publishing checklist

pnpm run build in this package (tsc emits dist/).
pnpm run check for a no-emit TS pass.
package.json files lists dist, README.md, LICENSE, CHANGELOG.md, SECURITY.md, LLMS.md.
Peers: @trpc/server ^11, zod ^4, optional @trpc/client ^11 for /client.

Vendor docs

OpenAI
Gemini (text + native image / “Nano Banana” models)
Perplexity
Anthropic
Stability
Replicate
tRPC subscriptions

License

MIT

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

trpc-ai-providers

Contents

Getting started

Install

Five steps to a first response

Import paths

Barebones server

Barebones client (streaming)

Core API (no tRPC)

Wire input and sanitization

Environment variables

Failover behavior (the short version)

Diagnostics / preflight

Seeded routing / conversation affinity

Observability (optional)

Provider policy (low-level only)

Usage metadata

Errors and tRPC codes

AI-facing docs

Publishing checklist

Vendor docs

License