@salesforce/llm-gateway-sdk
v0.14.0
Published
Salesforce LLM Gateway SDK
Maintainers
Keywords
Readme
@salesforce/llm-gateway-sdk
Typed SDK for the Salesforce LLM Gateway API. Provides an HTTP client with JWT management, model abstraction, streaming (SSE) support, and request/response tracing.
Quick Start
Closed source. This package is published to npm under the Salesforce Public Code License and is for use by Salesforce only.
import { createJWT, createLLMGatewayClient } from '@salesforce/llm-gateway-sdk';
// 1. Create a JWT and client
const jwt = await createJWT({
accessToken: 'your-access-token',
instanceUrl: 'https://your-instance.salesforce.com',
});
const client = createLLMGatewayClient({ jwt });
// 2. Send a non-streaming chat request
const response = await client.chat({
messages: [{ role: 'user', content: 'Hello!' }],
generation_settings: { max_tokens: 100 },
});
console.log(response.data.generatedText);
// 3. Or stream the response
const stream = await client.chatStream({
messages: [{ role: 'user', content: 'Tell me a story.' }],
generation_settings: { max_tokens: 500 },
});
for await (const chunk of stream.data) {
if (chunk.generatedText) process.stdout.write(chunk.generatedText);
if (chunk.done) console.log('\nUsage:', chunk.usage);
}API Reference
createLLMGatewayClient(options): LLMGatewayClient
Creates a ready-to-use client. This is the recommended entry point.
import { createJWT, createLLMGatewayClient, Models, ModelName, SfApiEnv } from '@salesforce/llm-gateway-sdk';
const jwt = await createJWT({
accessToken: 'your-access-token',
instanceUrl: 'https://your-instance.salesforce.com',
});
const client = createLLMGatewayClient({ jwt, env: SfApiEnv.Prod });
client.setModel(Models.getByName(ModelName.GPT_5));Accepts GatewayClientOptions:
| Option | Type | Default | Description |
| -------------------------- | ----------------------- | ------------------------------------------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------- |
| jwt | JSONWebToken | — | JWT for authentication (required). |
| env? | SfApiEnv | prod | API environment. |
| basePath? | string | LLMG_BASE_PATH env var or '/einstein/gpt/code/v1.1' | Override the API base path. See route selection rationale. |
| clientFeatureIdOverride? | string | — | Override x-client-feature-id header instead of using the JWT minting feature ID. |
| retry? | RetryOptions \| false | below | Retry policy for transient failures. Pass false to disable retries entirely. |
LLMGatewayClient (interface)
The core contract consumers program against. All factory functions return this type.
interface LLMGatewayClient {
getModel(): Model;
setModel(model: Model): void;
chat(request: ChatRequest, options?: ChatOptions): Promise<ChatResponse>;
chatStream(request: ChatRequest, options?: ChatOptions): Promise<ChatStreamResponse>;
onTrace(callback: (event: TraceEvent) => void): Unsubscribe;
onTelemetry(callback: TelemetryEventCallback): Unsubscribe;
onLog(callback: (record: LogRecord) => void): Unsubscribe;
startTracing(filePath: string): void;
stopTracing(): void;
dispose(): void;
}ChatOptions
type ChatOptions = {
abortSignal?: AbortSignal;
};Models
abstract class Model {
abstract readonly name: ModelName;
abstract readonly displayId: string;
abstract readonly maxInputTokens: number;
abstract readonly maxOutputTokens: number;
abstract readonly contextWindow: number;
abstract readonly supportsPromptCache: boolean;
// The file formats this model accepts, with per-format caps. `[]` = text-only.
abstract readonly supportedFormats: readonly SupportedFileFormat[];
}
type SupportedFileFormat = {
name: 'png' | 'jpeg' | 'pdf';
mimeType: MimeType; // 'image/png' | 'image/jpeg' | 'application/pdf'
maxBytesPerFile?: number; // per-format byte cap; unset = only the global 15 MiB applies
maxFilesPerRequest?: number; // per-format count cap; unset = only the global 10-file cap applies
};
enum ModelName {
GPT_5 = 'llmgateway__OpenAIGPT5',
GPT_5_4 = 'llmgateway__OpenAIGPT54',
GPT_5_5 = 'llmgateway__OpenAIGPT55',
CLAUDE_SONNET_4_5 = 'llmgateway__BedrockAnthropicClaude45Sonnet',
CLAUDE_SONNET_4_6 = 'llmgateway__BedrockAnthropicClaude46Sonnet',
CLAUDE_OPUS_4_5 = 'llmgateway__BedrockAnthropicClaude45Opus',
CLAUDE_OPUS_4_6 = 'llmgateway__BedrockAnthropicClaude46Opus',
CLAUDE_OPUS_4_7 = 'llmgateway__BedrockAnthropicClaude47Opus',
}A model's supportedFormats array is the single source of truth for multimodal capability and caps. A file is accepted
only if its mimeType matches a declared format; an empty array means the model is text-only.
Built-in Models
| Model | ModelName | Context Window | Max Output | Images | PDFs | Prompt Cache |
| ----------------- | ------------------- | -------------- | ---------- | ------ | ---- | ------------ |
| GPT-5 | GPT_5 | 272K | 128K | Yes | Yes | No |
| GPT-5.4 | GPT_5_4 | 1.05M | 128K | Yes | Yes | No |
| GPT-5.5 (BETA) | GPT_5_5 | 1.05M | 128K | Yes | Yes | No |
| Claude Sonnet 4.5 | CLAUDE_SONNET_4_5 | 200K | 8192 | Yes | Yes | Yes |
| Claude Sonnet 4.6 | CLAUDE_SONNET_4_6 | 200K | 16384 | Yes | Yes | Yes |
| Claude Opus 4.5 | CLAUDE_OPUS_4_5 | 200K | 64K | Yes | Yes | Yes |
| Claude Opus 4.6 | CLAUDE_OPUS_4_6 | 1M | 128K | Yes | Yes | Yes |
| Claude Opus 4.7 | CLAUDE_OPUS_4_7 | 1M | 128K | Yes | Yes | Yes |
GPT-5.5 is gated behind the AIModelBetaEnabled org preference; orgs without that gate enabled will get a gateway error
at request time.
Per-format multimodal caps for Claude 4.x on Bedrock: 3.75 MiB/PNG, 3.75 MiB/JPEG, 4.5 MiB/PDF, and a 5-PDF per-request cap. GPT-5 / GPT-5.4 / GPT-5.5 declare the same three formats but no per-format caps — only the global 15 MiB / 10-file caps bind, since no per-file rejection has been observed on GPT-5 below the gateway's raw-payload wall.
Models Registry
const Models = {
getDefault(): Model; // ClaudeSonnet46
getByName(name: ModelName): Model;
};createClaudeModel (forward-compat escape hatch)
When the LLM Gateway publishes a new Bedrock-Anthropic Claude variant before this SDK has been updated, consumers can
opt into it by name without waiting for a release. The factory builds a Model instance using the shared
AnthropicClaudeResponseProcessor (correct for every Bedrock-Claude variant the gateway hosts) with conservative
Sonnet/Opus 4.5-baseline caps. Newer flagship variants ship with larger windows — pass overrides to dial caps in for
the specific model.
function createClaudeModel(gatewayId: string, overrides?: ClaudeModelOverrides): Model;
type ClaudeModelOverrides = {
displayId?: string;
maxInputTokens?: number; // default: 200_000 (Sonnet/Opus 4.5 baseline)
maxOutputTokens?: number; // default: 64_000 (Sonnet/Opus 4.5 baseline)
contextWindow?: number; // default: 200_000 (Sonnet/Opus 4.5 baseline)
supportsPromptCache?: boolean; // default: true
supportedFormats?: readonly SupportedFileFormat[]; // default: PNG/JPEG/PDF Claude caps
permittedParameters?: string[]; // e.g. ['temperature', 'top_p']
customHeaders?: Record<string, string>; // e.g. { 'anthropic-beta': '...' }
};import { createClaudeModel } from '@salesforce/llm-gateway-sdk';
const model = createClaudeModel('llmgateway__BedrockAnthropicClaude48Opus', {
displayId: 'Claude Opus 4.8',
maxInputTokens: 1_000_000,
contextWindow: 1_000_000,
maxOutputTokens: 128_000,
});
client.setModel(model);The factory is intentionally Claude-only. Prefer the Models.getByName(...) registry for any model the SDK already
ships, since the registry instance is the canonical source for caps and stays in sync with releases.
Request Types
type ChatRequest = {
messages: ChatMessageIn[];
generation_settings: GenerationSettings;
reasoning_settings?: ReasoningSettings;
tools?: ChatCompletionFunctionTool[];
tool_config?: ToolConfig;
};
type ChatMessageIn = {
role: 'user' | 'assistant' | 'system' | 'tool';
content: string;
files?: ChatMessageFile[];
tool_call_id?: string;
tool_call_name?: string;
tool_invocations?: ToolInvocationIn[];
};
type GenerationSettings = {
max_tokens?: number;
temperature?: number;
stop_sequences?: string[];
frequency_penalty?: number;
presence_penalty?: number;
parameters?: object;
};
type ToolConfig = {
mode: 'auto' | 'none' | 'tool' | 'any';
allowed_tools?: { type: 'function'; name: string }[];
};
type ChatCompletionFunctionTool = {
type?: 'function';
function: {
name: string;
description?: string;
parameters?: {
type: 'object';
properties?: Record<string, LlmgPropertyDefinition>;
required?: string[];
};
strict?: boolean;
};
};Response Types
type ChatResponse = {
status: number;
data: ChatResponseData;
};
type ChatResponseData = {
generatedText?: string;
finishReason?: FinishReason;
error?: LLMGError;
toolInvocations?: ToolInvocation[];
usage?: TokenUsage;
};
type ChatStreamResponse = {
status: number;
data: AsyncGenerator<ChatStreamChunk>;
};
type ChatStreamChunk = {
generatedText: string;
done: boolean;
finishReason?: FinishReason;
error?: LLMGError;
toolInvocations?: ToolInvocation[];
usage?: TokenUsage;
};
type ToolInvocation = {
id: string;
// `arguments` is always a JSON-parseable string. Tool calls with no arguments
// surface as `arguments: "{}"` (the stringified empty object).
function: { name: string; arguments: string };
};
type TokenUsage = {
inputTokens: number;
outputTokens: number;
totalTokens: number;
reasoningTokens?: number;
cacheReadTokens?: number;
cacheWriteTokens?: number;
};
type FinishReason = 'stop' | 'length' | 'tool_calls' | 'end_turn' | 'max_tokens' | 'tool_use';Multimodal Input
Attach images and PDFs to any ChatMessageIn via the optional files array. The same files[] shape works on both
chat() and chatStream().
import { createLLMGatewayClient, Models, ModelName } from '@salesforce/llm-gateway-sdk';
import { readFile } from 'node:fs/promises';
import { randomUUID } from 'node:crypto';
const client = createLLMGatewayClient({ jwt });
client.setModel(Models.getByName(ModelName.CLAUDE_SONNET_4_6));
const imageBytes = await readFile('./screenshot.png');
const response = await client.chat({
messages: [
{
role: 'user',
content: 'Describe what you see.',
files: [
{
fileId: randomUUID(),
mimeType: MimeType.Png,
dataType: 'base64',
data: imageBytes.toString('base64'),
fileName: 'screenshot.png',
},
],
},
],
generation_settings: { max_tokens: 200 },
});ChatMessageFile
type ChatMessageFile = {
fileId: string;
mimeType: MimeType; // 'image/png' | 'image/jpeg' | 'application/pdf'
dataType: 'base64';
data: string;
fileName?: string;
};Casing is camelCase by design. Even though the surrounding
ChatRequestenvelope is snake_case (generation_settings,max_tokens, …), the innerFileDataobject uses camelCase per the gateway's OpenAPI schema.
Supported MIME types
MimeType.Png (image/png), MimeType.Jpeg (image/jpeg), MimeType.Pdf (application/pdf). A file is accepted
only if its mimeType matches one of the configured model's supportedFormats; anything else is rejected with
LLMGClientError code MODEL_DOES_NOT_SUPPORT_FORMAT. Import the constant to avoid hard-coding the wire string:
import { MimeType } from '@salesforce/llm-gateway-sdk';.
Transport
dataType: 'base64' only in v1. uri and sfDrive are deferred with .
Limits
| Layer | Limit |
| ------------------------------- | ----------------------------------------------------------------- |
| Files per request (global) | 10 |
| Total decoded bytes (global) | 15 MiB |
| Per-PNG/JPEG bytes — Claude 4.x | 3.75 MiB (from the png/jpeg supportedFormats[].maxBytesPerFile) |
| Per-PDF bytes — Claude 4.x | 4.5 MiB (from the pdf supportedFormats[].maxBytesPerFile) |
| PDFs per request — Claude 4.x | 5 (from the pdf supportedFormats[].maxFilesPerRequest) |
| GPT-5 / 5.4 / 5.5 caps | Declare png/jpeg/pdf with no per-format caps; global limits only |
The SDK measures size by approximating Math.floor(data.length * 0.75) — accurate to within ≤ 2 bytes of the exact
decoded length.
Internal limits are noted here: https://docs.internal.salesforce.com/ai/einstein/gateway/models-and-providers/ Externally documented LLM Limits: https://help.salesforce.com/s/articleView?id=ai.generative_ai_llm_multimodal_support.htm&type=5
Validation errors
| Code | Cause |
| ------------------------------- | ------------------------------------------------------------------------------------ |
| TOO_MANY_FILES | Combined files[] count across all messages > 10 (global cap). |
| FILES_TOO_LARGE | Total decoded bytes across all files > 15 MiB (global cap). |
| MODEL_DOES_NOT_SUPPORT_FORMAT | A file's mimeType is not in the configured model's supportedFormats. |
| FILE_TOO_LARGE_FOR_FORMAT | A single file exceeds its format's maxBytesPerFile cap for the configured model. |
| TOO_MANY_FILES_FOR_FORMAT | The count of files of one format exceeds its maxFilesPerRequest cap for the model. |
Trust Layer. Files sent as binary base64 bypass Salesforce Trust Layer protections (toxicity scoring, data masking). If those protections matter for your use case, extract the content client-side and send it as text.
Reusing validation outside the client
chat() / chatStream() run this validation automatically. Integrations that build requests through a different
transport (e.g. an agent harness speaking a provider-native protocol) can run the identical pre-flight check via the
exported validateMultimodalFiles(files, model) — it throws the same LLMGClientError codes as the client. files
only needs { mimeType, data } (the MultimodalFile type), so callers holding partial file shapes don't need to
construct a full ChatMessageFile.
import { Models, ModelName, validateMultimodalFiles } from '@salesforce/llm-gateway-sdk';
const model = Models.getByName(ModelName.CLAUDE_SONNET_4_6);
validateMultimodalFiles([{ mimeType: 'image/png', data: base64Png }], model); // throws LLMGClientError on a cap/format violationTracing
Subscribe to per-request tracing via onTrace(), or use the convenience startTracing()/stopTracing() methods to
write trace events to a Markdown file.
// Callback-based tracing
const offTrace = client.onTrace((event) => {
if (event.type === TraceType.Response) {
console.log(`[${event.status}] ${event.totalDurationMs}ms`);
}
});
offTrace(); // unsubscribe
// File-based tracing
client.startTracing('./trace.md');
// ... make requests ...
client.stopTracing();type TraceEvent = TraceRequestEvent | TraceResponseEvent;
type TraceRequestEvent = {
type: TraceType.Request;
timestamp: Date;
url: string;
method: string;
model: string;
body: Record<string, unknown>;
};
type TraceResponseEvent = {
type: TraceType.Response;
timestamp: Date;
model: string;
status: number;
error?: Error;
timeToFirstTokenMs?: number;
usage?: TokenUsage;
totalDurationMs?: number;
responseText?: string;
xClientTraceId?: string;
};
enum TraceType {
Request = 'request',
Response = 'response',
Info = 'info',
}TraceFileWriter
Lower-level class for manual trace file management:
const writer = new TraceFileWriter(client, { filePath: './trace.md' });
// ... make requests ...
writer.detach();Telemetry & Structured Logs
LLMGatewayClient exposes two more typed event streams. Consumers subscribe via onTelemetry() and onLog(), both of
which return an Unsubscribe closure.
type TelemetryEvent =
| RequestStartedEvent
| RequestCompletedEvent
| RequestFailedEvent
| RequestRetryEvent
| JwtRefreshedEvent
| JwtRefreshFailedEvent;
type RequestStartedEvent = {
type: 'request-started';
timestamp: Date;
url: string;
method: string;
model: string;
};
type RequestCompletedEvent = {
type: 'request-completed';
timestamp: Date;
model: string;
status: number;
durationMs: number;
timeToFirstTokenMs?: number;
usage?: TokenUsage;
};
type RequestFailedEvent = {
type: 'request-failed';
timestamp: Date;
model: string;
status?: number;
durationMs: number;
error: Error;
};
type RequestRetryEvent = {
type: 'request-retry';
timestamp: Date;
model: string;
attempt: number; // 1-indexed: attempt 1 is emitted before the first retry
delayMs: number; // jittered or Retry-After-driven wait
status?: number; // set when the retry was triggered by an HTTP response (429/5xx)
error?: Error; // set when the retry was triggered by a transport-layer failure
};
type JwtRefreshedEvent = {
type: 'jwt-refreshed';
timestamp: Date;
durationMs: number;
};
type JwtRefreshFailedEvent = {
type: 'jwt-refresh-failed';
timestamp: Date;
durationMs: number;
error: Error;
};
type LogRecord = {
level: 'debug' | 'info' | 'warn' | 'error';
message: string;
timestamp: Date;
context?: Record<string, unknown>;
error?: Error;
};const offTelemetry = client.onTelemetry((event) => {
if (event.type === 'request-completed') {
histogram('llm_request_ms', event.durationMs, { model: event.model });
}
});
const offLog = client.onLog((record) => {
logger[record.level](record.context, record.message);
});Log messages emitted by the SDK
Each LogRecord carries a context object; the model key is populated on every record whenever a model is set.
| Level | Message | Notable context keys |
| ----- | ------------------------------------ | ---------------------------------------------------- |
| debug | Model set to {name} | model |
| debug | Refreshing expired JWT | — |
| info | JWT refreshed | durationMs |
| warn | Rate limit hit | model, reset, remaining, limit |
| warn | Unexpected status {code} | model, status |
| warn | SSE parse error | model, chunk, reason |
| warn | Retrying transient request failure | model, attempt, delayMs, status, errorCode |
| warn | Retries exhausted | model, status |
Disposal
dispose() releases event resources: it detaches any active trace writer and clears the trace, telemetry, and log
buses. After disposal every consumer-facing method throws LLMGClientError. dispose() itself is idempotent.
In-flight streams returned by a prior chatStream() call are not aborted by dispose(); the async generator keeps
yielding until it naturally completes. Use the caller-supplied AbortSignal if you need to cancel an active stream.
SfApiEnv
enum SfApiEnv {
Dev = 'dev',
Perf = 'perf',
Prod = 'prod',
Stage = 'stage',
Test = 'test',
}Retries
By default the client retries transient failures with exponential backoff and full jitter. Retries cover only connection establishment and the initial HTTP response — once the response body starts streaming, errors propagate to the caller.
The SDK retries on:
- HTTP
429,500,502,503,504 - Socket / network errors:
ECONNRESET,ECONNREFUSED,ENOTFOUND,ENETDOWN,ENETUNREACH,EHOSTDOWN,ETIMEDOUT,UND_ERR_SOCKET,UND_ERR_CONNECT_TIMEOUT,UND_ERR_HEADERS_TIMEOUT,UND_ERR_BODY_TIMEOUT - HTTP/2 GOAWAY (surfaces as
UND_ERR_SOCKET)
Retry-After and Salesforce LLMG's x-ratelimit-reset headers are honored when present, used verbatim without
jitter or further clamping (the server already specified the wait). The hint is bounded only by maxRetryAfterMs — if
the server asks for longer, the SDK does not retry and instead surfaces the response so the caller sees the
structured error (rather than the SDK retrying too soon and cascading more 429s). Computed exponential-backoff retries
are bounded separately by maxDelayMs; that cap does not apply to server-driven hints.
maxTotalElapsedMs provides an optional wall-clock ceiling on a single call (default: no deadline). When the next sleep
would exceed it, the loop stops retrying and surfaces the most recent error / response, even if maxAttempts hasn't
been reached.
JWT-bearing headers are recomputed on every attempt, so a long Retry-After wait never replays a stale token.
Retries do not fire on:
- 4xx that aren't 429 (
400,401,403,404,422, …) - Caller-initiated
AbortSignalcancellation - Errors after the response body has begun streaming
- Server hints that exceed
maxRetryAfterMs— the response is surfaced instead
type RetryOptions = {
/** Total attempts including the first. Default: 3. Must be >= 1. To disable retries, pass `retry: false`. */
maxAttempts?: number;
/** Initial delay before the first retry, in ms. Default: 100. */
initialDelayMs?: number;
/** Maximum delay between *computed* exponential-backoff retries, in ms. Default: 2000. */
maxDelayMs?: number;
/** Maximum delay accepted from a server-driven hint (`Retry-After` / `x-ratelimit-reset`), in ms. Default: 60000. */
maxRetryAfterMs?: number;
/** Exponential backoff multiplier. Default: 2. */
backoffFactor?: number;
/**
* Hard ceiling on cumulative wall-clock time a single call may spend across all attempts and
* inter-attempt sleeps, in ms. When the next sleep would push past this deadline the loop bails
* and surfaces the most recent error / response. Default: `Infinity` (no deadline).
*/
maxTotalElapsedMs?: number;
};// Tighten the policy
const client = createLLMGatewayClient({ jwt, retry: { maxAttempts: 5, initialDelayMs: 50 } });
// Disable retries entirely
const noRetry = createLLMGatewayClient({ jwt, retry: false });
// Observe each retry attempt
client.onTelemetry((event) => {
if (event.type === 'request-retry') {
console.warn(`retry #${event.attempt} after ${event.delayMs}ms`, event.status, event.error.message);
}
});When retries are exhausted, the underlying response is surfaced as a structured LLMGClientError — the messageCode,
cause (parsed LLMGError), and formatted rate-limit message are preserved exactly as on the non-retry path.
Error Handling
The client throws LLMGClientError for API errors (4xx/5xx). Rate-limit responses (429) include formatted reset time
information.
import { createLLMGatewayClient, LLMGClientError, LLMGClientErrorCode } from '@salesforce/llm-gateway-sdk';
try {
await client.chat(request);
} catch (err) {
if (err instanceof LLMGClientError) {
if (err.code === LLMGClientErrorCode.TooManyFiles) {
// narrow on a stable client-raised code …
}
console.error(`LLMG error [${err.code}]: ${err.message}`);
}
}LLMGClientErrorCode enumerates the stable codes the client raises itself (lifecycle + pre-flight multimodal
validation). Gateway-side errors (4xx/5xx) carry the upstream errorData.messageCode verbatim instead — those values
are an opaque string and are not part of this enum.
Function Calling Example
const response = await client.chat({
messages: [{ role: 'user', content: 'Get the weather for San Francisco' }],
generation_settings: { max_tokens: 200 },
tools: [
{
type: 'function',
function: {
name: 'get_weather',
description: 'Get weather for a location',
parameters: {
type: 'object',
properties: { location: { type: 'string' } },
required: ['location'],
},
},
},
],
tool_config: { mode: 'auto' },
});
if (response.data.toolInvocations) {
for (const invocation of response.data.toolInvocations) {
console.log(`Tool: ${invocation.function.name}, Args: ${invocation.function.arguments}`);
}
}Advanced: createJWT
Creates an auto-refreshing JWT from raw credentials. Establishes a connection, mints an initial token (fail-fast on
invalid credentials), and returns a self-refreshing JSONWebToken:
import { createJWT } from '@salesforce/llm-gateway-sdk';
const jwt = await createJWT({
accessToken: 'your-access-token',
instanceUrl: 'https://your-instance.salesforce.com',
featureId: 'MyFeature',
});| Option | Type | Default | Description |
| -------------- | -------- | --------------------------------------------- | -------------------------------------- |
| accessToken | string | — | Salesforce access token. |
| instanceUrl | string | — | Salesforce instance URL. |
| mintingPath? | string | '/ide/auth' | Token minting endpoint path. |
| featureId? | string | LLMG_FEATURE_ID env var or 'VibesService' | Feature identifier sent with requests. |
Advanced: createJWTFromConnection
Creates an auto-refreshing JWT from an existing OrgConnection. Use this when you already have a validated connection
and want to avoid the extra network call that createJWT performs to establish one:
import { createJWTFromConnection } from '@salesforce/llm-gateway-sdk';
import type { OrgConnection } from '@salesforce/agentic-common';
const jwt = await createJWTFromConnection(orgConnection, { featureId: 'MyFeature' });| Option | Type | Default | Description |
| --------------- | --------------- | --------------------------------------------- | ---------------------------------------- |
| orgConnection | OrgConnection | — | An already-authenticated org connection. |
| mintingPath? | string | '/ide/auth' | Token minting endpoint path. |
| featureId? | string | LLMG_FEATURE_ID env var or 'VibesService' | Feature identifier sent with requests. |
Advanced: DefaultLLMGatewayClientFactory
For dependency-injection scenarios where a factory interface is needed:
interface LLMGatewayClientFactory {
create(jwt: JSONWebToken, options?: { env?: SfApiEnv }): LLMGatewayClient;
}Development
See DEVELOPING.md for build-from-source setup, scripts, E2E testing, and packaging.
See ARCHITECTURE.md for implementation details on the request pipeline, SSE parsing, response processors, and JWT lifecycle.
