npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

@salesforce/llm-gateway-sdk

v0.14.0

Published

Salesforce LLM Gateway SDK

Readme

@salesforce/llm-gateway-sdk

Typed SDK for the Salesforce LLM Gateway API. Provides an HTTP client with JWT management, model abstraction, streaming (SSE) support, and request/response tracing.

Quick Start

Closed source. This package is published to npm under the Salesforce Public Code License and is for use by Salesforce only.

import { createJWT, createLLMGatewayClient } from '@salesforce/llm-gateway-sdk';

// 1. Create a JWT and client
const jwt = await createJWT({
  accessToken: 'your-access-token',
  instanceUrl: 'https://your-instance.salesforce.com',
});
const client = createLLMGatewayClient({ jwt });

// 2. Send a non-streaming chat request
const response = await client.chat({
  messages: [{ role: 'user', content: 'Hello!' }],
  generation_settings: { max_tokens: 100 },
});
console.log(response.data.generatedText);

// 3. Or stream the response
const stream = await client.chatStream({
  messages: [{ role: 'user', content: 'Tell me a story.' }],
  generation_settings: { max_tokens: 500 },
});

for await (const chunk of stream.data) {
  if (chunk.generatedText) process.stdout.write(chunk.generatedText);
  if (chunk.done) console.log('\nUsage:', chunk.usage);
}

API Reference

createLLMGatewayClient(options): LLMGatewayClient

Creates a ready-to-use client. This is the recommended entry point.

import { createJWT, createLLMGatewayClient, Models, ModelName, SfApiEnv } from '@salesforce/llm-gateway-sdk';

const jwt = await createJWT({
  accessToken: 'your-access-token',
  instanceUrl: 'https://your-instance.salesforce.com',
});

const client = createLLMGatewayClient({ jwt, env: SfApiEnv.Prod });
client.setModel(Models.getByName(ModelName.GPT_5));

Accepts GatewayClientOptions:

| Option | Type | Default | Description | | -------------------------- | ----------------------- | ------------------------------------------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------- | | jwt | JSONWebToken | — | JWT for authentication (required). | | env? | SfApiEnv | prod | API environment. | | basePath? | string | LLMG_BASE_PATH env var or '/einstein/gpt/code/v1.1' | Override the API base path. See route selection rationale. | | clientFeatureIdOverride? | string | — | Override x-client-feature-id header instead of using the JWT minting feature ID. | | retry? | RetryOptions \| false | below | Retry policy for transient failures. Pass false to disable retries entirely. |

LLMGatewayClient (interface)

The core contract consumers program against. All factory functions return this type.

interface LLMGatewayClient {
  getModel(): Model;
  setModel(model: Model): void;

  chat(request: ChatRequest, options?: ChatOptions): Promise<ChatResponse>;
  chatStream(request: ChatRequest, options?: ChatOptions): Promise<ChatStreamResponse>;

  onTrace(callback: (event: TraceEvent) => void): Unsubscribe;
  onTelemetry(callback: TelemetryEventCallback): Unsubscribe;
  onLog(callback: (record: LogRecord) => void): Unsubscribe;

  startTracing(filePath: string): void;
  stopTracing(): void;

  dispose(): void;
}

ChatOptions

type ChatOptions = {
  abortSignal?: AbortSignal;
};

Models

abstract class Model {
  abstract readonly name: ModelName;
  abstract readonly displayId: string;
  abstract readonly maxInputTokens: number;
  abstract readonly maxOutputTokens: number;
  abstract readonly contextWindow: number;
  abstract readonly supportsPromptCache: boolean;
  // The file formats this model accepts, with per-format caps. `[]` = text-only.
  abstract readonly supportedFormats: readonly SupportedFileFormat[];
}

type SupportedFileFormat = {
  name: 'png' | 'jpeg' | 'pdf';
  mimeType: MimeType; // 'image/png' | 'image/jpeg' | 'application/pdf'
  maxBytesPerFile?: number; // per-format byte cap; unset = only the global 15 MiB applies
  maxFilesPerRequest?: number; // per-format count cap; unset = only the global 10-file cap applies
};

enum ModelName {
  GPT_5 = 'llmgateway__OpenAIGPT5',
  GPT_5_4 = 'llmgateway__OpenAIGPT54',
  GPT_5_5 = 'llmgateway__OpenAIGPT55',
  CLAUDE_SONNET_4_5 = 'llmgateway__BedrockAnthropicClaude45Sonnet',
  CLAUDE_SONNET_4_6 = 'llmgateway__BedrockAnthropicClaude46Sonnet',
  CLAUDE_OPUS_4_5 = 'llmgateway__BedrockAnthropicClaude45Opus',
  CLAUDE_OPUS_4_6 = 'llmgateway__BedrockAnthropicClaude46Opus',
  CLAUDE_OPUS_4_7 = 'llmgateway__BedrockAnthropicClaude47Opus',
}

A model's supportedFormats array is the single source of truth for multimodal capability and caps. A file is accepted only if its mimeType matches a declared format; an empty array means the model is text-only.

Built-in Models

| Model | ModelName | Context Window | Max Output | Images | PDFs | Prompt Cache | | ----------------- | ------------------- | -------------- | ---------- | ------ | ---- | ------------ | | GPT-5 | GPT_5 | 272K | 128K | Yes | Yes | No | | GPT-5.4 | GPT_5_4 | 1.05M | 128K | Yes | Yes | No | | GPT-5.5 (BETA) | GPT_5_5 | 1.05M | 128K | Yes | Yes | No | | Claude Sonnet 4.5 | CLAUDE_SONNET_4_5 | 200K | 8192 | Yes | Yes | Yes | | Claude Sonnet 4.6 | CLAUDE_SONNET_4_6 | 200K | 16384 | Yes | Yes | Yes | | Claude Opus 4.5 | CLAUDE_OPUS_4_5 | 200K | 64K | Yes | Yes | Yes | | Claude Opus 4.6 | CLAUDE_OPUS_4_6 | 1M | 128K | Yes | Yes | Yes | | Claude Opus 4.7 | CLAUDE_OPUS_4_7 | 1M | 128K | Yes | Yes | Yes |

GPT-5.5 is gated behind the AIModelBetaEnabled org preference; orgs without that gate enabled will get a gateway error at request time.

Per-format multimodal caps for Claude 4.x on Bedrock: 3.75 MiB/PNG, 3.75 MiB/JPEG, 4.5 MiB/PDF, and a 5-PDF per-request cap. GPT-5 / GPT-5.4 / GPT-5.5 declare the same three formats but no per-format caps — only the global 15 MiB / 10-file caps bind, since no per-file rejection has been observed on GPT-5 below the gateway's raw-payload wall.

Models Registry

const Models = {
  getDefault(): Model;              // ClaudeSonnet46
  getByName(name: ModelName): Model;
};

createClaudeModel (forward-compat escape hatch)

When the LLM Gateway publishes a new Bedrock-Anthropic Claude variant before this SDK has been updated, consumers can opt into it by name without waiting for a release. The factory builds a Model instance using the shared AnthropicClaudeResponseProcessor (correct for every Bedrock-Claude variant the gateway hosts) with conservative Sonnet/Opus 4.5-baseline caps. Newer flagship variants ship with larger windows — pass overrides to dial caps in for the specific model.

function createClaudeModel(gatewayId: string, overrides?: ClaudeModelOverrides): Model;

type ClaudeModelOverrides = {
  displayId?: string;
  maxInputTokens?: number; // default: 200_000 (Sonnet/Opus 4.5 baseline)
  maxOutputTokens?: number; // default: 64_000 (Sonnet/Opus 4.5 baseline)
  contextWindow?: number; // default: 200_000 (Sonnet/Opus 4.5 baseline)
  supportsPromptCache?: boolean; // default: true
  supportedFormats?: readonly SupportedFileFormat[]; // default: PNG/JPEG/PDF Claude caps
  permittedParameters?: string[]; // e.g. ['temperature', 'top_p']
  customHeaders?: Record<string, string>; // e.g. { 'anthropic-beta': '...' }
};
import { createClaudeModel } from '@salesforce/llm-gateway-sdk';

const model = createClaudeModel('llmgateway__BedrockAnthropicClaude48Opus', {
  displayId: 'Claude Opus 4.8',
  maxInputTokens: 1_000_000,
  contextWindow: 1_000_000,
  maxOutputTokens: 128_000,
});
client.setModel(model);

The factory is intentionally Claude-only. Prefer the Models.getByName(...) registry for any model the SDK already ships, since the registry instance is the canonical source for caps and stays in sync with releases.

Request Types

type ChatRequest = {
  messages: ChatMessageIn[];
  generation_settings: GenerationSettings;
  reasoning_settings?: ReasoningSettings;
  tools?: ChatCompletionFunctionTool[];
  tool_config?: ToolConfig;
};

type ChatMessageIn = {
  role: 'user' | 'assistant' | 'system' | 'tool';
  content: string;
  files?: ChatMessageFile[];
  tool_call_id?: string;
  tool_call_name?: string;
  tool_invocations?: ToolInvocationIn[];
};

type GenerationSettings = {
  max_tokens?: number;
  temperature?: number;
  stop_sequences?: string[];
  frequency_penalty?: number;
  presence_penalty?: number;
  parameters?: object;
};

type ToolConfig = {
  mode: 'auto' | 'none' | 'tool' | 'any';
  allowed_tools?: { type: 'function'; name: string }[];
};

type ChatCompletionFunctionTool = {
  type?: 'function';
  function: {
    name: string;
    description?: string;
    parameters?: {
      type: 'object';
      properties?: Record<string, LlmgPropertyDefinition>;
      required?: string[];
    };
    strict?: boolean;
  };
};

Response Types

type ChatResponse = {
  status: number;
  data: ChatResponseData;
};

type ChatResponseData = {
  generatedText?: string;
  finishReason?: FinishReason;
  error?: LLMGError;
  toolInvocations?: ToolInvocation[];
  usage?: TokenUsage;
};

type ChatStreamResponse = {
  status: number;
  data: AsyncGenerator<ChatStreamChunk>;
};

type ChatStreamChunk = {
  generatedText: string;
  done: boolean;
  finishReason?: FinishReason;
  error?: LLMGError;
  toolInvocations?: ToolInvocation[];
  usage?: TokenUsage;
};

type ToolInvocation = {
  id: string;
  // `arguments` is always a JSON-parseable string. Tool calls with no arguments
  // surface as `arguments: "{}"` (the stringified empty object).
  function: { name: string; arguments: string };
};

type TokenUsage = {
  inputTokens: number;
  outputTokens: number;
  totalTokens: number;
  reasoningTokens?: number;
  cacheReadTokens?: number;
  cacheWriteTokens?: number;
};

type FinishReason = 'stop' | 'length' | 'tool_calls' | 'end_turn' | 'max_tokens' | 'tool_use';

Multimodal Input

Attach images and PDFs to any ChatMessageIn via the optional files array. The same files[] shape works on both chat() and chatStream().

import { createLLMGatewayClient, Models, ModelName } from '@salesforce/llm-gateway-sdk';
import { readFile } from 'node:fs/promises';
import { randomUUID } from 'node:crypto';

const client = createLLMGatewayClient({ jwt });
client.setModel(Models.getByName(ModelName.CLAUDE_SONNET_4_6));

const imageBytes = await readFile('./screenshot.png');

const response = await client.chat({
  messages: [
    {
      role: 'user',
      content: 'Describe what you see.',
      files: [
        {
          fileId: randomUUID(),
          mimeType: MimeType.Png,
          dataType: 'base64',
          data: imageBytes.toString('base64'),
          fileName: 'screenshot.png',
        },
      ],
    },
  ],
  generation_settings: { max_tokens: 200 },
});

ChatMessageFile

type ChatMessageFile = {
  fileId: string;
  mimeType: MimeType; // 'image/png' | 'image/jpeg' | 'application/pdf'
  dataType: 'base64';
  data: string;
  fileName?: string;
};

Casing is camelCase by design. Even though the surrounding ChatRequest envelope is snake_case (generation_settings, max_tokens, …), the inner FileData object uses camelCase per the gateway's OpenAPI schema.

Supported MIME types

MimeType.Png (image/png), MimeType.Jpeg (image/jpeg), MimeType.Pdf (application/pdf). A file is accepted only if its mimeType matches one of the configured model's supportedFormats; anything else is rejected with LLMGClientError code MODEL_DOES_NOT_SUPPORT_FORMAT. Import the constant to avoid hard-coding the wire string: import { MimeType } from '@salesforce/llm-gateway-sdk';.

Transport

dataType: 'base64' only in v1. uri and sfDrive are deferred with .

Limits

| Layer | Limit | | ------------------------------- | ----------------------------------------------------------------- | | Files per request (global) | 10 | | Total decoded bytes (global) | 15 MiB | | Per-PNG/JPEG bytes — Claude 4.x | 3.75 MiB (from the png/jpeg supportedFormats[].maxBytesPerFile) | | Per-PDF bytes — Claude 4.x | 4.5 MiB (from the pdf supportedFormats[].maxBytesPerFile) | | PDFs per request — Claude 4.x | 5 (from the pdf supportedFormats[].maxFilesPerRequest) | | GPT-5 / 5.4 / 5.5 caps | Declare png/jpeg/pdf with no per-format caps; global limits only |

The SDK measures size by approximating Math.floor(data.length * 0.75) — accurate to within ≤ 2 bytes of the exact decoded length.

Internal limits are noted here: https://docs.internal.salesforce.com/ai/einstein/gateway/models-and-providers/ Externally documented LLM Limits: https://help.salesforce.com/s/articleView?id=ai.generative_ai_llm_multimodal_support.htm&type=5

Validation errors

| Code | Cause | | ------------------------------- | ------------------------------------------------------------------------------------ | | TOO_MANY_FILES | Combined files[] count across all messages > 10 (global cap). | | FILES_TOO_LARGE | Total decoded bytes across all files > 15 MiB (global cap). | | MODEL_DOES_NOT_SUPPORT_FORMAT | A file's mimeType is not in the configured model's supportedFormats. | | FILE_TOO_LARGE_FOR_FORMAT | A single file exceeds its format's maxBytesPerFile cap for the configured model. | | TOO_MANY_FILES_FOR_FORMAT | The count of files of one format exceeds its maxFilesPerRequest cap for the model. |

Trust Layer. Files sent as binary base64 bypass Salesforce Trust Layer protections (toxicity scoring, data masking). If those protections matter for your use case, extract the content client-side and send it as text.

Reusing validation outside the client

chat() / chatStream() run this validation automatically. Integrations that build requests through a different transport (e.g. an agent harness speaking a provider-native protocol) can run the identical pre-flight check via the exported validateMultimodalFiles(files, model) — it throws the same LLMGClientError codes as the client. files only needs { mimeType, data } (the MultimodalFile type), so callers holding partial file shapes don't need to construct a full ChatMessageFile.

import { Models, ModelName, validateMultimodalFiles } from '@salesforce/llm-gateway-sdk';

const model = Models.getByName(ModelName.CLAUDE_SONNET_4_6);
validateMultimodalFiles([{ mimeType: 'image/png', data: base64Png }], model); // throws LLMGClientError on a cap/format violation

Tracing

Subscribe to per-request tracing via onTrace(), or use the convenience startTracing()/stopTracing() methods to write trace events to a Markdown file.

// Callback-based tracing
const offTrace = client.onTrace((event) => {
  if (event.type === TraceType.Response) {
    console.log(`[${event.status}] ${event.totalDurationMs}ms`);
  }
});
offTrace(); // unsubscribe

// File-based tracing
client.startTracing('./trace.md');
// ... make requests ...
client.stopTracing();
type TraceEvent = TraceRequestEvent | TraceResponseEvent;

type TraceRequestEvent = {
  type: TraceType.Request;
  timestamp: Date;
  url: string;
  method: string;
  model: string;
  body: Record<string, unknown>;
};

type TraceResponseEvent = {
  type: TraceType.Response;
  timestamp: Date;
  model: string;
  status: number;
  error?: Error;
  timeToFirstTokenMs?: number;
  usage?: TokenUsage;
  totalDurationMs?: number;
  responseText?: string;
  xClientTraceId?: string;
};

enum TraceType {
  Request = 'request',
  Response = 'response',
  Info = 'info',
}

TraceFileWriter

Lower-level class for manual trace file management:

const writer = new TraceFileWriter(client, { filePath: './trace.md' });
// ... make requests ...
writer.detach();

Telemetry & Structured Logs

LLMGatewayClient exposes two more typed event streams. Consumers subscribe via onTelemetry() and onLog(), both of which return an Unsubscribe closure.

type TelemetryEvent =
  | RequestStartedEvent
  | RequestCompletedEvent
  | RequestFailedEvent
  | RequestRetryEvent
  | JwtRefreshedEvent
  | JwtRefreshFailedEvent;

type RequestStartedEvent = {
  type: 'request-started';
  timestamp: Date;
  url: string;
  method: string;
  model: string;
};

type RequestCompletedEvent = {
  type: 'request-completed';
  timestamp: Date;
  model: string;
  status: number;
  durationMs: number;
  timeToFirstTokenMs?: number;
  usage?: TokenUsage;
};

type RequestFailedEvent = {
  type: 'request-failed';
  timestamp: Date;
  model: string;
  status?: number;
  durationMs: number;
  error: Error;
};

type RequestRetryEvent = {
  type: 'request-retry';
  timestamp: Date;
  model: string;
  attempt: number; // 1-indexed: attempt 1 is emitted before the first retry
  delayMs: number; // jittered or Retry-After-driven wait
  status?: number; // set when the retry was triggered by an HTTP response (429/5xx)
  error?: Error; // set when the retry was triggered by a transport-layer failure
};

type JwtRefreshedEvent = {
  type: 'jwt-refreshed';
  timestamp: Date;
  durationMs: number;
};

type JwtRefreshFailedEvent = {
  type: 'jwt-refresh-failed';
  timestamp: Date;
  durationMs: number;
  error: Error;
};

type LogRecord = {
  level: 'debug' | 'info' | 'warn' | 'error';
  message: string;
  timestamp: Date;
  context?: Record<string, unknown>;
  error?: Error;
};
const offTelemetry = client.onTelemetry((event) => {
  if (event.type === 'request-completed') {
    histogram('llm_request_ms', event.durationMs, { model: event.model });
  }
});

const offLog = client.onLog((record) => {
  logger[record.level](record.context, record.message);
});

Log messages emitted by the SDK

Each LogRecord carries a context object; the model key is populated on every record whenever a model is set.

| Level | Message | Notable context keys | | ----- | ------------------------------------ | ---------------------------------------------------- | | debug | Model set to {name} | model | | debug | Refreshing expired JWT | — | | info | JWT refreshed | durationMs | | warn | Rate limit hit | model, reset, remaining, limit | | warn | Unexpected status {code} | model, status | | warn | SSE parse error | model, chunk, reason | | warn | Retrying transient request failure | model, attempt, delayMs, status, errorCode | | warn | Retries exhausted | model, status |

Disposal

dispose() releases event resources: it detaches any active trace writer and clears the trace, telemetry, and log buses. After disposal every consumer-facing method throws LLMGClientError. dispose() itself is idempotent.

In-flight streams returned by a prior chatStream() call are not aborted by dispose(); the async generator keeps yielding until it naturally completes. Use the caller-supplied AbortSignal if you need to cancel an active stream.

SfApiEnv

enum SfApiEnv {
  Dev = 'dev',
  Perf = 'perf',
  Prod = 'prod',
  Stage = 'stage',
  Test = 'test',
}

Retries

By default the client retries transient failures with exponential backoff and full jitter. Retries cover only connection establishment and the initial HTTP response — once the response body starts streaming, errors propagate to the caller.

The SDK retries on:

  • HTTP 429, 500, 502, 503, 504
  • Socket / network errors: ECONNRESET, ECONNREFUSED, ENOTFOUND, ENETDOWN, ENETUNREACH, EHOSTDOWN, ETIMEDOUT, UND_ERR_SOCKET, UND_ERR_CONNECT_TIMEOUT, UND_ERR_HEADERS_TIMEOUT, UND_ERR_BODY_TIMEOUT
  • HTTP/2 GOAWAY (surfaces as UND_ERR_SOCKET)

Retry-After and Salesforce LLMG's x-ratelimit-reset headers are honored when present, used verbatim without jitter or further clamping (the server already specified the wait). The hint is bounded only by maxRetryAfterMs — if the server asks for longer, the SDK does not retry and instead surfaces the response so the caller sees the structured error (rather than the SDK retrying too soon and cascading more 429s). Computed exponential-backoff retries are bounded separately by maxDelayMs; that cap does not apply to server-driven hints.

maxTotalElapsedMs provides an optional wall-clock ceiling on a single call (default: no deadline). When the next sleep would exceed it, the loop stops retrying and surfaces the most recent error / response, even if maxAttempts hasn't been reached.

JWT-bearing headers are recomputed on every attempt, so a long Retry-After wait never replays a stale token.

Retries do not fire on:

  • 4xx that aren't 429 (400, 401, 403, 404, 422, …)
  • Caller-initiated AbortSignal cancellation
  • Errors after the response body has begun streaming
  • Server hints that exceed maxRetryAfterMs — the response is surfaced instead
type RetryOptions = {
  /** Total attempts including the first. Default: 3. Must be >= 1. To disable retries, pass `retry: false`. */
  maxAttempts?: number;
  /** Initial delay before the first retry, in ms. Default: 100. */
  initialDelayMs?: number;
  /** Maximum delay between *computed* exponential-backoff retries, in ms. Default: 2000. */
  maxDelayMs?: number;
  /** Maximum delay accepted from a server-driven hint (`Retry-After` / `x-ratelimit-reset`), in ms. Default: 60000. */
  maxRetryAfterMs?: number;
  /** Exponential backoff multiplier. Default: 2. */
  backoffFactor?: number;
  /**
   * Hard ceiling on cumulative wall-clock time a single call may spend across all attempts and
   * inter-attempt sleeps, in ms. When the next sleep would push past this deadline the loop bails
   * and surfaces the most recent error / response. Default: `Infinity` (no deadline).
   */
  maxTotalElapsedMs?: number;
};
// Tighten the policy
const client = createLLMGatewayClient({ jwt, retry: { maxAttempts: 5, initialDelayMs: 50 } });

// Disable retries entirely
const noRetry = createLLMGatewayClient({ jwt, retry: false });

// Observe each retry attempt
client.onTelemetry((event) => {
  if (event.type === 'request-retry') {
    console.warn(`retry #${event.attempt} after ${event.delayMs}ms`, event.status, event.error.message);
  }
});

When retries are exhausted, the underlying response is surfaced as a structured LLMGClientError — the messageCode, cause (parsed LLMGError), and formatted rate-limit message are preserved exactly as on the non-retry path.

Error Handling

The client throws LLMGClientError for API errors (4xx/5xx). Rate-limit responses (429) include formatted reset time information.

import { createLLMGatewayClient, LLMGClientError, LLMGClientErrorCode } from '@salesforce/llm-gateway-sdk';

try {
  await client.chat(request);
} catch (err) {
  if (err instanceof LLMGClientError) {
    if (err.code === LLMGClientErrorCode.TooManyFiles) {
      // narrow on a stable client-raised code …
    }
    console.error(`LLMG error [${err.code}]: ${err.message}`);
  }
}

LLMGClientErrorCode enumerates the stable codes the client raises itself (lifecycle + pre-flight multimodal validation). Gateway-side errors (4xx/5xx) carry the upstream errorData.messageCode verbatim instead — those values are an opaque string and are not part of this enum.

Function Calling Example

const response = await client.chat({
  messages: [{ role: 'user', content: 'Get the weather for San Francisco' }],
  generation_settings: { max_tokens: 200 },
  tools: [
    {
      type: 'function',
      function: {
        name: 'get_weather',
        description: 'Get weather for a location',
        parameters: {
          type: 'object',
          properties: { location: { type: 'string' } },
          required: ['location'],
        },
      },
    },
  ],
  tool_config: { mode: 'auto' },
});

if (response.data.toolInvocations) {
  for (const invocation of response.data.toolInvocations) {
    console.log(`Tool: ${invocation.function.name}, Args: ${invocation.function.arguments}`);
  }
}

Advanced: createJWT

Creates an auto-refreshing JWT from raw credentials. Establishes a connection, mints an initial token (fail-fast on invalid credentials), and returns a self-refreshing JSONWebToken:

import { createJWT } from '@salesforce/llm-gateway-sdk';

const jwt = await createJWT({
  accessToken: 'your-access-token',
  instanceUrl: 'https://your-instance.salesforce.com',
  featureId: 'MyFeature',
});

| Option | Type | Default | Description | | -------------- | -------- | --------------------------------------------- | -------------------------------------- | | accessToken | string | — | Salesforce access token. | | instanceUrl | string | — | Salesforce instance URL. | | mintingPath? | string | '/ide/auth' | Token minting endpoint path. | | featureId? | string | LLMG_FEATURE_ID env var or 'VibesService' | Feature identifier sent with requests. |

Advanced: createJWTFromConnection

Creates an auto-refreshing JWT from an existing OrgConnection. Use this when you already have a validated connection and want to avoid the extra network call that createJWT performs to establish one:

import { createJWTFromConnection } from '@salesforce/llm-gateway-sdk';
import type { OrgConnection } from '@salesforce/agentic-common';

const jwt = await createJWTFromConnection(orgConnection, { featureId: 'MyFeature' });

| Option | Type | Default | Description | | --------------- | --------------- | --------------------------------------------- | ---------------------------------------- | | orgConnection | OrgConnection | — | An already-authenticated org connection. | | mintingPath? | string | '/ide/auth' | Token minting endpoint path. | | featureId? | string | LLMG_FEATURE_ID env var or 'VibesService' | Feature identifier sent with requests. |

Advanced: DefaultLLMGatewayClientFactory

For dependency-injection scenarios where a factory interface is needed:

interface LLMGatewayClientFactory {
  create(jwt: JSONWebToken, options?: { env?: SfApiEnv }): LLMGatewayClient;
}

Development

See DEVELOPING.md for build-from-source setup, scripts, E2E testing, and packaging.

See ARCHITECTURE.md for implementation details on the request pipeline, SSE parsing, response processors, and JWT lifecycle.