npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

@pierreraby/openrouter-client

v1.0.6

Published

Complete TypeScript SDK for OpenRouter API with full type safety, streaming, tool calls, and advanced features

Readme

OpenRouter TypeScript SDK

npm version License: MIT TypeScript Node.js Documentation pnpm

A complete, type-safe TypeScript SDK for the OpenRouter API. Node.js only (ESM), with full API coverage, streaming support, and comprehensive error handling.

Features

Full API Coverage: Chat completions, streaming, models, providers, credits, analytics
Type Safety: Complete TypeScript types for all endpoints and responses
Streaming: Two approaches - ReadableStream (low-level) or AsyncIterable (recommended)
Advanced Features: Tool calling, structured outputs, multimodal (vision), provider preferences
Batch Requests: Execute multiple requests concurrently with rate limiting
Validation Helpers: Pre-validate parameters, check model capabilities, truncate messages
Reliability: Automatic retry with exponential backoff, timeouts, proper error handling
Security: Automatic redaction of sensitive data in logs
Logging: Multiple logger implementations (default, silent, formatted)
100% Test Coverage: 92 tests covering all features

Installation

npm install @pierreraby/openrouter-client
# or
pnpm add @pierreraby/openrouter-client
# or
yarn add @pierreraby/openrouter-client
yarn add openrouter-client

Quick Start

import OpenRouterClient from 'openrouter-client';

const client = new OpenRouterClient({
  apiKey: process.env.OPENROUTER_API_KEY
});

// Simple chat completion
const response = await client.createChatCompletion({
  model: 'openai/gpt-3.5-turbo',
  messages: [
    { role: 'user', content: 'Hello!' }
  ]
});

console.log(response.choices[0].message.content);

Streaming (Recommended)

// Using AsyncIterable (cleanest approach)
for await (const chunk of client.streamChatCompletion({
  model: 'openai/gpt-3.5-turbo',
  messages: [{ role: 'user', content: 'Tell me a story' }]
})) {
  const content = chunk.choices[0]?.delta?.content;
  if (content) {
    process.stdout.write(content);
  }
}

Examples

The examples/ directory contains comprehensive examples for all features:

Basic Usage (01-03)

  • 01-basic-usage.ts: Client initialization, simple chat completion
  • 02-streaming.ts: ReadableStream vs AsyncIterable streaming
  • 03-tool-calls.ts: Function calling with helpers

Advanced Features (04-07)

  • 04-structured-outputs.ts: JSON mode and json_schema
  • 05-multimodal.ts: Vision with images (URL, base64, multiple)
  • 06-provider-preferences.ts: Provider routing, fallbacks, quantization
  • 07-cost-tracking.ts: Cost monitoring with getGeneration(), getCredits()

Production Patterns (08-12)

  • 08-error-handling.ts: Robust error handling strategies
  • 09-retry-backoff.ts: Retry configuration and best practices
  • 10-prompt-caching.ts: Anthropic caching for 90% cost reduction
  • 11-model-capabilities.ts: Discover model features and validate compatibility
  • 12-rate-limits.ts: Monitor rate limits, budgets, and usage

Validation & Optimization (13-16)

  • 13-validation-helpers.ts: Parameter validation, feature checking, message truncation
  • 14-batch-requests.ts: Concurrent batch processing with rate limiting
  • 15-tool-message-validation.ts: Tool message formatting and common validation errors
  • 16-immediate-cost-tracking.ts: Immediate cost tracking via response.usage (recommended)

Run examples with:

tsx examples/01-basic-usage.ts

Configuration

const client = new OpenRouterClient({
  apiKey: string;              // Required: Your OpenRouter API key
  baseURL?: string;            // Default: 'https://openrouter.ai/api/v1'
  timeout?: number;            // Default: 30000 (30s)
  maxRetries?: number;         // Default: 3
  retryDelay?: number;         // Default: 1000 (1s initial delay)
  headers?: Record<string, string>; // Additional headers
  logger?: Logger;             // Custom logger
  logLevel?: LogLevel;         // 'error' | 'warn' | 'info' | 'debug'
});

Recommended Configurations

Development:

const client = new OpenRouterClient({
  apiKey: process.env.OPENROUTER_API_KEY!,
  maxRetries: 1,
  logLevel: 'debug'
});

Production:

const client = new OpenRouterClient({
  apiKey: process.env.OPENROUTER_API_KEY!,
  timeout: 60000,
  maxRetries: 5,
  retryDelay: 2000,
  logLevel: 'error'
});

Advanced Features

Tool Calling (Function Calling)

const tools = [
  {
    type: 'function' as const,
    function: {
      name: 'get_weather',
      description: 'Get current weather',
      parameters: {
        type: 'object',
        properties: {
          location: { type: 'string' }
        },
        required: ['location']
      }
    }
  }
];

const response = await client.createChatCompletion({
  model: 'openai/gpt-4o-mini',
  messages: [{ role: 'user', content: "What's the weather in Paris?" }],
  tools,
  tool_choice: 'auto'
});

// Parse and execute tool calls
if (response.choices[0].message.tool_calls) {
  const parsedCalls = OpenRouterClient.parseToolCalls(
    response.choices[0].message.tool_calls
  );
  
  for (const call of parsedCalls) {
    const result = yourFunctions[call.function.name](call.function.arguments);
    
    const toolMessage = OpenRouterClient.createToolResponseMessage(
      call.id,
      result,
      call.function.name
    );
    messages.push(toolMessage);
  }
}

Structured Outputs (JSON Schema)

const response = await client.createChatCompletion({
  model: 'openai/gpt-4o-mini',
  messages: [{ role: 'user', content: 'Generate a person profile' }],
  response_format: {
    type: 'json_schema',
    json_schema: {
      name: 'person_profile',
      strict: true,
      schema: {
        type: 'object',
        properties: {
          name: { type: 'string' },
          age: { type: 'number' },
          occupation: { type: 'string' }
        },
        required: ['name', 'age', 'occupation']
      }
    }
  }
});

const person = JSON.parse(response.choices[0].message.content!);

Multimodal (Vision)

const response = await client.createChatCompletion({
  model: 'openai/gpt-4o-mini',
  messages: [
    {
      role: 'user',
      content: [
        { type: 'text', text: 'What is in this image?' },
        {
          type: 'image_url',
          image_url: {
            url: 'https://example.com/image.jpg',
            detail: 'high'
          }
        }
      ]
    }
  ]
});

Cost Tracking

// Get account credits
const credits = await client.getCredits();
console.log(`Remaining: $${credits.total_credits - credits.total_usage}`);

// Track specific generation (⚠️ NOT immediately available - see note below)
const response = await client.createChatCompletion({ /* ... */ });
const stats = await client.getGeneration(response.id);
console.log(`Cost: $${stats.total_cost}`);

// ⚠️ RECOMMENDED: Use response.usage for immediate cost tracking
const response = await client.createChatCompletion({ /* ... */ });
if (response.usage) {
  console.log(`Prompt tokens: ${response.usage.prompt_tokens}`);
  console.log(`Completion tokens: ${response.usage.completion_tokens}`);
  console.log(`Total tokens: ${response.usage.total_tokens}`);
  // Calculate approximate cost based on model pricing
}

// Estimate before request
const messages = [/* ... */];
const estimatedTokens = client.countMessagesTokens(messages);
console.log(`Estimated tokens: ${estimatedTokens}`);

Note: getGeneration() statistics are not immediately available after a request completes. OpenRouter needs time to process them. For real-time cost tracking, use response.usage instead (see example 16).

Prompt Caching (Anthropic)

Reduce costs up to 90% by caching portions of your prompts with Anthropic's Claude models:

// Mark system prompt as cacheable (must be >1024 tokens for Claude 3.5 Sonnet)
const systemPrompt = OpenRouterClient.markMessageAsCacheable({
  role: 'system',
  content: 'Long instructions, examples, or context that will be reused...' // >1024 tokens
});

// First call: cache creation (10% surcharge)
const response1 = await client.createChatCompletion({
  model: 'anthropic/claude-3.5-sonnet',
  messages: [
    systemPrompt,
    { role: 'user', content: 'First question' }
  ],
  usage: { include: true }  // ✅ Get detailed cache metrics
});

// Second call: cache hit (90% discount)
const response2 = await client.createChatCompletion({
  model: 'anthropic/claude-3.5-sonnet',
  messages: [
    systemPrompt,
    { role: 'user', content: 'Second question' }
  ],
  usage: { include: true }
});

// Track cache performance (real-time)
console.log('Cached tokens:', response2.usage?.prompt_tokens_details?.cached_tokens);
// Output: 1668 (90% discount on these tokens!)

// Or track via generation ID (async, more accurate)
const stats = await client.getGeneration(response2.id);
console.log('Cache discount:', stats.cache_discount); // e.g., 0.0045036 ($)
console.log('Native cached tokens:', stats.native_tokens_cached); // e.g., 1668

Two methods to track cache metrics:

  1. Real-time with usage: { include: true } (recommended for development)

    • Returns prompt_tokens_details.cached_tokens in response
    • Adds ~200ms latency to final response
    • Best for debugging and real-time monitoring
  2. Async with getGeneration(id) (recommended for production)

    • Returns cache_discount (actual $ savings) and native_tokens_cached
    • No latency impact on responses
    • Best for cost analytics and reporting

Requirements:

  • Minimum 1024 tokens for Claude 3.7/3.5 Sonnet and 3 Opus
  • Minimum 2048 tokens for Claude 3.5/3 Haiku
  • Cache expires after 5 minutes of inactivity

Best practices:

  • Cache stable content (system prompts, reference docs, examples)
  • Don't cache dynamic content (user messages, real-time data)
  • Use provider-specific models (e.g., anthropic/claude-3.5-sonnet)
  • See examples/10-prompt-caching.ts for complete examples with both tracking methods

Model Capabilities Discovery

Automatically discover what features a model supports before using it:

const caps = await client.getModelCapabilities('anthropic/claude-3.5-sonnet');

// Check capabilities
if (caps.supportsVision) {
  // Can send images
}
if (caps.supportsTools) {
  // Can use function calling
}
if (caps.supportsJSON) {
  // Can use response_format
}

// Access detailed info
console.log('Context length:', caps.maxContextLength);
console.log('Input modalities:', caps.inputModalities); // ['text', 'image']
console.log('Supported params:', caps.supportedParameters);
console.log('Pricing:', caps.pricing); // { prompt: 0.003, completion: 0.015 }

Use cases:

  • Validate model compatibility before requests
  • Build dynamic UIs that adapt to model capabilities
  • Auto-select the best model for your needs
  • See examples/11-model-capabilities.ts for advanced patterns

Rate Limits & Usage Monitoring

Track your API usage, budgets, and rate limits in real-time:

// Get detailed key information
const keyInfo = await client.getKeyInfo();
console.log('Usage:', keyInfo.usage);
console.log('Limit:', keyInfo.limit || 'Unlimited');
console.log('Free tier:', keyInfo.is_free_tier);
if (keyInfo.rate_limit) {
  console.log(`${keyInfo.rate_limit.requests} requests per ${keyInfo.rate_limit.interval}`);
}

// Get credits with current rate limit status
const credits = await client.getCredits();
console.log('Credits remaining:', credits.total_credits - credits.total_usage);
if (credits.rate_limit) {
  console.log('Requests remaining:', credits.rate_limit.remaining);
  console.log('Resets at:', new Date(credits.rate_limit.reset * 1000));
}

Benefits:

  • Prevent 429 errors with proactive throttling
  • Monitor budget usage in real-time
  • Set up alerts before hitting limits
  • See examples/12-rate-limits.ts for monitoring patterns

Validation Helpers

Pre-validate requests before sending them to save costs and avoid errors:

// Check if a model supports a specific feature
const supportsVision = await client.supportsFeature(
  'anthropic/claude-3.5-sonnet',
  'vision'
);

if (!supportsVision) {
  console.log('This model cannot process images');
}

// Validate parameters against model capabilities
const validation = await client.validateParams('openai/gpt-3.5-turbo', {
  messages: [{ role: 'user', content: 'Hello' }],
  stream: true,
  tools: [/* ... */],
  max_tokens: 5000
});

if (!validation.valid) {
  console.error('Errors:', validation.errors);
  // Example: ["Model doesn't support streaming", "max_tokens exceeds limit"]
}

if (validation.warnings?.length) {
  console.warn('Warnings:', validation.warnings);
  // Example: ["max_tokens is high and may be expensive"]
}

// Truncate conversation to fit context window
const longConversation = [
  { role: 'system', content: 'You are helpful' },
  // ... 50+ messages
];

const truncated = client.truncateMessages(longConversation, 4000);
// Keeps system message + most recent messages that fit in 4000 tokens

Benefits:

  • Validate before spending credits on invalid requests
  • Prevent errors for unsupported features
  • Auto-truncate long conversations (FIFO, preserves system message)
  • See examples/13-validation-helpers.ts for complete workflows

Batch Requests

Execute multiple chat completion requests concurrently with automatic rate limiting:

// Prepare multiple requests
const requests = [
  {
    model: 'openai/gpt-3.5-turbo',
    messages: [{ role: 'user', content: 'Translate "hello" to French' }]
  },
  {
    model: 'openai/gpt-3.5-turbo',
    messages: [{ role: 'user', content: 'Translate "hello" to Spanish' }]
  },
  {
    model: 'openai/gpt-3.5-turbo',
    messages: [{ role: 'user', content: 'Translate "hello" to German' }]
  }
];

// Execute with concurrency control
const results = await client.batchChatCompletion(requests, {
  maxConcurrent: 5,      // Max 5 concurrent requests (default)
  stopOnError: false     // Continue on errors (default)
});

// Process results
results.forEach((result, idx) => {
  if (result.success && result.response) {
    console.log(`Request ${idx}:`, result.response.choices[0].message.content);
  } else {
    console.error(`Request ${idx} failed:`, result.error?.message);
  }
});

Options:

  • maxConcurrent: Limit concurrent requests (default: 5)
  • stopOnError: Stop on first error (default: false)

Benefits:

  • 2-5x faster than sequential requests
  • Automatic concurrency control
  • Individual error handling per request
  • See examples/14-batch-requests.ts for advanced patterns

Error Handling

import { OpenRouterError } from 'openrouter-client';

try {
  const response = await client.createChatCompletion({ /* ... */ });
} catch (error) {
  if (error instanceof OpenRouterError) {
    console.error('OpenRouter Error:', {
      message: error.message,
      status: error.status,
      code: error.code,
      requestId: error.requestId
    });
    
    if (error.status === 429) {
      // Handle rate limit
    } else if (error.status && error.status >= 500) {
      // Handle server error
    }
  }
}

Logging

import { formattedLogger, createLogger, silentLogger } from 'openrouter-client';

// Formatted logger with timestamps and colors
const client = new OpenRouterClient({
  apiKey: process.env.OPENROUTER_API_KEY!,
  logger: formattedLogger,
  logLevel: 'info'
});

// Custom prefixed logger
const client = new OpenRouterClient({
  apiKey: process.env.OPENROUTER_API_KEY!,
  logger: createLogger('MyApp'),
  logLevel: 'debug'
});

// Silent logger (no output)
const client = new OpenRouterClient({
  apiKey: process.env.OPENROUTER_API_KEY!,
  logger: silentLogger
});

API Reference

📚 Complete API Documentation (TypeDoc)

See docs/INDEX.md for architectural decisions and contribution guidelines.

Main Methods

Chat Completions:

  • createChatCompletion(params) - Standard chat completion
  • streamChatCompletion(params) - Streaming with AsyncIterable (recommended)
  • createChatCompletionStream(params) - Streaming with ReadableStream
  • batchChatCompletion(requests, options?) - Execute multiple requests concurrently

Models & Providers:

  • listModels() - Get available models
  • getModel(id) - Get model details
  • getModelEndpoints(id) - Get model endpoints
  • getModelCapabilities(id) - Get detailed model capabilities
  • listProviders() - Get available providers

Account & Usage:

  • getCredits() - Get account credits (with rate limits)
  • getKeyInfo() - Get API key information and limits
  • getActivity() - Get activity analytics
  • getGeneration(id) - Get generation statistics

Validation & Helpers:

  • supportsFeature(modelId, feature) - Check if model supports a feature
  • validateParams(modelId, params) - Validate parameters against model
  • truncateMessages(messages, maxTokens) - Truncate messages to fit context
  • countTokens(text) - Estimate tokens in text
  • countMessagesTokens(messages) - Estimate tokens in messages
  • validateApiKey() - Validate API key

Static Helpers

  • OpenRouterClient.parseToolCalls(toolCalls) - Parse tool calls
  • OpenRouterClient.createToolResponseMessage(id, content, name?) - Create tool response (requires string content)
  • OpenRouterClient.createToolResponseFromResult(id, result, name?) - Create tool response from any object (auto-serializes)
  • OpenRouterClient.executeToolCalls(toolCalls, functions) - Execute tool calls
  • OpenRouterClient.markMessageAsCacheable(message) - Mark message for caching

Development

# Install dependencies
pnpm install

# Run tests
pnpm test

# Run tests in watch mode
pnpm test:watch

# Build
pnpm build

# Lint
pnpm lint

# Format
pnpm format

Requirements

  • Node.js 22.x LTS or later (native fetch support)
  • TypeScript 5.9.x or later
  • ESM only (no CommonJS)

License

MIT

Contributing

See docs/INDEX.md for contribution guidelines and architecture decisions.