lunar-sdk

v0.2.0

Published

13 days ago

TypeScript SDK for Lunar LLM Inference API - OpenAI-compatible with intelligent fallbacks

0High
0Medium
0Low

diogovieira

llm ai openai lunar-api lunar inference fallback streaming

Lunar SDK

TypeScript SDK for Lunar LLM Inference API - OpenAI-compatible with intelligent fallbacks.

Installation

npm install lunar

Quick Start

import { Lunar } from "lunar";

// Initialize client (uses LUNAR_API_KEY env var)
const client = new Lunar();

// Chat completion
const response = await client.chat.completions.create({
  model: "gpt-4o-mini",
  messages: [{ role: "user", content: "Hello!" }],
});

console.log(response.choices[0].message.content);
console.log(`Cost: $${response.usage?.total_cost_usd}`);

Authentication

Set your API key via environment variable:

export LUNAR_API_KEY="your-api-key"

Or pass it directly:

const client = new Lunar({ apiKey: "your-api-key" });

JWT authentication is also supported:

const client = new Lunar({ jwt: "eyJhbG..." });

Features

Chat Completions

const response = await client.chat.completions.create({
  model: "gpt-4o-mini",
  messages: [
    { role: "system", content: "You are a helpful assistant." },
    { role: "user", content: "What is the capital of France?" },
  ],
});

console.log(response.choices[0].message.content);

Streaming

Stream responses token by token:

const stream = await client.chat.completions.create({
  model: "gpt-4o-mini",
  messages: [{ role: "user", content: "Write a short story" }],
  stream: true,
});

for await (const chunk of stream) {
  if (chunk.choices[0].delta.content) {
    process.stdout.write(chunk.choices[0].delta.content);
  }
}

Text Completions

const response = await client.completions.create({
  model: "gpt-4o-mini",
  prompt: "Once upon a time",
  max_tokens: 100,
});

console.log(response.choices[0].text);

Fallbacks

Lunar automatically falls back to alternative models when the primary model fails with infrastructure errors (5xx, rate limits, timeouts).

// Per-request fallbacks
const response = await client.chat.completions.create({
  model: "gpt-4o-mini",
  messages: [{ role: "user", content: "Hello!" }],
  fallbacks: ["claude-3-haiku", "llama-3.1-8b"],
});

// Global fallbacks via config
const client = new Lunar({
  fallbacks: {
    "gpt-4o-mini": ["claude-3-haiku", "llama-3.1-8b"],
    "gpt-4": ["claude-3-opus"],
  },
});

Fallback behavior:

Triggers fallback: 5xx errors, 429 rate limit, connection errors, timeouts
Does NOT trigger fallback: 400 bad request, 401 auth error, 403 forbidden (client errors won't be fixed by another model)

Force Provider

const response = await client.chat.completions.create({
  model: "openai/gpt-4o-mini", // Forces OpenAI provider
  messages: [{ role: "user", content: "Hello!" }],
});

List Models and Providers

// List available models
const models = await client.models.list();
for (const model of models) {
  console.log(`${model.id} (owned by ${model.owned_by})`);
}

// List providers for a model
const providers = await client.providers.list("gpt-4o-mini");
for (const provider of providers) {
  console.log(`${provider.id}: ${provider.type} (enabled: ${provider.enabled})`);
}

Cost Tracking

Every response includes detailed cost information:

const response = await client.chat.completions.create({
  model: "gpt-4o-mini",
  messages: [{ role: "user", content: "Hello!" }],
});

console.log(`Input tokens: ${response.usage?.prompt_tokens}`);
console.log(`Output tokens: ${response.usage?.completion_tokens}`);
console.log(`Input cost: $${response.usage?.input_cost_usd}`);
console.log(`Output cost: $${response.usage?.output_cost_usd}`);
console.log(`Total cost: $${response.usage?.total_cost_usd}`);
console.log(`Latency: ${response.usage?.latency_ms}ms`);

Configuration

const client = new Lunar({
  apiKey: "your-api-key",                // Or use LUNAR_API_KEY env var
  baseUrl: "https://api.lunar-sys.com", // Custom API endpoint
  timeout: 60000,                        // Request timeout in ms
  numRetries: 3,                         // Retries for transient errors
  fallbacks: {                           // Global fallback configuration
    "gpt-4o-mini": ["claude-3-haiku"],
  },
});

Error Handling

import {
  Lunar,
  LunarError,
  APIError,
  BadRequestError,
  AuthenticationError,
  RateLimitError,
  ServerError,
} from "lunar";

const client = new Lunar();

try {
  const response = await client.chat.completions.create({
    model: "gpt-4o-mini",
    messages: [{ role: "user", content: "Hello!" }],
  });
} catch (e) {
  if (e instanceof BadRequestError) {
    console.log(`Invalid request: ${e}`);
  } else if (e instanceof AuthenticationError) {
    console.log(`Auth failed: ${e}`);
  } else if (e instanceof RateLimitError) {
    console.log(`Rate limited: ${e}`);
  } else if (e instanceof ServerError) {
    console.log(`Server error: ${e}`);
  } else if (e instanceof APIError) {
    console.log(`API error [${e.statusCode}]: ${e}`);
  } else if (e instanceof LunarError) {
    console.log(`Lunar error: ${e}`);
  }
}

License

MIT