nx-ai

v2.2.1

Published

3 months ago

Zero-config, local-first AI gateway. Works like @ai-sdk/gateway but with cost tracking, fallbacks, caching, rate limiting, logging, tool calling, structured outputs, reasoning modes, batch execution, embeddings, smart model selection, and 120+ providers o

0High
0Medium
0Low

nx-morpheus

ai llm gateway openai anthropic groq vercel-ai-sdk litellm xronox langfuse

nx-ai

Zero-config, local-first AI gateway. Works like @ai-sdk/gateway but with cost tracking, fallbacks, caching, rate limiting, logging, and 120+ providers out of the box.

✨ Features

🚀 120+ Providers - Support for OpenAI, Anthropic, Groq, and 120+ more via LiteLLM
🔒 Type Safety - TypeScript ENUM types for providers and models (autocomplete, no typos)
📝 Structured Prompts - Separate instructions, prompt, and context for RAG and few-shot learning
✅ Auto-Validation - API key validation on initialization with detailed reports
🏥 Health Checks - Built-in provider health monitoring
🐛 Rich Diagnostics - Comprehensive error messages with troubleshooting guides
💰 Cost Tracking - Automatic cost calculation for all requests with reasoning token support
🔄 Fallbacks - Automatic fallback to backup models for reliability
💾 Caching - Optional Redis caching for responses to reduce costs
📊 Logging - Optional MongoDB logging via Xronox for audit trails
📈 Telemetry - Optional open-source telemetry via Langfuse (self-hostable)
🔁 Retries - Configurable retry logic for resilience
🌊 Streaming - Full support for streaming responses
🛠️ TypeScript - Full TypeScript support with type definitions
🏷️ Metadata Tracking - JobId, AgentId, TaskType, TaskId for correlation
🎯 Tool/Function Calling - Unified interface with automatic provider mapping
📋 Structured Outputs - JSON mode and JSON Schema support with Zod validation
🧠 Reasoning Modes - Unified configuration for o1/o3, Claude, Gemini reasoning models
⚡ Rate Limiting - Smart rate limiting with queue, auto-detection, and fallback
📦 Batch Execution - Native batch support with provider batch API optimization
🔗 MCP Support - Model Context Protocol server discovery and tool routing
🎨 Smart Model Selector - Task-aware routing with preference-based model selection
📊 Model Registry - Comprehensive model capabilities and scoring matrix
💬 Routing Explanations - Human-readable explanations for model selection decisions

📋 Changelog

v2.0.7 (Latest)

✨ Added: Support for GPT-5 series models - gpt-5, gpt-5-pro, gpt-5-mini, gpt-5-nano
✨ Added: Support for GPT-5.1 series models - gpt-5.1-pro, gpt-5.1-chat-latest
✨ Added: Support for O-series reasoning models - o3-pro, o3-deep-research, o4-mini-deep-research
🔧 Improved: Model registry now supports provider-organized JSON structure (matches matrix structure)
🔧 Improved: Scoring matrix now loads from metadata/models.matrix.json as primary source
✅ Fixed: Registry code now properly handles both old (flat) and new (provider-organized) JSON formats
✅ Fixed: All new models added to both features.json and matrix.json with proper scoring

v2.0.6

✨ Added: Support for 8 additional model provider enums - DeepSeek, XAI, Amazon, Qwen, Perplexity, Together, Fireworks, and Ollama
✨ Added: Type-safe overloads for all new model enums in both provider+model and model-only formats
🔧 Improved: Model registry now loads from metadata/models.features.json as primary source (more reliable than markdown extraction)
🔧 Improved: Added missing providers (XAI, AMAZON, QWEN, VOYAGE) to provider enum
📚 Added: Comprehensive documentation explaining why structured prompts need conversion (docs/WHY_CONVERT_STRUCTURED_PROMPTS.md)
✅ Fixed: Enum detection logic now includes all new model enums for proper runtime detection
✅ Fixed: Provider inference logic updated to handle all new model enums with correct mappings

v2.0.5

🐛 Fixed: ENUM runtime detection - Fixed detection logic to check enum arrays instead of typeof checks (TypeScript string enums are strings at runtime)
✅ ENUM types now work correctly: nxai(OpenAIModel.GPT_4O) and nxai(NxAiProvider.OPENAI, OpenAIModel.GPT_4O) both work
🔧 Added: convertStructuredPrompt() helper function to convert structured prompts to messages format (required due to AI SDK validation)

v2.0.3

🐛 Fixed: ENUM runtime support - TypeScript ENUM types now work correctly at runtime
🐛 Fixed: Structured prompts runtime - Structured prompts now work with generateText() and streamText()
🐛 Fixed: TypeScript type definitions - Added module augmentation to support StructuredPrompt in AI SDK functions
✅ All critical bugs from v2.0.2 have been resolved

v2.0.2

Initial release with full feature set

📦 Installation

npm install nx-ai

Peer Dependencies:

ai >= 3.0.0 (Vercel AI SDK)

🚀 Quick Start

Basic Usage

Type-Safe with ENUMs (Recommended)

import { nxai, NxAiEngine, OpenAIModel, AnthropicModel } from 'nx-ai';
import { generateText } from 'ai';

// ✅ Type-safe: provider + model
const model1 = nxai(NxAiEngine.OPENAI, OpenAIModel.GPT_4O);
const model2 = nxai(NxAiEngine.ANTHROPIC, AnthropicModel.CLAUDE_3_5_SONNET);

// ✅ Type-safe: model-only (provider inferred)
const model3 = nxai(OpenAIModel.GPT_4O);
const model4 = nxai(AnthropicModel.CLAUDE_3_5_HAIKU);

// ✅ Autocomplete works - IDE shows all available models
const { text, usage } = await generateText({ 
  model: model1, 
  prompt: 'Hello, world!' 
});

console.log(text);
console.log('Job ID:', model1.jobId); // Auto-generated UUID

String-Based (Backward Compatible)

import { nxai } from 'nx-ai';
import { generateText } from 'ai';

// ✅ Still works - backward compatible
const model = nxai('openai/gpt-4o');
const { text, usage } = await generateText({ 
  model, 
  prompt: 'Hello, world!' 
});

console.log(text);
console.log('Job ID:', model.jobId); // Auto-generated UUID

Streaming

import { nxai } from 'nx-ai';
import { streamText } from 'ai';

const model = nxai('openai/gpt-4o');
const result = await streamText({
  model,
  prompt: 'Write a story...',
});

for await (const chunk of result.textStream) {
  process.stdout.write(chunk);
}

📚 Documentation

Initialization

Initialize nx-ai with API keys, logging, and/or telemetry:

import { initNxAi, closeNxAi } from 'nx-ai';

// Initialize with API keys (recommended for production)
await initNxAi({
  // API keys for ANY provider supported by LiteLLM (120+ providers)
  apiKeys: {
    openai: 'sk-proj-...',        // OpenAI
    anthropic: 'sk-ant-...',      // Anthropic
    google: '...',                 // Google/Gemini
    groq: '...',                  // Groq
    cohere: '...',               // Cohere
    mistral: '...',              // Mistral
    // ... ANY other LiteLLM-supported provider
  },
  // Optional: Logging
  logging: {
    enabled: true,
    mongoUri: 'mongodb://localhost:27017',
    dbName: 'nx-ai-db',
  },
  // Optional: Telemetry
  telemetry: {
    enabled: true,
    type: 'langfuse',
    publicKey: 'pk-...',
    secretKey: 'sk-...',
    host: 'http://localhost:3000', // Self-hosted Langfuse
  },
});

// ... use nx-ai ...

// Cleanup when done
await closeNxAi();

Note: If you don't call initNxAi(), nx-ai will fall back to environment variables (e.g., OPENAI_API_KEY, ANTHROPIC_API_KEY). However, provider must still be explicitly specified when calling nxai().

Type-Safe Model Selection with ENUMs

Recommended: Use TypeScript ENUMs for type safety and autocomplete:

import { nxai, OpenAIModel, AnthropicModel, GoogleModel, GroqModel, NxAiProvider } from 'nx-ai';

// ✅ Type-safe with ENUMs (recommended)
const model1 = nxai(OpenAIModel.GPT_4O);
const model2 = nxai(AnthropicModel.CLAUDE_3_5_SONNET);
const model3 = nxai(GoogleModel.GEMINI_1_5_PRO);
const model4 = nxai(GroqModel.LLAMA_3_70B);

// ✅ Provider/model format with ENUMs
const model5 = nxai(NxAiProvider.OPENAI, OpenAIModel.GPT_4O);

Backward Compatible: String format still works:

// ✅ CORRECT: Provider explicit in format
const model1 = nxai('openai/gpt-4o');

// ✅ CORRECT: Provider in options
const model2 = nxai('gpt-4o', { provider: 'openai' });

// ❌ INCORRECT: Provider not specified
const model3 = nxai('gpt-4o'); // Error: Provider must be specified

Model Options

const model = nxai('openai/gpt-4o', {
  // REQUIRED: Explicitly specify provider (if not in model name format)
  provider: 'openai',
  // Optional: Override API key for this specific model instance
  apiKey: 'sk-proj-user-specific-key',
  // Fallback models (tried in order if primary fails)
  fallback: ['openai/gpt-3.5-turbo', 'anthropic/claude-3-haiku'],
  
  // Caching
  cache: true,
  redisUrl: 'redis://localhost:6379',
  
  // Retries
  retries: 3,
  
  // Custom metadata for tracking
  jobId: 'custom-job-123',      // Custom job ID (auto-generated if not provided)
  agentId: 'agent-1',            // Agent identifier
  taskType: 'classification',   // Task type
  taskId: 'task-456',            // Task identifier
});

Logging (MongoDB via Xronox)

All requests are logged to the ai-activities collection with full metadata:

await initNxAi({
  logging: {
    enabled: true,
    mongoUri: 'mongodb://localhost:27017',
    dbName: 'nx-ai-db',
    // Optional: S3 for file storage
    s3: {
      endpoint: 'https://s3.amazonaws.com',
      accessKeyId: 'your-access-key',
      secretAccessKey: 'your-secret-key',
      bucket: 'your-bucket-name',
    },
    // Optional: Engine selection
    engine: 'xronoxCore', // or 'nxMongo'
  },
});

Logged Data:

jobId, agentId, taskType, taskId
startTime, endTime, duration
params, request, response
cost, status, error (if failed)
engine, provider, model

Telemetry (Langfuse)

Track all requests in your Langfuse dashboard:

await initNxAi({
  telemetry: {
    enabled: true,
    type: 'langfuse',
    publicKey: process.env.LANGFUSE_PUBLIC_KEY,
    secretKey: process.env.LANGFUSE_SECRET_KEY,
    host: 'http://localhost:3000', // Self-hosted or cloud.langfuse.com
    projectId: 'my-project', // Optional
  },
});

Features:

Full request/response tracing
Streaming support with real-time chunks
Usage tracking
Cost tracking
Error tracking
Self-hostable (MIT license)

Fallbacks

Automatically try backup models if primary fails:

const model = nxai('openai/gpt-4o', {
  fallback: [
    'openai/gpt-3.5-turbo',
    'anthropic/claude-3-haiku',
  ],
  retries: 3,
});

Answer Caching (Xronox)

Automatic answer-level caching - When logging is enabled, caching works automatically. Cached answers persist indefinitely (no TTL) and are returned even if cached months ago, unless you force fresh.

import { initNxAi, nxai } from 'nx-ai';
import { generateText } from 'ai';

// Initialize with logging - caching is AUTOMATIC
await initNxAi({
  logging: {
    enabled: true, // Required for caching (uses Xronox)
    mongoUri: 'mongodb://localhost:27017',
    dbName: 'nx-ai-db',
  },
  // Optional: Configure cache mode globally (defaults to 'perModel')
  answerCache: {
    enabled: true,
    defaultMode: 'perModel', // 'off' | 'perModel' | 'anyModel'
  },
});

// Create model - NO answerCache config needed! Caching is automatic.
const model = nxai('openai/gpt-4o-mini');

// Use structured prompts: instructions (system role), prompt (user), context
const prompt = {
  instructions: 'You are a helpful geography assistant. Answer concisely.',
  prompt: 'What is the capital of France?',
  context: ['France is a country in Western Europe.'],
};

// First request - goes to LLM
const result1 = await generateText({ model, prompt });
// result1.response.headers['x-nxai-cache'] === 'miss'

// Second request - served from cache (even if cached months ago)
const result2 = await generateText({ model, prompt });
// result2.response.headers['x-nxai-cache'] === 'hit'

// Force fresh answer (bypasses cache but still writes)
const modelForceFresh = nxai('openai/gpt-4o-mini', {
  answerCache: { forceFresh: true }, // Only user choice: force fresh or not
});
const result3 = await generateText({ model: modelForceFresh, prompt });
// result3.response.headers['x-nxai-cache'] === 'miss'

Cache Modes:

perModel - Cache key includes provider + model (isolated per model)
anyModel - Cache key only includes input (reusable across models)
off - No caching

Response Metadata: Responses include cache information in headers:

x-nxai-cache: 'hit' | 'miss'
x-nxai-cache-source: 'llm' | 'cache'
x-nxai-cache-mode: Cache mode used
x-nxai-cache-origin: Original provider/model that generated the cached answer
x-nxai-cache-cached-at: ISO timestamp when answer was cached

Note: Answer caching requires Xronox logging to be enabled. The cache is stored in the nxai-cache collection.

Cache Behavior:

Automatic - Works automatically when logging is enabled
No TTL - Cached answers persist indefinitely until manually deleted
Cache hits return immediately with source: 'cache' in response metadata
Cache misses proceed to LLM and store the result
forceFresh: true bypasses cache read but still writes the new response
Per-model mode ensures model isolation (useful for A/B testing)
Any-model mode maximizes reuse across different models

Parallel Requests: Send the same prompt to multiple models/engines in parallel and collect all results together:

// Create multiple model instances
const models = [
  { name: 'openai-1', model: nxai('openai/gpt-4o-mini') },
  { name: 'openai-2', model: nxai('openai/gpt-4o-mini') }, // Same model, multiple times
  { name: 'anthropic', model: nxai('anthropic/claude-3-haiku') },
];

// Execute all requests in parallel
const results = await Promise.all(
  models.map(async ({ name, model }) => {
    const result = await generateText({ model, prompt });
    return { name, response: result.text, cache: result.response?.headers?.['x-nxai-cache'] };
  })
);

// All results available together
console.log(results); // Array of all responses

Response Caching (Redis)

Cache responses using Redis (legacy, for backward compatibility):

const model = nxai('openai/gpt-4o', {
  cache: true,
  redisUrl: 'redis://localhost:6379',
});

Custom Metadata

Track requests with custom metadata:

const model = nxai('openai/gpt-4o', {
  jobId: 'user-123-request-456',
  agentId: 'customer-support-bot',
  taskType: 'ticket-resolution',
  taskId: 'ticket-789',
});

Structured Prompts

Separate instructions, prompt, and context for better RAG, few-shot learning, and multi-turn conversations:

import { nxai, NxAiEngine, OpenAIModel, convertStructuredPrompt } from 'nx-ai';
import { generateText } from 'ai';

const model = nxai(NxAiEngine.OPENAI, OpenAIModel.GPT_4O);

// Structured prompt format - convert to messages before calling AI SDK
const structured = {
  instructions: 'You are a helpful assistant that responds in JSON format.',
  prompt: 'What is the capital of France?',
  context: [
    'France is a country in Europe.',
    'Paris is the largest city in France.',
  ],
};

const messages = convertStructuredPrompt(structured);
await generateText({
  model,
  prompt: messages,
});

// RAG example
const retrievedDocs = await vectorDB.search(query, { limit: 5 });
const ragPrompt = convertStructuredPrompt({
  instructions: 'Answer the question based only on the provided context.',
  prompt: query,
  context: retrievedDocs.map(doc => doc.content),
});
await generateText({ model, prompt: ragPrompt });

// Context with metadata
const reviewPrompt = convertStructuredPrompt({
  instructions: 'Review the code and provide feedback.',
  prompt: codeToReview,
  context: [
    {
      content: 'Project uses TypeScript',
      source: 'tsconfig.json',
      metadata: { version: '5.0' },
    },
  ],
});
await generateText({ model, prompt: reviewPrompt });

// Backward compatible - string prompts still work
await generateText({
  model,
  prompt: 'Simple string prompt',
});

Benefits:

✅ Automatic conversion to AI SDK message format
✅ Perfect for RAG applications
✅ Supports few-shot learning
✅ Context with metadata support
✅ Backward compatible with string prompts

Note: The AI SDK validates prompts before calling doGenerate(), so structured prompts need to be converted using convertStructuredPrompt() before passing to generateText() or streamText(). See Why We Need to Convert Structured Prompts for a detailed explanation of why this is necessary.

Tool/Function Calling

Unified tool calling interface with automatic provider mapping:

const model = nxai('openai/gpt-4o', {
  tools: [
    {
      name: 'get_weather',
      description: 'Get current weather for a location',
      parameters: {
        type: 'object',
        properties: {
          location: { type: 'string' },
          unit: { type: 'string', enum: ['celsius', 'fahrenheit'] }
        },
        required: ['location']
      }
    }
  ],
  toolChoice: 'auto', // 'auto' | 'required' | 'none' | { name: 'specific_tool' }
  parallelToolCalls: true, // Enable parallel execution (where supported)
});

Structured Outputs

Unified response format with automatic schema enforcement:

import { z } from 'zod';

const model = nxai('openai/gpt-4o', {
  responseFormat: {
    type: 'json_schema',
    schema: z.object({
      sentiment: z.enum(['positive', 'negative', 'neutral']),
      confidence: z.number().min(0).max(1),
      keywords: z.array(z.string()),
    }),
    strict: true, // Enforce 100% compliance (where supported)
  },
});

Reasoning Modes

Unified reasoning configuration for reasoning models:

const model = nxai('openai/o3-mini', {
  reasoning: {
    effort: 'high', // 'low' | 'medium' | 'high'
    maxTokens: 4000, // Budget for thinking tokens
    includeInResponse: false, // Whether to return reasoning tokens
  },
});

Rate Limiting

Smart rate limiting with queue and auto-detection:

await initNxAi({
  rateLimiting: {
    enabled: true,
    strategy: 'queue', // 'queue' | 'reject' | 'fallback'
    redis: 'redis://localhost:6379', // Distributed rate limiting
    limits: {
      'openai/gpt-4o': { rpm: 500, tpm: 150000 },
      'anthropic/*': { rpm: 1000 }, // Wildcards
    },
    autoDetect: true, // Learn limits from 429 responses
  },
});

Batch Execution

Process many prompts efficiently with cost optimization:

import { nxaiBatch } from 'nx-ai';

const results = await nxaiBatch({
  model: 'openai/gpt-4o',
  requests: [
    { prompt: 'Summarize: ...' },
    { prompt: 'Translate: ...' },
    { prompt: 'Classify: ...' },
  ],
  concurrency: 10, // Parallel requests
  useBatchAPI: true, // Use provider batch API if available (50% cheaper)
  jobId: 'batch-123',
});

Embeddings

Unified embeddings interface:

import { nxaiEmbed } from 'nx-ai';

const embedding = nxaiEmbed('openai/text-embedding-3-large', {
  dimensions: 1024, // Reduce dimensions (where supported)
  fallback: ['cohere/embed-english-v3.0'],
});

const { vectors, usage } = await embedding.embed([
  'First document',
  'Second document',
]);

Smart Model Selector

Intelligent model routing based on task and preferences:

import { pickBestModel, explainRouting, NxAiProvider } from 'nx-ai';

// Auto-select best model for a task
const result = pickBestModel({
  provider: NxAiProvider.OPENAI,
  taskType: 'coding',
  importance: { cost: 2, speed: 2, reasoning: 3 },
});

if (result) {
  const explanation = explainRouting(result);
  console.log(explanation.text);
  // "Picked openai / gpt-4.1-mini (chat model, cost tier: cheap) 
  //  because you prioritized reasoning, problem solving, cognitive tasks..."
  
  const model = nxai(`${result.provider}/${result.model.id}`);
}

// Or use the selector in nxai()
const model = nxai('auto', {
  selector: {
    strategy: 'cost-optimized', // 'cost-optimized' | 'quality-first' | 'balanced'
    constraints: {
      maxCostPer1k: 0.01,
      minContextWindow: 32000,
      requiresVision: true,
      requiresTools: true,
    },
  },
});

Health Checks

Monitor provider health before making requests:

import { checkProviderHealth, checkAllProvidersHealth, NxAiProvider } from 'nx-ai';

// Check specific provider
const health = await checkProviderHealth(NxAiProvider.OPENAI);
console.log(`Status: ${health.status}`); // 'healthy' | 'degraded' | 'unhealthy'
console.log(`Response time: ${health.responseTime}ms`);

// Check all configured providers
const allHealth = await checkAllProvidersHealth();

Rich Error Handling

Get comprehensive diagnostic information when errors occur:

import { NxAiError } from 'nx-ai';

try {
  const { text } = await generateText({ model, prompt: 'Hello' });
} catch (error) {
  if (error instanceof NxAiError) {
    // Rich diagnostic information
    console.error('Error Code:', error.code);
    console.error('Provider:', error.provider);
    console.error('API Key Source:', error.diagnostics.apiKeySource);
    
    // Possible causes and suggested fixes
    error.diagnostics.troubleshooting.possibleCauses.forEach((cause, i) => {
      console.error(`${i + 1}. ${cause}`);
    });
    
    // Or use formatted diagnostic string
    console.error(error.toDiagnosticString());
  }
}

📖 API Reference

`nxai(model, options?)`

Creates a new AI model instance compatible with Vercel AI SDK. Supports both ENUM types (recommended) and string-based model IDs.

Type-Safe Overloads (Recommended):

// With ENUMs - type-safe, autocomplete
nxai(OpenAIModel.GPT_4O, options?)
nxai(AnthropicModel.CLAUDE_3_5_SONNET, options?)
nxai(GoogleModel.GEMINI_1_5_PRO, options?)
nxai(GroqModel.LLAMA_3_70B, options?)

// Provider/model format with ENUMs
nxai(NxAiProvider.OPENAI, OpenAIModel.GPT_4O, options?)

String-Based (Backward Compatible):

// Format: provider/model-name
nxai('openai/gpt-4o', options?)

// Or with provider option
nxai('gpt-4o', { provider: 'openai', ...options })

Parameters:

model - Model identifier (ENUM or string)
options (NxAiOptions, optional):
- provider?: string - REQUIRED if using string model without provider prefix
- apiKey?: string - Override API key for this specific model instance (highest priority)
- fallback?: string[] - Array of fallback model IDs
- cache?: boolean - Enable caching (requires redisUrl)
- retries?: number - Number of retry attempts (default: 0)
- redisUrl?: string - Redis URL for caching
- jobId?: string - Custom job ID (auto-generated UUID if not provided)
- agentId?: string - Agent identifier for logging
- taskType?: string - Task type for logging
- taskId?: string - Task identifier for logging

Returns: NxAiModel instance with .jobId property

Examples:

import { nxai, OpenAIModel, NxAiProvider } from 'nx-ai';

// ✅ Type-safe with ENUMs (recommended)
const model1 = nxai(OpenAIModel.GPT_4O, { jobId: 'my-job' });

// ✅ Provider/model format with ENUMs
const model2 = nxai(NxAiProvider.OPENAI, OpenAIModel.GPT_4O);

// ✅ String format (backward compatible)
const model3 = nxai('openai/gpt-4o', { jobId: 'my-job' });

// ✅ With per-model API key override
const model4 = nxai(OpenAIModel.GPT_4O, { 
  apiKey: 'sk-proj-user-specific-key',
  jobId: 'my-job' 
});

console.log(model1.jobId); // 'my-job'

`initNxAi(config)`

Initialize global API keys, logging, and telemetry. Call once at app startup.

Parameters:

config (NxAiGlobalConfig):
- apiKeys?: { [provider: string]: string } - API keys for ANY provider supported by LiteLLM (120+ providers)
  - Examples: { openai: 'sk-...', anthropic: 'sk-ant-...', google: '...' }
  - Keys set here override environment variables
- validateApiKeys?: boolean - Auto-validate API keys on initialization (default: false)
- validationTimeout?: number - Timeout for validation in ms (default: 5000)
- skipValidationFor?: string[] - Skip validation for specific providers
- diagnostics?: { enabled, level?, logToConsole?, logToFile? } - Enable diagnostic mode
- logging?: { enabled, mongoUri, dbName, s3?, engine? }
- telemetry?: { enabled, type: 'langfuse', publicKey, secretKey, host?, projectId? }

Example:

await initNxAi({
  apiKeys: {
    openai: 'sk-proj-...',
    anthropic: 'sk-ant-...',
    google: '...',
  },
  logging: { enabled: true, mongoUri: '...', dbName: '...' },
  telemetry: { enabled: true, type: 'langfuse', publicKey: '...', secretKey: '...' },
});

API Key Resolution Priority:

Per-model apiKey option (highest priority)
initNxAi() configured keys
Environment variables (fallback, e.g., OPENAI_API_KEY, ANTHROPIC_API_KEY)

`closeNxAi()`

Close all connections and cleanup resources. Call when shutting down.

Example:

await closeNxAi();

`checkProviderHealth(provider)`

Check health of a specific provider.

Parameters:

provider (string | NxAiProvider) - Provider to check

Returns: Promise<NxAiHealthCheck> with status, response time, and error details

Example:

import { checkProviderHealth, NxAiProvider } from 'nx-ai';

const health = await checkProviderHealth(NxAiProvider.OPENAI);
console.log(health.status); // 'healthy' | 'degraded' | 'unhealthy'

`checkAllProvidersHealth()`

Check health of all configured providers.

Returns: Promise<NxAiHealthCheck[]> - Array of health checks

Example:

import { checkAllProvidersHealth } from 'nx-ai';

const allHealth = await checkAllProvidersHealth();

`nx` (alias)

Short alias for nxai:

import { nx, OpenAIModel } from 'nx-ai';
const model = nx(OpenAIModel.GPT_4O);

ENUM Types

Type-safe ENUMs for providers and models:

import { 
  NxAiProvider,
  OpenAIModel,
  AnthropicModel,
  GoogleModel,
  GroqModel
} from 'nx-ai';

// Provider ENUM
NxAiProvider.OPENAI
NxAiProvider.ANTHROPIC
NxAiProvider.GOOGLE
// ... 120+ providers

// Model ENUMs
OpenAIModel.GPT_4O
AnthropicModel.CLAUDE_3_5_SONNET
GoogleModel.GEMINI_1_5_PRO
GroqModel.LLAMA_3_70B

🎯 Examples

See the examples/ directory for complete, isolated examples:

01-basic-usage.ts - Basic usage
02-streaming.ts - Streaming responses
03-with-logging.ts - MongoDB logging
04-with-telemetry.ts - Langfuse telemetry
05-with-fallbacks.ts - Fallback models
06-with-caching.ts - Redis caching
07-full-featured.ts - All features together
08-custom-job-id.ts - Custom metadata tracking
09-tool-calling.ts - Tool/function calling
10-structured-outputs.ts - Structured outputs with Zod
11-smart-routing.ts - Task-aware model selection

Run examples:

npx tsx examples/01-basic-usage.ts

🧠 Smart Model Routing

nx-ai includes a sophisticated task-aware routing system that automatically selects the best model based on your task type and preferences.

Task-Aware Routing

import { pickBestModel, explainRouting, explainRoutingForOptions } from 'nx-ai';

// Select best model for coding task
const result = pickBestModel({
  taskType: 'coding',
  importance: { cost: 2, speed: 2, reasoning: 3 },
});

if (result) {
  const explanation = explainRouting(result);
  console.log(explanation.text);
  // "Picked openai / gpt-4.1-mini (chat model, cost tier: cheap) 
  //  because you prioritized reasoning, problem solving, cognitive tasks.
  //  Specifically, reasoning is strong (score 8/10); problem solving is strong (score 9/10).
  //  It is particularly strong for problem solving (score 9/10)."
}

// One-shot: pick and explain
const explained = explainRoutingForOptions({
  taskType: 'heavy_reasoning',
  importance: { reasoning: 5, problemSolving: 4, cost: 1 },
});

if (explained) {
  console.log(explained.text);
  console.table(explained.details.importanceRank);
}

Available Task Types

general - General-purpose tasks
chatbot - Conversational interfaces (emphasizes speed)
analysis - Data analysis and interpretation
coding - Code generation and debugging
heavy_reasoning - Complex multi-step reasoning
planning - Task planning and orchestration
classification - Simple classification tasks (emphasizes cost/speed)

Model Registry

nx-ai includes a comprehensive model registry with:

Capabilities - Tool calling, structured outputs, reasoning, vision, etc.
Pricing - Accurate per-token pricing with reasoning token support
Scoring Matrix - 0-10 scores for cost, speed, reasoning, problem solving, cognitive tasks
Rate Limits - Provider-specific rate limit information

The registry is loaded from metadata/models.features.json (primary source) with fallback to docs/models.md, and the scoring matrix from metadata/models.matrix.json (primary source) with fallback to docs/modelsMatrix.md. The registry can be queried programmatically:

import { 
  getModelCapabilities, 
  getModelPricing, 
  getDefaultModel,
  findModelsByScores 
} from 'nx-ai';

// Get model capabilities
const caps = getModelCapabilities('openai', 'gpt-4o');
console.log(caps?.capabilities?.toolCalling?.supported); // true

// Get pricing
const pricing = getModelPricing('openai', 'gpt-4o');
console.log(pricing?.inputPer1k); // 0.0025

// Get default model for provider
const defaultModel = getDefaultModel('openai', 'chat'); // 'gpt-4.1-mini'

// Find models by score criteria
const fastModels = findModelsByScores({
  minSpeedScore: 8,
  kind: 'chat',
});

🌐 Supported Providers

All providers supported by LiteLLM (120+), including:

OpenAI - GPT-4, GPT-3.5, etc.
Anthropic - Claude 3 Opus, Sonnet, Haiku
Google - Gemini Pro, Gemini Ultra
Groq - Llama 3, Mixtral
Cohere - Command, Command R
Mistral - Mistral Large, Medium, Small
Meta - Llama 2, Llama 3
And 100+ more...

See LiteLLM documentation for the complete list.

🏗️ Architecture

Code Organization

The codebase is organized into focused modules:

src/types.ts - Type definitions
src/utils.ts - Utility functions (cost calculation, etc.)
src/logger.ts - Logging functionality
src/mapper.ts - AI SDK to LiteLLM mapping
src/callback.ts - Callback handlers
src/model.ts - NxAiModel implementation
src/init.ts - Initialization and configuration
src/index.ts - Main exports

How It Works

Model Creation: nxai() creates a NxAiModel instance
Request Processing: Model implements LanguageModelV2 interface
LiteLLM Integration: Requests are mapped and sent via LiteLLM
Logging: If enabled, requests are logged to MongoDB
Telemetry: If enabled, requests are tracked in Langfuse
Response Mapping: LiteLLM responses are mapped back to AI SDK format

🔧 Configuration

Environment Variables

# MongoDB (for logging)
MONGO_URI=mongodb://localhost:27017

# Langfuse (for telemetry)
LANGFUSE_PUBLIC_KEY=pk-...
LANGFUSE_SECRET_KEY=sk-...
LANGFUSE_HOST=http://localhost:3000

# Redis (for caching)
REDIS_URL=redis://localhost:6379

Package Configuration

For private @xronoces/xronox package access, configure .npmrc:

@xronoces:registry=https://npm.pkg.github.com
//npm.pkg.github.com/:_authToken=your-github-token

📊 Logging & Telemetry Details

Xronox Logging

Collection: ai-activities
Metadata: Full request/response data, timing, costs, errors
Storage: MongoDB (required), S3 (optional)
Engine: nxMongo (default) or xronoxCore (with S3)

Langfuse Telemetry

License: MIT (open-source)
Hosting: Self-hostable or cloud
Features: Tracing, datasets, evaluations, UI dashboard
Streaming: Real-time chunk tracking

🤝 Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

📝 License

Apache-2.0 - See LICENSE file for details.

🔗 Links

💡 Tips

Use jobId to correlate requests across your systems
Enable caching for repeated prompts to reduce costs
Use fallbacks for production reliability
Enable logging for audit trails and debugging
Use telemetry for observability and monitoring
All features are optional - use only what you need

Made with ❤️ for the AI community

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

nx-ai

✨ Features

📋 Changelog

v2.0.7 (Latest)

v2.0.6

v2.0.5

v2.0.3

v2.0.2

📦 Installation

🚀 Quick Start

Basic Usage

Type-Safe with ENUMs (Recommended)

String-Based (Backward Compatible)

Streaming

📚 Documentation

Initialization

Type-Safe Model Selection with ENUMs

Model Options

Logging (MongoDB via Xronox)

Telemetry (Langfuse)

Fallbacks

Answer Caching (Xronox)

Response Caching (Redis)

Custom Metadata

Structured Prompts

Tool/Function Calling

Structured Outputs

Reasoning Modes

Rate Limiting

Batch Execution

Embeddings

Smart Model Selector

Health Checks

Rich Error Handling

📖 API Reference

nxai(model, options?)

initNxAi(config)

closeNxAi()

checkProviderHealth(provider)

checkAllProvidersHealth()

nx (alias)

ENUM Types

🎯 Examples

🧠 Smart Model Routing

Task-Aware Routing

Available Task Types

Model Registry

🌐 Supported Providers

🏗️ Architecture

Code Organization

How It Works

🔧 Configuration

Environment Variables

Package Configuration

📊 Logging & Telemetry Details

Xronox Logging

Langfuse Telemetry

🤝 Contributing

📝 License

🔗 Links

💡 Tips

`nxai(model, options?)`

`initNxAi(config)`

`closeNxAi()`

`checkProviderHealth(provider)`

`checkAllProvidersHealth()`

`nx` (alias)