npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

@ancatag/n-r

v0.2.3

Published

Official Node.js/TypeScript SDK for Nova AI API with route-config-based orchestration

Readme

@ancatag/n-r

Official Node.js/TypeScript SDK for Nova-route AI API
Save 60-80% on AI token costs with intelligent multi-tier caching and OpenAI-compatible interface.

npm version License: ISC Node.js Version TypeScript

Nova-route is an AI infrastructure platform that reduces your AI API costs by 60-80% through intelligent caching, semantic similarity matching, and RAG optimization. This SDK provides a drop-in replacement for OpenAI with automatic cost savings and route-config-based AI orchestration.

Features

  • 💰 60-80% Token Cost Reduction - Multi-tier caching (hot + semantic) automatically saves on redundant API calls
  • 🔄 OpenAI-Compatible API - Drop-in replacement, just change your base URL
  • Dual Transport - REST (default) and gRPC (lower overhead) support
  • 🌊 Streaming Support - Real-time response streaming with cancellation
  • 🧠 RAG Integration - Retrieval-Augmented Generation for document-based AI
  • 🎯 Smart Routing - Automatic model selection and route configuration
  • 📊 Cache Analytics - Track savings, hit rates, and token usage
  • 🔒 TypeScript First - Full type safety with exported types
  • 🚀 Zero Configuration - Works out of the box with sensible defaults

Installation

npm install @ancatag/n-r
# or
pnpm add @ancatag/n-r
# or
yarn add @ancatag/n-r

Requirements: Node.js 18+ (uses native fetch)

Quick Start

Basic Chat Completion

import { NovaClient } from '@ancatag/n-r';

const client = new NovaClient({
  apiKey: process.env.NOVA_API_KEY || 'nova_sk_...',
});

// Recommended: Use route config ID for consistent behavior
const response = await client.chat.create({
  messages: [
    { role: 'system', content: 'You are a helpful assistant.' },
    { role: 'user', content: 'Hello!' }
  ],
  temperature: 0.7,
  max_tokens: 1000,
  nova: {
    routeConfigId: 'your-route-config-id' // Use specific route config
  }
});

// Legacy: Model field still supported but route config recommended
const responseLegacy = await client.chat.create({
  model: 'llama2', // Falls back to project default model
  messages: [
    { role: 'system', content: 'You are a helpful assistant.' },
    { role: 'user', content: 'Hello!' }
  ],
  temperature: 0.7,
  max_tokens: 1000,
});

console.log(response.choices[0].message.content);
console.log('Tokens used:', response.usage.total_tokens);
console.log('Cache hit:', response.nova?.cacheHit);
console.log('Tokens saved:', response.nova?.tokensSaved);

Streaming

const stream = client.chat.createStream({
  messages: [{ role: 'user', content: 'Tell me a story' }],
  temperature: 0.7,
  nova: {
    routeConfigId: 'your-route-config-id' // Recommended: use route config
  }
});

for await (const chunk of stream) {
  const content = chunk.choices[0]?.delta?.content || '';
  if (content) {
    process.stdout.write(content); // Print as it arrives
  }
  
  if (chunk.choices[0]?.finish_reason) {
    console.log('\n\nStream complete!');
    break;
  }
}

Usage Examples

Non-Streaming Chat Completion

import { NovaClient } from '@ancatag/n-r';

const client = new NovaClient({
  apiKey: 'nova_sk_...',
  baseUrl: 'https://api.nova.ai', // Optional: defaults to http://localhost:3000
  timeoutMs: 60000, // Optional: request timeout (default: 60000)
  maxRetries: 3, // Optional: max retries (default: 2)
});

const response = await client.chat.create({
  model: 'gpt-4', // Uses project default if not specified
  messages: [
    { role: 'system', content: 'You are a helpful assistant.' },
    { role: 'user', content: 'Explain quantum computing in simple terms' }
  ],
  temperature: 0.7,
  max_tokens: 1000,
});

// Access response
console.log(response.choices[0].message.content);

// Access Nova-specific metrics
if (response.nova) {
  console.log('Cache hit:', response.nova.cacheHit);
  console.log('Cache layer:', response.nova.cacheLayer); // 'hot' | 'semantic' | null
  console.log('Tokens saved:', response.nova.tokensSaved);
  console.log('Response time:', response.nova.responseTimeMs, 'ms');
  console.log('Request ID:', response.nova.requestId);
}

Streaming with Cancellation

const controller = new AbortController();

// Cancel after 5 seconds
setTimeout(() => controller.abort(), 5000);

try {
  for await (const chunk of client.chat.createStream(
    {
      model: 'llama2',
      messages: [{ role: 'user', content: 'Write a long story' }],
    },
    controller.signal
  )) {
    process.stdout.write(chunk.choices[0]?.delta?.content || '');
  }
} catch (error) {
  if (error.name === 'AbortError') {
    console.log('Stream cancelled');
  }
}

Models API

// List all available models in your project
const models = await client.models.list();
console.log('Available models:', models.map(m => m.id));

// Get specific model details
const model = await client.models.get('llama2');
console.log('Model:', model);

Error Handling

import { NovaClient, NovaError } from '@ancatag/n-r';

const client = new NovaClient({
  apiKey: 'nova_sk_...',
});

try {
  const response = await client.chat.create({
    model: 'invalid-model',
    messages: [{ role: 'user', content: 'Hello' }],
  });
} catch (error) {
  if (error instanceof NovaError) {
    console.error('Nova API Error:', error.message);
    console.error('Status:', error.status);
    console.error('Code:', error.code);
    console.error('Type:', error.type);
    
    // Handle specific error codes
    switch (error.code) {
      case 'invalid_api_key':
        console.error('Invalid API key');
        break;
      case 'model_not_found':
        console.error('Model not found or not accessible');
        break;
      case 'rate_limit_exceeded':
        console.error('Rate limit exceeded');
        break;
    }
  } else {
    console.error('Unexpected error:', error);
  }
}

Nova-Specific Features

Nova extends the OpenAI API with powerful features for cost optimization and advanced routing:

Cache Control

const response = await client.chat.create({
  model: 'llama2',
  messages: [{ role: 'user', content: 'Hello' }],
  
  // Nova-specific options
  nova: {
    skipCache: false, // Skip cache lookup for this request (default: false)
  },
});

// Response includes cache information
console.log('Cache hit:', response.nova?.cacheHit);
console.log('Cache layer:', response.nova?.cacheLayer); // 'hot' | 'semantic' | null
console.log('Tokens saved:', response.nova?.tokensSaved);

Route Configuration

const response = await client.chat.create({
  model: 'llama2',
  messages: [{ role: 'user', content: 'Hello' }],
  
  nova: {
    routeConfigId: 'route-config-uuid', // Use specific route configuration
  },
});

RAG (Retrieval-Augmented Generation)

RAG enables AI models to answer questions using your own documents as context. Instead of sending entire documents with every request, Nova-route automatically retrieves only the relevant chunks that match your query, dramatically reducing token usage (70-90% savings) while improving accuracy.

How RAG Works:

  1. Upload documents (PDF, TXT, MD) to a route config via REST API
  2. Documents are automatically parsed, chunked, and embedded
  3. During chat completions, relevant chunks are automatically retrieved and injected as context
  4. Only chunks that fit within your token budget are included

Using RAG with the SDK:

RAG works automatically once documents are uploaded and processed. Simply use a route config that has RAG enabled:

// RAG is automatic - no code changes needed!
const response = await client.chat.create({
  model: 'your-route-config-id', // Route config with ragEnabled: true
  messages: [
    { role: 'user', content: 'What is the vacation policy?' }
  ],
});

// Response includes context from your uploaded documents
console.log(response.choices[0].message.content);

Document Upload (via REST API):

Document upload is done via the REST API (not SDK methods). Here's a complete example:

const API_BASE_URL = 'https://api.nova.ai';
const JWT_TOKEN = 'your-jwt-token'; // From /auth/login
const ROUTE_CONFIG_ID = 'your-route-config-id';

// 1. Upload document
async function uploadDocument(file: File) {
  const formData = new FormData();
  formData.append('file', file);

  const response = await fetch(
    `${API_BASE_URL}/rag/collections/${ROUTE_CONFIG_ID}/documents`,
    {
      method: 'POST',
      headers: {
        'Authorization': `Bearer ${JWT_TOKEN}`,
      },
      body: formData,
    }
  );

  const document = await response.json();
  console.log('Document uploaded:', document.id);
  
  // Poll for processing completion
  return pollDocumentStatus(document.id);
}

// 2. Check processing status
async function pollDocumentStatus(documentId: string) {
  const maxAttempts = 30;
  const delayMs = 2000;

  for (let i = 0; i < maxAttempts; i++) {
    const response = await fetch(
      `${API_BASE_URL}/rag/documents/${documentId}`,
      {
        headers: {
          'Authorization': `Bearer ${JWT_TOKEN}`,
        },
      }
    );

    const document = await response.json();

    if (document.status === 'completed') {
      console.log('Document processed! Chunks:', document.chunkCount);
      return document;
    }

    if (document.status === 'failed') {
      throw new Error(`Processing failed: ${document.errorMessage}`);
    }

    await new Promise(resolve => setTimeout(resolve, delayMs));
  }

  throw new Error('Document processing timeout');
}

// 3. Use RAG in chat completions (automatic)
const response = await client.chat.create({
  model: ROUTE_CONFIG_ID, // Route config with ragEnabled: true
  messages: [
    { role: 'user', content: 'What is the vacation policy?' }
  ],
});

Token Savings with RAG:

  • Without RAG: Send entire documents (10,000+ tokens)
  • With RAG: Only relevant chunks (500-2,000 tokens)
  • Savings: 70-90% reduction in prompt tokens

RAG Configuration:

RAG settings are configured per route config:

  • chunkSize: 100-2048 tokens per chunk (default: 512)
  • chunkOverlap: 0-200 tokens overlap (default: 50)
  • topK: 1-20 chunks to retrieve (default: 5)
  • similarityThreshold: 0.5-0.95 minimum similarity (default: 0.7)

See RAG SDK Documentation for complete details.

Custom Metadata

const response = await client.chat.create({
  model: 'llama2',
  messages: [{ role: 'user', content: 'Hello' }],
  
  nova: {
    metadata: {
      userId: '123',
      sessionId: 'abc',
      feature: 'chatbot',
    },
  },
});

System Prompt Override

const response = await client.chat.create({
  model: 'llama2',
  messages: [{ role: 'user', content: 'Hello' }],
  
  nova: {
    systemPromptOverride: 'You are a specialized technical assistant.',
  },
});

gRPC Transport

For lower overhead and better performance, use gRPC transport:

import { NovaClient } from '@ancatag/n-r';

const client = new NovaClient({
  apiKey: process.env.NOVA_API_KEY || 'nova_sk_...',
  transport: 'grpc', // Use gRPC instead of REST
  grpcUrl: '0.0.0.0:50051', // Optional: defaults to 0.0.0.0:50051
});

// Same API, lower overhead
const response = await client.chat.create({
  model: 'llama2',
  messages: [{ role: 'user', content: 'Hello' }],
});

// Streaming also works with gRPC
for await (const chunk of client.chat.createStream({
  model: 'llama2',
  messages: [{ role: 'user', content: 'Hey' }],
})) {
  process.stdout.write(chunk.choices[0]?.delta?.content || '');
}

Transport Options:

  • transport: 'rest' (default) - Uses HTTP fetch
  • transport: 'grpc' - Uses gRPC over grpc-js with ts-proto stubs

API Reference

Client Configuration

interface NovaClientConfig {
  /** Nova API key (required, format: nova_sk_...) */
  apiKey: string;
  /** Base URL for REST API (default: http://localhost:3000) */
  baseUrl?: string;
  /** gRPC URL (default: 0.0.0.0:50051) */
  grpcUrl?: string;
  /** Preferred transport: 'rest' | 'grpc' (default: 'rest') */
  transport?: 'rest' | 'grpc';
  /** Request timeout in milliseconds (default: 60000) */
  timeoutMs?: number;
  /** Maximum number of retries for failed requests (default: 2) */
  maxRetries?: number;
}

Chat Completions

client.chat.create(request)

Create a non-streaming chat completion.

Parameters:

  • request: ChatCompletionRequest - Chat completion request (OpenAI-compatible)

Returns: Promise<ChatCompletionResponse>

client.chat.createStream(request, signal?)

Create a streaming chat completion.

Parameters:

  • request: ChatCompletionRequest - Chat completion request
  • signal?: AbortSignal - Optional abort signal for cancellation

Returns: AsyncIterable<ChatCompletionChunk>

Models

client.models.list()

List all available models in your project.

Returns: Promise<Model[]>

client.models.get(modelId)

Get details for a specific model.

Parameters:

  • modelId: string - Model identifier

Returns: Promise<Model>

Type Exports

import type {
  ChatMessage,
  ChatCompletionRequest,
  ChatCompletionResponse,
  ChatCompletionChunk,
  ChatCompletionChoice,
  ChatCompletionUsage,
  Model,
  NovaClientConfig,
  NovaTransport,
} from '@ancatag/n-r';

import { NovaError } from '@ancatag/n-r';

Nova-Specific Extensions

Request Extensions

interface NovaRequestExtensions {
  nova?: {
    /** Skip cache lookup for this request */
    skipCache?: boolean;
    /** Route config ID - specifies which route configuration to use */
    routeConfigId?: string;
    /** Enable RAG (Retrieval-Augmented Generation) for this request */
    ragEnabled?: boolean;
    /** Additional metadata to attach to the request */
    metadata?: Record<string, any>;
    /** Override the system prompt for this request */
    systemPromptOverride?: string;
  };
}

Response Extensions

interface NovaResponseExtensions {
  nova?: {
    /** Whether this response was served from cache */
    cacheHit: boolean;
    /** Cache layer used: 'hot' (exact match) | 'semantic' (similarity match) | null */
    cacheLayer?: 'hot' | 'semantic' | null;
    /** Number of tokens saved by cache hit */
    tokensSaved: number;
    /** Response time in milliseconds */
    responseTimeMs: number;
    /** Unique request ID for tracking */
    requestId: string;
  };
}

Advanced Features

Multi-Tier Caching

Nova automatically uses two cache layers:

  1. Hot Cache - Exact match caching (7-30 day TTL)

    • Instant responses for identical requests
    • SHA-256 hash-based lookup
  2. Semantic Cache - Similarity matching (95% threshold)

    • Matches semantically similar prompts
    • Vector embedding-based similarity search
    • Available on paid plans

Cost Savings Tracking

Every response includes savings metrics:

const response = await client.chat.create({
  model: 'llama2',
  messages: [{ role: 'user', content: 'Hello' }],
});

if (response.nova?.cacheHit) {
  console.log(`Saved ${response.nova.tokensSaved} tokens`);
  console.log(`Cache layer: ${response.nova.cacheLayer}`);
}

Model Routing

Nova automatically routes requests to the correct provider based on:

  • Model identifier in request
  • Project default model configuration
  • Route configuration (if specified)
// Uses project default if model not specified
const response = await client.chat.create({
  messages: [{ role: 'user', content: 'Hello' }],
  // model is optional - uses project default
});

RAG (Retrieval-Augmented Generation)

RAG provides 70-90% token savings by automatically retrieving only relevant document chunks instead of sending entire documents.

How RAG Works:

  1. Automatic Context Retrieval: When you make a chat completion to a route config with ragEnabled: true, Nova-route automatically:

    • Extracts the query from the last user message
    • Generates a vector embedding of the query
    • Searches Qdrant for semantically similar chunks
    • Selects top-K chunks above similarity threshold
    • Manages token budget to include only chunks that fit
    • Injects retrieved chunks as context before the user's query
  2. Token Budget Management: Nova-route automatically calculates:

    Token Budget = Context Window - Existing Prompt Tokens - Headroom (200 tokens)

    Only chunks that fit within this budget are included.

  3. Prompt Format: The final prompt sent to the AI model includes:

    [System Prompt]
    [Pre-prompt Items]
       
    Context:
    [Relevant chunk 1 from your documents]
    [Relevant chunk 2 from your documents]
       
    Query: [User's original question]

RAG Benefits:

  • 70-90% Token Savings: Only relevant chunks vs. full documents
  • Improved Accuracy: AI responses grounded in your documents
  • Automatic: No manual context management needed
  • Scalable: Works with large document collections
  • Intelligent: Semantic search finds relevant content even with different wording

Example Token Savings:

  • Full document: 10,000 tokens
  • Relevant chunks: 1,500 tokens
  • Savings: 8,500 tokens (85%)

See RAG SDK Documentation for complete setup and configuration details.

TypeScript Support

Full TypeScript support with comprehensive type definitions:

import { NovaClient } from '@ancatag/n-r';
import type {
  ChatCompletionRequest,
  ChatCompletionResponse,
  NovaClientConfig,
} from '@ancatag/n-r';

const config: NovaClientConfig = {
  apiKey: process.env.NOVA_API_KEY!,
  baseUrl: 'https://api.nova.ai',
};

const client = new NovaClient(config);

async function chat(
  request: ChatCompletionRequest
): Promise<ChatCompletionResponse> {
  return await client.chat.create(request);
}

Requirements

  • Node.js: 18.0.0 or higher (uses native fetch)
  • TypeScript: 5.0+ (for type definitions, optional)

Getting Started with Nova-route

  1. Sign up at nova.ai (Free plan available)
  2. Create a project in the dashboard
  3. Configure your models (BYOP or use hosted providers)
  4. Generate an API key (format: nova_sk_...)
  5. Install the SDK and start saving on token costs!

Migration from OpenAI

Switching from OpenAI to Nova-route is simple:

// Before (direct OpenAI)
import OpenAI from 'openai';

const client = new OpenAI({
  apiKey: process.env.OPENAI_API_KEY,
});

// After (via Nova-route) - Just change the SDK!
import { NovaClient } from '@ancatag/n-r';

const client = new NovaClient({
  apiKey: process.env.NOVA_API_KEY, // Get from Nova-route dashboard
  baseUrl: 'https://api.nova.ai', // Nova-route API endpoint
});

// Your code stays exactly the same!
const response = await client.chat.create({
  model: 'gpt-4',
  messages: [{ role: 'user', content: 'Hello!' }],
});

Documentation

License

ISC

Support


Built with ❤️ by the Nova-route team