npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

@illuma-ai/llm-router

v2.0.4

Published

Superfast semantic routing layer for LLMs and agents. Uses embedding similarity to classify queries into configurable route categories.

Readme

@illuma-ai/llm-router

Superfast semantic routing layer for LLMs and agents. Uses embedding similarity to classify queries into configurable route categories. The library provides the routing engine; consumers define their own routes, utterances, and model mappings.

Installation

npm install @illuma-ai/llm-router

Quick Start

import { SemanticRouter, createEncoder } from '@illuma-ai/llm-router';

const encoder = createEncoder(); // defaults to Bedrock Titan; set LLM_ROUTER_ENCODER for others

const router = new SemanticRouter({
  encoder,
  routes: [
    {
      name: 'billing',
      utterances: ['payment issue', 'invoice question', 'subscription cost'],
      scoreThreshold: 0.3,
    },
    {
      name: 'technical',
      utterances: ['bug report', 'API error', 'integration help'],
      scoreThreshold: 0.3,
    },
  ],
  topK: 5,
  aggregation: 'mean',
});

await router.initialize(); // embeds all utterances (batch via encoder)

const result = await router.route('my payment failed');
console.log(result.name);            // 'billing'
console.log(result.similarityScore); // 0.52

How It Works

Query → Encoder → Embedding → Cosine Similarity (vs utterance index) → Best Route + Score
  1. Initialize — all utterances are embedded and stored in an in-memory cosine similarity index.
  2. Route — the user's query is embedded (1 API call), compared against the index, and the best-matching route above its scoreThreshold is returned.
  3. No match — if no route clears its threshold, { name: null, similarityScore: 0 } is returned. The consumer decides the fallback.

Supported Encoders

Configure via LLM_ROUTER_ENCODER environment variable or pass programmatically.

| Encoder | Env Value | Batching | Notes | |---------|-----------|----------|-------| | AWS Bedrock Titan Embed v2 | bedrock (default) | 1 text/call, 5 concurrent | Symmetric encoding | | Cohere Embed v4 (via Bedrock) | cohere-bedrock | 96 texts/call | Asymmetric encoding (recommended) | | OpenAI Embeddings | openai | Native batch | Uses native fetch, zero extra deps | | Custom | — | — | Extend BaseEncoder |

Why Cohere is Recommended

  • Faster initialization: 96 texts per API call vs Titan's 1-at-a-time
  • Asymmetric encoding: separate search_query and search_document modes for better retrieval accuracy
  • Higher score range: 0.40–0.75 (vs Titan's 0.08–0.60) gives clearer route separation

Custom Encoder

Extend BaseEncoder to plug in any embedding provider:

import { BaseEncoder, SemanticRouter } from '@illuma-ai/llm-router';

class MyEncoder extends BaseEncoder {
  readonly name = 'my-encoder';
  readonly type = 'custom';
  readonly scoreThreshold = 0.3;

  async encode(docs: string[]): Promise<number[][]> {
    return Promise.all(docs.map(t => myEmbeddingService.encode(t)));
  }
}

const router = new SemanticRouter({
  encoder: new MyEncoder(),
  routes: myRoutes,
});
await router.initialize();

Warm-Up

Pre-initialize the router at application startup to eliminate cold-start latency on the first user query:

import { SemanticRouter, createEncoder } from '@illuma-ai/llm-router';

// At server startup
const router = new SemanticRouter({ encoder: createEncoder(), routes: myRoutes });
await router.initialize();
await router.route('warm-up'); // optional: force a throwaway query to fully warm caches

The router is a singleton in your application — initialize once, route for all users.

Environment Variables

| Variable | Default | Description | |----------|---------|-------------| | LLM_ROUTER_ENCODER | bedrock | Encoder type: bedrock, cohere-bedrock, openai | | LLM_ROUTER_EMBEDDING_DIMENSIONS | 1024 | Embedding dimensions (256, 512, 1024, or 1536 for Cohere) | | BEDROCK_AWS_ACCESS_KEY_ID | — | AWS access key (falls back to AWS_ACCESS_KEY_ID) | | BEDROCK_AWS_SECRET_ACCESS_KEY | — | AWS secret key | | BEDROCK_AWS_DEFAULT_REGION | us-east-1 | AWS region | | OPENAI_API_KEY | — | OpenAI API key (when using OpenAI encoder) |

Pre-built Route Definitions

The library ships with 3 pre-built model-tier routes (MODEL_TIER_ROUTES) as a reference. These are general-purpose and may not fit your use case — define your own routes for best results.

| Tier | Semantic Space | |------|---------------| | moderate | Greetings, Q&A, standard code, writing | | complex | Deep analysis, debugging, architecture | | expert | Research, academic, strategic planning |

Important: Route definitions belong in the consumer, not the library. The library provides the routing engine; you provide the utterances and model mapping that make sense for your domain.

API Reference

Core Classes

| Class | Description | |-------|-------------| | SemanticRouter | Main router — encoder + routes, handles init and routing | | BedrockTitanEncoder | AWS Bedrock Titan Embed v2 encoder | | CohereBedrockEncoder | Cohere Embed v4 via AWS Bedrock | | OpenAIEncoder | OpenAI embeddings encoder | | BaseEncoder | Abstract base class for custom encoders | | LocalIndex | In-memory cosine similarity index |

SemanticRouter Methods

| Method | Description | |--------|-------------| | initialize() | Embed all utterances and build the index. Call once at startup. | | route(query, options?) | Route a query — returns RouteChoice ({ name, similarityScore }) | | routeWithScores(query, options?) | Route with full breakdown — returns { choice, allScores } |

Factory Functions

| Function | Description | |----------|-------------| | createEncoder(config?) | Create an encoder from config or env vars | | registerEncoder(name, factory) | Register a custom encoder type |

Legacy High-Level API

These functions are retained for backwards compatibility but consumers should use SemanticRouter directly:

| Function | Description | |----------|-------------| | routeToModel(query, config?) | Route query to a concrete model ID (uses built-in presets) | | createModelTierRouter(config?) | Create/get singleton router with built-in tier routes | | warmUp(config?) | Pre-initialize the built-in singleton router |

Types

type ModelTier = 'moderate' | 'complex' | 'expert';

interface Route {
  name: string;
  utterances: string[];
  scoreThreshold: number;
  description?: string;
}

interface RouteChoice {
  name: string | null;
  similarityScore: number;
}

interface ScoredRoute {
  name: string;
  score: number;
}

Testing

npm test                  # Unit tests
npm run test:integration  # Integration tests (requires AWS credentials)
npm run test:e2e          # End-to-end accuracy tests
npm run test:coverage     # Coverage report

Building

npm run build       # Build CJS + ESM bundles to dist/
npm run type-check  # TypeScript type checking

Issues & Support

Report bugs and request features at github.com/illuma-ai/llm-router-issues.

For licensing inquiries or modification permissions, contact: [email protected]

License

Elastic License 2.0 (ELv2) — Copyright (c) 2024-2026 Illuma AI.

You are free to use, copy, and distribute this software. You may not provide it as a hosted/managed service. See LICENSE for full terms.