llm-switchboard

v1.0.4

Published

13 days ago

Blazing-fast, zero-cost local LLM router. Classify and route prompts to specialized AI models (OpenAI, Claude, Gemini, Llama) with <1ms latency using heuristic rules.

🌟 Key Features

💸 Zero-Cost Routing: Runs 100% locally. No expensive LLM-based classification calls.
⚡ Ultra-Low Latency: Heuristic-based classification adds less than 1ms to your stack.
🧠 Tiered Intelligence: Automatically maps prompts to SIMPLE, MEDIUM, COMPLEX, or REASONING tiers.
🤖 Agentic Detection: Specialized logic to identify multi-step, tool-heavy tasks.
🌍 Multilingual Support: Native intent detection for 10+ major languages.
🛠️ Developer First: Type-safe, customizable, and works with Bun, Node.js, and Deno.

🚀 Why llm-switchboard?

In high-volume AI applications, using high-end models (like GPT-4o or Claude 3.5 Sonnet) for every request is a waste of both time and money. Traditional routers use another LLM call to classify the prompt, which adds latency and cost.

llm-switchboard solves this by using a high-performance heuristic engine that scores prompts across 14 weighted dimensions instantly.

📦 Installation

# Using Bun (Recommended)
bun install llm-switchboard

# Using NPM
npm install llm-switchboard

# Using Yarn
yarn add llm-switchboard

🚦 Smart Tiering System

llm-switchboard classifies every prompt into one of four tiers, allowing you to map specific models to specific task complexities.

📖 Usage

⚙️ Global Configuration

Set your model preferences once at application startup.

import { configureRouter, getProductionModel } from "llm-switchboard";

// Configure your routing table
configureRouter({
  tiers: {
    SIMPLE: { primary: "meta-llama/llama-3-8b-instruct" },
    MEDIUM: { primary: "anthropic/claude-3-haiku" }
  },
  agenticTiers: {
    // Models highly optimized for multi-step tool use
    COMPLEX: { primary: "anthropic/claude-3-5-sonnet-20241022" },
    REASONING: { primary: "openai/o3-mini" }
  },
  overrides: {
    agenticMode: true
  }
});

// Get the best model for a prompt
const model = getProductionModel("What is the weather like in Tokyo?");
console.log(model); // => "meta-llama/llama-3-8b-instruct"

Configuration Parameters:

tiers: The standard routing table mapping task complexity (SIMPLE, MEDIUM, COMPLEX, REASONING) to specific models. Each tier requires a primary model.
agenticTiers: An alternative routing table. When agenticMode is true (or when the router automatically detects a multi-step agentic prompt), it routes the request to models defined here instead. This allows you to keep standard workloads cheap while reserving premium tool-calling models for agentic tasks.
overrides.agenticMode: A boolean (true/false). When set to true, it forces the router to ALWAYS prefer models from the agenticTiers config, ignoring standard tiers.

🎯 Per-Request Overrides

Override global settings for specific, high-priority, or sensitive prompts without affecting the rest of your app.

const prompt = "Analyze this highly confidential dataset.";

const model = getProductionModel(prompt, {
  customTiers: {
    COMPLEX: { 
      primary: "local-mixtral-8x7b"
    }
  },
  customAgenticTiers: {
    // Override the global agentic tier for this request
    REASONING: { primary: "deepseek/deepseek-r1" } 
  },
  agenticMode: false // Explicitly bypass agentic routing for this single prompt
});

Per-Request Parameters:

customTiers: Deep-merges with the global tiers mapping for this specific request.
customAgenticTiers: Deep-merges with the global agenticTiers mapping.
agenticMode: (boolean) Enable or disable agent-optimized model selection strictly for this prompt.

📊 How it Works

The classification engine analyzes prompts across multiple dimensions including:

Token Density: Estimating semantic weight vs. length.
Syntactic Markers: Detecting code chunks, mathematical notation, and imperative verbs.
Instruction Depth: Identifying complex formatting demands (JSON, Tables, CSV).
Agentic Signatures: Multi-step planning patterns and tool-use intent.
Domain Context: Scanning for technical terminology and high-entropy keywords.

🧪 Development & Testing

We include a comprehensive test suite to help you benchmark classification accuracy.

bun run test

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme