npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

@kb-labs/llm-router

v1.2.0

Published

Adaptive LLM router with tier-based model selection and fallback support.

Readme

@kb-labs/llm-router

Adaptive LLM Router with tier-based model selection for KB Labs Platform.

Overview

LLM Router provides an abstraction layer that isolates plugins from LLM providers and models. Plugins specify what they need (tier + capabilities), and the platform decides how to fulfill it.

Key Principles:

  • Plugin isolation - Plugins don't know about providers/models
  • User-defined tiers - Users decide what "small/medium/large" means for them
  • Adaptive resolution - Platform adapts to available models
  • Simplify by default - Minimal config works out of the box

Installation

pnpm add @kb-labs/llm-router

Quick Start

Plugin Usage (via SDK)

import { useLLM } from '@kb-labs/sdk';

// Simple - uses configured default tier
const llm = useLLM();

// Request specific tier
const llm = useLLM({ tier: 'small' });   // Simple tasks
const llm = useLLM({ tier: 'large' });   // Complex tasks

// Request capabilities
const llm = useLLM({ tier: 'medium', capabilities: ['coding'] });

// Use LLM
if (llm) {
  const result = await llm.complete('Generate commit message');
  console.log(result.content);
}

Configuration

Minimal config in kb.config.json:

{
  "platform": {
    "adapters": {
      "llm": "@kb-labs/adapters-openai"
    },
    "adapterOptions": {
      "llm": {
        "tier": "medium",
        "defaultModel": "gpt-4o"
      }
    }
  }
}

Centralized Cache/Stream Defaults

Platform can manage cache/stream defaults centrally via adapterOptions.llm.executionDefaults. Plugins then use plain useLLM() and get consistent behavior by default.

{
  "platform": {
    "adapterOptions": {
      "llm": {
        "defaultTier": "medium",
        "executionDefaults": {
          "cache": {
            "mode": "prefer",
            "scope": "segments",
            "ttlSec": 3600
          },
          "stream": {
            "mode": "prefer",
            "fallbackToComplete": true
          }
        }
      }
    }
  }
}

Plugin-level override remains available as an escape hatch:

const llm = useLLM({
  tier: 'medium',
  execution: {
    cache: { mode: 'require', key: 'mind-rag:v2' }
  }
});

Merge priority:

  1. platform executionDefaults
  2. plugin useLLM({ execution })
  3. per-call llm.complete(..., { execution })

How To Verify Cache Is Working

Analytics events include cache outcome and billing breakdown:

  • llm.cache.hit
  • llm.cache.miss
  • llm.cache.bypass

Completion/tool events also include:

  • cacheReadTokens
  • cacheWriteTokens
  • billablePromptTokens
  • estimatedUncachedCost
  • estimatedCost
  • estimatedCacheSavingsUsd

Practical signal that cache works:

  • llm.cache.hit appears regularly;
  • cacheReadTokens > 0;
  • billablePromptTokens < promptTokens;
  • estimatedCacheSavingsUsd > 0.

Tier System

Tiers are User-Defined Slots

small / medium / large are NOT tied to specific models. They are abstract slots that users fill with whatever models they want.

| Tier | Plugin Intent | User Decides | |------|---------------|--------------| | small | "This task is simple" | "What model for simple stuff" | | medium | "Standard task" | "My workhorse model" | | large | "Complex task, need max quality" | "When I really need quality" |

Example Configurations

# Budget-conscious: everything on mini
small: gpt-4o-mini
medium: gpt-4o-mini
large: gpt-4o-mini

# Standard gradient
small: gpt-4o-mini
medium: gpt-4o
large: o1

# Anthropic-first
small: claude-3-haiku
medium: claude-3.5-sonnet
large: claude-opus-4

# Local-first with cloud fallback
small: ollama/llama-3-8b
medium: ollama/llama-3-70b
large: claude-opus-4

Adaptive Resolution

Escalation (Silent)

If plugin requests lower tier than configured, platform escalates silently:

Plugin requests: small
Configured:      medium
Result:          medium (no warning)

Degradation (With Warning)

If plugin requests higher tier than configured, platform degrades with warning:

Plugin requests: large
Configured:      medium
Result:          medium (⚠️ warning logged)

Resolution Table

| Request | Configured | Result | Note | |---------|------------|--------|------| | small | small | small | Exact match | | small | medium | medium | Escalate ✅ | | small | large | large | Escalate ✅ | | medium | small | small | Degrade ⚠️ | | medium | medium | medium | Exact match | | medium | large | large | Escalate ✅ | | large | small | small | Degrade ⚠️ | | large | medium | medium | Degrade ⚠️ | | large | large | large | Exact match |

Capabilities

Capabilities describe task-specific requirements:

| Capability | Description | Typical Models | |------------|-------------|----------------| | fast | Lowest latency | gpt-4o-mini, haiku, flash | | reasoning | Complex reasoning | o1, claude-opus | | coding | Code-optimized | claude-sonnet, gpt-4o | | vision | Image support | gpt-4o, claude-sonnet, gemini |

// Request with capabilities
const llm = useLLM({ tier: 'medium', capabilities: ['coding'] });
const llm = useLLM({ capabilities: ['vision'] });

API Reference

Types

// Tier (user-defined quality slot)
type LLMTier = 'small' | 'medium' | 'large';

// Capability (task-specific requirements)
type LLMCapability = 'reasoning' | 'coding' | 'vision' | 'fast';

// Options for useLLM()
interface UseLLMOptions {
  tier?: LLMTier;
  capabilities?: LLMCapability[];
}

Functions

// Get LLM with tier selection
function useLLM(options?: UseLLMOptions): ILLM | undefined;

// Check if LLM is available
function isLLMAvailable(): boolean;

// Get configured tier
function getLLMTier(): LLMTier | undefined;

ILLMRouter Interface

interface ILLMRouter {
  getConfiguredTier(): LLMTier;
  resolve(options?: UseLLMOptions): LLMResolution;
  hasCapability(capability: LLMCapability): boolean;
  getCapabilities(): LLMCapability[];
}

Architecture

┌─────────────────────────────────────────────────────────────┐
│                      PLUGIN LAYER                           │
│  ┌───────────────────────────────────────────────────────┐ │
│  │  Plugins see ONLY:                                     │ │
│  │  • tier: 'small' | 'medium' | 'large'                  │ │
│  │  • capabilities: 'reasoning' | 'coding' | ...          │ │
│  │                                                        │ │
│  │  Plugins DON'T KNOW:                                   │ │
│  │  ✗ Providers (openai, anthropic, google)               │ │
│  │  ✗ Models (gpt-4o, claude-3-opus)                      │ │
│  │  ✗ API keys, endpoints, pricing                        │ │
│  └───────────────────────────────────────────────────────┘ │
│                            │                                │
│                   useLLM({ tier, capabilities })            │
│                            │                                │
├────────────────────────────┼────────────────────────────────┤
│                     PLATFORM LAYER                          │
│                            ▼                                │
│  ┌───────────────────────────────────────────────────────┐ │
│  │                     LLM Router                         │ │
│  │  • Adaptive tier resolution                            │ │
│  │  • Capability matching                                 │ │
│  │  • Transparent ILLM delegation                         │ │
│  └───────────────────────────────────────────────────────┘ │
│                            │                                │
│                            ▼                                │
│  ┌───────────────────────────────────────────────────────┐ │
│  │                   ILLM Adapter                         │ │
│  │  (OpenAI, Anthropic, Google, Ollama, etc.)             │ │
│  └───────────────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────────┘

Related

License

MIT