bar-dynamic-orchestrator

v2.0.11

Published

19 days ago

BAR - Dynamic Multi-Model Orchestrator — Business-Aware LLM Routing pipeline

0High
0Medium
0Low

nipunhevavitharana-npm

llm router query-decomposition multi-model orchestration ai

BAR — Dynamic Multi-Model Orchestrator

Business-Aware LLM Routing — a fully standalone JS/TS library that automatically routes queries to the right LLM based on content type, with built-in query decomposition for complex multi-part questions.

No Python. No FastAPI. No backend server required.

How It Works

BAR Pipeline

BAR runs a 5-component pipeline on every query — each component is independently configurable:

| # | Component | What it does | |---|----------------------|----------------------------------------------------| | 1 | Detector | Decides if the query is simple or complex | | 2 | Decomposer | Splits complex queries into sub-queries | | 3 | Execution Planner| Assigns PARALLEL or SEQUENTIAL tags to sub-queries | | 4 | Router | Classifies each sub-query into a category | | 5 | Aggregator | Combines all sub-answers into one final response |

Installation

npm install bar-dynamic-orchestrator
npx bar-orchestrator setup-models

setup-models downloads all ML models (~430MB) into ~/.bar_model_cache/. This runs once — every subsequent use loads from local cache instantly.

Project setup

Make sure your package.json has "type": "module":

{
  "type": "module"
}

Quick Start

import { BAROrchestrator } from 'bar-dynamic-orchestrator';

const bar = new BAROrchestrator({ openaiKey: 'sk-...' });

const result = await bar.route('Explain the symptoms of diabetes');
console.log(result.answer);
console.log(result.routing);          // ["$1 → MEDICAL"]
console.log(result.executionPlan);    // stages, parallel/sequential counts
console.log(result.componentLatency); // latency per component in ms

Full Configuration

Every component of the pipeline can be configured independently:

import { BAROrchestrator } from 'bar-dynamic-orchestrator';

const bar = new BAROrchestrator({
  // ── Required ──────────────────────────────────────────────────────────────
  openaiKey: 'sk-...',                   // or set OPENAI_API_KEY env var

  // ── Component 1: Complexity Detector ──────────────────────────────────────
  detector: 'rule_based',                // 'rule_based' | 'llm'
  //  rule_based → regex + heuristics, zero cost, instant (default)
  //  llm        → GPT judges complexity, most accurate

  // ── Component 2: Query Decomposer ─────────────────────────────────────────
  decomposer: 'rule_based',              // 'rule_based' | 'llm'
  //  rule_based → pattern-based splitting, instant, zero cost (default)
  //  llm        → GPT decomposes, handles nuanced multi-part queries

  // ── Component 4: Router ───────────────────────────────────────────────────
  router: 'llm',                         // 'llm' | 'transformer' | 'rag' | 'rnn'
  //  llm         → GPT-4o-mini classifies, 86.5% acc, ~$0.0003/query (default)
  //  transformer → DeBERTa ONNX local, 92.0% F1, $0, ~233MB first download
  //  rag         → ChromaDB hybrid retrieval, 86.1% F1, $0, ~188MB first download
  //  rnn         → TextCNN local, 89.9% F1, $0, ~9MB first download, 2.5ms

  // ── Component 5: Aggregator ───────────────────────────────────────────────
  aggregation: 'simple',                 // 'simple' | 'llm'
  //  simple → labeled sections "Part 1 / Part 2" (default)
  //  llm    → GPT synthesizes all answers into one unified natural response

  // ── Answer generation model ───────────────────────────────────────────────
  model: 'gpt-4o-mini',                  // default model for answer generation

  // ── Category → Model mapping ──────────────────────────────────────────────
  categoryModelMap: {
    REASONING:             'gpt-4o',
    CODE:                  'gpt-4o',
    MEDICAL:               'gpt-4o',
    LEGAL:                 'gpt-4o',
    FINANCE:               'gpt-4o',
    CREATIVE_WRITING:      'gpt-4o-mini',
    FACTUAL_KNOWLEDGE:     'gpt-4o-mini',
    INSTRUCTION_FOLLOWING: 'gpt-4o-mini',
    CONVERSATIONAL:        'gpt-4o-mini',
    MULTILINGUAL:          'gpt-4o-mini',
    SUMMARIZATION:         'gpt-4o-mini',
  },
});

Config presets

| Preset | detector | decomposer | router | aggregation | Best for | |--------|--------------|--------------|---------------|-------------|------------------------------------| | A | rule_based | rule_based | llm | simple | Easiest setup, zero model download | | B | rule_based | rule_based | rnn | simple | Fastest inference, tiny download | | C | rule_based | rule_based | transformer | simple | Highest accuracy, free after download | | D | llm | llm | transformer | llm | Maximum quality, all LLM-powered |

Router Approaches (Component 4)

| Router | How it works | Accuracy | Latency | Cost/query | First-use download | |---------------|-------------------------------------------|-----------|----------|-----------------|--------------------| | llm | GPT-4o-mini classifies via API | 86.5% | ~1s | ~$0.0003 | None | | transformer | DeBERTa-v3 ONNX runs locally in Node.js | 92.0% F1 | ~50ms | $0 | ~233MB (once) | | rag | ChromaDB hybrid vector + BM25 retrieval | 86.1% F1 | ~100ms | $0 | ~188MB (once) | | rnn | TextCNN + FastText runs locally | 89.9% F1 | ~2.5ms | $0 | ~9MB (once) |

First-use downloads: ML models are automatically downloaded from HuggingFace Hub and cached in ~/.bar_model_cache/. Every subsequent use loads from local cache — fast and free.

API Reference

`bar.route(query)` — Full 5-component pipeline

const result = await bar.route(
  'Compare Python and JavaScript for ML, then suggest which to learn first'
);

console.log(result.isComplex);
// true

console.log(result.subQueries);
// ["Compare Python and JavaScript for ML", "suggest which to learn first"]

console.log(result.executionPlan);
// {
//   stages:      [["$1"], ["$2"]],
//   nParallel:   1,
//   nSequential: 1,
//   subQueries: [
//     { id: "$1", text: "Compare...", dependsOn: [], execTag: "PARALLEL" },
//     { id: "$2", text: "suggest...", dependsOn: ["$1"], execTag: "SEQUENTIAL" },
//   ]
// }

console.log(result.routing);
// ["$1 → CODE", "$2 → INSTRUCTION_FOLLOWING"]

console.log(result.answer);
// unified final answer

console.log(result.componentLatency);
// { detect: 2, decompose: 5, plan: 0, route: 1200, answer: 1200, aggregate: 300 }

`bar.detect(query)` — Component 1 only

Detect whether a query is simple or complex, without running the full pipeline.

const result = await bar.detect('What is Python?');
// { query: "What is Python?", isComplex: false, detector: "rule_based" }

const result = await bar.detect(
  'Compare Python and JavaScript, then explain which is better for ML'
);
// { query: "...", isComplex: true, detector: "rule_based" }

`bar.decompose(query)` — Component 2 only

Break a query into sub-queries, without routing or answering.

const result = await bar.decompose(
  'Explain diabetes symptoms and write a Python script to track blood sugar'
);

console.log(result.subQueries);
// ["Explain diabetes symptoms", "write a Python script to track blood sugar"]

console.log(result.decomposer);  // "rule_based"
console.log(result.latencyMs);   // 3

`bar.plan(subQueries)` — Component 3 only

Build a parallel/sequential execution plan from a list of sub-queries.

const result = bar.plan([
  { id: "$1", text: "Fetch the data",         depends_on: [] },
  { id: "$2", text: "Clean the data",         depends_on: ["$1"] },
  { id: "$3", text: "Visualize the results",  depends_on: ["$2"] },
]);

console.log(result.stages);
// [["$1"], ["$2"], ["$3"]]

console.log(result.subQueries);
// [
//   { id: "$1", execTag: "PARALLEL",   dependsOn: [] },
//   { id: "$2", execTag: "SEQUENTIAL", dependsOn: ["$1"] },
//   { id: "$3", execTag: "SEQUENTIAL", dependsOn: ["$2"] },
// ]

console.log(result.nParallel);    // 1
console.log(result.nSequential);  // 2

`bar.classify(query)` — Component 4 only

Classify a query into a category, without decomposing or answering.

const category = await bar.classify('What are the tax implications of selling stocks?');
// → "FINANCE"

const category = await bar.classify('Write a recursive fibonacci function in Python');
// → "CODE"

const category = await bar.classify('Summarize this article: ...');
// → "SUMMARIZATION"

Supported Categories

| Category | Default model | Example queries | |-------------------------|-----------------|----------------------------------------------| | REASONING | gpt-4o | Logic puzzles, math, multi-step inference | | CODE | gpt-4o | Programming, debugging, algorithms | | MEDICAL | gpt-4o | Symptoms, treatments, health advice | | LEGAL | gpt-4o | Contracts, law, compliance questions | | FINANCE | gpt-4o | Investing, tax, budgeting, economics | | CREATIVE_WRITING | gpt-4o-mini | Stories, poems, essays, creative content | | FACTUAL_KNOWLEDGE | gpt-4o-mini | Facts, history, geography, science | | INSTRUCTION_FOLLOWING | gpt-4o-mini | Step-by-step tasks, how-to guides | | CONVERSATIONAL | gpt-4o-mini | Casual chat, opinions, greetings | | MULTILINGUAL | gpt-4o-mini | Non-English queries, translation requests | | SUMMARIZATION | gpt-4o-mini | Summarize or condense content |

Agent Framework Integrations

LangChain

import { BAROrchestrator } from 'bar-dynamic-orchestrator';
import { DynamicTool } from '@langchain/core/tools';

const bar = new BAROrchestrator({ openaiKey: 'sk-...', router: 'transformer' });

const barTool = new DynamicTool({
  name: 'bar_router',
  description: 'Routes a query to the best LLM and returns the answer',
  func: async (query: string) => {
    const result = await bar.route(query);
    return result.answer;
  },
});

AutoGen

import { BAROrchestrator } from 'bar-dynamic-orchestrator';

const bar = new BAROrchestrator({ openaiKey: 'sk-...', router: 'transformer' });

async function barRoute(query: string): Promise<string> {
  const result = await bar.route(query);
  return result.answer;
}
// Register barRoute as a tool in your AutoGen agent

LlamaIndex

import { BAROrchestrator } from 'bar-dynamic-orchestrator';
import { FunctionTool } from 'llamaindex';

const bar = new BAROrchestrator({ openaiKey: 'sk-...', router: 'transformer' });

const barTool = FunctionTool.from(
  async ({ query }: { query: string }) => {
    const result = await bar.route(query);
    return result.answer;
  },
  {
    name: 'bar_router',
    description: 'Routes a query to the best LLM and returns the answer',
    parameters: {
      type: 'object',
      properties: { query: { type: 'string' } },
      required: ['query'],
    },
  }
);

Requirements

Node.js >= 18
OpenAI API key — for answer generation and llm router/detector/decomposer

License

MIT — Nipun Hevavitharana

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme