bar-dynamic-orchestrator
v2.0.11
Published
BAR - Dynamic Multi-Model Orchestrator — Business-Aware LLM Routing pipeline
Maintainers
Readme
BAR — Dynamic Multi-Model Orchestrator
Business-Aware LLM Routing — a fully standalone JS/TS library that automatically routes queries to the right LLM based on content type, with built-in query decomposition for complex multi-part questions.
No Python. No FastAPI. No backend server required.
How It Works

BAR runs a 5-component pipeline on every query — each component is independently configurable:
| # | Component | What it does | |---|----------------------|----------------------------------------------------| | 1 | Detector | Decides if the query is simple or complex | | 2 | Decomposer | Splits complex queries into sub-queries | | 3 | Execution Planner| Assigns PARALLEL or SEQUENTIAL tags to sub-queries | | 4 | Router | Classifies each sub-query into a category | | 5 | Aggregator | Combines all sub-answers into one final response |
Installation
npm install bar-dynamic-orchestrator
npx bar-orchestrator setup-models
setup-modelsdownloads all ML models (~430MB) into~/.bar_model_cache/. This runs once — every subsequent use loads from local cache instantly.
Project setup
Make sure your package.json has "type": "module":
{
"type": "module"
}Quick Start
import { BAROrchestrator } from 'bar-dynamic-orchestrator';
const bar = new BAROrchestrator({ openaiKey: 'sk-...' });
const result = await bar.route('Explain the symptoms of diabetes');
console.log(result.answer);
console.log(result.routing); // ["$1 → MEDICAL"]
console.log(result.executionPlan); // stages, parallel/sequential counts
console.log(result.componentLatency); // latency per component in msFull Configuration
Every component of the pipeline can be configured independently:
import { BAROrchestrator } from 'bar-dynamic-orchestrator';
const bar = new BAROrchestrator({
// ── Required ──────────────────────────────────────────────────────────────
openaiKey: 'sk-...', // or set OPENAI_API_KEY env var
// ── Component 1: Complexity Detector ──────────────────────────────────────
detector: 'rule_based', // 'rule_based' | 'llm'
// rule_based → regex + heuristics, zero cost, instant (default)
// llm → GPT judges complexity, most accurate
// ── Component 2: Query Decomposer ─────────────────────────────────────────
decomposer: 'rule_based', // 'rule_based' | 'llm'
// rule_based → pattern-based splitting, instant, zero cost (default)
// llm → GPT decomposes, handles nuanced multi-part queries
// ── Component 4: Router ───────────────────────────────────────────────────
router: 'llm', // 'llm' | 'transformer' | 'rag' | 'rnn'
// llm → GPT-4o-mini classifies, 86.5% acc, ~$0.0003/query (default)
// transformer → DeBERTa ONNX local, 92.0% F1, $0, ~233MB first download
// rag → ChromaDB hybrid retrieval, 86.1% F1, $0, ~188MB first download
// rnn → TextCNN local, 89.9% F1, $0, ~9MB first download, 2.5ms
// ── Component 5: Aggregator ───────────────────────────────────────────────
aggregation: 'simple', // 'simple' | 'llm'
// simple → labeled sections "Part 1 / Part 2" (default)
// llm → GPT synthesizes all answers into one unified natural response
// ── Answer generation model ───────────────────────────────────────────────
model: 'gpt-4o-mini', // default model for answer generation
// ── Category → Model mapping ──────────────────────────────────────────────
categoryModelMap: {
REASONING: 'gpt-4o',
CODE: 'gpt-4o',
MEDICAL: 'gpt-4o',
LEGAL: 'gpt-4o',
FINANCE: 'gpt-4o',
CREATIVE_WRITING: 'gpt-4o-mini',
FACTUAL_KNOWLEDGE: 'gpt-4o-mini',
INSTRUCTION_FOLLOWING: 'gpt-4o-mini',
CONVERSATIONAL: 'gpt-4o-mini',
MULTILINGUAL: 'gpt-4o-mini',
SUMMARIZATION: 'gpt-4o-mini',
},
});Config presets
| Preset | detector | decomposer | router | aggregation | Best for |
|--------|--------------|--------------|---------------|-------------|------------------------------------|
| A | rule_based | rule_based | llm | simple | Easiest setup, zero model download |
| B | rule_based | rule_based | rnn | simple | Fastest inference, tiny download |
| C | rule_based | rule_based | transformer | simple | Highest accuracy, free after download |
| D | llm | llm | transformer | llm | Maximum quality, all LLM-powered |
Router Approaches (Component 4)
| Router | How it works | Accuracy | Latency | Cost/query | First-use download |
|---------------|-------------------------------------------|-----------|----------|-----------------|--------------------|
| llm | GPT-4o-mini classifies via API | 86.5% | ~1s | ~$0.0003 | None |
| transformer | DeBERTa-v3 ONNX runs locally in Node.js | 92.0% F1 | ~50ms | $0 | ~233MB (once) |
| rag | ChromaDB hybrid vector + BM25 retrieval | 86.1% F1 | ~100ms | $0 | ~188MB (once) |
| rnn | TextCNN + FastText runs locally | 89.9% F1 | ~2.5ms | $0 | ~9MB (once) |
First-use downloads: ML models are automatically downloaded from HuggingFace Hub and cached in
~/.bar_model_cache/. Every subsequent use loads from local cache — fast and free.
API Reference
bar.route(query) — Full 5-component pipeline
const result = await bar.route(
'Compare Python and JavaScript for ML, then suggest which to learn first'
);
console.log(result.isComplex);
// true
console.log(result.subQueries);
// ["Compare Python and JavaScript for ML", "suggest which to learn first"]
console.log(result.executionPlan);
// {
// stages: [["$1"], ["$2"]],
// nParallel: 1,
// nSequential: 1,
// subQueries: [
// { id: "$1", text: "Compare...", dependsOn: [], execTag: "PARALLEL" },
// { id: "$2", text: "suggest...", dependsOn: ["$1"], execTag: "SEQUENTIAL" },
// ]
// }
console.log(result.routing);
// ["$1 → CODE", "$2 → INSTRUCTION_FOLLOWING"]
console.log(result.answer);
// unified final answer
console.log(result.componentLatency);
// { detect: 2, decompose: 5, plan: 0, route: 1200, answer: 1200, aggregate: 300 }bar.detect(query) — Component 1 only
Detect whether a query is simple or complex, without running the full pipeline.
const result = await bar.detect('What is Python?');
// { query: "What is Python?", isComplex: false, detector: "rule_based" }
const result = await bar.detect(
'Compare Python and JavaScript, then explain which is better for ML'
);
// { query: "...", isComplex: true, detector: "rule_based" }bar.decompose(query) — Component 2 only
Break a query into sub-queries, without routing or answering.
const result = await bar.decompose(
'Explain diabetes symptoms and write a Python script to track blood sugar'
);
console.log(result.subQueries);
// ["Explain diabetes symptoms", "write a Python script to track blood sugar"]
console.log(result.decomposer); // "rule_based"
console.log(result.latencyMs); // 3bar.plan(subQueries) — Component 3 only
Build a parallel/sequential execution plan from a list of sub-queries.
const result = bar.plan([
{ id: "$1", text: "Fetch the data", depends_on: [] },
{ id: "$2", text: "Clean the data", depends_on: ["$1"] },
{ id: "$3", text: "Visualize the results", depends_on: ["$2"] },
]);
console.log(result.stages);
// [["$1"], ["$2"], ["$3"]]
console.log(result.subQueries);
// [
// { id: "$1", execTag: "PARALLEL", dependsOn: [] },
// { id: "$2", execTag: "SEQUENTIAL", dependsOn: ["$1"] },
// { id: "$3", execTag: "SEQUENTIAL", dependsOn: ["$2"] },
// ]
console.log(result.nParallel); // 1
console.log(result.nSequential); // 2bar.classify(query) — Component 4 only
Classify a query into a category, without decomposing or answering.
const category = await bar.classify('What are the tax implications of selling stocks?');
// → "FINANCE"
const category = await bar.classify('Write a recursive fibonacci function in Python');
// → "CODE"
const category = await bar.classify('Summarize this article: ...');
// → "SUMMARIZATION"Supported Categories
| Category | Default model | Example queries |
|-------------------------|-----------------|----------------------------------------------|
| REASONING | gpt-4o | Logic puzzles, math, multi-step inference |
| CODE | gpt-4o | Programming, debugging, algorithms |
| MEDICAL | gpt-4o | Symptoms, treatments, health advice |
| LEGAL | gpt-4o | Contracts, law, compliance questions |
| FINANCE | gpt-4o | Investing, tax, budgeting, economics |
| CREATIVE_WRITING | gpt-4o-mini | Stories, poems, essays, creative content |
| FACTUAL_KNOWLEDGE | gpt-4o-mini | Facts, history, geography, science |
| INSTRUCTION_FOLLOWING | gpt-4o-mini | Step-by-step tasks, how-to guides |
| CONVERSATIONAL | gpt-4o-mini | Casual chat, opinions, greetings |
| MULTILINGUAL | gpt-4o-mini | Non-English queries, translation requests |
| SUMMARIZATION | gpt-4o-mini | Summarize or condense content |
Agent Framework Integrations
LangChain
import { BAROrchestrator } from 'bar-dynamic-orchestrator';
import { DynamicTool } from '@langchain/core/tools';
const bar = new BAROrchestrator({ openaiKey: 'sk-...', router: 'transformer' });
const barTool = new DynamicTool({
name: 'bar_router',
description: 'Routes a query to the best LLM and returns the answer',
func: async (query: string) => {
const result = await bar.route(query);
return result.answer;
},
});AutoGen
import { BAROrchestrator } from 'bar-dynamic-orchestrator';
const bar = new BAROrchestrator({ openaiKey: 'sk-...', router: 'transformer' });
async function barRoute(query: string): Promise<string> {
const result = await bar.route(query);
return result.answer;
}
// Register barRoute as a tool in your AutoGen agentLlamaIndex
import { BAROrchestrator } from 'bar-dynamic-orchestrator';
import { FunctionTool } from 'llamaindex';
const bar = new BAROrchestrator({ openaiKey: 'sk-...', router: 'transformer' });
const barTool = FunctionTool.from(
async ({ query }: { query: string }) => {
const result = await bar.route(query);
return result.answer;
},
{
name: 'bar_router',
description: 'Routes a query to the best LLM and returns the answer',
parameters: {
type: 'object',
properties: { query: { type: 'string' } },
required: ['query'],
},
}
);Requirements
- Node.js >= 18
- OpenAI API key — for answer generation and
llmrouter/detector/decomposer
License
MIT — Nipun Hevavitharana
