npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

agent-orchestra

v0.1.0

Published

Multi-agent AI orchestration with confidence scoring for TypeScript

Readme


Why agent-orchestra?

Most AI agent frameworks give you building blocks but leave the hard problems to you: when should an agent's output be trusted? What happens when two agents disagree? How do you prevent a confident-but-wrong agent from causing damage?

agent-orchestra is a TypeScript framework for multi-agent orchestration where confidence scoring is the core primitive, not an afterthought. Every agent returns a calibrated confidence score. The orchestrator uses these scores to decide whether to proceed, cross-validate with another agent, or escalate to a human. The result is an AI system that knows what it doesn't know.

npm install agent-orchestra

Quickstart

import {
  Orchestrator,
  defineAgent,
  ConfidenceThresholds,
} from "agent-orchestra";

// 1. Define specialized agents
const reviewer = defineAgent({
  id: "code-reviewer",
  description: "Reviews code changes for correctness",
  execute: async (task, context) => {
    const analysis = await yourLLM.analyze(task.payload);
    return {
      result: analysis.findings,
      confidence: analysis.confidence,
      rationale: analysis.reasoning,
    };
  },
});

const security = defineAgent({
  id: "security-scanner",
  description: "Checks for security vulnerabilities",
  execute: async (task, context) => {
    const scan = await yourLLM.scan(task.payload);
    return {
      result: scan.vulnerabilities,
      confidence: scan.confidence,
      rationale: scan.reasoning,
    };
  },
});

// 2. Create the orchestrator
const orchestra = new Orchestrator({
  agents: [reviewer, security],
  thresholds: ConfidenceThresholds.DEFAULT,
  onEscalation: async (result) => {
    console.log(`Escalating: ${result.rationale}`);
  },
});

// 3. Run
const result = await orchestra.run({
  type: "code-review",
  payload: { diff: "..." },
});

console.log(result.aggregateConfidence); // 0.87
console.log(result.escalations);         // []

That's it. Sixteen lines to a working multi-agent system with confidence gating.

Architecture

graph TB
    subgraph Orchestrator
        TQ[Task Queue] --> DC[Decomposer]
        DC --> RT[Router]
        RT --> AG[Aggregator]
        AG --> CG[Confidence Gate]
        CG -->|≥ 0.85| PR[Proceed]
        CG -->|0.60 – 0.84| CV[Cross-Validate]
        CG -->|< 0.60| ES[Escalate]
    end

    subgraph Agents
        A1[Agent A]
        A2[Agent B]
        A3[Agent C]
    end

    subgraph Context
        CB[Context Bus]
        HS[Score History]
    end

    RT --> A1 & A2 & A3
    A1 & A2 & A3 --> AG
    A1 & A2 & A3 -.-> CB
    ES -.-> HS

    style CG fill:#1a1a2e,color:#fff
    style PR fill:#16a34a,color:#fff
    style CV fill:#ca8a04,color:#fff
    style ES fill:#dc2626,color:#fff

The orchestrator receives a task, decomposes it into sub-tasks, routes each to the appropriate agent, collects results with confidence scores, and makes a decision. It never writes code or produces content itself — its job is coordination and judgment.

Core Concepts

Agents

An agent is a specialized unit that performs one task well. Each agent implements the Agent interface: a single execute method that takes a Task and returns an AgentResult with a confidence score.

import { defineAgent } from "agent-orchestra";

const myAgent = defineAgent({
  id: "my-agent",
  description: "Does one thing well",
  taskTypes: ["analysis"],        // optional: restrict to task types
  schemaVersion: 1,               // optional: reject incompatible tasks

  execute: async (task, context) => {
    // context.bus — read/write to the shared context bus
    // context.history — past results for this task chain
    const priorFindings = context.bus.get("upstream-findings");

    const result = await doWork(task.payload, priorFindings);

    // Write to context bus for downstream agents
    context.bus.set("my-findings", result.findings);

    return {
      result: result.data,
      confidence: result.confidence,  // 0–1
      rationale: "Explanation of confidence level",
      evidencePaths: result.filesExamined,
    };
  },
});

Agents are model-agnostic. Use OpenAI, Anthropic, a local model, or a deterministic function — agent-orchestra doesn't care how you get the result, only that you return a confidence score with it.

Confidence Scoring

Every AgentResult includes a confidence field (0–1). The framework provides rubric helpers to keep scores calibrated:

import { ConfidenceRubric } from "agent-orchestra";

const reviewRubric = new ConfidenceRubric({
  high:   { range: [0.9, 1.0], criteria: "Full context, established patterns, small diff" },
  medium: { range: [0.7, 0.9], criteria: "Well-understood but touches integration boundaries" },
  low:    { range: [0.5, 0.7], criteria: "Multiple plausible interpretations exist" },
  guess:  { range: [0.0, 0.5], criteria: "Insufficient context to make a determination" },
});

// Use in your agent
const score = reviewRubric.score("medium", 0.78);
// Returns 0.78, validated against the range

The orchestrator uses thresholds to gate decisions:

| Score | Action | Description | |-------|--------|-------------| | ≥ 0.85 | Proceed | Result is trusted. Move to next step. | | 0.60 – 0.84 | Cross-validate | Route to a different agent for a second opinion. | | < 0.60 | Escalate | Pause for human review. |

Thresholds are configurable per deployment:

import { ConfidenceThresholds } from "agent-orchestra";

// Built-in presets
ConfidenceThresholds.DEFAULT    // { proceed: 0.85, review: 0.60 }
ConfidenceThresholds.STRICT     // { proceed: 0.92, review: 0.75 }
ConfidenceThresholds.PERMISSIVE // { proceed: 0.75, review: 0.50 }

// Custom
const custom = { proceed: 0.88, review: 0.65 };

Cross-Agent Chaining

When Agent A's output becomes Agent B's input, confidence propagates:

import { propagateConfidence } from "agent-orchestra";

const scores = [0.85, 0.78, 0.91];
propagateConfidence(scores); // 0.60 — conservative by design

The formula: max(product(scores), min(scores) * 0.9). Multiplicative attenuation penalizes long uncertain chains while the floor prevents unbounded pessimism.

The orchestrator handles chaining automatically:

const orchestra = new Orchestrator({
  agents: [analyzer, reviewer, tester],
  chains: [
    { from: "analyzer", to: "reviewer", when: "always" },
    { from: "reviewer", to: "tester", when: "confidence_below", threshold: 0.85 },
  ],
});

Disagreement Detection

When two agents assess the same artifact and disagree, the orchestrator detects it:

const orchestra = new Orchestrator({
  agents: [reviewer, security],
  onDisagreement: async (disagreement) => {
    // disagreement.agents — ["code-reviewer", "security-scanner"]
    // disagreement.summaryA — reviewer's rationale
    // disagreement.summaryB — security's rationale
    await notifyHuman(disagreement);
  },
});

The orchestrator never resolves disagreements by averaging scores or picking the higher one. Disagreement between confident agents is a signal that human judgment is needed.

Context Bus

Agents communicate through a shared context bus rather than stuffing full outputs into prompts:

// Agent A writes structured findings
context.bus.set("security-findings", {
  vulnerabilities: [...],
  scannedFiles: [...],
  uncertainties: ["Could not determine if input is sanitized at L42"],
});

// Agent B reads only what it needs
const findings = context.bus.get("security-findings");

This keeps each agent's prompt focused and prevents context pollution.

Circuit Breaker

Built-in protection against runaway agents:

const orchestra = new Orchestrator({
  agents: [reviewer],
  circuitBreaker: {
    maxConsecutiveFailures: 3,
    maxActionsPerMinute: 20,
    maxTokenBudget: 500_000,
    onTrip: async (reason) => {
      await alertOps(`Circuit breaker tripped: ${reason}`);
    },
  },
});

Circuit breakers require manual reset by default. An agent that enters a failure loop should not be allowed to retry on its own.

API Reference

defineAgent(config)

Creates a new agent instance.

| Parameter | Type | Required | Description | |-----------|------|----------|-------------| | id | string | Yes | Unique agent identifier | | description | string | Yes | Human-readable description | | taskTypes | string[] | No | Task types this agent handles (all if omitted) | | schemaVersion | number | No | Reject tasks with incompatible schema versions | | execute | (task, context) => Promise<AgentResult> | Yes | The agent's work function |

Returns: Agent

new Orchestrator(config)

Creates the orchestrator.

| Parameter | Type | Required | Description | |-----------|------|----------|-------------| | agents | Agent[] | Yes | Array of agents to coordinate | | thresholds | ThresholdConfig | No | Confidence thresholds (default: ConfidenceThresholds.DEFAULT) | | chains | ChainConfig[] | No | Cross-agent chaining rules | | circuitBreaker | CircuitBreakerConfig | No | Circuit breaker settings | | maxChainDepth | number | No | Max chaining depth before forced escalation (default: 4) | | onEscalation | (result) => Promise<void> | No | Called when a result is escalated | | onDisagreement | (disagreement) => Promise<void> | No | Called when agents disagree |

orchestrator.run(task)

Executes a task through the orchestration pipeline.

| Parameter | Type | Description | |-----------|------|-------------| | task | Task | The task to execute |

Returns: Promise<OrchestratorResult>

interface OrchestratorResult {
  results: AgentResult[];
  escalations: Escalation[];
  disagreements: Disagreement[];
  aggregateConfidence: number;
  tokensUsed: number;
  durationMs: number;
}

AgentResult

Returned by every agent execution.

interface AgentResult {
  agentId: string;
  taskId: string;
  result: unknown;
  confidence: number;        // 0–1
  rationale: string;         // why this confidence level
  evidencePaths?: string[];  // files/resources examined
}

propagateConfidence(scores)

Computes aggregate confidence across a chain.

propagateConfidence([0.9, 0.85])  // 0.765
propagateConfidence([0.5, 0.9])   // 0.45
propagateConfidence([])            // 0

ConfidenceRubric

Helper for anchoring confidence scores to observable criteria.

const rubric = new ConfidenceRubric({ ... });
rubric.score(level, value)  // Validates value is in the level's range
rubric.describe()           // Returns human-readable rubric description

Comparison

| | agent-orchestra | LangGraph | CrewAI | AutoGen/AG2 | |---|---|---|---|---| | Core abstraction | Confidence-gated orchestration | State machine graphs | Role-based crews | Multi-party conversation | | Confidence scoring | First-class primitive with rubrics, calibration, propagation | Not built-in | Not built-in | Not built-in | | Disagreement detection | Automatic with configurable resolution | Manual | Manual | Emergent (uncontrolled) | | Circuit breakers | Built-in | Not built-in | Not built-in | Not built-in | | Cross-agent chaining | Declarative with confidence gating | Graph edges | Sequential/parallel tasks | Conversation turns | | Human-in-the-loop | Confidence-triggered escalation | Checkpoint-based | Limited | Manual | | Language | TypeScript | Python, TypeScript | Python | Python | | Model lock-in | None | LangChain ecosystem | None | None | | Framework weight | ~12 KB (zero dependencies) | Heavy (LangChain) | Medium | Medium |

When to use agent-orchestra: You need multiple AI agents to coordinate on tasks where reliability matters more than speed, and you want fine-grained control over when the system trusts its own output.

When to use something else: You need a full application framework (LangGraph), rapid prototyping with role-based agents (CrewAI), or conversational multi-agent research (AutoGen).

Features at a Glance

  • Zero runtime dependencies — ships at ~13 KB minified
  • Dual CJS/ESM — works in Node.js, Bun, Deno, and bundlers
  • Full TypeScript — strict types with exported interfaces for everything
  • Model-agnostic — use OpenAI, Anthropic, local models, or deterministic functions
  • 38 tests — comprehensive coverage of confidence, agents, circuit breakers, and orchestration

Observability

agent-orchestra emits structured events for every orchestration decision:

const orchestra = new Orchestrator({
  agents: [reviewer, security],
  logger: {
    onAgentStart: (agentId, task) => { /* ... */ },
    onAgentComplete: (agentId, result) => { /* ... */ },
    onConfidenceGate: (result, decision) => { /* ... */ },
    onDisagreement: (disagreement) => { /* ... */ },
    onCircuitBreak: (reason) => { /* ... */ },
  },
});

Every event includes agentId, taskId, timestamp, tokensUsed, and confidence. Pipe these to your observability stack (Datadog, Grafana, LangSmith, or a plain JSON log) for dashboards, alerting, and calibration monitoring.

Contributing

See CONTRIBUTING.md for development setup, testing, and PR guidelines.

License

MIT © Amey Bhalerao