agnostic-agents
v1.3.3
Published
A Node.js package for multi-LLM agent support
Readme
agnostic-agents
agnostic-agents is a Node.js runtime OS for building provider-agnostic agent systems.
The longer direction is broader than provider agnosticism alone: the package is
meant to stay agnostic to how intelligence is produced, whether that comes from
an LLM, tool, workflow, simulator, verifier, or human checkpoint.
It is designed for projects that need more than a chat wrapper:
- inspectable runs
- checkpoints and replay
- approvals and policy-gated tools
- workflows and delegation
- grounded retrieval and layered memory
- governed memory with provenance, retention, conflict handling, and access controls
- distributed handoff across processes or services
- evals, benchmarks, and incident analysis
- API and protocol tools imported from OpenAPI, curl, and MCP discovery
- import, sandbox, record, and simulate external tools before production use
The package also includes a separate coordination layer above the runtime for:
- structured critique
- disagreement resolution
- task decomposition
- coordination benchmarks
It also includes governed learning, fleet rollout, and assurance surfaces for:
- reviewed improvement proposals and bounded adaptation
- staged rollout and canary control across many runtimes
- invariant checks and rollout blocking before unsafe changes spread
The maintained architecture is:
- runtime OS first
- coordination intelligence above the runtime
- governed learning above coordination
- fleet and assurance layers above runtime, coordination, and learning
- operator control above those layers
- next-horizon capability routing, governed memory, and budgeted autonomy on top of the current core
Install
npm install agnostic-agentsThe package ships a maintained index.d.ts, so TypeScript projects get typed access to the public surface without extra setup.
What This Package Is Good For
Use agnostic-agents when you want:
- a single-agent runtime with tools, approvals, memory, and replay
- a workflow runner with explicit steps and child-run lineage
- a provider-agnostic execution layer behind one adapter contract
- a runtime substrate for higher-level worker or organization systems
- a package that keeps control, observability, and governance in the open
- a package that can turn evidence into governed improvement proposals instead of hidden self-modification
- a package that can keep learned changes inside explicit safety envelopes and measure their effect after review
- a package that can turn eval, incident, and branch evidence into concrete runtime or coordination change plans
- a package that can benchmark learned changes and halt adaptation when those changes start regressing outcomes
- a package that can stage, compare, and roll back changes across a fleet of runtimes
- a package that can block unsafe rollout candidates with explicit invariants and assurance reports
- a package that can evolve toward capability-aware routing instead of static model/provider defaults
- a package that treats memory as governed operational knowledge instead of ad hoc retrieval glue
- a package that can audit what memory was stored, recalled, blocked, expired, or superseded
- a package that treats supervised autonomy as a core operating model rather than an exception path
- a package that can turn OpenAPI files, curl commands, and MCP discovery into governed executable tools
- a package that can bootstrap tools from Postman collections, record real tool I/O, mock them offline, and resolve secrets outside prompts and code
Do not think of it as only a prompt helper or a chat abstraction. The maintained direction is a runtime control layer for serious agent systems.
The repo is intentionally split so that:
- runtime primitives stay general-purpose and portable
- coordination logic stays inspectable and separate from the runtime kernel
- learning/adaptation remains governed instead of hidden in opaque agent glue
- fleet rollout and assurance remain operator-visible instead of collapsing into opaque control-plane logic
Fastest Start
If you just want a working tool-using agent, start here:
const { Agent, Tool, OpenAIAdapter } = require('agnostic-agents');
const adapter = new OpenAIAdapter(process.env.OPENAI_API_KEY, {
model: 'gpt-4o-mini',
});
const calculator = new Tool({
name: 'calculate',
description: 'Evaluate a basic arithmetic expression.',
parameters: {
type: 'object',
properties: {
expression: { type: 'string' },
},
required: ['expression'],
},
implementation: async ({ expression }) => ({
result: Function(`"use strict"; return (${expression})`)(),
}),
});
const agent = new Agent(adapter, {
tools: [calculator],
description: 'Use tools when they help.',
defaultConfig: { temperature: 0.2, maxTokens: 300 },
});
const answer = await agent.sendMessage('What is 12 * 7?');
console.log(answer);If you want the maintained runtime path instead of the shortest chat path, use Agent.run():
const { Agent, Tool, OpenAIAdapter, RunInspector } = require('agnostic-agents');
const adapter = new OpenAIAdapter(process.env.OPENAI_API_KEY, {
model: 'gpt-4o-mini',
});
const sendUpdate = new Tool({
name: 'send_status_update',
parameters: {
type: 'object',
properties: {
recipient: { type: 'string' },
summary: { type: 'string' },
},
required: ['recipient', 'summary'],
},
metadata: {
executionPolicy: 'require_approval',
sideEffectLevel: 'external_write',
},
implementation: async ({ recipient, summary }) => ({
delivered: true,
recipient,
summary,
}),
});
const agent = new Agent(adapter, {
tools: [sendUpdate],
verifier: adapter,
defaultConfig: { selfVerify: true },
});
let run = await agent.run('Send Paulo a short update saying the runtime is ready.');
if (run.status === 'waiting_for_approval') {
run = await agent.resumeRun(run.id, {
approved: true,
reason: 'approved in demo',
});
}
console.dir(RunInspector.summarize(run), { depth: null });Core Surfaces
Runtime
The maintained runtime surface centers on:
AgentRunRunInspectorToolPolicyApprovalInboxGovernanceHooks
That gives you:
- pause/resume/cancel
- tool approval gating
- replay and branching
- structured events
- run inspection and trace export
API and MCP Tool Import
The maintained import surfaces are:
OpenAPILoaderApiLoaderCurlLoaderPostmanLoaderMCPDiscoveryLoader
Use them when you want to:
- turn an OpenAPI file into executable runtime tools
- normalize a custom API spec into runtime tools
- bootstrap a tool from a working curl command
- bootstrap tools from Postman collections and variables
- discover remote MCP tools and expose them through the normal tool/runtime path
The maintained utility surfaces above the runtime also include:
SecretResolverSchemaNormalizerToolRecorderToolMockBuilderToolSandboxRunnerPromptArtifactPromptRegistryRunRecipeWorkflowPresetIncidentBundleExporterCredentialDelegationKitRoutePolicySimulator
The first maintained memory-governance surfaces now include:
MemoryProvenanceLedgerMemoryRetentionPolicyMemoryAccessControllerMemoryConflictResolverMemoryAuditViewMemoryGovernanceBenchmarkSuiteMemoryGovernanceDiagnosticsMemoryGovernanceReviewWorkflow
Workflow and Delegation
For explicit multi-step work, use:
WorkflowWorkflowStepAgentWorkflowStepWorkflowRunnerDelegationRuntimeDelegationContract
This is the maintained orchestration layer. Legacy Task and Orchestrator remain only for compatibility.
Example:
const {
Agent,
Workflow,
AgentWorkflowStep,
WorkflowRunner,
DelegationRuntime,
DelegationContract,
OpenAIAdapter,
} = require('agnostic-agents');
const adapter = new OpenAIAdapter(process.env.OPENAI_API_KEY);
const delegationRuntime = new DelegationRuntime();
const researcher = new Agent(adapter, {
description: 'Produce concise factual bullet points.',
});
const writer = new Agent(adapter, {
description: 'Turn findings into short status updates.',
});
const workflow = new Workflow({
id: 'daily-sync',
steps: [
new AgentWorkflowStep({
id: 'research',
agent: researcher,
delegationRuntime,
delegationContract: new DelegationContract({
id: 'research-contract',
assignee: 'researcher',
requiredInputs: ['prompt'],
}),
prompt: 'List three runtime capabilities in bullet form.',
}),
new AgentWorkflowStep({
id: 'draft_update',
agent: writer,
dependsOn: ['research'],
delegationRuntime,
delegationContract: new DelegationContract({
id: 'writer-contract',
assignee: 'writer',
requiredInputs: ['prompt'],
}),
prompt: ({ results }) => `Turn this research into a short update:\n${results.research.output}`,
}),
],
});
const run = await new WorkflowRunner({ workflow }).run('Prepare a daily sync update');
console.log(run.output);Memory and Retrieval
Use:
Memoryfor conversation, working, profile, policy, and semantic layersRAGfor grounded retrieval with provenance
const { Agent, Memory, RAG, LocalVectorStore, OpenAIAdapter } = require('agnostic-agents');
const adapter = new OpenAIAdapter(process.env.OPENAI_API_KEY);
const memory = new Memory();
const rag = new RAG({
adapter,
vectorStore: new LocalVectorStore(),
});
await rag.index(['Runtime control requires replay, branching, and inspection.']);
await memory.setWorkingMemory('current_project', 'Validate the runtime release');
const agent = new Agent(adapter, {
memory,
rag,
description: 'Use retrieved context when relevant.',
});
console.log(await agent.sendMessage('What does runtime control require?'));Distributed Execution
Use shared stores plus distributed envelopes when one process creates a run and another continues it.
const { Agent, FileRunStore } = require('agnostic-agents');
const runStore = new FileRunStore({ directory: './runtime-runs' });
const processA = new Agent(adapterA, { runStore });
const processB = new Agent(adapterB, { runStore });
const run = await processA.run('Prepare the handoff.');
const envelope = await processA.createDistributedEnvelope(run.id, {
action: 'replay',
checkpointId: run.checkpoints[run.checkpoints.length - 1].id,
metadata: { handoffTarget: 'worker-b', transport: 'service_call' },
});
const remoteRun = await processB.continueDistributedRun(envelope);
console.log(remoteRun.status);Coordination Layer
Above the runtime core, the package now includes:
CritiqueProtocolCritiqueSchemaRegistry- risk-class and artifact-type-aware critique defaults through
CritiqueSchemaRegistry TrustRegistryDisagreementResolverCoordinationLoopDecompositionAdvisorCoordinationBenchmarkSuiteCoordinationRoleContractRoleAwareCoordinationPlannerCoordinationTraceDisagreementResolverstrategies for weighted, majority, trust-consensus, and severity-first coordinationVerificationStrategySelectorMultiPassVerificationEngineCoordinationQualityTrackerCoordinationDiagnostics
This layer is for:
- structured critique records
- trust-weighted disagreement handling
- role-aware decomposition and assignment
- disagreement strategies shaped by trust, severity, and consensus mode
- multi-pass and adversarial verification paths for higher-risk coordination
- verifier quality tracked separately from executor quality
- operator-facing coordination diagnostics for disagreement, missing roles, and verification escalation
- coordination evals and benchmarks
It is intentionally separate from the runtime kernel.
Forward Direction
The current package already includes:
v9Policy OS corev10State OS corev11Interop OS corev12Coordination OS corev13Learning OS corev14Fleet OS corev15Assurance OS baseline
The maintained direction after this is refinement and extension of those layers:
- interop depth and external ecosystem adoption
- stronger coordination quality and diagnostics
- governed learning refinement instead of hidden autonomy
- stronger fleet and assurance hardening above the completed core
The next horizon after the current shipped core is:
v16Operator OS- richer day-2 control, intervention, triage, and governance continuity
That sequence is intentional. The package should become smarter without collapsing policy, state, interop, coordination, and learning into one opaque layer.
The current maintained v14 baseline starts with:
FleetRolloutPlanFleetHealthMonitorFleetCanaryEvaluatorFleetSafetyControllerFleetImpactComparatorFleetRollbackAdvisorRouteFleetDiagnostics
The current maintained v15 baseline starts with:
InvariantRegistryAssuranceSuiteAssuranceReportAssuranceGuardrailAssuranceRecoveryPlannerFleetSafetyControllerFleetImpactComparatorFleetRollbackAdvisor
The current v16 baseline starts with:
OperatorSummaryOperatorInterventionPlannerOperatorTriageWorkflowGovernanceRecordLedgerAuditStitcherGovernanceTimelineOperatorDashboardSnapshotOperatorControlLoop
Useful docs for these layers:
Policy and Governance
Use:
ToolPolicyfor raw policy logicPolicyPackfor portable policy artifactsPolicySimulatorfor policy simulation over requests, runs, and trace bundlesPolicyEvaluationRecordfor portable policy evaluation artifactsPolicyDecisionReport.explain()for operator-facing policy explanationsPolicyScopeResolverfor scoped policy inheritance across runtime, workflow, agent, and distributed handoff layersPolicyLifecycleManagerfor draft promotion and rollback of policy packsApprovalEscalationPolicySuitefor simulating approval and escalation policy scenarios before rolloutRecoveryPolicyGatefor applying policy constraints to replay, branch, and resume recovery pathsCompensationPolicyPlannerfor applying policy to compensation recommendations on side-effecting workCoordinationPolicyGatefor applying policy to coordination outcomes like retry, reject, or escalateExtensionHostfor contributed policy and governance behaviorProductionPolicyPackfor a maintained production-oriented presetApprovalInboxandGovernanceHooksfor operator-facing control
State and Replay
Use:
Runfor inspectable runtime stateTraceSerializerfor portable run/trace exportStateBundlefor portable run-plus-memory state snapshotsStateDifffor high-level state comparisonStateBundleSerializerfor import/export and validation of state bundlesStateContractRegistryfor authoritative-versus-derived state contractsStateIntegrityCheckerfor pre-restore integrity checksStateConsistencyCheckerfor coherence checks across run state, memory, and portable job metadataMemoryAccessContractRegistryfor explicit memory access contracts across runtime, workflow, coordination, learning, and operator surfacesStateRestorePlannerfor cross-environment restore planning from portable state bundlesStateDurableRestoreSuitefor process, queue, and service restore scenarios plus workflow/scheduler durability stepsStateIncidentReconstructorfor offline incident reconstruction directly from portable state bundles
This means users can add or mutate policy dynamically without patching core runtime code.
Useful docs for these surfaces:
Provider Surface
The package supports multiple providers behind one adapter contract.
Shared maintained contract:
generateText()getCapabilities()supports(capability)
Capability support and certification vary by provider. Use these docs instead of assuming parity:
If you want the strongest maintained end-to-end path today, start with OpenAI.
Maintained Examples
Local/no-key examples:
npm run example:local-toolnpm run example:local-ragnpm run example:local-rag-toolnpm run example:reference-workernpm run example:reference-incidentnpm run example:reference-operatornpm run example:reference-operator-dashboardnpm run example:reference-evalsnpm run example:reference-replay-benchmarksnpm run example:reference-adaptive-benchmarksnpm run example:reference-v7-auditnpm run example:reference-coordination-reviewnpm run example:reference-decomposition-advisornpm run example:reference-coordination-benchmarksnpm run example:reference-coordination-policy-gatenpm run example:reference-role-aware-coordinationnpm run example:reference-production-policy-packnpm run example:reference-policy-simulationnpm run example:reference-policy-inheritancenpm run example:reference-policy-lifecyclenpm run example:reference-approval-escalation-policy-suitenpm run example:reference-recovery-policy-gatenpm run example:reference-compensation-policy-plannernpm run example:reference-state-bundlenpm run example:reference-state-restore-plannernpm run example:reference-state-incident-reconstructornpm run example:reference-file-backed-stacknpm run example:reference-worker-coordination-benchmarksnpm run example:reference-openapinpm run example:reference-durable-backendsnpm run example:reference-distributed-handoffnpm run example:reference-distributed-incidentnpm run example:reference-remote-control-planenpm run example:reference-deployment-splitnpm run example:reference-distributed-recovery
Provider-backed examples:
npm run example:openainpm run example:gemininpm run example:openai-runtimenpm run example:openai-v3-runtimenpm run example:openai-v4-runtime
See examples/README.md for labels and intent.
Recommended Starting Paths
If you are new to the package:
- start with
local-tooloropenai - move to
Agent.run()andRunInspector - add
ToolPolicyandApprovalInboxif side effects matter - add
WorkflowRunnerand delegation if you need explicit orchestration - move to file-backed or split deployment references when persistence and operations matter
If you are evaluating the package for serious runtime usage:
- read docs/reference-integrations.md
- read docs/common-stack-integrations.md
- run
reference-file-backed-stack - run
reference-deployment-split - run
reference-public-control-plane - review docs/operator-workflows.md
Documentation Map
API and package surface:
Runtime operations:
- Reference integrations
- Common stack integrations
- Operator workflows
- Operator architecture
- Operator day-2 guidance
- Operator checklists
- Operator OS
- Governed improvement
- Fleet OS
- Assurance OS
- Multi-runtime operations
- Public control-plane references
- Run and trace visualization
- Distributed execution
- Remote control planes
- Distributed identities
- Storage backends
Policy, governance, and security:
Evals and ecosystem:
- Benchmarking
- Benchmark fixtures
- Capability fabric
- Community summaries
- Community roadmap status
- Plugin authoring
- Extension certification and compatibility
- Ecosystem certification guidance
- Runtime extension contributor guidance
- Interop OS
- Interop artifact registry
- Interop schema evolution
- Interop contract validation
- Interop certification kits
- Third-party control plane interop
- MCP interoperability
- OpenAPI examples
- Cookbook
Provider guidance:
Development
Run the maintained test suites with:
npm testOther useful commands:
npm run test:unit
npm run test:integration
npm run test:live
npm run test:all
npm run test:coverageCurrent Scope
This package currently targets:
- provider-agnostic runtime execution
- inspectable runs with replay and lineage
- approval-gated tool execution
- layered memory and grounded retrieval
- explicit workflows and delegation
- distributed handoff and recovery
- adaptive runtime tuning and operator-facing evals
- a separate coordination layer above the runtime
- governed learning and bounded adaptation
- fleet rollout and rollback control
- assurance suites and rollout guardrails
- operator-centered triage and intervention workflows
- operator dashboard snapshots and control-loop references
Forward Direction
The current package already covers the runtime, coordination, governed
learning, fleet, assurance, and operator core program through v16.
The next horizon is not "add basic runtime features." That substrate already exists. The next versions are about turning that substrate into a stronger autonomy operating layer:
v17Capability Fabric OS- capability-aware routing across models, tools, methods, simulators, verifiers, budgets, and trust zones
v18Memory Governance OS- governed memory with provenance, retention, redaction, conflict handling, and operator-visible lifecycle rules
v19Budgeted Autonomy OS- uncertainty thresholds, reusable human approvals, jurisdiction/tenant-aware policies, and explicit autonomy budgets
v20Enterprise Autonomy OS- integrated execution graphs, transactional side-effect discipline, multi-agent safety controls, and a coherent AI operating-layer story
The target shape is six layers:
- kernel
- policy
- workflow
- intelligence
- memory
- operator
The governing rule stays the same:
- reliability over demo complexity
- observability over abstraction hype
- supervised autonomy over blind automation
- governed memory over ad hoc retrieval
- policy and evaluation over hidden self-modification
The first maintained v17 surface is CapabilityRouter, which adds
explainable capability-aware ranking above raw provider fallback.
That routing surface now also plugs into decomposition, role-aware coordination,
and verification strategy selection.
It does not try to be:
- a hosted control plane
- a closed worker-management product
- a provider-specific framework disguised as a generic runtime
And it should not become:
- a runtime that hides coordination and learning decisions behind unreadable internal heuristics
- a self-modifying agent loop that bypasses policy, replay, or operator review
