agent-conductor
v0.3.0
Published
A tiny dependency-free TypeScript runtime for structured agent actions.
Readme
agent-conductor
agent-conductor is a tiny, dependency-free TypeScript runtime for structured agent actions.
It helps applications expose a small, safe action surface to AI agents: agents can discover capabilities, propose ordered plans, preview impact, and execute approved actions through your trusted backend code.
The small surface area is intentional. MCP tool definitions and runtime manifests usually end up inside the model context window, so every action name, description, schema, and tool result competes with the user's actual task for tokens. agent-conductor is designed around a compact workflow surface instead of a one-tool-per-API-endpoint wrapper.
Discover capabilities → create plan → validate → preview → executeInstead of giving an agent arbitrary code execution or hundreds of low-level tools, agent-conductor gives it allowlisted workflow actions, schema validation, dry runs, approval-friendly previews, and sequential execution.
What it is
agent-conductor is a structured action runtime.
Agent / MCP client
↓
Small tool surface
↓
agent-conductor runtime
↓
Developer-defined actions
↓
Application backendYour app owns the dangerous parts: auth, data access, side effects, billing, logging, and tenant boundaries. agent-conductor owns the middle: discovery, validation, preview, execution, and MCP-compatible tool descriptions. Keep actions coarse enough to match the model's natural workflow, but strict enough that each action is still safe, auditable, and easy to approve.
It is not:
- an AI framework
- an MCP framework
- a sandbox
- a database/task system
- an LLM client
- a workflow language
Install
npm install agent-conductorThis package currently has no runtime dependencies.
Quick start
import { action, createRuntime, s } from "agent-conductor";
type AppContext = {
userId: string;
tasks: {
createTask(input: {
title: string;
priority?: "low" | "normal" | "high";
}): Promise<{ id: string }>;
assignTask(input: {
taskId: string;
assignee: string;
}): Promise<{ id: string }>;
};
};
const runtime = createRuntime<AppContext>({
actions: {
create_task: action({
description: "Create a task",
tags: ["tasks", "planning"],
input: s.object({
title: s.string().min(1).max(200),
priority: s.enum(["low", "normal", "high"]).optional(),
}),
output: s.object({
id: s.string(),
}),
preview: ({ input }) => ({
title: `Create task "${input.title}"`,
impact: "Creates one new task",
}),
requiresApproval: true,
risk: "low",
execute: async ({ input, ctx }) => {
return ctx.tasks.createTask(input);
},
}),
assign_task: action({
description: "Assign a task to a teammate",
tags: ["tasks", "collaboration"],
input: s.object({
taskId: s.string(),
assignee: s.string(),
}),
preview: ({ input }) => ({
title: `Assign task ${input.taskId} to ${input.assignee}`,
impact: "Updates one task assignment",
}),
execute: async ({ input, ctx }) => {
return ctx.tasks.assignTask(input);
},
}),
},
limits: {
maxActions: 50,
},
});Use it with an ordered plan:
const plan = {
summary: "Plan a launch checklist",
actions: [
{
id: "announcement",
type: "create_task",
input: {
title: "Draft launch announcement",
priority: "high",
},
},
{
type: "assign_task",
input: {
taskId: { $ref: "announcement.output.id" },
assignee: "Sam",
},
},
],
};
const validation = await runtime.validate(plan, ctx);
const preview = await runtime.preview(plan, ctx);
// Show preview to a user, then execute after approval.
const result = await runtime.execute(plan, ctx);Example: OpenAI + SQLite
The core runtime does not call an LLM or own your data layer, but it is designed to sit between them.
A typical flow looks like this:
User request
↓
OpenAI generates a JSON plan
↓
agent-conductor validates and previews the plan
↓
User approves
↓
agent-conductor executes allowlisted actions against SQLiteSee examples/openai-sqlite-tasks.ts for a complete example using:
- OpenAI Chat Completions via
fetch - built-in
node:sqlite create_task,assign_task, andcomplete_taskactions- preview-before-execute approval
- step output references like
{ "$ref": "first_task.output.id" }
Run it with a recent Node version that supports node:sqlite:
OPENAI_API_KEY=sk-... npx tsx examples/openai-sqlite-tasks.ts \
"Create a launch checklist with three tasks and assign the first one to Sam"
tsxis only used to run the TypeScript example directly. It is not required by agent-conductor.
For a smaller Anthropic Claude example, see examples/claude-minimal.ts. It exposes one add_note action, asks Claude to return a JSON plan, previews it, and executes it:
ANTHROPIC_API_KEY=sk-ant-... npx tsx examples/claude-minimal.ts \
"Add a note saying ship the tiny runtime"Runtime API
runtime.search(query?);
runtime.planSchema();
runtime.describe();
runtime.validate(plan, ctx);
runtime.preview(plan, ctx);
runtime.execute(plan, ctx, options?);
runtime.mcpTools();
runtime.handleToolCall(name, input, ctx);search(query?)
Returns a compact list of available actions. Search is intentionally simple in v1: lowercase matching across action name, description, tags, and input field names.
runtime.search("tasks");{
"actions": [
{
"type": "create_task",
"description": "Create a task",
"tags": ["tasks", "planning"],
"input": {
"type": "object",
"properties": {
"title": { "type": "string" },
"priority": {
"type": "string",
"enum": ["low", "normal", "high"],
"optional": true
}
},
"required": ["title"],
"additionalProperties": false
},
"output": {
"type": "object",
"properties": {
"id": { "type": "string" }
},
"required": ["id"],
"additionalProperties": false
},
"requiresApproval": true,
"risk": "low"
}
]
}planSchema()
Returns a compact JSON-schema-like description of the plan envelope, including known action types and their input schemas.
runtime.planSchema();Treat this as prompt material. It should be small enough to include in model context when you ask an agent to produce a plan.
describe()
Returns a runtime manifest containing action capabilities and the plan schema. This is useful when prompting an LLM to produce valid plans.
const manifest = runtime.describe();The manifest is intentionally compact, but it still grows with every action, field, description, and output schema. Prefer a handful of workflow-shaped actions over a large mirror of your internal API.
validate(plan, ctx)
Validates plan shape, action count limits, known action types, input schemas, and optional authorization.
Validation failures return structured errors instead of throwing:
{
"ok": false,
"errors": [
{
"code": "INVALID_INPUT",
"message": "Expected one of: low, normal, high",
"path": "actions.0.input.priority",
"actionIndex": 0,
"actionType": "create_task"
}
]
}preview(plan, ctx)
Validates the plan and returns approval-friendly preview data. It never executes action handlers.
{
"ok": true,
"summary": "Plan a launch checklist",
"steps": [
{
"index": 0,
"id": "announcement",
"type": "create_task",
"title": "Create task \"Draft launch announcement\"",
"impact": "Creates one new task"
}
],
"requiresApproval": true,
"approval": {
"required": true,
"stepIndexes": [0],
"destructive": false,
"highestRisk": "low"
}
}execute(plan, ctx, options?)
Revalidates the plan and executes actions sequentially. Execution stops on the first failed action.
const result = await runtime.execute(plan, ctx, {
signal,
dryRun: false,
});The runtime passes an AbortSignal to each action. Cancellation is cooperative: handlers should observe the signal for best results. A timeout or external abort makes execute() return an aborted/failed result, but JavaScript cannot forcibly stop handler code that ignores the signal, so side effects may still complete in the background.
Actions
An action is the unit of work an agent may request.
type Action<Input, Output, Context> = {
description: string;
tags?: string[];
input: Schema<Input>;
output?: Schema<Output>;
requiresApproval?: boolean;
destructive?: boolean;
risk?: "low" | "medium" | "high";
preview?: (args: {
input: Input;
ctx: Context;
}) => MaybePromise<Preview>;
execute: (args: {
input: Input;
ctx: Context;
signal?: AbortSignal;
}) => MaybePromise<Output>;
authorize?: (args: {
input: Input;
ctx: Context;
}) => MaybePromise<boolean>;
idempotencyKey?: (args: {
input: Input;
ctx: Context;
}) => string | undefined;
};Required fields:
descriptioninputexecute
Optional fields:
tagsoutput— validates action outputs and exposes output shape during capability discoveryrequiresApproval— defaults totrue; used by previews and capability descriptionsdestructive— marks actions that may delete, overwrite, charge, send, or otherwise cause higher-impact side effectsrisk— optional"low" | "medium" | "high"metadata for approval UIspreviewauthorizeidempotencyKey— skips duplicate actions with the same key within one execution and reuses the prior output
Plan format
Plans are ordered lists, not programs.
type Plan = {
summary?: string;
actions: PlanStep[];
};
type PlanStep = {
id?: string;
type: string;
input: unknown;
};There is no branching, looping, imports, network access, filesystem access, or generated code execution. Agents compose allowlisted actions; they do not create new execution semantics.
References between steps
Later steps can reference outputs from earlier steps:
{
"actions": [
{
"id": "announcement",
"type": "create_task",
"input": {
"title": "Draft launch announcement"
}
},
{
"type": "assign_task",
"input": {
"taskId": {
"$ref": "announcement.output.id"
},
"assignee": "Sam"
}
}
]
}Reference rules:
- refs must be pure objects:
{ "$ref": "stepId.output.path" } - malformed refs are rejected during validation
- refs can only point to previous steps
- refs can only read from outputs
- refs are resolved before each step executes
- resolved inputs are validated again before execution
- if a referenced step has an
outputschema, ref output paths are checked during validation - without an
outputschema, unresolved output paths may still fail at execution - no expressions or string interpolation
Schema builder
agent-conductor includes a small serializable schema builder exposed as s.
const input = s.object({
title: s.string().min(1).max(200),
priority: s.enum(["low", "normal", "high"]).optional(),
labels: s.array(s.string()).max(10).optional(),
});
const parsed = input.parse(value);
const json = input.toJSON();Supported schemas:
s.string()with.min(length),.max(length),.regex(pattern)s.number()with.min(value),.max(value)s.int()with.min(value),.max(value)s.boolean()s.literal(value)s.enum(values)s.array(schema)with.min(length),.max(length)s.object(shape)— rejects unknown propertiess.union([...])s.discriminatedUnion(key, [...])s.optional(schema)/.optional()s.nullable(schema)/.nullable()s.describe(schema, description)/.describe(description)s.refine(schema, predicate, message)/.refine(predicate, message)s.record(schema)s.unknown()
No Zod. No Ajv. No runtime dependencies.
Context boundary
The host application provides context for every operation:
await runtime.execute(plan, {
userId,
orgId,
tasks,
permissions,
logger,
});Actions receive the same context:
execute: async ({ input, ctx }) => {
if (!ctx.permissions.canCreateTask) {
throw new Error("Not allowed");
}
return ctx.tasks.createTask(input);
};agent-conductor does not manage auth, storage, tenant isolation, logging, tracing, feature flags, rate limits, or billing. Put those in your context and action handlers.
Hooks
Hooks provide basic observability and integration points without a plugin system. Hook failures are treated as programmer/integration errors and may throw from the runtime method that invoked them.
const runtime = createRuntime({
actions,
hooks: {
beforeValidate({ plan, ctx }) {},
afterValidate({ plan, result, ctx }) {},
beforeStep({ step, input, ctx }) {},
afterStep({ step, output, ctx }) {},
onStepError({ step, error, ctx }) {},
},
});Limits and safety
const runtime = createRuntime({
actions,
limits: {
maxActions: 50,
maxInputBytes: 100_000,
maxOutputBytes: 100_000,
stepTimeoutMs: 10_000,
},
});Execution is sequential in v1. Parallelism, transactions, rollback, queues, retries, and branching are intentionally excluded.
Limit values must be non-negative safe integers. Passing undefined for a limit override keeps the default value.
MCP-compatible surface
agent-conductor does not depend on the MCP SDK, but it can expose SDK-agnostic tool definitions. In most MCP clients, those tool definitions are serialized into the model context before generation. That means MCP design is partly context design: a large tool set can make the model slower, less reliable, and more expensive before it calls a single tool.
const tools = runtime.mcpTools();Default tools:
search_capabilitiesvalidate_planpreview_planexecute_plan
Dispatch tool calls through:
const response = await runtime.handleToolCall(name, input, ctx);
// execute_plan also accepts { dryRun: true } in its tool input.A host using an MCP SDK can wire the tools manually:
for (const tool of runtime.mcpTools()) {
server.tool(
tool.name,
tool.description,
tool.inputSchema,
async input => runtime.handleToolCall(tool.name, input, ctx),
);
}agent-conductor does not know about transports, sessions, clients, stdio, or HTTP.
MCP context budget
When exposing a runtime through MCP, optimize for the model's working context rather than for REST-style endpoint purity:
- Prefer workflow actions over thin endpoint wrappers. For example,
get_project_snapshotis usually easier for a model thanget_project,get_member,get_permission, andget_activitychained together. - Keep action names explicit and namespaced enough to survive clients with many installed tools.
- Write descriptions for the model, not just for humans. Include when to use the action and what kind of result to expect, but avoid long examples unless they materially improve tool choice.
- Keep schemas tight. Remove optional fields the model does not need, use enums for constrained choices, and avoid deeply nested inputs when a flatter shape is enough.
- Return compact outputs. Tool results also enter the context window, so prefer IDs, summaries, and bounded lists over full records by default.
- Use
runtime.search(query)before showing a full manifest when the client flow allows it. Search lets an agent narrow the action set before you spend tokens on every capability. - Test with a small context budget or a weaker/local model. If tool selection only works with a huge context window, the MCP surface is probably doing too much.
This is why agent-conductor exposes four stable MCP tools for the runtime itself and keeps your application actions inside the plan manifest. The model can discover capabilities, validate a proposed plan, preview impact, and execute after approval without carrying hundreds of direct backend tools.
Error model
Runtime errors are structured:
type RuntimeError = {
code:
| "INVALID_PLAN"
| "UNKNOWN_ACTION"
| "UNKNOWN_TOOL"
| "INVALID_INPUT"
| "UNAUTHORIZED"
| "ACTION_FAILED"
| "LIMIT_EXCEEDED"
| "ABORTED";
message: string;
path?: string;
actionIndex?: number;
actionType?: string;
};Normal validation failures return { ok: false, errors }. Programmer errors, such as invalid runtime definitions, may throw.
v1 scope
Included:
- define allowlisted actions
- tiny schema builder
- capability search
- plan validation
- approval-friendly previews
- sequential execution
- simple output references
- MCP-compatible tool definitions
- MCP-compatible tool-call dispatcher
- hooks
- simple limits
Excluded:
- generated code execution
- sandboxing
- LLM calls
- MCP SDK dependency
- full JSON Schema support
- OpenAPI import
- generated UI
- hosted approval flow
- persistence
- queueing
- retries
- transactions
- rollback
- branching and loops
- parallel execution
Development
npm install
npm run check
# Or run individual checks:
npm run typecheck
npm test
npm run coverage
npm run build
npm run smokeCoverage is configured with Vitest and V8. The goal is 100% coverage for the public v1 surface.
License
MIT
