@agent-compose/sdk
v0.5.7
Published
Client library for agent-compose — define agents, runtimes, and workflows, and invoke them against an agent-compose server.
Readme
@agent-compose/sdk
TypeScript SDK for agent-compose. Use it to:
- Author workflows that run agentic LLM loops inside isolated sandboxes
- Define runtimes that wrap a coding-CLI tool (Claude Code, OpenAI Desktop, …) into a sandbox-portable agent loop
- Register, invoke, observe, and cancel workflows via the HTTP API (
AgentComposeClient) - Manage factories, secrets, API keys, and snapshots programmatically
The hierarchy: a team owns one or more factories (project containers); each factory owns workflow templates, secrets, and runs. Workflows are versioned per (factory, name, version). New code that doesn't care about factories transparently lands in default — every team has one.
Installation
npm install @agent-compose/sdk
# peer dep:
npm install zodAuthoring a workflow
A workflow is async (ctx, sandbox) => T. Two positional args:
ctxcarries the run identity (run.id), the caller'sinput, plus observability helpers (setMetadata,step).sandboxis a capability the engine constructs once for the run — pass it toagent({ sandbox, ... })and to any helper that takes aSandboxProvider(file writers, git utilities, command runners).
// my-workflow.ts
import { defineWorkflow, agent, claudeRuntime } from "@agent-compose/sdk";
import PROMPT from "./prompt.md" with { type: "text" };
export default defineWorkflow({
async run(ctx, sandbox) {
const repo = (ctx.input?.repo as string | undefined) ?? "owner/repo";
const result = await agent({
sandbox,
runtime: claudeRuntime,
prompt: `${PROMPT}\n\nRepository: ${repo}`,
tools: ["Bash", "Read", "Edit", "Write", "Grep", "Glob"],
budget: { turnsPerIteration: 40, maxIterations: 8 },
});
await ctx.setMetadata({ summary: result.status?.summary });
return { ok: result.status?.completed ?? false };
},
// Optional: outbound network rules that the runner sandbox will enforce
// (Vercel only — E2B ignores). Use `$VAR` placeholders for secrets that
// get resolved from the per-workflow secret store at dispatch time.
networkPolicy: {
allow: {
"*": [],
"api.anthropic.com": [{ transform: [{ headers: { "x-api-key": "$ANTHROPIC_API_KEY" } }] }],
},
},
});defineWorkflow is a thin sugar — it returns the bare run function with
networkPolicy / placeholders / snapshots
attached as metadata that the bundler picks up at registration time. A
plain export default async (ctx, sandbox) => {...} is also valid; you
just lose the metadata channel.
What the workflow can do with ctx
interface WorkflowCtx {
run: { id: string };
input?: Record<string, unknown>;
setMetadata: (data: Record<string, unknown>) => Promise<void>;
step<T>(name: string, fn: () => Promise<T>): Promise<T>;
}step("phase-name", () => …) wraps a phase for the run timeline — emits
step_started / step_completed / step_failed lifecycle events with
duration. Use it for setup, external API calls, or anything you want
visible on the dashboard's run detail page.
The agent loop
agent({
sandbox, // the workflow's sandbox arg
runtime, // claudeRuntime, or your own via createClaudeRuntime / defineRuntime
prompt, // raw markdown — `--- frontmatter ---` is auto-stripped
tools?, // model tool allowlist (defaults inside agentLoop)
budget?, // { turnsPerIteration, maxIterations }
workingDir?, // every shell command runs here
responseSchema?, // zod — when set, the loop demands a `<response>` block on exit
onAgentEvent?, // per-message hook (e.g. wire to telemetry)
onIteration?, // per-iteration hook with parsed `<status>` block
})
// → AgentLoopResult { status?, response? (when responseSchema set), iterations, … }The protocol is simple: the model emits XML-tagged blocks (<status> /
<response>) the loop parses. See sdk/src/agent/protocol-suffix.md for
the full instructions appended to every prompt.
Defining a runtime
A "runtime" wraps an agent's underlying execution model — usually a coding
CLI like Claude Code or OpenAI Desktop — so agent can drive it. The SDK
ships built-ins; you only need a custom one for an exotic provider.
Built-in runtimes
import {
createClaudeRuntime, // factory, takes config
claudeRuntime, // pre-built default (DEFAULT_CLAUDE_MODEL)
ClaudeRunner, // class, if you need to override
} from "@agent-compose/sdk";
// openAIDesktopRuntime is NOT in the package root (it pulls in `sharp` for
// screenshot capture; the native binding can't be cross-compiled). Import
// directly when you actually want the desktop runtime:
import openAIDesktopRuntime from "@agent-compose/sdk/runtimes/openai-desktop.js";Custom runtime
import { defineRuntime, type AgentRuntime } from "@agent-compose/sdk";
const myRuntime: AgentRuntime = defineRuntime({
create: (sandbox, opts) => {
// Return a ModelExecutionContract — see sdk/src/types/runtime.ts
return {
sendMessage({ prompt, sessionId, signal }) {
// Async generator that yields AgentMessage chunks the loop parses.
return /* … */;
},
};
},
});AgentRuntime is a tagged record with create(sandbox, RuntimeOptions) →
ModelExecutionContract. There is no provider field on it — the
runtime is bound to the workflow at author time (you pass it to agent),
not selected by the server.
Registering a workflow
The agentc CLI handles the bundling-and-registration step for you:
agentc register my-workflow.ts -n my-workflowUnder the hood that calls bundleWorkflow(workflowPath) (resolves imports,
inlines runtime sources via dynamic-require traversal) and POST
/api/v1/factories/<slug>/templates with the bundled source. If you need
to drive registration from your own build pipeline, you can do the same
thing via the SDK directly:
import { AgentComposeClient, bundleWorkflow } from "@agent-compose/sdk";
const client = new AgentComposeClient(
"https://your-server.example.com",
process.env.AGENT_COMPOSE_API_KEY!,
);
const bundled = await bundleWorkflow("./my-workflow.ts");
await client.register({
name: "my-workflow",
source: bundled.source,
runtimes: bundled.runtimes, // [{ name, source }] — embedded so the runner has them locally
schedule: "*/30 * * * *", // optional cron
factorySlug: "default", // optional — defaults to "default"
// snapshots, networkPolicy, placeholders — all optional
});register() requires the caller's API key to carry the admin scope
(or full team-access for legacy keys without scopes).
Invoking a workflow
Two flavours:
// Fire-and-forget — returns the run id immediately.
const { id } = await client.invoke("my-workflow", {
repo: "owner/repo",
});
// Block until the run settles (default 30min timeout, 1s poll).
const status = await client.invokeAndWait("my-workflow", { repo: "owner/repo" }, {
timeoutMs: 5 * 60_000,
pollIntervalMs: 2000,
});
console.log(status.status); // "success" | "failed" | "abandoned" | "canceled"
console.log(status.output); // workflow's return valueoutput is the workflow's run() return value (whatever defineWorkflow({
async run() { return … } }) resolves to). setMetadata() writes to a
separate metadata field — useful for "side-channel" facts (PR url, plan
url) without polluting the structured return.
invoke and invokeAndWait both accept { factorySlug, snapshots,
networkPolicy, placeholders, parentRunId } as the
third argument. Per-invocation snapshots merges field-by-field with
the registered default. factorySlug defaults to "default".
Auto parent/child tracing
The SDK detects process.env.RUN_ID (set by the runner sandbox on every
dispatch) and automatically threads it as parentRunId on subsequent
invoke() calls. Workflows that fan out to other workflows get a
parent/child tree in the dashboard for free. Pass parentRunId: null
to opt out.
Cancelling a run
await client.cancelRun(runId);Idempotent — cancelling an already-terminal run returns the current state
without throwing. The server stamps the run as canceled, kills any live
sandboxes, and emits a run_canceled event on the stream.
Streaming live logs
streamRunLogs returns an async generator of RunEvents in real time,
re-attaching via SSE under the hood. Pass lastEventId (the highest
seq you've already processed) to resume after a reconnect.
for await (const ev of client.streamRunLogs(runId, { lastEventId: 0 })) {
console.log(ev.event, ev.seq, ev.data);
if (ev.event === "run_complete" || ev.event === "run_failed" || ev.event === "run_canceled") {
break;
}
}AbortSignal works too — pass { signal } and call controller.abort()
to tear the stream down from the caller side.
Factories
Factories are project containers within a team. Each factory has its own
workflow templates, secrets, runs, and (optionally) scoped API keys. New
projects don't need to think about them — default is auto-created per
team and is what the SDK falls back to when factorySlug is omitted.
// CRUD on factories
await client.createFactory({ slug: "ci-bots", name: "CI Bots", description: "…" });
const factories = await client.listFactories();
const f = await client.getFactory("ci-bots");
await client.updateFactory("ci-bots", { name: "Continuous-Integration Bots" });
await client.deleteFactory("ci-bots");
// Templates list — flat across factories, or scoped to one
const all = await client.listTemplates();
const scoped = await client.listTemplates({ factorySlug: "ci-bots" });
// Register / invoke / secret operations all accept factorySlug
await client.register({ name: "scrape", source, factorySlug: "ci-bots", … });
await client.invoke("scrape", { url: "…" }, { factorySlug: "ci-bots" });
await client.setSecret("scrape", "GH_TOKEN", "ghp_…", { factorySlug: "ci-bots" });CLI equivalents: agentc factory list | create | get | update | delete,
plus --factory <slug> on every other command.
Per-workflow secrets
Secrets live in GCP Secret Manager, one row per (factory, workflow, key).
They're injected as env vars into the runner sandbox at dispatch time,
never persisted in the VM. Values are write-only — the API only returns
metadata (key, timestamps).
await client.setSecret("my-workflow", "ANTHROPIC_API_KEY", process.env.ANTHROPIC_API_KEY!);
const list = await client.listSecrets("my-workflow"); // [{ key, createdAt, updatedAt }]
await client.deleteSecret("my-workflow", "STALE_KEY");
// Scope to a non-default factory:
await client.setSecret("scrape", "GH_TOKEN", "ghp_…", { factorySlug: "ci-bots" });Mutations require admin scope.
API keys
Mint and list scoped keys programmatically (requires an admin-scoped
caller key). New keys are returned once, in the same response as the
metadata — copy the ac_… value immediately.
const created = await client.createApiKey({
name: "ci-dispatcher",
scopes: ["read", "invoke"],
expiresAt: new Date(Date.now() + 30 * 86_400_000).toISOString(), // 30 days
// factorySlug: "ci-bots" // optional — scopes the key to a single factory
});
console.log(created.key); // "ac_…" — the only time you'll see this
const all = await client.listApiKeys();CLI equivalent: agentc keys create <name> --scopes read,invoke
--expires-in 30d.
Usage
const usage = await client.getUsage(
new Date(Date.now() - 30 * 86_400_000),
new Date(),
);
// usage.rows: [{ day, runs, sandbox_seconds, … }]CLI equivalent: agentc usage.
Snapshots (replay-friendly sandboxes)
Long-running workflows can capture the runner sandbox as a Vercel snapshot
on success (snapshots: { saveLatest: true }). Other workflows reference
that snapshot via snapshots.bootFrom to boot into the same prepared VM
(deps installed, repo cloned, etc.) instead of repeating setup.
// Capture per-invocation:
await client.invoke("my-workflow", input, { snapshots: { saveLatest: true } });
// Boot from a captured snapshot — pick the id from `agentc snapshot
// list` or the dashboard snapshots page:
await client.invoke("my-workflow", input, {
snapshots: { bootFrom: { snapshotId: "snap_…" } },
});
// Set a default at registration time:
defineWorkflow({ run, snapshots: { saveLatest: true } });
// Retain every step's snapshot (not just the latest):
defineWorkflow({ run, snapshots: { saveLatest: true, retainSteps: true } });
// Browse / clean up:
const page = await client.listSnapshotsPage({ workflow: "my-workflow", limit: 50 });
const snaps = page.data;
await client.deleteRunSnapshot(snaps[0].runId, snaps[0].snapshotId);CLI equivalents: agentc snapshot list / agentc snapshot delete <run-id> <snapshot-id>.
Per-invocation overrides merge field-by-field
The snapshots object on invoke() is merged with the registered template's snapshots config — you can override bootFrom alone without losing saveLatest, or vice versa.
Authentication
The SDK accepts a Bearer API key (ac_…). Mint one from the dashboard:
sign in at <server-url>/login, then Settings → API Keys → Create key.
Default scopes (read + invoke) are right for a CI / dispatch caller. Tick
admin only if this key needs to register templates, mint other keys, or
manage secrets.
const client = new AgentComposeClient(
process.env.AGENT_COMPOSE_URL!,
process.env.AGENT_COMPOSE_API_KEY!,
);The dashboard itself uses the cookie-bound session path; the SDK is for programmatic / server-to-server callers.
Public exports — quick reference
| Export | What |
|---|---|
| defineWorkflow | Attach metadata to a workflow run function |
| defineRuntime | Wrap an agent execution provider as an AgentRuntime |
| defineSandboxEnvironment | Sugar for declaring a workflow whose primary purpose is to build a snapshot for others to boot from |
| agent / agentLoop | Embed an LLM loop inside a workflow |
| runWorkflow | Local engine for running a workflow in-process (test harness) |
| bundleWorkflow | Resolve + inline a workflow's runtime sources for registration |
| claudeRuntime / createClaudeRuntime / ClaudeRunner | Built-in Claude Code runtime + factory |
| AgentComposeClient | HTTP client — register, invoke, cancel, stream logs, factories, snapshots, secrets, API keys, usage |
| AgentComposeError | Thrown by every non-2xx HTTP response |
| parseAgentStatus / parseAgentResponse / AgentStatusSchema / AgentMessageSchema | Protocol parsers |
| parseSseStream | Generic SSE chunk decoder (used by streamRunLogs) |
| createSandbox / reconnectSandbox / killAllSandboxes / killSandboxById / getSandboxQuotas / listOwnedSandboxes / deleteSandboxSnapshot | Sandbox-provider helpers (Vercel + E2B) |
Type exports: WorkflowFn, WorkflowCtx, WorkflowDefinition,
WorkflowHooks, AgentBudget, AgentRuntime, RuntimeOptions,
ModelExecutionContract, McpServerConfig, AgentMessage (and its
variants), AgentStatus, RunStatus, RegisterResult, RunEvent,
FactoryRow, SnapshotListEntry, ApiKey, ApiKeyCreated,
UsageRollupRow, UsageResponse, CancelRunResponse, AgentLoopResult,
AgentOpts, SandboxProvider, DesktopSandboxProvider,
SandboxNetworkPolicy, SandboxCreateOpts, OwnedSandbox,
BundledWorkflow.
For the canonical signatures, follow your IDE's go-to-definition into
@agent-compose/sdk — sdk/src/index.ts is the public surface and the
files it re-exports from carry full inline docstrings.
Errors
All non-2xx HTTP responses throw AgentComposeError(status, message). The
message is the server's { error: string } body when present, falling
back to the HTTP status text:
import { AgentComposeError } from "@agent-compose/sdk";
try {
await client.invoke("missing-workflow");
} catch (err) {
if (err instanceof AgentComposeError && err.status === 404) {
// template not registered
}
}invokeAndWait throws AgentComposeError(504, …) on timeout for symmetry.
