@team-monet/aart
v0.8.0
Published
Agentic Automation RunTime — a governed block/workflow runtime for AI agents
Readme
aart — Agentic Automation RunTime
AI creates reusable automation blocks and workflows. Users govern them. A runtime executes them deterministically. Reports prove it.
aart is a workflow & automation framework for AI agents: it turns an
agent's one-off work into reusable, user-approved, evidence-producing
automations — web tasks, API workflows, data pulls, recurring checks, and
(as one flagship use case) QA and release verification. Your agent (Claude
Code, Codex, …) authors a workflow; aart validates it, you approve it once in
chat, and every run leaves a structured report (per-step trace, pass/fail,
screenshots). aart never calls an LLM — the agent does the authoring; aart is
the governed runtime.
Ships with a core pack (browser, HTTP, assertions) — and agents can extend
it with their own blocks: npm-powered node blocks and workspace packs. Drive
it from your agent over MCP, or from the CLI.
Install
npm i -g @team-monet/aart # installs the `aart` command
aart doctor # checks Node + sandbox + browser, with fix hints
# for the browser blocks (browser.*), download Chromium + its system libs:
npx playwright install --with-deps chromium- Node ≥ 20 required — both at runtime and to build from source. The
node-block sandbox (isolated-vm) and its tests need Node ≥ 22; on Node 20 the sandbox is absent and customnodeblocks are unavailable. - The
node-block sandbox (isolated-vm) is an optional, prebuilt native addon — install needs no compiler on macOS (Apple Silicon), Linux x64/arm64 (including WSL2), or Windows x64. If it can't install, aart still works; only customnodeblocks need it (npm i isolated-vm).
Verify it's working:
aart --help
aart list # shows the built-in core blocks (browser.*, http.*, data.*, file.*, flow.*, assert.*)Use it with your coding agent (recommended)
Point your agent's MCP config at aart (no global install needed for this path):
{ "command": "npx", "args": ["-y", "@team-monet/aart", "mcp"] }No configuration is required to start — with nothing set, aart uses a per-user
default workspace (~/.aart), created on first run. To scope a project's blocks,
runs, and approvals to that project (its own registry/approvals/secrets),
point it at the project directory:
{ "command": "npx", "args": ["-y", "@team-monet/aart", "mcp"],
"env": { "AART_WORKSPACE": "/path/to/your/project" } }Then just ask, e.g. "using aart, check my app at http://localhost:3000 shows the dashboard." The agent discovers the blocks, drafts a workflow, shows you what it does and asks you to approve it in chat, runs it, and shows you the report.
Prime your agent (recommended)
So the agent reaches for aart before you have to ask, paste this into your
project's AGENTS.md / CLAUDE.md / agent rules:
You have aart available via MCP — a workflow/automation framework, not just a
test runner. When a task is worth running again, should be trustworthy, or
should become a durable asset — author an aart workflow instead of a one-off
shell command. aart saves it (named/versioned), the user approves it once, and
every run leaves an evidence report (per-step trace, pass/fail, screenshots).
Good fits: recurring web/API automations, data pulls, report generation,
smoke/health checks, release verification. Use shell for one-off probes; a
durable automation that needs a library can declare npm `dependencies`
(approval-gated). Loop: aa_list_blocks → draft → aa_validate →
aa_register_block → show the user & ask → aa_approve → aa_run_workflow.What you can build
Compose workflows from the built-in primitives — no custom code needed for most web/HTTP work:
- Web automations —
browser.*: navigate, fill, click, read the rendered page as data (extract_text,html), query the live DOM (eval), screenshot. - API workflows & integrations —
http.request(any method, headers, auth via{{secrets.X}}): chain calls, branch on responses withif/then/else. - Data, files & reports —
data.parse/data.stringify(JSON/YAML/CSV), workspace-scopedfile.read/file.write,http.download, andartifact.writeto attach generated reports to the run's evidence.flow.sleep- step jumps give you polling loops;
flow.failgives branches a clean exit.
- step jumps give you polling loops;
- Governed host commands — a
commandblock wraps a CLI you'd otherwise shell out to (git,gh,kubectl, builds). You approve the exact command shape (fixed binary + argv template, spawned without a shell); inputs fill argv slots only, and every execution lands in run history — shell work becomes an audit trail instead of scrollback. - Custom logic — a sandboxed
nodeblock parses/transforms data between steps. Declaredependencies(npm packages /node:built-ins) and it runs as real Node.js with those libraries instead — approval-gated. - QA & release verification — the same primitives plus
assert.*and evidence reports make smoke tests, acceptance checks, and post-deploy verification a flagship use case (re-run on every deploy with proof). - Your own native blocks — agents can author workspace packs
(
.aa/packs/<name>): families of trusted blocks plus shared capabilities (long-lived sessions, clients). Registering a pack records a content hash and never executes it; it loads only after you approve, and any edit breaks the seal until re-approved.
(Until 0.4 the core blocks were namespaced qa.* — those ids resolve forever,
so existing workflows keep running.)
Custom code is sandboxed by default (a pure-compute V8 isolate). Everything more powerful is approval-gated, not impossible: dependency-bearing blocks and workspace packs run unsandboxed — the approval summary says so in plain words, and nothing runs until you agree. aart never runs ad-hoc shell commands.
See what ran
aart dashboard serves a local, read-only web UI (127.0.0.1 only): the
block catalog with approval status, run history with per-step traces and
inputs/outputs, inline screenshots and artifacts, and workspace-pack status.
Approval and registration stay in the governed chat/CLI flows — the dashboard
only shows evidence.
Governance (the approval gate)
Every registration lands as draft and can't run until approved. Approval is
conversational: the agent shows you what a workflow does and asks; when you
say yes it records the approval. You never have to touch a terminal.
- A draft can't run until approved (referenced blocks must be approved too).
Editing + re-registering resets it to
draft, so the agent asks again. - Prefer a stricter, out-of-band gate? Set
AART_STRICT_APPROVAL=1— then only the CLIaart approve <id>(run by you) can approve. Review any time withaart show <id>.
Secrets & workspace
- Workspace — state lives under
<workspace>/.aa(created on first run). Resolution is explicit, with no cwd magic, so the CLI,aart dashboard, and the MCP server always agree on where state lives:--workspace <dir>→$AART_WORKSPACE→ the per-user default~/.aart. To scope a project, setAART_WORKSPACE— and set it in both your MCP config and your shell so the agent, the CLI, and the dashboard all point at the same place (or pass--workspace). Otherwise everything uses~/.aart. - Secrets — reference credentials as
{{secrets.NAME}}, sourced fromAART_SECRET_<NAME>env vars or<workspace>/.aa/secrets.json. They're best-effort redacted from reports — never put a real secret in an input. (Screenshot contents aren't scrubbed; mask secret fields via the screenshot block'smaskoption.)
CLI
The agent does everything via MCP, but the same is available as commands:
aart list [--json] list blocks (with approval status)
aart context the full authoring guide + catalog + schema
aart validate <file> validate a draft definition
aart block add <file> register a definition (lands as draft)
aart show / approve <id> review / approve a definition
aart pack register|approve|list <name> workspace packs (agent-authored native blocks)
aart run <id|file> [--yes] run a workflow → report
aart report <runId> replay a past report
aart dashboard [--port 4400] local read-only web UI: blocks, run history, artifacts
aart doctor check setup
aart mcp start the MCP server (stdio)WSL2 / Linux notes
- aart runs its own headless Chromium Linux-side (not your Windows Chrome);
no display needed. Use
npx playwright install --with-deps chromium. - Target URL: an app inside WSL2 is at
http://localhost:PORT. An app on the Windows host isn't reachable vialocalhostunder default WSL2 NAT — use the host IP or mirrored networking (.wslconfig→[wsl2] networkingMode=mirrored, Win11 22H2+). Remote URLs just work. - Keep the project off
/mnt/c/…(slow 9p I/O).
License
Apache-2.0
