@pi-orca/agents
v0.0.5
Published
Subagent lifecycle via SDK/RPC/tmux
Maintainers
Readme
pi-orca-agents
Subagent lifecycle for the Pi coding agent — spawn, monitor, recover, and reap child agents across three isolation modes.
Version: 0.0.2-dev Status: Phase 3 complete (347 tests passing) License: MIT
Overview
pi-orca-agents manages the full lifecycle of Pi subagents: spawning, monitoring, stopping, and recovering child agents defined as markdown templates with YAML frontmatter. Agents may run in-process (SDK isolation), out-of-process via RPC (pi --mode rpc child), or in a tmux pane or window (tmux — pane by default, window opt-in). Each isolation mode trades off latency, crash isolation, and user attachability differently — pick the one that fits the workload.
The extension is independently useful — it ships a lightweight completion mechanism via per-agent status files — and gains richer async coordination when pi-orca-messages is also loaded. A worktree allocator gives any spawning workflow optional git-branch isolation, so two agents can edit the same repo without stepping on each other.
Spec: docs/spec-v0.3.0.md §5. Full implementation plan and decisions log: docs/phase-3-plan.md. End-user walkthrough: docs/agents-guide.md.
Quick Start
# 1. Install (monorepo or from npm)
npm install @pi-orca/agents
# 2. Load as a Pi extension (example settings.json fragment)
{
"extensions": ["./node_modules/@pi-orca/agents/dist/index.js"]
}
# 3. In a running Pi session, spawn a scout
/agents spawn scout "Find every call site of authenticateUser()"
# 4. Watch the widget (Alt+A)
🤖 Agents
◉ scout-1 sdk fast $0.0021 123 / 856 tokensOn first run, default templates (scout / planner / worker / reviewer) are bootstrapped to ~/.pi/agent/orca/agents/. The LLM can also call the agent tool directly with the same five actions exposed via slash commands.
Concepts
- Templates. Markdown files with YAML frontmatter (
<name>.md) define an agent's model, tools, restrictions, isolation mode, lifecycle, and system prompt. User-scope (~/.pi/agent/orca/agents/) is overridden by project-scope (<project>/.pi/orca/agents/). - Status files. Each spawned agent writes its own
<parent-session>.orca/agents/<agentId>.yaml. One writer per file — the parent never modifies a child's status. Globbing this directory gives/agents listand the widget aggregate state without shared mutable memory. - Isolation modes.
sdk(in-process),process(RPC child),tmux(tmux window). Fixed at spawn time. See the decision matrix below. - Persistent agents. Templates may set
lifecycle: persistent(defaultone-shot) so the child stays alive between prompts; the parent drives the next turn with/agents prompt|tell|ask <id> <task>and the status YAML flipsrunning → idle → running → …per cycle. Works across all three isolation modes. See spec §5.13. - Parent liveness protocol. Three layers — shutdown guard (
/quitwith running children), child self-rescue (ancestor-walking PID check), recovery on parent restart (/agents recover). See spec §5.1–5.3. - Worktree isolation. Templates may opt into
useWorktree: trueso the agent runs in a fresh git worktree under<repo>/.pi/worktrees/<parentSessionId>/<agentId>/on branchorca/<parentSessionId>/<agentId>. Cleanup is gated by a W9 uncommitted-work safeguard. - Notify channel. Best-effort Unix-socket pings short-circuit poll-loop latency for parent→child control and child→parent heartbeat/completion events. Filesystem state is always authoritative — sockets are a hint, never a correctness dependency.
Default Templates
Bootstrapped to ~/.pi/agent/orca/agents/<name>.md on first run if absent. Edit freely or shadow at project scope.
| Template | Model alias | Thinking | Context | Tools | Isolation | Use case |
|---|---|---|---|---|---|---|
| scout | fast | minimal | fresh | read, grep, find, ls, bash | sdk | Read-only investigation and discovery |
| planner | thinker | medium | fresh | read, write, bash, ls | sdk | Decompose work into dependency DAGs |
| worker | balanced | low | fresh | read, write, edit, bash, ls | sdk | Implement code per assigned task |
| reviewer | balanced | medium | fresh | read, ls, grep | sdk | Read-only code review |
All four use completionNotify: parent, restrictionsMode: override, and useWorktree: false. Override any of these per template at user or project scope.
Model resolution
A template's model: field accepts any of:
- A
pi-orca-modelsalias name (fast,balanced,thinker,coder, or a custom alias) — resolves when@pi-orca/modelsis loaded. - The synthetic-provider-qualified form
pi-orca-models/<alias>— same as a bare alias but unambiguous when an alias collides with a real model id. - A bare model id like
claude-haiku-4-5— resolves through the Pi SDK'sModelRegistrywhen exactly one of your configured providers offers it; ambiguous bare ids (multiple providers, same id) require qualification. - A literal
provider/model-idlikeanthropic/claude-haiku-4-5— passes through unchanged.
Resolution happens at spawn time. Mid-session changes via /models set or session-level overrides take effect on the next spawn — no restart required. When nothing resolves, the spawner errors with a message naming @pi-orca/models. See spec §8.6.1 for the cascade order.
Isolation Modes — How to Choose
Each agent runs under exactly one isolation mode. The mode is fixed at spawn time and cannot be changed once the agent is running. There is no automatic fallback between modes — picking the right one up front matters.
| Aspect | SDK (in-process) | RPC (process) | tmux |
|---|---|---|---|
| Overhead | Lowest — shared process, no spawn cost | Medium — full Node bootstrap per child | Medium — Node bootstrap + tmux pane/window |
| Crash isolation | None — a child crash takes down the parent | Full — child crashes leave the parent untouched | Full — child crashes are contained to its tmux pane/window |
| Survives parent death | No — dies with the parent | No — RPC pipe closes on parent exit | Yes — the tmux pane/window keeps running after parent exits |
| Real-time events to parent | Direct — AgentSession.subscribe() callbacks | Direct — RpcClient.onEvent() over stdout pipe | Indirect — parent reads status files (+ optional notify-socket pings) |
| Cost visibility | Live via session.getSessionStats() in-process | Near-live via child's status-file heartbeat (also notify-pings) | Near-live via child's status-file heartbeat (also notify-pings) |
| User can attach | No live attach (in-process), but /agents attach surfaces tail -f <sessionPath> (live tail) and pi --session <sessionPath> (post-termination) hints | Limited — /agents attach resumes the saved session in a new TUI | Yes — /agents attach switches focus to the tmux pane (default) or window (opt-in via tmux.target: window / --tmux-target window) |
| Console log | Merged into parent stdout | Captured via RPC framing; viewable via the debug overlay | Persistent per-agent log at <orca-dir>/agents/<agentId>.log |
| Recoverable via /agents recover | Only if the parent is still alive; otherwise lost with parent | Yes — SessionManager.open(savedSessionPath) resumes from saved JSONL | Yes — same recovery path; the tmux pane/window may also still be running |
| External deps | None | None | Requires tmux on PATH (and for target: pane — the default — the parent pi must be running inside a tmux client; outside tmux the spawner transparently falls back to a new window) |
| Quoting/injection surface | None — direct function calls | None — task strings pass through RPC framing | Mitigated — task strings written to a tempfile, read by a wrapper script (never substituted into a shell command); agentId sanitized to [A-Za-z0-9_-]+ |
| Typical use case | Short scout/recon tasks where shared memory and lowest latency matter | Background workers, reviewers, anything that should survive the LLM's current turn but not the user's session | Long-running, user-attachable agents (e.g., a worker the user wants to peek at), or jobs that must outlive the parent session |
A practical rule of thumb: SDK for fast, ephemeral, parent-coupled work; RPC for isolated workers whose lifetime matches the user's session; tmux when the user needs to watch the agent live, or when the agent must outlive the parent.
Selecting Isolation at Spawn Time
The TUI user has five input surfaces for choosing isolation, in precedence order (later overrides earlier):
- Template default in YAML frontmatter —
isolation: sdk(allowed values:sdk,process,tmux; defaultsdk). See spec §5.2. - Inheritance from parent template — when a subagent spawns another subagent,
isolationis inherited from the parent's resolved template unless the child template explicitly overrides it. See spec §5.3. - Interactive
/agents createwizard — when scaffolding a new template via/agents create <name>, the wizard prompts for isolation among other fields. The chosen value is written into the new template's frontmatter and becomes its default. See spec §5.11. - Per-spawn tool-call parameter — programmatic spawning via the
agenttool acceptsisolationas a per-call override:agent({ action: "spawn", template, isolation }). See spec §5.10. - Per-spawn slash command flag —
/agents spawn <template> [task] --isolation <sdk|process|tmux>overrides everything below it for that one spawn.
There is intentionally no global config override for isolation. Each agent template names its own preferred mode; per-spawn overrides handle the rest. Adding a global setting would couple unrelated agents (e.g., forcing a fast scout into tmux when the user only wanted a single worker isolated) and is rejected by design.
Environment inheritance. process and tmux children both inherit the parent pi process's full process.env so provider API keys (OPENROUTER_API_KEY, ANTHROPIC_API_KEY, OPENAI_API_KEY, etc.) reach the child without any extra setup. RPC gets this for free from RpcClient ({ ...process.env, ...env }). Tmux requires explicit forwarding via -e VAR=value arguments to both tmux new-window (window target) and tmux split-window (pane target, the default), because tmux otherwise spawns the new pane/window using the tmux server's environment — whatever shell first started the server — not the client's shell. The spawner builds those -e pairs via buildTmuxEnvArgs (engine/spawners/tmux.ts) on every spawn and reuses them across both invocations.
/agents attach <id> Behavior by Mode
/agents attach <id> behaves differently per isolation mode:
- SDK → returns a structured error. When the agent has its own on-disk session file (
context: fork), the error surfaces the absolute path plus two hints:tail -f <sessionPath>for live inspection while running, andpi --session <sessionPath>to open the saved transcript in a fresh TUI after termination. When the agent runs entirely in-memory (context: fresh), there is no on-disk file to tail; the error explains that and points users at theorca:agent-completionLLM-context entry the parent receives on terminal status (spec §5.8). - RPC → returns a structured error containing an absolute
pi --session <path>hint so the user can resume the saved session in a fresh TUI. - tmux (pane target, default) → switches focus to the agent's pane via
tmux select-window -t <paneId>(resolves the containing window from the pane id) followed bytmux select-pane -t <paneId>. - tmux (window target) → switches focus to the agent's window via
tmux select-window -t <windowId>.
Tab-completion for <id> reflects this: /agents attach <tab> lists only non-SDK running agents — SDK agents still surface the read-only inspection hints described above, but cannot be attached interactively.
Tool Actions (agent tool)
The extension registers a single agent tool with a discriminated-union action parameter. LLMs invoke it programmatically; users invoke the same actions via slash commands.
| Action | Required params | Optional | Result |
|---|---|---|---|
| spawn | template | task, isolation, lifecycle, tmuxTarget, tmuxSplit, labels | { agentId, isolation, sessionPath, pid?, tmuxWindowId?, tmuxPaneId?, tmuxTarget? } |
| list | — | — | { agents: [{ agentId, templateName, status, isolation, model, pid, lastHeartbeat, cost }] } |
| stop | agentId | — | { agentId, isolation } — graceful then hard-kill after gracefulStopWaitSeconds |
| recover | — | — | { scanned, abandoned, respawned, skipped, worktreeSweep? } |
| status | agentId | — | Full status YAML contents |
| attach | agentId | — | { tmuxWindowId?, tmuxPaneId?, tmuxTarget? } (tmux only) or structured error (sdk/rpc) |
| prompt | agentId, task | — | { agentId, isolation } — drives the next turn of a persistent agent (spec §5.13); rejected when one-shot, terminal, or running |
All results carry success: boolean, action, and either message (success) or error (failure).
Slash Commands
| Command | Description |
|---|---|
| /agents (no args) | Toggle the agent widget |
| /agents spawn <template> [task...] [--isolation sdk\|process\|tmux] [--lifecycle one-shot\|persistent] [--tmux-target pane\|window] [--tmux-split horizontal\|vertical] | Spawn a new agent. Task is everything after the template name; --isolation and --lifecycle override template defaults per spec §5.11. --tmux-target / --tmux-split override the template's tmux: block; both are rejected unless effective isolation is tmux. |
| /agents list | List all agents in the current session |
| /agents stop <agentId> | Stop a running agent (graceful, then hard-kill) |
| /agents status <agentId> | Show full status YAML for an agent |
| /agents attach <agentId> | Attach to an agent (tmux only; sdk/rpc surface a hint) |
| /agents prompt\|tell\|ask <agentId> <task> | Drive the next prompt cycle of a persistent agent (spec §5.13). tell / ask are aliases for prompt. |
| /agents recover | Sweep the parent's status directory + worktree lockfiles; respawn what's recoverable |
| /agents create [name] | Interactive wizard: scaffold a new template at user or project scope |
| /agents validate [name] | Validate one template (by name) or every template in scope |
All subcommands tab-complete. /agents spawn <tab> lists templates annotated with their scope (user/project) and default isolation; pressing space after the template suggests --isolation, --tmux-target, and --tmux-split and each flag's tab list narrows to its valid values. /agents stop|status <tab> shows only running, non-terminal agents. /agents attach <tab> further excludes SDK agents (they can't be attached to a tmux pane or window).
The agent tool exposes spawn / list / stop / recover / status / attach. create and validate are slash-only because they require interactive prompts (create) or chatty per-template output (validate) that doesn't fit a single-shot tool response.
Widget
Alt+A toggles the agent widget. Bare /agents (no args) also toggles. The widget lifecycle mirrors pi-orca-tasks and pi-orca-messages:
- Auto-show on
session_startwhen the parent's sibling folder already has agent status files. - Auto-show on 0→>0 transitions during a session — both user-initiated (
/agents spawn, theagenttool's spawn action) and external (another tool or recovery sweep creates a status file). Spawn-triggered shows happen immediately; external transitions surface withinpolling.idleIntervalSeconds(default 30s). - Auto-hide on >0→0 transitions — when the last agent reaches a terminal state and the status directory is empty, the widget closes itself. Triggered both from
/agents stop//agents recoverand from the idle poller. - Live cost refresh — the idle poller re-reads
<orca-dir>/agents/*.yamleverypolling.idleIntervalSeconds. Heartbeats update each agent'scost.{inputTokens, outputTokens, totalCost}(see §6.3.1), so token/cost figures track running agents with worst-case staleness equal to one poll interval.
🤖 Agents
◉ scout-1 sdk fast $0.0021 123 / 856 tokens
◉ worker-2 process balanced $0.0145 2.1k / 9.8k tokens
⚠ planner-1 sdk thinker $0.0080 421 / 3.2k tokens orphaned (parent died)
✓ reviewer-1 sdk balanced $0.0044 312 / 1.4k tokens
────────────────────────────
Σ $0.0290 2.9k / 15.3k tokens| Icon | Status | Notes |
|---|---|---|
| ◉ | running | Live; heartbeat within stuckThresholdSeconds |
| ◐ | idle | Persistent agent between prompts (spec §5.13). Alive; awaiting /agents prompt. |
| ⚠ | orphaned | Parent died or heartbeat stale; surfaces above terminal results |
| ✗ | failed | Surfaces cleanupBlockedReason inline when set |
| ✓ | completed | Successful exit |
| — | abandoned | Template missing on respawn, or respawn failed |
Sort order: running → orphaned → failed → completed → abandoned, then by agentId lexicographically. The totals row appears only when there's a non-zero cost or token count. The cost summary file (<orca-dir>/cost-summary.yaml) is preferred over computed totals when present.
Filesystem Layout
<parent-session>.orca/
├── agents/
│ ├── <agentId>.yaml # Per-agent status (one writer)
│ ├── <agentId>.log # tmux only: persistent console log
│ ├── <agentId>.sock # Parent→child control socket (RPC/tmux)
│ └── .counter-<template> # Sticky-counter lock for agentId allocation
├── cost-summary.yaml # Aggregated cost across terminated agents
└── notify.sock # Child→parent multiplexed event socket
~/.pi/agent/orca/agents/ # User-scope templates
<project>/.pi/orca/agents/ # Project-scope templates (shadows user)
<repo>/.pi/
├── worktrees/<parentSessionId>/<agentId>/ # Per-agent git worktree
└── active-<agentId>.lock # Per-agent worktree lockfileThe parent-session sibling folder is the unit of recovery: /agents recover scans agents/*.yaml, applies liveness checks, and routes each non-terminal entry to respawn / abandon / skip.
Path resolution and cross-machine portability
Two forms of every session reference are persisted side-by-side throughout the agents extension (see spec §2.6.1 / §2.6.1.1):
- Relative
sessionPath— canonical, portable. Stored in status YAMLs and used for filesystem layout (<orca-dir>/agents/<agentId>.yaml). - Absolute
sessionFile— captured at write time on the writer's host. Authoritative when it still resolves locally (no sync); falls back to the relative form when it doesn't.
engine/session-self.ts::resolveParentSessionFilePath(localSessionsRoot, sessionFile, sessionPath) is the single resolver. Spawner inputs (parentSessionFile, parentSessionPath) carry both forms; the SDK / RPC / tmux spawners call the resolver once at the top to compute the effective absolute path used for SessionManager.forkFrom. If both forms are empty (in-memory SDK parents), fork is rejected — only context: fresh is valid for in-memory parents. getOrcaSiblingDir additionally detects absolute input defensively so any caller still gets the right directory.
Concurrency contract on status writes
atomicWrite uses per-call unique tmp suffixes so concurrent writes to the same target file cannot collide on rename. The earlier shared-tmp implementation manifested as ENOENT: rename ... .tmp -> ... whenever any two writers raced (heartbeat tick + initial status write, postmaster registry merges, etc.). createHeartbeat additionally serializes its own writes through a Promise chain and reads-merges-then-writes so fields written by other paths survive subsequent ticks. The status YAML on disk remains the canonical source of truth; in-process events (agentEvents for SDK, notify-socket pings for RPC/tmux) are only low-latency hints for the parent widget.
Worktree Isolation
Templates with useWorktree: true get an isolated git branch and worktree at spawn time. This lets parallel agents edit the same repo without merge conflicts mid-flight.
- Branch:
orca/<parentSessionId>/<agentId>(always-b, never-B— fresh-branch creation is mandatory). - Path:
<repo>/.pi/worktrees/<parentSessionId>/<agentId>/. - Lockfile:
<repo>/.pi/active-<agentId>.lock(per-agent, never global) carriespid,parentSessionId,branch,worktreePath,acquiredAt. - Cleanup hook fires from each spawner's terminal paths (SDK
agent_end/ RPConTerminate/stopXAgent). Tmux normal completion falls back to the recovery sweep. - W9 uncommitted-work safeguard. Before
git worktree remove --force, the cleanup module runsgit status --porcelaininside the worktree. Non-empty output → skip removal, writecleanupBlockedReason: "uncommitted changes in worktree"into the status YAML, surface a warning notify. Setagents.worktreeForceCleanupOnTerminal: trueto bypass. - Spawn-failure rollback. If the spawner throws after worktree allocation, the orchestrator drops the half-allocated worktree (force=true) before re-throwing — no leaks on transient failures.
- Recovery sweep.
/agents recoverscans<repo>/.pi/active-*.lock, appliesisPidAlive, and reclaims worktrees whose owners are dead. The sweep result surfaces inRecoverAgentsResult.worktreeSweep(scanned,reclaimed,blocked,skippedLive,failed).
Parent Liveness Protocol
Three independent layers — defense in depth — prevent orphaned children from running unbounded.
Layer 1 — Shutdown guard (parent side, /quit). When the parent session is about to exit with running non-SDK children, the guard prompts the user (interactive: ctx.ui.select over Wait | Stop agents first | Orphan and quit; non-interactive: stop all). Configurable via parentLiveness.shutdownGuard: false.
Layer 2 — Child self-rescue (child side, periodic poll). Each non-SDK child polls its parent's orca:session-self entry every idleIntervalSeconds. If the parent's PID is dead (isPidAlive returns false) for longer than orphanGracePeriodSeconds, AND the child is idle, the child writes status: orphaned and ctx.shutdown()s. Walks the ancestor chain to find a live grandparent for completion delivery before giving up. SDK children skip this layer (they share the parent's event loop, so any condition that prevents the parent's heartbeat from advancing also prevents the child's check from firing). Configurable via parentLiveness.childSelfRescue: false.
Layer 3 — Recovery on parent restart (/agents recover). When a parent dies and is later resurrected (new PID, same session file), the user runs /agents recover. The dispatcher reads every non-terminal status YAML in the sibling folder, applies liveness checks per isolation mode, and routes each agent to one of three actions: respawn (template still exists, child dead) → re-launch under the live parent; abandon (template missing) → mark status: abandoned; skip (RPC live-PID no-rebind, or SDK live + tmux live conflict). Also drives the worktree-lockfile sweep described above.
Configuration
AgentsConfig lives in OrcaConfig.agents. Defaults are in @pi-orca/core's DEFAULT_CONFIG.
| Key | Default | Effect |
|---|---|---|
| agents.worktreeForceCleanupOnTerminal | false | When true, terminal-status worktree cleanup bypasses the W9 uncommitted-work safeguard. Use for fire-and-forget environments where uncommitted work in a child worktree is always expendable. |
| agents.forkSizeWarnBytes | 1_000_000 | When context: fork would replay more than this many bytes of parent history to an SDK child, surface a warning at spawn. 0 disables. |
Cross-cutting config that affects agents:
| Key | Default | Effect |
|---|---|---|
| polling.heartbeatIntervalSeconds | 30 | Child heartbeat write cadence |
| polling.idleIntervalSeconds | 30 | Parent idle poll for stuck/orphan detection |
| polling.stuckThresholdSeconds | 120 | Time since lastHeartbeat before a still-alive agent is marked orphaned |
| parentLiveness.shutdownGuard | true | Enable Layer 1 (§5.1) |
| parentLiveness.childSelfRescue | true | Enable Layer 2 (§5.2) |
| parentLiveness.orphanGracePeriodSeconds | 60 | Grace window after detecting parent death before child shuts down |
| parentLiveness.gracefulStopWaitSeconds | 180 | Max wait for stopAllAgents graceful drain before hard kill |
| parentLiveness.spawnBootDeadlineSeconds | 10 | Max wait for RPC/tmux child to write its first status YAML before being declared failed |
| parentLiveness.shutdownWaitMaxSeconds | 600 | Max wait the shutdown guard's "Wait" branch will hold |
| notify.enabled | process.platform !== "win32" | Enable the parent↔child notify channel |
| notify.sockPathStrategy | "hashed-tmp" | hashed-tmp keeps the socket path under 100 chars on long session paths |
User-level config at ~/.pi/agent/orca/config.yaml; project-level shadows it at <project>/.pi/orca/config.yaml.
Real-Time Notification Channel
The agents extension uses the shared notification socket from @pi-orca/core to short-circuit poll-loop latency. The filesystem state is always authoritative — every receiver runs the existing idle poll and a notify-socket listener; the socket is a best-effort hint that the file just changed, so the receiver can read it now instead of waiting for the next poll tick. See docs/spec-v0.3.0.md §2.9 for full design.
Two flavors of socket exist per session:
- Child → parent at
<orca-dir>/notify.sock. The session inbox is multiplexed across all extensions via a type-keyed handler registry, so a single socket file serves agents, messages (planned), and any future consumer without sprawling per-extension files. Messages:{ type: "agent-update", agentId, event: "heartbeat" | "completion" | "error" }. SDK agents skip this — the parent is already subscribed to the in-processAgentSessionevent stream. - Parent → child at
<orca-dir>/agents/<agentId>.sock. The child stands this up whenlifecycle: "persistent"ANDisolation in {process, tmux}so the parent can push out-of-band control. Messages:{ type: "agent-control", command: "abort" | "reread-status" | "graceful-stop" | "prompt", reason?, body? }. Thepromptcommand (spec §5.13) carries the new turn's task inbody; the child invokespi.sendUserMessage(body).abort/reread-status/graceful-stopmirror the existing/agents stopand shutdown-drain semantics. One-shot agents do not allocate a socket; they receive control via the shutdown guard's per-mode hard-kill path.
If a ping fails (no listener, Windows, broken connection, timeout), the receiver still picks up the change at the next idle poll. The socket is a latency optimization, never a correctness dependency.
Child-mode lifecycle
When pi-orca-agents is loaded into an RPC or tmux child process, its session_start handler detects the orca:agent-instance entry that the parent pre-seeded into the child's session JSONL and stands up the child-side lifecycle in engine/child-bootstrap.ts:
- Heartbeat writer to the same
agents/<agentId>.yamlthe parent reads. Cost is accumulated locally from every assistantmessage_end(summingusage.input,usage.output, andusage.cost.total), because theReadonlySessionManagerexposed to extensions does not includegetSessionStats()— only the parent-side in-processAgentSessiondoes. - Synchronous first write at boot (
heartbeat.update({...identity}) → heartbeat.flush() → heartbeat.start()). The flush is what lets the parent'sverifyChildBootwatchdog seemtimeadvance withinparentLiveness.spawnBootDeadlineSeconds(default 10 s); without it the next scheduled write isheartbeatIntervalMs(default 30 s) away and boot always times out. - Per-turn cost flush. Every
turn_endcallsheartbeat.flush()before the notify ping fires, dropping the latest accumulated cost on disk so the parent's re-read on receipt of the ping sees fresh tokens instead of stale data from the previous 30 s interval tick. turn_enddrain captures the closing assistant text intolastAssistantText; whenagent-control: graceful-stophas setdrainFlag, the next turn boundary triggers terminal flush.agent_endone-shot terminal flush. The natural "LLM finished the task" event runsfinalizeStatus("completed", lastAssistantText)followed by best-effortctx.shutdown(). The child's YAML reaches terminal state before the parent'sRpcClient.client.stop()SIGTERM lands, eliminating the previous flush-vs-shutdown race.- Single terminal-flush coordinator so multiple shutdown paths (explicit
shutdown(reason),agent_end,turn_enddrain,session_shutdown, Layer 2onOrphaned) write at most one terminal status. - Notify-socket ping on every heartbeat and completion to the path recorded in
orca:agent-instance.parentNotifySockPath. Best-effort — if it fails, the parent's idle poll still finds the change in the YAML. - Layer 2 parent self-rescue via
startParentMonitor(spec §5.7). When the parent's PID disappears ANDctx.isIdle()returns true AND the grace period has elapsed, the child writesstatus: orphanedand shuts down.
The parent side stands up a single createNotifyServer(sockPath) at session_start, where sockPath resolves via resolveNotifySockPath(orcaDir, strategy) (default hashed-tmp keeps the path well under the 108-byte Unix socket cap). The registered agent-update handler (buildAgentUpdateHandler in src/index.ts) re-emits on the in-process agentEvents bus so the widget and idle poller converge with the SDK in-process path. On terminal pings (event: "completion" | "error"), the handler additionally reads the child's status YAML and calls enqueueCompletion(...) before re-emitting — that's the RPC/tmux equivalent of the SDK's in-process agent_end enqueue, and it's what gets the agent's result body into the parent LLM's context. See packages/core/src/utils/notify.ts for the protocol.
Agent Completion Delivery to the Parent LLM
Detection of a terminal status (the agent's YAML on disk) is separate from delivering the result to the parent LLM's message stream. The status YAML is canonical persistence; the LLM only sees what we explicitly push into its in-flight message array.
The extension uses two delivery paths to surface completions, mirroring the pattern in pi-orca-messages:
pi.on("context")hook (mid-turn injection). Registered once atregister()time. Fires before every LLM call. Drains a per-process completion queue and pushes the formatted batch intoevent.messagesas a synthetic user message. The in-flight turn sees the completion immediately. Cannot usepi.sendUserMessagehere — pi rejects it mid-build with "Agent is already processing."- Idle
pi.sendUserMessage(visible delivery). When the in-processagentEventsbus emitscompletionorerrorANDctx.isIdle()is true AND the queue is non-empty, the parent extension drains the queue viapi.sendUserMessage(text). The body lands as a visible user message in the parent's chat and triggers a normal LLM turn.
Three enqueue sites feed the queue, depending on isolation and on whether the notify ping survived:
- SDK —
engine/spawners/sdk.tscallsenqueueCompletion(...)inside theagent_endsubscription callback, in the parent's process. - RPC / tmux — the parent's
buildAgentUpdateHandler(src/index.ts) callsenqueueCompletion(...)when it receives a notify ping withevent: "completion" | "error". - Poll-tick fallback —
runAgentPollTick(widget-lifecycle.ts) detects any agent that transitioned non-terminal → terminal between twopolling.idleIntervalSecondsticks and enqueues from there. This is the safety net for notify-down environments (Windows,EADDRINUSE,notify.enabled: false, or a lost ping).
All three sites guard with markCompletionEnqueued(agentId) / hasCompletionBeenEnqueued(agentId) in completion-queue.ts, so even when multiple paths observe the same terminal transition the queue ends up with at most one entry per agentId. The dedup set is reset alongside the queue on session-boundary lifecycle hooks. Both delivery paths drain the same in-memory queue, so each completion is delivered exactly once. The body format follows spec §5.8: a one-line "Agent <id> (<template>) <exitReason>." prefix followed by the trimmed result text (or "(No assistant text was produced.)" when the agent produced no closing assistant text). Multiple completions in the queue batch into a single message with a "<N> agents completed:" header and --- separators.
Why not write through a second SessionManager.open on the parent's session? The parent's runtime uses its own in-memory SessionManager instance for LLM context and does not reload from disk. A second writer's appendCustomMessageEntry lands on disk but is invisible to the parent's runtime — confirmed empirically when an earlier version of this code took that route and the user observed no completion message in the parent's chat at all. The getParentWritable handle (engine/parent-writable.ts) is retained for entries that only need on-disk persistence (e.g. orca:agent-fallback-warning); it is intentionally NOT used for completion delivery.
The queue is reset on session-boundary lifecycle hooks (session_before_switch, session_before_fork, session_shutdown).
Architecture
packages/agents/src/
├── index.ts # Extension entry: tool, commands, shortcuts, lifecycle
├── widget.ts # Alt+A widget renderer + component factory
└── engine/ # Pure engine modules (no Pi/UI dependencies)
├── agent-id.ts # Sticky-counter allocator with PID-based lock
├── agent-events.ts # In-process EventEmitter for SDK progress/completion/error updates
├── agent-id.ts # Sticky-counter allocator with PID-based lock (inherited below)
├── bootstrap.ts # First-run default-template bootstrap
├── child-mode.ts # Child-side self-identification (orca:agent-instance)
├── completion-notify.ts # formatAgentCompletionBody + AGENT_COMPLETION_CUSTOM_TYPE
├── completion-queue.ts # Per-process pending-completion queue (drained by index.ts)
├── liveness.ts # Shutdown guard, stopAllAgents, drain hooks
├── parent-monitor.ts # Child Layer 2 — parent PID poll + ancestor walk
├── parent-writable.ts # Cached writable SessionManager for disk-only entry writes
├── pi-binary.ts # capturePiInvocation, captureInheritedCliFlags
├── recovery.ts # /agents recover: plan + execute + worktree sweep
├── scaffold.ts # /agents create + /agents validate
├── session-self.ts # orca:session-self entry helpers + resolveParentSessionFilePath
├── spawn.ts # spawnAgent orchestrator (template resolution → isolation routing)
├── status.ts # Status YAML read/write/list, terminal-status helpers
├── template.ts # Template loader, inheritance resolution
├── tool.ts # Pure dispatcher for the `agent` tool
├── verify-child-boot.ts # Watchdog for spawn → first-heartbeat deadline
├── worktree.ts # Git worktree allocate/cleanup/sweep with injectable GitExec
└── spawners/
├── sdk.ts # In-process via createAgentSession
├── rpc.ts # Out-of-process via RpcClient
└── tmux.ts # tmux new-window + wrapper scriptThe dispatcher (tool.ts) is pure — it depends only on engine modules and accepts test fakes via context injection. The Pi-shaped wiring (pi.registerTool, pi.registerCommand, pi.registerShortcut, event handlers) all lives in index.ts.
Test Coverage
347 tests across 18 files, all passing. Real tmux, git, and child pi processes are stubbed via injectable execs; tests run sub-second except for liveness.test.ts (5s due to deliberate timer fixtures).
| File | Tests | Coverage |
|---|---|---|
| agent-id.test.ts | 15 | Sticky-counter allocation, PID-based lock retry, concurrent spawn id distinctness |
| child-mode.test.ts | 13 | orca:agent-instance self-identification, terminal-flush dedup, mapEntryDataToIdentity |
| liveness.test.ts | 17 | Shutdown guard interactive + non-interactive, stopAllAgents per mode, drain timeouts |
| parent-monitor.test.ts | 10 | Ancestor walk, dead-parent grace period, no-overlap polling |
| pi-binary.test.ts | 14 | capturePiInvocation, captureInheritedCliFlags, -ne handling |
| recovery.test.ts | 20 | Plan/execute, abandon/respawn/skip routing, worktree sweep integration |
| scaffold.test.ts | 30 | /agents create wizard paths, /agents validate (required fields, model alias, tool names), name pattern |
| session-self.test.ts | 8 | orca:session-self write/read, absolute-path resolution |
| spawn.test.ts | 19 | Orchestrator: template resolution, isolation routing, useWorktree allocation, respawn dedup |
| spawner-rpc.test.ts | 18 | RpcClient subscribe-before-start, PID discovery via status YAML, force-kill timeout |
| spawner-sdk.test.ts | 22 | agent_end handling, fork-bloat warning, original-task dedup, abort path |
| spawner-tmux.test.ts | 30 | Wrapper-script quoting, agentId sanitization, tmux -V probe, pipe-pane |
| status.test.ts | 15 | YAML round-trip, exitReason: "" empty-string, terminal-status helpers |
| template.test.ts | 20 | Project-over-user, required fields, inheritance resolution, extensions: [] SDK rejection |
| tool.test.ts | 31 | Every dispatcher action path, error envelopes, recoverAction onRespawn closure |
| verify-child-boot.test.ts | 11 | Deadline fires, stands down on timely status, diagnostic includes inherited flags |
| widget.test.ts | 28 | Format helpers, sort order, totals row, status icons, width clamping |
| worktree.test.ts | 26 | -b only, stale-branch retry, W9 safeguard, idempotency, sweep |
Dependencies
@pi-orca/core— Shared types, utilities, filesystem helpers, notify channel@earendil-works/pi-coding-agent— Pi SDK:SessionManager,createAgentSession,RpcClient, extension types (provided by Pi runtime)yaml— YAML parse/serialize for templates and status files
Optional integration: @pi-orca/messages is detected at runtime via messages.registerSession() for the §5.8 completion-via-delivery.send path used by RPC/tmux agents that have a distinct sessionId from the parent. SDK agents share the parent's process and reach the parent's LLM through the in-process completion queue + pi.on("context") hook (see Agent Completion Delivery above) — no messages-package dependency required.
Documentation
- End-user guide — Hands-on walkthroughs (recommended starting point)
- Specification §5 — Full design
- Phase 3 plan — Implementation roadmap and §13 decisions log
- Phase 3 completion report — Validation against spec
License
MIT License — see LICENSE for details.
