@freshsqueezed/playdough

v1.0.2

Published

2 months ago

Local coding agent orchestrator with AFK ralph loops

0High
0Medium
0Low

mateogordo

playdough

A local coding-agent orchestrator for AFK ralph loops.

You drop a markdown file into .playdough/lanes/ describing what you want done.
Playdough runs the lane in a sandboxed worktree on a schedule (or on demand), iterating the agent until it emits <COMPLETE> or hits the iteration cap.
The commits land on a playdough/<lane> branch, and (optionally) a PR is opened on completion.

Playdough is provider-agnostic — it ships with first-party providers for claude-code, codex, and gemini, and runs them inside Docker or Podman sandboxes. Great for fixing flaky tests overnight, grinding through a backlog of typo fixes, or running a code-review pass on every PR — all 100% local.

Prerequisites

Git
Node.js 20 or newer
A container runtime — playdough needs an isolated environment to run agents in:
- Docker Desktop — the default, most common for local development
- Podman — rootless alternative
An agent CLI installed in the sandbox image (Claude Code by default; Codex and Gemini via --provider)
An API key for the agent provider you choose (ANTHROPIC_API_KEY, OPENAI_API_KEY, or GEMINI_API_KEY)

Quick start

Install playdough into your repo:
```
pnpm add -D playdough
```
Scaffold the .playdough/ config directory. This writes a Dockerfile, an example lane, a config file, and an .env.example.
```
npx playdough init
```
Pass --provider claude-code,codex,gemini to wire multiple agent CLIs into the sandbox image, or --sandbox podman to scaffold for Podman instead of Docker.

Copy .env.example to .env (gitignored) and fill in your API keys:

cp .playdough/.env.example .playdough/.env
# edit .playdough/.env

Build the sandbox image:
```
npx playdough docker build-image
```
Start the daemon. It watches .playdough/lanes/, schedules runs, and serves the dashboard.
```
npx playdough daemon
```
The dashboard is at http://localhost:7777/ by default.
Trigger the example lane manually:
```
npx playdough lane run example
```

The daemon claims the trigger on its next tick, materializes a worktree, runs the agent inside the sandbox, and reports progress on the dashboard.

How it works

Playdough orchestrates four pieces:

Lanes — markdown files in .playdough/lanes/ with YAML frontmatter (provider, model, success command, schedule). The body is the prompt the agent sees.
Worktrees — each run gets a fresh git worktree on a playdough/<lane> branch, forked from the lane's baseBranch.
Sandbox — Docker or Podman container that bind-mounts the worktree. The agent CLI runs as a non-root user with the env from .playdough/.env.
Daemon — long-running orchestrator that watches lane files, evaluates schedules (cron / interval / manual), enforces a concurrencyCap, and persists run state in a SQLite db at .playdough/state.db.

Each run executes the agent in a loop: prompt → stream output → check for the <COMPLETE> token → run the lane's successCommand (e.g. pnpm test) → repeat until success, the iteration cap, or the idle timeout fires. Sessions resume across iterations where the provider supports it.

Lanes

A lane is a single markdown file. The frontmatter configures the run; the body is the prompt sent to the agent.

---
name: typo-sweep
provider: claude-code
model: claude-opus-4-7
baseBranch: main
successCommand: "pnpm test"
maxIterations: 5
schedule:
  kind: cron
  expression: "0 */6 * * *"
---

Scan the repo for typos in markdown files (`*.md`).

For each typo you find, fix it in place with the smallest possible edit.
When done, write a short `.playdough/pr.md` (title on the first line as H1,
then a brief body) and print `<COMPLETE>` on its own line.

Frontmatter fields:

| Field | Default | Description | | ---------------- | ---------------------- | ----------------------------------------------------------------- | | name | — | Required. Used to derive the branch playdough/<name> | | provider | from config | Agent provider: claude-code, codex, gemini, or a plugin | | model | from config | Model string passed through to the provider CLI | | baseBranch | main | Branch the worktree forks from | | successCommand | — | Shell command run inside the sandbox to gate completion | | maxIterations | 10 | Hard ceiling on agent iterations per run | | idleTimeoutSec | 600 | Per-iteration idle timeout before the agent is killed | | retries | 3 | Consecutive failures before the lane is dead-lettered | | resumeSession | true | Resume the provider session across iterations (where supported) | | schedule | { kind: "manual" } | manual, { kind: "cron", expression }, or { kind: "interval", intervalMs } | | pr | — | Optional PR publishing config (title/body templates, reviewers, labels) |

CLI commands

`playdough init`

Scaffolds .playdough/ into the current repo: Dockerfile, .env.example, playdough.config.ts, lanes/example.md, and run.ts. Detects an installed container runtime (docker or podman) and writes config accordingly.

| Flag | Default | Description | | ------------------------ | -------------------- | ----------------------------------------------------------------------------- | | --force | false | Overwrite existing files instead of skipping them | | --sandbox docker\|podman | autodetect | Force a container runtime instead of probing the host | | --provider <list> | claude-code | Comma-separated agent providers to install into the Dockerfile and .env.example |

`playdough daemon`

Runs the orchestrator in the foreground. Watches .playdough/lanes/, schedules runs, claims triggers, and serves the dashboard on dashboardPort (default 7777). Ctrl-C drains in-flight runs; a second Ctrl-C force-exits.

`playdough lane list`

Lists every lane parsed from .playdough/lanes/ along with the status of its most recent run.

`playdough lane run <name>`

Enqueues a manual trigger for a lane. The daemon claims it on the next tick — no need to restart.

`playdough lane requeue <name>`

Pops a lane out of the dead-letter state (after retries consecutive failures) and enqueues a fresh manual trigger.

`playdough doctor [--fix]`

Scans for orphaned worktrees, stuck containers, stale SQLite locks, and runs whose state on disk has drifted from the database. Without --fix, prints a report. With --fix, cleans them up (refuses to run while a daemon is up).

`playdough docker build-image` / `playdough podman build-image`

Builds the sandbox image from .playdough/Dockerfile. Pass --tag <tag> to override the default playdough/sandbox:latest.

Configuration

All per-repo configuration lives in .playdough/:

.playdough/
├── Dockerfile             # Sandbox environment (customize as needed)
├── playdough.config.ts    # Daemon config + lane defaults
├── lanes/                 # One markdown file per lane
│   └── example.md
├── .env                   # API keys (gitignored)
├── .env.example           # Template
├── state.db               # SQLite — run history, triggers, locks
└── run.ts                 # Programmatic entry point (optional)

playdough.config.ts is loaded by both the daemon and the programmatic API. A missing config falls back to the built-in defaults.

export default {
  concurrencyCap: 2,
  dashboardPort: 7777,
  sandbox: "docker",
  sandboxImage: "playdough/sandbox:latest",
  defaults: {
    provider: "claude-code",
    model: "claude-opus-4-7",
    baseBranch: "main",
    maxIterations: 10,
    idleTimeoutSec: 600,
    retries: 3,
    resumeSession: true,
  },
};

Lane-level fields override the matching defaults field. concurrencyCap is enforced by the daemon — at most N lane runs execute simultaneously regardless of how many triggers are pending.

Programmatic API

Playdough also exports run() for use in scripts, CI pipelines, or custom tooling. This is what .playdough/run.ts uses by default.

import { run } from "playdough";

const result = await run({
  lane: ".playdough/lanes/example.md",
  onStdout: (chunk) => process.stdout.write(chunk),
  onStderr: (chunk) => process.stderr.write(chunk),
});

console.log(result.completed);     // true once the agent emitted <COMPLETE> and successCommand passed
console.log(result.iterations);    // per-iteration results, including stdout/stderr tails
console.log(result.branch);        // playdough/<lane>

run() accepts overrides for repoRoot, sandboxImage, env, maxIterations, the sandbox/agent providers, the state db path, and the PR publisher. See PlaydoughRunOptions in src/run.ts.

Development

Working on playdough itself? Here's how to run it against another project locally.

Clone, install, build

git clone https://github.com/<your-fork>/playdough.git
cd playdough
pnpm install
pnpm run build

The build runs tsc for the library and Vite for the dashboard UI. The CLI entry point is dist/cli.js.

Link into a target project

Once built, register the local checkout as a global symlink so playdough resolves to your working copy:

# In the playdough repo
pnpm link --global

Then in the project you want to test against:

# In the target repo
pnpm link --global playdough
npx playdough init
npx playdough daemon

When you make changes in the playdough repo, rerun pnpm run build and the linked CLI picks up the new dist/ immediately — no relinking required. To unlink:

# In the target repo
pnpm unlink --global playdough

# In the playdough repo
pnpm unlink --global

npm link and yarn link work the same way if you're not using pnpm.

Running the daemon directly from source

If you don't want to link, you can run the daemon against any repo by pointing the CLI at it:

cd /path/to/playdough
pnpm run build
node dist/cli.js daemon  # invoked from the target repo's cwd

Or use pnpm exec:

cd /path/to/target-repo
pnpm exec --dir /path/to/playdough playdough daemon

Feedback loops

pnpm run typecheck   # tsc --noEmit across the library and dashboard-ui
pnpm run test        # vitest run
pnpm run test:watch  # vitest in watch mode

Both must pass before committing. Tests use the real better-sqlite3 driver against a tmpdir state file.

Repo layout

src/                  # Library + CLI source
  cli.ts              # Entry point for the `playdough` bin
  daemon.ts           # Long-running orchestrator
  loop.ts             # Per-iteration agent loop
  lane.ts             # Lane file parser
  sandbox.ts          # Docker/Podman exec wrappers
  state.ts            # SQLite-backed run/trigger/lock store
  ...
dashboard-ui/         # Vite + React dashboard served by the daemon
ralph/                # Reference implementation of the ralph loop in bash
dist/                 # Build output (gitignored)

License

ISC

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme