agentspace-cli

v0.1.4

Published

4 days ago

Docker-based helpers for running coding agents in isolated workspaces.

Downloads

821

0High
0Medium
0Low

imrec

agent claude-code cli codex coding-agent docker isolation sandbox workspace yolo

agentspace

Run long-running, autonomous coding agents in "YOLO" mode, safely.

agentspace is an alternative to running coding agents inside tmux, screen, or other terminal multiplexers, and to managing a pile of git worktrees by hand. Instead of keeping a session alive in a multiplexer, it leverages Docker to give each task its own long-running container that you can detach from and reattach to at will, just like a multiplexer, but with full workspace isolation built in.

Each task gets its own disposable workspace (a Docker volume) and its own throwaway container. The agent edits, tests, and builds inside that isolation; you keep full control over review, commits, and merges. agentspace currently wraps Codex and Claude Code.

Quick start

1. Prerequisites

| Requirement | Why | | ------------------------------ | -------------------------------------------------- | | Node.js (current LTS, 20+) | installs and runs the agentspace CLI via npm | | Docker | runs the agent containers and workspace volumes | | Git | configured on the host with access to your remotes |

The agent image is pulled automatically from GitHub Container Registry on first use, so there's no manual build needed.

2. Install

npm i -g agentspace-cli      # installs the `agentspace` command
agentspace --version         # print the installed version (also -v, version)

3. Spawn and talk to the agent

Run from inside the git repository you want the agent to work on:

cd ~/code/my-project

agentspace spawn claude fix-login   # clone into a fresh workspace, start an agent, and attach

spawn infers the repo from the current folder (its origin URL and your checked-out branch), cuts an agent/<repo>/<task> branch in an isolated volume, starts the agent container, and attaches you straight to it. Talk to the agent directly in your terminal.

When you want to step away, detach with the Docker standard: press Ctrl-P then Ctrl-Q in succession. The agent keeps running in the background. Reattach whenever you like:

agentspace fix-login attach         # drop back into the running agent

4. Review and promote

agentspace fix-login review                       # status, diff stat, and the full diff
agentspace fix-login commit ["optional msg"]      # show diff, generate a message, commit on accept
agentspace fix-login commit-push ["optional msg"] # same, then push the branch in one step

Use commit to commit your changes inside the workspace, or commit-push if you want to push your branch immediately. Both generate the commit message for you from the staged diff (and let you edit it before committing), so run them with no message argument. Only pass a message (agentspace fix-login commit "my message") when you specifically want to write it yourself.

Once you've pushed (via commit-push or push), the task is just a normal agent/<repo>/<task> branch on your remote, so you can merge your work through your normal git procedures: open a pull request or merge it locally.

5. Clean up

agentspace fix-login purge          # remove the container and its workspace + sessions volumes

purge tears the task down completely once you're done with it.

Docker support

Some projects need Docker themselves: a Postgres for local dev, a containerized app, a docker compose stack. Spawn the task with --docker and it gets its own isolated, nested Docker daemon running as a sidecar:

agentspace spawn claude my-app --docker

The agent can then use docker run, docker compose, and docker build normally, and the host's Docker is never exposed. See Run Docker inside a task for ports, reaching services, and the security boundary.

Preview in a browser

See the agent's work running, without committing first, by starting a preview: a sidecar container that mounts the same workspace volume, runs a dev server, and publishes a port.

agentspace my-task preview node:24-slim --port 5173 \
  --cmd "npm install && npm run dev -- --host 0.0.0.0 --port 5173"

Because the preview shares the workspace volume, it sees edits live and hot-reloads as the agent works. See Preview results in a browser for the full story.

That's the whole loop. Everything below is reference.

How it works

Workspace = state. A Docker volume holds the git repo and the agent's changes.
Container = tool. A disposable agent container runs against that volume.
Git = promotion. Short-lived helper containers clone/diff/commit/push; the agent container never holds git credentials.

All networked git (clone, fetch, push) runs on the host with your normal git setup, so credentials, host keys, and commit signing stay native. History moves between host and volume as git bundles piped over stdin/stdout, so there's no SSH agent forwarding into containers and no host-specific socket plumbing.

Restarts resume the agent's existing session rather than starting blank, and the work is always just a normal git branch, so agentspace adds no lock-in.

Git writes are off-limits to agents, on purpose

agentspace is deliberately opinionated: the agent never modifies git. It edits, tests, and builds, and it may run read-only git (status, log, diff, show, blame, ...) to inspect the repo, but you own review, commit, push, and merge. This keeps an autonomous YOLO-mode agent from rewriting history, force-pushing, or leaking credentials, and it's why promotion is a separate, human-driven step.

Enforcement lives entirely in the agent image, not the CLI:

Claude Code uses a system-managed PreToolUse hook (/etc/claude-code/) that allows read-only git but hard-blocks any mutating command. It fires even under --dangerously-skip-permissions, backed by deny rules for the mutating subcommands and a ~/.claude/CLAUDE.md instruction.
Codex uses execpolicy rules that allow read-only subcommands and leave git otherwise forbidden, plus a ~/.codex/AGENTS.md instruction.

The instruction files are re-seeded into the home volume on every start, so the policy applies to login volumes created before it existed too.

Ejecting

Nothing locks you in, in either direction:

Want agents that do use git? The policy is in the image, not the CLI. Build your own image with the hook and policy files removed, then point AGENTSPACE_IMAGE at it.
Want to drop agentspace entirely? Every task is just a standard agent/<repo>/<task> branch. Push it and carry on with plain git and your usual PR flow; there's nothing proprietary to migrate off.

For an even harder guarantee in the other direction, you can stub out the git binary in a custom image (not done by default, since some build tools read git metadata).

Language runtimes

The agent image is built on node:24-slim, so Node.js / JavaScript / TypeScript projects work out of the box. Other language runtimes (Python, Go, Rust, and so on) aren't preinstalled yet; first-class support for more is on the roadmap.

Until then, you don't have to wait: extend the image yourself and point AGENTSPACE_IMAGE at it.

# my-agent.Dockerfile
FROM ghcr.io/imrec/agentspace:latest
RUN apt-get update && apt-get install -y --no-install-recommends \
      python3 python3-pip \
    && rm -rf /var/lib/apt/lists/*

docker build -t my-agent -f my-agent.Dockerfile .
AGENTSPACE_IMAGE=my-agent agentspace spawn claude my-task

Building FROM ghcr.io/imrec/agentspace:latest keeps the bundled agents and the read-only-git policy; the same approach lets you bake in any other tooling your project needs. (For --docker tasks, language runtimes can instead live in the containers the agent runs; see Run Docker inside a task.)

OS support

| OS | Status | | ----------- | ---------------------------- | | macOS | tested | | Linux | tested | | Windows | unverified, expected to work |

agentspace depends only on Git, Docker, and Node.js, all of which run on Windows, so it should work there out of the box; it just hasn't been tested yet. If you run it on Windows, reports (success or bug) are very welcome.

Command reference

Run task commands from anywhere; the agent branch, its origin, and the base branch are all recorded in the workspace volume.

agentspace <setup-command> [args]     # spawn, pull, refresh, list, cache
agentspace <task> <command> [args]    # everything that acts on a workspace

Setup commands (spawn, pull, refresh, list, cache) come first. Every command that acts on an existing workspace takes the task first, then the command, like agentspace my-task check or agentspace my-task shell. tool is codex or claude.

Spawn a workspace

agentspace spawn codex my-task     # seed the workspace and start a detached agent

agentspace spawn codex my-task --from [email protected]:me/other.git            # seed from a different repo
agentspace spawn codex my-task --from ../sibling-checkout --base develop     # seed from a local path + base

By default spawn infers the repo from the current folder: its origin URL and the branch you have checked out (resolved to origin/<branch>). Pass --from <url|path> to seed from somewhere else: any clone source git understands (a remote URL or a local repo path). The recorded origin becomes that source, so later push and update target it, so make sure you have push access there. --base <branch> overrides the seeded branch; without it, a --from source uses the remote's default branch (its HEAD), and the current folder uses your checked-out branch. With neither a --from nor a usable origin, spawn errors.

spawn creates a workspace volume agentspace-<task>-vol, clones origin into it, syncs the current branch, cuts an agent/<repo>/<task> branch, and starts a detached container named agentspace-<task>. It also creates a per-task agentspace-<task>-sessions volume holding the agent's conversation transcripts.

All tasks for a tool share one home volume (codex-home / claude-home) for tool credentials and config.

Work with a running agent

agentspace my-task shell      # open a shell in the agent container
agentspace my-task logs       # follow the container logs
agentspace my-task attach     # attach to the agent (restarts it first if stopped)
agentspace my-task restart    # recreate the agent on the latest image (revives it from a leftover volume)
agentspace my-task stop       # stop the container
agentspace my-task rm         # remove the container (its volumes survive until purge)
agentspace my-task purge      # remove the container and its workspace + sessions volumes
agentspace list               # list workspace tool, task, uptime, and running-agent status
agentspace list --status      # also compare workspaces to origin/base for git status
agentspace restart-all        # recreate every workspace's agent on the latest local image

restart recreates, so it picks up new images and credentials. A plain docker restart keeps a container's original image, so restart instead removes the container and starts a fresh one against the same workspace and sessions volumes — the agent resumes its task, now on the latest pulled image and with whatever credentials the shared home volume currently holds. Because it depends only on the workspace volume, restart also revives a task that was rm'd down to a bare volume (the one way back from rm short of re-spawning). restart-all does the same across every workspace at once; it does not pull, so run agentspace pull first when you want newer images.

Sessions survive restarts. Conversation transcripts live in the per-task agentspace-<task>-sessions volume (mounted over ~/.claude/projects or ~/.codex/sessions), separate from the shared home volume that holds credentials. On start the container resumes the task's most recent session if one exists, otherwise begins fresh, so attach drops you back into your ongoing conversation instead of a blank one.

Detaching. While attached, press Ctrl-P then Ctrl-Q in succession (the Docker standard) to detach your terminal and leave the agent running in the background. Run agentspace <task> attach to reattach. Ctrl-C is not forwarded into the agent, so it won't interrupt the running turn.

Review and promote

agentspace my-task check                       # is the work committed, pushed, up to date, or already on base?
agentspace my-task review                      # status, diff stat, and full diff
agentspace my-task update main                 # rebase the workspace onto origin/main
agentspace my-task commit ["optional msg"]     # show diff, generate a message, commit on accept (no push)
agentspace my-task push                        # push the workspace's branch to its origin
agentspace my-task commit-push ["optional msg"] # commit (same as above) and push in one step

check is a read-only health report: it fetches origin and tells you, in plain language, whether your latest changes are committed, whether every commit is on the remote, whether your branch is behind the base, and whether every branch commit is already on the base, with the exact command to fix each gap. It exits non-zero when something still needs doing, so it doubles as a pre-merge gate in scripts.

commit and commit-push show the staged diff before committing (with your host git identity) and pushing. By default they generate the commit message for you: the workspace's tool drafts a one-line subject from the staged diff and drops it into an editable prompt, so you can accept it as-is or tweak it. Pass a message argument (commit "my message") only when you want to write it yourself; then it's used as-is after a yes/no confirm, with no generation step.

A task is a branch (agent/<repo>/<task>). push publishes it to origin; review and merge through your normal pull-request flow, which respects branch protection, required checks, and reviews. For a direct local merge instead:

git switch <base>
git pull --ff-only
git merge origin/agent/<repo>/<task>
git push origin <base>

Preview results in a browser

See the agent's work running, without committing first, by starting a preview: a sidecar container that mounts the same workspace volume, runs a dev server, and publishes a port.

# Node example: install deps and run a dev server, published on localhost:5173
agentspace my-task preview node:24-slim --port 5173 \
  --cmd "npm install && npm run dev -- --host 0.0.0.0 --port 5173"

agentspace my-task preview-logs      # follow install/build output and the server URL
agentspace my-task preview-restart   # recreate the preview from its saved settings
agentspace my-task preview-stop      # stop and remove the preview container

Because the preview shares the workspace volume, it sees edits live and hot-reloads as the agent works. The runtime comes from the image you name, so point it at python:3.12, rust:1, or anything else and supply the matching --cmd. --port accepts 5173 (published on 127.0.0.1), 8080:80 (host:container), or 0.0.0.0:8080:80 to expose it on your LAN. Run agentspace <task> preview with no arguments for usage; pass --replace to recreate a running preview.

To recreate a preview without re-typing the image, ports, cmd, and env, use agentspace <task> preview-restart. It removes the container and starts a fresh one from the settings the preview was created with (stored on the container), so you get a clean slate — env files are re-read, picking up any host-side changes.

Environment variables. Pass --env-file <file> to load a host file of KEY=value lines (e.g. your project's .env) into the preview container, and --env KEY=value to set individual vars. Both are repeatable; --env wins over --env-file, and a later --env-file wins over an earlier one. The file is read from the host where you run agentspace (not from inside the workspace volume), so point it at a .env on your machine:

agentspace my-task preview node:24-slim --port 5173 --env-file .env \
  --env NODE_ENV=development \
  --cmd "npm install && npm run dev -- --host 0.0.0.0 --port 5173"

The server must bind 0.0.0.0, not localhost. A server listening only on 127.0.0.1 inside the container is unreachable from the host. Most dev servers need a flag for this (Vite --host 0.0.0.0, Next.js -H 0.0.0.0, Django runserver 0.0.0.0:8000, or HOST=0.0.0.0).

When any previews exist, list shows a PREVIEW column with each task's published port, and rm/purge tear the preview down along with the task.

Run Docker inside a task

Some projects need Docker themselves: a Postgres for local dev, a containerized app, a docker compose stack. Spawn the task with --docker and it gets its own isolated, nested Docker daemon, running as a sidecar:

agentspace spawn claude my-app --docker
agentspace my-app docker-logs    # follow the daemon's startup / pulls / builds

The agent (and any preview) can then use docker run, docker compose, and docker build normally. This never exposes the host's Docker: the host socket is root-equivalent and is never mounted into a task; containers the agent starts live inside the nested daemon's namespace. Tasks spawned without --docker get no daemon, no network, and no Docker access.

State persists in a per-task agentspace-<task>-docker-lib volume (images, build cache, volumes, DB data), surviving restart/stop and host reboots. The sidecar follows the agent's lifecycle, and purge removes it along with the network and data volume. When any task has a nested daemon, list shows a DOCKER column with the daemon's status.

Reaching services. The agent, preview, and nested daemon share a private per-task network on which the daemon is named services. Anything the agent publishes inside the daemon is reachable at services:<port>: e.g. run docker run -d -p 5432:5432 postgres and point your app at services:5432.

Reaching it from your browser. Host port publishing is fixed when the daemon starts, so name the ports you want at spawn time with --ports (same forms as preview, plus ranges):

agentspace spawn claude my-app --docker --ports 8000-8010
# inside the agent: docker run -d -p 8005:8080 webapp  ->  http://localhost:8005

Security: what `--docker` does and doesn't protect against

Be clear-eyed about this boundary: it is weaker than a task without --docker, and it is not a sandbox for untrusted code:

Protected: an agent mishap. An agent acting in good faith but autonomously (YOLO mode) can't reach your machine through the nested daemon: no host Docker socket, no host mounts, so even docker run -v /:/host … mounts the sidecar's filesystem, not yours. The worst it can do is wreck its own throwaway daemon and per-task volumes. ✔
Not protected: a deliberate escape. The nested daemon runs --privileged (Docker-in-Docker requires it) and the agent fully controls it. Code that is actively trying to break out, such as a prompt-injection payload, can start a privileged nested container with well-known paths to the host kernel. Assume an attacker who can inject instructions into the agent can reach the host. ✘

So enable --docker only for repositories and prompts you would already trust on your machine. If you need a hard boundary against hostile code, run agentspace on a host that provides one: a stronger container runtime such as Sysbox, or a microVM (Kata Containers, Firecracker, gVisor).

Sharing image pulls across tasks (optional cache)

Each task's daemon is isolated, so concurrent --docker tasks each pull the same images independently, which is wasteful on bandwidth and Docker Hub rate-limits when you run many at once. Set AGENTSPACE_DOCKER_MIRROR=1 to enable a shared pull-through cache: a single registry:2 proxy every --docker task uses as a Docker Hub mirror. The first task to need an image fetches it; the rest are served locally.

AGENTSPACE_DOCKER_MIRROR=1 agentspace spawn claude a --docker
AGENTSPACE_DOCKER_MIRROR=1 agentspace spawn claude b --docker   # b's pulls hit the cache

agentspace cache status     # up | stopped | not created
agentspace cache up         # pre-warm / start it
agentspace cache down       # stop it (keeps cached layers in its volume)

Saves bandwidth, pull time, and rate-limit pressure (N tasks become one upstream pull). Does not save disk: each daemon still unpacks its own copy.
Covers Docker Hub only; ghcr.io/quay.io/etc. pull directly (which still covers most base images: postgres, redis, nginx, node, python, …).
A host-wide singleton, kept running across tasks (not removed by purge); manage it with agentspace cache.
Optional AGENTSPACE_DOCKERHUB_USER / AGENTSPACE_DOCKERHUB_PASSWORD let the cache authenticate its own upstream pulls for a higher rate limit.

Naming

For a repository named example and task fix-login:

| Resource | Name | | ---------------- | ---------------------------------------------- | | Container | agentspace-fix-login | | Workspace volume | agentspace-fix-login-vol | | Sessions volume | agentspace-fix-login-sessions | | Home volume | codex-home / claude-home (shared per tool) | | Branch | agent/example/fix-login |

With --docker, a task also gets a daemon sidecar agentspace-fix-login-docker, a private network agentspace-fix-login-net, and a data volume agentspace-fix-login-docker-lib.

The container and volume are keyed by task name only; the repo name appears only in the branch. So task names must be unique across all your repositories. spawn refuses a task name that already has a workspace; remove it first with agentspace <task> rm (or purge).

Updating the agent image

The CLI keeps the agent image fresh on its own: spawn pulls it at most once a day (tracked in ~/.agentspace/state.json), so you pick up new releases without a registry round-trip on every run. Force a refresh anytime:

agentspace pull            # update the local image only
agentspace pull -r         # also reboot running agents onto it (-y to skip the prompt)

A plain pull leaves running agents on their original image until they're recreated. pull -r removes and re-creates every workspace not already on the new image — agents running or stopped on an older one, plus tasks left as a bare volume — against the same workspace and session volumes, so each agent resumes its conversation where it left off, though its in-flight turn is interrupted, so it prompts first. (To recreate every workspace regardless of image, use agentspace restart-all.)

Refresh shared tool credentials without touching workspaces or sessions:

agentspace refresh claude
agentspace refresh claude -y  # skip the restart confirmation

The refresh runs the tool's login flow in a temporary container against the shared home volume (claude-home / codex-home), then restarts the agents for that tool that were already running. Session transcripts stay in each task's own agentspace-<task>-sessions volume, so the restarted agents resume their existing conversations. Claude's container login follows the normal Claude Code flow: if the browser callback cannot reach the container, copy the login URL and paste the resulting code back into the terminal.

The image is published to ghcr.io/imrec/agentspace:latest by the Publish agent image workflow (multi-arch amd64/arm64). The daily rebuild is cacheless, so each image carries the latest Codex / Claude Code / opencode.

Configuration

Bring your own skills, commands, and settings

Drop your own Claude Code / Codex customizations into ~/.agentspace and every task picks them up automatically, with no per-spawn flags. The directory mirrors the in-container home layout, split by tool:

~/.agentspace/
├── claude/                 # overlaid onto ~/.claude in the container
│   ├── settings.json       #   your settings (cannot weaken the git guardrails)
│   ├── skills/             #   your skills
│   ├── commands/           #   your slash commands
│   ├── agents/             #   your subagents
│   └── CLAUDE.md           #   your global memory (kept; managed note appended)
└── codex/                  # overlaid onto ~/.codex in the container
    ├── config.toml         #   your Codex config
    ├── prompts/            #   your saved prompts
    └── AGENTS.md           #   your global guidance (kept; managed note appended)

Because it's just a folder, it transports at scale: keep it in a dotfiles repo, sync it across machines, or share a team baseline. Point AGENTSPACE_CONFIG_HOME elsewhere to use a different location.

Untested. Mounting your user config into every task is a new feature that hasn't been thoroughly tested yet. It may not behave as expected; reports (success or bug) are very welcome.

On every container start the folder is mounted read-only (so a YOLO agent can't rewrite your source) and copied into the home volume, where the tool can read and update it. The managed "never touch git" guardrails are then re-asserted on top, so your config can extend the environment but never drop them. Hard enforcement lives in /etc and on the host (see Git is off-limits), outside any volume you can reach, so a custom settings.json cannot re-enable git. Edits apply on the next restart, the same as a credential refresh.

Environment overrides

| Variable | Purpose | | ------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------ | | AGENTSPACE_IMAGE | use a different/pinned tag or a locally built agent image. Setting it disables the daily auto-pull (you manage updates). | | AGENTSPACE_CONFIG_HOME | host directory for your skills/commands/settings overlay (default ~/.agentspace). | | AGENTSPACE_GIT_IMAGE | override the git-helper image (default alpine/git:latest). | | AGENTSPACE_STATE_DIR | override where the last-pull timestamp is stored. | | AGENTSPACE_DOCKER_IMAGE | override the nested Docker daemon image for --docker tasks (default docker:dind). | | AGENTSPACE_DOCKER_MIRROR | enable the shared pull-through cache for --docker tasks. | | AGENTSPACE_REGISTRY_IMAGE | override the cache's registry:2 image. | | AGENTSPACE_DOCKERHUB_USER / AGENTSPACE_DOCKERHUB_PASSWORD | authenticate the cache's upstream pulls for a higher rate limit. |

To build and test the image locally:

npm run image:build                       # builds ghcr.io/imrec/agentspace:latest
AGENTSPACE_IMAGE=ghcr.io/imrec/agentspace:latest agentspace spawn codex my-task

The GHCR package must be public for unauthenticated docker pull, or run docker login ghcr.io first.

Limitations & contributing

Known gaps and rough edges, contributions welcome:

Tools: only Codex and Claude Code are wired up today.
Language runtimes: the image ships with Node.js; broader runtime support is on the roadmap. For now, bring your own image.
Windows: unverified (see above).
Pull-through cache: mirrors Docker Hub only; ghcr.io / quay.io pull directly.
--docker is not a security sandbox against hostile code; see the security note.

Found a bug or want a feature? Open an issue or PR at github.com/ImreC/agentspace. For a tour of the codebase and the conventions to follow, read AGENTS.md.

Development

npm run dev -- <command>   # run the CLI from source with tsx
npm run build              # bundle to dist/index.mjs
npm run check              # oxlint + tsc
npm run test               # run the Vitest suite
npm run format             # format with oxfmt

License

MIT.