@jaredlikes/openclaw-plugin-ansible
v0.1.4
Published
Secure mesh coordination plugin for OpenClaw: invite/join handshake, resilient task+message routing, capability lifecycle, and delegation/execution skill-pair contracts.
Maintainers
Readme
OpenClaw Plugin: Ansible
Secure mesh control plane for OpenClaw: invite/join handshake, durable routing, and auditable delegation contracts.
Ansible enables a single agent identity (e.g., "Jane") to operate seamlessly across multiple devices. It synchronizes tasks, messages, and shared context in real-time using CRDTs (Yjs) over a secure mesh network (Tailscale), with explicit safety and governance gates for high-risk operations.
This repo also documents a pragmatic way to use Ansible as a reliable inter-agent communication substrate today: treat the shared Yjs document as the durable inbox, and treat auto-dispatch as an optimization (not the only delivery mechanism).
Four Pillars
| Pillar | What It Does | |---|---| | Ring of Trust | Invite/join admission, auth-gate tickets, token lifecycle, and optional signed capability manifest verification | | Mesh Sync | Yjs CRDT replicated state over Tailscale for durable messages, tasks, and context | | Capability Routing | Capability contracts with delegation+execution skill pair semantics and governance gates | | Lifecycle Ops | Lock sweep, retention, coordinator sweeps, receipts, backpressure, and escalation controls |
Full Functional Surface
- Invite/join onboarding with token exchange and optional auth-gate websocket ticket handshake before Yjs attach.
- Durable shared-state transport for messages and tasks with per-agent delivery tracking.
- Reconcile heartbeat + retry logic so missed task/message injection is recovered automatically.
- Send receipts so operators can see when agents place work onto the mesh.
- Capability lifecycle (publish/unpublish/list/health/evidence) with provenance trust controls.
- Delegation policy distribution + per-node ACK tracking.
- Task governance (claim/accept/close, SLA sweep, backpressure policy, escalation controls).
- Admin controls (gateway admin nomination/distribution plus scoped token patterns).
Delegation + Execution Skill Pair Model
- A capability publish registers a contract, not just a label.
- Each capability contract references a delegation skill (requester side) and an execution skill (executor side).
- The delegation skill defines when/how to create tasks/messages and expected ACK/completion semantics.
- The execution skill defines accept-to-close lifecycle behavior in one run, including replies and error paths.
- Publishing/unpublishing updates routing eligibility, and lifecycle evidence records install/wire outcomes per target.
Key Concepts
Hemispheres vs. Friends (Default: Friends)
Ansible coordinates hemispheres — mirrored instances of the same agent identity that share memory, context, and purpose. Think of it like one brain controlling multiple bodies:
| | Hemispheres (Ansible) | Friends / Employees | |---|---|---| | Identity | Same agent (e.g., "Jane" on VPS + "Jane" on Mac) | Different agents (e.g., "Jane" and "Alex") | | Memory | Shared via CRDT sync | Separate memory stores | | Purpose | Same goals, different capabilities | Different roles and responsibilities | | Communication | Self-to-self (direct, efficient) | Inter-agent (polite, contextual) | | Session | Shared session state | Independent sessions |
A hemisphere is your agent's presence on another machine.
In many setups, you do not want every agent to see cross-node context or have inbound ansible messages routed into the default agent. In those setups, treat nodes as friends/employees and centralize ops in a single operator agent (for example, an "Architect").
Node Topology
- Backbone nodes (always-on) — Servers, VPS instances. Handle long-running tasks, scheduled work, background coordination. Host the Yjs WebSocket server.
- Edge nodes (intermittent) — Laptops, desktops. Have local filesystem access, run interactively with the user. Connect to backbone on startup.
Prerequisites
1. OpenClaw
Install OpenClaw on all nodes:
npm install -g openclawEach node needs a working OpenClaw gateway (openclaw gateway or managed via launchd/systemd).
2. Tailscale
Ansible uses Tailscale for secure, zero-config networking between nodes.
- Install Tailscale on all nodes: tailscale.com/download
- Sign in to the same Tailscale network (tailnet) on every node
- Enable MagicDNS in your Tailscale admin console — this lets you use hostnames like
jane-vpsinstead of IPs - Verify connectivity between nodes:
tailscale ping <other-node-hostname>
Important Tailscale details:
- Backbone peers in the ansible config MUST use Tailscale MagicDNS hostnames or Tailscale IPs (100.x.y.z), NOT SSH aliases or public IPs
- If running inside Docker, Tailscale runs on the host, not inside the container. The container reaches Tailscale peers via the host's network
- The ansible WebSocket port (default 1235) is separate from the OpenClaw gateway port (default 18789) — never mix them
3. Ansible Skill (Recommended)
Install the companion skill so your agent knows how to use ansible:
cd ~/.openclaw/workspace/skills
git clone https://github.com/likesjx/openclaw-skill-ansible.git ansibleRestart your OpenClaw gateway to pick up the skill.
To enforce base ansible skills across all configured agent workspaces on a gateway:
openclaw ansible skills sync --skill ansible
openclaw ansible skills verify --skill ansiblesync links the skill into each agent workspace and is safe by default (it will not replace existing mismatched paths unless you pass --force-replace).
Sync Skill Registry Entries Across Workspaces
If you want custom skills (for example ansible-codex-comm) to be slash-addressable in multiple workspace contexts, sync the ### Available skills block in AGENTS.md:
just sync-agents-skills-dry-run
just sync-agents-skillsThis uses scripts/sync-agents-skills.sh and reads target workspaces from ~/.openclaw/openclaw.json.
Installation
1. Install the Plugin
On every node:
openclaw plugins install likesjx/openclaw-plugin-ansibleFor local development:
openclaw plugins install /path/to/repo --link2. Configure
Add the ansible plugin to ~/.openclaw/openclaw.json on each node.
Backbone Node (VPS / Docker)
// ~/.openclaw/openclaw.json
{
"plugins": {
"entries": {
"ansible": {
"enabled": true,
"config": {
"tier": "backbone",
"listenPort": 1235,
"listenHost": "0.0.0.0",
"authGate": {
"enabled": true,
"nodeIdParam": "nodeId",
"inviteParam": "invite",
"ticketParam": "ticket",
"requireTicketForUnknown": true,
"authPort": 1236,
"exchangePath": "/ansible/auth/exchange",
"ticketTtlSeconds": 60,
"requireNodeProof": true,
"rateLimitMax": 30,
"rateLimitWindowSeconds": 60
},
"manifestTrust": {
"allowUnsignedLegacy": true,
"trustedPublisherKeys": {
"vps-jane": "-----BEGIN PUBLIC KEY-----\n<ed25519-pubkey>\n-----END PUBLIC KEY-----"
}
},
"capabilities": ["always-on"]
}
}
}
}
}When running in Docker, expose the port bound to your Tailscale IP:
# docker-compose.yml
services:
jane:
ports:
# Bind to Tailscale IP only (NOT 0.0.0.0) for security
- "100.x.y.z:1235:1235"Edge Node (Mac / Laptop)
// ~/.openclaw/openclaw.json
{
"plugins": {
"entries": {
"ansible": {
"enabled": true,
"config": {
"tier": "edge",
"backbonePeers": [
"ws://jane-vps:1235"
],
"dispatchHeartbeatSeconds": 20,
"sendReceiptAgents": ["architect"],
"capabilities": ["local-files", "voice"]
}
}
}
}
}backbonePeers must use Tailscale MagicDNS hostnames or Tailscale IPs. SSH config aliases do NOT work here.
When authGate.enabled=true, unknown nodes can be admitted in two ways:
ws://jane-vps:1235/?nodeId=<new-node-id>&invite=<invite-token>
Known authorized nodes reconnect with nodeId only.
For stricter admission, set requireTicketForUnknown=true and use one-time short-lived tickets:
# on inviter/backbone
openclaw ansible ws-ticket --token <invite-token> --node <new-node-id> --ttl-seconds 60Then connect with:
ws://jane-vps:1235/?nodeId=<new-node-id>&ticket=<ws-ticket>
You can also mint tickets via HTTP exchange (no gateway admin token required):
curl -sS -X POST http://jane-vps:1236/ansible/auth/exchange \
-H 'content-type: application/json' \
-d '{
"inviteToken": "<invite-token>",
"nodeId": "<new-node-id>",
"nonce": "n-123456",
"clientPubKey": "<PEM-public-key>",
"clientProof": "<base64-signature>"
}'Then connect with returned ticket:
ws://jane-vps:1235/?nodeId=<new-node-id>&ticket=<ticket>
Architect-Managed (Recommended for Multi-Agent Ops)
If you want ansible to be operated only by a dedicated agent (e.g., Architect), disable:
- prompt context injection
- auto-dispatch of inbound ansible messages into the default agent
{
"plugins": {
"entries": {
"ansible": {
"enabled": true,
"config": {
"tier": "edge",
"backbonePeers": ["ws://jane-vps:1235"],
"injectContext": false,
"dispatchIncoming": false
}
}
}
}
}In this mode, the operator agent should poll and respond using tools like:
ansible_read_messagesansible_send_message
Reliability & Delivery Semantics (If You Want To Rely On It)
Ansible has two distinct mechanisms:
- Durable state replication: messages/tasks/context are written into the shared Yjs document and replicated across nodes.
- Auto-dispatch (optional): when a node observes inbound work (messages, and explicitly-assigned tasks) in the shared Yjs doc, it can inject that work into the agent loop as a normal inbound turn.
What this means today:
- Messages are durable (persist in the Yjs doc; readable via
ansible_read_messages; visible in context injection if enabled). - Auto-dispatch is best-effort realtime + reconnect-safe:
- New messages dispatch immediately while connected.
- On reconnect (provider
sync=true), the dispatcher reconciles backlog deterministically (timestamp order) and injects any undelivered items. - Heartbeat reconciliation (default every 20s) re-scans pending deliveries so missed observe events do not strand messages/tasks.
- Dispatch failures are retried with exponential backoff (with jitter) instead of being "seen forever".
Send/delegate visibility:
ansible_send_messageandansible_delegate_taskemit a compact send receipt message.- Default receipt recipients are:
- gateway admin agent
- plus any configured
sendReceiptAgents
If you want to "completely rely" on Ansible for inter-agent communication, treat the shared Yjs doc as the source of truth and the dispatcher as the delivery worker. You can still keep manual tools (ansible_read_messages, ansible_find_task) as an operator backstop.
For a concrete protocol and improvement plan, see docs/protocol.md.
For the practical "how do I add a new agent/gateway" guide, see docs/setup.md.
3. Bootstrap the Network
- Start the backbone: Restart OpenClaw on the VPS
- Bootstrap (run on the backbone node):
openclaw ansible bootstrap - Invite edge nodes (run on backbone):
openclaw ansible invite --tier edge --node <expected-node-id> - Join (run on each edge node):
openclaw ansible join --token <token-from-invite>
How It Works
Message Dispatch
When one hemisphere sends a message, the ansible dispatcher automatically injects it into the receiving hemisphere's agent loop — just like a Telegram or Twitch message would. The agent processes it as a full turn and can reply, call tools, or delegate tasks.
Replies are delivered back through the Yjs document automatically.
Important: backlog is durable and will also be delivered on reconnect via reconciliation; this is what makes restarts/offline edges reliable.
Session Isolation
Each sender gets a separate ansible session (ansible:{nodeId}). Conversation history is preserved per-hemisphere, so ongoing coordination has continuity. This mirrors how Telegram creates per-chat sessions.
State Sync
All state is synchronized via Yjs CRDTs:
- Messages: Inter-hemisphere communication
- Tasks: Delegated work items with claim/complete lifecycle
- Context: Current focus, active threads, recent decisions
- Pulse: Online status and heartbeat data
Agent Tools
| Tool | Description |
|---|---|
| ansible_status | Check who's online, what they're working on, pending tasks |
| ansible_delegate_task | Create a task for another hemisphere |
| ansible_claim_task | Pick up a pending task |
| ansible_complete_task | Mark a claimed task as done |
| ansible_send_message | Send a message (targeted or broadcast) |
| ansible_update_context | Update your current focus, threads, or decisions |
| ansible_read_messages | Read messages (unread by default) |
| ansible_mark_read | Mark messages as read |
| ansible_delete_messages | Operator-only emergency purge (destructive; strongly discouraged for agent workflows) |
| ansible_get_coordination | Read coordinator configuration (who coordinates, sweep cadence) |
| ansible_set_coordination_preference | Record your preferred coordinator/cadence (per-node preference) |
| ansible_set_coordination | Set coordinator configuration (initial setup or last-resort failover) |
| ansible_set_retention | Configure coordinator roll-off (daily prune of closed tasks by TTL) |
| ansible_get_delegation_policy | Read shared delegation policy + per-agent ACK records |
| ansible_set_delegation_policy | Coordinator-only publish/update delegation policy (+ optional notify) |
| ansible_ack_delegation_policy | Record this agent's ACK for the current policy version/checksum |
ansible_delete_messages is intentionally high-friction (confirm token + required justification + explicit filters) and should only be used by human operators for emergency cleanup. It is hard-gated to nodes that advertise capability admin.
CLI Commands
openclaw ansible status # Show network health and nodes
openclaw ansible nodes # List authorized nodes
openclaw ansible tasks # View shared task list
openclaw ansible send --message "hi" # Send a manual message
openclaw ansible retention set # Configure closed-task roll-off (coordinator-only service)
openclaw ansible messages-delete --dry-run --from architect --reason "Emergency cleanup of stale chatter"
openclaw ansible delegation show # Show policy + ACK status
openclaw ansible delegation set # Publish policy from markdown file (coordinator-only)
openclaw ansible delegation ack # ACK current policy
openclaw ansible capability list # List published capability contracts + eligibility
openclaw ansible capability publish --id cap.example --name "Example" --version 1.0.0 --owner executor --delegation-skill-name ansible-delegate-example --delegation-skill-version 1.0.0 --executor-skill-name ansible-executor-example --executor-skill-version 1.0.0 --contract schema://ansible/cap.example/1.0.0
openclaw ansible capability publish --id cap.highrisk --name "High Risk" --version 1.0.0 --owner executor --delegation-skill-name ansible-delegate-highrisk --delegation-skill-version 1.0.0 --executor-skill-name ansible-executor-highrisk --executor-skill-version 1.0.0 --contract schema://ansible/cap.highrisk/1.0.0 --approval-artifact CAB-1234 --approval-note "Approved in CAB on 2026-03-03"
openclaw ansible capability unpublish --id cap.example
openclaw ansible tasks claim <taskId> --eta-seconds 900 --plan "scan, patch, validate" # emits accepted ACK contract
openclaw ansible bootstrap # Initialize as first node
openclaw ansible invite --tier edge --node <expected-node-id> # Generate node-bound invite token
openclaw ansible join --token <tok> # Join with invite tokenCapability publish provenance (G2_PROVENANCE) verifies signed manifests with configured trusted keys:
- signature formats:
ed25519:<base64>ored25519:<keyId>:<base64> - trust source:
manifestTrust.trustedPublisherKeys - legacy fallback:
manifestTrust.allowUnsignedLegacy(only whensignedManifestRequired=false) - high-risk governance: when
riskClass=highandrequiresHumanApprovalForHighRisk=true, publish requiresapproval_artifact_id(CLI:--approval-artifact) - publish-path safety: manifest secret-like literals are blocked before publish; lifecycle metadata is redacted for sensitive keys/patterns
Gateway Transport Security (CLI -> Gateway)
By default, the CLI targets local loopback (http://127.0.0.1:<port>), which is acceptable for local-only traffic.
For remote gateway calls, use HTTPS:
export OPENCLAW_GATEWAY_URL="https://gateway.example.com"Security guardrail:
- The CLI refuses non-loopback
http://endpoints by default. - To override intentionally (not recommended), set:
OPENCLAW_ALLOW_INSECURE_REMOTE_HTTP=1
External Coding Agent Token Lifecycle (Recommended)
Use a two-step flow so admins never hand out long-lived tokens directly:
# 1) Admin issues temporary invite (single-use, short TTL)
openclaw ansible agent invite --id codex --ttl-minutes 15 --as admin --token "$OPENCLAW_ANSIBLE_TOKEN"
# 2) Agent accepts invite and receives permanent token (rotated on accept)
openclaw ansible agent accept --invite-token <temp_invite_token> \
--write-token-file ~/.openclaw/runtime/ansible/codex.tokenNotes:
- Invite tokens are one-time and expire automatically.
- Accepting an invite mints a permanent
agent_tokenand invalidates the invite. - Any other outstanding invites for the same agent are revoked after successful accept.
- Admin can inspect invite state with:
openclaw ansible agent invites(oropenclaw ansible agent invites --all). - Admin can inspect non-secret auth lifecycle metadata with:
openclaw ansible agent list(token hint + issued/rotated/accepted timestamps). - Admin-sensitive operations require a valid admin
agent_token(invite, token issue, destructive message delete).
External Agent Rotation Runbook
Two supported rotation paths:
- Immediate rotate (admin-driven):
openclaw ansible agent token-issue --id codex- Re-invite rotate (recommended for unattended coding agents):
openclaw ansible agent invite --id codex --ttl-minutes 15 --as admin --token "$OPENCLAW_ANSIBLE_TOKEN"
openclaw ansible agent accept --invite-token <temp> --write-token-file ~/.openclaw/runtime/ansible/codex.tokenRecommended policy:
- Rotate every 30 days (or immediately after suspected exposure).
- Prefer re-invite flow when you need explicit handoff/acceptance proof.
Automatic Token Storage Options
Choose one primary storage path per coding agent:
- Environment variable (simple)
export OPENCLAW_ANSIBLE_TOKEN="<agent_token>"- Restricted runtime file (recommended baseline)
openclaw ansible agent accept --invite-token <temp> \
--write-token-file ~/.openclaw/runtime/ansible/codex.token
chmod 600 ~/.openclaw/runtime/ansible/codex.token- OS key vault / secret manager (best for production)
- macOS: Keychain
- Linux:
pass, Secret Service, or cloud secret manager - Windows: Credential Manager / DPAPI-backed store
For automation, retrieve from vault at process start and export into OPENCLAW_ANSIBLE_TOKEN in-memory only.
Updating (Maintainers + Users)
Maintainers (this repo)
This plugin is typically installed from GitHub, so dist/ must be committed.
- Make changes in
src/and/or docs. - Build:
npm ci && npm run build - Verify
dist/changed as expected. - Commit both
src/anddist/(plus docs), then push.
Release Preflight (Plugin + Skill)
For npm plugin release readiness in this repo:
npm run release:preflightFor ClawHub skill readiness against the skill repo:
npm run test:skill:preflight -- --skill-dir=/Users/jaredlikes/code/openclaw-skill-ansibleFor full dual-track release steps, see:
docs/release-dual-track-v1.md
Users (machines running OpenClaw)
After updating the plugin:
- Update the plugin checkout (either via
openclaw plugins update ansibleif you have an install record, or by re-runningopenclaw plugins install likesjx/openclaw-plugin-ansible). - Run
openclaw ansible setupto align skill + config (use--dry-runfirst if desired). - Restart the gateway (
openclaw gateway restart, or your supervisor).
openclaw ansible setup intentionally updates skill + config only. Plugin code update remains a separate explicit step.
Gateway Deploy Hygiene (Recommended)
To avoid recurring dist/* merge conflicts on gateways:
- Use the repo Node version from
.nvmrc(22.22.0). - Deploy from a clean checkout only.
- Use:
./scripts/safe-deploy-pull.shThis script:
- fails fast if tracked files are dirty,
- runs
git pull --ff-only, and - runs
npm run build.
If it fails with a dirty tree, resolve/stash local changes first (do not force pull).
Troubleshooting
Connection Refused
- Check backbone is running:
openclaw ansible status - Check firewall:
sudo ufw allow 1235/tcp(or allow on Tailscale interface only) - Docker: Ensure
listenHost: "0.0.0.0"is set in the backbone config - Try using the Tailscale IP directly:
ws://100.x.y.z:1235
Tailscale Issues
- Run
tailscale ping <hostname>to verify the tunnel - Ensure MagicDNS is enabled in your Tailscale admin console
- Inside Docker containers, Tailscale hostname resolution depends on host DNS — if DNS is broken on the host, containers will fail too
"Ansible not initialized"
- The gateway must be running and the Yjs document must be synced
- For edge nodes, wait for the first successful sync with the backbone
Node ID Shows Container Hash Instead of Hostname
- When running inside Docker, the hostname is the container ID (e.g.,
2ad9255a2f3e), not the Tailscale hostname - This is cosmetic — messages still route correctly because the dispatcher processes all new messages regardless of
tofield
Known Issues
- Gemini provider
.filter()crash: If usinggoogle-gemini-cliprovider and the session transcript contains a corruptedtoolResultmessage, the pi-ai library crashes with "Cannot read properties of undefined (reading 'filter')". Workaround: reset the session with/newor switch to a different provider. This is an upstream bug in@mariozechner/pi-ai.
Architecture
See docs/architecture.md for detailed technical architecture.
Naming and Trademark Notice
This project's "Ansible" name references the fictional ansible communication concept from science fiction and is unrelated to Red Hat Ansible or Ansible Automation Platform.
License
MIT
