@agenticengineeringagency/ultimate-harness
v0.7.0
Published
Runtime-agnostic software-development harness for agentic engineering work.
Readme
Ultimate Harness
Ultimate Harness is a runtime-agnostic software-development harness for agentic engineering work.
It sits above coding agents and agent runtimes. Instead of becoming "one more coding agent", it standardizes the durable artifacts and lifecycle around agentic work:
request / issue / spec
-> workflow profile
-> mission packet
-> runtime adapter
-> runtime execution with sandbox policy
-> verification result
-> human review
-> promotion into canonical project stateThe goal is to combine proven patterns from specification-driven development, agent workflow systems, skill libraries, and sandboxing tools into a practical harness for planning, specifying, executing, verifying, and safely iterating on software work.
Current status
UH ships an end-to-end CLI with a schema-backed artifact lifecycle and five wired adapters (hermes, codex, hermes-proxy, openrouter active; oh-my-pi experimental). Sandboxes support git-worktree (default) and directory backends (container planned). Latest release: v0.7.0 on npm. See docs/ROADMAP.md for status and CHANGELOG.md for release notes.
| Adapter | Status | Notes |
|---|---|---|
| hermes | active | Reference adapter. Pinned to Hermes Agent ≥ 0.14.0. |
| codex | active | Drives codex exec --sandbox workspace-write --json --output-last-message against codex-cli ≥ 0.130.0. Verified end-to-end against the live ChatGPT backend. |
| oh-my-pi | experimental | Drives omp --print --mode json. Missions can route to any OMP-supported model (including Anthropic-tier via OMP's stealth surface) by setting runtime_config_overrides.model:. Read docs/runbooks/anthropic-via-omp.md before routing Claude through OMP — the ToS posture is documented there. |
| hermes-proxy | active | HTTP client targeting a local hermes proxy instance (Hermes Agent ≥ 0.14.0). Officially sanctioned OAuth-backed subscription routing — replaces the OMP stealth path. See docs/architecture/adapter-hermes-proxy.md and docs/runbooks/hermes-proxy-setup.md. |
| openrouter | active | OpenAI-compat HTTP client for openrouter.ai — the cheapest pay-per-token routing target. API key via OPENROUTER_API_KEY (never the manifest); a missing key makes uh adapter check openrouter degrade gracefully. See docs/runbooks/openrouter-setup.md. |
Cross-cutting protocols every adapter participates in:
- UH-28 runtime-final-message capture — every adapter prompts the model to emit a fenced
uh-runtime-final-messageblock; the harness extracts it intoruntime-final.txtfor cross-runtime parity. See the protocol section ofdocs/architecture/runtime-adapter-contract.md. - UH-26 per-runtime strict
runtime_configvalidation — typos in adapter manifests fail at load time. - UH-27 / UH-33 mission
runtime_config_overrides— missions override adapter defaults per-run with the same typo safety. - UH-34 untracked-file diff capture —
diff.patchincludes brand-new files, not just modified-tracked ones.
Documentation
Start with the vision, the documentation home, and the roadmap. Direct links:
- VISION — what UH is, who it's for, and what we won't accept
- Glossary
- Product requirements
- MVP scope
- Architecture overview
- Runtime adapter contract — includes the UH-28 sentinel protocol
- Mission packet schema
- Verification and promotion lifecycle
Runbooks:
- Codex E2E smoke
- Anthropic via oh-my-pi
- Hermes Proxy setup
- Hermes Proxy E2E smoke (UH-38 promotion record)
- OpenRouter setup
- Sandbox backends
- Publishing
- Honcho persistent memory (oh-my-pi)
Install
bun add -g @agenticengineeringagency/ultimate-harness
uh --helpThe package is published to the npm registry and is installable with Bun's
package manager. The CLI binary is uh.
Quick start
bun install
bun run build# Initialize .harness/ project state.
uh init
# Confirm a runtime is available (one of: hermes, codex, oh-my-pi).
uh adapter check hermes
# Create and validate a mission packet.
uh mission create m1-example \
--title "Example mission" \
--workflow spec-first-feature \
--objective "Demonstrate the mission lifecycle"
uh validate --all-missions
# Render the runtime invocation without launching.
uh mission dry-run .harness/missions/m1-example/mission.yaml --runtime hermes
# Execute the mission. --runtime accepts: hermes | codex | oh-my-pi.
uh mission run .harness/missions/m1-example/mission.yaml --runtime hermes
# Run the mission's declared verification checks.
uh verify m1-example
# Record a human promotion decision.
uh promote m1-example --approved-by "Reviewer Name" --change README.md
# Inspect harness state.
uh statusFor the package bin and dev loop:
bun run dev -- --help # tsx-driven dev runner
node dist/cli.js --help # built CLI
npm link && uh --help # local bin install after buildThis package is currently private and intended for local development rather than publishing.
Mission-level runtime overrides
Missions select which model / runtime config to use per-run without editing the shared adapter manifest:
# .harness/missions/<id>/mission.yaml
runtime_config_overrides:
model: anthropic/claude-opus-4-7
thinking: mediumMission overrides merge over the adapter manifest's config.runtime_config and are strict-validated by the per-runtime Zod schema, so typos fail fast.
Durable artifacts
Mission-scoped:
mission.yaml— schema-backed mission packet.prompt.md— rendered runtime prompt for the run.runtime-session.yaml— runtime command, args, status, timestamps, exit code.events.ndjson— runtime lifecycle + adapter-specific event stream.runtime-final.txt— model's one-paragraph summary (UH-28 sentinel-extracted when present).runtime-result.yaml— terminal status + artifact refs.diff.patch—git diffincluding untracked new files (UH-34).verification.yaml—uh verifyoutput.promotion.yaml— human approval / rejection / deferral.
Project-scoped: .harness/project.yaml, .harness/workflows/, .harness/adapters/, .harness/sandboxes/, .harness/skills/, .harness/audit/events.ndjson.
Safety model
Ultimate Harness is designed around explicit gates rather than direct mutation of canonical state:
- Schemas validate every persisted artifact (
uh.project.v0,uh.workflow.v0,uh.mission.v0,uh.runtime-session.v0,uh.runtime-result.v0,uh.verification.v0,uh.promotion.v0). - Mission IDs, workflow profile names, and artifact paths are constrained to avoid path traversal.
- Runtime artifact persistence refuses symlinked
.harness, mission directories, or artifact targets. - Sandboxes are git-worktree-backed; missions run with
cwdset to the sandbox path. Bound mission packets are seeded into the worktree at create time (UH-29). - Codex runs with
--sandbox workspace-write; oh-my-pi runs with--no-extensions --no-skillsby default. - Promotion is a separate human approval step. A
promoteddecision is blocked unlessverification.yamlis passed. - Event streams and YAML records provide an audit trail for execution, verification, and promotion.
Inspiration
Ultimate Harness studies and selectively integrates ideas from:
- Specsafe — specification safety and issue-driven development.
- BMAD Method — structured agent roles and delivery workflows.
- superpowers — composable agent capabilities.
- GSD — fresh-context execution and durable project context.
- matt-pocock/skills — focused reusable engineering skills.
- oh-my-openagent — multi-agent harness patterns.
- OpenSpec — artifact-guided specification workflows.
- Hermes Agent — wired as the reference adapter.
- Codex CLI — wired adapter for OpenAI's coding agent.
- oh-my-pi and Pi — wired oh-my-pi adapter; Pi tracked as a future addition.
- AgentFS — copy-on-write sandboxing patterns (design at
docs/architecture/sandbox-agentfs.md).
See the comparison matrix and adopt/reject/defer log for the current design position.
Project vision
- Specification-first planning and execution.
- Portable mission packets for bounded agentic work.
- Runtime adapters for multiple coding agents.
- Reusable skills and workflow profiles.
- Sandboxed environments for safer autonomous development.
- Structured verification and human approval gates.
- Clear audit trails for decisions, file changes, checks, and promotion.
