ultima-harness

v0.3.4

Published

19 days ago

Autonomous build pipeline with mechanical enforcement for Claude Code projects

0High
0Medium
0Low

gtbeltrani

claude-code autonomous-build ai-agent build-pipeline mechanical-enforcement

ultima-harness

Ultima Harness is a Windows-first autonomous build harness for Claude Code projects.

It turns a project idea into a PRD, agent-spec contracts, a task checklist, and a Ralph build loop where worker Claude sessions implement tasks in isolated git worktrees. Workers can commit code, but they do not decide whether the work passes. Ralph runs mechanical verification, merges passing work, discards failed worktrees, and updates BUILD_STATUS.md.

Status

Package version: 0.3.4
Supported OS: Windows
Default git base branch for this repository: main
Rust/Cargo verification remains supported
Package: shell is the supported non-Rust verifier path

What Is Included

The npm package includes the runtime artifacts needed to run without building from source during setup:

| Component | Purpose | | --- | --- | | ultima-harness CLI | Machine setup, project init, repair, doctor, uninstall | | agent-spec.exe | Spec parsing, linting, task planning, lifecycle and scenario verification | | harness.exe | Claude Code hook handler binary from claude-code-harness | | context-mode | MCP server and hook scripts for context routing | | claude-mem daemon files | MCP search/runtime server, context generator, worker service, status helpers | | /ultima-* commands | Claude Code operator commands for greenfield planning, production-change lanes, build, review, save/load | | ralph.sh | Worktree-isolated autonomous build loop | | guardrail scripts | Worker safety checks for dangerous commands and low-quality edits |

Requirements

Install these before using the harness:

| Tool | Why | | --- | --- | | Node.js 18+ and npm | Install and run the package | | Claude Code CLI | Operator and worker agent runtime | | Git | Project history and worktree isolation | | Git Bash | Runs generated shell scripts on Windows | | Python 3 | Project guardrail scripts | | Go/Rust toolchains | Only needed when rebuilding artifacts from source |

The package is declared win32 because setup and Ralph currently assume Windows paths, Git Bash, and Windows Terminal behavior.

Install

npm install -g ultima-harness

The postinstall step copies:

agent-spec.exe to ~/.ultima-harness/bin/agent-spec.exe
harness.exe to ~/.ultima-harness/bin/harness.exe

If the install is incomplete, ultima-harness setup and ultima-harness repair package fail with explicit missing-artifact messages.

Machine Setup

Run once per machine:

ultima-harness setup

Setup does the following:

Writes ~/.ultima-harness/config.json
Deploys the bundled module to ~/.ultima-harness/modules/ultima-harness/
Copies runtime dependencies and claude-mem daemon sidecars
Registers MCP servers:
- context-mode
- claude-mem
Wires Claude Code hooks into ~/.claude/settings.json
Registers the bundled plugin for skills, agents, and modes
Installs agent-spec.exe into ~/.ultima-harness/bin/

Use this after package upgrades if you want the global hook/MCP install refreshed:

ultima-harness repair hooks

Project Setup

Inside a git repository:

git init
ultima-harness init
git add .
git commit -m "Initial Ultima project"

ultima-harness init creates or updates:

| Path | Purpose | | --- | --- | | .gitignore | Ignores harness runtime state: .ralph/, .claude/state/, .claude/sessions/ | | .claude/commands/ | /ultima-* operator commands | | .claude/settings.json | Project guardrail hooks with absolute paths | | CLAUDE.md | Short project usage note | | docs/ | PRD and documentation location | | specs/ | agent-spec contract location | | scripts/ralph.sh | Build loop | | scripts/ultima-init.sh | Phase 0 scaffold/lint/plan helper | | scripts/guardrail-check.py | Worker command/edit guardrail | | scripts/code-quality-check.py | Worker code quality guardrail |

Existing generated files are skipped if current. If an owned generated file has local edits, repair backs it up before replacing it.

For an already-initialized project after a package upgrade:

ultima-harness repair project
git add .gitignore scripts/ralph.sh .claude/commands scripts/ultima-init.sh
git commit -m "Update Ultima harness project files"

You only need repair project for projects initialized before the package fix or projects where generated files are stale.

Operator Workflow

Run these inside Claude Code from the project root.

| Command | Purpose | | --- | --- | | /ultima-prd | Interview for a PRD and write docs/*-prd.md | | /ultima-plan | Convert PRD phases into specs and BUILD_STATUS.md | | /ultima-build | Run Ralph for the next incomplete phase | | /ultima-review | Manually inspect phase status and lifecycle verification | | /ultima-edit | Make deliberate PRD/spec edits | | /ultima-fix | Create a production bugfix lane for existing-project delta work | | /ultima-feature | Create a scoped production feature lane | | /ultima-optimize | Create a behavior-preserving optimization lane | | /ultima-security | Create a security hardening lane | | /ultima-accept | Approve or request changes after mandatory human review | | /ultima-save | Save current progress context | | /ultima-load | Restore and cross-check saved progress context |

Normal flow:

/ultima-prd
/ultima-plan
git add docs specs BUILD_STATUS.md .claude/commands/ultima-plan.md
git commit -m "Add PRD and phase specs"
/ultima-build
/ultima-review

Commit the PRD/spec baseline before /ultima-build. Ralph requires HEAD because it creates worktrees and verifies changes against the phase contract.

Production Change Lanes

For existing projects, do not make a fresh PRD just to fix a bug, add a small feature, optimize behavior, or harden security. Use a production-change lane instead:

/ultima-fix       # bug/regression
/ultima-feature   # scoped feature delta
/ultima-optimize  # performance/resource work without behavior changes
/ultima-security  # security hardening or vulnerability fix
git add docs/changes specs BUILD_STATUS.md
git commit -m "Add production change lane"
/ultima-build
/ultima-accept

These commands replace /ultima-prd and /ultima-plan only for delta work. They are deliberately more structured than a normal ad hoc edit: each lane first reads the existing project context, interviews for the lane-specific facts, validates the resulting intake packet, and asks for explicit operator approval before writing files. The next command after committing the generated lane baseline is /ultima-build; do not run /ultima-plan for these phases.

Each lane writes:

docs/changes/phase<N>-<type>-<slug>.md
specs/phase<N>-<type>-<slug>.spec.md
appended phase tasks in BUILD_STATUS.md

Lane intake requirements:

| Command | Required intake | | --- | --- | | /ultima-fix | Expected behavior, actual behavior, reproduction steps, affected surface, regression test contract, and rollback notes. If reproduction is missing, the lane must be investigation-only. | | /ultima-feature | User outcome, new behavior, compatibility expectations, docs or migration impact, and regression checks. Interface choices should be confirmed rather than silently invented. | | /ultima-optimize | Baseline command, baseline metric, target metric, allowed tradeoffs, and behavior-preservation tests. Behavior changes are forbidden unless the operator explicitly allows them. | | /ultima-security | Threat model, protected asset, entry point, attacker capability, impact, safe exploit or regression test, negative tests, and security review checklist. |

Production-change specs include prod-change plus one lane tag: bugfix, feature, optimization, or security. They must also use the normal agent-spec contract rules, including exact Package: shell / Filter: selectors and whole-phase-verifiable first tasks. If a spec uses Package: shell, the implementation task must include every required tests/<Filter>.sh script instead of deferring verifier creation to a later task.

The lane command should lint the generated spec with:

agent-spec lint specs\phase<N>-<type>-<slug>.spec.md

Ralph completes the phase, runs lifecycle verification, appends HUMAN_REVIEW_REQUIRED: phase<N>, and stops. /ultima-build refuses to start later production-change work until /ultima-accept records both:

HUMAN_REVIEW_APPROVED: phase<N>
PRODUCTION_READY: phase<N>

If the operator requests changes, /ultima-accept records review feedback under docs/reviews/, adds follow-up tasks, and sends the phase back through /ultima-build.

Specs And Verification

Specs live under specs/phase<N>*.spec.md.

Use path-only entries under Allowed Changes:

## Boundaries
### Allowed Changes
- calc.py
- tests/

Do not write prose as an allowed path, such as Create tests in tests/; agent-spec treats allowed-change bullets as literal path patterns.

Shell Selectors

For non-Rust CLI or script projects, use Package: shell:

Scenario: valid addition
  Test:
    Package: shell
    Filter: calc_valid_add
  Given calc.py exists at project root
  When the user runs `python calc.py 2 3`
  Then stdout contains exactly `5` and exit code is 0

This maps to:

tests/calc_valid_add.sh

The shell script is run from the repository root. Exit 0 means pass; nonzero means fail.

Missing tests/<Filter>.sh is a verifier failure.

Cargo Selectors

For Rust projects, keep using Cargo selectors:

Test:
  Package: my_crate
  Filter: test_name

Ralph tells workers to follow the selector. It does not tell Python projects to create Cargo tests.

Ralph Build Loop

/ultima-build runs:

./scripts/ralph.sh <phase> --max-iterations <N> --disallowed-tools Agent -- --dangerously-skip-permissions

Each iteration:

Creates .ralph/work/p<phase>-i<iteration> on a throwaway branch
Writes worker-specific .claude/settings.json
Opens a worker Claude session in that worktree
Waits for the done sentinel
Cleans worker-only Claude runtime state before verification
Runs optional project test command or Cargo fallback
Runs agent-spec verify --change-scope worktree
Merges passing commits back to the main worktree
Marks the next BUILD_STATUS.md task complete
Discards failed worktrees and retries

Ralph is fail-closed:

no worker commit: discard
verifier missing: fail before launch
no initial commit: fail before launch
verification failure: discard
timeout: leave the worktree for manual recovery and do not merge

When a phase finishes, Ralph adds PHASE_COMPLETE and runs lifecycle verification.

Runtime State

The harness writes local runtime files during normal use:

.ralph/
.claude/state/
.claude/sessions/

Generated projects ignore these paths. They should not be committed and should not count against agent-spec boundary checks.

If an old project is missing the ignores:

ultima-harness repair project
git add .gitignore
git commit -m "Ignore Ultima runtime artifacts"

claude-mem

Setup registers claude-mem as an MCP server and wires SessionStart/Stop hooks.

Expected behavior:

SessionStart starts or health-checks the worker daemon
SessionStart injects bounded recent context as valid Claude hook JSON
MCP tools are available for search and observation fetch
Hook failures produce safe fallback JSON and stderr diagnostics
Hook timeouts are bounded so memory cannot stall Claude startup indefinitely

Useful checks:

claude mcp list
bash "$USERPROFILE/.ultima-harness/modules/ultima-harness/scripts/hook-handlers/memory-session-start.sh"

The viewer URL printed by claude-mem context may require the viewer UI assets to be present. MCP/search success and SessionStart injection are the primary functional checks.

Repair Commands

ultima-harness repair package
ultima-harness repair hooks
ultima-harness repair claude-mem
ultima-harness repair project

| Command | What it repairs | | --- | --- | | repair package | Required packaged artifacts and native SQLite load | | repair hooks | Deployed module, MCP servers, global hooks, plugin registration | | repair claude-mem | claude-mem settings, credentials, daemon cleanup | | repair project | Project-owned scripts, commands, hooks, .gitignore |

Doctor

ultima-harness doctor

doctor is read-only. It checks prerequisites, config, project files, and claude-mem health. Use repair commands for modifications.

Build And Test This Package

From the repository root:

npm.cmd run build:artifacts
npm.cmd run smoke:package
npm.cmd run smoke:installed
cargo test --manifest-path src\agent-spec\Cargo.toml
Push-Location src\claude-code-harness\go; go test ./internal/hookhandler/ -run Setup -v; Pop-Location

Packaging locally:

npm.cmd pack
npm.cmd install -g D:\Projects\ultima-harness\ultima-harness-0.3.2.tgz
ultima-harness repair hooks

Publishing

Before publishing, ensure package.json has a version not already present on npm:

npm.cmd view ultima-harness version
npm.cmd publish

Do not publish generated local runtime folders or tarballs. The package uses a files allowlist.

Repository Hygiene

Default branch is main.
Do not merge release work into initial-package.
Do not commit .claude/, .omx/, .ralph/, or generated .tgz files.
Keep Cargo verification behavior intact when adding non-Rust verifier paths.
Keep shell verification explicit through Package: shell.

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

ultima-harness

Status

What Is Included

Requirements

Install

Machine Setup

Project Setup

Operator Workflow

Production Change Lanes

Specs And Verification

Shell Selectors

Cargo Selectors

Ralph Build Loop

Runtime State

claude-mem

Repair Commands

Doctor

Build And Test This Package

Publishing

Repository Hygiene