subagent-harness
v0.5.0
Published
Parse, validate, and compose rich subagent definitions into runtime-specific formats.
Maintainers
Readme
subagent-harness
Evolve your agent in one place. Use it anywhere.
Stop copy-pasting prompts across .cursorrules, Claude Code, and your CI pipelines. subagent-harness takes your single markdown file and auto-magically translates it for every AI tool you use.
📖 Read the full story: Why AI Agents need a "Compiler" and Governance Flow
flowchart LR
subgraph State ["Scattered Agent State"]
direction TB
P[📝 Prompts]
C[⚙️ Configs]
S[🧩 Skills]
end
SSOT(("📄 Rich .agent.md<br/><b>(Single Source of Truth)</b>"))
COMPILER{{"⚡ subagent-harness<br/><i>(Compiler)</i>"}}
subgraph Runtimes ["Target Runtimes"]
direction TB
IDE["💻 IDE<br/><i>(Cursor, etc.)</i>"]
CLI["⌨️ CLI<br/><i>(Claude Code)</i>"]
PROD["🚀 Production<br/><i>(JSON Artifacts)</i>"]
end
State -->|"Flatten"| SSOT
SSOT -->|"Compile & Validate"| COMPILER
COMPILER --> IDE
COMPILER --> CLI
COMPILER --> PROD
%% Styling
style SSOT fill:#1f6feb,stroke:#58a6ff,stroke-width:3px,color:#ffffff
style COMPILER fill:#2d333b,stroke:#d29922,stroke-width:2px,color:#c9d1d9
style IDE fill:#2d333b,stroke:#238636,stroke-width:2px,color:#c9d1d9
style CLI fill:#2d333b,stroke:#238636,stroke-width:2px,color:#c9d1d9
style PROD fill:#2d333b,stroke:#238636,stroke-width:2px,color:#c9d1d9
style Runtimes fill:none,stroke:#8b949e,stroke-dasharray: 5 5,color:#8b949e
style State fill:none,stroke:#8b949e,stroke-dasharray: 5 5,color:#8b949e
style P fill:#22272e,stroke:#444c56,color:#c9d1d9
style C fill:#22272e,stroke:#444c56,color:#c9d1d9
style S fill:#22272e,stroke:#444c56,color:#c9d1d9Supported Targets
| Target | Generated Format | Purpose |
| ------ | ---------------- | ------- |
| Cursor | .cursor/agents/*.md | Local IDE autocomplete & agent generation |
| Codex | .codex/agents/*.md | OpenAI Codex CLI (codex exec --instructions); same markdown shape as Cursor |
| Claude Code | .claude/skills/*.md | Global CLI skills |
| Production | dist/agents/*.json | CI/CD pipelines & backend SDK consumption |
More integrations planned: Windsurf, Copilot, Cline.
How it works: The Output
You write one file. The harness generates the rest.
# 1. You write this (Single Source of Truth)
my-agents/
└── changelog-extractor.agent.md
# 2. Run subagent-harness
$ pnpm exec subagent-compose --apply
# 3. It generates these auto-magically:
.cursor/agents/
└── changelog-extractor.md # Formatted natively for Cursor
.codex/agents/
└── changelog-extractor.md # Same markdown; OpenAI Codex CLI (`runtime: codex` in config)
.claude/skills/
└── changelog-extractor/SKILL.md # Formatted natively for Claude Code
dist/
└── changelog-extractor.json # Strict JSON for your backend/CIThe Problem — "Living Agent" Drift
Agents are living artifacts. Their prompts, configs, and skills evolve constantly. When the same agent lives in two places — your IDE and your CI pipeline — iteration guarantees drift.
A ground-truth story
Bob builds changelog-extractor, a tiny sub-agent:
| Component | Initial state |
|-----------|---------------|
| Prompt | "Read recent commits, group by feature/fix, output a markdown list." |
| Skills | read_git_log · read_jira_ticket |
He copies this definition into Cursor (.cursor/agents/) for local dev, and into the CI pipeline (rich JSON) for automated GitHub Releases.
Then the iteration begins:
| # | What changed | Detail |
|---|-------------|--------|
| 1 | Format | Marketing wants customer-facing language → prompt rewritten |
| 2 | New skill | Needs PR context → adds fetch_pr_description |
| 3 | Config | LLM hallucinating → temperature: 0.1, hard constraint added |
Bob updates the CI config. He forgets to sync the Cursor copy.
Developers now run a stale local agent that hallucinates, misses PR context, and outputs raw technical jargon — while the production version works fine.
Root cause: No Single Source of Truth. No automated composition. Iterating on an agent guarantees environment drift.
How subagent-harness fixes it:
- Single file — Write
changelog-extractor.agent.mdonce. This is the only copy. - Auto-validate — On commit, the harness checks structure and required fields.
- Compose per runtime —
composeemits full config for CI, and stripped proprietary.mdfor Cursor. - Zero drift — Next iteration, edit one file. The change propagates everywhere automatically.
Quickstart
# 1. Install
pnpm add -D subagent-harness
# 2. Preview what the harness will generate (no file writes)
pnpm exec subagent-compose \
--src ~/my-custom-agents \
--dst ~/.cursor/agents \
--dry-run
# 3. Generate runtime-ready files
pnpm exec subagent-compose \
--src ~/my-custom-agents \
--dst ~/.cursor/agents \
--applyReload your IDE window → open the Subagents list → your agent is discovered, formatted, and in sync with the SSOT.
With subagent.config.json, compose only selected runtimes (issue #14):
pnpm exec subagent-compose --apply --target codex
pnpm exec subagent-compose --apply --target cursor --target production
pnpm exec subagent-compose --apply --target all # same as omitting --target--src / --dst legacy mode only allows --target cursor or --target all.
Production JSON contract (issue #15)
For runtime: production, compose runs an extra SSOT fidelity check on the JSON (including during --dry-run):
promptmust be a non-empty string- If the source declares a
modelwith a name (top-level or via the target’sprofile),modelin JSON must be an object with the samename(catches silent"inherited"/ wrong-provider fallbacks downstream) - If
.agent.ext.yamldefinescontentSchemaas a mapping, top-level keys must match the composed JSON
Any parse failure, validation error, or contract violation increments the failure count; the CLI exits 1 so CI can gate on subagent-compose --dry-run or --apply.
Drift check (optional): after --apply, confirm on-disk artifacts still match a fresh compose from SSOT:
pnpm exec subagent-compose --apply --target production
pnpm exec subagent-compose --verify --target production--verify requires at least one active production target. It exits 2 if combined with --apply or --clean, or if no production target is selected.
Full walkthrough: 5-Minute Quickstart
Built-in Quality Gates
It's not just a format converter; it's a governance pipeline. Before any agent is compiled, it passes through 3 layers of tests to ensure absolute reliability:
- Schema Validation (L1): Prevents missing fields, broken YAML frontmatter, or syntax errors.
- Runtime Contract (L2): Ensures the generated output matches exactly what Cursor/Claude actually supports.
- Live Smoke Check (L3): (Optional) Boots up a real process to verify the compiled agent won't crash your IDE or backend.
Run all layers locally:
pnpm test:e2eRun the reusable matrix runner (L1/L2 + strict production probe):
pnpm test:matrixTo include real Subject-Harness probes in strict mode:
SUBJECT_HARNESS_CLI_CMD='subject-harness run --artifact "$SUBJECT_ARTIFACT_PATH"' \
SUBJECT_HARNESS_API_MODULE=./path/to/subject-api-probe.mjs \
node scripts/run-matrix.mjs --targets "production,subject-cli,subject-api" --strictOptional L3 probes for composed Markdown agents: set CURSOR_RUNTIME_CHECK_CMD, CODEX_RUNTIME_CHECK_CMD, or CLAUDE_RUNTIME_CHECK_CMD to a shell command; pnpm test:l3 passes AGENT_FILE pointing at the temp artifact. Markdown probes use a 5-minute Vitest timeout so a real codex exec / IDE round-trip can finish. To require a probe (fail CI if unset), add that target to L3_REQUIRE_TARGETS (comma-separated, e.g. production,codex).
Example (OpenAI Codex CLI — instructions from the composed agent file on stdin):
export CODEX_RUNTIME_CHECK_CMD='codex exec -s read-only --skip-git-repo-check --ephemeral - < "$AGENT_FILE"'
pnpm test:l3Programmatic Embedding (CLI / Production)
subagent-harness is not only a CLI. You can import it as a package inside your own terminal app, backend worker, or release pipeline.
import { readFileSync } from "node:fs";
import {
parseRichAgentMarkdown,
validateRichAgent,
composeSubagent
} from "subagent-harness";
const sourcePath = "agents/changelog-extractor.agent.md";
const content = readFileSync(sourcePath, "utf8");
const doc = parseRichAgentMarkdown(sourcePath, content);
const validation = validateRichAgent(doc);
if (!validation.ok) {
throw new Error(
`Invalid agent definition: ${validation.issues.map((i) => i.code).join(", ")}`
);
}
// Built-in targets
const cursorAgent = composeSubagent(doc, "cursor");
const codexAgent = composeSubagent(doc, "codex"); // same markdown as cursor; use with `.codex/agents/`
// You can also map `doc.frontmatter` + `doc.body` into your own runtime schema.
// For `production` JSON, import `validateProductionComposeOutput` to mirror the CLI SSOT checks (issue #15).This lets product runtime and IDE runtime consume the same SSOT file while keeping environment-specific adapters isolated.
Demo Video
Staying Updated
Status: Pre-RC. The format and API are stabilizing but not yet frozen. Follow releases to stay informed.
subagent-harness uses git tags and GitHub Releases as the primary version signal. Choose whichever subscription method fits your workflow:
| Method | How | Best for |
|--------|-----|----------|
| Watch → Releases | Click Watch on this repo → Custom → check Releases only | Lightweight human notification |
| Dependency automation | Configure Dependabot, Renovate, or similar tools to monitor this repo's tags | Auto-PR when a new version is available |
| Downstream registry | Add your project to downstream.json via PR | Receive an automated issue on each release |
| RSS | Subscribe to https://github.com/ERerGB/subagent-harness/releases.atom | Feed reader integration |
Samantha Mini-Test
A small integration of the Samantha project (emotional AI companion from Her) into subagent-harness:
pnpm compose --dry-run # Preview
pnpm compose --apply # Compose to .cursor/agentsSee Samantha Quickstart for full walkthrough.
Integration Best Practices
The Extension Boundary
When integrating subagent-harness into your project, the most important architectural decision is understanding where the harness stops and your project begins.
subagent-harness (standardized) Your project (project-specific)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
.agent.md → parse → validate Interpret extensions semantics
.agent.ext.yaml → merge extensions Scenario bundle resolution
Profile → resolve model + skills Agent orchestration / scheduling
Output → AgentDefinition or artifact Runtime lifecycle (spawn, freeze…)
Contract validation Map skill names → tool implementationsThe harness never interprets extensions. It passes them through as an opaque Record<string, unknown>. All domain-specific logic built on extension fields belongs in your project's runtime layer.
What goes where
| Concern | Where it lives | Why |
|---------|---------------|-----|
| Agent identity (name, description, prompt) | Harness — .agent.md | Universal across all runtimes |
| Model config and profile resolution | Harness — loadAgent() | Deterministic merge logic, runtime-agnostic |
| Extension fields (config, contentSchema, archetype…) | Harness stores, your project interprets | Semantics are domain-specific |
| Scenario bundles (refs, overrides, disabled) | Your project only | Product-level composition decisions |
| Agent scheduling and lifecycle | Your project only | Depends on your interaction model |
| Skill → tool binding | Your project only | Tools are runtime-specific |
Example: consuming production artifacts
import { loadAgentFromDisk, composeSubagent } from "subagent-harness";
// Harness: SSOT → typed artifact
const doc = loadAgentFromDisk("agents/my-agent.agent.md");
const json = JSON.parse(composeSubagent(doc, "production"));
// Your project: interpret extensions, wire up runtime
const maxIdleTurns = json.config?.maxIdleTurns ?? 10; // your domain logic
const contentSchema = json.contentSchema; // your card schema
// ... feed into your own scheduler / executorWhy scenario bundles are NOT in the harness
A scenario bundle (e.g. "which agents run together, with what overrides, excluding which others") is a product decision. Different projects compose agents differently:
- A meeting app might run 3 agents in parallel monitoring a transcript
- A CI pipeline might run 1 agent per PR sequentially
- A CLI tool might spawn agents as task executors
The harness provides loadAgent() for each individual agent. How you compose multiple agents is your product's job, not a library concern.
See issue #16 for the
loadAgent()typed API that formalizes this boundary.
Docs
| Document | Purpose | |----------|---------| | 5-Minute Quickstart | Hands-on onboarding guide | | Samantha Quickstart | Samantha mini-test integration | | YAML Subset | Supported YAML features and parser boundaries | | Beta Feedback Form | Structured feedback for testers | | Governance Agreement | Maintainer agreement & migration triggers | | Governance Navigation | Governance entry point | | Trusted Publishing | npm publish via GitHub Actions OIDC | | SDK Probe Contract | L3 real-runtime probe input/output and exit-code contract | | Protocol Iteration Principles | Lean core-protocol evolution rules | | Project Structuring Worklog | Project-by-project import and protocol observations |
License
Apache-2.0
