cc-harness

v0.1.1

Published

9 days ago

Harness Engineering audit tool + toolkit for AI coding agents, distilled from Claude Code source architecture

Downloads

127

0High
0Medium
0Low

vincent.bian

claude-code codex harness-engineering ai-agent audit skills toolkit cursor windsurf copilot

cc-harness

Harness Engineering audit tool + toolkit for AI coding agents, distilled from Claude Code source architecture.

Scan any project. Score it against 10 engineering principles. Get actionable fix suggestions.

What is this

cc-harness evaluates whether your AI coding agent workflow (Claude Code, Codex, Cursor, or any custom system) follows proven Harness Engineering principles — extracted from analyzing Claude Code's 4756-file TypeScript source and Codex's Rust codebase.

It combines two layers:

Deterministic scan (50 points) — script checks file existence, structure, configuration
LLM semantic analysis (50 points) — Claude evaluates content quality and design intent

Total score: /100, with per-dimension breakdown and remediation links.

Quick Start

As a Claude Code plugin (recommended)

Enable in your Claude Code settings:

// .claude/settings.json
{
  "enabledPlugins": {
    "cc-harness@auto-research": true
  }
}

Then run /harness-audit in any project.

Manual skill install

# Clone and install the skill
git clone https://github.com/auto-research/cc-harness.git
cp -r cc-harness/skills/harness-audit ~/.claude/skills/harness-audit

# Run in your project — type: /harness-audit

Global npm install

npm i -g cc-harness

# Scan any project (script layer only, no LLM)
cc-harness scan --root /path/to/your/project

Script-only mode (no install)

git clone https://github.com/auto-research/cc-harness.git
cd cc-harness && npm install

# Scan any project
npm run scan -- --root /path/to/your/project

Sample Output

  ╦ ╦╔═╗╦═╗╔╗╔╔═╗╔═╗╔═╗  ╔═╗╦ ╦╔╦╗╦╔╦╗
  ╠═╣╠═╣╠╦╝║║║║╣ ╚═╗╚═╗  ╠═╣║ ║ ║║║ ║
  ╩ ╩╩ ╩╩╚═╝╚╝╚═╝╚═╝╚═╝  ╩ ╩╚═╝═╩╝╩ ╩
  v0.1.0 · Harness Engineering Compliance Audit

  Project    ai-assistant-hub
  Score      41 / 50 (script layer)

  ──────────────────────────────────────────

  P1  Constraint Codification  █████████░  5/5  PASS
  P2  Control Plane Layering   ████████░░  4/5  PASS
  P3  Workflow Continuity      █████████░  5/5  PASS
  P4  Tool Governance          █████████░  5/5  PASS
  P5  Context Budget           ████████░░  4/5  PASS
  P6  Error Path               ████░░░░░░  2/5  WARN
  P7  Interrupt Recovery       ██████░░░░  3/5  WARN
  P8  Role Separation          █████████░  5/5  PASS
  P9  Independent Verification ██████░░░░  3/5  WARN
  P10 Team Institution         █████████░  5/5  PASS

With Claude Code skill (/harness-audit), you also get the LLM layer for a full /100 score.

The 10 Principles

These principles were extracted from Claude Code's source architecture and validated against Codex's design:

| # | Principle | What it means | |---|-----------|---------------| | 1 | Model is unstable | Don't trust it as a colleague — constrain it as an executor | | 2 | Prompt is control plane | Not personality text — it's a priority chain of behavior blocks | | 3 | Query loop is heartbeat | The core isn't single answers — it's continuous execution | | 4 | Tools are managed interfaces | More power needs finer constraints. Bash is most dangerous | | 5 | Context is working memory | Not a trash can — it's a budgeted resource | | 6 | Error path = main path | prompt too long is inevitable, not exceptional | | 7 | Recovery means continuation | Not rollback — pick up where you left off | | 8 | Multi-agent needs role separation | Not cloning — it's uncertainty partitioning | | 9 | Verification must be independent | Don't let the writer judge the writing | | 10 | Institution > individual skill | Skills are reusable policy slices, hooks attach to lifecycle |

Full details: toolkit/principles.md

Toolkit

The toolkit/ directory contains ready-to-use templates:

Team (3-10 person teams)

| Template | What you get | |----------|-------------| | claude-md-template.md | Three-layer CLAUDE.md structure | | hooks-starter.md | 4 ready-to-use hooks (health check, write guard, bash guard, lint) | | agent-roles-template.md | 3 preset roles (Implementer, Reviewer, Architect) | | governance-starter.md | rules + dont + quality-gates + verification templates | | workflow-template.md | Workflow definition + progress tracking | | checklist.md | 10-minute team audit (10 yes/no questions) |

Enterprise (building your own agent system)

| Reference | What you learn | |-----------|---------------| | architecture.md | 6-component harness architecture | | query-loop-design.md | Query loop state machine design (the most important one) | | tool-governance.md | 8-step tool execution pipeline | | context-management.md | 5-stage compression strategy | | multi-agent-patterns.md | Multi-agent scheduling + verification separation | | error-recovery.md | Error path design for agent systems | | checklist.md | 25-item enterprise design audit |

How it Works

  ┌─────────────────────────────┐
  │  /harness-audit             │
  └──────────┬──────────────────┘
             │
  ┌──────────▼──────────────────┐
  │  Layer 1: scan.ts           │
  │  Deterministic file scan    │
  │  10 dims × 5 checks = 50   │──→ JSON (reproducible)
  │  Same commit = same score   │
  └──────────┬──────────────────┘
             │
  ┌──────────▼──────────────────┐
  │  Layer 2: LLM Analysis      │
  │  Read discovered files      │
  │  10 dims × 5 semantic = 50  │──→ Quality judgment
  │  "Is this actually good?"   │
  └──────────┬──────────────────┘
             │
  ┌──────────▼──────────────────┐
  │  Report                     │
  │  Score /100 + PASS/WARN/FAIL│
  │  Top 3 actions + toolkit    │──→ Terminal or Markdown
  │  links to fix templates     │
  └─────────────────────────────┘

Tool Compatibility

cc-harness is tool-agnostic. It probes configuration paths for multiple AI coding tools:

| Tool | Files Detected | |------|---------------| | Claude Code | CLAUDE.md, .claude/settings.json, .claude/agents/ | | Codex | AGENTS.md, .codex/hooks/, .codex/agents/ | | Cursor | .cursorrules | | Windsurf | .windsurfrules | | GitHub Copilot | copilot-instructions.md, .github/copilot-instructions.md | | Generic | governance/, workflows/, agents/, .husky/, lint-staged |

Contributing

See docs/CONTRIBUTING.md for how to:

Add probe paths for new AI tools
Add new check items
Contribute toolkit templates

Sources

The 10 principles were distilled from three source-code analysis documents:

Xiao Tan AI — Claude Code 源码架构深度解析 V2.0 (4756-file TypeScript source analysis)
@wquguru — Harness Engineering: Claude Code 设计指南 (10 principles from CC source)
@wquguru — Claude Code 和 Codex 的 Harness 设计哲学 (CC vs Codex comparative analysis)

Full source details: docs/sources.md

License

MIT

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

cc-harness

What is this

Quick Start

As a Claude Code plugin (recommended)

Manual skill install

Global npm install

Script-only mode (no install)

Sample Output

The 10 Principles

Toolkit

Team (3-10 person teams)

Enterprise (building your own agent system)

How it Works

Tool Compatibility

Contributing

Sources

License