harnesskit
v0.1.1
Published
Plug-and-play SDK to set up Harness Engineering (agent-first development) in any repo, any IDE, any git provider
Maintainers
Readme
⚡ harnesskit
Plug-and-play SDK for Harness Engineering — agent-first development in any repo, any IDE, any git provider.
What Is This?
harnesskit implements the Harness Engineering workflow as a universal, pluggable SDK. One command scaffolds the complete agent-first development infrastructure — structured knowledge base, specialized agent personas, architecture enforcement, execution plans, and quality tracking — adapted to your language, IDE, and git provider.
npx harnesskit initThat's it. Your repo is now set up for agent-first development.
The Problem
OpenAI showed that with the right scaffolding, AI agents can build and ship real products. But their setup is deeply integrated with Codex. The patterns are universal — the tooling isn't:
- AGENTS.md tells agents about your repo (60k+ repos use it)
- Agent Skills give agents reusable capabilities (adopted by every major tool)
- But nobody has packaged the full orchestration layer: execution plans, quality grades, architecture enforcement, agent-to-agent review loops, doc gardening
harnesskit is that missing layer.
How It Works
graph TD
%% Setup Phase
Init[<code>npx harnesskit init</code>] -->|Detects Stack| GenEnv
subgraph "Scaffolded Knowledge & Rules"
GenEnv[Generates Infrastructure] --> AgentsMD[AGENTS.md]
GenEnv --> Rules[Layered Arch Rules<br/>Quality Scores]
GenEnv --> Configs[IDE configs<br/>VS Code, Cursor, etc.]
end
%% Development Phase
User[Developer] -->|Prompts| Planner
subgraph "The Agent Workflow"
Planner[Planner Agent] -->|Reads Context| Rules
Planner -->|Creates Plan| ExecPlans[docs/exec-plans]
ExecPlans -->|Handoff| Implementer[Implementer Agent]
Implementer -->|Writes Code| Code[Codebase]
Code -->|Triggers| Reviews
subgraph "Review & Enforcement"
Reviews[Sub-agents] -.->|PR/Review| ArchReview[Arch Reviewer]
Reviews -.->|PR/Review| SecReview[Security Reviewer]
ArchReview -.->|Checks| Rules
SecReview -.->|Checks| Rules
ArchReview -->|Fails| Implementer
SecReview -->|Fails| Implementer
end
Reviews -->|All Pass| Ship[Ship it! 🚀]
end
%% Styles
style Init fill:#3b82f6,stroke:#2563eb,color:#fff
style Ship fill:#10b981,stroke:#059669,color:#fff
style Code fill:#64748b,stroke:#475569,color:#fffQuick Start
New Project
mkdir my-project && cd my-project
git init
npx harnesskit initExisting Project
cd my-existing-repo
npx harnesskit init # Interactive wizard
# or
npx harnesskit init --yes # Auto-detect everythingSpecify Your Stack
npx harnesskit init --lang python --ide cursor,vscode --git adoSupported Environments
Languages
| Language | Auto-Detected By | Build/Test/Lint Commands |
|----------|-----------------|------------------------|
| Node.js / TypeScript | package.json | npm run build, npm test, npm run lint |
| Python | pyproject.toml, requirements.txt | python -m build, pytest, ruff check |
| .NET (C#/F#) | *.csproj, *.sln | dotnet build, dotnet test, dotnet format |
| Java / Kotlin | pom.xml, build.gradle | ./gradlew build, ./gradlew test |
| Go | go.mod | go build, go test, golangci-lint |
| Rust | Cargo.toml | cargo build, cargo test, cargo clippy |
IDEs / Agent Runtimes
| IDE | Config Format | What's Generated |
|-----|--------------|-----------------|
| VS Code + GitHub Copilot | .github/agents/*.agent.md | 4 custom agents + copilot-instructions.md |
| Cursor | .cursor/rules/*.md | 4 rules with Cursor frontmatter |
| Claude Code | .claude/agents/*.md + CLAUDE.md | CLAUDE.md with @-imports + 4 subagents |
| Windsurf | .windsurf/rules/*.md | 4 workspace rules |
| JetBrains (Junie) | .junie/guidelines.md | Unified guidelines file |
| All tools | AGENTS.md | Always generated (universal standard) |
Git Providers
| Provider | Auto-Detected By | Special Features |
|----------|-----------------|-----------------|
| GitHub | github.com in remote | Cloud agent support, PR via gh |
| Azure DevOps | dev.azure.com in remote | PR via az repos, work item linking |
| GitLab | gitlab in remote | MR via glab |
| Bitbucket | bitbucket.org in remote | PR via bb |
Commands
| Command | What It Does |
|---------|-------------|
| harnesskit init | Interactive setup wizard |
| harnesskit init --yes | Non-interactive, auto-detect everything |
| harnesskit enforce | Validate architecture layer rules |
| harnesskit doctor | Check setup health and completeness |
| harnesskit garden | Find stale docs, broken refs, completed plans |
What Gets Generated
4 Specialized Agents
| Agent | Role | Tools | |-------|------|-------| | Planner | Creates execution plans, never writes code | Read-only | | Implementer | Writes code following plans and architecture | Full access | | Reviewer | Reviews for architecture, tests, quality | Read + terminal | | Doc Gardener | Finds and fixes stale documentation | Read + write |
Knowledge Base (docs/)
| File | Purpose |
|------|---------|
| ARCHITECTURE.md | Layer rules, dependency diagram, key directories |
| QUALITY_SCORE.md | Per-domain quality grades (A through F) |
| design-docs/core-beliefs.md | 8 agent-first operating principles |
| exec-plans/active/ | In-flight execution plans |
| exec-plans/active/_template.md | Plan template with Goal, Steps, Acceptance Criteria |
| exec-plans/completed/ | Archived completed plans |
The Agent Workflow
You (human intent)
│
├─→ Planner Agent
│ Creates structured execution plan
│ Saves to docs/exec-plans/active/
│ → Handoff: "Start Implementation"
│
├─→ Implementer Agent
│ Follows plan step by step
│ Runs tests after each step
│ → Handoff: "Request Review"
│
├─→ Reviewer Agent
│ Checks architecture + tests + quality
│ Reports PASS or FAIL
│ → If FAIL: "Fix Issues" → back to Implementer
│ → If PASS: Ship it
│
└─→ Doc Gardener Agent (periodic)
Finds stale docs, broken links
Archives completed plans
Updates quality scoresDesign Principles
- Zero dependencies — pure Node.js, no npm install needed for the CLI
- Universal standard first — always generates
AGENTS.md(read by every tool) - Progressive disclosure — short AGENTS.md → deep docs/ → IDE-specific configs
- Language agnostic — works with Node, Python, .NET, Java, Go, Rust
- IDE agnostic — one
initgenerates configs for all your IDEs at once - Git provider agnostic — GitHub, ADO, GitLab, Bitbucket
- Non-destructive — never overwrites existing files (safe for existing repos)
- Composable — use the CLI or import as a library
Programmatic API
import { init, enforce, doctor, garden, generateAgents } from 'harnesskit';
// Generate harness scaffold programmatically
await init('/path/to/repo', { yes: true, lang: 'python', ide: 'cursor,vscode' });
// Run architecture enforcement
await enforce('/path/to/repo');
// Check setup health
await doctor('/path/to/repo');Mapping to OpenAI's Harness Engineering
| Harness Principle | harnesskit Implementation |
|---|---|
| AGENTS.md as table of contents | Auto-generated AGENTS.md (~50 lines, links to docs/) |
| Structured docs/ knowledge base | docs/ — ARCHITECTURE, QUALITY_SCORE, exec-plans, core-beliefs |
| Layered architecture enforcement | harnesskit enforce — language-agnostic import scanner |
| Agent-to-agent review loop | Planner → Implementer → Reviewer handoff agents |
| Execution plans as first-class artifacts | docs/exec-plans/active/ with templates |
| Garbage collection / doc gardening | harnesskit garden + Doc Gardener agent |
| Progressive disclosure | AGENTS.md → docs/ → IDE-specific instructions |
| Corrections are cheap | Non-blocking flow, fix-up PRs encouraged |
Contributing
See CONTRIBUTING.md for setup, code style, and PR process.
Acknowledgments
harnesskit is an open-source implementation of the Harness Engineering methodology published by OpenAI in 2025. The core principles — repository-as-source-of-truth, structured agent instructions, layered architecture enforcement, agent-to-agent review loops, and execution plans — originate from OpenAI's research on making AI coding agents effective at scale.
This project adapts those principles into a universal, IDE-agnostic, language-agnostic CLI tool. It is not affiliated with, endorsed by, or sponsored by OpenAI.
License
MIT — see LICENSE.
Built for the agent-first era.
Humans steer. Agents execute. harnesskit sets up the environment.
