anvil-ai

v0.7.0

Published

2 months ago

Lightweight AI Code Factory — Build anything from a single command. Pure TypeScript, zero setup.

0High
0Medium
0Low

fepvenancio

ai autonomous-agents code-generation typescript claude cli developer-tools forge

Anvil

Lightweight AI Code Factory — Build anything from a single command. Zero setup.

Anvil orchestrates a team of AI agents to build entire projects from a natural-language spec. Spiritual successor to Forge — same structured agent roles, same review rigor, radically simplified.

npx anvil-ai run "Build a REST API for a todo app with Express and TypeScript"

One command. You get a complete project with clean git history, passing tests, and full audit trail.

Quick Start

# Create a new project directory
mkdir my-project && cd my-project && git init

# Build it
npx anvil-ai run "Build a CLI calculator with add, subtract, multiply, divide"

Requirements: Node.js 22+, Git, and a Claude Code / Gemini CLI / any AI CLI that provides auth (Anvil inherits authentication from the parent environment — no API key needed).

What Happens

You ──► "Build a todo API"
              │
         ┌────▼─────┐
         │  Planner  │  Spec → JSON plan with tasks, dependencies, interface contracts
         └────┬─────┘
              │
         Plan Critic ──► Deterministic + LLM validation (loop till clean)
              │
         Plan Review ──► Y / n / edit
              │
         ┌────▼────┐
         │  Wave 1  │  Independent tasks in parallel (git worktrees)
         │ Workers  │  Each reads context from earlier waves
         └────┬────┘
              │
         Sub-Judges ──► tsc / vitest / touch-map / security / interface
              │         (5 judges, all code — $0, no AI)
              │
              │         ✗ Failed? → Retry with error context (up to 2x)
              │
         ┌────▼────┐
         │  Wave 2  │  Dependent tasks execute next
         └────┬────┘
              │
         Sub-Judges ──► (same 5 checks)
              │
         Final Integration ──► tsc + vitest on fully merged codebase
              │
         High Court ──► AI architectural review
              │         merge ✓ / human_required ⚠ / abort ✗
              │
         Librarian ──► README.md + ARCHITECTURE.md
              │
         Done ──► Clean git history, full audit trail

Commands

anvil run "spec"                    # Build from natural language
anvil run "spec" --skip-review      # Skip interactive plan review
anvil run "spec" --stack python     # Use Python stack preset
anvil run "spec" --spec todo.md     # Read detailed spec from file
anvil run "spec" --sequential       # Force sequential execution
anvil stacks                        # List available stack presets
anvil status                        # View last build state
anvil cost                          # Token/cost breakdown
anvil logs                          # View build logs
anvil logs --wave 2                 # Logs for a specific wave

Stack Presets

anvil run "Build X"                    # Default: TypeScript
anvil run "Build X" --stack python     # Python + FastAPI + pytest
anvil run "Build X" --stack go         # Go + Chi + stdlib testing
anvil run "Build X" --stack react      # React 19 + Vite + Vitest

Agent Roles

| Agent | Type | What it does | |-------|------|-------------| | Planner | AI (JSON) | Spec → plan with tasks, deps, interface contracts (exports[]) | | Plan Critic | Code + AI | Validates plan structure, loops until clean | | Worker | AI (tool use) | Executes one task in isolated git worktree. Reads context, self-verifies with tsc/vitest | | Sub-Judges | Code only ($0) | tsc, vitest, touch-map, security (5 regex rules), interface contract enforcement | | High Court | AI (JSON) | Architectural review. Abort → git reset --hard (nothing leaks) | | Librarian | AI (markdown) | Generates README.md + ARCHITECTURE.md | | Cost Auditor | Code only | Tracks tokens per call, calculates cost per wave |

What You Get

your-project/
├── src/                    # Generated source code
├── tests/                  # Generated tests (if requested)
├── README.md               # Auto-generated by Librarian
├── ARCHITECTURE.md         # Auto-generated by Librarian
├── package.json            # Project config
└── .anvil/                 # Audit trail
    ├── roadmap.json        # The execution plan
    ├── cost-report.json    # Token usage + cost breakdown
    ├── high-court-report.json
    └── reports/            # Per-wave Sub-Judge reports

Plus a clean git history:

feat(anvil): Create project scaffold
feat(anvil): Implement calculator logic
feat(anvil): Add CLI entry point and tests
docs(anvil): auto-generated README and ARCHITECTURE

v0.2.0 — What's New

First fully successful end-to-end build. Benchmark: CLI calculator, 3 waves, $0.26, all judges pass.

| Feature | Description | |---------|-------------| | Interface Contracts | Planner declares exact exports per task. InterfaceJudge enforces them. | | Wave Retry Loop | Failed waves retry 2x with error context injected into worker prompts | | Plan Critic | Deterministic structural validation + LLM review before execution | | Final Integration Check | tsc + vitest on fully merged codebase before High Court | | Security Judge | Catches eval(), hardcoded secrets, SQL injection, innerHTML, insecure HTTP | | Worker Self-Verification | Workers run tsc + vitest before declaring complete | | Context Injection | Workers read actual file contents from earlier waves (no more guessing imports) | | Stack Presets | --stack typescript/python/go/react | | Brownfield Support | Detects existing projects, injects file tree + export signatures | | Worker Timeout | 5-minute AbortController per worker | | Lockfile De-confliction | Parallel workers don't conflict on package-lock.json |

Cost

Typical builds cost $0.25–$30 depending on complexity. Workers are 93–98% of spend.

| Project Size | Tasks | Cost | |-------------|-------|------| | Simple (calculator, CLI tool) | 4-5 | $0.25–$3 | | Medium (REST API with tests) | 10-15 | $5–$13 | | Complex (full-stack app) | 20-30 | $15–$30 |

Development

git clone https://github.com/fepvenancio/anvil.git
cd anvil
npm install
npm test              # 174 tests
npm run typecheck     # strict mode, zero errors
npm run dev -- run "Build a hello world Express app"

Tech Stack

| Dependency | Purpose | |-----------|---------| | @anthropic-ai/claude-agent-sdk | Claude Code Agent SDK (workers, planner, high court) | | commander | CLI framework | | simple-git | Git worktree management | | zod | Schema validation (plans, reports, config) | | p-limit | Parallel wave execution | | chalk + ora | Terminal UI | | pino | Structured JSON logging |

License

MIT