ancoder-skill-cli

v0.13.1

Published

10 hours ago

CLI for managing everything-claude-code (ECC) components — agents, skills, commands, rules, hooks, MCP configs. Single binary, all assets embedded.

0High
0Medium
0Low

mccree_npm

skill claude claude-code anthropic agentskills everything-claude-code ecc cli agents mcp

skill-cli

CLI for managing and testing Anthropic Agent Skills (e.g. anthropics/skills).

Install (npm)

# Global
npm install -g ancoder-skill-cli

# Or run without installing
npx ancoder-skill-cli --help

The npm package is self-contained and includes prebuilt binaries for:

macOS arm64
macOS x64
Linux arm64
Linux x64
Windows x64

After install, the wrapper selects the correct bundled binary for the current platform automatically.

Build from source (Go)

cd skill-cli
go build -o bin/skill-cli .
# Then run: ./bin/skill-cli --help
# Or via npm: node bin/skill-cli.js --help

Commands

| Command | Description | |--------|-------------| | skill-cli validate <path> | Validate SKILL.md, skill.contract.yaml, and evals/*.yaml | | skill-cli list [--path <dir>] | List installed skills | | skill-cli create <name> [--path <dir>] | Create a skill scaffold with contract and smoke eval templates | | skill-cli test <path> | Check that a skill has trigger docs, contract, and eval coverage | | skill-cli verify <path> [--suite smoke] | Run a machine-readable verification suite end-to-end | | skill-cli generate <name> --desc "..." | Generate a complete skill using Claude CLI with OMC autopilot (default) | | skill-cli generate <name> --desc "..." --adversarial | Generate with OMC, then run isolated generator/evaluator contract negotiation and review | | skill-cli install [--no-omc] | Install ECC components into ~/.claude/ (includes OMC by default) | | skill-cli install --component omc | Install only the bundled OMC multi-agent orchestration layer |

Machine-Readable Skill Layout

Task-oriented skills can now include a deterministic verification harness:

my-skill/
├── SKILL.md
├── skill.contract.yaml
├── evals/
│   └── smoke.yaml
├── fixtures/
└── scripts/

skill.contract.yaml defines the executable contract: entrypoint, inputs, outputs, invariants, and datasets.
evals/*.yaml defines runnable verification suites with deterministic checks like file existence, required content, and JSON assertions.
skill-cli verify materializes fixture data into a temp workspace, runs the skill entrypoint, and enforces the declared checks.

skill-cli verify executes local code declared by the skill contract, so only run it against trusted skills and repositories.

Adversarial Skill Generation

skill-cli generate --adversarial adds an independent evaluator pass after the default OMC generation pipeline:

skill-cli generate pdf-to-md --desc "Convert PDF files to Markdown" --adversarial

This mode uses two isolated Claude CLI contexts:

A generator-side claude -p process proposes concrete acceptance criteria for the generated skill.
An evaluator-side claude -p process negotiates the contract, runs skill-cli validate, skill-cli test, and skill-cli verify, and writes .adversarial/diff-report.json.

The evaluator fails the run when critical issues are found or the score is below 0.80. Deterministic gate failures are always converted into critical diff items and cap the score below 0.60.

Publish to npm

Set repository.url in package.json to your GitHub repo (e.g. git+https://github.com/your-org/skill-cli.git).
Build binaries per platform before publishing the npm package:
```
bash scripts/build-all.sh
```
Optionally attach the same binaries to a GitHub Release with names:
- skill-cli-darwin-arm64, skill-cli-darwin-x64
- skill-cli-linux-x64, skill-cli-linux-arm64
- skill-cli-win32-x64.exe

Publish the package:

npm login --registry=https://registry.npmjs.org/
npm publish --access public --registry=https://registry.npmjs.org/ --userconfig ~/.npmrc

Users who npm install -g ancoder-skill-cli get a fully bundled package. No extra binary download is required during install.

Test-Driven Skill Development (100:10:1 Architecture)

skill-cli adopts a test-driven approach to skill development, inspired by oh-my-claudecode's multi-agent orchestration patterns. The core principle: invest the majority of compute in building robust test skills, not the skill itself.

Time Allocation: 100:10:1

When creating a skill for a task, the system simultaneously creates a main skill and a test skill:

| Phase | Time Share | Purpose | |-------|-----------|---------| | Test skill development | 90% (100 units) | Build an automated evaluator that compares expected vs actual output, locating specific differences | | Main skill development | 9% (10 units) | Implement the actual skill, guided by test skill feedback | | Execution & verification | 1% (1 unit) | Final end-to-end smoke test |

Architecture

Phase 1: Test Skill Development (90% compute)
  generate structured acceptance criteria
  -> N planners generate test strategies in parallel
  -> critic reviews + eliminates weak strategies
  -> N executors implement test skills in parallel
  -> golden test evaluation (tournament selection)
  -> repeat until precision threshold met
  -> best test skill selected

Phase 2: Main Skill Development (9% compute)
  generate main skill
  -> test skill verifies (independent executor)
  -> structured diff feedback injected into next prompt
  -> repeat until test skill passes
  -> main skill complete

Phase 3: Final Verification (1% compute)
  end-to-end smoke test

Key Design Principles

1. Separation of Author and Reviewer

The agent that generates the main skill and the agent that runs the test skill operate in separate contexts. This prevents self-approval bias. The verify phase spawns an independent executor to run the test skill, ensuring honest evaluation (borrowed from OMC's verifier lane pattern).

2. Structured Diff Feedback

Test skills output structured diff reports instead of simple pass/fail:

diffs:
  - location: "page 3, paragraph 2"
    type: "content_loss"
    severity: "critical"
    expected: "table with 3 columns and 5 rows"
    actual: "table missing entirely"
  - location: "page 5, heading"
    type: "format_drift"
    severity: "warning"
    expected: "## Second-level heading"
    actual: "### Third-level heading"

This structured feedback is injected back into the main skill's improvement loop, enabling targeted fixes rather than blind retries.

3. QA Cycling with Early Exit

Borrowed from OMC's UltraQA pattern:

Test skill finds issues -> structured diagnosis -> main skill fixes -> retest -> loop
Same error appearing 3 times triggers early exit (avoids infinite compute burn)
Maximum 5 QA cycles per iteration

4. Tournament Selection for Test Skills

During the 90% test skill development phase, multiple test strategies are generated in parallel and evaluated against golden tests (known-correct input/output pairs). The strategy with the highest detection precision wins, similar to OMC's self-improve tournament selection.

5. PRD-Driven Acceptance Criteria

Test skills define concrete, testable acceptance criteria (not vague "implementation is complete"):

Bad:  "PDF conversion works correctly"
Good: "All tables with merged cells are preserved as HTML <table> blocks
       with correct colspan/rowspan attributes"

Example: PDF-to-Markdown Skill

For a PDF-to-Markdown conversion skill:

Test skill (100 min): Compares original PDF content with generated Markdown, detecting content loss (missing paragraphs, tables, images), format drift (heading levels, list styles), and encoding issues. Outputs structured diffs with page/paragraph-level location info.
Main skill (10 min): Implements PDF parsing and Markdown generation, iteratively improved by test skill feedback.
Verification (1 min): End-to-end smoke test on fixture PDFs.

`skill_eval` Check Type

The verify system supports a skill_eval check type that invokes a test skill as a verification oracle:

checks:
  - id: quality-check
    type: skill_eval
    skill: pdf-to-md-test
    config:
      threshold: 0.95
      output_format: structured_diff

Verify Phase: Independent Executor

During the loop's verify phase, a separate Claude executor is spawned to run the test skill. This executor:

Has no shared context with the main skill's executor
Produces an objective evaluation report
Returns structured diff feedback that feeds into the next iteration

This mirrors OMC's principle: "Keep authoring and review as separate passes."

oh-my-claudecode (OMC) Integration

skill-cli embeds the full oh-my-claudecode multi-agent orchestration bundle (synced from GitHub release v4.13.6) and installs it into ~/.claude/omc/ by default. This gives any skill-cli user a single-command path to OMC's agents, skills, hooks, and runtime scripts without needing to clone the OMC repo or configure the plugin marketplace separately.

What gets installed

When you run skill-cli install, OMC is installed alongside ECC components:

| OMC asset | Install target | |-----------|---------------| | 19 agents (analyst, architect, executor, planner, critic, verifier, …) | ~/.claude/omc/agents/ | | 38 skills (autopilot, ralph, ralplan, deep-interview, team, ultrawork, ultraqa, self-improve, …) | ~/.claude/omc/skills/ | | Runtime scripts (hook helpers, session lifecycle, skill injector, …) | ~/.claude/omc/scripts/ (executable bit preserved for .sh/.mjs/.cjs/.js/.ts) | | hooks.json | Merged into ~/.claude/settings.json with $CLAUDE_PLUGIN_ROOT rewritten to the absolute OMC install path | | Templates | ~/.claude/omc/templates/ | | .claude-plugin/ manifest, LICENSE, CHANGELOG, VERSION | ~/.claude/omc/ |

Flags

# Default — installs ECC + OMC
skill-cli install

# Skip OMC entirely (opt-out)
skill-cli install --no-omc

# Install only the OMC bundle
skill-cli install --component omc

# Preview without writing files
skill-cli install --dry-run

Browse embedded OMC content

skill-cli list --type omc       # list embedded OMC agents and skills
skill-cli info autopilot        # show the autopilot skill content
skill-cli doctor                # verify OMC install health and version

Why the hook rewrite matters

OMC hooks are authored for the Claude Code plugin system and reference scripts via $CLAUDE_PLUGIN_ROOT/scripts/.... Because skill-cli installs OMC as a plain directory (not as a marketplace plugin), the installer rewrites $CLAUDE_PLUGIN_ROOT → ${claudeDir}/omc at merge time so hooks resolve correctly without the plugin loader.

If you already have OMC installed via the Claude Code plugin marketplace, the skill-cli install places a separate self-contained copy under ~/.claude/omc/ and will not touch the marketplace install. The two copies can coexist; hooks from both sources will simply fire in sequence.

Upgrading OMC

The embedded OMC version is pinned to the release tagged in embedded/omc/VERSION. To bump it, re-run the sync workflow that downloads a fresh GitHub release tarball into embedded/omc/ and rebuild.

Meta-Harness (experimental)

meta-harness/ is a Python sub-project that implements the outer-loop harness optimizer from arXiv:2603.28052 (Stanford, 2026).

Architecture

meta-harness search   ←  outer loop (Python, Claude Code proposer)
      │
      └─ skill-cli eval validate / run / ls / diff   ←  evaluator backend (Go)
                │
                └─ harness.py (user-supplied Python)  ←  inner execution layer

Two independent binaries — intentionally decoupled:

skill-cli knows nothing about meta-harness; it only runs harness candidates and emits scores/traces.
meta-harness knows nothing about OMC internals; it calls skill-cli via CLI contract only.

Quick start

# Build skill-cli
go build -o bin/skill-cli .

# Install meta-harness
cd meta-harness
python3 -m venv .venv && source .venv/bin/activate
pip install -e ".[dev]"

# Run smoke test (no API key needed)
cd ..
bash scripts/meta-harness-smoke.sh

# Real search (requires ANTHROPIC_API_KEY + claude CLI)
meta-harness search \
  --suite meta-harness/domains/text_classification/suite.yaml \
  --out search-runs/run-01 \
  --max-iter 5 \
  --k 2 \
  --seed meta-harness/domains/text_classification/seeds/zero_shot.py \
  --seed meta-harness/domains/text_classification/seeds/few_shot.py \
  --skill-cli bin/skill-cli \
  --samples 20

CLI contract (skill-cli eval)

| Command | Description | |---|---| | skill-cli eval validate <dir> | Cheap structural check (exit 0 = valid) | | skill-cli eval run <dir> --suite <f> --out <d> | Full eval → scores.json + traces/ | | skill-cli eval ls --store <d> [--pareto] | List / filter candidates | | skill-cli eval diff <a> <b> --store <d> | Code + score diff |

Tuning

The meta-harness/src/meta_harness/skill.md file is the most important lever on search quality. Per Appendix D of the paper: run 3–5 short iterations (--max-iter 3) specifically to debug and refine it before committing to a full run.

License

MIT

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

skill-cli

Install (npm)

Build from source (Go)

Commands

Machine-Readable Skill Layout

Adversarial Skill Generation

Publish to npm

Test-Driven Skill Development (100:10:1 Architecture)

Time Allocation: 100:10:1

Architecture

Key Design Principles

Example: PDF-to-Markdown Skill

skill_eval Check Type

Verify Phase: Independent Executor

oh-my-claudecode (OMC) Integration

What gets installed

Flags

Browse embedded OMC content

Why the hook rewrite matters

Upgrading OMC

Meta-Harness (experimental)

Architecture

Quick start

CLI contract (skill-cli eval)

Tuning

License

`skill_eval` Check Type