npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

@dzhechkov/harness-cli

v0.3.89

Published

The dz CLI — install AI skills for Claude Code, Codex, OpenCode, Hermes. 11 commands, 7 presets, 4 platform adapters.

Readme

@dzhechkov/harness-cli

The dz CLI — the main entry point to the DZ Harness Hub. Install AI skills for Claude Code, Codex, OpenCode, Hermes, OpenClaude, GitHub Copilot from a single command.

Why dz?

dz is a package manager + cross-compiler for your AI agent harness. Write a skill once in one canonical form; dz installs it into any agent's harness, holds it to a quality bar, and lets the harness learn over time.

The problem. You accumulate ~117 skills (design-thinking, QE, devops, web3, MCP, academic…). Five pains follow:

  1. Every agent wants a different layout. Claude Code reads .claude/skills/, Codex .codex/, OpenCode/Hermes/OpenClaude their own. Hand-maintaining N copies is sync hell.
  2. Skills arrive from many upstream repos — they must be canonicalized (brought to one form) and kept in sync without losing provenance.
  3. It's hard to know which skill to reach for out of a hundred.
  4. Quality drifts — there's no single bar.
  5. Experience doesn't accumulate — the harness doesn't learn from feedback.

The answer — one canon → many platforms. There is a single source of truth (a CanonicalSkill); dz compiles it for each target — so the same skill drops into .claude/skills/, .codex/, etc. without hand-copying.

Every command maps to one of five jobs:

| Job | Commands | What it does | |-----|----------|--------------| | Author / canonicalize | auto-canonicalize, sync-upstream, diff, create-skill | pull a skill from any repo into one canonical form + keep it in sync with upstream | | Install / assemble | init, setup, install, compose, presets, upgrade | deploy the right set of skills into a chosen agent harness (6 targets) | | Find / recommend | registry, scout, recommend, skill-advisor | for a task, suggest which skill / preset / package to use | | Guarantee quality | benchmark (L0 A–F), verify, doctor | one bar — 20 deterministic checks per skill | | Learn | teach, pretrain, roam (reward-learning) | accumulate patterns, improve recommendations over time |

(+ ops: publish, stats, downloads, dashboard, plugin.)

Analogy: npm for distribution, a compiler / Babel for one source → many targets (adapters for 6 agents), and a linter / CI for a quality bar (benchmark) — but for AI agent skills, not ordinary code.

Install

npm install -g @dzhechkov/harness-cli

# Update to latest version (run from outside any workspace project):
cd /tmp && npm install -g @dzhechkov/harness-cli@latest

Note: If you get EUNSUPPORTEDPROTOCOL workspace:*, you're inside a pnpm/yarn workspace. Run the install from /tmp or ~ instead.

User Journey — from install to mastery

All 32 commands mapped to a real workflow:

DISCOVER → INSTALL → USE → CREATE → MAINTAIN → SHARE

Phase 1: Discover (what's available?)

npm install -g @dzhechkov/harness-cli    # install the CLI

dz help                                   # see all commands
dz pretrain                                # analyze project files → recommend by tech stack
dz recommend "build API and deploy to K8s" # keyword match → skills + toolkits
dz recommend "work on this project"        # generic? → auto-runs pretrain → recommends by stack
dz stats                                  # 33 packages, 117 skills, 6 targets, 11 presets
dz dashboard                              # visual panel — packages, adapters, skill packs
dz registry                               # browse all 117 skills by category
dz registry search kubernetes             # find specific skills
dz registry --category devops             # filter by domain
dz downloads                              # npm weekly download stats

Phase 2: Install (set up your workspace)

# Full setup with self-learning (recommended):
dz setup --target claude-code --preset devops  # pretrain + hooks + JSONL memory

# With AgentDB vector memory (semantic search + self-learning):
dz setup --target claude-code --preset devops --memory agentdb  # .rvf + 41 MCP tools

# Or just install skills (no learning):
dz init --target claude-code --preset devops   # 28 DevOps skills
dz init --target openclaude --preset web3      # 12 DeFi skills for OpenClaude
dz init --target codex --preset mcp            # 16 MCP skills for Codex

# Or pick individual skills:
dz init --target claude-code --select terraform,kubernetes,docker-compose

# Or install from any npm package:
dz install @dzhechkov/skills-devops            # npm install + copy skills

# Verify everything is correct:
dz verify                                       # structural validation
dz doctor                                       # 7 health checks
dz list                                         # show installed skills
dz info --id terraform                          # detailed info about a skill

Phase 3: Use (work with your agent)

# Now use Claude Code / Codex / OpenCode / Hermes normally.
# Skills are auto-discovered from the platform's skills directory.
# Example in Claude Code:
#   "Review this PR" → pr-review skill activates
#   "Design an API" → api-design skill activates
#   "Fix this CI" → ci-fix skill activates

Phase 4: Create (build your own skills)

# Scaffold a new skill:
dz create-skill --name my-skill --description "What it does" --tier 2

# With BTO-compatible eval templates:
dz create-skill --name my-skill --bto

# Benchmark your skill (aim for Grade A):
dz benchmark .claude/skills/my-skill           # single skill — 20 L0 checks
dz benchmark packages/@dzhechkov/skills-devops --all   # batch all
dz benchmark skill-a --compare skill-b          # A/B compare

# Find skills to canonicalize from the ecosystem:
dz scout                                        # scan 9 sources (GitHub, npm+plugins, HN, ...)
dz scout --deep                                 # deep analysis with SKILL.md parsing
dz auto-canonicalize --source github.com/user/repo --pack packages/@dzhechkov/skills-devops

Phase 5: Maintain (keep skills fresh)

# Check for upstream changes (canonicalized skills):
dz sync-upstream --list                                 # which packages have external sources?
dz sync-upstream --all                                  # check all against upstream
dz sync-upstream --package packages/@dzhechkov/skills-devops  # check one

# Check installed skills vs canonical:
dz upgrade                                      # shows which skills need update
dz upgrade --target openclaude                  # check specific platform

# Sync canonical to legacy layout:
dz sync                                         # canonical → project skills
dz migrate                                      # detect legacy installations

# Orchestrate dynamic workflows:
dz workflow --task coverage-lift                 # parallel coverage improvement
dz workflow --task security-audit               # adversarial security scan

# Cross-host state sync:
dz roam --apply                                 # sync agent state across machines

Phase 6: Share (publish to the world)

# Publish updated packages to npm:
dz publish --dry-run                            # preview
dz publish --filter skills-devops               # publish specific package
dz publish                                      # publish all changed packages

Three Ways to Install Skills

| | Individual Skill | Preset | npx Package | |---|---|---|---| | What | 1 SKILL.md file | Curated list of skill names | Full toolkit with orchestration | | Contains | Instructions for 1 task | N skill references | Skills + commands + rules + shards + agents + memory | | Pipeline | No | No | Yes (phases, checkpoints, governance) | | Self-learning | No | dz setup adds it | Built-in | | Install | dz init --select X | dz setup --preset X | npx @dzhechkov/X init | | Example | terraform | devops (28 skills) | keysarium (7-phase research) |

# One skill:
dz init --target claude-code --select design-thinking

# Curated set by topic (recommended):
dz setup --target claude-code --preset meta          # 16 development skills + self-learning

# Full toolkit with orchestrated pipeline:
npx @dzhechkov/keysarium init                        # 7-phase research + commands + memory

When to use which:

  • Need 1 specific capability--select
  • Need a themed set that works together → --preset
  • Need a full pipeline with commands and governance → npx

Available Presets (11)

| Preset | Skills | Description | |--------|--------|-------------| | meta | 16 | Development process (explore, goap-research, problem-solver, design-thinking, feature-adr, knowledge-extractor, understand-anything-bridge, agentshield-scan, adversarial-verifier, skill-advisor) | | qe-engineer | 20 | Quality engineering (test-gen, coverage, chaos, defect, ...) | | bto | 1 | Build-Benchmark-Test-Optimize pipeline | | health | 8 | Medical AI (diagnostics, drugs, labs, clinical decisions) | | keysarium | 9 | Full research toolkit (feature-adr, presentation, reverse-eng) | | p-replicator | 10 | AI product development (/replicate, SPARC PRD, pipeline-forge) | | feature-adr | 5 | Feature pipeline (feature-adr, explore, frontend-design) | | devops | 28 | DevOps skills (terraform, kubernetes, c4-architecture, incident-response, problem-management, risk-assessment, ...) | | web3 | 12 | Web3/DeFi (quicknode, zerion, symbiosis, bankr, veil, neynar, ...) | | mcp | 16 | MCP servers (agentdb, brave-search, gmail, gitlab, comfyui, notion, ...) | | academic | 5 | Thesis defense (review, questions, doc-check, live defense + answer eval) |

Standalone Packages (install via npx, no dz CLI needed)

| Package | Install | What it does | |---------|---------|-------------| | @dzhechkov/keysarium | npx @dzhechkov/keysarium init | Full 7-phase research toolkit | | @dzhechkov/design-thinking | npx @dzhechkov/design-thinking init | d.school 6-phase Design Thinking (8 skills) | | @dzhechkov/trip-planner | npx @dzhechkov/trip-planner init | Travel itinerary → interactive mobile site (pending publish) | | @dzhechkov/p-replicator | npx @dzhechkov/p-replicator init | AI product development (/replicate pipeline) | | @dzhechkov/health-advisor | npx @dzhechkov/health-advisor init | Medical AI (25 skills) | | @dzhechkov/skills-bto | npx @dzhechkov/skills-bto init | BTO benchmarking (Build-Test-Optimize) | | @dzhechkov/skills-feature-adr | npx @dzhechkov/skills-feature-adr init | 11-step feature pipeline | | @dzhechkov/skills-edu-site | npx @dzhechkov/skills-edu-site init | Gamified edu site generator | | @dzhechkov/skills-transcript-site | npx @dzhechkov/skills-transcript-site init | Transcript → interactive site | | @dzhechkov/skills-analyst-manual | npx @dzhechkov/skills-analyst-manual init | 3-phase analyst composite |

Difference: dz init --preset installs individual skills from .claude/skills/ source into a target platform tree. Standalone npx packages have their own CLI and install a complete toolkit with commands, rules, shards, and agents — a richer but self-contained experience.

A skill and its npx toolkit are not duplicates — they're a graduation. Several skills (e.g. feature-adr, design-thinking) exist BOTH as a skill inside a dz preset AND as a standalone npx package. The preset's SKILL.md is fully functional on its own (the whole methodology — modules + references — travels with it, and it auto-activates by description), and it's the only way to compile that capability to the non-Claude platforms (Codex/OpenCode/Hermes/OpenClaude) via dz. The npx package adds project-level runtime governance around the same skill: a slash command, governance rules, a context shard, and (for feature-adr) reward-learning + /harvest. So: pick the skill/preset for a working capability across platforms; pick the npx toolkit when you want it as a governed, command-driven fixture of one project.

All Commands (32)

dz setup             --target <name> [--preset <name>] [--memory agentdb] [--no-hooks] [--install-driver] [--force]
dz init              --target <name> [--preset <name>] [--select id,id,...] [--force]
dz install           <npm-pkg> [--target <name>] [--project <dir>]
dz teach             "<pattern>" [--reward <0-1>] [--domain <name>]
dz pretrain          [--project <dir>]
dz recommend         "<task description>"
dz compose           <preset1+preset2+...> [--target <name>]
dz diff              <skill-dir>
dz upgrade           [--target <name>] [--project <dir>]
dz verify            [--skills-dir <dir>] [--target <name>]
dz sync              [--canonical <dir>] [--project <dir>] [--dry-run] [--force]
dz update            (alias for sync)
dz list              [--skills-dir <dir>]
dz info              --id <skill-id> [--skills-dir <dir>]
dz create-skill      --name <id> [--description <text>] [--tier 1|2|3] [--bto]
dz registry          [search <query>] [--category <cat>]
dz benchmark         <skill-dir> [--compare <dir>] [--all]
dz mcp-scan          [path] [--json]   (static agent-permission audit; exit 0/1/2 = clean/medium/high)
dz publish           [--filter <name>] [--dry-run] [--bump-only]
dz auto-canonicalize --source <github-url> --pack <skills-pack>
dz sync-upstream     [--package <dir>] [--list] [--all]
dz scout             [--topics <list>] [--since <date>] [--deep]
dz workflow          --task <name> [--dry-run]
dz plugin            [--version <ver>]
dz downloads
dz migrate           [--project <dir>]
dz stats
dz dashboard
dz doctor            [--project <dir>]
dz roam              [--apply] [--slug <slug>]
dz import-ecc       [--local-path <dir>] [--select id,id,...] [--limit N] [--output <dir>] [--force]
dz help

Targets (5 platforms)

All 5 platforms natively support the agentskills.io SKILL.md format:

| Target | Skills directory | Native SKILL.md? | |--------|-----------------|:---:| | claude-code | .claude/skills/ | Yes | | codex | .agents/skills/ | Yes (docs) | | opencode | .opencode/skills/ | Yes (also scans .claude/skills/) | | hermes | .hermes/skills/ | Yes | | openclaude | .openclaude/skills/ | Yes |

Same SKILL.md file, different directory — no format conversion needed.

Optional platform enrichment (skills work without these):

| Platform | Optional extra | What it adds | |----------|---------------|-------------| | Codex | agents/openai.yaml | UI metadata (icons, display_name, MCP deps) | | OpenCode | opencode.json + .opencode/agents/*.md | Config, custom agents | | Hermes | cli-config.yaml | Agent config, persona, memory |

Workflows (Opus 4.8+ dynamic workflows)

dz workflow --task coverage-lift     # parallel coverage improvement
dz workflow --task mutation-kill     # kill surviving mutants
dz workflow --task canonicalize      # canonicalize new packages
dz workflow --task security-audit    # adversarial security scan

Scout (ecosystem intelligence)

dz scout                              # quick scan — radar mode
dz scout --deep                       # deep analysis — AI analyst mode
dz scout --topics mcp-server,ai-agent # custom topics
dz scout --since 2026-05-01           # only recent repos

Radar mode (dz scout) scans 9 sources in parallel (GitHub + npm + HN + MCP Registry + Glama + OSSInsight + Smithery + Semantic Scholar + arXiv):

  1. Detects skill format — SKILL.md, plugin.json, .claude/skills/, .claude-plugin/, MCP manifests
  2. Scores relevance — format (40%) + stars (30%) + recency (20%) + novelty (10%)
  3. Compares against our 32 packages — finds skills we don't have
  4. Recommends — integrate (score ≥70) / monitor (40-69 + ≥50 stars) / skip

Deep analyst mode (dz scout --deep) goes further for top-scored repos:

  1. Downloads SKILL.md from each repo, parses frontmatter + body
  2. Finds closest match in our inventory by keyword overlap
  3. Explains the delta — what the found skill adds that ours doesn't
  4. Recommends integration path:
    • canonicalize — high-signal novel skill → new @dzhechkov/skills-* pack
    • merge — similar to existing skill → add unique features to ours
    • new-preset — novel skill → add to preset or create new pack
    • skip — already in our inventory
  5. Gap analysis — identifies trending categories across the ecosystem that our harness lacks

Example deep analysis output:

## 🔬 Deep Analysis

### cool/agent-toolkit (★500)
2/3 skills are novel

| Skill | Description | Closest match | Integration | Rationale |
|-------|------------|---------------|-------------|-----------|
| code-review | Automated OWASP-focused review | brutal-honesty-review | **merge** | Similar to ours — merge OWASP checklist |
| deploy-check | Pre-deploy validation gates | — | **canonicalize** | High-signal novel skill (500 stars) |

## 📊 Harness Gap Analysis

| Category | Frequency | Recommendation |
|----------|-----------|---------------|
| deploy-automation | 12 repos | Create @dzhechkov/skills-devops — high demand |
| data-pipeline | 5 repos | Monitor — emerging trend |

Powered by @dzhechkov/scout.

BTO integration (create-skill --bto)

# Scaffold a new skill with BTO-compatible 3-layer evaluation:
dz create-skill --name my-skill --bto

# What you get:
#   evals/my-skill.yaml       — BTO eval with L0/L1/L2 layers
#   references/judge-rubrics.md — scoring rubrics for 3-judge panel

The --bto flag generates eval templates compatible with /bto-test:

| Layer | What | Gate | |-------|------|------| | L0 | Deterministic checks (U1-U5 universal + S1-S15 skill-specific) | Pass rate >= 80% | | L1 | Single LLM judge (Haiku) — 5 dimensions: Clarity, Completeness, Actionability, Quality, Anti-patterns | Average >= 7.0 | | L2 | 3-judge panel (Sonnet) — Expert (0.40), Critic (0.30), Auditor (0.30) — 5 dimensions: Methodology, Depth, Correctness, Usability, Robustness | Weighted avg >= 7.0 |

After scaffolding, fill in the SKILL.md protocol and run /bto-test .claude/skills/my-skill to evaluate.

dz install — install skills from any npm package

# Install skills from any npm package directly
dz install @dzhechkov/skills-devops
dz install @dzhechkov/skills-web3 --target openclaude
dz install @lythos/skill-curator --target claude-code

Runs npm install, discovers SKILL.md files in the package, copies them to the target platform directory. Works with any agentskills.io-compatible npm package.

dz sync-upstream — check for upstream updates

dz sync-upstream --list                                    # show packages with external sources
dz sync-upstream --all                                     # check ALL packages against upstream
dz sync-upstream --package packages/@dzhechkov/skills-devops  # check one package

Discovers all skill packs with sources.json, fetches SKILL.md from origin repos, reports which skills have upstream changes.

dz upgrade — check installed skills for updates

dz upgrade                           # check .claude/skills/ against canonical
dz upgrade --target openclaude       # check .openclaude/skills/

Compares installed skills with canonical source, reports which need dz init --force to update.

dz downloads — npm weekly download stats

dz downloads     # fetch weekly downloads for all 31 packages

dz benchmark — L0 quality gate

dz benchmark packages/@dzhechkov/skills-devops/terraform     # single skill
dz benchmark packages/@dzhechkov/skills-devops --all          # batch all
dz benchmark skill-a --compare skill-b                        # A/B compare

20 graded deterministic checks (U1-U5 universal + S1-S15 skill-specific) + S16 advisory (capability-declaration nudge, not graded). Grade A = 95%+. For L1/L2 LLM judges, use /bto-test inside Claude Code.

dz mcp-scan — static agent-permission audit

dz mcp-scan .                  # scan a project/pack (default: .)
dz mcp-scan . --json           # machine-readable report

"npm audit for agent tools." Reads (never executes) .claude/settings*.json and .mcp.json/.vscode/mcp.json, then emits a 3-tier verdict with capability-level findings. Exit codes: 0 clean · 1 medium · 2 high (so CI fails on any non-clean surface). Flags: wildcard/shell grants, secrets-reachable (Read + MCP active, no .env deny), hardcoded MCP env secrets, interpreter/package-runner MCP servers, enableAllProjectMcpServers, missing default-deny. Rules adapted from the MetaHarness threat-model.

# Build-time capability reconciliation (project grants vs installed skills' declarations):
dz mcp-scan . --reconcile                  # report under-grant (skill needs a denied capability) + over-grant
dz mcp-scan . --reconcile --emit-policy    # also write .dz/policy/mcp-policy.json (least-privilege, advisory)
dz mcp-scan . --reconcile --fail-on-undergrant   # CI: exit 1 if a skill declares a need the grants forbid

--reconcile is build-time and advisorydz never enforces; it reports the grant-vs-declaration gap and (with --emit-policy) emits a least-privilege policy for a host to enforce. Under-grant is MEDIUM (the host will starve the skill); over-grant is an advisory CANDIDATE (a grant may be for the operator). Declared limits are reported but inert (settings.json has no timeout field). Verdict-neutral unless --fail-on-undergrant.

dz publish — automated npm publish

dz publish --dry-run                          # preview what would publish
dz publish --filter skills-devops             # publish specific package
dz publish --filter skills-devops --bump-only # bump version only, no publish

dz auto-canonicalize — discover skills in GitHub repos

dz auto-canonicalize --source github.com/user/repo --pack packages/@dzhechkov/skills-devops

Scans a GitHub repo for SKILL.md files, generates dz create-skill commands.

dz registry — searchable skill index

dz registry                    # visual panel: 117 skills in 6 categories
dz registry search security    # fuzzy search
dz registry --category mcp     # filter by category

dz stats + dz dashboard

dz stats        # Quick metrics: packages, skills, targets, presets
dz dashboard    # Visual panel with all packages, adapters, skill packs

Example: Thesis Defense Preparation (Academic Preset)

# Install with AgentDB (remembers patterns across students):
dz setup --target claude-code --preset academic --memory agentdb

# Or lightweight:
# dz init --target claude-code --preset academic

Prepare: Create a folder per student with thesis.pdf + review.pdf + external-review.pdf + antiplagiat.pdf.

Pre-defense (open Claude Code in student folder):

"Check document package completeness"     → document-checker
"Analyze this thesis"                     → dissertation-review (format, criteria, grade)
"Generate 6 defense questions"            → question-generator (basic → critical, page refs)

During defense (feed live transcript via Whisper + VB-Cable):

"Analyze this defense transcript"         → defense-evaluator (structure, coverage, delivery)
"Evaluate the student's answers"          → answer-assessor (completeness, depth, reviewer alignment)

| When | Skill | What it does | |------|-------|-------------| | Before | document-checker | Package completeness: thesis, reviews, antiplagiat | | Before | dissertation-review | ГЭК criteria, research/project format, grade 1-10, team project check | | Before | question-generator | 4-6 questions with page refs and expected keywords | | During | defense-evaluator | Live transcript → structure, coverage, delivery quality | | During | answer-assessor | Q&A evaluation → completeness, depth, reviewer remarks |

Key features: Grade corridor, per-criterion 1-10 scoring, TO BE vs data detection, LTV/CAC > 10 warning, reviewer divergence, raise/lower conditions, compact mode (1-page справка: "компактная справка"), summary table across all students. With AgentDB, patterns persist.

Skills contain only evaluation criteria and methodology — no student data.

Batch mode: S3 archive → agent swarm

# Download and extract: each student = subfolder with .zip
curl -o students.zip "https://s3.example.com/bucket/students.zip"
mkdir students && cd students && 7z x ../students.zip
for f in *.zip; do mkdir -p "${f%.zip}" && cd "${f%.zip}" && 7z x "../$f" && cd ..; done

Then in Claude Code:

"For each student folder: run document-checker → dissertation-review → question-generator.
 Save справка.md per student with clickable inline links to pages (стр. 45, разд. 2.3)
 and external sources ([JTBD](https://hbr.org/...)). Run all students in parallel."

With AgentDB, patterns persist across students — grading calibration improves with each analysis.


Example: Product Discovery with Design Thinking

# With self-learning (recommended — remembers HADI patterns, JTBD insights across sessions):
dz setup --target claude-code --preset meta
dz setup --target claude-code --preset meta --memory agentdb  # + semantic search

# Or without self-learning:
dz init --target claude-code --preset meta
# Or individually:
dz setup --target claude-code --select design-thinking

Then in Claude Code:

"Design a mobile app for booking coworking spaces"
→ design-thinking skill activates
→ 6-phase protocol runs with complexity tier auto-selection

6-Phase Protocol

Phase 1: EMPATHIZE  → STOP gate: request user interview data + goap-research for market data
Phase 2: DEFINE     → JTBD Canvas + CJM AS IS + Ishikawa root cause analysis
Phase 3: IDEATE     → HADI hypotheses + Lean Canvas / Osterwalder BMC + GTM + Unit Economics
Phase 4: PROTOTYPE  → MVP (fidelity spectrum) + CJM/VSM TO BE (labeled as hypotheses)
Phase 5: TEST       → STOP gate: request usability test data + risk analysis + HADI validation
Phase 6: VALIDATE   → Pilot with variance analysis: projected vs actual → Scale/Iterate/Pivot/Kill

Complexity Tiers (auto-selected)

| Tier | When | Phases | Integrations | |------|------|--------|-------------| | S | Quick user insight | 1→2→5 | explore + goap-research | | M | New feature | 1→2→3→4→5 | + frontend-design + six-thinking-hats | | L | New product | 1→2→3→4→5→6 | + qcsd-swarm + reverse-engineering-unicorn | | XL | Platform / ecosystem | All | All optional integrations (aqe init recommended) |

Key Safeguards

  • Never fabricates data — STOP gates pause for real interview/survey/test data
  • TO BE ≠ data — projections labeled as hypotheses, validated via pilot (Phase 6)
  • LTV/CAC > 10 flagged as suspicious (Skok 2013)
  • Loop-back protocol — Phase 5 can invalidate Phase 2 and return upstream
  • 22 methodologies with academic validation tiers (Strong/Moderate/Practitioner/Weak)
  • 23 validation rules (DT-001 through DT-023) enforce quality per tier

What's included vs what's optional

Core DT — the meta preset includes all required dependencies (16 skills):

dz setup --target claude-code --preset meta
# → explore, goap-research-ed25519, problem-solver-enhanced,
#   design-thinking, feature-adr, knowledge-extractor,
#   understand-anything-bridge, ... (15 total)

Full DT — for ALL optional integrations, install agentic-qe:

npm install -g agentic-qe && aqe init --auto
# → 94 QE skills + 55 agents in .claude/skills/ and .claude/agents/
# → six-thinking-hats, qcsd-ideation-swarm, frontend-design, brutal-honesty-review

Or cherry-pick: dz compose meta+keysarium for competitive analysis.

| Optional Skill | Source | What it adds | |---------------|--------|-------------| | frontend-design | aqe init / keysarium | HTML/React prototypes (Phase 4) | | six-thinking-hats | aqe init | Team ideation (Phase 3) | | qcsd-ideation-swarm | aqe init | 9-agent quality risk (Phase 2-3) | | reverse-engineering-unicorn | keysarium | Competitor CJM+JTBD (Phase 1) |

Without optional skills, design-thinking uses built-in fallbacks.

BTO benchmark: L0 Grade A (100%), L2 Opus weighted 7.58/10.


Example: Import Skills from ECC

dz install @dzhechkov/skills-ecc                 # 20 curated ECC skills
dz import-ecc --limit 50                         # import 50 from GitHub
dz import-ecc --local-path /path/to/ECC          # from local clone (fast)
dz import-ecc --select docker-patterns,tdd       # cherry-pick

Example: Security Scan with AgentShield

# In Claude Code: "scan my agent config for security issues"
# → agentshield-scan skill activates (170 rules, 10 categories)
npx ecc-agentshield scan --format sarif           # SARIF for GitHub Code Scanning

Example: 4-Axis Risk Scoring

dz init --target codex --preset meta --enrich
# → agents/openai.yaml includes risk_level per skill
# Axes: base_tool + file_sensitivity + blast_radius + irreversibility

Example: Understand & Develop an Existing Project

# 1. Analyze project → get recommendations
dz pretrain                                     # detects stack, recommends presets
dz recommend "work on this Node.js API"         # suggests skills + toolkits

# 2. Install skills (choose your level)
dz setup --target claude-code --preset meta --memory agentdb  # 16 skills (includes feature-adr)
dz setup --target claude-code --preset qe-engineer             # + 20 QE skills

# Want the full feature-adr toolkit with /feature-adr command + governance?
npx @dzhechkov/skills-feature-adr init                         # adds slash command + rules + shards
# See: https://www.npmjs.com/package/@dzhechkov/skills-feature-adr

# preset = SKILL.md only (auto-activates on matching tasks)
# npx = full toolkit (slash command + governance + rules)

Install Understand-Anything plugin, then in Claude Code:

# 3. Map the codebase
/understand                                      # builds knowledge graph
# → understand-anything-bridge feeds architecture context to all skills

# 4. Develop with full context
"Add a payment module"
# → feature-adr runs with architecture awareness (layers, hot spots, dependencies)
# → see: https://www.npmjs.com/package/@dzhechkov/skills-feature-adr
# → code generation informed by real dependency graph
# → QE review targets tests at high-impact files
# → agentshield-scan checks new configs for security

# 5. Verify impact
"What files are affected by my changes?"
# → blast radius calculation → targeted test generation

Architecture-aware development: every skill knows the codebase structure.


Example: AI-Assisted Reasoning & Self-Improvement

# Auto-select reasoning strategy:
"Compare 3 architectures"      → structured-reasoning: Tree-of-Thought (branches + scoring)
"Debug this test"              → structured-reasoning: Chain-of-Thought (linear trace)
"We've been looping"           → structured-reasoning: Reflection-Suppression (break loop)

# Self-review before delivering:
"Write a migration and verify" → reflection-loop: draft → critique → revise (max 3 rounds)

# Manage long sessions:
"Context is getting long"      → context-window-management: checkpoint + prune + continue

# Learn from success:
"Extract this as a skill"      → skill-crystallizer: trace → reusable SKILL.md

All included in meta preset.


Self-Learning: JSONL vs AgentDB

DZ Harness supports two memory backends for self-learning:

dz setup --target claude-code --preset devops                    # JSONL (default, lightweight)
dz setup --target claude-code --preset devops --memory agentdb   # AgentDB (vector memory)

| Capability | JSONL (default) | AgentDB (--memory agentdb) | |------------|----------------|------------------------------| | Session tracking | Append-only JSONL log | HNSW vector store (.rvf) | | Pattern storage | dz teach → patterns.jsonl | dz teach → .rvf + agentdb_pattern_store | | Search | Keyword (grep) | Semantic (HNSW nearest-neighbor, cosine similarity) | | Retrieval | Sequential scan | O(log n) approximate nearest neighbor | | Self-learning | Frequency-based | 9 RL algorithms + Thompson Sampling bandit | | Memory tiers | Flat file | 3-tier (working → short-term → long-term) | | Reflexion | Reward scores (0-1) | Episodic memory (task + outcome + self-critique) | | Causal reasoning | No | Cypher-like graph queries (X caused Y) | | Skill composition | Manual (presets) | Bandit-picked skill chains (A→B→C) | | Audit trail | No | Cryptographic attestation log | | Size | ~0 KB | 4.6 MB (agentdb) | | MCP tools | 0 | 41 tools (pattern, reflexion, causal, skill, hierarchy) | | Dependencies | None | agentdb (optional, via npx) |

AgentDB self-learning algorithms

When using --memory agentdb, the following algorithms automatically tune search quality:

  1. Thompson Sampling — multi-armed bandit for ranking search results
  2. UCB1 (Upper Confidence Bound) — exploration-exploitation balancing
  3. EXP3 — adversarial bandit for non-stationary environments
  4. Softmax — temperature-based action selection
  5. Epsilon-Greedy — simple exploration with decay
  6. Gradient Bandit — preference-based action selection
  7. Contextual Bandit — context-aware ranking using features
  8. REINFORCE — policy gradient for complex reward landscapes
  9. PPO-lite — proximal policy optimization for stable learning

The bandit automatically selects the best algorithm for your usage pattern — no manual tuning needed.

How to enable AgentDB

# One command — everything is set up:
dz setup --target claude-code --preset devops --memory agentdb

This creates .dz/memory.rvf, registers the agentdb MCP server (41 tools), and configures session hooks. The agent can immediately use agentdb_pattern_store, agentdb_reflexion_recall, etc. — no additional dz init needed.

| Command | When to use | |---------|-------------| | dz setup --memory agentdb | Recommended — full setup in one step | | dz init --select agentdb-memory | Lightweight — only the SKILL.md guide (see below) |

What does dz init --select agentdb-memory actually do?

This is the lightweight path — it installs only the skill documentation, without configuring the backend:

Step 1: Auto-discovers agentdb-memory/ in skills-mcp package
Step 2: Copies to .claude/skills/agentdb-memory/
          ├── SKILL.md              ← instructions for the agent
          ├── schemas/output.json
          ├── scripts/validate-config.json
          └── evals/agentdb-memory.yaml

Step 3: Claude Code auto-discovers the skill from .claude/skills/
Step 4: When agent encounters a matching task, it reads SKILL.md
Step 5: SKILL.md teaches the agent WHICH tools to call and WHEN

What it does NOT do (unlike dz setup --memory agentdb):

  • Does NOT create .dz/memory.rvf
  • Does NOT register agentdb MCP server
  • Does NOT configure session hooks

After dz init --select agentdb-memory, the user must manually add the MCP server:

claude mcp add agentdb -- npx agentdb@latest mcp start

When this is useful:

  • You already have agentdb installed separately and just want the skill guide
  • You want to teach the agent about agentdb tools without committing to the full .dz/ infrastructure
  • You're in a team where agentdb is managed centrally but each developer needs the skill docs

How it works

  • dz init compiles canonical skills from the agentskills.io standard into the target platform's layout
  • Writing is additive — existing files are never overwritten without --force
  • All 5 platform adapters produce byte-identical output (ADR-005)
  • dz doctor runs 7 health checks (node version, adapters, config, SQLite, skills)
  • dz migrate detects legacy keysarium/bto installations and recommends migration path

Use Cases

1. Short-term product research (one-off study)

Goal: Quickly research a product idea, competitors, market — get a structured report.

# Option A: via dz CLI
dz init --target claude-code --preset meta
# Then in Claude Code:
#   /explore "Research the market for AI-powered code review tools"
#   /feature-adr "Summarize findings into an ADR"

# Option B: via keysarium (full 7-phase pipeline)
npx @dzhechkov/keysarium init
# Then in Claude Code:
#   /casarium "AI-powered code review tools — market analysis"
#   → Phase 0: Discovery → Phase 1: Exploration → Phase 2: Paranoid Research
#   → Phase 3: Solution Design → Phase 4: Architecture → Phase 5: Presentation

What you get:

  • meta preset: /explore clarifies the problem → /feature-adr structures findings as ADR decisions
  • keysarium: full 7-phase pipeline with dream cycles, background workers, and presentation generation

Best for: Quick study (hours), competitive analysis, technology evaluation.


2. Long-term product research (evolving over time)

Goal: Continuously gather data, add new sources, and "recalculate" the product vision as insights accumulate.

# Install keysarium (research pipeline) + evidence-wiki (knowledge base)
npx @dzhechkov/keysarium init
# Copy evidence-wiki plugin into your project:
npx @dzhechkov/evidence-wiki   # or git clone https://github.com/djd1m/evidence-wiki

npm install -g @dzhechkov/harness-cli
dz init --target claude-code --preset meta

Workflow — iterative research cycles with evidence wiki:

Week 1:  /casarium "Product X — initial research"
         → researches/ directory created with findings
         → .keysarium/memory/ stores patterns + reward scores

         /wiki-generate                              ← evidence-wiki
         → Scans researches/, ADRs, docs
         → Generates wiki/concepts/*.md (atomic pages with inline sources)
         → Builds wiki/graph.json (knowledge graph)
         → wiki/INDEX.md links everything

Week 2:  Add new data → /casarium "Product X — update with Q2 metrics"
         → Memory recalls Week 1 patterns (reward-calibrated learning)
         → New findings merged with existing, conflicts resolved

         /wiki-generate --check                      ← re-generates wiki
         → New concepts added, existing updated
         → Every claim verified: triple-pillar protocol requires N independent
           typed sources (ADR + methodology + research)
         → Stale concepts flagged, broken evidence links detected

         /triple-check wiki/concepts/pricing-model.md ← verify specific page
         → Checks that every factual claim has inline source citations
         → Flags unsupported statements

Week N:  /casarium "Product X — pivot analysis after customer feedback"
         → Full history in memory layer + evidence wiki
         → /harvest extracts reusable knowledge patterns
         → /wiki-generate rebuilds the entire knowledge graph
         → Product vision "recalculated" — the wiki IS the living product model

The evidence-wiki advantage:

| Without evidence-wiki | With evidence-wiki | |----------------------|-------------------| | Research in markdown files | Atomic concept pages with inline sources | | Findings scattered across researches/ | Interlinked knowledge graph (graph.json) | | "I think we decided X" | Every claim has a cited source (triple-pillar) | | Hard to see what changed | /wiki-generate --check diffs the knowledge base | | No verification | /triple-check enforces evidence discipline |

Key features for long-term research:

  • Evidence wiki (@dzhechkov/evidence-wiki): atomic concept pages where every factual claim carries inline sources; knowledge graph for cross-referencing; triple-pillar protocol (N independent typed sources per claim)
  • Reward-calibrated memory (@dzhechkov/memory Reflexion): each checkpoint response trains the system — "ок" = excellent (1.0), feedback = good (0.7), rework = needs_work (0.3)
  • Agent SDK Dreaming: between sessions, patterns are consolidated and distilled
  • /harvest (knowledge-extractor skill): extracts reusable patterns from completed research into lib/ templates
  • SQLite + FTS5 backend: scales to 100k+ records with full-text search across all research sessions

Best for: Product strategy over months, continuous market monitoring, evolving product vision with evidence-backed decisions.


3. Product research + working prototype

Goal: Research the product AND build a functional prototype.

Option A: Sequential — research first, then code

# Step 1: Install research + development presets
npx @dzhechkov/keysarium init
# OR:
dz init --target claude-code --preset keysarium

# Step 2: Research phase
#   /casarium "SaaS platform for team retrospectives"
#   → Phase 0-2: Discovery, Exploration, Paranoid Research
#   → Phase 3: Solution Design (with CJM prototype)
#   → Result: researches/<slug>/ with full analysis

# Step 3: Switch to development
dz init --target claude-code --preset feature-adr

# Step 4: Build using research outputs
#   /feature-adr "Build the retrospective platform based on research in researches/<slug>/"
#   → Step 0: Router classifies as L/XL
#   → Step 1-5: Requirements, ADRs, DDD, Architecture (informed by research)
#   → Step 6: Implementation plan
#   → Step 7: Code generation (with /frontend-design for UI)
#   → Step 8-9: QE review + fleet assessment

What you get: Research artifacts in researches/, then code in features/<slug>/ + actual repository changes. Research directly feeds into ADR decisions.

Option B: Parallel — research and code simultaneously with p-replicator

# Install the full product development toolkit
npx @dzhechkov/p-replicator init

# Single pipeline: research → requirements → prototype
#   /replicate "SaaS platform for team retrospectives"
#   → Reverse-engineers similar products (reverse-engineering-unicorn)
#   → Generates SPARC PRD (sparc-prd-mini)
#   → Validates requirements (requirements-validator)
#   → Creates the project structure (pipeline-forge)
#   → Builds the prototype (cc-toolkit-generator-enhanced)
#   → Reviews with brutal honesty (brutal-honesty-review)

What you get: A working prototype generated from research in a single /replicate pipeline run. Faster but less deep than Option A.

Comparison

| Aspect | Option A (Sequential) | Option B (p-replicator) | |--------|----------------------|------------------------| | Research depth | Deep (7-phase keysarium) | Moderate (reverse-engineering) | | Code quality | High (11-step feature-adr + QE) | Good (pipeline-forge + review) | | Time | Days to weeks | Hours to days | | Best for | Complex products, regulated domains | MVPs, hackathons, quick validation | | Packages | keysarium + feature-adr preset | p-replicator | | Research artifacts | researches/ directory | Embedded in PRD | | Code artifacts | features/<slug>/ + repo changes | Generated project |

Tip: For maximum rigor, combine both — use p-replicator for a quick prototype, then run /feature-adr --full-qe-extended on the generated code for production-grade quality engineering.


Status

v0.3.89 — published on npm. Also available as Claude Plugin. Part of DZ Harness Hub.

Claude Plugin

DZ Harness Hub is available as a Claude Code plugin:

# Via marketplace (when published):
claude plugin marketplace add djd1m/dz-harness-hub
claude plugin install dz-harness-hub@dz-harness-hub

# Or test locally:
claude --plugin-dir /path/to/dz-harness-hub

# Generate plugin manifest from current inventory:
dz plugin --version 0.3.86

The .claude-plugin/ directory contains plugin.json + marketplace.json compatible with pi-claude-marketplace and skill-hub.

Related Projects

Skill sources

  • agentic-qe — 20 QE skills + 55 agents (test generation, coverage, chaos, QCSD swarms)
  • ECC — 20 curated skills (agent patterns, autonomous loops, docker, git workflows)
  • AgentShield — Security scanning (170 rules for .claude/ configs)
  • Understand-Anything — Codebase knowledge graph → architecture context

Platform & infrastructure

  • AgentDB — Self-learning vector memory (--memory agentdb, 41 MCP tools)
  • agentskills.io — Open standard for SKILL.md format (adopted by all 5 platforms)
  • OpenAI Codex — 2nd target platform
  • OpenCode — 3rd target platform (160K+ stars)
  • Hermes Agent — 4th target platform
  • OpenClaude — 5th target platform (28K+ stars)