@momentiq/dark-factory-cli
v2.6.0
Published
Dark Factory OSS CLI — multi-vendor adversarial critic orchestration (Cursor, Codex, Gemini, Grok) with min-complete-quorum aggregation and trusted-surface rebind
Maintainers
Readme
@momentiq/dark-factory-cli
Dark Factory OSS CLI — multi-vendor adversarial critic orchestration.
What this package gives you
Nine Dark Factory services, consumable as a TypeScript library and (where
relevant) as df subcommands:
- Critic Orchestrator (
./adapters/*) — vendor-neutral adapter contract (CriticAdapter) with concrete adapters for Cursor SDK, OpenAI Codex SDK, Google Gemini, Grok (xAI via OpenAI-compatible API), and MiniMax M3 (minimax-direct-sdk, via OpenRouter's OpenAI-compatible endpoint; requiresOPEN_ROUTER_API_KEY). MiniMax is not part of the default four-vendor quorum — it is an optional fifth adapter a consumer or the hosted runtime can wire in. - Policy Engine (
./policy/*) — gate evaluation, min-complete-quorum aggregation, TDD classifier, finding-rubric strip, verification routes, profile resolution, and config loading. - Trusted-Surface Rebind (
./trusted-surface/*) — when a commit modifies the trusted policy surface (config + guidance files + prompt fragments), the rebind reads those inputs from the parent ref so the commit is reviewed against the prior baseline (self-modification guard). - Per-SHA Evidence Store (
./evidence/*) — canonical per-SHA quality-gate evidence path layout + runner that writes/reads evidence files atomically. - Cycle-Doc Trailer Validator (
./cycle-doc-validator/*+df validate-cycle-doc) — enforces per-PRCycle:/Issue:/ProjectItem:trailer rules. - Merge Queue Admission Policy (
./policy/merge-queue.ts+df admit-pr) — plan-vs-code PR classifier + the typed ruleset shape (defaultMainRulesetShape,defaultCeReviewRulesetShape,defaultMergeQueueRule) that consumers declare so the branch-protection auditor can detect drift against it. - Branch-Protection Drift Detector (
./branch-protection/*+df audit-branch-protection) — compares a declarativespec.yamlagainst the live GitHub ruleset. - Audit / Compliance Trail (
./evidence/audit-trail.ts+df audit stats) — the_runs.ndjsonNDJSON sink + read/summarize/agreement-rate/quorum-stats helpers behindmake agent-review-stats. Every critic run, every gate verdict, every bypass invocation appends here. - Cycle Tracker Sync + PR Attribution (
./cycle-tracker-sync/*+df sync-trackers+df attribute-pr) — reconciles GitHub tracker issues with cycle docs + writes theCycle Refcustom field on PR project items.
The package also ships five reusable GitHub Actions workflows
(.github/workflows/*.yml) that consumers wire up via uses:. See the
root README for the consumer wiring
pattern.
Status
1.0.0 — shipped on npm. Library API + the hook-facing binary surface
(review, gate-push, doctor, gates, stats) are stable. The
df critic subcommand is the CI cold-path (API-key) counterpart to the
subscription-auth local hooks.
Install
npm install @momentiq/dark-factory-cliLibrary usage
import {
runReview,
evaluateCommitGate,
buildReviewPacket,
loadAgentReviewConfig,
runValidateCycleDoc,
runAuditBranchProtection,
runSyncCycleTrackers,
runAttributePrCycleRef,
} from "@momentiq/dark-factory-cli";
const loaded = await loadAgentReviewConfig(repoRoot);
const outcome = await runReview({ loaded, /* ... */ });
// Service #5 — validate a PR's cycle/issue trailers (subprocess-wraps the
// bundled Python script). Inherits stdio by default.
await runValidateCycleDoc({
env: { PR_NUMBER: "1234", PR_TITLE: "feat: ...", PR_BODY: "..." },
});CLI
df --help
df --version
# Python-backed subcommands — each forwards remaining argv to the bundled
# Python script verbatim, so `df <sub> --help` returns the Python argparse
# banner.
df validate-cycle-doc --help
df audit-branch-protection --use-bundled-default-spec --repo owner/repo
df sync-trackers --dry-run
df attribute-pr # env-driven; needs PR_NUMBER, PR_NODE_ID, PR_BODY_FILE, PROJECT_TOKEN
# Pure-TS subcommands.
df audit stats --path .git/agent-reviews/_runs.ndjson
df admit-pr --files-stdin # newline-separated file paths on stdin
df admit-pr --files docs/roadmap/cycles/cycle1.md,packages/cli/src/cli.ts
# Hook-facing subcommands (subscription cost model).
df review --commit HEAD --profile local --foreground
df gate-push # local pre-push, reads stdin (default: gate HEAD only)
df gate-push --full-range # legacy: gate every commit in the push range
df gate-push --commit HEAD --ci # CI replay
df doctor --profile local # env + per-adapter auth check
df gates # static gates, no LLM
df stats # alias for `df audit stats`
# Audit-mode inspection (NOT a gate).
df findings --range origin/main..HEAD # per-commit findings for the range
df findings --range origin/main..HEAD --json # df_findings-shaped JSON array
# Stdio Model Context Protocol server — exposes the CLI surface to any
# MCP-speaking agent.
df mcp # start the stdio MCP server
df mcp --help # config snippets for Claude Code, Cursor, Codex
# Bundled-skill installer (consumer-shape — implements DFP #192).
df skills list # list bundled skills (name, version, summary)
df skills install <name> # render + write .claude/skills/<name>/
df skills install --all # install every skill declared `enabled: true`
# in darkfactory.yaml
df skills install <name> --force # overwrite a hand-edited rendered fileNote on
--use-bundled-default-spec: the bundledspec-default.yamlasserts the standard Dark Factory required-status-check contexts (e.g.agent-critic,cycle-doc-validation). It exists as a working starting point for first-run audits. Consumers SHOULD author their ownspec.yamlmatching their repo's actual posture — running the bundled default against an arbitrary repo will surface drift against contexts that don't exist there.
For consumer repos — hook wiring + subscription cost model
The hook-facing subcommands (review, gate-push, doctor, gates,
stats) are designed to power consumer repos' .husky/post-commit and
.husky/pre-push hooks. The cost model is critical: per-commit critic
invocations from API tokens cost $1000s/week on a busy repo, while
subscription-auth invocations (using the developer's existing Cursor /
Codex / Claude CLI logins) are flat-rate.
Subscription auth — what runs on each git push
| Subcommand | Hook | Cost model |
| --- | --- | --- |
| df review | .husky/post-commit (background) | Subscription — consumes Cursor / Codex / Claude CLI logins via the active profile's auth pins. No API spend by default. |
| df gate-push | .husky/pre-push | Free — reads pre-existing artifacts, no LLM calls. Default (Cycle 13 / dark-factory-platform#149): gates the HEAD commit only; intermediate commits are iteration receipts (df findings --range surfaces them un-gated). Opt-in legacy: --full-range or DF_GATE_FULL_RANGE=1 gates every commit in the range. Soundness caveat: HEAD's per-SHA artifact reviews parent..HEAD only, NOT base..HEAD — use --full-range or the CI cold-path agent-critic workflow (which reviews the full PR diff) when cumulative-state evidence is required. |
| df findings --range <base>..<head> | None (operator-run) | Free. Walks every commit's per-SHA artifact in the range for audit-mode inspection. NOT a gate; does not re-run critics. The companion-surface to the final-commit-only df gate-push default. |
| df doctor | None (operator-run) | Free. Validates that per-adapter auth source is reachable. |
| df gates | None (operator-run) | Free. Runs static quality gates per validation.requiredQualityGates. |
| df stats | None (operator-run) | Free. Reads .git/agent-reviews/_runs.ndjson. |
CI cold-path (the 4 vendor API keys: CURSOR_API_KEY, CODEX_API_KEY,
GEMINI_API_KEY, XAI_API_KEY) is intentionally the fallback only —
used when:
- The first PR on a fresh repo runs critic before any developer has run hooks locally.
- A hook bypass landed and the CI gate needs to re-evaluate.
- The developer hasn't logged in to a vendor CLI yet
(
cursor login/codex login/ Claude desktop OAuth).
.agent-review/config.json — the profile that pins subscription auth
{
"version": 2,
"critics": [
{ "id": "cursor-local-chief-engineer", "adapter": "cursor-sdk", ... },
{ "id": "codex-local-chief-engineer", "adapter": "codex-sdk", ... }
],
"profiles": {
"local": {
"criticIds": ["cursor-local-chief-engineer", "codex-local-chief-engineer"],
"quorum": 1,
"auth": {
"codex-local-chief-engineer": "chatgpt"
}
},
"cloud": {
"criticIds": ["cursor-local-chief-engineer", "codex-local-chief-engineer"],
"quorum": 2,
"auth": {
"codex-local-chief-engineer": "api"
}
}
}
}The local profile pins codex to "chatgpt" — the Codex SDK will use
~/.codex/auth.json (from codex login) and NOT fall back to
CODEX_API_KEY even if it's set in env. This is the firewall against
accidental API-token billing.
df doctor --profile local validates the configured subscription source
is reachable. Run it after first-time setup.
Sample .husky/post-commit
#!/usr/bin/env bash
set -euo pipefail
if [[ "${AGENT_REVIEW_SKIP:-}" == "1" ]]; then
echo "df: skipped by AGENT_REVIEW_SKIP=1"
exit 0
fi
SHA="$(git rev-parse HEAD)"
COMMON_DIR="$(git rev-parse --git-common-dir 2>/dev/null || echo .git)"
mkdir -p "${COMMON_DIR}/agent-reviews"
LOG_FILE="${COMMON_DIR}/agent-reviews/post-commit.log"
# Detached background invocation — does not block the commit.
AGENT_REVIEW_PROFILE=local nohup npx df review --commit "${SHA}" \
>"${LOG_FILE}" 2>&1 </dev/null &
disown || true
echo "df: review started for ${SHA:0:12} (log: ${LOG_FILE})"Sample .husky/pre-push
#!/usr/bin/env bash
set -euo pipefail
if [[ -n "${AGENT_REVIEW_BYPASS:-}" ]]; then
echo "df: pre-push gate BYPASSED — reason: ${AGENT_REVIEW_BYPASS}" >&2
exit 0
fi
npx df gate-push --profile localDoppler bootstrap (optional)
For repos that use Doppler to manage DOPPLER_TOKEN, place it in
<main-checkout>/.env and the bootstrap loader will hoist it from any
worktree:
echo 'DOPPLER_TOKEN=dp.st.dev.…' > <main-checkout>/.env
chmod 600 <main-checkout>/.envThe default allowlist is just DOPPLER_TOKEN. Consumers that use
project-scoped service-token vars (e.g. DOPPLER_SERVICE_TOKEN_ACME) can
pass a custom allowlist to loadDopplerBootstrapEnv() via the library
API — see packages/cli/src/doppler-bootstrap.ts for the
serviceTokenAlias parameter that bridges to DOPPLER_TOKEN.
First-time setup checklist
npm install --ignore-scripts @momentiq/dark-factory-cli- Add
.agent-review/config.jsonwith thelocalprofile (above). - Add
.husky/post-commit+.husky/pre-push(samples above). git config --local core.hooksPath .huskycursor login/codex login(or Claude desktop OAuth) on the workstation.- Run
df doctor --profile local— should report all OK. - Commit something — observe
.git/agent-reviews/<sha>.jsonarrives. - (Optional) Set up CI with
CURSOR_API_KEY/CODEX_API_KEY/GEMINI_API_KEY/XAI_API_KEYas repo secrets for the cold path.
System requirements
- Node.js >=20
- Python 3.11+ — required for services #5, #7, #9. The package bundles
the source Python scripts (
validate_cycle_doc.py,audit_branch_protection.py,sync_cycle_trackers.py,attribute_pr_cycle_ref.py) and wraps each in a TypeScript subprocess spawn. A pure-TS rewrite is on the roadmap and will eliminate this dependency in a future release. ghCLI (authenticated) — all four Python scripts shell out togh apifor GitHub queries. CI invocations provideGH_TOKEN/PROJECT_TOKENvia environment.gitonPATH— the rebind + config-from-ref code paths shell out to git, and the Python scripts usegit rev-parse --show-toplevelto discover the consumer repo root.
Repo root detection for Python-backed services
The bundled Python scripts resolve the consumer repo root in this order:
$DF_REPO_ROOTenvironment variable (explicit override).git rev-parse --show-toplevelfrom the current working directory.- Legacy
__file__-relative fallback (preserved for in-tree dev-mode pytest runs).
The TypeScript wrappers set DF_REPO_ROOT automatically when a repoRoot
option is supplied — pass it when invoking outside a git worktree.
Bundled skills (df skills install — DFP #192)
The CLI ships a small set of consumer-shape templated skills that any
repo adopting Dark Factory can install with one command — no fork, no
hand-edit. The skill body templates live in this repo under skills/<name>/;
the renderer substitutes {{REPO_NAME}}, {{ADR_DIR}},
{{CYCLE_DOCS_DIR}}, {{QUALITY_GATE_TARGETS}}, etc. against the
consumer's darkfactory.yaml at install time.
Bundled today:
chief-engineer-review— AI-native architectural review (autonomous- conversational modes). Originated in
momentiq-ai/sage3c.
- conversational modes). Originated in
chief-engineer-blitz— orchestrated multi-PR delivery doctrine (six phases: Plan → Spec → Implement → Triage → Validate → Closure). Originated inmomentiq-ai/dark-factory-platform.
Consumer setup (one-time per repo):
# 1. Author darkfactory.yaml at repo root (see the schema below).
# 2. Install:
df skills install --all # installs every skill marked enabled
# … or one at a time:
df skills install chief-engineer-reviewdarkfactory.yaml schema (every key optional — install falls back to
sensible defaults):
repo:
displayName: "My Repo"
slug: "my-repo"
ownerRepo: "my-org/my-repo"
docs:
manifesto: docs/PRINCIPLES.md
adrDir: docs/ADR
cycleDocsDir: docs/roadmap/cycles
rfcDir: docs/rfcs
prdDir: docs/prds
agents:
chiefEngineer: .claude/agents/chief-engineer.md
qualityGates:
- make quality-gates
- make test
worktreeRoot: .claude/worktrees
agentCommitterOrg: my-org # for the claude-code+<handle>@<org>.ai committer
skills:
chief-engineer-review:
enabled: true
chief-engineer-blitz:
enabled: trueRe-install semantics:
- A re-install with identical inputs is a no-op (the rendered file carries
an
install-hashin itsGENERATEDheader; matching hash → unchanged). - A re-install with different inputs (config changed, template upstream changed) overwrites the rendered file.
- A re-install where the rendered file has been hand-edited (no
GENERATEDheader detected) is skipped — pass--forceto overwrite.
MCP tool parity: the df mcp server exposes df_skills_install +
df_skills_list with the same semantics, so MCP-speaking agents (Claude
Code, Cursor) can install skills programmatically without shelling out.
Adding a new bundled skill: see skills/README.md for the manifest +
template authoring rules.
License
Apache-2.0. The OSS critic surface is a public artifact. The hosted Dark Factory runtime layers proprietary calibrated prompts and a calibrated bypass-classifier on top of this CLI; those are out-of-scope here.
