crewswarm-cli
v1.0.0
Published
Multi-model AI coding CLI with 45+ tools, multi-agent swarm, and real-time streaming — works with GPT, Claude, Gemini, Grok, DeepSeek, and more
Downloads
226
Maintainers
Readme
crew-cli
AI coding CLI with an execution quality engine. 29+ models score 100/100 across 12 providers — cheap models perform like premium ones because the engine prevents failures, forces verification, and learns from past runs.
npm i -g crewswarm-cli
crew doctor
crew auto "fix the divide-by-zero bug in src/math.ts"Execution Quality Engine
No other CLI has this. Claude Code, Codex, and Gemini CLI run blind loops. crew-cli wraps every task in 8 quality modules:
- Failure memory — blocks repeated bad tool calls after 1 failure
- Verification gate — forces proof (test/build) before accepting "done"
- Edit gate — rejects "done" without file changes on mutation tasks
- Action ranking — steers model toward highest-value next action per turn
- Task mode strategies — bugfix/feature/refactor/test_repair get different approaches
- Patch critic — catches unread edits, churn, scope creep in real time
- Adaptive weights — learns from past runs which action patterns work best
- Deterministic QA — checks filesystem ground truth, not LLM judgment
Result: Groq Llama 3.3 70B (free) scores the same as GPT-5.4 ($0.15/call). Gemini Flash at $0.003/task matches Claude Opus. The engine is the equalizer.
Providers (12 verified at 4/4 presets)
Claude (OAuth), OpenAI (OAuth), Gemini, DeepSeek, Grok, Groq, Kimi, GLM, Qwen, MiniMax, OpenRouter (387 models), OpenCode/Zen (39 models + free tier).
Command-line interface for crewswarm agent orchestration with local safety rails (sandbox diffs, session state, routing/cost logs), team sync, CI/browser helpers, and voice mode.
OVERVIEW.md - 🚀 1-minute summary of what this is and how it works. INSTRUCTION-STACK.md - canonical instruction precedence and composition PERMISSIONS-MODEL.md - canonical read/write/shell/approval behavior by mode
Requirements
- Node.js 20+
- Git
- Optional for full integration: running CrewSwarm gateway (
http://127.0.0.1:5010)
Install
npm install
npm run buildRun the CLI:
node bin/crew.js --helpCore Commands
crew chat "refactor auth middleware"
crew chat "build auth API with tests" --modefast6
crew dispatch crew-coder "fix failing tests"
crew dispatch crew-coder "harden auth middleware" --preset quality
crew run -t "build auth API with tests" # unified pipeline (resumable)
crew run --resume pipeline-<trace-id> # resume/replay from checkpoint trace
crew run --resume pipeline-<trace-id> --from-phase execute
crew explore "refactor database layer" # parallel speculative execution
crew plan "add OAuth login" --parallel
crew preview
crew apply --check "npm test"
crew rollbackIntelligence Commands
crew map --graph # visual dependency graph
crew shell "list large files" # NL to shell command translation
crew docs "how does auth work" # RAG search over docs/markdown
crew blast-radius # impact analysis of current changes
crew capabilities # runtime capability handshake
crew memory "auth login" # recall prior task memory
crew lsp check src/cli/index.ts # TypeScript diagnostics
crew lsp complete src/cli/index.ts 10 5
crew repl # interactive multi-agent REPL
crew tui # terminal UI adapter (same runtime as REPL)
crew github "list open issues" # NL GitHub flows
crew github doctor # GitHub CLI health checkAdvanced Commands
crew sync --status
crew privacy --show
crew serve --port 4317 # unified /v1 API + /mcp endpoint
crew exec "vim src/server.ts" # interactive terminal (PTY)
crew listen --duration-sec 6
crew browser-debug --url http://127.0.0.1:4319
crew ci-fix --check "npm test"
crew doctorPipeline Runtime Flags
CREW_USE_UNIFIED_ROUTER=false- force-disable UnifiedPipeline routing pathCREW_LEGACY_ROUTER=true- use legacy router/legacy standalone execution pathCREW_DUAL_L2_ENABLED=true- enable Dual-L2 planning/decompositionCREW_QA_LOOP_ENABLED=true- run QA -> fixer -> final QA gate before completionCREW_QA_MAX_ROUNDS=2- max fixer rounds in QA loopCREW_CONTEXT_BUDGET_CHARS=7000- per-worker retrieved artifact context budgetCREW_CONTEXT_MAX_CHUNKS=8- max retrieved artifact chunks per workerCREW_CONTEXT_PACK_TTL_HOURS=24- TTL for persisted context-pack cache in.crew/context-packsCREW_TOOL_MODE=auto|native|markers- tool execution mode (defaultauto)CREW_GEMINI_DYNAMIC_DECLARATIONS=true|false- use dynamic Gemini declaration builder (defaulttrue)CREW_ENABLE_ADVANCED_ADAPTER_TOOLS=true|false- enable safe advanced adapter tools in default pipeline (defaulttrue)CREW_NO_ROUTER=true|false- skip router classification and force execute-parallel flow
CLI preset flags (chat/auto/dispatch):
--preset fast6|turbo6|balanced|quality--modefast6shortcut forfast6--new-task(chat only) ignores pending clarification resume and starts fresh
Preset summary:
fast6: 6 parallel workers, QA 2 rounds, no-router, speed-focusedturbo6: 6 parallel workers, QA off, no-router, max throughputbalanced: 4 workers, QA 1 round, no-router, mixed speed/qualityquality: 3 workers, QA 2 rounds + stricter gates, no-router
Diagnostics & Health
crew doctor # checks Node.js, Git, API keys, gateway, MCP, updates (~3s)
crew doctor --gateway http://custom:5010crew doctor validates your environment and suggests fixes:
- API key detection — shows which of 10 providers are configured
- Cheapest-first hints — when no keys found, recommends Gemini (free) and Groq (free)
- Gateway health — verifies crew-lead is reachable
- MCP server health — checks configured MCP servers
- Update check — shows if a newer version is available on npm
Key Engine Features
| Feature | Status |
|---|---|
| Streaming output | ✅ All providers — Gemini, OpenAI, Anthropic, Grok, DeepSeek, Groq, OpenRouter |
| Session continuity | ✅ SessionManager persists history across REPL sessions |
| Auto-approve mode | ✅ --always-approve flag for unattended execution |
| Turn compression | ✅ Topic-Action-Summary keeps prompts lean on long sessions |
| JIT context | ✅ Files discovered by tools are indexed for subsequent turns |
| Repo-map RAG | ✅ TF-IDF semantic search injected before execution |
| Auto-retry | ✅ Failed tool calls retry up to 3 times with auto-correction |
| Infinite loop detection | ✅ Repeating-action detector stops stuck agents |
| Multimodal vision | ✅ --image flag for Gemini, Claude, GPT-4o, Grok Vision |
| Cost tracking | ✅ Per-session token costs for all providers |
Adaptive QA + reliability:
CREW_QA_SMALL_EDIT_THRESHOLD=1andCREW_QA_SMALL_EDIT_ROUNDS=1reduce QA rounds for tiny editsCREW_DECOMPOSE_MAX_ATTEMPTS=2retries lightweight decomposition on failureCREW_SELF_CONSISTENCY_GATE_ENABLED=truevalidates synthesized final output against worker evidence
Standalone default:
- standalone mode now uses UnifiedPipeline by default.
- pass
--legacy-routerto any command for temporary legacy fallback.
L1/L2/L3 Use Cases
- Use case 1 (Code engine path): command-driven execution (
dispatch,auto,run) with full L2/L3 pipeline. - Use case 2 (Chat-directed execution): user chats with L1 (
crew chat), L2 decides/forces execution path, L3 runs workers/tools.
Clarification rule:
- L1 returns final completion when done.
- If L3 emits unresolved
ask_user, L1 returns only clarification questions and waits for user input. - Next
crew chatmessage auto-resumes the pending trace using savedtraceIdand prior plan artifacts.
Quick Benchmarking
# Compare latency/pass behavior of presets
node scripts/benchmark-presets.mjscrew cost now includes pipeline observability counters:
qa_approved,qa_rejected,qa_rounds_avgcontext_chunks_used,context_chars_saved_est
Context Flags
chat and dispatch accept these context injection flags:
--docs— auto-retrieve relevant doc chunks via collections search--cross-repo— inject sibling repo context--context-file <path>— attach a file--context-repo <path>— attach git context from another repo--stdin— pipe stdin as context
What Is Implemented
- Phase 1 (MVP): complete
- Phase 2 (Intelligence): complete
- Phase 3 (Polish/Launch): complete
- Phase 4 (Advanced): complete
- Phase 5 (3-Tier LLM Scale-Up): complete
See ROADMAP.md for tracked completion.
Testing
npm run build
npm run check
npm testLatest local QA pass (2026-03-01):
- Build: passing
- Check: passing
- Tests: 178 passing, 0 failing
Community
Documentation
- QUICKSTART.md
- INSTRUCTION-STACK.md
- PERMISSIONS-MODEL.md
- EXAMPLES.md
- API.md
- API-UNIFIED-v1.md — unified dashboard/CLI/headless contract
- MCP-CLI-INTEGRATION.md — Codex/Cursor/Claude MCP setup
- openapi.unified.v1.json — OpenAPI spec
- TROUBLESHOOTING.md
- CONTRIBUTING.md
- SECURITY.md
