@wazir-dev/cli
v1.5.0
Published
Host-native engineering OS kit for AI coding agents — roles, phases, expertise modules, quality gates for Claude, Codex, Gemini & Cursor
Maintainers
Readme
AI agents degrade on long tasks — context rots, reviews get rubber-stamped, verification is an honor system. Wazir is the operating model that's missing.
$ /wazir Build a REST API with authentication
[clarify] 3 questions asked, answers collected
[specify] 47 acceptance criteria written
[spec-gate] APPROVED
[plan] 6 implementation tasks
[plan-gate] APPROVED
[execute] 6/6 tasks complete, 43 tests passing
[verify] 0 lint errors, proof artifact generated
[review] 2 findings, both resolved
[learn] 3 learnings captured
Pipeline complete. 3/3 gates passed.Quick Start
Requires Claude Code and Node.js 20+.
npm install -g @wazir-dev/cliThen in Claude Code:
/plugin marketplace add MohamedAbdallah-14/Wazir
/plugin install wazirRun a task:
/wazir Build a REST API for task management with authenticationControl the depth: /wazir quick ... for fast fixes, /wazir deep ... for full pipeline, /wazir audit ... for dedicated audits.
What Makes This Different
clarify → specify → [gate] → design → [gate] → plan → [gate] → execute → verify → review → learn
Mandatory research phase. Before any code is written, a researcher fetches live API docs, changelogs, and prior art. Not optional — a pipeline phase.
Adversarial review. The reviewer is never the author. Three gates reject work back until quality passes.
AI writing detection and removal. Strips AI patterns from specs, comments, and commit messages. Output reads like a human wrote it.
Published compliance data. Every run scores itself across five dimensions. Numbers in SQLite, not marketing.
324 expertise modules, deterministic composition. Modules compose based on project context across 13 domains. Same input, same composition, every time.
Export compiler: one source, four host-native packages. Write the process once, compile to Claude, Codex, Gemini, and Cursor.
Fresh agent per step. Each pipeline phase gets a clean context window. No carryover. No contamination between research, implementation, and review.
Wazir vs. Alternatives
| Dimension | Wazir | Claude Code (bare) | Superpowers | Spec-Kit | Raw Prompting | |---|:---:|:---:|:---:|:---:|:---:| | Enforced delivery pipeline | 14 phases, 3 gates | — | — | Spec-first | — | | Mandatory pre-coding research | ✓ | — | — | — | — | | Adversarial review (reviewer ≠ author) | ✓ | — | — | — | — | | Expertise composition per task | 324 modules | — | ~15 skills | Extensions | — | | Published compliance measurement | ✓ | — | — | — | — | | AI writing detection + removal | ✓ | — | — | — | — | | Export compiler (one source → host-native) | 4 hosts | — | 5 hosts (per-host install) | 3 hosts | — |
Wazir is not competing with these tools — it learned from all of them. See Acknowledgments.
Documentation
| Section | What You'll Find | |---|---| | Architecture | System design, component interactions, context tiers | | Roles & Workflows | 10 roles, 14 phases, gate mechanics | | Composition Engine | How 324 modules are assembled per task | | Pipeline Vision | Every design decision with research citations | | Research Index | 122 research files across 14 categories |
Wazir (وزير): Arabic for advisor.
Active development. The pipeline works. Rough edges remain.
| Project | What Wazir Learned | |---|---| | superpowers | Skill system architecture, bootstrap injection pattern | | spec-kit | Specification-driven development patterns |
Full acknowledgments · Contributing · Security · MIT License
