argus-qa

v1.5.0

Published

2 months ago

Argus — All-Seeing QA Guardian. An autonomous Claude Code agent that performs end-to-end quality assurance: codebase intelligence, Playwright exploration, ticket analysis, AC writing, tracker updates, and validation.

argus-qa

Argus — All-Seeing QA Guardian & Knowledge Base Generator for Claude Code.

A zero-dependency npm package that installs autonomous Claude Code agents directly into your project's .claude/ directory, making them immediately available in Claude Code — including on the Max plan.

Three agents in one package:

Argus QA Agent — Performs a complete, six-phase QA sweep: reads your codebase, explores the running application with Playwright, harvests your GitHub/Jira tickets, writes professional Acceptance Criteria and Manual Test Cases, updates the tracker, and validates everything with a final Playwright pass.
Knowledge Base Agent — Analyzes your codebase and produces comprehensive, structured documentation in a docs/ directory. Detects languages and frameworks automatically, fires parallel sub-agents to document each technology, and supports incremental re-runs.
Sentinel Agent — Reads the Knowledge Base, analyzes your source code, and generates comprehensive test scenarios — unit tests, integration tests, E2E tests, and edge cases — then creates structured tickets in GitHub Issues or Jira. The bridge between understanding and testing.

Quick Start

# In your project root
npx argus-qa init

Then open Claude Code and say:

Run Argus. App is at http://localhost:3000, using GitHub Issues.

Or to generate project documentation:

Create knowledge base

Or to generate test scenarios and create testing tickets (requires Knowledge Base):

Run Sentinel. Using GitHub Issues.

That's it. The agents take over from there.

Installation

You don't need to install the package globally. npx argus-qa init downloads and runs it in one step.

If you prefer a local dev dependency (so your team always runs the same version):

npm install --save-dev argus-qa
npx argus-qa init

Or globally, if you want argus-qa available in every project:

npm install -g argus-qa
cd your-project
argus-qa init

What `init` does

Running init creates the following files in your project, which Claude Code reads automatically on startup:

.claude/
├── agents/
│   ├── argus.md                          ← QA orchestrator agent
│   ├── knowledge-base.md                 ← Knowledge Base orchestrator agent
│   └── sentinel.md                       ← Sentinel code refinement agent
└── skills/
    ├── argus/
    │   ├── 00-project-intelligence.md    ← Phase 1: codebase analysis
    │   ├── 01-playwright-explorer.md     ← Phase 2: visual app exploration
    │   ├── 02-ticket-harvester.md        ← Phase 3: ticket fetch + Playwright probes
    │   ├── 03-qa-documentation.md        ← Phase 4: ACs, test cases, bug reports
    │   ├── 04-ticket-updater.md          ← Phase 5: GitHub/Jira updates
    │   └── 05-playwright-validator.md    ← Phase 6: Playwright validation pass
    ├── knowledge-base/
    │   ├── phases/
    │   │   ├── 00-technology-detection.md    ← detect languages, frameworks, tools
    │   │   ├── 01-architecture-planning.md   ← plan docs/ directory structure
    │   │   ├── 02-documentation-orchestrator.md ← fire parallel doc agents
    │   │   └── 03-tracker-logging-manager.md ← checksums, tracking, logs
    │   ├── languages/                        ← 12 language analysis skills
    │   │   ├── python.md, javascript.md, typescript.md, csharp.md,
    │   │   ├── java.md, go.md, rust.md, ruby.md,
    │   │   └── php.md, swift.md, kotlin.md, cpp.md
    │   └── frameworks/                       ← 17 framework analysis skills
    │       ├── django.md, fastapi.md, flask.md, express.md,
    │       ├── nestjs.md, nextjs.md, nuxtjs.md, react.md,
    │       ├── vue.md, angular.md, svelte.md, tailwindcss.md,
    │       └── vuetify.md, dotnet.md, spring-boot.md, rails.md, laravel.md
    └── sentinel/
        ├── phases/
        │   ├── 00-kb-ingestion.md            ← read and synthesize Knowledge Base
        │   ├── 01-edge-case-discovery.md     ← dispatch parallel analysis agents
        │   ├── 02-ticket-drafting.md         ← draft tickets per finding
        │   └── 03-tracker-publisher.md       ← approval gate + publish to tracker
        └── analysis/                         ← 8 analysis category skills
            ├── error-handling.md, input-validation.md,
            ├── concurrency.md, security.md,
            ├── performance.md, state-management.md,
            └── api-contracts.md, data-integrity.md

Claude Code discovers agents in .claude/agents/ and skills in .claude/skills/ automatically — no additional configuration is needed.

Commands

`argus-qa init`

Installs Argus into the current project. Safe by default: files that already exist are skipped. Use --force to overwrite.

npx argus-qa init
npx argus-qa init --force     # overwrite existing files
npx argus-qa init --dry-run   # preview without writing anything
npx argus-qa init --yes       # skip confirmation prompt (CI-friendly)

`argus-qa update`

Re-installs all files using the latest version bundled in the package. Run this after upgrading argus-qa.

npx argus-qa update
npx argus-qa update --dry-run
npx argus-qa update --yes

`argus-qa status`

Shows which files are installed, missing, or outdated compared to the current package version.

npx argus-qa status

Example output:

  ✔ .claude/agents/argus.md  (up to date)
  ✔ .claude/skills/argus/00-project-intelligence.md  (up to date)
  ↑ .claude/skills/argus/03-qa-documentation.md  (outdated — run argus-qa update)
  ✖ .claude/skills/argus/05-playwright-validator.md  (not installed)

`argus-qa uninstall`

Removes all Argus files. Does not touch the .claude/ directory itself, so other agents and skills you have installed are not affected.

npx argus-qa uninstall
npx argus-qa uninstall --dry-run
npx argus-qa uninstall --yes

Triggering Argus in Claude Code

Once installed, Argus is triggered by natural language phrases in Claude Code. You don't need to use a specific command syntax — Claude Code recognises the intent from context.

The following phrases (and variations) will activate Argus:

"Run Argus. App is at http://localhost:3000, using GitHub Issues."
"Argus — full QA sweep. Jira project KEY. Staging URL: https://staging.myapp.com"
"Start a full quality assurance run with Argus"
"qa the project", "full qa sweep", "audit the project"

Argus will ask a few setup questions if it can't infer what it needs (app URL, tracker type, credentials for Jira), then proceed through its six phases.

What Argus Produces

Every run writes to .argus/run-<TIMESTAMP>/ in your project root, with a latest symlink for easy access:

.argus/
└── run-20250305T120000/
    ├── 00-project-intel/
    │   ├── BUSINESS_LOGIC.md      ← every business rule, user role, and workflow
    │   ├── ARCHITECTURE.md        ← tech stack, API surface, domain model
    │   ├── TEST_COVERAGE.md       ← existing coverage + gap analysis
    │   └── CODEBASE_MAP.md        ← annotated directory structure
    ├── 01-playwright-exploration/
    │   ├── EXPLORATION_REPORT.md  ← every route and feature found
    │   └── screenshots/
    ├── 02-ticket-analysis/
    │   ├── TICKET_OVERVIEW.md
    │   └── tickets/<ID>/
    │       ├── TICKET_DETAIL.md   ← enriched with Playwright findings
    │       └── screenshots/
    ├── 03-qa-documentation/
    │   └── tickets/<ID>/
    │       ├── ACCEPTANCE_CRITERIA.md
    │       ├── MANUAL_TEST_CASES.md
    │       └── BUG_REPORT.md      ← only present when bugs were found
    ├── 04-tracker-updates/
    │   └── UPDATE_LOG.md
    └── 05-validation/
        ├── VALIDATION_REPORT.md
        └── screenshots/

The Six Phases

Phase 1 — Project Intelligence reads the entire codebase (source files, tests, config, API definitions, migration files) and produces four structured documents covering business logic, architecture, test coverage gaps, and a codebase map. This gives Argus a foundation to interpret everything that follows.

Phase 2 — Playwright Explorer opens a real Chromium browser, creates a test user through the app's own registration flow, navigates every discoverable route, fills out forms, and captures screenshots. It behaves exactly like an exploratory tester on their first day with the product.

Phase 3 — Ticket Harvest fetches all open tickets from GitHub Issues and/or Jira, then fires a targeted Playwright session per ticket — navigating to the relevant feature and attempting to observe or reproduce the reported behavior. This phase spawns parallel sub-agents for speed.

Phase 4 — QA Documentation is the intellectual core of the run. Per-ticket parallel sub-agents each read all Phase 1–3 context and produce three professional QA documents: Acceptance Criteria in Given/When/Then format, step-by-step Manual Test Cases with priority levels, and Bug Reports for any defects discovered along the way.

Phase 5 — Tracker Updates posts the Phase 4 documentation to GitHub or Jira as structured comments, adds QA labels, and creates new bug tickets for anything Argus found. Argus always shows you the full update manifest and waits for explicit approval before posting a single thing to the tracker.

Phase 6 — Playwright Validation converts the written test cases into executable Playwright scripts, runs them against the live application, and produces a validation report that distinguishes between application bugs, documentation inaccuracies, and environment issues.

Knowledge Base Agent

The Knowledge Base agent analyzes your codebase and generates structured documentation in a docs/ directory. It works in four phases:

Phase 0 — Technology Detection scans your project for languages, frameworks, build tools, testing frameworks, and infrastructure. It detects 12 languages (Python, JavaScript, TypeScript, C#, Java, Go, Rust, Ruby, PHP, Swift, Kotlin, C++) and 17 frameworks (Django, FastAPI, Flask, Express, NestJS, Next.js, Nuxt.js, React, Vue, Angular, Svelte, TailwindCSS, Vuetify, .NET, Spring Boot, Rails, Laravel).

Phase 1 — Architecture Planning creates a docs/plan.json with the complete documentation structure, metadata.json files explaining each directory's purpose, and initializes docs/tracker.md for incremental tracking.

Phase 2 — Documentation Orchestrator fires parallel Task sub-agents (max 5 concurrent), each using the appropriate language or framework skill to analyze relevant source files and produce detailed documentation. Each skill provides an 8-section analysis covering project structure, idioms, testing, dependencies, build tools, libraries, code organization, and documentation standards.

Phase 3 — Tracker & Logging Manager computes SHA-256 checksums for all documented source files, finalizes the tracker for incremental re-runs, and generates CLAUDE.md files summarizing KB coverage.

Trigger phrases

"Create knowledge base"
"Generate project docs"
"Build documentation"
"Document this codebase"

Incremental re-runs

When you run the Knowledge Base agent again after code changes, it reads docs/tracker.md, detects which source files have changed via SHA-256 checksums, and only re-documents the changed modules. New files are detected automatically.

Output structure

docs/
├── plan.json                 ← full KB creation plan
├── tracker.md                ← file checksums for incremental updates
├── detection-manifest.json   ← technology detection results
├── metadata.json             ← root-level metadata
├── logs/                     ← progress and response logs
├── <technology>/             ← one directory per detected technology
│   ├── metadata.json         ← why this directory exists
│   ├── overview.md           ← high-level overview
│   ├── architecture.md       ← architecture patterns
│   ├── modules/              ← per-module documentation
│   ├── patterns.md           ← idioms and patterns
│   ├── dependencies.md       ← dependency analysis
│   ├── testing.md            ← testing approach
│   └── build-deploy.md       ← build and deployment
├── cross-cutting/            ← how technologies interact
└── CLAUDE.md                 ← summary for AI context

Sentinel Agent

The Sentinel agent reads your Knowledge Base (docs/) and generates comprehensive test scenarios — unit tests, integration tests, E2E tests, and edge cases. It then creates structured tickets in GitHub Issues or Jira so your team has a complete testing roadmap.

Sentinel does NOT look for bugs. It asks: "What tests should exist for this codebase?"

Prerequisite: Run the Knowledge Base agent first ("Create knowledge base").

Trigger phrases

"Run Sentinel. Using GitHub Issues."
"Generate test scenarios"
"Create test tickets"
"Sentinel sweep", "Find test gaps", "Plan testing"

Four phases

Phase 0 — KB Ingestion reads the entire docs/ Knowledge Base and produces a condensed summary, a test coverage map, and a testing strategy.

Phase 1 — Test Scenario Discovery dispatches parallel sub-agents across 8 testing categories: Error Handling, Input Validation, Concurrency, Security, Performance, State Management, API Contracts, and Data Integrity. Each sub-agent generates test scenarios armed with KB context.

Phase 2 — Ticket Drafting organizes scenarios and drafts tickets classified as: UNIT (unit tests), INT (integration tests), E2E (end-to-end tests), or EDGE (edge case tests). Default limit: 50 tickets per run.

Phase 3 — Tracker Publisher shows the full publish manifest and waits for explicit approval before creating any tickets.

Output structure

.sentinel/
└── run-<TIMESTAMP>/
    ├── 00-kb-analysis/
    │   ├── KB_SUMMARY.md             ← condensed KB synthesis
    │   ├── TEST_COVERAGE_MAP.md      ← what's tested vs what's not
    │   └── TESTING_STRATEGY.md       ← recommended test types per module
    ├── 01-test-scenarios/
    │   ├── SCENARIO_SUMMARY.md       ← aggregated scenarios
    │   └── categories/               ← one file per testing category
    ├── 02-tickets/
    │   ├── TICKET_MANIFEST.md        ← summary of all drafted tickets
    │   └── drafts/                   ← individual ticket files (UNIT/INT/E2E/EDGE)
    └── 03-published/
        ├── PUBLISH_LOG.md            ← chronological action log
        └── created-tickets/          ← copies with ticket URLs

Approval Checkpoints

Argus pauses for your explicit approval at five points during a run. This design is intentional: the agent can discover a lot of information autonomously, but humans stay in control of all decisions that have real-world effects (like posting to a tracker or writing test documentation that will guide a team).

The five checkpoints are after Phase 1 (before Playwright exploration), after Phase 2 (before ticket analysis), after Phase 3 (before writing QA docs), after Phase 4 (before touching the tracker — also shows the full update manifest), and after Phase 6 (final report).

You can stop after any checkpoint without losing work — all prior-phase documents are already written.

Prerequisites

The following tools need to be available in your environment before running Argus. Most will already be present in a typical development setup.

Node.js 18 or newer and npm are required for Playwright. Playwright itself is installed automatically by Argus if it isn't already present.

Git — the project directory should be a git repository.

GitHub CLI (gh) authenticated via gh auth login, if you are using GitHub Issues as your ticket tracker. Verify with gh auth status.

Jira CLI or a Jira MCP connection, if using Jira.

A running application — Argus needs to be able to open a browser and load your app. Provide the base URL when invoking Argus (e.g., http://localhost:3000 or your staging URL).

Works with Claude Code Max

Argus uses Claude Code's sub-agent spawning (the Task tool) to run Phase 3 and Phase 4 work in parallel. This works on the Max plan and any Claude Code subscription that supports multi-agent workflows. The agent file declares tools: Task, Bash, Read, Write, Edit, Glob, Grep, mcp__github, mcp__jira — Claude Code will use whichever of these are available in your environment.

`.gitignore` recommendations

You can either commit .argus/ to give your team a permanent, searchable QA history, or ignore it to keep it local. If you want to keep the text documents but exclude large binary files:

# Keep text docs, exclude screenshots and video recordings
.argus/**/screenshots/
.argus/**/flows/

# Or ignore the entire audit directory
.argus/

Programmatic API

The package also exposes a small programmatic API for build scripts and monorepo tooling:

const argus = require('argus-qa');

// Install silently (no prompts)
const result = argus.install({ force: true });
console.log(`Installed ${result.installedCount} files`);

// Check status
const { missing, outdated } = argus.status();
if (missing.length > 0) {
  console.warn('Argus files missing — run argus-qa init');
}

// Remove all files
argus.uninstall();

License

MIT

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

argus-qa

Quick Start

Installation

What init does

Commands

argus-qa init

argus-qa update

argus-qa status

argus-qa uninstall

Triggering Argus in Claude Code

What Argus Produces

The Six Phases

Knowledge Base Agent

Trigger phrases

Incremental re-runs

Output structure

Sentinel Agent

Trigger phrases

Four phases

Output structure

Approval Checkpoints

Prerequisites

Works with Claude Code Max

.gitignore recommendations

Programmatic API

License

What `init` does

`argus-qa init`

`argus-qa update`

`argus-qa status`

`argus-qa uninstall`

`.gitignore` recommendations