devxira

v2.0.0

Published

17 days ago

MCP server for developer workflows — PR review, daily reports, task management, plugin research

Downloads

665

0High
0Medium
0Low

_hifromabul

mcp mcp-server claude devtools github wordpress workflow mcp-prompts

Devxira

An MCP server that turns plain-English requests into senior-engineer developer workflows — PR review loops, QA automation, daily standup reports, competitor research, and WordPress plugin work — driven through your existing MCP client.

Devxira is a Model Context Protocol server. It gives your AI agent (Claude Code, Claude Desktop, Cursor, ChatGPT, …) two things: a few live tools that run real work in-process (Git, WordPress), and a library of workflow prompts — battle-tested, multi-step playbooks (PR-review loops, exploratory QA, competitor research) that your agent executes by orchestrating companion MCP servers like GitHub, ClickUp, and Playwright. You drive all of it by just describing what you want.

What is Devxira?

Capabilities come in two kinds, and the difference matters:

flowchart LR
    U["You: plain-English request"] --> C["MCP client<br/>Claude Code · Desktop · Cursor · ChatGPT"]
    C --> D{{"devxira MCP server"}}

    D --> L["Live tools<br/>execute in-process"]
    L --> G["git_* — shell, stdio only"]
    L --> W["wordpress_* — wordpress.org API"]
    G --> R1["real results"]
    W --> R1

    D --> P["Workflow prompts<br/>return a playbook, not a result"]
    P --> A["Your agent runs the playbook"]
    A --> CS["Companion MCP servers<br/>GitHub · ClickUp · Playwright · web"]
    CS --> R2["PRs reviewed · tasks filed · apps QA'd · reports written"]

Live tools (git_*, wordpress_*) do the work themselves and hand back real results.
Workflow prompts (qa_pilot, pr_bulk_review_loop, product_intelligence, …) return a detailed instruction set; your agent then carries it out, calling companion MCP servers (GitHub, ClickUp, Playwright) as needed. If a required companion isn't installed, the workflow tells your agent to print the one-line install command and stop — nothing fails silently.

You never call a workflow with special syntax. You describe the goal; the agent picks the matching workflow and fills in the arguments from your sentence.

Quickstart (60 seconds)

Add the server to Claude Code:
```
claude mcp add devxira -- npx devxira
```
Confirm it's connected: run /mcp — devxira should be listed. (Restart Claude Code if you added it mid-session.)

Try it. Type, in plain English:

Run test_connection

Then a real one:

Research the WordPress plugin wpforms and plan an MVP

That's the whole interface — plain English, no special commands. Workflows that need GitHub, ClickUp, or Playwright will tell you the exact claude mcp add … command if it's missing. See Installation for other clients (Desktop, Cursor, Windsurf, Codex) and companion setup.

How requests become workflows

flowchart LR
    A["You describe a goal<br/>'Audit PR #42'"] --> B[Agent matches it to a workflow]
    B --> C[Agent pulls args from your sentence<br/>pr = 42]
    C --> D{Needs a companion?}
    D -- yes, missing --> E[Prints install command + stops]
    D -- yes, present --> F[Runs the playbook via companion]
    D -- no --> F
    F --> G[Report / fixes / PR comment]

You only add as much detail as you need — every instruction follows the same shape:

[ run <workflow> | describe the goal ] + the required input + (optional) any options, in plain words

Using qa_pilot as the example, each row does more than the one above it:

| Level | Style | Example instruction | What the agent runs | |------|-------|---------------------|------------------| | 1 | Describe the goal (let the agent choose) | "QA-test my app at http://localhost:3000 before I merge." | qa_pilot · feature mode · defaults | | 2 | Name it + required input | "Run qa_pilot on http://localhost:3000." | qa_pilot with defaults | | 3 | Name it + options (just say the option words) | "Run qa_pilot on https://staging.acme.com — deploy mode, depth deep, focus checkout." | sets mode, depth, focus | | 4 | Name it + sources (derive cases from a PR / task) | "Run qa_pilot on http://localhost:3000 for PR #42 and task abc123." | adds pr + task_id as case sources |

In the catalog below, bold args are required and a | b | c lists an enum's allowed values — say any value in your sentence to set it.

Workflow catalog

12 workflow prompts. One line each — click a name for full arguments and an example. Workflows that need a companion server are marked; if it's missing, the workflow prints the install command instead of failing.

Where output goes: report- and screenshot-producing workflows (qa_test, qa_pilot, competitor_exploration, product_intelligence, research_plugin) always write to a standard, gitignored run folder — .devxira/evidence/<workflow>/<timestamp>-<slug>/ — with consistent screenshot filenames. There are no output_path/baseline_dir location params; the location is fixed, so output is predictable every run.

| Workflow | What it does | Required | Companion | |---|---|---|:--:| | test_connection | Verify every integration (GitHub, ClickUp, Playwright) is wired up | — | — | | daily_report | Build a standup report from GitHub activity, post it to ClickUp | org | GitHub · ClickUp | | audit_pr | One-pass deep audit → severity-rated bug report | pr | GitHub | | pr_feedback_loop | Incremental review → verify → fix → re-review loop (small/medium PRs) | — | GitHub | | pr_bulk_review_loop | Bulk parallel discovery → cluster → fix-plan → apply (large / cross-cutting) | — | GitHub · Playwright* | | implement_task | End-to-end ClickUp task implementation with a PR | task_id | GitHub · ClickUp | | research_plugin | WordPress plugin market analysis + MVP plan | plugin | — | | build_plugin | Turn a research report into a step-by-step AI build plan | plugin | — | | qa_test | Quick browser QA: verify specific bugs with screenshot evidence | url | Playwright | | qa_pilot | Full human-like exploratory + acceptance + regression QA (MAP→COVER→DONE) | base_url | Playwright | | product_intelligence | Evidence-based competitor/feature research → 18-section report | target | Web · GitHub* · Playwright* | | competitor_exploration | Hands-on browser exploration of a competitor's live product | platform | Playwright |

* Optional / conditional. product_intelligence uses built-in web search/fetch — GitHub and Playwright are optional boosters. For pr_bulk_review_loop, Playwright fires automatically only when the PR touches UI.

Workflow details

Checks the Devxira server plus GitHub / ClickUp / Playwright reachability.

Args: (none)
Example: "Run test_connection."

Collects PRs and commits from GitHub activity, gathers your input, formats a standup report, and posts it to a ClickUp channel.

Args: org · channel_id
Companions: GitHub, ClickUp
Example: "Generate my daily standup report for the dorik org."

Single-pass deep review of a PR → severity-rated bug report (CRITICAL / HIGH / MEDIUM / Minor) with file:line evidence and suggested fixes. For a closed review→fix loop, use pr_feedback_loop.

Args: pr
Companions: GitHub
Example: "Audit PR #42 and give me a severity-rated bug report."

Codex + Claude review the diff in parallel, a critic subagent verifies each finding (pushes back on false positives), a fixer applies verified fixes, then the loop re-runs until clean or the cap fires. Best for small/medium PRs and fast feedback. Works on a GitHub PR or local changes.

Args: pr (omit or pass local to review local changes) · max_iterations (default 5) · mode (fix | dry_run)
Companions: GitHub (PR mode)
Example: "Run pr_feedback_loop on PR #42."

Parallel specialized discovery passes (no fixing) → consolidate + dedupe + cluster by root cause → evidence-grounded specialist critics on the full list → architect a coordinated fix plan → apply all fixes in dependency order → final verify. Best for medium/large PRs and root-cause / cross-cutting bugs. Honors CLAUDE.md and .claude/skills/*, persists false-positive learnings across runs. See Deep dives for how to choose between the two PR loops and the research behind this one.

Args: pr (omit or local for local changes) · discovery_passes (default 4) · outer_iterations (default 1) · mode (fix | dry_run | plan_only) · verify_with_browser (auto | always | never) · precision (high | balanced | recall)
Companions: GitHub (PR mode); Playwright auto for UI changes
Example: "Bulk-review PR #42 with 3 discovery passes."

Fetches a ClickUp task and implements it end-to-end: explore the codebase, plan, code, open a PR, update the task.

Args: task_id
Companions: GitHub, ClickUp
Example: "Implement ClickUp task abc123."

Market analysis, competitor comparison, gap analysis, MVP features, and a development timeline for a WordPress plugin or plugin idea.

Args: plugin (slug, name, wordpress.org URL, or topic)
Example: "Research the WordPress plugin wpforms and plan an MVP."

Turns a research_plugin report into a step-by-step AI build plan.

Args: plugin
Example: "Run build_plugin for wpforms."

Drives Playwright to verify specific bugs / edge cases with screenshot evidence and a short report. For a full acceptance or regression pass, use qa_pilot.

Args: url · pr (extract bug list from the review) · bugs (comma-separated descriptions)
Companions: Playwright (+ GitHub if pr given)
Example: "QA test http://localhost:3000 and verify these bugs: …"

Automated senior-QA pass that drives a real browser end-to-end. FEATURE mode (pre-merge) proves acceptance criteria + edge/negative/boundary cases → PASS / PASS-WITH-ISSUES / FAIL. DEPLOY mode (post-deploy) runs smoke + regression on the live site → GO / GO-WITH-WATCH / NO-GO. MAP the app into a coverage ledger → COVER each item (happy-path + edge cases, UI/UX, a11y, and where they apply security/perf/i18n/data-integrity/API-contract) → DONE when every ledger item has a verdict. Environment-aware (production hardened; credentials runtime-only, never persisted). Regression is stateless by default, with opt-in golden-master visual baselines via baseline: true (stored under .devxira/baselines/). The report + screenshots land in a timestamped run folder under .devxira/evidence/qa-pilot/.

Args: base_url · env (local | staging | production) · mode (feature | deploy) · pr · task_id · focus · depth (smoke | standard | deep | exhaustive) · baseline (true | false)
Companions: Playwright (+ GitHub for pr, ClickUp for task_id)
Example: "Run qa_pilot against https://staging.acme.com in feature mode, depth deep, focus checkout."

Researches real competitors across primary sources (docs, changelogs, GitHub, G2/Capterra, Reddit, HN, Product Hunt, app stores), extracts evidence-backed + confidence-scored features, classifies + scores them, mines user feedback, maps gaps and AI opportunities, then synthesizes a category-leading design. Every claim traces to a fetched source. 10-stage pipeline → 18-section report.

Args: target · domain · competitors (comma-separated seeds) · depth (scan | standard | deep) · focus
Companions: built-in web search + fetch; GitHub & Playwright optional
Example: "Run product_intelligence on 'AI note-taking apps'."

Drives a real browser to use a competitor's product: visits every reachable page, tests each feature (subject to safety tiers), reverse-engineers major workflows step-by-step, simulates human error to watch recovery, and rotates through 7 personas. Every finding carries screenshot / URL / interaction-log evidence with confidence + verification tags. Acts only on accounts you own/authorize; hard-to-reverse actions require confirmation. 15-section report.

Args: platform (name or app URL) · url · goals (comma-separated) · depth (scan | standard | deep | exhaustive) · focus
Companions: Playwright
Example: "Explore competitor Notion at https://notion.so, deep depth."

Tools reference

These are live tools — they execute and return real results (unlike workflow prompts).

Git (6) — stdio mode only

| Tool | Description | Read-only | |------|-------------|:---------:| | git_status | Show working tree status | Yes | | git_diff | Show staged/unstaged/ref changes | Yes | | git_create_branch | Create and checkout a new branch | No | | git_commit | Stage files and commit | No | | git_push | Push to remote | No | | git_log | Show recent commit history | Yes |

Git tools shell out locally, so they're registered only in stdio mode (not over HTTP).

WordPress (2)

| Tool | Description | Read-only | |------|-------------|:---------:| | wordpress_get_plugin | Get plugin details from wordpress.org | Yes | | wordpress_search_plugins | Search plugins by keyword | Yes |

Installation

Claude Code

claude mcp add devxira -- npx devxira

Claude Desktop

Add to your config file:

macOS: ~/Library/Application Support/Claude/claude_desktop_config.json
Windows: %APPDATA%\Claude\claude_desktop_config.json

{
  "mcpServers": {
    "devxira": {
      "command": "npx",
      "args": ["devxira"]
    }
  }
}

VS Code / Cursor

Add to .vscode/mcp.json in your workspace:

{
  "servers": {
    "devxira": {
      "command": "npx",
      "args": ["devxira"]
    }
  }
}

Windsurf

Add to ~/.windsurf/mcp.json:

{
  "mcpServers": {
    "devxira": {
      "command": "npx",
      "args": ["devxira"]
    }
  }
}

OpenAI Codex

codex --mcp-server "npx devxira"

Companion Servers (optional)

Install these alongside devxira for full workflow support:

# GitHub — for PR review, code search, and GitHub operations
claude mcp add --transport http github https://api.githubcopilot.com/mcp/ -H "Authorization: Bearer YOUR_GITHUB_PAT"

# ClickUp — for daily reports and task management
claude mcp add clickup --transport http https://mcp.clickup.com/mcp

# Playwright — for browser-based QA testing
claude mcp add playwright -- npx @playwright/mcp@latest

Authentication

All external authentication is handled by companion MCP servers:

GitHub — Requires a GitHub Personal Access Token (PAT). Create one at github.com/settings/tokens with repo and read:user scopes.
ClickUp — Authenticates via browser on first use through ClickUp's official MCP server.
Playwright — No authentication needed (runs locally).

Deep dives

| Aspect | pr_feedback_loop (incremental) | pr_bulk_review_loop (bulk) | |---|---|---| | Loop shape | small bite (find few → verify → fix → re-find) | one big sweep (find all → verify all → fix all → final verify) | | Best for | small/medium PRs, fast feedback, well-known codebase | medium/large PRs, security audits, root-cause / pattern bugs, unfamiliar codebases | | Catches | individual bugs as they surface | cross-cutting patterns (one fix closes 5 findings) | | Wall-clock | longer (sequential rounds) | shorter (parallel discovery + fix in batches) | | Risk | misses cross-file patterns, can oscillate | longer time-to-first-fix; bugs may interact under batch fix | | Args | pr, max_iterations, mode | pr, discovery_passes, outer_iterations, mode, verify_with_browser, precision |

Naming history: review_pr was renamed to audit_pr in v1.2.0 to free the name for the loop tools and avoid colliding with project-level /review-pr slash commands. pr_bulk_review_loop was added in v1.3.0.

The bulk loop's design is grounded in published code-review-tool research (CodeRabbit, Cursor Bugbot V1→V2 retrospective, CriticGPT, AutoReview FSE 2025, the Multi-Agent Code Verification paper, and bug-bash methodology):

Specialist critics over generic — domain-specific critics (Correctness / Security / Performance / UI / Backend-Tenancy / Type-safety) detect more bugs with far fewer false positives, because their detections are statistically independent.
"Evidence not vibes" — every critic verdict requires executable proof (grep / ast-grep / quoted code / browser screenshots). Pure-opinion verdicts are downgraded to NEEDS_INFO.
Asymmetric thresholds, not voting — Security alone marking VERIFIED blocks; Correctness needs two specialists or one + reproduction; UI requires a screenshot or DOM excerpt. Prevents the "many agents unanimously endorse a non-existent bug" failure mode.
Dynamic per-finding context lookup (Cursor Bugbot V2 lesson) — critics grep/Read the actual code per finding rather than trusting an inlined excerpt.
Persistent learnings — refuted findings are appended to .claude/review-learnings.md; future runs auto-skip matching patterns (overridable with new evidence).
Architect / Editor split (Aider) — one subagent plans the fix batch with dependency ordering; separate fixers apply patches per file-group.
Browser verification uses Playwright MCP via the bundled qa_test prompt (or codex exec with Playwright MCP wired).

Development

Prerequisites

Node.js >= 18
npm

Setup

git clone https://github.com/abuldev/devxira.git
cd devxira
npm install

Build

npm run build

Run

# stdio mode (for MCP clients)
npm start

# HTTP mode (for remote deployment)
npm run start:http

Test

npm test

Test with MCP Inspector

npx @modelcontextprotocol/inspector node build/cli.js

License

MIT