devsquad-mcp
v0.1.0
Published
Turn vague coding prompts into safe, role-based execution workflows for AI coding agents.
Maintainers
Readme
DevSquad MCP
Turn vague coding prompts into safe, role-based execution workflows for AI coding agents.
DevSquad MCP is an MCP server that helps coding agents work in a structured, evidence-first way. It is built for agents such as Claude Code, Codex, Cursor, Antigravity, and other MCP-compatible coding tools.
The main coding agent remains the executor. DevSquad MCP does not run real sub-agents, inspect your project, edit files, execute shell commands, or call external APIs. It returns safe instructions, collaboration gates, diagnosis playbooks, review checks, and workflow state so the main agent avoids guessing.
Product Goal
AI coding agents often fail in predictable ways:
- They start coding from vague prompts.
- They assume frameworks, libraries, routes, schemas, or user intent.
- They apply broad fixes for narrow errors.
- They install packages before checking existing dependencies.
- They touch auth, payments, database, or deployment code without enough confirmation.
- They give junior users confident answers without explaining what is known, unknown, and risky.
DevSquad MCP adds a lightweight senior-developer workflow layer. It forces the agent to classify the prompt, ask for missing decisions, inspect before editing, submit findings, pass review gates, and only then move forward.
What It Does
- Classifies prompt clarity.
- Routes work into implementation, discovery-first, or investigation-only workflows.
- Splits work by role: Architect, Backend, Frontend, Database, Testing, Reviewer, Investigation, and DevOps.
- Adds collaboration questions for missing product decisions.
- Adds junior-friendly task guidance when requested.
- Diagnoses common compile-time and runtime error patterns with deterministic playbooks.
- Returns safe inspection steps instead of running commands itself.
- Tracks workflow state in local JSON storage.
- Reviews submitted task results with rule-based gates.
- Generates final merge strategy, test checklist, and risk summary.
What It Does Not Do
- Does not run multiple LLMs.
- Does not edit project files.
- Does not execute shell commands.
- Does not inspect arbitrary filesystem paths.
- Does not read
.envfiles. - Does not require API keys.
- Does not call OpenAI, Anthropic, Gemini, GitHub, databases, or external APIs.
- Does not directly verify source code. It reviews submitted workflow artifacts and task results.
Core Safety Rule
DevSquad MCP tells the main coding agent what to do. It does not do the work itself.
Correct flow:
User: "Implement login"
DevSquad MCP: "Architect Agent should inspect package.json, routes, middleware, auth files, and database schema. Do not edit files."
Main coding agent: inspects the project.
Main coding agent: submits findings with submit_task_result.
DevSquad MCP: reviews the findings and allows expansion only when safe.Incorrect flow:
DevSquad MCP reads package.json directly.
DevSquad MCP edits files.
DevSquad MCP runs npm install.
DevSquad MCP runs migrations.Those actions are intentionally not implemented.
Prompt Clarity Levels
Level 1: Clear Prompt
Example:
Add Google login using NextAuth and update navbarBehavior:
- Creates an implementation workflow.
- Starts with architecture confirmation.
- Moves through backend, frontend, tests, and review.
- Still applies confirmation gates in strict mode for risky areas.
Level 2: Partial Prompt
Example:
Implement loginBehavior:
- Creates a discovery-first workflow.
- First task blocks editing.
- Agent must inspect existing setup and return findings.
- Workflow can expand only after approved discovery.
Level 3: Very Vague Prompt
Example:
Make it workBehavior:
- Creates an investigation-only workflow.
- Agent must collect evidence before fixes.
- Editing is blocked until the problem is understood.
Collaboration Modes
DevSquad supports three modes.
guided
Default mode. Adds practical collaboration questions when product decisions or evidence are missing.
Use this for normal development.
{
"prompt": "Implement login",
"collaborationMode": "guided"
}junior
Adds beginner-friendly explanations to each task:
- plain-English summary
- why the task matters
- what to inspect first
- common mistakes
- example expected output
- when to stop and ask the user
Use this when the user has little experience or wants the agent to explain its reasoning clearly.
{
"prompt": "Fix auth",
"collaborationMode": "junior"
}strict
Adds stronger confirmation gates, especially for:
- authentication
- payments
- database/schema/migrations
- deployments
- secrets/configuration
- destructive or production-like changes
Use this when hallucinated changes would be expensive or dangerous.
{
"prompt": "Create Stripe checkout with webhook verification and store payment status",
"collaborationMode": "strict"
}Error Diagnosis
DevSquad includes a diagnose_error tool for compile-time and runtime errors.
It does not promise to solve every possible error. That would be dishonest. Instead, it uses deterministic playbooks for common categories and falls back to investigation when evidence is weak.
Initial diagnosis categories:
- TypeScript module resolution:
TS2307,Cannot find module - TypeScript type mismatch:
TS2322,TS2339 - Node module resolution:
ERR_MODULE_NOT_FOUND,MODULE_NOT_FOUND - Node runtime errors:
TypeError,ReferenceError - Next.js hydration errors
- Prisma schema/client mismatch
- OAuth callback/config errors
- Stripe/payment webhook signature errors
Example input:
{
"errorText": "TS2307: Cannot find module '@/components/Button' or its corresponding type declarations.",
"whenItHappened": "npm run build",
"projectContext": {
"framework": "Next.js",
"language": "TypeScript"
},
"collaborationMode": "junior"
}Example output includes:
- category
- confidence
- matched signals
- likely causes
- evidence needed
- safe inspection steps
- user questions
- blocked actions
- suggested Investigation Agent workflow
- junior explanation when
collaborationModeisjunior
MCP Tools
classify_prompt
Classifies prompt clarity.
Input:
{
"prompt": "Implement login",
"projectContext": {
"framework": "Next.js",
"database": "PostgreSQL",
"authLibrary": "NextAuth",
"testing": "Vitest"
}
}diagnose_error
Diagnoses submitted error text and returns a safe fix workflow.
Input:
{
"errorText": "Hydration failed because the initial UI does not match what was rendered on the server.",
"whenItHappened": "opening dashboard page",
"projectContext": {
"framework": "Next.js"
},
"collaborationMode": "guided"
}start_workflow
Creates and saves a workflow.
Input:
{
"prompt": "Implement login",
"collaborationMode": "junior",
"projectContext": {
"framework": "Next.js"
}
}get_next_task
Returns the next safe task and marks it in_progress.
Input:
{
"workflowId": "wf_example"
}submit_task_result
Stores a result from the main coding agent and marks the task completed.
Input:
{
"workflowId": "wf_example",
"taskId": "T1",
"result": {
"projectType": "Next.js",
"authLibraryDetected": "none",
"databaseDetected": "PostgreSQL",
"ormDetected": "Prisma",
"missingDecisions": ["provider"],
"recommendedImplementationOptions": [
{
"name": "Google OAuth with NextAuth",
"pros": ["standard Next.js flow"],
"cons": ["requires OAuth env vars"],
"recommended": true
}
],
"recommendedDefault": "Google OAuth with NextAuth",
"requiresUserConfirmation": false
}
}review_task_result
Reviews a completed task result using rule-based checks.
Input:
{
"workflowId": "wf_example",
"taskId": "T1"
}get_workflow_status
Returns grouped workflow state.
Input:
{
"workflowId": "wf_example"
}expand_workflow_after_discovery
Appends implementation tasks after approved discovery or investigation.
Input:
{
"workflowId": "wf_example",
"selectedApproach": "Google OAuth with NextAuth"
}final_review
Checks whether all required tasks are approved and returns merge/test/risk strategy.
Input:
{
"workflowId": "wf_example"
}Example Agent Flow
For a junior user asking:
Fix authThe intended flow is:
classify_promptreturns Level 3 / investigation-only.start_workflowcreates an Investigation Agent task.get_next_tasktells the main agent to inspect evidence, not edit.- The main agent asks the user for missing error text, auth provider, and reproduction steps if needed.
- The main agent submits findings with
submit_task_result. review_task_resultapproves only if evidence, likely cause, confidence, and recommended next step are present.- Only after approval can the workflow expand into implementation.
This is the anti-hallucination loop.
Installation
npm installDevelopment
npm run devBuild
npm run buildTest
npm testLocal Examples
npx tsx src/examples/clearPromptExample.ts
npx tsx src/examples/partialPromptExample.ts
npx tsx src/examples/vaguePromptExample.ts
npx tsx src/examples/diagnoseErrorExample.tsExample MCP Client Config
After publishing, MCP clients can use the package through npx:
{
"mcpServers": {
"devsquad": {
"command": "npx",
"args": ["-y", "devsquad-mcp"]
}
}
}For local development from this repository:
{
"mcpServers": {
"devsquad": {
"command": "node",
"args": ["/absolute/path/to/devsquad-mcp/dist/index.js"]
}
}
}On Windows, forward slashes are safest:
{
"mcpServers": {
"devsquad": {
"command": "node",
"args": [
"C:/Users/annub/OneDrive/Desktop/PersonalProjects/Devsquad/devsquad-mcp/dist/index.js"
]
}
}
}Runtime Storage
Workflow state is stored at:
.devsquad-mcp/workflows.jsonThe server creates the folder/file if missing. If the file is empty or invalid JSON, it recovers with {} and logs a warning to stderr. Writes go through:
workflows.json.tmp -> workflows.jsonRuntime state is never stored inside src/.
Security Model
DevSquad MCP is intentionally conservative:
- no shell execution
- no file edits
- no package installs
- no migrations
- no Git operations
- no arbitrary filesystem reads
- no
.envreads - no secret collection
- no external API calls
- no hidden network dependency
For stdio MCP, stdout is reserved for protocol messages. DevSquad writes logs to stderr only.
Design Philosophy
DevSquad is not trying to be a smarter chatbot. It is trying to make the coding agent behave more like a careful senior developer:
- understand the request
- identify missing decisions
- gather evidence
- ask the user when needed
- make a scoped plan
- implement in role-based stages
- review submitted work
- verify before final handoff
The product is most useful for junior developers and non-expert users because it makes the agent expose its reasoning and uncertainty instead of confidently guessing.
Limitations
- Diagnosis is pattern-based, not omniscient.
- Final review validates submitted task results, not source code directly.
- The main coding agent still needs to inspect files, run tests, and make changes.
- The quality of review depends on the quality of submitted task results.
- New error categories should be added as small playbooks with tests, not as one giant catch-all prompt.
