aidevo
v1.0.0-beta.3
Published
AI Development Observability Platform - Track what AI actually does in your codebase
Downloads
44
Maintainers
Readme
AIDevo
AI Development Observability Platform
Vibe coding with receipts.
AI writes more and more of your code — but what did it actually do? How many tasks did it complete? How many bugs did it introduce? How often did it deviate from your architecture?
AIDevOS gives you the answers.
Quick Start · MCP Server · How It Works · Dashboard · Skills · CLI Reference
The Problem
You're using Claude Code or Cursor to build features. The AI generates hundreds of lines of code. But at the end of the day:
- You can't tell how many of those lines were right the first time
- You can't track which architectural rules the AI keeps violating
- You can't measure whether the AI is actually saving you time
- You can't prove your AI-assisted productivity in reviews or reports
- You can't compare how different models perform on your codebase
All that development process data? Gone. Every single session.
The Solution
AIDevOS works in two modes — pick the one that fits your workflow:
| Mode | What you get | Setup effort |
|------|-------------|--------------|
| Data Collection Only (via MCP) | Full observability with zero workflow changes | Add one JSON config block |
| Full Workflow (MCP + Skills + Commands) | Observability + structured AI SOPs + self-improving rules | npm install + aidevo init |
Most teams start with data collection. Your AI tool calls MCP tools automatically as it works — you change nothing about how you code. When you're ready for structured workflows, upgrade to full mode.
Quick Start
Path A: Data Collection Only (Recommended)
Add the MCP server config to your AI tool and start working. That's it.
Claude Code — add to .mcp.json in your project root:
{
"mcpServers": {
"aidevo": {
"command": "npx",
"args": ["-y", "aidevo", "mcp"]
}
}
}Cursor — add to .cursor/mcp.json:
{
"mcpServers": {
"aidevo": {
"command": "npx",
"args": ["-y", "aidevo", "mcp"]
}
}
}VS Code Copilot — add to .vscode/mcp.json:
{
"servers": {
"aidevo": {
"command": "npx",
"args": ["-y", "aidevo", "mcp"]
}
}
}Windsurf — add to ~/.codeium/windsurf/mcp_config.json:
{
"mcpServers": {
"aidevo": {
"command": "npx",
"args": ["-y", "aidevo", "mcp"]
}
}
}No aidevo init or aidevo start needed. The MCP server uses lazy init — it auto-creates .aidevos/ and run.json on the first tool call.
Then view your data:
npx aidevo dashboardOpen http://localhost:2375 — real-time dashboard with live updates.
Path B: Full Workflow
# Install globally
npm install -g aidevo
# Initialize in your project (interactive setup)
cd your-project
aidevo initaidevo init now offers mode selection:
- Data collection only — sets up MCP config for your chosen AI tool(s)
- Full workflow — MCP config + 14 AI Skills + slash commands + project rules
Multi-tool support: select one or more AI tools (Claude Code, Cursor, VS Code Copilot, Windsurf) and AIDevOS writes the correct MCP config for each.
Then start building:
# Create a new development run
aidevo start
# Place your PRD, then let AI take over
/workflowThe AI will execute: Requirement Analysis -> User Confirmation -> Task Decomposition -> Code Generation -> Self-Review -> loop until done.
MCP Server
The MCP server is the primary data collection mechanism. Your AI tool calls these tools automatically as part of its normal workflow — no extra prompts or commands required.
9 MCP Tools
| Tool | Description |
|------|-------------|
| aidevos_task_start | Mark a task as in-progress |
| aidevos_task_done | Mark a task as completed |
| aidevos_log_bug | Record a bug found during development |
| aidevos_bug_fix | Record a bug fix |
| aidevos_log_review | Log a self-review result (pass/fail) |
| aidevos_log_deviation | Record when AI output deviates from expectations |
| aidevos_log_files | Track file changes (added, modified, deleted) |
| aidevos_highlight | Capture notable achievements or milestones |
| aidevos_status | Return current run status as structured data |
MCP Prompts
The server exposes an aidevos-guide prompt that teaches your AI tool when and how to call each tool. AI tools that support MCP prompts will automatically understand the observability protocol.
Lazy Init
No manual setup required for data collection. On the first MCP tool call, the server will:
- Create
.aidevos/if it doesn't exist - Create
run.jsonfor the current branch and developer - Start recording immediately
Token Auto-Collection
For Claude Code users, AIDevOS automatically reads Claude session files to collect token usage data:
- Token usage per task and per bug fix
- Breakdown:
input_tokens,output_tokens,cache_creation_input_tokens,cache_read_input_tokens - Aggregated totals at the run level
This enables accurate ROI calculation — you can see exactly how many tokens each task or bug fix consumed and correlate cost with output.
How It Works
AIDevOS is not an AI coding agent. It's an observability layer that standardizes how your existing AI tools work.
┌──────────────────────────────────────────────────────┐
│ Your IDE (Claude Code / Cursor / VS Code / Windsurf)│
│ │
│ ┌──────────┐ MCP calls ┌────────────────────┐ │
│ │ AI Agent ├────────────→│ AIDevOS MCP Server │ │
│ │ │ │ (9 tools) │ │
│ │ │ reads ┌──────────────────┐ │ │
│ │ ├────────────→│ .aidevos/skills/ │ │ │
│ │ │ │ (14 SOPs) │ │ │
│ └──────────┘ └───────┬──────────┘ │ │
└───────────────────────────┬───────┘──────────────┘
│
▼
┌─────────────┐
│ run.json │ <- Single source of truth
└──────┬──────┘
│
┌───────────┼───────────┐
▼ ▼ ▼
┌──────────┐ ┌────────┐ ┌─────────┐
│Dashboard │ │Reports │ │Analysis │
│ (React) │ │ (.md) │ │ (JSON) │
└──────────┘ └────────┘ └─────────┘The MCP path (top arrow) is the primary data collection mechanism. Skills are optional and used only in full workflow mode.
Three-Layer Data Model
| Layer | File | Scope | Purpose |
|-------|------|-------|---------|
| L0 | run.json | Per developer | Every task, bug, deviation, review, rule |
| L1 | requirement.json | Per branch | Aggregated developer stats, module assignments |
| L2 | index.json | Per project | Cross-branch overview for team leads |
The Self-Improving Loop
This is what makes AIDevOS different. It's not just tracking — it's learning.
AI generates code
│
▼
Self-review catches issues
│
├── Pass -> next task
│
└── Fail -> fix -> record deviation
│
▼
Is it a missing rule?
│
Yes ─────┤
▼
Sediment as project rule
│
▼
AI reads rules next time
│
▼
Same mistake never happens againOver time, your .aidevos/rules/ grows into a project-specific AI knowledge base that makes every subsequent development run more accurate.
Dashboard
Real-time visualization powered by React + ECharts with dark theme.
Branch Detail View — deep dive into a single development run:
- KPI cards: task completion, deviation rate, bug count, review pass rate, ROI
- Task timeline with stage breakdown
- Node time distribution (where did the AI spend time?)
- Bug severity distribution
- Review issue categories
- File change heatmap
- Token usage breakdown per task
Project Overview — team lead perspective across all branches:
- Requirement status ring chart
- Developer efficiency comparison
- Cross-branch totals and highlights
aidevo dashboard # Default port 2375
aidevo dashboard --port 3000 # Custom portDashboard updates in real-time via SSE — no refresh needed.
Skills
AIDevOS ships with 14 AI Skills — structured SOPs that tell your AI tool exactly what to do and how to record it. Skills are used in full workflow mode and are optional for data-collection-only setups.
Workflow Skills (auto-orchestrated)
| Skill | Role | What it does |
|-------|------|-------------|
| workflow-orchestrator | Project Manager | Orchestrates the full loop, handles interruption recovery |
| requirement-analyzer | Architect | Analyzes PRD, generates analysis.md, waits for user confirmation |
| task-splitter | Tech Lead | Decomposes analysis into atomic, testable tasks |
| code-generator | Senior Engineer | Writes code strictly following tasks and project rules |
| self-reviewer | QA Lead | Reviews code against all project rules, pass or fail |
| bug-fixer | Debug Expert | Fixes bugs found during self-review |
Manual Skills (user-triggered)
| Skill | Command | When to use |
|-------|---------|-------------|
| audit | /audit | Scan your codebase to auto-generate project rules |
| deviation-recorder | /deviation | Record when AI output doesn't match expectations |
| rules-evolver | /rules-evolver | Evolve and maintain project rules from PR feedback |
Utility Skills
| Skill | Purpose |
|-------|---------|
| dashboard-generator | Generate dashboard configuration |
| commit-code | Git commit assistant |
| docx-to-markdown | Convert DOCX PRD files to Markdown |
| mcp-reviewer | External MCP-based code review |
| dev-flower | Development flow visualization |
Rules System
AIDevOS uses a Registry + Generated Views pattern for project rules:
.aidevos/rules.json <- Source of truth (committed to git)
.aidevos/rules/*.md <- Auto-generated views (gitignored)- Fingerprint dedup: SHA256 hash prevents duplicate rules across parallel branches
- Auto-merge:
aidevo rules mergeresolves git conflicts by taking the union - Similarity detection:
aidevo rules dedupefinds near-duplicate rules via Jaccard similarity - Category system:
component,api,style,i18n,architecture,state-management,routing,testing,process,general
Rules are automatically rebuilt on every aidevo start and every rule sedimentation.
CLI Reference
| Command | Description |
|---------|-------------|
| aidevo init | Interactive project setup: mode selection (data collection / full workflow), multi-tool support |
| aidevo start | Create a new development run for current branch |
| aidevo status | Show current run status in terminal |
| aidevo log <sub> | Write structured data to run.json (12 subcommands) |
| aidevo dashboard | Launch real-time visualization dashboard |
| aidevo mcp | Start the MCP server (used in MCP config, not called directly) |
| aidevo rules <sub> | Manage rules registry (build, dedupe, merge, list) |
| aidevo reindex | Rebuild project-level index from all runs |
| aidevo report | Generate markdown performance report (--scope me/team) |
| aidevo update | Update all skills to latest version |
| aidevo migrate | Migrate old run.json format to current schema |
aidevo log Subcommands
aidevo log task --title "Create API layer" --stage "Infrastructure" --prd-phase "PRD1"
aidevo log task-start --id TASK-01
aidevo log task-done --id TASK-01
aidevo log bug --title "Type mismatch" --severity high --source self-review
aidevo log bug-fix --id BUG-01 --fix "Fixed response type"
aidevo log deviation --title "Wrong component" --root-cause rule-missing --category component-usage
aidevo log review --task-id TASK-01 --result pass --scope "src/api/"
aidevo log rule --content "Use Drawer for detail views" --category component
aidevo log file --path "src/api/user.ts" --change-type modified --lines-added 50
aidevo log cost --tokens 125000 --stage "requirement-analysis"
aidevo log highlight --content "FCP reduced from 3.2s to 0.8s"All writes are validated against the schema. Invalid enum values or missing required fields are rejected with clear error messages.
Project Structure
your-project/
├── .mcp.json # MCP server config (Claude Code)
├── .cursor/mcp.json # MCP server config (Cursor)
├── .vscode/mcp.json # MCP server config (VS Code Copilot)
├── .aidevos/
│ ├── config.json # Project configuration
│ ├── rules.json # Rules registry (source of truth)
│ ├── rules/ # Auto-generated rule views (gitignored)
│ ├── index.json # Project-level aggregation
│ ├── skills/ # 14 AI Skill SOPs
│ └── runs/
│ └── [branch]/
│ ├── prd.md # PRD document (shared)
│ ├── analysis.md # Requirement analysis (shared)
│ ├── requirement.json # Branch-level aggregation (shared)
│ └── [developer]/
│ └── run.json # All structured dev data (personal)
├── .claude/commands/ # Slash commands (Claude Code)
└── CLAUDE.md # Project rules + iron lawsTesting
npm test82 tests across 5 test suites:
mcp-server.test— MCP protocol, all 9 tools, prompts, lazy init, end-to-end data verificationrules.test— Fingerprint dedup, registry CRUD, view generation, merge, similarity detectioncli-log.test— All 12 log subcommands, enum validation, metrics calculation, requirement.json synccli-start.test— run.json structure integrity, schema alignment, gitignore managementreindex.test— Index aggregation, status inference, edge cases
Tech Stack
- CLI: Node.js + TypeScript (zero runtime dependencies)
- MCP Server: Model Context Protocol over stdio
- Dashboard: React 19 + ECharts + Tailwind CSS 4
- Data: JSON files (no database required)
- Real-time: Server-Sent Events (SSE)
- Tests: Node.js built-in test runner
Philosophy
The performance of a system is not determined by its strongest component, but by the synergy between all parts.
AIDevOS is built on three iron laws:
- No hallucination — When uncertain, ask. Never guess.
- No unauthorized docs — Don't generate documents without explicit permission.
- Clean workspace — Test scripts must be deleted after passing. Keep the project lean.
These laws are enforced in every AI Skill and injected into your project's global rules.
Contributing
Issues and PRs welcome.
