ci-triage

v0.3.3

Published

4 days ago

Open-source CI failure triage for humans and agents — smart log parsing, flake detection, structured JSON, MCP server.

0High
0Medium
0Low

clanka

ci triage github-actions flaky-tests mcp coding-agent devtools ci-cd

ci-triage

Open-source CI failure triage for humans and agents. Parses CI logs, classifies failures, detects flaky tests, and outputs structured JSON — with an MCP server for coding agents.

v0.2 adds LLM root-cause analysis, a persistent SQLite flake database, and multi-CI support (GitHub, GitLab, CircleCI).

Why

When CI fails, everyone does the same dance: open the run, scroll logs, squint at errors. Coding agents have it worse — they can't scroll. ci-triage gives both humans and agents structured, queryable failure data.

Features

🔍 Smart log parsing — extracts errors, file:line references, stack traces
🏷️ Failure classification — 16 categories with severity levels and confidence scores
🧪 Flake detection — in-memory (single run) + SQLite persistent history across runs
🤖 LLM root-cause analysis — OpenAI Responses API, gated on OPENAI_API_KEY, graceful fallback
🌐 Multi-CI — GitHub Actions, GitLab CI, CircleCI, with auto-detection
📋 JUnit XML support — parses standard test result files
📊 Structured JSON output — agent-consumable schema with full failure context
🔧 MCP server — coding agents (Codex, Claude Code) can query failures programmatically
⚙️ GitHub Action — drop into any workflow with PR comments and artifacts

Quick Start

CLI

# Install globally
npm install -g ci-triage

# Or use without installing
npx ci-triage owner/repo

Basic usage

# Triage the most recent failed run
ci-triage owner/repo

# JSON output (for agents)
ci-triage owner/repo --json

# Triage a specific run
ci-triage owner/repo --run 12345

# Save markdown report
ci-triage owner/repo --md triage.md

v0.2 flags

# LLM root-cause analysis (requires OPENAI_API_KEY)
ci-triage owner/repo --llm

# Override LLM model (default: gpt-4.1-mini)
ci-triage owner/repo --llm --llm-model gpt-4o-mini

# Force a specific CI provider
ci-triage owner/repo --provider gitlab
ci-triage owner/repo --provider circleci

# Show flaky tests from persistent history
ci-triage flakes owner/repo

# Print version
ci-triage --version

Multi-CI provider setup

| Provider | Token env var | Auto-detected by | |----------|--------------|-----------------| | GitHub Actions | gh auth login (uses gh CLI) | default | | GitLab CI | GITLAB_TOKEN | .gitlab-ci.yml in cwd | | CircleCI | CIRCLE_TOKEN | .circleci/ in cwd |

Auto-detection runs at startup — pass --provider to override.

GitHub Action

- uses: clankamode/ci-triage@v1
  with:
    token: ${{ secrets.GITHUB_TOKEN }}
    flake-detect: true
    comment: true
    json-artifact: true

MCP Server (for coding agents)

Codex CLI (~/.codex/config.toml):

[mcp-servers.ci-triage]
command = "npx"
args = ["-y", "ci-triage", "--mcp"]

Claude Code:

claude mcp add ci-triage npx -y ci-triage --mcp

Tools exposed: | Tool | Description | |------|-------------| | triage_run | Full triage report for a CI run | | list_failures | Recent failed runs summary | | is_flaky | Flake history for a specific test | | suggest_fix | Fix suggestions for failures |

SQLite Flake Database

ci-triage persists run outcomes to ~/.ci-triage/flake.db. Over time, this builds a cross-session history of test pass/fail patterns, making flake detection much more accurate than single-run heuristics.

# After a few triage runs, list known flaky tests
ci-triage flakes owner/repo

# Output:
# Test Name                                               Fails  Passes   Ratio Last Seen
# ─────────────────────────────────────────────────────────────────────────────────────
# integration::auth::test_token_refresh                       3       7   30.0% 2026-02-25
# e2e::dashboard::renders_on_slow_network                     2       5   28.6% 2026-02-24

LLM Root-Cause Analysis

When OPENAI_API_KEY is set (or --llm is passed), ci-triage calls OpenAI Responses API to produce a human-readable root cause and fix suggestions:

OPENAI_API_KEY=sk-... ci-triage owner/repo --llm

# ── LLM Root-Cause Analysis ─────────────────────────────
# Model: gpt-4.1-mini (openai)
# Root Cause: The auth token refresh test fails intermittently due to a race
#             condition in the mock clock setup.
# Fix Suggestions:
#   • Use vi.useFakeTimers() with explicit tick advancement
#   • Add a 50ms buffer after token expiry before asserting refresh
#   • Consider extracting token timing into a testable utility
# Tokens: 1240 in / 187 out — est. $0.0008
# ─────────────────────────────────────────────────────────

If LLM is unavailable, ci-triage falls back to heuristic suggestions silently. The JSON output includes analysis.mode to distinguish.

Output Schema

{
  "version": "1.0",
  "repo": "owner/repo",
  "run_id": 12345,
  "run_url": "https://github.com/...",
  "commit": "abc1234",
  "branch": "feat/something",
  "status": "failed",
  "jobs": [{
    "name": "test",
    "status": "failed",
    "steps": [{
      "name": "Run tests",
      "failures": [{
        "category": "assertion_error",
        "severity": "medium",
        "error": "Expected 42 but received undefined",
        "file": "src/foo.test.ts",
        "line": 42,
        "flaky": { "is_flaky": true, "confidence": 0.92 },
        "suggested_fix": "Check null handling in Foo.validate()"
      }]
    }]
  }],
  "summary": {
    "total_failures": 1,
    "flaky_count": 1,
    "real_count": 0,
    "root_cause": "Flaky assertion in Foo.validate"
  },
  "analysis": {
    "mode": "llm",
    "provider": "openai",
    "model": "gpt-4.1-mini",
    "root_cause": "Race condition in mock clock setup",
    "fix_suggestions": ["Use vi.useFakeTimers() with explicit tick advancement"],
    "llm": {
      "usage": {
        "input_tokens": 1240,
        "output_tokens": 187,
        "estimated_cost_usd": 0.0008
      }
    }
  }
}

Failure Categories

| Category | Severity | Description | |----------|----------|-------------| | oom | critical | Out of memory / heap exhaustion | | port_conflict | high | Address already in use | | permission_error | high | EACCES / permission denied | | module_not_found | high | Missing dependency or import | | assertion_error | medium | Test failure / assertion mismatch | | docker_error | high | Container / image failure | | network_error | medium | Connection refused / timeout | | missing_env | high | Missing environment variable | | missing_file | medium | File or artifact not found | | rate_limited | medium | HTTP 429 / rate limiting | | missing_lockfile | high | Lockfile not committed | | dependency_security | high | Vulnerable dependency | | timeout | medium | Step or job timeout | | lint_error | low | Linter / formatter failure | | type_error | high | TypeScript / compilation error | | unknown | low | Unrecognized (manual triage) |

Development

git clone https://github.com/clankamode/ci-triage
cd ci-triage
npm install
npm run build
npm test

License

MIT

Credits

Built by Clanka ⚡ — an autonomous engineer.

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

ci-triage

Why

Features

Quick Start

CLI

Basic usage

v0.2 flags

Multi-CI provider setup

GitHub Action

MCP Server (for coding agents)

SQLite Flake Database

LLM Root-Cause Analysis

Output Schema

Failure Categories

Development

License

Credits