tooleval

v0.1.0

Published

a month ago

npm audit for MCP servers — zero-config heuristic testing that discovers tools and runs 21 automated security & quality checks per tool

0High
0Medium
0Low

touchskyer

mcp model-context-protocol testing security audit cli tool-testing path-traversal schema-validation ssrf command-injection sarif ci

tooleval

npm audit for MCP servers — zero-config, zero-API-cost heuristic testing

ToolEval connects to any MCP server via stdio, discovers all tools, and runs 21 automated checks per tool covering schema validation, security, resilience, and correctness. No API keys. No LLM costs. Just plug in your server command and get results.

Quick Start

npx tooleval npx @modelcontextprotocol/server-filesystem /tmp

That's it. One command, full report.

Output Formats

Text (default)

npx tooleval npx @modelcontextprotocol/server-filesystem /tmp

🔍 ToolEval Spike — Generic MCP Server Checker
📦 Server: npx -y @modelcontextprotocol/server-memory

📋 Discovered 9 tools:
   • create_entities — Create multiple new entities in the knowledge graph
   • create_relations — Create multiple new relations between entities
   ...

────────────────────────────────────────────────────────────
🔧 Testing: create_entities
   ✅ A. Schema exists — 1 props
   ✅ B. Empty call (no crash) — error (expected)
   ✅ C. Response shape — 1 items
   ...

════════════════════════════════════════════════════════════
📊 SUMMARY
  ✅ create_entities: 18/18 checks
  Total: 162/162 checks passed
  ✅ ALL CLEAR
════════════════════════════════════════════════════════════

JSON

npx tooleval --format json npx @modelcontextprotocol/server-filesystem /tmp

Returns structured JSON with per-tool results — perfect for CI pipelines.

SARIF

npx tooleval --format sarif npx @modelcontextprotocol/server-filesystem /tmp

Outputs SARIF format — integrates with GitHub Code Scanning, Azure DevOps, and other SARIF-compatible tools.

HTML

npx tooleval --format html npx @modelcontextprotocol/server-filesystem /tmp > report.html

Self-contained HTML report — open in any browser, share with stakeholders.

The 21 Checks

| # | Check | What it tests | |---|-------|--------------| | A | Schema exists | Tool exposes a valid inputSchema object | | B | Empty call resilience | Calling with {} doesn't crash the server | | C | Response shape | Response has valid MCP content array | | D | Response time | Responds within 10 seconds | | E | Path traversal | Classic ../../../etc/passwd is rejected | | F | Multi-vector traversal | 6 path traversal bypass techniques blocked | | G | Schema validation (Ajv) | inputSchema compiles as valid JSON Schema | | H | Error info leakage | No stack traces, secrets, or paths in errors | | I | Large input (1MB) | Server handles 1MB payloads gracefully | | J | Concurrent resilience | 5 simultaneous calls all return successfully | | K | SSRF probe | Internal network URLs (169.254.x, localhost) are rejected | | L | Command injection | Shell metacharacters in string params are rejected | | M | Secret detection | Responses don't leak API keys, tokens, or credentials | | N | Type coercion | Wrong types (string→int) are handled gracefully | | O | Idempotency | Repeated identical calls produce consistent results | | P | Timeout escalation | Slow inputs don't hang the server indefinitely | | Q | Unicode handling | Unicode/special chars don't crash or corrupt | | R | Required fields | Missing required fields produce proper errors | | S | Nested depth | Deeply nested objects are handled gracefully | | T | Enum boundary | Out-of-range enum values are rejected | | U | Description quality | Tool has a meaningful description |

Checks are automatically skipped when not applicable (e.g., path traversal skipped for tools without path params).

CI Integration

GitHub Actions

name: MCP Server Audit
on: [push, pull_request]

jobs:
  tooleval:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-node@v4
        with:
          node-version: 20
      - run: npm ci
      - name: Run ToolEval
        run: npx tooleval --format json node ./dist/server.js > tooleval-report.json
      - name: Check results
        run: npx tooleval node ./dist/server.js
      - uses: actions/upload-artifact@v4
        if: always()
        with:
          name: tooleval-report
          path: tooleval-report.json

GitHub Code Scanning (SARIF)

      - name: Run ToolEval (SARIF)
        run: npx tooleval --format sarif node ./dist/server.js > tooleval.sarif
      - uses: github/codeql-action/upload-sarif@v3
        with:
          sarif_file: tooleval.sarif

Why ToolEval

No API key needed — pure heuristic checks, no LLM calls
No LLM cost — run it 1000 times in CI, it's free
Tests the tool, not the AI — validates the MCP server interface directly
Catches real security issues — path traversal, SSRF, command injection, info leakage
Zero config — just point it at your server command
CI-ready — JSON/SARIF/HTML output + proper exit codes

Exit Codes

| Code | Meaning | |------|---------| | 0 | All checks passed | | 1 | Some checks failed | | 2 | Fatal error (server failed to connect, etc.) |

Requirements

Node.js >= 18
The MCP server must be launchable via a shell command (stdio transport)

License

MIT

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

tooleval

Quick Start

Output Formats

Text (default)

JSON

SARIF

HTML

The 21 Checks

CI Integration

GitHub Actions

GitHub Code Scanning (SARIF)

Why ToolEval

Exit Codes

Requirements

License