@dj_abstract/mcp-audit
v0.4.0
Published
Security auditor for Model Context Protocol (MCP) servers — scans tool definitions for prompt injection, tool poisoning, unsafe combinations, and other AI-native vulnerabilities.
Maintainers
Readme
mcp-audit
A security auditor for Model Context Protocol (MCP) servers. Scans tool, resource, and prompt definitions for AI-native security issues — prompt injection, tool poisoning, dangerous capability combinations, schema permissiveness, and more.
Why this exists: MCP servers ship arbitrary text directly into the host LLM's context. A malicious or sloppy server can manipulate any agent that connects to it. As the MCP ecosystem grows, the surface for prompt injection, tool poisoning, and "lethal trifecta" capability combinations grows with it. There's no shortage of CVE scanners. There's almost nothing focused on the threats that are unique to agent infrastructure.
What it checks
| Rule | Severity (worst case) | What it catches |
|------|-----------------------|------------------|
| prompt-injection | critical | Instruction overrides, role redefinition, fake system tags, system-prompt extraction, silent-exfiltration directives, and other injection patterns embedded in tool/prompt/resource descriptions. |
| invisible-instructions | critical | Unicode Tag block characters (the "ASCII Smuggler" attack), zero-width characters, control characters, and large base64 blobs hidden in descriptions. |
| tool-poisoning | high | Hidden capabilities (params not mentioned in the description), read-only claims contradicted by mutating params, descriptions that reference a different tool name. |
| unsafe-tool-combos | critical | "Lethal trifecta" combinations on a single server: shell-exec + network-egress, file-read + network-egress, secret-read + network-egress, file-write + shell-exec. |
| sensitive-output | high | Tools whose names suggest they return secrets, env vars, credentials, sessions, or private keys. |
| destructive-no-confirm | medium | Destructive tools (delete_*, drop_*, kill_*) with no confirmation parameter. |
| schema-permissiveness | high | Unbounded string params on command-shaped surfaces, missing inputSchema, additionalProperties: true, undefined object structures. |
| unauthenticated-server | high | Remote (HTTP/SSE) MCP servers that accept connections without auth. |
| excessive-scope | medium | A single server spanning many unrelated capability domains (filesystem + network + shell + db + …) — large blast radius if compromised. |
Install
One-shot with npx (no install):
npx @dj_abstract/mcp-audit scan --stdio "node ./my-mcp-server.js"Global install:
npm install -g @dj_abstract/mcp-audit
mcp-audit --helpOr clone and run from source:
git clone https://github.com/abregoarthur-star/mcp-audit
cd mcp-audit
npm install
node bin/mcp-audit.js --helpRequires Node.js 20+.
Usage
Scan a local stdio MCP server
mcp-audit scan --stdio "node ./my-mcp-server.js"Scan a remote HTTP/SSE server
mcp-audit scan --url https://mcp.example.com/sse --bearer "$TOKEN"
mcp-audit scan --url https://mcp.example.com --header "X-Api-Key: $KEY"Scan a static manifest
Useful for offline audits, CI pipelines, or auditing in-process SDK servers (see "Auditing Agent SDK servers" below).
mcp-audit scan --manifest server.jsonOutput formats
mcp-audit scan --stdio "..." --html report.html # standalone HTML, share-friendly
mcp-audit scan --stdio "..." --json report.json # JSON for CI / automation
mcp-audit scan --stdio "..." --sarif results.sarif # SARIF 2.1.0 — GitHub code-scanning compatible
mcp-audit scan --stdio "..." --json # JSON to stdout
mcp-audit scan --stdio "..." --quiet --json | jq ... # pipingCI gate
Exit non-zero if any finding meets a severity threshold:
mcp-audit scan --stdio "..." --fail-on highGitHub Actions (native Code Scanning integration)
Drop-in Action that runs the scan, emits SARIF, and surfaces findings in your PR's Security tab and inline on Files Changed:
- uses: abregoarthur-star/mcp-audit-action@v1
with:
manifest: ./mcp-manifest.json
fail-on: highFull docs and recipes: mcp-audit-action.
Auditing Agent SDK servers
Servers built with the Anthropic Agent SDK's createSdkMcpServer() run in-process; they are not standalone stdio servers. Use the bundled extractor to dump them as a manifest first:
node bin/extract-sdk-server.js path/to/your-mcp.js exportName /tmp/manifest.json
mcp-audit scan --manifest /tmp/manifest.json --html report.htmlSample finding
CRITICAL Shell execution + network egress on same server
rule: unsafe-tool-combos · target: server/brain-tools
A single server provides both arbitrary command execution and outbound
network capability. Any prompt-injection that lands here can run a
command and exfiltrate the output in one hop.
evidence:
shell_exec: ["execute_command"]
network_out: ["create_linkedin_draft","security_intel","market_intel",
"send_telegram","manage_tasks","read_email","send_email"]
remediation:
Split capabilities across separate MCP servers with separate trust
boundaries. The host agent can compose them, but a compromise of one
server should not yield the full kill chain.
refs:
- https://owasp.org/www-project-top-10-for-large-language-model-applications/
- https://simonwillison.net/2025/Jun/16/the-lethal-trifecta/Threat model
mcp-audit is a static analyzer of an MCP server's surface. It does not execute tools, send payloads, or attempt exploitation. Every check is read-only:
- For stdio servers: spawn the server, perform the MCP
initialize/tools/list/resources/list/prompts/listhandshakes, then close. - For HTTP/SSE servers: connect, list, close.
- For manifests: pure file read.
This makes it safe to run against production servers, including third-party servers you don't own.
It will not catch:
- Vulnerabilities in tool implementations (e.g. SQL injection inside a
query_dbhandler). - Behavior that only manifests at call time (e.g. rate-limit issues, time-of-check / time-of-use bugs).
- Backdoored binaries or supply-chain compromise of the server itself.
Pair it with conventional SAST/DAST and supply-chain scanning.
Differential audits (diff)
Detect rug-pulls and drift between two snapshots of an MCP server. New to 0.3.0.
# First time — save a baseline
mcp-audit scan --stdio "..." --json baseline.json
# Later — diff current state against baseline
mcp-audit diff baseline.json current.json
mcp-audit diff baseline.json current.json --fail-on high # CI gateWhat it catches:
| Severity | Detects |
|---|---|
| CRITICAL | A new tool introduces a capability class (shell-exec, network-egress, secret-read) the server didn't have before — silent capability expansion, classic rug-pull. |
| CRITICAL | Prompt-injection markers appeared in a tool description that wasn't there before. |
| CRITICAL | An existing tool's capability class widened (e.g. its name or schema now implies shell execution where it previously didn't). |
| HIGH | Server-level capability drift — the union of the server's capabilities has grown. |
| HIGH | readOnlyHint annotation removed — a previously read-only tool can now mutate state. |
| HIGH | inputSchema widened with additionalProperties: true. |
| HIGH | Tool description materially rewritten (>25% length delta). |
| MEDIUM | New tool added (no new capability class). |
| MEDIUM | Tool removed. |
| MEDIUM | Required parameters dropped from inputSchema. |
| LOW | Cosmetic description or schema edits. |
Pair with CI: if you connect your agent to a third-party MCP server, run mcp-audit scan --json current.json nightly and mcp-audit diff prior.json current.json --fail-on high to page on silent changes. Your agents should not discover a new execute_command tool on a server they've trusted for months.
Programmatic API
import { audit, diff } from '@dj_abstract/mcp-audit';
// Scan
const report = await audit({ stdio: 'node ./server.js' });
console.log(report.summary.bySeverity);
// Diff
const result = await diff('baseline.json', 'current.json');
console.log(result.summary, result.changes);
for (const f of result.findings) {
console.log(f.severity, f.ruleId, f.title);
}Roadmap
- Detection-only Nuclei-style remote checks (auth bypass probes, CORS misconfig)
- Per-tool permission-cost scoring (rank which tools deserve human-in-the-loop gating)
- Integration with the MCP server registry for community scoring
- Recipe for Brain Agent SDK to call
audit()before connecting to any new server
References
- OWASP Top 10 for LLM Applications
- Tool Poisoning Attacks (Invariant Labs)
- The Lethal Trifecta (Simon Willison)
- Hiding and finding text with Unicode Tags (Embrace The Red)
- MITRE ATLAS
- Model Context Protocol specification
Related tools
Part of a detect → inventory → test → generate → defend pipeline for AI-agent security:
| Layer | Tool | Role |
|---|---|---|
| Detect | mcp-audit (you are here) | Static audit of MCP server definitions |
| Detect | mcp-audit-sweep | Reproducible sweep of public MCP servers (methodology + report) |
| Inventory | @dj_abstract/agent-capability-inventory | Fleet-wide tool catalog with data-sensitivity tags |
| Test | prompt-eval | Runtime prompt-injection eval harness against a live agent |
| Generate | @dj_abstract/prompt-genesis | LLM-driven adversarial attack corpus generator (feeds prompt-eval) |
| Defend | @dj_abstract/agent-firewall | Call-time defensive middleware for tool invocations |
License
MIT — see LICENSE.
