magenta-canon

v0.8.1

Published

8 hours ago

A verifiable MCP accountability gateway for AI-agent tool calls: allows authorized calls, blocks unauthorized calls, records both, and produces cryptographic evidence anyone can verify.

0High
0Medium
0Low

royalohioholdings

mcp model-context-protocol ai-agents tool-calls gateway transparency-log merkle verifiable accountability audit

Magenta Canon

A verifiable MCP accountability gateway for AI-agent tool calls. It sits between an MCP host (Claude Code, Claude Desktop, Cursor, …) and downstream MCP tools, and:

allows authorized calls · blocks unauthorized calls · records both · and produces cryptographic evidence anyone can verify.

Try it now (nothing to clone)

Installing from npm? A plain npx magenta-canon (or npm i magenta-canon) installs the current release — the latest dist-tag tracks it (it is also tagged next). It is a proven reference implementation, verified on macOS/Linux/Windows — not production-hosted infrastructure yet.

npx magenta-canon demo                  # full local proof loop: allow · block · witness · verify · tamper
npx magenta-canon vertical fintech-refund # sector demo in buyer language (vertical --list shows all 8)
npx magenta-canon verify --self-test    # verifier self-check: all verdict levels incl. a tamper FAIL
npx magenta-canon gateway <config.json> # the stdio MCP capability gateway
npx magenta-canon mirror --self-test    # external STH mirror: catch operator history rewrites
npx magenta-canon sentinel pack         # Sentinel Mesh: invariant-check the packed artifact

Magenta Canon is a proven reference implementation, not production-hosted infra yet. See what ships vs. what doesn't below.

In 60 seconds

AI agents are starting to take real actions through tools (refunds, deploys, file writes). The unanswerable question becomes: was that action authorized, and can you prove what actually happened?

Magenta answers it. You point your agent's MCP connection at the Magenta Gateway. For every tool call it:

gates the call against an operator-delegated capability (e.g. "may refund up to $100") — default deny;
witnesses the decision — allowed and refused — as a hash-chained, signed execution receipt in an append-only Merkle transparency log;
forwards allowed calls to the real tool; blocks the rest (the tool is never even contacted) and returns an error the agent can read.

Afterward, anyone can run a standalone verifier over the published evidence and get VERIFIED — without trusting the server that produced it. Tamper with the record and verification fails. That's the whole point: not "trust our log," but "verify the receipt yourself."

See it run (30-second demo)

The fastest way to see it is to run the one-command proof loop yourself:

npx magenta-canon demo

It runs the whole loop — an allowed call, a blocked call, a verified receipt, and a tamper case that fails. A narrated walkthrough is in the explainer video, and the Understand it visually section below shows each step. (A terminal-recording GIF is coming; see docs/DEMO_RECORDING.md.)

Understand it visually

The whole idea in three pictures — full set (get started, architecture, the downstream non-bypass proof, one-command demo) in the Visual Guide and on the live site.

What it does — authorize what's allowed, block what isn't, record proof of both:

Magenta Canon sits between an AI agent and a tool server, authorizing, blocking, and recording every tool call.

Allowed vs blocked — an $89 refund is forwarded; a $250 refund stops at the gate:

An $89 refund is authorized and forwarded; a $250 refund is blocked and never reaches the tool server.

Verify the receipt — check the proof yourself; one flipped byte fails verification:

The magenta-verify checker returns VERIFIED for a valid receipt and VERIFICATION FAILED when a single byte is tampered.

A narrated walkthrough is in the explainer video on the live site.

The proof (we don't rely on claims)

Two recorded, reproducible, live end-to-end runs live in the repo:

docs/MCP_GATEWAY_RUN.md — the wedge: an agent tried two tool calls; one was allowed, one was blocked, both were recorded, the downstream tool's own log proves the blocked call never arrived, and magenta-verify returned VERIFIED.
docs/VERIFICATION_RUN.md — the trust anchor: operator publishes evidence → independent verifier checks it → a tampered bundle fails (exit code 1).

To catch operator history rewrites (not just tampered bundles), run an independent STH mirror: docs/MIRROR_OPERATIONAL_RUNBOOK.md — "Verify the receipt — and compare the signed tree head against an outside mirror."

Why this matters and where it's going: docs/LAUNCH_MANIFESTO.md — Verifiable Agent Actions: stop trusting the log, verify the receipt.

Architecture

  MCP host (Claude Code / Desktop / Cursor)
        │  tools/call
        ▼
  ┌─────────────────────┐     gate + witness     ┌──────────────────────────┐
  │   Magenta Gateway    │ ─────────────────────▶ │  Control plane            │
  │   (stdio proxy)      │                        │  • capability gate        │
  │                      │ ◀───── allow/deny ──── │  • transparency-log witness│
  └─────────┬───────────┘                        │  • /api/trust/evidence    │
            │ forward IF allowed                  └──────────────┬───────────┘
            ▼                                                    │ signed receipts
  downstream MCP tool                                            ▼
  (refund, search, …)                          evidence bundle ──▶ magenta-verify
                                                                   = VERIFIED ✓

The verifier imports nothing from the server — only node:crypto + tweetnacl — so it's independently auditable. See docs/MAGENTA_VERIFICATION_SPEC.md.

Quickstart — one command

npm install
npm run demo

npm run demo runs the entire proof loop locally and narrates it end to end: it boots the control plane, bootstraps trust, drives the MCP gateway through one allowed and one blocked tool call, proves the blocked call never reached the downstream tool, publishes the evidence bundle, verifies it with the standalone magenta-verify (→ VERIFIED), then tampers with one byte of the record and shows verification fails. It is local/dev only — an ephemeral port and a fresh universe in a temp dir, cleaned up on exit.

# These manual steps run from a REPO CHECKOUT (they boot the full hosted
# control plane, which does not ship in the npm package). From an npm install,
# just run `npx magenta-canon demo` — it is the same proof loop, self-contained.
# 1. boot the control plane (the witness + evidence surface) and bootstrap trust
INTERNAL_API_KEY=operator-secret-xyz MAGENTA_STATE_FILE=/tmp/magenta-state.json \
  PORT=5000 npx tsx server/index.ts &
# wait until it is healthy before bootstrapping (the server takes a few seconds)
until curl -sf :5000/api/health >/dev/null 2>&1; do sleep 1; done
curl -s -X POST :5000/internal/founder/ceremony \
  -H 'X-Internal-Key: operator-secret-xyz' -d '{}'

# 2. run the live gateway demo (gateway spawns a real downstream MCP server)
DOWNSTREAM_LOG=/tmp/downstream-calls.log \
  node scripts/mcp-demo-drive.mjs examples/magenta-gateway.demo.config.json

# 3. publish evidence and verify it independently
curl -s :5000/api/trust/evidence > bundle.json
npx magenta-canon verify bundle.json --expected-witness-key witness.pub
                                       # → RESULT: ORIGIN AND INTEGRITY VERIFIED
# without --expected-witness-key the honest maximum is: RESULT: INTEGRITY VERIFIED
# (the bundle can vouch for its math, not for who produced it)

Full walkthrough: docs/MCP_GATEWAY.md.

Durable evidence (file ledger) — new in 0.4.0

By default evidence lives in memory (the demo's fresh-universe behavior). For evidence that survives a process restart, point the gateway/control plane at an append-only file ledger and pin the witness identity:

MAGENTA_LEDGER_FILE=/var/lib/magenta/ledger.jsonl \
MAGENTA_WITNESS_SECRET=<hex> MAGENTA_WITNESS_PUBLIC=<hex> \
  <your magenta process>

Every receipt and signed tree head is fsync-committed before it is reported recorded; on restart the ledger is strictly replayed (chain, issuer signatures, Merkle roots, checkpoint history) or startup fails loudly. One exclusive writer per ledger; corruption is refused, never repaired silently. This is single-writer local/reference durability — see docs/FILE_LEDGER.md for the format, lock recovery, crash-tail policy, backup guidance, and limitations.

Witness identity & authorized key rotation — new in 0.5.0

Instead of raw env keys, the witness can hold its identity in an encrypted keystore (scrypt → AES-256-GCM, header-bound AAD) and rotate its signing key under cryptographic authorization — the outgoing key signs the rotation, the incoming key countersigns, and the signed rotation record is committed into the evidence ledger itself before the new key signs anything:

MAGENTA_LEDGER_FILE=/var/lib/magenta/ledger.jsonl \
MAGENTA_WITNESS_KEYFILE=/var/lib/magenta/witness.keystore \
MAGENTA_WITNESS_PASSPHRASE=<operator passphrase> \
  <your magenta process>

Key material alone never grants authority — a validated, durably committed rotation record does. Verifiers replay the rotation chain from a pinned anchor: historical checkpoints validate against the key that was authorized for their epoch, substitution/rollback/forks are refused, and --expected-witness-key still earns ORIGIN AND INTEGRITY VERIFIED across rotations from the original pinned key. A crash at any point in the rotation ceremony reconciles deterministically at next boot (recover the committed fact, abandon the orphaned key, or fail closed — never silent activation).

Scope honesty: this is local/reference custody (file keystore, not KMS/HSM), and it covers the witness identity only — founder/receipt-issuer key custody is a separate forthcoming lane (in file-ledger mode the issuer key still regenerates on restart; old evidence keeps verifying). Keystore format, rotation protocol, boot-reconciliation table, and operator procedures: docs/WITNESS_IDENTITY.md (ships in the package).

Sentinel Mesh foundation — new in 0.6.0

Magenta Canon 0.6.0 introduces an executable Sentinel Mesh that protects repository state, npm artifacts, dependency boundaries, release promotion, witness identity, and cryptographically authorized witness-key rotation: versioned invariants, deterministic repository / artifact / witness evaluation, evidence-preserving violations, and scoped release-promotion blocking.

magenta-canon sentinel repository        # invariant-check a git repository
magenta-canon sentinel artifact <tgz|dir>  # invariant-check a packed artifact
magenta-canon sentinel pack              # npm pack, then check the result
magenta-canon sentinel witness --ledger ledger.jsonl  # witness rotation-authority check
magenta-canon sentinel ledger ledger.jsonl  # ledger truth: sequence, chains, Merkle, checkpoints (new in 0.7.0)
magenta-canon sentinel execution bundle.json      # authority-bound action evidence (new in 0.8.0)
magenta-canon sentinel configuration config.json  # one operation, one canonical authority path (new in 0.8.0)
magenta-canon sentinel mirror records.jsonl       # independent checkpoint observation (new in 0.8.0)
magenta-canon sentinel invariants        # list the invariant registry

The Witness & Rotation Sentinel (MC-WIT-001…011) watches the 0.5.0 rotation-authority model itself: pinned-anchor continuity, dual-signed rotation validity, exact epoch boundaries, epoch-correct STH signing, retired-key resurrection / pre-activation refusal, and keystore↔ledger agreement — re-derived independently from the published wire formats, so it can disagree with a wrong (or compromised) server.

The Ledger Sentinel (MC-LEDGER-001…021, new in 0.7.0) is the independent judge of ledger truth: contiguous receipt sequence, hash-chain and issuer signatures, RFC 6962 Merkle re-derivation, checkpoint monotonicity and equivocation refusal, corruption fail-closed vs bounded crash-tail quarantine, writer/lock authority (never auto-stolen), witness agreement, and mirror consistency. It reports five distinct outcomes — integrity / origin (pinned anchor only) / operational exclusivity (lock metadata only) / not-evaluated / violation — and is the contract any future storage backend (PostgreSQL, hosted) must satisfy before carrying authority (docs/LEDGER_SENTINEL.md, ships in the package).

Sentinel Mesh V1 — complete in 0.8.0

Magenta Canon 0.8.0 completes the Sentinel Mesh V1: eight sentinels plus a coordinator, 105 ratified invariants across twelve domains, every one with a named enforcer (the coverage law is a CI gate), zero live reserved entries. The three sentinels added in 0.8.0 are:

Execution Sentinel (MC-EXEC-001…014) — binds a governed action to the signature-authenticated receipt's committed action_hash, never to operator params; a refused action that appears downstream, or a missing receipt, is caught.
Configuration & Dual-Authority Sentinel (MC-CONFIG-001…014) — enforces one operation, one canonical authority path; independently enumerates configuration sources, precedence, fallbacks, and reviewed↔effective drift.
Mirror Sentinel (MC-MIRROR-001…012) — re-derives a magenta-mirror/1 observation stream independently of the mirror that wrote it: record-hash chain, STH signatures, anti-equivocation across all observed sizes.

Two meta-sentinels close the mesh: the Anomaly Sentinel (MC-ANOM-001…004) is the Attack KB's regression memory — a ratified attack that stops being detected fails the build — and the Sentinel Coordinator (MC-COORD-001…003) combines per-sentinel findings into one scoped, deterministic, tamper-evident magenta-sentinel-report/1 without overriding, downgrading, or suppressing any finding, freezing only the affected authority class. Full inventory, limitations, and the V1 boundary: docs/SENTINEL_MESH_V1_COMPLETION.md.

Sentinels are read-only, redacting (paths + fingerprints, never secret bodies), deterministic, network-free, and emit magenta-sentinel-violation/1 evidence records; exit codes map to a scoped promotion decision (eligible / blocked / requires-independent-review). Records at this layer are unsigned local diagnostics, and the freeze is a release/build promotion gate — not a production kill switch, a hosted Sentinel service, continuous monitoring, or a probabilistic/unknown-threat detector. Honest scope and the full model: docs/SENTINEL_MESH.md.

From the repo vs. as a CLI

From a repo checkout (above): npm install then npm run demo.
As a package/CLI — the same three capabilities without cloning internals. Published on npm; a plain npx magenta-canon installs the current release (the latest dist-tag — a proven reference implementation, not production-hosted infra yet):
```
npx magenta-canon demo                 # the full local proof loop
npx magenta-canon verify <bundle.json> # independently verify an evidence bundle
npx magenta-canon gateway <config.json># run the stdio MCP capability gateway
```
verify exits 0 on VERIFIED, 1 on VERIFICATION FAILED. The packaged demo boots a headless control plane (no frontend needed) and is verified end-to-end on macOS, Linux, and Windows. What ships, what doesn't, and how to test the package locally: docs/NPM_PACKAGING.md. (Published under next; not on latest yet.)

Connect your MCP host

Point Claude Code / Claude Desktop / Cursor at the gateway (it proxies your real downstream MCP server, configured in examples/magenta-gateway.demo.config.json):

{
  "mcpServers": {
    "magenta-gated-tools": {
      "command": "npx",
      "args": [
        "tsx",
        "/abs/path/to/magenta-canon/scripts/mcp-gateway.ts",
        "/abs/path/to/magenta-canon/examples/magenta-gateway.demo.config.json"
      ]
    }
  }
}

Every tool call your agent makes now flows through the gate and is witnessed.

Demo transcript

HANDSHAKE_OK tools=refund
ALLOWED_CALL isError=false :: refund executed: {"amount_cents":8900,"order_id":"4471"}
REFUSED_CALL isError=true  :: Blocked by Magenta capability gate: exceeds delegated
  refund ceiling (25000 > 10000 cents). Witnessed as receipt e993d86ef721cf94…

# downstream tool's own log — ground truth of what actually reached it:
{"name":"refund","args":{"amount_cents":8900,"order_id":"4471"}}     ← only the allowed call
# the blocked $250 refund is ABSENT — it never reached the tool.

$ npx magenta-canon verify bundle.json --expected-witness-key witness.pub
  witness-key source: INDEPENDENT (supplied by caller)
  [PASS] STH signature verifies
  [PASS] STH witness key matches INDEPENDENTLY supplied key
  [PASS] receipt chain intact
  [PASS] receipt issuer signatures verify (2/2)
  [PASS] recomputed Merkle root == signed STH root
  RESULT: ORIGIN AND INTEGRITY VERIFIED            (exit code 0)

Security model

Magenta gives accountability, not magic: an action is gated before it happens and recorded so that tampering is detectable by an independent party. The key honest condition — the insider guarantee holds only when the Signed Tree Head is mirrored to an outside party; the code emits/verifies it, the mirror is an operational step. Read the full threat model and assumptions: docs/SECURITY_MODEL.md.

Current status (honest)

✅ Proven reference implementation — validated live in a development sandbox.
✅ Capability gate, witness, signed receipts, standalone verifier, stdio MCP gateway — all tested (full suite green).
⏳ Production durability (single-writer / Postgres) is not yet complete — the demo uses file-backed persistence.
⏳ The demo's downstream MCP server is minimal (real stdio JSON-RPC, enough to prove the path; not a production tool server).
⏳ Transport is stdio only — no Streamable HTTP, no hosted/multi-tenant.
Capabilities here are operator-configured (Act 1, proven); per-call agent-signed certificates (Act 2) are a stronger, separate mode also proven (docs/AGENT_ACCOUNTABILITY_SPEC.md), not exercised by the stdio demo.

This is a credible, reproducible wedge — not a claim of production readiness.

Open source vs. paid

The verifier, spec, evidence formats, stdio gateway, and the local reference control plane are free and open under Apache-2.0 (verification must never require trusting us); hosted witness, external STH mirroring, multi-tenant / team policy control plane, remote HTTP gateway, and compliance reporting are the reserved paid surface — not shipped in this repo. This boundary is ratified: docs/OSS_VS_PAID.md.

Docs index

| Doc | What it is | |---|---| | docs/VISUAL_GUIDE.md | Visual walkthrough: what it does, allowed vs blocked, verify the receipt, architecture, explainer video | | docs/LAUNCH_MANIFESTO.md | The category thesis: verifiable agent accountability | | docs/DEMO_VIDEO_STORYBOARD.md | Storyboard + capture plan for the npm run demo video/GIF | | docs/MCP_GATEWAY.md | How the gateway works + run guide | | docs/MCP_GATEWAY_RUN.md | Live end-to-end gateway proof | | docs/VERIFICATION_RUN.md | Live trust-anchor proof (+ negative control) | | docs/MAGENTA_VERIFICATION_SPEC.md | Language-agnostic verification spec | | docs/SECURITY_MODEL.md | Threat model & trust assumptions | | docs/AGENT_ACCOUNTABILITY_SPEC.md | Agent-action accountability (Act 1 + Act 2) | | docs/OSS_VS_PAID.md | Open/paid boundary (ratified) | | docs/NPM_PACKAGING.md | What the npm package ships, and how to verify it locally |

Stack

Express + TypeScript backend · React + Tailwind/shadcn frontend · Drizzle ORM (PostgreSQL, for the durability roadmap) · Ed25519 + SHA-256 + RFC 6962 Merkle.

Platform operational requirements

Magenta Canon also operates as a structural execution-eligibility control plane (binary ALLOWED/BLOCKED gating of software artifacts via computed conformance; epoch-based append-only baselines; non-circumventable, no override flags). The following operational rules are load-bearing for the live deployment and must be preserved.

Crawl & SEO enforcement (non-negotiable)

All public pages MUST comply with the Sovereign HTML Base Template and the Universal Crawl Enforcement Prompt in /docs:

Every page renders complete semantic HTML without JavaScript.
Exactly one <h1> per page.
Required: <main>, <article>, canonical link, JSON-LD.
Head ordering: charset → viewport → title → description → robots → canonical.

Validate by disabling JavaScript and viewing source; if <title>, <meta description>, <link rel="canonical">, exactly one <h1>, or <article> body text is missing, the page is invalid. Build-time check:

npm run crawl:lint

Any PR, agent, or refactor that violates these rules is invalid. See docs/SOVEREIGN_HTML_BASE.html (canonical template) and docs/CRAWL_ENFORCEMENT.md (full rules).

Contact

[email protected]