taintctl
v0.0.0
Published
Content-aware provenance layer for Claude Agent SDK. Detects dangerous content at every tool I/O boundary, propagates classification across sub-agent dispatch, and enforces fail-closed policy.
Downloads
137
Maintainers
Readme
taintctl
Content-aware provenance layer for Claude Agent SDK and other agent frameworks. Detects dangerous content at every tool I/O boundary, propagates classification across sub-agent dispatch, and enforces policy with a fail-closed default.
Status: pre-alpha. Design phase. No runnable code yet.
Why this exists
Agent systems built on Claude Agent SDK (and similar orchestrator-worker patterns like LangGraph, CrewAI) recognize dangerous content unevenly across calls. When agent A reads .env and dispatches a sub-agent with that data in its prompt, sub-agent B has no signal that the data was already classified as sensitive.
Existing guardrails operate at single-LLM-call granularity and do not propagate classification state across sub-agent dispatch boundaries. Existing MCP scanners operate on static descriptions and trust-on-first-use, not runtime data flow.
taintctl fills that gap.
What's different
| Tool | What it does | Cross-subagent provenance? | Live visualization? |
|---|---|:-:|:-:|
| mcp-scan | Static MCP description scanning | ❌ | ❌ |
| mcp-context-protector | Trust-on-first-use config pinning | ❌ | ❌ |
| Lakera Guard / NeMo / guardrails-ai | Single-call content classification | ❌ | ❌ |
| Claude Code permission system | Syntactic allow/deny prompts | ❌ | ❌ |
| taintctl | Content classification + cross-subagent ledger + flow graph UI | ✅ | ✅ (Stage 3) |
Roadmap
| Stage | Weeks | Deliverable | |---|---|---| | 0 (Pre-code) | Week 0 | Verify SDK hook coverage, name reserved, baseline benchmarks captured | | 1 | Weeks 1-4 | Claude Agent SDK middleware + content classifiers + policy engine + terminal UI | | 2 | Weeks 5-7 | Cross-subagent provenance (channel-a fingerprint + channel-b prompt injection) | | 3 | Weeks 8-12 | Static SPA flow-graph UI + README screencast |
Full design: docs/design-2026-05-20.md
Active tasks: TODO.md
Limitations (acknowledged, not hidden)
- v1 only handles verbatim taint flow. When an LLM paraphrases sensitive data, channel-a (sha256 fingerprint) breaks. Channel-b (system-prompt warnings to sub-agents) is a partial mitigation but its effectiveness is an empirical question, not a guarantee.
- v1 only ships a TypeScript adapter for Claude Agent SDK. Python adapter, LangGraph, AutoGen, CrewAI, OpenClaw, Hermes are v1.1+.
- v1 prompt-injection detection is pattern-based. Paraphrased prompt injections will be missed. Documented as known gap, not silently broken.
- Not a defense against a malicious parent agent. Standard guardrail assumption: the agent we sit inside is honest-but-naive, not adversary-controlled.
Validation
- AgentDojo prompt-injection-marker subset: Stage 1 gate is recall ≥ 0.65 (deterministic detector floor)
- InjecAgent: baseline numbers in CI on every PR
- Multi-agent scenarios: 8-12 in
benchmarks/multiagent/, derived from a fork ofdamn-vulnerable-MCP-server
License
MIT — see LICENSE
Related work
This is the author's second project in MCP/agent security. The first is
MCP-Security-Framework, which scans MCP servers
for vulnerable patterns. The two projects are complementary: MCP-Security-Framework is a
static scanner; taintctl is a runtime provenance layer.
