blackwall-mcp
v1.0.6
Published
BLACK_WALL MCP server — a pre-action risk check your AI agent calls before any irreversible action (send email, move money, run SQL, delete data).
Maintainers
Readme
blackwall-mcp
A guardrail for AI agents, as an MCP server. Your agent calls one tool — forecast — before any irreversible action (send email, move money, run SQL, delete data, post content). It gets back a risk score (0–100), a reversibility class, a GO / CAUTION / STOP recommendation, and named red flags in a few seconds (~4-8s).
Works in any MCP host: Claude Desktop, Claude Code, Cursor, Windsurf, and any agent framework with MCP support.
The wall between your agent and disaster. A BLUETIER product.
1. Get an API key
Sign up free at https://blackwalltier.com → Dashboard → API keys → Create key.
Free tier: ~100 forecasts/month, no card. Your key looks like bw_live_….
2. Add the server to your MCP host
Claude Desktop
Edit claude_desktop_config.json (Settings → Developer → Edit Config):
{
"mcpServers": {
"blackwall": {
"command": "npx",
"args": ["-y", "blackwall-mcp"],
"env": { "BLACKWALL_API_KEY": "bw_live_your_key_here" }
}
}
}Restart Claude Desktop. You'll see a forecast tool available.
Cursor
Settings → MCP → Add new global MCP server, then in mcp.json:
{
"mcpServers": {
"blackwall": {
"command": "npx",
"args": ["-y", "blackwall-mcp"],
"env": { "BLACKWALL_API_KEY": "bw_live_your_key_here" }
}
}
}Claude Code
claude mcp add blackwall -e BLACKWALL_API_KEY=bw_live_your_key_here -- npx -y blackwall-mcpRun locally (any host / testing)
BLACKWALL_API_KEY=bw_live_your_key_here npx -y blackwall-mcp3. Use it
Once added, instruct your agent: "Before any irreversible action, call the forecast tool and stop if it returns STOP." The model will call it automatically when it's about to do something risky.
The forecast tool
| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| action | string | ✅ | The action type, e.g. send_email, make_payment, run_sql, delete_file, post_content |
| inputs | object | ✅ | Concrete parameters: recipient, amount_usd, SQL statement, file path, message body, URL, etc. |
| context | object | — | Optional: { agent_role, user_intent, environment } |
| depth | standard | deep | — | Analysis depth. standard is the default. |
Returns: recommendation (GO/CAUTION/STOP), risk_score (0–100), reversibility (class + rollback cost), gate (proceed/confirm/human-required), confidence, red_flags[], predicted_result, alternative_actions[].
Example
Agent about to run DELETE FROM users; (no WHERE clause) →
🛑 BLACK_WALL: STOP — risk 99/100
Red flags:
• [CRITICAL] SQL_NO_WHERE — deletes the entire table, not one row
• [CRITICAL] INTENT_MISMATCH — intent was "remove a single test row"
• [CRITICAL] IRREVERSIBLE_NO_BACKUP — no recovery path
Guidance: DO NOT take this action. Surface the red flags to the user.Observe mode — try it with zero risk
Not ready to let a guardrail block your agents? Start in observe mode. It scores and logs every action but never tells the agent to stop — your agents behave exactly as they do today. After a week, review your dashboard and see what it would have caught.
{
"mcpServers": {
"blackwall": {
"command": "npx",
"args": ["-y", "blackwall-mcp"],
"env": {
"BLACKWALL_API_KEY": "bw_live_your_key_here",
"BLACKWALL_MODE": "observe"
}
}
}
}Then see "what your agents almost did" in your dashboard. Flip BLACKWALL_MODE to enforce (or just remove it — enforce is the default) when you're ready to actually block.
Config reference
| Env var | Required | Default | Notes |
|---------|----------|---------|-------|
| BLACKWALL_API_KEY | ✅ | — | bw_live_… from your dashboard |
| BLACKWALL_BASE_URL | — | https://blackwalltier.com | |
| BLACKWALL_MODE | — | enforce | observe = log only, never block |
Links
- Site & docs: https://blackwalltier.com
- Get a key: https://blackwalltier.com/dashboard/keys
MIT licensed.
