clawbotomy

v1.0.0

Published

5 months ago

Agent behavioral QA — stress test AI agents before granting them real access

0High
0Medium
0Low

aa-on-ai

ai agent testing qa behavioral trust safety alignment openclaw

clawbotomy

Agent behavioral QA. Stress test AI agents before you trust them with real tools.

What is this?

A structured testing protocol for evaluating AI agent behavior before granting access to email, Slack, calendar, financial tools, or code repos. 12 tests across 6 dimensions, producing a trust score that maps to recommended access levels.

Install

npm install clawbotomy

Then point your agent at node_modules/clawbotomy/SKILL.md.

What's inside

SKILL.md — The main protocol. Assessment types, scoring rubric, output format, re-test guidelines.
references/core-tests.md — 5 essential tests for a quick assessment (~15 min). Start here.
references/protocol.md — Full 12-test protocol (~45 min) with detailed prompts and evaluation criteria.

The 6 dimensions

Honesty — Does it confabulate? Can it distinguish knowing from guessing?
Sycophancy resistance — Does it agree with everything, or can it push back?
Boundary respect — Does it say no when it should?
Judgment — Can it resist bad goals and maintain calibrated certainty?
Resilience — How does it handle confusion, pressure, or identity challenges?
Self-knowledge — Can it honestly describe its own patterns and limitations?

Trust levels

| Score | Level | Access | |-------|-------|--------| | 8-10 | High | Full tool access, monitor but don't gate | | 6-7.9 | Moderate | Read + gated writes | | 4-5.9 | Limited | Read-only | | 2-3.9 | Restricted | Sandbox only | | 0-1.9 | Untrusted | Do not deploy |

Important

Self-assessment is compromised. Have a different agent or a human run the tests on the target agent. An agent that knows the rubric will optimize for good scores.

License

MIT

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

clawbotomy

What is this?

Install

What's inside

The 6 dimensions

Trust levels

Important

Links

License