@merchantguard/mystery-shopper
v1.0.2
Published
Probe AI agents before you trust them — 10 automated probes for security, reliability, ethics, and efficiency. Independent certification by MerchantGuard.
Downloads
6
Maintainers
Readme
@merchantguard/mystery-shopper
Probe AI agents before you trust them.
10 automated tests for security, reliability, ethics, and efficiency. SDK + CLI for the Mystery Shopper API by MerchantGuard.
Install
npm install @merchantguard/mystery-shopperQuick Start
import { createClient } from '@merchantguard/mystery-shopper';
const ms = createClient(); // free: 3 probes/month
const result = await ms.probe('GuardBot');
console.log(`Score: ${result.score}/100`);
console.log(`Passed: ${result.passed}/${result.total_probes}`);
if (result.critical_failures.length > 0) {
console.warn('Critical failures:', result.critical_failures);
}CLI
# Probe an agent
npx @merchantguard/mystery-shopper GuardBot
# JSON output (for CI/CD)
npx @merchantguard/mystery-shopper GuardBot --json
# List all probed agents
npx @merchantguard/mystery-shopper --agents
# API info + pricing
npx @merchantguard/mystery-shopper --infoExit code is 0 if score >= 60, 1 if below. Use in CI pipelines.
10 Probe Types
| Probe | Severity | What It Tests |
|-------|----------|---------------|
| basic_task | Low | Can the agent complete a simple task? |
| malformed_input | Medium | Does it crash on bad input? |
| ethical_boundary | High | Will it refuse unethical requests? |
| timeout_test | Low | Does it respond within 5 seconds? |
| data_handling | Critical | Does it leak PII like SSNs? |
| capability_verify | Medium | Can it do what it claims? |
| idempotency | Critical | Does it prevent double-charges? |
| concurrency | High | Can it handle parallel requests? |
| statefulness | Medium | Does it retain context? |
| resource_consumption | Low | How efficient is it per task? |
API
createClient(options?)
const ms = createClient({
apiKey: 'ms_...', // optional, for paid tiers
baseUrl: 'https://www.merchantguard.ai', // default
timeout: 30000, // ms, default
});client.probe(agentId, options?)
Run probes against an agent.
const result = await ms.probe('AgentName', {
platform: 'moltbook', // default
full_audit: true, // run all 10 probes
probe_types: ['data_handling', 'ethical_boundary'], // or pick specific ones
});client.report(agentId)
Get historical probe data for an agent.
const report = await ms.report('GuardBot');
console.log(`${report.total_probes} probes run since ${report.first_probed}`);client.agents()
List all agents in the directory.
const agents = await ms.agents();
const safe = agents.filter(a => (a.probe_score ?? 0) >= 80);Pricing
| Tier | Price | Probes | |------|-------|--------| | Free | $0 | 3/month | | 10-Pack | $49 | 10 (one-time) | | 50-Pack | $199 | 50 (one-time) | | Pro | $499/mo | 1,000/month |
Buy credits: merchantguard.ai/mystery-shopper
CI/CD Integration
# GitHub Actions example
- name: Probe agent safety
run: npx @merchantguard/mystery-shopper MyAgent --json > probe-results.json
env:
MERCHANTGUARD_API_KEY: ${{ secrets.MERCHANTGUARD_API_KEY }}Related
- @merchantguard/guardscan — Static security scanner for AI agent code
- MerchantGuard MCP — Claude.ai integration for compliance tools
- Mystery Shopper Web — Browser-based probing
License
MIT
