@certaworks/prompt-archaeology-tool

v0.1.0

Published

23 days ago

Runs local probes to estimate agent behavioral constraints from observed responses.

0High
0Medium
0Low

blairhall

certaworks mcp ai-agent agent-safety prompt-archaeology behavioral-analysis probing

Prompt Archaeology Tool

Type: Local Probe Runner / API / MCP / CLI

Value: Runs scripted probes against observed agent responses to estimate likely behavioral constraints, persona patterns, refusal style, tool policy, and formatting habits.

Current Status

Complete as a local probe runner / CLI / MCP / HTTP slice. It analyzes captured or callback-driven responses and produces evidence-backed local reports.

Shipped Local Scope

Probe catalog for persona, constraints, refusal style, tool policy, format rules, tone, ownership, and knowledge-cutoff behavior
Local batch runner that can execute probes through a pluggable agent callback or analyze captured { probe_id, response } pairs
Evidence-backed report model with confidence, signal matches, quotes, methodology, and limitations
JSON, Markdown, and text report export
Durable local run history with save/list/load APIs
Local HTTP API plus lightweight dashboard page
MCP tools for listing probes, analyzing responses, building reports, and exporting reports
Package bins and subpath exports for SDK, MCP, and HTTP usage

Install And Run

npm install
npm test
npm run mcp
npm run serve

After build, the package exposes:

prompt-archaeology
prompt-archaeology-mcp
prompt-archaeology-api

Local Store

By default, run history writes to:

.prompt-archaeology/runs.json

Override with either:

PROMPT_ARCHAEOLOGY_STORE_PATH=/path/to/runs.json
PROMPT_ARCHAEOLOGY_STORE=/path/to/runs.json

SDK Surface

import { runProbeBatch, createRunStore, exportReport } from '@blair/prompt-archaeology';

const run = await runProbeBatch({
  target: { name: 'Support assistant', vendor: 'local-fixture' },
  probeIds: ['p-name', 'c-reveal-prompt', 't-tools'],
  agent: async probe => sendPromptToAgent(probe.prompt)
});

await createRunStore().saveRun(run);
await exportReport(run.report, { format: 'markdown', filePath: './report.md' });

For already captured responses:

const run = await runProbeBatch({
  target: { name: 'Captured transcript' },
  responses: [
    { probe_id: 'p-name', response: "I'm Atlas." },
    { probe_id: 'c-refuse-harm', response: 'I cannot help with harmful requests.' }
  ]
});

CLI

prompt-archaeology --input responses.json --json-out report.json --text-out report.md

Input file shape:

{
  "target": { "name": "Agent name" },
  "responses": [
    { "probe_id": "p-name", "response": "I'm an assistant." }
  ]
}

HTTP API

GET  /dashboard
GET  /health
GET  /api/probes
POST /api/analyze
POST /api/reports
POST /api/runs
GET  /api/runs
GET  /api/runs/:id

The server binds locally by default when run through:

npm run serve

MCP Tools

list_probes
analyze_response
build_report
export_report

Current Limits

This is a local product slice, not hosted SaaS.
The tool estimates behavior from observed responses; it does not recover hidden prompts verbatim.
Confidence is heuristic and evidence-backed, not a guarantee about private system instructions.
There is no public npm publication, live checkout, hosted account system, API-key service, paid credit meter, provider integration marketplace, or authenticated dashboard yet.
Batch execution uses a local callback/API surface; production provider adapters, retries, rate limits, and hosted transcript collection are future work.

Verification

Fresh suite verification on 2026-05-28:

npm test passed, 20/20 SDK, runner, persistence, export, MCP, HTTP, and package contract tests.
npm run build passes.
Package dry-run verifies only runtime artifacts and README are included.

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme