@pistachio.sh/cli
v0.2.0
Published
Pistachio CLI — run harnesses against AI agents from the command line. Ships an MCP server for Claude Code.
Downloads
25
Readme
@pistachio.sh/cli
Pistachio CLI — run harnesses against AI agents from the command line, or from inside a Claude Code session via an MCP server.
Pistachio is a marketplace of harnesses for AI agents in business. Stop shipping agents on vibes. — https://www.pistachio.sh
Status
v0.2.0 — live on npm. Ships:
- The v0.0 HTTP client (
harness list/show/run) - A Design B stdio MCP server with seven tools + in-memory trace buffer
pistachio init— registers the server with Claude Code viaclaude mcp addpistachio_run_harnessaccepts an optionalreplay: true— triggers server-side trace-replay verification (KR-1 tier 2) against a pinned Haiku model, with the verdict embedded in the signed manifest- Observe-tool descriptions rewritten to forbid paraphrase / summarization (KR-1 tier 1 item 2) — content must be passed verbatim
Install
Once published:
npx @pistachio.sh/cli init # recommended — registers as MCP server
# or
npm install -g @pistachio.sh/cliFor now (local dev):
git clone <this repo>
cd pistachio-cli
npm install
npm run build
node dist/cli.js --helpAuth
Most commands work unauthenticated (the harness catalog is public). Submitting a run — or starting the MCP server — requires a Pistachio API key.
The MCP server relies on the PISTACHIO_API_KEY env var so it keeps working
across Claude Code session restarts. Set it in your shell:
export PISTACHIO_API_KEY=pk_live_…For the plain CLI, you can also persist the key on disk (mode 0600) at
~/.config/pistachio/config.json:
pistachio config set-key pk_live_…Commands
pistachio version
pistachio config set-key <key>
pistachio config show
pistachio harness list
pistachio harness show <slug>
pistachio harness run <slug> --trace <file.json> [--timeout <seconds>]
pistachio init [-s user|project|local] [-n <name>]
pistachio mcp servepistachio init
Registers this CLI as a Claude Code MCP server. Verifies the API key against
the production API first, then shells out to claude mcp add — we don't
touch ~/.claude.json directly.
export PISTACHIO_API_KEY=pk_live_…
pistachio init # defaults: scope=user, name=pistachio
# Quit and restart Claude Code before using the tools.pistachio mcp serve
Runs the stdio MCP server. You don't invoke this by hand — claude mcp add
wires it as the command to spawn. stdout is the JSON-RPC protocol channel,
so all diagnostics go to stderr.
MCP tools exposed
| Tool | What it does |
|---|---|
| pistachio_observe_turn | Recommended. Record a full user→assistant turn (with optional tool calls + retrieved docs) atomically. |
| pistachio_observe_message | Append one message to the trace buffer. |
| pistachio_observe_tool_call | Append one tool/function call. Needed for tool-use-stress. |
| pistachio_observe_retrieved_document | Append one RAG document. Needed for rag-faithfulness, legal-citation-accuracy. |
| pistachio_set_metadata | Attach a metadata key/value to the buffer. |
| pistachio_get_buffer | Sanity-check the buffer before running a harness. |
| pistachio_run_harness | Submit the buffer to a harness, poll, return the pass/fail report. Auto-resets on success. Accepts optional replay: true for server-side trace-replay verification. |
| pistachio_reset | Clear the buffer without submitting. |
Each harness declares a required_trace shape server-side (min messages,
required roles, whether tool calls / retrieved documents must be present).
Mismatches return a 422 with a specific reason, so partial / empty buffers
can't silently masquerade as passes.
pistachio harness run
Submits a JSON trace file to a harness, polls until the run completes, and
prints the result table. Exits non-zero if any check failed. The shape is
the same VerificationContext the worker grader expects:
{
"messages": [
{ "role": "system", "content": "you are helpful" },
{ "role": "user", "content": "hello" },
{ "role": "assistant", "content": "hi there" }
],
"tool_calls": [],
"retrieved_documents": [],
"metadata": { "source": "my-pipeline" }
}Environment variables
| Var | Purpose | Default |
|---|---|---|
| PISTACHIO_API_KEY | Bearer token for runs endpoints | (none) |
| PISTACHIO_BASE_URL | API base URL | https://www.pistachio.sh |
| XDG_CONFIG_HOME | Where the config file lives | ~/.config |
| NO_COLOR | Disable ANSI colors | (off) |
Layout
pistachio-cli/
├── src/
│ ├── api.ts # PistachioApi — typed fetch client
│ ├── config.ts # XDG config file IO + env resolution
│ ├── trace.ts # zod schema for VerificationContext
│ ├── output.ts # tiny terminal helpers (no chalk dep)
│ ├── cli.ts # commander wiring + command implementations
│ └── mcp/
│ ├── buffer.ts # in-memory trace buffer (pure, unit-tested)
│ └── server.ts # stdio MCP server + seven tools
├── tests/ # vitest — 44/44, no real network calls
├── tsup.config.ts # ESM single-file bundle, shebang banner
└── package.jsonRoadmap
- v0.0.1 —
harness list/show/run, config, trace validation (✅) - v0.1.x — Design B MCP server +
pistachio init(✅) - v0.2.0 —
replay: trueonpistachio_run_harness+ anti-paraphrase observe-tool descriptions (KR-1 tier 1 item 2 + tier 2 surfacing) (← current) - v0.3.0 — Cursor / Codex CLI install variants
License
MIT.
