@pistachio.sh/cli

v0.2.0

Published

a month ago

Pistachio CLI — run harnesses against AI agents from the command line. Ships an MCP server for Claude Code.

Downloads

0High
0Medium
0Low

sodiumsun

`@pistachio.sh/cli`

Pistachio CLI — run harnesses against AI agents from the command line, or from inside a Claude Code session via an MCP server.

Pistachio is a marketplace of harnesses for AI agents in business. Stop shipping agents on vibes. — https://www.pistachio.sh

Status

v0.2.0 — live on npm. Ships:

The v0.0 HTTP client (harness list/show/run)
A Design B stdio MCP server with seven tools + in-memory trace buffer
pistachio init — registers the server with Claude Code via claude mcp add
pistachio_run_harness accepts an optional replay: true — triggers server-side trace-replay verification (KR-1 tier 2) against a pinned Haiku model, with the verdict embedded in the signed manifest
Observe-tool descriptions rewritten to forbid paraphrase / summarization (KR-1 tier 1 item 2) — content must be passed verbatim

Install

Once published:

npx @pistachio.sh/cli init          # recommended — registers as MCP server
# or
npm install -g @pistachio.sh/cli

For now (local dev):

git clone <this repo>
cd pistachio-cli
npm install
npm run build
node dist/cli.js --help

Auth

Most commands work unauthenticated (the harness catalog is public). Submitting a run — or starting the MCP server — requires a Pistachio API key.

The MCP server relies on the PISTACHIO_API_KEY env var so it keeps working across Claude Code session restarts. Set it in your shell:

export PISTACHIO_API_KEY=pk_live_…

For the plain CLI, you can also persist the key on disk (mode 0600) at ~/.config/pistachio/config.json:

pistachio config set-key pk_live_…

Commands

pistachio version
pistachio config set-key <key>
pistachio config show

pistachio harness list
pistachio harness show <slug>
pistachio harness run <slug> --trace <file.json> [--timeout <seconds>]

pistachio init [-s user|project|local] [-n <name>]
pistachio mcp serve

`pistachio init`

Registers this CLI as a Claude Code MCP server. Verifies the API key against the production API first, then shells out to claude mcp add — we don't touch ~/.claude.json directly.

export PISTACHIO_API_KEY=pk_live_…
pistachio init        # defaults: scope=user, name=pistachio
# Quit and restart Claude Code before using the tools.

`pistachio mcp serve`

Runs the stdio MCP server. You don't invoke this by hand — claude mcp add wires it as the command to spawn. stdout is the JSON-RPC protocol channel, so all diagnostics go to stderr.

MCP tools exposed

| Tool | What it does | |---|---| | pistachio_observe_turn | Recommended. Record a full user→assistant turn (with optional tool calls + retrieved docs) atomically. | | pistachio_observe_message | Append one message to the trace buffer. | | pistachio_observe_tool_call | Append one tool/function call. Needed for tool-use-stress. | | pistachio_observe_retrieved_document | Append one RAG document. Needed for rag-faithfulness, legal-citation-accuracy. | | pistachio_set_metadata | Attach a metadata key/value to the buffer. | | pistachio_get_buffer | Sanity-check the buffer before running a harness. | | pistachio_run_harness | Submit the buffer to a harness, poll, return the pass/fail report. Auto-resets on success. Accepts optional replay: true for server-side trace-replay verification. | | pistachio_reset | Clear the buffer without submitting. |

Each harness declares a required_trace shape server-side (min messages, required roles, whether tool calls / retrieved documents must be present). Mismatches return a 422 with a specific reason, so partial / empty buffers can't silently masquerade as passes.

`pistachio harness run`

Submits a JSON trace file to a harness, polls until the run completes, and prints the result table. Exits non-zero if any check failed. The shape is the same VerificationContext the worker grader expects:

{
  "messages": [
    { "role": "system", "content": "you are helpful" },
    { "role": "user", "content": "hello" },
    { "role": "assistant", "content": "hi there" }
  ],
  "tool_calls": [],
  "retrieved_documents": [],
  "metadata": { "source": "my-pipeline" }
}

Environment variables

| Var | Purpose | Default | |---|---|---| | PISTACHIO_API_KEY | Bearer token for runs endpoints | (none) | | PISTACHIO_BASE_URL | API base URL | https://www.pistachio.sh | | XDG_CONFIG_HOME | Where the config file lives | ~/.config | | NO_COLOR | Disable ANSI colors | (off) |

Layout

pistachio-cli/
├── src/
│   ├── api.ts            # PistachioApi — typed fetch client
│   ├── config.ts         # XDG config file IO + env resolution
│   ├── trace.ts          # zod schema for VerificationContext
│   ├── output.ts         # tiny terminal helpers (no chalk dep)
│   ├── cli.ts            # commander wiring + command implementations
│   └── mcp/
│       ├── buffer.ts     # in-memory trace buffer (pure, unit-tested)
│       └── server.ts     # stdio MCP server + seven tools
├── tests/                # vitest — 44/44, no real network calls
├── tsup.config.ts        # ESM single-file bundle, shebang banner
└── package.json

Roadmap

v0.0.1 — harness list/show/run, config, trace validation (✅)
v0.1.x — Design B MCP server + pistachio init (✅)
v0.2.0 — replay: true on pistachio_run_harness + anti-paraphrase observe-tool descriptions (KR-1 tier 1 item 2 + tier 2 surfacing) (← current)
v0.3.0 — Cursor / Codex CLI install variants

License

MIT.

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme