@root-signals/scorable-cli
v0.5.0
Published
CLI for Scorable
Readme
The scorable CLI is a command-line tool for interacting with the Scorable API. It lets you manage and execute Judges and Evaluators, view execution logs, and run prompt testing experiments directly from the terminal.
Requires Node.js 20 or higher.
Installation
curl -sSL https://scorable.ai/cli/install.sh | shOr install directly with npm:
npm install -g @root-signals/scorable-cliOr run without installing via npx:
npx @root-signals/scorable-cli judge listAuthentication
Option 1 — Free demo key (no registration required):
scorable auth demo-keyCreates a temporary key and saves it to ~/.scorable/settings.json.
Option 2 — Permanent key from scorable.ai/register:
# Interactively
scorable auth set-key
# From argument
scorable auth set-key sk-your-api-keyOption 3 — Environment variable (takes precedence over saved key):
export SCORABLE_API_KEY="sk-your-api-key"The key lookup order is: SCORABLE_API_KEY env var → api_key in ~/.scorable/settings.json → temporary_api_key in ~/.scorable/settings.json.
Scorable Skills for AI Coding Agents
Install Scorable skills into your project so your AI coding agent (Claude Code, Cursor, etc.) can integrate evaluators automatically:
scorable skills-addOnce installed, open your coding agent in your AI powered project and use the prompt:
"Integrate scorable evaluators"
Judge Management
List judges
scorable judge listOptions: --page-size, --cursor, --search, --name, --ordering
Get a judge
scorable judge get <judge_id>Create a judge
scorable judge create --name "My Judge" --intent "Evaluate response quality."Options: --name (required), --intent (required), --stage, --evaluator-references (JSON string, e.g. '[{"id": "eval-id"}]')
Update a judge
scorable judge update <judge_id> --name "Updated Name"Options: --name, --stage, --evaluator-references (use "[]" to clear)
Delete a judge
scorable judge delete <judge_id>Prompts for confirmation. Use --yes to skip.
Duplicate a judge
scorable judge duplicate <judge_id>Judge Execution
Execute by ID
scorable judge execute <judge_id> --request "What is the capital of France?" --response "Paris"Options: --request, --response, --turns (JSON array of conversation turns), --contexts (JSON list), --expected-output, --tag (repeatable), --user-id, --session-id, --system-prompt
Pipe a response via stdin:
echo "Paris" | scorable judge execute <judge_id> --request "What is the capital of France?"
cat response.txt | scorable judge execute <judge_id>For multi-turn conversations, pass the full history as a JSON array:
scorable judge execute <judge_id> --turns '[{"role":"user","content":"Hi"},{"role":"assistant","content":"Hello!"}]'Execute by name
scorable judge execute-by-name "My Judge" --request "What is the capital of France?" --response "Paris"Accepts the same options as execute. Stdin piping and --turns work the same way.
Evaluator Management
List evaluators
scorable evaluator listOptions: --page-size, --cursor, --search, --name, --ordering
Get an evaluator
scorable evaluator get <evaluator_id>Create an evaluator
scorable evaluator create \
--name "My Evaluator" \
--scoring-criteria "Does the {{ response }} directly answer the user's question?" \
--intent "Evaluate response relevance"Options: --name (required), --scoring-criteria (required — must contain {{ request }} and/or {{ response }}), --intent or --objective-id (one required), --system-message, --models (JSON array), --overwrite, --objective-version-id
Update an evaluator
scorable evaluator update <evaluator_id> --name "Updated Name"Options: --name, --scoring-criteria, --system-message, --models (JSON array), --objective-id, --objective-version-id
Delete an evaluator
scorable evaluator delete <evaluator_id>Prompts for confirmation. Use --yes to skip.
Duplicate an evaluator
scorable evaluator duplicate <evaluator_id>Evaluator Execution
Execute by ID
scorable evaluator execute <evaluator_id> --request "What is 2+2?" --response "4"Options: --request, --response, --turns (JSON array of conversation turns), --contexts (JSON list), --expected-output, --tag (repeatable), --user-id, --session-id, --system-prompt, --variables (JSON object of extra template variables)
Stdin piping and --turns work the same way as for judge execution.
For evaluators with custom template placeholders beyond {{request}}/{{response}}:
scorable evaluator execute <evaluator_id> --request "Hello" --variables '{"lang":"EN","topic":"science"}'Execute by name
scorable evaluator execute-by-name "My Evaluator" --request "What is 2+2?" --response "4"Accepts the same options as execute, including --variables.
Execution Logs
List execution logs
scorable execution-log listOptions: --page-size, --cursor, --search, --evaluator-id, --judge-id, --model, --tags, --score-min, --score-max, --created-at-after, --created-at-before, --owner-email
Get an execution log
scorable execution-log get <log_id>Prompt Testing
Initialize a config file and run experiments:
scorable pt init
scorable pt runUse a custom config path:
scorable pt run --config path/to/prompt-tests.yamlThe prompt-test command is an alias for pt.
Config file format
prompts:
- "Extract info from: {{text}}"
inputs:
- vars:
text: "John Doe, [email protected]"
# Or use a dataset instead of inline inputs:
# dataset_id: "<uuid>"
models:
- gpt-4o-mini
- gemini-2.5-flash-lite
evaluators:
- name: Precision
- name: Confidentiality
# Optional: enforce structured output
# response_schema:
# type: object
# properties:
# name: { type: string }Results are displayed in a table and a browser link is printed for the full comparison view.
Development
npm install
npm run build # compile TypeScript
npm test # run tests
npm run typecheck # type-check without emitting
npm run lint # lint with oxlint
npm run fmt # format with oxfmt