rubric-chat
v0.4.0
Published
A strict 0–100 score for AI conversations. Auto-discovers Claude Code, Codex CLI, and Cursor sessions; also accepts ChatGPT and Claude.ai exports. Six dimensions, eight archetypes, shareable score card.
Maintainers
Readme
rubric-chat
A strict 0–100 score for AI conversations.
Auto-discover terminal-agent sessions on disk or upload an export — Rubric grades the prompting craft on six dimensions (specificity, context, structure, iteration, scope, meta-prompting), names an archetype, and gives you a public score card you can share.
Install
npm i -g rubric-chatRequires Node ≥ 18.
Usage
# Interactive wizard — auto-login, picks a session, scores it, opens the report.
rubric-chat
# Common subcommands
rubric-chat status # who am I, what plan, where are the URLs pointed
rubric-chat login # browser-callback login (or --no-browser to paste a token)
rubric-chat logout
rubric-chat list # show discovered local sessions
rubric-chat rate <id> # score a specific session id
rubric-chat rate --file path # score a JSON file (ChatGPT or Claude.ai export)
# Output modes
rubric-chat rate <id> --json # machine-readable
rubric-chat rate <id> --quiet # exit code + URL only
# Fully offline scoring — your prompts never leave your machine
rubric-chat rate <id> --localLocal mode (--local)
rate --local scores the session entirely on-device with a small open-weight
model — no account, no API call, no monthly cap, nothing uploaded. The default is
Gemma 4 E2B (Apache 2.0) running on ONNX Runtime via transformers.js. The
first run downloads the model (~3.2 GB, cached in ~/.rubric/models/); after
that it's offline forever.
rubric-chat rate <id> --local # default (gemma-4-e2b, ~5 GB RAM)
rubric-chat rate <id> --local --model gemma-4-e4b # bigger judge (~8 GB RAM)
rubric-chat rate <id> --local --model gemma-3-1b # low-RAM machines (~2 GB)
rubric-chat rate <id> --local --yes # skip the first-run download promptAvailable models: gemma-4-e2b (default), gemma-4-e4b, gemma-3-4b,
gemma-3-1b. Gemma 4 runs on the ONNX engine; Gemma 3 runs on llama.cpp with
grammar-enforced output.
Local scores use the exact same rubric, weights, and strictness curve as the
server, but a small on-device judge is not a frontier judge — expect roughly
±10 points (local tends generous) and plainer feedback. The report is badged
LOCAL (approximate); run without --local for the calibrated score and a
shareable card.
Supported sources
| Source | How it's discovered | Status |
| --- | --- | --- |
| Claude Code | ~/.claude/projects/**/*.jsonl | ✅ |
| Codex CLI | ~/.codex/sessions/**/*.jsonl | ✅ |
| Cursor | local SQLite at ~/Library/Application Support/Cursor/User/globalStorage/state.vscdb (macOS) — verified against composerData / bubbleId schema _v: 16 | ✅ |
| Cline / Roo / Aider / Continue | per-tool storage | ⏳ planned |
| ChatGPT export | upload conversations.json via --file | ✅ |
| Claude.ai export | upload Anthropic data export JSON via --file | ✅ |
What gets graded
The whole session as one artifact, on six weighted dimensions, with a strict curve so a competent median lands near 50 (not 85). Full rubric, weights, and the strictness formula are at https://rubric.chat/methodology.
What we look at
Only your turns. The CLI strips assistant text locally before any payload leaves your machine — see the apps/cli/src/sources/*.test.ts files in the repo, which assert this at the wire level. User turns are also PII-redacted on the server before they reach the scoring model.
Configuration
| Env var | Default | What it does |
| --- | --- | --- |
| RUBRIC_API_BASE_URL | https://api.rubric.chat | Override for local dev |
| RUBRIC_WEB_BASE_URL | https://rubric.chat | Override for local dev |
| RUBRIC_LOCAL_MODEL_URI | registry default | Override the --local GGUF model URI |
Credentials are stored in the OS keychain (macOS Keychain, libsecret on Linux). Fallback: ~/.rubric/credentials (chmod 600).
License
MIT
