ccretro
v0.1.0
Published
Claude Code retrospective analysis tool — analyze prompt quality, waste, and usage patterns
Downloads
61
Maintainers
Readme
ccretro
Analyze your Claude Code sessions to find where you're wasting money and how to stop.
Why
Claude Code costs money per turn. Depending on how you write prompts, the same task can cost 2–5x more.
The problem is you can't see where the waste happens. ccretro parses your ~/.claude/projects/ session logs and gives you:
- Per-turn prompt quality scores with category-aware normalization
- Wasted tokens from retries, corrections, and duplicates — in dollar estimates
- 5 anti-pattern detectors that flag inefficient habits automatically
- Concrete action items: split session here, switch model there, save $X
Session: abc12345-...
Turns: 9 | Avg PQS: 81 (norm: 109%) | Iteration: 100%
Waste: 7901 tokens (11.7%)
💡 Action Items (potential savings: ~$3.20 / 2 turns):
1. [context] Using offset/limit on Read tool would have saved ~$2.59
2. [split] Splitting session around turn 4 would have saved ~$0.59
Anti-patterns (2):
!! [context_stuffing] Turn 7: tool results consumed 68% of context
! [model_mismatch] Turn 0: expensive model used for simple taskInstall
npm install -g ccretroRequires Node.js 18+. No native binaries — runs on sql.js (WASM), works on any OS.
Quick Start
The recommended way to use ccretro is through Claude Code slash commands.
ccretro setupThis copies the slash commands to ~/.claude/commands/. Then inside any Claude Code session, type:
/retro— Retrospective on the current session. Analyzes every turn, finds waste, coaches your prompts, and suggests concrete improvements./retro-weekly— Weekly summary across all sessions. Shows trends, growth assessment, and action items for next week.
That's it. The slash commands handle everything — they run ccretro under the hood with a multi-model pipeline (Haiku + Sonnet + Opus) and present the results in your language.
CLI (optional)
You can also use ccretro directly from the terminal:
# See which projects you have
ccretro list-projects
# Analyze recent sessions
ccretro analyze --project /path/to/project --limit 5
# Visual dashboard in your browser
ccretro serveFeatures
Session Analysis
# Single session
ccretro analyze --session <session-id>
# Batch analyze recent sessions
ccretro analyze --project /path/to/project --since 2025-01-27 --limit 10
# JSON output for scripting
ccretro analyze --session <session-id> --jsonPer-turn metrics:
| Metric | What it tells you | |--------|-------------------| | PQS (0–100) | Prompt execution quality. First-attempt success, error avoidance, scope efficiency, completion | | Normalized PQS | Performance relative to category expectations (debugging, generation, etc). 100% = meeting baseline | | Iteration Ratio | Effective turns / total turns. Drops when you retry or correct | | Context Budget | Context window health. Tracks compaction count, peak usage ratio |
Prompt Coaching
ccretro coach --session <session-id>
ccretro coach --project /path/to/projectEvaluates your prompt text itself (PTQS): specificity, clarity, scope, and context. Grades each turn A/B/C/D with specific improvement suggestions.
Correction Cause Analysis
ccretro correction-causes --session <session-id>
ccretro correction-causes --session <session-id> --detail detailed --jsonGoes beyond detecting that corrections happened — analyzes why they happened. Builds correction chains from consecutive correction turns, scores each chain's reliability (5 weighted factors), and generates structured prompts for model-based root cause classification.
7 cause categories:
| Cause | Meaning | Typical User Contribution | |-------|---------|--------------------------| | ambiguous_prompt | Prompt was genuinely unclear or multi-interpretable | 60-80% | | missing_context | Critical info omitted (file paths, constraints, env) | 60-80% | | requirements_changed | User changed their mind after seeing the result | 70-90% | | scope_creep | Prompt tried to do too many things at once | 60-80% | | model_misunderstood | Prompt was clear but model missed instructions | 10-30% | | technical_complexity | Task required iteration to refine approach | 30-50% | | tool_failure | Right approach but tool call failed | 10-30% |
The --json output includes pre-built prompts.system and prompts.user fields designed for direct model consumption. The /retro slash command uses these automatically.
Reliability scoring (5 factors):
- Keyword strength (0.20) — strong/moderate/weak correction signals
- Temporal proximity (0.15) — time gap between origin and correction
- Semantic coherence (0.20) — trigram Jaccard similarity between prompts
- Tool/file overlap (0.20) — shared tools and file paths
- Resolution signal (0.10) — whether the chain resolved
Chains with reliability >= 0.8 are analyzed at chain level, 0.5-0.8 with surrounding context, and < 0.5 at session level.
Anti-pattern Detection
5 inefficiency patterns detected automatically:
| Pattern | What it means | |---------|---------------| | Shotgun Debugging | 3+ corrections within 5 turns. You're guessing instead of debugging systematically | | Context Stuffing | Single turn consumes 30%+ of context. Use offset/limit on Read tool | | Blind Retry | Same prompt repeated after error. Try a different approach instead | | Model Mismatch | Opus used for simple lookups. Sonnet is 5x cheaper for these | | Premature Session | ≤3 turns + errors. Plan your task before starting a session |
Action Items
Instead of abstract scores, ccretro shows savings in dollars and turns:
- Split session — cost of regeneration after compaction
- Avoid retries — output tokens wasted on corrections/duplicates
- Reduce context — input tokens from context stuffing excess
- Switch model — Opus→Sonnet price difference
- Improve prompt — tokens wasted on shotgun debugging
Web Dashboard
ccretro serve # http://127.0.0.1:3000
ccretro serve --port 8080Browse projects → sessions → per-turn drilldown. PQS trend charts, category performance, task segments, anti-patterns — all visual.
JSON API available at /api/projects, /api/sessions, /api/session/:id, /api/aggregate.
Project Aggregation
ccretro sync --project /path/to/project
ccretro aggregate --project /path/to/project --since 2025-01-27Daily breakdown, PQS trends, top tools, best/worst sessions — powered by SQLite cache for instant results.
More Commands
# PQS change patterns within a session (warmup, fatigue, volatile, etc)
ccretro patterns --session <session-id>
# Outlier session detection across a project
ccretro anomalies --project /path/to/project
# Analyze WHY corrections occurred in a session
ccretro correction-causes --session <session-id>
ccretro correction-causes --session <session-id> --detail detailed --json
# Auto-generate slash commands from repeated prompt patterns
ccretro recommend --project /path/to/project --generate
# Install prompt quality hook (blocks submission if PTQS < 30)
ccretro install-hook --threshold 30
# Cache management
ccretro cache-status
ccretro cache-clearSlash Commands in Detail
Both commands use a multi-model pipeline: Haiku detects your language, Sonnet crunches the data, and Opus writes the final retrospective.
/retro — Session Retrospective
Analyzes a single session end-to-end:
- Lists your recent sessions and asks which one to review
- Runs
ccretro analyze,ccretro coach, andccretro correction-causeson the selected session - Identifies weak prompts (PQS < 60), quotes them, and drafts improved rewrites
- Classifies why each correction chain occurred (ambiguous prompt, missing context, model error, etc.) with user vs model attribution
- Surfaces waste patterns, anti-patterns, and habit analysis
- Presents a cohesive retrospective in your language
- Auto-updates
.claude/ccretro/coaching.mdso your next session benefits
/retro-weekly — Weekly Summary
Aggregates an entire week of sessions:
- Runs aggregation, anomaly detection, and pattern recommendation
- Compares this week vs last week (PQS trend, cost, waste rate)
- Classifies your trajectory: Growing / Plateau / Declining
- Flags outlier sessions worth investigating
- Suggests slash commands you can auto-generate from repeated prompts
- Delivers action items specific enough to follow immediately
Token Usage
Both /retro and /retro-weekly use a multi-model pipeline. Here are typical token consumption ranges:
/retro (Single Session)
| Phase | Model | Input tokens | Output tokens | |-------|-------|-------------|--------------| | Language detection | Haiku | ~500 | ~10 | | Data analysis | Sonnet | 5K–50K | 3K–5K | | Final retrospective | Opus | 10K–60K | 3K–8K | | Total | | 15K–110K | 6K–13K |
Input token count scales with session length (number of turns and tool calls). A 10-turn session typically uses ~30K total input tokens; a 30+ turn session can reach ~100K+.
/retro-weekly (Weekly Summary)
| Phase | Model | Input tokens | Output tokens | |-------|-------|-------------|--------------| | Language detection | Haiku | ~500 | ~10 | | Data analysis | Sonnet | 10K–100K | 5K–8K | | Final retrospective | Opus | 15K–110K | 5K–8K | | Total | | 25K–210K | 10K–16K |
Weekly analysis scales with the number of sessions in the period. A typical week (10–20 sessions) uses ~80K total input tokens.
Cost Control
Both commands show an estimated token breakdown before running the expensive model calls and ask for confirmation. After completion, they display actual usage vs estimate so you can track real costs over time.
To skip confirmation and run immediately, you can answer "yes" when prompted.
Metrics & Methodology
For a detailed explanation of every metric, how scores are calculated, and how to interpret them in retrospectives, see docs/metrics.md.
Tips for Better Prompts
Based on what ccretro surfaces most often:
- Be specific upfront — include file paths, error messages, and expected behavior instead of "fix this"
- One task per prompt — multiple requests in one prompt increase partial failure rates
- Don't repeat after errors — include the error message and try a different approach
- Read files partially — use offset/limit on Read tool instead of loading entire files
- Split before compaction — ccretro tells you when to start a new session
- Use Sonnet for simple questions — exploration and confirmation tasks don't need Opus
License
MIT
