rlhf-feedback-loop
v0.6.13
Published
Feedback-Driven Development (FDD) for AI agents — capture preference signals, steer behavior via Thompson Sampling, and export KTO/DPO training pairs for downstream fine-tuning.
Maintainers
Keywords
Readme
MCP Memory Gateway
Local-first memory and feedback pipeline for AI agents. Captures thumbs-up/down signals, promotes reusable memories, generates prevention rules from repeated failures, and exports KTO/DPO pairs for fine-tuning.
Works with any MCP-compatible agent: Claude, Codex, Gemini, Amp, Cursor.
What It Does
thumbs up/down → validate → promote to memory → vector index → prevention rules → DPO export- Capture —
capture_feedbackMCP tool accepts signals with context - Validate — Rubric engine gates promotion (vague feedback is rejected with clarification prompts)
- Remember — Promoted memories stored in JSONL + LanceDB vectors
- Prevent — Repeated failures auto-generate prevention rules
- Export — KTO/DPO pairs for downstream fine-tuning
- Bridge — JSONL file watcher auto-ingests signals from external sources (Amp plugins, hooks, scripts)
Quick Start
# Add to any MCP-compatible agent
claude mcp add rlhf -- npx -y rlhf-feedback-loop serve
codex mcp add rlhf -- npx -y rlhf-feedback-loop serve
amp mcp add rlhf -- npx -y rlhf-feedback-loop serve
gemini mcp add rlhf "npx -y rlhf-feedback-loop serve"
# Or auto-detect all installed platforms
npx rlhf-feedback-loop initMCP Tools
| Tool | Description |
|------|-------------|
| capture_feedback | Accept up/down signal + context, validate, promote to memory |
| recall | Vector-search past feedback and prevention rules for current task |
| feedback_stats | Approval rate, per-skill/tag breakdown, trend analysis |
| feedback_summary | Human-readable recent feedback summary |
| prevention_rules | Generate prevention rules from repeated mistakes |
| export_dpo_pairs | Build DPO preference pairs from promoted memories |
| construct_context_pack | Bounded context pack from contextfs |
| evaluate_context_pack | Record context pack outcome (closes learning loop) |
| list_intents | Available action plan templates |
| plan_intent | Generate execution plan with policy checkpoints |
| context_provenance | Audit trail of context decisions |
CLI
npx rlhf-feedback-loop init # Scaffold .rlhf/ + configure MCP
npx rlhf-feedback-loop serve # Start MCP server (stdio) + watcher
npx rlhf-feedback-loop status # Learning curve dashboard
npx rlhf-feedback-loop watch # Watch .rlhf/ for external signals
npx rlhf-feedback-loop watch --once # Process pending signals and exit
npx rlhf-feedback-loop capture # Capture feedback via CLI
npx rlhf-feedback-loop stats # Analytics + Revenue-at-Risk
npx rlhf-feedback-loop rules # Generate prevention rules
npx rlhf-feedback-loop export-dpo # Export DPO training pairs
npx rlhf-feedback-loop risk # Train/query boosted risk scorer
npx rlhf-feedback-loop self-heal # Run self-healing diagnosticsJSONL File Watcher
The serve command automatically starts a background watcher that monitors feedback-log.jsonl for entries written by external sources (Amp plugins, shell hooks, CI scripts). These entries are routed through the full captureFeedback() pipeline — validation, memory promotion, vector indexing, and DPO eligibility.
# Standalone watcher
npx rlhf-feedback-loop watch --source amp-plugin-bridge
# Process pending entries once and exit
npx rlhf-feedback-loop watch --onceExternal sources write entries with a source field:
{"signal":"positive","context":"Agent fixed bug on first try","source":"amp-plugin-bridge","tags":["amp-ui-bridge"]}The watcher tracks its position via .rlhf/.watcher-offset for crash-safe, idempotent processing.
Learning Curve Dashboard
npx rlhf-feedback-loop status╔══════════════════════════════════════╗
║ RLHF Learning Curve Dashboard ║
╠══════════════════════════════════════╣
║ Total signals: 148 ║
║ Positive: 45 (30%) ║
║ Negative: 103 (70%) ║
║ Recent (last 20): 20% ║
║ Trend: 📉 declining ║
║ Memories: 17 ║
║ Prevention rules: 9 ║
╠══════════════════════════════════════╣
║ Top failure domains: ║
║ execution-gap 4 ║
║ asked-not-doing 2 ║
║ speed 2 ║
╠══════════════════════════════════════╣
║ Learning curve (approval % by window)║
║ [1-10] 10% ██ ║
║ [11-20] 20% ████ ║
║ [21-30] 35% ███████ ║
║ [31-40] 30% ██████ ║
╚══════════════════════════════════════╝Architecture
Five-phase pipeline: Capture → Validate → Remember → Prevent → Export
Agent (Claude/Codex/Amp/Gemini)
│
├── MCP tool call ──→ captureFeedback()
├── REST API ────────→ captureFeedback()
├── CLI ─────────────→ captureFeedback()
└── External write ──→ JSONL ──→ Watcher ──→ captureFeedback()
│
▼
┌─────────────────┐
│ Full Pipeline │
│ • Schema valid │
│ • Rubric gate │
│ • Memory promo │
│ • Vector index │
│ • Risk scoring │
│ • RLAIF audit │
│ • DPO eligible │
└─────────────────┘Agent Runner Contract
- WORKFLOW.md: scope, proof-of-work, hard stops, and done criteria for isolated agent runs
- .github/ISSUE_TEMPLATE/ready-for-agent.yml: bounded intake template for "Ready for Agent" tickets
- .github/pull_request_template.md: proof-first handoff format for PRs
License
MIT. See LICENSE.
