npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

rlhf-feedback-loop

v0.6.13

Published

Feedback-Driven Development (FDD) for AI agents — capture preference signals, steer behavior via Thompson Sampling, and export KTO/DPO training pairs for downstream fine-tuning.

Readme

MCP Memory Gateway

CI Self-Healing npm License: MIT Node

Local-first memory and feedback pipeline for AI agents. Captures thumbs-up/down signals, promotes reusable memories, generates prevention rules from repeated failures, and exports KTO/DPO pairs for fine-tuning.

Works with any MCP-compatible agent: Claude, Codex, Gemini, Amp, Cursor.

What It Does

thumbs up/down → validate → promote to memory → vector index → prevention rules → DPO export
  1. Capturecapture_feedback MCP tool accepts signals with context
  2. Validate — Rubric engine gates promotion (vague feedback is rejected with clarification prompts)
  3. Remember — Promoted memories stored in JSONL + LanceDB vectors
  4. Prevent — Repeated failures auto-generate prevention rules
  5. Export — KTO/DPO pairs for downstream fine-tuning
  6. Bridge — JSONL file watcher auto-ingests signals from external sources (Amp plugins, hooks, scripts)

Quick Start

# Add to any MCP-compatible agent
claude mcp add rlhf -- npx -y rlhf-feedback-loop serve
codex mcp add rlhf -- npx -y rlhf-feedback-loop serve
amp mcp add rlhf -- npx -y rlhf-feedback-loop serve
gemini mcp add rlhf "npx -y rlhf-feedback-loop serve"

# Or auto-detect all installed platforms
npx rlhf-feedback-loop init

MCP Tools

| Tool | Description | |------|-------------| | capture_feedback | Accept up/down signal + context, validate, promote to memory | | recall | Vector-search past feedback and prevention rules for current task | | feedback_stats | Approval rate, per-skill/tag breakdown, trend analysis | | feedback_summary | Human-readable recent feedback summary | | prevention_rules | Generate prevention rules from repeated mistakes | | export_dpo_pairs | Build DPO preference pairs from promoted memories | | construct_context_pack | Bounded context pack from contextfs | | evaluate_context_pack | Record context pack outcome (closes learning loop) | | list_intents | Available action plan templates | | plan_intent | Generate execution plan with policy checkpoints | | context_provenance | Audit trail of context decisions |

CLI

npx rlhf-feedback-loop init              # Scaffold .rlhf/ + configure MCP
npx rlhf-feedback-loop serve             # Start MCP server (stdio) + watcher
npx rlhf-feedback-loop status            # Learning curve dashboard
npx rlhf-feedback-loop watch             # Watch .rlhf/ for external signals
npx rlhf-feedback-loop watch --once      # Process pending signals and exit
npx rlhf-feedback-loop capture           # Capture feedback via CLI
npx rlhf-feedback-loop stats             # Analytics + Revenue-at-Risk
npx rlhf-feedback-loop rules             # Generate prevention rules
npx rlhf-feedback-loop export-dpo        # Export DPO training pairs
npx rlhf-feedback-loop risk              # Train/query boosted risk scorer
npx rlhf-feedback-loop self-heal         # Run self-healing diagnostics

JSONL File Watcher

The serve command automatically starts a background watcher that monitors feedback-log.jsonl for entries written by external sources (Amp plugins, shell hooks, CI scripts). These entries are routed through the full captureFeedback() pipeline — validation, memory promotion, vector indexing, and DPO eligibility.

# Standalone watcher
npx rlhf-feedback-loop watch --source amp-plugin-bridge

# Process pending entries once and exit
npx rlhf-feedback-loop watch --once

External sources write entries with a source field:

{"signal":"positive","context":"Agent fixed bug on first try","source":"amp-plugin-bridge","tags":["amp-ui-bridge"]}

The watcher tracks its position via .rlhf/.watcher-offset for crash-safe, idempotent processing.

Learning Curve Dashboard

npx rlhf-feedback-loop status
╔══════════════════════════════════════╗
║     RLHF Learning Curve Dashboard   ║
╠══════════════════════════════════════╣
║ Total signals:    148                ║
║ Positive:          45  (30%)         ║
║ Negative:         103  (70%)         ║
║ Recent (last 20):  20%               ║
║ Trend:            📉 declining       ║
║ Memories:          17                ║
║ Prevention rules:   9                ║
╠══════════════════════════════════════╣
║ Top failure domains:                 ║
║   execution-gap     4                ║
║   asked-not-doing   2                ║
║   speed             2                ║
╠══════════════════════════════════════╣
║ Learning curve (approval % by window)║
║   [1-10]   10% ██                    ║
║   [11-20]  20% ████                  ║
║   [21-30]  35% ███████               ║
║   [31-40]  30% ██████                ║
╚══════════════════════════════════════╝

Architecture

Five-phase pipeline: CaptureValidateRememberPreventExport

Agent (Claude/Codex/Amp/Gemini)
  │
  ├── MCP tool call ──→ captureFeedback()
  ├── REST API ────────→ captureFeedback()
  ├── CLI ─────────────→ captureFeedback()
  └── External write ──→ JSONL ──→ Watcher ──→ captureFeedback()
                                        │
                                        ▼
                              ┌─────────────────┐
                              │  Full Pipeline   │
                              │  • Schema valid  │
                              │  • Rubric gate   │
                              │  • Memory promo  │
                              │  • Vector index  │
                              │  • Risk scoring  │
                              │  • RLAIF audit   │
                              │  • DPO eligible  │
                              └─────────────────┘

Agent Runner Contract

License

MIT. See LICENSE.