npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

pi-prompt-autoresearch

v0.1.1

Published

A pi extension that iteratively improves prompts with execution-based evaluation and keep/discard decisions.

Readme

pi prompt autoresearch

npm version license

A pi extension that iteratively improves prompts using execution-based evaluation, blind A/B comparison, and keep/discard decisions.

  • Generates an eval suite from your goal
  • Runs each prompt candidate across the suite and scores actual outputs
  • Performs blind A/B comparisons between incumbent and candidate
  • Keeps or discards each iteration based on eval scores and comparator preference
  • Benchmarks repeated runs and reports variance

Install

pi install npm:pi-prompt-autoresearch

From the public git repo:

pi install git:github.com/NicoAvanzDev/pi-prompt-autoresearch

From a local clone:

pi install .

Load without installing:

pi --no-extensions -e ./index.ts

Quick start

/autoresearch Write a prompt that produces a concise, factual summary of a long technical article.

That single command kicks off the full optimization loop. The extension will:

  1. Generate an initial prompt from your goal
  2. Build an eval suite tailored to the task
  3. Iterate — rewrite, evaluate, compare, keep or discard — for 10 rounds (configurable)
  4. Write the best prompt to AUTORESEARCH_PROMPT.md in your working directory

A live progress widget shows iteration count, scores, elapsed time, and ETA while it runs. When a new best prompt is found you get a milestone update in chat.

Example session

> /autoresearch Write a prompt that turns raw meeting transcripts into structured JSON notes with attendees, action items, and decisions.

  Autoresearch ━━━━━━━━━━━━━━━━━━━━ 100%  10/10 iterations
  Goal    Turn meeting transcripts into structured JSON notes
  Score   0.92 (best) — +38% vs baseline
  Status  Completed in 4m 12s

✓ Best prompt written to AUTORESEARCH_PROMPT.md

You can also benchmark an existing prompt to measure consistency:

> /autoresearch-benchmark --runs 5 Write a prompt that extracts structured meeting notes as JSON.

  Benchmark complete — 5 runs
  Mean 0.88 · Min 0.84 · Max 0.91 · StdDev 0.03

How it works

Improve mode

For each /autoresearch run, the extension:

  1. generates an initial prompt from the user goal
  2. generates a small eval suite for the user goal
  3. runs the initial prompt on every eval case
  4. scores each case and computes an aggregate score
  5. generates a revised prompt candidate
  6. runs that candidate on every eval case
  7. evaluates the candidate across the full suite
  8. performs a blind A/B comparison between incumbent and candidate outputs
  9. keeps the candidate only if:
    • the eval says keep
    • the aggregate score beats the current best
    • the blind comparator prefers the candidate

Benchmark mode

The benchmark workflow:

  1. generates an eval suite
  2. runs the prompt multiple times across that suite
  3. records per-run aggregate scores
  4. reports:
    • mean score
    • min/max score
    • variance
    • standard deviation

Commands

Run autoresearch

/autoresearch <goal>

Example:

/autoresearch Write a prompt that produces a concise, factual summary of a long technical article.

Override iterations for one run:

/autoresearch --iterations 20 Write a prompt that generates a JSON API migration checklist.

Benchmark a prompt

/autoresearch-benchmark <goal>

Example:

/autoresearch-benchmark --runs 5 Write a prompt that extracts structured meeting notes as JSON.

Change the default iteration count

/autoresearch-iterations 20

Control a running job

/autoresearch-pause
/autoresearch-resume
/autoresearch-kill
/autoresearch-status

The interactive extension now shows:

  • a persistent progress widget above the editor
  • an AI-generated goal summary
  • iteration and case progress
  • elapsed time and ETA, refreshed live while a job is running
  • current score, best score, and percentage improvement vs baseline
  • milestone updates in chat when a new best prompt is found, or when the job is paused/resumed/completed

During a run, the extension writes AUTORESEARCH_PROMPT.md in the current working directory with the raw best prompt text, updated at each iteration. Progress state is kept internal to the extension (pi session entries and the live UI widget).

Pause takes effect at the next safe checkpoint between long-running steps.

Tools

The extension exposes LLM-callable tools:

  • run_prompt_autoresearch
  • benchmark_prompt_autoresearch

run_prompt_autoresearch

Parameters:

  • goal: string
  • iterations?: number
  • evalCases?: number

benchmark_prompt_autoresearch

Parameters:

  • goal: string
  • runs?: number
  • evalCases?: number

Notes

  • default improve iterations: 10
  • users can increase iterations up to 100
  • default benchmark runs: 3
  • benchmark runs can go up to 10
  • default eval cases: 5
  • eval cases can go up to 8
  • in interactive mode, /autoresearch copies the best prompt into the editor when finished