npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

@sebastianandreasson/pi-autonomous-agents

v0.15.2

Published

Portable unattended PI harness for developer/tester/visual-review loops.

Readme

PI Autonomous Agents

@sebastianandreasson/pi-autonomous-agents is an npm package for running a bounded unattended PI workflow inside another repository.

It orchestrates:

  • a developer turn
  • a fast local verification step
  • an independent tester turn
  • an optional focused developerFix turn when verification/tester finds a real issue
  • optional periodic visual review from screenshots

The package is intentionally generic. It handles supervision, prompts, runtime state, telemetry, retries, and guardrails. The consuming repo still owns its own tasks, instructions, tests, model endpoints, and screenshot capture flow.

Install

npm install -D @sebastianandreasson/pi-autonomous-agents

Then in the consuming repo, tell your agent:

Find SETUP.md in @sebastianandreasson/pi-autonomous-agents and set everything up for this repository.

The package ships a top-level SETUP.md specifically for that workflow.

What This Package Owns

  • unattended loop orchestration
  • PI Node SDK integration
  • config loading
  • prompt assembly
  • verification/tester/visual-review handoff
  • timeout and loop guards
  • telemetry and run summaries
  • runtime isolation and stale-run recovery

What Each Repo Must Provide

  • TODOS.md
  • repo-specific pi/DEVELOPER.md
  • repo-specific pi/TESTER.md
  • a fast bounded testCommand
  • model configuration that actually matches the local/cloud providers in use
  • optionally a screenshot capture command for visual review

Quick Start In A Repo

The normal setup shape is:

TODOS.md
pi.config.json
pi/
  DEVELOPER.md
  TESTER.md

Typical scripts:

  • pi:once / pi:run use default sdk transport
  • pi:run also hosts web UI on 127.0.0.1:4317 by default
  • pi:mock skips real agent execution
{
  "scripts": {
    "pi:mock": "PI_CONFIG_FILE=pi.config.json PI_TRANSPORT=mock PI_TEST_CMD= pi-harness once",
    "pi:once": "PI_CONFIG_FILE=pi.config.json pi-harness once",
    "pi:run": "PI_CONFIG_FILE=pi.config.json pi-harness run",
    "pi:report": "PI_CONFIG_FILE=pi.config.json pi-harness report",
    "pi:visual:once": "PI_CONFIG_FILE=pi.config.json pi-harness visual-once",
    "pi:visualize": "PI_CONFIG_FILE=pi.config.json pi-harness visualize"
  }
}

Start from templates/pi.config.example.json, templates/DEVELOPER.md, templates/TESTER.md, and templates/gitignore.fragment.

Request telemetry is enabled by default for SDK runs. pi-harness writes a managed Pi extension package under .pi/extensions/pi-harness-request-telemetry/ in the consuming repo, with a package.json manifest and index.mjs shim that Pi auto-discovers on the next resource reload. Disable that with PI_REQUEST_TELEMETRY_ENABLED=0 or "piRequestTelemetryEnabled": false.

By default the extension now stores compact request telemetry only:

  • requests.jsonl with exact request totals and summarized tool/file attribution
  • spans.jsonl with byte counts and attribution metadata, but not full prompt text

Verbose hook traces and raw span text are opt-in for debugging:

  • PI_REQUEST_TELEMETRY_STORE_HOOKS=1 or "piRequestTelemetryStoreHooks": true
  • PI_REQUEST_TELEMETRY_STORE_SPAN_TEXT=1 or "piRequestTelemetryStoreSpanText": true

CLI

pi-harness once
pi-harness run
pi-harness report
pi-harness clear-history
pi-harness visual-once
pi-harness visualize
pi-harness debug-live
pi-harness visual-review-worker

Use PI_CONFIG_FILE to point at the repo-local config file:

PI_CONFIG_FILE=pi.config.json pi-harness once

If PI_CONFIG_FILE is not set, the package falls back to the bundled generic pi.config.json.

Core Workflow

Each real iteration works like this:

  1. developer implements one unchecked task from TODOS.md.
  2. The harness runs the configured fast verification command.
  3. If verification passes, tester reviews the change independently.
  4. If tester or verification fails, the findings go back to developerFix for one focused repair pass.
  5. If tester reaches PASS, tester creates the final commit directly by default.
  6. Every N successful iterations, optional visual review can inspect screenshots and veto the success if it finds a real problem.

The default commit model is commitMode: "agent". The older harness-managed parsed commit-plan flow still exists as commitMode: "plan", but it is now a compatibility mode rather than the default.

Recommended Model Setup

The package supports:

  • one default text model via piModel
  • one default visual-review model via visualReviewModel
  • optional per-role overrides via roleModels
  • per-model endpoint config in models
  • default transport via transport (sdk or mock)

Typical pattern:

  • local model for developer
  • local model for developerRetry
  • local model for developerFix
  • local or slightly stronger model for tester
  • stronger frontier model only for visualReview

Example:

{
  "piModel": "local/text-model",
  "visualReviewModel": "cloud/vision-model",
  "models": {
    "local/text-model": {
      "baseUrl": "http://localhost:8000/v1",
      "apiKey": "local",
      "vision": false
    },
    "local/tester-model": {
      "baseUrl": "http://localhost:8000/v1",
      "apiKey": "local",
      "vision": false
    },
    "cloud/vision-model": {
      "baseUrl": "https://api.openai.com/v1",
      "apiKeyEnv": "OPENAI_API_KEY",
      "vision": true
    }
  },
  "roleModels": {
    "developer": "local/text-model",
    "developerRetry": "local/text-model",
    "developerFix": "local/text-model",
    "tester": "local/tester-model",
    "visualReview": "cloud/vision-model"
  }
}

Important:

  • do not guess model ids
  • if using a custom OpenAI-compatible provider, verify <baseUrl>/models
  • if using PI models directly, verify pi --list-models
  • if PI_CODING_AGENT_DIR points at a repo-local PI home, make sure it is bootstrapped and contains models.json

The harness now preflights those checks before starting a real run.

Important Config Fields

Common fields in pi.config.json:

  • taskFile
  • developerInstructionsFile
  • testerInstructionsFile
  • transport (sdk or mock)
  • piModel
  • piRequestTelemetryEnabled
  • models
  • roleModels
  • commitMode
  • promptMode
  • testCommand
  • visualReviewEnabled
  • visualCaptureCommand
  • failureArtifactDir
  • continueAfterSeconds
  • toolContinueAfterSeconds
  • noEventTimeoutSeconds
  • toolNoEventTimeoutSeconds
  • sameFileLoopBudget
  • loopHistoryLimit
  • largeFileWarningLines
  • largeSpecWarningLines

Key defaults:

  • transport: sdk
  • commitMode: agent
  • promptMode: compact
  • piTools: read,edit,write,find,ls,bash
  • continueAfterSeconds: 300
  • toolContinueAfterSeconds: 900
  • noEventTimeoutSeconds: 900
  • toolNoEventTimeoutSeconds: 1800
  • sameFileLoopBudget: 2
  • loopHistoryLimit: 25

Prompt and Tooling Behavior

The package is optimized for local models by default:

  • prompts are compacted before handoff
  • changed-file lists and feedback excerpts are capped
  • prompts prefer read for source inspection
  • shell is intended for git, tests, and narrow diagnostics
  • SDK transport carries forward oversized shell-read warnings and loop/timeout guards
  • repeated same-file loop failures are remembered across iterations and escalate the next edit strategy
  • the supervisor emits large-file/spec warnings when touched files are getting risky

This is deliberate. Large monolith files, huge e2e specs, and broad TODO items are one of the main causes of local-model drift and retry loops.

Recommended repo shape:

  • keep TODO items very small and implementation-shaped
  • split giant stores/modules before they become constant edit hotspots
  • split ever-growing end-to-end specs into scenario files
  • keep the default testCommand to a bounded smoke check, not a multi-minute happy-path run

Runtime Isolation And Recovery

Recent versions of the package isolate each run more aggressively:

  • active ownership lock at .pi-runtime/active-run.json
  • per-run runtime directory under .pi-runtime/runs/<runId>/
  • per-run PI sessions and telemetry
  • runId added to telemetry
  • in-progress iteration state persisted before agent work starts
  • stale run locks recovered when the owning PID is gone
  • timeout cleanup kills the full spawned process group, not only the direct child
  • parent-death watchers shut down orphaned supervisor layers instead of letting them continue under PPID 1

That is meant to prevent orphaned timed-out agents or concurrent supervisors from corrupting shared state.

Debugging Artifacts

Useful files during a run:

  • .pi-last-prompt.txt Exact assembled prompt for the current role.
  • .pi-last-output.txt Latest agent output snapshot.
  • .pi-last-verification.txt Latest verification output snapshot.
  • .pi-last-iteration.json Structured summary of the last completed iteration.
  • pi-output/failure-artifacts/ Compact failure artifacts with command, exit code, changed files, tester summary, and output excerpt.
  • .pi-state.json Persistent harness state, including in-progress iteration data.
  • pi.log Main run log.
  • pi_telemetry.jsonl
  • pi_telemetry.csv
  • pi-output/token-usage/events.jsonl Normalized token-attribution event stream for downstream tools. Each row includes phase, role, kind, session/model, attribution bucket, tool/file context, and token counts.
  • pi-output/token-usage/summary.json Derived structured token summary with totals plus breakdowns by phase, model, session, attribution, tool, file, and directory.
  • .pi-runtime/active-run.json
  • .pi-runtime/runs/<runId>/...

Each run also gets run-scoped token artifacts under .pi-runtime/runs/<runId>/token-usage.events.jsonl and .pi-runtime/runs/<runId>/token-usage.summary.json.

pi-harness report summarizes recent telemetry and token artifacts and surfaces things like terminal reasons, large-file warnings, failure artifacts, and top token hotspots.

pi-harness run now also starts lightweight local web UI for orchestration flow by default. By default it listens on 127.0.0.1:4317. Override with PI_VISUALIZER_HOST and PI_VISUALIZER_PORT. Set PI_VISUALIZER=0 to disable embedded web UI for a run.

Visualizer uses SSE for live updates instead of browser polling.

pi-harness visualize still exists as standalone viewer if you want to inspect run history without starting a new run.

Visualizer now includes:

  • TODO-centric main view with current task open by default
  • run history selector from .pi-runtime/runs/
  • orchestration flow for selected todo
  • 50/50 split between live worker feed and current repo edits
  • per-iteration stage graph with retries/rechecks in diagnostics
  • clickable graph nodes and timeline rows that show full event JSON
  • historical run summaries and per-run last output snapshots
  • live worker feed with thinking text, assistant text, tool calls, and tool output
  • feed controls to hide thinking and collapse repetitive deltas
  • pinned latest tool output panel

Visual Review Contract

Visual review is optional and generic. The harness does not know how to navigate your app.

If enabled, your repo must provide a real screenshot capture command that writes a manifest under the configured capture directory. The manifest shape is documented in docs/PI_SUPERVISOR.md.

Visual review should be used as a periodic audit, not as the default inner-loop gate.

Resetting Harness State

If you want to wipe harness-generated state and start fresh:

PI_CONFIG_FILE=pi.config.json pi-harness clear-history

That clears configured harness runtime/history artifacts and verifies they are gone. It does not remove project source files.

Docs

Development

In this package repo:

npm run check
npm test

For local visualizer iteration against fake live SDK agent:

npm run debug:live-ui

Scenario variants:

node src/cli.mjs debug-live --reset --scenario noisy --task-count 24
node src/cli.mjs debug-live --reset --scenario retry

For React/Vite visualizer UI dev loop:

npm run dev:visualizer:ui

For production visualizer UI build:

npm run build:visualizer:ui

Publish now auto-runs check, tests, and UI build via prepublishOnly.

This seeds .pi-debug/live-ui/, runs harness there with streaming fake SDK fixture, hosts visualizer, and gives stable local repro loop for UI work. React app lives in visualizer-ui/. Visualizer server now serves built assets from visualizer-ui/dist/ and falls back to build-instructions page if build artifacts are missing.

See docs/VISUALIZER_UI_PLAN.md for migration plan.

The package requires Node >=20.