npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

@robhowley/pi-structured-return

v1.4.0

Published

Structured command execution for Pi agents: compact results for the model, full logs for humans.

Readme

pi-structured-return

Structured command execution for pi agents: compact results for the model, full logs for humans.

Cross-platform Pi package that combines:

  • a structured-return skill for choosing compact / machine-readable command forms
  • a structured-return extension that captures output, stores artifacts, applies parsers, and falls back to tail + log path

Tens of thousands of tokens per session saved by filtering noise from test suites, linters, build tools, and data pipelines so LLMs spend tokens on signal, not boilerplate.

Token reduction

Tool output is designed for humans: source diffs, line annotations, timing breakdowns, absolute paths repeated on every line. Useful on a terminal. Expensive in a model context, especially on failure when output is most verbose and the model needs to act fast.

  • Test runners: 3 tests, 1 passing, 1 assertion failure, 1 unexpected error.
  • Linters: 1 unused variable warning in a single file.

| Parser | Raw (tokens) | Structured (tokens) | Reduction | Notes | |---|---|---|---|---| | junit-xml (maven) | 1063 | 86 | 92% | build lifecycle noise with surefire stack traces per failure | | junit-xml (go) | 400 | 58 | 86% | verbose output with full stack trace per failure | | junit-xml (dotnet) | 487 | 107 | 78% | build header and VSTest output with per-failure stack traces | | vitest-json | 348 | 75 | 78% | source diff with inline arrows and ANSI color codes per failure | | cargo-test | 285 | 68 | 76% | cargo progress + test binary output with panic traces per failure | | junit-xml (pytest) | 289 | 71 | 75% | verbose output with source snippets and summary footer | | rspec-json | 212 | 55 | 74% | default output with backtrace | | junit-xml (gradle) | 263 | 81 | 69% | gradle console output with build lifecycle noise | | junit-xml (jest) | 309 | 99 | 68% | source annotations with deep jest-circus stack traces per failure | | cargo-build | 225 | 77 | 66% | rustc error annotations with code spans and help text per error | | minitest-text | 168 | 59 | 65% | default output with backtrace | | ruff-json | 107 | 52 | 51% | source context + help text per error | | eslint-json | 64 | 59 | 8% | already compact formatter |

Tokens counted with cl100k_base (tiktoken). Linter output is more compact than test runner output to begin with, so the baseline reduction is lower. The numbers above are measured against a single file with a single error — a conservative lower bound. Both ruff and eslint repeat absolute file paths per error in their raw output, so reduction grows as violations spread across more files.

Pipeline tools

dbt output is the noisiest tool in this repo relative to useful signal. Every run prints version info, adapter registration, project stats, concurrency settings, and per-node start/finish lines — all before any result.

The numbers below use 3–4 model toy examples; real projects run hundreds of models where the noise scales linearly and reduction compounds.

  • dbt run: 4 models, 1 passing, 1 DB error, 1 permissions error, 1 DAG skip.
  • dbt test: 3 tests, 1 passing, 1 uniqueness failure, 1 unit test diff.
  • dbt compile: 3 models compiled to SQL.

| Parser | Raw (tokens) | Structured (tokens) | Reduction | Notes | |---|---|---|---|---| | dbt-json (run, success) | 428 | 20 | 95% | version, adapter, concurrency, per-model start/finish — all noise on success | | dbt-json (run, failure) | 618 | 198 | 68% | error messages, model paths, compiled code paths preserved | | dbt-json (test) | 720 | 274 | 62% | unit test diff tables preserved verbatim; preamble stripped | | dbt-json (compile) | 775 | 683 | 12% | compiled SQL is the signal and returned verbatim |

At 12 models, run failures hit 85% reduction. An 18-model DAG success: 1,645 → 20 tokens (99%).

Built-in parsers

  • junit-xml (JUnit XML — covers pytest --junitxml, Gradle, Maven, Jest with jest-junit, Go with go-junit-report, and any other tool that emits the JUnit XML schema)
  • vitest-json
  • rspec-json
  • minitest-text (parses default minitest output — no flags or reporters needed)
  • dbt-json (dbt run/test/compile --log-format json — errors, warnings, and unit test diffs from JSONL; compiled SQL returned for compile; success runs reduced to a one-line summary)
  • cargo-build (cargo build --message-format=json — errors with file, line, error code, and primary span label; warnings filtered out)
  • cargo-test (cargo test — assertion left/right values, panic messages, and file:line per failure; detects compilation failures and directs to cargo build --message-format=json)
  • ruff-json (ruff check only — ruff format has no json support)
  • eslint-json

Before / after

Raw pytest output (262 tokens):

============================= test session starts ==============================
platform darwin -- Python 3.14.2, pytest-9.0.2
collecting ... collected 3 items

test_math.py::test_adds_two_numbers_correctly PASSED                     [ 33%]
test_math.py::test_multiplies_two_numbers_correctly FAILED               [ 66%]
test_math.py::test_does_not_divide_by_zero FAILED                        [100%]

=================================== FAILURES ===================================
____________________ test_multiplies_two_numbers_correctly _____________________

    def test_multiplies_two_numbers_correctly():
>       assert 3 * 4 == 99
E       assert (3 * 4) == 99

test_math.py:5: AssertionError
_________________________ test_does_not_divide_by_zero _________________________

    def test_does_not_divide_by_zero():
>       result = 1 / 0
                 ^^^^^
E       ZeroDivisionError: division by zero

test_math.py:8: ZeroDivisionError
=========================== short test summary info ============================
FAILED test_math.py::test_multiplies_two_numbers_correctly
FAILED test_math.py::test_does_not_divide_by_zero - ZeroDivisionError: ...
========================= 2 failed, 1 passed in 0.01s ==========================

Structured result returned to the model (56 tokens):

pytest test_math.py --junitxml=.tmp/report.xml → cwd: project
2 failed, 1 passed
  test_math.py:5  assert (3 * 4) == 99
  test_math.py:8  ZeroDivisionError: division by zero

Installation

pi install npm:@robhowley/pi-structured-return

How it works

  1. The agent runs commands through structured_return instead of bash.
  2. Full output is captured and stored as a log.
  3. A parser converts noisy CLI output into a compact structured result. If no parser matches, the last 200 lines and the log path are returned as a fallback.
  4. The agent receives the structured result in context — signal only, no noise.
  5. The full log is always available on disk for both the agent and humans to inspect.

Agentic loops

The token table above measures a single run. In an agentic loop the cost compounds — every tool result accumulates in context for the life of the task.

This applies to any loop: fixing a failing test suite, implementing a feature end-to-end, working through a migration, performance tuning execution times. The agent runs a command, reads the result, makes a change, runs it again. Each iteration adds another tool result to the context window. With a noisy CLI that means paying for the same verbose boilerplate every time.

A parser reduces each run to a one- or two-line signal. Over 15 iterations the difference isn't 80 tokens vs 15 tokens — it's 1,200 tokens vs 225 for a single command in a single task.

Extending with project-local parsers

Built-in parsers cover common tools. For everything else — internal CLIs, custom test runners, proprietary lint tools — add a .pi/structured-return.json to your project root.

Why: keeps token costs low for tools the built-ins don't know about, without forking the package.

Two options:

1. Re-use a built-in parser

Route a project-specific command to an existing parser. Use this when your tool's output already matches a supported format (e.g. a test runner that emits JUnit XML).

// .pi/structured-return.json
{
  "parsers": [
    {
      "id": "acme-tests",
      "match": { "argvIncludes": ["acme", "test"] },
      "parseAs": "junit-xml"
    }
  ]
}

2. Write a custom parser

Point to a local .ts file for tools with unique output formats.

// .pi/structured-return.json
{
  "parsers": [
    {
      "id": "foo-json",
      "match": { "argvIncludes": ["foo-cli", "check"] },
      "module": "parsers/foo-cli.js"
    }
  ]
}
// .pi/parsers/foo-cli.ts
import fs from "node:fs";
import type { RunContext } from "@robhowley/pi-structured-return/types";

export default {
  id: "foo-json",
  async parse(ctx: RunContext) {
    const data = JSON.parse(fs.readFileSync(ctx.stdoutPath, "utf8"));
    return {
      tool: "foo-cli",
      status: data.ok ? "pass" : "fail",
      summary: data.ok ? "passed" : `${data.errors.length} errors`,
      failures: data.errors.map((e, i) => ({ id: e.id ?? `error-${i}`, file: e.file, line: e.line, message: e.message })),
      logPath: ctx.logPath,
    };
  },
};

The parser receives a RunContext (command, argv, cwd, stdout/stderr paths, artifact paths, log path) and returns a ParsedResult. Match rules support argvIncludes (array of required tokens) or regex (tested against the full argv string).

Slash commands

  • /sr-parsers — list all registered parsers (built-in and project-local) with their match rules and targets

Structured result schema

Every parser returns the same shape. The model always knows where to look.

| Field | Type | Description | |---|---|---| | tool | string | Name of the tool that ran (eslint, pytest, etc.) | | exitCode | number | Raw process exit code | | status | pass \| fail \| error | Normalized outcome | | summary | string | One-line human+model readable result (3 failed, 12 passed) | | cwd | string | Working directory — anchor for resolving relative paths in failures | | failures | { id, file?, line?, message?, rule? }[] | Per-failure details with relative file paths | | artifact | string? | Path to the saved report file, if one was written | | logPath | string | Path to full stdout+stderr log | | rawTail | string? | Last 200 lines of log, included on fallback when no parser matched |

Design

structured_return is a separate tool, not a wrapper around bash. Intercepting bash to silently rewrite commands would override a primitive the model and platform both rely on. Pi's philosophy is to extend rather than obfuscate: features are built on top of the platform, not hidden inside it. A dedicated tool honors that. It adds to the available surface, keeps bash honest, and leaves the choice explicit. The skill guides the model toward it; nothing is hijacked to get there.