npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

@plune-ai/cli

v0.2.2

Published

AI-powered assertion testing for LLM apps — write assertions, run them against any provider, get a pass/fail report (local, CI, or diff).

Downloads

349

Readme

@plune-ai/cli

AI-powered assertion testing for LLM apps — a test runner for model behaviour.

npm CI license

Plune runs an assertion suite against an LLM provider and gives you a pass/fail report — locally, in CI, or as a regression diff between two runs. You describe the checks in one plune.yaml; Plune calls the model, evaluates each assertion, caches results, and reports token cost. Ten built-in assertion types cover plain text, JSON-schema, LLM-as-judge, and RAG metrics (faithfulness, answer-relevance, context-precision).

Install

npm install -g @plune-ai/cli      # or: pnpm add -g @plune-ai/cli
plune --version

Or run it without installing:

npx -y @plune-ai/cli run

Requires Node.js ≥ 20.

Quickstart

# 1. Scaffold plune.yaml, an example dataset, and .env.example
plune init

# 2. Add your provider key (read from the environment / .env — never written to disk)
echo 'ANTHROPIC_API_KEY=sk-ant-...' >> .env

# 3. Run the assertions
plune run
# → 1/1 passed · 0 failed · 0 errored · $0.0008

# 4. Re-render the last run, or diff two runs to catch regressions
plune report --format markdown
plune diff baseline.json current.json --fail-on-regression

Each run writes its full result to .plune/last-run.json.

Configuration

Plune reads a single plune.yaml, discovered by walking up from the working directory (or passed with -c <path>). This is what plune init scaffolds:

version: 1
provider:
  type: anthropic            # anthropic | openai | openrouter
  model: claude-3-5-sonnet-latest
evals:
  - id: example
    prompt: "Answer concisely. {{question}}"   # {{vars}} come from each dataset row
    dataset: datasets/example.jsonl            # a file path, or an inline `examples:` list
    assertions:
      - type: contains
        value: "Paris"

Datasets are JSONL, one row per line, shaped { "vars": { ... }, "expected"?: "..." }. The provider API key is read from the environment based on provider.type:

| Provider | provider.type | Environment variable | | ---------- | --------------- | -------------------- | | Anthropic | anthropic | ANTHROPIC_API_KEY | | OpenAI | openai | OPENAI_API_KEY | | OpenRouter | openrouter | OPENROUTER_API_KEY |

Assertion types

| Type | Passes when… | | --------------------- | ---------------------------------------------------------------- | | exact-match | output equals value (optional trim, ignore_case) | | contains | output contains value | | contains-any | output contains at least one of values | | contains-all | output contains every one of values | | json-schema | output validates against the JSON schema | | llm-judge | an LLM grades the output against criteria (≥ pass_threshold) | | semantic-similarity | embedding similarity to referencethreshold | | faithfulness | output is grounded in context (RAG) | | answer-relevance | output actually answers the question (RAG) | | context-precision | context is relevant to the question (RAG) |

Commands

| Command | Summary | | ------- | ------- | | plune run | Run the suite. Flags: --dry-run, --only <id\|tag> (repeatable), --bail, --no-cache, --concurrency <n>, --format console\|json\|markdown, -o, --output <file>. | | plune report | Re-render the most recent run. Flags: --format, -o. | | plune diff <baseline> <current> | Compare two plune run --format json outputs and report pass→fail regressions. Flags: --fail-on-regression, --format, -o. | | plune init | Scaffold plune.yaml, a sample dataset, and .env.example. Flags: --yes (non-interactive), --force. |

Global flags: -c, --config <path> · -v, --verbose · --no-color.

Exit codes: 0 everything passed · 1 an assertion failed · 2 configuration or execution error.

Programmatic API

The same engine that powers plune run is exported for use from your own code. Unlike the CLI, the library does not parse argv or auto-load .env — set the provider key in process.env yourself.

import { run } from '@plune-ai/cli';
import type { RunResult } from '@plune-ai/cli';

const result: RunResult = await run({ dryRun: false, configPath: 'plune.yaml' });
console.log(result.summary); // { total, passed, failed, errored, ... }

Use in CI

Run Plune on every pull request and post a regression diff as a sticky comment with the companion GitHub Action, plune-ai/eval-action:

- uses: plune-ai/eval-action@v1
  with:
    config: plune.yaml
    fail-on-regression: true

Contributing

Bug reports and pull requests are welcome — see CONTRIBUTING.md. For security issues, see SECURITY.md.

License

MIT © Plune Contributors