npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

@taskproof/spec

v0.2.1

Published

The taskproof task-spec format: versioned YAML schema with Zod validation for agent-usability tasks

Readme

@taskproof/spec

The task-spec format: a versioned YAML schema describing a task an AI agent should be able to complete on your site, plus the deterministic assertions that decide whether it did. This package is the single source of truth for the format — the CLI, every runner adapter, and the report generator all validate through it.

Pre-release; 0.x versions may break between releases.

A task spec

specVersion: '0.1'
id: pricing-trial
goal: 'Find the price of the Team plan and start a free trial'
entryUrl: 'https://example.com'
allowedDomains: # optional — defaults to the entryUrl hostname
  - example.com
  - '*.checkout-provider.example.com'
maxSteps: 20 # default 20, max 200
maxCostUsd: 1.00 # optional soft budget cap per run (Claude only; see note)
passPolicy: # default { k: 1, minPasses: 1 }
  k: 5 # run the task 5 times…
  minPasses: 4 # …pass if at least 4 succeed (agents are non-deterministic)
assertions: # at least one; ALL must hold for a run to pass
  - type: url
    pattern: '**/trial**'
  - type: dom
    selector: '[data-testid=trial-confirmation]'
    state: visible

Fields

| Field | Required | Notes | | ---------------- | -------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | specVersion | yes | Format version, currently "0.1". Must be a quoted string — YAML reads unquoted 0.1 as a number, which cannot round-trip versions (0.10 becomes 0.1), so the parser rejects it with a quoting hint. | | id | yes | Kebab-case, stable across runs — baselines and diffs key on it. | | goal | yes | The natural-language instruction handed verbatim to each agent. | | description | no | Human-facing context; never shown to the agent. | | tags | no | Free-form labels for filtering (docs, smoke, …). | | entryUrl | yes | http(s) URL where every run starts. Credentials (user:pass@) are rejected — specs land in reports and artifacts. | | allowedDomains | no | Hostnames the agent may visit; *.example.com wildcards and localhost allowed. Defaults to the entry hostname. | | maxSteps | no | Hard step cap before the run is failed (default 20). | | maxCostUsd | no | Soft budget cap per run. Claude: the run stops before paying for a turn it can't afford, so it can overshoot by ≤1 turn's cost. browser-use: not enforced mid-run — it runs to maxSteps, so lower maxSteps to bound spend. A mistake-guard, not a hard limit. | | passPolicy | no | pass@k with a threshold. The default (k: 1) is a single smoke run for local dev; CI gating should set k ≥ 3 with a threshold — never a binary gate. | | assertions | yes | Deterministic checks, all required to pass. LLM-judge grading layers on top of these, never replaces them. | | judge | no | Natural-language rubric for the optional LLM judge (WebJudge layer). Runs only after the deterministic assertions pass, and can turn a pass into a fail, never the reverse. Omit to skip LLM judging. 1–2000 chars. |

Assertion types

  • url — glob pattern against the full final URL (scheme+host+path+query, fully anchored): { type: url, pattern: "**/order/confirmed**" }. A leading ** matches any host, so anchor the host (https://shop.example.com/**/confirmed**) when an off-domain match would be a false pass — browser-use blocks off-domain navigation via allowedDomains, but the Claude adapter does not.
  • dom — CSS selector against the final DOM with state: visible | attached | text | absent | hidden (text requires a text: substring): { type: dom, selector: "h1", state: text, text: "Thank you" }. absent = no element matches; hidden = a match exists but isn't visible (e.g. a modal closed to display:none) — use these to grade "the agent cleared the overlay". A negative check can pass vacuously on a blank/wrong page, so always pair absent/hidden with a positive anchor (a url, visible/text, or network assertion).
  • network — glob pattern against requests observed during the run, with optional method and status (exact code or class like "2xx"): { type: network, urlPattern: "**/api/orders", method: POST, status: "2xx" }

Unknown keys anywhere in the document are rejected — typos fail loudly at validate time, not silently at run time.

API

import { parseTaskSpec, safeParseTaskSpec, taskSpecSchema, type TaskSpec } from '@taskproof/spec';

const spec = parseTaskSpec(yamlSource, { filename: 'tasks/pricing.yaml' }); // throws TaskSpecError
const result = safeParseTaskSpec(yamlSource); // { ok: true, spec } | { ok: false, error }

Errors are typed: TaskSpecYamlError (unparseable YAML), UnsupportedSpecVersionError (unknown specVersion, carries .supported), TaskSpecValidationError (carries structured .issues with path + message).

Versioning

specVersion is a literal, not semver: parsers declare exactly which versions they support and fail fast on anything else. Format changes will go through a public RFC process once the spec is published; 0.x versions may break between releases.